Abstract 2

1       A Real-Virtual Interaction Model 2

2       Live Theatrical Stage Performance Use Case. 3

2.1.1     Purpose. 3

2.1.2     Description and flow of actions. 3

3       eSports Tournament (XRV-EST). 4

3.1        Purpose. 4

3.2        Description. 4

4       Experiential retail/shopping. 5

4.1        Purpose. 5

4.2        Description and flow of actions. 5

5       Collaborative immersive laboratory. 6

5.1        Purpose. 6

5.2        Description. 6

6       Immersive art experience. 8

6.1        Purpose. 8

6.2        Description. 8

7       DJ/VJ performance at a dance party. 8

7.1        Purpose. 8

7.2        Description. 8

8       Live concert performance. 9

8.1        Purpose. 9

8.2        Description. 9

9       Experiential marketing/branding. 9

9.1        Purpose. 9

9.2        Description. 9

10     Meetings/presentations. 9

10.1      Purpose. 9

10.2      Description. 9

Abstract

XR Venues (MPAI-XRV) is an MPAI project addressing a multiplicity of use cases enabled by Extended Reality (XR), the combination of Augmented Reality (AR), Virtual Reality (VR) and Mixed Reality (MR) technologies and enhanced by Artificial Intelligence (AI) technologies. The word Venue is used as a synonym for Real and Virtual Environments.

1        A Real-Virtual Interaction Model

An important feature of MPAI-XRV is the strong interaction with – and sometimes even interchangeability of – a Real/Virtual Environment and a Virtual/Real World Environment. The MPAI-XRV model, depicted in Figure 1, is helpful to guide the analysis of the MPAI-XRV use cases.

The model assumes that there is a complete symmetry between the actions performed and the data formats exchanged between a Real Environment and a Virtual Environment (e.g., a metaverse).

 

Figure 1 – Real World (yellow) and Virtual (blue) Interactions.

 

Table 2 defines the functions of the processing elements that bidirectionally capture and process data from a Real Environment which are used to generate actions and deliver experiences in a Virtual Environment and vice versa. Table 1 describes the functions of the identified components.

 

Table 1 – The functions of the components in the MPAI-XRV Model

Data Capture Captures Environment as collections of signals and/or Data.
Feature Extraction Analyses Data to extract Descriptors.
Feature Interpretation Analyses Descriptors to yield Interpretations.
Action Generation Analyses Interpretations to generate Actions.
Experience Generation Analyses Actions to generate Environment.
Environment Delivery Delivers Environment as collections of signals and/or Data.

 

2        Live Theatrical Stage Performance Use Case

2.1.1      Purpose

Theatrical stage performances such as Broadway theatre, musicals, dramas, operas, and other performing arts increasingly use video scrims, backdrops, and projection mapping to create digital sets rather than constructing physical stage sets, allowing animated backdrops, and reducing the cost of mounting shows.

 

The use of immersion domes – especially LED volumes – promises to surround audiences with virtual environments that the live performers can inhabit and interact with. In addition, Live Theatrical Stage Performance can extend into the metaverse as a digital twin implement the model depicted in Figure 1. In this case, elements of the Virtual Environment experience can be projected in the Real Environment and elements of the Real Environment experience can be rendered in the Virtual Environment (metaverse).

 

Use of AI in Live Theatrical Stage Performance will allow:

  1. Rapid mounting of a show into a variety of real and virtual venues.
  2. Orchestration of the complex lighting, video, audio, and stage set cues that must adapt to the pace of live performers without extensive staff.
  3. Large shows to tour to smaller venues that otherwise could not support complex productions.
  4. Live performances spanning both Virtual- and Real-Environments, including in-person or remote participants and performers with enhanced participant interactivity.
  5. A more direct connection between the artist and participants by consolidating many complex experiential modalities into a simple user interface.
  6. Artists to access a large amount of data from opted-in individuals and which can be incorporated into the visual and musical performance. Each show can thus be unique for each audience.

2.1.2      Description and flow of actions

The typical set up can be described as follows:

  1. A physical stage.
  2. Lighting, projections (e.g., dome, holograms, AR goggles), and Special Effects (FX).
  3. Audience (standing or seated) in the real and virtual venue and external audiences via interactive streaming.
  4. Interactive interfaces to allow audience participation (e.g., voting, branching, real-virtual action generation).
  5. Performers on stage, platforms around domes or moving through the audience (immersive theatres).
  6. Multisensory experience delivery system (immersive video and spatialised audio, touch, smell).
  7. Capture of biometric data from audience and/or performers from wearables, sensors embedded in the seat, remote sensing (e.g., audio, video, lidar).
  8. Show operator(s) to allow manual augmentation and oversight of an AI that has been trained by show operator activity.
  9. Virtual Environment (metaverse) that mirrors selected elements of the Real Environment. For example, performers on the stage are mirrored by digital twins in the metaverse, using:
    1. Capture body motion (MoCap) to animate an avatar.
    2. Keyed 2D image mapped on a plane.
    3. Volumetrically captured 3D images producing photorealistic digital embodiments.
  10. Real Environment can also mirror selected elements of the Metaverse, similar to in-camera visual effects/virtual production techniques. For instance, elements of the Metaverse such as, avatars, landscape, sky, objects can be represented in the Real Environment through:
    1. Immersive displays
    2. The floor of the stage itself and set pieces on the stage may be projection-mapped or wrapped with LED to integrate them into the immersive environment. This allows, for instance set pieces such as a tree, to come alive with moving leaves, blooming flowers, or ripening fruits, and for the tree to cast a virtual shadow across the stage from a virtual light source moving across an immersive dome. Many of these elements may be extracted from the metaverse and projected into the real-world immersive environment.
    3. Augmented reality overlays using AR glasses, “hologram” displays or scrims.
    4. Lighting and FX.
  11. The physical stage and set pieces blend seamlessly into the virtual 3D backdrop projected onto the dome such that the spectators perceive as a single immersive environment.
  12. Real performers enter the stage. As they move about the stage, whether dancing, acting, etc., their performance may be mirrored in the Virtual Environment (metaverse) by tracking performer’s motion, gesture, vocalisation, and biometrics. The performance is accompanied by music, lighting, and FX.
  13. In addition, virtual performers in the Virtual Environment (metaverse) may be projected onto the real-world immersive environment via immersive display, AR, etc.
  14. The Script or cue list describes the show events, guiding and synchronising the actions of all AI Modules (AIM) as the show evolves from cue to cue and scene to scene. In addition to performing the show, the AIMs might spontaneously innovate show variations amplify the actions of performers or respond to commands from operators by modifying the Real or Virtual Environment within scripted guidelines.

 

3        Collaborative immersive laboratory

3.1       Purpose

Create a collaborative immersive environment allowing citizen scientists and researchers to join physically or virtually via avatar or volumetric representation of themselves for navigation, inspection, analysis, and simulations of scientific or industrial 3D/spatial models/datasets ranging from microscopic to macroscopic.

Examples are:

  • View data in its actual 3D or 4D (over time) form through Immersive Reality.
  • Present very large data sets that are generated by microscopes, patient, and industrial scanners.
  • Format/reformat, qualify, and quantify sliced dataset with enhanced visualisation and analysis tools or import results for rapid correction of metadata for volumetric import.
  • Provide tools for investigators to understand complex data sets completely and communicate their findings efficiently.

Objective of an exemplary case: to define interfaces of AI Modules that create 3D models of the fascia from 2D slices sampling microscopic medical images, classify cells based on their spatial phenotype morphology, enable the user to explore, interact with, zoom in the 3D model, count cells, and jump from a portion of the endoderm to another.

3.2       Description

There is a file containing the digital capture of 2D slices, e.g., of the endocrine system.

An AIM reads the file and creates the 3D model of the fascia.

Another AIM finds the cells in the model and classifies them.

A human

  1. navigates the 3D model.
  2. interacts with the 3D model.
  3. zooms in the 3D model (e.g., x2000).
  4. converts a confocal image stack into a volumetric model.
  5. Analyses the movement of an athlete for setting peak performance goals.

Relevant data formats are:

  1. Image Data: TIFF, PNG, JPEG, DICOM, VSI, OIR, IMS, CZI, ND2, and LIF files
  2. Mesh Data: OBJ, FBX, and STEP files
  3. Volumetric Data: OBJ, PLY, XYZ, PCG, RCS, RCP and E57[1]
  4. Supplemental Slides from Powerpoint/Keynote/Zoom
  5. 3D Scatterplots from CSV files

3.3       Specific application areas

3.3.1      Microscopic dataset visualisation

  1. Deals with different object types, e.g.:
    1. 3D Visual Output of a microscope.
    2. 3D model of the brain of a mouse.
    3. Molecules captured as 3D objects by an electronic microscope.
  2. Create and add metadata to a 3D audio-visual object:
    1. Define a portion of the object – manual or automatic.
    2. Assign physical properties to (different parts) of the 3D AV object.
    3. Annotate a portion of the 3D AV object.
    4. Create links between different parts of the 3D AV object.
  3. Enter, navigate and act on 3D audio-visual objects:
    1. Define a portion of the object – manual or automatic.
    2. Count objects per assigned volume size.
    3. Detect structures in a (portion of) the 3D AV object.
    4. Deform/sculpt the 3D AV object.
    5. Combine 3D AV objects.
    6. Call an anomaly detector on a portion with an anomaly criterion.
    7. Follow a link to another portion of the object.
    8. 3D print (portions of) the 3D AV object.

3.3.2      Macroscopic dataset visualisation and simulation

  1. Deals with different dataset types, e.g.:
    1. Stars, 3D star maps (HIPPARCOS, Tycho Catalogues, etc.).
    2. Deep-sky objects (galaxies, star clusters, nebulae, etc.).
    3. Deep-sky surveys (galaxy clusters, large-scale structures, distant galaxies, etc.).
    4. Satellites and man-made objects in the atmosphere and above, space junks, planetary and Moon positions.
    5. Real-time air traffic.
    6. Geospatial information including CO2 emission maps, ocean temperature, weather, etc.
  2. Simulation data
    1. Future/past positions of celestial objects.
    2. Stellar and galactic evolution.
    3. Weather simulations.
    4. Galaxy collisions.
    5. Black hole simulation.
  3. Create and add metadata to datasets and simulations:
    1. Assign properties to (different parts) of the datasets and simulations.
    2. Define a portion of the dataset – manual or automatic.
    3. Annotate a portion of the datasets and simulations.
    4. Create links between different parts of the datasets and simulations.
  4. Enter, navigate, and act on 3D audio-visual objects:
    1. Search data for extra-solar planets.
    2. Count objects per assigned volume size.
    3. Detect structures and trends in a (portion of) the datasets and simulations.
    4. Call an anomaly detector on a portion with an anomaly criterion.

3.3.3      Educational lab

  1. Experiential learning models simulations for humans.
  2. Group navigation across datasets and simulations.
  3. Group interactive curricula.
  4. Evaluation maps.

3.3.4      Collaborative CAD

  1. Building information management.
  2. Collaborative design and art.
  3. Collaborative design reviews.
  4. Event simulation (emergency planning etc.).
  5. Material behaviour simulation (thermal, stress, collision, etc.).

4        eSports Tournament (XRV-EST).

4.1       Purpose

To define interfaces between components enabling an XR Theatre (RW) to host any pre-existing VW game for the purpose of producing an esports tournament with RW and VW audience interactivity. To the extent that the game possesses the required interfaces, the XR Theatre can drive action within the VW.

4.2       Description

The eSports Tournament Use Case consists of the following:

  1. Two teams of 5 RW players are arranged on either side of a RW stage, each using a computer to compete within a common real-time Massively Multiplayer Online (MMO) VW game space.
  2. The 10 players in the VW are represented by avatars each driven by
    1. Role (e.g., magicians, warriors, soldier, etc.).
    2. Properties (e.g., costumes, physical form, physical features).
    3. Actions (e.g., casting spells, shooting, flying, jumping) operating in the VW
  3. The VW is populated by
    1. Avatars representing the other players.
    2. Autonomous characters (e.g., dragon, monsters, various creatures)
    3. Environmental structures (e.g., terrain, mountains, bodies of water).
  4. The action in the VW is captured by multiple VW cameras and
    1. Projected onto an immersive screen surrounding RW spectators
    2. Live streamed to remote spectators as a 2D video.

with all related sounds of the VW game space.

  1. A shoutcaster calls the action as the game proceeds.
  2. The image of RW players, player stats or other information or imagery may also be displayed on the immersive screen and the live stream.
  3. The RW tournament venue is augmented with lighting and special effects, music, and costumed performers.
  4. Interactions:
    1. Live stream viewers interact with one another and with commentators through live chats, Q&A sessions, etc.
    2. RW spectators interact through shouting, waving and interactive devices (e.g., LED wands, smartphones) through processing where:
      1. Data are captured by camera/microphone or wireless data interface (see RW data in Figure 1).
      2. Features are extracted and interpreted.
    3. RW/VW actions can be generated as a result of:
      1. In-person or remote audience behaviour (RW).
      2. Data collected from VW action (e.g., spell casting, characters dying, bombs exploding)
    4. At the end of the tournament, an award ceremony featuring the winning players on the RW stage is held with great fanfare.

5        Experiential retail/shopping.

5.1       Purpose

To define components and interfaces to facilitate a retail shopping experience enhanced using immersive/interactive technologies driven with AI.

Enhancements includes:

  1. Faster locating of products
  2. Easy access to product information and reviews.
  3. Delivery if special offers
  4. Collaborative shopping (members of a group know what other members have purchased)
  5. Product annotation according to user preference and theming of the environment according to season and user preferences.
  6. Analytics of data collected to inform sales and marketing decisions, inventory control and business model optimisation.
  7. Offering remote shoppers the ability to enter a digital twin of real world store as an avatar (as a 3D Graphics or as a volumetric “hologram”) and interact with friends who are physically or virtually present in the real world store.

5.2       Description and flow of actions

The environment displays the following features:

  1. It gives the user the impression that it is intelligent because the environment has access to the user’s identity/behaviour/preferences/shopping history/shopping list and is capable to guide the buyer to the area containing products of their supposed interest, propose products, annotate products and to display a particular product and make it flash because the environment thinks it is of interest to the buyer.
  2. It broadcasts music etc. to all buyers in the environment driven by the preferences. Friends in the shop at the same time can “meet”, but buyers can opt out from being discoverable (by the store, by friends etc.). Buyers can opt out from the loyalty card and not have the product they buy recorded by the shop.
  3. It can be digitally rethemed for different occasions.
  4. It offers experience that can takes shape can be anywhere, e.g., in a vehicle or in a public transit space.
  5. It enables remote shoppers to virtually enter a digital twin of the store and interact with friends who are physically present in the store for a collaborative shopping experience.

6        Immersive art experience.

6.1       Purpose

Define interfaces and components to enhance magical Environments created by skilled artists to provide each user with a unique interactive experience including the ability to modify the environment per their personal style and preferences.

6.2       Description

Immersive art experiences such as Immersive Van Gogh provide visitors with a visually and aurally immersive experience, often based on the work of a specific artist. These are typically passive walk-through and sit-down experiences. The addition of AI to these Environments allows numerous enhancements including the recognition of individual visitors, allowing them to interact with and modify these environments based on pre-selected preferences and style choices. AI style transfer allows the featured artist’s style to be applied to unique visitor interactions which might include AI voice or text-based image diffusion, gesture-based interactions, proximity effects and more. The addition of AR glasses allows visitors to experience, create and interact with “holograms” within the Environment. Biometric wearables allow the AI to monitor and adjust the multisensory experience to maximize target brain/nervous system states related to well-being, restorative states and more. The XR Venue model also allows visitors in the RW and VW to interact.

7        DJ/VJ performance at a dance party.

7.1       Purpose

Define interfaces and components to enhance the overall experience within a nightclub, lounge or dance party Environment. The goal is to empower the DJ/VJ to create and control entertaining immersive and interactive experiences that reduce social inhibitions, encourage play, invoke a greater sense of togetherness, encourage personal connections, evoke altered states of consciousness, amplify user’s self-expression and generally create a highly pro-social experience for participants.

7.2       Description

Dance parties, lounges, clubs, and electronic music festivals use powerful visuals, sound and other effects to captivate participants. The DJ (disc jockey) mixes audio tracks, energizes the crowd and is central to the experience. However, the visual artist or VJ (video jockey) is also an important contributor, often supported by lighting, laser and effects operators, dancers, performers and more. Quite often these venues offer peripheral activities as well to further engage participants off of the dance floor, including interactive screens, spatial art, vendors offering costumes and LED accessories. These venues can be thought of as play spaces. Pro-social intoxicants such as alcohol are sometimes used to lower inhibitions that would otherwise limit social connections. XR Venues can supercharge the dance party experience by providing powerful immersive visuals and by including VW participants Assisted by AI, all music, visuals, lights, and effects can be controlled by a single DJ (or immersive jockey) using gestures, simple control surfaces, vocal commandsm and such. In addition, expanded peripheral activities for deeper engagement might include immersive visuals that respond to emergent crowd behaviours, “photonic go-go booths” that modulate immersive visuals to amplify the creative expression of dancers’ movements, and AI-based matchmaking that fosters connections between like-minded attendees.

8        Live concert performance.

8.1       Purpose

Define interfaces and components to enhance live musical concerts with AI-driven visuals and special effects and allow enhanced audience participation while extending concert performances into the metaverse.

8.2       Description

Similar to live theatrical stage performances, musical concerts – whether orchestral or popular music – are increasingly using visuals and other effects to enhance the audience experience. A band or orchestral musicians on stage can be substantially enhanced by video projections from a live VJ, audio responsive visuals, image magnification from cameras and other effects. In addition, skilful live mixing of audio is critical to the audience experience, but it complicated by architectural properties of the physical venue. AI can dynamically optimize the listening experience and allow tight synchronization of visuals with spontaneous musical performances in addition to optimizing the VW experience for remote attendees.

9        Experiential marketing/branding.

9.1       Purpose

Define interfaces and components to enhance a wide range of experiences in support of corporate branding.

9.2       Description

Wherever there are a lot of people gathered we often find advertisers or corporate brands seeking visibility. Experiential marketing goes beyond simple advertising or signage by offering memorable experiences to attendees. Experiential marketing often makes use of pop-up venues or storefronts co-located at festivals, sporting events, concerts and more. Digital interactive or immersive experiences are increasingly employed, often incorporating branded story-worlds or iconic brand elements. The XRV allows delivery of a unique experience to each participant and deeper engagement to build brand loyalty. In addition, the experience can be extended into the VW to reach a larger number of attendees.

10    Meetings/presentations.

10.1   Purpose

Define interfaces and components to enhance live presentations and dialog, both in RW and VW, using rich multimedia, dialog mapping, AI-based mediation and fact checking.

10.2   Description

Meetings and presentations are increasingly hybrid, including both live and virtual attendees, allowing the sharing of rich multimedia content including documents, videos and website links. Use of an XRV for presentations and especially dialog – including political discourse – presents an opportunity for AI to monitor, track, organize and summarize numerous data in real-time to overlook hyperbole and guide the conversation toward rapid convergence on positive outcomes. Real-time fact-finding/fact-checking, dialog mapping (creating a logical tree showing relationships and dependencies between various points raised), group polling and other advanced methods can be employed in an XRV to guide dialog or facilitate presentations.

 

[1] https://info.vercator.com/blog/what-are-the-most-common-3d-point-cloud-file-formats-and-how-to-solve-interoperability-issues