Moving Picture, Audio and Data Coding
by Artificial Intelligence

Archives: 2022-01-26

MPAI issues three Calls for Technologies and publishes five standards for Community Comments

Geneva, Switzerland – 23 August 2023. Today, the international, non-profit, and unaffiliated Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) organisation developing AI-based data coding standards has concluded its 35th General Assembly (MPAI-35) approving the publication of three Calls for Technologies and five Technical Reports with request for Community Comments. The table gives links to the documents, dates of and registration links to online presentations and deadlines for submitting responses to Calls and comments on Technical Specifications.

Call for Technologies Link Presentation Deadline
AI for Health Data (AIH) X Sep 08 08 & 15 Oct 19 23:59
Object and Scene Description (OSD) X Sep 07 09 & 16 Sep 20 23:59
XR Venues – Live Theatrical Stage Performance (XRV) X Sep 12 07 & 17 Nov 20 23:59
Standard for Community Comments   Presentation Deadline
AI Framework (AIF) V2 X Sep 11 08 & 15 Sep 24 23:59
Avatar Representation and Animation (ARA) X Sep 07 08 & 15 Sep 27 23:59
Connected Autonomous Vehicles – Architecture (CAV) X Sep 06 08 & 15 Sep 26 23:59
Multimodal Conversation (MMC) V2 X Sep 05 08 & 15 Sep 25 23:59
MPAI Metaverse Model – Architecture (MMM) X Sep 01 08 & 15 Sep 21 23:59

Additional information about the purpose of the projects can be found here.
Anybody may respond to any of the three Calls for Technologies. However, non-members should join MPAI to participate in the development of the relevant standards.
Anybody can make comments on the Technical Specifications published with a request for Community Comments.
MPAI is continuing its work plan that includes the development of the following Technical Specifications:

  • AIF-DC, the group in charge of AI Framework (MPAI-AIF), is now working on the review of comments made on MPAI-AIF V2, developing the reference software and drafting the conformance testing.
  • Requirements (ARA), the group in charge of Avatar Representation and Animation (MPAI-ARA), is now working on the review of comments made on MPAI-AIF V2, developing the reference software and drafting the conformance testing.
  • MMC-DC, the group in charge of Multimodal Conversation (MPAI-MMC), is now working on the review of comments made on MPAI-AIF V2, developing the reference software and drafting the conformance testing.
  • Requirements (MMM), the group in charge of MPAI Metaverse Model (MPAI-MMM) – Architecture, is now working on the review of comments made on MPAI-AIF V2, developing the reference software and drafting the conformance testing.

The MPAI work plan also includes exploratory activities, some of which are close to becoming standard or technical report projects:

  • AI Health (MPAI-AIH). Targets an architecture where smartphones store users’ health data processed using AI and AI Models are updated using Federated Learning.
  • End-to-End Video Coding (MPAI-EEV). Extends the video coding frontiers using AI-based End-to-End Video coding.
  • AI-Enhanced Video Coding (MPAI-EVC). Improves existing video coding with AI tools for short-to-medium term applications.
  • Server-based Predictive Multiplayer Gaming (MPAI-SPG). Uses AI to train neural networks that help an online gaming server to compensate data losses and detects false data.
  • XR Venues (MPAI-XRV). Identifies common AI Modules used across various XR-enabled and AI-enhanced use cases where venues may be both real and virtual.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.
Please visit the MPAI website, contact the MPAI secretariat for specific information, subscribe to the MPAI Newsletter and follow MPAI on social media: LinkedIn, Twitter, Facebook, Instagram, and YouTube.


Do we need standards for Connected Autonomous Vehicles?

Enabling individuals or groups of people to move independently has been a major achievement that has changed human life for the better. Motor vehicles, however, have created a number of negative consequences such as, accidents causing damages, injuries, and deaths; congestion on the roads, millions of cars carrying a single person for a couple of ours and then staying, unused; air pollution, worsening of urban environments, etc.

Connected autonomous vehicles (CAV) have the potential to eliminate human error replacing it with a rate of machine errors orders of magnitude lower, optimise use of vehicles and infrastructure, give more time to human brains for rewarding activities, optimise traffic management, reduce congestion and pollution, help the elderly or disabled people to have a better life, and more.

Much has been happening since the first 1939 attempt at creating an autonomous vehicle. Today CAVs are technically feasible, and prototypes are driving on public roads and streets. The Society of Automotive Engineers in the USA has published a classification of autonomous vehicles based on levels.

Should we just wait for the industry to produce higher SAE-Level vehicles until one day we will only see CAVs around us? This is an option, but not necessarily the one that will let us reach the CAV holy grail in the most efficient and timely way.

Some 35 years ago, most public authorities, “owners” of their countries’ VHF and UHF bands, realised that digital television would allow them to keep their cherished terrestrial television service while getting a “digital dividend” in the form of VHF and UHF slots and re-assign them to other purposes. Especially in the United States, digital television was a national goal and steps were made to implement it. Some enlightened people understood the value of a global digital television standard (MPEG-2) and thing simply “happened”, not just for terrestrial, but also for ratellite and cable television, and packaged media as well.

Of course, cars are not television sets, but the game-changing role of standards can be the same. Standards can convert today’s niche market of CAVs (if we can call it a “market”) into a mass market. It can accelerate the availability of technology, promote competition, yield better and cheaper products, assuage consumer concerns, and provide tools for regulation.

Artificial Intelligence (AI) is the technology that can provide the solutions we need. MPAI can provide AI-based standards that are explainable.

MPAI intends to publish a standard called Connected Autonomous Vehicle (MPAI-CAV) – Architecture. This will enable component manufacturers to put their standard components on the market and car manufacturers to access an open global market of components with standard functions and interfaces that can be tested for conformance using standard procedures.

Register for one of the two online presentations on July/26 at 8 UTC and 15 UTC or read an overview of MPAI-CAV – Architecture.


MPAI issues Call for Technologies: Connected Autonomous Vehicle – Architecture

Geneva, Switzerland – 12 July 2023. Today, the international, non-profit, and unaffiliated Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) organisation developing AI-based data coding standards has concluded its 34th General Assembly (MPAI-34) approving the Call for Technologies: Connected Autonomous Vehicle (MPAI-CAV) – Architecture. Two online presentations of the Call will be made on 26 July at 8 and 15 UTC. Responses are due by 15 August.

The goal of the MPAI-CAV standard is to promote the development of a CAV industry by specifying components that can be easily integrated into larger subsystems. To achieve this goal, MPAI intends to develop the MPAI-CAV standard as a series of standards each adding more details to enhance CAV component interoperability. The first issue, MPAI-CAV – Architecture, to be developed using the results of the Call, aims to partition CAVs into subsystems and to further partition those subsystems into components. Both subsystems and components are identified by their function and interfaces, i.e., data exchanged between subsystems and components.

Three documents are attached to the Call: the first is Use Cases and Functional Requirements. It includes an initial set of Functionalities that the Architecture should provide.

The second document is the Framework Licence designed to facilitate the timely access to IP that is essential to implement the planned MPAI-CAV – Architecture standard. Finally, the third document is a Template for responses that respondents to the Call may wish to use in their responses.

Anybody may respond to the Call. However, non-members should join MPAI to participate in the development of the MPAI-CAV – Architecture standard.

MPAI is continuing its work plan comprising the development of the following Technical Specifications:

  1. The AI Framework (MPAI-AIF) V2 Technical Specification will enable an implementer to establish a secure AIF environment to execute AI Workflows (AIW) composed of AI Modules (AIM).
  2. The Avatar Representation and Animation (MPAI-ARA) V1 Technical Specification will support creation and animation of interoperable human-like avatar models able to understand and express a Personal Status.
  3. The Multimodal Conversation (MPAI-MMC) V2 Technical Specification will generalise the notion of Emotion by adding Cognitive State and Social Attitude and specify a new data type called Personal Status.
  4. The MPAI Metaverse Model (MPAI-MMM) – Architecture V1 Technical Specification will specify the Operation Model and its components Actions, Items, and Data Types.

The MPAI work plan also includes exploratory activities, some of which are close to becoming standard or technical report projects:

  1. AI Health (MPAI-AIH). Targets an architecture where smartphones store users’ health data processed using AI and AI Models are updated using Federated Learning.
  2. End-to-End Video Coding (MPAI-EEV). Extends the video coding frontiers using AI-based End-to-End Video coding.
  3. AI-Enhanced Video Coding (MPAI-EVC). Improves existing video coding with AI tools for short-to-medium term applications.
  4. Server-based Predictive Multiplayer Gaming (MPAI-SPG). Uses AI to train neural networks that help an online gaming server to compensate data losses and detects false data.
  5. XR Venues (MPAI-XRV). Identifies common AI Modules used across various XR-enabled and AI-enhanced use cases where venues may be both real and virtual.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

Please visit the MPAI website, contact the MPAI secretariat for specific information, subscribe to the MPAI Newsletter and follow MPAI on social media: LinkedIn, Twitter, Facebook, Instagram, and YouTube.

 

 


An Introduction to the MPAI Metaverse Model Architecture – Part III

In parts I and II of this series of posts, we have highlighted:

  1. The basic elements that enable operation of an M-Instance, especially Processes, Items, Actions and Data Type. In particular, Processes can take the shape of a User (a representative of a human in an M-Instance), a Device (to enable the connection of an M-Instance with the real world, called Universe), and a Service.
  2. The functional requirements of an initial list of Actions that enable a Process to do useful things in an M-Instance.

We are now going to identify the functional requirements of an initial list of Items that enable a Process to do useful things in an M-Instance. For convenience, Items will be grouped in classes.

Remembers the online presentations at 8 and 15 UTC on 23 June where you can know more about the MPAI Metaverse Model Architecture and the Call for Technologies. Register here for the first and here for the second presentation.

Here are Items with a general applicability.

Functional requirements Item
An M-Instance is an abstract entity bearing an Identifier. An M-Instance may expose its Capabilities. M-Instance
An M-Environment is an abstract entity bearing an Identifier.

1.      An M-Environment is hosted by an M-Instance.

2.      An M-Environment may expose its Capabilities.

3.      The Capabilities of an M-Environment may extend the Capabilities of its hosting M-Instance.

M-Environment
An Item or a Process shall bear Identifiers in such a way that:

1.      An Identifier uniquely references an Item or Process.

2.      An Item can have more than one Identifier.

An Item may have a hierarchical structure, such as:

Item: M-InstanceID, M-EnvironmentID, M-LocationID, ItemID.

Process: M-InstanceID, M-EnvironmentID, ProcessID.

Identifier
With the Rights Item we can express the Actions that a Process can perform on Items, at M-/U-Locations, during a period, e.g.,

Action1 Item1 Location1 T11-T12
Action2 Item2 Location2 T21-T22
Rights
Program is Data (and Metadata) that can be executed.

A Program Item shall be executable in the M-Instance.

A Program Item may be subject to certification before being admitted to an M-Instance

Program
Contract is a special Program that can be activated (Executed) by an external entity, e.g., a User or another already activated Contract. The contract shall include:

1.      Offer: Rights.

2.      Acceptance: By both parties.

3.      Consideration: There may be a Transaction.

The terms of the Contract are enforced in the jurisdiction of the M-Instance.

Contract
An M-Instance/M-Environment may show its Capabilities, i.e., an Item describing the characteristics of an M-Instance/M-Environment, including:

1.      Currencies supported.

2.      Items supported with Data Formats.

3.      Data Types supported.

Capabilities

Here are Items related to the interaction between Processes.

Functional requirements Item
Processes may need to exchange application-level Messages. Message
A Process should be able to expose its Capabilities, i.e., an Item containing a description of its characteristics including:

1.        List of Actions that can be performed.

2.        List of Items supported with Data Formats.

3.        List of Data Types supported.

4.        The cost of performing an Action.

5.        Human represented (User)

6.        Apps on board (Device).

Capabilities
When a Process requests another Process to perform an Action on its ehalf, it issues a Request-Action, an Item including:

1.      Time the Request-Action was issued.

2.      The Source ProcessID.

3.      The Destination ProcessID.

4.      The Action requested.

5.      The ItemIDs relevant to the Action.

6.      The Location of the Items.

7.      The Location of the output Items produced by the Request-Action.

8.      The requested Rights on the output Items.

Request-Action
When a Process has received a Request-Action and succeeds in performing it, it provides a Response-Action, an Item containing:

1.      Time the Response-Action was issued.

2.      The Source ProcessID (Source refers to the Process that issued the request).

3.      The Destination ProcessID.

4.      The output Items produced by performing the Request-Action.

Response-Action
An M-Instance is an abstract entity bearing an Identifier. An M-Instance may expose its Capabilities. M-Instance
An M-Environment is an abstract entity bearing an Identifier.

4.      An M-Environment is hosted by an M-Instance.

5.      An M-Environment may expose its Capabilities.

6.      The Capabilities of an M-Environment may extend the Capabilities of its hosting M-Instance.

M-Environment

Here are Items related to the use of an M-Instance.

Functional requirements Item
Account in an Item that uniquely references a human who has Registered.

A User may have more than one Account with one or more M-Instances or M-Environments.

An Account shall include:

1.      The ID of the Registered human.

2.      An M-Instance-specific subset of the Registered human’s User Data.

3.      The Rights held by each Users in the M-Instance/M-Environment.

4.      The IDs of Devices, Apps, and Users, and Personae.

5.      The validity of:

5.1.   Rights.

5.2.   Account.

Account
Activity Data is an Item containing the record of the Actions made by a User at all M-Locations for a period. Therefore, Activity Data shall include a list of Activities and, for each activity:

1.      The M-LocationID the Activity Data refer to.

2.      The duration (t1-t2) the Activity Data refer to.

3.      The list of Action.

Activity Data
Personal Profile is an Item containing the Data about the human represented by a User. It may include:

1.      First Name

2.      Last Name

3.      Address

4.      Country

5.      Age

6.      Biometric data

7.     

Personal Profile
The Manager of an M-Instance sets Rules, an Item expressing the terms and conditions under which Processes operate in the M-Instance. The Rules may express:

1.      The ability of a User to perform Actions on Items for which it has Rights.

2.      The inability of a User to perform Actions on Items for which it has no Rights.

3.      The duty of a User to perform Actions on Items.

4.      The ability of a User to make Transactions on the Rights of Items.

Rules
Social Graph is a representation of a User’s network of connections with Items and Processes representing the following:

1.      The types and the connections with Items and their M-Locations.

2.      The types and the connections with Devices (frequency of use, etc.).

3.      The types and the connections with Services (frequency of use, etc.).

4.      The types and the connections with Users, groups of Users in terms of:

4.1.   Time

4.2.   M-Locations.

4.3.   Declared purpose.

Social Graph
User Data is an Item that collects all the Data related to a human and their Users:

1.      Rights held by the human’s Users in the M-Instance.

2.      The Personal Profile of the human.

3.      The Personae that the human’s Users impersonate.

4.      The Activity Data of the human’s Users.

5.      The Social Graphs of the human’s Users.

User Data should have a representation that allow easy identification, extraction, and sharing of subsets of a User Data.

User Data

Here are Items with a financial impact.

Functional requirements Item
An Item that may be the object of a Transaction is called Asset. An Asset may be:

1.        MM-Embedded at an M-Location.

2.        Posted to a Service.

An Asset shall:

1.      Preserve the Data Formats of the Item that has spawned it.

2.      Include the date it was created.

Asset
It is useful to consider the Ledger associated with a specific Asset. This Item includes the list of all Transactions executed:

1.      On an Asset.

2.      Starting from the first Transaction and including the last.

3.      The Marketplace on which a Transaction was performed.

Ledger
The Provenance Item shall include the list of all Transactions executed:

4.      On an Asset.

5.      Starting from the first Transaction and including the last.

The Marketplace on which a Transaction was performed.Provenance.

Provenance
Transaction is Item representing the changed state of:

1.        The Rights on an Asset held by a seller User and a buyer User.

2.        The Accounts of the Users and of the Service facilitating/enabling the Transaction (Optional).

The Transaction shall represent:

1.      The Time the Transaction is performed.

2.      The Value moving into the Wallet of User 1 (seller).

3.      The Value moved from the Wallet of User2 (buyer).

4.      The Value moved into the Wallet of User 3 (service) – optional.

5.      The Time the Values were moved.

6.      The Rights to Act owned by User1 after Time.

7.      The Rights to Act owned by User2 after Time.

Transaction
Value is expressed by an Amount and the Currency related to the Amount. It

shall have a representation that enables the expression of the Amount and the Currency used to represent the Amount.

Value
A Wallet is a container of Currency units. A Wallet shall enable the representation of:

1.      Each Currency’s Amounts contained in the Wallet for each Currency.

2.      The Transactions performed.

Wallet

Here are Items specifically used to access a group of Services.

Functional requirements Item
To Authenticate an Entity (an Item that can be perceived), a special Item called AuthenticateIn is produced. This contain:

1.      The (ID of the) Entity Authenticated.

2.      (Optionally) information related to the way AuthenticateOut is rendered.

The Entity to be Authenticated can be:

1.      Speech produced by a User.

2.      The visual appearance of a User, etc.

Information on the rendering of InterpretOut is provided by:

1.      Media type (text, speech, image, etc.) used for rendering.

2.      Spatial Attitude of the Object rendering AuthenticateOut.

AuthenticateIn
AuthenticateOut is the Item containing the result of the Service Acting on the Request-Authenticate Item and information about its rendering. It is rendered as requested in AuthenticateIn. AuthenticateOut
To Discover Items, an Item called DiscoverIn is produced that contains:

1.      A description of the Items to be Discovered.

2.      Information related to the rendering of DiscoverOut.

Items candidate for Discovery may be described by:

1.      Verbal/text description.

2.      Similar Items.

3.      Belonging to specific M-Instances/M-Environments/M-Locations.

4.      Belonging to specific sections of Activity Data.

Information on DiscoverOut Rendering may be provided by:

1.      Media type used for rendering.

2.      Spatial Attitude of the Object rendering DiscoverOut.

DiscoverIn
DiscoverOut is the Item containing the result of the Service Acting on the Request-Discover Item and information about its rendering. It is rendered as requested in DiscoverIn. DiscoverOut
To obtain information on an Item, a User produces InformIn, an Item containing:

1.      A description of the Item about which information is requested.

2.      Information related to the rendering of InformOut.

InformIn may refer to:

1.      Item Metadata

2.      Any other information that a Service may have on the Item.

Information on rendering of InformOut may be provided by:

1.      Media type used for rendering.

2.      Spatial Attitude of the InformOut rendered Object.

InformIn
InformOut is the Item containing the result of the Service Acting on the Request-Inform Item and information about its rendering. It is rendered as requested in InformIn. InformOut
To obtain interpretation of an Item, a User produces InterpretIn, An Item containing:

1.      The ID or the Item to be Interpreted.

2.      Information related to the rendering of InterpretOut.

Items candidate for interpretation may be identified by: Item or ItemID.

Information on InterpretOut Rendering may be provided by:

1.      Media type used for rendering.

2.      Spatial Attitude of InterpretOut rendered Object.

InterpretIn
InterpretOut is the Item containing the result of the Service Acting on the Request-Interpret Item and information about its rendering. It is rendered as requested in InterpretIn. InterpretIn

Here are Items producing a perceptible experience.

Functional requirements Item
An Entity is an Item that can be perceived. MPAI introduces the following perceptible Items: Object, Model, Scene, Event, and Experience. Entity
It is useful to introduce Event, an Entity that includes selected Entities at an M-Location and their Animations during a period. Therefore, an Event shall include:

1.      M-LocationID.

2.      Start Time and End Time.

3.      List of Entities, their Animations, and Interactions.

Event
It is also useful to introduce the Entity Experience, comprising selected Entities of an Event and User Interactions with the Entities of the Event. Therefore, an Experience shall include:

1.      Start Time and End Time

2.      EventID

3.      List of selected Entities, their Animations, and User Interactions.

Experience
Object is an Entity representing an object including:

1.      The type(s) of Media (Audio-Visual-Haptic) composing the Model.

2.      The Data representation.

3.      The Data Format used.

Object
Model is an Object representing an object with its features ready to be MM-Animated or UM-Animated. Model
Persona is a Model representing a human. Persona
Scene is a composition of Objects with the following features:

1.      May be hierarchical.

2.      May be MM-Embedded at a specified M-Location.

3.      Represent Objects:

3.1.   With a Spatial Attitude.

3.2.   Animated by a stream or by an autonomous agent.

Scene
A Stream is an Item made by a continuous flow of Data with the following features:

1.      May be scalable in space and time.

2.      May be used to:

2.1.   Animate a Model.

2.2.   Represent a Digitised Object in an M-Instance.

Stream
Interaction is an Item containing the Request-Action issued by a User on an Entity at an M-Locations and the corresponding Time. Interaction
Map is the basic Item of an AR application. It is an Item containing a structure establishing a correspondence between U-Locations with M-Locations. Therefore, A Map shall include:

1.      The M-Instance the Map refers to.

2.      For each U-Location having one correspondence with an M-Location:

2.1.   The ID of the M-Location corresponding to the U-LocationID.

2.2.   Metadata related to the U-LocationID.

2.3.   Metadata related to the M-LocationID.

Map

Here are Item with a spatial impact.

Functional requirements Item
M-Location is an Identifiable delimited spatial portion of an M-Instance, e.g., the place occupied by representation of a human. An M-Location:

1.      Shall define the space of the M-Instance belonging to the M-Location.

2.      May enable the creation of sub-spaces defining sub-M-Locations

M-Location
U-Location is an Identifiable delimited spatial portion of the Universe, e.g., the place occupied by the human. A U-Location shall:

1.      Define the space in the Universe belonging to the U-Location.

2.      Enable the definition of sub-spaces (sub-U-Locations) comprised in the U-Location.

The enforcement of Rights to a U-Location is not intended to be part of the MPAI-MMM Architecture.

U-Location

Of course, more Items can be identified but those introduced above have been tested to cover a significantly large number of use cases in a variety of application domains.


An Introduction to the MPAI Metaverse Model Architecture – Part II

In part I of this series of posts, we have highlighted the basic elements that enable operation of an M-Instance, especially Processes, Items, Actions and Data Type. In particular, Processes can take the shape of a User (a representative of a human in an M-Instance), a Device (to enable the connection of an M-Instance with the real world, called Universe), and a Service.

We are now going to identify the functional requirements of an initial list of Actions that enable a Process to do useful things in an M-Instance.

Functional requirements Actions
To Register with an M-Instance or M-Environment. This Action may only be performed by a human or a legal entity, not by a User Register (human)
To transmit a Request-Action to a Process. The Request-Action should contain: the Action, the Items involved in the Action, the location where the necessary input Items are located, the location where the produced output Items are located, and the Rights that the requesting Process wishes to hold to be able to perform Actions on the Items produced. Request (Request-Action)
To transmit a Response-Action to a Process. The Response-Action should contain the output Items or an error message. Respond (Response-Action)
To enable a User to increase or diminish the Rights of a Process, e.g., because new Rights have been acquired or because a User has not complied with the Rules, we need the Action Change (Rights of a Process)
To confirm that the speech or the face of a human or an object imported into an M-Instance is from a specific human or U-Location (place in the real world), we need the Action. Note: The User requesting Authentication may also request Rights to use the information received, e.g., to publish it. Authenticate (Item).
To disable access to certain Item no longer accessible by all Processes (assuming that Rights have not been irrevocably granted to a Process) we need the Action Note: The Item may be made accessible again depending on the Rights of the User Hiding the Item. Hide (Item).
To create an Item out of Data and Metadata. For instance, a Device may capture Media as Data subject to certain Rights for use in an M-Instance. Create converts them to an Item usable in the M-Instance because an M-Instance can only Act on Items. Identify (Item)
To create a new Item by modifying an original Item with new or partially new Data and Metadata. For instance, a User with Rights on an Item may wish clone and then modify the components of an existing Item. Modify (Item)
To author an Item by calling a Service and providing it with Data and Metadata. Note: An M-Instance can provide a Service, internal or external to the M-Instance, that Users can call to create Items for use in the M-Instance. Author (Item)
To find Items by giving a description of the Items. Comments: An M-Instance can provide a Service that Users can call to find Items or Processes they need. Alternatively, the M-Instance may allow a User to Call an external Service to find Items of interest also outside of the M-Instance. Discover (Item)
To inform about an Item. A User may wish to know more about an Item, starting from its Metadata but potentially including other information the a Service has collected on the Item. Inform (Item)
To interpret an Item. For instance, a User may need the translation of an utterance produced by an avatar, recognise the face of an avatar, have its own message expressed in sign language into a speech segment. Interpret (Item)
To display an Asset. For instance, a User may wish to manifest its intention to surrender (part of) its Rights on an Asset. This can be done by placing the Asset at an M-Location that other Users can see or by posting it to a marketplace. Post (Asset)
To make a Transaction of an object. A User may like to surrender (part of) the Rights to an Asset to another User, possibly recognising the facilitation role of a Service in the Transaction. At the end of the Transaction the parties making the Transaction have different Rights on the Asset and the status of their Wallets may have changed. Transact (Asset)
To place an Entity (an Item that can be perceived) at an M-Location in such a way that other Users may not perceive it. This may be useful when the User needs to add more Entities to the M-Location without showing the preparations. MM-Add (Entity)
To make an Entity perceptible that was not until that moment, e.g., because the User did not want to show the Entity is not at a given time. MM-Enable (Entity)
To stop making an Entity perceptible, e.g., because the User does not want to show the setting of an event when the event is over. MM-Disable (Entity)
To transmit an Item to a Process. This is done, e.g., when a Device sends Data and Metadata coming from the Universe to a Service that Identifies the Item created from Data and Metadata, when a Uses captures an Entity at an M-Location (i.e., it asks the Service to MM-Send the Entity) or when a User transmits an Entity to a Device for rendering in the Universe. MM-Send (Item or Data and Metadata)
To activate a Contract. i.e., a Program and its Metadata stored on a Device and activated by an external entity, e.g., a User, or another activated Contract. Of course, Contracts may be Executed by an underlying Blockchain. Execute (Contract)
To animate a Model, i.e., an Object in the M-Instance representing an object at a U-Location with its features ready to be animated using a Process that is an autonomous agent. MM-Animate (Item)
To animate a Model using a Process that receives a Stream from a U-Location and animates the Model. The Process may be provided by the M-Instance, the human, or a third-party. UM-Animate
To present Media (i.e., Data and Metadata representing perceptible information) available at a Device to a U-Location as an Entity with an associated Spatial Attitude (i.e., Position and Orientation). For instance, a User may request that a Device present the Media is has received as an Entity from the M-Instance via an MM-Send Action. MU-Actuate (Media)
To present an Entity that is at an M-Location to a U-Location as an Entity with an associated Spatial Attitude. This operation is performed in two steps: MM-Send the Entity to a Device and MU-Actuate the Media from the Device. MU-Render (Entity)
To capture a scene at a U-Location as Media. A User may ask a Device to capture a scene at a U-Location as Media. UM-Capture (scene)
To transmit Data and Metadata to a Process. A User may ask a Device to transmit the Data corresponding to the Media and Device Metadata. UM-Send (Data and Metadata)
To present a scene that is at a U-Location to an M-Location as an Entity with an associated Spatial Attitude. This operation is performed in three steps: the Device captures the scene that is at the U-Location as Media (UM-Capture), then it UM-Sends the Data and Device Metadata to a Service that Identifies the Entity and MM-Embeds it at the M-Location. UM-Render (scene)
To store an Item at an Address, i.e., to an Item to a Device or to store an Item at an Address. MU-Send (Item)
To place a Model at an M-Location, animate it with a Stream, and present the animated Model at a U-Location with an associated Spatial Attitude. In other terms, the round trip real-virtual-real is established. Track (Model)
To verify that a Process has Rights to make an Action on an Item, to preserve the integrity of M-Instance operation. Validate (Process)
To convert an Item of a Request-Action or Response-Action to another Data Format. As for other Services, Convert can be a Service offered by the M-Instance or available outside of the M-Instance. Convert (Item)
To transmit a Request-Action to a Resolution Service to enable a ProcessA in M-InstanceA to communicate with ProcessB in a different M-InstanceB. As for other Services, Resolution can be a Service offered by the M-Instance or available outside of the M-Instance. Request (Request-Action)
To transmit a Response-Action to a Resolution Service that has sent a Request-Action. Respond (Response-Action)

Of course, more Actions can be identified but those introduced above have been tested to cover a significantly large number of use cases in a variety of application domains.

To know more about the MPAI Metaverse Model Architecture and the Call for Technologies, join the online presentations at 8 and 15 UTC on 23 June. Register here for the first and here for the second presentation.


An Introduction to the MPAI Metaverse Model Architecture – Part I

This is the first of a series of posts that illustrate the Call for Technologies: MPAI Metaverse Model – Architecture, a document inviting interested parties to submit comments to and proposals for Use Cases and Functional Requirements: MPAI Metaverse Model – Architecture. The goal is to facilitate the task of those who wish to contribute to the first “Metaverse Architecture” standard ever attempted by a standards body.

Before starting, let’s clarify why is MPAI, a standards body developing standards for AI-based data coding, engaged in the “metaverse”? The answer is manifold:

  1. The metaverse will require a range of technologies that deal with data and their transformations, i.e., data coding.
  2. The metaverse is thus and excellent source of valuable standard projects.
  3. MPAI has already a few standards that respond to specific metaverse needs.
  4. The metaverse is thus also an excellent technology integration platform.

Note that the word metaverse is used here to mean the “metaverse notion” while Metaverse Instance (M-Instance) indicated a “specific implementation” of the MPAI Metaverse Model Architecture (in the following: MPAI-MMM). An M-Instance is considered as a set of Processes providing some or all the following functions:

  1. To sense data from U-Locations.
  2. To process the sensed data and produce Data from the sensed data and/or autonomously.
  3. To produce one or more M-Environments populated by Objects that can be either digitised or virtual, the latter with or without autonomy.
  4. To process Objects from the M-Instance or potentially from other M-Instances to affect U- and/or M-Environments using Object in ways that are:
    • Consistent with the goals set for the M-Instance.
    • Effected within:
      • The capabilities of the M-Instance
      • The Rules set for the M-Instance.

The above gives the opportunity to highlight the convention that word beginning with a capital letter are defined here while those beginning with a small letter have the normal meaning of the context.

Some terms are more important than others, so let’s report a few of them here:

  1. A Process is Program and Metadata that can be executed in the M-Instance to perform Actions on Items.
  2. An Action is a supported Functionality that is performed in an M-Instance.
  3. An Item is Data and Metadata supported by the M-Instance where the Item exists.
  4. Metadata adds information on a Process of an Item and may include Rights.
  5. Rights define:
    • The ability of a Process to perform Actions on Items.
    • The possibility that an Item be subjected to an Action by a Process.
  6. An Item may include Rights held by User and Rights that it may be possible to acquire on the Item.
  7. Data Types are data referenced by Actions and Items.

We are now able to list the first Functionalities of an M-Instance:

An M-Instance is a set of Processes providing some or all the following functions:

  1. Senses data from U-Location.
  2. Produces Items autonomously or by processing the sensed data.
  3. Hosts one or more M-Environments populated by Objects that can be either digitised or virtual, the latter with or without autonomy.
  4. Processes Objects from the M-Instance or potentially from other M-Instances to affect U- and/or M-Environments using Objects in ways that are:
    • Consistent with the goals set for the M-Instance.
    • Effected within the capabilities and Rules of the M-Instance, and in accordance with applicable laws and regulations.
  5. Identifies Processes and Items with one or more than one Identifier each of which uniquely refers to one Process or Item.
  6. May contain one or more M-Environments each of which:
    • Includes an Identifier.
    • May include M-Locations with space and time attributes.
    • May require a Registration specific to the M-Environment.
  7. May make available information regarding its Capabilities.
  8. May require Registration for use:
    • A human can request to deploy one or more Users and one or more Personae in an M-Instance.
    • An M-Instance may request a subset of the Personal Profile of the Registering human.
  9. Establishes Rules that human’s Users in the M-Instance shall comply with.
  10. May penalise a User for lack of compliance with the Rules.

MPAI-MMM – Architecture identifies the following types of Process (see Figure 1):

  1. User represents a human rendered as:
    • A Model (Persona) animated by a stream generated by the human or by an autonomous agent. A User may be rendered by one or more Personae.
    • An Object rendering the human.
  2. Device connects User with a human or a U-Location:
    • From the Universe to an M-Instance: captures a scene as Media and Provides Media as Data and Metadata:
    • From an M-Instance to the Universe: receives an Entity and renders it as Media with a Spatial Attitude.
  3. Service provides Functionalities.
  4. App is a Program executed on a Device.

Figure 1 – Relationship of Human-Device-User-Service-Persona

Figure 1 highlights a few basic aspects:

  1. A human is connected to one of its potentially many digital counterparts (Users).
  2. An object has one or more digital correspondents (Objects).
  3. A User can be rendered as a Persona (possibly more than one).
  4. Processes, in particular Users, can interact with one another.

Processes play an important role. Here are some features of a Process:

  1. Performs Actions on Items if it has the Rights to do that.
  2. Can make available information about its Capabilities.
    • The Actions it can Perform.
    • The Items on which Actions can be performed.
    • The time during which they can be performed.
    • The M-Locations where they can be performed.
  3. Can request another Process to perform Actions on Items by transmitting to it a Request-Action Item.
  4. Can be requested to perform an Action and it does so if:
    • The requesting Process has the Rights required to perform that request, e.g., it has made a Transaction to acquire the Rights, the Rights is part of the set of Rights assigned at Registration time, etc.
    • The requested Process has the Rights to perform the requested Action on the Item.
  5. Can respond to the Process requesting an Action with a Response-Action Item (see Figure 2).
  6. Uses a supported format:
    • To request another Process to perform Actions on Items (Request-Action).
    • To respond to another Process that has requested an Action (Respond-Action).
  7. May perform, or to request other Processes to perform, Actions on Items even in the absence of Rights, if the Rules so allow.
  8. May need to be certified by the M-Instance Manager for use in an M-Instance.

Figure 2 – Processes requesting Action and responding to request

An M-Instance is typically administered by a single Manager that makes decisions about the technologies that fit the best with the entity’s needs. Two independent Managers need not make the same technology choices. The following workflow enables interoperability between MetaverseA and MetaverseB when ProcessA in MetaverseA requests ProcessB in MetaverseB to perform Action on an ItemA.1, the following (Note: RS=Resolution Service, CS=Conversion Service).

  1. ProcessA transmits Request-Action1 to R­SA.
  2. RSA transmits Request-Action1 to RSB.
  3. RSB transmits Item1 to CS.
  4. CS produces and transmit Item2 containing Converted Data to RSB.
  5. RSB transmits the new Request-Action2 to ProcessB.
  6. ProcessB
    • Performs the Action specified in Request-Action2 using ItemA.2.
    • Produces Response-Action2.
    • Requests RSB to transmit to RSA Response-Action2 containing ItemA.3 (result of performing Request-ActionA.2).
    • RSB transmits Response-Action2 to RSA.
    • RSA transmits Item3 to CS.
    • CS produces and transmits to RSA Item4, corresponding to ItemA.3 with converted Data.
    • RSA produces and transmits to ProcessA a new Response-Action4 that references ItemA.4.

It should be noted that an M-Instance may allow Processes to communicate directly with Processes in other M-Instances without calling ResolutionServiceA.

Given the above, what is then the scope of the MPAI-MMM Architecture Call for Technologies?

To make comments on, add functionalities to, or propose new elements to the MPAI Metaverse Model Architecture.

Join the online presentation on 23 June at 08 and 15 UTC. Register here for the first or here for the second presentation.

 


MPAI issues MPAI Metaverse Model – Architecture Call for Technologies

Geneva, Switzerland – 14 June 2023. Today, the international, non-profit, and unaffiliated Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) organisation developing AI-based data coding standards has concluded its 33rd General Assembly (MPAI-33) approving the Call for Technologies: MPAI Metaverse Model (MPAI-MMM) – Architecture. Responses are due by 10 July. Two online presentations of the Call will be made on 23 June at 8 and 15 UTC.

After publishing two Technical Reports on Functionalities and Functionality Profiles of the MPAI Metaverse Model, MPAI is now kicking off an ambitious plan to develop a Technical Specification on MPAI Metaverse Model – Architecture. This is a project that no standards body has ever attempted so far.

The first step of the plan is the publication of the Call for Technologies, as mandated by the MPAI standard development process. Note that the Call does not address data formats, only metaverse functionalities.

Three documents are attached to the Call: the first is Use Cases and Functional Requirements. It includes a reference to some thirty metaverse use cases explored by MPAI, a set of Functionalities that the Architecture should provide, and the functional requirements of its key elements: Processes, Items, Actions and Data Types.

The second document is the Framework Licence designed to facilitate the timely access to IP that is essential to implement the planned MPAI-MMM – Architecture standard. Finally, the third document is a Template for responses that respondents to the Call may wish to use in their responses.

Anybody may respond to the Call. However, non-members should join MPAI to participate in the development of the MPAI-MMM – Architecture standard.

MPAI is continuing its work plan comprising the development of the following Technical Specifications:

  1. The AI Framework (MPAI-AIF) V2 Technical Specification will enable an implementer to establish a secure AIF environment to execute AI Workflows (AIW) composed of AI Modules (AIM).
  2. The Avatar Representation and Animation (MPAI-ARA) V1 Technical Specification will support creation and animation of interoperable human-like avatar models able to understand and express a Personal Status.
  3. The Multimodal Conversation (MPAI-MMC) V2 Technical Specification will generalise the notion of Emotion by adding Cognitive State and Social Attitude and specify a new data type called Personal Status.

The MPAI work plan also includes exploratory activities, some of which are close to becoming standard or technical report projects:

  1. AI Health (MPAI-AIH). Targets an architecture where smartphones store users’ health data processed using AI and AI Models are updated using Federated Learning.
  2. Connected Autonomous Vehicles (MPAI-CAV). Targets the Human-CAV Interaction Environment Sensing, Autonomous Motion, and Motion Actuation subsystems implemented as AI Workflows.
  3. End-to-End Video Coding (MPAI-EEV). Extends the video coding frontiers using AI-based End-to-End Video coding.
  4. AI-Enhanced Video Coding (MPAI-EVC). Improves existing video coding with AI tools for short-to-medium term applications.
  5. Server-based Predictive Multiplayer Gaming (MPAI-SPG). Uses AI to train neural networks that help an online gaming server to compensate data losses and detects false data.
  6. XR Venues (MPAI-XRV). Identifies common AI Modules used across various XR-enabled and AI-enhanced use cases where venues may be both real and virtual.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

Please visit the MPAI website, contact the MPAI secretariat for specific information, subscribe to the MPAI Newsletter and follow MPAI on social media: LinkedIn, Twitter, Facebook, Instagram, and YouTube.

 

 


The MPAI Metaverse Model – Status Report

  1. Introduction

Many use the metaverse word with other people, but it is unlikely that they all mean the same. In general one can say that a metaverse instance is a rather complex communication and interaction environment with features, such as synchronous and persistent experiences and virtual reality features such as avatars that may or may not be controlled by humans or objects of the real world.

MPAI Metaverse Model – MPAI-MMM – is the MPAI project developing technical documents – so far Technical Specifications – that can be applied to as many kinds of metaverse instances as possible and enable varied metaverse implementations to interoperate.

The first document, Technical Report – MPAI Metaverse Model – Functionalities, collects the functionalities that potential metaverse users expect a metaverse instance to provide, rather than trying to define what the metaverse is. It includes definitions, assumptions guiding the project, potential sources of functionalities, an organised list of commented functionalities, and an analysis of some of the main technology areas underpinning the development of the metaverse.

The MPAI-MMM is based on the idea of using the notion of Profiles and Levels that digital media standardisation has successfully employed for three decades to cope with the wide variety of expected application domains. As some metaverse technologies are not yet available, the second document, Technical Report – MPAI Metaverse Model – Functionality Profiles, develops Functionality Profiles, a new notion in standardisation that defines profiles for what they do (“functionalities”) rather than for how they do it (“technologies”).

The second document reaches another important milestone by:

  1. Extending the existing collection of definitions.
  2. Developing a functional metaverse operation model based on Sources requesting Destinations to perform Actions on Items both containing Data Types.
  3. Specifying the Actions that Sources request Destinations to perform on Items and the responses of Destinations.
  4. Specifying the Metadata of the Items but not their Data Formats, in line with the Functionality approach.
  5. Developing nine Use Cases to test the suitability of Actions and Items.
  6. Developing four Functionality Profiles.
  7. MPAI-MMM Functional Operation Model

As it is hard to describe the many terms defined in the document, we will rely on the common meaning of the words. When in doubt about the meaning of a term (starting with a capital letter), please use the search window.

Figure 1 shows a simple example of the connection between the real world (right-hand side, called Universe) and the representations of U-Environments in M-Instances on the left-hand side. Green indicates that the User/Objects represents real-world humans/objects. Users are visualised as Personae: light blue indicates that a Persona or Object is driven by an autonomous agent and brick red that the Persona moves according to its real twin’s movements.

Figure 1 – An example of Metaverse Scenario

An M-Instance is populated by Processes, e.g., a real or virtual Persona is driven by a Process. A Process may request another Process to perform an Action by sending it a Request-Action and receiving a Response-Action. The Request-Action is an Item, i.e., Data and Metadata, possibly with Rights. The Item contains the Time the request was issued and Source Process, Destination Process, Action requested, InItems provided as input and their InLocations, OutLocations of the output Items, and requested OutRights to Act on the produced Items.

Figure 2 – Processes interacting within and without M-Instances

So far, the following elements have been identified and specified:

  1. 4 Processes: App, Device, Service, and User.
  2. 27 Actions, such as Authenticate an Entity (an Item that can be perceived), Discover (request a Service to find Items responding to certain criteria), MM-Embed (place and make perceptible an Entity at an M-Location), UM-Animate (animate an Entity with data from the real world), etc.
  3. 33 Items such as Account, Asset (an Item that can be Transacted), Map (a list of connections between U-Locations and M-Locations), Model (a representation of an object ready to be UM-Animated by a Stream or to be MM-Animated by an autonomous agent), Rights (a description of what Actions can be done on an Item), etc.
  4. 13 Data Types such as, Currency, Emotion, Spatial Attitude, Time, etc.
  1. Use Cases

Nine use cases have been developed. Here a simple use case showing the descriptive capabilities of the MPAI-MMM scene description language.

Figure 3 – The Virtual lecture use case

Here is a description of the workflow.

  1. The meeting manager authors and embeds a virtual classroom.
  2. The student
    1. Connects its place in the M-Instance (“home”) with the place where the human is.
    2. Pays for the right to attend the lecture and save the Experience of the lecture.
    3. Places its Persona in the virtual classroom and stops the rendering of the Persona at home.
  3. The teacher
    1. Does likewise (but does not pay),
    2. Places a 3D Model used in the lecture and animates it.
  4. The student
    1. Moves close to the teacher’s desk without changing the display of its Persona to feels the audio, visual, and haptic components of the 3D Model.
    2. Saves the lecture how they experienced it.
  5. The meeting manager pays lecture fees to the teacher.
  6. Both student and teacher go back home.

The other use cases are: Virtual Meeting, Hybrid Working, eSports Tournament, Virtual Performance, AR Tourist Guide, Virtual Dance, Virtual Car Showroom, and Drive a Connected Autonomous Vehicle.

  1. Functionality Profiles

The structure of the Metaverse Functionality Profiles is derived from the Use Cases and includes hierarchical Profiles and independent Profiles. Profiles may have Levels. As depicted in Figure 3, the currently identified Profiles are Baseline, Management, Finance, and High. The currently identified Levels for Baseline, Management, and High Profiles are Audio only, Audio-Visual, and Audio-Visual-Haptic. The Finance Profile does not have Levels.

Figure 4 – The currently identified Functionality Profiles

  1. What is next

MPAI has now laid down the basic elements and can start from the development of the Technical Specification – Metaverse Architecture. This will contain the main components of an M-Instance, their interconnections and the types of data exchanged. It will also contain the APIs called by the Processes to enable implementation of M-Instances.

 

 

 

 

 

 

 

 


Technical Report – MPAI Metaverse Model – Functionality Profiles

Abstract

1       Introduction. 1

2       A functional operation model 2

3       Actions. 3

4       Items. 5

5       Data Types. 6

6       Use cases. 7

6.1        AR Tourist Guide. 7

6.2        Virtual Dance. 8

7       Initial functionality profiles. 10

8       Conclusions. 11

9       References. 11

Abstract

MPAI has developed a roadmap for Metaverse interoperability. The published Technical Report – MPAI Metaverse Model – Functionalities [1] claims that, as standards for a market as vast as the one expected for the Metaverse are difficult to develop, functional profiles should be developed. The published draft Technical Report – MPAI Metaverse Model – Functionality Profiles [2] develops a Metaverse functional operation model, applies it to 7 Use Cases and proposes 4 initial functionality profiles. This document is a concise summary of [2].

1        Introduction

The MPAI Metaverse Model (MPAI-MMM) aims to provide Technical Reports and Technical Specifications that apply to as many kinds of Metaverse instances as possible and enable varied Metaverse implementations to interoperate.

At present, achieving interoperability is difficult because:

  1. There is no common understanding of what a Metaverse is or should be, in detail.
  2. There is an abundance of existing and potential Metaverse Use Cases.
  3. Some independently designed Metaverse implementations are very successful.
  4. Some important technologies enabling more advanced and even unforeseen forms of the Metaverse may be uncovered in the next several years.

MPAI has developed a roadmap to deal with this challenging situation. The first milestone of the roadmap is based on the idea of collecting the functionalities that potential Users expect the Metaverse to provide, instead of trying to define what the Metaverse is. The first Technical Report of this roadmap [1] includes definitions, makes assumptions for the MPAI-MMM project, identified sources that can generate functionalities, develops an organised list of commented functionalities, and analyses the main enabling technologies. Reference [3] provides a summary of [1].

Potential Metaverse Users with different needs might require different technologies to support these needs. Therefore, an approach that tried to achieve the goal of making every M-Instance be able to interoperate with every other M-Instance would force implementers to take technologies on board that are useless for their needs and potentially costly.

Reference [1] posits that Metaverse standardisation should be based on the notion of Profiles[1] and Levels[2] successfully adopted by digital media standardisation. A Metaverse Standard that includes Profiles and Levels would enable Metaverse developers to use only the technologies they need that are offered by whatever profile is most suitable to them.

The notion of profile can mitigate the impact of having many disparate Metaverse Users with diverse requirements. Unfortunately, that notion cannot be currently implemented because some key technologies are not yet available and at this time it is unclear which technologies, existing or otherwise, will eventually be adopted. To cope with this situation, [2] only targets Functionality Profiles, i.e., profiles that are defined by the functionalities they offer, not by technologies implementing them. Functionality Profiles are not meant to fully address the interoperability problem, but rather to allow a technology-independent definition of profiles based on the functional value they provide rather than on the “influence” of specific technologies.

2        A functional operation model

A Metaverse instance – called M-Instance – is composed of interacting Processes within and without the M-Instance. An M-Instance interoperates with another M-Instance to the extent its Processes interoperate.

MPAI has identified 3 classes of Process:

  1. Users: Processes that represent humans using data from the real world (U-Locations) or autonomous agents (both need not be human-like).
  2. Devices: Processes interconnecting U-Locations to M-Locations and vice-versa.
  3. Services: Processes providing Functionalities.

A Process performs or requests another Process to perform Actions on Items. Item is data with attached metadata possibly including Rightss enabling a Process to perform Actions on. Rights express the ability to perform an Action on an Item.

The names of some Actions are prefixed depending on where Action is performed:

  1. MM- if the interAction takes place in Metaverse.
  2. UM- if the interAction is between Universe to Metaverse (Universe is the real world).
  3. MU- if the interAction is between Metaverse to Universe.

An interAction of Process #1 with Process #2 unfolds as follows:

  1. Process #1 requests Process #2 to performs Actions on Items.
  2. Process #2 executes the request if Process #1 has Rightss to call Process #2.
  3. Process #2 responds to Process #1’s request.

Note that Processes and Items need not be in the same M-Instance. However, the possibility of having Actions performed on Items may be more limited if they are not in the same M-Instance.

Figure 1 depicts a Metaverse configuration with interacting Processes.

Figure 1- A Metaverse configuration

Devices connect humans and objects in a U-Location (a location in the real world – called Universe) to one or more Metaverse Service generating M-Instances as depicted in Figure 2.

Figure 2 – The relationship between the real world (Universe) and M-Instances

A Device UM-Captures scenes with its Sensors and MU-Render Entities (i.e., Item that can be perceived) though their Actuators.

To enable its User(s) to perform Actions in an M-Instance, a human may be asked to Register and provide a subset of their User data, e.g., Persona(e) and ID(s) of Wallet. The M-Instance provides an Account that univocally associates a Registered human with their Items with the following features:

  1. A human may have more than one Account in more than one M-Instance.
  2. A User has Rightss to act in the M-Instance associated with the human’s Account.
  3. A human may have more than one Account and more than one User per Account.
  4. A User exists after a human Registers with an M-Instance.
  5. Different Users of an Account may have different Rightss.

The Rules of an M-Instance express the obligations undertaken by the Registering human represented by the User and the terms and conditions under which a User exists in an M-Instance and operates either there or in another M-Instance. Depending on the Rules, a User of a human Registered on an M-Instance may or may not interact with another M-Instance and Rightss enforcement on some Actions performed on some Items may be forfeited.

Data entering an M-Instance, e.g., by Reading or MM-Capturing may include metadata and the Rightss granted to a Process to perform Actions on the data.

Items bear an Identifier is uniquely associated to that Item. However, an Item may bear more than one identifier. It is assumed that identifiers have the following structure:

[M-InstanceID] [ItemID]; [M-InstanceID] [M-LocationID] [ItemID].

3        Actions

A User can call a Process to perform Actions on Entities. An Entity can be:

  1. Authored, i.e., the User calls an Authoring Tool Service with an accompanying request to obtain Rights to act on the authored Entity.
  2. MM-Added, i.e., the User requests that an Entity be added to an M-Location with a Spatial Attitude.
  3. MM-Enabled, i.e., the User requests that a Process be allowed to MM-Capture an Entity that is MM-Added at an M-Location.
  4. MM-Embedded, i.e., MM-Added and MM-Enabled in one stroke.
  5. MM-Captured, i.e., the User requests that an Entity MM-Embedded at an M-Location be MM-Sent to a Process.
  6. UM-Animated, i.e., the User requests that a Process change the features of an Entity using a Stream of data.
  7. MM-Disabled, i.e., the User requests that the MM-Enabling of the Entity be stopped.
  8. Authenticated, i.e., the User calls a Service to make sure that an Item is what it claims to be.
  9. Interpreted, i.e., the User calls a Service to obtain an interpretation of Items in an M-Instance, e.g., translate a Speech Object MM-Embedded at an M-Location into a specific language.
  10. Informed, i.e., the User calls a Service to obtain data about an Item.

An example of composite Action, i.e., an Action that involves a plurality of Actions and a plurality of Items is Track that enables a User to request Services to:

  1. MM-Add a Persona at an M-Location with a Spatial Attitude.
  2. UM-Animate the Persona MM-Added at an M-Location.
  3. MU-Render specified Entities at the M-Location to a U-Location with Spatial Attitudes.

So far, the following Actions have been identified:

1 Authenticate 8 Inform 15 MM-Enable 22 Track
2 Author 9 Interpret 16 MM-Send 23 Transact
3 Call 10 MM-Add 17 MU-Actuate 24 UM-Animate
4 Change 11 MM-Animate 18 MU-Render 25 UM-Capture
5 Create 12 MM-Capture 19 Post 26 UM-Render
6 Destroy 13 MM-Disable 20 Read 27 UM-Send
7 Discover 14 MM-Embed 21 Register 28 Write

Of course, the Actions identified are not intended to be the only ones that will be needed by all M-Instances.

A Process requests another Process to perform an Action sending the following payload.

Source User (ID=UserID) ∨ Device (ID=DeviceID) ∨ Service (ID=ServiceID)
Destination User (ID=UserID) ∨ Device (ID=DeviceID) ∨ Service (ID=ServiceID)
Action Act
InItem Item (ID=ItemID)
InLocations M-LocationID ∨ U-LocationID ∨ Service (ID=ServiceID)
OutLocations M-LocationID ∨ U-LocationID ∨ Device (ID=DeviceID) ∨ Service (ID=ServiceID)
OutRights Rights (ID=RightsID)

The requested Process will respond by sending the following payload:

Success OutItem Item (ID=ItemID)
Error Request Faulty
IDs Incorrect
Rights Missing or incomplete
Unsupported Item not supported
Mismatch Item type mismatch
User Data Faulty
Wallet Insufficient Value
Clash Entity clashes with another Entity
M-Location Out of range
U-Location Out of range
Address Incorrect

4        Items

An Item can be:

  1. Created: from data and metadata. Metadata may include Rights.
  2. Changed: its Rightss are modified.
  3. Discovered by calling an appropriate Service.
  4. Written, i.e., stored.
  5. Posted as an Asset (i.e., an Item that can be the object of a Transaction to a Marketplace) to a marketplace.
  6. Transacted as an Asset.

An Item can belong to one of six categories:

  1. Items characterised by the fact that they can be MM-Captured by a User.
  2. Items that can cause an Entity to change its perceptible features.
  3. Items that have space and time attributes.
  4. Items that are finance related.
  5. Items that are non-perceptible.
  6. Items that are Process-related.

Entities are Items that can be perceived. Here are some relevant Items.

  1. Event: the set of Entities that are MM-Embedded at an M-Location from Start Time until End Time.
  2. Experience: An Event as a User MM-Captured it and the User’s Interactions with the Entities belonging to the Entity that spawned the Event.
  3. Object: the representation of an object and its features. Currently, only the following object types are considered: Audio, Visual, and Haptic.
  4. Model: An Object that can be UM-Animated by a Stream or a Process. A Persona is the Model of a human.
  5. Scene: a dynamic composition of Objects described by Times and Spatial Attitudes.

The finance-related Items are:

  1. Asset: An Item that may be the object of a Transaction and is embedded at an M-Location or Posted to a Service.
  2. Ledger: the list of Transactions executed on Assets.
  3. Provenance: the Ledger of an Asset.
  4. Transaction: Item representing the changed state of the Accounts and the Rights of a seller User and a buyer User on an Asset and optionally of the Service facilitating/enabling the Transaction.
  5. Value: An Amount expressed in a Currency.
  6. Wallet: A container of Currency units.

So far, the following Items have been identified:

1 Account 11 M-Environment 21 Request-Authenticate 31 Scene
2 Activity Data 12 M-Instance 22 Request-Discover 32 Service
3 App 13 M-Location 23 Request-Inform 33 Social Graph
4 Asset 14 Map 24 Request-Interpret 34 Stream
5 Device 15 Message 25 Response-Authenticate 35 TransAction
6 Event 16 Model 26 Response-Discover 36 U-Location
7 Experience 17 Object 27 Response-Inform 37 User
8 Identifier 18 Personal Profile 28 Response-Interpret 38 User Data
9 InterAction 19 Process 29 Rights 39 Value
10 Ledger 20 Provenance 30 Rules 40 Wallet

Of course, more Items will be identified as more application domains will be taken into consideration.

Each Item is specified by the following table. It should be noted that, in line with the assumptions made by the MPAI Metaverse roadmap and the current focus on functionalities, the formats of the Item Data are not specified.

Purpose A functional description of the Item.
Data In general, the Item data format(s) is(are) not provided.
Acted on Metadata
ItemID ID of the Item.
UserID ID of the User who holds Rights on the Item with ItemID.
WalletID ID of the Wallet held by User with UserID
InRightsID ID of the Rights the User with UserID has on the Item with ItemID.
OutRightsID ID of the Rights a User may acquire on the Item with ItemID.
AuthorID ID of the User who Authored the Item with ItemID.
AuthorToolID ID of the Service who provided the AuthorTool.
ParentItemID ID of the Item that spawned the Item.
ServiceID ID of the Service that is Called.
ServiceWalletID ID of the Wallet of a Service.
ActedOnItemID ID of the Item that was Acted on.
TargetUserID ID of the User to be affected by the Action.
TargetWalletID ID of the Wallet of the User to be affected by the Action.
UserDataID ID of a User Data.
PersonaID ID of a User’s Persona.
PersonalDataID ID of a User’s Personal Data.
ActivityDataID ID of a User’s Activity Data.
DescrMdata Any additional descriptive Metadata of the Item.

5        Data Types

Data Types are data that are referenced in Actions or in Items. Currently, the following data types have been identified.

  1. Address
  2. Amount
  3. Coordinates
  4. Currency
  5. Personal status
    1. Cognitive state
    2. Emotion
    3. Social attitude
  6. Point
  7. Spatial attitude
    1. position
    2. orientation
  8. Time

6        Use cases

So far 7 Use Cases have been developed to verify the completeness of Actions, Items and data types specified. Use cases are also a tool that facilitates the development of functionality profiles.

To cope with the fact that Use Cases may involve several locations and Items, a notation has been developed exemplified by the following:

  1. Useri MM-Embeds Persona1, Personai.2, etc.
  2. Useri Calls Process1, Processi.2, etc.
  3. Useri MM-Embeds Personaj, at M-Locationi.1, M-Locationi.2, etc.
  4. Useri MU-Renders Entityj at U-Locationi.1, U-Locationi.2, etc.
  5. Useri MM-Sends object2 to Userj.

The following is to be noted:

Note1 A = Audio, A-V = Audio-Visual, A-V-H = Audio-Visual-Haptic, SA=Spatial Attitude.
Note2 If a Composite Action is listed, its Basic Actions are not listed, unless they are independently used by the Use Case.

Here only two of the seven Use Cases of [2] will be presented.

6.1       AR Tourist Guide

6.1.1      Description

This Use Case describes a human who intends to develop a tourist application through an App that alerts the holder of a smart phone where the App is installed and lets them view Entities and talk to autonomous agents residing at M-Locations:

  1. User1
    1. Buys M-Location1 (parcel) in an M-Environment.
    2. Creates Entity1 (landscape suitable for a virtual path through n sub-M-Locations).
    3. Embeds Entity1 (landscape) on M-Location1.1 (parcel).
    4. Sells Entity1 (landscape) and M-Location1.1 (parcel) to a User2.
  2. User2
    1. Authors Entity1 to Entity2.n for the M-Locations.
    2. Embeds the Entities at M-Location1 to M-Location2.n.
    3. Sells the result to User3.
  3. human4
    1. Develops
      1. Map recording the pairs M-Locationi – U-Location2.i
      2. App alerting a human5 holding the Device with the App installed that a key U-Location has been reached.
    2. Sells Map and App to human3.
  4. User3 MM-Embeds one or more autonomous Personae at M-Location1 to M-Location2.n.
  5. When human5 gets close to a key U-Location:
    • App prompts Device to Request User3 to MU-Render the Entityi MM-Embedded at M-Location2.i to the key U-Location2.i.
    • human5 interacts with MU-Rendered Entityi that may include an MM-Animated Persona2.i.

6.1.2      Workflow

Table 1 – AR Tourist Guide workflow.

Who Does What Where/comment
User1 Transacts M-Location1.1 (Parcel in an M-Environment)
Authors Entity1.1 (A landscape for the parcel)
MM-Embeds Entity1.1 M-Location1.1
Transacts Entity1.1 User2 (landscape)
Transacts M-Location1.1 User2 (parcel)
User2 Authors Entity2.1 to Entity2.n Promotion material for U-Locations.
MM-Embeds Entity2.1 to Entity2.n M-Location2.1 to Location 2.n
Writes M-Locations Address2.1
MM-Sends Address2.1 User4
Transacts Entity1.1 User4 (landscape)
Transacts M-Location1.1 User4 (parcel)
Transacts Entity2.1 to Entity2.n User4
human3 develops Map3.1 (U-location2.i-M-Location2.i-Metadata2.i)
sells Map and App To human4
User4 MM-Embeds Persona4.1-Persona4.n M-Location2.1 to Location 2.n w/ SA
MM-Animates Persona4.1-Persona4.n M-Location2.1 to M-Location2.n
human5 downloads App (To Device)
approaches U-Location2.i (App’s key point)
App prompts Device5.1
Device5.1 MM-Send Message5.1 User4 (Persona4.i)
Persona4.i MU-Sends Entity2.i U-Location2.i
human5 interacts (W/ MU-Rendered Entity4.i and Persona4.i)

6.1.3      Actions, Items, and Data Types

Actions Items Data Types
Author App Amount
MM-Animate Device Currency
MM-Embed Entity Coordinates
MM-Send Map Spatial Attitude
MU-Render M-Location Spatial Attitude
Send Persona Value
Write Service
Transact U-Location
User

6.2       Virtual Dance

6.2.1      Description

  1. User2 (dance teacher)
    • Teaches dance at a virtual classroom.
    • Works at M-Location1 where its digital twin Persona2.1 is Audio-Visually MM-Embedded.
    • MM-Embeds and MM-Animates Persona2 (A-V) (another of its Personae) at M-Location2.2 as virtual secretary to attends to students coming to learn dance.
  2. User1 (dance student #1):
    • MM-Embeds its Persona1 (A-V) at Location1.1 (its “home”).
    • Audio-Visual-Haptically MM-Embeds Persona1 (A-V-H) at Location1.2 close to Location2.2.
    • Sends Object1 (A) to Persona2.2 (greets virtual secretary).
  3. Virtual secretary:
    • Sends Object1 (A) to dance students #1 (reciprocates greeting).
    • Send Object2 (A) to call regular dance teacher’s Persona2.1.
  4. Dance teacher MM-Embeds (A-V-H) Persona1 at Location2.3 (classroom) where it dances with Persona1.1 (dance student #1).
  5. While Persona1 (student #1) and Persona2.1 (teacher) dance, User3 (dance student #2):
    • MM-Embeds (A-V) Persona1 (its digital twin) at Location3.1 (its “home”).
    • MM-Embeds (A-V-H) Persona1 to Location3.2 (close to Location2.2 where the secretary is located).
  6. After a while, User2 (dance teacher):
    • MM-Embeds (A-V-H) Persona1 at Location2.4, (close to Location3.2 of dance student #2).
    • MM-Disables Persona1 from Location2.3 where it was dancing with Persona1.1 (student #1).
    • MM-Embeds (A-V-H) and MM-Animates an autonomous Persona3 replacing Persona2.1 from Location2.3 so that student #1 can continue practising dance.
    • Dances with Persona1 (student #2).

6.2.2      Workflow

Table 2 – Virtual Dance workflow.

Who Does What Where/(comment)
User2 (Teacher) Tracks Persona2.1 (AV) M-Location2.1
MM-Embeds Persona2.2 (AV) M-Location2.2 w/ SA
MM-Animates Persona2.2 (AV) M-Location2.2
User1 (Student) Tracks Persona1.1 (AV) M-Location1.1
Transacts Value1.1 (Lesson fees)
MM-Embeds Persona1.1 (AVH) M-Location1.2 w/ SA
MM-Disables Persona1.1 M-Location1.1
MM-Sends Object1.1 (A) Persona2.2 (greetings)
User2 (Persona2.2) MM-Sends Object2.1 (A) Persona1.1 (greetings)
MM-Sends Object2.2 (A) Persona2.1 (alert)
User2 (Persona2.1) MM-Embeds Persona2.1 M-Location2.3 w/ SA
MM-Disables Persona2.2 M-Location2.2
MM-Embeds Object2.3 (A) M-Location2.4 (music)
Persona1.1 (dances)
Persona2.1 (dances)
User3 (Student) Tracks Persona3.1 (AV) M-Location3.1
Transacts Value3.1 (Lesson fees)
MM-Embeds Persona3.1 (AVH) M-Location3.2 w/ SA
MM-Disables Persona3.1 M-Location3.1
User2 (Teacher) MM-Disables Persona2.1 M-Location2.3
MM-Embeds Persona2.3 M-Location2.4 w/ SA
Animates Persona2.3 M-Location2.4
Persona3.1 (dance)
Persona2.1 (dance)

6.2.3      Actions, Items, and Data Types

Actions Items Data Types
MM-Animate Persona (AV) Amount
MM-Disable M-Location Currency
MM-Embed Object (A) Spatial Attitude
MM-Send Persona (AVH) Value
Track Service
Transact U-Location
Value

7       Initial functionality profiles

The structure of the Metaverse functionality profiles derived from the above includes hierarchical profiles and one independent profile. Profiles may have levels. As depicted in Figure 2, the currently identified profiles are baseline, management, finance, and high. The currently identified levels for baseline, management, and high profiles are audio only, audio-visual, and audio-visual-haptic.

Figure 3 – the currently identified Functionality Profiles

As an example, the baseline functionality profile enables a human equipped with a Device to allow their Users to:

  1. Author Entities, e.g., object models.
  2. Sense a scene at a U-Location:
    • UM-Capture a scene.
    • UM-Send data.
  3. MM-Embed Personae and objects:
    • MM-Add Persona and object.
    • UM-Animate a Persona, with a stream (UM-Animate) or using a Process (MM-Animate).
  4. MU-Render an Entity at an M-Location to a U-Location.
  5. MM-Capture Entities at an M-Location
  6. MM-Disable an Entity.

This Profile supports baseline lecture, meeting, and hang-out Use Cases. TransActions and User management are not supported.

Table 3 lists the Actions, Entities, and Data Types of the Baseline Functionality Profile.

Table 3 – Actions, Entities, and Data Types of the Baseline Functionality Profile

Actions Author Call Create Destroy
MM-Add MM-Animate MM-Capture MM-Embed
MM-Disable MM-Enable MM-Render MM-Send
MU-Render MU-Send Read Register
Track UM-Animate UM-Capture UM-Render
UM-Send Write
Items Device Event Experience Identifier
Map M-Instance M-Location Model
Object Persona Process Scene
Service Stream U-Environment U-Location
User
Data Types Address Coordinates Orientation Position
  Spatial Attitude

The identified four profiles serve well the needs conveyed by the identified functionalities. As more functionalities will be added, the number of profiles and potentially of levels, is likely to increase.

8       Conclusions

Reference [2] demonstrates the feasibility of the first two milestones of the proposed MPAI roadmap to Metaverse interoperability. Currently, four functionality profiles supporting the selected functionalities have been identified and specified. As more basic Metaverse elements are added, however, more functional profiles are likely to be found necessary. Functionality profiles can be extended and restructured as more Functionalities will be added.

The next step of the MPAI roadmap to Metaverse interoperability is the development of Technical Specification – MPAI Metaverse Model (MPAI-MMM) – Metaverse Architecture.

9        References

  1. MPAI; Technical Report – MPAI Metaverse Model – Functionalities (MPAI-MMM); January 2023; https://mpai.community/standards/mpai-mmm/mpai-Metaverse-model/mmm-functionalities/
  2. MPAI; Technical Report – MPAI Metaverse Model – Functionality Profiles (MPAI-MMM); March 2023; https://mmm.mpai.community/
  3. MPAI; A roadmap to Metaverse Interoperability; February 2023; FGMV-I-012

[1] A Profile is sets of one or more base standards and, if applicable, chosen classes, subsets, options, and parameters of those standards that are necessary for accomplishing a particular function.

[2] A Level is a subdivision of a Profile indicating the completeness of the User experience.


The MPAI metaverse standardisation proposal

1. Introduction

Metaverse is expected to create new jobs, opportunities, and experiences and transform virtually all sectors of human interaction. To harness its potential, however, there are hurdles to overcome:

  1. There is no common agreement on what a “metaverse” is or should be.
  2. The potential users of the metaverse  are too disparate.
  3. Many successful independent implementations of “metaverse” already exist.
  4. Some important enabling technologies may be years away.

To seamlessly use metaverses, standards are needed. Before engaging in metaverse standardisation, we should find a way to achieve metaverse standardisation, even without:

  1. Reaching an agreement on what a metaverse is.
  2. Disenfranchising potential users.
  3. Alienating existing initiatives.
  4. Dealing with technologies (for now)

2. The MPAI proposal

MPAI – Moving Picture, Audio, and Data Coding by Artificial Intelligence – believes that developing a (set of) metaverse standards is very challenging goal. Metaverse standardisation requires that we should:

  1. Start small and grow.
  2. Be creative and devise a new working method.
  3. Test the method.
  4. Gather confidence in the method.
  5. Achieve a wide consensus on the method.

Then, we could develop a (set of) metaverse standards.

MPAI has developed an initial roadmap:

  1. Build a metaverse terminology.
  2. Agree on basic assumptions.
  3. Collect metaverse functionalities.
  4. Develop functionality profiles.
  5. Develop a metaverse architecture.
  6. Develop functional requirements of the metaverse architecture data types.
  7. Develop a Table of Contents of Common Metaverse Specifications.
  8. Map technologies to the ToC of the Common Metaverse Specifications.

Step #1 – develop a common terminology

We need no convincing of the importance of this step as many are developing metaverse terminologies. Unfortunately, there is no attempt at converging to an industry-wide terminology.

The terminology should:

  • Have an agreed scope.
  • Be technology- and business-agnostic.
  • Not use one industry’s terms if they are used by more than one industry.

The terminology is:

  • Intimately connected with the standard that will use it.
  • Functional to the following milestones of the roadmap.

MPAI has already defined some 150 classified metaverse terms and encourages the convergence of existing terminology initiatives. The MPAI terminology can be found here.

Step #2 – Agree on basic assumptions

Assumptions are needed for a multi-stakeholder project because designing a roadmap depends on the goal and on the methods used to reach it.

MPAI has laid down 16 assumptions which it proposes for a discussion. Here the first 3 assumptions are presented. All the assumptions can be found here.

  • Assumption #1: As there is no agreement on what a metaverse is, let’s accept all legitimate requests of “metaverse” functionalities.
    • Note: an accepted functionality does not imply that a Metaverse Instance shall support it.
  • Assumptions#2Common Metaverse Specifications (CMS) will be developed.
    • Note: they will provide the technologies supporting identified Functionalities.
  • Assumption#3: CMS Technologies will be grouped into Technology Profiles in response to industry needs.
    • Note: A profile shall maximise the number technologies supported by specific groups of industries.

The notion of profile was well known and used in digital media standardisation and is defined here:

A set of one or more base standards, and, if applicable, their chosen classes, subsets, options and parameters, necessary for accomplishing a particular function.

Step #3 – Collect metaverse functionalities

The number of industries potentially interested in deploying metaverse is very large and MPAI has explored 18 of them. See here. A metaverse implementation is also likely to use external service providers – interfaces should be defined. See here.

MPAI has collected > 150 functionalities organised in: Areas – Subareas – Titles. See here.

This is not an accomplished task, but its beginning. Collecting metaverse functionalities should be a continuous task.

Step #4 – Which profiles?

The notion of profile is not currently implementable because some key technologies are not yet available and it is not clear which technologies, exisiting or otherwise, will eventually be selected.

MPAI proposes to introduce a new type of profile – functionality profile, characterised by the functionalities offered, not by the technologies implementing them. By dealing only with functionalities and not technologies, profile definition is not “contaminated” by technology considerations. MPAI is in the process of developing:

Technical Report – MPAI Metaverse Model (MPAI-MMM) – Functionality Profiles.

It is expected that the Technical Report will be published for Community Comments on the 26th of March and finally adopted on the 19th of April 2023. It will contain the following table of contents:

  1. scalable Metaverse Operational Model.
  2. Actions (what you do in the metaverse):
    1. Purpose – what the Action is for.
    2. Payload – data to the Metaverse.
    3. Response – data from the Metaverse.
  3. Items (on what you do Actions):
    1. Purpose – what the Item is for.
    2. Data – functional requirements.
    3. Metadata – functional requirements.
  4. Example functionality profiles.

The Technical Report will not contain nor make reference to technologies.

3. The next steps of the MPAI poposal

The Technical Report will demonstrate that it is possible to develop metaverse functionality profiles (and levels) that do not make reference to technologies, only to functionalities.

Step #5 – Develop a metaverse architecture.

The goal is to specify of a Metaverse Architecture, including the main functional blocks and the data types exchanged between blocks.

Step #6 – Develop functional requirements of the metaverse architecture data types.

The goal is to develop the functional requirements of the data types exchanged between functional blocks of the metaverse architecture.

Step #7 – Develop the Table of Contents of the Common Metaverse Specifications.

The goal is to produce an initial Table of Contents (ToC) of Common Metaverse Specifications to have a clear understanding of which Technologies are needed for which purpose in which parts of the metaverse architecture to achieve interoperability.

Step #8 – Map technologies to the ToC of the Common Metaverse Specifications.

MPAI intends to map its relevant technologies and see how they fit in the Common Metaverse Specification architecture. Other SDOs are invited to join the effort.

4. Conclusions

Of course, step #8 there will not provide the metaverse specifications but a tested method to produce them. MPAI envisages to reach step #8 in December 2023. It is a good price to pay before engaging in a perilous project.