Moving Picture, Audio and Data Coding
by Artificial Intelligence

All posts

MPAI releases reference software leveraging AI Framework and Neural Network Watermarking for Generative AI applications

Geneva, Switzerland – 20 March 2024. MPAI – Moving Picture, Audio and Data Coding by Artificial Intelligence – the international, non-profit, and unaffiliated organisation developing AI-based data coding standards has concluded its 42nd General Assembly (MPAI-42) approving the release of Reference Software using Neural Network Watermarking for Generative AI applications.

The new V1.1 version of the Neural Network Watermarking (MPAI-NNW) Reference Software includes an implementation of the AIF Framework and of an AI Workflow enabling a user to make queries that include a text and an image and obtain a vocal response. This inference is watermarked, to enable the issuer of the query to ascertain that the response they receive is from the intended source. The Software will be presented online on the 16th of April at 15 UTC. Register at https://us06web.zoom.us/meeting/register/tZ0udeutqT0vHdBh1DLiUxoRr59cUs7iQzzN.

Presentations and video recordings of all MPAI standards are available (ppt= PowerPoint file), YT=YouTube, nYT=WimTV):

AI Framework (MPAI-AIF) ppt YT nYT
Context-based Audio Enhancement (MPAI-CAE) ppt YT nYT
Connected Autonomous Vehicle (MPAI-CAV) – Architecture ppt  YT nYT
Compression and Understanding of Industrial Data (MPAI-CUI) ppt YT nYT
Governance of the MPAI Ecosystem (MPAI-GME) ppt YT nYT
Human and Machine Communication (MPAI-HMC) ppt YT nYT 
Multimodal Conversation (MPAI-MMC) ppt YT nYT
MPAI Metaverse Model (MPAI-MMM) – Architecture ppt  YT  nYT
Neural Network Watermarking MPAI-NNW) ppt YT nYT
Object and Scene Description (MPAI-OSD) ppt YT nYT
Portable Avatar Format (MPAI-PAF) ppt  YT  nYT

MPAI is continuing its work plan that involving the following activities:

  1. AI Framework (MPAI-AIF): developing open-source applications based on the AI Framework.
  2. AI for Health (MPAI-AIH): developing the specification of a system enabling clients to improve models processing health data and federated learning to share the training.
  3. Context-based Audio Enhancement (CAE-DC): preparing new projects.
  4. Connected Autonomous Vehicle (MPAI-CAV): Functional Requirements of the data used by the MPIA-CAV – Architecture standard.
  5. Compression and Understanding of Industrial Data (MPAI-CUI): preparation for an extension to existing standard that includes support for more corporate risks.
  6. End-to-End Video Coding (MPAI-EEV): video coding using AI-based End-to-End Video coding.
  7. AI-Enhanced Video Coding (MPAI-EVC). video coding with AI tools added to existing tools.
  8. Human and Machine Communication (MPAI-HMC): developing reference software.
  9. Multimodal Conversation (MPAI-MMC): developing reference software and conformance testing and exploring new areas.
  10. MPAI Metaverse Model (MPAI-MMM): developing reference software specification and identifying metaverse technologies requiring standards.
  11. Neural Network Watermarking (MPAI-NNW): reference software for enhanced applications.
  12. Portable Avatar Format (MPAI-PAF): reference software, conformance testing and new areas.
  13. Server-based Predictive Multiplayer Gaming (MPAI-SPG): technical report on mitigation of data loss and cheating.
  14. XR Venues (MPAI-XRV): development of the standard.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

Please visit the MPAI website, contact the MPAI secretariat for specific information, subscribe to the MPAI Newsletter and follow MPAI on social media: LinkedIn, Twitter, Facebook, Instagram, and YouTube.

 

 


Recent MPAI standards – presentations and video recordings

In the last few months, MPAI has published eight new or update MPAI standards. They have been presented online in the 11-15 March 2024 week.

Here are the titles of the standards with links to the presentations and video recording provided by two services. They are a good opportunity to stay abreast of the progress in MPAI

rev MPAI Metaverse Model  (MPAI-MMM) – Architecture ppt  YT  nYT
new Portable Avatar Format  (MPAI-PAF) ppt  YT  nYT
new Human and Machine Communication  (MPAI-HMC) ppt YT nYT 
new Connected Autonomous Vehicle  (MPAI-CAV) – Architecture ppt  YT nYT
rev Context-based Audio Enhancement (MPAI-CAE) ppt YT nYT
new Object and Scene Description (MPAI-OSD) ppt YT nYT
rev Multimodal Conversation  (MPAI-MMC) ppt YT nYT
rev AI Framework (MPAI-AIF) ppt YT nYT
MPAI presentation ppt YT nYT

MPAI publishes two standards: the new version of Context-based Audio Enhancement and the new Human and Machine Communication

Geneva, Switzerland – 21 February 2024. MPAI – Moving Picture, Audio and Data Coding by Artificial Intelligence – the international, non-profit, and unaffiliated organisation developing AI-based data coding standards has concluded its 41st General Assembly (MPAI-41) approving the publication of two standards and announcing the availability of all its standards in linked form on the web.

Context-based Audio Enhancement (MPAI-CAE) V2.1 extends the previously published Version 2.0 adding full online references to the specification of all AI Workflows, AI Modules, JSON Metadata, and Data Types used by the standard.

Human and Machine Communication (MPAI-HMC) V1.0 integrates a wide range of technologies from existing MPAI standards to enable new forms of communication between entities, i.e., humans present or represented in a real or virtual space or machines represented in a virtual space as speaking avatars and acting in a context using text, speech, face, gesture, and audio-visual scene in which they are embedded. It.

In the 11-15 March week, MPAI will be presenting its recently published standards at a series of planned 40-min online sessions. The presentations will illustrate the scope, the features, and the technologies of each standard and will be followed by open discussions. The new web-based access to all published MPAI standards will also be presented. All times are UTC

Standard March Registr.
AI Framework (MPAI-AIF) 11 T16:00 Link
Context-based Audio Enhancement (MPAI-CAE) 12 T17:00 Link
Connected Autonomous Vehicle (MPAI-CAV) – Architecture 13 T15:00 Link
Human and Machine Communication (MPAI-HMC) 13 T16:00 Link
Multimodal Conversation (MPAI-MMC) 12 T14:00 Link
MPAI Metaverse Model (MPAI-MMM) – Architecture 15 T15:00 Link
Portable Avatar Format (MPAI-PAF) 14 T14:00 Link

MPAI is continuing its work plan that involving the following activities:

  1. AI Framework (MPAI-AIF): developing open-source applications based on the AI Framework.
  2. AI for Health (MPAI-AIH): developing the specification of a system enabling clients to improve models processing health data and federated learning to share the training.
  3. Context-based Audio Enhancement (CAE-DC): preparing new projects.
  4. Connected Autonomous Vehicle (MPAI-CAV): Functional Requirements of the data used by the MPIA-CAV – Architecture standard.
  5. Compression and Understanding of Industrial Data (MPAI-CUI): preparation for an extension to existing standard that includes support for more corporate risks.
  6. Human and Machine Communication (MPAI-HMC): developing reference software.
  7. Multimodal Conversation (MPAI-MMC): developing reference software and conformance testing, and exploring new areas.
  8. MPAI Metaverse Model (MPAI-MMM): developing reference software specification and identifying metaverse technologies requiring standards.
  9. Neural Network Watermarking (MPAI-NNW): reference software for enhanced applications.
  10. Portable Avatar Format (MPAI-PAF): reference software, conformance testing and new areas.
  11. End-to-End Video Coding (MPAI-EEV): video coding using AI-based End-to-End Video coding.
  12. AI-Enhanced Video Coding (MPAI-EVC). video coding with AI tools added to existing tools.
  13. Server-based Predictive Multiplayer Gaming (MPAI-SPG): technical report on mitigation of data loss and cheating.
  14. XR Venues (MPAI-XRV): development of the standard.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

Please visit the MPAI website, contact the MPAI secretariat for specific information, subscribe to the MPAI Newsletter and follow MPAI on social media: LinkedIn, Twitter, Facebook, Instagram, and YouTube.

 

 


Standards that innovate technology and standardisation

At its 40th General Assembly (MPAI-40), MPAI approved one draft, one new, and three extension standards. For an organisation that has already nine standards in its game bag, this may not look like big news. There are two reasons, though, to consider this a remarkable moment in the MPAI short but intense life.

The first reason is that the draft standard posted for Community Comments – Human and Machine Communication (MPAI-HMC) – does not specify new technologies but leverages technologies from existing MPAI standards: Context-based Audio Enhancement (MPAI-CAE), Multimodal Conversation (MPAI-MMC), the newly approved Object and Scene Description (MPAI-OSD), and Portable Avatar Format (MPAI-PAF).

If not new technologies, what does MPAI-HMC specify then? To answer this question let’s consider Figure 1.

Figure 1 – The MPAI-HMC communications model

The human labelled as #1 is part of a scene with audio and visual attributes and communicates with the Machine by transmitting speech information and the entire audio-visual scene including him or herself. The Machine receives that information, processes it, and emits internally generated audio-visual scenes that include itself uttering vocal and displaying visual manifestations of its own internal state generated to interact more naturally with the human. The human may also communicate with the Machine when other humans are in the scene with him or her and the Machine can discern the individual human and identify (i.e., give a name to) audio and visual objects. However, only one human at a time can communicate with the Machine.

The Machine need not capture the human in a real space. His or her digital representation can be rendered in a Virtual Space as a Digitised Human. The human may not be alone but together with other Digitised Humans or with Virtual Humans, i.e., audio-visual representations of processes, such as Machines. For this reason, we will use the word Entity to indicate both a human or their avatar and a Machine rendered as an avatar.

The Machine can also act as an interpreter between the Entities and Contexts labelled as #1 or #2 and #3 or #4. By Context we mean information surrounding an Entity that provides additional insight into the information communicated by the Entity. An example of Context is language and, more generally, culture.

Communication between #1 and #3 represents the case of a human in a Context communicating with a Machine, e.g., an information service, in another Context. In this case the Machine communicates with the human by sensing and actuating audio-visual information, but the communication between the Machine and #3 may use a different communication protocol. The payload used to communicate is the “Portable Avatar” defined as a Data Type specified by the MPAI-PAF standard representing an Avatar and its Context.

Communication between the human in #1 and the Machine is based on raw audio-visual communication while communication between Machine and Entity #3 is carried out using a Portable Avatar .

Read a collection of usage scenarios.

The name of the standard is Human and Machine Communication (MPAI-HMC). It is published as a draft with a request for Community Comments, the last step before publication. Comments are due by 2024/02/19T23:59 UTC to secretariat@mpai.community.

To explain the second reason why the 40th General Assembly is a remarkable moment we have to recall that most MPAI application standards are based on the notion of AI Workflow (AIW) composed of interconnected AI Modules (AIM) executed in the AI Framework (AIF) specified by the MPAI-AIF standard. Four out of five documents are now  published in a new format where the Use Cases-AI Modules- Data Types chapters make reference to a common body of AIMs and Data Types.

Component-based software engineering aims to build software out of modular components. MPAI is implementing this notion in the world of standards.

See the links below and enjoy:

MPAI-HMC: https://mpai.community/standards/mpai-hmc/mpai-hmc-specification/

MPAI-MMC: https://mpai.community/standards/mpai-mmc/mpai-mmc-specification/

MPAI-OSD: https://mpai.community/standards/mpai-osd/mpai-osd-specification/

MPAI-PAF: https://mpai.community/standards/mpai-paf/mpai-paf-specification/


MPAI publishes 5 documents: 1 draft for community comments, 3 extensions, and 1 new standard

Geneva, Switzerland – 24 January 2024. MPAI – Moving Picture, Audio and Data Coding by Artificial Intelligence – the international, non-profit, and unaffiliated organisation developing AI-based data coding standards has concluded its 40th General Assembly (MPAI-40) approving the publication of a range of standards covering disparate technologies and application domains.

Human and Machine Communication (MPAI-HMC) is a draft published for Community Comments, the last step before publication. It includes a wide range technologies available from existing MPAI standards to enable an Entity, i.e., a human or a machine, to hold a communication with Entities as humans do. Comments are due by 2024/02/19T23:59 UTC to secretariat@mpai.community.

The newly-approved Object and Scene Description  (MPAI-OSD) V1.0 standard provides important technologies enabling the digital representation of position and orientation of Audio and Visual Objects and their combinations in Scenes. The MPAI-OSD capabilities enhance usability of the new Multimodal Conversation (MPAI-MMC) V2.1 and Portable Avatar Format (MPAI-PAF) V1.1.

MPAI Metaverse Model – Architecture (MPAI-MMM) V1.1 updates the MMM- Architecture Metadata to streamline communication between the Processes of a Metaverse Instance and uses the new MPAI-MMM Scripting Language (MMM-Script) to represent a wide range of use cases.

MPAI is now offering an innovative way to access to its new standards via the web:

MPAI-HMC: https://mpai.community/standards/mpai-hmc/mpai-hmc-specification/

MPAI-MMC: https://mpai.community/standards/mpai-mmc/mpai-mmc-specification/

MPAI-OSD: https://mpai.community/standards/mpai-osd/mpai-osd-specification/

MPAI-PAF: https://mpai.community/standards/mpai-paf/mpai-paf-specification/

MPAI is continuing its work plan that involving the following activities:

  1. AI Framework (MPAI-AIF): reference software, conformance testing, and application areas.
  2. AI for Health (MPAI-AIH): reference model and technologies for a system enabling clients to improve models processing health data and federated learning to share the training.
  3. Context-based Audio Enhancement (CAE-DC): new projects are brewing.
  4. Connected Autonomous Vehicle (MPAI-CAV): Functional Requirements of the data used by the MPIA-CAV – Architecture standard.
  5. Compression and Understanding of Industrial Data (MPAI-CUI): preparation for an extension to existing standard that includes support for more corporate risks.
  6. Human and Machine Communication (MPAI-HMC): model and technologies enabling a human or a machine to communicate with a machine or a human in a different cultural environment.
  7. Multimodal Conversation (MPAI-MMC): drafting reference software and conformance testing, and exploring new areas.
  8. MPAI Metaverse Model (MPAI-MMM): reference software and metaverse technologies requiring standards.
  9. Neural Network Watermarking (MPAI-NNW): reference software for enhanced applications.
  10. Portable Avatar Format (MPAI-PAF): reference software, conformance testing and new areas.
  11. End-to-End Video Coding (MPAI-EEV): video coding using AI-based End-to-End Video coding.
  12. AI-Enhanced Video Coding (MPAI-EVC). video coding with AI tools added to existing tools.
  13. Server-based Predictive Multiplayer Gaming (MPAI-SPG): technical report on mitigation of data loss and cheating.
  14. XR Venues (MPAI-XRV): development of the standard.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

Please visit the MPAI website, contact the MPAI secretariat for specific information, subscribe to the MPAI Newsletter and follow MPAI on social media: LinkedIn, Twitter, Facebook, Instagram, and YouTube.

 


MPAI publishes Context-based Audio Enhancement and Object and Scene Description for Community Comments

Geneva, Switzerland – 20 December 2023. MPAI, Moving Picture, Audio and Data Coding by Artificial Intelligence, the international, non-profit, and unaffiliated organisation developing AI-based data coding standards has concluded its 39th General Assembly (MPAI-39) approving the publication of the Context-based Audio Enhancement standard and Object and Scene Description standard for Community Comments.

The draft of the Context-based Audio Enhancement (MPAI-CAE) Version 2.1 standard enhances the compatibility of the Audio with the Visual and the Audio-Visual Scene Description specified by the draft Object and Scene Description (MPAI-OSD) standard. Both are published with requests for Community Comments. These are due by 2024/01/23T23:59 UTC and 17T23:58 UTC, respectively, to secretariat@mpai.community.

MPAI is continuing its work plan that involving the following activities:

  1. AI Framework (MPAI-AIF): reference software, conformance testing, and application areas.
  2. AI for Health (MPAI-AIH): reference model and technologies for a system enabling clients to improve models processing health data and federated learning to share the training.
  3. Context-based Audio Enhancement (CAE-DC): new projects are brewing.
  4. Connected Autonomous Vehicle (MPAI-CAV): Functional Requirements of the data used by the MPIA-CAV – Architecture standard.
  5. Compression and Understanding of Industrial Data (MPAI-CUI): preparation for an extension to existing standard that includes support for more corporate risks.
  6. Human and Machine Communication (MPAI-HMC): model and technologies enabling a human or a machine to communicate with a machine or a human in a different cultural environment.
  7. Multimodal Conversation (MPAI-MMC): drafting reference software and conformance testing, and exploring new areas.
  8. MPAI Metaverse Model (MPAI-MMM): reference software and metaverse technologies requiring standards.
  9. Neural Network Watermarking (MPAI-NNW): reference software for enhanced applications.
  10. Portable Avatar Format (MPAI-PAF): reference software, conformance testing and new areas.
  11. End-to-End Video Coding (MPAI-EEV): video coding using AI-based End-to-End Video coding.
  12. AI-Enhanced Video Coding (MPAI-EVC). video coding with AI tools added to existing tools.
  13. Server-based Predictive Multiplayer Gaming (MPAI-SPG): technical report on mitigation of data loss and cheating.
  14. XR Venues (MPAI-XRV): development of the standard.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

Please visit the MPAI website, contact the MPAI secretariat for specific information, subscribe to the MPAI Newsletter and follow MPAI on social media: LinkedIn, Twitter, Facebook, Instagram, and YouTube.


Connected autonomous vehicles can be connected to the metaverse

In a previous article, we have described the Architecture of the MPAI Connected Autonomous Vehicle (CAV). The CAV’s Environment Sensing Subsystem (ESS) captures data of the environment with a variety of sensors and produces the Basic Environment Representation (BER) that is passed to the Autonomous Motion Subsystem (AMS). This exchanges (subsets of) the BER with other CAVs in range and uses the received information to produce the Full Environment Representation (FER). Then, the AMS can issue commands to the Motion Actuation Subsystem (MAS) to move the CAV toward the destination.

In a previous article, we have described the Architecture of the MPAI Metaverse Model (MPAI-MMM) where a Metaverse Instance (M-Instance) is defined as a set of Processes providing some or all the following functions (terms beginning with small letters are in the Universe and terms beginning witj a large letter are in an M-Instance:

  1. To sense data from U-Locations.
  2. To process the sensed data and produce Data.
  3. To produce one or more M-Environments populated by Objects that can be either digitised or virtual, the latter with or without autonomy.
  4. To process Objects from the M-Instance or potentially from other M-Instances to affect U-Locations (in the Universe) and/or M-Locations (in this or other M-Instances) using Object in ways that are:
    • Consistent with the goals set for the M-Instance.
    • Effected within the capabilities of the M-Instance.
    • Complying with the Rules set for the M-Instance and applicable laws.

At a first glance, it looks like the way a CAV’s BER and FER bear a lot of similarities with the M-Instance of the MMM Architecture as we can see from the comparative .

 

Table 1 – Comparison between M-Instance and CAV

M-Instance

CAV

An M-Instance is a set of Processes providing some or all the following functions: A CAV is a set of Processes (Subsystems and AI Modules) providing the following functions:
1.   To sense data from U-Locations. 1.  To sense data from the environment.
2.  To process the sensed data and produce Data. 2.  To process the sensed data and produce Data processable by the CAV, in particular BERs.
3.  To produce one or more M-Environments populated by Objects that can be either digitised or virtual, the latter with or without autonomy. 3.  To produce one M-Instance populated by Objects
4.  To process Objects from the M-Instance or potentially from other M-Instances to affect U- and/or M-Environments using Objects in ways that are: 4.  To process (subsets of) BERs from the CAV’s M-Instance and potentially from other CAVs’ M-Instances in ways that are:
4.1.  Consistent with the goals set for the M-Instance. 4.1. Consistent with the goals set to the CAVs to reach a destination.
4.2.  Effected within the capabilities of the M-Instance. 4.2. Effected within the CAV’s capabilities (processing but also physical).
4.3.  Complying with the Rules set for the M-Instance and applicable laws. 4.3.  Complying with the Rules (law and traffic regulations).

We need to look more in detail into this “similarities”. Before proceeding, let’s recall two assumptions at the basis of MPAI-MMM – Architecture:

  1. User is a type of Process that represents and acts on behalf of a human. A human may have more than one User in an M-Instance.
  2. Persona is a rendered User.
  3. User may have or acquire the Rights to perform an Action, e.g., to authenticate another User.

To do that, let’s consider the simple case of two CAVs: CAVA and CAVB respectively owned by humanA and humanB, where humanA is friend to humanB. humanA has two Users: UserA.1 who represents humanA in the Human-CAV Interaction (HCI) Subsystem (or M-EnvironmentA.1) and UserA.2 who represents humanA in the Autonomous Motion Subsystem (or M-EnvironmentA.2). Similarly, for humanB.

humanA wants to see the landscape seen by humanB in their CAVB.

This is a simplified description of the workflow (a fuller workflow is in the MPAI=CAV – Architecture standard)

  1. humanA requests User1 (HCI) to take them to a destination.
  2. User1 requests UserA.2 (AMS) to take CAVA to destination.
  3. User2
    • Gets the BER from CAVA’s ESS (or M-Environment3).
    • Computes the Route to Destination.
    • Issues a series of Commands to the MAS.
    • Authenticates its peer User2.
    • Gets a subset of the BER from User2.
    • Produces CAVA’s FER.
  4. User1
    • Authenticates its peer User2.
    • Renders their Persona in CAVB (e.g., using advanced 3D rendering technologies).
    • Converses with humanB.
    • Watches CAVB’s M-Location corresponding to the environment currently traversed by CAVB.

This example is a first demonstration of the compatibility of an M-Instance produced by a CAV implementing the MPAI-CAV – Architecture standard with the MPAI-MMM – Architecture standard.


MPAI releases new version of Neural Network Watermarking Reference Software; starts new project on XR Venues – Live Theatrical Stage Performance

Geneva, Switzerland – 22 November 2023. MPAI, Moving Picture, Audio and Data Coding by Artificial Intelligence, the international, non-profit, and unaffiliated organisation developing AI-based data coding standards has concluded its 38th General Assembly (MPAI-38) approving the release of a new version of its Neural Network Watermarking reference software and the start of the development of the new XR Venues – Live Theatrical Stage Performance standard.

The new version of the Neural Network Watermarking (MPAI-NNW) reference software makes it possible to upgrade conventional AI-based processing workflows with traceability and integrity checking functions. For instance, it is now possible to add AI Modules to an MPAI-AIF workflow to detect whether a particular text was indeed produced by the expected service or AI Module (AIM). Register to attend the online presentation on 2023/12/12T15:00 UTC.

The XR Venues (MPAI-XRV) – Live Theatrical Stage Performance standard project specifies functions and interfaces of AI Modules designed to automate live multisensory immersive stage performances which ordinarily require extensive on-site show control staff to operate. By running AI Workflows (AIW) composed of AIMs, it will be possible to obtain a more direct, precise yet spontaneous show implementation and control of multiple complex systems to achieve the show director’s vision.

MPAI is continuing its work plan that involve the following activities:

  1. AI Framework (MPAI-AIF): reference software, conformance testing, and application areas.
  2. AI for Health (MPAI-AIH) development of the standard.
  3. Context-based Audio Enhancement (CAE-DC): new projects are bewing.
  4. Connected Autonomous Vehicle (MPAI-CAV): Functional Requirements of data used by the CAV architecture.
  5. Compression and Understanding of Industrial Data (MPAI-CUI): preparation for an extension to existing standard.
  6. Multimodal Conversation (MPAI-MMC): reference software, drafting conformance testing, and new areas.
  7. MPAI Metaverse Model (MPAI-MMM): reference software and metaverse technologies requiring standards.
  8. Neural Network Watermarking (MPAI-NNW): reference software for enhanced applications.
  9. Portable Avatar Format (MPAI-PAF): reference software, conformance testing and new areas.
  10. End-to-End Video Coding (MPAI-EEV): video coding using AI-based End-to-End Video coding.
  11. AI-Enhanced Video Coding (MPAI-EVC). video coding with AI tools added to existing tools.
  12. Server-based Predictive Multiplayer Gaming (MPAI-SPG): technical report on mitigation of data loss and cheating.
  13. XR Venues (MPAI-XRV): development of the standard.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

Please visit the MPAI website, contact the MPAI secretariat for specific information, subscribe to the MPAI Newsletter and follow MPAI on social media: LinkedIn, Twitter, Facebook, Instagram, and YouTube.


Visiting MPAI standards – the MPAI Metaverse Model foundations

Much has been and is being said about the vagueness of the notion of “metaverse”. To compensate for this, the current trend is to add an adjective to the “metaverse” name. So, now we have studies on industrial metaverse, medical metaverse, tourist metaverse, and more.

In the early phases of its metaverse studies, when it was scoping the field, MPAI did consider 18 metaverse domains (use cases). Now, however, that phase is over because the right approach to standards is to identify what is common first and the differences (profiles) later.

In this paper we will try to identify features that are expected to be common across metaverse instances (M-Instances).

The basic metaverse features are the ability to:

  • Sense U-Environments (i.e., portions of the Universe) and their elements: inanimate and animate objects, and measurable features (temperature, pressure, etc.). By animate we mean humans, animals, and machines that move such as robots.
  • Create a virtual space (M-Instance) and its subsets (M-Environments).
  • Populate virtual spaces with digitised objects (captured from U-Environments) and virtual objects (created in the M-Instance).
  • Communicate with other M-Instances.
  • Actuate U-Environments as a result of the activities taking place in the M-Instance.

How will such an M-Instance be implemented?

We assume that an M-Instance is composed of a set of processes running on a computing environment. Of course, the M-Instance could be implemented as a single process, but this is a detail. What is important is that the M-Instance implements a variety of functions. Here we assume that functions correspond to processes. These are individually activated where they are accessible at the atomic level or by a single large process.

While a process is a process is a process, it is useful to characterise some processes. The first type of process is a Device, having the task to “connect” a U-Environment with an M-Environment. We assume that to achieve a safe governance, a Device should be connected to an M-Instance under the responsibility of a human. The second type of process is the User, a process that “represents” a human in the M-Instance and acts on their behalf. The third type is a Service, a process able to perform specific functions such as creating objects. The fourth type is an App, a process running on a Device. An example of App is a User that is not executing in the metaverse platform but rather on the Device.

An M-Instance includes objects connected with a U-Instance; some objects, like digitised humans, mirror activities carried out in the Universe; activities in the M-Instance may have effects on U-Environments. There are sufficient reasons to assume that the operation of an M-Instance be governed by Rules. A reasonable application of the notion of Rules is that a human wishing to connect a Device to, or deploy Users in an M-Instance should register and provide some data.

Data required to register could be a subset of the human’s Personal Profile, Device IDs, and User IDs. There are, however, other important elements that may have to be provided for a fuller experience. One is what we call Persona, i.e., an Avatar Model that the User process can utilise to render itself. Obviously, a User can be rendered as different Personae, if the Rules so allow. A second important element is the Wallet: a registering human may decide to allow one of their Users to access a particular Wallet to carry out its economic activity in the M-Instance.

Figure 1 pictorially represents some of the points made so far.

Figure 1 – Universe-Metaverse interaction

The activities of a human in a U-Environment captured by a Device may drive the activities of a User in the M-Instance. The human can let one of User:

  1. Just execute in the M-Instance without rendering itself.
  2. Render itself as an autonomously animated Persona.
  3. Render itself as a Persona animated by the movements of the human.

We have treated the important case of a human and their User agent. What about other objects?

Besides processes performing various functions, an M-Instance is populated by Items, i.e., Data and Metadata supported by the M-Instance and bearing an Identifier. An Item may be produced by Identifying imported Data/Metadata or internally produced by an Authoring Service. This is depicted in Figure 2 where User produces:

  1. Item1 by calling the Authoring Service1
  2. Item2 by importing data and metadata and then calling Identification Service2.

Figure 2 – Objects in an M-Instance

A more complete view of an M-Instance is provided by Figure 3.

Figure 3 – M-Instance Model

In Figure 3 we see that:

  1. Human1 and Human3 are connected to the M-Instance via a Device, but Human2 is connected to the M-Instance with two Device.
  2. Human1 has deployed one User, User 2 two Users and Human3
  3. User1.1 of Human1 is rendered as one Persona1.1.1, User 2.1 of Human2 as two Personae (Persona2.1.1 and Persona2.1.2), and User2.2 as one Persona2.2.1.
  4. Object1 in the U-Environment is captured by 1 and Device3.1 and mapped as two distinct Objects: Object1.2 and Object3.1.
  5. Users and Services variously interact.

What are the interactions referred to in point 5. above? We assume that an M-Instance is populated by Services performing functions that are useful for the life of the M-Instance. We call standard functions “Actions”.  MPAI has specified the functional requirements of a set of Actions:

  1. General Actions (Register, Change, Hide, Authenticate, Identify, Modify, Validate, Execute).
  2. Call a Service (Author, Discover, Inform, Interpret, Post, Transact, Convert, Resolve).
  3. M-Instance to M-Instance (MM-Add, MM-Animate, MM-Disable, MM-Embed, MM-Enable, MM-Send).
  4. M-Instance to U-Environment (MU-Actuate, MU-Render, MU-Send, Track).
  5. U-Environment to M-Instance (UM-Animate, UM-Capture, UM-Render, UM-Send).

The semantics of some of these Actions are:

  1. Identify: convert data and metadata into an Item bearing an Identifier.
  2. Discover: request a Service to provide Items and/or Processes with certain features.
  3. MM-Embed: place an Item at a particular M-Instance location (M-Location).
  4. MU-Render: select Items at an M-Location and render them at a U-Environment.
  5. UM-Animate: use a captured animation stream to animate a Persona.

How do interactions take place in the M-Instance?

A User may have the capability to perform certain Actions on certain Items but more commonly a User may ask a Device to do something for it, like capture an animation stream and use it to animate a Persona. The help of the Device may not be sufficient because MPAI assumes that an animation stream is not an Item until it gets Identified as such. Hence, the help of the Identify Service is also needed.

MPAI has defined the Inter-Process Communication Protocol (Figure 4) whereby

  1. A process creates, identifies and sends a Request-Action Item to the destination process.
  2. The receiving process
    1. May or may not perform the action requested
    2. Sends a Response-Action.

Figure 4 – The Inter-Process Interaction Protocol

Table 1 – The Inter-Process Interaction Protocol

Request-Action Response-Action Comments
Request-Action ID Response-Action ID Unique ID
Emission Time Emission Time Time of Issuance
Source Process ID Source Process ID Requesting Process ID
Destination Process ID Destination Process ID Requested Process ID
Action The Action requested
InItems OutItems In/Output Items of Action
InLocations Locations of InItems
OutLocations Locations of OutItems
OutRights Expected Rights on OutItems

The Request-Action payload includes the ID, the time, the requesting process and destination process IDs, the Action requested, the InItems on which the action is applied, where the InItems are found and the resulting OutItems are found, and the Rights the requesting process needs to have in order to act on the OutItems.

Having defined standard Actions, here is how standard Items are defined:

  1. General (M-Instance, M-Capabilities, M-Environment, Identifier, Rules, Rights, Program, Contract)
  2. Human/User-related (Account, Activity Data, Personal Profile, Social Graph, User Data).
  3. Process Interaction (Message, P-Capabilities, Request-Action, Response-Action).
  4. Service Access (AuthenticateIn, AuthenticateOut, DiscoverIn, DiscoverOut, InformIn, InformOut, InterpretIn, InterpretOut).
  5. Finance-related (Asset, Ledger, Provenance, Transaction, Value, Wallet).
  6. Perception-related (Event, Experience, Interaction, Map, Model, Object, Scene, Stream, Summary).
  7. Space-related (M-Location, U-Location).

Here are a few examples of the Item semantics:

  1. Rights: the Item describes the ability of a process to perform an Action on an Item at a time and at M-Location.
  2. Social Graph: the log of a process, e.g., a User.
  3. P-Capabilities: the Item describes the Rights held by a process and related abilities.
  4. DiscoverIn: the description of the User’s request.
  5. Asset: an Item that can be transacted.
  6. Model: data exposing animation interfaces.
  7. MLocation: delimits a space in the M-Instance.

We also need to define several entities – called Data Types – used in the M-Instance:

  1. Location and time (Address, Coordinates, Orientation, Point of View, Position, Spatial Attitude, Time).
  2. Transaction-related (Amount, Currency).
  3. Internal state of a User (Cognitive State, Emotion, Social Attitude, Personal Status).

Finally, we need to address the issue of a process in M-InstanceA requesting a process in another M-InstanceB to perform Actions on Items. In general, it is not possible for a process in M-InstanceA to communicate with a process in M-InstanceB because of security concerns, but also because the other M-InstanceB may use different data types. MPAI solves this process by extending the Inter-Process Interaction Protocol and introducing two Services:

  1. Resolution ServiceA: can talk to Resolution ServiceB.
  2. Conversion Service: can convert the format of M-InstanceA data into the format of M-InstanceB.

Figure 5 – The Inter-Process Interaction Protocol between M-Instances

This is a very high-level description of the MPAI Metaverse Model – Architecture standard that enables Interoperability of two or more M-Instances if they:

  1. Rely on the Operation Model, and
  2. Use the same Profile Architecture, and
    1. Either the same technologies, or
    2. Independent technologies while accessing Conversion Services that losslessly transform Data of an M-InstanceA to Data of an M-InstanceB.

Visiting MPAI standards – Connected Autonomous Vehicles (MPAI-CAV) – Architecture

Introduction

MMPAI-36 (September 2023) has approved the publication of five standards. Two are extensions of already published standards (and adopted by IEEE without modifications) and three brand new. This is an overview of the main content of one of the new standards, Technical Specification: Connected Autonomous Vehicles – Architecture (MPAI-CAV).
MPAI works on CAV standards because replacing current vehicles with CAVs is desirable from many viewpoints. CAVs are expected to offer a safer drive by replacing human errors with machine errors that are expected to be less frequent, allowing more time for rewarding activities, offering opportunities for better use of vehicles and road infrastructure, enabling more sophisticated traffic management, reducing congestion and pollution, and helping elderly and disabled people to have a better life. On the other hand, CAVs are available today more for experimental than regular use also because, unlike the current highly componentised automotive industry, CAV companies are monolithic, and develop internally and assemble all the components they need to make their CAVs and because the level of reliability is insufficient were CAVs deployed in massive numbers.
A CAV standard would allow the acceleration of the maturity of the CAV industry. But which standard? First, the standard MPAI is targeting would not be related to the “hardware” side but to the “software” of a CAV. Second, the standard should not address CAVs in its entirety but adopt a component approach, not unlike what the automotive industry does for the “hardware” side.
A component approach is useful for the two expected stages of the process. The first stage applies now because research can concentrate on a specific component, defined by its function and interface and optimise the component performance, possibly attaching proposed revised functions and interfaces. The second stage applies to the time when CAVs will have reached a sufficient level of performance, and component-based mass production of CAVs becomes attractive. An open market of CAV components can naturally be formed where competing providers can offer components with standard functions and interfaces but with alternative performance compared to what is available on the market at that time.

The MPAI-CAV – Architecture standard

MPAI-CAV – Architecture is a standard implementing the first step of this strategy. It defines a CAV Reference Model composed of four subsystems, each composed of interconnected components. Technical Specification: AI Framework (MPAI-AIF) is the natural selection for implementing this Reference Model. The AI Framework (AIF) specified by the standard is the environment executing AI Workflows (AIW) that correspond to the subsystems of the Reference Model. Each subsystem is composed of components called AI Modules (AIM).
The Subsystem-level Reference model is represented in Figure 1.

Figure 1 – The MPAI-CAV – Architecture Reference Model

There are four subsystem-level reference models, each specified in term of:

  1. The functions the subsystem performs.
  2. The AIF-based Reference Model.
  3. The input/output data exchanged by the CAV subsystem with other subsystems and the environment.
  4. The functions of each subsystem components to be implemented as AI Modules.
  5. The input/output data exchanged by the component with other components.

The functions and reference models of the MPAI-CAV – Architecture Subsystems will be presented next.

Human-CAV Interaction (HCI)

The HCI functions are:

  1.  To authenticates humans, e.g., to let them into the CAV.
  2. To converse with humans by interpreting utterances, e.g., to go to a destination, or during a conversation.
  3. To converse with the Autonomous Motion Subsystem to implement and execute human commands.
  4. To enable passengers to navigate the Full Environment Representation (FER), i.e., the best representation of the external environment achieved by the CAV.
  5. Appears as a speaking avatar showing a Personal Status, i.e., a simulated internal status of the machine represented according to the criteria used by humans (see https://mpai.community/standards/mpai-mmc/about-mpai-mmc/).

The HCI Reference Model is depicted in Figure 2.

Figure 2 – Human CAV Interaction Reference Model

Environment Sensing Subsystem (ESS)

The ESS functions are:

  1. To acquire Environment information using Subsystem’s RADAR, LiDAR, Cameras, Ultrasound, Offline Map, Audio, GNSS, …
  2. To receive Ego CAV’s position, orientation, and environment data (temperature, humidity, etc.) from Motion Actuation Subsystem.
  3. To produce Scene Descriptors for each sensor technology in a common format.
  4. To produce the Basic Environment Representation (BER) by integrating the sensor-specific Scene Descriptors produced during the travel.
  5. To hand over the BERs, including Alerts, to the Autonomous Motion Subsystem.

The ESS Reference Model is depicted in Figure 3.ù

Figure 3 – Environment Sensing Subsystem Reference Model

Autonomous Motion Subsystem (AMS)

The AMS functions are:

  1. To compute and execute the human-requested Route(s).
  2. To receive current BER from Environment Sensing Subsystem.
  3. To communicate with other CAVs’ AMSs (e.g., to exchange subsets of BER and other data).
  4. To produce the Full Environment Representation by fusing its own BER with info from other CAVs in range.
  5. To send Commands to Motion Actuation Subsystem to take the CAV to the next Pose.
  6. To receive and analyse responses from the MAS.

The AMS Reference Model is depicted in Figure 4.

Figure 4 – Autonomous Motion Subsystem Reference Model

Motion Actuation Subsystem

The MAS functions are:

  1. To transmit spatial/environmental information from sensors/mechanical subsystems to the Environment Sensing Subsystem.
  2. To receive Autonomous Motion Subsystem Commands.
  3. To translates Commands into specific Commands to its own mechanical subsystems, e.g., steering brakes, wheel directions, and wheel motors.
  4. To receive and analyse Responses from its mechanical subsystems.
  5. To Sends responses to Autonomous Motion Subsystem about execution of commands.

The MAS Reference Model is depicted in Figure 5.

Figure 5 – Motion Actuation Subsystem Reference Model

Conclusions

The MPAI-CAV Architecture standard is the starting point for the next steps of the MPAI-CAV roadmap addressing functional requirements of the data exchanged by subsystems and components