2022 - Page 4 of 5 - MPAI community

Leonardo Chiariglione
2025-04-27

Communicating avatars in worlds

An avatar is generally intended as a representation of a real or fictitious human in a virtual space. Research has dedicated much effort to creating and animating realistic avatars. However, the scope of use is typically assumed to be a closed environment such as a proprietary video game. Therefore, the portability of avatars has seldom been a priority.

Some 30 years ago, the Humanoid Animation (H-Anim) standard was developed that defined a human skeleton composed of joints, segments, and sites exhibiting four levels of articulation and the default skeleton pose. The latest versions of the H-Anim standards are ISO/IEC 19774-1:2019, ISO/IEC 19774-2:2019).

An avatar is a basic but not the only element in an application. What if you want to convey to a third party a speaking avatar immersed in an environment with all its features so that it is displayed as you intended?

Technical Specification: Portable Avatar Format (MPAI-PAF), whose Version 1.4 has recently been approved offers a solution to this problem for a broad range of applications called Portable Avatar. Personal Status is a package of data conveying the following information:

The ID of the virtual space (M-Instance) where the Portable Avatar is to be placed.
The space and time information of the “environment” to be placed in the M-Instance.
The Audio-Visual Scene representing the “environment”.
The space and time information of the Avatar in the scene.
The Avatar represented as a 3D Model, its Face Descriptors and Body Descriptors.
The Language Preference of the Avatar.
The Text Object the Avatar is associated with, or which will be converted into a Speech Object.
The Speech Model used to synthesise the Text Object.
The Speech Object alternative to the Text Object that the Avatar utters.
The Personal Status of the Avatar.

Here is a brief description of the Portable Avatar components.

The ID of the virtual space (M-Instance). This is the ID of a virtual space where the Portable Avatar is to be placed. It can be a metaverse (for MPAI, this would be an M-Instance of MMM-TEC).
The space and time information of the “environment”. MPAI has defined a data type called Space-Time that defines:
1. The space information as Spatial Attitude, Position, Orientation, and optionally their velocities and acceleration.
2. Time defined as either absolute (from 1970/01/01T00:00 or from an arbitrary origin of time.
The Audio-Visual Scene. MPAI has defined Scene as a data type that describes a scene as composed of scenes and objects with their Space-Time information. The MPAI scene is thus hierarchical (see Audio-Visual Scene Descriptors).
The space and time information of the Avatar considered a particular type of object in the scene. The Avatar data type has its own space-time information. This is overridden by the scene time information, if different.
The Avatar. The MPAI Avatar data type is a structure that includes:
1. The space-time information (that is overridden by the space-time information of the scene).
2. The 3D Model Object composed of data and Qualifier giving additional information to the data, e.g., the format.
3. The Face Descriptors. MPAI has adopted the Actions Units of the Facial Action Coding System (FACS).
4. The Body Descriptors. MPAI has adopted the H-Anim standard, but the 3D Model Qualifier allows the use of other standards.
The Language Preference. MPAI supports the signalling of a large number of language codes.
The Text Object. An avatar may have a textual description represented by a Text Object (text data and Qualifier providing various types of information on the text, e.g., language and character code).
The Text Object may be used to synthesise a Speech Object (speech data and Qualifier). The Portable Avatar can convey a neural network speech model to synthesise the text. In MPAI, a neural network model (more generally, a Machine Learning Model) has an associated Qualifier providing various types of information such as the conformity of the model to a particular regulation.
The Speech Object. In some cases, the speech associated with the avatar is conveyed by the Portable Avatar. Same as for the Text Object, a Speech Object includes speech data and a Qualifier that may be used to provide information on the language, compression format, speaker identity etc.
The Personal Status. The Personal Status is an MPAI data type including information on the Cognitive State, Emotion, and Social Attitude of the Text, Speech, Face, and Gesture of an Entity (in MPAI Entity is used to indicate either a human or the process animating an avatar).
1. Cognitive State represents the internal state of an Entity that has knowledge of the context such as “surprised” or “interested”.
2. Emotion represents the internal state of an Entity such as that resulting from its interaction with the context, such as “angry” or “sad”.
3. Social Attitude represents the internal state of an Entity related to the way it intends to position itself vis-à-vis the context, e.g., “respectful” or “soothing”.

The Portable Avatar is an essential component of a variety of use cases. It is typically used as input to the Audio-Visual Scene Rendering AI Module that produces Speech, Audio, and Visual Objects from Portable Avatar, Audio-Visual Scene Descriptors (in case one is not available in the Portable Avatar), and a Point of View as depicted in Figure 1.

Figure 1 – Audio-Visual Scene Rendering AI Module

The Personal Status Display (PAF-PSD) AIM produces a Portable Avatar corresponding to an Avatar Model uttering a Speech Object synthesised from a Text Object with a Speech Model and displaying a Personal Status:

Figure 2 – Personal Status Display

Here, the input is a Text Object, a Neural Network Speech Model (in case the PAF-PSD does not have one embedded), an Avatar Model, and a Personal Status:

The Text is used to synthesise speech modulated with the Speech component of the input Personal Status.
The Speech, input Text and Face component of the input Personal Status, and the input Avatar Model are used to synthesise a face;
The Text, the Gesture component of the input Personal Status, and the input Avatar Model are used to synthesise the body.

In MPAI, the Lego approach to avatar deployment in applications is a reality.

No Comments InAll posts

Leonardo Chiariglione
2025-04-21

An overview of the MPAI Metaverse Model – Technologies standard

The MPAI Metaverse Model – Technologies standard – in short, MMM-TEC V2.0 – is the first open metaverse standard enabling independently designed and implemented defines a metaverse instance (that MMM-TEC calls M-Instances) and clients to interoperate. These are the main MMM-TEC elements:

The Architecture is based on Processes acting on Items based on the Rights they hold.
Items represent any abstract and concrete objects in an M-Instance.
Processes – possibly in different M-Instances – communicate using the Inter-Process Protocol (IPP).
Process Actions represent the payload of a message sent by a Process.
Qualifiers are containers of technology-specific information of an Item.
The MPAI-MMM API enables fast development of M-Instances.
Verification Use Cases verify the completeness of the standard.
The MMM Open-Source Software implementation can easily be installed.

Below is a more extended introduction to MMM-TEC. Here is the text of Technical Specification: MPAI Metaverse Model (MPAI-MMM) – Technologies (MMM-TEC) V2.0.

MMM-TEC defines an M-Instance as an Information and Communication Technologies platform populated by Processes. They perform a range of activities, such as operate with various degrees of autonomy and interactivity, sense data from the real world, produce various types of entities called Items, perform or request other Processes to perform activities represented by Process Actions or request other Processes – possibly in other M-Instances, hold or acquire Rights on Items, and act on the real world on a variety of ways. They can perform Process Actions based on Rights they may hold, acquire, or be granted.

Processes may be characterised as:

Services providing specific functionalities, such as content authoring.
Devices connecting the Universe to the M-Instance and the M-Instance to the Universe.
Apps running on Devices.
Users representing and acting on behalf of human entities residing in the Universe. A User is rendered as a Persona, i.e., an avatar.

Figure 1 depicts the main elements on which the MMM-TEC Specification is based: human, Devices, Apps, Users, Services, and Personae.

Figure 1 – Main elements of an M-Instance

Processes Sense Data from U-Environments, i.e., portions of the Universe and may produce three types of Items, i.e., Data that has been Identified in – and thus recognised by – the M-Instance:

Digitised– i.e., sensed from the Universe – possibly animated by activities in the Universe.
Virtual– i.e., imported from the Universe as Data or internally generated – possibly autonomous or driven by activities in the Universe.
Mixed – Digitised and Virtual.

Processes Perform – either on their initiative, or driven by the actions of humans or machines in the Universe – Process Actions that combine:

An Action, possibly prepended by:
1. MM:to indicate Actions performed inside the M-Instance, e.g., MM-Animate using a stream to animate a 3D Model with a Spatial Attitude (defined as Position, Orientation, and their velocities and accelerations).
2. MU:to indicate Actions in the M-Instance influencing the Universe, e.g., MU-Actuate to render one of its Items to a U-Location as Media with a Spatial Attitude.
3. UM: to indicate Actions in the Universe influencing the M-Instance, e.g., UM-Embed to place an Item produced by Identifying a scene, UM-Captured at a U-Location, at an M-Location with a Spatial Attitude.
Items on which the Action is performed or are required for performance, such as Asset, 3D Model, Audio Object, Audio-Visual Scene, etc.
M-Locations and/or U-Locationswhere the Process Action is performed.
Processes with which the Action is performed.
Time(s) during which the Process Action is requested to be and is performed.

Processes may hold Rights on an Item, i.e., they may perform the set of Process Actions listed in their Rights. An Item may include Rights signalling which Processes may perform Process Actions on it. Processes affect U-Environments and/or M-Instances using Items in ways that are Consistent with the goals of the M-Instance as expressed by the Rules, within the M-Capabilities of the M-Instance, e.g., to support Transactions, and respecting applicable laws and regulations.

Processes perform activities strictly inside the M-Instance or have various degrees of interaction with Data sensed from and/or actuated in the Universe.

Processes may request other Processes to perform Process Actions on their behalf by using the Inter-Process Protocol, possibly after Transacting a Value (i.e., an Amount in a Currency) to a Wallet.

An M-Instance is managed by an M-Instance Manager. At the initial time, the M-Instance Manager has Rights covering the M-Instance and may decide to define certain subsets inside the M-Instance – called M-Environments – on which it has Rights and attach Rights to them.

A Registering human may:

Request to Register to open an Account of a certain class.
Be requested to provide their Personal Profile and possibly to perform a Transaction to open an Account.
Obtain in exchange a set of Rights that their Processes may perform. Rights have Levels indicating that the Rights are
1. Internal, g., assigned by the M-Instance at Registration time according to the M-Instance Rules and the Account type.
2. Acquired, g., obtained by initiative of the Process.
3. Granted to the Process by another Process.

MMM-TEC V2.0 does not specify how an M-Instance verifies that the Process Actions performed by a Process comply with the Process’s Rights or the M-Instance Rules. An M-Instance may decide to verify the full set of Activity Data (the log of performed Process Actions), or to make verifications based on claims by another Process, to make random verifications, or to not make any verification at all. Therefore, MMM-TEC V2.0 does not specify how a M-Instance Manager can sanction non-complying Processes.

In some cases, an M-Instance could be wastefully too costly as an undertaking if all the technologies required by the MMM Technical Specification were mandatorily to be implemented, even if a specific M-Instance had limited scope. MMM-TEC V2.0 specifies Profiles to facilitate the take-off of M-Instance implementations that conform to the MMM-TEC V2.0 specification without unduly burdening some other implementations.

A Profile includes only a subset of the Process Actions that are expected to be needed and are shared by a sizeable number of applications. MMM-TEC v2.0 defines four Profiles (see Figure 2:

Baseline Profile enables basic applications such as lecture, meeting, and hang-out.
Finance Profile enables trading activities.
Management Profile includes enables a controlled ecosystem with more advanced functionalities.
High Profile enables all the functionalities of the Management Profile with a few additional functionalities of its own.

Figure 2 – MMM-TEC V2.0 Profiles

MPAI developed and used some use cases in the two MPAI-MMM Technical Reports (1 and 2) published in 2023 to develop the MMM-ARC and MMM-TEC Technical Specifications. However, MMM-TEC V2.0 includes various Verification Use Cases that use Process Actions to verify that the currently specified Actions and Items completely support those Use Cases.

The fast development of certain technology areas is one of the issues that has so far prevented the development of metaverse interoperability standards. MMM-TEC deals with this issue by providing JSON syntax and semantics for all Items. When needed, the JSON syntax references Qualifiers, MPAI-defined Data Types that supply additional information to the Data in the form of:

Sub-Type (e.g., the colour space of a Visual Data Type).
Format (e.g., the compression or the file/streaming format of Speech).
Attributes (e.g., the Binaural Cues of an Audio Object).

For instance, a Process receiving an Object can understand from the Qualifier referenced in the Object whether it has the required technology to process it, or else it has to rely on a Conversion Service to obtain a version of the Object matching its P-Capabilities. This approach should help to prolong the life of the MMM-TEC specification as in many cases only the Qualifier specification will need to be updated, not the MMM-TEC specification.

Finally, MMM-TEC V2.0 specifies the MPAI-MMM API. By calling the APIs, a developer can easily develop M-Instances and applications.

No Comments InAll posts

Leonardo Chiariglione
2025-04-18

MPAI publishes the MPAI-MMM API as part of MPAI Metaverse Model – Technologies (MMM-TEC)

Geneva, Switzerland – 16^th April 2025. MPAI – Moving Picture, Audio and Data Coding by Artificial Intelligence – the international, non-profit, unaffiliated organisation developing AI-based data coding standards – has concluded its 55^th General Assembly (MPAI-55) with the final release of the Connected Autonomous Vehicle and the MPAI Metaverse Model standards.

The MPAI Metaverse Model (MPAI-MMM) – Technologies (MMM-TEC) V2.0 standard specifies:

The Functional Requirements of the Processesoperating in a metaverse instance (M-Instance).
The Items, i.e., the Data Types and their Qualifiers recognised in an M-Instance.
The Process Actionsthat a Process can perform on Items.
The Protocolsenabling a Process to communicate with another Process.
The MPAI Metaverse Model
The MPAI-MMM API.

Availability of APIs enable the rapid development of M-Instances and clients that interoperate with M-Instances conforming with the MMM-TEC V2.0 standard.

An online presentation of the MMM-TEC V2.0 standard will be presented on 9 May at 15 UTC. Register at https://tinyurl.com/5a2d4ucv.

The Connected Autonomous Vehicle (MPAI-CAV) – Technologies (CAV-TEC) V1.0 standard specifies the Reference Model partitioning a CAV into subsystems and components. The standard Reference Model promotes CAV componentisation by enabling:

Researchersto optimise component technologies.
Component manufacturersto bring their standard-conforming components to an open market.
Car manufacturersto access a global market of interchangeable components.
Regulatorsto oversee conformance testing of components following standard procedures.
Usersto rely on Connected Autonomous Vehicles whose operation they can explain.

An online presentation of the CAV-TEC V1.0 standard will be presented on 8 May at 15 UTC. Register at https://tinyurl.com/372739sa.

MPAI is continuing its work plan that involves the following activities:

AI Framework (MPAI-AIF): building a community of MPAI-AIF-based implementers.
AI for Health (MPAI-AIH): developing the specification of a system enabling clients to improve models processing health data and federated learning to share the training.
Context-based Audio Enhancement (CAE-DC): developing the Audio Six Degrees of Freedom (CAE-6DF) standard.
Connected Autonomous Vehicle (MPAI-CAV): investigating extensions of the current CAV-TEC standard.
Compression and Understanding of Industrial Data (MPAI-CUI): developing Company Performance Prediction standard V2.0.
End-to-End Video Coding (MPAI-EEV): exploring the potential of video coding using AI-based End-to-End Video coding.
AI-Enhanced Video Coding (MPAI-EVC): developing the Up-sampling Filter for Video applications (EVC-UFV) standard.
Governance of the MPAI Ecosystem (MPAI-GME): working on version 2.0 of the Specification.
Human and Machine Communication (MPAI-HMC): developing reference software and performance assessment.
Multimodal Conversation (MPAI-MMC): Developing technologies for more Natural-Language-based user interfaces capable of handling more complex questions.
MPAI Metaverse Model (MPAI-MMM): extending the MMM-TEC specs to support more applications.
Neural Network Watermarking (MPAI-NNW): studying the use of fingerprinting as a technology for neural network traceability.
Object and Scene Description (MPAI-PAF): studying applications requiring more space-time handling applications.
Portable Avatar Format (MPAI-PAF): studying more applications using digital humans needing new technologies.
AI Module Profiles (MPAI-PRF): specifying which features AI Workflow or more AI Modules need to support.
Server-based Predictive Multiplayer Gaming (MPAI-SPG): exploring new standard opportunities in the domain.
Data Types, Formats, and Attribes (MPAI-TFA) extending the standard to data types used by MPAI standards (e.g., aomotive and health).
XR Venues (MPAI-XRV): developing the standard for improved development and execion of Live Theatrical Performances and studying the prospects of Collaborative Immersive Laboratories.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

Please visit the MPAI website, contact the MPAI secretariat for specific information, subscribe to the MPAI Newsletter and follow MPAI on social media: LinkedIn, Twitter, Facebook, Instagram, and YouTube.

No Comments InAll posts

Leonardo Chiariglione
2025-04-02

Component standards versus monolithic standards

MPAI has recently published two main new standards with a request for Community Comments. In MPAI lingo this means that the standards are mature, but MPAI asks the Community to review the drafts before publication.

The two standards are Connected Autonomous Vehicle – Technologies (CAV-TEC) V1.0 and MPAI Metaverse Model – Technologies (MMM-TEC) V2.0. They are not “new” MPAI standards as they have already been published in earlier version, but the new versions represent significant improvements.

You may ask “Why is this topic handled in a single paper when we are talking about two standards that have – apparently – so little in common”? If you ask this question, you may want to continue reading and discover an essential aspect of MPAI standardisation. A standard for an application domain is (almost) never a monolith but is often made of components shared with standards of other domains, often unrelated.

Let’s first have a scan of the two standards.

Connected Autonomous Vehicle (CAV-TEC V1.0) is a standard for the ICT (Information and Communication Technology) part of a vehicle that can move itself in the physical world to reach a destination. The standard assumes that a CAV is composed of four functionally separated but interconnected subsystems:

The human-CAV Interaction (HCI): enables a human to establish a dialogue with the CAV to issue a variety of commands to it, such as moving to a destination or have conversations where the human – and the CAV – can express their internal status – e.g., emotion – whether real or fictitious.
The Environment Sensing Subsystem (EES): leverages the sensors on board to create the Basic Scene Descriptors (BED), the most accurate digital representation of the external environment possible with the available sensors.
The Autonomous Motion Subsystem (AMS): receives the BED and improves its accuracy by exchanging portions of their Environment Descriptors with CAVs in range. Then it analyses or reviews the situation and decides how to issue commands to implement the route decided by the human.
The Motion Actuation Subsystem converts a general request to move the CAV potentially by a few meters to specific commands for brakes, motors, and wheel and to report about the implementation of the command.

The four subsystems are implemented as AI Workflows per the AI Framework standard. Each AI Workflow includes several AI Modules that exchange Data specifies by CAV-TEC V1.0 or by other MPAI standards. Most AIMs and data types of:

HCI: is specified not by CAV-TEC but by Multimodal Conversation (MPAI-MMC) V2.3 . The rest is specified by Object and Scene Description (MPAI-OSD) V1.3, Portable Avatar Format (MPAI-PAF) V1.4, Data Types, Formats, and Attributes (MPAI-TFA) V1.3, and CAV-TEC.
ESS, AMS, and MAS: are specified by CAV-TEC, MPAI-OSD, MPAI-PAF, and MPAI-TFA.

A concise description of the operation of an implementation of CAV-TEC is available here.

MPAI Metaverse Model (MMM-TEC V2.0) specifies a virtual space composed of processes is provided operating on an ICT platform and executing or requesting other processes to execute actions on items, i.e., MMM-TEC V2.0-specified data types. Processes may be rendered as avatars.

MMM-TEC does not use AIMs but only data types. It defines 27 actions and a language enabling processes to communicate using speech acts. Many are specified by MPAI-MMC, MPAI-OSD, and MPAI-PAF.

Example of actions are MM-Embed applied on an avatar or an object to place it somewhere in the metaverse, UM-Capture applied to media information in the physical world acquired for use in the metaverse, Identify applied to data captured to convert it to an item with an identifier, MM-Anim applied to an item, e.g., an avatar, to animate it, and MU-Render applied to an item in the metaverse to render it in the universe. A concise description of the operation of an implementation of MMM-TEC is available here.

As already said, both draw a sizeable part of their data types (and CAV-TEC of its AIMs) from three other standards: MPAI-OSD V1.3, MPAI-PAF V1.4, and MPAI-TFA V1.3. Some of these AIMs and data types are new in the mentioned versions of the three standards. They were developed in response to the CAV-TEC and MMM-TEC needs.

Why not develop them in CAV-TEC and MMM-TEC directly, then? Because the three standards address a specific area of standardisation that is also required by many other MPAI standards: objects and scenes, 3D graphics, and qualifiers. Therefore, MPAI-OSD V1.3, MPAI-PAF V1.4, and MPAI-TFA V1.3 are also published with a request for Community Comments.

Anybody is invited to send comments on any of the five standards to the MPAI Secretariat by April 13 at 23:59 UTC.

No Comments InAll posts

Leonardo Chiariglione
2025-03-19

MPAI releases Connected Autonomous Vehicle and MPAI Metaverse Model for Community Comments, starts Up-sampling Filter for Video Applications

Geneva, Switzerland – 19^th March 2025. MPAI – Moving Picture, Audio and Data Coding by Artificial Intelligence – the international, non-profit, unaffiliated organisation developing AI-based data coding standards – has concluded its 54^th General Assembly (MPAI-54) with the release of the Connected Autonomous Vehicle and the MPAI Metaverse Model standards for Community Comments and the start of the Up-sampling Filter for Video Applications (EVC-UFV) V1.0 standard project.

MPAI has been working on the Connected Autonomous Vehicle (MPAI-CAV) project since its early days and released the Architecture Specification (CAV-TEC). Today, MPAI-54 released the Technology specification (CAV-TEC) V1.0 that builds on the Architecture Specification adding subsystems, components, and data types. The specification is released with a request for Community Comments and is available from the MPAI website. Anybody is invited to send comments on MMM-CAV V1.0 to the MPAI Secretariat by April 13 at 23:59 UTC.

MPAI has been working on the MPAI Metaverse Model (MPAI-MMM) project since January 2022. So far, two technical Reports and two Technical Specifications on Architecture and Technologies were published. Today, MPAI-54 released the Technology specification (MMM-TEC) V2.0 integrating the Architecture and Technologies specifications and the Reference Software published by MPAI-53. The specification is released with a request for Community Comments and is available from the MPAI website. Anybody is invited to send comments on MMM-TEC V2.0 to the MPAI Secretariat by April 13 at 23:59 UTC.

Both CAV-TEC V1.0 and MMM-TEC V2.0 reuse technology specifications that are shared with other MPAI standard, namely Object and Scene Description (MPAI-OSD) V1.3, Portable Avatar Format (MPAI-PAF) V1.4, and Data Types, Formats and Attributes (MPAI-TFA) V1.3. The specifications – available from the MPAI-OSD, MPAI-PAF, and MPAI-TFA web pages, respectively – were released with a request for Community Comments. Anybody is invited to send comments on any of the three standards to the MPAI Secretariat by April 13 at 23:59 UTC.

MPAI has identified the need for a standard that specifies an up-sampling filter for video applications with improved performance compared to the currently used up-sampling filters. MPAI-54 decided to kick off the new Up-sampling Filter for Video applications (EVC-UFV) V1.0 standard project based on the received responses to the Call for Technologies.

MPAI is continuing its work plan that involves the following activities:

AI Framework (MPAI-AIF): building a community of MPAI-AIF-based implementers.
AI for Health (MPAI-AIH): developing the specification of a system enabling clients to improve models processing health data and federated learning to share the training.
Context-based Audio Enhancement (CAE-DC): developing the Audio Six Degrees of Freedom (CAE-6DF) standard.
Connected Aonomous Vehicle (MPAI-CAV): updating the MPAI-CAV Architecture part and developing the new MPAI-CAV Technologies (CAV-TEC) part of the standard.
Compression and Understanding of Industrial Data (MPAI-CUI): developing Company Performance Prediction standard V2.0.
End-to-End Video Coding (MPAI-EEV): exploring the potential of video coding using AI-based End-to-End Video coding.
AI-Enhanced Video Coding (MPAI-EVC): waiting for responses to the Call for Technologies for video up-sampling filter on 11 February.
Governance of the MPAI Ecosystem (MPAI-GME): working on version 2.0 of the Specification.
Human and Machine Communication (MPAI-HMC): developing reference software and performance assessment.
Multimodal Conversation (MPAI-MMC): Developing technologies for more Natural-Language-based user interfaces capable of handling more complex questions.
MPAI Metaverse Model (MPAI-MMM): extending the MPAI-MMM specs to support more applications.
Neural Network Watermarking (MPAI-NNW): studying the use of fingerprinting as a technology for neural network traceability.
Object and Scene Description (MPAI-PAF): studying applications requiring more space-time handling applications.
Portable Avatar Format (MPAI-PAF): studying more applications using digital humans needing new technologies.
AI Module Profiles (MPAI-PRF): specifying which features AI Workflow or more AI Modules need to support.
Server-based Predictive Multiplayer Gaming (MPAI-SPG): exploring new standard opportunities in the domain.
Data Types, Formats, and Attribes (MPAI-TFA) extending the standard to data types used by MPAI standards (e.g., aomotive and health).
XR Venues (MPAI-XRV): developing the standard for improved development and execion of Live Theatrical Performances and studying the prospects of Collaborative Immersive Laboratories.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

No Comments InAll posts

Leonardo Chiariglione
2025-02-19

MPAI releases the MPAI Metaverse Model as Open-Source Software

Geneva, Switzerland – 19^th February 2025. MPAI – Moving Picture, Audio and Data Coding by Artificial Intelligence – the international, non-profit, unaffiliated organisation developing AI-based data coding standards – has concluded its 53^rd General Assembly (MPAI-53) releasing the first version of the MPAI Metaverse Model Open-Source Reference Software and kicking off the new project Compression and Understanding of Industrial Data (MPAI-CUI) – Company Performance Prediction (CUI-CPP) V2.0.

MPAI has been working on MPAI Metaverse Model (MPAI-MMM) standards since January 2022 and published Two technical Reports and two Technical Specifications on Architecture and Technologies. The Reference Software released today implements a significant number of the MMM functionalities and uses a set of Unity instances to realise the different environments of the metaverse instance. You can find the MMM software at http://bit.ly/41J0wsj (REST API web server) and https://bit.ly/4ituw0R (Unity web server).

The Company Performance Prediction (CUI-CPP) project intends to provide a solution to a problem afflicting all companies: given the governance structure and the financial situation of a company and various types of risks that may affect it, what is the impact of the components of the governance structure on the governance, finance, and risk on the probability of default? MPAI has developed a set of functional requirements and a framework licence. A Call for Technologies was published and the standard will be collaboratively developed based on the responses to the Call.

MPAI is continuing its work plan that involves the following activities:

AI Framework (MPAI-AIF): building a community of MPAI-AIF-based implementers.
AI for Health (MPAI-AIH): developing the specification of a system enabling clients to improve models processing health data and federated learning to share the training.
Context-based Audio Enhancement (CAE-DC): developing the Audio Six Degrees of Freedom (CAE-6DF) standard.
Connected Aonomous Vehicle (MPAI-CAV): updating the MPAI-CAV Architecture part and developing the new MPAI-CAV Technologies (CAV-TEC) part of the standard.
Compression and Understanding of Industrial Data (MPAI-CUI): developing Company Performance Prediction standard V2.0.
End-to-End Video Coding (MPAI-EEV): exploring the potential of video coding using AI-based End-to-End Video coding.
AI-Enhanced Video Coding (MPAI-EVC): waiting for responses to the Call for Technologies for video up-sampling filter on 11 February.
Governance of the MPAI Ecosystem (MPAI-GME): working on version 2.0 of the Specification.
Human and Machine Communication (MPAI-HMC): developing reference software and performance assessment.
Multimodal Conversation (MPAI-MMC): Developing technologies for more Natural-Language-based user interfaces capable of handling more complex questions.
MPAI Metaverse Model (MPAI-MMM): extending the MPAI-MMM specs to support more applications.
Neural Network Watermarking (MPAI-NNW): studying the use of fingerprinting as a technology for neural network traceability.
Object and Scene Description (MPAI-PAF): studying applications requiring more space-time handling applications.
Portable Avatar Format (MPAI-PAF): studying more applications using digital humans needing new technologies.
AI Module Profiles (MPAI-PRF): specifying which features AI Workflow or more AI Modules need to support.
Server-based Predictive Multiplayer Gaming (MPAI-SPG): exploring new standard opportunities in the domain.
Data Types, Formats, and Attribes (MPAI-TFA) extending the standard to data types used by MPAI standards (e.g., aomotive and health).
XR Venues (MPAI-XRV): developing the standard for improved development and execion of Live Theatrical Performances and studying the prospects of Collaborative Immersive Laboratories.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

No Comments InAll posts

Leonardo Chiariglione
2025-02-01

Is it possible to mitigate data loss effects in online gaming?

The 52^nd MPAI General Assembly (MPAI-52) has approved Server-based Predictive Multiplayer Gaming (MPAI-SPG) – Mitigation of Data Loss effects (SPG-MDL) V1.0. It is a Technical Report that provides a methodology to predict the game state of an online gaming server when some controller data is lost. The Prediction is obtained by applying Machine Learning algorithms based on historical data of the online game.

An online Multiplayer Game is based on a server. When this maintains consistency among all clients’ game instances it is called authoritative. It updates and broadcasts the game state using the controller data of all the clients. This function is harmed when controller data are not correctly received or are maliciously modified.

There are several techniques currently used to cure this situation. In Client Prediction, client game state is updated locally using predicted or interpolated data while waiting for the server data; in Time Delay, the server buffers the game state updates to synchronise all clients; and in Time Warp the server rolls back the game state to when controller data was sent by a client and acts as if the action was taken then, reconciling this new game state with the current game state.

These three methods have shortcomings. Client Prediction causes perceptible delay, Time Delay affects responsiveness, and Time Warp disadvantages other players because the new game state likely differs from the previous one.

Figure 1 depicts the arrangement proposed by SPG-MDL. The right-hand side represents the online game server with the Game State Engine tasked to produce the game state using all clients’ controller data and specialised engines. The Engines of the figure are:

Behaviour Engine, orchestrating actions from players and non-player entities.
Rules Engine, ensuring adherence to game mechanics.
Physics Engine, responsible for physical interactions within the game environment.

Figure 1 – Server Prediction

Whenever a client’s controller data is lost, the server requests the SPG-MDL module (the left-hand side of Figure 1) to compute the next game state. In this way, even if a client is experiencing network latency, the other clients maintain a continuous playing environment. The more accurate the predictions, the less noticeable the effect of the synchronisation process on the lagging client when the network resumes normal operations. A latency-affected client will still receive the results of its actions with a delay, but the server will send the new game state before receiving the action from the client, effectively halving the wait time. Of course, it is possible to further mitigate the effects of this problem by implementing additional client-side techniques, for example Client Prediction.

When some controller data is lost, the process begins with the last correct game state being fed into the SPG-MDL’s Game State Demultiplexer, which deconstructs it into discrete Game Messages* ( ). To differentiate the Game Messages inside the “twin” game server from the ones inside the game server, the ‘*’ symbol is attached to the SPG-MDL Game Messages. Each Game Message* is then processed by its respective Engine AI, leveraging a Neural Network Model to produce a predicted Game Message* ( ). These predictions are assembled by the Predicted Game State Multiplexer into the predicted Game State ( ), which is then sent back to the game server for the next iteration of Game State computation.

While the SPG-MDL module operates, the server independently computes its updated Game State (GS_t+1) using the available Client Data. The server utilises the predicted Game State as follows. If any Client Data is missing, the server uses the predicted game state to compensate for the missing data shortfall from one or more clients. Note that the online game server architecture is a reference model and that the three engines are not a requirement for a specific game applying the MPAI-SPG methodology. For example, some games may not have a Physics Engine because physical-based behaviour is not required.

SPG-MDL provides a 10-step procedure to develop an SPG-MDL that is applicable to any authoritative game server.

The first 4 steps are required to outline the game setup to enable informed decisions for the implementation of SPG.

Select the game.
Define the Entities (to identify NN Model parameters):
1. Environment
2. Human-controlled players (HPC) and Non-player characters (NPC)
Define the Game State and relevant Entities.
Design the training dataset.
Collect the training dataset.
Train prediction NN Models defining viable architectures and training parameters and comparing the training results of different architectures.
Implement SPG-MDL.
Evaluate SPG-MDL to select the model yielding the best predictions.
Implement modules which simulate the disturbances.
Evaluate the SPG-MDL-enabled game experience with human players.

For each of the ten steps, the Technical Report provides:

High-level guidelines to outline the actions required.
An example of how the guidelines are implemented using a car racing game (Figure 2).

Figure 2 – Modular tiles and an example of a racetrack

The game was developed using the Unity game engine, and the networking features implemented through the open-source game networking library Mirror.

The following components are provided:

The car racing game.
Four different categories of Agent Players trained using Unity’s ML-Agents library.
The dataset used for training generated by simulating game sessions played by the Agent Players.
Jupyter Notebooks for training experiments and results.
The trained models used by the AI-Behaviour Engine.

The software is available online. Please contact the MPAI secretariat to access the repository. Details on how to use the material are provided in the repository’s README page.

No Comments InAll posts

Leonardo Chiariglione
2025-01-22

MPAI applies AI to Server-based Predictive Multiplayer Gaming

Geneva, Switzerland – 22^nd January 2025. MPAI – Moving Picture, Audio and Data Coding by Artificial Intelligence – the international, non-profit, unaffiliated organisation developing AI-based data coding standards – has concluded its 52^nd General Assembly (MPAI-52) approving publication of Technical Report: Server-based Predictive Multiplayer Gaming (MPAI-SPG) – Mitigation of Data Loss effects (SPG-MDL) V1.0.

Technical Report: Server-based Predictive Multiplayer Gaming (MPAI-SPG) – Mitigation of Data Loss effects (SPG-MDL) V1.0 addresses the effect of controller data latency in online multiplayer gaming. When controller data from a player does not reach the server on time, the server is unable to update and distribute a correct game state. The Technical Report provides guidelines on the design and use of Neural Networks that produce reliable and accurate predictions making up for the absence of players’ control data in multiplayer gaming contexts based on authoritative servers. An example Reference Software allows experimenters to test the suggested guidelines in a practical case.

MPAI will make an online presentation of the main results of the SPG-MDL V1.0 Technical Report on 12^th of February 2025 at 15 UTC. Register at https://encr.pw/CzO72 to attend.

See also the MPAI presentation to LA SIGGRAPH by Leonardo Chiariglione, Marina Bosi (Six Degrees of Freedom Audio), Andrea Bottino (MPAI Metaverse Model), Mark Seligman (Multimodal Conversation), and Ed Lantz (XR Venues)

MPAI is continuing its work plan that involves the following activities:

AI Framework (MPAI-AIF): building a community of MPAI-AIF-based implementers.
AI for Health (MPAI-AIH): developing the specification of a system enabling clients to improve models processing health data and federated learning to share the training.
Context-based Audio Enhancement (CAE-DC): developing the Audio Six Degrees of Freedom (CAE-6DF) standard.
Connected Aonomous Vehicle (MPAI-CAV): updating the MPAI-CAV Architecture part and developing the new MPAI-CAV Technologies (CAV-TEC) part of the standard.
Compression and Understanding of Industrial Data (MPAI-CUI): waiting for responses on 11 February.
End-to-End Video Coding (MPAI-EEV): exploring the potential of video coding using AI-based End-to-End Video coding.
AI-Enhanced Video Coding (MPAI-EVC): waiting for responses to the Call for Technologies for video up-sampling filter on 11 February.
Governance of the MPAI Ecosystem (MPAI-GME): working on version 2.0 of the Specification.
Human and Machine Communication (MPAI-HMC): developing reference software and performance assessment.
Multimodal Conversation (MPAI-MMC): Developing technologies for more Natural-Language-based user interfaces capable of handling more complex questions.
MPAI Metaverse Model (MPAI-MMM): extending the MPAI-MMM specs to support more applications.
Neural Network Watermarking (MPAI-NNW): studying the use of fingerprinting as a technology for neural network traceability.
Object and Scene Description (MPAI-PAF): studying applications requiring more space-time handling applications.
Portable Avatar Format (MPAI-PAF): studying more applications using digital humans needing new technologies.
AI Module Profiles (MPAI-PRF): specifying which features AI Workflow or more AI Modules need to support.
Server-based Predictive Multiplayer Gaming (MPAI-SPG): developing technical report on mitigation of data loss.
Data Types, Formats, and Attribes (MPAI-TFA) extending the standard to data types used by MPAI standards (e.g., aomotive and health).
XR Venues (MPAI-XRV): developing the standard for improved development and execion of Live Theatrical Performances and studying the prospects of Collaborative Immersive Laboratories.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

No Comments InAll posts

Leonardo Chiariglione
2025-01-02

MPAI calls for a new generation of company performance prediction technologies

The 51st MPAI General Assembly has decide to develop a new version V2.0 of Compression and Understanding of Financial Data (MPAI-CUI) – Company Performance Prediction (CUI-CPP) and issued a Call for Technologies to acquire relevant technologies. Register to attend online event where the Call will be presented on 2025/01/08 T15:00 UTC.

Compression and Understanding of Industrial Data (MPAI-CUI) was one of the first (2021) MPAI standards. The MPAI-CUI V1.0 Company Performance Prediction Use Case was based on the notion that the future of a company largely depends on structure, financial state, and risks it may face in the future. The solution used to address the complex task of creating a standard to predict the future of a company using such variables was successfully achieved was based on:

Governance Data, Financial Data, and Risk Assessment Data.
Conversion of Governance Data and Financial Data into Descriptors.
Conversion of Risk Assessment Data into a Risk Matrix.
Passing the Governance and Financial Descriptors to an Organisation Assessment and Default Prediction neural network.
Perturbing the Default Probability from the neural network with the Risk Matrix to obtain the Discontinuity Prediction.

Governance Descriptors used as input to the neural network were: #Stakeholder Individuals, #Stakeholder Companies, Shareholder Share, Shareholders Gender, Decision-Makers Gender, #Decision-Makers, Members of the Revision And Advisory Board, Presence Of The Advisory Company, #Decision-Makers By The Same Family, Company Phase (Age).

Financial Descriptors used as input to the neural network were: Revenues, EBITDA Margin, EBITDA, Quick Ratio, Current Ratio, Net Working Capital, Net Financial Position, Net Short-Term Assets, Shareholder Funds-Fixed Assets, Long-Term Liability Ratio, Coverage Of Fixed Assets, Amortisation Rate, Debt On Sales, Interest Coverage Ratio, Average Stock Turnover, Stock Coverage Days, Return On Investments (ROI), Return On Assets (ROA), Return On Sales (ROS), Return On Equity (ROE), Cash Flow, Interest On Sales, Type Of Financial Statement.

Risk Matrix included the following characteristics: Occurrence (3 values), Business Impact (3 values), Gravity (5 values), Risk retention (portion of the risk that the Company decides to retain).

The Governance and Financial Descriptors together with the Prediction Horizon fed to Neural Network provided an Organisational Model Index and a Default Probability. Default Probability and Risk Matrix fed to a Prediction Result Perturbation AIM which perturbed the Default Probability and produced the Business Discontinuity Probability.

Figure 1 depicts the AI Workflow that performs as described above.

Figure 1 – The Company Performance Prediction AI Workflow of MPAI-CUI V1.1

At the 51^st General Assembly (MPAI-51) MPAI issued a Call for Technologies for substantially more ambitious goal for MPAI-CUI V2.0 because it Company performance prediction targets:

A more precise identification of Cyber, Digitisation, Climate, and Business risks.
A definition of risks to be organised as follows:
- Risk name, Risk type: (cyber, etc.),
- Target regulation,
- Vector of inputs including, e.g. for Cyber Risks:
  - Name of input: IP address, Denial of service;
  - Time: time the attack was detected;
  - Source: provider of input vector; Type: image, text, category, etc.;
  - Value: depends on type.

With this additional information, the task is to define a (set of) neural network(s) that receive(s) Risk Data in addition to Governance and Financial Descriptors. The Assessment and Prediction network of Figure 2 are not only numbers but Descriptors that include index or probability but also information on which input elements have more influence on index and probabilities.

Figure 2 – The Company Performance Prediction AI Workflow of MPAI-CUI V1.1

Risk Assessment and Risk Matrix are used when sufficient data are not available to train the Neural Network or when the Neural Network may not be used because it does not comply with relevant regulations.

The Call for Technologies is directed to all parties having rights to technologies satisfying the Use Cases and Functional Requirements and the Framework Licence of the planned Technical Specification MPAI-CUI V2.0 are invited to respond to the Call for Technologies, preferably using the Template for Responses. Submissions received by 2024/02/11 will be assessed and considered for use in the development of said MPAI-CUI Technical Specification.

No Comments InAll posts

Leonardo Chiariglione
2024-12-18

MPAI calls for “Company Performance Prediction” technologies

Geneva, Switzerland – 18^th December 2024. MPAI – Moving Picture, Audio and Data Coding by Artificial Intelligence – the international, non-profit, unaffiliated organisation developing AI-based data coding standards – has concluded its 51^st General Assembly (MPAI-51) approving publication of the Compression and Understanding of Industrial data (MPAI-CUI) V2.0 Call for Technologies.

Call for Technologies: Compression and Understanding of Industrial data (MPAI-CUI) V2.0 invites any party able and wishing to contribute to the development of the planned MPAI-CUI V2.0 Technical Specification to submit a response by 11^th of February 2025. The new standard will extend the current company organisation index and default/discontinuity probabilities with descriptors and information on the compliance of the Machine Learning Models used.

MPAI-51 also approved as MPAI standards:

Neural Network Traceability (MPAI-NNT) V1.0 to evaluate the ability to trace back to its source a neural network that has been modified, the computational cost of injecting, extracting, detecting, decoding, or matching data from a neural network, and the impact on the performance of a neural network with inserted traceability data and its inference.
Human and Machine Communication (MPAI-HMC) V2.0 that enables advanced forms of communication between humans in a real space or represented in a Virtual Space, and Machines represented as humanoids in a Virtual Space or rendered as humanoids in a real space.
Context-based Audio Enhancement (MPAI-CAE) V2.3, Multimodal Conversation (MPAI MMC) V2.3, Object and Scene Descriptors (MPAI-OSD) V1.2, and: Portable Avatar Format (MPAI-PAF) V1.3.

The MPAI-CUI V2.0 Call for Technologies will be presented online on 8^th of January 2025 at 15 UTC. Register at https://tinyurl.com/4vdps8f3 to attend.

MPAI is continuing its work plan that involves the following activities:

AI Framework (MPAI-AIF): building a community of MPAI-AIF-based implementers.
AI for Health (MPAI-AIH): developing the specification of a system enabling clients to improve models processing health data and federated learning to share the training.
Context-based Audio Enhancement (CAE-DC): developing the Audio Six Degrees of Freedom (CAE-6DF) standard.
Connected Aonomous Vehicle (MPAI-CAV): updating the MPAI-CAV Architecture part and developing the new MPAI-CAV Technologies (CAV-TEC) part of the standard.
Compression and Understanding of Industrial Data (MPAI-CUI): waiting for responses on 11 February.
End-to-End Video Coding (MPAI-EEV): exploring the potential of video coding using AI-based End-to-End Video coding.
AI-Enhanced Video Coding (MPAI-EVC): waiting for responses to the Call for Technologies for video up-sampling filter on 11 February.
Governance of the MPAI Ecosystem (MPAI-GME): working on version 2.0 of the Specification.
Human and Machine Communication (MPAI-HMC): developing reference software and performance assessment.
Multimodal Conversation (MPAI-MMC): Developing technologies for more Natural-Language-based user interfaces capable of handling more complex questions.
MPAI Metaverse Model (MPAI-MMM): extending the MPAI-MMM specs to support more applications.
Neural Network Watermarking (MPAI-NNW): studying the use of fingerprinting as a technology for neural network traceability.
Object and Scene Description (MPAI-PAF): studying applications requiring more space-time handling applications.
Portable Avatar Format (MPAI-PAF): studying more applications using digital humans needing new technologies.
AI Module Profiles (MPAI-PRF): specifying which features AI Workflow or more AI Modules need to support.
Server-based Predictive Multiplayer Gaming (MPAI-SPG): developing technical report on mitigation of data loss.
Data Types, Formats, and Attribes (MPAI-TFA) extending the standard to data types used by MPAI standards (e.g., aomotive and health).
XR Venues (MPAI-XRV): developing the standard for improved development and execion of Live Theatrical Performances and studying the prospects of Collaborative Immersive Laboratories.

Legal entities and representatives of academic departments supporting the MPAI mission and able to contribute to the development of standards for the efficient use of data can become MPAI members.

No Comments InAll posts

Cookie	Duration	Description
cookielawinfo-checkbox-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Technical".
CookieLawInfoConsent	1 year	The cookie is set by the GDPR Cookie Consent plug-in and is used to store whether the user has consented to the use of cookies or not. It does not store any personal data.
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
_pk_id.6.08a8	13 months	Used to store a few details about the user such as the unique visitor ID
_pk_ses.6.08a8	30 minutes	Short lived cookies used to temporarily store data for the visit

Archives: 2022-04-25

Notice