Go to ToC       Scope ->


Artificial Intelligence (AI) has recently made great strides in offering more efficient ways to implement processes formerly carried out with Data Processing (DP) technologies. However, AI has often been used in an ad hoc way. Many machines using AI perform extremely complex functions, but the value of the result is known to depend on the training data sets, which are typically known only to the implementer. In certain applications – information services, for instance – this data issue may have potentially devastating social impacts due to unexpected or possibly deliberate bias in the training. In other applications – such as in autonomous vehicles – the inability to explain how the vehicle came to a particular decision may likewise be unacceptable.

Data Processing standards have played a major role in promoting the wide use of digital technologies for products, services, and applications. However, few if any examples are known of AI standards with an approach comparable to that of DP standards. The MPAI organisation (Moving Picture, Audio, and Data Coding by Artificial Intelligence [1]) has taken on the mission of developing AI-based data coding standards. The group has already developed several Technical Specifications using AI Modules (AIMs) that attempt to break monolithic applications into components with known functions and interfaces and implementable using AI or DP technologies. By incorporating these modules, applications can be implemented as AI Workflows (AIWs), themselves with known functions and external interfaces, composed of AIMs interconnected according to a specified topology.

MPAI Technical Specifications offer two main advantages. The first is the ability to implement AI applications whose operation is more traceable and explainable. The second is the ability to create a competitive market of components – AIMs – with standardised functions and interfaces and potentially providing competitive performance.

MPAI has been pursuing this mission for several years. The group has developed Technical Specification: Governance of the MPAI Ecosystem (MPAI-GME) [3]. The MPAI Ecosystem is defined by the following elements:

  1. The collections of Technical Specifications, Reference Software Specifications, Conformance Testing Specifications, and Performance Assessment Specifications jointly called Standard.
  2. The MPAI Store in charge of making AIMs and AIWs available and providing Implementer Identifiers through its Implementer ID Registration Authority.
  3. Implementers of MPAI Technical Specifications who have obtained an Implementer Identifier.
  4. Performance Assessors, i.e., independent entities appointed by MPAI who assess the performance of implementations in terms of Reliability, Replicability, Robustness, and Fairness.

Another foundational Technical Specification is Technical Specification: AI Framework (MPAI-AIF) [4] enabling dynamic configuration, initialisation, and control of AIWs in a standard environment (AI Framework) depicted in Figure 1.

Figure 1 – The AI Framework (MPAI-AIF) V2 Reference Model

An Implementation of MPAI-AIF enables the secure execution of AIWs constituted by AIMs. AIMs can execute Data Processing (DP) or Artificial Intelligence (AI) algorithms and can be implemented in hardware, software, or hybrid hardware/software. They can be Composite, i.e., include interconnected AIMs.

Thus, MPAI specifications enable the implementation of applications whose internal operation end-users can understand to some degree, rather than machines that are just “black boxes” resulting from unknown training with unknown datasets. The developers of AIMs used in the AIWs can compete providing components with standard interfaces that can have improved performance compared to other implementations.

So far, MPAI has developed eight application-specific Technical Specifications on a wide range of application domains: context-enhanced audio, connected autonomous vehicles, audio enhancement, prediction of company performance, multimodal human-machine conversation, metaverse architecture, neural network watermarking, object and scene description, and portable avatars.

MPAI Technical Specifications are developed in compliance with a rigorous process [2] in service of the following policies:

  1. While closely accommodating a given AI use case, so far as possible, remain agnostic to the technology – AI or DP – used in an implementation.
  2. Facilitate the practical exploitation of Technical Specifications once adopted by MPAI.
  3. Attempt to attract various industries, end users, and regulators.
  4. Address three levels of standardisation, any of which an implementer can freely decide to adopt: the data exchanged by AIMs (“Data Types”), AIMs, and AIWs.
  5. Specify the Data Types with clear, humanly understandable semantics, so far as possible.

This Technical Specification: Human and Machine Communication (MPAI-HMC) leverages five MPAI Technical Specifications: Context-based Audio Enhancement, MPAI Metaverse Model – Architecture, Multimodal Conversation, Object and Scene Description, and Portable Avatar Format, all of which deal with technologies enabling communication of real and digital humans in real or virtual environments. MPAI-HMC reproduces the normative elements from the five Technical Specifications that are relevant to this Technical Specification.

A Term beginning with a capital letter is defined in Table 1 if it is MPAI-HMC-specific or in Table 26 if its use extends across MPAI Technical Specifications. A term beginning with a small letter has the commonly intended meaning.

MPAI may extend this Version of MPAI-HMC with new technologies drawing from existing of new Technical Specifications.

Chapters, Sections, and Annexes are Normative unless they are explicitly identified as Informative.


Go to ToC       Scope ->