1 Introduction 6 MPAI-CUI 12 MPAI-MMC
2 MPAI-AIF 7 MPAI-CAV 12.1 MMC  V1
2.1 AIF V1 8 MPAI-EEV 12.2 MMC  V2
2.1 AIF V2 9 MPAI-EVC 13 MPAI-NNW
3 MPAI-AIH 10 MPAI-GSA 14 MPAI-OSD
4 MPAI-ARA 11 MPAI-MCS 15 MPAI-SPG
5 MPAI-CAE 16 MPAI-XRV

1        Introduction

MPAI’s standards development is based on projects evolving through a workflow extending on 6 + 1 stages.

# Acr Name Description
0 IC Interest Collection Collection and harmonisation of use cases proposed.
1 UC Use cases Proposals of use cases, their description and merger of compatible use cases.
2 FR Functional Reqs Identification of the functional requirements that the standard in­cluding the Use Case should satisfy.
3 CR Commercial Reqs Development and approval of the framework licence of the stan­dard.
4 CfT Call for Technologies Preparation and publication of a document calling for technologies supporting the functional and commercial requirements.
5 SD Standard Development Development of the standard in a specific Development Com­mit­tee (DC).
6 CC Community Comments When the standard has achieved sufficient maturity it is published with request for comments.
7 MS MPAI standard The standard is approved by the General Assembly.
7.1 TS Technical Specification The normative specification to make a conforming implement­ation.
7.2 RS Reference Software The descriptive text and the software implementing the Technical Specification
7.3 CT Conformance Testing The Specification of the steps to be executed to test an implementation for conformance.
7.4 PA Conformance Assessment The Specification of the steps to be executed to assess an implementation for performance.

A project progresses from one stage to the next by resolution of the General Assembly.

The stages of currently (MPAI-21) active MPAI projects are graphically represented by Figure 1

Legend: TS: Technical Specification, RS: Reference Software, CT: Conformance Testing, PA: Performance Assessment; V2: Version 2.

Figure 1Snapshot of the MPAI work plan

2         MPAI-AIF

Artificial Intelligence Framework (MPAI-AIF) enables creation and automation of mixed Artif­icial Intelligence – Machine Learning – Data Processing workflows for the application areas cur­rently considered by the MPAI work plan.

2.1        Version 1

Figure 2 shows the MPAI-AIF V1 Reference Model.

Figure 2 – Reference model of the MPAI AI Framework (MPAI-AIF) V1

The MPAI-AIF Technical Specification V1 and Reference Software V1 have been approved and is available here.

2.2        Version 2

MPAI has developed Use Cases and Requirements for MPAI-AIF V2 adding security support to MPAI-AIF V1.

Figure 3 – Reference model of the MPAI AI Framework (MPAI-AIF) V2

The collection of public documents is available here.

3         MPAI-AIH

Artificial Intelligence for Health data (MPAI-AIH) is an MPAI project addressing the secure collection, AI-based processing and secure access to Health data (Figure 4).

Figure 4 – MPAI-AIH Reference Model

4          MPAI-ARA

Avatar Representation and Animation (ARA) is a project developing requirements for standards supporting the needs of MPAI-MMC, MPAI-CAV and MPAI-MCS.  Figure 5  gives the Reference Model of Personal Status Display (ARA-PSD) able to animate an speaking avatar based on text and Personal Status.

The collection of public documents is available here.

Figure 5 – Personal Status Display (ARA-PSD)

5          MPAI-CAE

Context-based Audio Enhancement (MPAI-CAE) improves the user experience for several audio-related applications including entertainment, communication, teleconferencing, gaming, post-production, restoration etc. in a variety of contexts such as in the home, in the car, on-the-go, in the studio etc. using context information to act on the input audio content using AI.

Figure 6 is the reference model of Unidirectional Speech Translation.

Figure 6 – An MPAI-CAE Use Case: Emotion-Enhanced Speech

The MPAI-AIF Technical Specification has been approved and is available here. MPAI has developed Use Cases and Requirements for Version 2 as part of the MPAI-MMC V2 standard.

The collection of public documents is available here.

6          MPAI-CUI

Compression and understanding of industrial data (MPAI-CUI) aims to enable AI-based filtering and extraction of key information to predict company performance by applying Artificial Intellig­ence to governance, financial and risk data. This is depicted in Figure 7.

Figure 7 – The MPAI-CUI Use Case

The collection of publicly available MPAI-CUI documents is here. The set of specifications composing the MPAI-CUI standard is available here.

7          MPAI-CAV

Connected Autonomous Vehicles (CAV) is a Use Case addressing the Connected Autonomous Vehicle (CAV) domain and the 5 main operating instances of a CAV:

  1. Human-CAV interaction (HCI), i.e., the CAV subsystem that responds to humans’ com¬mands and queries, senses human activities in the CAV passenger compartment and activates other subsystems as required by humans or as deemed necessary by the identified conditions.
  2. CAV-Environment interaction, i.e., the subsystem that acquires information from the physical environment via a variety of sensors.
  3. Autonomous Motion Subsystem (AMS), i.e., the CAV subsystem that uses different sources of information to instructs the CAV to reach the intended destination.
  4. CAV-Device Interaction (CDI), i.e., the subsystem that communicates with sources of external information, including other CAVs, Roadside Units (RSU), other vehicles etc.
  5. Motion Actuation Subsystem (MAS), i.e., the subsystem that operates and actuates the motion instructions in the physical world.

The interaction of the 5 subsystems in depicted in Figure 8.

Figure 8 -– The CAV subsystems

Requirements for the Human-CAV Interaction subsystem (Figure 9) have been developed and integrated in the MPAI-MMC Call for Technologies.

Figure 9 – Reference Model of the Human-CAV Interaction Subsystem

The collection of public documents is available here.

8          MPAI-EEV

There is consensus in the video coding research community that the so-called End-to-End (E2E) video coding schemes can yield significantly higher performance than those target, e.g., by MPAI-EVC. AI-based End-to-End Video Coding intends to address this promising area.

MPAI is extending the OpenDVC model [Figure 10]

Figure 10 – MPAI-EEV Reference Model

The collection of public documents is available here.

9          MPAI-EVC

AI-Enhanced Video Coding (MPAI-EVC) is a video compression stan­dard that substantially en­hances the performance of a traditional video codec by improving or replacing traditional tools with AI-based tools. Two approaches – Horizontal Hybrid and Vertical Hybrid – are envisaged. The Vertical Hybrid approach envigaes an AVC/HEVC/EVC/VVC base layer plus an enhanced machine learning-based layer. This case can be represented by Figure 11.

Figure 11 – A reference diagram for the Vertical Hybrid approach

The Horizontal Hybrid approach introduces AI based algorithms combined with trad­itional image/video codec, trying to replace one block of the traditional schema with a machine learn­ing-based one. This case can be described by Figure 12 where green circles represent tools that can be replaced or enhanced with their AI-based equivalent.

Figure 12 – A reference diagram for the Horizontal Hybrid approach

MPAI is engaged in the MPAI-EVC Evidence Project seeking to find evidence that AI-based technologies provide sufficient improvement to the Horizontal Hybrid approach. A second project on the Vertical Hybrid approach is being considered.

The collection of public documents is available here.

10          MPAI-GSA

Integrative Genomic/Sensor Analysis (MPAI-GSA) uses AI to understand and compress the res­ult of high-throughput experiments combining genomic/proteomic and other data, e.g., from video, motion, location, weather, medical sensors.

Figure 13 addresses the Smart Farming Use Case.

Figure 13 – An MPAI-GSA Use Case: Smart Framing

The collection of public documents is available here.

11          MPAI-MCS

Mixed-Reality Collaborative (MPAI-MCS) Spaces is a project riding on the opportunities offered by emerging technologies enabling developers to deliver mixed-reality collaborative space (MCS) applications where biomedical, scientific, and industrial sensor streams and recordings are to be viewed. MCS systems use AI to achieve immersive presence, spatial maps (e.g., Lidar scans, inside-out tracking) rendering, and multiuser synchronis­ation etc.

MPAI has developed the requirements for the Avatar-Based Videoconference (MCS_ABV) Use Case (Figure 14).

Figure 14 – End-to-End block diagram of Avatar-Based Videoconference

The Reference Model of the transmitting client of MCS-ABV is given by Figure 15.

Figure 15 – Reference Model of MCS-ABV Tramnsmittingt Client

Figure 16 gives the Reference Model of the Virtual Secretary of the MPAI-MCS Avatar-Based Videoconference (MCS-ABV).

Figure 16 – Reference Model of Avatar-Based Videoconference

The collection of public documents is available here.

12          MPAI-MMC

Multi-modal conversation (MPAI-MMC) aims to enable human-machine conversation that emul­ates human-human conversation in completeness and intensity by using AI.

12.1        Version 1

Five Use Cases have been developed for MPAI-MMC V1: Conversation with emotion, Multimodal Question Answering (QA) and 3 Automatic Speech Translation Use Cases.

Figure 16 depicts the Reference Model of the Conversation with Emotion Use Case.

Figure 17 – An MPAI-MMC V1 Use Case: Conversation with Emotion

The MPAI-MMC Technical Specification V1.2 has been approved and is available here.

12.2       Version 2

Five new Use Cases have been identified for Multi-modal conversation V2 (MPAI-MMC V2).

Conversation About a Scene (CAS) and Avatar-Based Videoconference (ABV).

Figure 17 is the reference model of the Conversation About a Scene (CAS) Use Case.

Figure 18 – An MPAI-MMC V2 Use Case: Conversation About a Scene

The collection of public documents is available here.

13          MPAI-NNW

Neural Network Watermarking will be a standard whose purpose is to enable watermarking technology providers to qualify their products by providing the means to measure, for a given size of the watermarking payload, the ability of:

  1. The watermark inserter to inject a payload without deteriorating the NN performance.
  2. The watermark detector to recognise the presence of the inserted watermark when applied to
    1. A watermarked network that has been modified (e.g., by transfer learning or pruning)
    2. An inference of the modified model.
  3. The watermark decoder to successfully retrieve the payload when applied to
    1. A watermarked network that has been modified (e.g., by transfer learning or pruning)
    2. An inference of the modified model.
  4. The watermark inserter to inject a payload at a measures computational cost on a given processing environment.
  5. The watermark detector/decoder to detect/decode a payload from a watermarked model or from any of its inferences, at a low computational cost, e.g., execution time on a given processing environment.

MPAI has developed the requirements for MPAI-NNW.

The collection of public documents is available here.

14          MPAI-OSD

Visual object and scene description is a collection of Use Cases sharing the goal of describe visual object and locate them in the space. Scene description includes the usual des­cription of objects and their attributes in a scene and the semantic description of the objects.

Unlike proprietary solutions that address the needs of the use cases but lack interoperability or force all users to adopt a single technology or application, a standard representation of the ob­jects in a scene allows for better satisfaction of the requirements.

MPAI has developed requirements related to MPAI-OSD for

  1. MMC-PSE
  2. MMC-PSD
  3. MMC-CAS
  4. CAV-HCI
  5. MMC-ABV

The collection of public documents is available here.

15          MPAI-SPG

Server-based Predictive Multiplayer Gaming (MPAI-SPG) aims to minimise the audio-visual and gameplay discontinuities caused by high latency or packet losses during an online real-time game. In case information from a client is missing, the data collected from the clients involved in a particular game are fed to an AI-based system that predicts the moves of the client whose data are missing. The same technologies provide a response to the need to detect who amongst the players is cheating.

Figure 19 depicts the MPAI-SPG reference model including the cloud gaming model.

 

Figure 19 – The MPAI-SPG Use Case

The collection of public documents is available here.

16          MPAI-XRV

XR Venues (MPAI-XRV) is an MPAI project addressing a multiplicity of use cases enabled by AR/VR/MR (XR) and enhanced by Artificial Intelligence technologies. The word venue is used as a synonym to Environment, both real and virtual.

The goals of the project are:

  1. To identify and characterise AI Modules (AIMs) re-usable across use cases.
  2. To develop requirements for the AI Workflows (AIWs) implementing the identified use cases and for AIM functions and input/output data.
  3. To draft and publish Calls for Technologies satisfying the identified functional requirements and the commercial requirements.
  4. To specify the enabling technologies in a series of MPAI standard.

MPAI is developing requirements for a set of Use Cases related to MPAI-XRV based on a general diagram covering the interdependence of Real World and the Virtual World depicted in Figure 20.

Figure 20 – RW-VW-RW paths in MPAI-XRV

The collection of public documents is available here.