1 Introduction
2 Approved standards Compression and Understanding of Industrial data MPAI-CUI
3. Areas at stage 6 (SD) Multi-Modal Conversation MPAI-MMC
Artificial Intelligence Framework MPAI-AIF
Context-based Audio Enhancement MPAI-CAE
4 Areas at stage 2 (FR) Server-based Predictive Multiplayer Gaming MPAI-SPG
AI-Enhanced Video Coding MPAI-EVC
Connected Autonomous Vehicles MPAI-CAV
Mixed-Reality Collaborative Spaces MPAI-MCS
Integrative Genomic/Sensor Analysis MPAI-GSA
Neural Network Watermarking MPAI-NNW
AI-based End-to-End Video Coding MPAI-EEV
Avatar Representation and Animation MPAI-ARA
5 Areas at stage 1 (UC) Visual Object and Scene Description MPAI-OSD
6 Areas at stage 0 (IC)

1        Introduction

MPAI’s standards development is based on projects evolving through a workflow extending on 6 + 1 stages.

# Acr Name Description
0 IC Interest Collection Collection and harmonisation of use cases proposed.
1 UC Use cases Proposals of use cases, their description and merger of compatible use cases.
2 FR Functional Reqs Identification of the functional requirements that the standard in­cluding the Use Case should satisfy.
3 CR Commercial Reqs Development and approval of the framework licence of the stan­dard.
4 CfT Call for Technologies Preparation and publication of a document calling for technologies supporting the functional and commercial requirements.
5 SD Standard Development Development of the standard in a specific Development Com­mit­tee (DC).
6 CC Community Comments When the standard has achieved sufficient maturity it is published with request for comments.
7 MS MPAI standard The standard is approved by the General Assembly.

A project progresses from one stage to the next by resolution of the General Assembly.

The stages of currently (MPAI-12) active MPAI projects are graphically represented by Figure 1.

Figure 1Snapshot of the MPAI work plan

2        Approved standards

2.1       MPAI-CUI

Compression and understanding of industrial data (MPAI-CUI) aims to enable AI-based filtering and extraction of key information to predict company performance by applying Artificial Intellig­ence to governance, financial and risk data. This is depicted in Figure 2.

Figure 2 The MPAI-CUI Use Case

 The collection of public documents is available here. The set of specifications is available here.

3        Areas at stage 6 (SD)

3.1       MPAI-MMC

Multi-modal conversation (MPAI-MMC) aims to enable human-machine conversation that emul­ates human-human conversation in completeness and intensity by using AI.

So far, 5 Use Cases have been identified for MPAI-MMC: Conversation with emotion, Multimodal Question Answering (QA) and 3 Automatic Speech Translation Use Cases.

Figure 3 addresses the Conversation with Emotion Use Case.

Figure 3 An MPAI-MMC Use Case: Conversation with emotion

The collection of public documents is available here. The MPAI-MMC Technical Specification has been approved and is available here.

3.2      MPAI-AIF

Artificial Intelligence Framework (MPAI-AIF) enables creation and automation of mixed Artif­icial Intelligence – Machine Learning – Data Processing workflows for the application areas cur­rently considered by the MPAI work plan. MPAI-AIF will be extended to support new applic­ations areas, if the need will arise. Figure 4 shows the general MPAI-AIF Reference Model.

Figure 4 – Reference model of the MPAI AI Framework

The collection of public documents is available here. The MPAI-AIF Technical Specification has been approved and is available here.

3.3       MPAI-CAE

Context-based Audio Enhancement (MPAI-CAE) improves the user experience for several audio-related applications including entertainment, communication, teleconferencing, gaming, post-production, restoration etc. in a variety of contexts such as in the home, in the car, on-the-go, in the studio etc. using context information to act on the input audio content using AI.

Figure 5 An MPAI-CAE Use Case: Emotion-Enhanced Speech

The collection of public documents is available here. The MPAI-CAE Technical Specification has been approved and is available here.

MPAI has initiated work on Audio on the Go (AOG), a new use case for MPAI-CUI V2.

4        Areas at stage 2 (FR)

4.1       MPAI-SPG

Server-based Predictive Multiplayer Gaming (MPAI-SPG) aims to minimise the audio-visual and gameplay discontinuities caused by high latency or packet losses during an online real-time game. In case information from a client is missing, the data collected from the clients involved in a particular game are fed to an AI-based system that predicts the moves of the client whose data are missing. The same technologies provide a response to the need to detect who amongst the players is cheating.

Figure 6 depicts the MPAI-SPG reference model including the cloud gaming model.

Figure 6 The MPAI-SPG Use Case

The collection of public documents is available here.

4.2       MPAI-EVC

AI-Enhanced Video Coding (MPAI-EVC) is a video compression stan­dard that substantially en­hances the performance of a traditional video codec by improving or replacing traditional tools with AI-based tools. Two approaches – Horizontal Hybrid and Vertical Hybrid – are envisaged. The Vertical Hybrid approach envigaes an AVC/HEVC/EVC/VVC base layer plus an enhanced machine learning-based layer. This case can be represented by Figure 7.

Figure 7 A reference diagram for the Vertical Hybrid approach

The Horizontal Hybrid approach introduces AI based algorithms combined with trad­itional image/video codec, trying to replace one block of the traditional schema with a machine learn­ing-based one. This case can be described by Figure 8 where green circles represent tools that can be replaced or enhanced with their AI-based equivalent.

Figure 8 A reference diagram for the Horizontal Hybrid approach

MPAI is engaged in the MPAI-EVC Evidence Project seeking to find evidence that AI-based technologies provide sufficient improvement to the Horizontal Hybrid approach. A second project on the Vertical Hybrid approach is being considered.

The collection of public documents is available here.

4.3       MPAI-CAV

Connected Autonomous Vehicles (CAV) is a standard project seeking to standardise thel components that enable the implementation of a Connected Autonomous Vehicle (CAV), i.e., a mechanical system capable of executing the com­mand to move its body auronomously – save for the exceptional intervention of a human – based on the analysis of the data produced by a range of sensors exploring the environment and the information transmitted by other sources in range, e.g., CAVs and roadside units (RSU).

Figure 9 – The MPAI-CAV subsystems

The collection of public documents is available here.

4.4      MPAI-MCS

New technologies are emerging which equip developers to deliver mixed-reality collaborative space (MCS) scenarios where biomedical, scientific, and industrial sensor streams and recordings are to be viewed. Artificial Intelligence can be utilized throughout MCS systems for immersive presence, spatial maps (e.g. Lidar scans, inside-out tracking) rendering, and multiuser synchronis­ation etc.

Figure 11 depicts one Use Case being considered where most functionalities are executed in the client.

Figure 11 – The Virtual E-Learning transmission side

The collection of public documents is available here.

4.5       MPAI-GSA

Integrative Genomic/Sensor Analysis (MPAI-GSA) uses AI to understand and compress the res­ult of high-throughput experiments combining genomic/proteomic and other data, e.g., from video, motion, location, weather, medical sensors.

Figure 12 addresses the Smart Farming Use Case.

Figure 11 An MPAI-GSA Use Case: Smart Framing

The collection of public documents is available here.

4.6      MPAI-NNW

Neural Network Watermarking, chaired by Mihai Mitrea of IMT. is a project developing require­ments for a standard enabling the measure, for a given size of the watermarking payload, of:

  1. The impact, e.g., the degradation of the user experience caused by the watermark applied to a neural network.
  2. The resistance to attacks, e.g., transfer learning, pruning.
  3. The processing cost of watermarking injection, e.g., time, processing cost.

Draft Use Cases and Functional Requirements

5        Areas at stage 1 (UC)

5.1       MPAI-OSD

Visual object and scene description is a collection of Use Cases sharing the goal of describe visual object and locate them in the space. Scene description includes the usual des­cription of objects and their attributes in a scene and the semantic description of the objects.

Unlike proprietary solutions that address the needs of the use cases but lack interoperability or force all users to adopt a single technology or application, a standard representation of the ob­jects in a scene allows for better satisfaction of the requirements.

The collection of public documents is available here.

6.1       MPAI-EEV

There is consensus in the video coding research community that the so-called End-to-End (E2E) video coding schemes can yield significantly higher performance than those target, e.g., by MPAI-EVC. AI-based End-to-End Video Coding intends to address this promising area.

The collection of public documents is available here.

5.3      MPAI-ARA

Avatar Representation and Animation (ARA) is a project developing requirements for standards supporting the needs of MPAI-MMC, MPAI-CAV and MPAI-MCS.

6        Areas at stage 0 (IC)