On the 31st of March 2023 – thirty months after its establishment – MPAI reaffirms its mission, revisits its organisation, retraces its achievements, and reviews its next goals in 16 presentations starting at 06:00 UTC and lasting 20 minutes each for total time of 5h20min. The presentations are made a second time starting at 17:00 UTC. Some talks will be delivered in a different order. We guarantee that the talks are made at the times indicated in the table below to let you attend the presentations of your interest.
Register here and join the talks of your interest. The date is 31st of March 2023. All times are UTC.
A Standards body for AI
|L. Chiariglione||MPAI Mission, organisation, and activities||06:00||17:00|
|M. Bosi||Enhancing audio with AI||06:20||19:00|
|S. Dukes||Connecting with standards organisations||06:40||22:00|
|E. Lantz||AI-powered XR Venues||07:00||21:40|
|D. Schultens||Connected Autonomous Vehicles||07:20||21:20|
|S. Casale-Brunet||MPAI Metaverse Model||07:40||18:20|
|J. Yoon/A. Bottino||Avatar interoperability||08:00||18:40|
|M. Choi/M. Seligman||Humans and computers converse||08:20||19:20|
|M. Mitrea||Watermarking Neural Networks||08:40||20:00|
|G. Perboli||Predicting company performance||09:00||19:40|
|A. Basso||Multi-sourced AI apps||09:20||18:00|
|A. De Almeida/ M. Breternitz||Federated AI for Health||09:40||20:20|
|P. Ribeca||MPAI Ecosystem Governance||10:00||17:40|
|C. Jia||End-to-End Video Coding||10:20||17:20|
|R. Iacoviello||AI-Enhanced Video Coding||10:40||20:40|
|M. Mazzaglia||Better and fairer online games with AI||11:00||21:00|
MPAI Mission, organisation, and activities
To our knowledge, MPAI is the only standards developing organisation with the sole mission of developing data coding standards that are enhanced or enabled by Artificial Intelligence. MPAI has a rigorous process to accept proposals for, developing and approving technical documents that include Technical Specifications usually complemented by Reference Software, Conformance Testing and Performance Assessment, and in some cases Technical Reports. Its technical work is based on the principle of separating complex AI solutions into components (AI Modules) whose functions and interfaces are standardised. AI applications are executed in a standardised AI Framework enabling the execution of independently sourced AI Modules in an environment with standards APIs. MPAI offers a novel approach to IP in standards by means of Framework Licences, without values in $, %, dates etc. and general indications of time to develop the licence and IP cost. MPAI has already developed 12 technical documents. Three standards have already been adopted by IEEE without modifications and two more are in the pipeline. MPAI has developed version 2 of one standard and is doing the same for two more. It is developing a brand new standard and a brand new Technical Report. It is also preparing 6 new projects.
Digital technologies have done a lot to improve the audio experience. Compression and internet have changed the way humans enjoy the audio experience. Artificial Intelligence can do more and MPAI has already done a few steps in this direction. There are already 2 versions of the MPAI-CAE (Context-based Audio Enhancement) standard addressing several audio-enhancement use cases.
Emotion-Enhanced Speech enables a user to indicate a model utterance and obtain an emotionally charged version of a given utterance. An important variation enables a user to indicate one emotion out of a standard list and convert an emotionless utterance into a version that conveys that emotion.
Audio Recording Preservation facilitates the preservation of open reel magnetic tape features that can carry annotations (by the composer or by the technicians) or include multiple splices and/or display several types of irregularities (e.g., corruptions of the carrier, tape of different colour or chemical composition).
Speech Restoration System enables the replacement of damaged speech recordings using the a speech model created.
Enhanced Audioconference Experience and Human Connected Autonomous Vehicle (CAV) Interaction provides a standard way to represent the audio sources of multiple speakers inside a room and outside. Audio sources are represented by their audio streams and the directions.
Connecting with standards organisations
The interactions of MPAI with industry and standard organisations is one of the talks planned for the 31st of March, the day marking the 30th month of operation of MPAI.
Interacting with industry and standards is important because of MPAI’s wide scope of work. Indeed, MPAI works on data coding for audio, avatar interoperability, voice communication, financial data e more.
Two examples to show the importance of industry and standards relations. The first one is IEEE, with which MPAI is conducting a continuous relationship. Three MPAI standards have been adopted without modification by the IEEE Standards Association as IEEE Standards and two more are in the pipeline. The subject of the second one is the metaverse. MPAI is contributing to the ITU-T Focus Group on metaverse, is a member of the Metaverse Standards Forum, and is interacting with the IEEE, W3C and others.
One of the talks planned for the 31st of March, the day marking the 30th month of operation of MPAI, will be the use of AI to enable more powerful XR Venues. But what is an XR Venue?
MPAI-XRV is a project addressing contexts enabled by Extended Reality (XR) – any combination of Augmented Reality (AR), Virtual Reality (VR) and Mixed Reality (MR) technologies – and enhanced by Artificial Intelligence (AI) technologies. The word “Venue” is used as a synonym for real and virtual environments.
MPAI is developing a Call for Technologies which seeks to obtain technologies that support some of and preferably all the Functional Requirements of the “Live Theatrical Stage Performance” that are being developed. Availability of proposed technologies will enable MPAI to develop the XR Venues (MPAI-XRV) Standard for Live Theatrical Stage Performance.
The standard will define interfaces and components to facilitate live multisensory immersive stage performances which ordinarily require extensive on-site show control staff to operate. Use of the XRV will allow more direct, precise yet spontaneous show implementation and control to achieve the show director’s vision by freeing staff from repetitive and technical tasks and focusing their artistic and creative skills.
This is the first Use Case. In the pipeline there are more use cases: eSports Tournament, Experiential retail/shopping, and Collaborative immersive laboratory.
The media are highlighting the tectonic changes affecting the automotive industry in its transition from internal combustion engines to electric vehicles to vehicles that have more and more autonomy. How do we achieve autonomy? The answer is obvious, by means of Artificial Intelligence (AI). But there are ways and ways to use AI to achieve vehicles’ autonomous motion. MPAI’s approach is to include connectivity in vehicle autonomy – hence the name Connected Autonomous Vehicles (CAV) – and subdivide a CAV in subsystems – 4 of them. Each subsystem is subdivided in AI Modules whose functions and input and output data are specified.
The MPAI Metaverse Model project
While the fortunes of the metaverse term go up and down, the interest of MPAI in the work is unshaken. In January, MPAI published a substantial Technical Report that included definitions, assumptions guiding the MPAI Metaverse Model (MPAI-MMM) project, a list of sources that can generate functionalities, an organised list of commented functionalities, and an analysis of some of the main technology areas underpinning the development of the metaverse. In a matter of days, the MPAI General Assembly is expected to publish a new document called Functionality Profiles that contains an operational functional model of a metaverse, three basic set of entities actions, items and data types, a collection of use cases expressed by means of actions, items and data types and a first set of functionality profiles. This document is the second deliverable of the MPAI-MMM project, but more are planned to be published: Architecture, Data Formats, Table of Contents of Metaverse Technologies, and initial mapping of Technologies into the Table of Contents.
Avatars have a serious presence with many services offering users the possibility to create and decorate digital representations of humans. MPAI has been working on a use case called Avatar-Based Videoconference enabling videoconference participants at their locations to send to a server their avatars faithfully representing their visual features and motions. The server selects a virtual meeting room, locates the avatars at selected positions and sends the resulting virtual room populated of avatars to each participant.
The project intends to define various formats based on which this use case can be interoperable implemented.
Humans’ interaction with machines has taken many steps forward since the early signal processing days and AI can be credited for many of the innovations. Multimodal Conversation is one of the first projects started with the foundation of MPAI. Version 1 of the standard was first approved 18 months ago with 3 main use cases: enhancing the quality of the conversation with emotion detected in the human and added to the machine’s response, asking questions to a machine with speech and images and asking a machine to translate speech preserving the features of the human speech in the translation. MPAI is now close to publishing V2 which will support an extended form of representation of the internal state of a human (and of a machine…) called Personal Status.
Neural networks are increasingly used, and the trend is bound to continue. A neural network may be the result of significant investments and, as for media, watermarking allows rights holders to retain a level of control on their assets.
MPAI has been recently approved an NNW standard that specifies methodologies to evaluate:
- The impact on the performance of a watermarked neural network and its inference.
- How well a neural network watermarking detector/decoder to detect/decode a payload when the watermarked neural network has been modified.
- The computational cost of injecting, detecting, or decoding a payload in the watermarked neural network.
Predicting company performance
Being able to predict the future of a company is one of the most challenging endeavours. Of course, most of that future is embedded in financial data, but the way a company is organised has also a weight. This is only the beginning of the story because life is not a steady sequence of events. Something unexpected can happen. Designing an AI that manages all that is not simple, but MPAI managed to produce a standard called Compression and Understanding of Industrial Data containing the Company Performance Prediction Use Case in just one year since its foundations. Actually, not only the Technical Specification, but also the Reference Software, Conformance Testing and Performance Assessment.
A founding pillar of the MPAI standardisation effort is the notion of AI Framework, an environment that developers of AI Apps can use to components – called AI Modules – to build functionally-rich solutions by connecting AI Modules with standard interfaces. MPAI-AIF V1 is the standard that offered the solution to this problem assuming that the environment is “Zero Trust”. MPAI-AIF V2 under development intend to develop a set of API that enable a developer to configure the security of the AIF environment according to their needs.
AI promises to improve the way we approach health. Obviously, there are many possible approaches to the problem. MPAI’s AI Health project assumes that everybody could acquire health data and process it using AI Models on their handsets. Processed health data can then uploaded to a system accompanied by a smart contract setting the use that the system can make of the data. Third parties can process health data based on the said smart contract. The AI models can learn while they process health data. Federated learning techniques are used to collect AI models, create an improved one that is distributed to all handsets. MPAI is developing use cases and requirements to issue a Call for Technologies later in the year.
All standards generate an ecosystem. The root of trust is the standard itself or the body that produced the standard, then there are implementers of the standard then we have users of the standard. How can a user be sure that an implementation of a standard complies with its prescriptions? Here come another element of the ecosystem: conformance assessment. An ecosystem needs governance.
MPAI standards, too, create an ecosystem with one important difference: its standards are typically AI-based, and it is the first body to do so.
MPAI’s mission is data coding by AI. Therefore, the large – and getting larger – number of bits required to provide an adequate video experience has prompted MPAI to work on two video coding projects. MPAI-EVC enhances an existing video codec (AI-Enhanced Video Coding, MPEG-EVC) by enhancing/replacing existing coding tools with AI-based tools. So far, MPAI has addressed two tools: intra coding and super resolution obtaining a total of 16% compression compared to MPEG-EVC. MPAI is now working on in loop filtering and other inter prediction. All tools are added to the same environment, thus providing a true measure of the improvement,
Unlike AI-Enhanced Video Coding (MPAI-EVC) that seeks to improve video compression by adding or replacing traditional data processing coding tools with AI-based tools, MPAI-EEV exploits AI-based data coding technologies in an end-to-end fashion and is entirely based on AI. The latest EEV reference model, EEV-0.4 offers very promising results (see MPAI – End-to-end Video (EEV) presented at the 2023/01/01 EEV-0.4 reference model event). In an MS SSIM metric, EEV-0.4 exceeds the performance of MPEG-VVC.
Better and fairer online games with AI
The global game market is an impressive economic reality: 93 B$ for mobile, 37 B$ for PC, and 50 B$ for console (2021). In Steam, Valve’s digital video game platform and store, 9 out of the first 10 most played games of 2020 are online games. Notwithstanding the results, there are still two problems plaguing online gaming: latency/packet loss and game cheating. The MPAI-SPG projects works to solve or mitigate these two important issues. MPAI-SPG solutions could be used as plug-ins in a generic game engine. At the time of game development, each project could add MPAI-SPG and the online server instance will exchange information with the MPAI-SPG component to get the missing information needed to feed an incomplete Game State.