The 19th of July 2022 was the second anniversary of the launch of the MPAI idea. After two years of existence, it is useful to have a summary of MPAI’s vision, mission, processes, achievements, plans, and the sister organisation MPAI Store. Those in a hurry can have a look at a 2 min video about MPAI (YouTube – non-YouTube).
Vision. The MPAI idea was driven by the impact digital media standards had on the media industry. While traditionally not very inclined to adopt “official” standards, that industry has seen relentless development in the last 1/3 of a century since digital media standards came to the fore and the industry began adopting them.
The state of Artificial Intelligence today is like the state of digital media some 1/3 of a century ago. Many players hold many technologies, but none has the power alone to create a level playing field where different players can deploy interoperable products, services, and applications.
Mission. The international, non-profit, and unaffiliated MPAI organisation develops standards for AI-based data coding and seeks to play the role of enabling that level playing field. 1/3 of a century ago the blocking factor was the high amount of data generated by the digitisation of analogue media. Today this remains an issue, but Artificial Intelligence can also be applied to all sorts of data when it is convenient to transform it from one format into another format.
Processes. Developing standards is a challenging business because standards are often based on sophisticated technologies that result from large research investments and have the potential to be used by millions of people. MPAI takes the following approach:
- Anybody should be allowed to propose standards and contribute to the definition of their functional requirements.
- Before the development of a standard starts users should know as many details of functional and commercial requirements as legally possible.
- Investments that have produced good research results should be remunerated.
- Once approved, the terms and conditions for using a standard should be known in a timely and simple fashion.
MPAI is developing its standards using a process that accommodates such requirements:
- Anybody can propose standards, attend online meetings, and develop functional requirements.
- MPAI Principal Members develop and approve the Framework Licence of a standard. Unlike Fair, Reasonable and Non-Discriminatory (FRAND) declarations, the Framework Licence includes terms and conditions without values (dollars, percentages, rates, dates, etc.) and a declaration that:
- The licence will be issued before commercial implementations are available on the market.
- The total cost will be in line with the total cost of the licenses for similar data coding technologies.
- The market value of the specific standardised technology will be considered.
- MPAI issues Calls for Technologies requesting proposals satisfying functional and commercial requirements.
- Anybody can respond to a Call and participate in the integration of technologies for a standard on the condition of membership in MPAI and acceptance of the Framework Licence for proposals submitted.
Achievements. MPAI has developed 4 technical specifications and 1 standard, i.e., the full set of technical specification, reference software, conformance testing and performance assessment:
- AI Framework (MPAI-AIF) enables the creation of environments (AIF) that execute AI Workflows (AIW) composed of basic components called AI Modules (AIM). It is a foundational MPAI standard on which other MPAI application standards are built.
- Context-based Audio Enhancement (MPAI-CAE) uses AI to improve the user experience for audio-related entertainment, teleconferencing, restoration, and other applications in contexts such as in the home, in the car, on the go, in the studio, etc.
- Compression and Understanding of Industrial Data (MPAI-CUI) uses AI to handle financial data for such purposes as assessing adequacy of governance and predicting the default and business discontinuity probabilities of a company.
- Multimodal Conversation (MPAI-MMC) uses AI to enable conversation between humans and machines emulating human-human conversation in completeness and intensity.
MPAI has also developed Governance of the MPAI Ecosystem (MPAI-GME), a foundational standard laying down the rules that govern the submission of and access to MPAI standard implementations with attributes of Reliability, Robustness, Replicability, and Fairness, available from the MPAI Store.
Plans. MPAI is engaged in 3 projects which have just reached the Call for Technologies stage and aim at:
- Providing the AI Framework standard with a security infrastructure so that AIF V2 components can access security services. Please have a look at the 1 min 20 sec video about the MPAI-AIF V2 Call for Technologies (YouTube – non-YouTube); the slides presented at the online meeting on 2022/07/11; the video recording of the online presentation (Youtube, non-YouTube) made at that 11 July presentation; and the Call for Technologies, Use Cases and Functional Requirements, and Framework Licence.
- Extending the Multimodal Conversation standard. Please have a look at the 2 min video (YouTube ) and video (non YouTube) illustrating MPAI-MMC V2; the slides presented at the online meeting on 2022/07/12; the video recording of the online presentation (Youtube, non-YouTube) made at that 12 July presentation; and the Call for Technologies, Use Cases and Functional Requirements, and Framework Licence. MPAI-AIF V2 calls for a range of technologies, such as:
- Extraction of Personal Status, a set of internal characteristics from a person or avatar, currently Emotion, Cognitive State, and Attitude, conveyed by Modalities: Text, Speech, Face, and Gesture.
- Generation of a speaking avatar from Text and Personal Status, typically generated by a machine conversing with a human.
- Audio-Visual Scene Description to describe the structured composition of the audio-visual objects in a scene.
- Avatar Model to describe a static avatar from the waist up displaying movements in face and gesture.
- Avatar Descriptors to represent the instantaneous alterations of the face, head, arms, hands, and fingers of an Avatar Model.
- Extraction of Speech and Face Descriptors for remote authentication.
- Developing the Neural Network Watermarking (MPAI-NNW) standard providing the means to measure the performance of a neural network watermarking technology. Please have a look at the 1 min 30 sec video (YouTube ) and video (non YouTube) illustrating MPAI-MMC V2; the slides presented at the online meeting on 2022/07/12; the video recording of the online presentation (Youtube, non-YouTube) made at that 12 July presentation[ and the Call for Technologies, Use Cases and Functional Requirements, and Framework Licence.
MPAI is also engaged in several other projects which have not reached the Call for Technologies stage:
- AI Health (MPAI-AIH): addresses users equipped with an AIF-enabled smartphone who collect, process, and license health data to a central service which satisfies data processing requests from third parties in line with the data licence. Improved neural network models are shared and improved via federated learning.
- Avatar Representation and Animation (MPAI-ARA): addresses the extraction of visual human features to animate a speaking avatar which accurately reproduces the features and the movements of a human.
- Connected Autonomous Vehicle (MPAI-CAV): addresses the AI Modules and AI Workflows of a CAV, i.e., a system capable of moving autonomously based on the analysis of the data produced by a range of sensors exploring the environment and the information transmitted by other sources in range.
- AI-based End-to-End Video Coding (MPAI-EEV): seeks to reduce the number of bits required to represent 2D video by exploiting AI-based end-to-end data coding technologies without being constrained by how data coding has traditionally been used for video coding.
- AI-Enhanced Video Coding (MPAI-EVC): aims at substantially enhancing the performance of a traditional video codec (MPEG-5 EVC) by improving or replacing traditional tools with AI-based tools.
- Integrative Genomic/Sensor Analysis (MPAI-GSA): aims at understanding and compressing the result of high-throughput experiments combining genomic/proteomic and other data, e.g., from video, motion, location, weather, and medical sensors.
- Mixed-Reality Collaborative Spaces (MPAI-MCS): addresses virtual spaces where humans and avatars collaborate to achieve common goals, such as Conversation About a Scene (CAS) and Avatar-Based Videoconference (ABV). These are two use cases enabled by MPAI-MMC V2.
- Visual Object and Scene Description (MPAI-OSD): addresses use cases sharing the goal of describing visual objects and locating them in space. Scene description includes the description of objects, their attributes in a scene and their semantic description.
- Server-based Predictive Multiplayer Gaming (MPAI-SPG): aims to mitigate the gameplay discontinuities caused by high latency or packet losses in online and cloud gaming applications and to detect game players who are getting an unfair advantage by manipulating the data generated by their game client.
- XR Venues (MPAI-XRV) addresses use cases enabled by AR/VR/MR (XR) and enhanced by Artificial Intelligence technologies. Examples are eSports, Experiential retail/shopping, and Immersive art experiences.
MPAI Store: Standards are about interoperability, but what is MPAI Interoperability? MPAI defines it as the ability to replace an Implementation of an AI Workflow or an AI Module with a functionally equivalent and conforming Implementation. MPAI defines 3 Interoperability Levels of an AIW executed in an AIF:
Level 1 – The AIW is implementer-specific and satisfies the MPAI-AIF Standard.
Level 2 – The AIW is specified by an MPAI Application Standard.
Level 3 – The AIW is specified by an MPAI Application Standard and validated by a Performance Assessor.
Implementations should be labelled so as not to confuse users. The Governance of the MPAI Ecosystem assigns this task to the MPAI Store, a not-for-profit organisation that verifies the security of implementations, tests the claimed conformance to an MPAI technical specification, records the result of a Performance Assessor, and makes the implementation available for download. The MPAI Store also manages a reputation system recording reviews of MPAI implementation.
MPAI offers Users access to the promised benefits of AI with a guarantee of increased transparency, trust and reliability as the Interoperability Level of an Implementation moves from Level 1 to 3.