MPAI goes public with MPAI-CUI Compression and Understanding of industrial Data
The title is slightly inaccurate. MPAI-CUI did “go public” on the 30th of September. It was one of the first batch of 3 MPAI standards. Now, however, MPAI wants to give the opportunity to anybody to understand what MPAI-CUI exactly does, how it can be used and see how a real-time implementation works and what it does.
To facilitate better understanding of the planned webinar let’s say a few things about MPAI-CUI.
An implementation of the standard (see Fig. 1) is composed of several modules pre-processing the input data. A fourth module – called Prediction – is a neural network that has been trained with a large amount of company data of the same type as those used by the implementation and can provide an accurate estimate of the company default probability and the governance adequacy. The fifth module – called Perturbation – takes as input the estimation of the company default probability and the assessment of vertical risks (i.e., seismic and cyber) and estimates the probability that a business discontinuity will occur in the future.
Figure 1 – Company Performance Prediction in MPAI-CUI
The MPAI-CUI standard is a set of 4 specifications. The Technical Specification outlined above is the first and
- a second specification called Conformance Assessment enabling a user of an implementation of the standard to verify that it is technically correct.
- a third specification called Performance Assessment enabling a user to detect whether the training of the neural network is biased against some geographic locations (e.g., North-South) and some industry types (e.g., Service vs. Manufacturing).
- a fourth specification called Reference Software, a software implementation of the standard.
The novelty of MPAI-CUI is in its ability to analyse, through AI, the large amount of data required by regulation and extract the most relevant information. Moreover, compared to state-of-the-art techniques that predict the performance of a company, MPAI-CUI allows extending the time horizon of prediction.
Companies and financial institutions can use MPAI-CUI in a variety of contexts, e.g.:
- To support the company’s board in analysing the financial performance, identifying clues to the crisis or risk of bankruptcy years in advance. It may help decision-makers to make proper decisions to avoid these situations, conduct what-if analysis, and devise efficient strategies.
- To assess the financial health of companies applying for funds. A financial institution receiving a request from a troubled company, can access the company’s financial and organisational data and make an AI-based assessment and predict performance of the company. Financial institution can make the right decision whether to fund that company or not, based on a broader view of its situation.
The webinar will be held on 25th of November 2021 at 15:00 UTC with the following agenda:
1. Introduction (5’): introduction to MPAI, its mission, what has been done in the year after its establishment, plans.
2. MPAI-CUI standard (15’):
- The process that led to the standard: study of Use Cases, Functional Requirements, Commercial Requirements, Call for Technologies, Request for Community Comments and Standard.
- The MPAI-CUI modules and their function.
- Extensions to the standard under way.
- Some applications of the standard (banking, insurance, public administrations)
3. Demo (15’): a set of anonymous companies with identified financial, governance and risk features will be passed through an MPAI-CUI implementation.
Conversing with a machine
It will take some time before we can have an argument with a machine about the different form of novels in different centuries and cultures. It is clear, however, that there is significant push by the industry to endow machines with the ability to hold even a limited form of conversation with humans.
The MMC standard, approved on 30th of September provides two significant examples. In the first it assumes there is a machine that responds to user queries. The query can be about the line of products sold by a company or by the operation of a product or complaints about a malfunctioning product.
It would be a great improvement over some systems available today if the machine could understand the state of mind of the human and fine-tune its speech so as to make it more in tune with the mood of the human.
MPAI is developing standards for typical use case.
The first is Conversation with Emotion (MMC-CWE) depicted in Figure 2. Here MPAI has standardised the architecture of the processing units (called AIM Modules – AIMs) and the data formats exchanged by the processing units that support the following scenario: a human types or speaks to a machine that captures the human’s text or speech and face, and responds with a speaking avatar.
Figure 2 – Conversation with Emotion
A Video Analysis AIM extracts emotion and meaning from the human’s face and the the Speech Recognition AIM converts the speech to text and extracts emotion from the human’s speech. Emotion from different media is fused and all the data are passed to Dialogue Processing AIM that provides the machine’s answer along with the appropriate emotion. Both are passed to a speech synthesisr AIM and a lips animation AIM.
The second case is Multimodal Question Answering (MMC-MQA) depicted in Figure 3. Here MPAI has done the same for a system where a human handles an object in their hand and asks a question about the object that the machine answers with synthetic voice.
Figure 3 – Multimodal Question Answering
An AIM recognises the object while a speech recogniser AIM converts the speech of the human to text. The text with information about the object is processed by the Language Understanding AIM that produces Meaning which Question Analysis AIM converts to Intention. The Question Answering AIM processes human text, intention and meaning to produce the machine’s answer that is finally converted into synthetic speech.
The MPAI-MMC standard is publicly available at
MPAI is now busy developing the Reference Software of the 5 MPAI-MMC Use Cases. At the same time MPAI is exploring other environments where human-machine conversation is possible with technologies within reach.
The third case is conversation between a human and a Connected Autonomous Vehicle (CAV) depicted in Figure 4. In this case the CAV should be able to
- Recognise that the human has indeed the right to ask the CAV to do anything to the human.
- Understand commands like “take me home” and respond by offering a range of possibilities among which the human can choose.
- Respond to other questions while travelling and engage in a conversation with the human.
Figure 4 – Human to Connected Autonomous Vehicle dialogue
The CAV is impersonated by an avatar which should be capable to do several additional things compared to the MMC-CWE case distinguish which human in the compartment is asking a question and turn its eyes to that human.
The fourth case Conversation About a Scene, depicted in Figure 5 is actually an extension of Multimodal Question Answering: a human and a machine are holding a conversation on the objects of a scene. The machine understands what the human is saying and the object they are pointing to, by looking at the changes in the face and in the speech denoting approval or disapproval etc.
Figure 5 – Conversation About a Scene
Figure 5 represents the architecture of the AIMs whose concurrent actions allow a human and a machine to have a dialogue. It integrates the emotion-detecting AIMs of MMC-CWE and the question-handling AIMs of MMC-MQA with the AIM that detects the human gesture (“what the human arm/finger is pointing at”) and the AIM that created a model of the objects in the scene. The Object in Scene AIM fuses the two data and provides the object identifier that is processed in a way similar to MMC-MQA.
The fifth case is part of a recent new MPAI project called Mixed-reality Collaborative Spaces (MPAI-MCS) depicted in Figure 6, applicable to scenarios where geographically separated humans collaborate in real time with speaking avatars in virtual-reality spaces called ambient to achieve goals generally defined by the use scenario and specifically carried out by humans and avatars.
Figure 6 – Mixed-reality Colleborative Space.
Strictly speaking, in MCS the problem is not conversation with a machine but creation of a virtual twin of a human (an avatar) looking like and behaving in a similar way as its physical twin. Many the AIMs we need for this case are similar to and in some cases exactly the same as those needed by MMC-CWE and MMC-MQA: we need to capture the emotion or the meaning on the face and in the speech and in the physical twin so that we can map them to the virtual twin.
MPAI meetings in November-December (draft)
|Mixed-reality Collaborative Spaces||29||6||13||20||14|
|Context-based Audio enhancement||30||7||14||21||15:30|
|Connected Autonomous Vehicles||1||8||15||22||13|
|AI-Enhanced Video Coding||1||15||14|
|AI-based End-to-End Video Coding||8||21||14|
|Compression and Understanding of Industrial Data||8||15||15|
|Server-based Predictive Multiplayer Gaming||8||9||16||22||13:30|
|Industry and Standards||10||14|
|General Assembly (MPAI-15)||24|
The foundational AI Framework standard is coming to the fore
One of the distinctive characteristics of MPAI standardisation is componentisation. For instance, a complex system like Conversation with Emotion, where a human holds a conversation with a machine impersonated by a synthetic voice and an animated face using text, speech and video, is reduced to 7 interconnected subsystems for each of which the function, the format and semantics of the input and output data, and their interconnections is specified. The result is a workflow of connected processing elements.
This approach has several benefits. One is that it is easier to work with smaller rather than larger entities, another that it is easier to trace back the result to its source, a notion that experts call Explainability and a third one is that a component used in a workflow may often be reused “as is” in another workflow.
The problem is that, to execute such a workflow, you need an environment that allows easy interconnection and interoperability of modules organised in workflows. This is one of the functions of the MPAI AI Framework standard. MPAI-AIF can execute AI Workflows (AIW) composed of AI Modules (AIM). AIMs can be based on data processing, AI or Machine Learning functions implemented in hardware, software and hybrid hardware and software.
Figure 1 represents the MPAI-AIF Reference Model and its components.
Figure 1 – The AI Framework (AIF) Reference Model and its Components
The Controller plays a key role because:
- It provides basic functionalities such as scheduling, inter AIMs communication, access to all AIF components such as the Internal and Global Storage.
- It activates/suspends/resumes/deactivates AIWs or AIMs according to the user’s or other inputs
- It can support complex application scenarios by balancing load and resources.
- It exposes three APIs:
- AIM API through which modules can communicate with it (register themselves, communicate and access the rest of the AIF environment)
- User API through which the user or other Controllers can perform high-level tasks (e.g., switch the Controller on and off, give inputs to the AIW through the Controller).
- MPAI Store API to enable communication between the AIF and the Store.
- May run one or more AIWs.
The MPAI Store is another critical component. It is a repository of AIF, AIW and AIM implementations that users can access to set up an environment, and download an application (AIW) and its modules (AIM).
An AIW executed in an AIF may have one of the following MPAI-defined Interoperability Levels:
- Interoperability Level 1, if the AIW is proprietary and composed of AIM with proprietary functions using any proprietary or standard data Format.
- Interoperability Level 2, if the AIW is composed of AIMs having all their Functions, Formats and Connections specified by an MPAI Application Standard.
- Interoperability Level 3, if the AIW has Interoperability Level 2, and the AIW and its AIMs are certified by an MPAI-appointed Assessor to hold the attributes of Reliability, Robustness, Replicability and Fairness – collectively called Performance.
MPAI-AIF is now at the WD0.12 level. MPAI plans on approving MPAI-AIF V1 as a standard at its next General Assembly (MPAI-15) on 24 November. The MPAI standard development process, however, has a stage called Community Comments that takes place before final approval. WD0.12 can be downloaded from the MPAI web site (https://mpai.community/standards/mpai-aif/). Anybody may submit comments to the MPAI Secretariat (email@example.com). Comments are especially requested on the suitability of the standard in its current form, and suggestions for future work.
Comments will not be shared outside MPAI but considered by AIF-DC, the Development Committee in charge of the standard. The Secretariat will provide individual responses to comments.
Comments shall not include any Intellectual Property matters. If any will be received, the Secretariat will return the email and not forward it to any MPAI member or Development Committee.
AI and Video, a partnership destined to last
Since the day MPAI was announced, there has been considerable interest in the application of AI to video coding. In the Video Codec world research effort focuses on radical changes to the classic block-based hybrid coding framework to face the challenges of offering more efficient video compression solutions. AI approaches can play an important role in achieving this goal.
According to a literature survey of AI-based video coding papers, performance improvements up to 30% can be expected. Therefore, MPAI has investigating whether it is possible to improve the performance of the MPEG-5 Essential Video Coding (EVC) standard modified by enhancing/replacing existing video coding tools with AI tools keeping complexity increase to an acceptable level.
While the MPAI-EVC Use Cases and Requirements documents is ready and would enable the MPAI General Assembly to proceed to the Commercial Requirements phase, MPAI has made a deliberate decision not to move to the next stage because it first wanted to make sure that the results collected from different papers were indeed confirmed when implemented in a unified platform.
As MPAI members are globally distributed and they work across multiple software frameworks, the first challenge faced by MPAI was to enable them to collaborate in real-time in a shared environment. The main goals of this environment are:
- to allow testing of independently developed AI tools on a common EVC code base.
- to run the training and inference phases in “plug and play” manner.
To address these requirements, MPAI decided to adopt a solution based on a networked server application listening for inputs over an UDP/IP socket.
MPAI has been is working on three tools (Intra prediction, Super Resolution, In-loop Filtering). For each tool there are three phases: database building, learning phase and inference phase.
Significant gains have already been obtained.
Once the MPAI-EVC Evidence Project – as the current activity is called – will demonstrate that AI tools can improve the MPEG-5 EVC efficiency by at least 25%, MPAI will be in a position to initiate work on its own MPAI-EVC standard. The functional requirements already developed need only to be revised while the framework licence needs to be developed before a Call for Technology can be issued.
However, there is consensus in the video coding research community – and some papers make claims grounded on results – that so-called End-to-End (E2E) video coding schemes can yield significantly higher performance. However, many issues need to be examined, e.g., how such schemes can be adapted to a standard-based codec. End-to-End E2E VC promises AI-based video coding standard with significantly higher performance in the longer term.
As a technical body unconstrained by IP legacy and whose mission is to provide efficient and usable data coding standards, MPAI has initiated the study of what we can call End-to-End Video Coding (MPAI-EEV). This decision is an answer to the needs of the many who need not only environments where academic knowledge is promoted but also a body that develops common understanding, models and eventually standards-oriented End-to-End video coding.
Of course, the MPAI-EVC Evidence Project continues and new resources have been found to support the new activity. MPAI-EEV is at the Interest Collection stage, the first of the 8 stages through which an activity can become a standard.
MPAI-EEV is designed to serve long-term video coding needs. In the first phase of work MPAI-EEV researchers are engaged in cycles comprising:
- Coordinated research
- Comparing results within a common model
- Definition of new rounds of investigation.
The definition of a reference model was the first step. The method envisaged is by extracting the models implicitly or explicitly assumed by published End-to-End Video Coding papers, including published Open Source Software.
Work is progressing. Stay tuned.
MPAI meetings in November
|Mixed-reality Collaborative Spaces||1||8||15||22||14|
|Context-based Audio enhancement||2||9||16||23||15:30|
|Connected Autonomous Vehicles||3||10||17||24||13|
|AI-Enhanced Video Coding||10||24||14|
|AI-based End-to-End Video Coding||3||17||14|
|Compression and Understanding of Industrial Data||13||27||15|
|Server-based Predictive Multiplayer Gaming||4||11||18||14:30|
|Industry and Standards||5||19||14|
|General Assembly (MPAI-13)||24|
One year after establishment MPAI publishes 3 standards
MPAI was established on 30 September 2020. On the same 30 September, one year later, MPAI is proud to announce that it has developed, approved and published 3 standards.
Compression and Understanding of Financial Data (MPAI-CUI)
The AI-based Company Performance Prediction of MPAI-CUI, predicts the performance of a company, from its governance, financial and risks data in a given time horizon of prediction.
The performance of a company is measured by:
- Default Probability (i.e., the default probability in a specified number of future months based on company financial features).
- Organisational Model Index (i.e., the adequacy of the organisational model).
- Business Discontinuity Index (i.e., the probability of an interruption of the operations of the company for a period of time less than 2% of the prediction horizon).
Figure 2 – The MPAI Company Performance Prediction (CPP))
Read the MPAI-CUI Standard for further details.
An online MPAI-CUI demo is available online.
The novelty of MPAI-CUI is in its ability to analyse, through AI, the large amount of data required by regulation and extract the most relevant information elements. Moreover, compared to state-of-the-art techniques that predict the performance of a company, MPAI-CUI allows extending the time horizon of prediction.
Companies, financial institutions can use MPAI-CUI in a variety of contexts:
- To support the company’s board in deploying efficient strategies. A company can analyse its financial performance, identifying possible clues to the crisis or risk of bankruptcy years in advance. It may help the board of directors and decision-makers to make the proper decisions to avoid these situations, conduct what-if analysis, and devise efficient strategies.
- To assess the financial health of companies applying for funds/financial help. A financial institution receiving a request for financial help from a troubled company, can access the company’s financial and organisational data and make an AI-based assessment, as well as a prediction of future performance of the company. This helps the financial institution to make the right decision whether funding that company or not, based on a broad vision of its situation.
Multimodal Conversation (MPAI-MMC)
The MPAI-MMC Application Standard presently specifies five use cases.
- In the Conversation with Emotion (CWE) use case, a human holds an audio-visual conversation with a computational system personified by a synthetic voice and an animated face.
- In the Multimodal Question Answering (MQA) use case, a human user requests and receives from a computational system spoken information concerning a displayed object.
- In three conversational translation use cases, a computational system translates from one spoken language to one or more other spoken languages. The translation path may be one-to-one from Language A to B only (in the Unidirectional Speech Translation (UST) use case); from Language A to B and vice versa (in the Bidirectional Speech Translation (BST) use case); or from Language A to B, C, … (in the One-to-Many Speech Translation (MST) use case). Synthetic spoken output can preserve specified features of the source language speech.
In all such use cases, MPAI-MMC specifies the implementation architecture; the AI modules comprising the topology; and the formats of the AIMs’ input and output data. And, based on the AI Framework (MPAI-AIF) specification, it defines the workflow linking the AIMs, along with their metadata.
Figure 2 – The MPAI One-to-Many Speech Translation (MST)
Read the MPAI-MMC Standard for further details.
The AIMs specified by MPAI-MMC modules are typically based on AI technologies but in some cases on data processing technologies – can be reused in more than one application scenario.
Multimodality is the essence of MPAI-MMC. Text, speech, and video are exploited, in both input and output data, to enhance user experience and improve human-machine interaction. This application standard will be valuable in AI industries aiming to enhance services based on human machine interaction of all sorts, particularly when the emphasis is on utilization of multimodal user interfaces enabling emotional expression through natural language and visual communication.
Governance of the MPAI Ecosystem (MPAI-GME)
MPAI does not consider its role accomplished with the publication of a Technical Specification. MPAI also delivers Reference Software, a conforming implementation of a Technical Specification whose components are written in a programming language. Some modules (AIM) are available in source code, and some other modules are provided in executable form. MPAI also provides the specification of the procedure, the tools and the data – Conformance Testing – to test the conformance of an implementation to a standard.
MPAI makes a further step. MPAI provides the specification of the procedure, the tools and the characteristics of the data – Performance Assessment – to assess the degree of Reliability, Robustness, Fairness and Replicability of an implementation, collectively called Performance.
The ecosystem created by MPAI standards includes the following: MPAI issuing standards, implementers developing implementations and end users using implementations. However, there are 3 serious questions whose answer determines the viability of the ecosystem.
- Who verifies the security of an implementation?
- Who tests the conformance of an implementation?
- Who assesses the performance of an implementations?
MPAI has selected the approach, depicted in Figure 2, to:
- Establish the MPAI Store, an MPAI-controlled not-for-profit entity in charge of answering the first two questions, and
- Appoint Performance Assessors in charge of answering the third question.
Figure 2 – The MPAI Ecosystem entities and their interactions
Read the MPAI-MMC Standard for further details.
The MPAI Store offers secure access to implementations executing AIWs composed of AIMs in AIF implementations that:
- Are proprietary – Level 1.
- Conform to MPAI application standards (e.g., MPAI-CUI and MPAI-MMC) – Level 2.
- Whose Performance has been assessed by a Performance Assessor – Level 3.
The MPAI Store always informs end users of the level of guarantee offered by an implementation.
AI can offer great new benefits to humankind. MPAI standards offer the way to practically promote and disseminate use of AI. The Governance of the MPAI Ecosystem assures implementers that the Store holds interoperable implementations and end users that the implementations they enjoy have undergone different levels of scrutiny.
MPAI meetings in October
The meeting if MPAI Groups until 27 October (date of the 13 General Assembly) are given in the table below. 4-8 etc are the days of the weeks in October. Attendance at meetings of groups in bold is restricted to MPAI members.
|Mixed-reality Collaborative Spaces||4||11||18||25||14|
|Context-based Audio enhancement||5||12||19||26||14:30|
|Connected Autonomous Vehicles||6||13||29||27||13|
|AI-Enhanced Video Coding||13||27||14|
|AI-based End-to-End Video Coding||6||29||14|
|Compression and Understanding of Industrial Data||13||27||15|
|Server-based Predictive Multiplayer Gaming||7||14||21||14:30|
|Industry and Standards||8||22||14|
|General Assembly (MPAI-13)||27|
MPAI publishes 2 draft standards and 1 document for comments
MPAI (https://mpai.community/) was established 11 months ago as an international, not-for-profit, unaffiliated standards developing organisation. Its mission is to develop Data Coding standards that primarily use Artificial Intelligence.
MPAI is currently working on 10 standards projects (https://mpai.community/standards/). The publicly available text of the working drafts of two of these standards is close to settled, and final approval is expected in a matter of weeks:
- Multimodal Conversation (https://mpai.community/standards/mpai-mmc/) comprising Conversation with Emotion, Multimodal Question Answering and 3 Speech Translation Use Cases, and
- Compression and Understanding of Industrial Data (https://mpai.community/standards/mpai-cui/) comprising the AI-based Company Prediction Use Case
Professional in these fields are invited to send comments on the drafts especially on the suitability of the standards in their current form and suggestions for future work.
Comments should be sent to MPAI Secretariat (firstname.lastname@example.org) via email by 20 September 2021.
Comments will not be shared outside MPAI but considered by the Development Committees in charge of the standards. An individual response provided to you.
Comments shall not include any Intellectual Property (IP) matters. If any will be received, the Secretariat will return the email and not forward it to any MPAI member or Development Committee.
Those who believe they have IP in the areas where MPAI develops standards, should read the MPAI Statutes (https://mpai.community/statutes/) and consider joining MPAI (https://mpai.community/how-to-join/join/).
MPAI is also publishing for comments its foundational document titled Governance of the MPAI Ecosystem (https://mpai.community/governance/). This lays down the rules that will govern access to the implementations based on MPAI standards and on their attributes of Reliability, Robustness, Replicability and Fairness.
An introductory paper is available at https://mpai.community/2021/08/27/the-governance-of-the-mpai-ecosystem/.
MPAI engages in standards for AI-based Connected Autonomous Vehicles
For several decades, Autonomous Vehicles have been the target of research and experimentations in industry and academia. Since a decade, trials on real roads have been and are being conducted. Connected Vehicles are a reality today.
Standardisation of Connected Autonomous Vehicle (CAV) components will be required because of the size of the future CAV market (one estimate is 1.38 T$ in 2030). More importantly, users and regulators will need to be assured of the safety, reliability and explainability of CAV components.
In a traditional view of standardisation, the CAV state of development may not warrant space for CAV standardisation. However, MPAI heralds a more modern approach, one where a standard is the result of a continuous interaction between research providing results and standardisation building hypotheses to be proved or modified or disproved by subsequent research results.
MPAI has been working on the first steps of such a process. It has first partitioned a CAV in 5 functionally homogeneous subsystems as depicted in depicted in Figure 1.
Each of these subsystems has an architecture that is based on the emerging MPAI AI Framework (MPAI-AIF) and contains several components called AI Modules (AIM).
Figure 1 – The CAV subsystems
Figure 2 depicts the architecture of the 1st subsystem: Human-CAV interaction. A human may issue vocal commands to the CAV which are interpreted and sent to the Autonomous Motion Subsystem for action. A human may also entertain a dialogue with the CAV or with fellow passengers in the compartment or can indicate objects or make gestures that the CAV would understand and act upon accordingly.
Figure 2 – Human-CAV interaction (HCI)
The existence of an established organisation – MPAI – with a peculiar process to develop standard and actually developing them using that approach facilitates the implementation of the proposed plan. Indeed, MPAI follows a rigorous process represented in Figure 3.
Figure 3 – Standard development process
The first 3 stages – Interest Collection, Use Case and Functional Requirements – are open to participation to non-members. Stage 4 – Commercial Requirements – is the prerogative of Principal Members. Stages 4 and 5 – Call for Technologies and Standard Development – are restricted to MPAI members. Stage 6 – MPAI standard – is again the prerogative of Principal Members. Note that MPAI membership is open to corporate entities and individuals representing academic departments.
The MPAI-CAV project is currently at stage 3. This means that non-members can participate in the development of the functional requirements document which will provide the final CAV partitioning in subsystems; the functions performed by and the functional requirements of the I/O data of each subsystem; the partitioning of subsystems in AIMs, and the functions performed by and the functional requirements of the I/O data of each AIM. Independently produced results will be collectively assessed and used to design experiments executed by different participants in agreed conditions to provide comparable results.
MPAI has been established on 30 September 2020 as a not-for-profit unaffiliated organisation with the mission (https://mpai.community/statutes/)
- to develop data coding standards based on Artificial Intelligence and
- to bridge the gap between standards and their practical use through Framework Licences.
MPAI develops its standards through a rigorous process depicted in following figure:
The MPAI standard development process
An MPAI standard passes through 6+1 stages. Anybody can contribute to the first 3 stages. The General
Assembly approves the progression of a standard to the next stage. MPAI defines standard interfaces of AI Modules (AIM) combined and executed in an MPAI-specified AI-Framework (AIF). AIMs receive
data with standard formats and produce output data with standard formats.
|An MPAI AI Module (AIM)||The MPAI AI Framework (AIF)|
MPAI is currently developing 10 technical specifications.
In the following, the MPAI name and acronym and the scope of each standard are provided. The first 4 standards will be approved within 2021,
AI Framework – MPAI-AIF (Stage 5)
Specifies 6 elements: Management and Control, AIM, Execution, Communication, Storage and Access to enable creation and automation of mixed ML-AI-DP processing and inference workflows.
Context-based Audio Enhancement – MPAI-CAE (Stage 5)
Improves the user experience in audio applications, e.g., entertainment, communication, teleconferencing, gaming, post-production, restoration etc. for different contexts, e.g., in the home, in the car, on-the-go, in the studio etc.
Multimodal Conversation – MPAI-MMC (Stage 5)
Enables human-machine conversation that emulates human-human conversation in completeness and intensity
Compression and understanding of industrial data – MPAI-CUI (Stage 5)
Enables AI-based filtering and extraction of governance, financial and risk data to predict company performance.
Server-based Predictive Multiplayer Gaming – MPAI-SPG (Stage 3)
Minimises the audio-visual discontinuities caused by network disruption during an online real-time game and provides a response to the need to detect who amongst the players is cheating.
Integrative Genomic/Sensor Analysis – MPAI-GSA (Stage 3)
Understands and compresses the result of high-throughput experiments combining genomic/proteomic and other data, e.g., from video, motion, location, weather, medical sensors.
AI-Enhanced Video Coding – MPAI-EVC (Stage 3)
Substantially enhances the performance of a traditional video codec by improving or replacing traditional tools with AI-based tools.
Connected Autonomous Vehicles – MPAI-CAV (Stage 3)
Uses AI to enable a Connected Autonomous Vehicle with 3 subsystems: Human-CAV interaction, Autonomous Motion Subsystem and CAV-Environment interaction
Visual object and scene description – MPAI-OSD (Stage 2)
A collection of Use Cases sharing the goal of describing visual object and locating them in the space. Scene description includes the usual description of objects and their attributes in a scene and the semantics of the objects.
Mixed-Reality Collaborative Spaces – MPAI-MCS (Stage 1)
Enables mixed-reality collaborative space scenarios where biomedical, scientific, and industrial sensor streams and recordings are to be viewed where AI can be utilised for immersive presence, spatial map rendering, multiuser synchronisation etc.
Additionally, MPAI is developing a standard titled “Governance of the MPAI ecosystem”. This will specify how:
- Implementers can get certification of the adherence of an implementation to an MPAI standard from the technical (Conformance) and ethical (Performance) viewpoint.
- End users can reliably execute AI workflows on their devices.
MPAI is currently developing 4 standards
Established less than 8 months ago – on 30 September 2020 – MPAI has promptly produced a process to develop its standards and it immediately put it in action.
In simple words, the MPAI process identifies the need for standards and determines functional requirements. Then it determines the commercial requirements (framework licences). Then it acquires technologies by issuing a public Call for technologies and developes the standard using the technologies proposed and evaluated.
MPAI is currently developing 4 standards, which means that the functional and commercial requirements have been developed, calls have been issued, and responses received in 4 instances:
- Artificial Intelligence Framework (MPAI-AIF) enables creation and automation of mixed ML-AI-DP processing and inference workflows. See https://mpai.community/standards/mpai-aif/
- Context-based Audio Enhancement (MPAI-CAE) improves the user experience for several audio-related applications including entertainment, communication, teleconferencing, gaming, post-production, restoration etc. in a variety of contexts. See https://mpai.community/standards/mpai-cae/
- Multi-modal conversation (MPAI-MMC) aims to enable human-machine conversation that emulates human-human conversation in completeness and intensity by using AI. See https://mpai.community/standards/mpai-mmc/
Compression and understanding of industrial data (MPAI-CUI) aims to predict the performance of a company by applying Artificial Intelligence to governance, financial and risk data. See https://mpai.community/standards/mpai-cui/
MPAI Use Cases being standardised – Emotion enhanced speech.
Imagine that you have a sentence uttered without a particular emphasis or emotion, that you have a sample sentence uttered with a particular intonation, emphasis and emotion and that you would like to have the emotion-less sentence uttered as in the sample sequence.
This is one of the use cases belonging to the Context-based Audio Enhancement standard that MPAI is developing as part of the process described above.
What is being standardised by MPAI in this Emotion-Enhanced Speech (EES) use case? The input and output interfaces of an EES box that takes a speech uttered without emotion (“emotion-less speech”), a segment of that speech between t1 and t2 and a sample speech containing the emotion, timbre etc. with which the segment of speech between t1 and t2 at the output of EES should be pronounced.
The EES does not stop here. It defines the architecture of the box, composed by AI Modules (AIM) of which only the functionality and the input and output data are defines, but not the internals of the AIM.
MPAI believes that this “lightweight” standardisation reaches two, apparently opposite goals: AIMs can be obtained from different sources and replaced with AIMs with more advanced functionalities.
MPAI standards not only do offer interoperability but also build on and further promote AI innovation.
Two weeks left to the Context-based Audio Enhancement and Multimodal Communication Calls for Technologies.
Last 17 February, MPAI issued two Calls for Technologies.
The Context-based Audio Enhancement (MPAI-CAE) https://bit.ly/3rqrvn1 Call comprises 4 Use Cases designed to improve the user experience for several audio applications in contexts such as in the home, in the car, on-the-go, in the studio etc. Usage examples are: adding a desired emotion to a speech without emotion, preserving old audio tapes, improving the audioconference experience, and removing unwanted sounds while keeping the relevant ones to a user walking in the street.
The Multimodal Conversation (MPAI-MMC) https://bit.ly/3tZqF2y Call comprises 3 Use Cases that use Artificial Intelligence (AI) to enable conversations between humans and machines that emulate conversations between humans in completeness and intensity. Usage examples are: an audio-visual conversation with an emotion-understanding machine impersonated by a synthetic voice and an animated face, requesting information about an object while displaying it, a human talking to a machine doing the translation with a voice that preserves the speech features of the human.
MPAI has already received a sufficient number of intentions to submit proposals covering all use cases. However, more competition makes better standards. If you have relevant technologies, please have a look at the Call for Technologies (https://bit.ly/3ryNsAF) page, read the text of the Call of your interest, study the Use Cases and Functional Requirements document and review the Framework Licence. In case of doubt, use the Template for submissions.Your proposal should be received by the 12th April 2021.
A new MPAI Call for Technologies tackles AI-based risk analysis
At the 6th General Assembly MPAI has approved the Compression and understanding of industrial data (MPAI-CUI) https://bit.ly/2PCD1hP Call for Technologies. The standard will enable prediction of a company performance by extracting information from its governance, financial and risk data. MPAI believes that Artificial Intelligence (AI) can achieve that goal.
In its current form, MPAI-CUI uses AI for such purposes as assessing and monitoring a company’s financial and organisational performance, and the impact of vertical risks (e.g., cyber, seismic, etc.); identifying clues to a crisis or bankruptcy years in advance; supporting financial institutions when deciding on a loan to a troubled company…
The Use Cases and and Functional Requirements (https://bit.ly/3dapNkE) document identifies the requirements that proposed technologies should satisfy e.g. a governance data ontology that captures today’s practice at the global level, a digital representation of financial statements, risk assessment technical data having universally valid semantics and tree-like decision models to predict the probability of company crisis.
The Call for Technologies will be introduced at two online conferences that will be held on 31st of March at 15.00 UTC and 7th if April at 15.00 UTC. Interested parties are welcome to attend using the URLs of the first conference call (https://bit.ly/2PdUSMf).
All parties, including non-MPAI members, who believe they have relevant technologies satisfying all or most of the MPAI-CUI Functional Requirements (https://bit.ly/39CzuaP) are invited to submit proposals using a template (https://bit.ly/39mtzpX). They are also invited to inform the secretariat (email@example.com) their intention to respond to the Call ) by the 16th of April.
The MPAI-CUI Call for Technologies (https://bit.ly/3rnDl1i) requests that the technologies proposed, if accepted for inclusion in the standard, be released according to the MPAI-CUI Framework Licence (https://bit.ly/2QNWTzv) to facilitate patent holders in their definition of the final licence.
Framework Licences and MPAI standards
MPAI’s Call for Technologies are documents that describe the purpose of the standard (called XYZ in the following), what submitters should do to respond, how submissions will be evaluated. Additionally, a Call contains the following text that should be mandatorily included in a submission:< Company > submits this technical document in response to MPAI Call for Technologies for MPAI project XYZ.
< Company > explicitly agrees to the steps of the MPAI standards development process defined in Annex 1 to the MPAI Statutes https://bit.ly/2PxO3Fm (N80), in particular
< Company > declares that < Company > or its successors will make available the terms of the Licence related to its Essential Patents according to the Framework Licence of XYZ, alone or jointly with other IPR holders after the approval of the XYZ Technical Specification by the General Assembly and in no event after commercial implementations of the MPAI-CUI Technical Specification become available on the market.
With this declaration a submitter agrees to license their technologies that have been accepted into the XYZ standard in line with the Framework Licence of the XYZ standard. MPAI has already developed four Framework Licences, (https://bit.ly/2P2aCSM) but what is a Framework Licence?
It is the business model, defined and adopted by the MPAI Principal Members who intend to actively contribute to the standard, to monetise their patents. The Framework Licence does not contain values: such as dollars, percentages, dates etc.
Here are 3 examples of clauses contained in the FWLs adopted for the three standards mentioned in this newsletter:
- The License will be free of charge to the extent it is only used to evaluate or demo solutions or for technical trials.
- The License may be granted free of charge for particular uses if so decided by the licensors.
- A preference will be expressed on the entity that should administer the pool of patent holders.
MPAI is confident that Framework Licences will accelerate the definition of licences benefitting industry, consumers and patent holders.
MPAI is barely 5 months old, but its community is expanding. So, we thought that it might be useful to have a slim and effective communication channel to keep our extended community informed of the latest and most relevant news. We plan to have a monthly newsletter.
We are keen to hear from you, so don’t hesitate to give us your feedback.
MPAI has started the development of its first standard: MPAI-AIF.
In December last year, MPAI issue a Call for Technologies for its first standard. The call concerned “AI Framework”, an environment capable to assemble and execute AI Modules (AIMs), components that perform certain functions that achieve certain goals.
The call requested technologies to support the life cycle of single and multiple AIMs, and to manage machine learning and workflows.
The standard is expected to be released in July 2021
MPAI is looking for technologies to develop its Context-based Audio Enhancement standard
In September last year, 3 weeks before MPAI was formally established, the group of people who was developing the MPAI organisation had already identified Context-based Audio Enhancement as an important target of MPAI standardisation. The idea was to improve the user experience in several audio-related applications including entertainment, communication, teleconferencing, gaming, post-production, restoration etc. The intention was promptly announced in a press release.
A lot has happened since then. Finally, in February 2021 the original intention took shape with the publication of a Call for Technologies for the upcoming Context-based Audio Enhancement (MPAI-CAE) standard.
The Call envisages 4 use cases. In Emotion-enhanced speech (left) an emotion-less synthesised or natural speech is enhanced with a specified emotion with specified intensity. In Audio recording preservation (right) sound from an old audio tape is enhanced and a preservation master file produced using a video camera pointing to the magnetic head;
In Enhanced audioconference experience (left) speech captured in an unsuitable (e.g. at home) enviroment is cleaned of unwanted sounds. In Audio on the go (right) the audio experienced by a user in an environment preserves the external sounds that are considered relevant.
MPAI needs more technologies
On the same day the MPAI-CAE Call was published, MPAI published another Call for Technologies for the Multimodal Conversation (MPAI-MMC) standard. This broad application area can vastly benefit from AI.
Currently, the standard supports 3 use cases where a human entertains an audio-visual conversation with a machine emulating human-to-human conversation in completeness and intensity. In Conversation with emotion, the human holds a dialogue with speech, video and possibly text with a machine that responds with a synthesised voice and an animated face.
In Multimedia question answering (left), a human requests information about an object while displaying it. The machine responds with synthesised speech. In Personalized Automatic Speech Translation (right), a sentence uttered by a human is translated by a machine using a synthesised voice that preserves the human speech features.