Moving Picture, Audio and Data Coding
by Artificial Intelligence

All posts

MPAI approves AI Framework and calls for comments on Enhanced Audio standards

Geneva, Switzerland – 24 November 2021. After releasing 3 official standards, today the Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) standards developing organisation has published one standard for final publication and one draft standard for Community Comments, the step before official release.

The standard approved for final publication is AI Framework (MPAI-AIF), a standard that enables creation and automation of mixed Machine Learning, Artificial Intelligence, Data Processing and inference workflows. The Framework can be implemented as software, hardware, or hybrid software and hardware, and is the enabler of the MPAI Store, a not-for-profit entity giving access to certified MPAI implementations.

The draft standard released for Community Comments is Context-based Audio Enhancement (MPAI-CAE). The standard supports 4 identified use cases: adding a desired emotion to an emotion-less speech segment, preserving old audio tapes, restoring audio segments and improving the audio conference experience. Comments are requested, by 15 December, prior to final approval at MPAI’s next General Assembly (MPAI-15) on 22 December 2021.

MPAI is currently working on several other standards, some of which are:

  1. Server-based Predictive Multiplayer Gaming (MPAI-SPG) uses AI to train a network that com­pensates data losses and detects false data in online multiplayer gaming.
  2. AI-Enhanced Video Coding (MPAI-EVC), a candidate MPAI standard improving existing video coding tools with AI and targeting short-to-medium term applications.
  3. End-to-End Video Coding (MPAI-EEV) is a recently launched MPAI exploration promising a fuller exploitation of the AI potential in a longer-term time frame that MPAI-EVC.
  4. Connected Autonomous Vehicles (MPAI-CAV) uses AI in key features: Human-CAV Interac­tion, Environment Sensing, Autonomous Motion, CAV to Everything and Motion Actuation.
  5. Mixed Reality Collaborative Spaces (MPAI-MCS) creates AI-enabled mixed-reality spaces populated by streamed objects such as avatars, other objects and sensor data, and their descriptors for use in meetings, education, biomedicine, science, gaming and manufacturing.

So far MPAI has published 4 standards in final form

  1. The AI Framework (MPAI-AIF) standard.
  2. The Governance of the MPAI Ecosystem (MPAI-GME) establishing the process and rules that allow users to select and access implementations with the desired interoperability level.
  3. The Compression and Understanding of Industrial Data (MPAI-CUI) standard giving the financial risk assessment industry new, powerful and extensible means to predict the performance of a company.
  4. The Multimodal Conversation (MPAI-MMC) standard allowing industry to accelerate the availability of products, services and applications such as: multimodal conversation with a machine; requesting and receiving information via speech about a displayed object; translating speech using a synthetic voice that preserves the features of the speaker.

MPAI develops data coding standards for applications that have AI as the core enabling technology. Any legal entity supporting the MPAI mission may join MPAI if able to contribute to the development of standards for the efficient use of data.

Visit the MPAI web site and contact the MPAI secretariat for specific information.

 


Time to join MPAI

Are there serious reasons to be part of the Moving Picture, Audio and Data Coding by Artificial Intelligence?

The latest is the fact that, starting from the 20th of November 2021, a legal entity or a representative of an academic department who are able to contribute to the the development of Technical Specifications can join MPAI now and have their membership extended until the end of 2022.

This, however, is more an opportunity to accelerate the decision to joinMPAI, because there are other substantive reasons:

  1. Data coding is a very important technology area.
  2. Standards accelerate technology exploitation into products.
  3. AI is the technology with the highest potential to yield high-performance solutions.
  4. MPEG has shown that standards should be developed by applying technology across the board and MPAI is the only standards organisation doing so for AI-based data coding.
  5. MPEG has shown that standards should be accessible and MPAI applies the practice of making available Framework Licences to accelerate accessibility to its standards.
  6. MPAI has solid foundations, experience and a rigorous standards-development process.
  7. MPAI addresses high-profile areas: AI framework, audio enhancement, human-machine conversation, prediction of company performance, video coding, online gaming, autonomous vehicles and more.
  8. MPAI is productive: in 15 months it has developed 3 standards (governance, human-machine conversation and company performance prediction), by end of year it will complete 2 standards (AI framework and audio enhancement) and has 7 more standards in the pipeline.
  9. MPAI standards are viral: products conforming to MPAI standards are already present on the market.
  10. MPAI develops methods to assess the level of conformance and reliability of standard implementations, including methods ensuring that implementations are bias-free.
  11. MPAI plans on establishing MPAI Store, a not-for-profit commercial organisation with the task to test implementations for security and conformance, and verify they are bias-free.

Join the fun, build the future!


Workshop announcement: MPAI-CUI standard assesses business performance with AI

MPAI – the international AI-based data coding standards developing organisation – has published an epoch-marking AI-based standard called MPAI-CUI (Compression and Understanding of Industrial Data) that allows the assessment of a company from its financial, governance and risk data.

An implementation of the standard is composed of several modules pre-processing the input data. A fourth module – called Prediction – is a neural network that has been trained with a large amount of company data of the same type as those used by the implementation and can provide an accurate estimate of the company default probability and the governance adequacy. The fifth module – called Perturbation – takes as input the estimation of the company default probability and the assessment of vertical risks (i.e., seismic and cyber) and estimates the probability that a business discontinuity will occur in the future.

The MPAI-CUI standard is complemented by a second specification called Conformance Assessment. This allows a user of an implementation of the standard to verify that the implementation is technically correct.

The standard is further complemented by a specification called Performance Assessment. The goal of the specification is to allow a user to detect whether the training of the neural network was biased against some geographic locations (e.g., North-South) and some industry types among the four currently supported: service, public, commerce and manufacturing.

The novelty of MPAI-CUI is in its ability to analyse, through AI, the large amount of data required by regulation and extract the most relevant information. Moreover, compared to state-of-the-art techniques that predict the performance of a company, MPAI-CUI allows extending the time horizon of prediction.

Companies and financial institutions can use MPAI-CUI in a variety of contexts, e.g.:

  1. To support the company’s board in deploying efficient strategies. A company can analyse its financial performance, identifying possible clues to the crisis or risk of bankruptcy years in advance. It may help the board of directors and decision-makers to make the proper decisions to avoid these situations, conduct what-if analysis, and devise efficient strategies.
  2. To assess the financial health of companies applying for funds/financial help. A financial institution receiving a request for financial help from a troubled company, can access the company’s financial and organisational data and make an AI-based assessment, as well as a prediction of future performance of the company. This helps the financial institution to make the right decision whether funding that company or not, based on a broad vision of its situation.

MPAI organises a public workshop to promote understanding and potential of use of MPAI-CUI in the industry. The event will be held on 25th of November 2021 at 15:00 UTC with the following agenda:

  1. Introduction (5’) – will introduce MPAI, its mission, what has been done in the year after its establisment, plans
  2. MPAI-CUI standard (15’) – will describe
    1. The process that led to the standard: study of Use Cases, Functional Requirements, Commercial Requirements, Call for Technologies, Request for Community Comments and Standard.
    2. The MPAI-CUI modules and their function.
    3. Extensions under way.
    4. Some applications of the standard (banking, insurance, public administrations).
  3. Demo (15’) – a set of anonymous companies with identified financial, governance and risk features will be passed through an MPAI-CUI implementation.
  4. Q&A

MPAI calls for comments on one more candidate standard

Geneva, Switzerland – 27 October 2021. After releasing 3 official standards at its previous monthly General Assembly, today the Moving Pic­ture, Audio and Data Coding by Artificial Intelligence (MPAI) standards developing organisation has published 1 more draft standard for comments, the step before official release.

Comments are requested, by 20 November, prior to final approval at MPAI’s next 24 November General Assem­bly (MPAI-14) on:

AI Framework (MPAI-AIF) enables creation and autom­ation of mixed Machine Learning, Artificial Intelligence, Data Processing and inference workflows, implemented as software, hardware, or hybrid software and hardware. MPAI-AIF is also an enabler of the MPAI Store part of the Governance of the MPAI Ecosystem (MPAI-GME) approved by MPAI-12.

MPAI-12 released the full set of the AI-based Compression and Understanding of Industrial Data (MPAI-CUI) standard – Technical Specification, Reference Software, Conformance Testing and Performance Assessment. As MPAI-12 only released the Multimodal Conversation (MPAI-MMC) Technical Specification, MPAI is currently developing the MPAI-MMC Conformance Testing specification to enable a user to verify the technical correctness of an implementation.

MPAI is currently working on several other standards, e.g.:

  1. Context-based Audio Enhancement (MPAI-CAE): adding a desired emotion to an emotion-less speech segment, preserving old audio tapes, restoring audio segments and improving the audio confer­ence experience.
  2. Server-based Predictive Multiplayer Gaming (MPAI-SPG) uses AI to train a network that com­pensates data losses and detects false data in online multiplayer gaming.
  3. Connected Autonomous Vehicles (MPAI-CAV) uses AI in key features: Human-CAV Interac­tion, Environ­ment Sensing, Autonomous Motion, CAV to Everything and Motion Actuation.
  4. Mixed Reality Collaborative Spaces (MPAI-MCS) creates AI-enabled mixed-reality spaces populated by streamed objects such as avatars, other objects and sensor data, and their descriptors for use in meetings, education, biomedicine, science, gaming and manufacturing.
  5. AI-Enhanced Video Coding (MPAI-EVC), a candidate MPAI standard improving existing video coding tools with AI and targeting short-to-medium term applications.
  6. End-to-End Video Coding (MPAI-EEV) is a recently launched MPAI exploration promising a fuller exploitation of the AI potential in a longer-term time frame that mPAI-EVC.

MPAI develops data coding standards for applications that have AI as the core enabling technology. Any legal entity who supports the MPAI mission may join MPAI if it is able to contribute to the development of standards for the efficient use of data.

Visit the MPAI web site and contact the MPAI secretariat for specific information.


MPAI celebrates its first anniversary approving 3 standards for publication

Geneva, Switzerland – 30 September 2021. Today, at its 12th General Assembly, the Moving Pic­ture, Audio and Data Coding by Artificial Intelligence (MPAI) standards devel­oping organisation has approved 3 standards for publication.

Established exactly one year ago as an international, unaffili­ated, not for profit association, MPAI is proud to announce that the first two AI-powered standards approved today serve two of the many industries targeted by MPAI: financial risk assessment and human-to-machine communication.  MPAI standards generate an ecosystem whose governance constitutes the 3rd standard approved today.

The AI-based Compression and Understanding of Industrial Data (MPAI-CUI) standard gives the financial risk assessment industry new, powerful and extensible means to predict the performance of a company. The standard includes Reference Software, Conformance Testing (to test that the standard has been correctly implemented) and Performance Assessment (to assess how well an im­plem­entation satis­fies the criteria of Reliability, Robustness, Replicab­ility and Fairness). The Reference Software is released with a modified BSD licence (link). An online demo (link) demonstrates the potential of the MPAI-CUI standard.

A slate of applications are enabled by the approved Multimodal Conversation (MPAI-MMC) standard. Industry can accelerate the availability of products, services and applications such as: holding an audio-visual conversation with a machine impersonated by a synthetic voice and an animated face; requesting and receiving information via speech about a displayed object; inter­preting speech to one, two or many languages using a synthetic voice that preserves the features of the human speech.

AI is a technology with great potential for good but also misleading use. With the document Governance of the MPAI Ecosystem (MPAI-GME) approved today, MPAI has laid down the rules governing an ecosystem of implem­enters and users of secure MPAI standard im­plemen­tations guar­an­teed for Conformance and Performance, and acces­sible through the not-for-profit MPAI Store.

All standards can be downloaded from the MPAI web site.

MPAI has been working on AI-Enhanced Video Coding (MPAI-EVC), a candidate MPAI standard improving existing video coding tools with AI and targeting short-to-medium term applications. MPAI believes that the so-called end-to-end approach lends itself to a fuller exploitation of the AI potential. Therefore, MPAI is launching a new project called AI-based End-to-End Video Coding (MPAI-EEV) targeting long-term applications next to MPAI-EVC.

MPAI has also decided to approve the following two draft standards for community comments:

  1. Context-based Audio Enhancement (MPAI-CAE): adding a desired emotion to an emotion-less speech segment, preserving old audio tapes, restoring audio segments, improving the audio confer­ence experience and removing unwanted sounds to a user on the go.
  2. AI Framework (MPAI-AIF) enables creation and autom­ation of mixed Machine Learning, Artificial Intelligence, Data Processing and inference workflows, implemented as software, hardware, or hybrid software and hardware.

MPAI is currently also working on several other standards, e.g.:

  1. Server-based Predictive Multiplayer Gaming (MPAI-SPG) uses AI to train a network that com­pensates data losses and detects false data in online multiplayer gaming.
  2. Connected Autonomous Vehicles (MPAI-CAV) uses AI in key features: Human-CAV Interac­tion, Environ­ment Sensing, Autonomous Motion, CAV to Everything and Motion Actuation.
  3. Mixed Reality Collaborative Spaces (MPAI-MCS) creates AI-enabled mixed-reality spaces populated by streamed objects such as avatars, other objects and sensor data, and their descriptors for use in meetings, education, biomedicine, science, gaming and manufacturing.

MPAI develops data coding standards for applications that have AI as the core enabling technology. Any legal entity who supports the MPAI mission may join MPAI if it is able to contribute to the development of standards for the efficient use of data.

Visit the MPAI web site and contact the MPAI secretariat for specific information.


Looking forward to MPAI’s 12th General Assembly

MPAI’s 12th General Assembly (MPAI-12) is not going to be like any of the other previous 12 General Assemblies (MPAI was established at MPAI-0) and the reason is simple to explain. So far MPAI has made big announcements about its plans to develop AI-based data coding standards. In less than two weeks it plans on releasing the first three. At its last General Assembly (MPAI-11), it actually published the WDs of the working drafts of the three standards for “Community Comments”: MPAI- GME, MPAI-CUI, MPAI-MMC.

Comments are flowing in, and this article will briefly describ the content of the 3 standards and remind the community that the deadline for comments is close (20 September).

The first standard is “Governance of the MPAI Ecosystem” (MPAI-GWE). In general, standards have powerful positive effects. MPAI Standards will naturally create an Ecosystem whose Actors are:

  1. Implementers: develop and upload Implementations to the MPAI Store.
  2. Performance Assessors: assess the Performance of Implementations, i.e., their Reliability, Robustness, Replicability and Fairness.
  3. MPAI Store: verifies security, tests Conformance and checks the positive outcome of Performance of Implementations.
  4. End Users download and enjoy Implementations.

A system of this complexity requires governance and “Governance of the MPAI Ecosystem” lays down the rules that the Actors shall follow.

WD0.4 of MPAI-GME is available for Community Comments.

The second standard is “Compression and Understanding of Industrial Data” (MPAI-CUI). Unlike MPAI-GME – a system standard – MPAI-CUI is an application standard that currently contains one use case called AI-based Company Performance Prediction. By extracting key information from the flow of data produced by companies – currently financial and organisational data – and vertical risks – currently seismic and cyber – MPAI-CUI enables users to predict default probability and business discontinuity probability of a company.

WD0.4 of MPAI-CUI is available for Community Comments.

The third standard is “Multimodal Conversation” (MPAI-MMC), an application standard containing 5 use cases making possible a slate of applications enabling industry to accelerate the availability of products, services and applications. Examples are: holding an audio-visual conversation with a machine impersonated by a synthetic voice and an animated face; requesting and receiving information via speech about a dis­played object; inter­preting speech to one, two or many languages using a synthetic voice that preser­ves the features of the human speech.

WD0.4 of MPAI-MMC is available for Community Comments.

These are just the starters of the rich menu of MPAI-12. After the 30th of September, look at the MPAI web site blog for a full report.


The Governance of the MPAI Ecosystem

Artificial Intelligence is not just another technology coming to the fore as humankind has seen many time before. By mimicking the way humans interpret and act on the new based on their experience, AI may influence its users in subtle ways.

The MPAI Statutes read that MPAI’s mission is to produce standards for Moving Picture, Audio and Data Coding by Artificial intelligence. While Moving Pictures and Audio have been singled out in the mission because of their importance, ultimately they are “Data”. Therefore, from now on we will only talk about Data.

The most immediate example of Data Coding is data compression. If the machine doing the compression, after a careful training, has learnt that certain patterns are more common than others, it will probably compress better than a machine that has been hardwired to perform certain operations by a human who has understood how certain patterns appear and devised a mechanism to exploit those patterns. By growing in age, humans understand that the world is always more complex that it was assumed to be before.

It is the age of the power of the numbers vs the ingenuity of the human that AI heralds.

A more interesting example of Data Coding is feature extraction. The machine called “human eye and brain” has few if any machines capable to compete with it in its ability to recognise objects. Even a short period of learning is sufficient to a newly born to recognise the faces of the people and the objects in the environment. After decades, the object recognition ability of the adult grown from the child exposed to a broad experience is very high. If the adult, however, has lived in an environment where “being different” is associated with a negative connotation, it is more likely that an object with those features will have a negative connotation. Think about how certain cultures handle certain objects or words.

Machine learning is no different. As much as education can largely influence the future behaviour of a child, the way a machine is trained influences the way that machine will respond to future stimuli.

The above is not a general warning about anything related with Artificial Intelligence. If you use a video codec where AI improves the efficiency of a traditional video codec, you should not expect that, because of the way the codec has been trained, the codec will show you a cat when the original animal was a dog. But if you use a system that assesses the performance of a company using AI, you may discover that a company with certain combinations of features is judged negatively simply because the cases used in the training phase were associated with a negative performance.

Again, AI is not like any other technology and it would be irresponsible for MPAI to simply stop with the publication of its standards and not to care about their ultimate use.

With its notion of MPAI ecosystem, MPAI has developed a strategy that allows end users to find implementations that have a level of  “guarantee” that depends on the thoroughness of the “review”.

The way MPAI implements its ecosystem is based on the following elements:

  1. MPAI standardises an environment – called AI Framework (AIF) – in which AI-based applications – implemented as AI Workflows (AIW) are executed. AIWs are composed of mostly AI-based basic components – called AI Modules (AIM). The architecture of the AIF with its components is sketched in Figure 1.

Figure 1 –Architecture and Components of the AI Framework (AIF)

  1. MPAI develops application standards, i.e., collections of use cases belonging to an application domain such as “audio enhancement” or “multimodal conversation” or “understanding financial data” which specify the AIWs and their AIMs implementing well-identified use cases. Standardisation is lightweight because
    1. For AIWs: only functions, and formats and semantics of the input and output data and the interconnections of the AIMs are specified
    2. For AIMs: only functions, and the formats and semantics of the input and output data are specified.

An example of Use Case is “Unidirectional Speech Translation”. The AIW of Figure 2 interprets a Speech Segment uttered in a specified language to another specified language preserving the characteristics of the original speech. Figure 2 depicts an AIW standardised by MPAI-MMC, called “Unidirectional Speech Translation”.

Figure 2 – An AIW example

  1. MPAI additionally specifies how
    1. To test the conformance of a AIF-AIW-AIM implementations with the standard
    2. To assess the performance of AIW-AIM implementations. Performance of an implementation is defined as the support of the attributes of Reliability, Robustness, Replicability and Fairness.
  2. MPAI appoints performance assessors, entities with the task to assess the performance of an implementation in a given application domain.
  3. MPAI defines interoperability as the ability to replace an AIF, AIW or AIM implementation with a functionally equivalent implementation. MPAI defines 3 interoperability levels of an AIF implementation that run an AIW composed of AIMs:
    1. Executing any proprietary function and exposing any proprietary interface (Level 1).
    2. With functions and interfaces specified by an MPAI Application Standard (Level 2).
    3. Certified by a performance assessor (Level 3).
  4. MPAI establishes and oversees the not-for-profit commercial company MPAI Store, that distributes implementations of MPAI standards. Depending on the interoperability level of the implementation, the MPAI Store performs the following functions: tests security and conformance of AIFs, AIWs and AIMs, and assesses performance of AIWs and AIMs.

Figure 3 – Operation of the MPAI ecosystem

MPAI has developed the Governance of the MPAI Ecosystem WD0.4 and is seeking comments from the MPAI community. Comments should be provided by 20 September 2021 on time for review before adoption of the document by the MPAI General Assembly (MPAI-12) on 30 September.


MPAI publishes 2 draft standards and 1 document for comments

Geneva, Switzerland – 25 August 2021. At its 11th General Assembly, the international, unaffili­ated Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) standards devel­oping organisation has published 2 draft standards and 1 foundational document for comment.

Comments are requested, by 20 September, prior to final approval at MPAI’s next General Assem­bly (MPAI-12) on:

  1. Compression and Understanding of Industrial Data (MPAI-CUI). AI-based Com­pany Perfor­m­ance Prediction enables a user to assess a company’s default probability, organisati­onal adequacy and business discontinuity probability in a given prediction horizon.
  2. Multimodal Conversation (MPAI-MMC). Conversation with Emotion supports audio-visual conversation with a machine impersonated by a synthetic voice and an animated face; Multimodal Question Answering supports request for information about a dis­played object; Unidirectional, Bidirectional and One-to-Many Speech Translation support conversational translation using a synthetic voice that preser­ves the speech features of the human.
  3. Governance of the MPAI Ecosystem lays down the rules governing an ecosystem of implem­enters and users of secure and performance-guar­an­teed MPAI standard implemen­tations acces­sible through the not-for-profit MPAI Store.

MPAI is currently also working on other standards, e.g.:

  1. Context-based Audio Enhancement (MPAI-CAE): adding a desired emotion to an emotion-less speech segment, preserving old audio tapes, restoring audio segments, improving the audio confer­ence experience and removing unwanted sounds to a user on the go.
  2. AI Framework (MPAI-AIF) enables creation and autom­ation of mixed Machine Learning, Artificial Intelligence, Data Processing and inference workflows, implemented as software, hardware, or hybrid software and hardware.
  3. Server-based Predictive Multiplayer Gaming (MPAI-SPG) uses AI to train a network that com­pensates data losses and detects false data in online multiplayer gaming.
  4. AI-Enhanced Video Coding (MPAI-EVC) uses AI to improve the performance of existing video coding tools.
  5. Connected Autonomous Vehicles (MPAI-CAV) uses AI in key features: Human-CAV Interac­tion, Environ­ment Sensing, Autonomous Motion, CAV to Everything and Motion Actuation.
  6. Mixed Reality Collaborative Spaces (MPAI-MCS) applies Artificial Intelligence to create mixed-reality spaces populated by streamed objects such as avatars representing individuals, other objects and sensor data, and their descriptions for meetings, education, biomedicine, science, gaming and manufacturing.

MPAI develops data coding standards for applications that have AI as the core enabling technology. Any legal entity who supports the MPAI mission may join MPAI if it is able to contribute to the development of standards for the efficient use of data.

Visit the MPAI web site and contact the MPAI secretariat for specific information.


MPAI Standards

MPAI’s raison d’être is developing standards. So, including the word in the title looks like a pleonasm. But there is a reason: the word standard can be used to mean several things. Let’s first explore which ones.

In my order of importance the first is “information representation”. If there were no standard saying that 65 (in 7 or 8 bits) means “A” and 97 means “a” there would be no email and no WWW. Actually there would not have been a lot more things before them. Similarly, if there were no standard saying that 0111 means that there is a sequence of 3 white pixels and 11 a sequence of 3 black pixels, there would be no digital fax (not that it would make a lot of difference today, but even 10 years ago it would have been a disaster). Going into more sophisticated fields, without standard there would be no MP3, which is about digital representation of a song.

A second, apparently different, shade of the word standard is found in the Encyclopaedia Britannica where it says that a standard “permits large production runs of component parts that are readily fitted to other parts without adjustment”, which I can label as “componentisation”. Today, no car manufacturer would internally develops the nuts and bolts used in their cars (and a lot of more sophisticated components as well). They can do that because there are standards for nuts and bolts, e.g., ISO 4014:2011 Hexagon head bolts – Product grades A and B which specifies “the characteristics of hexagon head bolts with threads from M1,6 up to and including M64 etc.”.

MPAI is developing standards that fit the first definition but it is also involved in standards that fit the second one. For sure, it does neither for Hexagon head bolts. Actually, its first four standards to be published shortly, cover both areas. Let’s see how.

MPAI develops its standards focusing on application domains. For instance, MPAI-CAE targets Context-based Audio Enhancement and MPAI-MMC targets Multimodal Communication. Within these broad areas MPAI identifies Use Cases that fall in the application area and are conducive to meaningful standards. An example of MAI-CAE Use Case is Emotion Enhanced Speech (EES). You pronounce a sentence without particular “colour”, you give a model utterance and you ask the machine to provide your sentence with the “colour” of the given utterance. An example of MPAI-MMC Use Case is Unidirectional Speech Translation (UST): you pronounce a sentence in your language and with your colour and you ask the machine to interpret your sentence and pronounce it in another specified language with your own colour.

If the role of MPAI stopped there its standards would be easy to write. In the case of CAE-EES, you specify the input signals – plain speech, model utterance – and the output signal – speech with colour. In the case of MMC-UST, you specify the input signals – speech in your language and the target language – and the output signal – speech in the target language.

As such standards would be of limited use, MPAI tackles the problem from a different direction. AI-based products and services typically require training. What guarantee does a user have that the box has been properly trained? What if the box has major – intentional or unintentional – performance holes? Reverse engineering an AI-based box is a dauntingly complexity problem.

To decrease the complexity of the problem, MPAI splits a complex box in components.  Let’s see how MPAI has modelled its Unidirectional Speech Translation (UST) Use Case. Looking at Figure 1, we can see that the UST box contains 4 sub-boxes:

  1. A Speech Recognition box receiving speech as input and providing text as output.
  2. A Translation box receiving text either from a user or from the output of Speech Recognition, in addition to a signal telling which is the desired output language
  3. A Speech Feature Extraction box able to extract what is specific of the input speech in terms of intonation, emotion ecc.
  4. A Speech Synthesis box using not only text as input but also the features of the speech input to the UST box.

MPAI standards are defined at two levels:

  1. The UST box level, by defining the input and output signals, and the function of the UST box
  2. Th UST sub-box levels (that MPAI calls AI Modules – AIM), by defining the input and output signals, and the function of the each AIM

Figure 1 – The MPAI Unidirectional Speech Translation (UST) Use Case

There are at least two advantages from the MPAI approach

  1. It is possible to trace back a specific UST box output to the UST box input that generated it. This is the “information representation” part of MPAI standards.
  2. It is possible to replace individual AIMs in a UST box because the functions of the AIMs are normatively defined and so is the syntax and semantics of each AIM input and output data. This is the “componentisation” part of MPAI standards.

Which guarantee do you have that by replacing an AIM in an implementation you get a working system? The answer is “Conformance Testing”. Each MPAI Technical Specification  has a corresponding Conformance Testing specification that you can run to make sure that an implementation of a Use Case or of an AIM is technically correct.

That may not be enough if you also want to know whether the AIMs do a proper job. What if the Speech Feature Extraction AIM has been poorly trained and that your interpreted voice does not really look like your voice?

The MPAI answer to this question is called “Performance Assessment”. Each MPAI Technical Specification has a corresponding Performance Testing specification that you can run to make sure that an implementation of a Use Case or of an AIM has an acceptable grade of performance.

All this is interesting, but when will MPAI standards be actually available? The first standard (planned to be approved by the 12th MPAI General Assembly on 30 September ) will support the AI-based Company Performance Prediction. You feed the relevant data of a company into a box and you are told what is the organisation adequacy and the default probability of the company.

There is no better example than this first planned standard to understand that AI boxes cannot be treated like the Oracle of Delphi.


MPAI lays the foundations for a Mixed Reality Collaborative Spaces standard

Geneva, Switzerland – 19 July 2021. At its 10th General Assembly, the international, unaffiliated Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) standards association has continued the development of 4 standards, progressed the study of functional requirements of 4 projects and refined the definition of two use cases.

The latest addition is Mixed Reality Collaborative Spaces (MPAI-MCS) – where MPAI is studying the application of Artificial Intelligence to the creation of mixed-reality spaces populated by streamed objects such as avatars representing geographically distributed individuals, other objects and sensor data, and their descriptions. Some of the applications envisaged are education, biomedicine, science, gaming, manufacturing and remote conferencing.

Functional requirements are being developed for

  1. Server-based Predictive Multiplayer Gaming (MPAI-SPG) that uses AI to train a network to compensate data losses and detect false data in online multiplayer gaming.
  2. AI-Enhanced Video Coding (MPAI-EVC) that uses AI to improve the performance of existing data processing-based video coding tools.
  3. Connected Autonomous Vehicles (MPAI-CAV) that uses AI in Human-CAV interaction, Environment Sensing,  Autonomous Motion, CAV to Everything and Motion Actuation.
  4. Integrative Genomic/ Sensor Analysis (MPAI-GSA) that uses AI to compres­s and understand data from combined genomic and other experiments.

The four standards are at an advanced stage of development:

  1. Compression and Understanding of Industrial Data (MPAI-CUI), covers the AI-based Com­pany Performance Prediction instance, enables prediction of default probability and assess­ment of organisati­onal adequacy using governance, financial and risk data of a given company.
  2. Multimodal Conversation (MPAI-MMC) covers three instances: audio-visual conversation with a machine impersonated by a synthesised voice and an animated face, request for information about a displayed object, and translation of a sentence using a synthetic voice that preserves the speech features of the human.
  3. Context-based Audio Enhancement (MPAI-CAE) covers four instances: adding a desired emotion to a speech segment without emotion, preserving old audio tapes, improving the audio confer­ence experience and removing unwanted and keeping relevant sounds to a user on the go.
  4. AI Framework standard (MPAI-AIF) enables creation and autom­ation of mixed Machine Learning (ML) – Artificial Intelligence (AI) – Data Processing (DP) – inference workflows, implemented as software, hardware, or mixed software and hardware.

MPAI develops data coding standards for applications that have AI as the core enabling technology. Any legal entity who supports the MPAI mission may join MPAI if it is able to contribute to the development of standards for the efficient use of data.

Visit the MPAI web site and contact the MPAI secretariat for specific information.