Moving Picture, Audio and Data Coding
by Artificial Intelligence

Looking forward to MPAI’s 12th General Assembly

MPAI’s 12th General Assembly (MPAI-12) is not going to be like any of the other previous 12 General Assemblies (MPAI was established at MPAI-0) and the reason is simple to explain. So far MPAI has made big announcements about its plans to develop AI-based data coding standards. In less than two weeks it plans on releasing the first three. At its last General Assembly (MPAI-11), it actually published the WDs of the working drafts of the three standards for “Community Comments”: MPAI- GME, MPAI-CUI, MPAI-MMC.

Comments are flowing in, and this article will briefly describ the content of the 3 standards and remind the community that the deadline for comments is close (20 September).

The first standard is “Governance of the MPAI Ecosystem” (MPAI-GWE). In general, standards have powerful positive effects. MPAI Standards will naturally create an Ecosystem whose Actors are:

  1. Implementers: develop and upload Implementations to the MPAI Store.
  2. Performance Assessors: assess the Performance of Implementations, i.e., their Reliability, Robustness, Replicability and Fairness.
  3. MPAI Store: verifies security, tests Conformance and checks the positive outcome of Performance of Implementations.
  4. End Users download and enjoy Implementations.

A system of this complexity requires governance and “Governance of the MPAI Ecosystem” lays down the rules that the Actors shall follow.

WD0.4 of MPAI-GME is available for Community Comments.

The second standard is “Compression and Understanding of Industrial Data” (MPAI-CUI). Unlike MPAI-GME – a system standard – MPAI-CUI is an application standard that currently contains one use case called AI-based Company Performance Prediction. By extracting key information from the flow of data produced by companies – currently financial and organisational data – and vertical risks – currently seismic and cyber – MPAI-CUI enables users to predict default probability and business discontinuity probability of a company.

WD0.4 of MPAI-CUI is available for Community Comments.

The third standard is “Multimodal Conversation” (MPAI-MMC), an application standard containing 5 use cases making possible a slate of applications enabling industry to accelerate the availability of products, services and applications. Examples are: holding an audio-visual conversation with a machine impersonated by a synthetic voice and an animated face; requesting and receiving information via speech about a dis­played object; inter­preting speech to one, two or many languages using a synthetic voice that preser­ves the features of the human speech.

WD0.4 of MPAI-MMC is available for Community Comments.

These are just the starters of the rich menu of MPAI-12. After the 30th of September, look at the MPAI web site blog for a full report.


The Governance of the MPAI Ecosystem

Artificial Intelligence is not just another technology coming to the fore as humankind has seen many time before. By mimicking the way humans interpret and act on the new based on their experience, AI may influence its users in subtle ways.

The MPAI Statutes read that MPAI’s mission is to produce standards for Moving Picture, Audio and Data Coding by Artificial intelligence. While Moving Pictures and Audio have been singled out in the mission because of their importance, ultimately they are “Data”. Therefore, from now on we will only talk about Data.

The most immediate example of Data Coding is data compression. If the machine doing the compression, after a careful training, has learnt that certain patterns are more common than others, it will probably compress better than a machine that has been hardwired to perform certain operations by a human who has understood how certain patterns appear and devised a mechanism to exploit those patterns. By growing in age, humans understand that the world is always more complex that it was assumed to be before.

It is the age of the power of the numbers vs the ingenuity of the human that AI heralds.

A more interesting example of Data Coding is feature extraction. The machine called “human eye and brain” has few if any machines capable to compete with it in its ability to recognise objects. Even a short period of learning is sufficient to a newly born to recognise the faces of the people and the objects in the environment. After decades, the object recognition ability of the adult grown from the child exposed to a broad experience is very high. If the adult, however, has lived in an environment where “being different” is associated with a negative connotation, it is more likely that an object with those features will have a negative connotation. Think about how certain cultures handle certain objects or words.

Machine learning is no different. As much as education can largely influence the future behaviour of a child, the way a machine is trained influences the way that machine will respond to future stimuli.

The above is not a general warning about anything related with Artificial Intelligence. If you use a video codec where AI improves the efficiency of a traditional video codec, you should not expect that, because of the way the codec has been trained, the codec will show you a cat when the original animal was a dog. But if you use a system that assesses the performance of a company using AI, you may discover that a company with certain combinations of features is judged negatively simply because the cases used in the training phase were associated with a negative performance.

Again, AI is not like any other technology and it would be irresponsible for MPAI to simply stop with the publication of its standards and not to care about their ultimate use.

With its notion of MPAI ecosystem, MPAI has developed a strategy that allows end users to find implementations that have a level of  “guarantee” that depends on the thoroughness of the “review”.

The way MPAI implements its ecosystem is based on the following elements:

  1. MPAI standardises an environment – called AI Framework (AIF) – in which AI-based applications – implemented as AI Workflows (AIW) are executed. AIWs are composed of mostly AI-based basic components – called AI Modules (AIM). The architecture of the AIF with its components is sketched in Figure 1.

Figure 1 –Architecture and Components of the AI Framework (AIF)

  1. MPAI develops application standards, i.e., collections of use cases belonging to an application domain such as “audio enhancement” or “multimodal conversation” or “understanding financial data” which specify the AIWs and their AIMs implementing well-identified use cases. Standardisation is lightweight because
    1. For AIWs: only functions, and formats and semantics of the input and output data and the interconnections of the AIMs are specified
    2. For AIMs: only functions, and the formats and semantics of the input and output data are specified.

An example of Use Case is “Unidirectional Speech Translation”. The AIW of Figure 2 interprets a Speech Segment uttered in a specified language to another specified language preserving the characteristics of the original speech. Figure 2 depicts an AIW standardised by MPAI-MMC, called “Unidirectional Speech Translation”.

Figure 2 – An AIW example

  1. MPAI additionally specifies how
    1. To test the conformance of a AIF-AIW-AIM implementations with the standard
    2. To assess the performance of AIW-AIM implementations. Performance of an implementation is defined as the support of the attributes of Reliability, Robustness, Replicability and Fairness.
  2. MPAI appoints performance assessors, entities with the task to assess the performance of an implementation in a given application domain.
  3. MPAI defines interoperability as the ability to replace an AIF, AIW or AIM implementation with a functionally equivalent implementation. MPAI defines 3 interoperability levels of an AIF implementation that run an AIW composed of AIMs:
    1. Executing any proprietary function and exposing any proprietary interface (Level 1).
    2. With functions and interfaces specified by an MPAI Application Standard (Level 2).
    3. Certified by a performance assessor (Level 3).
  4. MPAI establishes and oversees the not-for-profit commercial company MPAI Store, that distributes implementations of MPAI standards. Depending on the interoperability level of the implementation, the MPAI Store performs the following functions: tests security and conformance of AIFs, AIWs and AIMs, and assesses performance of AIWs and AIMs.

Figure 3 – Operation of the MPAI ecosystem

MPAI has developed the Governance of the MPAI Ecosystem WD0.4 and is seeking comments from the MPAI community. Comments should be provided by 20 September 2021 on time for review before adoption of the document by the MPAI General Assembly (MPAI-12) on 30 September.


MPAI publishes 2 draft standards and 1 document for comments

Geneva, Switzerland – 25 August 2021. At its 11th General Assembly, the international, unaffili­ated Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) standards devel­oping organisation has published 2 draft standards and 1 foundational document for comment.

Comments are requested, by 20 September, prior to final approval at MPAI’s next General Assem­bly (MPAI-12) on:

  1. Compression and Understanding of Industrial Data (MPAI-CUI). AI-based Com­pany Perfor­m­ance Prediction enables a user to assess a company’s default probability, organisati­onal adequacy and business discontinuity probability in a given prediction horizon.
  2. Multimodal Conversation (MPAI-MMC). Conversation with Emotion supports audio-visual conversation with a machine impersonated by a synthetic voice and an animated face; Multimodal Question Answering supports request for information about a dis­played object; Unidirectional, Bidirectional and One-to-Many Speech Translation support conversational translation using a synthetic voice that preser­ves the speech features of the human.
  3. Governance of the MPAI Ecosystem lays down the rules governing an ecosystem of implem­enters and users of secure and performance-guar­an­teed MPAI standard implemen­tations acces­sible through the not-for-profit MPAI Store.

MPAI is currently also working on other standards, e.g.:

  1. Context-based Audio Enhancement (MPAI-CAE): adding a desired emotion to an emotion-less speech segment, preserving old audio tapes, restoring audio segments, improving the audio confer­ence experience and removing unwanted sounds to a user on the go.
  2. AI Framework (MPAI-AIF) enables creation and autom­ation of mixed Machine Learning, Artificial Intelligence, Data Processing and inference workflows, implemented as software, hardware, or hybrid software and hardware.
  3. Server-based Predictive Multiplayer Gaming (MPAI-SPG) uses AI to train a network that com­pensates data losses and detects false data in online multiplayer gaming.
  4. AI-Enhanced Video Coding (MPAI-EVC) uses AI to improve the performance of existing video coding tools.
  5. Connected Autonomous Vehicles (MPAI-CAV) uses AI in key features: Human-CAV Interac­tion, Environ­ment Sensing, Autonomous Motion, CAV to Everything and Motion Actuation.
  6. Mixed Reality Collaborative Spaces (MPAI-MCS) applies Artificial Intelligence to create mixed-reality spaces populated by streamed objects such as avatars representing individuals, other objects and sensor data, and their descriptions for meetings, education, biomedicine, science, gaming and manufacturing.

MPAI develops data coding standards for applications that have AI as the core enabling technology. Any legal entity who supports the MPAI mission may join MPAI if it is able to contribute to the development of standards for the efficient use of data.

Visit the MPAI web site and contact the MPAI secretariat for specific information.


MPAI Standards

MPAI’s raison d’être is developing standards. So, including the word in the title looks like a pleonasm. But there is a reason: the word standard can be used to mean several things. Let’s first explore which ones.

In my order of importance the first is “information representation”. If there were no standard saying that 65 (in 7 or 8 bits) means “A” and 97 means “a” there would be no email and no WWW. Actually there would not have been a lot more things before them. Similarly, if there were no standard saying that 0111 means that there is a sequence of 3 white pixels and 11 a sequence of 3 black pixels, there would be no digital fax (not that it would make a lot of difference today, but even 10 years ago it would have been a disaster). Going into more sophisticated fields, without standard there would be no MP3, which is about digital representation of a song.

A second, apparently different, shade of the word standard is found in the Encyclopaedia Britannica where it says that a standard “permits large production runs of component parts that are readily fitted to other parts without adjustment”, which I can label as “componentisation”. Today, no car manufacturer would internally develops the nuts and bolts used in their cars (and a lot of more sophisticated components as well). They can do that because there are standards for nuts and bolts, e.g., ISO 4014:2011 Hexagon head bolts – Product grades A and B which specifies “the characteristics of hexagon head bolts with threads from M1,6 up to and including M64 etc.”.

MPAI is developing standards that fit the first definition but it is also involved in standards that fit the second one. For sure, it does neither for Hexagon head bolts. Actually, its first four standards to be published shortly, cover both areas. Let’s see how.

MPAI develops its standards focusing on application domains. For instance, MPAI-CAE targets Context-based Audio Enhancement and MPAI-MMC targets Multimodal Communication. Within these broad areas MPAI identifies Use Cases that fall in the application area and are conducive to meaningful standards. An example of MAI-CAE Use Case is Emotion Enhanced Speech (EES). You pronounce a sentence without particular “colour”, you give a model utterance and you ask the machine to provide your sentence with the “colour” of the given utterance. An example of MPAI-MMC Use Case is Unidirectional Speech Translation (UST): you pronounce a sentence in your language and with your colour and you ask the machine to interpret your sentence and pronounce it in another specified language with your own colour.

If the role of MPAI stopped there its standards would be easy to write. In the case of CAE-EES, you specify the input signals – plain speech, model utterance – and the output signal – speech with colour. In the case of MMC-UST, you specify the input signals – speech in your language and the target language – and the output signal – speech in the target language.

As such standards would be of limited use, MPAI tackles the problem from a different direction. AI-based products and services typically require training. What guarantee does a user have that the box has been properly trained? What if the box has major – intentional or unintentional – performance holes? Reverse engineering an AI-based box is a dauntingly complexity problem.

To decrease the complexity of the problem, MPAI splits a complex box in components.  Let’s see how MPAI has modelled its Unidirectional Speech Translation (UST) Use Case. Looking at Figure 1, we can see that the UST box contains 4 sub-boxes:

  1. A Speech Recognition box receiving speech as input and providing text as output.
  2. A Translation box receiving text either from a user or from the output of Speech Recognition, in addition to a signal telling which is the desired output language
  3. A Speech Feature Extraction box able to extract what is specific of the input speech in terms of intonation, emotion ecc.
  4. A Speech Synthesis box using not only text as input but also the features of the speech input to the UST box.

MPAI standards are defined at two levels:

  1. The UST box level, by defining the input and output signals, and the function of the UST box
  2. Th UST sub-box levels (that MPAI calls AI Modules – AIM), by defining the input and output signals, and the function of the each AIM

Figure 1 – The MPAI Unidirectional Speech Translation (UST) Use Case

There are at least two advantages from the MPAI approach

  1. It is possible to trace back a specific UST box output to the UST box input that generated it. This is the “information representation” part of MPAI standards.
  2. It is possible to replace individual AIMs in a UST box because the functions of the AIMs are normatively defined and so is the syntax and semantics of each AIM input and output data. This is the “componentisation” part of MPAI standards.

Which guarantee do you have that by replacing an AIM in an implementation you get a working system? The answer is “Conformance Testing”. Each MPAI Technical Specification  has a corresponding Conformance Testing specification that you can run to make sure that an implementation of a Use Case or of an AIM is technically correct.

That may not be enough if you also want to know whether the AIMs do a proper job. What if the Speech Feature Extraction AIM has been poorly trained and that your interpreted voice does not really look like your voice?

The MPAI answer to this question is called “Performance Assessment”. Each MPAI Technical Specification has a corresponding Performance Testing specification that you can run to make sure that an implementation of a Use Case or of an AIM has an acceptable grade of performance.

All this is interesting, but when will MPAI standards be actually available? The first standard (planned to be approved by the 12th MPAI General Assembly on 30 September ) will support the AI-based Company Performance Prediction. You feed the relevant data of a company into a box and you are told what is the organisation adequacy and the default probability of the company.

There is no better example than this first planned standard to understand that AI boxes cannot be treated like the Oracle of Delphi.


MPAI lays the foundations for a Mixed Reality Collaborative Spaces standard

Geneva, Switzerland – 19 July 2021. At its 10th General Assembly, the international, unaffiliated Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) standards association has continued the development of 4 standards, progressed the study of functional requirements of 4 projects and refined the definition of two use cases.

The latest addition is Mixed Reality Collaborative Spaces (MPAI-MCS) – where MPAI is studying the application of Artificial Intelligence to the creation of mixed-reality spaces populated by streamed objects such as avatars representing geographically distributed individuals, other objects and sensor data, and their descriptions. Some of the applications envisaged are education, biomedicine, science, gaming, manufacturing and remote conferencing.

Functional requirements are being developed for

  1. Server-based Predictive Multiplayer Gaming (MPAI-SPG) that uses AI to train a network to compensate data losses and detect false data in online multiplayer gaming.
  2. AI-Enhanced Video Coding (MPAI-EVC) that uses AI to improve the performance of existing data processing-based video coding tools.
  3. Connected Autonomous Vehicles (MPAI-CAV) that uses AI in Human-CAV interaction, Environment Sensing,  Autonomous Motion, CAV to Everything and Motion Actuation.
  4. Integrative Genomic/ Sensor Analysis (MPAI-GSA) that uses AI to compres­s and understand data from combined genomic and other experiments.

The four standards are at an advanced stage of development:

  1. Compression and Understanding of Industrial Data (MPAI-CUI), covers the AI-based Com­pany Performance Prediction instance, enables prediction of default probability and assess­ment of organisati­onal adequacy using governance, financial and risk data of a given company.
  2. Multimodal Conversation (MPAI-MMC) covers three instances: audio-visual conversation with a machine impersonated by a synthesised voice and an animated face, request for information about a displayed object, and translation of a sentence using a synthetic voice that preserves the speech features of the human.
  3. Context-based Audio Enhancement (MPAI-CAE) covers four instances: adding a desired emotion to a speech segment without emotion, preserving old audio tapes, improving the audio confer­ence experience and removing unwanted and keeping relevant sounds to a user on the go.
  4. AI Framework standard (MPAI-AIF) enables creation and autom­ation of mixed Machine Learning (ML) – Artificial Intelligence (AI) – Data Processing (DP) – inference workflows, implemented as software, hardware, or mixed software and hardware.

MPAI develops data coding standards for applications that have AI as the core enabling technology. Any legal entity who supports the MPAI mission may join MPAI if it is able to contribute to the development of standards for the efficient use of data.

Visit the MPAI web site and contact the MPAI secretariat for specific information.


MPAI Status report – July 2021

MPAI has been established on 30 September 2020 as a not-for-profit unaffiliated organisation with the mission: 1) to dev­elop data coding standards based on Artificial Intelligence and 2) to bridge the gap between standards and their practical use through Framework Licences.

MPAI develops its standards through a rigorous process depicted in Figure 1.

Figure 1 – Process to develop MPAI standards

An MPAI standard passes though 6+1 stages. Anybody can contribute to the first 3 stages. The General Assembly approves the progression of a standard to the next stage.

MPAI defines standard interfaces of AI Modules (AIM) combined and executed in an MPAI-specified AI-Framework (AIF). AIMs receive data with standard formats and produce output data with standard formats.

Figure 1 – The MPAI AI Module (AIM) Figure 2 – The MPAI AI Framework (AIF)

MPAI is currently engaged in the development of 10 technical specifications. The table below gives the current stage, the MPAI name, the title and the scope of each standard. The first 4 standards will be approved within 2021

Table 1 – MPAI standards under development

# MPAI name Title Scope
5 MPAI-AIF AI Framework Specifies 6 elements: Management and Control, AIM, Execution, Communication, Storage and Access to enable creation and automation of mixed ML-AI-DP processing and inference workflows.
5 MPAI-CAE Context-based Audio Enhancement Improves the user experience in audio applications, e.g., entertainment, communication, teleconferencing, gam­ing, post-production, restoration etc. for different con­texts, e.g., in the home, in the car, on-the-go, in the studio etc.
5 MPAI-MMC Multimodal Conversation Enables human-machine conversation that emulates human-human conversation in completeness and intensity
5 MPAI-CUI Compression and understanding of industrial data Enables AI-based filtering and extraction of governance, financial and risk data to predict company performance.
3 MPAI-SPG Server-based Predictive Multi­player Gaming Minimises the audio-visual discontinuities caused by network disruption during an online real-time game and provides a response to the need to detect who amongst the players is cheating.
3 MPAI-GSA Integrative Genomic/Sensor Analysis Understands and compresses the result of high-throughput experiments combining genomic/proteomic and other data, e.g., from video, motion, location, weather, medical sensors.
3 MPAI-EVC AI-Enhanced Video Coding Substantially enhances the performance of a traditional video codec by improving or replacing traditional tools with AI-based tools.
3 MPAI-CAV Connected Autonomous Vehicles Uses AI to enable a Connected Autonomous Vehicle with 3 subsystems: Human-CAV interaction, Auton­om­ous Motion Subsystem and CAV-Environment interac­tion
2 MPAI-OSD Visual object and scene des­crip­tion A collection of Use Cases sharing the goal of describing visual object and locating them in the space. Scene descrip­tion includes the usual description of objects and their attrib­utes in a scene and the semantics of the objects.
1 MPAI-MCS Mixed-Reality Collaborative Spaces Enables mixed-reality collaborative space scenarios where biomedical, scientific, and industrial sensor streams and recordings are to be viewed where AI can be utilised for immersive presence, spatial map ren­der­ing, multiuser synchronisation etc.

Additionally, MPAI is engaged in the development of a standard titled “Governance of the MPAI ecosystem”. This standard will specify how:

  1. Implementers can get certification of the adherence of an implementation to an MPAI standard from the technical (Conformance) and ethical (Performance) viewpoint.
  2. End users can reliably execute AI workflows on their devices.

Any legal entity supporting the mission of MPAI, if able to contribute to the development of standards for the efficient use of Data may apply for MPAI membership. Additionally, individuals representing technical depart­ments of academic institutions may apply for Associate Membership.


MPAI opens new projects leveraging its unique approach to AI standards

Geneva, Switzerland – 09 June 2021. At its 9th General Assembly, the international, unaffiliated Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) standards association has started the development of new projects while continuing the development of four standards.

MPAI application standards use aggregations of components called AI Modules (AIM) executed in an MPAI-specified environment called AI Framework (AIF). MPAI only specifies the interfaces, not the internals of the AIMs, to enable interoperability while promoting the emergence of an open market of components.

MPAI is currently developing functional requirements for a future Connected Autonomous Vehicles standard (MPAI-CAV). Three CAV subsystems are being targeted: Human-CAV interaction, Autonomous Motion and CAV-environment interaction. All data flows between the AIMs in the 3 subsystems are being identified and requirements developed.

The four standards under development are:

  1. AI Framework standard (MPAI-AIF) enables creation and autom­ation of mixed Machine Learning (ML) – Artificial Intelligence (AI) – Data Processing (DP) – inference workflows, implemented as software, hardware, or mixed software and hardware.
  2. Context-based Audio Enhancement (MPAI-CAE) covers four instances: adding a desired emotion to a speech without emotion, preserving old audio tapes, improving the audioconference experience and removing unwanted sounds while keeping the relevant ones to a user walking in the street
  3. Multimodal Conversation (MPAI-MMC) covers three instances: audio-visual conversation with a machine impersonated by a synthesised voice and an animated face, request for information about a displayed object, translation of a sentence using a synthetic voice that preserves the speech features of the human.
  4. Compression and Understanding of Industrial Data (MPAI-CUI) currently includes one instance: AI-based Company Performance Prediction enabling prediction of performance, e.g., organisati­onal adequacy or default probability, by extrac­ting information from governance, financial and risk data of a given company.

The MPAI web site provides information about other AI-based standards being developed: AI-Enhanced Video Coding (MPAI-EVC) will improve the performance of existing video codecs using AI, Server-based Predictive Multiplayer Gaming (MPAI-SPG) will compensate data loss and detect false data in online multiplayer gaming and Integrative Genomic/Sensor Analysis (MPAI-GSA) will compres­s and understand data from combined genomic and other experiments produced by related dev­ices/sensors.

MPAI develops data coding standards for applications that have AI as the core enabling technology. Any legal entity who supports the MPAI mission may join MPAI if it is able to contribute to the development of standards for the efficient use of data.

Visit the MPAI web site and contact the MPAI secretariat for specific information.

 


Three minutes to know what you need to know about MPAI

If media have become so pervasive, it is because smart use of data processing has reduced the amount of data generated by audio and video. Digital media can give more, but the spirit that has produced MP3, digital television, audio and video on the internet, DASH and so much more has waned

The new organisation  MPAI – Moving pictures, audio and data coding by Artificial Intelligence (AI) – has the necessary propulsive thrust. MPAI AI as the enabling technology to code data.

The term AI is in the MPAI title because AI is the enabling technology extending coding from compression (i.e., less bits for a similar result) to understanding (i.e., what the bits mean), and the use of coding from media, to many more data types.

Data processing remains a valid alternative to A-I in many domains, though…

MPAI has defined 5 pillars on which it bases its operation. The formulation of the process benefits from 30+ years of standardisation where a huge organisation – MPEG – was created from nothing and processes have been fine tuned from day-to-day real-world experience.

Pillar #1 – the process

MPAI likes to call itself, as the domain extension in mpai.community implies, a “community”. The development of MPAI standards is divided in 4 phases: Preparation, Framework Licence, Standard Development and Standard approval. They are characterised by those who are allowed to participate in each phase. In total there are 7 stages, as can be seen from the  Figure 1

Figure 1 – The MPAI standard development stages

Phase 1 – Preparation.

Stage 0 – Interest Collection (IC): Members as well as non-members may submit proposals. These are collected and harmonised. Some proposals get merged with other similar proposals. Some get split because the harmonisation process so demands. The goal is to identify proposals of standard that reflect the proponent’s wishes while making sense in terms of specification and use across different environments.

Stage 1 – Use Case (UC): Use Cases are full characterised and description of the work program that will produce the Functional Requirements is developed.

Stage 2 – Functional Requirements (FR): detailed functional requirements of the Use Case are developed.

In the three stages above, MPAI is “open” in the sense that anybody interested may participate. However, if an MPAI Member wants to discuss a confidential proposal, only MPAI members may attend.  From the Commercial Requir­ements stage onward, non-members are not allowed to participate (but they may become members at any time).

Phase 2 – Framework Licence

Stage 3 – Commercial Requirements (CR): in a supply contract the characteristics (Func­tional Requirements) and the conditions (Commercial Requirements) are described. Antitrust laws do not permits that sellers (technology providers) and buyers (standard users) sit together and agree on values such as numbers, percentage or dates. However, sellers (technology providers) may indicate supply conditions, without values. Therefore, the embodiment of the Commercial Requirements, i.e. the Framework Licence, will not contain such details. Only Principal Members who declare they will make technical contributions to the standard can participate in the drafting of the Framework Licence.

Phase 3 – Standard Development

Stage 4 – Call for Technologies (CT): Once both Requirements are available, MPAI is in a position to draft the CfT. Anybody may respond to a CfT. However, if one of their proposed technologies from a non-member is accepted, the responder must join MPAI.

Stage 5 – Standard Development (SD): the Development Committee in charge reviews the responses and develops the standards.

Phase 4 – Standard approval

Stage 6 – MPAI Standard (MS): Only Principal Members may vote to approve the standard, hence trigger its publication. Associate Members, however, may become Principal Members at any time.

For each standard project, transition to each of the 7 stages of Figure 1 must  approved by a resolution of the General Assembly.

Pillar #2 & #3 – AI Modules and Framework

MPAI makes assumptions about the internal structure of an AI system to provide levels of guarantee about the “ethical performance” of an AI system that implements an MPAI standard.

  1. An implemention of an MPAI-specified Use Case is subdivided into functional components called AI Modules (AIM) that use Artificial Intelligence (AI) or Machine Learning (ML) or traditi­onal Data Processing (DP) or a combination of these and are implemented in software or hardware or mixed hardware and software.
  2. An AI system implementing a Use Case is an aggregation of AIMs, specified by the Use Case, in a topology specified in the standard, interconnec­ted as topology specified and executed inside an AI Frame­work (AIF).

The 2 basic elements of the MPAI standardisation are represented in Figure 42 e 9and Figure 3.

Figure 49 – The MPAI AI Module (AIM) Figure 50 – The MPAI AI Framework (AIF)

Figure 2 depicts a video coming from a camera shooting a human face. The function of the AIM (green block) is to detect the emotion on the face and the meaning of the sentence the human is uttering. The AIM can be implemented with a neural network or with DP technologies. In the latter case, the AIM accesses a knowledge base external to the AIM.

The input data enter the Execution area of the AIF (Figure 3) where the work­flow is executed under the supervision of Management and Control. AIMs communicate via the AIF’s Commun­ication and Storage infrastructure and may access static or slowly changing data sources (e.g., those of Figure 2) called Access. The result of the execution of the workflow is provided as output data.

Pillar #4 – IPR Guidelines

Seventy years ago, when ISO was established, it made sense to ask a participant in the standardisation process to make a declaration of availability to licence their technology at fair and reasonable terms and non discriminatory conditions (FRAND). Any IP item was typically held by one company. Actually that company most likely was approaching ISO because it had already products on the market, was already licensing its technology and just wanted ISO to ratify the status quo.

Forty years later, when MPEG started releasing its standards, the situation was entirely different. Each participant had IP but most participants were interested in using the standard.

Another 20 years later, the situation had changed beyond recognition. Each participant had IP, but most participant were not interested in using the standard, only in monetising their IP.

MPAI fully endorses IP as the engine of progress but cautions against standards released with FRAND promises. MPAI has not washed its hands of the IP issue, but has developed the notion of framework Licence. This is the IPR holders’ business model adopeted to monetise their IP in a standard without values: $, %, dates etc. This is the practice:

  • Before the standard is developed: Active members develop & adopt the Framework Licence
  • While the standard is developed: Members declare to make available their licences according to the Framework Licence after the standard is approved for any submissions they make.
  • After the standard is developed: All members declare they will get a licence for other members’ IPRs, if used, within 1 year after publication of IPR holders’ licensing terms.

Non members must get a licence from IPR holders to use an MPAI standard.

You can see an example of actual Framework Licence.

Pillar #5 – Ethical AI

MPAI is not alone in being aware of the impact AI will have on humans and society. Instead of just raising concerns about bias in AI, MPAI intends to offer practical solutions.

As an example, MPAI’s AIM-AIF approach is already capable to enhance explainability of MPAI standards’ implementations. By Explainability we mean “the ability to trace the output of an AI system back to the inputs that have produced it”.

The solution MPAI is working on will provide the means to test the Performance of MPAI standard implementations. An element of the solution is an identification system that helps users check whether an AI system is a tested implementation of an MPAI standard.

Who benefits from the MPAI approach?

  1. Component providers can offer conforming AIMs to an open competitive market and test or have them tested for performance.
  2. Application developers wishing to develop complex components may have some but not all AIMs they need. However, they can find the AIMs they need on the open competitive market.
  3. Consumers have a wider choice of better AI applications from competing application developers. Theu can discriminate from generic AI systems from AI systems that implement AI standards,
  4. Innovation will be fuelled by the need for novel/more performing AIMs to face competition.
  5. Society can lift the veil of opacity from large and monolithic AI systems.

MPAI has been in operation since the 30th of September 2020. A few months after, it is developing, 4 standards and the Functional Requirements for 3 standards, and is honing 2 Use Cases. Figure 4 depcts the situation.

Figure 51 – Snapshot of the MPAI work plan


MPAI starts development of AI-based company performance prediction standard

Geneva, Switzerland – 12 May 2021. At its 8th General Assembly, the international, unaffiliated Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) standards association has received substantial proposals in response to its Call for Technologies on AI-based Company Performance Prediction Use Case. Meanwhile the development of its foundational AI Framework standard is steadily progressing and the technical review of responses to the Context-based Audio Enhancement (MPAI-CAE) and Multimodal Conversation (MPAI-MMC) Calls for Technologies has been completed.

The goal of the AI Framework standard, nicknamed MPAI-AIF, is to enable creation and autom­ation of mixed Machine Learning (ML) – Artificial Intelligence (AI) – Data Processing (DP) – inference workflows, implemented as software, hardware, or mixed software and hardware. A major MPAI-AIF feature is enhanced explainability of MPAI standard applications.

Development of two new standards has started after completing the technical review of responses to the Calls for Technologies. Context-based Audio Enhancement (MPAI-CAE) covers four instances: adding a desired emotion to a speech without emotion, preserving old audio tapes, improving the audioconference experience and removing unwanted sounds while keeping the relevant ones to a user walking in the street. and Multimodal Conversation (MPAI-MMC) covers three instances: audio-visual conversation with a machine impersonated by a synthesised voice and an animated face, request for information about a displayed object, translation of a sentence using a synthetic voice that preserves the speech features of the human.

Substantial proposals received in response to the MPAI-CUI Call for Technologies has allowed starting the work on a fourth standard, AI-based Company Performance Prediction, part of the Compression and Understanding of Industrial Data standard. The standard will enable prediction of performance, e.g., organisati­onal adequacy or default probability, by extrac­ting information from governance, financial and risk data of a given company.

The MPAI web site provides information about other AI-based standards being developed: AI-Enhanced Video Coding (MPAI-EVC) will improve the performance of existing video codecs using AI, Server-based Predictive Multiplayer Gaming (MPAI-SPG) will compensates the loss of data and detect false data in online multiplayer gaming and Integrative Genomic/Sensor Analysis (MPAI-GSA) will compres­s and understand data from combined genomic and other experiments produced by related dev­ices/sensors.

MPAI develops data coding standards for applications that have AI as the core enabling technology. Any legal entity who supports the MPAI mission may join MPAI if it is able to contribute to the development of standards for the efficient use of data.

Visit the MPAI web site and contact the MPAI secretariat for specific information.


MPAI consolidates the development of three AI-based data coding standards

Geneva, Switzerland – 14 April 2021. At its 7th General Assembly, the international, unaffiliated Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) standards association has received substantial proposals in response to its two Calls for Technologies on Enhanced Audio and Multimodal Conversation that closed on the 12th of April. Meanwhile the development of its foundational AI Framework standard is steadily progressing targeting July 2021 for delivery of the standard.

The goal of the the AI Framework standard, nicknamed MPAI-AIF, is to enable creation and automation of mixed Machine Learning (ML) – Artificial Intelligence (AI) – Data Processing (DP) – inference workflows, implemented as software, hardware, or mixed software and hardware. A major MPAI-AIF feature is enhanced explainability to applications conforming to MPAI standards.

Work on the two new Context-based Audio Enhancement (MPAI-CAE) and Multimodal Conver­sation (MPAI-MMC) standards has started after receiving substantial technologies in response to the Calls for Technologies. MPAI-CAE covers four instances: adding a desired emotion to a speech without emotion, preserving old audio tapes, improving the audioconference experience and removing unwanted sounds while keeping the relevant ones to a user walking in the street. MPAI-MMC covers three instances: audio-visual conversation with a machine impersonated by a synthesised voice and an animated face, request for information about a displayed object, trans­lation of a sentence using a synthetic voice that preserves the speech features of the human.

Work on a fourth standard is scheduled to start at the next General Assembly (12th of May) after receiving responses – both from MPAI and non-MPAI members – to the currently open MPAI-CUI Call for Technologies. The standard will enable prediction of performance, e.g., organisati­onal adequacy or default probability, using Artificial Intelligence (AI)-based filtering and extrac­tion of information from a company’s governance, financial and risk data.

The MPAI web site provides information about other AI-based standards being developed: AI-Enhanced Video Coding (MPAI-EVC) that improves the performance of existing video codecs, Server-based Predictive Multiplayer Gaming (MPAI-SPG) that compensates the loss of data in online multiplayer gaming and Integrative Genomic/Sensor Analysis (MPAI-GSA) that compres­ses and understands data from combined genomic and other experiments produced by related dev­ices/sensors.

MPAI develops data coding standards for applications that have AI as the core enabling technology. Any legal entity who supports the MPAI mission may join MPAI if it is able to contribute to the development of standards for the efficient use of data.

Visit the MPAI web site and contact the MPAI secretariat for specific information.