2021/05/22

MPAI is currently developing 4 standards

Established less than 8 months ago – on 30 September 2020 – MPAI has promptly produced a process to develop its standards and it immediately put it in action.
In simple words, the MPAI process identifies the need for standards and determines functional requirements. Then it determines the commercial requirements (framework licences). Then it acquires technologies by issuing a public Call for technologies and developes the standard using the technologies proposed and evaluated.
MPAI is currently developing 4 standards, which means that the functional and commercial requirements have been developed, calls have been issued, and responses received in 4 instances:

  1. Artificial Intelligence Framework (MPAI-AIF) enables creation and automation of mixed ML-AI-DP processing and inference workflows. See https://mpai.community/standards/mpai-aif/
  2. Context-based Audio Enhancement (MPAI-CAE) improves the user experience for several audio-related applications including entertainment, communication, teleconferencing, gaming, post-production, restoration etc. in a variety of contexts. See https://mpai.community/standards/mpai-cae/
  3. Multi-modal conversation (MPAI-MMC) aims to enable human-machine conversation that emulates human-human conversation in completeness and intensity by using AI. See https://mpai.community/standards/mpai-mmc/

Compression and understanding of industrial data (MPAI-CUI) aims to predict the performance of a company by applying Artificial Intelligence to governance, financial and risk data. See https://mpai.community/standards/mpai-cui/

MPAI Use Cases being standardised – Emotion enhanced speech.

Imagine that you have a sentence uttered without a particular emphasis or emotion, that you have a sample sentence uttered with a particular intonation, emphasis and emotion and that you would like to have the emotion-less sentence uttered as in the sample sequence.

This is one of the use cases belonging to the Context-based Audio Enhancement standard that MPAI is developing as part of the process described above.
What is being standardised by MPAI in this Emotion-Enhanced Speech (EES) use case? The input and output interfaces of an EES box that takes a speech uttered without emotion (“emotion-less speech”), a segment of that speech between t1 and t2 and a sample speech containing the emotion, timbre etc. with which the segment of speech between t1 and t2 at the output of EES should be pronounced.
The EES does not stop here. It defines the architecture of the box, composed by AI Modules (AIM) of which only the functionality and the input and output data are defines, but not the internals of the AIM.
MPAI believes that this “lightweight” standardisation reaches two, apparently opposite goals: AIMs can be obtained from different sources and replaced with AIMs with more advanced functionalities.
MPAI standards not only do offer interoperability but also build on and further promote AI innovation.