This is the public page of the Context-based Audio Enhancement (MPAI-CAE) standard. See the MPAI-CAE homepage.
Context-based Audio Enhancement (MPAI-CAE) is a collection of 4 use cases where the user audio experience, including: entertainment, communication, teleconferencing, restoration etc., in a variety of contexts such as in the home, in the car, on-the-go, in the studio etc. Context information acts on the input audio content to provide the desired results.
The 4 use cases considered are: Emotion Enhanced Speech, Audio Recording Preservation, Speech Restoration System and Enhanced Audioconference.
The figures below shows the reference models of the MPAI-CAE Use Cases. Note that an Implementation is supposed to run in the MPAI-specified AI Framework (MPAI-AIF).
Figure 1 – Emotion Enhanced Speech | |
Emotion-Enhanced Speech (EES) enables a user to indicate a model utterance or an Emotion to obtain an emotionally charged version of a given utterance.
In many use cases, emotional force can usefully be added to speech which by default would be neutral or emotionless, |
|
Figure 2 – Audio Recording Preservation | |
Audio Recording Preservation (ARP) Use Case enables a user to create digital copies of a digitised audio of open-reel magnetic tapes suitable for long-term preservation and for correct playback of the digitised recording (restored, if necessary). | |
Figure 3 – Speech Restoration System | |
Speech Restoration System (SRS) enables a user to restore a Damaged Segment of an Audio Segment containing only speech from a single speaker. No filtering or signal processing is involved. Instead, replacements for the damaged vocal elements are synthesised using a speech model. | |
Figure 4 – Enhanced Audioconference Experience | |
Enhanced Audioconference Experience (EAE) enables a user to improve the auditory quality of audioconference experience by processing speech signals recorded by microphone arrays and provide speech signals free from background noise and acoustics-related artefacts .
Watch the demo YouTube Non-YouTube |
The Context-based Audio Enhancement Technical Specification Version 1 has been developed by CAE-DC chaired by Marina Bosi (Stanford University). MPAI-CAE is publicly available.
CAE-DC is now developing the Reference Software Implementation, the Conformance Testing and Performance Assessment Specifications, and on extensions to Version 1.
If you wish to participate in this work you have the following options:
- Join MPAI
- Keep an eye on this page.