Multimodal Conversation (MPAI-MMC)
The goal of the MPAI Multi-Modal Conversation (MPAI-MMC) project is to develop standards that enable a human-machine conversation that is more human-like, richer in content, and able to emulate human-human conversation in completeness and intensity.
Visit the About MPAI-MMC page for an introduction to the MPAI-MMC V1 and V2 standards. MPAI-MMC V1 was adopted without modifications by the Institute of Electrical and Electronic Engineers (IEEE) with the name IEEE 3300-2022.
Version 1 – Version 2
Technical Specification: Multimodal Conversation (MPAI-MMC) V2 specifies 1) data formats for analysis of text, speech, and other non-verbal components, used in human-machine and machine-machine conversation, and 2) use cases providing recognised applications by using data formats from MPAI-MMC and other MPAI standards.
Read An overview of Multimodal Conversation (MPAI-MMC) V2.
Download Technical Specification – Multimodal Conversation (MPAI-MMC) V2.
MPAI thanks the following individuals for their valuable contributions to the development of MPAI-MMC V2: Miran Choi, Gérard Chollet, Paolo Ribeca, Mark Seligman, Fathy Yassa, and Jaime Yoon.
The MPAI-MMC V2 Working Draft (html, pdf) was published with a request for Community Comments. See also the video recordings (YouTube, WimTV) and the slides of the presentation made on 05 September. Read An overview of Multimodal Conversation (MPAI-MMC) V2. Comments should be sent to the MPAI Secretariat by 2023/09/25T23:59 UTC. MPAI will use the Comments received to develop the final draft planned to be published at the 36th General Assembly (29 September 2023).
MPAI-MMC Version 2 extends the capabilities of V1 specifying the data formats of two Composite AIMs:
- Personal Status Extraction: provides an estimate of the Personal Status (PS) – of a human or an avatar – conveyed by Text, Speech, Face, and Gesture. PS is the ensemble of information internal to a person, including Emotion, Cognitive State, and Attitude.
- Personal Status Display: generates an avatar from Text and PS that utters speech with the intended PS while the face and gesture show the intended PS.
in support of three new use cases:
- Conversation About a Scene: a human holds a conversation with a machine about objects in a scene. While conversing, the human points their fingers to indicate their interest in a particular object. The machine is helped by the understanding of the human’s Personal Status.
- Human-Connected Autonomous Vehicle (CAV) Interaction: a group of humans converse with a CAV which understands the utterances and the PSs of the humans it converses with and manifests itself as the output of a Personal Status Display.
- Virtual Secretary for Videoconference: in the Avatar-Based Videoconference use case a virtual secretary summarises what avatars are uttering understanding and captuting their Personal Status.
MPAI has published the following documents to develop MPAI-MMC V2:
- MPAI-MMC V2 Call for Technologies (closed)
- MPAI-MMC V2 Use Cases and Functional Requirements
- Clarifications about MPAI-MMC V2 CfT data formats
- MPAI-MMC V2 Framework Licence
- MPAI-MMC V2 Template for responses
Read about MPAI-MMC V2 Call for Technologies:
- 2 min video (YouTube ) and video (non YouTube) illustrating MPAI-MMC V2.
- slides presented at the online meeting on 2022/07/12.
- video recording of the online presentation (Youtube, non-YouTube) made at that 12 July presentation.
- Call for Technologies, Use Cases and Functional Requirements, and Framework Licence.
Version 1 – Version 2
Multi-modal conversation (MPAI-MMC) Version 1 includes 5 Use Cases:
- Conversation with Emotion supporting audio-visual conversation with a machine impersonated by a synthetic voice and an animated face.
- Multimodal Question Answering supports request for information about a displayed object.
- Unidirectional, Bidirectional and One-to-Many Speech Translation support conversational translation using a synthetic voice that preserves the speech features of the human.
MPAI is indebted to the following individuals: Miran Choi (ETRI), Gérard Chollet (IMT), Jisu Kang (KLleon), Mark Seligman (SMI) and Fathy Yassa (SMI) for their efforts in developing the MPAI-MMC Technical Specification V1.
MPAI-MMC V1
Users of MPAI standards should bear in mind the Notices and Disclaimers concerning use of MPAI Standards.
Version 1.1 | Technical Specification |
The Institute of Electrical and Electronic Engineers (IEEE) has adopted MPAI-MMC with the name IEEE 3300-2022.
Reference Software, Conformance Testing and Performance Assessment for MPAI-MMC V1 are under development. Read the V1.2-related document:
- MPAI-MMC V1 Standard
- Call for Patent Pool Administrator (Closed)
- Introdution to MPAI-MMC (V1)
- MPAI-MMC Standard (V1)
- Call for Technologies (V1)
- Use Cases and Functional Requirements (V1)
- Framework Licence (V1)
- Application Note
Visit the About MPAI-MMC page