Multimodal Conversation (MPAI-MMC)

Multi-modal conversation (MPAI-MMC) is an MPAI standard enabling conversation between humans and machines that emulates human-human conversation in completeness and intensity using AI.

After completing the development of Version 1, MPAI is engaged in the development of MPAI-MMC V2 using the submissions received in response to:

  1. MPAI-MMC Standard
  2. MPAI-MMC V2 Call for Technologies (closed)
  3. MPAI-MMC V2 Use Cases and Functional Requirements
  4. Clarifications about MPAI-MMC V2 CfT data formats
  5. MPAI-MMC V2 Framework Licence
  6. MPAI-MMC V2 Template for responses

MPAI is now engaged in the development of Multi-modal conversation (MPAI-MMC) Version 2 specifying the data formats of two Composite AIMs:

  1. Personal Status Extraction: provides an estimate of the Personal Status (PS) – of a human or an avatar – conveyed by Text, Speech, Face, and Gesture. PS is the ensemble of information internal to a person, including Emotion, Cognitive State, and Attitude.
  2. Personal Status Display: generates an avatar from Text and PS that utters speech with the intended PS while the face and gesture show the intended PS.

in support of two new use cases:

  1. Conversation About a Scene: a human holds a conversation with a machine about objects in a scene. While conversing, the human points their fingers to indicate their interest in a particular object. The machine is helped by the understanding of the human’s Personal Status.
  2. Human-Connected Autonomous Vehicle (CAV) Interaction: a group of humans converse with a CAV which understands the utterances and the PSs of the humans it converses with and manifests itself as the output of a Personal Status Display.

Multi-modal conversation (MPAI-MMC) Version 1 includes 5 Use Cases:

  1. Conversation with Emotion supporting audio-visual conversation with a machine impersonated by a synthetic voice and an animated face.
  2. Multimodal Question Answering supports request for information about a dis­played object.
  3. Unidirectional, Bidirectional and One-to-Many Speech Translation support conversational translation using a synthetic voice that preser­ves the speech features of the human.

MPAI is indebted to the following individuals: Miran Choi (ETRI), Gérard Chollet (IMT), Jisu Kang (KLleon), Mark Seligman (SMI) and Fathy Yassa (SMI) for their efforts in developing the MPAI-MMC Technical Specification V1.

The Institute of Electrical and Electronic Engineers (IEEE) has adopted MPAI-MMC with the name IEEE 3300-2022.

Reference Software, Conformance Testing and Performance Assessment for MPAI-MMC V1 are under development. Read the V1.2-related document

  1. Call for Patent Pool Administrator (Closed)
  2. Introdution to MPAI-MMC (V1)
  3. MPAI-MMC Standard (V1)
  4.  Call for Technologies (V1)
  5. Use Cases and Functional Requirements (V1)
  6. Framework Licence (V1)
  7. Application Note

Visit the About MPAI-MMC page