Multimodal Conversation (MPAI-MMC)

Important Information: Deadline for responses to the Call is 24 October

Multi-modal conversation (MPAI-MMC) is an MPAI standard enabling conversation between humans and machines that emulates human-human conversation in completeness and intensity using AI.

After completing the development of Version 1, MPAI is engaged in the development of MPAI-MMC V2:

  1. MPAI-MMC V2 Call for Technologies (the call closes on 2022/10/24)
  2. MPAI-MMC V2 Use Cases and Functional Requirements
  3. Clarifications about MPAI-MMC V2 CfT data formats
  4. MPAI-MMC V2 Framework Licence
  5. MPAI-MMC V2 Template for responses

The MPAI Secretariat shall receive submissions in response to the MPAI-MMC V2 Call for Technologies by 2022/10/24T23:59 UTC.

Multi-modal conversation (MPAI-MMC) Version 2 will specify two Composite AIMs:

  1. Personal Status Extraction: provides an estimate of the Personal Status (PS) – of a human or an avatar – conveyed by Text, Speech, Face, and Gesture. PS is the ensemble of information internal to a person, including Emotion, Cognitive State, and Attitude.
  2. Personal Status Display: generates an avatar from Text and PS that utters speech with the intended PS while the face and gesture show the intended PS.

and three new use cases:

  1. Conversation About a Scene: a human holds a conversation with a machine about objects in a scene. While conversing, the human points their fingers to indicate their interest in a particular object. The machine is helped by the understanding of the human’s Personal Status.
  2. Human-Connected Autonomous Vehicle (CAV) Interaction: a group of humans converse with a CAV which understands the utterances and the PSs of the humans it converses with and manifests itself as the output of a Personal Status Display.
  3. Avatar-Based Videoconference: avatars representing humans with a high degree of accuracy participate in a videoconference. A virtual secretary (VS) represented as an avatar displaying PS creates an online summary of the meeting with a quality enhanced by the VS’s ability to understand the PS of the avatar it converses with.

Multi-modal conversation (MPAI-MMC) Version 1 includes 5 Use Cases:

  1. Conversation with Emotion supporting audio-visual conversation with a machine impersonated by a synthetic voice and an animated face.
  2. Multimodal Question Answering supports request for information about a dis­played object.
  3. Unidirectional, Bidirectional and One-to-Many Speech Translation support conversational translation using a synthetic voice that preser­ves the speech features of the human.

MPAI is indebted to the following individuals: Miran Choi (ETRI), Gérard Chollet (IMT), Jisu Kang (KLleon), Mark Seligman (SMI) and Fathy Yassa (SMI) for their efforts in developing the MPAI-MMC Technical Specification V1.

Reference Software, Conformance Testing and Performance Assessment for MPAI-MMC V1 are under development. Read the V1.2-related document

  1. Call for Patent Pool Administrator (Closed)
  2. Introdution to MPAI-MMC (V1)
  3. MPAI-MMC Standard (V1)
  4.  Call for Technologies (V1)
  5. Use Cases and Functional Requirements (V1)
  6. Framework Licence (V1)
  7. Application Note

Visit the About MPAI-MMC page