Multimodal Conversation (MPAI-MMC)


	Multimodal Communication (MPAI-MMC) specifies technologies that enable a variety of forms of conversation between human and machine that are more human-like, richer in content, and seek to emulate human-human conversation in completeness and intensity.	What MPAI-MMC is about Description of the standard MPAI-MMC V2.1 – Technical Specification – Conformance Testing Specification MPAI-MMC V2.2 (published for CC)

What the MPAI-MMC standard is about

IEEE 3300-2022 is the name of MPAI-MMC V1.4 adopted without modifications by the Institute of Electrical and Electronic Engineers (IEEE) Standard Association.

Version 1 – Version 2.1

Technical Specification: Multimodal Conversation (MPAI-MMC) V2.1 specifies 1) data formats for analysis of text, speech, and other non-verbal components, used in human-machine and machine-machine conversation, and 2) use cases providing recognised applications by using data formats from MPAI-MMC and other MPAI standards.

Material about MPAI-MMC V2.1

Watch video recording of online presentation (YouTube, WimTV)
Read PowerPoint presentation .
Read online version of Technical Specification – Multimodal Conversation (MPAI-MMC) V2.1.
Read Technical Specification – Multimodal Conversation (MPAI-MMC) V2.1 (pdf file)
Read An overview of Multimodal Conversation (MPAI-MMC) V2.

MPAI thanks the following individuals for their valuable contributions to the development of MPAI-MMC V2: Miran Choi (ETRI), Gérard Chollet (IMT), Paolo Ribeca (James HUtton Ltd), Mark Seligman (SMI), Fathy Yassa (SMI), and Jaime Yoon (Hancom).

MPAI appreciates the work carried out by Miran Choi, Gérard Chollet, Mark Seligman, and Jaime Yoon in the development of Conformance Testing Specification: Multimodal Conversation (MPAI-MMC) V2.1.

The MPAI-MMC V2 Working Draft (html, pdf) was published with a request for Community Comments. See also the video recordings (YouTube, WimTV) and the slides of the presentation made on 05 September. Read An overview of Multimodal Conversation (MPAI-MMC) V2. Comments should be sent to the MPAI Secretariat by 2023/09/25T23:59 UTC. MPAI will use the Comments received to develop the final draft planned to be published at the 36^th General Assembly (29 September 2023).

MPAI-MMC Version 2 extends the capabilities of V1 specifying the data formats of two Composite AIMs:

Personal Status Extraction: provides an estimate of the Personal Status (PS) – of a human or an avatar – conveyed by Text, Speech, Face, and Gesture. PS is the ensemble of information internal to a person, including Emotion, Cognitive State, and Attitude.
Personal Status Display: generates an avatar from Text and PS that utters speech with the intended PS while the face and gesture show the intended PS.

in support of three new use cases:

Conversation About a Scene: a human holds a conversation with a machine about objects in a scene. While conversing, the human points their fingers to indicate their interest in a particular object. The machine is helped by the understanding of the human’s Personal Status.
Human-Connected Autonomous Vehicle (CAV) Interaction: a group of humans converse with a CAV which understands the utterances and the PSs of the humans it converses with and manifests itself as the output of a Personal Status Display.
Virtual Secretary for Videoconference: in the Avatar-Based Videoconference use case a virtual secretary summarises what avatars are uttering understanding and captuting their Personal Status.

MPAI has published the following documents to develop MPAI-MMC V2:

Read about MPAI-MMC V2 Call for Technologies:

2 min video (YouTube ) and video (non YouTube) illustrating MPAI-MMC V2.
slides presented at the online meeting on 2022/07/12.
video recording of the online presentation (Youtube, non-YouTube) made at that 12 July presentation.
Call for Technologies, Use Cases and Functional Requirements, and Framework Licence.

Version 1 – Version 2

Multi-modal conversation (MPAI-MMC) Version 1 includes 5 Use Cases:

Conversation with Emotion supporting audio-visual conversation with a machine impersonated by a synthetic voice and an animated face.
Multimodal Question Answering supports request for information about a displayed object.
Unidirectional, Bidirectional and One-to-Many Speech Translation support conversational translation using a synthetic voice that preserves the speech features of the human.

MPAI is indebted to the following individuals: Miran Choi (ETRI), Gérard Chollet (IMT), Jisu Kang (KLleon), Mark Seligman (SMI) and Fathy Yassa (SMI) for their efforts in developing the MPAI-MMC Technical Specification V1.

MPAI-MMC V1

Users of MPAI standards should bear in mind the Notices and Disclaimers concerning use of MPAI Standards.

Version 1.1

Technical Specification

The Institute of Electrical and Electronic Engineers (IEEE) has adopted MPAI-MMC with the name IEEE 3300-2022.

Reference Software, Conformance Testing and Performance Assessment for MPAI-MMC V1 are under development. Read the V1.2-related document:

MPAI-MMC V1 Standard
Call for Patent Pool Administrator (Closed)
Introdution to MPAI-MMC (V1)
MPAI-MMC Standard (V1)
Call for Technologies (V1)
Use Cases and Functional Requirements (V1)
Framework Licence (V1)
Application Note

Cookie	Duration	Description
cookielawinfo-checkbox-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Technical".
CookieLawInfoConsent	1 year	The cookie is set by the GDPR Cookie Consent plug-in and is used to store whether the user has consented to the use of cookies or not. It does not store any personal data.
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
_pk_id.6.08a8	13 months	Used to store a few details about the user such as the unique visitor ID
_pk_ses.6.08a8	30 minutes	Short lived cookies used to temporarily store data for the visit