1 Version
V2.1
2 Functions
The Conversation With Emotion (MMC-CWE) AIW:
- Receives:
- Input Selector indicating use of Text or Speech.
- Input Text, complementing or replacing the Audio Input and Visual Input.
- Speech Object of the conversing human.
- Face Object of the conversing human.
- Understands the information and the Emotion of the human conveyed by Text, Speech, and Face.
- Produces and emits multimodal responses to the human in the form of Speech representing the response of the Machine and Output Visual conveying a face displaying the Emotion of the Machine and moving its lips synchronously with the Machine Text.
3 Reference Model
Figure 1 depicts the Reference Model of Conversation With Emotion AIW.

Figure 1 – Conversation With Emotion AIW
4 I/O Data
The input and output data of the Conversation with Emotion Use Case are:
Table 19 – I/O Data of Conversation with Emotion
| Input | Descriptions |
| Input Selector | Data determining the use of Speech vs Text. |
| Text Object | Text typed by the human as additional information stream or as a replacement of the speech depending on the value of Input Selector. |
| Speech Object | Speech of the human having a conversation with the machine. |
| Face Object | Visual information of the Face of the human having a conversation with the machine. |
| Output | Descriptions |
| Text Object | Text of the Speech produced by the machine. |
| Speech Object | Synthetic Speech produced by the machine. |
| Face Object | Video of a Face whose lip movements are synchronised with the Output Speech and the synthetic machine emotion. |
5 SubAIMs
| AIW and AIMs | Name | JSON |
| MMC-ASR | Automatic Speech Recognition | X |
| MMC-ISD | Input Speech Description | X |
| PAF-IFD | Input Face Description | X |
| MMC-NLU | Natural Language Understanding | X |
| MMC-PSI | PS-Speech Interpretation | X |
| PAF-PFI | PS-Face Interpretation | X |
| MMC-PTI | PS-Text Interpretation | X |
| MMC-MEF | Multimodal Emotion Fusion | X |
| MMC-EDP | Entity Dialogue Processing | X |
| MMC-TTS | Text-to-Speech | X |
| MMC-VLA | Video Lip Animation | X |
6 JSON Metadata
https://schemas.mpai.community/MMC/V2.1/AIWs/ConversationWithEmotion.json