| 1 Entity Context Understanding (HMC-ECU)
2 Entity Dialogue Processing (MMC-EDP) |
5 Text and Speech Translation (MMC-TST)
6 Audio-Visual Scene Rendering (PAF-AVR) |
This Chapter specifies the eight AI Modules for which Profile signaling is specified.
1 Entity Context Understanding (HMC-ECU)
1.1 Specification
The Entity Context Understanding Composite AIM is specified
1.2 Attributes
HMC-ECU Profiles are determined by the use of one or more of the following attributes by the AIM:
| Attribute | Code | Function of HMC-ECU |
| Body Descriptors | BDD | Receives Body Descriptors |
| Face Descriptors | FCD | Receives Face Descriptors |
| Speech Object | SPO | Receives Speech Object |
| Text Object | TXO | Receives Text Object |
| Visual Object | VIO | Receives Visual Object |
| Audio Object | AUO | Receives Audio Object |
| Audio-Visual Scene Descriptors | AVS | Receives Audio-Visual Scene Descriptors |
| Translation | TRN | Translates Text Object |
2 Entity Dialogue Processing (MMC-EDP)
2.1 Specification
Entity Dialogue Processing is specified.
2.2 Attributes
MMC-EDP Profiles are determined by the use of one or more of the following attributes by the AIM:
| Attribute | Code | Function of MMC-EDP |
| Text Object | TXO | Receives Text (directly from human or through NLU). |
| Object Instance ID | OII | Receives the ID of an A/V/AV Instance referenced in the dialogue. |
| Input Personal Status | EPS | Receives Personal Status. |
| Text Descriptors | TXD | Receives Meaning. |
| AV Scene Descriptors | AVS | Receives AV Scene Descriptors to enable it to locate the Object. |
| Speaker ID | SPI | Receives Speaker ID. |
| Face ID | FCI | Receives Face ID. |
| Memory | MEM | Takes into account prior Input Data of the dialogue session. |
3 Natural Language Understanding (MMC-NLU)
3.2 Specification
The Natural Language Understanding AIM is specified.
3.3 Attributes
MMC-NLU Profiles are determined by the use of one or more of the following attributes by the AIM:
| Attribute | Code | Function of MMC-NLU |
| Text Object | TXO | Receives Text directly from human. |
| Recognised Text | TXR | Receives text from ASR. |
| Object Instance ID | OII | Receives Object Instance ID |
| Audio-Visual Scene Descriptors | AVS | Receives Audio-Visual Descriptors. |
| Text Descriptors | TXD | Produces Text Descriptors (Meaning) |
4 Personal Status Extraction (MMC-PSE)
4.1 Specification
Personal Status Extraction is specified.
4.2 Attributes
MMC-PSE Profiles are determined by the use of one or more of the following attributes by the AIM:
| Attribute | Code | Function of MMC-PSE |
| Text Object | TXO | Receives Text |
| Speech Object | SPO | Receives Speech |
| Face Object | FCO | Receives Face |
| Body Object | BDO | Receives Gesture |
When an MMC-PSE is used as a component AIM in a Composite AIM as in the case of HMC-ECU, the MMC-PSE Attributes become Sub-Attributes of the Composite AIM.
5 Text and Speech Translation (MMC-TST)
5.1 Specification
Text and Speech Translation is specified.
5.2 Attributes
MMC-TST Profiles are determined by the use of one or more of the following attributes by the AIM:
| Attributes | Code | Functions |
| Language Preferences | LGP | MMC-TST receives information on input and output languages. |
| Text Object | TXO | MMC-TST receives Text. |
| Speech Object | SPO | MMC-TST receives Speech. |
| Speech Descriptors | SPD | MMC-TST uses Speech Descriptors. |
| Personal Status | EPS | MMC-TST receives Personal Status. |
When an MMC-TST is used as a component AIM in a Composite AIM as in the case of HMC-ECU, the LGP (Language Preferences) Attribute of MMC-TST become Sub-Attributes of the Composite AIM represented as 3-letter codes of [6], Part 3.
6 Audio-Visual Scene Rendering (PAF-AVR)
6.1 Specification
Audio-Visual Scene Rendering is specified.
6.2 Attributes
PAF-AVR Profiles are determined by the use of one or more of the following attributes by the AIM:
| Attribute | Code | Function |
| Point of View | POV | PAF-AVR is informed to provide Output Audio and/or Output Visual as perceived from a Point of View. |
| Portable Avatar | PAV | PAF-AVR receives a Portable Avatar and produces an Audio-Visual Scene from the Point of View. |
| Audio-Visual Scene Descriptors | AVS | PAF-AVR receives Audio-Visual Scene Descriptors and produces an Audio-Visual Scene from the Point of View. |
| Output Text | TXO | PAF-AVR produces Text Object. |
| Output Audio | AUO | PAF-AVR produces Audio Object. |
| Output Visual | VIO | PAF-AVR produces Visual Object. |
7 Personal Status Display (PAF-PSD)
7.1 Specification
Personal Status Display is specified.
7.2 Attributes
PAF-PSD Profiles are determined by the use of one or more of the following attributes by the AIM:
| Attribute | Code | Function |
| Text Object | TXO | PAF-PSD receives Text and produces Speech. |
| Personal Status | EPS | PAF-PSD receives Personal Status. |
| Speech Model | SPM | PAF-PSD receives Speech Model |
| Avatar Model | AVM | PAF-PSD receives Avatar Model. |
8 Text-to-Speech (MMC-TTS)
8.1 Specification
Text-to-Speech is specified.
8.2 Attributes
An MMC-TTS AIM Profile is determined by whether the AIM uses one or more of the following Attributes:
| Attribute | Code | Function |
| Text Object | TXO | MMC-TTS receives Text Object |
| Personal Status | EPS | MMC-TTS receives Personal Status |
| Speech Model | SPM | MMC-TTS receives NN Speech Model |