1 Function | 2 Reference Model | 3 Input/Output Data |
4 SubAIMs | 5 JSON Metadata | 6 Profiles |
7 Reference Software | 8 Conformance Texting | 9 Performance Assessment |
1 Function
The function of Entity and Context Understanding (HMC-ECU) enables a Machine to understand the information conveyed by an Entity and its Context to enable the Entity Dialogue Processing AIM to produce a pertinent response (Communication Item).
Therefore, Entity and Context Understanding (HMC-ECC)::
- Receives the Audio-Visual Scene Descriptors.
- Separates the components of the Audio-Visual Scene Descriptors.
- Performs
- Recognition of Entity’s Speech.
- Recognition of Audio Object and Visual Object.
- Understanding of Entity’s Natural Language expressed as Text in the Context of the Audio and Visual Instance.
- Extraction of the Entity’s Personal Status.
- Translation of the Entity’s Text.
- Produces:
- Audio-Visual Scene Geometry
- Entity ID
- Audio Instance ID
- Visual Instance ID
- Personal Status
- Translated and Refined Text
- Meaning.
2 Reference Model
Entity and Context Understanding is an AIM whose Reference Model is depicted in Figure 2.
Figure 1 – The Entity and Context Understanding Composite AIM
3 Input and Output Data
Table 1 specifies the Input and Output Data of the of the Entity Context Understanding AIM.
Table 1 – I/O Data of the Entity Context Understanding Composite AIM
Input | Description |
Body Descriptors | The Descriptors of the Body Objects of Entities in the Visual Scene. |
Face Descriptors | The Descriptors of the Face Objects of Entities in the Visual Scene. |
Speech Object | The digital representation of the speech emitted by the Entity. |
Audio-Visual Scene Geometry | The digital representation of the spatial arrangement of the Audio, Visual, and Audio-Visual Objects of the Scene. |
Visual Objects | The Visual Objects of the Scene. |
Audio Object | The Audio Objects of the Scene. |
Text Object | Text of Entity with Entity ID. |
Output | Description |
Personal Status | Personal Status of Entity having the Entity ID. |
Translated Text | Translated Text of Text Object or of Text conveyed by Speech Object. |
Refined Text | Refined Text of Speech Object. |
Meaning | Other name for Refined Text Descriptors. |
Visual Instance ID | The Identifier of the specific Visual Object belonging to a level in the taxonomy. |
Audio-Visual Scene Geometry | As in Input |
Audio Instance ID | The Identifier of the specific Audio Object belonging to a level in the taxonomy. |
4 SubAIMs
Entity and Context Understanding is a Composite AIM whose Reference Model is depicted in Figure 2.
Figure 2 – The Entity and Context Understanding Composite AIM
Note that Output Data in italic are passed directly from the homonymous Input Data.
Table 2 – AIMs and JSON Metadata
AIM/1 | AIM/2 | AIM Name | JSON | |
HMC-ECU | Entity and Context Understanding | X | ||
OSD-SDX | Audio-Visual Scene Demultiplexing | X | ||
MMC-ASR | Automatic Speech Recognition | X | ||
OSD-VOI | Visual Object Identification | X | ||
CAE-AOI | Audio Object Identification | X | ||
MMC-NLU | Natural Language Understanding | X | ||
MMC-PSE | Personal Status Extraction | X | ||
MMC-ETD | Entity Text Description | X | ||
MMC-ESD | Entity Speech Description | X | ||
PAF-EFD | Entity Face Description | X | ||
PAF-EBD | Entity Body Description | X | ||
MMC-PTI | PS-Text Interpretation | X | ||
MMC-PSI | PS-Speech Interpretation | X | ||
PAF-PFI | PS-Face Interpretation | X | ||
PAF-PGI | PS-Gesture Interpretation | X | ||
MMC-PMX | Personal Status Multiplexing | X | ||
MMC-TTT | Text-to-Text Translation | X |
5 JSON Metadata
https://schemas.mpai.community/HMC/V2.0/AIMs/EntityAndContextUnderstanding.json
6 Profiles
Entity Context Understanding Profiles are defined.
7. Reference Software
8. Conformance Testing
Table 2 provides the Conformance Testing Method for the HMC-ECU AIM.
If a schema contains references to other schemas, conformance of data for the primary schema implies that any data referencing a secondary schema shall also validate against the relevant schema, if present and conform with the Qualifier, if present.
Table 2 – Conformance Testing Method for CAE-ECU AIM
Receives | Body Descriptors | Shall validate against Body Descriptors XML Schema. |
Face Descriptors | Shall validate against Face Descriptors Schema. | |
Speech Object | Shall validate against Speech Object Schema. Speech Data shall conform with Speech Qualifier. |
|
Audio-Visual Scene Geometry | Shall validate against Audio-Visual Scene Geometry Schema. | |
Visual Objects | Shall validate against Visual Object Schema. Visual Data shall conform with Visual Qualifier. |
|
Audio Object | Shall validate against Audio Object Schema. Audio Data shall conform with Visual Qualifier. |
|
Text Object | Shall validate against Text Object Schema. Text Data shall conform with Visual Qualifier. |
|
Produces | Personal Status | Shall validate against Personal Status Schema. |
Translated Text | Shall validate against Text Object Schema. Text Data shall conform with Visual Qualifier. |
|
Refined Text | Shall validate against Text Object Schema. Text Data shall conform with Visual Qualifier. |
|
Meaning | Shall validate against Meaning schema | |
Visual Instance ID | Shall validate against Instance ID schema. | |
Audio-Visual Scene Geometry | Shall validate against Audio-Visual Scene Geometry Schema. | |
Audio Instance ID | Shall validate against Instance ID schema. |
9. Performance Assessment