| 1 Function | 2 Reference Model | 3 Input/Output Data |
| 4 SubAIMs | 5 JSON Metadata | 6 Profiles |
| 7 Reference Software | 8 Conformance Texting | 9 Performance Assessment |
1 Function
Entity and Context Understanding (HMC-ECU) enables a Machine
- To understand the information conveyed by an Entity and its Context, in the form of either:
- An Audio-Visual Scene (if Entity is a human).
- A Portable Avatar (if Entity is a machine).
- To produce a pertinent response composed of Machine Text and Machine Personal Status.
Therefore, Entity and Context Understanding (HMC-ECC):
| Receives | Audio-Visual Scene Descriptors | And separates into components. |
| Recognises | Speech | Of Entity. |
| Audio Object and Visual Object. | Providing their Identities. | |
| Understands | Natural Language | (Of Entity) expressed as Text being cognizant of the Audio and Visual Instances |
| Extracts | Personal Status. | Of Entity. |
| Translates | Text | Of Entity. |
| Produces: | Audio-Visual Scene Geometry | Geometry of the Scene. |
| Entity ID | Entity producing Input Data. | |
| Audio Instance ID | Identified Instance. | |
| Visual Instance ID | Identified Instance. | |
| Personal Status | On Entity. | |
| Translated Text | Of Refined Text. | |
| Meaning | Of Refined Text. |
2 Reference Model
Entity and Context Understanding (HMC-ECU) is an AIM whose Reference Model is depicted in Figure 2.

Figure 1 – The Entity and Context Understanding (HMC-ECU) AIM
3 Input and Output Data
Table 1 specifies the Input and Output Data of the of the Entity Context Understanding (HMC-ECU) AIM.
Table 1 – I/O Data of the Entity Context Understanding (HMC-ECU) AIM
| Input | Description |
| Body DescriptorsObject | The Descriptors of the Body Objects of Entities in the Visual Scene. |
| Face DescriptorsObject | The Descriptors of the Face Objects of Entities in the Visual Scene. |
| Speech Object | The digital representation of the speech emitted by the Entity. |
| Audio-Visual Scene Geometry | The digital representation of the spatial arrangement of the Audio, Visual, and Audio-Visual Objects of the Scene. |
| Visual Objects | The Visual Objects of the Scene. |
| Audio Object | The Audio Objects of the Scene. |
| Text Object | Text of Entity with Entity ID. |
| Output | Description |
| Personal Status | Personal Status of Entity having the Entity ID. |
| Translated Text Object | Translated Text of Text Object or of Text conveyed by Speech Object. |
| Refined Text Object | Refined Text of Speech Object. |
| Meaning | Other name for Refined Text Descriptors. |
| Visual Instance Ientifier | The Identifier of the specific Visual Object belonging to a level in the taxonomy. |
| Audio-Visual Scene Geometry | As in Input |
| Audio Instance Identifier | The Identifier of the specific Audio Object belonging to a level in the taxonomy. |
4 SubAIMs
Entity and Context Understanding (HMC-ECU) is a Composite AIM whose Reference Model is depicted in Figure 2.
Figure 2 – The Entity and Context Understanding Composite (HMC-ECU) AIM
Table 2 – AIMs and JSON Metadata
| AIM/1 | AIM/2 | AIM Name | JSON | |
| HMC-ECU | Entity and Context Understanding | X | ||
| OSD-SDX | Audio-Visual Scene Demultiplexing | X | ||
| MMC-ASR | Automatic Speech Recognition | X | ||
| OSD-VOI | Visual Object Identification | X | ||
| CAE-AOI | Audio Object Identification | X | ||
| MMC-NLU | Natural Language Understanding | X | ||
| MMC-PSE | Personal Status Extraction | X | ||
| MMC-ETD | Entity Text Description | X | ||
| MMC-ESD | Entity Speech Description | X | ||
| PAF-EFD | Entity Face Description | X | ||
| PAF-EBD | Entity Body Description | X | ||
| MMC-PTI | PS-Text Interpretation | X | ||
| MMC-PSI | PS-Speech Interpretation | X | ||
| PAF-PFI | PS-Face Interpretation | X | ||
| PAF-PGI | PS-Gesture Interpretation | X | ||
| MMC-PMX | Personal Status Multiplexing | X | ||
| MMC-TTT | Text-to-Text Translation | X |
5 JSON Metadata
https://schemas.mpai.community/HMC/V2.1/AIMs/EntityAndContextUnderstanding.json
6 Profiles
Entity Context Understanding Profiles are defined.
7. Reference Software
8. Conformance Testing
Table 2 provides the Conformance Testing Method for the HMC-ECU AIM.
If a schema contains references to other schemas, conformance of data for the primary schema implies that any data referencing a secondary schema shall also validate against the relevant schema, if present and conform with the Qualifier, if present.
Table 2 – Conformance Testing Method for CAE-ECU AIM
| Receives | Body Descriptors | Shall validate against Body Descriptors XML Schema. |
| Face Descriptors | Shall validate against Face Descriptors Schema. | |
| Speech Object | Shall validate against Speech Object Schema. Speech Data shall conform with Speech Qualifier. |
|
| Audio-Visual Scene Geometry | Shall validate against Audio-Visual Scene Geometry Schema. | |
| Visual Objects | Shall validate against Visual Object Schema. Visual Data shall conform with Visual Qualifier. |
|
| Audio Object | Shall validate against Audio Object Schema. Audio Data shall conform with Visual Qualifier. |
|
| Text Object | Shall validate against Text Object Schema. Text Data shall conform with Visual Qualifier. |
|
| Produces | Personal Status | Shall validate against Personal Status Schema. |
| Translated Text | Shall validate against Text Object Schema. Text Data shall conform with Visual Qualifier. |
|
| Refined Text | Shall validate against Text Object Schema. Text Data shall conform with Visual Qualifier. |
|
| Meaning | Shall validate against Meaning schema | |
| Visual Instance ID | Shall validate against Instance ID schema. | |
| Audio-Visual Scene Geometry | Shall validate against Audio-Visual Scene Geometry Schema. | |
| Audio Instance ID | Shall validate against Instance ID schema. |
9. Performance Assessment