1      Functions

The functions of Entity and Context Understanding (HMC-ECU) allow a Machine to achieve understanding the information conveyed by an Entity and its Context in order to enable the Entity Dialogue Processing AIM to produce a pertinent communication.

Therefore, Entity and Context Understanding (HMC-ECC):

Receives Audio-Visual Scene Descriptors.
Separates The components of the Audio-Visual Scene Descriptors.
Performs Recognition of Speaker ID.
Recognition of Face ID.
Recognition of Entity’s Speech.
Recognition of Audio Object and Visual Object.
Understanding of Entity’s Natural Language expressed as Text in the Context of Audio and/or Visual Instance.
Extraction of the Entity’s Personal Status.
Translation of the Entity’s Text.
Produces Entity ID
Personal Status
Translated and Refined Text
Meaning
Audio Instance ID
Visual Instance ID
Audio-Visual Scene Descriptors (same as input)

2      Reference Model

Figure 1 depicts the Reference Architecture of the Entity and Context Understanding AIM.

Figure 1 – The Entity and Context Understanding Composite AIM

3      I/O Data

Table 1 specifies the Input and Output Data of the of the Entity Context Understanding AIM.

Table 1 – I/O Data of the Entity Context Understanding Composite AIM

Input Description
Audio-Visual Scene Descriptors The digital representation of the Audio, Visual, and Audio-Visual Objects of the Scene and their spatial arrangement .
Output Description
Entity ID
Personal Status Personal Status of Entity having the Entity ID.
Translated Text Translated Text of Text Object or of Text conveyed by Speech Object.
Refined Text Refined Text of Speech Object.
Meaning Other name for Refined Text Descriptors.
Visual Instance ID The Identifier of the specific Visual Object belonging to a level in the taxonomy.
Audio-Visual Scene Descriptors As in Input
Audio Instance ID The Identifier of the specific Audio Object belonging to a level in the taxonomy.

4      SubAIMs

HMC-ECU is a Composite AIM having the Reference Model depicted in Figure 2

Figure 2 – The Entity and Context Understanding Composite AIM

Table 2 provides the list of AIMs – both Basic and Composite – included in the Entity and Context Understanding Composite AIM.

Table 2 – AIW, AIMs, and JSON Metadata

AIMs Name
HMC-ECU Entity And Context Understanding
OSD-SDX Audio-Visual Scene Demultiplexing
MMC-SIR Speaker Identity Recognition
PAF-FIR Face Identity Recognition
MMC-ASR Automatic Speech Recognition
OSD-VOI Visual Object Identification
CAE-AOI Visual Object Identification
MMC-NLU Natural Language Understanding
MMC-PSE Personal Status Extraction
MMC-TTT Text-to-Text Translation

5     JSON Metadata

https://schemas.mpai.community/HMC/V1.1/AIMs/EntityAndContextUnderstanding.json 

6    Profiles

The Profiles of Entity Context Understanding AIM are specified.