MPAI-HMC V2.0 AIMs Entity and Context Understanding

Go To AI Modules

1 Function	2 Reference Model	3 Input/Output Data
4 SubAIMs	5 JSON Metadata	6 Profiles
7 Reference Software	8 Conformance Texting	9 Performance Assessment

1 Function

The function of Entity and Context Understanding (HMC-ECU) enables a Machine to understand the information conveyed by an Entity and its Context providing the information needed by the Entity Dialogue Processing AIM to produce a pertinent response composed of Machine Text and Machine Personal Status.

Therefore, Entity and Context Understanding (HMC-ECC)::

Receives	Audio-Visual Scene Descriptors	And separates into components.
Recognises	Speech	Of Entity.
	Audio Object and Visual Object.	Providing their Identities.
Understands	Natural Language	(Of Entity) expressed as Text being cognizant of the Audio and Visual Instances
Extracts	Personal Status.	Of Entity.
Translates	Text	Of Entity.
Produces:	Audio-Visual Scene Geometry	Geometry of the Scene.
	Entity ID	Entity producing Input Data.
	Audio Instance ID	Identified Instance.
	Visual Instance ID	Identified Instance.
	Personal Status	On Entity.
	Translated Text	Of Refined Text.
	Meaning	Of Refined Text.

2 Reference Model

Entity and Context Understanding is an AIM whose Reference Model is depicted in Figure 2.

Figure 1 – The Entity and Context Understanding Composite AIM

3 Input and Output Data

Table 1 specifies the Input and Output Data of the of the Entity Context Understanding AIM.

Table 1 – I/O Data of the Entity Context Understanding Composite AIM

Input	Description
Body Descriptors	The Descriptors of the Body Objects of Entities in the Visual Scene.
Face Descriptors	The Descriptors of the Face Objects of Entities in the Visual Scene.
Speech Object	The digital representation of the speech emitted by the Entity.
Audio-Visual Scene Geometry	The digital representation of the spatial arrangement of the Audio, Visual, and Audio-Visual Objects of the Scene.
Visual Objects	The Visual Objects of the Scene.
Audio Object	The Audio Objects of the Scene.
Text Object	Text of Entity with Entity ID.
Output	Description
Personal Status	Personal Status of Entity having the Entity ID.
Translated Text	Translated Text of Text Object or of Text conveyed by Speech Object.
Refined Text	Refined Text of Speech Object.
Meaning	Other name for Refined Text Descriptors.
Visual Instance ID	The Identifier of the specific Visual Object belonging to a level in the taxonomy.
Audio-Visual Scene Geometry	As in Input
Audio Instance ID	The Identifier of the specific Audio Object belonging to a level in the taxonomy.

4 SubAIMs

Entity and Context Understanding is a Composite AIM whose Reference Model is depicted in Figure 2.

Figure 2 – The Entity and Context Understanding Composite AIM

Table 2 – AIMs and JSON Metadata

	AIM/1	AIM/2	AIM Name	JSON
HMC-ECU			Entity and Context Understanding	X
	OSD-SDX		Audio-Visual Scene Demultiplexing	X
	MMC-ASR		Automatic Speech Recognition	X
	OSD-VOI		Visual Object Identification	X
	CAE-AOI		Audio Object Identification	X
	MMC-NLU		Natural Language Understanding	X
	MMC-PSE		Personal Status Extraction	X
		MMC-ETD	Entity Text Description	X
		MMC-ESD	Entity Speech Description	X
		PAF-EFD	Entity Face Description	X
		PAF-EBD	Entity Body Description	X
		MMC-PTI	PS-Text Interpretation	X
		MMC-PSI	PS-Speech Interpretation	X
		PAF-PFI	PS-Face Interpretation	X
		PAF-PGI	PS-Gesture Interpretation	X
		MMC-PMX	Personal Status Multiplexing	X
	MMC-TTT		Text-to-Text Translation	X

5 JSON Metadata

https://schemas.mpai.community/HMC/V2.0/AIMs/EntityAndContextUnderstanding.json

6 Profiles

Entity Context Understanding Profiles are defined.

7. Reference Software

8. Conformance Testing

Table 2 provides the Conformance Testing Method for the HMC-ECU AIM.

If a schema contains references to other schemas, conformance of data for the primary schema implies that any data referencing a secondary schema shall also validate against the relevant schema, if present and conform with the Qualifier, if present.

Table 2 – Conformance Testing Method for CAE-ECU AIM

Receives	Body Descriptors	Shall validate against Body Descriptors XML Schema.
	Face Descriptors	Shall validate against Face Descriptors Schema.
	Speech Object	Shall validate against Speech Object Schema. Speech Data shall conform with Speech Qualifier.
	Audio-Visual Scene Geometry	Shall validate against Audio-Visual Scene Geometry Schema.
	Visual Objects	Shall validate against Visual Object Schema. Visual Data shall conform with Visual Qualifier.
	Audio Object	Shall validate against Audio Object Schema. Audio Data shall conform with Visual Qualifier.
	Text Object	Shall validate against Text Object Schema. Text Data shall conform with Visual Qualifier.
Produces	Personal Status	Shall validate against Personal Status Schema.
	Translated Text	Shall validate against Text Object Schema. Text Data shall conform with Visual Qualifier.
	Refined Text	Shall validate against Text Object Schema. Text Data shall conform with Visual Qualifier.
	Meaning	Shall validate against Meaning schema
	Visual Instance ID	Shall validate against Instance ID schema.
	Audio-Visual Scene Geometry	Shall validate against Audio-Visual Scene Geometry Schema.
	Audio Instance ID	Shall validate against Instance ID schema.

9. Performance Assessment