Conversation with Personal Status (MMC-CPS)

1 Functions of Conversation with Personal Status

2 Reference Architecture of Conversation with Personal Status

3 I/O Data of Conversation with Personal Status

4 Functions of AI Modules of Conversation with Personal Status

5 I/O Data of AI Modules of Conversation with Personal Status

6 JSON Metadata of Conversation with Personal Status

1 Functions of Conversation with Personal Status

When humans have a conversation with other humans, they use speech and, in constrained cases, text. Their interlocutors perceive speech and/or text supplemented by visual information related to the speaker’s face and gesture of a conversing human. Text, speech, face, and gesture may convey information about the internal state of the speaker that MPAI calls Personal Status. Therefore, handling of Personal Status information in a human-machine conversation and, in the future, even machine-machine conversation, is a key feature of a machine trying to understand what the speakers’ utterances mean because Personal Status recognition can improve understanding of the speaker’s utterance and help a machine produce better replies.

Conversation with Personal Status (MMC-CPS) is a general Use Case of an entity – a real or digital human – conversing and question answering with a machine. The machine captures and understands Speech, extracts Personal Status from the Text, Speech, Face, and Gesture Factors, fuses the Factors into an estimated Personal Status of the entity to achieve a better understanding of the context in which the entity utters Speech.

2 Reference Architecture of Conversation with Personal Status

Figure 1 gives the Conversation with Personal Status Reference Model including the input/output data, the AIMs, and the data exchanged between and among the AIMs.

The operation of the Conversation with Personal Status Use Case develops as follows:

Input Selector is used to inform the machine whether the human employs Text or Speech in conversation with the machine.
Visual Scene Description extracts the Scene Geometry, the Visual Objects and the Face and Body Descriptors of humans in the Scene.
Audio Scene Description extracts the Scene Geometry, and the Speech Objects in the Scene.
Visual Object Identification assigns an Identifier to each Visual Object indicated by a human.
Audio-Visual Alignment uses the Audio Scene Description and Visual Scene Description to assign unique Identifiers to Audio, Visual, and Audio-Visual Objects.
Automatic Speech Recognition recognises Speech utterances.
Natural Language Understanding refines Text and extracts Meaning.
Personal Status Extraction extracts a human’s Personal Status.
Entity Dialogue Processing produces the machine’s response and its Personal Status.
Personal Status Display produces a speaking Avatar expressing Personal Status.

Figure 1 – Reference Model of Conversation with Personal Status

The operation of the Conversation with Personal Status Use Case develops as follows:

Input Selector is used to inform the machine whether the human employs Text or Speech in conversation with the machine.
Visual Scene Description extracts the Scene Geometry, the Visual Objects and the Face and Body Descriptors of humans in the Scene.
Audio Scene Description extracts the Scene Geometry, and the Speech Objects in the Scene.
Visual Object Identification assigns an Identifier to each Visual Object indicated by a human.
Audio-Visual Alignment uses the Audio Scene Description and Visual Scene Description to assign unique Identifiers to Audio, Visual, and Audio-Visual Objects.
Automatic Speech Recognition recognises Speech utterances.
Natural Language Understanding refines Text and extracts Meaning.
Personal Status Extraction extracts a human’s Personal Status.
Entity Dialogue Processing produces the machine’s response and its Personal Status.
Personal Status Display produces a speaking Avatar expressing Personal Status.

3 I/O Data of Conversation with Personal Status

Table 1 gives the input and output data of the Conversation with Personal Status Use Case:

Table 1 – I/O Data of Conversation with Personal Status

Input	Descriptions
Input Text	Text typed by the human as additional information stream or as a replacement of the Speech.
Input Speech	Speech of the human having a conversation with the machine.
Input Visual	Visual information of the Face and Body of the human having a conversation with the machine.
Input Selector	Data determining the use of Speech vs Text.
Output	Descriptions
Machine Portable Avatar	The representation of the Machine response.

4 Functions of AI Modules of Conversation with Personal Status

Table 2 provides the functions of the Conversation with Personal Status Use Case.

Table 2 – Functions of AI Modules of Conversation with Personal Status

AIM	Function
Visual Scene Description	1. Receives Input Visual. 2. Provides Visual Objects and Visual Scene Geometry.
Audio Scene Description	1. Receives Input Audio. 2. Provides Speech Objects and Audio Scene Geometry.
Visual Object Identification	1. Receives Visual Scene Geometry, Body Descriptors, and Visual Objects. 2. Provides Visual Object Instance IDs.
Automatic Speech Recognition	1. Receives Input Speech. 2. Extracts Recognised Text.
Natural Language Understanding	1. Receives Recognised Text. 2. Refines Text and extracts Meaning.
Personal Status Extraction	1. Receives Meaning, Refined Text, Body Descriptors, and Face Descriptors. 2. Extracts Personal Status.
Entity Dialogue Processing	1. Receives Refined Text and Personal Status. 2. Produces machine’s Text and Personal Status.
Personal Status Displays	1. Receives Machine Text and Personal Status. 2. Synthesises Machine Portable Avatar.

5 I/O Data of AI Modules of Conversation with Personal Status

Table 3 provides the I/O Data of the AI Modules of the Conversation with Personal Status Use Case.

Table 3 – I/O Data of AI Modules of Conversation with Personal Status

AIM	Receives	Produces
Visual Scene Description	1. Input Visual	1. Face Descriptors 2. Body Descriptors 3. Visual Scene Geometry 4. Visual Objects
Audio Scene Description	1. Input Audio	1. Speech 2. Audio Scene Geometry
Visual Object Identification	1. Body Descriptors 2. Visual Scene Geometry 3. Visual Objects	1. Visual Object ID
Automatic Speech Recognition	1. Input Speech	1. Recognised Text
Natural Language Understanding	1. Visual Object ID 2. Input Speech 3. Recognised Speech 4. Input Selector	1. Meaning 2. Refined Speech
Personal Status Extraction	1. Body Descriptors 2. Face Descriptors 3. Meaning4. Speech	1. Input Personal Status
Entity Dialogue Processing	1. Input Speech 2. Refined Speech 3. Input Personal Status 4. Input Selector	1. Machine Personal Status 2. Machine Speech
Personal Status Displays	1. Machine Speech 2. Machine Personal Status	1. Machine Portable Avatar

6 JSON Metadata of Conversation with Personal Status

Table 4 – Acronyms and URLs of JSON Metadata

MMC-CPS			Conversation With Personal Status	X
	MMC-SSD		Speech Scene Description	X
	OSD-VSD		Visual Scene Description	X
	OSD-VOI		Visual Object Identification	X
		OSD-VDI	Visual Direction Identification	X
		OSD-VOE	Visual Object Extraction	X
		OSD-VII	Visual Instance Identification	X
	MMC-ASR		Automatic Speech Recognition	X
	MMC-NLU		Natural Language Understanding	X
	MMC-PSE		Personal Status Extraction	X
		MMC-ITD	Entity Text Description	X
		MMC-ISD	Entity Speech Description	X
		PAF-IFD	Entity Face Description	X
		PAF-IBD	Entity Body Description	X
		MMC-PTI	PS-Text Interpretation	X
		MMC-PSI	PS-Speech Interpretation	X
		PAF-PFI	PS-Face Interpretation	X
		PAF-PGI	PS-Gesture Interpretation	X
		MMC-PMX	Personal Status Multiplexing	X
	MMC-EDP		Entity Dialogue Processing	X
	PAF-PSD		Personal Status Display	X
		MMC-TTS	Text-to-Speech	X
		PAF-IFD	Entity Face Description	X
		PAF-IBD	Entity Body Description	X
		PAF-PMX	Portable Avatar Multiplexing	X
	OSD-AVA		Audio-Visual Alignment	X

MMC-CPS			Conversation With Personal Status	X
–	OSD-VSD		Visual Scene Description	X
–	CAE-ASD		Audio Scene Description	X
	–	CAE-AAT	Audio Analysis Transform	X
	–	CAE-ASL	Audio Source Localisation	X
	–	CAE-ASE	Audio Separation and Enhancement	X
	–	CAE-AST	Audio Synthesis Transform	X
	–	CAE-ADM	Audio Description Multiplexing	X
–	OSD-VOI		Visual Object Identification	X
	–	OSD-VDI	Visual Direction Identification	X
	–	OSD-VOE	Visual Object Extraction	X
	–	OSD-VII	Visual Instance Identification	X
–	OSD-AVA		Audio-Visual Alignment	X
–	MMC-ASR		Automatic Speech Recognition	X
–	MMC-NLU		Natural Language Understanding	X
–	MMC-PSE		Personal Status Extraction	X
	–	MMC-ITD	Input Text Description	X
	–	MMC-ISD	Input Speech Description	X
	–	PAF-IFD	Input Face Description	X
	–	PAF-IBD	Input Body Description	X
	–	MMC-PTI	PS-Text Interpretation	X
	–	MMC-PSI	PS-Speech Interpretation	X
	–	PAF-PFI	PS-Face Interpretation	X
	–	PAF-PGI	PS-Gesture Interpretation	X
	–	MMC-PMX	Personal Status Multiplexing	X
–	MMC-EDP		Entity Dialogue Processing	X
–	PAF-PSD		Personal Status Display	X
	–	MMC-TTS	Text-to-Speech	X
	–	PAF-IFD	Input Face Description	X
	–	PAF-IBD	Input Body Description	X
	–	PAF-PMX	Portable Avatar Multiplexing	X

Cookie	Duration	Description
cookielawinfo-checkbox-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Technical".
CookieLawInfoConsent	1 year	The cookie is set by the GDPR Cookie Consent plug-in and is used to store whether the user has consented to the use of cookies or not. It does not store any personal data.
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
_pk_id.6.08a8	13 months	Used to store a few details about the user such as the unique visitor ID
_pk_ses.6.08a8	30 minutes	Short lived cookies used to temporarily store data for the visit

Conversation with Personal Status (MMC-CPS)

1 Functions of Conversation with Personal Status

2 Reference Architecture of Conversation with Personal Status

3 I/O Data of Conversation with Personal Status

4 Functions of AI Modules of Conversation with Personal Status

5 I/O Data of AI Modules of Conversation with Personal Status

6 JSON Metadata of Conversation with Personal Status

Notice