Personal Status Extraction (PSE) is a composite AIM that extracts Cognitive State, Emotion, and Social Attitude called Factors conveyed by each of Text, Speech, Face, and Gesture, called Modalities, and provides an estimate of the Personal Status, intended as a combination of Factors. The Personal Status Composite AIM is used in MPAI-MMC and other Use Cases as a replacement for the combination of AIMs depicted in Figure 1. Note that the Personal Status Data Type need not convey information on all Factors and all Modalities.

1     Scope of Personal Status Extraction

2     Reference Architecture of Personal Status Extraction.

3     I/O Data of Personal Status Extraction.

4     Functions of AI Modules of Personal Status Extraction

5     I/O Data of AI Modules of Personal Status Extraction.

6     JSON Metadata of Personal Status Extraction and its AIMs

1      Scope of Personal Status Extraction

Personal Status Extraction produces the estimate of the Personal Status of a human or an avatar by analysing each Modality in three steps:

  1. Data Capture (e.g., characters and words, a digitised speech segment, the digital video containing the hand of a person, etc.).
  2. Descriptor Extraction (e.g., pitch and intonation of the speech segment, thumb of the hand raised, the right eye winking, etc.).
  3. Personal Status Interpretation (i.e., one of Emotion, Cognitive State, and Attitude).

An implementation may combine two or more of the AIMs implementing the steps.

2      Reference Architecture of Personal Status Extraction

Figure 1 depicts the Personal Status extraction process:

  1. Descriptors are extracted from Text, Speech, Face Object, and Body Object. An AI Module upstream can provide Descriptors, depending on the value of the Input Selectors, indicating to PSE whether a Modality or its Descriptors are used.
  2. Descriptors are interpreted and the specific indicators of the Personal Status in the Text, Speech, Face, and Gesture Modalities are derived.
  3. Personal Status is obtained by combining the estimates of different Modalities of the Personal Status.

Figure 1 – Reference Model of Personal Status Extraction

Note that:

  1. A Modality can be input into the Personal Status Extraction Composite AIM as a Modality or as Descriptors. Both Modality Descriptors have the same syntax and semantics. Text Descriptors are equivalent to Meaning. Gesture Description extracts Gesture Descriptors from Body Object. In the future other Descriptors may be extracted from Body Object.
  2. An Implementation can combine, e.g., the Gesture Description and PS-Gesture Interpretation AIMs into one AIM, and directly provide PS-Gesture from a Body Object without exposing PS-Gesture Descriptors.

3      I/O Data of Personal Status Extraction

Table 1 gives the input/output data of Personal Status Extraction.

Table 1  – I/O data of Personal Status Extraction

Input data From Description
Input Text Selector An external signal Text/Descriptors Selector
Text Keyboard or AIM Text or Recognised Text
Text Descriptors An upstream AIM Descriptors of Text
Input Speech Selector An external signal Speech/Descriptors Selector
Speech Microphone Speech of human.
Speech Descriptors An upstream AIM Descriptors of Speech
Input Face Selector An external signal Face/Descriptors Selector
Body Object Visual Scene Description The face of the human
Face Descriptors An upstream AIM Descriptors of Face
Input Gesture Selector An external signal Body/Descriptors Selector
Body Object Visual Scene Description The body of the human
Gesture Descriptors An upstream AIM Descriptors of Gesture
Output data To Descriptions
Personal Status A downstream AIM For further processing

4      Functions of AI Modules of Personal Status Extraction

Table 2 gives functions of the AIMs.

Table 2 – AI Modules of Personal Status Extraction

AIM Modules
Input Text Description Extracts the Descriptors of Text.
Input Speech Description Extracts the Descriptors of Speech.
Input Face Description Extracts the Descriptors of Face.
Input Body Description Extracts the Descriptors of Body.
PS-Text Interpretation Interprets the Personal Status Descriptors of Text.
PS-Speech Interpretation Interprets the Personal Status Descriptors of Speech.
PS-Speech Interpretation Interprets the Personal Status Descriptors of Face.
PS-Gesture Interpretation Interprets the Personal Status Descriptors of Body.
Personal Status Multiplexing Produces the Personal Status.

5      I/O Data of AI Modules of Personal Status Extraction

Table 3 gives the list of the AIMs with their functions.

Table 3 – AI Modules of Personal Status Extraction

AIM Receives Produces
Input Text Description Text Text Descriptors
Input Speech Description Speech Speech Descriptors
Input Face Description Face Object Face Descriptors
Input Body Description Body Object Gesture Descriptors
PS-Text Interpretation Text Descriptors PS-Text
PS-Speech Interpretation Speech Descriptors PS-Speech
PS-Speech Interpretation Face Descriptors PS-Face
PS-Gesture Interpretation Gesture Descriptors PS-Gesture
Personal Status Multiplexing PS-Text
PS-Speech
PS-Face
PS-Gesture
Personal Status

6      Specification of Personal Status Extraction AIMs and JSON Metadata

Table 4 – AIMs and JSON Metadata

MMC-PSE Personal Status Extraction X
MMC-ITD Input Text Description X
MMC-ISD Input Speech Description X
PAF-IFD Input Face Description X
PAF-IBD Input Body Description X
MMC-PTI PS-Text Interpretation X
MMC-PSI PS-Speech Interpretation X
PAF-PFI PS-Face Interpretation X
PAF-PGI PS-Gesture Interpretation X
MMC-PMX Personal Status Multiplexing X