<- Scope Go to ToC References ->
Terms beginning with a capital letter have the meaning defined in Table 1. Terms beginning with a small letter have the meaning commonly defined for the context in which they are used. For instance, Table 1 defines Object and Scene but does not define object and scene.
A dash “-” preceding a Term in Table 1 indicates the following readings according to the font:
- Normal font: the Term in the table without a dash and preceding the one with a dash should be read before that Term. For example, “Avatar” and “- Model” will yield “Avatar Model.”
- Italic font: the Term in the table without a dash and preceding the one with a dash should be read after that Term. For example, “Avatar” and “- Portable” will yield “Portable Avatar.”
The full set of Terms and Definitions relevant to all MPAI Technical Specifications, including MPAI-HMC, can be accessed online.
Table 1 – General MPAI-HMC terms
Terms | Definitions |
Attitude | |
– Social | The coded representation of the internal state related to the way a human or avatar intends to position vis-à-vis the Environment or subsets of it, e.g., “Respectful”, “Confrontational”, “Soothing”. |
– Spatial | Position and Orientation and their velocities and accelerations of an Object in a Real or Virtual Environment. |
Attribute | |
Audio | Digital representation of an analogue audio signal sampled at a frequency between 8-192 kHz with a number of bits/sample between 8 and 32, and non-linear and linear quantisation. Data with characteristics of Audio may be synthetically produced. |
Audio Block | A set of consecutive Audio samples. |
Audio Channel | A sequence of Audio Blocks. |
Avatar | An Object rendered to represent a Human of a Machine in a virtual space. |
– Model | An inanimate Avatar exposing animation interfaces. |
– Portable | A Data Type including Avatar ID, Time, Visual Environment, Spatial Attitude, Avatar Model, Body Descriptors, Face Descriptors, Language Preference, Speech Coding, Speech Data, Text, and Personal Status [8]. |
Body | A digital representation of a human body, head included, face excluded. |
Centre Point | The point of an Object selected to have Local Coordinates (0,0,0). |
Cognitive State | The coded representation of the internal state reflecting the way a human or avatar understands the Environment, such as “Confused”, “Dubious”, “Convinced”. |
Communication Item | An element generated by a Machine communicating with an Entity expressed with a Portable Avatar. |
Context | Information surrounding an Entity and providing additional insight into the information the Entity communicates. |
Coordinate System | A coordinate system where the position of a point is specified by three numbers. |
– Cartesian | A coordinate system where the three numbers are the signed distances from the point to three mutually perpendicular planes. |
– Spherical | A coordinate system where the three numbers are:
– the radial distance of that point from a fixed origin. – the polar angle measured from a fixed zenith direction. – the azimuthal angle of its orthogonal projection on a reference plane. |
Culture | The collection of language and customs governing the way a human, or a group of humans employ to express their internal statuses. |
Data | Information in digital form. |
– Format | The standard digital representation of Data. |
– Type | An instance of Data with a specific Data Format. |
Descriptor | The Digital Representation of a feature of an Object. |
– Body | A Data Type including the digital representation of the features of the body of a real or digital human. |
– Face | A Data Type including the digital representation of a feature of the face of a real or digital human. |
– Speech | A Data Type including the digital representation of a feature of speech of a real or digital human, such as degree of vocal tension, pitch, etc. |
– Text | A Data Type including the digital representation of a feature of text. |
Digital Representation | Data corresponding to and representing a physical entity. |
Emotion | The coded representation of the internal state resulting from the interaction of a human or avatar with the Environment or subsets of it, such as “Angry”, “Sad”, “Determined”. |
Entity | A human in a real environment or digitally represented as a Digitised Human in a Virtual Environment a Digital or a Virtual Human in a Virtual Environment. |
Environment | A Virtual Space that may be null or may include an Audio-Visual Scene. |
Experience | The state of an Entity whose senses/sensors are continuously affected for a meaningful period. |
Face | A digital representation of a human face. |
Factor | One of Emotion, Cognitive State, and Attitude. |
Gesture | A movement of a Digital Human or part of it, such as the head, arm, hand, and finger, often a complement to a vocal utterance. |
Human | A human being in a real space. |
– Digital | A Digitised or a Virtual Human in a Virtual Space. |
– Digitised | An Object in a Virtual Space that has the appearance of a specific human when rendered. |
– Virtual | An Object in a Virtual Space created by a computer that has a human appearance when rendered but is not a Digitised Human. |
Identifier | The label uniquely associated with a human or an Object. |
Instance | An element of a set of entities – Objects, Digital Humans etc. – belonging to some levels in a hierarchical classification (taxonomy). |
– Audio | The instance of an Audio Object. |
– Visual | The instance of a Visual Object. |
Machine | An Implementation of MPAI-MMC. |
Meaning | Information extracted from Text such as syntactic and semantic information, Personal Status, and other information, such as an Object Identifier. |
Microphone Array | A microphone system that uses multiple microphones arranged in a specific pattern to capture audio in an audio space. |
– Geometry | A Data Type representing the spatial arrangement of the microphones in a Microphone Array. |
Modality | One of Text, Speech, Face, or Gesture. |
Object | A data structure that can be rendered to cause an Experience. |
– Audio | An Object described by Audio Descriptors. |
– Audio-Visual | An Object described by Audio-Visual Descriptors. |
– Body | A digital representation of the body of a Human or a Machine. |
– Descriptor | The digital representation of the feature of an Object. |
– Digital | A Digitised or a Virtual Object. |
– Digitised | The digital representation of a real object. |
– Face | The digital representation of the face of a Human or a Machine. |
– Speech | An Object described by Speech Descriptors. |
– Text | A string of Text. |
– Virtual | An Object not representing an object in the real environment. |
– Visual | An Object described by Visual Descriptors. |
Orientation | The 3 Euler angles of an Object in a Virtual Space. |
Personal Status | A Data Type including three Factors – Cognitive State, Emotion and Social Attitude – conveyed by four Modalities – Text, Speech, Face, and Gesture and providing standard extensible labels for the three Factors [6]. |
– Face | The Cognitive State, Emotion, and Social Attitude conveyed by a Face Object. |
– Gesture | The Cognitive State, Emotion, and Social Attitude conveyed by the Gesture of a Body Object. |
– Speech | The Cognitive State, Emotion, and Social Attitude conveyed by a Speech Object. |
– Text | The Cognitive State, Emotion, and Social Attitude conveyed by a Text Object. |
Portable Avatar | A Data Type representing an Avatar and its Context. |
Position | The coordinates of a representative point for an object in a Virtual Space with respect to a set of coordinate axes. |
Principal Axis | The x axis of an Object. |
Rendering | The process of instantiating a Virtual Space as a human-perceptible entity. |
Scene | A composition of Objects located according to a Scene Geometry. |
– Audio | A Scene composed of Audio Objects. |
– Audio-Visual | A Scene composed of Audio Objects, Visual Objects and co-located Audio-Visual Objects. |
– Multichannel | A data structure containing at least 2 time-aligned interleaved Audio Channels. |
– Visual | A Scene composed of Visual Objects. |
Scene Descriptors | The digital representation of a feature of a scene. |
– Audio | A Data Type including the digital representation of the audio features of a real or digital scene. |
– Audio-Visual | A Data Type combining the Audio or Visual Scene Descriptors. |
– Visual | A Data Type including the digital representation of the visual features of a real or digital scene. |
Scene Geometry | The digital representation of the Object arrangement of a Scene. |
– Audio | A Data Type describing the spatial arrangement of the Visual Objects of a Scene. |
– Audio-Visual | A Data Type describing the spatial arrangement of the Audio, Visual, and Audio-Visual Objects of a Scene. |
– Visual | A Data Type describing the spatial arrangement of the Visual Objects of a Scene. |
Selector | Input Data having the goal to set a parameter (e.g., use of Text vs Speech or Language Preference) or an operating mode of a Machine. |
Speech | Digital representation of analogue speech sampled at a frequency between 8 kHz and 96 kHz with a number of bits/sample of 8, 16 or 24, and non-linear and linear quantisation or compressed. Data with characteristics of Speech may be synthetically produced. |
Text | A sequence of characters represented according to [12]. |
– Recognised | The Text at the output of an Automatic Speech Recognition AIM. |
– Refined Text | The Text at the output of a Natural Language Understanding AIM. |
– Translated Text | The Text at the output of a Natural Language Translation AIM. |
Virtual Space | A space generated and maintained by a computing platform that can be rendered. |
The Terms used in this standard whose first letter is capital and are not already included in Table 1 are defined in Table 2.
Term | Definition |
Access | Static or slowly changing data that are required by an application such as domain knowledge data, data models, etc. |
AI Framework (AIF) | The environment where AIWs are executed. |
AI Model (AIM) | A data processing element receiving AIM-specific Inputs and producing AIM-specific Outputs according to according to its Function. An AIM may be an aggregation of AIMs. |
– Attribute | An input Data or an output Data or a functionality, such as the ability to translate or retain memory of past operations. |
– Basic | An AIM that does not aggregate other AIMs. |
– Composite | An AIM that does not include or does not expose AIMs. |
– Profile | The label that uniquely identifies a set of Attributes of an AIM. |
AI Workflow (AIW) | A structured aggregation of AIMs implementing a Use Case receiving AIW-specific inputs and producing AIW-specific outputs according to the AIW Function. |
Application Standard | An MPAI Standard designed to enable a particular application domain. |
Channel | A connection between an output port of an AIM and an input port of an AIM. The term “connection” is also used as synonymous. |
Communication | The infrastructure that implements message passing between AIMs. |
Component | One of the 7 AIF elements: Access, Communication, Controller, Internal Storage, Global Storage, Store, and User Agent |
Composite AIM | An AIM aggregating more than one AIM. |
Component | One of the 7 AIF elements: Access, Communication, Controller, Internal Storage, Global Storage, Store, and User Agent |
Conformance | The attribute of an Implementation of being a correct technical Implementation of a Technical Specification. |
– Testing | The normative document specifying the Means to Test the Conformance of an Implementation. |
– Testing Means | Procedures, tools, data sets and/or data set characteristics to Test the Conformance of an Implementation. |
Connection | A channel connecting an output port of an AIM and an input port of an AIM. |
Controller | A Component that manages and controls the AIMs in the AIF, so that they execute in the correct order and at the time when they are needed |
Data | Information in digital form. |
– Format | The standard digital representation of Data. |
– Type | An instance of Data with a specific Data Format. |
– Semantics | The meaning of Data. |
Descriptor | Coded representation of a text, audio, speech, or visual feature. |
Digital Representation | Data corresponding to and representing a physical entity. |
Ecosystem | The ensemble of actors making it possible for a User to execute an application composed of an AIF, one or more AIWs, each with one or more AIMs potentially sourced from independent implementers. |
Explainability | The ability to trace the output of an Implementation back to the inputs that have produced it. |
Fairness | The attribute of an Implementation whose extent of applicability can be assessed by making the training set and/or network open to testing for bias and unanticipated results. |
Format | |
– Data | |
– File | |
– Stream | |
Function | The operations effected by an AIW or an AIM on input data. |
Global Storage | A Component to store data shared by AIMs. |
AIM/AIW Storage | A Component to store data of the individual AIMs. |
Identifier | A name that uniquely identifies an Implementation. |
Implementation | 1. An embodiment of the MPAI-AIF Technical Specification, or 2. An AIW or AIM of a particular Level (1-2-3) conforming with a Use Case of an MPAI Application Standard. |
Implementer | A legal entity implementing MPAI Technical Specifications. |
ImplementerID (IID) | A unique name assigned by the ImplementerID Registration Authority to an Implementer. |
ImplementerID Registration Authority (IIDRA) | The entity appointed by MPAI to assign ImplementerID’s to Implementers. |
Instance ID | Instance of a class of Objects and the Group of Objects the Instance belongs to. |
Interoperability | The ability to functionally replace an AIM with another AIW having the same Interoperability Level |
– Level | The attribute of an AIW and its AIMs to be executable in an AIF Implementation and to:
1. Be proprietary (Level 1) |
Knowledge Base | Structured and/or unstructured information made accessible to AIMs via MPAI-specified interfaces |
Message | A sequence of Records transported by Communication through Channels. |
Normativity | The set of attributes of a technology or a set of technologies specified by the applicable parts of an MPAI standard. |
Performance | The attribute of an Implementation of being Reliable, Robust, Fair and Replicable. |
– Assessment | The normative document specifying the Means to Assess the Grade of Performance of an Implementation. |
– Assessment Means | Procedures, tools, data sets and/or data set characteristics to Assess the Performance of an Implementation. |
– Assessor | An entity Assessing the Performance of an Implementation. |
Profile | A particular subset of the technologies used in MPAI-AIF, AIW, or AIM and, where applicable, the classes, other subsets, options and parameters relevant to that subset. |
Qualifier | |
– Attribute | |
– Format | |
-Sub-Type | |
Record | A data structure with a specified structure |
Reference Model | The AIMs and theirs Connections in an AIW. |
Reference Software | A technically correct software implementation of a Technical Specification containing source code, or source and compiled code. |
Reliability | The attribute of an Implementation that performs as specified by the Application Standard, profile, and version the Implementation refers to, e.g., within the application scope, stated limitations, and for the period of time specified by the Implementer. |
Replicability | The attribute of an Implementation whose Performance, as Assessed by a Performance Assessor, can be replicated, within an agreed level, by another Performance Assessor. |
Robustness | The attribute of an Implementation that copes with data outside of the stated application scope with an estimated degree of confidence. |
Scope | The domain of applicability of an MPAI Application Standard |
Service Provider | An entrepreneur who offers an Implementation as a service (e.g., a recommendation service) to Users. |
Standard | A set of Technical Specification, Reference Software, Conformance Testing, Performance Assessment, and Technical Report of an MPAI application Standard. |
Technical Specification | The normative specification of the set of AIWs belonging to an application domain along with the AIMs required to Implement the AIWs that includes:
1. The formats of the Input/Output data of the AIWs implementing the AIWs. |
Testing Laboratory | A laboratory accredited to Assess the Grade of Performance of Implementations. |
Time Base | The protocol specifying how Components can access timing information |
Topology | The set of AIM Connections of an AIW. |
Use Case | A particular instance of the Application domain target of an Application Standard. |
User | A user of an Implementation. |
User Agent | The Component interfacing the user with an AIF through the Controller |
Version | A revision or extension of a Standard or of one of its elements. |
Zero Trust | A cybersecurity model primarily focused on data and service protection that assumes no implicit trust. |