The Reference Model underpinning Technical Specification: Human and Machine Communication (MPAI-HMC) is depicted in Figure 1. It is based on the AI Framework (AIF) standardised by Technical Specification: AI Framework (MPAI-AIF) and is implemented by an AI Workflow (AIW) that includes AI Modules (AIM). Three out of six AIMs in Figure 1 (Audio-Visual Scene Description, Entity Context Understanding, and Personal Status Display) are Composite, i.e., they include interconnected AIMs. Basic information on MPAI-AIF is provided here.
Figure 1 – Human-Machine Communication AIW
- Words beginning with a capital are defined in Definitions, Words beginning with a small letter have the common meaning.
- Input Selector enables the Entity to inform the Machine through the Entity Context Understanding AIM about use of Text vs. Speech, Language Preferences, and Selected Language in translation.
- Input Text, Input Speech, and Input Visual convey the information emitted by the Entity and its Context as captured by the Machine.
- Input Portable Avatar is the Communication Item emitted by a communicating Machine.
- Audio-Visual Scene Descriptors produced by the Audio-Visual Scene Description and Audio-Visual Scene Integration and Description AIMs are digital representations of a real audio-visual scene or a Virtual Audio-Visual Scene.
- For convenience, AIMs are labelled by three letters indicating the three letters of the Technical Specification that specifies it followed by a hyphen “-” followed by three letters uniquely identifying the AIM defined by that Technical Specification. For instance, Portable Avatar Demultiplexing is indicated as PAF-PDX where PAF refers to Technical Specification: Portable Avatar Format (MPAI-PAF) and PDX refers to the Portable Avatar Demultiplexing AIM specified by MPAI-PAF.