The Reference Model underpinning Technical Specification: Human and Machine Communication (MPAI-HMC) is depicted in Figure 1. It is based on the AI Framework (AIF) standardised by Technical Specification: AI Framework (MPAI-AIF) and is implemented by an AI Workflow (AIW) that includes AI Modules (AIM). Three out of six AIMs in Figure 1 (Audio-Visual Scene Description, Entity Context Understanding, and Personal Status Display) are Composite, i.e., they include interconnected AIMs. Basic information on MPAI-AIF is provided here.

Figure 1 – Human-Machine Communication AIW

Note that:

  1. Words beginning with a capital are defined in Definitions, Words beginning with a small letter have the common meaning.
  2. Input Selector enables the Entity to inform the Machine through the Entity Context Understanding AIM about use of Text vs. Speech, Language Preferences, and Selected Language in translation.
  3. Input Text, Input Speech, and Input Visual convey the information emitted by the Entity and its Context as captured by the Machine.
  4. Input Portable Avatar is the Communication Item emitted by a communicating Machine.
  5. Audio-Visual Scene Descriptors produced by the Audio-Visual Scene Description and Audio-Visual Scene Integration and Description AIMs are digital representations of a real audio-visual scene or a Virtual Audio-Visual Scene.
  6. For convenience, AIMs are labelled by three letters indicating the three letters of the Technical Specification that specifies it followed by a hyphen “-” followed by three letters uniquely identifying the AIM defined by that Technical Specification. For instance, Portable Avatar Demultiplexing is indicated as PAF-PDX where PAF refers to Technical Specification: Portable Avatar Format (MPAI-PAF) and PDX refers to the Portable Avatar Demultiplexing AIM specified by MPAI-PAF.