<- References   Go to ToC       Composite AIMs->


Technical Specification: Portable Avatar Format (MPAI-PAF) V1.1 enables the implementation of the Avatar-Based Videoconference Use Case (PAF-ABV) enabling a form of videoconference held in a Virtual Environment populated by Avatars representing humans. The Avatars showing the humans’ visual appearance and utter their voices. Figure 2 depicts the system composed of four types of subsystems implemented as MPAI-AIF AI Workflows (AIW) whose specifications are available at the following links:

  1. Videoconference Client Transmitter
  2. Avatar Videoconference Server
  3. Virtual Meeting Secretary
  4. Videoconference Client Receiver..

Figure 2 – Avatar-Based Videoconference end-to-end diagram

The components of the PAF-ABV system:

  1. Participant: a human joining an ABV either individually or as a member of a group of humans in the same physical room.
  2. Audio-Visual Scene: a virtual audio-visual space equipped with Visual Objects such as a table and an appropriate number of chairs and Audio Objects described by Audio-Visual Scene Descriptors.
  3. Portable Avatar: represents a human participant represented in the Portable Avatar Format (PAF).
  4. Videoconference Client Transmitter:
    • At the beginning of the conference,
      • Receives from Participants and sends to the Server Portable Avatars containing the Avatar Models and Language Preferences.
      • Sends to the Server Speech Object and Face Object for Authentication.
    • Continuously sends to the Server Portable Avatars containing Avatar Descriptors and Speech.
  5. The Avatar Videoconference Server
    • At the beginning:
      • Selects a Visual Environment Model, e.g., a meeting room.
      • Equips the room with objects, i.e., meeting table and chairs.
      • Places Avatar Models around the table with a given Spatial Attitude.
      • Distributes Environment and Portable Avatars containing Avatars Models, and their Spatial Attitudes to all Receiving Clients.
      • Authenticates Speech and Face Objects and assigns IDs to Avatars.
      • Sets the common conference language.
    • Continuously:
      • Translates Speech to Participants according to their Language Preferences.
      • Sends Portable Avatars containing Avatar Descriptors, Speech, and Spatial Attitude of Participants and Virtual Meeting Secretary to all Receiving Clients and Virtual Meeting Secretary.
  1. Virtual Meeting Secretary is an Avatar not corresponding to any Participant that continuously:
    • Uses the common meeting language.
    • Understands Avatars’ utterances and extracts their Personal Statuses.
    • Drafts a Summary of its understanding of Avatars’ Text and Personal Status.
    • Displays the Summary either to:
      • Outside of the Environment for Participants to read and edit directly, or
      • The Visual Environment for Avatars to comment, e.g., via Text.
    • Refines the Summary.
    • Sends its Portable Avatar containing its Avatar Descriptors to the Server.
  2. Videoconference Client Receiver:
    • At the beginning
      • receives Visual Environment and Portable Avatars containing Avatar Models with Spatial Attitudes.
    • Continuously:
      • Receives Portable Avatars with Avatar Descriptors and Speech.
      • Produces Visual and Audio Scene Descriptors.
      • Renders the Audio-Visual Scene by spatially adding the participants’ utterances to the Spatial Attitude of the respective Avatars’ mouths. Rendering may be done from a Point of View, possibly different from the position assigned to their Avatars in the Visual Environment, selected by participant who use a device of their choice (HMD or 2D display/earpad).

Each component of the Avatar-Based Videoconference Use Case is implemented as an AI Workflow (AIW) composed of AI Modules (AIMs) executed in an AI Framework. Basic notions concerning Technical Specification: AI Framework (MPAI-AIF) V2.0 are available here.

It includes the following elements:

  1. Functions of the AIW
  2. Reference Architecture of the AIW
  3. Input and Output Data of the AIW
  4. Functions of the AIMs
  5. Input and Output Data of the AIMs
  6. JSON Metadata of the AIW and its AIMs.


<- References   Go to ToC       Composite AIMs->