(Informative)

There is a long history of computer-created objects called “digital humans”, i.e., digital objects having a human appearance when rendered. In most cases the underlying assumption of these objects has been that creation, animation, and rendering is done in a closed environment. Such digital humans had little or no need for standards.

In a communication and more so in a metaverse context, there are many cases where a digital human is not constrained within a closed environment thus requiring forms of standardisation. Technical Specification: Portable Avatar Format (MPAI-PAF) – in the following also called MPAI-PAF – is a first response to the requirements of new usage contexts. MPAI-PAF specifies a standard for Portable Avatar Format (PAF) enabling a receiving party to render a digital human as intended by the sending party.

MPAI-PAF has been developed by MPAI – Moving Picture, Audio, and Data Coding by Artificial Intelligence, the international, unaffiliated, non-profit organisation developing standards for Artificial Intelligence (AI)-based data coding with clear Intellectual Property Rights licensing frameworks in compliance with the rigorous MPAI Process in pursuit of the following policies:

  1. Be friendly to the AI context but, to the extent possible, agnostic to the technology – AI or Data Processing – used in an implementation.
  2. Be attractive to different industries, end users, and regulators.
  3. Address three levels of standardisation: data types, components (called AI Modules), configurations of components (called AI Workflows) all exposing standard interfaces with an aggregation level decided by the implementer.
  4. Specify the data exchanged by components with a clear semantic to the extent possible.

As manager of the MPAI Ecosystem specified by Governance of MPAI Ecosystem (MPAI-GME) [1], MPAI ensures that a user can:

  1. Operate the reference implementation of the Technical Specification, by providing a Reference Software Specification with annexed software.
  2. Test the conformance of an implementation with the Technical Specification, by providing Conformance Testing Specification.
  3. Assess the performance of an implementation of a Technical Specification, by providing the Performance Assessment Specification.
  4. Get conforming implementations possibly with a performance assessment report from a trusted source through the MPAI Store.

The MPAI-PAF Technical Specification will be accompanied by the Reference Software, Conformance Testing, and Performance Assessment Specifications. Conformance Testing specifies methods enabling users to ascertain whether a data type generated by an AIM or an AIW conform with this Technical Specification.

The MPAI-PAF Technical Specification applies technologies to the Avatar-Based Videoconference (PAF-ABV) Use Case where:

  1. Client Transmitters send PAFs containing:
    • At the beginning: Avatar Models, Language Selector, and Speech Object and Face Object for participant authentication.
    • Continuously: Avatar Descriptors, and Speech Objects to a Server.
  1. Avatar Videoconference Server:
    • At the beginning:
      • Selects an Environment, i.e., a meeting room and equips it with objects, i.e., meeting table and chairs.
      • Places Avatar Models around the table.
      • Distributes for each participant a PAF containing Environment, Avatar Models, and their positions to all receiving clients.
    • Continuously sends to receiving clients:
      • Translated Speech from participants according to Language Selectors.
      • Sends PAFs containing Avatar Descriptors and translated Speech.
  1. Client Receivers:
    • At the beginning: receive Environment and PAFs containing Avatar Models and Language Selectors from the server.
    • Continuously from the server:
      • Receive PAFs containing Avatar Descriptors and translated Speech.
      • Create Audio and Visual Scene Descriptors.
      • Render the Audio-Visual Scene as seen from the human participant-selected Point of View.

MPAI-PAF specifies Digital Human technologies and utilises its own technologies and those specified by other MPAI standards for the Avatar-Based Videoconference Use Case. Similarly, other MPAI standards utilise standard MPAI-PAF technologies in other Use Cases such as Human-Connected Autonomous Vehicle (CAV) Interaction (CAV-HCI).

Chapters, Sections, and Annexes are Normative unless they are explicitly identified as Informative.