This is the public page of the Portable Avatar Format (MPAI-PAF) standard project providing technical information. Also see the MPAI-PAF homepage.

MPAI-PAF specifies the Portable Avatar Format, including AvatarID, time, Visual Environment and the Spatial Attitude of the Avatar, the Model, the Body and Face Descriptors, the Language Preference, the Speech, the Text, and the Personal Status.

Personal Status Display generates a Portable Avatar from Text and Personal Status (PS) that 1) utters speech with the intended PS, 2) displays a face whose lips move in sync with the text and shows the intended PS, and 3) displays a body making gestures that accompany the Text and show the intended PS.
Figure 1 – Personal Status Display 
Avatar-Based Videoconference is a system where digital twins of humans, embodied in speaking avatars having a high level of similarity, in terms of voice and appearance, with their human twins, are directed by the humans. Participants at a location use a client connected to the server which is optionally augmented by a Virtual Secretary.
Figure 2 – Avatar-Based Videoconference (ABV)
At the start of the videoconference, the Transmitting Client receives each participant’s Avatar Model and spoken language preferences. Subsequently, it receives the audio and video information of a physical meeting room.

Each Client extracts visual and speech data for authentication purposes and constantly generates Portable Avatars using the participant’s Personal Status to improve the accuracy of the participant description.

Figure 3 – Avatar-Based Videoconference (Transmitting Client)
The ABV Server authenticates participants, selects the Visual Environment, receives the Portable Avatars and distributes the Portable Avatars containing Environment, Avatar Model and Spatial Attitude in the Environment. Subsequently, the ABV Server translates the text and the utterances of the individual participants into Text and Speech in the languages selected by the participants.
Figure 4 – Avatar-Based Videoconference (Server)
The Virtual Secretary (VS) is a human-like speaking avatar not representing a human who takes note of what is being said at the meeting taking the avatars’ PSs into account. The avatars participating in the ABV can make comments to the VS, answer questions, etc. The VS sends a Personal Avatar that includes its Personal Status data. The ABV Server composes the Virtual Secretary in the ABV.
Figure 5 – Avatar-Based Videoconference (Virtual Secretary)
Each Receiving Client arranges the participating avatars as instructed by the ABV Server and attaches the speech of each participant to the mouth of the corresponding avatar with the appropriate Spatial Attitude. A participant may select a point of view that coincides with their avatar’s point of view or a different one and see the meeting participants as the ABV Server has arranged them.
Figure 6 – Avatar-Based Videoconference (Receiving Client)

The current goal is to develop a Reference Software implementation of the MPAI-PAF Technical Specification.


If you wish to participate in this work, you have the following options:

  1. Join MPAI
  2. Keep an eye on this page.

Return to the MPAI-PAF page.