1     Functions

Audio-Visual Scene Rendering (PAF-AVR):

Receives
  1. Audio-Visual Scene Descriptors, or
  2. Portable Avatar, and
  3. Point of View.
Transforms Portable Avatar into generic Audio-Visual Scene Descriptors if input is Portable Avatar.
Produces
  1. Text.
  2. Media resulting from rendering of Audio Scene Descriptors from Point of View:
    1. Output Audio.
    2. Output Visual.

2      Reference Architecture

Figure 1 depicts the Reference Architecture of the Audio-Visual Scene Rendering AIM.

Figure 1 – The Audio-Visual Scene Rendering AIM

3      I/O Data

Table 1 specifies the Input and Output Data of the Speech & Text Translation AIM.

Table 1 – I/O Data of the Speech & Text Translation AIM

Input Description
Portable Avatar Data produced, e.g., by Personal Status Display.
Point of View Point from where an Entity perceives the Audio-Visual Scene
Output Description
Output Text The Text included in the Portable Avatar.
Output Audio The Audio components of the Audio-Visual Scene.
Output Visual The Visual components of the Audio-Visual Scene.

5      SubAIMs

No SubAIMs.

5     JSON Metadata

https://schemas.mpai.community/PAF/V1.2/AIMs/AudioVisualSceneRendering.json

6     Profiles

The Profiles of Audio-Visual Scene Rendering are specified.