The Audio-Visual Scene Description Composite AIM is specified in the following six sections.
1 Functions of Audio-Visual Object Description
2 Reference Architecture of Audio-Visual Object Description
3 Input/output data of Audio-Visual Object Description
4 Functions of Audio-Visual Object Description AI Modules
5 I/O Data of Audio-Visual Object Description AI Modules
6 Specification of Audio-Visual Object Description AIMs and JSON Metadata
1 Functions of Audio-Visual Object Description
The Audio-Visual Scene Description (OSD-AVD) Composite AIM receives two independently developed Audio Scene Descriptors and Visual Scene Descriptors in the same Virtual Space and produces Audio-Visual Scene Descriptors whose co-located Audio Objects and Visual Objects have the same or related identifiers.
2 Reference Architecture of Audio-Visual Object Description
Figure 1 gives the Reference Model of Audio-Visual Scene Description.
Figure 1 – Reference Model of Audio-Visual Scene Description
3 Input/output data of Audio-Visual Object Description
Table 1 gives the input/output data of Audio-Visual Scene Description.
Table 1 – I/O data of Audio-Visual Scene Description
Input data | From | Comment |
Input Audio | A real environment | The Input Audio and Input Visual originate from the same scene |
Input Visual | A real environment | The Input Audio and Input Visual originate from the same scene |
Output data | To | Comments |
Audio-Visual Scene Descriptors | Downstream AIM | The co-located Audio and Visual Objects in the Scene convey the same or related identifiers. |
4 Functions of Audio-Visual Object Description AI Modules
Table 2 gives functions of the AIMs.
Table 2 – AI Modules of Audio-Visual Scene Description
AIM | Modules |
Audio Scene Description | Produces the Audio Scene Descriptors (Geometry+Objects). |
Visual Scene Description | Produces the Visual Scene Descriptors (Geometry+Objects). |
Audio-Visual Alignment | Identifies co-located Audio and Visual Objects.
Assigns the same or related Identifiers to the co-located Audio and Visual Objects. Updates the Audio-Visual Scene Geometry. |
Audio-Visual Scene Multiplexing | Multiplexes the new Audio-Visual Scene Geometry and the Audio and Visual Objects. |
5 I/O Data of Audio-Visual Object Description AI Modules
Table 3 gives the list of the AIMs with their functions.
Table 3 – AI Modules of Audio-Visual Scene Description
6 Specification of Audio-Visual Object Description AIMs and JSON Metadata
Table 4 – AIM and JSON Metadata
OSD-AVS | Audio-Visual Scene Description | X | |
– | CAE-ASD | Audio Scene Description | X |
– | OSD-VSD | Visual Scene Description | X |
– | OSD-AVA | Audio-Visual Alignment | X |
– | OSD-AMX | Audio-Visual Scene Multiplexing | X |