1 Functions
Audio-Visual Scene Description (OSD-AVS) provides standard descriptors of an Audio-Visual Scene.
Receives | Audio Object |
Visual Objects | |
Creates | Audio-Visual or Visual Scene Descriptors if there is only one Visual or Audio-Visual Object, respectively, and no Scene Geometry. |
Produces | Audio-Visual Scene Descriptors |
2 Reference Architecture
The Reference Architecture is depicted in Figure 1.
Figure 1 – The Audio-Visual Scene Description AIM
3 I/O Data
Table 1 specifies the Input and Output Data of the Visual Scene Description AIM. Links are to the Data Type specifications.
Table 1 – I/O Data of the Visual Scene Description AIM
Input | Description |
Audio Objects | Audio Objects. |
Visual Objects | Visual Objects. |
Output | Description |
Audio-Visual Scene Descriptors | The Audio-Visual Descriptors of the Scene. |
4 SubAIMs
Audio Scene Description (CAE-ASD) is a Composite AIM with the structure is depicted in Figure 2.
Figure 2 – The Audio-Visual Scene Description (OSD-AVS) Composite AIM
Table 2 provides the links to the specifications of the OSD-AVS Basic AIMs.
Table 2 – BASIC AIMs of the Audio-Visual Scene Description (OSD-AVS) Composite AIM
AIMs | Names | |
CAE-ASD | Audio Scene Description | |
CAE-AAT | Audio Analysis Transform | |
CAE-ASL | Audio Source Localisation | |
CAE-ASE | Audio Separation and Enhancement | |
CAE-AST | Audio Synthesis Transform | |
CAE-AMX | Audio Descriptors Multiplexing | |
OSD-VSD | Visual Scene Description | |
OSD-AVA | Audio-Visual Alignment |
5 JSON Metadata
https://schemas.mpai.community/OSD/V1.1/AIMs/AudioVisualSceneDescription.json
6 Profiles
No Profiles.