1    Functions

Audio-Visual Basic Scene Description (OSD-AVB):

Receives Space-Time.
Audio Objects.
Speech Objects.
Visual Objects.
Audio-Visual Objects.
Processes All Objects.
Creates Audio, Speech, Visual, and Audio-Visual Scene Descriptors from the Objects, if possible.
Combines The Scene Descriptors of the all Objects.
Produces Audio-Visual Visual Scene Descriptors

2      Reference Architecture

The Reference Architecture is depicted in Figure 1.

Figure 1 – The Audio-Visual Basic Scene Description (OSD-AVB) AIM

3      I/O Data

Table 1 specifies the Input and Output Data of the Audio-Visual Basic Scene Description (OSD-AVB). Links are to the Data Type specifications.

Table 1 – I/O Data of the Audio-Visual Basic Scene Description (OSD-AVB) AIM

Input Description
SpaceTime Space-Time information of Objects.
Audio Objects Audio Objects.
Speech Objects Speech Objects.
Visual Objects Visual Objects.
Output Description
Audio-Visual Scene Descriptors The Audio-Visual Descriptors of the Scene.

4     SubAIMs

Audio-Visual Basic Scene Description (OSD-AVB) is a Composite AIM whose reference Model is depicted in Figure 2.

 

Figure 2 – Reference Model of Audio-Visual Basic Scene Description Composite AIM

Table 2 provides the AI Modules composing the AIM.

AIMs JSON
Audio-Visual Basic Scene Description X
Audio Scene Description X
Speech Scene Description X
Visual Scene Description X
Audio-Visual Alignment X

5     JSON Metadata

https://schemas.mpai.community/OSD/V1.1/AIMs/AudioVisualBasicSceneDescription.json

6     Profiles

No Profiles.