1 Definition
A Data Type including the Audio-Visual Scene’s Objects and SubScenes and their arrangement in the Scene.
2 Functional Requirements
Audio-Visual Scene Descriptors includes Scenes in addition to Objects.
3 Syntax
https://schemas.mpai.community/OSD/V1.1/data/AudioVisualSceneDescriptors.json
4 Semantics
| Label | Size | Description |
| Header | N1 Bytes | Audio-Visual Scene Descriptors Header |
| – Standard-AVSceneDescriptors | 9 Bytes | The characters “OSD-AVS-V” |
| – Version | N2 Bytes | Major version – 1 or 2 characters |
| – Dot-separator | 1 Byte | The character “.” |
| – Subversion | N3 Bytes | Minor version – 1 or 2 characters |
| MInstanceID | N4 Bytes | Identifier of M-Instance. |
| AVBasicSceneDescriptorsID | N5 Bytes | Identifier of the AV Object. |
| ObjectCount | N6 Bytes | Number of Objects in Scene |
| AVSceneSpaceTime | N7 Bytes | Data about Space and Time |
| AudioObjectsData[] | N8 Bytes | Set of Audio Objects |
| – AudioObject | N9 Bytes | ID of Audio Object |
| – AudioObjectSpaceTime | N10 Bytes | Space-Time of Audio Object |
| – AudioObjectPayload | N11 Bytes | Length in Bytes and URI of Audio Object Payload |
| SpeechObjectsData[] | N12 Bytes | Set of SpeechObjects |
| – SpeechObject | N13 Bytes | Speech Object |
| – SpeechObjectSpaceTime | N14 Bytes | Space-Time of Speech Object |
| VisualObjectsData[] | N15 Bytes | Set of Visual Objects |
| – VisualObjectID | N16 Bytes | ID of Visual Object |
| – VisualObjectSpaceTime | N17 Bytes | Space-Time of Visual Object |
| – VisualObjectPayload | N18 Bytes | Length in Bytes and URI of Visual Object Payload |
| AudioVisualObjectsData[] | N19 Bytes | Set of Audio-Visual Objects |
| – AudioVisualObjectID | N18 Bytes | ID of Audio-Visual Object |
| – AudioObjectSpaceTime | N19 Bytes | Space-Time of Audio-Visual Object |
| SubSceneCount | N20 Bytes | Number of Sub-Scenes in Scene |
| SubSceneData[] | N21 Bytes | Set of Sub-Scenes |
| – SubSceneID | N22 Bytes | ID of Sub-Scene |
| – SubSceneSpaceTime | N23 Bytes | Space-Time of Sub-Scenes |
| – Payload | N24 Bytes | Length in Bytes and URI of Sub-Scene Payload |
| DescrMetadata | N25 Bytes | Descriptive Metadata |