1 Definition
A Data Type including the Objects of an Audio-Visual Scene and their arrangement in the Scene.
2 Functional Requirements
Audio-Visual Basic Scene Descriptors includes:
- The ID of a Virtual Space where the Audio-Visual Basic Scene is or will be located.
- The ID of the Audio-Visual Basic Scene Descriptors.
- The number of
- Speech Objects in the Audio-Visual Basic Scene.
- Audio Objects in the Audio-Visual Basic Scene.
- Visual Objects in the Audio-Visual Basic Scene.
- Audio-Visual Objects in the Audio-Visual Basic Scene.
- The Audio-Visual Basic Scene Space-Time info.
- The Audio Objects including, for each Speech Object:
- The Speech Object Space-Time.
- The Speech Object.
- The Audio Objects including, for each Audio Object:
- The Audio Object Space-Time.
- The Audio Object.
- The Visual Objects including, for each Visual Object:
- The Visual Object Space-Time.
- The Visual Object.
- The Audio-Visual Objects including, for each Audio-Visual Object:
- The Audio-Visual Object Space-Time.
- The Audio-Visual Object.
3 Syntax
https://schemas.mpai.community/OSD/V1.1/data/AudioVisualBasicSceneDescriptors.json
4 Semantics
| Label | Size | Description |
| Header | N1 Bytes | Audio-Visual Basic Scene Descriptors Header |
| – Standard-AVScene | 9 Bytes | The characters “OSD-AVB-V” |
| – Version | N2 Bytes | Major version – 1 or 2 characters |
| – Dot-separator | 1 Byte | The character “.” |
| – Subversion | N3 Bytes | Minor version – 1 or 2 characters |
| MInstanceID | N4 Bytes | Identifier of M-Instance. |
| AVBasicSceneDescriptorsID | N5 Bytes | Identifier of the AV Object. |
| AVBasicSceneSpaceTime | N7 Bytes | Data about AVScene’s Space and Time |
| AudioObjectCount | N6 Bytes | Number of Audio Objects in Scene |
| AudioObjectsData[] | N8 Bytes | Set of Audio Objects |
| – AudioObjectID and/or Object | N9 Bytes | Audio Object ID and/or Object |
| – AudioObjectSpaceTime | N10 Bytes | Space-Time of Audio Object |
| SpeechObjectCount | N6 Bytes | Number of Speech Objects in Scene |
| SpeechObjectsData[] | N11 Bytes | Set of Speech Objects |
| – SpeechObjectID and/or Object | N12 Bytes | Speech Object ID and/or Object |
| – SpeechObjectSpaceTime | N13 Bytes | Space-Time of Speech Object |
| VisualObjectCount | N6 Bytes | Number of Visual Objects in Scene |
| VisualObjectsData[] | N14 Bytes | Set of Visual Objects |
| – VisualObjectID and/or Object | N15 Bytes | Visual Object ID and/or Object |
| – VisualObjectSpaceTime | N16 Bytes | Space-Time of Visual Object |
| AudioVisualObjectCount | N6 Bytes | Number of Audio-Visual Objects in Scene |
| AudioVisualObjectsData[] | N17 Bytes | Set of Audio-Visual Objects |
| – AudioVisualObjectID and/or Object | N18 Bytes | Audio-Visual Object ID and/or Object |
| – AudioObjectSpaceTime | N19 Bytes | Space-Time of Audio-Visual Object |
| DescrMetadata | N20 Bytes | Descriptive Metadata |