Definition
Functional Requirements
Syntax
Semantics
Conformance Testing
Performance Assessment
1 Definition
AudioVisual Scene Descriptors are a Data Type including the AudioVisual Objects of a scene, their Sub-Scenes, and their arrangement in the Scene. AudioVisual Scene Descriptors may be hierarchical, i.e., they may contain Objects and AudioVisual Scene Descriptors.
2 Functional Requirements
An AudioVisual Scene Descriptors instance must include:
- The AudioVisual Scene Descriptors Header.
- The ID of the AudioVisual Scene Descriptors instance.
- The Space/Time of the AudioVisual Scene Descriptors instance.
An AudioVisual Scene Descriptors instance may include:
- The ID of a Virtual Space (M-Instance) where it is or is intended to be located.
- The ID of a U-Environment (Real Space) where it is or is intended to be located.
- Time when this instance was produced.
- Audio, Speech, Visual, 3D Model, and AudioVisual Objects in the AudioVisual Scene, each with their Space/Time.
- Audio, Speech, Visual, 3D Model, and AudioVisual Sub-Scenes in the AudioVisual Scene, each with their Space/Time.
- A mixed array of Basic AudioVisual Scene Descriptors and AudioVisual Scene Descriptors.
- MPAI-PTF Data Exchange Metadata.
- Descriptive Metadata.
3 Syntax
https://schemas.mpai.community/OSD/V1.5/data/AudioVisualSceneDescriptors.json
4 Semantics
| Header | AudioVisual Scene Descriptors Header – Standard “OSD‑MSD‑Vx.y”. |
| MInstanceID | Identifier of the M-Instance (Virtual Space) where this AudioVisual Scene is or is intended to be located. |
| UEnvironmentID | Identifier of the U-Environment (Real Space) where this AudioVisual Scene is or is intended to be located. |
| AudioVisualSceneDescriptorsID | Unique identifier of this AudioVisual Scene Descriptors instance. |
| AudioVisualSceneDescriptorsTime | Time this AudioVisual Scene Descriptors instance was produced. |
| AudioVisualSceneDescriptorsSpaceTime | Space/Time where/when this AudioVisual Scene Descriptors instance is located. |
| AudioObjectCount | Number of Audio Objects in the AudioVisual Scene. |
| AudioObjects[] | Set of Audio Objects in the AudioVisual Scene. |
| – AudioObjectSpaceTime | Space/Time where/when this Audio Object is located within the AudioVisual Scene. |
| – AudioObject | Either an Audio Object or the ID of an Audio Object. |
| SpeechObjectCount | Number of Speech Objects in the AudioVisual Scene. |
| SpeechObjects[] | Set of Speech Objects in the AudioVisual Scene. |
| – SpeechObjectSpaceTime | Space/Time where/when this Speech Object is located within the AudioVisual Scene. |
| – SpeechObject | Either a Speech Object or the ID of a Speech Object. |
| VisualObjectCount | Number of Visual Objects in the AudioVisual Scene. |
| VisualObjects[] | Set of Visual Objects in the AudioVisual Scene. |
| – VisualObjectSpaceTime | Space/Time where/when this Visual Object is located within the AudioVisual Scene. |
| – VisualObject | Either a Visual Object or the ID of a Visual Object. |
| 3DModelObjectCount | Number of 3D Model Objects in the AudioVisual Scene. |
| 3DModelObjects[] | Set of 3D Model Objects in the AudioVisual Scene. |
| – 3DModelObjectSpaceTime | Space/Time where/when this 3D Model Object is located within the AudioVisual Scene. |
| – 3DModelObject | Either a 3D Model Object or the ID of a 3D Model Object. |
| AudioVisualObjectCount | Number of AudioVisual Objects in the AudioVisual Scene. |
| AudioVisualObjects[] | Set of AudioVisual Objects in the AudioVisual Scene. |
| – AudioVisualObjectSpaceTime | Space/Time where/when this AudioVisual Object is located within the AudioVisual Scene. |
| – AudioVisualObject | Either an AudioVisual Object or the ID of an AudioVisual Object. |
| SubAudioSceneCount | Number of Audio Sub-Scenes in the AudioVisual Scene. |
| SubAudioScenes[] | Set of Audio Sub-Scenes in the AudioVisual Scene. |
| – SubAudioSceneSpaceTime | Space/Time where/when this Audio Sub-Scene is located within the AudioVisual Scene. |
| – SubAudioScene | Either an Audio Scene Descriptors instance or its ID. |
| SubSpeechSceneCount | Number of Speech Sub-Scenes in the AudioVisual Scene. |
| SubSpeechScenes[] | Set of Speech Sub-Scenes in the AudioVisual Scene. |
| – SubSpeechSceneSpaceTime | Space/Time where/when this Speech Sub-Scene is located within the AudioVisual Scene. |
| – SubSpeechScene | Either a Speech Scene Descriptors instance or its ID. |
| SubVisualSceneCount | Number of Visual Sub-Scenes in the AudioVisual Scene. |
| SubVisualScenes[] | Set of Visual Sub-Scenes in the AudioVisual Scene. |
| – SubVisualSceneSpaceTime | Space/Time where/when this Visual Sub-Scene is located within the AudioVisual Scene. |
| – SubVisualScene | Either a Visual Scene Descriptors instance or its ID. |
| Sub3DModelSceneCount | Number of 3D Model Sub-Scenes in the AudioVisual Scene. |
| Sub3DModelScenes[] | Set of 3D Model Sub-Scenes in the AudioVisual Scene. |
| – Sub3DModelSceneSpaceTime | Space/Time where/when this 3D Model Sub-Scene is located within the AudioVisual Scene. |
| – Sub3DModelScene | Either a 3D Model Scene Descriptors instance or its ID. |
| SubAudioVisualSceneCount | Number of AudioVisual Sub-Scenes in the AudioVisual Scene. |
| SubAudioVisualScenes[] | Set of AudioVisual Sub-Scenes in the AudioVisual Scene. |
| – SubAudioVisualSceneSpaceTime | Space/Time where/when this AudioVisual Sub-Scene is located within the AudioVisual Scene. |
| – SubAudioVisualScene | Either an AudioVisual Scene Descriptors instance or its ID. |
| BasicAudioVisualSceneOrAudioVisualScene | A mixed array of Basic AudioVisual Scene Descriptors and AudioVisual Scene Descriptors. |
| DataXMData | MPAI-PTF Data Exchange Metadata providing provenance, authorisation, legal, security, and confidence information associated with this AudioVisual Scene Descriptors instance in a trusted data exchange context. |
| DescrMetadata | Human-readable descriptive metadata of the AudioVisual Scene Descriptors instance (plain text, max 2048 characters). |
5 Conformance Testing
A Data instance conforms with AudioVisual Scene Descriptors (OSD‑MSD) if:
- The Data validates against the AudioVisual Scene Descriptors’ JSON Schema.
- All Data in the AudioVisual Scene Descriptors’ JSON Schema:
- Have the specified type.
- Validate against their JSON Schemas.
- Conform with their Data Qualifiers.
6 Performance Assessment
Not part of this specification.