Go to MPAI-OSD V1.5 Data Types

Definition
Functional Requirements
Syntax
Semantics
Conformance Testing
Performance Assessment

1      Definition

AudioVisual Scene Descriptors are a Data Type including the AudioVisual Objects of a scene, their Sub-Scenes, and their arrangement in the Scene. AudioVisual Scene Descriptors may be hierarchical, i.e., they may contain Objects and AudioVisual Scene Descriptors.

2      Functional Requirements

An AudioVisual Scene Descriptors instance must include:

  1. The AudioVisual Scene Descriptors Header.
  2. The ID of the AudioVisual Scene Descriptors instance.
  3. The Space/Time of the AudioVisual Scene Descriptors instance.

An AudioVisual Scene Descriptors instance may include:

  1. The ID of a Virtual Space (M-Instance) where it is or is intended to be located.
  2. The ID of a U-Environment (Real Space) where it is or is intended to be located.
  3. Time when this instance was produced.
  4. Audio, Speech, Visual, 3D Model, and AudioVisual Objects in the AudioVisual Scene, each with their Space/Time.
  5. Audio, Speech, Visual, 3D Model, and AudioVisual Sub-Scenes in the AudioVisual Scene, each with their Space/Time.
  6. A mixed array of Basic AudioVisual Scene Descriptors and AudioVisual Scene Descriptors.
  7. MPAI-PTF Data Exchange Metadata.
  8. Descriptive Metadata.

3      Syntax

https://schemas.mpai.community/OSD/V1.5/data/AudioVisualSceneDescriptors.json

4      Semantics

Table 1 – Semantics of the AudioVisual Scene Descriptors Data Type

Label
Description
Header AudioVisual Scene Descriptors Header – Standard “OSD‑MSD‑Vx.y”.
MInstanceID Identifier of the M-Instance (Virtual Space) where this AudioVisual Scene is or is intended to be located.
UEnvironmentID Identifier of the U-Environment (Real Space) where this AudioVisual Scene is or is intended to be located.
AudioVisualSceneDescriptorsID Unique identifier of this AudioVisual Scene Descriptors instance.
AudioVisualSceneDescriptorsTime Time this AudioVisual Scene Descriptors instance was produced.
AudioVisualSceneDescriptorsSpaceTime Space/Time where/when this AudioVisual Scene Descriptors instance is located.
AudioObjectCount Number of Audio Objects in the AudioVisual Scene.
AudioObjects[] Set of Audio Objects in the AudioVisual Scene.
    – AudioObjectSpaceTime Space/Time where/when this Audio Object is located within the AudioVisual Scene.
    – AudioObject Either an Audio Object or the ID of an Audio Object.
SpeechObjectCount Number of Speech Objects in the AudioVisual Scene.
SpeechObjects[] Set of Speech Objects in the AudioVisual Scene.
    – SpeechObjectSpaceTime Space/Time where/when this Speech Object is located within the AudioVisual Scene.
    – SpeechObject Either a Speech Object or the ID of a Speech Object.
VisualObjectCount Number of Visual Objects in the AudioVisual Scene.
VisualObjects[] Set of Visual Objects in the AudioVisual Scene.
    – VisualObjectSpaceTime Space/Time where/when this Visual Object is located within the AudioVisual Scene.
    – VisualObject Either a Visual Object or the ID of a Visual Object.
3DModelObjectCount Number of 3D Model Objects in the AudioVisual Scene.
3DModelObjects[] Set of 3D Model Objects in the AudioVisual Scene.
    – 3DModelObjectSpaceTime Space/Time where/when this 3D Model Object is located within the AudioVisual Scene.
    – 3DModelObject Either a 3D Model Object or the ID of a 3D Model Object.
AudioVisualObjectCount Number of AudioVisual Objects in the AudioVisual Scene.
AudioVisualObjects[] Set of AudioVisual Objects in the AudioVisual Scene.
    – AudioVisualObjectSpaceTime Space/Time where/when this AudioVisual Object is located within the AudioVisual Scene.
    – AudioVisualObject Either an AudioVisual Object or the ID of an AudioVisual Object.
SubAudioSceneCount Number of Audio Sub-Scenes in the AudioVisual Scene.
SubAudioScenes[] Set of Audio Sub-Scenes in the AudioVisual Scene.
    – SubAudioSceneSpaceTime Space/Time where/when this Audio Sub-Scene is located within the AudioVisual Scene.
    – SubAudioScene Either an Audio Scene Descriptors instance or its ID.
SubSpeechSceneCount Number of Speech Sub-Scenes in the AudioVisual Scene.
SubSpeechScenes[] Set of Speech Sub-Scenes in the AudioVisual Scene.
    – SubSpeechSceneSpaceTime Space/Time where/when this Speech Sub-Scene is located within the AudioVisual Scene.
    – SubSpeechScene Either a Speech Scene Descriptors instance or its ID.
SubVisualSceneCount Number of Visual Sub-Scenes in the AudioVisual Scene.
SubVisualScenes[] Set of Visual Sub-Scenes in the AudioVisual Scene.
    – SubVisualSceneSpaceTime Space/Time where/when this Visual Sub-Scene is located within the AudioVisual Scene.
    – SubVisualScene Either a Visual Scene Descriptors instance or its ID.
Sub3DModelSceneCount Number of 3D Model Sub-Scenes in the AudioVisual Scene.
Sub3DModelScenes[] Set of 3D Model Sub-Scenes in the AudioVisual Scene.
    – Sub3DModelSceneSpaceTime Space/Time where/when this 3D Model Sub-Scene is located within the AudioVisual Scene.
    – Sub3DModelScene Either a 3D Model Scene Descriptors instance or its ID.
SubAudioVisualSceneCount Number of AudioVisual Sub-Scenes in the AudioVisual Scene.
SubAudioVisualScenes[] Set of AudioVisual Sub-Scenes in the AudioVisual Scene.
    – SubAudioVisualSceneSpaceTime Space/Time where/when this AudioVisual Sub-Scene is located within the AudioVisual Scene.
    – SubAudioVisualScene Either an AudioVisual Scene Descriptors instance or its ID.
BasicAudioVisualSceneOrAudioVisualScene A mixed array of Basic AudioVisual Scene Descriptors and AudioVisual Scene Descriptors.
DataXMData MPAI-PTF Data Exchange Metadata providing provenance, authorisation, legal, security, and confidence information associated with this AudioVisual Scene Descriptors instance in a trusted data exchange context.
DescrMetadata Human-readable descriptive metadata of the AudioVisual Scene Descriptors instance (plain text, max 2048 characters).

5      Conformance Testing

A Data instance conforms with AudioVisual Scene Descriptors (OSD‑MSD) if:

  1. The Data validates against the AudioVisual Scene Descriptors’ JSON Schema.
  2. All Data in the AudioVisual Scene Descriptors’ JSON Schema:
    1. Have the specified type.
    2. Validate against their JSON Schemas.
    3. Conform with their Data Qualifiers.

6      Performance Assessment

Not part of this specification.

Go to MPAI-OSD V1.5 Data Types