Go to MPAI-OSD V1.5 Data Types

Definition
Functional Requirements
Syntax
Semantics
Conformance Testing
Performance Assessment

1    Definition

Speech Scene Descriptors are a Data Type including the Speech Objects of a scene, their Sub-Scenes, and their arrangement in the Scene. Speech Scene Descriptors may be hierarchical, i.e., they may contain Objects and Speech Scene Descriptors.

2    Functional Requirements

An Speech Scene Descriptors instance must include:

  1. The Speech Scene Descriptors Header
  2. The ID of the Speech Scene Descriptors instance.
  3. The ID of the Speech Objects in the Speech Scene.
  4. The ID of the Speech Sub-Scenes in the Speech Scene.
  5. The Space-Time information of the Speech Objects and Speech Sub-Scenes in the Speech Scene.

An Speech Scene Descriptors instance may include:

  1. The ID of a virtual Space (M-Instance) where it is or it is intended to be located.
  2. The ID of a U-Environment where it is or it is intended to be located.
  3. The Space-Time information of the Speech Scene.
  4. Information about this Speech Scene Descriptors instance.
  5. Descriptive Metadata.

3    Syntax

https://schemas.mpai.community/OSD/V1.5/data/SpeechSceneDescriptors.json

4    Semantics

Table 1 – Semantics of the Speech Scene Descriptors Data Type

Label Description
Header Speech Scene Descriptors Header – Standard “OSD-SSD-Vx.y”
MInstanceID Identifier of the M-Instance.
UEnvironmentID Identifier of the U-Environment.
SpeechSceneDescriptorsID ID of this Speech Scene Descriptors instance instance.
SpeechSceneDescriptorsTime Space and Time of this Speech Scene Descriptors instance.
SpeechObjectCount Number of Speech Objects in Speech Scene.
SpeechSubSceneCount Number of Speech Sub-Scenes in Speech Scene.
SpeechSceneDescriptorsSpaceTime Space and Time of Speech Scene Descriptors instance.
SpeechSceneObjects[] Set of Data related to Speech Objects in the Speech Scene.
– BasicSpeechSceneObjectSpaceTime Space and Time of Speech Object in the Speech Scene.
– BasicSpeechSceneObject Basic Speech Object in the Scene.
SpeechSubScenes[] Set of Data related to Speech Sub-Scenes in the Scene.
– BasicSpeechSubSceneSpaceTime Space and Time of Speech Object in the Scene.
DataXMData MPAI-PTF Data Exchange Metadata providing provenance, authorisation, legal, security, and confidence information associated with this Speech Scene Descriptors instance in a trusted data exchange context.
DescrMetadata Human-readable descriptive metadata of the Speech Scene Descriptors instance (plain text, max 2048 characters).

5    Conformance Testing

A Data instance conforms with Speech Scene Descriptors (OSD-SSD) if:

  1. The Data validates against the Speech Scene Descriptors JSON Schema.
  2. All Data in the Speech Scene Descriptors JSON Schema:
    1. Have the specified type.
    2. Validate against their JSON Schemas.
    3. Conform with their Data Qualifiers.

6    Performance Assessment

Not part of this specification.

Go to MPAI-OSD V1.5 Data Types