Definition
Functional Requirements
Syntax
Semantics
Conformance Testing
Performance Assessment
1 Definition
Speech Scene Descriptors are a Data Type including the Speech Objects of a scene, their Sub-Scenes, and their arrangement in the Scene. Speech Scene Descriptors may be hierarchical, i.e., they may contain Objects and Speech Scene Descriptors.
2 Functional Requirements
An Speech Scene Descriptors instance must include:
- The Speech Scene Descriptors Header
- The ID of the Speech Scene Descriptors instance.
- The ID of the Speech Objects in the Speech Scene.
- The ID of the Speech Sub-Scenes in the Speech Scene.
- The Space-Time information of the Speech Objects and Speech Sub-Scenes in the Speech Scene.
An Speech Scene Descriptors instance may include:
- The ID of a virtual Space (M-Instance) where it is or it is intended to be located.
- The ID of a U-Environment where it is or it is intended to be located.
- The Space-Time information of the Speech Scene.
- Information about this Speech Scene Descriptors instance.
- Descriptive Metadata.
3 Syntax
https://schemas.mpai.community/OSD/V1.5/data/SpeechSceneDescriptors.json
4 Semantics
| Label | Description |
| Header | Speech Scene Descriptors Header – Standard “OSD-SSD-Vx.y” |
| MInstanceID | Identifier of the M-Instance. |
| UEnvironmentID | Identifier of the U-Environment. |
| SpeechSceneDescriptorsID | ID of this Speech Scene Descriptors instance instance. |
| SpeechSceneDescriptorsTime | Space and Time of this Speech Scene Descriptors instance. |
| SpeechObjectCount | Number of Speech Objects in Speech Scene. |
| SpeechSubSceneCount | Number of Speech Sub-Scenes in Speech Scene. |
| SpeechSceneDescriptorsSpaceTime | Space and Time of Speech Scene Descriptors instance. |
| SpeechSceneObjects[] | Set of Data related to Speech Objects in the Speech Scene. |
| – BasicSpeechSceneObjectSpaceTime | Space and Time of Speech Object in the Speech Scene. |
| – BasicSpeechSceneObject | Basic Speech Object in the Scene. |
| SpeechSubScenes[] | Set of Data related to Speech Sub-Scenes in the Scene. |
| – BasicSpeechSubSceneSpaceTime | Space and Time of Speech Object in the Scene. |
| DataXMData | MPAI-PTF Data Exchange Metadata providing provenance, authorisation, legal, security, and confidence information associated with this Speech Scene Descriptors instance in a trusted data exchange context. |
| DescrMetadata | Human-readable descriptive metadata of the Speech Scene Descriptors instance (plain text, max 2048 characters). |
5 Conformance Testing
A Data instance conforms with Speech Scene Descriptors (OSD-SSD) if:
- The Data validates against the Speech Scene Descriptors JSON Schema.
- All Data in the Speech Scene Descriptors JSON Schema:
- Have the specified type.
- Validate against their JSON Schemas.
- Conform with their Data Qualifiers.
6 Performance Assessment
Not part of this specification.