1 Definition
Speech Qualifier is a set of Data providing additional information on Speech Data for potential use by a machine.
Speech Object includes Speech Qualifier in addition to Speech Data. It is specified by MPAI-MMC V2.2.
2 Functional Requirements
A Speech Qualifier must allow the expression of the following Elements:
- Formats
- Content
- Transport
- Attributes
- Source
- Metadata
- Spatial Attributes
- Device
Users needing support of other entries in MPAI-TFA should make a documented request to the MPAI Secretariat to consider addition of such entries.
3 Syntax
https://schemas.mpai.community/TFA/V1.0/formats/SpeechQualifiers.json
4 Semantics
-
Sub-Types
- No Sub-Types
-
Formats
- Content
- Definition: the type of data arrangement used to digitally represent speech.
- Types:
- Raw Speech
- Definition: the type of data arrangement used to digitally represent samples.
- Types:
- Sampling Frequency: Number expressing kHz.
- Sample Precision: Number expressing bits/sample.
- Speech Compression Formats
- Raw Speech
- Transport
- Definition: the type of data arrangement used to transport Speech.
- Types:
- File
- Definition: the type of data arrangement used to statically transport Speech by files.
- Types:
- WAV
- MP4 (ISO/IEC 14496-12:2022)
- Stream
- Definition: the type of data arrangement used to dynamically transport Speech by stream.
- Types:
- DASH (ISO/IEC 23009-1:2022)
- HTTP Live Streaming
- File
- Content
-
Attributes
- Source Type
- Definition: the types of the Speech instance
- Types:
- Real
- Synthetic
- Metadata
- Definition: the type of data arrangement used to attach information to a Speech instance.
- Types:
- Language
- Definition: the type of data arrangement used to indicate the Language used by a Speech instance.
- Type:
- ISO 639-1
- ISO 639-2
- ISO 639-3
- Speaker Identity
- Definition: the type of data arrangement used to identify a speaker.
- Type:
- MPAI Instance Identifier
- Content Description
- Definition: the type of data arrangement used to describe the content of a Speech instance.
- Types:
- ASCII
-
UTF-8,
-
UTF-16,
-
UTF-32
- Entity Internal Status D
- Definition: the type of data arrangement used to describe the internal status such as cognitive state, emotion, and social attitude.
- Type:
- MPAI Personal Status
- Language
- Device
- Definition: Characteristics of the device that captured the speech.
- Characteristics:
- Device ID
- Definition: an identifier of the device
- Identifier:
- String
- Device Location
- Definition: the position and orientation of the device in a real or virtual space.
- Types:
- MPAI Point of View
- Sensor Characteristics
- Definition: sensor features having an impact on the captured speech.
- Sensor features
- Omnidirectional
- Figure of eight
- Cardioid
- Supercardioid
- Hypercardioid
- Device ID
- Source Type