1 Definition
The Audio Qualifier is a set of Data providing additional information on Audio Data for potential use by a machine.
The combination of Audio Data and Audio Qualifier is called Audio Object and is specified by
MPAI-OSD V1.5.
2 Functional Requirements
The Audio Qualifier allows the expression of the following Elements:
- Sub-Types
- Formats
- ContentFormat
- TransportFormat
- Attributes
- Source
- SpatialAttributes
- Device
- Metadata
Users needing additional entries in the Audio Qualifier or support of new Qualifiers should make a documented request to the MPAI Secretariat.
Requests will be considered by the appropriate MPAI committee.
3 Syntax
https://schemas.mpai.community/TFA/V1.5/data/AudioQualifier.json
—
4 Semantics
4.1 Sub-Types
Defines the nature of the audio signal.
- Speech
- Music
- SoundEffects
- Noise
- Mixed
4.2 Formats
4.2.1 ContentFormat
Defines the structure and representation of audio data.
- SampleSpace:
- PCM representation
- TransformSpace:
- Sequence (Sequential, Interleaved)
- Precision (float32, float64)
- Spherical Harmonics Decomposition:
- Order
- Precision
- Ambisonics:
- Normalisation
- ChannelOrder
- Order
- Precision
- MultiPointAmbisonics
Additional content formats are defined in:
AudioContentFormats
.
4.2.2 TransportFormat
Defines how Audio Data is transported.
- FileFormats:
AudioFileFormats - StreamFormats:
AudioStreamFormats
4.3 Attributes
4.3.1 Source
Defines the origin of the audio signal.
- Vocal: Real or Synthetic
- Music: Real or Synthetic
- SoundEffects: Real or Synthetic
- Noise: Real or Synthetic
4.3.2 SpatialAttributes
Defines spatial perception characteristics of audio.
- BinauralCues:
- InterauralLevelDifference
- InterauralTimeDelay
- InterauralPhaseDifference
- SpectralCues
- InterchannelDifferences
4.3.3 Device
Defines the device used for capturing or rendering audio data.
- DeviceRole:
- Capture: microphones, arrays, wearable sensors
- Render: speakers, headphones
- Bidirectional
- DeviceType:
- Microphone
- MicrophoneArray
- Speaker
- Headphones
- WearableMic
- CaptureConfiguration:
- ChannelCount
- SamplingMode (Mono, Stereo, MultiChannel, Ambisonics)
- RenderConfiguration:
- ChannelCount
- RenderingMode (Mono, Stereo, Multichannel, Binaural, Ambisonics)
- SpeakerConfiguration:
- ChannelCount
- Layout (Mono, Stereo, 5.1, 7.1, Custom)
- OperationalParameters:
- Gain
- Sensitivity
- DynamicRange
4.3.4 Metadata
Defines metadata associated with the audio signal.
- AudioMetadataFormats
- ObjectID
4.4 Conceptual Model
The Audio Qualifier describes:
- The structure of the signal (Formats)
- The type of content (SubType)
- The origin of the signal (Source)
- The spatial properties (SpatialAttributes)
- The systems interacting with the signal (Device)