1     Definition

The Audio Qualifier is a set of Data providing additional information on Audio Data for potential use by a machine.

The combination of Audio Data and Audio Qualifier is called Audio Object and is specified by
MPAI-OSD V1.5.

2     Functional Requirements

The Audio Qualifier allows the expression of the following Elements:

Users needing additional entries in the Audio Qualifier or support of new Qualifiers should make a documented request to the MPAI Secretariat.
Requests will be considered by the appropriate MPAI committee.

3     Syntax


https://schemas.mpai.community/TFA/V1.5/data/AudioQualifier.json

4     Semantics

4.1  Sub-Types

Defines the nature of the audio signal.

  • Speech
  • Music
  • SoundEffects
  • Noise
  • Mixed

4.2  Formats

4.2.1  ContentFormat

Defines the structure and representation of audio data.

  • SampleSpace:
    • PCM representation
  • TransformSpace:
    • Sequence (Sequential, Interleaved)
    • Precision (float32, float64)
  • Spherical Harmonics Decomposition:
    • Order
    • Precision
  • Ambisonics:
    • Normalisation
    • ChannelOrder
    • Order
    • Precision
  • MultiPointAmbisonics

Additional content formats are defined in:

AudioContentFormats
.

4.2.2  TransportFormat

Defines how Audio Data is transported.

4.3  Attributes

4.3.1  Source

Defines the origin of the audio signal.

  • Vocal: Real or Synthetic
  • Music: Real or Synthetic
  • SoundEffects: Real or Synthetic
  • Noise: Real or Synthetic

4.3.2  SpatialAttributes

Defines spatial perception characteristics of audio.

  • BinauralCues:
    • InterauralLevelDifference
    • InterauralTimeDelay
    • InterauralPhaseDifference
  • SpectralCues
  • InterchannelDifferences

4.3.3  Device

Defines the device used for capturing or rendering audio data.

  • DeviceRole:
    • Capture: microphones, arrays, wearable sensors
    • Render: speakers, headphones
    • Bidirectional
  • DeviceType:
    • Microphone
    • MicrophoneArray
    • Speaker
    • Headphones
    • WearableMic
  • CaptureConfiguration:
    • ChannelCount
    • SamplingMode (Mono, Stereo, MultiChannel, Ambisonics)
  • RenderConfiguration:
    • ChannelCount
    • RenderingMode (Mono, Stereo, Multichannel, Binaural, Ambisonics)
  • SpeakerConfiguration:
    • ChannelCount
    • Layout (Mono, Stereo, 5.1, 7.1, Custom)
  • OperationalParameters:
    • Gain
    • Sensitivity
    • DynamicRange

4.3.4  Metadata

Defines metadata associated with the audio signal.

4.4  Conceptual Model

The Audio Qualifier describes:

  • The structure of the signal (Formats)
  • The type of content (SubType)
  • The origin of the signal (Source)
  • The spatial properties (SpatialAttributes)
  • The systems interacting with the signal (Device)