1       Definition

Multichannel Audio provided by a Microphone Array used to:

  1. Create Audio Scene Descriptors to:
    • Enable extraction of speech addressed by humans outside or inside the HCI.
    • Incorporate outdoor Audio information into the Basic Environment Representation.
  2. Suppress noise and individual sound sources outside the passenger cabin.

2       Functional Requirements

Microphone (arrays) are used to capture the sound both outdoor and indoor for the purpose of creating Audio Scene Description to:

  1. Provide the location of sound sources.
  2. Enable extraction of speech addressed to CAV by humans.
  3. Remove unwanted noise from the passenger cabin.
  4. Incorporate Audio information into the Basic Environment Representation.

MPAI has developed specifications for Multichannel AudioMultichannel Audio Stream, and Microphone Array Geometry.

3       Syntax

https://schemas.mpai.community/CAE1/V2.2/data/AudioFormatID.json

https://schemas.mpai.community/OSD/V1.1/data/SpaceTime.json

4       Semantics

Label Size Description
Header N1 Bytes
·         Standard 9 Bytes The characters “CAV-IAU-V”
·         Version N2 Bytes Major version – 1 or 2 Bytes
·         Dot-separator 1 Byte The character “.”
·         Subversion N3 Bytes Minor version – 1 or 2 Bytes.
InputAudioID N4 Bytes Identifier of LiDAR Sensor.
InputAudioTimeSpaceAttributes N5 Bytes Time and Space of Input Audio Data.
InputAudioData N6 Bytes
·         AudioFormatID N7 Bytes Format ID of Input Audio Data.
·         InputAudioDataLength N8 Bytes Data Length of Input Audio Data in Bytes.
·         InputAudioDataURI N9 Bytes Location of Input Audio Data.
InputAudioAttributes[] N10 Bytes
·         AudioAttributeID N11 Bytes ID of Attribute of Input Audio Data
·         AudioAttributeFormatID N12 Bytes ID of Attribute Format of Input Audio Data
·         InputAudioAttributeLength N13 Bytes Number of Bytes in Input Audio Attribute Data
·         InputAudioAttributeDataURI N14 Bytes URI of Data of Input Audio Attribute Data
DescrMetadata N1 Bytes Descriptive Metadata

5       Data Formats

Input Audio requires:

  1. Audio Format.
  2. Audio Attribute Format.

6       To Respondents

Respondents are invited to:

  1. Comment or elaborate on the relevance and applicability of the above-mentioned three standards to CAV.
  2. Comment on the Functional Requirements.
  3. Propose motivated Functional Requirements for an Audio Array Format suitable to create a 3D sound field representation of the Environment for the stated purposes.
  4. Propose Data Formats and Attributes for use in the future Technical Specification: Data Types, Formats, and Attributes (MPAI-TFA).