1        Definition

A data structure containing Audio Objects packaged with Audio Scene Geometry and Time Code according to the structure specified in Table 1.

2      Functional Requirements

3        Syntax


4        Semantics

Table 1 – Multichannel Audio Stream Semantics

Label Size Description
Header N1 Bytes Multichannel Audio Header
– Standard and Data Type 9 Bytes The characters “CAE-MCA-V”
– Version N2 Bytes Major MPAI-CAE version
– Dot-separator 1 Byte The character “.”
– Subversion N3 Byte Minor MPAI-CAE version
MInstanceID N4 Bytes ID of the Metaverse Instance.
MultichannelAudioID N4 Bytes Identifier of the Multichannel Audio Stream.
– BlockIndex 8 Bytes Indicates the timing order of the output block.
Derived from Audio Scene Geometry.
– BlockStart 8 Bytes Derived from Audio Scene Geometry.
– BlockEnd 8 Bytes Derived from Audio Scene Geometry.
– BlockSize 1 Byte Derived from Audio Scene Geometry.
– Checksum 1 Byte Checksum is calculated by summing the block and speech header bytes modulo 256.
AudioObjectCount 1 Byte AudioObjectCount of Audio Scene Geometry.
AudioObjectsData N1 Bytes
– AudioObjectID 16 Bytes AudioObjectID in Audio Object.
– Sampling Rate 0-3 bits SamplingRate of Audio Scene Descriptors.
– Sample Type 4-6 bits (aka, sample precision) 0:8, 1:16, 2:24, 3:32, 4:64 (bits/sample)
– Reserved 7 bit
– Spatial Attitude N2 Bytes