1     Function 2     Reference Model 3     Input/Output Data
4     SubAIMs 5     JSON Metadata 6     Profiles
7     Reference Software 8     Conformance Testing 9     Performance Assessment

1     Functions

Receives Microphone Array Geometry Geometry of the Microphone Array
Transform Multichannel Audio Object in the Transform domain
Produces Spherical Harmonic Decomposition Coefficients Result of the transformation into spherical frequency domain.

2      Reference Model

3      Input/Output Data

Input data Semantics
Microphone Array Geometry Geometry of the Microphone Array
Transform Multichannel Audio Audio Object in the Transform domain
Output data Semantics
Spherical Harmonics Decomposition Coefficients Result of the transformation into spherical frequency domain.

4      SubAIMs

No SubAIMs.

5      JSON Metadata

https://schemas.mpai.community/CAE1/V2.4/AIMs/SoundFieldDescription.json

6 Profiles

No profiles

7 Reference Software

The Sound Field Description Reference Software can be downloaded from the MPAI Git.

8 Conformance Testing

Receives Microphone Array Geometry Shall validate against the Microphone Array Geometry Schema.
Transform Multichannel Audio Shall validate against the Audio Object schema.
The Qualifier shall validate against the Audio Qualifier schema.
The values of any Sub-Type, Format, and Attribute of the Qualifier shall correspond with the Sub-Type, Format, and Attributes of the Audio Object Qualifier schema.
Produces Spherical Harmonics Decomposition Coefficients Shall validate against the Audio Object schema.
The Qualifier shall validate against the Audio Qualifier schema.
The values of any Sub-Type, Format, and Attribute of the Qualifier shall correspond with the Sub-Type, Format, and Attributes of the Audio Object Qualifier schema.

9 Performance Assessment

Table 52 Table 52 gives the Enhanced Audioconference Experience (CAE-EAE) Sound Field Description Means and how they are used.

Table 52AIM Means and use of Enhanced Audioconference Experience (CAE-EAE) Sound Field Description

Means Actions
Performance Testing Dataset DS1: n Test files containing real recordings or simulations structured in Transform Multichannel Audio format.

DS2: n Microphone Array Geometry associated with the real recordings or simulations.

DS3: n Expected Output files including data in SHD format.

Procedure 1.     Feed the AIM under test with the Test files (DS1) and their associated Microphone Array Geometry (DS2).

2.     Analyse the SHD with the Expected Output files (DS3).

Evaluation 1.     Check the output SHD data format with the given Expected Output files format.

2.     Calculate the peak-to-peak Amplitude (A) value of each Audio block in the Expected Output files.

3.     Calculate the RMSE of each Audio block in SHD by comparing the output (x) with the Expected Output files (y).

4.     Accept the AIM under test if, for each audio block, these two conditions are satisfied:

a.     Data format of the SHD is the same with the Expected Output Files and

b.     RMSE < A * 0.1%

Figure 21 – Sound Field Description Testing Flow

After the Tests, Performance Assessor shall fill out Table 53.

Table 53 – Performance Testing form of Enhanced Audioconference Experience (CAE-EAE) Sound Field Description

Performance Assessor ID Unique Performance Assessor Identifier assigned by MPAI
Standard, Use Case ID and Version Standard ID and Use Case ID, Version and Profile of the standard in the form “CAE:EAE:1:0”.
Name of AIM Sound Field Description
Implementer ID Unique Implementer Identifier assigned by MPAI Store.
AIM Implementation Version Unique Implementation Identifier assigned by Implementer.
Neural Network Version* Unique Neural Network Identifier assigned by Implementer.
Identifier of Performance Testing Dataset Unique Dataset Identifier assigned by MPAI Store.
Test ID Unique Test Identifier assigned by Performance Assessor.
Actual output The Performance Assessor will provide the following matrix containing a limited number of input records (n) with the corresponding outputs. If an input record fails, the Assessor would specify the reason why the test case fails.

Input data (DS1, DS2) Expected Output Data (DS3) Data Format RMSE
Transform Multichannel Audio ID1

Microphone Array Geometry ID1

SHD ID1 T/F < A * 0.1%
Transform Multichannel Audio ID2

Microphone Array Geometry ID2

SHD ID2 T/F < A * 0.1%
Transform Multichannel Audio ID3

Microphone Array Geometry ID3

SHD ID3 T/F < A * 0.1%
Transform Multichannel Audio IDn

Microphone Array Geometry IDn

SHD IDn T/F < A * 0.1%

Final evaluation: T/F

Denoting with i, the record number in DS1, DS2, and DS3, the matrices reflect the results obtained with input records i with the corresponding outputs i.

DS1 DS2 DS3 Sound Field Description output value
(obtained through the AIM under test)
DS1[i] DS2[i] DS3[i] SoundFieldDescription[i]
Execution time* Duration of test execution.
Test comment* Comments on test results and possible needed actions.
Test Date yyyy/mm/dd.

* Optional field