1 Function 2 Reference Model 3 Input/Output Data
4 SubAIMs 5 JSON Metadata 6 Profiles
7 Reference Software 8 Conformance Testing 9 Performance Assessment

1 Functions

  1. Receives Spherical Harmonic Decomposition Coefficients, Transform Speech, Audio Scene Geometry, and Source Model KB info.
  2. Eliminates background noise and reverberation which reduce the audio quality, acting as a Passthrough AIM if environmental conditions do not substantially add ambient noise to the desired speech.
  3. Produces Denoised Transform Speech

2 Reference Model

3 Input/Output Data

Input data Semantics
Spherical Harmonics Decomposition Coefficients Result of the transformation of Transform Multichannel Audio into the spherical frequency domain.
Transform Audio Audio Object in the Transform Domain.
Audio Scene Geometry Spatial arrangement of Audio Objects.
Source Model KB info Discrete-time and discrete-valued simple acoustic source models used in source separation.
Output data Semantics
Enhanced Transform Speech Transform Speech whose noise level has been reduced.

4 SubAIMs

No SubAIMs.

5 JSON Metadata

https://schemas.mpai.community/CAE1/V2.4/AIMs/NoiseCancellationModule.json

6     Profiles

No Profiles

7     Reference Software

The Noise Cancellation Module can be downloaded from the MPAI Git.

8     Conformance Testing

Receives Spherical Harmonics Decomposition Coefficients Shall validate against the Audio Object schema.
The Qualifier shall validate against the Audio Qualifier schema.
The values of any Sub-Type, Format, and Attribute of the Qualifier shall correspond with the Sub-Type, Format, and Attributes of the Audio Object Qualifier schema.
Transform Audio Shall validate against the Audio Object schema.
The Qualifier shall validate against the Audio Qualifier schema.
The values of any Sub-Type, Format, and Attribute of the Qualifier shall correspond with the Sub-Type, Format, and Attributes of the Audio Object Qualifier schema.
Audio Scene Geometry Shall validate against Audio Scene Geometry schema.
Source Model KB info Discrete-time and discrete-valued simple acoustic source models used in source separation.
Produces Enhanced Transform Speech Shall validate against the Audio Object schema.
The Qualifier shall validate against the Audio Qualifier schema.
The values of any Sub-Type, Format, and Attribute of the Qualifier shall correspond with the Sub-Type, Format, and Attributes of the Audio Object Qualifier schema.

9     Performance Assessment

Table 58 gives the Enhanced Audioconference Experience (CAE-EAE) Noise Cancellation Means and how they are used.

Table 58AIM Means and use of Enhanced Audioconference Experience (CAE-EAE) Noise Cancellation

Means Actions
Performance Assess Dataset DS1: n Test files containing SHD.
DS2: n Test files containing Transform Speech.
DS3: n Test files containing Audio Scene Geometry.
DS4: n Expected Denoised Transform Speech.
Procedure 1.     Feed the AIM under test with the Test files (DS1, DS2, DS3).
2.     Analyse the Denoised Transform Speech (DS4).
Evaluation 1.     Compare the number of Audio Blocks in the Expected Denoised Transform Speech with the number of Audio Blocks in the Denoised Transform Speech Files.
2.     Compute Perception Evaluation of Speech Quality (PESQ) between the Expected and Output Denoised Transform Speech Files [6].
3.     Accept the AIM under test if these two conditions are satisfied:
a.     The number of Audio Blocks in the Denoised Transform Speech is the same with the number of Audio Blocks in the Expected Denoised Transform Speech.
b.     Compare each Denoised Transform Speech with the Expected Denoised Transform Speech.
c.     If the room reverb time (T60) is greater than 0.5 seconds.
i.     Each object’s PESQ between the Expected and Output is greater than P=2.0.
d.     If the room reverb time (T60) is smaller than 0.5 seconds.
i.     Each object’s PESQ between the Expected and Output is greater than P=3.0.

Figure 23 – Noise Cancellation Testing Flow

After the Tests, Performance Assessor shall fill out Table 59.Table 59

Table 59 – Performance Assessment form of Enhanced Audioconference Experience (CAE-EAE) Noise Cancellation

Performance Assessor ID Unique Performance Assessor Identifier assigned by MPAI
Standard, Use Case ID and Version Standard ID and Use Case ID, Version and Profile of the standard in the form “CAE:EAE:1:0”.
Name of AIM Noise Cancellation
Implementer ID Unique Implementer Identifier assigned by MPAI Store.
AIM Implementation Version Unique Implementation Identifier assigned by Implementer.
Neural Network Version* Unique Neural Network Identifier assigned by Implementer.
Identifier of Performance Assessment Dataset Unique Dataset Identifier assigned by MPAI Store.
Test ID Unique Test Identifier assigned by Performance Assessor.
Actual output The Performance Assessor will provide the following matrix containing a limited number of input records (n) with the corresponding outputs. If an input record fails, the tester would specify the reason why the test case fails.

Input data (DS1, DS2, DS3) Expected Output Data

(DS4)

Data Format PESQ Score
SHD ID1

Transform Speech ID1

Audio Scene Geometry ID1

 

Denoised Transform Speech ID1

 

T/F > P
SHD ID2

Transform Speech ID2

Audio Scene Geometry ID2

 

Denoised Transform Speech ID2

 

T/F > P
SHD ID3

Transform Speech ID3

Audio Scene Geometry ID3

 

Denoised Transform Speech ID3

 

T/F > P
SHD IDn

Transform Speech IDn

Audio Scene Geometry IDn

 

Denoised Transform Speech IDn

 

T/F > P

Final evaluation : T/F

Denoting with i, the record number in DS1, DS2, and DS3, the matrices reflect the results obtained with input records i with the corresponding outputs i.

DS1 DS2 DS3 DS4 Noise Cancellation output value
(obtained through the AIM under test)
DS1[i] DS2[i] DS3[i] DS4[i] NoiseCancellation[i]
Execution time* Duration of test execution.
Test comment* Comments on test results and possible needed actions.
Test Date yyyy/mm/dd.

* Optional field