1 Function | 2 Reference Model | 3 Input/Output Data |
4 SubAIMs | 5 JSON Metadata | 6 Profiles |
7 Reference Software | 8 Conformance Texting | 9 Performance Assessment |
1 Functions
Audio Analysis Transform (CAE-AAT):
Receives | Audio Object | As Multichannel Audio |
Transforms | Multichannel Audio | into frequency bands via a Fast Fourier Transform (FFT). The following operations are carried out in discrete frequency bands. When such a configuration is used, a 50% overlap between subsequent audio blocks needs to be employed. The output is a data structure comprising complex valued audio samples in the frequency domain. |
Produces | Audio Object | In the Transform domain. |
2 Reference Model
Figure 1 depicts the Reference Architecture of the Audio Analysis Transform (CAE-AAT) AIM.
Figure 1 – Audio Analysis Transform (CAE-AAT) AIM
3 Input/Output Data
Table 1 specifies the Input and Output Data of the Audio Analysis Transform (CAE-AAT) AIM.
Table 1 – Audio Analysis Transform (CAE-AAT) AIM
Input | Description |
Audio Object | Audio Object (with associated Microphone Array info) |
Output | Description |
Audio Object (Transform) | The result of the application of the Fast Fourier Transform to Multichannel Audio. |
4 SubAIMs
No SubAIMs.
5 JSON Metadata
https://schemas.mpai.community/CAE1/V2.3/AIMs/AudioAnalysisTransform.json
6 Profiles
No Profiles
7 Reference Software
8 Conformance Testing
The following steps shall be followed when testing the Conformance of a CAE-AAT AIM instance.
- Use the following datasets:
- DS1: n Test Audio Object files including Multichannel Audio as Interleaved Multichannel Audio format.
- DS2: n Expected Audio Object Output files including data in Transform Interleaved Multichannel Audio format.
- Feed the AIM under test with the Test files (DS1).
- Perform the following steps to analyse the Audio Object (Transform) with the Expected Audio Objects (DS2):
- Check the data format of the Audio Object (Transform) with the format of the given Expected Audio Objects.
- Calculate the peak-to-peak Amplitude (A) of each Audio block in the Expected Audio Objects.
- Calculate the RMSE of each Audio block by comparing the Audio Object (Transform) (x) with the Expected Audio Objects (y).
- Accept the AIM under test if, for each audio block, these two conditions are satisfied:
- Data format of the Audio Object (Transform) is the same as the format of the Expected Audio Object and
- RMSE < A* 0.1%.
- The Conformance Tester will provide the following matrix containing a limited number of input records (n) with the corresponding outputs. If an input record fails, the tester would specify the reason why the test case fails.
Input data (DS1) | Expected Output Data (DS2) | Data Format | RMSE |
Audio Object (Microphone Array) ID1 | Audio Object (Transform) ID1 | T/F | < A*0.1% |
Audio Object (Microphone Array) ID2 | Audio Object (Transform) ID2 | T/F | < A*0.1% |
Audio Object (Microphone Array) ID3 | Audio Object (Transform) ID3 | T/F | < A*0.1% |
… | … | … | … |
Audio Object (Microphone Array) IDn | Audio Object (Transform) IDn | T/F | < A*0.1% |
- Final evaluation: T/F Denoting with i, the record number in DS1 and DS2, the matrices reflect the results obtained with input records i with the corresponding outputs i.
DS1 | DS2 | Audio Object (Transform) output value (from AIM under test) |
DS1[i] | DS2[i] | Audio Object (Transform)[i] |
Table 2 provides the Conformance Testing Method for the formats of the CAE-AAT AIM output.
Note: If a schema contains references to other schemas, conformance of data for the primary schema implies that any data referencing a secondary schema shall also validate against the relevant schema, if present and conform with the Qualifier, if present.
Table 2 – Conformance Testing Method for CAE-AAT AIM
Receives | Audio Object (Microphone Array) | Shall validate against Audio Object schema. Audio Data shall conform with Audio Qualifier. |
Produces | Audio Object (Transform) | Shall validate against Audio Object schema. Audio Data shall conform with Audio Qualifier. |