The Audio Scene Description Composite AIM is specified in the following six sections.
1 Functions of Audio Scene Description
2 Reference Model of Audio Scene Description
3 I/O Data of Audio Scene Description
4 Functions of AI Modules of Audio Scene Description
5 I/O Data of AI Modules of Audio Scene Description
6 AIM and JSON Metadata Specification of Audio Scene Description
1 Functions of Audio Scene Description
Audio Scene Description (CAE-ASD):
- Receives the Audio Scene composed of:
- Microphone Array Geometry.
- Multichannel Audio, i.e., the output of the Microphone Array.
- Separates Audio Objects in the scene.
- Produces Audio Scene Descriptors.
2 Reference Model of Audio Scene Description
Figure 1 depicts the Reference Architecture of CAE-ASD.
Figure 1 – Reference Model of Audio Scene Description Composite AIM
3 I/O Data of Audio Scene Description
Table 1 gives the Input/Output data of Audio Scene Description.
Table 1 – I/O data of Audio Scene Description
Input data | Comment |
Microphone Array Geometry | The description of the spatial microphone arrangement. |
Multichannel Audio | The Audio output of the Microphone Array. |
Output data | Comments |
Audio Scene Descriptors | The Descriptors of the Audio Scene. |
1.4 Functions of AI Modules of Audio Scene Description
Table 2 gives the list of the AIMs with their functions.
Table 2 – AI Modules of Audio Scene Description
AIM | Function |
Audio Analysis Transform |
|
Audio Source Localisation |
|
Audio Separation and Enhancement |
|
Audio Synthesis Transform |
|
Audio Descriptor Multiplexing |
|
1.5 I/O Data of AI Modules of Audio Scene Description
Table 3 – Audio Scene Description and their data
AIM | Input Data | Output Data |
Audio Analysis Transform | Multichannel Audio | Transform Multichannel Audio |
Audio Source Localisation | Transform Multichannel Audio Microphone Array Geometry |
Audio Spatial Attitudes |
Audio Separation and Enhancement | Audio Spatial Attitudes Transform Multichannel Audio Microphone Array Geometry |
Transform Enhanced Audio Audio Scene Geometry |
Audio Synthesis Transform | Transform Enhanced Audio | Enhanced Audio |
Audio Descriptor Multiplexing | Enhanced Audio Audio Scene Geometry Microphone Array Geometry |
Audio Scene Descriptors |
6 Specification of Audio Scene Description AIMs and JSON Metadata
Table 4 – AIM and JSON Metadata
AIW | AIMs | Names | JSON | |
CAE-ASD | Audio Scene Description | X | ||
– | CAE-AAT | Audio Analysis Transform | X | |
– | CAE-ASL | Audio Source Localisation | X | |
– | CAE-ASE | Audio Separation and Enhancement | X | |
– | CAE-AST | Audio Synthesis Transform | X | |
– | CAE-AMX | Audio Descriptor Multiplexing | X |