| 1 Function | 2 Reference Model | 3 Input/Output Data |
| 4 SubAIMs | 5 JSON Metadata | 6 Profiles |
| 7 Reference Software | 8 Conformance Texting | 9 Performance Assessment |
1 Functions
The Basic Audio-Visual Scene Description (OSD-BMS) AIM receives Audio Objects, Speech Objects, Visual Objects, and Audio-Visual Objects with their Space-Time information and produces the Descriptors of a Scene composed of Audio Objects, Speech Objects, Visual Objects, and Audio-Visual Objects and optionally an Alert:
| Receives | Space-Time | of the Audio Basic Visual Scene. |
| Audio Objects | In input. | |
| Speech Objects | In input. | |
| Visual Objects | In input. | |
| Audio-Visual Objects | In input. | |
| Processes | All Objects | |
| Creates | Audio, Speech, Visual, and Audio-Visual Scene Descriptors | From the Objects, if possible |
| Combines | Scene Descriptors | Of all Objects. |
| Produces | Audio-Visual Scene Descriptors |
2 Reference Model
Figure 1 specifies the Reference Model of the Audio-Visual Basic Scene Description (OSD-BMS) AIM.

Figure 1 – The Basic Audio-Visual Scene Description (OSD-BMS) AIM
3 Input/Output Data
Table 1 specifies the Input and Output Data of the Audio-Visual Basic Scene Description (OSD-BMS).
Table 1 – I/O Data of the Basic Audio-Visual Scene Description (OSD-BMS) AIM
| Input | Description |
| SpaceTime | Space-Time information of Objects. |
| Audio Objects | Input Audio Objects. |
| Speech Objects | Input Speech Objects. |
| Visual Objects | Input Visual Objects. |
| Output | Description |
| Audio-Visual Scene Descriptors | The Audio-Visual Descriptors of the Scene. |
4 SubAIMs
Audio-Visual Basic Scene Description (OSD-AVB) is a Composite AIM whose reference Model is depicted in Figure 2.

Figure 2 – Reference Model of Basic Audio-Visual Scene Description Composite (OSD-BMS) AIM
Table 2 provides the AI Modules composing the AIM.
Table 2 – AI Modules of the Audio-Visual Basic Scene Description (OSD-AVB) AIM
| AIM | Acronym | AIMs | JSON |
| OSD-AVS | Basic Audio-Visual Basic Scene Description | X | |
| OSD-BBS | Basic Audio Scene Description | X | |
| OSD-BSS | Basic Speech Scene Description | X | |
| OSD-BVS | Basic Visual Scene Description | X | |
| OSD-AVA | Audio-Visual Alignment | X |
5 JSON Metadata
https://schemas.mpai.community/OSD/V1.4/AIMs/BasicAudioVisualSceneDescription.json
6 Profiles
No Profiles.
7 Reference Software
8 Conformance Testing
Table 3 provides the Conformance Testing Method for OSD-BMS AIM. Conformance Testing of the individual AIMs of the OSD-BMS Composite AIM are given by the individual AIM Specification.
If a schema contains references to other schemas, conformance of data for the primary schema implies that any data referencing a secondary schema shall also validate against the relevant schema, if present and conform with the Qualifier, if present.
Table 3 – Conformance Testing Method for OSD-BMS AIM
| Receives | Space-Time | Shall validate against Space-Time schema |
| Speech Objects | Shall validate against Speech Objects schema Speech Data shall conform with Qualifier |
|
| Audio Objects | Shall validate against Audio Objects schema Audio Data shall conform with Qualifier |
|
| Visual Objects | Shall validate against Visual Objects schema Visual Data shall conform with Qualifier |
|
| Audio-Visual Objects | Shall validate against Audio-Visual Objects schema Audio-Visual Data shall conform with Qualifier |
|
| Produces | Audio-Visual Scene Descriptors | Shall validate against AV Scene Descriptors schema |