Object and Scene Description (MPAI-OSD) is a project for a standard specifying technologies for object description and their localisation in space. Such technologies are used across several use cases of several MPAI standards.

Figure 1 gives two examples that assume the types of output to Audio and Visual Scene Descriptors.

Figure 1 – Audio and Visual Scene Description

The next Figure 2 provides one solution to the problem of assigning identifiers to the Objects – extracted from an audio-visual scene, especially for the purpose of identifying those that are audio-visual such as a human and their speech.

Figure 2 – Audio-Visual Alignment

Another example is provided by Figure 3.

Figure 3 – Visual Spatial Object Identification

Figure 4 is an example of the Conversation with Personal Status use case that makes use of all the (Composite) AI Modules described above.

Figure 4 – Reference Model of Conversation with Personal Status (MPAI-CPS)

MPAI has sought proposals for data formats and reference models for the identified application areas.

Call for Technologies (closed) html,  pdf
Use Cases and Functional Requirements htmlpdf
Framework Licence htmlpdf
Template for responses htmldocx

See also the video recordings (YouTubeWimTV) and the slides of the presentation made on 07 September.