Object and Scene Description (MPAI-OSD) is a project for a standard specifying technologies for object description and their localisation in space. Such technologies are used across several use cases of several MPAI standards.
Figure 1 gives two examples that assume the types of output to Audio and Visual Scene Descriptors.
Figure 1 – Audio and Visual Scene Description
The next Figure 2 provides one solution to the problem of assigning identifiers to the Objects – extracted from an audio-visual scene, especially for the purpose of identifying those that are audio-visual such as a human and their speech.
Figure 2 – Audio-Visual Alignment
Another example is provided by Figure 3.
Figure 3 – Visual Spatial Object Identification
Figure 4 is an example of the Conversation with Personal Status use case that makes use of all the (Composite) AI Modules described above.
Figure 4 – Reference Model of Conversation with Personal Status (MPAI-CPS)
MPAI has sought proposals for data formats and reference models for the identified application areas.
Call for Technologies (closed) | html, pdf |
Use Cases and Functional Requirements | html, pdf |
Framework Licence | html, pdf |
Template for responses | html, docx |
See also the video recordings (YouTube, WimTV) and the slides of the presentation made on 07 September.