1. Functions

Speaker Identity Recognition (MMC-SIR):

Receives Auxiliary Text related to the Speech Object.
Speech Object of which the Speaker id requested.
Speech Time for which a Speaker ID is requested.
Speech Overlap signalling which parts of Speech Object have Speech Overlap
Produces Speaker Identifier

2      Reference Architecture

The Reference Architecture is depicted in Figure 1.

Figure 1 – The Speaker Identity Recognition AIM

3      I/O Data

Table 1 specifies the Input and Output Data of the Visual Scene Description AIM.

Table 1 – I/O Data of the Visual Scene Description AIM

Input Description
Auxiliary Text Text with content related to Speaker ID.
Speech Object Speech Object emitted by the Speaker.
Speech Time The start and end time of the Speech.
Speech Overlap Information about overlapping Speech.
Output Description
Speaker Identifier The Visual Descriptors of the Visual Scene.

4     SubAIMs

No SubAIMs

5     JSON Metadata

https://schemas.mpai.community/MMC/V2.2/AIMs/SpeakerIdentityRecognition.json

6. Profiles

No Profiles