MPAI-PAF V2.2 AIWs Videoconference Client Transmitter

1 Functions	2 Reference Model	3 Input and Output Data
4 Functions of AI Modules	5 I/O Data of AI Modules	6 AIW, AIMs and JSON Metadata

The function of a Client Transmitter is to:

Receive from a Participant:
- Input Audio from the microphone.
- Input Visual from the camera.
- Participant’s Avatar Model.
- Participant’s language preferences (e.g., EN-US, IT-CH).
Send to the Server:
- Speech Object (for Authentication).
- Face Object (for Authentication).
- Input Portable Avatars containing:
  - Language preferences (at the start).
  - Avatar Model (at the start).
  - Speech.
  - Avatar Descriptors.

Figure 1 gives the Reference Model of Client Transmitter AIW. Red text refers to data sent at meeting start.

Figure 1 – Reference Model of Avatar

At the start, each participant provides:

During the videoconference:

Audio-Visual Scene Description produces Speech Objects, Face Objects, Face Descriptors, Body Descriptors and Audio-Visual Scene Geometry.
Automatic Speech Recognition produces Recognised Text.
Personal Status Extraction produces Personal Status.
Multiplexes Recognised Text, Face Descriptors, Body Descriptors, Personal Status
Portable Avatar Multiplexing multiplexes Recognised Text, Personal Status, Input Speech, Face Descriptors, Body Descriptors, Language Selector, Avatar Model, and Participant ID.
Videoconference Client Transmitter sends Portable Avatars to Avatar Videoconference Server that the Server processes and re-distributes to Client Receivers.

Table 1 gives the input and output data of the Client Transmitter AIW:

Table 1 – Input and output data of Client Transmitter AIW

Input	Description
Input Text	Chat text used by a human to communicate with Virtual Meeting Secretary or other participants
Language Selector	The language participant wishes to speak and hear.
Input Audio	Audio of Speech of participants in a meeting room.
Input Visual	Video of participants in a meeting room.
Avatar Model	The avatar model selected by the participant.
Output	Description
Speech Object	An utterance of a Participant used by Server for authentication.
PartixipantPortable Avatar	Portable Avatar produced by Client Transmitter.
Face Object	Participant’s face used by Server for authentication.

Table 2 gives the functions of AI Modules of the Client Transmitter AIW.

Table 2 – AI Modules of Client Transmitter AIW

AIM	Function
Audio-Visual Scene Description	1. Receives Input Audio and Input Visual. 2. Provides Input Speech, Speech Object, Face Descriptors, Body Descriptors, Face Object.
Automatic Speech Recognition	1. Receives Input Speech. 2. Provides Recognised Text.
Personal Status Extraction	1. Receives Recognised Text, Speech, Face Descriptors, Body Descriptors,. 2. Provides the Participant’s Personal Status.
Portable Avatar Multiplexing	1. Receives Language Preference, Avatar Model, Input Text, Input Speech, Recognised Speech, Personal Status, Participant ID, Face Descriptors, Body Descriptors. 2. Provides Participant Portable Avatars.

Table 3 gives the AI Modules of Client Transmitter AIW.

Table 3 – AI Modules of Client Transmitter AIW

AIM	Input	Output
Audio-Visual Scene Description	Input Audio Input Visual	Input Speech Speech Object Face Object Face Descriptors Body Descriptors
Automatic Speech Recognition	Speech Objects	Recognised Text
Personal Status Extraction	Recognised Text Input Speech Face Object Body Object	Personal Status
Portable Avatar Multiplexing	Recognised Text Personal Status Input Speech Face Descriptors Body Descriptors Input Text Language Selector Avatar Model Participant ID	Portable Avatars.

Table 4 – AIW, WIMs, and JSON Metadata

AIW	AIMs/1		AIMs/2	Name	JSON
PAF-CTX				Videoconference Client Transmitter	X
	OSD-AVS			Audio-Visual Scene Description	X
		CAE-ASD		Audio Scene Description	X
			CAE-AAT	Audio Analysis Transform	X
			CAE-ASL	Audio Source Localisation	X
			CAE-ASE	Audio Separation and Enhancement	X
			CAE-AST	Audio Synthesis Transform	X
			CAE-AMX	Audio Descriptors Multiplexing	X
		OSD-VSD		Visual Scene Description	X
		OSD-AVA		Audio-Visual Alignment	X
	MMC-ASR			Automatic Speech Recognition	X
	MMC-PSE			Personal Status Extraction	X
		MMC-ITD		Entity Text Description	X
		MMC-ISD		Entity Speech Description	X
		PAF-IFD		Entity Face Description	X
		PAF-IBD		Entity Body Description	X
		MMC-PTI		PS-Text Interpretation	X
		MMC-PSI		PS-Speech Interpretation	X
		PAF-PFI		PS-Face Interpretation	X
		PAF-PGI		PS-Gesture Interpretation	X
		MMC-PMX		Personal Status Multiplexing	X
	PAF-PMX			Portable Avatar Multiplexing	X

Cookie	Duration	Description
cookielawinfo-checkbox-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Technical".
CookieLawInfoConsent	1 year	The cookie is set by the GDPR Cookie Consent plug-in and is used to store whether the user has consented to the use of cookies or not. It does not store any personal data.
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
_pk_id.6.08a8	13 months	Used to store a few details about the user such as the unique visitor ID
_pk_ses.6.08a8	30 minutes	Short lived cookies used to temporarily store data for the visit