PGM-AUA V1.0 AIMs User State Description

1. Function	2. Reference Model	3. Input/Output Data
4. SubAIMs	5. JSON Metadata	6. Profiles
7. Reference Software	8. Conformance Testing	9.Performance Assessment

1. Function

The User State Description (PGM-USD) AIM derives a description of the User’s observable state by interpreting enriched audio‑visual scene information and cross‑modal correspondence evidence.

USD operates on enhanced scene descriptors and alignment evidence produced by Space and User Description and applies evidence‑based reasoning under directive control to construct a User Entity State suitable for downstream reasoning, control, and personalization.

2. Reference Model

Figure 1 gives the Reference Model of User State Description (PGM-USD) AIM.

Figure 1 – Reference Model of User State Description (PGM-USD) AIM.

The User State Description reference model consists of an evidence‑driven interpretative pipeline operating on enriched perceptual inputs.

At a conceptual level, USD performs the following functions:

Multimodal Evidence Integration: Integration of enhanced audio descriptors, enhanced visual descriptors, and audio‑visual scene geometry to establish user‑centric evidence.
Linguistic and Paralinguistic Analysis: Interpretation of textual input (e.g. ASR output) and associated audio evidence to extract communicative cues relevant to user state.
Behavioural and Expressive Analysis: Interpretation of visual and audio evidence to derive observable behavioural and expressive indicators.
Entity State Construction: Evidence‑based construction of a User Entity State under the constraints imposed by directives and policies.
Output Packaging and Provenance: Packaging of the constructed User Entity State together with status and provenance metadata.

The reference model explicitly separates evidence extraction, interpretation, and state construction, ensuring traceability, auditability, and modularity.

3. Input/Output Data

Table 1 – Input and Output Data of User State Description (PGM-USD) AIM

Input Data	Description
Enhanced Audio Scene Descriptors	Audio Scene Descriptors augmented with derived and semantic properties by Audio Scene Enhancement.
Enhanced Visual Scene Descriptors	Visual Scene Descriptors augmented with derived and semantic properties by Visual Scene Enhancement.
Speech	Speech component of Enhanced Audio Scene Descriptors.
User SUD Directive	Control directives specifying scope, depth, or policy constraints for user state interpretation.
User Domain Request	Domain‑specific knowledge supporting user state interpretation and constraint enforcement.
Output Data	Description
User Entity State	Structured description of the User’s observable state derived from multimodal evidence.
User SUD Status	Status information describing the execution and outcome of User State Description processing.
User Domain Response	Response to domain‑specific knowledge request.

3. SubAIMs (Informative)

User State Description (PGM-USD) AIM implementation may adopt the architecture of Figure 2.

Figure 2 – Reference Model of User State Description (PGM-USD) Composite AIM

A User State Description (PGM-USD) AIM adopting the architecture of Figure 2, will use the Input and Output Data of Table 2.

Table 2 – Input and Output Data of the User State Description (PGM-USD) Composite AIM’s SubAIMs.

SubAIM Specification	Purpose	Input Data	Output Data
Automatic Speech Recognition	Converts speech input into textual representation suitable for downstream multimodal processing.	Speech	TXT
Multimodal Input Harmonisation	Aligns audio, visual, and textual inputs into a harmonised multimodal context without semantic interpretation.	ASD VSD TXT User‑SUD Directive	Harmonised Multimodal Context
Linguistic–Paralinguistic Analysis	Extracts linguistic and paralinguistic evidence from harmonised multimodal context.	Harmonised Multimodal Context User‑SUD Directive Domain RS	Linguistic‑Paralinguistic Evidence
Behavioural and Expressive Analysis	Derives behavioural and expressive indicators of the User from multimodal evidence.	Harmonised Multimodal Context User‑SUD Directive Domain RS	Behavioural & Expressive Indicators
Cross‑Modal Interpretation	Integrates linguistic and behavioural evidence into cross‑modal interpretative evidence under directives and domain constraints.	Linguistic‑Paralinguistic Evidence Behavioural & Expressive Indicators User‑SUD Directive Domain RS	Cross‑Modal Interpretative Evidence
Entity State Construction	Constructs the User Entity State from cross‑modal interpretative evidence.	Cross‑Modal Interpretative Evidence User‑SUD Directive	User Entity State User‑SUD Status

Cookie	Duration	Description
cookielawinfo-checkbox-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Technical".
CookieLawInfoConsent	1 year	The cookie is set by the GDPR Cookie Consent plug-in and is used to store whether the user has consented to the use of cookies or not. It does not store any personal data.
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
_pk_id.6.08a8	13 months	Used to store a few details about the user such as the unique visitor ID
_pk_ses.6.08a8	30 minutes	Short lived cookies used to temporarily store data for the visit

PGM-AUA V1.0 AIMs User State Description

1. Function

2. Reference Model

3. Input/Output Data

3. SubAIMs (Informative)

5. JSON Metadata

6. Profiles

7. Reference Software

8. Conformance Testing

9. Performance Assessment

Notice