| Function | Reference Model | Input/Output Data |
| SubAIMs | JSON Metadata | Profiles |
Function
The Visual Scene Enhancement (PGM-VSE) AIM interpretative enrichment of a visual scene captured by Context Capture, in order to derive additional, non‑perceptual visual properties relevant to spatial understanding, interaction, and user‑centric reasoning.
VSE operates exclusively on Visual Scene Descriptors (VSD) and produces Enhanced Visual Scene Descriptors, preserving the original perceptual semantics while augmenting them with derived and semantic information under directive control.
Reference Model
Figure 1 gives the Reference Model of the Visual Scene Enhancement(PGM-VSE) AIM.

Figure 1 – The Reference Model of the Visual Scene Enhancement(PGM-VSE) AIM
Input/Output Data
Table 2 gives the Input and Output Data of the Visual Scene Enhancement(PGM-VSE) AIM.
Table 1 – Input/Output Data of Visual Scene Enhancement(PGM-VSE) AIM
| Input | Description |
|---|---|
| Visual Scene Descriptors | Perceptual description of the visual scene produced by Context Capture. |
| Visual SUD Directive | Control directives specifying scope, depth, or policy constraints for visual enhancement. |
| Visual Domain Request | Domain‑specific knowledge supporting visual interpretation and semantic classification. |
| Output | Description |
| Enhanced Visual Scene Descriptors | Visual Scene Descriptors augmented with derived and semantic visual properties produced by VSE. |
| Visual SUD Status | Status information describing the execution and outcome of Visual Scene Enhancement processing. |
| Visual Domain Response | Response to domain‑specific knowledge request. |
SubAIMs
A Visual Scene Enhancement (PGM-VSE) AIM implementation may adopt the architecture of Figure 2.
Figure 2 – Reference Model of Visual Scene Enhancement (PGM-VSE) Composite AIM
Table 2 specifies the Functions and I/O Data of Scene Enhancement (PGM-VSE) AIM’s SubAIMs.
Table 3 – Functions and I/O Data of Scene Enhancement (PGM-VSE) AIM’s SubAIMs
| SubAIM Specification | Purpose | Input Data | Output Data |
|---|---|---|---|
| Visual Descriptors Parsing | Structures raw Visual Scene Descriptors into explicit Visual Objects and spatial attributes without semantic interpretation. | Visual Scene Descriptors Visual SUD Directive |
Visual Objects Spatial Attitudes Visual SUD Status |
| Visual Motion & Proximity Analysis | Detects temporal and spatial dynamics of visual objects by tracking their evolution in space and time. | Visual Objects Spatial Attitudes |
Motion Flags (e.g. stationary, moving) Proximity Class (e.g. near, mid, far) |
| Depth and Occlusion Estimation | Computes relative depth relationships and occlusion conditions among visual objects. | Visual Objects Spatial Attitudes |
Relative Depths Occlusion Flags |
| Visual Object Identification | Assigns semantic object type labels to visual objects using classification models and optional domain knowledge. | Visual Objects Spatial Attitudes Domain Response |
Visual Object Type (e.g. human, vehicle, tool) Type Confidence |
| Visual Salience Mapping | Determines the relative relevance of visual objects with respect to user interaction and context. | Motion Flags Proximity Class Relative Depths Occlusion Flags Visual Object Type Visual SUD Directive Domain Response |
Ranked Visual Objects Filtered Salient Visual Objects |
| Visual Output Construction | Aggregates perceptual and enriched evidence into Enhanced Visual Scene Descriptors and emits execution status. | Visual Objects Spatial Attitudes Motion Flags Proximity Class Relative Depths Occlusion Flags Visual Object Type Salience Results |
Enhanced Visual Scene Descriptors (Enhanced VSD) Visual SUD Status |
Table 4 gives the AIMs composing the Visual Spatial Reasoning (PGM-VSR) Composite AIM:
Table 4 – AIMs of the Visual Spatial Reasoning (PGM-VSR) Composite AIM
| AIM | AIMs | Names | JSON |
| PGM-VSR | Visual Scene Enhancement | Link | |
| PGM-ADP | Visual Descriptors Parsing | Link | |
| PGM-VMP | Visual Motion & Proximity Analysis | Link | |
| PGM-DOE | Depth and Occlusion Estimation | Link | |
| OSD-VOI | Visual Object Identification | Link | |
| PGM-SMP | Visual Salience Mapping | Link | |
| PGM-VOC | Visual Output Construction | Link |
5. JSON Metadata
https://schemas.mpai.community/PGM1/V1.0/AIMs/VisualSceneEnhancement.json
6. Profiles
No Profiles