(Tentative)
| Definition | Functional Requirements | Syntax | Semantics |
Definition
Visual Spatial Directive (PGM-VSD)
- Is produced by the Domain Access (DA) AIM.
- Provides referential cues, user intent, and contextual constraints to guide interpretation of visual scenes.
- Enables Visual Spatial Reasoning (VSR) to resolve ambiguous entities, assess spatial feasibility, and generate primitives aligned with the User’s interaction goals.
Functional Requirements
Visual Spatial Directive conveys the following main information elements:
| Function | Description |
| Referent Cueing | Includes ambiguous or partially resolved tokens (e.g., “that”, “there”) requiring spatial disambiguation; supports linkage to user gestures, gaze, or prior dialogue context |
| Intent Anchoring | Includes the current user intent or goal motivating the spatial query; supports alignment with Prompt Creation and User State |
| Scene Anchoring | Includes identifiers for the current scene or zone; supports multi-zone and multi-user environments |
| Constraint Injection | Includes spatial or interaction constraints relevant to the query; supports filtering of unreachable, occluded, or non-viable entities |
| Resolution Guidance | Includes optional hints or preferences for resolution strategy (e.g., prioritise gaze over proximity); supports modular resolution logic in VSR |
| Traceability | Includes origin metadata and timestamp for audit and replay |
Syntax
https://schemas.mpai.community/PGM1/V1.0/data/VisualSpatialDirective.json
Semantics
| Label | Description |
| Header | Visual Spatial Directive Header |
| – Standard-VSD | The characters “PGP-VSD-V” |
| – Version | Major version – 1 or 2 characters |
| – Dot-separator | The character “.” separating version components |
| – Subversion | Minor version – 1 or 2 characters |
| VisualSpatialDirectiveID | Unique identifier for this Visual Spatial Directive instance |
| SceneID | Identifier of the scene or zone being queried |
| Intent | User intent motivating the spatial query |
| ReferentCues | Tokens requiring spatial disambiguation |
| ├─ Token | Ambiguous expression (e.g., “that”, “there”) |
| └─ ContextHint | Optional semantic or gestural cue |
| Constraints | Spatial or interaction constraints |
| ├─ ConstraintType | Type of constraint (e.g., reachable_from, visible_from) |
| └─ ConstraintValue | Entity or zone involved |
| ResolutionHints | Optional preferences for resolution strategy |
| ├─ HintType | e.g., prioritise_gaze, exclude_occluded |
| └─ HintValue | Boolean or string |
| Trace | Provenance metadata |
| ├─ Origin | Module or subsystem that generated the query |
| └─ Timestamp | Time of creation |