(Tentative)
| Definition | Functional Requirements | Syntax | Semantics |
Definition
Visual Spatial Output (PGM-VSO)
- Is produced by the Visual Spatial Reasoning (VSR) AIM.
- Provides spatial relationships, referent resolutions, and interaction constraints to Domain Access (DA).
- Supports reasoning about physical feasibility, visibility, and referential clarity within a visual scene.
Functional Requirements
Visual Spatial Output conveys the following main information elements:
| Function | Description |
| Referent Resolution | Resolves vague or underspecified expressions (e.g. “that one”, “nearby object”) into object IDs |
| Spatial Primitives | Represents geometric and topological data (e.g. bounding boxes, centroids, zones, vectors) |
| Occlusion Mapping | Indicates which entities are visually occluded from the User’s perspective |
| Reachability Assessment | Determines whether entities are physically reachable given constraints such as distance, obstacles, or posture |
| Zone Classification | Classifies spatial zones (e.g. shelf, table, floor) with semantic tags and interaction affordances |
| Scene Anchoring | Includes a unique SceneID and timestamp to anchor outputs to a specific frame of reference |
| Interaction Constraints | Represents constraints such as proximity thresholds, collision risks, or required orientation for interaction |
| Traceability | Includes provenance metadata (e.g. VSR ID, timestamp, origin) for audit and rollback |
| Modularity | Supports decomposition into subcomponents (e.g. ResolvedReferents, OcclusionMap, ReachabilityMap) for selective consumption |
Syntax
https://schemas.mpai.community/PGM1/V1.0/data/VisualSpatialOutputs.json
Semantics
| Label | Description |
| Header | Visual Spatial Output Header |
| ├─ Standard-VSO | The characters “PGM1-VSO-V” |
| ├─ Version | Major version – 1 or 2 characters |
| ├─ Dot-separator | The character “.” separating version components |
| └─ Subversion | Minor version – 1 or 2 characters |
| VisualSceneID | Unique identifier for the spatial scene |
| ResolvedReferents | Entities resolved from user input |
| ├─ SourceEntity | Identifier of the primary object of attention |
| ├─ TargetZone | Identifier of the spatial target zone |
| └─ AdditionalEntities | Optional list of secondary entities involved |
| OcclusionMap | Map of occlusion status for visible entities |
| ├─ Key | Entity ID |
| └─ Value | Boolean: true if occluded, false if visible |
| ReachabilityMap | Map of reachability status for entities |
| ├─ Key | Entity ID |
| └─ Value | Boolean: true if reachable, false if not |
| ZoneClassification | Semantic classification of spatial zones |
| ├─ Key | Zone ID |
| └─ Value | Semantic label (e.g. “shelf”, “floor”, “container”) |
| InteractionConstraints | Constraints relevant to physical interaction |
| ├─ ProximityThreshold | Minimum distance required for interaction |
| ├─ CollisionRisk | Whether interaction poses a collision risk |
| └─ RequiredOrientation | Required orientation for successful interaction |
| Trace | Provenance metadata |
| ├─ Timestamp | Time of Spatial Output generation |
| └─ Origin | Module or subsystem that generated the Spatial Output |
| DescrMetadata | Descriptive metadata for schema documentation and audit |
| ├─ Author | Author or generator of the schema instance |
| ├─ SchemaVersion | Version of the schema used |
| └─ Validated | Boolean indicating schema validation status |