(Tentative)

Definition Functional Requirements Syntax Semantics

Definition

Visual Spatial Output (PGM-VSO)

  1. Is produced by the Visual Spatial Reasoning (VSR) AIM.
  2. Provides spatial relationships, referent resolutions, and interaction constraints to Domain Access (DA).
  3. Supports reasoning about physical feasibility, visibility, and referential clarity within a visual scene.

Functional Requirements

Visual Spatial Output conveys the following main information elements:

Function Description
Referent Resolution Resolves vague or underspecified expressions (e.g. “that one”, “nearby object”) into object IDs
Spatial Primitives Represents geometric and topological data (e.g. bounding boxes, centroids, zones, vectors)
Occlusion Mapping Indicates which entities are visually occluded from the User’s perspective
Reachability Assessment Determines whether entities are physically reachable given constraints such as distance, obstacles, or posture
Zone Classification Classifies spatial zones (e.g. shelf, table, floor) with semantic tags and interaction affordances
Scene Anchoring Includes a unique SceneID and timestamp to anchor outputs to a specific frame of reference
Interaction Constraints Represents constraints such as proximity thresholds, collision risks, or required orientation for interaction
Traceability Includes provenance metadata (e.g. VSR ID, timestamp, origin) for audit and rollback
Modularity Supports decomposition into subcomponents (e.g. ResolvedReferents, OcclusionMap, ReachabilityMap) for selective consumption

Syntax

https://schemas.mpai.community/PGM1/V1.0/data/VisualSpatialOutputs.json

Semantics

Label Description
Header Visual Spatial Output Header
├─ Standard-VSO The characters “PGM1-VSO-V”
├─ Version Major version – 1 or 2 characters
├─ Dot-separator The character “.” separating version components
└─ Subversion Minor version – 1 or 2 characters
VisualSceneID Unique identifier for the spatial scene
ResolvedReferents Entities resolved from user input
├─ SourceEntity Identifier of the primary object of attention
├─ TargetZone Identifier of the spatial target zone
└─ AdditionalEntities Optional list of secondary entities involved
OcclusionMap Map of occlusion status for visible entities
├─ Key Entity ID
└─ Value Boolean: true if occluded, false if visible
ReachabilityMap Map of reachability status for entities
├─ Key Entity ID
└─ Value Boolean: true if reachable, false if not
ZoneClassification Semantic classification of spatial zones
├─ Key Zone ID
└─ Value Semantic label (e.g. “shelf”, “floor”, “container”)
InteractionConstraints Constraints relevant to physical interaction
├─ ProximityThreshold Minimum distance required for interaction
├─ CollisionRisk Whether interaction poses a collision risk
└─ RequiredOrientation Required orientation for successful interaction
Trace Provenance metadata
├─ Timestamp Time of Spatial Output generation
└─ Origin Module or subsystem that generated the Spatial Output
DescrMetadata Descriptive metadata for schema documentation and audit
├─ Author Author or generator of the schema instance
├─ SchemaVersion Version of the schema used
└─ Validated Boolean indicating schema validation status