(Tentative)
| Definition | Functional Requirements | Syntax | Semantics |
Definition
Audio Spatial Guide (PGM-ASG)
- Is produced by the Audio Spatial Reasoning (ASR) AIM
- Represents a User-centric view of the spatial audio context derived from scene interpretation
- Enriches the User’s spoken or written input with spatial cues – such as sound source relevance, directionality, and proximity – prior to prompt generation
Functional Requirements
Audio Spatial Guide conveys the following main information elements:
| Function | Description |
| Salient Source Selection | Prioritises audio sources relevant to user intent or focus |
| Directional Cue Mapping | Translates azimuth and elevation into user-relative descriptors |
| Proximity Framing | Classifies sources as near, mid, or far for contextual emphasis |
| Semantic Labeling | Attaches meaningful labels (e.g., “alarm”, “voice”, “music”) |
| Acoustic Environment Summary | Provides high-level descriptors of ambient audio context |
| Viewpoint Normalisation | Adjusts spatial descriptors to match user orientation |
| Output Generation | Produces Audio Adaptive Context Guide for Prompt Creation AIM |
Syntax
https://schemas.mpai.community/PGM1/V1.0/data/AudioSpatialGuide.json
Semantics
| Label | Description |
| Header | Schema header with version tag |
| ├─ Standard-ASG | The characters “PGM-ASG-V” |
| ├─ Version | Major version – 1 or 2 characters |
| ├─ Dot-separator | The character “.” separating version components |
| └─ Subversion | Minor version – 1 or 2 characters |
| AudioSpatialGuideID | Unique identifier for this guide instance |
| MInstanceID | Identifier of M-Instance |
| MEnvironmentID | Identifier of M-Environment |
| SalientSources | List of user-relevant sound sources |
| ├─ SourceID | Unique ID for each source |
| ├─ Label | Semantic label (e.g., “voice”, “alarm”) |
| ├─ RelativeDirection | Azimuth and elevation relative to user |
| ├─ Proximity | Estimated closeness to the User |
| ├─ Motion | Whether the source is static or moving |
| └─ NarrativeCue | Optional natural language cue for prompt enrichment |
| AmbientAudioContext | Summary of ambient audio conditions |
| ├─ NoiseLevel | Background noise classification |
| ├─ Reverberation | Echo profile of the environment |
| ├─ DominantSource | Most prominent sound source |
| Trace | Provenance metadata |
| ├─ Origin | Module that generated the guide |
| └─ Timestamp | Time of guide creation |