(Tentative)
| Definition | Functional Requirements | Syntax | Semantics |
Definition
Audio Spatial Output (PGM-ASO)
- Is produced by the Audio Spatial Reasoning (ASR) AIM
- Provides localised, annotated and salience-ranked Audio Objects to Domain Access (DAC).
- Encapsulates spatial, temporal, and semantic interpretations of audio-emitting Items
Functional Requirements
Audio Spatial Output conveys the following main information elements:
| Function | Description |
| Salience Estimation | Provides normalised salience scores per Audio Object over defined time windows. |
| Attention Prediction | A probabilistic estimate of where the A-User is likely to focus next. |
| Temporal Salience Curve | Time-series of salience scores for dynamic modulation. |
| Semantic Annotation | Includes optional semantic tags tied to object roles, states, or relationships. Confidence scores for semantic tags (e.g., “likely soloist”). |
| Traceability | Retains object IDs and timestamps for downstream audit and expressive rendering. |
| Modularity | Must support partial outputs (e.g., only salience, only localisation) depending on DAC query. |
Syntax
https://schemas.mpai.community/PGM1/V1.0/data/AudioSpatialOutput.json
Semantics
| Label | Description |
| Header | Schema identifier and version tag |
| ├─ Standard-ASO | The characters “PGM-ASO-V” |
| ├─ Version | Major version – 1 or 2 characters |
| ├─ Dot-separator | The character “.” separating version components |
| └─ Subversion | Minor version – 1 or 2 characters |
| AudioSpatialOutputID | Unique identifier for this Audio Spatial Output instance |
| MInstanceID | Metaverse instance ID |
| MEnvironmentID | Environment ID |
| SoundSources | List of detected sound sources |
| ├─ Spatial Attitude | Reference to spatial-attitude schema |
| ├─ Proximity | Relative distance class: near, mid, or far |
| ├─ Motion | Movement status: static or moving |
| └─ Confidence | Confidence score (range: 0 to 1) |
| ReverberationProfile | Environmental echo characteristics |
| ├─ EchoDelay | Delay of echo in milliseconds |
| ├─ DecayRate | Rate of echo decay |
| └─ SurfaceEstimate | Estimated surface type contributing to reverberation |
| Trace | Provenance metadata |
| ├─ Origin | Module or subsystem that generated the output |
| └─ Timestamp | Time of creation |