6.6.1    Input Audio.

6.6.2    Input Visual

6.6.3    Input RADAR.

6.6.4    Input LiDAR.

6.6.5    Input Ultrasound.

6.6.6    GNSS Data.

6.6.7    Offline Map Data.

6.6.8    Audio-Visual Scene Descriptors.

6.6.9    Visual Scene Descriptors.

6.6.10  Lidar Scene Descriptors.

6.6.11  RADAR Scene Descriptors.

6.6.12  Ultrasound Scene Descriptors.

6.6.13  Offline Maps Scene Descriptors.

6.6.14  Audio Scene Descriptors.

6.6.15  Traffic Signal Descriptors.

6.6.16  Basic Environment Representation.

6.6.17  Alert

1.1.1        Input Audio

1.1.1.1       Definition

Multichannel Audio provided by a Microphone Array used to:

  1. Create Audio Scene Descriptors to:
    • Enable extraction of speech addressed by humans outside or inside the HCI.
    • Incorporate outdoor Audio information into the Basic Environment Representation.
  2. Suppress noise and individual sound sources outside the passenger cabin.

1.1.1.2       Functional Requirements

Microphone (arrays) are used to capture the sound both outdoor and indoor for the purpose of creating Audio Scene Description to:

  1. Provide the location of sound sources.
  2. Enable extraction of speech addressed to CAV by humans.
  3. Remove unwanted noise from the passenger cabin.
  4. Incorporate Audio information into the Basic Environment Representation.

MPAI has developed specifications for Multichannel Audio, Multichannel Audio Stream, and Microphone Array Geometry.

1.1.1.3       Syntax

https://schemas.mpai.community/CAE1/V2.2/data/AudioFormatID.json

https://schemas.mpai.community/OSD/V1.1/data/SpaceTime.json

1.1.1.4       Semantics

Label Size Description
Header N1 Bytes
·         Standard 9 Bytes The characters “CAV-IAU-V”
·         Version N2 Bytes Major version – 1 or 2 Bytes
·         Dot-separator 1 Byte The character “.”
·         Subversion N3 Bytes Minor version – 1 or 2 Bytes.
InputAudioID N4 Bytes Identifier of LiDAR Sensor.
InputAudioTimeSpaceAttributes N5 Bytes Time and Space of Input Audio Data.
InputAudioData N6 Bytes
·         AudioFormatID N7 Bytes Format ID of Input Audio Data.
·         InputAudioDataLength N8 Bytes Data Length of Input Audio Data in Bytes.
·         InputAudioDataURI N9 Bytes Location of Input Audio Data.
InputAudioAttributes[] N10 Bytes
·         AudioAttributeID N11 Bytes ID of Attribute of Input Audio Data
·         AudioAttributeFormatID N12 Bytes ID of Attribute Format of Input Audio Data
·         InputAudioAttributeLength N13 Bytes Number of Bytes in Input Audio Attribute Data
·         InputAudioAttributeDataURI N14 Bytes URI of Data of Input Audio Attribute Data
DescrMetadata N1 Bytes Descriptive Metadata

1.1.1.5       Data Formats

Input Audio requires:

  1. Audio Format.
  2. Audio Attribute Format.

1.1.1.6       To Respondents

Respondents are invited to:

  1. Comment or elaborate on the relevance and applicability of the above-mentioned three standards to CAV.
  2. Comment on the Functional Requirements.
  3. Propose motivated Functional Requirements for an Audio Array Format suitable to create a 3D sound field representation of the Environment for the stated purposes.
  4. Propose Data Formats and Attributes for use in the future Technical Specification: Data Types, Formats, and Attributes (MPAI-TFA).

1.1.2        Input Visual

1.1.2.1       Definition

Digital representation of information captured in the visible range of the electromagnetic field from single camera or an array of cameras.

1.1.2.2       Functional Requirements

A visual scene can be captured by an array of visual sensors characterised by:

  1. Number and position of sensing devices.
  2. Number of horizontal and vertical sensors in a sensing device.
  3. Frame frequency.
  4. Colour space.
  5. Bit-depth information.
  6. Depth (distance from scene pixel) information.

Captured Data

  1. Provide pixel value, time and potentially depth.
  2. Are used to:
    1. Provide the position and orientation of individual visual objects.
    2. Provide Visual Scene Descriptors.
    3. Enable identification, tracking and representation of relevant visual objects, including humans.

1.1.2.3       Syntax

https://schemas.mpai.community/PAF/V1.2/data/VisualFormatID.json

https://schemas.mpai.community/OSD/V1.1/data/SpaceTime.json

1.1.2.4       Semantics

Label Size Description
Header N1 Bytes
·         Standard 8 Bytes The characters “CAV-IVI-V”
·         Version N2 Bytes Major version – 1 or 2 Bytes
·         Dot-separator 1 Byte The character “.”
·         Subversion N3 Bytes Minor version – 1 or 2 Bytes.
InputVisualID N5 Bytes Identifier of LiDAR Sensor.
InputVisualTimeSpaceAttributes N6 Bytes Total duration of Input Visual Data.
InputVisualData N7 Bytes
·         InputVisualFormatID N8 Bytes Format ID of Input Visual Data.
·         InputVisualDataLength N9 Bytes Data Length of Input Visual Data in Bytes.
·         InputVisualDataURI N10 Bytes Location of Input Visual Data.
AudioObjectAttributes[] N11 Bytes
·         InputVisualAttributeID N12 Bytes ID of Attribute of Input Visual Data
·         InputVisualFormatAttributeID N13 Bytes ID of Attribute Format of Input Visual Data
·         InputVisualAttributeLength N14 Bytes Number of Bytes in Input Visual Data
·         InputVisualAttributeDataURI N15 Bytes URI of Data of Input Visual Data
DescrMetadata N16 Bytes Descriptive Metadata

1.1.2.5       Data Types

Input Visual required the following Formats:

  1. Visual Format
  2. Visual Attribute
  3. Visual Attribute Format

1.1.2.6       To Respondents

Respondents are invited to:

  1. Comment Functional Requirements or propose new ones.
  2. Comment on and propose formats (2D, 2D+ depth, or 3D visual sensors) use in the future Technical Specification: Data Types, Formats, and Attributes (MPAI-TFA).

1.1.3        Input RADAR

1.1.3.1       Definition

Data produced by a “time-of-flight”-based active sensor called Radio Detection and Ranging (RADAR) able to measure the distance and speed of objects from the time it takes for a signal emitted by the sensor to hit an object and be reflected.

RADAR operates in the mm range. It can detect vehicles (CAVs and trucks) because they typically reflect radar signals while smaller and less reflecting objects, e.g., pedestrians and motorcycles have a poor reflectance. In a busy environment, the reflections of big vehicles can overcome a motorcycle’s causing missed detection of important objects (e.g., a human next to a vehicle), while a can may produce an image out of proportion to its size.

1.1.3.2       Functional Requirements

The main features of Radar Data are:

  1. Ability to detect objects and measure speed @ ≤ 250 m (long range radar in the 76-77 GHz).
  2. Ability to provide a resolution of ~25-cm radial and ~1.5 degrees angular.
  3. Suitability to measure distance (short range radar in the 25 GHz band).

1.1.3.3       Syntax

https://schemas.mpai.community/CAV2/V1.0/data/RADARFormatID.json

https://schemas.mpai.community/OSD/V1.1/data/SpaceTime.json”

1.1.3.4       Semantics

Label Size Description
Header N1 Bytes
·         Standard 9 Bytes The characters “CAV-IRA-V”
·         Version N2 Bytes Major version – 1 or 2 Bytes
·         Dot-separator 1 Byte The character “.”
·         Subversion N3 Bytes Minor version – 1 or 2 Bytes.
InputRADARID N5 Bytes Identifier of RADAR Sensor.
InputRADARTimeSpaceAttributes N6 Bytes Total duration of Input RADAR Data.
InputRADARData N7 Bytes
·         InputRADARFormatID N8 Bytes Format ID of Input RADAR Data.
·         InputRADARDataLength N9 Bytes Data Length of Input RADAR Data in Bytes.
·         InputRADARDataURI N10 Bytes Location of Input RADAR Data.
AudioObjectAttributes[] N11 Bytes
·         InputRADARAttributeID N12 Bytes ID of Attribute of Input RADAR Data
·         RADARAttributeFormatID N13 Bytes ID of Attribute Format of Input Audio Data
·         InputRADARAttributeLength N14 Bytes Number of Bytes in Input RADAR Data
·         InputRADARAttributeDataURI N15 Bytes URI of Data of Input RADAR Data

1.1.3.5       Data Formats

Input RADAR requires:

  1. RADAR Format.
  2. RADAR Attribute Format.

1.1.3.6       To Respondents

Respondents are invited to:

  1. Identify functional requirements of the output data produced by RADAR sensors for indoor/outdoor (cabin) use.
  2. Propose Data Formats and Attributes for use in the future Technical Specification: Data Types, Formats, and Attributes (MPAI-TFA).

1.1.4        Input LiDAR

1.1.4.1       Definition

Data produced by a “time-of-flight”-based active sensor called Light Detection and Ranging (LiDAR) able to measure the distance and speed of objects from the time it takes for a signal emitted by the sensor to hit an object and be reflected.

1.1.4.2       Functional Requirements

Produces the distance of a voxel from the sensor, its grayscale by the intensity variation of the reflected light, its colour by using more than one wavelength, its velocity by using the Doppler shift in frequency caused by motion, or by taking the position at different times with an angular resolution ~0.1º vertical and ~1º horizontal with a maximum field capture ~40º vertical.

1.1.4.3       Syntax

https://schemas.mpai.community/CAV2/V1.0/data/LiDARFormatID.json

https://schemas.mpai.community/OSD/V1.1/data/SpaceTime.json

1.1.4.4       Semantics

Label Size Description
Header N1 Bytes
·         Standard 9 Bytes The characters “CAV-ILI-V”
·         Version N2 Bytes Major version – 1 or 2 Bytes
·         Dot-separator 1 Byte The character “.”
·         Subversion N3 Bytes Minor version – 1 or 2 Bytes.
InputLiDARID N4 Bytes Identifier of LiDAR Sensor.
InputLiDARTimeSpaceAttributes N5 Bytes Total duration of Input LiDAR Data.
·         Duration N6 Bytes Duration of LiDAR Data Block.
·         SpatialAttitude N7 Bytes CAV’s Spatial Attitude when getting Data.
InputLiDARData N8 Bytes
·         InputLiDARFormatID N9 Bytes Format ID of Input LiDAR Data.
·         InputLiDARDataLength N10 Bytes Data Length of Input LiDAR Data in Bytes.
·         InputLiDARDataURI N11 Bytes Location of Input LiDAR Data.
AudioObjectAttributes[] N12 Bytes
·         InputLiDARAttributeID N13 Bytes ID of Attribute of Input LiDAR Data
·         LiDARAttributeFormatID N14 Bytes ID of Attribute Format of Input Audio Data
·         InputLiDARAttributeLength N15 Bytes Number of Bytes in Input LiDAR Data
·         InputLiDARAttributeDataURI N16 Bytes URI of Data of Input LiDAR Data

1.1.4.5       Data Formats

Input LiDAR requires:

  1. LiDAR Format.
  2. LiDAR Attribute Format.

1.1.4.6       To Respondents

Respondents are invited to:

  1. Comment or extend the functional requirements of the data produced by LiDAR sensors for indoor/outdoor (cabin) use.
  2. Propose Data Formats and Attributes for use in the future Technical Specification: Data Types, Formats, and Attributes (MPAI-TFA).

1.1.5        Input Ultrasound

1.1.5.1       Definition

A Data Type representing analogue signals generated by information captured by an ultrasonic sensor.

1.1.5.2       Functional Requirements

An active time-of-flight sensor typically operating in the 40 kHz to 250 kHz range.

The main features of Ultrasound are:

  1. Ability to monitor the immediate surroundings of the vehicle (≤ 10 m).
  2. Operation frequency above 30 kHz.
  3. Low-resolution images.

1.1.5.3       Syntax

https://schemas.mpai.community/CAV2/V1.0/data/UltrasoundFormatID.json

https://schemas.mpai.community/OSD/V1.1/data/SpaceTime.json

1.1.5.4       Semantics

Label Size Description
Header N1 Bytes
·         Standard 9 Bytes The characters “CAV-IUS-V”
·         Version N2 Bytes Major version – 1 or 2 Bytes
·         Dot-separator 1 Byte The character “.”
·         Subversion N3 Bytes Minor version – 1 or 2 Bytes.
InputUltrasoundID N4 Bytes Identifier of Ultrasound Sensor.
InputUltrasoundTimeSpaceAttributes N5 Bytes Total duration of Input Ultrasound Data.
·         Duration N6 Bytes Duration of Ultrasound Data Block.
·         SpatialAttitude N7 Bytes CAV’s Spatial Attitude when getting Data.
InputUltrasoundData N8 Bytes
·         InputUltrasoundFormatID N9 Bytes Format ID of Input Ultrasound Data.
·         InputUltrasoundDataLength N10 Bytes Data Length of Input Ultrasound Data in Bytes.
·         InputUltrasoundDataURI N11 Bytes Location of Input Ultrasound Data.
AudioObjectAttributes[] N12 Bytes
·         InputUltrasoundAttributeID N13 Bytes ID of Attribute of Input Ultrasound Data
·         UltrasoundAttributeFormatID N14 Bytes ID of Attribute Format of Input Audio Data
·         InputUltrasoundAttributeLength N15 Bytes Number of Bytes in Input Ultrasound Data
·         InputUltrasoundAttributeDataURI N16 Bytes URI of Data of Input Ultrasound Data

1.1.5.5       Data Formats

Ultrasound Data Formats are required.

1.1.5.6       To Respondents

Respondents are invited to:

  1. Comment or elaborate on the functional requirements of Ultrasound images formats with the goal of achieving tracking and representation of objects for the Ultrasound Scene Description.
  2. Propose Data Formats and Attributes for use in the future Technical Specification: Data Types, Formats, and Attributes (MPAI-TFA).

1.1.6        GNSS Data

Global Navigation Satellite System (GNSS) Data for a constellation of satellites that transmit positioning and timing data to GNSS receivers to determine their location.

1.1.6.1       Functional Requirements

GNSS Data can come from four GNSSs – GPS (US), GLONASS (RU), Galileo (EU), BeiDou (CN) and two regional systems – QZSS (Japan) and IRNSS or NavIC (India). Position accuracy depends on the GNSS system.

1.1.6.2       Syntax

https://schemas.mpai.community/CAV2/V1.0/data/GNSSFormatID.json

https://schemas.mpai.community/OSD/V1.1/data/SpaceTime.json

 

1.1.6.3       Semantics

Label Size Description
Header N1 Bytes
·         Standard 9 Bytes The characters “CAV-IGN-V”
·         Version N2 Bytes Major version – 1 or 2 Bytes
·         Dot-separator 1 Byte The character “.”
·         Subversion N3 Bytes Minor version – 1 or 2 Bytes.
InputGNSSID N4 Bytes Identifier of GNSS Sensor.
InputGNSSTimeSpaceAttributes N5 Bytes Total duration of Input GNSS Data.
·         Duration N6 Bytes Duration of GNSS Data Block.
·         SpatialAttitude N7 Bytes CAV’s Spatial Attitude when getting Data.
InputGNSSData N8 Bytes
·         InputGNSSFormatID N9 Bytes Format ID of Input GNSS Data.
·         InputGNSSDataLength N10 Bytes Data Length of Input GNSS Data in Bytes.
·         InputGNSSDataURI N11 Bytes Location of Input GNSS Data.
AudioObjectAttributes[] N12 Bytes
·         InputGNSSAttributeID N13 Bytes ID of Attribute of Input GNSS Data
·         GNSSAttributeFormatID N14 Bytes ID of Attribute Format of Input Audio Data
·         InputGNSSAttributeLength N15 Bytes Number of Bytes in Input GNSS Data
·         InputGNSSAttributeDataURI N16 Bytes URI of Data of Input GNSS Data

1.1.6.4       Data Formats

Some data formats are:

  1. GPS Exchange Format (GPX): an XML schema providing a common GPS data format that can be used to describe waypoints, tracks, and routes.
  2. Environment Geodetic System (WGS): definition of the coordinate system’s fundamental and derived constants, the ellipsoidal (normal) Earth Gravitational Model (EGM), a description of the associated Environment Magnetic Model (WMM), and a current list of local datum transfor­mations.
  3. International GNSS Service (IGS) SSR: format used to disseminate real-time products to support the IGS (igs.org) Real-Time Service. The messages support multi-GNSS and include corrections for orbits, clocks, DCBs, phase-biases and ionospheric delays. Extensions are planned to also cover satellite attitude, phase centre offsets and variations and group delay variations.

1.1.6.5       To Respondents

Respondents are requested to:

  1. Comment on the functional requirements.
  2. Propose Data Formats and Attributes for use in the future Technical Specification: Data Types, Formats, and Attributes (MPAI-TFA).

1.1.7        Offline Map Data

1.1.7.1       Definition

An Offline Map or HD map or 3D map is a roadmap with cm-level accuracy and a high environmental fidelity reporting the positions of pedestrian crossings, traffic lights/signs, barriers etc. at the time the Offline Map has been created.

1.1.7.2       Functional Requirements

The features of an Offline Data Format used by a CAV should consider the features of data formats such as:

  1. Navigation Data Standards calls itself “The Environment wide standard for map data in automotive eco-systems”. Their NDS specification covers data model, storage format, interfaces, and protocols.
  2. SharedStreets Referencing System calls itself a global non-proprietary system for describing streets.

1.1.7.3       Syntax

1.1.7.4       Semantics

Label Size Description
Header N1 Bytes
·         Standard 9 Bytes The characters “CAV-OLM-V”
·         Version N2 Bytes Major version – 1 or 2 Bytes
·         Dot-separator 1 Byte The character “.”
·         Subversion N3 Bytes Minor version – 1 or 2 Bytes
OffLineMapSourceID N4 Bytes Identifier of Offline Map.
OffLineMapDataFormatID N5 Bytes Format ID of Offline Map Data
OffLineMapData N7 Bytes Offline Map Data.
DescrMetadata N11 Bytes Descriptive Metadata.

1.1.7.5       Data Formats

Several Data Formats are used in practice.

1.1.7.6       To Respondents

Respondents are requested to:

  1. Comment on the functional requirements of the Offline Map Data Format to support the most common offline map formats.
  2. Propose Data Formats and Attributes for use in the future Technical Specification: Data Types, Formats, and Attributes (MPAI-TFA).

1.1.8        Audio-Visual Scene Descriptors

1.1.8.1       Definition

Scene is a Data Type representing the outcome of the process involving:

  1. A specific Environment Sensing Technology (EST).
  2. Sensed data (EST Data).
  3. Processing the EST Data to represent the environment with a Scene.

1.1.8.2       Functional Requirements

To the extent possible, a Scene created from Data of a specific EST should have a compatible format to facilitate the fusion of the individual EST-based Scenes into the Basic Environment Representation to be passed to the Autonomous Motion Subsystem.

The operation of the Environment Sensing Subsystem unfolds as follows:

  1. A given EST produces EST Data at discrete Δt time increments that depend on the EST operating frequency. Different ESTs may use different Δt values.
  2. EST-specific Data are passed to the EST-specific Scene Description AIM.
  3. An EST-specific Scene Description AIM produces EST-specific Scene Descriptors. These may have a complex data structure that includes several elementary Data Types each having their own Data Formats.
  4. EST-specific Scene Descriptors enable an object-based, time-dependent, and constantly updated Scene Descriptors that may contain Objects potentially with different resolutions, e.g., an object at 100m and another object at 10m may be represented with different spatial and temporal resolutions.
  5. Scene Descriptors#1 produced from EST#1 Data may include Data Types not included in Scene Descriptors#2 produced from EST#2 Data. However, the Environment Sensing Subsystem (ESS) Data Fusion AIM is cognisant of both Data Formats.
  6. Scene Descriptors#1 from EST#1 Data may not represent the environment with the same Accuracy as or may provide values that conflict with the environment representation provided by Scene Descriptors#2 from EST#2 Data.
  7. The format of the Offline Maps should allow for lossless transformation of its EST Data into Scene Descriptors without loss of information so as to enable the fusion of its Scene Descriptors into the Basic Environment Representation produced by the ESS Data Fusion AIM.
  8. EST Scene Descriptors SD(t) at time t are obtained from:
    • Using sensed EST Data at time t and previously computed Scene Descriptors SD(t-Δt), SD(t-2Δt) etc.
    • Updating the Objects inherited from preceding SDs.
    • Removing objects present in previous SDs and no longer present in SDs(t).
    • Adding and assigning attributes to new Objects, i.e., entirely new Objects, the merge of two or more Objects, or the splitting of a previously merged Object.
  9. SD(t) is a list objects detected and confirmed at time t with their attributes.
  10. EST Scene Description AIMs keep memory of past Scene Descriptors. Recent Objects may retain all attributes while Objects in a farther past may have coarser attributes or not be available at all.
  11. EST-specific Scene Descriptors allow for the description of Object using one of a limited number of MPAI-standardised formats:
    • The coordinates of a centre of gravity of an Object.
    • The Bounding Box of the Object.
    • 2D Scene Objects
      • Static environment:
      • Parametric free space representation represented as a single object.
      • Alternative representations as individual static objects.
      • Dynamic environment: object-based representation.
    • 5D Scene Objects
      • Static components of the scene
      • Grid-based (elevation maps or Stixel Environment), represented as a single object.
      • Object-based for traffic poles and signals (e.g., Stixel Environment, Multi-level surface map).
      • Object-based for the dynamic parts (e.g., Stixel Environment, Multi-level surface map).
    • 3D (Volumetric) Scene Objects
      • Static components of the scene
      • Voxel grids, meshes, possibly as a single.
      • Object-based for traffic poles and signals (voxel grids, meshes).
      • Dynamic components of the scene (point clouds, voxel grids, meshes, …)
  1. An EST-specific Scene can contain Objects with different formats.
  2. At a given time that depends on the operating frequency of a specific EST, the Scene described by the Audio-Visual Scene Descriptors represents an EST-specific snapshot of the environment.

 

MPAI has developed specification of Audio Object, Visual Object, Audio-Visual Object, and of Audio Scene Descriptors, Visual Scene Descriptors, and Audio-Visual Scene Descriptors supporting the functional requirements identified above.

Here the Syntax and Semantics of the Audio-Visual Basic Scene Descriptors where the Scene is defined as composition of Objects is reported from [13]. Other Scene Descriptors can easily be derived from this.

1.1.8.3       Syntax

This is provided by 5.5.9.3.

1.1.8.4       Semantics

This is provided by 5.5.9.4.

1.1.8.5       Data Formats and Attributes

Traffic Signalisation Descriptors can be considered as Attributes of the Scene and its Objects.

1.1.8.6       To Respondents

Respondents are:

  1. Invited to comment on the functional requirements identified above and on the MPAI specifications that provide the information identified in 6.8.2.
  2. Requested to propose motivated extensions or new technologies.
  3. Requested to propose Traffic Signalisation Descriptors as Attributes.

1.1.9        Visual Scene Descriptors

The Visual Scene Description AIM

  1. Receives the Spatial Attitude from MAS
  2. Retrieves the current Spatial Attitude.
  3. Receives or retrieves a specified subset of a prior Basic Environment Representation
  4. Provides Visual Scene Descriptors, a machine-readable description of the Visual Scene’s:
  5. Spatial Attitudes of the Visual Objects.
  6. Visual Objects.

 

To Respondents

Respondents are requested to propose functional requirements of Visual Scene Descriptors that provide the information identified in 6.6.8.2.

1.1.10    Lidar Scene Descriptors

The LiDAR Scene Description AIM receives LiDAR Data, Spatial Attitude from MAS, and a portion of a prior Basic Environment Representation and provides LiDAR Scene Descriptors.

 

To Respondents

Respondents are requested to propose functional requirements of LiDAR Scene Descriptors that provide the information identified in 6.6.8.2.

1.1.11    RADAR Scene Descriptors

The RADAR Scene Description AIM receives RADAR Data, Spatial Attitude from MAS, and a portion of a prior Basic Environment Representation and provides RADAR Scene Descriptors.

 

To Respondents

Respondents are requested to propose functional requirements of RADAR Scene Descriptors that provide the information identified in 6.6.8.2.

1.1.12    Ultrasound Scene Descriptors

The Ultrasound Scene Description AIM receives Ultrasound Data, Spatial Attitude from MAS, and a portion of a prior Basic Environment Representation and provides Ultrasound Scene Descriptors.

 

To Respondents

Respondents are requested to propose functional requirements of Ultrasound Scene Descriptors that provide the information identified in 6.6.8.2.

1.1.13    Offline Maps Scene Descriptors

The Offline Map Scene Description AIM receives Offline Map Data, Spatial Attitude from MAS, and a portion of a prior Basic Environment Representation and provides Offline Map Scene Descriptors.

 

To Respondents

Respondents are requested to propose functional requirements of Offline Map Scene Descriptors that provide the information identified in 6.6.8.2.

1.1.14    Audio Scene Descriptors

The Audio Scene Description AIM receives Audio Data, Spatial Attitude from MAS, and a portion of a prior Basic Environment Representation and provides Audio Scene Descriptors.

 

To Respondents

Respondents are requested to propose functional requirements of Audio Scene Descriptors that provide the information identified in 6.6.8.2.

1.1.15    Traffic Signal Descriptors

1.1.15.1   Definition

The digital representation of the traffic signalisations used at a U-Location. For the sake of simplicity, it is assumed that Traffic Signal Descriptors are derived using Audio and Visual Scene Descriptors. The content of this Subsection can be easily extended to apply to Scene Descriptors of other Environment Sensing Technology Data.

1.1.15.2   Functional Requirements

Traffic Signal Descriptors include:

  1. Position and Orientation of the traffic audio and visual signals at the U-Location:
    • Road signs
    • Traffic signs
    • Traffic lights
    • Walkways
    • Lanes
    • Traffic sound
  2. Semantics of the traffic signals.

Traffic Signal Descriptors can be used as Attributes of the MPAI-specified Scene Descriptors.

1.1.15.3   Syntax

https://schemas.mpai.community/OSD/V1.1/data/AudioVisualSceneDescriptors.json

1.1.15.4   Semantics

Label Size Description
Header N1 Bytes
·         Standard 9 Bytes The characters “MMM-TSD-V”
·         Version N2 Bytes Major version – 1 or 2 Bytes
·         Dot-separator 1 Byte The character “.”
·         Subversion N3 Bytes Minor version – 1 or 2 Bytes
TrafficSignalConfigurationID N4 Bytes Identifier of TSD.
TrafficSignalConfigurationData N5 Bytes
·         AVScene Descriptors N6 Bytes AV Scene Descriptors with added Object semantics (Traffic Signal Descriptors)
DescrMetadata N7 Bytes Descriptive Metadata

1.1.15.5   Data Types and Formats

Traffic Signal Descriptors are Attributes of the Audio-Visual Scene’s Objects.

1.1.15.6   To Respondents

Respondents are requested to:

  1. Comment on, extend, or reformulate the Functional Requirements.
  2. Comment on the use MPAI Object and Space Descriptors for Traffic Signal Descriptors needs.
  3. Propose alternative Traffic Signal Descriptors solutions.

1.1.16    Basic Environment Representation

1.1.16.1   Definition

Basic Environment Representation (BER) is the digital representation of the environment traversed by a CAV is called. The BER results from the integration of all data sensed by a CAV:

  1. Spatial information (e.g., GNSS, odometry).
  2. Audio-Visual Scene Descriptors obtained the fusion of EST-specific Scene Descriptors.
  3. Road Topology.
  4. Environmental data (e.g., weather, temperature, air pressure, ice and water on the road, wind, fog etc.).

1.1.16.2   Functional Requirements

The functional requirements of the BER format are:

  1. Includes all available information that enables the Autonomous Motion Subsystem (AMS) to define a Path to be executed in a Decision Horizon Time.
  2. Describes the Environment in terms of Scene Descriptors (including static objects, e.g., from Offline Maps) and Topology (e.g., roads and lanes).
  3. Enables object tracking, inference of motion vectors, etc. by referencing the BERs of sufficient prior snapshots.
  4. Describes each Object with the following attributes:
    • This is specified as Start and End Time of the validity of the Object Description.
    • Object Identifier. An Identifier is assigned to an Object that is retained until the Object disappears.
    • AIM Identifier. This identifies the AIM that provided the initial Data used to represent the Object.
    • Object Format ID. MPAI is identifying a set of Object Format specifications that enable unambiguous reference to an Object Format.
    • Identifiers of parent Objects corresponding to the current Object.
    • Identifier of a parent Object that has spawned more than one current Object.
    • ID of spatially corresponding Object of different Type.
    • Spatial Attitude of Object.
    • Object dimensionality (2D, 2.5D and 3D). Applicable only to Visual Objects.
    • Visual Object shape.
    • Semantic relationship with other Objects, e.g., identification of groups of Objects (platoon). The components of a platoon may broadcast Platooning Information, or a CAV may be able to deduce it by observing the behaviour of a group of CAVs over a period.
    • Accuracy of all Object values.
  5. Allows for easy verification of the feasibility of a Trajectory (e.g., the AMS can easily check that the intended Trajectory of the ego CAV designed to reach the intended point does not collide with other Visual Objects in the Decision Horizon based on the current state of the BER).
  6. Has a scalable representation, i.e., it allows for:
    • Gradual refinement of a BER when new EST-specific Scene Descriptors are added.
    • Extraction of part of the BER based on a required Level of Detail (e.g., Object bounding boxes and their Spatial Attitudes).
    • Easy addition of new data (e.g., adding shape of an Object when there was only the bounding box).
    • Fast access to Object metadata, e.g.:
      • Spatial Attitude.
      • Shape (e.g., bounding box for a Visual Object).
    • Selected (read) access to data required by different AIMs, e.g., the RADAR Scene Description AIM accesses the current BER to improve its description.
    • Easy update of Objects and Scene from one snapshot to another.
    • Possibility that a CAV communicates a subset of its BER to another CAV. E.g., Objects have different degrees of details, starting from bounding boxes and their Position Attributes, depending on the available bandwidth.

1.1.16.3   Syntax

1.1.16.4   Semantics

1.1.16.5   Data Formats

1.1.16.6   To Respondents

Respondents are requested to:

  1. Explore the use of the MPAI Audio-Visual Scene Descriptors to support the Basic Environment Representation by adding the missing functionalities.
  2. Comment on the Functional Requirements.

1.1.17    Alert

1.1.17.1   Definition

1.1.17.2   Functional Requirements

1.1.17.3   Syntax

1.1.17.4   Semantics

1.1.17.5   Data Formats

1.1.17.6   To Respondents