<‑References   Go to ToC   AI Modules‑>

Introduction Human-CAV Environment Sensing Autonomous Motion Motion Actuation Connected Autonomous
Interaction Subsystem Subsystem Subsystem Vehicle Security

1      Introduction

The CAV‑TEC standard specifies the Connected Autonomous Operation (CAO), a component of a land-based Connected Autonomous Vehicle (CAV), that is expected to autonomously reach a specified Position and Orientation (called Point of View) starting from a Point of View when receiving a request from an authorised human or process, by capturing environment data from onboard systems, exchanging relevant data with peer CAOs onboard other CAVs, and issuing control instructions to three components of its CAV called Brakes, Motors, and Wheels that comply with applicable traffic laws.

Figure 1 characterises the type of environment that the vehicle executing the CAV may be requested to traverse.

An example of an environment traversed by a CAV

Figure 1 – An example of an environment traversed by a CAV

The environment is populated by humans carrying devices, CAVs, and objects, such as vehicles, roadside units, and traffic lights, that may be “CAV-aware”, i.e., can understand information transmitted.

Figure 2 depicts the interaction of CAV subsystems with infrastructure, other vehicles, and the environment in a closed-loop system.

CAV subsystems, infrastructure/CAVs, and environment

Figure 2 – CAV subsystems, infrastructure/CAVs, and environment

  • A Human or service interacts with the Human‑CAV Interaction (HCI) subsystem to activate and interact with the Autonomous Motion Subsystem (AMS).
  • The AMS acts as the central intelligence, coordinating perception, decision‑making, and control, and activating the Environment Sensing Subsystem (ESS).
  • The ESS acquires sensory data from the environment, receives spatial information from the AMS, consolidates scene descriptions, and provides raw scene data to the AMS.
  • The AMS exchanges information with infrastructure and other CAVs and generates motion commands directed to the Motion Actuation Subsystem (MAS).
  • The MAS performs low‑level control and manages execution through physical actuators such as brakes, motors, and steering, that affect the environment and return status information to the MAS and upstream subsystems.

Figure 3 depicts the four Subsystems composing a CAV. Each Subsystem is implemented as a Composite AIM conforming with Technical Specification AI Framework (MPAI-AIF) V3.0 and Process Instance Trust Framework (MPAI-PTF) V1.0.

The Reference Model of a CAV

Figure 3 – The Reference Model of a CAV

AIMs may be located in other subsystems, on condition that the interfaces specified by CAV-TEC are preserved.

A human approaching a CAV requests the Human‑CAV Interaction Subsystem (HCI) to be taken to a Point of View using a combination of Audio, Visual, and LiDAR signals. A remote process can make a similar request to the CAV.

Either request is passed to the Autonomous Motion Subsystem (AMS), which requests the Environment Sensing Subsystem (ESS) to provide the current Point of View of the CAV. With this information from the ESS (the current Point of View), the Destination Point of View, and the possibility to access Offline Maps, the AMS can propose one or more Routes, one of which the human or process can select.

With the human aboard, the AMS continuously receives environment information from the ESS — possibly complemented with information received from other CAVs in range — and instructs the Motion Actuation Subsystem to move the vehicle appropriately.

2      Human-CAV Interaction

The operation of the HCI in its interaction with humans and the rest of the CAV is best explained using the CAV-HCI Reference Model of Figure 4.

Reference Model of CAV-HCI

Figure 4 – Reference Model of CAV-HCI

The Audio-Visual Scene Description (AVS) monitors the environment and produces Audio-Visual Scene Descriptors from which it extracts Speech Scene Descriptors and from these, Speech Objects corresponding to any speaking humans in the environment surrounding the CAV. Visual Scene Descriptors may also be extracted in the form of Face and Body Descriptors of all humans present.

The CAV activates Automatic Speech Recognition (ASR) to have the speech of each human recognised and converted into Recognised Text. Each Speech Object is identified according to its position in space. The CAV also activates Visual Object Identification (VOI) that is able to produce the Instance IDs of Visual Objects as indicated by humans.

Natural Language Understanding (NLU) processes the Speech Objects, produces Refined Text, and extracts Meaning from the Text of each input Speech. This process is facilitated by the use of the IDs of the Visual Objects provided by VOI.

Speaker Identity Recognition (SIR) and Face Identity Recognition (FIR) help the CAV to reliably obtain the Identifiers of the humans the HCI is interacting with. If the Face ID(s) provided by FIR correspond to the ID(s) provided by SIR, the CAV may proceed to attend to further requests. Especially with humans aboard, Personal Status Extraction (PSE) provides useful information regarding the humans’ state of mind by extracting their Personal Status.

The CAV interacts with humans through Entity Dialogue Processing (EDP). When a human requests to be taken to a Destination, the EDP interprets and communicates the request to the Autonomous Motion Subsystem (AMS). A dialogue may then ensue where the AMS may offer different choices to satisfy potentially different human needs (e.g., a long but comfortable Route or a short but less predictable one).

While the CAV moves to the Destination, the HCI may converse with the humans, show the Full Environment Descriptors developed by the AMS to the passengers, and communicate information about the CAV from the Ego AMS or more generally from the HCIs of remote CAVs.

The HCI responds using the two main EDP outputs: Text and Personal Status. These are used by the Response and Scene Rendering (RSR) to produce the Portable Avatar representing the HCI conveying Speech, Face, and Gesture synthesised to render Audio, Speech, and Visual information using the HCI Portable Avatar. Alternatively, RSR can also be used to display the Full Environment Descriptors produced by the AMS from the Point of View selected by the human.

3      Environment Sensing Subsystem

The operation of the Environment Sensing Subsystem (ESS) is best explained using the Reference Model of the CAV-ESS subsystem depicted in Figure 5.

Reference Model of CAV-ESS

Figure 5 – Reference Model of CAV-ESS

When the CAV is activated in response to a request by a human owner or renter or by a process, Spatial Attitude Generation continuously computes the CAV’s Spatial Attitude relying on the initial Motion Actuation Subsystem’s Point of View, and information from the Global Navigation Satellite Systems (GNSS), if available.

An ESS may be equipped with a variety of Environment Sensing Technologies (EST). CAV-TEC assumes they are Audio, LiDAR, RADAR, Ultrasound, and Visual. CAV-TEC assumes that a CAV implementation may only support a subset of the ESTs. Offline Map is considered as an EST.

An EST-specific Scene Description AIM receives EST-specific Data Objects and produces EST-specific Scene Descriptors, which are integrated into the Basic Environment Descriptors (BED) by the Basic Environment Description AIM using:

  1. All available sensing technologies.
  2. Weather Data.
  3. Road State.
  4. The Full Environment Descriptors of previous instants produced by the AMS.

Note that, although in Figure 5 each sensing technology is processed by an individual EST, an implementation may combine two or more Scene Description AIMs to handle two or more ESTs, provided the relevant interfaces are preserved. An EST-specific Scene Description AIM may need to access the BED of previous instants and may produce Alerts that are immediately communicated to the AMS.

The Objects included in the BEDs may carry Annotations specifically related to traffic signalling, e.g.: Point of View of traffic signals in the environment, Traffic Policemen, Road signs (lanes, turn right/left on the road, one way, stop signs, words painted on the road), Traffic signs — vertical signalisation (signs above the road, signs on objects, poles with signs), Traffic lights, Walkways, and Traffic sounds (siren, whistle, horn).

4      Autonomous Motion Subsystem

The operation of the Autonomous Motion Subsystem (AMS) is best explained using the Reference Model of the CAV-AMS subsystem depicted in Figure 6.

Reference Model of CAV-AMS

Figure 6 – Reference Model of CAV-AMS

When the HCI sends the AMS a request from a human or a process to move the CAV to a Destination, Route Selection Planning uses the Basic Environment Descriptors from the ESS and produces a set of Waypoints starting from the current Point of View to the Destination.

When the CAV is in motion, Route Selection Planning causes Path Selection Planning to generate a set of Points of View to reach the next Waypoint. The Full Environment Description AIM may request the AMSs of Remote CAVs to send (subsets of) their Scene Descriptors and integrates all sources of Environment Descriptors into its Full Environment Descriptors (FED), and may also respond to similar requests from Remote CAVs.

Motion Selection Planning generates a Trajectory to reach the next Point of View in each Path. Traffic Obstacle Avoidance receives the Trajectory and checks if any Alert was received that would cause a collision with the current Trajectory. If a potential collision is detected, Traffic Obstacle Avoidance requests a new Trajectory from Motion Selection Planning; otherwise Traffic Obstacle Avoidance issues an AMS‑MAS Message to the Motion Actuation Subsystem (MAS).

The MAS sends an AMS‑MAS Message to the AMS informing it about the execution of the AMS‑MAS Message received. The AMS, based on the received AMS‑MAS Messages, may discontinue the execution of the earlier AMS‑MAS Message, issue a new AMS‑MAS Message, and inform Traffic Obstacle Avoidance. The decision of each element of the chain may be recorded in the AMS Memory (“black box”).

5      Motion Actuation Subsystem

The operation of the Motion Actuation Subsystem (MAS) is best explained using the Reference Model of the CAV-MAS subsystem depicted in Figure 7.

Reference Model of CAV-MAS

Figure 7 – Reference Model of CAV-MAS

When the AMS Message Interpretation AIM receives an AMS‑MAS Message from the AMS, it interprets the message, partitions it into commands, and sends commands to the Brake, Motor, and Wheel mechanical subsystems. CAV-TEC is silent on how the three mechanical subsystems process the commands but specifies the format of the responses issued to and received by the AMS Message Interpretation AIM. The result of the interpretation is sent as an AMS‑MAS Message to the AMS.

MAS includes more AIMs:

  1. Inertial Sensing includes a variety of devices — Odometer, Speedometer, Accelerometer, and Inclinometer — and produces Spatial Data.
  2. Spatial Attitude Generation computes the initial Ego CAV’s Spatial Attitude using the Spatial Data provided by Inertial Sensing. This initial Spatial Attitude is sent to the ESS.
  3. Weather Sensing includes a variety of devices — thermometre, hygrometre, anemometre, etc. — and produces Weather Data.
  4. Ice Condition Analysis augments the Weather Data by analysing the Brake, Motor, and Wheel mechanical subsystems’ responses and sends the augmented Weather Data to the ESS.

6      Connected Autonomous Vehicle Security

The Connected Autonomous Vehicle considered by CAV-TEC is composed of:

  1. Connected Autonomous Operation (CAO),
  2. Brakes, Motors, and Wheels, and
  3. The rest of the physical vehicle.

Connected Autonomous Operation (CAV-CAO) is implemented as an AI Module and executed in the AI Framework according to the MPAI-AIF standard. After trust has been established as specified by MPAI-PTF, the AI Modules part of the CAO AIM interact by exchanging data enriched with Data Exchange Metadata.

The Motion Actuation Subsystem (MAS) issues commands and receives responses from Brakes, Motors, and Steering Wheels. These operate as Processes, not as AIMs. MAS exchanges commands and responses with Brakes, Motors, and Wheels based on this CAV-TEC standard.

<‑References   Go to ToC   AI Modules‑>