<- Scope    Go t o ToC       References ->

 

Terms beginning with a capital letter have the meaning defined in Table 1. Terms beginning with a small letter have the meaning commonly defined for the context in which they are used. For instance, Table 1 defines Object and Scene but does not define object and scene.

A dash “-” preceding a Term in Table 1 indicates the following readings according to the font:

  1. Normal font: the Term in the table without a dash and preceding the one with a dash should be read before that Term. For example, “Avatar” and “- Model” will yield “Avatar Model.”
  2. Italic font: the Term in Table 1 without a dash and preceding the one with a dash should be read after that Term. For example, “Avatar” and “- Portable” will yield “Portable Avatar.”

Table 1Table of terms and definitions

Term Definition
Attitude
–       Social The coded representation of the internal state related to the way a human or avatar intends to position vis-à-vis the Environment or subsets of it, e.g., “Respectful”, “Confrontational”, “Soothing”.
–       Spatial Position and Orientation and their velocities and accelerations of an Audio and Visual Object in a Virtual Environment.
Audio Digital representation of an analogue audio signal sampled at a frequency between 8-192 kHz with a number of bits/sample between 8 and 32, and non-linear and linear quantisation.
–       Object Coded representation of Audio information with its metadata. An Audio Object can be a combination of Audio Objects.
–       Scene The Audio Objects of an Environment with Object location metadata.
Audio-Visual Object Coded representation of Audio-Visual information with its metadata. An Audio-Visual Object can be a combination of Audio-Visual Objects.
Audio-Visual Scene (AV Scene) The Audio-Visual Objects of an Environment with Object location metadata.
Avatar An animated 3D object representing a real or fictitious person in a Virtual Space.
–       Model An inanimate avatar exposing interfaces enabling animation.
Cognitive State The coded representation of the internal state reflecting the way a human or avatar understands the Environment, such as “Confused”, “Dubious”, “Convinced”.
Colour (of speech) The timber of an identifiable voice independent of a current Personal Status and language.
Connected Autonomous Vehicle A vehicle able to autonomously reach an assigned geographical position by:

1.     Understanding human utterances.

2.     Planning a route.

3.     Sensing and interpreting the Environment.

4.     Exchanging information with other CAV.

5.     Acting on the CAV’s motion actuation subsystem.

Context Information surrounding an Entity and providing additional information about the communication emitted by the Entity.
Data Information in digital form.
–       Format The standard digital representation of Data.
–       Type An instance of Data with a specific Data Format.
Descriptor Coded representation of text, audio, speech, or visual feature.
Digital Representation Data corresponding to and representing a real entity.
Emotion The coded representation of the internal state resulting from the interaction of a human or avatar with the Environment or subsets of it, such as “Angry”, “Sad”, “Determined”.
Entity A real or Digital Human
Environment A Virtual Space containing a Scene.
Face The portion of a 2D or 3D digital representation corresponding to the face of a human.
Factor One of Emotion, Cognitive State and Attitude.
Gesture A movement of the body or part of it, such as the head, arm, hand, and finger, often a complement to a vocal utterance.
Grade The intensity of a Factor.
Human A human being in a real space.
–       Digital A Digitised or a Virtual Human in a Virtual Space.
–       Digitised An Object in a Virtual Space that has the appearance of a specific human when rendered.
–       Virtual An Object in a Virtual Space created by a computer that has a human appearance when rendered but is not a Digitised Human.
Identifier The label uniquely associated with a human or an avatar or an object.
Instance An element of a set of entities – Objects, users etc. – belonging to some levels in a hierarchical classification (taxonomy).
Intention The result of analysis of the goal of an input question.
Manifestation The manner of showing the Personal Status, or a subset of it, in any one of Speech, Face, and Gesture.
Meaning Information extracted from Text such as syntactic and semantic information, Personal Status, and other information, such as an Object Identifier.
Modality One of Text, Speech, Face, or Gesture.
Object Descriptor An individual attribute of the coded representation of an object in a Scene, including its Spatial Attitude.
Orientation The set of the 3 roll, pitch, yaw angles indicating the rotation around the principal axis (x) of an Object, its y axis having an angle of 90˚ counterclockwise (right-to-left) with the x axis and its z axis pointing up toward the viewer.
Personal Status The ensemble of information internal to a person, including Emotion, Cognitive State, and Attitude.
Portable Avatar A Data Type representing an Avatar and its Context.
Pitch The fundamental frequency of Speech. Pitch is the attribute that makes it possible to judge sounds as “higher” and “lower.”
Point of View The Spatial Attitude of a human or avatar looking at an Environment.
Position The 3 coordinates (x,y,z) of a representative point of an object in the Real and Virtual Space.
Refined Text The Text resulting from the analysis of the Text produced by Automatic Speech Recognition made by Natural Language Understanding.
Scene A structured composition of Objects.
Speech Digital representation of analogue speech sampled at a frequency between 8 kHz and 96 kHz with a number of bits/sample of 8, 16 and 24, and non-linear and linear quantisation.
–       Features Aspects of a speech segment that enable its description and reproduction, e.g., degree of vocal tension, Pitch, etc., and that can be automatically recognised and extracted for speech synthesis or other related purposes.
–       Rate The number of Speech Units per second.
–       Unit Phoneme, syllable, or word as a segment of Speech.
Summary An abridged outline of the content of the utterance(s) of one or more Users possibly including their Personal Statuses.
Text A sequence of characters drawn from a finite alphabet.
Visual Object Coded representation of Visual information with its metadata. A Video Object can be a combination of Video Objects.
Vocal Gesture Utterance, such as cough, laugh, hesitation, etc. Lexical elements are excluded.

 

<- Scope    Go t o ToC       References ->