CAE-USC V2.4 AIMs Speech Feature Analysis 1

Speech Feature Analysis 1 (CAE-SF1):

Receives	Model Utterance containing emotion.
Extracts	Speech Features1 from the Model Utterance.
Produces	Prosodic Speech Features.

Figure 1 depicts the Speech Feature Analysis 1 (CAE-SF1) AIM:

Figure 1 – Speech Feature Analysis 1 (CAE-SF1) AIM

Table 1 gives the Input/Output Data of the Speech Feature Analysis 1 (CAE-SF1) AIM.

Table 1 – Input/Output Data of the Speech Feature Analysis 1 (CAE-SF1) AIM

Input data	Semantics
Model Utterance	Utterance provided as a model.
Output data	Semantics
Prosodic Speech Features	A type of Speech Features (Descriptors).

No SubAIMs.

No Profiles

Reference Software not available.

Receives	Model Utterance	Shall validate against the Audio Object schema. The Qualifier shall validate against the Audio Qualifier schema. The values of any Sub-Type, Format, and Attribute of the Qualifier shall correspond with the Sub-Type, Format, and Attributes of the Audio Object Qualifier schema.
Produces	Prosodic Speech Features	Shall validate against the Speech Features Schema.

Table 6 gives the Emotion Enhanced Speech (EES) Speech Feature Analyser1 Means (verification procedures) and how they are used.

Table 6 – Means and use of Emotion Enhanced Speech (EES) Speech Feature Analyser1 AIM

Means	Actions
Conformance Testing Dataset	DS1: a dataset of at least n > M Model Utterances. DS2: a dataset of n Speech Features 1 arrays, where each is associated with a specific utterance of DS1 used as input, and thus represents one correct output, given this input.
Procedure	For each of the n Model Utterances in input: Feed the Speech Feature Analyser (SFA) 1 under test with the current Model Utterance. Verify that the number of features in output Speech Features 1 array equals the corresponding one in DS2. For each feature of the output Speech Features 1 array, compute the delta (absolute difference) between: the pitch property and the corresponding DS2 data in Hz. the intensity property and the corresponding DS2 data in dB. the duration property and the corresponding DS2 data in ms. 4. Compute the Average of: The deltas of the pitch property. The deltas of the intensity property. The deltas of the duration property. Then, compute the Average for each of the three properties among the n Model Utterances. Considering one of the three properties (pitch, intensity and duration) and denoting it as p, a mathematical representation of the computation for each property is:
Evaluation

Figure 3 – EES Speech Feature Analyser1.

After the Tests, Conformance Tester shall fill out Table 7.

Table 7 – Conformance Testing form of Emotion Enhanced Speech (EES) Speech Feature Analyser1 (AIM1)

Conformance Tester ID	Unique Conformance Tester Identifier assigned by MPAI
Standard, Use Case ID and Version	Standard ID and Use Case ID, Version and Profile of the standard in the form “CAE:EES:1.2:0”.
Name of AIM	Speech Feature Analyser1
Implementer ID	Unique Implementer Identifier assigned by Conformance Tester.
AIM Implementation Version	Unique Implementation Identifier assigned by Implementer.
Neural Network Version*	Unique Neural Network Identifier assigned by Implementer.
Identifier of Test Dataset	Unique Dataset Identifier assigned by Conformance Tester.
Test ID	Unique Test Identifier assigned by Conformance Tester.
Actual output	Actual output provided as a matrix of n+1 rows containing all computed Average values: Result: Threshold: m Final evaluation: Passed / Not passed
Execution time*	Duration of test execution.
Test comment*
Test Date	yyyy/mm/dd.

* Optional field

Cookie	Duration	Description
cookielawinfo-checkbox-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Technical".
CookieLawInfoConsent	1 year	The cookie is set by the GDPR Cookie Consent plug-in and is used to store whether the user has consented to the use of cookies or not. It does not store any personal data.
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
_pk_id.6.08a8	13 months	Used to store a few details about the user such as the unique visitor ID
_pk_ses.6.08a8	30 minutes	Short lived cookies used to temporarily store data for the visit