| 1. Functions | 2. Reference Model | 3. Input/Output Data |
| 4. JSON Metadata | 5. SubAIMs | 6. Profiles |
| 7. Reference Software | 8. Conformance Testing | 9. Performance Assessment |
1. Function
Speech Feature Analysis 2 (CAE-SF2):
| Receives | Emotionless Speech. |
| Extracts | Emotionless Speech Features from Emotionless Speech. |
| Produces | Prosodic Speech Features. |
2. Reference Model
Figure 1 depicts the Speech Feature Analysis2 (CAE-SF2) AIM:

Figure 1 – Speech Feature Analysis 2 (CAE-SF2) AIM
3. Input/Output Data
Table 1 gives the Input/Output Data of the Speech Feature Analysis 2 (CAE-SF2) AIM.
Table 1 – Input/Output Data of the Speech Feature Analysis 2 (CAE-SF2) AIM
| Input data | Semantics |
| Emotionless Speech | Utterance provided as a model. |
| Output data | Semantics |
| Emotionless Speech Features | Descriptors of the Soeech without Emotion.. |
4 JSON Metadata
https://schemas.mpai.community/CAE1/V2.4/AIMs/SpeechFeatureAnalysis2.json
5 SubAIMs
No SubAIMs.
6 Profiles
No Profiles
7. Reference Software
Reference Software not available.
8. Conformance Testing
| Receives | Emotionless Speech | Shall validate against the Audio Object schema. The Qualifier shall validate against the Audio Qualifier schema. The values of any Sub-Type, Format, and Attribute of the Qualifier shall correspond with the Sub-Type, Format, and Attributes of the Audio Object Qualifier schema. |
| Produces | Emotionless Speech Features | Shall validate against the Speech Features Schema. |
9 Performance Assessment
Table 12 gives the Emotion Enhanced Speech (EES) Speech Feature Analysis 2 Means (verification procedures) and how they are used.
Table 12 – Means and use of Emotion Enhanced Speech (EES) Speech Feature Analysis2 AIM
| Means | Actions |
| Conformance Testing Dataset | DS1: a dataset of at least y > N Emotionless Speech Segments.
DS2: a dataset of y Emotion Lists. DS3: a dataset of one element, specifying the Language in question. DS4: a dataset of y Speech with Emotion Segments, where each is associated with specific elements of DS1, DS2, and DS3 used as input, and thus represents one correct output, given this input. |
| Procedure | Given a reference Emotion Feature Producer (ID: efp), a reference Emotion Inserter 2 (ID: ei2) and a Speech Feature Analysis 2 module that we want to test, we measure the quality of Speech Feature Analysis 2 in relation to the reference modules as follows:
|
| Evaluation |
|

Figure 5 – EES path 2

Figure 6 – EES Speech Feature Analyser2.
After the Tests, Conformance Tester shall fill out Table 13.
Table 13 – Conformance Testing form of Emotion Enhanced Speech (EES) Speech Feature Analysis2 AIM
| Conformance Tester ID | Unique Conformance Tester Identifier assigned by MPAI | ||||||||
| Standard, Use Case ID and Version | Standard ID and Use Case ID, Version and Profile of the standard in the form “CAE:EES:1:0”. | ||||||||
| Name of AIM | Speech Feature Analyser2 | ||||||||
| Implementer ID | Unique Implementer Identifier assigned by MPAI Store. | ||||||||
| AIM Implementation Version | Unique Implementation Identifier assigned by Implementer. | ||||||||
| Neural Network Version* | Unique Neural Network Identifier assigned by Implementer. | ||||||||
| Identifier of Conformance Testing Dataset | Unique Dataset Identifier assigned by MPAI Store. | ||||||||
| Test ID | Unique Test Identifier assigned by Conformance Tester. | ||||||||
| Actual output | The Conformance Tester will provide the following matrix related to the modules utilized for the tests. Denoting with i and j, 0≤i<x and 0≤j<y, the record number in DS1 and DS2 respectively, the matrices reflect the results obtained with a limited number of random multiple inputs and the corresponding outputs.
Example:
Language: DS3 |
||||||||
| Execution time* | Duration of test execution. | ||||||||
| Test comment* | In case step 1 of Conformance Testing fails, the Conformance Tester shall request the implementer to provide an Emotion Feature Producer AIM (AIM2).
In case step 4 or 5 of Conformance Testing also fails, the Conformance Tester shall inform the implementer that the Speech Feature Analyser2 (AIM1) did not pass the CT. |
||||||||
| Test Date | yyyy/mm/dd. |
* Optional field