Contents

Foreword

1      Introduction

2      Scope of CAE 6DF Call for Technologies

3      How to submit a response

4      Evaluation Criteria and Procedure

5      Expected time line

6      References

Annex A: Information Form

Annex B: Evaluation Sheet

Annex C: Check list of data formats proposed by a respondent

Annex D: Technologies that may require specific testing

Annex E: Mandatory text in responses

 

Foreword

This document is issued by Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI) to request parties having rights to technologies satisfying the Use Cases and Functional Requirements [4] and the Framework Licence [5] of MPAI-CAE – Six Degrees of Freedom Audio (CAE-6DF) to respond to this Call for Technologies preferably using the Template for Responses [6].

MPAI is an international non-profit organisation with the mission to develop standards for Artificial Intelligence (AI)-enabled data coding and technologies facilitating integration of data coding components into Information and Communication Technology (ICT) systems [1]. The MPAI Patent Policy [2] guides the accomplishment of this mission.

1          Introduction

Established in September 2020, MPAI has developed eleven Technical Specifications relevant to its mission such as execution environment of multi-component AI applications, portable avatar format, object and scene description, neural network watermarking, context-based audio enhancements, multimodal human-machine conversation and communication, company performance prediction, metaverse, and governance of the MPAI ecosystem. Five Technical Specifications have been adopted by IEEE without modification and four more are in the pipeline. Several other standard projects – such as AI for Health, online gaming and XR Venues – are under way and are expected to deliver specifications in the next few months.

MPAI specifications are the result of a process whose main steps are:

  1. Development of functional requirements in an open environment.
  2. Adoption of “commercial requirements” (Framework Licence) by MPAI principal members setting the main elements of the future licence to be issued by standard essential patents holders.
  3. Publication of a Call for Technologies referring to the two sets of requirements that invites the submission of contributions by parties who accept licensing their technologies according to the Framework Licence, if their technologies are accepted to be part of the target Technical Specification.

This Call for Technologies document requests technologies for use in the planned Technical Specification: Context-based Audio Enhancement (MPAI-CAE) – Six Degrees of Freedom Audio (in the following called CAE-6DF).

2          Scope and purpose

This Call for Technologies: Context-based Audio Enhancement (MPAI-CAE) – Six Degrees of Freedom Audio (CAE-6DF) invites any party able and wishing to contribute to the development of the planned MPAI CAE-6DF Technical Specification to submit a response. If they own technologies relevant to this Call, they are required to eventually license their technologies according to Framework Licence: Context-based Audio Enhancement (MPAI-CAE) – Six Degrees of Freedom Audio (CAE-6DF) [5] if their technologies are selected by MPAI for possible modification and inclusion in the planned CAE-6DF Technical Specification.

Any respondent who is not an MPAI member and wishes to participate in the development of CAE-6DF shall join MPAI. If they own accepted technologies and do not join MPAI, they lose the opportunity to have their technologies included in the planned CAE-6DF.

The planned CAE-6DF will be developed using technologies that comply with the following mandatory requirements:

  1. Be part of responses to this Call submitted by parties accepting the CAE-6DF Framework Licence and satisfy the CAE-6DF Use Cases and Functional Requirements.
  2. Be based on technologies specified in published MPAI standards, where relevant, desirable, and feasible [3].

Therefore, the scope of this Call is restricted to responses whose specification in the planned CAE-6DF conforms to [4]. However, respondents are welcome to additionally make:

  1. Comments on any technical element of [4].
  2. Technologies based on motivated proposals to extend or add new functional requirements in [4] provided they:
    • Are in line with the scope of [4].
    • Satisfy the Framework Licence [5].

At this stage, MPAI membership is not a prerequisite for responding to this Call for Technologies. However, proponents should be aware that, if their proposal or part thereof is accepted for inclusion in the planned Technical Specification: Context-based Audio Enhancement (MPAI-CAE) – Six Degrees of Freedom Audio (CAE-6DF), they will be requested to immediately join MPAI, or lose the opportunity to have their accepted technologies included in the standard.

MPAI will select the most suitable technologies based on their technical merits. However, MPAI in not obligated, by virtue of this Call, to select a particular technology or to select any of the proposed technologies if those submitted are found to be inadequate.

Note that in the future, MPAI may decide to further extend the planned CAE-6DF a CAE-6DF extension or as a new part of the MPAI-CAE series of Technical Specifications.

3          How to submit a response

Those planning to respond to this Call are:

  1. Advised that the Call for Technologies: CAE Six Degrees of Freedom (CAE-6DF) Call for Technologies will be presented online on 2024/05/28.
  2. Requested to communicate their intention to respond to this Call with an initial version of the form of Annex A to the MPAI secretariat by 2024/06/04. Submission of a duly filled out Annex A helps MPAI to properly plan for the revision of submissions. Submission of Annex A, however, is not a requirement and the submission of a respondent to this Call who did not the submit Annex A will still be accepted.
  3. Encouraged to regularly visit the MPAI-CAE webpage where relevant additional information will be posted.
  4. Required to deliver their submissions to the MPAI secretariat by 2024/08/19 T23:59 UTC. The secretariat will acknowledge receipt of the submission via email.
  5. Required to attend the review of submissions according to the schedule that the 47th MPAI General Assembly (MPAI-47) will define at its online meeting on 2024/08/21. The MPAI secretariat will inform submitters about how non MPAI members can attend the said review sessions. Respondents shall present their submission at such online review sessions. If no presenter of a submission will be in attendance, the submission will be discarded.

Responses to this Call for Technologies may/shall include:

Table 1 – Optional and mandatory elements of a response

Item Status
Detailed documentation describing the proposed technologies mandatory
The completed version of Annex A. mandatory
The text of Annex B duly filled out with the table indicating to which Functional Requirements the response applies. mandatory
Comments on the completeness and appropriateness of the CAE-6DF Functional Requirements and any motivated suggestion to amend and/or extend those Requirements. optional
A preliminary demonstration, with a detailed document describing it. optional
Any other additional relevant information that may help evaluate the submission, such as additional use cases. optional
The text of Annex E. mandatory

Respondents are invited to take advantage of the check list of Annex C before filling out Annex A and submitting their response.

Further information on MPAI can be obtained from the MPAI website.

4          Evaluation Criteria and Procedure

Proposals will be assessed using the following process:

  1. Evaluation panel is created from:
    1. MPAI members in attendance.
    2. Non-MPAI members who are respondents.
    3. Non respondents/non MPAI member experts invited in a consulting capacity.
  2. No one from 1.1.-1.2. is denied membership in the Evaluation panel if they request it.
  3. Respondents present their proposals.
  4. Evaluation Panel members ask questions.
  5. If required, subjective and/or objective tests are carried out with the following process:
    1. The required tests are defined.
    2. Test subjects are selected.
    3. The required tests are carried out.
    4. A report is produced.
  6. If required, at least 2 reviewers are appointed to review and report about specific points in a proposal.
  7. Evaluation panel members fill out Annex B for each proposal.
  8. Respondents comment on evaluations.
  9. Proposal evaluation report is produced.

5          Expected timeline

Timeline of the Call for Technologies, deadlines and response evaluation:

Table 2 – Dates and deadlines

Step Date Time
CAE-6DF Call for Technologies issued. 2024/05/15 17:00 UTC
CAE-6DF Call for Technologies presented online. 2024/05/28 16:00 UTC
Notification of intention to submit proposal (Annex A). 2024/06/10 23:59 UTC
Response submission deadline. 2024/09/16 23:59 UTC
Start of response evaluation. 2024/09/30 (MPAI-48)

Evaluations to be carried out during 2-hour online sessions according to the calendar agreed at MPAI-47.

6          References

  1. MPAI Statutes
  2. MPAI Patent Policy
  3. MPAI Technical Specifications
  4. MPAI; Use Cases and Functional Requirements: Context-based Audio Enhancement (MPAI-CAE) – Six Degrees of Freedom Audio (CAE-6DF); N1764
  5. MPAI; Framework Licence: Use Cases and Functional Requirements: Context-based Audio Enhancement (MPAI-CAE) – Six Degrees of Freedom Audio (CAE-6DF); N1765
  6. MPAI; Template for Responses: Use Cases and Functional Requirements: Context-based Audio Enhancement (MPAI-CAE) – Six Degrees of Freedom Audio (CAE-6DF); N1766

Annex A: Information Form

This information form is to be filled in by a Respondent to this CAE-6DF Call for Technologies.

  1. Title of the proposal.
  2. Organisation: company name, position, e-mail of contact person.
  3. What are the main functionalities of your proposal?
  4. Does your proposal provide or describe a formal specification and APIs?
  5. Will you provide a demonstration to show how your proposal meets the evaluation criteria?

Annex B: Evaluation Sheet

NB: This evaluation sheet will be filled out by Evaluation Team members.

Proposal title:

Main functionalities: 

Response summary: (a few lines)

Comments on relevance to the Call for Technologies (Requirements):

Comments on possible MPAI-EEV profiles[1]

Evaluation table:

Table 3Assessment of submission features

Note 1 Table 4 gives the semantics of submission features.
Note 2 Evaluation Elements indicate the elements used by the evaluator in assessing the submission.
Note 3 Final Assessment indicates the ultimate assessment based on the Evaluation Elements.

 

Submission features Evaluation Elements Final Assessment
Completeness of description

Understandability

Extensibility

Use of standard technology

Efficiency

Test cases

Maturity of reference implementation

Relative complexity

Support of non-CAE-5DFuse cases

Content of the criteria table cells:

Evaluation facts should mention:

  • Not supported / partially supported / fully supported.
  • What supports these facts: submission/presentation/demo.
  • The summary of the facts themselves, e.g., very good in one way, but weak in another.

Final assessment should mention:

  • Possibilities to improve or add to the proposal, e.g., any missing or weak features.
  • How sure the evaluators are, i.e., evidence shown, very likely, very hard to tell, etc.
  • Global evaluation (Not Applicable/ –/ – / + / ++)

New Use Cases/Requirements Identified:

(Please describe)

Evaluation summary:

  • Main strong points, qualitatively:
  • Main weak points, qualitatively:
  • Overall evaluation: (0/1/2/3/4/5)

0: could not be evaluated

1: proposal is not relevant.

2: proposal is relevant but requires significant more work.

3: proposal is relevant, but with a few changes.

4: proposal has some very good points, so it is a good candidate for standard.

5: proposal is superior in its category, very strongly recommended for inclusion in standard.

Additional remarks: (points of importance not covered above.)

The submission features in Table 3 are explained in the following Table 4.

Table 4 – Explanation of submission features

Submission

features

Criteria
Completeness of description Evaluators should:

1.      Compare the list of requirements (Annex C) with the submission.

2.      Check if respondents have described in sufficient detail how the requirements are supported by the proposal.

Note: Submissions will be judged for the merit of what is proposed. A submission on a single technology that is excellent may be considered instead of a submission that is complete but has a less performing technology.

Understandability Evaluators should identify items that are demonstrably unclear (incon­sistencies, sentences with dubious meaning etc.)
Extensibility Evaluators should check if respondent has proposed extensions to the Use Cases.

Note: Extensibility is the capability of the proposed solution to support use cases that are not supported by current requirements.

Use of standard Technology Evaluators should check if new technologies are proposed where widely adopted technologies exist. If this is the case, the merit of the new tech­nology shall be proved.
Efficiency Evaluators should assess power consumption, computational speed, computational complexity.
Test cases Evaluators should report whether a proposal contains suggestions for testing the technologies proposed.
Maturity of reference implementation Evaluators should assess the maturity of the proposal.

Note1: Maturity is measured by the completeness, i.e., having all the necessary information and appropriate parts of the HW/SW implementation of the submission disclosed.

Note2: If there are parts of the implementation that are not disclosed but demonstrated, they will be considered if and only if such components can be replicated.

Relative complexity Evaluators should identify issues that would make it difficult to implement the proposal compared to the state of the art.
Support of non-CAE-6DF use cases Evaluators should check whether the technologies proposed can demonstrably be used in other significantly different use cases.

Annex C: Check list of data formats proposed by a respondent

Table 5 – Table of response areas

CAE-6DF use cases Response
Use Case 1 – Immersive Concert Experience (Music plus Video) Y/N
Use Case 2 – Immersive Radio Drama (Speech plus Foley/Effects) Y/N
Use Case 3 – Virtual lecture (Audio plus Video) Y/N
Use Case 4 – Immersive Opera/Ballet/Dance/Theatre experience (Music, Drama plus 360° Video/6DoF Visual) Y/N
CAE-6DF use cases Response
1.     The Functional Requirements apply to the Audio experience and to the impact of visual conditions on the Audio experience. For instance: Y/N
a.     Audio-Visual Contract, i.e. alignment of audio scenes with visual scenes. Y/N
b.     Effects of locomotion on a human audio-visual perception. Y/N
c.     Orientation response, i.e., turning toward a sound source of interest. Y/N
d.     Distance perception such that visual and auditory modalities affect each other. Y/N
2.     One or more of the following three content profiles should be addressed: Y/N
a.     Scene-based, i.e., the captured audio scene, for example Ambisonics, is accurately reconstructed so that the Audio Scene provides a high degree of correspondence to the acoustic ambient characteristics of the captured audio scene. Y/N
b.     Object-based, i.e., the Audio Scene comprises Audio Objects and associated metadata to allow synthesising a perceptually veridical, but not necessarily physically accurate, representation of the captured audio scene. Y/N
c.     Mixed, i.e., a combination of scene-based and object-based profiles where Audio Objects can be overlaid or mixed with scene-based content. Y/N
3.     One or both of the following rendering modalities should be addressed: Y/N
a.     Loudspeaker-based, i.e., the content is rendered through at least two loudspeakers. Y/N
b.     Headphone-based, i.e., the content is rendered through headphones. Y/N
4.     If the audio content is rendered through loudspeakers, the rendering space should have the following characteristics: Y/N
a.     Shape and dimensions: Y/N
                                      i.     Not larger than the captured space. Y/N
b.     Acoustic ambient characteristics: Y/N
                                      i.     Early decay time (EDT) lower than the captured space. Y/N
                                    ii.     Frequency mode density lower than the captured space. Y/N
                                  iii.     Echo density lower than the captured space. Y/N
                                   iv.     Reverberation time (T60) lower than the captured space. Y/N
                                    v.     Energy decay curve characteristics same or lower than the captured space. Y/N
                                   vi.     Background noise less than 50dB(A) SPL. Y/N
5.     If the audio content is rendered through headphones that can successfully block the audibility of the ambient acoustical characteristics of the rendering space the rendering space should have the following characteristics: Y/N
a.     Shape and dimensions: Y/N
                                      i.     Not larger than the captured space. Y/N
b.     Acoustic ambient characteristics: No constraints on the ambient characteristics defined in point 2.b. Y/N
6.     The User movement in the rendering space may be the result of actual or virtual locomotion or orientation. Y/N
a.     Actual locomotion/orientation of the User as tracked by sensors. Y/N
b.     Virtual locomotion/orientation is actuated by controlling devices. Y/N
7.     The maximum responsive latency of the audio system to user movement should be 20 ms or less, however, some applications may have higher latency. Y/N

Respondent should in any case review the equivalent list in the table of contents of [4].

Annex D: Technologies that may require specific testing

Table 6 will be compiled based on the responses received.

Table 6 – Functional Requirements that may require specific testing

Section Technology Nature of Test

Annex E: Mandatory text in a submission

A response to Call for Technologies: Context-based Audio Enhancement (MPAI-CAE) – Six Degrees of Freedom (CAE-6DoF) shall mandatorily include the following text

<Company/Member> submits this technical document in response to CAE-6DoF (N1763).

 <Company/Member> explicitly agrees to the steps of the MPAI standards development process defined in Annex 1 to the MPAI Statutes (N421), in particular <Company/Member> declares that  <Company/Member> or its successors will make available the terms of the Licence related to its Essential Patents according to Framework Licence: Context-based Audio Enhancement (MPAI-CAE) – Six Degrees of Freedom (CAE-6DoF) (N1765), alone or jointly with other IPR holders after the approval of the planned CAE-6DF by the General Assembly and in no event after commercial implementations of CAE-6DoF become available on the market.

In case the respondent is a non-MPAI member, the submission shall mandatorily include the following text:

If (a part of) this submission is identified for inclusion in a specification, <Company> understands that <Company> will be requested to immediately join MPAI and that, if <Company> elects not to join MPAI, this submission will be discarded.

Subsequent technical contribution shall mandatorily include this text

<Member> submits this document to MPAI as a con­tribution to the development of the planned CAE-6DF.

 <Member> explicitly agrees to the steps of the MPAI standards development process defined in Annex 1 to the MPAI Statutes (N421), in particular  <Company> declares that <Company> or its successors will make available the terms of the Licence related to its Essential Patents according to the CAE-6DF Framework Licence (N1765), alone or jointly with other IPR holders after the approval of the CAE-6DF Technical Specification by the General Assembly and in no event after CAE-6DF commercial implementations become available on the market.

 

[1] Profile of a standard is a particular subset of the technologies that are used in a standard and, where applicable, the classes, subsets, options and parameters relevant for the subset.