Abstract
This document, issued by Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI), collects Use Cases and Functional Requirements relevant to Technical Specification: Up-sampling Filter for Video applications (MPAI-UFV) V1.0.
MPAI is an international non-profit organisation having the mission to develop standards for Artificial Intelligence (AI)-enabled data coding and technologies facilitating integration of data coding components into Information and Communication Technology (ICT) systems [1]. The MPAI Patent Policy [2] guides the accomplishment of the mission.
1 Introduction
Established in September 2020, MPAI has developed eleven Technical Specifications relevant to its mission such as execution environment of multi-component AI applications, portable avatar format, object and scene description, neural network watermarking, context-based audio enhancements, multimodal human-machine conversation and communication, company performance prediction, metaverse, and governance of the MPAI ecosystem. Five Technical Specifications have been adopted by IEEE without modification and four more one more are in the pipeline. Several other standard projects – such as AI for Health, online gaming and XR Venues – are under way and are expected to deliver specifications in the next few months.
MPAI specifications are the result of a process whose main steps are:
- Development of functional requirements in an open environment.
- Adoption of “commercial requirements” (Framework Licence) by MPAI principal members setting main elements of the future licence to be issued by standard essential patents holders.
- Publication of a Call for Technologies referring the two sets of requirements inviting the submission of contributions by parties who accept to licence their technologies according to the Framework Licence, if their technologies are accepted to be part of the target Technical Specification.
This document is the Use Cases and Functional Requirements related to the planned Technical Specification: Up-sampling Filter for Video applications (MPAI-UFV) V1.0 – in the following called UFV – developed in the context of work of the MPAI AI-Enhanced Video Coding (EVC) group.
2 Functional Requirements
This document collects the Use Cases and Functional Requirements for an up-sampling filter approach, in which case the image prior to encoding is downscaled and then the up-sampling filter step is applied to the decoded image to achieve the original resolution (Fig. 1).
Figure 1 – Reference model of a video coding system with an up-sampling filter
MPAI has carried out the following experiments:
- using EVC and VVC encoded sequences.
- SR network trained with EVC and VVC
- Tests of SR-based up-sampling of luminance only sequences for
- SD to HD
- HD to 4K
- In the following configurations
- Training using EVC-encoded and testing EVC-encoded sequences
- Training using VVC-encoded Testing VVC-encoded sequences
- Training using EVC-encoded Testing VVC-encoded sequences
- Training using VVC-encoded Testing EVC-encoded sequences
- PSNR of decoded and SR-up-sampled vs original sequences improved by ~20% compared to bicubic-up-sampling.
- Performance using neural networks trained on the sequences encoded with the same codec improved performance by ~1%
- Results were confirmed using colour (objective) and subjective (experts viewing)
An assessment of the results obtained in the exploration phase has led MPAI to conclude that a high-performance standard up-sampling filter would benefit many applications as described in the next chapter.
3 Use Cases
Use Case 1 – Enhancing visuals in real-time: Up-sampling filter on eSports streaming
A user is watching in streaming an eSports tournament. The up-sampling filter tool analyses the incoming video frames and intelligently enhances them. This results in sharper images, reduced pixelation, and improved overall visual quality saving the user bandwidth. For instance, even when streaming at lower resolutions (e.g., 720p), up-sampling filter can make the visuals appear closer to native 1080p or even 4K quality.
Use Case 2 – Expanding channel capacity with up-sampling filter
A regional TV broadcaster wants to expand its channel lineup to attract a broader audience. However, the available bandwidth for transmission is limited, and adding more physical channels is not feasible due to regulatory constraints. The broadcaster decides to downscale the content before broadcasting it, effectively doubling the amount of content broadcast without using additional physical channels. Up-sampling filter at end user ensures that even lower-resolution channels maintain acceptable quality. For instance, a viewer with a 4K TV receives the UHD version of the channel even if it was transmitted in HD resolution, allowing broadcasters to optimise channel capacity, improve quality and adapt to different viewer needs.
Use case 3 – UAV Skylens
In the metropolis of Turin, security and surveillance are paramount. The network of unmanned aerial vehicles (UAVs) patrols the city’s skies, ensuring the safety of citizens, monitoring traffic and safeguarding critical infrastructure. However, there’s a catch: the live video downlinks from these drones suffer from low resolution, hindering effective threat detection and response.
The city of Turin decides to use SR to improve the resolution of live UAV video downlinks while maintaining low latency. By leveraging advanced algorithms and neural networks, SR enhances the visual fidelity of surveillance footage, empowering security personnel with clearer, more detailed imagery.
4 References
- MPAI Statutes
- MPAI Patent Policy
- MPAI-EVC https://mpai.community/standards/mpai-evc/
- MPAI; Call for Technologies: Up-sampling Filter for Video Applications (MPAI-UFV); N2077
- MPAI; Framework Licence: Use Cases and Functional Requirements: Up-sampling Filter for Video Applications (MPAI-UFV); N2079
- MPAI; Template for Responses: Up-sampling Filter for Video Applications (MPAI-UFV); N2080.