Temporal Pyramid Structure for Video Frame Interpolation

En cours de chargement...
Vignette d'image

Nom de la revue

ISSN de la revue

Titre du volume

Éditeur

Université d'Ottawa | University of Ottawa

Licence Creative Commons

Attribution 4.0 International

Résumé

The most prevalent structure in video frame interpolation involves using optical flow to guide frame warping, which typically considers only the two adjacent frames. However, these methods often fail to capture long-range temporal dependencies and often result in significant deformation in complex motion scenarios. We propose a novel Temporal Pyramid Attention (TPA) block, which employs a temporal pyramid structure to connect four frames within a sliding window for the generation of intermediate frames. The temporal pyramid structure consists of three layers to leverage multi-level features, estimate the frame window, and connect with a GRU to generate a bi-directional feature flow. Furthermore, the dual pyramid structure incorporates channel attention mechanisms, enabling the interpolation of three frames in a single process. The TPA block employs a multi-scale approach to effectively capture temporal dependencies and spatial correlations, enhancing the quality of interpolated frames. Our model achieves a state-of-the-art performance on the Vimeo90K septuplet dataset compared to existing methods using pre-trained parameters.

Description

Mots-clés

Deep learning, Video frame interpolation, Gated recurrent unit, Knowledge distillation, Temporal feature extraction

Citation

Approbation

Évaluation

Complété par

Référencé par