Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/161421
Title: | Feature flow: in-network feature flow estimation for video object detection | Authors: | Jin, Ruibing Lin, Guosheng Wen, Changyun Wang, Jianliang Liu, Fayao |
Keywords: | Engineering::Electrical and electronic engineering Engineering::Computer science and engineering |
Issue Date: | 2022 | Source: | Jin, R., Lin, G., Wen, C., Wang, J. & Liu, F. (2022). Feature flow: in-network feature flow estimation for video object detection. Pattern Recognition, 122, 108323-. https://dx.doi.org/10.1016/j.patcog.2021.108323 | Journal: | Pattern Recognition | Abstract: | Optical flow, which expresses pixel displacement, is widely used in many computer vision tasks to provide pixel-level motion information. However, with the remarkable progress of the convolutional neural network, recent state-of-the-art approaches are proposed to solve problems directly on feature-level. Since the displacement of feature vector is not consistent with the pixel displacement, a common approach is to forward optical flow to a neural network and fine-tune this network on the task dataset. With this method, they expect the fine-tuned network to produce tensors encoding feature-level motion information. In this paper, we rethink about this de facto paradigm and analyze its drawbacks in the video object detection task. To mitigate these issues, we propose a novel network (IFF-Net) with an In-network Feature Flow estimation module (IFF module) for video object detection. Without resorting to pre-training on any additional dataset, our IFF module is able to directly produce feature flow which indicates the feature displacement. Our IFF module consists of a shallow module, which shares the features with the detection branches. This compact design enables our IFF-Net to accurately detect objects, while maintaining a fast inference speed. Furthermore, we propose a transformation residual loss (TRL) based on self-supervision, which further improves the performance of our IFF-Net. Our IFF-Net outperforms existing methods and achieves new state-of-the-art performance on ImageNet VID. | URI: | https://hdl.handle.net/10356/161421 | ISSN: | 0031-3203 | DOI: | 10.1016/j.patcog.2021.108323 | Schools: | School of Electrical and Electronic Engineering School of Computer Science and Engineering |
Rights: | © 2021 Elsevier Ltd. All rights reserved. | Fulltext Permission: | none | Fulltext Availability: | No Fulltext |
Appears in Collections: | EEE Journal Articles SCSE Journal Articles |
SCOPUSTM
Citations
50
8
Updated on Mar 13, 2025
Web of ScienceTM
Citations
50
4
Updated on Oct 31, 2023
Page view(s)
150
Updated on Mar 16, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.