Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/80911
Full metadata record
DC FieldValueLanguage
dc.contributor.authorZhu, Hongyuanen
dc.contributor.authorVial, Romainen
dc.contributor.authorLu, Shijianen
dc.contributor.authorPeng, Xien
dc.contributor.authorFu, Huazhuen
dc.contributor.authorTian, Yonghongen
dc.contributor.authorCao, Xianbinen
dc.date.accessioned2019-05-09T03:36:11Zen
dc.date.accessioned2019-12-06T14:17:14Z-
dc.date.available2019-05-09T03:36:11Zen
dc.date.available2019-12-06T14:17:14Z-
dc.date.issued2018en
dc.identifier.citationZhu, H., Vial, R., Lu, S., Peng, X., Fu, H., Tian, Y., & Cao, X. (2018). YoTube : searching action proposal via recurrent and static regression networks. IEEE Transactions on Image Processing, 27(6), 2609-2622. doi:10.1109/TIP.2018.2806279en
dc.identifier.issn1057-7149en
dc.identifier.urihttps://hdl.handle.net/10356/80911-
dc.description.abstractIn this paper, we propose YoTube-a novel deep learning framework for generating action proposals in untrimmed videos, where each action proposal corresponds to a spatial-temporal tube that potentially locates one human action. Most of the existing works generate proposals by clustering low-level features or linking image proposals, which ignore the interplay between long-term temporal context and short-term cues. Different from these works, our method considers the interplay by designing a new recurrent YoTube detector and static YoTube detector. The recurrent YoTube detector sequentially regresses candidate bounding boxes using Recurrent Neural Network learned long-term temporal contexts. The static YoTube detector produces bounding boxes using rich appearance cues in every single frame. To fully exploit the complementary appearance, motion, and temporal context, we train the recurrent and static detector using RGB (Color) and flow information. Moreover, we fuse the corresponding outputs of the detectors to produce accurate and robust proposal boxes and obtain the final action proposals by linking the proposal boxes using dynamic programming with a novel path trimming method. Benefiting from the pipeline of our method, the untrimmed video could be effectively and efficiently handled. Extensive experiments on the challenging UCF-101, UCF-Sports, and JHMDB datasets show superior performance of the proposed method compared with the state of the arts.en
dc.format.extent13 p.en
dc.language.isoenen
dc.relation.ispartofseriesIEEE Transactions on Image Processingen
dc.rights© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/TIP.2018.2806279.en
dc.subjectObject Detectionen
dc.subjectDRNTU::Engineering::Computer science and engineeringen
dc.subjectImage Sequence Analysisen
dc.titleYoTube : searching action proposal via recurrent and static regression networksen
dc.typeJournal Articleen
dc.contributor.schoolSchool of Computer Science and Engineeringen
dc.identifier.doi10.1109/TIP.2018.2806279en
dc.description.versionAccepted versionen
item.fulltextWith Fulltext-
item.grantfulltextopen-
Appears in Collections:SCSE Journal Articles
Files in This Item:
File Description SizeFormat 
YoTube Searching Action Proposal via Recurrent and Static Regression Networks.pdf4.14 MBAdobe PDFThumbnail
View/Open

SCOPUSTM   
Citations

32
Updated on Sep 2, 2020

PublonsTM
Citations

35
Updated on Feb 24, 2021

Page view(s)

56
Updated on Feb 28, 2021

Download(s) 50

19
Updated on Feb 28, 2021

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.