Video-Based Person Re-Identification by an End-To-End Learning Architecture with Hybrid Deep Appearance-Temporal Feature

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Video-based person re-identification is an important task with the challenges of lighting variation, low-resolution images, background clutter, occlusion, and human appearance similarity in the multi-camera visual sensor networks. In this paper, we propose a video-based person re-identification method called the end-to-end learning architecture with hybrid deep appearance-temporal feature. It can learn the appearance features of pivotal frames, the temporal features, and the independent distance metric of different features. This architecture consists of two-stream deep feature structure and two Siamese networks. For the first-stream structure, we propose the Two-branch Appearance Feature (TAF) sub-structure to obtain the appearance information of persons, and used one of the two Siamese networks to learn the similarity of appearance features of a pairwise person. To utilize the temporal information, we designed the second-stream structure that consisting of the Optical flow Temporal Feature (OTF) sub-structure and another Siamese network, to learn the person’s temporal features and the distances of pairwise features. In addition, we select the pivotal frames of video as inputs to the Inception-V3 network on the Two-branch Appearance Feature sub-structure, and employ the salience-learning fusion layer to fuse the learned global and local appearance features. Extensive experimental results on the PRID2011, iLIDS-VID, and Motion Analysis and Re-identification Set (MARS) datasets showed that the respective proposed architectures reached 79%, 59% and 72% at Rank-1 and had advantages over state-of-the-art algorithms. Meanwhile, it also improved the feature representation ability of persons.

Related collections

Most cited references 43

Record: found
Abstract: not found
Conference Proceedings: not found

Very Deep Convolutional Networks for Large-Scale Image Recognition

K Simonyan, A. Zisserman, Karen Simonyan … (2024)

0 comments Cited 276 times – based on 0 reviews

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

Rethinking the inception architecture for computer vision

C. Szegedy, V Vanhoucke, S Ioffe … (2024)

0 comments Cited 135 times – based on 0 reviews

Bookmark

Record: found
Abstract: not found
Book Chapter: not found

MARS: A Video Benchmark for Large-Scale Person Re-Identification

Liang Zheng, Zhi Bie, Yifan Sun … (2016)

0 comments Cited 88 times – based on 0 reviews

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Sensors (Basel)

Journal ID (iso-abbrev): Sensors (Basel)

Journal ID (publisher-id): sensors

Title: Sensors (Basel, Switzerland)

Publisher: MDPI

ISSN (Electronic): 1424-8220

Publication date (Electronic): 29 October 2018

Publication date Collection: November 2018

Volume: 18

Issue: 11

Electronic Location Identifier: 3669

Affiliations

School of Computer Science and Information Engineering, Hefei University of Technology, Feicui Road 420, Hefei 230000, China; sunrui@ 123456hfut.edu.cn (R.S.); 18225514947@ 123456163.com (M.X.); zhangjun@ 123456hfut.edu.cn (J.Z.)

Author notes

[* ]Correspondence: jchqh123@ 123456163.com ; Tel.: +86-151-566-99439

Author information

Rui Sun https://orcid.org/0000-0002-1547-161X

Article

Publisher ID: sensors-18-03669

DOI: 10.3390/s18113669

PMC ID: 6263398

PubMed ID: 30380623

SO-VID: 3aad4056-1a81-487e-a82c-9355dbcc25f0

License:

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

Video-Based Person Re-Identification by an End-To-End Learning Architecture with Hybrid Deep Appearance-Temporal Feature

Read this article at

Abstract

Related collections

Journal of Disability Research

Most cited references 43

Very Deep Convolutional Networks for Large-Scale Image Recognition

Rethinking the inception architecture for computer vision

MARS: A Video Benchmark for Large-Scale Person Re-Identification

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Categories

Comments

Comment on this article

Similar content 55

Cited by 2

Most referenced authors 401