A unified framework for automated person re-identification

  • Hong Quan Nguyen

    School of Electronics and Telecommunications, Hanoi University of Science and Technology, Hanoi, Vietnam
    Viet-Hung Industry University, Hanoi, Vietnam
  • Thuy Binh Nguyen

    School of Electronics and Telecommunications, Hanoi University of Science and Technology, Hanoi, Vietnam
    Faculty of Electrical-Electronic Engineering, University of Transport and Communications, Hanoi, VietNam
  • Duc Long Tran

    International Research Institute MICA, Hanoi University of Science and Technology, Hanoi, Vietnam
  • Thi Lan Le

    School of Electronics and Telecommunications, Hanoi University of Science and Technology, Hanoi, Vietnam
    International Research Institute MICA, Hanoi University of Science and Technology, Hanoi, Vietnam
Email: thuybinh_ktdt@utc.edu.vn
Từ khóa: Person re-identification, human detection, tracking

Tóm tắt

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.

Tài liệu tham khảo

[1] M. Zabłocki, K. Go´sciewska, D. Frejlichowski, R. Hofman, Intelligent video surveillance systems for public spaces–a survey, Journal of Theoretical and Applied Computer Science 8 (4)
(2014) 13–27.
[2] Q. Leng, M. Ye, Q. Tian, A survey of open-world person re-identification, IEEE
Transactions on Circuits and Systems for Video Technology 30 (2019) 1092–1108.
https://doi.org/10.1109/TCSVT.2019.2898940.
[3] J. Redmon, A. Farhadi, Yolov3: An incremental improvement, arXiv preprint
arXiv:1804.02767, 2018. https://arxiv.org/pdf/1804.02767v1.pdf.
[4] K. He, G. Gkioxari, P. Doll´ar, R. Girshick, Mask r-cnn, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961–2969.
[5] N. Wojke, A. Bewley, D. Paulus, Simple online and realtime tracking with a deep association
metric, in: 2017 IEEE International Conference on Image Processing (ICIP), 2017, pp. 3645–
3649. https://doi.org/10.1109/ICIP.2017.8296962.
[6] H.-Q. Nguyen, T.-B. Nguyen, T.-A. Le, T.-L. Le, T.-H. Vu, A. Noe, Comparative evaluation
of human detection and tracking approaches for online tracking applications, in: 2019 International Conference on Advanced Technologies for Communications (ATC), IEEE, 2019,
pp. 348–353. https://www.researchgate.net/publication/336719645 Comparative evaluation of
human detection and tracking approaches for online tracking applications.pdf.
[7] T. T. T. Pham, T.-L. Le, H. Vu, T. K. Dao, et al., Fully-automated person reidentification
in multi-camera surveillance system with a robust kernel descriptor and effective shadow removal method, Image and Vision Computing 59 (2017) 44–62.
https://doi.org/10.1016/j.imavis.2016.10.010.
[8] M. Taiana, D. Figueira, A. Nambiar, J. Nascimento, A. Bernardino, Towards fully automated person re-identification, i n: 2 014 I nternational C onference on Computer Vision Theory and Applications (VISAPP), Vol. 3, IEEE, 2014, pp. 140–147.
https://ieeexplore.ieee.org/document/7295073.
[9] Y.-J. Cho, J.-H. Park, S.-A. Kim, K. Lee, K.-J. Yoon, Unified framework for automated person Re-identification and camera network topology inference in camera networks, i n: Proceedings of the IEEE International Conference on Computer Vision Workshops, 2017, pp. 2601–2607. https://arxiv.org/abs/1704.07085.
[10] D. A. B. Figueira, Automatic person re-identification for video surveillance applications,
Ph.D. thesis, University of Lisbon, Lisbon, Portugal (2016).
https://www.ulisboa.pt/prova-academica/automatic-person-re-identification-video-surveillance-applications.
[11] J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788. https://arxiv.org/abs/1506.02640.
[12] S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, in: Advances in neural information processing systems, 2015, pp. 91–99. https://arxiv.org/abs/1506.01497.
[13] S. Karanam, M. Gou, Z.Wu, A. Rates-Borras, O. Camps, R. J. Radke, A systematic evaluation and benchmark for person re-identification: Features, metrics, and datasets, IEEE Transactions on Pattern Analysis & Machine Intelligence (1) (2018) 1–1.
[14] L. Zheng, Y. Yang, A. G. Hauptmann, Person re-identification: Past, present and future, arXiv preprint arXiv:1610.02984. https://arxiv.org/pdf/1610.02984.pdf.
[15] R. E. Kalman, A new approach to linear filtering and prediction problems, Journal of basic
Engineering 82 (1) (1960) 35–45. https://doi.org/10.1109/9780470544334.ch9.
[16] M.ul Hassan, ResNet (34, 50, 101): Residual CNNs for Image Classification Tasks.
https://neurohive.io/en/popular-networks/resnet/, [Online; accessed 10-March-2020].
[17] Tzutalin, Labelimg. gitcode(2015). https://github.com/tzutalin/labelImg/, [Online; accessed
20-Sep-2020].
[18] A. Milan, L. Leal-Taix´e, I. Reid, S. Roth, K. Schindler, Mot16: A benchmark for multi-object tracking, arXiv preprint arXiv:1603.00831, 2016. https://arxiv.org/abs/1603.00831
[19] X.Wang, G. Doretto, T. Sebastian, J. Rittscher, P. Tu, Shape and appearance context modeling, in: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, IEEE, 2007, pp. 1–8. https://www.ndmrb.ox.ac.uk/research/our-research/publications/439059
[20] J. Gao, R. Nevatia, Revisiting temporal modeling for video-based person ReID, arXiv preprint arXiv:1805.02104. https://arxiv.org/abs/1805.02104
[21] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, ImageNet: A large-scale hierarchical image database, in: 2009 IEEE conference on computer vision and pattern recognition, IEEE, 2009, pp. 248–255.
https://www.bibsonomy.org/bibtex/252793859f5bcbbd3f7f9e5d083160acf/analyst
[22] M. Hirzer, C. Beleznai, P. M. Roth, H. Bischof, Person re-identification by descriptive and
Discriminative classification, in: Scandinavian conference on Image analysis (2011), Springer,
2011, pp.91–102. https://doi.org/10.1007/978-3-642-21227-7_9

Tải xuống

Chưa có dữ liệu thống kê