Publications

You can also find my articles on my Google Scholar profile.

Visual Place Recognition: A Survey from Deep Learning Perspective

Published in Pattern Recognition, 2020

Abstract: Visual place recognition has attracted widespread research interest in multiple fields such as computer vision and robotics. Recently, researchers have employed advanced deep learning techniques to tackle this problem. While an increasing number of studies have proposed novel place recognition methods based on deep learning, few of them has provided a whole picture about how and to what extent deep learning has been utilized for this issue. In this paper, by delving into over 200 references, we present a comprehensive survey that covers various aspects of place recognition from deep learning perspective. We first present a brief introduction of deep learning and discuss its opportunities for recognizing places. After that, we focus on existing approaches built upon convolutional neural networks, including off-the-shelf and specifically designed models as well as novel image representations. We also discuss challenging problems in place recognition and present an extensive review of the corresponding datasets. To explore the future directions, we describe open issues and some new tools, for instance, generative adversarial networks, semantic scene understanding and multi-modality feature learning for this research topic. Finally, a conclusion is drawn for this paper.

Recommended citation: Xiwu Zhang, Lei Wang and Yan Su. Visual Place Recognition: A Survey from Deep Learning Perspective. Pattern Recognition. November 2020. https://doi.org/10.1016/j.patcog.2020.107760

Graph-Based Place Recognition in Image Sequences with CNN Features

Published in Journal of Intelligent & Robotic Systems, 2019

Abstract: Visual place recognition is a critical and challenging problem in both robotics and computer vision communities. In this paper, we focus on place recognition for visual Simultaneous Localization and Mapping (vSLAM) systems. These systems have been limited to handcrafted feature based paradigms for a long time, which normally use local visual information of images and are not sufficiently robust against variations applied to images. In this work, we address place recognition with the features automatically learned from data. First, we propose a graph-based visual place recognition method. The graph is constructed by combining the visual features extracted from convolutional neural networks (CNNs) and the temporal information of the images in a sequence. Second, we propose to employ diffusion process to enhance the data association in the graph to achieve more accurate recognition results. Finally, to evaluate the proposed method, we experiment on four commonly used datasets. Experimental results indicate that the proposed method is able to obtain significantly better performance (e.g. 95.37% recall at 100% of precision) than that of FAB-MAP (47.16% recall at 100% of precision), a commonly used method for place recognition based on handcrafted features, especially on some challenging datasets.

Recommended citation: Zhang, X., Wang, L., Zhao, Y. et al. Graph-Based Place Recognition in Image Sequences with CNN Features. Journal of Intelligent & Robotic Systems 95, 389–403 (2019).

Loop closure detection for visual SLAM systems using convolutional neural network

Published in 2017 23rd International Conference on Automation and Computing (ICAC), 2017

Abstract: This paper is concerned of the loop closure detection problem, which is one of the most critical parts for visual Simultaneous Localization and Mapping (SLAM) systems. Most of state-of-the-art methods use hand-crafted features and bag-of-visual-words (BoVW) to tackle this problem. Recent development in deep learning indicates that CNN features significantly outperform hand-crafted features for image representation. This advanced technology has not been fully exploited in robotics, especially in visual SLAM systems. We propose a loop closure detection method based on convolutional neural networks (CNNs). Images are fed into a pre-trained CNN model to extract features. We pre-process CNN features instead of using them directly as most of the presented approaches did before they are used to detect loops. The workflow of extracting CNN features, processing data, computing similarity score and detecting loops is presented. Finally the performance of proposed method is evaluated on several open datasets by comparing it with Fab-Map using precision-recall metric.

Recommended citation: X. Zhang, Y. Su and X. Zhu, "Loop closure detection for visual SLAM systems using convolutional neural network," 2017 23rd International Conference on Automation and Computing (ICAC), Huddersfield, 2017, pp. 1-6.