June 20, 2023
ABSTRACT: This study aims to propose a personal recognition algorithm based on deep learning for barefoot footprint images. The ResNet 50 network architecture, known for its effectiveness in image recognition, was utilized as the foundation. The algorithm involved the collection of barefoot footprint images and the extraction of their features. Horizontal Pyramid Matching (HPM) was employed to separate and recombine features on multiple scales. Personal metric learning using Separate Triplet Loss was performed onthe fully-featured barefoot footprints. The relevant formulas were provided as required. Results: The algorithm was trained on barefoot footprints from 6,433 individuals and tested on an open barefoot dataset comprising 11,028 people. The algorithm, supported by the corresponding formulas, achieved a top-ranking accuracy of 96.2%. It outperformed the ResNet 50 coupled with Cross-Entropy Loss and ArcFace Loss algorithms in terms of Cumulative Matching Characteristic (CMC) and mean Average Precision (mAP) indicators.
Conclusions: The proposed personal recognition algorithm based on barefoot footprint images demonstrated excellent recognition performance, achieving a high level of accuracy on a dataset of ten thousand individuals.
Traditional identification of barefoot footprints relies on various features proposed in the field of footprinting, such as footprint structure and dynamic morphology. Feature analysis and fusion,combined with classifiers and other methods, are utilized to perform feature comparison and recognition .
In recent years,deep learning has emerged as a powerful tool in biometric recognition,including fingerprint and face recognition. Deep neural networks have demonstrated superior performance in image recognition tasks, surpassing traditional methods that rely on manually crafted features.
To address the challenges in barefoot footprint identification, a large-scale barefoot footprint database is essential. In this study, a professional single footprint acquisition instrument was employed to construct a comprehensive database comprising left and right foot barefoot footprints. A total of 213,254 individual footprints were collected from 18,380 individuals, ranging from 3 to54 footprints per person. The collected footprint images are depicted in Figure1.
The collected barefoot footprints underwent the following preprocessing steps:
1. Normalization of footprint direction based on the toe-up and heel-down orientation.
2. Clipping the image to a size of 399×886 pixels around the footprint center, followed by normalization to a unified size of300×660 pixels without scaling.
3. Normalization of image gray scale levels.
Data set partitioning:
To facilitate network training and research, the data set was randomly divided into training, validation, and test sets. The partitioning details are presented in Table 1.
Overall network framework:
The network's overall framework is illustrated in Figure 2.
The network architecture utilizes ResNet 50 as its base model. During the training stage, the base network extracts features from the barefoot image. The features are then subjected to the Horizontal Pyramid Matching (HPM) method for separation and recombination. Subsequently, metric learning is performed on the recombined and separated features. During the testing stage,the barefoot image undergoes feature extraction through the base network,followed by multi-scale feature extraction and recombination using HPM. The resulting features are synthesized into a4608-dimensional feature vector, which is employed for feature retrievalpurposes. The hierarchical structure of the network and the size of the feature graph are shown in Table 2.
To enhance the network's generalization capability and increase sample diversity, the following data augmentation techniques are applied to the input images:
(1) Random vertical flipping.
(2) Random horizontal flipping.
(3) Random rotation within a range of 0 to 10 degrees.
(4) Random grayscale transformations within a range of 0.8 to 1.1 times.
Horizontal Pyramid Matching (HPM):
Horizontal Pyramid Matching (HPM) is a recognition technology introduced in the field of pedestrian re-identification. Its main concept involves dividing the features extracted by the base network into different proportions along the image's height, enabling transformationand learning based on these features. This approach facilitates the extractionof multi-scale features from the image, thereby improving recognition accuracyand network generalization.
HPM demonstrates that segmenting images into different proportions at various heights allows for learning both globaland local features at different scales. In contrast to relying solely on global features, HPM is more effective in capturing multi-scale features, includinglocal characteristics. Given the distinct variations in human barefoot footprints across different regions (toes, soles, arches, and heels), each feature is independently subjected to a fully connected transformation,reducing the dimensionality from 2048 to 256 dimensions.
For the training process, each individual's 18 sets of 256-dimensional features are treated as separatepersonal characteristics. The features of other individuals are then studied by measurement. During the testing phase, all 18 sets of 256-dimensional featuresare concatenated to form complete personal features for individual identification.
The result of experiment:
In this experiment, the hardware setup includes two GTX 2080 Ti graphics cards, and the software framework used is pytorch1.7+ Cuda10. The mini-batch size for deep learning training is set to (p,k) = (32,16), meaning that the footprint images of 32 individuals are extracted and compared at each iteration, with each individual having 16 footprint pictures (repeatedly extracted if there are insufficient pictures). A total of 150,000 iterations are performed until convergence is achieved. The training process utilizes a fixed learning rate of 0.0001.
The proposed method result is presented in Table 3.
This research paper demonstrates the successful application of deep learning technology in barefoot footprint identification. By extracting multi-scale features from barefoot footprints, the network's generalization capability is improved, leading to ahigh recognition rate of 96.2% on a dataset of 10,000 individuals. Compared to traditional feature recognition methods, utilizing deep networks to automatically extract core features relatedto human footprints not only overcomes the challenge of manual feature extraction but also significantly enhances recognition accuracy. The combination of barefoot footprint multi-scale features with HPM technology further improves the network's generalization capability and recognition performance.
In future studies,the algorithm's performance can be further enhanced by exploring various aspects. This includes optimizing the network structure, incorporating attention mechanisms, refining loss functions, and implementing better result reordering methods for search results. Additionally, addressing challenges such as incomplete footprints, object changes, variations in walking states, andother factors will be crucial for the practical application of barefoot footprint recognition technology. Overcoming these challenges will pave the way for the widespread use of barefoot footprint recognition in real-world scenarios.
JIN Yifeng, WANG Li, LI Daixi, JIANG Xuemei, CHENG Jian, XIE Min, OUYANGWeijia