10.14489/vkit.2021.04.pp.011-020

DOI: 10.14489/vkit.2021.04.pp.011-020

Визильтер Ю. В., Горбацевич В. С., Моисеенко А. С.
ОДНОПРОХОДНЫЙ АЛГОРИТМ ОБНАРУЖЕНИЯ И РАСПОЗНАВАНИЯ ЛИЦ НА ОСНОВЕ СВЕРТОЧНЫХ НЕЙРОННЫХ СЕТЕЙ
(с. 11-20)

Аннотация. Предложены архитектура и методика обучения глубокой сверточной нейронной сети (ГСНС) для одновременного обнаружения и распознавания лиц. Предлагаемый подход комбинирует идеи алгоритмов SSD (Single Shot Detector) и Faster R-CNN (Region proposal Convolutional Neural Networks). Обнаружение лиц происходит аналогично однопроходным алгоритмам обнаружения и далее с использованием слоев пулинга по зоне интереса строится биометрический шаблон при помощи отдельной ветки нейронной сети. Показано, что основная особенность алгоритма – высокая скорость обработки, не зависящая от числа лиц на входном изображении. При использовании базовой архитектуры ГСНС ResNet-34 время обнаружения лиц и построения биометрических шаблонов на изображении со 100 лицами составляет менее 13 мс. Тестирование на базе данных FDDB (Face Detection Dataset and Benchmark) и Fei Face DataBase показало, что предлагаемый подход может использоваться на практике для решения задач реидентификации в реальном времени.

Ключевые слова: глубокие сверточные нейронные сети; биометрия; обнаружение лиц; поиск особых точек лица; построение биометрического шаблона.

Vizilter Yu. V., Gorbatsevich V. S., Moiseenko A. S.
SINGLE-SHOT FACE DETECTION AND RECOGNITION BASED ON CNN
(pp. 11-20)

Abstract. The paper proposes an architecture and training method of a deep convolutional neural network for simultaneous face detection and recognition. The implemented approach combines the ideas of SSD (Single Shot Detector) and Faster R-CNN (Region proposal Convolutional Neural Networks) algorithms. Face detection is performed similarly to single-stage detection algorithms, and then a biometric template is built by employing RoI (Region of Interest) pooling layers and using the separate branch of the neural network. Training process includes three stages: pretraining of thebasic CNN for face recognition on face images, fine-tuning by using RoI pooling on in painted face images, adding SSD layers and fine-tuning on face detection. Wherein, at the latter stage, training is performed by using shared layers technology for two databases simultaneously. The main feature of the algorithm is high processing speed, which does not depend on the number of faces in the input image. For example, in case of using ResNet-34 as the core architecture for the algorithm, the required time for detecting faces and building biometric templates on an image with 100 faces is less than 13 ms. For training purposes we use CASIA-WebFace for face recognition task and Wider Face for face detection task. Testing is performed on FDDB (Face Detection Dataset and Benchmark), since this database is closer to practical applications than Wider. As long as the main practical task the developed method is intended for is face reidentification, we use Fei Face DataBase for face recognition quality testing. We obtain TPR (True Positive Rate) = 0.928@1000 on FDDB Face DataBase and FAR (Face Acceptance Rate) = 0.03309@FRR (Face Rejection Rate) = 10–4. Therefore, the proposed algorithm allows solving face detection and reidentification tasks in real time with any number of faces on an input image.

Keywords: Deep convolutional neural networks; Biometrics; Face detection; Finding facial features; Biometric template.

+ - Информация об авторах (About the Authors) Click to collapse

Рус

Ю. В. Визильтер, В. С. Горбацевич, A. C. Моисеенко (ФГУП «Государственный научно-исследовательский институт авиационных систем» ГНЦ РФ, Москва, Россия) E-mail: Этот e-mail адрес защищен от спам-ботов, для его просмотра у Вас должен быть включен Javascript

Eng

Yu. V. Vizilter, V. S. Gorbatsevich, A. S. Moiseenko (State Research Institute of Aviation Systems State Scientific Center of Russian Federation, Moscow, Russia) E-mail: Этот e-mail адрес защищен от спам-ботов, для его просмотра у Вас должен быть включен Javascript

+ - Библиографический список (References) Click to collapse

Рус

1. Long-Term Face Tracking in the Wild Using Deep Learning / K. Zhanga et al. // Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR). 2018. URL: https://arxiv.org/pdf/1805.07646.pdf (дата обращения: 09.03.2021).
2. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications / A. G. Howard et al. // Arxiv preprint. 2017. URL: https://arxiv.org/abs/1704.04861 (дата обращения: 09.03.2021).
3. SSD: Single Shot MultiBox Detector / W. Liu et al. // Proc. of the 14th European Conf. on Computer Vision (ECCV). 2016. 17 p. URL: https://arxiv.org/pdf/1512.02325.pdf (дата обращения: 09.03.2021). DOI: 10.1007/978-3-319-46448-0_2
4. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks /
S. Ren et al. // IEEE Transactions on Pattern Analysis and Machine Intelligence. 2015. P. 91 – 99. URL: https://arxiv.org/pdf/1506.01497.pdf (дата обращения: 09.03.2021). DOI: 10.1109/TPAMI.2016.2577031 5. Jain V., Learned-Miller E. FDDB: A Benchmark for Face Detection in Unconstrained Settings // Computer Science. 2010. 11 p. URL: http://vis-www.cs.umass.edu/fddb/fddb.pdf (дата обращения: 09.03.2021).
6. Thomaz C. E., Giraldi G. A. A New Ranking Method for Principal Components Analysis and Its Application to Face Image Analysis // Image and Vision Computing. 2010. V. 28, No. 6. P. 902 – 913.
7. Focal Loss for Dense Object Detection / T.-Y. Lin et al. // Proc. of the Intern. Conf. on Computer Vision (ICCV). 2017. Р. 2980 – 2988. URL: https://arxiv.org/pdf/1708.02002.pdf (дата обращения: 09.03.2021).
8. Feature Pyramid Networks for Object Detection / T.-Y. Lin et al. // Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR). 2017. P. 2117 – 2125. URL: https://openaccess.thecvf.com/content_cvpr_2017/papers/Lin_Feature_Pyramid_Networks_CVPR_2017_paper.pdf (дата обращения: 09.03.2021). DOI: 10.1109/CVPR.2017.106
9. Zhang K., Zhang Z., Li Z., Qiao Y. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks // Proc. of the Conf. on Signal Processing Letters. 2016. URL: https://arxiv.org/abs/1604.02878 (дата обращения: 09.03.2021). DOI: 10.1109/LSP.2016.2603342
10. RetinaFace: Single-stage Dense Face Localisation in the Wild / J. Deng et al. // Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR). 2019. 10 p. URL: https://arxiv.org/pdf/1905.00641.pdf (дата обращения: 09.03.2021).
11. Convolutional Pose Machines / S.-E. Wei et al. // Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR). 2016. P. 4724 – 4732. URL: https://arxiv.org/pdf/1602.00134.pdf (дата обращения: 09.03.2021).
12. Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks / Z.-H. Feng et al. // Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR). 2018. P. 2235 – 2245. URL: https://arxiv.org/pdf/1711.06753.pdf (дата обра¬щения: 09.03.2021).
13. Schroff F., Kalenichenko D., Philbin J. FaceNet: A Unified Embedding for Face Recognition and Clustering // Proc. of the Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR). 2015. P. 815 – 823. URL: https://arxiv.org/abs/1503.03832 (дата обращения: 09.03.2021). DOI: 10.1109/CVPR.2015.7298682
14. ArcFace: Additive Angular Margin Loss for Deep Face Recognition / J. Deng et al. // Proc. of the CVF Conf. on Computer Vision and Pattern Recognition (CVPR). 2019. P. 4685 – 4694. URL: https://arxiv.org/abs/1801.07698 (дата обращения: 09.03.2021). DOI: 10.1109/CVPR.2019.00482
15. DeepID3: Face Recognition with Very Deep Neural Networks / Y. Sun et al. // Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR). 2015. URL: https://arxiv.org/pdf/1502.00873.pdf (дата обра¬щения: 09.03.2021).
16. Triplet Probabilistic Embedding for Face Verification and Clustering / S. Sankaranarayanan et al. // 8th Intern. Conf. on Biometrics Theory, Applications and Systems (BTAS). 2016. P. 1 – 8. URL: arXiv:1604.05417 (дата обращения: 09.03.2021). DOI: 10.1109/BTAS.2016.7791205
17. Facial Landmark Detection by Deep Multi-Task Learning / Z. Zhang et al. // European Conf. on Computer Vision. 2014. P. 94 – 108.
18. Zhu X., Ramanan D. Face Detection, Pose Estimation, and Landmark Localization, in the Wild // Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR). 2012. 8 p. URL: https://vision.ics.uci.edu/papers/ZhuR_CVPR_2012/ZhuR_CVPR_2012.pdf (дата обращения: 09.03.2021).
19. Learning Face Representation from Scratch / D. Yi et. al. // Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR). 2014. URL: arXiv:1411.7923 (дата обращения: 09.03.2021).
20. Wider Face: A Face Detection Benchmark / S. Yang et al. // Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR). 2016. P. 5525 – 5533. URL: https://openaccess.thecvf.com/content_cvpr_2016/papers/Yang_WIDER_FACE_A_CVPR_2016_paper.pdf (дата обращения: 09.03.2021).
21. MobileFaceNets: Efficient CNNs for Accurate Real-Time Face Verification on Mobile Devices / S. Chen et al. // Proc. of the Chinese Conf. on Biometric Recognition (CCBR). 2018. DOI: 10.1007/978-3-319-97909-0_46

Eng

1. Zhanga K. et al. (2018). Long-Term Face Tracking in the Wild Using Deep Learning. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR). Available at: https://arxiv.org/pdf/ 1805.07646.pdf (Accessed: 09.03.2021).
2. Howard A. G. et al. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. Arxiv preprint. Available at: https://arxiv.org/abs/1704.04861 (Accessed: 09.03.2021).
3. Liu W. et al. (2016). SSD: Single Shot MultiBox Detector. Proceedings of the 14th European Conference on Computer Vision (ECCV). Available at: https://arxiv.org/pdf/1512.02325.pdf (Accessed: 09.03.2021). DOI: 10.1007/978-3-319-46448-0_2
4. Ren S. et al. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 91 – 99. Available at: https://arxiv.org/pdf/1506.01497.pdf (Accessed: 09.03.2021). DOI: 10.1109/TPAMI.2016.2577031
5. Jain V., Learned-Miller E. (2010). FDDB: A Benchmark for Face Detection in Unconstrained Settings. Computer Science. Available at: http://vis-www.cs.umass.edu/fddb/fddb.pdf (Accessed: 09.03.2021).
6. Thomaz C. E., Giraldi G. A. (2010). A New Ranking Method for Principal Components Analysis and Its Application to Face Image Analysis. Image and Vision Computing, Vol. 28, (6), pp. 902 – 913.
7. Lin T.-Y. et al. (2017). Focal Loss for Dense Object Detection. Proceedings of the International Conference on Computer Vision (ICCV), pp. 2980 – 2988. Available at: https://arxiv.org/pdf/1708.02002.pdf (Accessed: 09.03.2021).
8. Lin T.-Y. et al. (2017). Feature Pyramid Networks for Object Detection. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117 – 2125. Available at: https://openaccess.thecvf.com/content_cvpr_2017/papers/Lin_Feature_Pyramid_Networks_CVPR_2017_paper.pdf (Accessed: 09.03.2021). DOI: 10.1109/CVPR.2017.106
9. Zhang K., Zhang Z., Li Z., Qiao Y. (2016). Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks. Proceedings of the Conference on Signal Processing Letters. Available at: https://arxiv.org/abs/1604.02878 Accessed: 09.03.2021). DOI: 10.1109/LSP.2016.2603342
10. Deng J. et al. (2019). RetinaFace: Single-stage Dense Face Localisation in the Wild. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR). Available at: https://arxiv.org/ pdf/1905.00641.pdf (Accessed: 09.03.2021).
11. Wei S.-E. et al. (2016). Convolutional Pose Machines. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4724 – 4732. Available at: https://arxiv.org/pdf/1602.00134.pdf (Accessed: 09.03.2021).
12. Feng Z.-H. et al. (2018). Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2235 – 2245. Available at: https://arxiv.org/pdf/1711.06753.pdf (Accessed: 09.03.2021).
13. Schroff F., Kalenichenko D., Philbin J. (2015). FaceNet: A Unified Embedding for Face Recognition and Clustering. Proceedings of the Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815 – 823. Available at: https://arxiv.org/abs/1503.03832 (Accessed: 09.03.2021). DOI: 10.1109/CVPR.2015.7298682
14. Deng J. et al. (2019). ArcFace: Additive Angular Margin Loss for Deep Face Recognition. Proceedings of the CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4685 – 4694. Available at: https://arxiv.org/abs/1801.07698 (Accessed: 09.03.2021). DOI: 10.1109/CVPR.2019.00482
15. Sun Y. et al. (2015). DeepID3: Face Recognition with Very Deep Neural Networks. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR). Available at: https://arxiv.org/pdf/1502.00873.pdf (Accessed: 09.03.2021).
16. Sankaranarayanan S. et al. (2016). Triplet Probabilistic Embedding for Face Verification and Clustering. 8th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1 – 8. Available at: arXiv:1604.05417 (Accessed: 09.03.2021). DOI: 10.1109/BTAS.2016.7791205
17. Zhang Z. et al. (2014). Facial Landmark Detection by Deep Multi-Task Learning. European Conference on Computer Vision, pp. 94 – 108.
18. Zhu X., Ramanan D. (2012). Face Detection, Pose Estimation, and Landmark Localization, in the Wild. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR). Available at: https://vision.ics.uci.edu/papers/ZhuR_CVPR_2012/ZhuR_CVPR_2012.pdf (Accessed: 09.03.2021).
19. Yi et. D. al. (2014). Learning Face Representation from Scratch. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR). Available at: arXiv:1411.7923 (Accessed: 09.03.2021).
20. Yang S. et al. (2016). Wider Face: A Face Detection Benchmark. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5525 – 5533. Available at: https://openaccess.the-cvf.com/content_cvpr_2016/papers/Yang_WIDER_FACE_A_CVPR_2016_paper.pdf (Accessed: 09.03.2021).
21. Chen S. et al. (2018). MobileFaceNets: Efficient CNNs for Accurate Real-Time Face Verification on Mobile Devices. Proceedings of the Chinese Conference on Biometric Recognition (CCBR). DOI: 10.1007/978-3-319-97909-0_46

+ - Заказать электронную версию статьи (Purchase digital version of a single article) Click to collapse

Рус

Статью можно приобрести в электронном виде (PDF формат).

Стоимость статьи 450 руб. (в том числе НДС 18%). После оформления заказа, в течение нескольких дней, на указанный вами e-mail придут счет и квитанция для оплаты в банке.

После поступления денег на счет издательства, вам будет выслан электронный вариант статьи.

Для заказа скопируйте doi статьи:

10.14489/vkit.2021.04.pp.011-020

и заполните форму

Отправляя форму вы даете согласие на обработку персональных данных.

Eng

This article is available in electronic format (PDF).

The cost of a single article is 450 rubles. (including VAT 18%). After you place an order within a few days, you will receive following documents to your specified e-mail: account on payment and receipt to pay in the bank.

After depositing your payment on our bank account we send you file of the article by e-mail.

To order articles please copy the article doi:

10.14489/vkit.2021.04.pp.011-020

and fill out the form