10.14489/vkit.2026.04.pp.037-044

DOI: 10.14489/vkit.2026.04.pp.037-044

Харитоненко В. Г., Иванов Д. А., Соковнин С. Д.
МОДЕЛЬ ИДЕНТИФИКАЦИИ КРИПТОГРАФИЧЕСКИХ ХЭШ-ФУНКЦИЙ НА ОСНОВЕ ГЕТЕРОГЕННОГО СТЕКИНГА С РАСШИРЕННЫМ ПРИЗНАКОВЫМ ПРОСТРАНСТВОМ
(c. 37-44)

Аннотация. Рассмотрена задача идентификации хэш-функции с применением двухуровневой архитектуры гетерогенного стекинга, где используется метод блэндинга, который комбинирует разнотипные (XGBoost, Random Forest и MLP) алгоритмы через двухуровневую архитектуру: модель верхнего уровня учится объединять предсказания моделей нижнего уровня. Расширенное признаковое пространство включает новые категории признаков (энтропийные, частотные), а также позволяет учитывать специфические особенности криптографических функций. Актуальность исследования обусловлена ростом сложности хэш-функций, недостаточностью традиционных методов анализа, возрастанием требований к информационной безопасности. Предложен подход, сочетающий анализ статистических, структурных и частотных признаков хэшей с использованием комбинации базовых классификаторов и нейросетевого алгоритма в целях повышения точности идентификации хэш-функций по сравнению с традиционными методами. Эксперименты показали, что стекинг-модель демонстрирует точность на 5…7 % выше, чем лучшие одиночные классификаторы. Подчеркнута важность рассматриваемой задачи в контексте верификации данных, анализа вредоносного программного обеспечения и криптографического аудита.

Ключевые слова: криптография; хэш-функции; идентификация; машинное обучение; градиентный бустинг; блэндинг; многослойный персептрон.

Haritonenko V. G., Ivanov D. A., Sokovnin S. D.
MULTILAYER SYSTEM OF IDENTIFICATION OF CRYPTOGRAPHIC HASH FUNCTIONS BASED ON A HETEROGENEOUS STEKING ENSEMBLE
(pp. 37-44)

Abstract. This paper presents a novel approach for hash function identification, a critical task in digital forensics, malware analysis, and cryptographic security auditing. Motivated by the increasing complexity of modern cryptographic functions and the limitations of con-ventional analytical techniques, we propose a sophisticated two-level heterogeneous stacking architecture. This ensemble method strategically leverages the unique strengths of diverse base-level algorithms, including XGBoost, Random Forest, and Multi-Layer Perceptron (MLP). Their individual predictions are not simply aggregated; instead, a high-level meta-learner model is trained to intelligently and non-linearly blend these out-puts, learning the optimal combination strategy for superior accuracy.To empower the model, we significantly expand the feature space beyond traditional statistical de-scriptors. We introduce and incorporate new categories of features that capture the in-trinsic properties of hash outputs, such as entropy-based metrics, which quantify ran-domness, and frequency-based attributes, which reveal structural patterns. This enriched feature set allows the model to discern subtle, algorithm-specific signatures more effec-tively. Experimental results demonstrate the framework's efficacy, showing a consistent 5…7 % improvement in identification accuracy over the best-performing single classifier within the ensemble. The study thus confirms that our two-level stacking model, sup-ported by a comprehensive feature engineering strategy, offers a robust and advanced solution for precise hash function recognition, addressing pressing challenges in the field of information security.

Keywords: Cryptography; Hash functions; Identification; Machine learning; Gradient boosting; Blending; Multilayer perceptron.

+ - Информация об авторах (About the Authors) Click to collapse

Рус

В. Г. Харитоненко, Д. А. Иванов, С. Д. Соковнин (Военный университет радиоэлектроники, Череповец, Вологодская обл., Россия) E-mail: Этот e-mail адрес защищен от спам-ботов, для его просмотра у Вас должен быть включен Javascript

Eng

V. G. Haritonenko, D. A. Ivanov, S. D. Sokovnin (Military University of Radioelectronics, Cherepovets, Vologda Region, Russia) E-mail: Этот e-mail адрес защищен от спам-ботов, для его просмотра у Вас должен быть включен Javascript

+ - Библиографический список (References) Click to collapse

Рус

1. Карпов А. В., Ишмуратов Р. А. Введение в криптографию: учеб. пособие. Казань: Казанский (Приволжский) федеральный университет, 2024. 128 с.
2. Рябко Б. Я., Фионов А. Н. Криптографические методы защиты информации: учеб. пособие. М.: Горячая линия–Телеком, 2005. 229 с.
3. Al-Kuwari S., Davenport J. H., Bradford R. J. Cryptographic Hash Functions: Recent Design Trends and Security Notions // Technical Report. University of Bath. 2011. 37 p.
4. Duda R. O., Hart P. E., Stork D. G. Pattern Classification. Wiley-Interscience, 2001. 736 p.
5. Мерков А. Б. Распознавание образов: построение и обучение вероятностных моделей. М.: ЛЕНАНД, 2020. 320 с.
6. Wolpert D. H. Stacked generalization // Neural Networks. 1992. V. 5, № 2. P. 241–259.
7. Дьяконов А. Г. Cтекинг (Stacking) и блендинг (Blending) [Электронный ресурс]. URL: https://dyakonov.org/2017/03/10/стекинг-stacking-и-блендинг-blending/ (дата обращения: 20.02.2026).
8. Hornik K., Stinchcombe M., White H. Multilayer feedforward networks are universal approximators // Neural Networks. 1989. V. 2, № 5. P. 359–366.
9. Демиденко А. Случайный лес: как приручить одну из самых мощных ML-моделей [Электронный ресурс]. Изд-во Автор, 2025. 80 с.
10. Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System // Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, California, USA: ACM, 13–17 August 2016. P. 785–794.
11. Zhang Y., Wang J., Li B. A Machine Learning Approach for Cryptographic Hash Function Identifica-tion // IEEE International Conference on Communications (ICC 2017). 21–25 May 2017. P. 1–6.
12. Liu X., Lin C., Wang Y. Entropy-Based Analysis of Cryptographic Hash Functions // IEEE International Symposium on Circuits and Systems (ISCAS 2019). 26–29 May 2019. P. 1–5.
13. Chen K., Li M., Wang R. Higher-Order Statistical Analysis for Hash Function Classification // International Conference on Big Data and Artificial Intelligence (BDAI 2020). Qingdao, China: 3–6 July 2020. P. 156–160.
14. Wang H., Zhang L., Liu Y. Frequency Domain Analysis for Cryptographic Hash Identification // Computers & Security. 2021. V. 108. Art. 102325.
15. Kumar S., Singh A., Patel R. A Comprehensive Feature Set for Hash Algorithm Recognition // Journal of Information Security and Applications. 2022. V. 65. Art. 103129.

Eng

1. Karpov, A. V., & Ishmuratov, R. A. (2024). Introduction to cryptography [Textbook]. Kazan (Volga Region) Federal University. [in Russian language]
2. Ryabko, B. Ya., & Fionov, A. N. (2005). Cryptographic methods of information protection [Textbook]. Goryachaya liniya–Telekom. [in Russian language].
3. Al-Kuwari, S., Davenport, J. H., & Bradford, R. J. (2011). Cryptographic hash functions: Recent design trends and security notions (Technical Report). University of Bath.
4. Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification. Wiley-Interscience.
5. Merkov, A. B. (2020). Pattern recognition: Construction and training of probabilistic models. LENAND. [in Russian language].
6. Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.
7. Dyakonov, A. G. (2017, March 10). Stacking and blending. Retrieved February 20, 2026, from https://dyakonov.org/2017/03/10/стекинг-stacking-и-блендинг-blending/ [in Russian language].
8. Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366.
9. Demidenko, A. (2025). Random forest: How to tame one of the most powerful ML models. Avtor Publishing House. [in Russian language].
10. Chen, T., & Guestrin, C. (2016, August 13–17). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). ACM.
11. Zhang, Y., Wang, J., & Li, B. (2017, May 21–25). A machine learning approach for cryptographic hash function identification. In IEEE International Conference on Communications (ICC 2017) (pp. 1–6). IEEE.
12. Liu, X., Lin, C., & Wang, Y. (2019, May 26–29). Entropy-based analysis of cryptographic hash functions. In IEEE International Symposium on Circuits and Systems (ISCAS 2019) (pp. 1–5). IEEE.
13. Chen, K., Li, M., & Wang, R. (2020, July 3–6). Higher-order statistical analysis for hash function classification. In International Conference on Big Data and Artificial Intelligence (BDAI 2020) (pp. 156–160). Qingdao, China.
14. Wang, H., Zhang, L., & Liu, Y. (2021). Frequency domain analysis for cryptographic hash identification. Computers & Security, 108, Article 102325.
15. Kumar, S., Singh, A., & Patel, R. (2022). A comprehensive feature set for hash algorithm recognition. Journal of Information Security and Applications, 65, Article 103129.

+ - Заказать электронную версию статьи (Purchase digital version of a single article) Click to collapse

Рус

Статью можно приобрести в электронном виде (PDF формат).

Стоимость статьи 700 руб. (в том числе НДС 20%). После оформления заказа, в течение нескольких дней, на указанный вами e-mail придут счет и квитанция для оплаты в банке.

После поступления денег на счет издательства, вам будет выслан электронный вариант статьи.

Для заказа скопируйте doi статьи:

10.14489/vkit.2026.04.pp.037-044

и заполните форму

Отправляя форму вы даете согласие на обработку персональных данных.

Eng

This article is available in electronic format (PDF).

The cost of a single article is 700 rubles. (including VAT 20%). After you place an order within a few days, you will receive following documents to your specified e-mail: account on payment and receipt to pay in the bank.

After depositing your payment on our bank account we send you file of the article by e-mail.

To order articles please copy the article doi:

10.14489/vkit.2026.04.pp.037-044

and fill out the form