| Русский Русский | English English |
   
Главная
18 | 04 | 2024
10.14489/vkit.2019.04.pp.013-024

DOI: 10.14489/vkit.2019.04.pp.013-024

Визильтер Ю. В., Выголов О. В., Желтов С. Ю., Князь В. В.
СЕМАНТИКО-МОРФОЛОГИЧЕСКОЕ ОПИСАНИЕ И СИНТЕЗ ИЗОБРАЖЕНИЙ С ИСПОЛЬЗОВАНИЕМ ГЛУБОКИХ НЕЙРОННЫХ СЕТЕЙ
(c. 13-24)

Аннотация. Обоснован подход к анализу и синтезу изображений на основе семантико-морфологических моделей (СММ), сочетающих свойства морфологических и семантических описаний изображений видимых сцен (семантико-морфологический анализ изображений). Определены СММ сегментированных мозаичных изображений. Дано описание возможных способов представления СММ в глубоких конволюционных нейронных сетях. Предложенный семантико-морфологический подход к описанию и синтезу изображений апробирован на примере задачи синтеза инфракрасных изображений по исходным цветным изображениям с использованием СММ и генеративно-состязательных сетей. Показано качественное преимущество полученных результатов предложенных СММ по сравнению с традиционными мозаичными морфологическими моделями и другими известными методами.

Ключевые слова:  морфологический анализ изображений; семантические модели; семантическая сегментация; синтез изображений; глубокие нейронные сети; генеративно-состязательные сети.

 

Vizilter Yu. V., Vygolov O. V., Zheltov S. Yu., Kniaz V. V.
SEMANTIC-MORPHOLOGICAL IMAGE DESCRIPTION AND SYNTHESIS VIA CONVOLUTIONAL NEURAL NETWORKS
(pp. 13-24)

Abstract. An approach for synthesis and analysis of images using SMM (Semantic-Morphological Models) was defined. SMM models combine properties of morphological and semantic image descriptions of the observed scene (semantic-morphological image analysis). Semantic-morphological models of the mosaic images were defined. A description was given for various approaches to implementation of SMM in deep convolutional neural networks. The proposed semantic-morphological approach to definition and synthesis of images was test on a task of infrared image synthesis. Infrared images are synthesized using GANs  (Generative Adversarial Networks). An SMM and a color image are used as an input for a generative adversarial network. An overview is presented that discusses the related work on image synthesis using generative adversarial networks. A modification of a generative adversarial network is presented and termed as SemanticThermalGAN. The proposed modification of a generative adversarial network performs the synthesis of infrared images in two steps. Firstly, the SemanticThermalGAN predicts the semantic segmentation for all objects in the scene. Secondly, the network predicts the relative temperature textures of objects in the scene. A qualitative evaluation was performed. The evaluation proves that results generated using SMM-based approach outperforms similar modern approaches, including mosaic morphological models.

Keywords: Morphological image analysis; Semantic model; Semantic segmentation; Image synthesis; Convolutional neural networks; Generative adversarial networks.

Рус

Ю. В. Визильтер, О. В. Выголов, С. Ю. Желтов, В. В. Князь (ФГУП «Государственный научно-исследовательский институт авиационных систем» ГНЦ РФ, Москва, Россия) E-mail: Этот e-mail адрес защищен от спам-ботов, для его просмотра у Вас должен быть включен Javascript

Eng

Yu. V. Vizilter, O. V. Vygolov, S. Yu. Zheltov, V. V. Kniaz (State Research Institute of Aviation Systems State Scientific Center of Russian Federation, Moscow, Russia) E-mail: Этот e-mail адрес защищен от спам-ботов, для его просмотра у Вас должен быть включен Javascript  

Рус

1. Qi G.-J. Hierarchically Gated Deep Networks for Semantic Segmentation // IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2016. P. 2267 – 2275.
2. Xu C., Corso J. J. Actor-Action Semantic Segmentation with Grouping Process Models // IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2016. P. 3083 – 3092.
3. Dai J., He K., Sun J. Instance-Aware Semantic Segmentation via Multi-Task Network Cascades // IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2016. P. 3150 – 3158.
4. Scribble Sup: Scribble Supervised Convolutional Networks for Semantic Segmentation / G. Lin et al. // IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2016. P. 3159 – 3167.
5. Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation / G. Lin et al. // IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2016. P. 3194 – 3203.
6. Gaussian Conditional Random Field Network for Semantic Segmentation / R. Vemulapalli et al. // IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2016. P. 3224 – 3233.
7. Bertasius G., Shi J., Torresani L. Semantic Segmentation with Boundary Neural Fields // IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2016. P. 3602 – 3610.
8. You Only Look Once: Unified, Real-Time Object Detection. arXiv [cs.CV] / J. Redmon et al. 2015. URL: http://arxiv.org/abs/1506.02640 (дата обращения: 11.03.2019).
9. Going Deeper with Convolutions. arXiv [cs.CV] / C. Szegedy et al. 2014. URL: http://arxiv.org/abs/1409.4842 (дата обращения: 11.03.2019).
10. Пытьев Ю. П., Чуличков А. И. Методы морфологического анализа изображений. М.: ФИЗМАТЛИТ, 2010. 336 с.
11. Визильтер Ю. В., Горбацевич В. С. Описание формы объектов на изображениях при помощи гибких структурирующих элементов // Техническое зрение в системах управления – 2011: тр. науч.-техн. конф. Москва, 15 – 17 марта 2011 г. 2011. С. 162 – 167.
12. Generative Adversarial Networks / I. Goodfellow et al. // Advances in Neural Information Processing Systems (NIPS). 2014. P. 2672 – 2680.
13. Zhang R. Colorful Image Colorization. ECCV. 2016. V. 9907, No. 40. P. 649 – 666.
14. Gatys L. A., Ecker A. S., Bethge M. A Neural Algorithm of Artistic Style arXiv [cs.CV]. URL: https:// arxiv.org/abs/1508.06576 (дата обращения: 22.03.2019).
15. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks / J.-Y. Zhu et al. // IEEE Intern. Conf. on Computer Vision (ICCV). 2017. P. 2242 – 2251.
16. Image-to-Image Translation with Conditional Adversarial Networks / P. Isola et al. // Conf. on Computer Vision and Pattern Recognition (CVPR). 2017. P. 5967 – 5976.
17. Toward Multimodal Image-to-Image Translation / Y. Zhu et. al. // NIPS. Annual Conference on Neural Information Processing Systems. 2017. P. 465 – 476.
18. Image Blind Denoising with Generative Adversarial Network Based Noise Modeling // Conf. on Computer Vision and Pattern Recognition (CVPR). 2018. P. 3155 – 3164.
19. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network / C. Ledig et al. // Conf. on Computer Vision and Pattern Recognition (CVPR). 2017. P. 105 – 114.
20. Lim B., Son S., Kim H. Enhanced Deep Residual Networks for Single Image Super-Resolution // Conf. on Computer Vision and Pattern Recognition Workshops (CVPR Workshops). 2017. P. 1132 – 1140.
21. Generative Adversarial Text to Image Synthesis / S. Reed et al. // 33nd Intern. Conf. on Machine Learning (ICML). 2016. V. 48. P. 1060 – 1069.
22. StackGAN++: Text to Photorealistic Image Synthesis with Stacked Generative Adversarial Networks / H. Zhang et al. // Intern. Conf. on Computer Vision (ICCV). 2017. P. 5908 – 5916.
23. Recycle-GAN: Unsupervised Video Retargeting / A. Bansal et al. // 15th European Conf. Computer Vision (ECCV). 2018. V. 11209. P. 122 – 138.
24. Kniaz V. V., Mizginov V. A. Thermal Texture Generation and 3D-Model Reconstruction Using SfM and GAN // ISPRS TC II Midterm Symposium “Towards Photogrammetry 2020”. ISPRS – International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2018. V. XLII-2. P. 519 – 524.
25. Thermal GAN: Multimodal Color-to-Thermal Image Translation for Person Re-Identification in Multispectral Dataset / V. V. Kniaz et al. // The European Conference on Computer Vision (ECCV) Workshops. 2018. 19 p. URL: http://openaccess.thecvf.com/content_ ECCVW_2018/papers/11134/Kniaz_ThermalGAN_Multimodal_Color-to-Thermal_Image_Translation_for_Person_ Re-Identification_in_Multispectral_ECCVW_2018_ paper.pdf (дата обращения: 11.03.2019).

Eng

1. Qi G.-J. (2016). Hierarchically Gated Deep Networks for Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2267-2275.
2. Xu C., Corso J. J. (2016). Actor-Action Semantic Segmentation with Grouping Process Models. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3083-3092.
3. Dai J., He K., Sun J. (2016). Instance-Aware Semantic Segmentation via Multi-Task Network Cascades. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3150-3158.
4. Lin G. et al. (2016). Scribble Sup: Scribble Supervised Convolutional Networks for Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3159-3167.
5. Lin G. et al. (2016). Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3194-3203.
6. Vemulapalli R. et al. (2016). Gaussian Conditional Random Field Network for Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3224-3233.
7. Bertasius G., Shi J., Torresani L. (2016). Semantic Segmentation with Boundary Neural Fields. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3602-3610.
8. Redmon J. et al. (2015). You Only Look Once: Unified, Real-Time Object Detection. arXiv [cs.CV] Available at: http://arxiv.org/abs/1506.02640 (Accessed: 11.03.2019).
9. Szegedy C. et al. (2014). Going Deeper with Convolutions. arXiv [cs.CV] Available at: http://arxiv. org/abs/1409.4842 (Accessed: 11.03.2019).
10. Pyt'ev Yu. P., Chulichkov A. I. (2010). Methods of morphological image analysis. Moscow: FIZMATLIT. [in Russian language]
11. Vizil'ter Yu. V., Gorbatsevich V. S. (2011). Description of the shape of objects in images using flexible structuring elements. Technical Vision in Management Systems - 2011: Proceedings of Scientific and Technical Conference Moscow, March 15 - 17, pp. 162-167. [in Russian language]
12. Goodfellow I. et al. (2014). Generative Adversarial Networks. Advances in Neural Information Processing Systems (NIPS), pp. 2672-2680.
13. Zhang R. (2016). Colorful Image Colorization. European Conference on Computer Vision, Vol. 9907, 40, pp. 649-666.
14. Gatys L. A., Ecker A. S., Bethge M. A Neural Algorithm of Artistic Style. arXiv [cs.CV]. Available at: https:// arxiv.org/abs/1508.06576 (Accessed: 22.03.2019).
15. Zhu J.-Y. et al. (2017). Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. IEEE International Conference on Computer Vision (ICCV), pp. 2242-2251.
16. Isola P. et al. (2017). Image-to-Image Translation with Conditional Adversarial Networks. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5967-5976.
17. Zhu J.-Y. et al. (2017). Toward Multimodal Image-to-Image Translation. NIPS. Annual Conference on Neural Information Processing Systems, pp. 465-476.
18. Image Blind Denoising with Generative Adversarial Network Based Noise Modeling. (2018). Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3155-3164.
19. Ledig C. et al. (2017). Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 105-114.
20. Lim B., Son S., Kim H. (2017). Enhanced Deep Residual Networks for Single Image Super-Resolution. Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), pp. 1132-1140.
21. Reed S. et al. (2016). Generative Adversarial Text to Image Synthesis. 33nd International Conference on Machine Learning (ICML), Vol. 48, pp. 1060-1069.
22. Zhang H. et al. (2017). StackGAN++: Text to Photorealistic Image Synthesis with Stacked Generative Adversarial Networks. International Conference on Computer Vision (ICCV), pp. 5908-5916.
23. Bansal A. et al. (2018). Recycle-GAN: Unsupervised Video Retargeting. 15th European Conference Computer Vision (ECCV), Vol. 11209, pp. 122-138.
24. Kniaz V. V., Mizginov V. A. (2018). Thermal Texture Generation and 3D Model Reconstruction Using SfM and GAN. ISPRS TC II Midterm Symposium “Towards Photogrammetry 2020”. ISPRS – International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol. XLII-2, pp. 519-524.
25. Kniaz V. V. et al. (2018). Thermal GAN: Multimodal Color-to-Thermal Image Translation for Person Re-Identification in Multispectral Dataset. The European Conference on Computer Vision (ECCV) Workshops. Available at: http://openaccess.thecvf.com/content_ ECCVW_2018/papers/11134/Kniaz_ThermalGAN_Multimodal_Color-to-Thermal_Image_Translation_for_Person_ Re-Identification_in_Multispectral_ECCVW_2018_ paper.pdf (Accessed: 11.03.2019).

Рус

Статью можно приобрести в электронном виде (PDF формат).

Стоимость статьи 350 руб. (в том числе НДС 18%). После оформления заказа, в течение нескольких дней, на указанный вами e-mail придут счет и квитанция для оплаты в банке.

После поступления денег на счет издательства, вам будет выслан электронный вариант статьи.

Для заказа скопируйте doi статьи:

10.14489/vkit.2019.04.pp.013-024

и заполните  форму 

Отправляя форму вы даете согласие на обработку персональных данных.

.

 

Eng

This article  is available in electronic format (PDF).

The cost of a single article is 350 rubles. (including VAT 18%). After you place an order within a few days, you will receive following documents to your specified e-mail: account on payment and receipt to pay in the bank.

After depositing your payment on our bank account we send you file of the article by e-mail.

To order articles please copy the article doi:

10.14489/vkit.2019.04.pp.013-024

and fill out the  form  

 

.

 

 

 
Баннер
Баннер
Rambler's Top100 Яндекс цитирования