Tấn công FGSM đối với các bộ phân loại ảnh CNN: Phân tích lỗ hổng và đề xuất giải pháp phòng thủ hiệu quả

Huong-Giang Doan; Thi Thanh Thuy Pham

doi:10.54939/1859-1043.j.mst.104.2025.155-163

Các tác giả

Doan Huong Giang Khoa Điều khiển và Tự động hóa, Trường Đại học Điện lực
Pham Thi Thanh Thuy (Tác giả đại diện) Faculty of Cybersecurity and High Tech Crime Prevention, Academy of People Security

DOI:

https://doi.org/10.54939/1859-1043.j.mst.104.2025.155-163

Từ khóa:

Mạng nơ ron tích chập CNN, Tấn công đối nghịch, Phòng thủ, Huấn luyện đối nghịch, Kỹ thuật chuẩn hóa.

Tóm tắt

Mạng nơ ron tích chập CNN (Convolutional Neural Network) đã cho thấy nhiều ưu điểm nổi trội và do đó đã được ứng dụng phổ biến trong nhiều lĩnh vực khác nhau. Tuy nhiên, các cuộc tấn công đối nghịch đã cho thấy những lỗ hổng nghiêm trọng của các mô hình này, đe dọa đến tính bảo mật và độ tin cậy của hệ thống. Mặc dù đã có nhiều nghiên cứu đề cập đến tấn công vào các mô hình học sâu, song tác động cụ thể của những cuộc tấn công này lên các bộ phân loại hình ảnh dựa trên CNN vẫn là vấn đề cần được làm rõ thêm, đặc biệt là đối với các mô hình CNN phổ biến là nền tảng của nhiều ứng dụng quan trọng trong thực tiễn. Nghiên cứu này phân tích điểm yếu của các bộ phân loại hình ảnh CNN trước tấn công đối nghịch FGSM (Fast Gradient Sign Method), đồng thời đề xuất giải pháp phòng thủ hiệu quả mang tên WR_FGSM. Các thử nghiệm trên các bộ cơ sở dữ liệu tiêu chuẩn cho thấy một số mô hình CNN bị ảnh hưởng khá nghiêm trọng bởi FGSM. Các hình ảnh nhiễu do tấn công này tạo ra không chỉ đánh lừa các bộ phân loại hình ảnh CNN mà còn khiến thị giác của con người khó phân biệt với hình ảnh gốc. Giải pháp phòng thủ đề xuất WR_FGSM ngoài việc sử dụng phương pháp phòng thủ hiệu quả hiện nay là Huấn luyện đối nghịch, chúng tôi còn tích hợp thêm kỹ thuật chính quy hóa vào các bước huấn luyện. Điều này cho phép bảo vệ các mô hình CNN một cách hiệu quả trước tấn công FGSM trong khi vẫn duy trì sự cân bằng giữa khả năng kháng tấn công và tính khái quát của mô hình.

Tài liệu tham khảo

[1]. Juanjuan Weng, Zhiming Luo, Dazhen Lin, Shaozi Li. “Comparative evaluation of recent universal adversarial perturbations in image classification”. Computers & Security, 136 (2024), 103576. (2024). DOI: https://doi.org/10.1016/j.cose.2023.103576

[2]. Jaydip Sen, Abhiraj Sen, and Ananda Chatterjee. “Adversarial Attacks on Image Classification Mod-els: Analysis and Defense”. arXiv preprint arXiv:2312.16880, (2023).

[3]. Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. “One pixel attack for fooling deep neu-ral networks”. IEEE Transactions on Evolutionary Computation 23, 5, 828–841, (2019). DOI: https://doi.org/10.1109/TEVC.2019.2890858

[4]. Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. “Deepfool: a simple and accurate method to fool deep neural networks”. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2574–2582, (2016).

[5]. Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu. “Towards deep learning models resistant to adversarial attacks”. arXiv preprint arXiv:1706.06083 (2017)

[6]. Jianbo Chen, Michael I Jordan, and Martin J Wainwright. “Hopskipjumpattack: A query-efficient decision-based attack”. In 2020 ieee symposium on security and privacy (sp). 1277–1294, (2020). DOI: https://doi.org/10.1109/SP40000.2020.00045

[7]. Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. “Explaining and harnessing adversarial examples”. arXiv preprint arXiv:1412.6572, (2014).

[8]. Aleksander Mkadry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. “Towards deep learning models resistant to adversarial attacks”. stat 1050, 9, (2017).

[9]. Alexey Kurakin, Ian J Goodfellow, and Samy Bengio. “Adversarial examples in the physical world”. In Artificial intelligence safety and security. Chapman and Hall/CRC, 99–112, (2018). DOI: https://doi.org/10.1201/9781351251389-8

[10]. Nicholas Carlini and David Wagner. “Towards evaluating the robustness of neural networks”. In 2017 ieee symposium on security and privacy (sp). pp. 39–57, (2017). DOI: https://doi.org/10.1109/SP.2017.49

[11]. Chen, P., Zhang, H., Sharma, Y., Yi, J., & Hsieh, C. J. "ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models". arXiv preprint arXiv:1708.03999, (2017). DOI: https://doi.org/10.1145/3128572.3140448

[12]. Ilyas, A., Engstrom, L., Athalye, A., & Lin, J. "Black-box adversarial attacks with limited queries and information". Proceedings of the 35th International Conference on Machine Learning (ICML), (2018).

[13]. Brendel, W., Rauber, J., & Bethge, M. "Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models". arXiv preprint arXiv:1712.04248, (2018).

[14]. Moosavi-Dezfooli, S. M., Fawzi, A., & Frossard, P. "DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016). DOI: https://doi.org/10.1109/CVPR.2016.282

[15]. Erh-Chung Chen and Che-Rung Lee. “Towards fast and robust adversarial training for image clas-sification”. In Proceedings of the Asian Conference on Computer Vision, (2020). DOI: https://doi.org/10.1007/978-3-030-69535-4_35

[16]. Xu, W., Evans, D., & Qi, Y. “Feature squeezing: Detecting adversarial examples in deep neural net-works”. Network and Distributed Systems Security (NDSS) Symposium, (2018). https://arxiv.org/abs/1704.01155 DOI: https://doi.org/10.14722/ndss.2018.23198

[17]. Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. “The limitations of deep learning in adversarial settings”. In 2016 IEEE European symposi-um on security and privacy (EuroS&P). pp. 372–387, (2016). DOI: https://doi.org/10.1109/EuroSP.2016.36

[18]. Mark Sandler, Andrew G. Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. “In-verted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Seg-mentation”. CoRR abs/1801.04381, (2018).

[19]. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep Residual Learning for Image Recognition”. In The IEEE Conference on Computer Vision and Pattern Recognition. 770–778, (2016). doi:10.1109/CVPR.2016.90 DOI: https://doi.org/10.1109/CVPR.2016.90

[20]. G. Huang, Z. Liu, L. van der Maaten, K. Q. Weinberger. “Densely Connected Convolutional Net-works”. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2261–2269, (2017). DOI: https://doi.org/10.1109/CVPR.2017.243

[21]. Yann LeCun, Corinna Cortes, and Christopher J.C. Burges. “Gradient-based learning applied to document recognition”. In Proceedings of the IEEE, Vol. 86. pp. 2278–2324, (1998). DOI: https://doi.org/10.1109/5.726791

[22]. Alex Krizhevsky. “Learning multiple layers of features from tiny images”. In Master’s thesis, Uni-versity of Toronto. Toronto, Canada, (2009).

[23]. Huong-Giang Doan, Ngoc-Trung Nguyen, “New blender-based augmentation method with quantita-tive evaluation of CNNs for hand gesture recognition”, Indonesian Journal of Electrical Engineering and Computer Science (IJEECS), Vol. 30, No.2, pp. 796–806, pp. 214-221, (2023). DOI: https://doi.org/10.11591/ijeecs.v30.i2.pp796-806

Tấn công FGSM đối với các bộ phân loại ảnh CNN: Phân tích lỗ hổng và đề xuất giải pháp phòng thủ hiệu quả

Các tác giả

DOI:

Từ khóa:

Tóm tắt

Tài liệu tham khảo

Tải xuống

Đã Xuất bản

Cách trích dẫn

Số

Chuyên mục

ISSN: 1859-1043

Ngôn ngữ

Gửi bài mới

Indexed by

Thông tin

Visitors

GTM