REAL

Applying Tree-Based Convolutional Neural Networks to classify design patterns

Kusper, Gábor and Hidi, Erik Zoltán and Kusper, Krisztián and Yang, Zijian Győző and Márien, Szabolcs (2025) Applying Tree-Based Convolutional Neural Networks to classify design patterns. In: Proceedings of the International Conference on Formal Methods and Foundations of Artificial Intelligence. Eszterházy Károly Katolikus Egyetem Líceum Kiadó, Eger, pp. 140-147. ISBN 9789634963035

[img]
Preview
Text
fmfai2025_pp140-147.pdf - Published Version

Download (629kB) | Preview

Abstract

Automatic detection and classification of design patterns are an increasingly relevant task in modern software engineering, as it directly contributes to improving code quality, readability, and maintainability. In this paper, we propose the application of a modified Tree-Based Convolutional Neural Network (TBCNN) architecture for the recognition of GoF design patterns in Java source code. The approach leverages Abstract Syntax Trees (ASTs) as structural representations of programs, where nodes are encoded by a pre-trained embedding model that captures semantic similarities between language keywords. The resulting vectorized ASTs are processed by the TBCNN, enabling the model to learn both structural and semantic features characteristic of design patterns. For training and evaluation, we collected a dataset of Java implementations of design patterns from GitHub repositories, resulting in approximately 500–600 samples per pattern. Experimental results demonstrate high classification accuracy, with average precision, recall, and F1-scores exceeding 98% across eight design patterns. These findings confirm the viability of tree-based deep learning methods for pattern recognition in source code. However, the model shows limitations when applied to real-world production code, likely due to the restricted representativeness of the training data, which consists mainly of educational implementations.

Item Type: Book Section
Additional Information: International Conference on Formal Methods and Foundations of Artificial Intelligence, Eger, June 5–7, 2025
Uncontrolled Keywords: design patterns, Tree-Based CNN, source code analysis
Subjects: Q Science / természettudomány > QA Mathematics / matematika > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány
Q Science / természettudomány > QA Mathematics / matematika > QA76.76 Software Design and Development / Szoftvertervezés és -fejlesztés
Depositing User: Tibor Gál
Date Deposited: 30 Oct 2025 13:26
Last Modified: 30 Oct 2025 14:26
URI: https://real.mtak.hu/id/eprint/227753

Actions (login required)

Edit Item Edit Item