REAL

Syntactic comparison of human and AI-written scientific texts

B. Varga, Erika and Baksa, Attila (2025) Syntactic comparison of human and AI-written scientific texts. ANNALES MATHEMATICAE ET INFORMATICAE, 61. pp. 248-260. ISSN 1787-6117

[img]
Preview
Text
248_260_varga.pdf - Published Version

Download (3MB) | Preview

Abstract

The spread of large language models (LLMs) has transformed scientific writing, enabling the generation of fluent and convincing text with minimal human input. This development poses significant challenges for authorship verification, especially when AI-generated or AI-assisted content is embedded in academic manuscripts. While most existing detection approaches rely on surface-level lexical features or stylometric clues, our study proposes a novel syntactic-level method to distinguish between human-authored, translated, and AI-generated scientific texts. We constructed a controlled corpus of 24 scientific articles in the field of computer science, divided into four categories: native-authored, human-translated, ChatGPT 4.0-generated, and ChatGPT 4o-generated with deep research. Each corpus was processed using part-of-speech (POS) and dependency parsing, followed by statistical profiling and sentence-structure discovery via process mining. Our results reveal that AI-generated texts differ significantly in their use of modal verbs, participles, coordination, and syntactic complexity. We demonstrate that process-mined graphs of syntactic transitions provide an interpretable and robust fingerprint of authorship, enabling us to detect AI-generated patterns and differentiate them from translated or native writing. The proposed framework contributes a novel methodological perspective to the growing field of AI authorship detection.

Item Type: Article
Uncontrolled Keywords: AI-generated text detection, syntactic analysis, sentence structure modeling, process discovery
Subjects: Q Science / természettudomány > QA Mathematics / matematika > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány
Depositing User: Tibor Gál
Date Deposited: 11 Nov 2025 10:11
Last Modified: 11 Nov 2025 10:11
URI: https://real.mtak.hu/id/eprint/228852

Actions (login required)

Edit Item Edit Item