Hungarian case study on automated detection of body-shaming comments using machine learning

Gőz, Franciska Noémi and B. Varga, Erika (2025) Hungarian case study on automated detection of body-shaming comments using machine learning. In: Proceedings of the International Conference on Formal Methods and Foundations of Artificial Intelligence. Eszterházy Károly Katolikus Egyetem Líceum Kiadó, Eger, pp. 65-77. ISBN 9789634963035

Preview

Text
fmfai2025_pp065-077.pdf - Published Version
Download (809kB) | Preview

Official URL: http://doi.org/10.17048/fmfai.2025.65

Abstract

Social media facilitates online interactions but also enables bodyshaming comments which are often ambiguous. This paper presents a machine learning-based approach for detecting Hungarian body-shaming comments, an underrepresented area in NLP. A dataset of Facebook comments was collected and expanded with synthetic data. Using HuSpaCy and HuBERT, logistic regression and MLP classification models were trained with TF-IDF and SBERT embeddings. The best-performing model achieved 88% accuracy, demonstrating the potential of NLP techniques for moderating harmful online content in low-resource languages. The results highlight key challenges, including category overlap and class imbalance, emphasizing the need for context-aware classification methods in automated content moderation.

Item Type:	Book Section
Additional Information:	International Conference on Formal Methods and Foundations of Artificial Intelligence, Eger, June 5–7, 2025
Uncontrolled Keywords:	Hungarian text analysis, toxic comment filtering, social media moderation, body-shaming detection, machine learning classification
Subjects:	Q Science / természettudomány > QA Mathematics / matematika > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány
Depositing User:	Tibor Gál
Date Deposited:	30 Oct 2025 13:23
Last Modified:	30 Oct 2025 14:34
URI:	https://real.mtak.hu/id/eprint/227743

Actions (login required)

Edit Item