REAL

Integrating Quasi-symbolic Conceptual Knowledge into Language Model Pre-training

Berend, Gábor (2024) Integrating Quasi-symbolic Conceptual Knowledge into Language Model Pre-training. In: The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning.

[img]
Preview
Text
2024.conll-babylm.13.pdf - Published Version

Download (306kB) | Preview

Abstract

In this paper, we investigate the integration of latent conceptual knowledge into the pretraining of masked language models. Our solution is based on the use of an auxiliary model, from which we extract training signals for training a student model. We determine the training signals from the hidden representations of the student model in an unsupervised way, using sparse coding. Models trained on latent concepts alone have an improved fine-tunability on downstream tasks, however, they perform worse on traditional language modeling, i.e., when the goal is to output missing tokens as opposed to latent semantic classes of words. In order to preserve the improved fine-tuning capability of the models, while making them better at the task of language modeling, we propose a final stage of pre-training, during which we perform traditional masked language modeling. The final stage of pre-training is based on a model that has already been pretrained on the task of modeling latent semantic properties, with the weights of the backbone model being frozen. During the final training phase, we only train a lightweight linear classifier layer on top of the logits that the model determines for the latent semantic properties. With this modification, we can obtain the benefits of both the traditional training paradigms and the one which is based on the use of latent semantic properties. We release our source code at github.com/SzegedAI/MLSM.

Item Type: Conference or Workshop Item (Paper)
Subjects: T Technology / alkalmazott, műszaki tudományok > T2 Technology (General) / műszaki tudományok általában
Depositing User: Gábor Berend
Date Deposited: 26 Sep 2025 08:13
Last Modified: 26 Sep 2025 08:13
URI: https://real.mtak.hu/id/eprint/225506

Actions (login required)

Edit Item Edit Item