Agritrop
Accueil

Could key word masking strategy improve language model?

Borovikova Mariya, Ferré Arnaud, Bossy Robert, Roche Mathieu, Nédellec Claire. 2023. Could key word masking strategy improve language model?. In : Natural language processing and information systems: 28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023, Derby, UK, June 21–23, 2023, Proceedings. Métais Elisabeth (ed.), Meziane Farid (ed.), Sugumaran Vijayan (ed.) , Manning Warren (ed.) , Reiff-Marganiec Stephan (ed.). Cham : Springer, 271-284. (Lecture Notes in Computer Science, 13913) ISBN 978-3-031-35319-2 International Conference on Applications of Natural Language to Information Systems (NLDB 2023). 28, Derby, Royaume-Uni, 21 Juin 2023/23 Juin 2023.

Communication avec actes
[img] Version publiée - Anglais
Accès réservé aux personnels Cirad
Utilisation soumise à autorisation de l'auteur ou du Cirad.
Borovikova_et_al_NLDB2023.pdf

Télécharger (1MB) | Demander une copie

Url - jeu de données - Entrepôt autre : https://doi.org/10.57745/HVPITE

Résumé : This paper presents an enhanced approach for adapting a Language Model (LM) to a specific domain, with a focus on Named Entity Recognition (NER) and Named Entity Linking (NEL) tasks. Traditional NER/NEL methods require a large amounts of labeled data, which is time and resource intensive to produce. Unsupervised and semi-supervised approaches overcome this limitation but suffer from a lower quality. Our approach, called KeyWord Masking (KWM), fine-tunes a Language Model (LM) for the Masked Language Modeling (MLM) task in a special way. Our experiments demonstrate that KWM outperforms traditional methods in restoring domain-specific entities. This work is a preliminary step towards developing a more sophisticated NER/NEL system for domain-specific data.

Mots-clés libres : Natural Language Processing, Named entity recognition, Language Model, BERT, Plant disease surveillance, Epidemiological surveillance

Agences de financement hors UE : Agence Nationale de la Recherche

Projets sur financement : (FRA) Building epidemiological surveillance and prophylaxis with observations both near and distant

Auteurs et affiliations

  • Borovikova Mariya, Université Paris-Saclay (FRA)
  • Ferré Arnaud, Université Paris-Saclay (FRA)
  • Bossy Robert, INRAE (FRA)
  • Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568
  • Nédellec Claire, INRAE (FRA)

Autres liens de la publication

Source : Cirad-Agritrop (https://agritrop.cirad.fr/605120/)

Voir la notice (accès réservé à Agritrop) Voir la notice (accès réservé à Agritrop)

[ Page générée et mise en cache le 2024-04-08 ]