Borovikova Mariya, Ferré Arnaud, Bossy Robert, Roche Mathieu, Nédellec Claire.
2024. Semantically-informed domain adaptation for named entity recognition.
In : Foundations of intelligent systems: 27th International Symposium, ISMIS 2024, Poitiers, France, June 17-19, 2024, Proceedings. Appice Annalisa (ed.), Azzag Hanane (ed.), Hacid Mohand-Said (ed.), Hadjali Allel (ed.), Ras Zbigniew (ed.)
Version publiée
- Anglais
Accès réservé aux personnels Cirad Utilisation soumise à autorisation de l'auteur ou du Cirad. Borovikova_et_al_ISMIS2024.pdf Télécharger (999kB) | Demander une copie |
Résumé : Named Entity Recognition (NER) is an important task in Natural Language Processing that involves identifying entities in unstructured text. State-of-the-art NER methods often require extensive manual labeling for training. To bridge this gap, this paper introduces a domain adaptation technique that leverages semantic information about entity types using Sentence-BERT embeddings of their textual descriptions. We conduct experiments across various datasets from both general and biological domains, evaluating our approach in standard and zero-shot settings. Our experiences demonstrate the effectiveness of our method, which outperforms existing zero-shot techniques on certain datasets. Our findings underscore the importance of accurate semantic representations for entity types. This paper contributes to the advancement of zero-shot domain adaptation for NER and opens avenues for future research in improving NER systems' adaptability and performance across diverse domains.
Mots-clés libres : Natural Language Processing, Language Model, Named entity recognition, Domain adaptation
Agences de financement hors UE : Agence Nationale de la Recherche
Projets sur financement : (FRA) Building epidemiological surveillance and prophylaxis with observations both near and distant
Auteurs et affiliations
- Borovikova Mariya, Université Paris-Saclay (FRA)
- Ferré Arnaud, Université Paris-Saclay (FRA)
- Bossy Robert, INRAE (FRA)
- Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568
- Nédellec Claire, INRAE (FRA)
Source : Cirad-Agritrop (https://agritrop.cirad.fr/609693/)
[ Page générée et mise en cache le 2024-06-20 ]