Agritrop
Accueil

A new method to extract n-Ary relation instances from scientific documents

Lentschat Martin, Buche Patrice, Dibie-Barthélemy Juliette, Roche Mathieu. 2022. A new method to extract n-Ary relation instances from scientific documents. Expert Systems with Applications, 209:118332, 16 p.

Article de revue ; Article de recherche ; Article de revue à facteur d'impact
[img] Version publiée - Anglais
Accès réservé aux personnels Cirad
Utilisation soumise à autorisation de l'auteur ou du Cirad.
Lentschat_et_al_ESWA2022.pdf

Télécharger (1MB) | Demander une copie

Url - jeu de données - Dataverse Cirad : https://doi.org/10.18167/DVN1/U7HK8J / Url - jeu de données - Dataverse Cirad : https://doi.org/10.18167/DVN1/GCZBC9

Liste HCERES des revues (en SHS) : oui

Thème(s) HCERES des revues (en SHS) : Economie-gestion

Résumé : A new method to extract knowledge structured as n-Ary relations from scientific articles is presented. We designed and assessed different approaches to reconstruct instances of n-Ary relations extracted from scientific articles in experimental domains, driven by an Ontological and Terminological Resource (OTR) and based on multi-feature representation of relations and their arguments. The proposed method starts with the identification of partial n-Ary relations in tables of scientific articles and then seeks to reconstruct them with argument instances in the article texts. Based on the so-called Scientific Publication Representation (SciPuRe) of textual arguments and Scientific Table Representation (STaRe) of n-Ary relations representation of an n-Ary relation called STaRe (Scientific Table Representation, originating from partial n-Ary relations extracted from document tables), here we propose and evaluate different approaches for the selection of textual argument instances that could complement partial n-Ary relations: structural, frequentist and word embedding models. The application domain concerns food packaging, especially composition and permeability data. Experiments were conducted on a corpus of 332 relation instances composed of 1547 arguments. Corpora of full and partial relations recognized in document tables and argument instances extracted from texts are available online. Different methods and strategies were measured with an f-score ranging from .34 to .74. These results show that n-Ary relations reconstruction approach depends on the number of selected candidate argument instances.

Mots-clés Agrovoc : composition des aliments, ontologie

Mots-clés libres : Natural Language Processing, Knowledge extraction, Information extraction, N-ary relation, Ontological and Terminological Resource, Smart data, SciPuRe (Scientific Publication Representation), STaRe (Scientific Table Representation)

Champ stratégique Cirad : CTS 7 (2019-) - Hors champs stratégiques

Agences de financement hors UE : Montpellier Université d'Excellence

Auteurs et affiliations

  • Lentschat Martin, CIRAD-ES-UMR TETIS (FRA)
  • Buche Patrice, INRAE (FRA)
  • Dibie-Barthélemy Juliette, INRAE (FRA)
  • Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568

Source : Cirad-Agritrop (https://agritrop.cirad.fr/601747/)

Voir la notice (accès réservé à Agritrop) Voir la notice (accès réservé à Agritrop)

[ Page générée et mise en cache le 2024-02-09 ]