Agritrop
Accueil

AgriCode: Automated coding for qualitative research and its application to the valorization of agricultural residues

Koptelov Maksim, Linck Jan, Bisquert Pierre, Buche Patrice, Roche Mathieu. 2025. AgriCode: Automated coding for qualitative research and its application to the valorization of agricultural residues. SoftwareX, 31:102258, 10 p.

Article de revue ; Article de recherche ; Article de revue à facteur d'impact Revue en libre accès total
[img]
Prévisualisation
Version publiée - Anglais
Sous licence Licence Creative Commons.
Koptelov_SoftwareX_2025.pdf

Télécharger (2MB) | Prévisualisation

Url - jeu de données - Entrepôt autre : https://doi.org/10.57745/91BXIY

Résumé : Qualitative research, widely employed across various academic fields, explores phenomena using non-numerical data, with a particular focus on understanding the meanings, experiences, and perspectives of participants. In contrast to other type of research, it seeks to answer how, where, what, when and why individuals behave or respond in certain ways toward specific issues or topics. Qualitative research involves collecting and analyzing textual data, with interviews playing a central role in gathering expert knowledge. An essential part of data analysis is coding, using specially developed code system hierarchy that helps to categorize and organize responses and facilitates the retrieval of insights. Manual data coding is labor-intensive, and to automate this process we developed the AgriCode tool based on machine learning and manually annotated data. To address data scarcity and improve the prediction quality of our offline classifiers, we perform data augmentation using Retrieval-Augmented Generation (RAG), a state-of-the-art method originally designed for online Q&A systems. Our tool automates the coding of interview responses within the Horizon Europe Agriloop project, which focuses on agricultural waste in the food industry. AgriCode predicts a subset of a predefined code system hierarchy, assisting a human coder by accelerating the process and identifying errors in manual coding. Although initially designed for the valorization of agricultural residues, AgriCode's methodology can be adapted for any qualitative research domain characterized by data scarcity and the need of automated textual analysis. To achieve this, responses from the first round of interviews must be manually annotated using dedicated code system hierarchy. They can then be used for fine-tuning the model, while the RAG method can be employed to address the lack of data for certain classes.

Mots-clés Agrovoc : analyse de données, déchet agricole

Mots-clés libres : Text classification, Natural Language Processing, Data augmentation, Retrieval-augmented generation, Automatic coding, Qualitative research, Bioeconomy

Agences de financement européennes : European Commission

Projets sur financement : (EU) Pushing the frontier of circular agriculture by converting residues into novel economic, social and environmental opportunities

Auteurs et affiliations

  • Koptelov Maksim, INRAE (FRA) - auteur correspondant
  • Linck Jan, ECOZEPT (DEU)
  • Bisquert Pierre, INRAE (FRA)
  • Buche Patrice, INRAE (FRA)
  • Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568

Source : Cirad-Agritrop (https://agritrop.cirad.fr/614273/)

Voir la notice (accès réservé à Agritrop) Voir la notice (accès réservé à Agritrop)

[ Page générée et mise en cache le 2025-09-09 ]