Leblanc César, Bonnet Pierre, Servajean Maximilien, Joly Alexis.
2024. Pl@ntBERT: leveraging large language models to enhance vegetation classification through species composition analysis.
In : ECCB 2024 - 7th European Congress of Conservation Biology: Biodiversity positive by 2030. Book of abstracts. SCB
|
Version publiée
- Anglais
Sous licence . 611843.pdf Télécharger (1MB) | Prévisualisation |
Résumé : Biodiversity is under pressure, as many disturbance events threaten natural areas. Therefore, habitat distribution mapping is increasingly relevant for monitoring their statuses. It aims to quantify the mathematical relationships between predictors and occurrences of categorized locations. Thus, advanced numerical technologies are more required than ever. They help summarizing our knowledge of species assemblages. Herein, we present Pl@ntBERT, a framework that encodes vegetation patterns and enhances their classifications. This tool leverages computer science and linguistic processes based on transformers. In particular, the pipeline implements two artificial intelligence tasks: fill-mask and text classification. Firstly, masked language modeling gets a statistical understanding of vascular plant compositions. Then, subsequent training assigns a label to sentences describing phytosociological relevés. The fine-tuning of a pretrained foundation model on in-domain words shows significant upgrade and clearly outperforms previous state-of-the-art methods. The software pushes the accuracy score on a database containing millions of European surveys to 92.48%. Finally, our results showcase that flora is a strong marker of ecosystems and doesn't need to be coupled with environmental data to train neural networks. The proposed application has a vocabulary covering over ten thousand organisms. This approach offers a methodology for advancing our comprehension in community ecology and conservation biology.
Auteurs et affiliations
- Leblanc César, CIRAD-BIOS-UMR AMAP (FRA)
- Bonnet Pierre, CIRAD-BIOS-UMR AMAP (FRA) ORCID: 0000-0002-2828-4389
- Servajean Maximilien, CNRS (FRA)
- Joly Alexis, INRIA (FRA)
Source : Cirad-Agritrop (https://agritrop.cirad.fr/611843/)
[ Page générée et mise en cache le 2025-01-29 ]