Mora-Fallas Adán, Goeau Hervé, Mazer Susan J., Love Natalie, Mata-Montero Erick, Bonnet Pierre, Joly Alexis. 2019. Accelerating the automated detection, counting and measurements of reproductive organs in herbarium collections in the era of deep learning. Biodiversity Information Science and Standards, 3:e37341, 3 p. Biodiversity Next: Building a global infrastructure for biodiversity data, Leiden, Pays-Bas, 22 Octobre 2019/25 Octobre 2019.
|
Version publiée
- Anglais
Sous licence . BISS_article_37341.pdf Télécharger (77kB) | Prévisualisation |
Résumé : Millions of herbarium records provide an invaluable legacy and knowledge of the spatial and temporal distributions of plants over centuries across all continents (Soltis et al. 2018). Due to recent efforts to digitize and to make publicly accessible most major natural collections, investigations of ecological and evolutionary patterns at unprecedented geographic scales are now possible (Carranza-Rojas et al. 2017, Lorieul et al. 2019). Nevertheless, biologists are now facing the problem of extracting from a huge number of herbarium sheets basic information such as textual descriptions, the numbers of organs, and measurements of various morphological traits. Deep learning technologies can dramatically accelerate the extraction of such basic information by automating the routines of organ identification, counts and measurements, thereby allowing biologists to spend more time on investigations such as phenological or geographic distribution studies. Recent progress on instance segmentation demonstrated by the Mask-RCNN method is very promising in the context of herbarium sheets, in particular for detecting with high precision different organs of interest on each specimen, including leaves, flowers, and fruits. However, like any deep learning approach, this method requires a significant number of labeled examples with fairly detailed outlines of individual organs. Creating such a training dataset can be very time-consuming and may be discouraging for researchers. We propose in this work to integrate the Mask-RCNN approach within a global system enabling an active learning mechanism (Sener and Savarese 2018) in order to minimize the number of outlines of organs that researchers must manually annotate. The principle is to alternate cycles of manual annotations and training updates of the deep learning model and predictions on the entire collection to process. Then, the challenge of the active learning mechanism is to estimate automatically at each cycle which are the most useful objects that must be manually extracted in the next manual annotation cycle in order to learn, in as few cycles as possible, an accurate model. We discuss experiments addressing the effectiveness, the limits and the time required of our approach for annotation, in the context of a phenological study of more than 10,000 reproductive organs (buds, flowers, fruits and immature fruits) of Streptanthus tortuosus, a species known to be highly variable in appearance and therefore very difficult to be processed by an instance segmentation deep learning model.
Mots-clés Agrovoc : herbier, morphologie végétale, anatomie végétale, collection botanique, identification, apprentissage machine, traitement des données, organe reproducteur végétal
Mots-clés complémentaires : deep learning, streptanthus tortuosus
Mots-clés libres : Herbarium collection, Plant phenology, Deep Learning, Instance detection, Phenophase
Classification Agris : F50 - Anatomie et morphologie des plantes
F70 - Taxonomie végétale et phytogéographie
C30 - Documentation et information
Champ stratégique Cirad : CTS 1 (2019-) - Biodiversité
Auteurs et affiliations
- Mora-Fallas Adán, Instituto Tecnológico de Costa Rica (CRI)
- Goeau Hervé, CIRAD-BIOS-UMR AMAP (FRA) - auteur correspondant
- Mazer Susan J., UC (USA)
- Love Natalie, UC (USA)
- Mata-Montero Erick, Instituto Tecnológico de Costa Rica (CRI)
- Bonnet Pierre, CIRAD-BIOS-UMR AMAP (FRA) ORCID: 0000-0002-2828-4389
- Joly Alexis, INRIA (FRA)
Source : Cirad-Agritrop (https://agritrop.cirad.fr/598381/)
[ Page générée et mise en cache le 2024-01-29 ]