Agritrop
Accueil

GOBii, a scalable genomics data management system with rapid data extract times and integration with downstream genomic selection analysis pipelines

Nti-Addae Yaw, Ulat Victor Jun, Matthews Dave, Sempere Guilhem, Guignon Valentin, Larmande Pierre, Renner Jon, Petel Adrien, Jones Elizabeth, Robbins Kelly. 2019. GOBii, a scalable genomics data management system with rapid data extract times and integration with downstream genomic selection analysis pipelines. . PAG. San Diego : PAG, Résumé Plant and Animal Genome Conference (PAG). 27, San Diego, États-Unis, 12 Janvier 2019/16 Janvier 2019.

Communication sans actes
[img]
Prévisualisation
Version publiée - Anglais
Utilisation soumise à autorisation de l'auteur ou du Cirad.
ID600979.pdf

Télécharger (131kB) | Prévisualisation
[img]
Prévisualisation
Version publiée - Anglais
Utilisation soumise à autorisation de l'auteur ou du Cirad.
ID600979_diaporama.pdf

Télécharger (319kB) | Prévisualisation

Url - éditeur : https://pag.confex.com/pag/xxvii/meetingapp.cgi/Session/5845

Matériel d'accompagnement : 1 diaporama (11 vues)

Résumé : The Genomic Open-Source Breeding informatics initiative (GOBii) has built a genomics data management system that is highly scalable and has focused on data extract performance for large genomics data files. We have benchmarked several SQL and noSQL open-source data management systems with a view to managing large scale genomics data, and have determined that the HDF5 file system outperformed other data management systems both in loading and extract times. In order to also accommodate metadata management, we have designed and developed a hybrid system based on Postgres for sample and marker metadata management and HDF5 for the large genomics files. The HDF5 genomics files are stored in two different orientations to enable rapid extract in either sample or marker-fast formats. The system is flexible enough to be used across different crops and with diverse marker and sequence-based platforms. We are now working to integrate the genomics data extracts with downstream genomic selection applications in Galaxy.

Mots-clés libres : NoSQL database, SNP markers, INDELs, VCF, Web tool, Interoperability

Auteurs et affiliations

  • Nti-Addae Yaw, GOBii (USA)
  • Ulat Victor Jun, CIMMYT
  • Matthews Dave, Cornell University (USA)
  • Sempere Guilhem, CIRAD-BIOS-UMR INTERTRYP (FRA)
  • Guignon Valentin, Bioversity International (FRA)
  • Larmande Pierre, Université de Montpellier (FRA)
  • Renner Jon, University of Minnesota (USA)
  • Petel Adrien, CIRAD-BIOS-UMR PVBMT (REU)
  • Jones Elizabeth, GOBii (USA)
  • Robbins Kelly, Cornell University (USA)

Source : Cirad-Agritrop (https://agritrop.cirad.fr/600979/)

Voir la notice (accès réservé à Agritrop) Voir la notice (accès réservé à Agritrop)

[ Page générée et mise en cache le 2024-04-06 ]