BIOINFO

BIOINFORMATICS

 
phylogeneticTreeweb.png

The objectives of the platform are to provide bioinformatics solutions to meet the needs of users and collaborators but also to develop new tools and environments for the analysis of biology data; particularly for genome and diversity studies.

ACTIVITIES

Our oldest expertise is in the annotation of genomes. We have contributed to the development of EuGene, FrameD / FrameDP software (collaboration Thomas Schiex, MIAT) for the structural annotation of genomes. In recent years, we have trained many bioinformatic colleagues in their use.

After the assembly, structural and functional annotation of genomes, we maintain over the long term several genome portals, both for symbionts, pathogens and plants (the model legume Medicago truncatula, Sunflower, Rose, etc.).

At the post-genomic level, we develop integrative analysis environments and knowledge bases (Legoo, EffectorK, etc.) to integrate high-throughput heterogeneous data and / or knowledge. Another axis of integration concerns the modeling of metabolic and regulatory networks, in particular through the MetExplore project (collaboration Fabien Jourdan, TOXALIM).

We are currently developing an environment for studying the diversity of cultivated species through the construction and exploitation of allele catalogs (Atlas).

Beyond our developments, we offer analysis protocols for epigenetic studies in plants and bacteria, transcriptomic analyzes, analyzes of diversity based on SNPs, functional annotation of proteins, etc.

Our sequence data management plan exploits the Archive environment that we have been developing for a decade. Archive allows us to collect data and metadata as close as possible to the production in order to manage the full data lifecycle, from acquisition to publication (in public databases or by generating an Archive doi).

At the technological level, our recent development projects are based on the Apache Spark ‘Big data’ architecture which unifies methods of data analysis, distributed computing and coupling with data organized in data lakes. Our 1152 HT Spark computing cluster can also be used in SGE mode.

Our publications/contributions

Our developments and funding supports