Manuel Spannagl

PGSB (formerly MIPS) has been engaged in the development of plant genome bioinformatics for over 15 years and has generated annotation for many plant genomes. The main focus of Manuel’s work include gene prediction and annotation, genome structure, expression and gene family analyses and comparative genomics. Manuel is also the database manager of PGSB PlantsDB, where genomic data from wheat (and many other plants) are structured, integrated and visualized.

After working on the analysis of several plant genomes including tomato, medicago, brachypodium and sorghum, Manuel joined the EU-funded TriticeaeGenome project in 2008; he became an IWGSC member that same year. Since then, he has been involved in the IWGSC survey sequence and the reference sequence projects.

Throughout 2017, Manuel worked with colleagues at INRA (France) and The Earlham Institute (UK) to produce the IWGSC RefSeq annotation v1.0. Manuel led the efforts at PGSB to generate gene models using the in-house annotation pipeline (previously called MIPS pipeline). Manuel’s team at PGSB also contributed analyses on gene families/phylogenomics, gene expression, transposable elements and genome structure to the current work on the IWGSC RefSeq v1.0 assembly.

Please describe your contribution to the IWGSC

My earlier contributions to the IWGSC included IT infrastructure and data management/access work, in collaboration with URGI/INRA, namely the online version of the wheat GenomeZipper. After that I was involved in the gene prediction, genome analyses and data representation of the 2014 IWGSC genome sequence survey. Lately I coordinated our contributions for the IWGSC Refseq study, including gene prediction and annotation, repeat detection, transcriptome analysis, genome structure and organization and phylogenomics and gene families.

You most recently worked to produce the IWGSC RefSeq annotation v1.0 with colleagues in France and the UK, what did you find most challenging about your work on this project?

Generating (somewhat) independent gene calls by different established pipelines and combining and integrating them to achieve a reference gene set for wheat was a novel but promising strategy, at least for us. For me the greatest challenge in this process was to find ways to evaluate the results in an objective, comprehensive and meaningful way. Annotations can be evaluated for many aspects and metrics, with each pipeline performing differently and no single “gold-standard” reference gene set to train with. Evaluation also not only stands at the end of an annotation process (e.g. to select the “best” gene model at a loci) but also in between the individual steps, in order to adjust parameters, choice of tools etc. In the end, we had an integrated gene evaluation and combination step (performed by Earlham) which among others made extensive use of the very helpful full-length IsoSeq sequences.

As a bioinformatician, do you see the wheat genome reference sequence as the end or the beginning of the story?

I believe we are just at the beginning because for so many questions we will gain all the power from comparing the genome reference sequence against the genome sequences of other wheats or cereals. I’m very happy to see numerous projects going on that aim to generate additional reference genome sequences in the near future. Not only for bread wheat, but also for its progenitors, close relatives and the cereals in general. I’m convinced that we can learn things unexpected so far from the bioinformatics comparisons, for example on genome organization.

What do you think the biggest advances in wheat research will be in the next 5-10 year

On the closer horizon I see great benefits from the (already ongoing) creation of an haplotype graph for wheat as well as from the generation of a larger panel of reference genome sequences. Both will have immediate impacts on breeding programs. In addition, I expect to see more and more studies using targeted mutagenesis with CRISPR/Cas9, once platforms and protocols are fully established in wheat.

You have been working on different plant species. Which one would you consider as the most challenging?

Bread wheat and barley for sure. Not just because we were looking at large and, for wheat, hexaploid genomes, but for the consequences for gene prediction and genome analysis. The large numbers of pseudogenes, transposable elements and gene fragments complicate identification of orthologs/homeologs and gene families and therefore asked for specialized analysis pipelines and strategies. Another example: for wheat we found up to 10% more tandem duplicates than for other monocotyledonous species. A very interesting thing to look at in more detail, but more challenging for orthologs identification.

Is there a plant genome you would love to work on?

Up to now, I was mainly involved in plant genome sequencing projects either targeting model species or crops. I’d be interested to investigate the genome of a biologically interesting, but more “exotic” plant species one day…like the sequoia.

What are your future plans?

I’m looking forward to continue my work on plant genomics. With the new resources and technologies that became available we can tackle questions right now that seemed to me out of range for years before.

About Manuel

While studying biology in high school, Manuel became fascinated by the possibilities that the emerging field of genomics had to offer. Therefore, in 2000, he decided to join the newly established bachelor course in “bioinformatics” – one of the very first offered in Germany – at the Technical University Munich and Ludwig-Maximilians-University. After obtaining his bachelor in 2003, he started a scientific staff job at MIPS (Munich Institute for Protein Sequences) Helmholtz in Klaus Mayer’s plant genomics group. At first, he was working more on the IT/software development side and became the database manager of MIPS PlantsDB. He really got involved in genomics with the international genome sequencing projects of Medicago, brachypodium, tomato, sorghum and finally the cereals. He decided to pursuit his studies while still working in the plant genomics group and obtained his bioinformatics Master and PhD degrees in 2009 and 2015, respectively.
Manuel is – in his own words – a bicycle maniac, he likes everything about bikes: riding, racing and building them. As he also enjoys travelling, he tries combining both passions by doing multi-day mountain bike, as this is “the best way to free [his] mind”.

Name of the cookie	Purpose	Shelf life
CAS and PHP session cookies	Login credentials, session security	Session
Tarteaucitron	Saving your cookie consent choices	12 months

Name of the cookie	Purpose	Shelf life
atid	Trace the visitor's route in order to establish visit statistics.	13 months
atuserid	Store the anonymous ID of the visitor who starts the first time he visits the site	13 months
atidvisitor	Identify the numbers (unique identifiers of a site) seen by the visitor and store the visitor's identifiers.	13 months