RefSeq v2.1 Assembly and Annotation now freely available at URGI and NCBI

RefSeq v2.1 Assembly and Annotation now freely available at URGI and NCBI

The International Wheat Genome Sequencing Consortium is pleased to announce that version 2.1 of the reference sequence of bread wheat, IWGSC RefSeq v2.1, and IWGSC Annotation v2.1 are now available at the IWGSC data repository hosted by URGI-INRAE and at the NCBI under project PRJNA669381.

Under the leadership of Mingcheng Luo and Jan Dvorak (UC Davis, CA, USA) and with funding from the US National Science Foundation grant IOS-1929053 and the USDA Agricultural Research Service CRIS project 2030-21430-014-00-D, a revised version of the reference wheat genome has been completed and is available for use without restriction.

The genome assembly of Triticum aestivum cv. Chinese Spring (IWGSC RefSeq v1.0; IWGSC, 2018 ) was revised using whole genome optical maps and contigs assembled from whole-genome-shotgun (WGS) PacBio SMRT reads ( Zimin et al. 2017 ). Optical maps were used to detect and resolve chimeric scaffolds, anchor unassigned scaffolds, correct ambiguities in positions and orientations of scaffolds, create super-scaffolds, and estimate gap sizes more accurately. PacBio contigs were used for gap closing. Pseudomolecules of the Chinese Spring 21 chromosomes were re-constructed to develop a new reference sequence, IWGSC RefSeq v2.1. The revisions involved approximately 10% sequence length of the IWGSC RefSeq v1.0.

Under the leadership of Frédéric Choulet and Hélène Rimbert (INRAE) and with funding from the French Government managed by the Research National Agency (ANR) under the Investment for the Future program (BreedWheat project ANR-10-BTBR-03), a new annotation, IWGSC Annotation v2.1, to accompany RefSeq v2.1 was completed. Initially, the previous annotation was updated to IWGSC Annotation v1.2 by integrating a set of 117 novel genes and 81 microRNAs, many of which had been curated manually by the wheat community. This interim gene annotation was used to annotate IWGSC RefSeq v2.1. The transposable elements (TEs) in the resulting assembly IWGSC RefSeq v2.1 were reannotated and gene annotation was updated by transferring the previously known gene models (v1.1) using a fine-tuned, dedicated strategy implemented in the Marker-Assisted Gene Annotation Transfer for Triticeae ( MAGATT ) pipeline. The newly released IWGSC Annotation v2.1 contains 266,753 genes comprising 106,913 HC genes and 159,840 LC genes.

Direct links to download the data

Reference

  • Zhu, T., Wang, L., Rimbert, H., Rodriguez, J.C., Deal, K.R., De Oliveira, R., Choulet, F., Keeble‐Gagnère, G., Tibbits, J., Rogers, J., Eversole, K., Appels, R., Gu, Y.Q., Mascher, M., Dvorak, J. and Luo, M.‐C. (2021), Optical maps refine the bread wheat Triticum aestivum cv Chinese Spring genome assembly. The Plant Journal. Accepted Author Manuscript. https://doi.org/10.1111/tpj.15289