Chinese Spring genome near-complete assembly (CS-CAU)

Chinese Spring (CS) is a landrace from China and a crucial cultivar used for genetic studies of wheat worldwide. Since the release of the CS-IWGSC genome assembly, it has been the most widely used reference genome for wheat worldwide and has greatly promoted various forms of basic and applied research in wheat. However, there are numerous gaps in the globally used wheat Chinese Spring (CS) genome. Here, we generated a 14.46 Gb near-completed assembly of the CS genome (CS-CAU), with a contig N50 over 266 Mb and an overall base accuracy of 99.9963%, which fills almost all gaps in CS-IWGSC genome assembly. The CS-CAU genome will serve as a valuable resource for the research and breeding of wheat as well as its related species.
Chinese Spring genome near-complete assembly and annotation
First, A total of 4,100.33 Gb (283.56× coverage) of ultralong Oxford Nanopore Technology (ONT) data, 420.60 Gb (29.01× coverage) of PacBio HiFi data, and 928.82 Gb (64.06× coverage) of MGI next-generation sequencing (NGS) data were generated for assembly of the wheat CS genome. The final wheat CS near-complete genome (CS-CAU) consisted of 14.46 Gb, with a contig N50 of 266.44 Mb. The assembled sizes of the A, B, and D subgenomes were 5.01, 5.37 and 4.07 Gb, respectively. The assemblies of four chromosomes (chr1D, chr3D, chr4D and chr5D) of the D subgenome are gap-free, including chr1D and chr5D, which are fully resolved from telomere to telomere. Among the 290 remaining gaps, 26, 257 and 7 were harbored in the A, B and D subgenomes, respectively. Except for the centromere of chr1B, which has a gap associated with superlong GAA repeat arrays, the centromeric sequences of all of the remaining 20 chromosomes were completely assembled.
Gene annotation of the CS assembly was performed using PASA and Mikado with RNA-seq evidence and GeMoMa with protein homology evidence, complemented by ab initio predictions using Fgenesh3. In total, 274,018 protein-coding genes were annotated, including 151,405 HC genes, and 59,180 HC genes were newly annotated here compared with the currently widely used annotation of the CS genome (the IWGSC v1.0 annotation).

Schematic representation showing the locations of centromeres, gaps, and pSc119.2 and pAs1 repeats in CS-CAU.
Resouces available: CS-CAU assembly and annotation
CS-CAU assembly has been deposited at DDBJ/ENA/GenBank under the accession JBJQUP000000000.1.
Gene model (GFF3): wheat.CS.Gapless.gff3 [download].
Gene CDS sequence (fa): wheat.CS.Gapless.cds.fa [download].
Protein sequence of high-confidence gene (fa): wheat.CS.Gapless.HC.pep.fa [download].
Citation
Zijian Wang*, Lingfeng Miao*, Kaiwen Tan, Weilong Guo, Beibei Xin, Rudi Appels, Jizeng Jia, Jinsheng Lai, Fei Lu#, Zhongfu Ni#, Xiangdong Fu#, Qixin Sun#, Jian Chen#. (2025). Near-complete assembly and comprehensive annotation of the wheat Chinese Spring genome, Molecular Plant, doi:10.1016/j.molp.2025.02.002.
*These authors contributed equally to this work.
#Correspondence and requests for materials should be addressed to Jian Chen (jianchen@cau.edu.cn), Qixin Sun (qxsun@cau.edu.cn), Xiangdong Fu (xdfu@genetics.ac.cn), Zhongfu Ni (nizf@cau.edu.cn), Fei Lu (flu@genetics.ac.cn)
External links
•Triticeae Multi-omics Center:
The Triticeae Multi-omics Center holds and collects data for species in Triticeae tribe. Some useful utilities, such as
- Genome BLAST,
- Sequence retrieval,
- Genome browser
and others are available for CS_gapless.
•Triticeae-GeneTribe (TGT)
Collinearity-incorporating homology database for Triticeae genomes. CS_gapless genome has been incorporated into TGT and is ready for analysis.
Quick examples:
- Finding homologues between CS_gapless and Chinese Spring (IWGSC RefSeqv1.1),
- Macro collinearity of chr1A between CS_gapless and Chinese Spring (IWGSC RefSeqv1.1).