Home / News / Press releases / Draft sequence / Q&A

Questions & Answers

Who participated?

The IWGSC article has almost 100 authors, representing 23 laboratories from 14 different countries. They are made up of groups who participated in generating the data through providing wheat lines, flow sorting DNA and sequencing and bioinformaticians and wheat geneticists that analyzed the data.

What did they do?

They produced the first survey of the gene content and composition – also called “draft sequence” – of the 21 individual chromosomes of bread wheat. Out of 124,201 sequences that could be recognized as genes or gene loci, more than 75,000 were positioned along the chromosomes.

They also produced and/or used whole genome shotgun sequences to identify genes in species of diploid and tetraploid wheat that are related to the ancestors of hexaploid bread wheat.

Finally, they measured gene expression in five different tissues of wheat plants taken at three stages of development and studied how genes from different chromosomes behave in a hexaploid context.

The data that have been generated provide a unique resource for accelerating gene mapping and marker development in wheat breeding, as well as for analyzing how wheat genes are used and how they have evolved.

How did they do it?

Using DNA isolated from individual wheat chromosomes, they produced sequence information covering each base on average at least 30 times by using high throughput short read DNA sequencing technologies (so called ‘Next Generation Sequencing’). The sequences were then assembled into short fragments of unique DNA and were assigned positions along the chromosomes by applying an approach called the GenomeZipper to produce a pseudo-order by aligning the contigs against wheat genetic maps and other sequenced grass genomes.

Are the results publicly available?


A central IWGSC repository has been established within the national plant bioinformatics database in France (the Unité de Recherche Génomique (URGI) at Institut National de la Recherche Agronomique (INRA)) to provide public access to wheat sequence data and other IWGSC resources, for example, physical maps and marker data. All data from the survey sequence initiative are available for the scientific community here or here

The raw sequence reads for each chromosome have also been deposited in the Short Read Archive (SRA) at the European Bioinformatics Institute and the assembled sequence data have been integrated into Ensembl Plants at EBI here .

How are those results significant and how will they be used?

For scientists: the draft sequences provide a unique resource for rapidly localizing specific genes on the chromosomes and for comparative genomics analysis with related species to study the evolution of the gene content in the wheat lineage. They also provide a foundation for analyzing the expression of homoeologous genes in a large polyploid genome.

For plant breeders: The draft sequences provide the putative location of about 75,000 genes along the wheat chromosomes and a substrate to design genome-specific markers through sequence capture and resequencing activities. Breeders can then use marker sets to map genes and to deploy marker assisted breeding and genomic selection schemes.

How much did that cost?

IWGSC estimates the project cost in the order of €200,000 for DNA preparation and the sequencing which was carried out by seven laboratories around the world. Salaries and any costs associated with bioinformatics and analyses are not included in this estimate.

What’s the next step?

The IWGSC’s focus is to continue its efforts to obtain a complete wheat genome sequence, also called a reference sequence. The wheat genome contains over 80% of repetitive sequences that cannot be assembled unambiguously from the short stretches of sequence of 100 to 150 bases generated with the current high throughput sequencing technologies. As a result, the survey sequences are still very fragmented. They do not provide long stretches of sequence that include regulatory sequences and several genes in a single sequence. Moreover, they provide only partial information about the order and orientation of the contiguous sequences and thus the genes and their surrounding sequences that make up the chromosomes. The information available for positional cloning of genes underlying agronomically important traits is therefore incomplete.

In the same issue of Science, another article reports the completion of the first reference sequence for the largest chromosome, 3B. This establishes a proof of concept and a template for sequencing the remaining 20 chromosomes.

When will the wheat genome sequence finally be completed?

With the demonstration that it is necessary and possible to sequence the wheat genome to the high quality as has been achieved for wheat chromosome 3B, the IWGSC has developed a three-year plan to complete the reference sequences for the remaining 20 chromosomes. Sequencing of some chromosomes is already underway and the whole genome could be completed by 2017 if the remaining funding requirements were to become available immediately.

The aim is to obtain a high quality sequence that provides an accurate representation of the structure and organization of sequences along individual chromosomes, enabling the identification of the positions of genes, regulatory elements, repetitive elements, sequence-based markers, and other feature.

Why does it take so long?

Obtaining a high quality reference sequence of the genome of bread wheat is a scientific challenge. With a size of 17 Gb, the wheat genome is more than five times larger than the human genome. In addition, the bread wheat genome comprises 21 chromosomes originating from three individual subgenomes that contain highly similar gene sets and have a repeat content of over 80%. Because of these issues, the IWGSC decided that the only approach that would deliver a high quality reference was to reduce the complexity and follow a strategy similar to that used for other high quality reference genomes – such as human, mouse, zebrafish, Arabidopsis and rice – namely, sequencing each chromosome. It has been a challenge to secure funding in part because investment into wheat research is generally lower than in other major crops such as maize, despite the importance of wheat as a major source of human food.

Why is it useful to sequence the wheat genome?

The world is facing enormous challenges with a human population projected to rise to over 9 billion by 2050. Food production will need to increase by over 50% without expanding land use in the face of a changing climate and with dwindling availability of fertilisers, water and effective pest treatment. To produce sufficient wheat for the human population in the future, there is an urgent need to develop new wheat varieties with higher yield, better resistance to diseases and pests, and tolerance to abiotic stresses such as drought, high salinity or high aluminium content of the soils.

Once the reference genome sequence is completed, breeders will have at their disposal tools to identify genes and regulatory elements underlying complex traits and accelerate improvement through genomics assisted breeding and biotechnology. In particular, they will be able to understand the interplay between sets of similar genes that reside in the genome and can be regulated in different ways. Using this information, breeders will be able to produce a new generation of wheat varieties with higher yields and improved sustainability to meet the demands of a growing world population in a changing environment. The draft genome sequence provides a useful first step towards this goal.

Wasn’t the wheat genome sequenced two years ago?

In 2012, a group in the United Kingdom reported the production of sequence information that represented approximately 5-fold coverage of the wheat genome, covering up to 70% of the non-repetitive regions, and allowed them to identify approximately 95,000 gene sequences. They made putative assignments of genes to each of the A, B, and D subgenomes, based on similarity to two diploid wheat species that are relatives of wheat A and D genomes and that had been sequenced by two Chinese groups. They were not able to assign genes to individual chromosomes, however and they did not provide information that allows breeders to identify differences between genes that lie on chromosomes within each of the A, B and D sub-genomes.