Know more

About cookies

What is a "cookie"?

A "cookie" is a piece of information, usually small and identified by a name, which may be sent to your browser by a website you are visiting. Your web browser will store it for a period of time, and send it back to the web server each time you log on again.

Different types of cookies are placed on the sites:

  • Cookies strictly necessary for the proper functioning of the site
  • Cookies deposited by third party sites to improve the interactivity of the site, to collect statistics

Learn more about cookies and how they work

The different types of cookies used on this site

Cookies strictly necessary for the site to function

These cookies allow the main services of the site to function optimally. You can technically block them using your browser settings but your experience on the site may be degraded.

Furthermore, you have the possibility of opposing the use of audience measurement tracers strictly necessary for the functioning and current administration of the website in the cookie management window accessible via the link located in the footer of the site.

Technical cookies

Name of the cookie


Shelf life

CAS and PHP session cookies

Login credentials, session security



Saving your cookie consent choices

12 months

Audience measurement cookies (AT Internet)

Name of the cookie


Shelf life


Trace the visitor's route in order to establish visit statistics.

13 months


Store the anonymous ID of the visitor who starts the first time he visits the site

13 months


Identify the numbers (unique identifiers of a site) seen by the visitor and store the visitor's identifiers.

13 months

About the AT Internet audience measurement tool :

AT Internet's audience measurement tool Analytics is deployed on this site in order to obtain information on visitors' navigation and to improve its use.

The French data protection authority (CNIL) has granted an exemption to AT Internet's Web Analytics cookie. This tool is thus exempt from the collection of the Internet user's consent with regard to the deposit of analytics cookies. However, you can refuse the deposit of these cookies via the cookie management panel.

Good to know:

  • The data collected are not cross-checked with other processing operations
  • The deposited cookie is only used to produce anonymous statistics
  • The cookie does not allow the user's navigation on other sites to be tracked.

Third party cookies to improve the interactivity of the site

This site relies on certain services provided by third parties which allow :

  • to offer interactive content;
  • improve usability and facilitate the sharing of content on social networks;
  • view videos and animated presentations directly on our website;
  • protect form entries from robots;
  • monitor the performance of the site.

These third parties will collect and use your browsing data for their own purposes.

How to accept or reject cookies

When you start browsing an eZpublish site, the appearance of the "cookies" banner allows you to accept or refuse all the cookies we use. This banner will be displayed as long as you have not made a choice, even if you are browsing on another page of the site.

You can change your choices at any time by clicking on the "Cookie Management" link.

You can manage these cookies in your browser. Here are the procedures to follow: Firefox; Chrome; Explorer; Safari; Opera

For more information about the cookies we use, you can contact INRAE's Data Protection Officer by email at or by post at :


24, chemin de Borde Rouge -Auzeville - CS52627 31326 Castanet Tolosan cedex - France

Last update: May 2021

Menu Logo Principal



Glossary of terms used frequently in genome sequencing.

This glossary was compiled using the following sources; please refer to them for additional information:


The process of identifying regions of a genome sequence that are associated with specific functions and adding pertinent biological information to these sequences; for example, the specific gene for which the sequence codes


The process of taking fragments of DNA sequences and putting them together by matching overlapping sequences to create a representation of the original DNA that was sequenced.


Molecules that form DNA molecules, also called nucleotides, known by their abbreviations: A (adenine), T (thymine), C (cytosine) and G (guanosine).

Bases can form bonds with each other: A bonds only to T and C only with G, linking the two strands in the helical structure of DNA.

Base Pair

Unit of DNA comprising two bases on reciprocal strands commonly used to measure the size of genomes. The wheat genome has 16-17 billion base pairs, or pairs of DNA “letters” (A, T, C, and G).


The science of managing and analyzing biological data using advanced computing techniques.

BAC (Bacterial Artificial Chromosome)

An engineered DNA molecule used to clone DNA sequences in bacterial cells (for example, Escherichia coli). Segments of an organism's DNA, ranging from 100,000 to about 300,000 base pairs, can be inserted into BACs. The BACs, with their inserted DNA, are then taken up by bacterial cells. As the bacterial cells grow and divide, they amplify the BAC DNA, which can then be isolated and used in sequencing DNA.

BACs have proved very useful for producing physical maps and sequencing of large genomes, such as the human, rice, mouse and bread wheat genomes.

BAC Library

Because large genomes are difficult to sequence as a whole, the DNA is fragmented in small segments that are inserted into BACs and amplified. A BAC library is a collection of all the BACs produced in the process, representing the entire genome of an organism.


Basic Local Alignment Search Tool. A computer program used to perform sequence comparisons.


The smallest unit of life that can exist independently. All organisms are made up of one or more cells.


A piece of DNA that is formed into a compact structure by folding and association with specific proteins. Each species has a characteristic number of chromosomes. Bread wheat has 42 chromosomes: three sets of 7 pairs of chromosomes that are derived from ancestral diploid species.

Comparative Genomics

The science of comparing the genome sequences of different species to discover similarities and differences in biology. For instance, genome scientists and breeders might compare the genomes of cultivated wheat varieties with those of wild species to understand evolution or to increase the diversity of cultivated varieties through crossing with wild species.


Short for “contiguous sequence”. A piece of DNA sequence that has been assembled from overlapping sequence fragments.


A cell or organism that contains two copies of each chromosome.

DNA (deoxyribonucleic acid)

A molecule found in all living organisms that carries the genetic information.

The DNA molecule consists of two strands – or chains – of nucleotides joined together by bonds, forming a shape known as double-helix.

DNA Sequence

The order of genetic “letters,” or nucleotides, in a piece of DNA. For instance: ACGTACGTACGT

Draft sequence

A sequence that has been assembled into contigs, but a proportion of the sequence is missing (i.e., there are gaps) and the complete order and orientation of the fragments is unknown.

Functional genomics

The study of how genomes function, including the identification and regulation of genes, their resulting proteins, and the role played by the proteins in biochemical processes.


A gene is the basic physical and functional unit of heredity (i.e., the inherited properties of an organism that is passed from one generation to the next). Genes are made up of nucleic acid, are linear molecules consisting of a string of four nucleotides (in DNA, A, T, G, C); they provide instructions or a part of the instructions necessary to make molecules called proteins. In genomics, a gene is an ordered sequence of nucleotides located in a particular position on a chromosome.

Genetic Marker

An easily identifiable piece of genetic material, e.g., a gene or a portion of DNA, with a known location on a chromosome that can be tracked from one generation to the next.


All the genetic material in the chromosomes of a particular organism.

A genome contains the biological information for building, running, and maintaining an organism—and for passing life on to the next generation. Nearly every cell of an organism contains a complete copy of its genome.

Genome map

A map of the relative positions of landmarks within a genome, their chromosomal position, and the distances between them. Landmarks might include short DNA sequences, regulatory sites that turn gene on or off, and genes.

Genetic map

A map of the relative positions of genes, genetic markers, and other features within a chromosome or genome determined on the basis of recombination frequency between markers.


The study of the structure and organization of genomes, their individual elements (e.g. genes), how they function, and how they are regulated.


In genomics, a region of the genome that is not represented in a map or by sequence.


Containing six sets of chromosomes in each cell.

The bread wheat genome is hexaploid, containing three sets of 7 pairs of chromosomes.

High-throughput sequencing

A rapid method of determining the order of the DNA bases of a genome. With this method, some small genomes can be sequenced in just a few days.

Kilobase (kb)

Unit of length for DNA fragments that equal 1000 nucleotides.

Minimum tiling path (MTP)

MTPs are ways of sequencing a chromosome or genome by dividing the genome into BACs then sequencing and assembling them. The MTP refers to an ordered list or “map” of the minimum set of overlapping BACs necessary to provide complete coverage of the whole chromosome or genome.

Non-coding DNA

DNA in the genome that is not directly involved in making proteins or other molecules.

About 98 % of the wheat genome consists of non-coding DNA. The functions of most non-coding fragments are not yet known; recent evidence suggests that they are involved in controlling the activity of genes.


The four chemical subunits of the DNA molecule, also called bases, known by their abbreviations A, T, C, and G.


The set of observable characteristics of an organism.

These characteristics can be controlled by genetics, by the environment, or a combination of both.

Positional cloning

A technique used to identify and isolate genes, usually those that are associated with a specific trait, based on their physical location on a chromosome. Traits are usually positioned first on the basis of proximity to genetic markers associated with chromosomal regions. Then, if a physical map covering the region is available, they are positioned relative to BACs across the region and subsequently to genes annotated in the BAC sequences.

Physical map

A map of the locations of identifiable landmarks on a chromosome or genome. Physical maps are an alignment of sequences (BACs) with distance between markers measured in base pairs. A physical map often refers to a map of overlapping BAC clones from a library that shows the relative positions of the clones along chromosomes. High resolution physical maps serve as a scaffold for genome sequence assembly.


A representation of the entire sequence of a chromosome that is assembled from smaller sequence contigs. In most cases, the pseudomolecule is ordered using physical and genetic map information.

Quantitative trait locus (QTL)

Stretch of DNA containing or linked to genes that underlie a trait.


The exchange of DNA sequence between sister chromatids during meiosis.

Reference sequence

The formally recognized, verified genome sequence of an organism that is used as a representative example of the genome for a particular species. A reference sequence is useful for assembling and comparing individual genomes of the same species (e.g., comparing elite varieties of wheat with the reference sequence for the purpose of understanding the inherited basis of key traits).


The sequential order of nucleotides (genetic “letters”) in a piece of DNA. A short DNA sequence might be: ACGTACGTACGT


The determination of the sequential order of nucleotides in a piece of DNA or an entire genome.

Single Nucleotide Polymorphism (SNP)

A variation in a single base (A, T, C or G) found when comparing the same DNA sequence from two different individuals in the same species.

Shotgun Sequencing (Also called Whole-Genome Shotgun Sequencing)

A laboratory technique for determining the DNA sequence of an organism's genome. The method involves breaking the genome into a collection of small DNA fragments (typically 600bp to 50kb in size, depending on sequencing technology) that are sequenced individually. A computer program looks for overlaps in the DNA sequences and uses them to place the individual fragments in their correct order to reconstitute the genome.


A physical or agronomical characteristic – such as high yield, resistance to pathogens, resistance to a stress.

Whole Genome Assembly

A whole genome assembly is the process of taking fragments of DNA sequences from an entire (whole) genome and, using high throughput technology, joining them by matching overlapping sequences to create a representation of the original DNA that was sequenced. This contrasts with sequence assemblies of individual chromosomes/chromosome arms.