Overview

Fungi, and yeasts in particular, have a long standing role in the development of genetics and molecular biology. Génolevures examines the conservation of chromosome maps to identify the "yeast-specific" genes, and to review the distribution of gene families into functional classes.
The Génolevures 1 project started in 1999 and ended with the publication of the special Génolevures issue of FEBS Letters in December 2000. A random sequencing analysis was performed on 13 different species sharing a small genome size and a low frequency of introns.
The Génolevures 2 project started from this point to be concluded by a publication in Nature in July 2004. A total of approximately 24,200 novel genes were identified, the translation products of which were classified together with Saccharomyces cerevisiae proteins into about 4,700 families, forming the basis for interspecific comparisons.
The Génolevures 3 project is currently in progress.

Génolevures 3

Presentation  


We selected three yeast species related to the Kluyveromyces clade for complete sequencing. We compared the obtained genomes to already available genome of species belonging to the same clade, in order to measure the gene content diversity in pre-WGD (Whole Genome Duplication) related species.

  • Zygosaccharomyces rouxii is one of the most osmotolerant and halotolerant yeasts. Its genome was partially explored during Genolevures 1.
  • Saccharomyces kluyveri shows close phylogenetic relationship with a variety of species of other genera including Kluyveromyces and Zygosaccharomyces. It is becoming a model organism for industrial applications. Its genome was partially explored during Genolevures 1.
  • Kluyveromyces thermotolerans has been assigned to the Kluyveromyces genus on the basis of ascus deliquescence. Its genome was partially explored during Genolevures 1.
  • Kluyveromyces lactis is a yeast species commonly used for genetic studies and industrial applications. Its genome was already sequenced in Genolevures 2, but it was completely reannotated dirung Genolevures 3.
  • Eremothecium (Ashbya) gossypii shows filamentous growth with multinucleated and extensively branching hyphae. Its genome was already sequenced (Dietrich FS et al., Science, 304(5668):304-307, 2004) and is currently maintained at AGD (Hermida L et al., Nucleic Acids Res., 33(Database issue):D348-352, 2005.).


Members  



Génolevures 2

Presentation  


We selected four yeast species representing various and distant branches among the hemiascomycetes for complete sequencing.

  • Candida glabrata was chosen because it has become the second causative agent of human candidiasis, and because, despite its name, it is phylogenetically more closely related to S. cerevisiae than to C. albicans, the major human fungal pathogen with which it shares only a few properties.
  • Kluyveromyces lactis is a yeast species commonly used for genetic studies and industrial applications, and it occupies an interesting position within the phylogeny of hemiascomycetes.
  • Debaryomyces hansenii was selected because it is a halotolerant yeast, related to C. albicans and other pathogenic yeasts, that is often found on fish and salted dairy products.
  • Yarrowia lipolytica, an alcane-using yeast commonly used in genetic studies, is very distantly related to the rest of the yeasts; instead it shares a number of common properties with filamentous fungi.

  • For each species, the haploid type strain was sequenced.


Of importance for evolutionary studies, the four yeast species display different mechanisms of sexuality. Yarrowia lipolytica has a haplo-diplontic cycle (that is, it alternates between haploid and diploid phases of similar importance), whereas D. hansenii is a homothallic yeast with an essentially haplontic life cycle. Both species have only one mating-type locus (MAT), whereas the other two have two silent mating-type cassette homologues, similar to S. cerevisiae. As is often the case with pathogens, C. glabrata displays no known sexual cycle, despite the fact that haploid strains of the two distinct mating types are regularly isolated from patients. Finally, K. lactis is a heterothallic species with a predominantly haplontic cycle, in contrast to S. cerevisiae, which has a the predominantly diplobiontic cycle, and is pseudo-homothallic due to mating-type switching.


This work, which represents the first multispecies exploration of genome evolution across an entire eukaryotic phylum, reveals the variety of events and mechanisms that have taken place, and should allow useful comparisons with other phyla of multicellular organisms when more genome sequences are determined.

Hemiascomycetous yeast species publicly available genome sequences. The phylogeny of the hemiascomycetous yeasts is adapted from [1] and [2] (only the general topology of the tree is illustrated). Phylogenetically circumscribed species are grouped as clades (colored triangles).This topology come from the [3].


References  


  1. Phylogenetic circumscription of Saccharomyces, Kluyveromyces and other members of the Saccharomycetaceae, and the proposal of the new genera Lachancea, Nakaseomyces, Naumovia, Vanderwaltozyma and Zygotorulaspora
    Kurtzman CP
    FEMS. Yeast Res. 4:233-245, 2003
  2. Phylogeny and evolution of medical species of Candida and related taxa: a multgenic analysis
    Diezmann S, Cox CJ, Schönian G, Vilgalys RJ, Mitchell TG
    J.Clin.Microbiol. 42:5624-5635, 2004
  3. Yeasts illustrate the molecular mechanisms of eukaryotic genome evolution
    Dujon B
    Trends in Genetics 22(7):375-387, 2006


Members  



Génolevures 1

Presentation  


Génolevures 1 aimed at a large-scale analysis of a wide range of evolutionary distances. Based on recent 18S rDNA phylogeny, a set of species representing the various branches of the Hemiascomycete class was defined. Preference was given to species of industrial or biomedical interest.



The 13 partial genome sequences of Génolevures 1.

  • Saccharomyces bayanus var. uvarum
  • Kazachstania exigua or Saccharomyces exiguus
  • Saccharomyces servazzii
  • Zygosaccharomyces rouxii**
  • Lachancea kluyveri or Saccharomyces kluyveri**
  • Kluyveromyces thermotolerans**
  • Kluyveromyces lactis*
  • Kluyveromyces marxianus
  • Pichia angusta
  • Debaryomyces hansenii var. hansenii*
  • Pichia farinosa or Pichia sorbitophila
  • Candida tropicalis
  • Yarrowia lipolytica*

  • * genomes completely sequenced during Génolevures 2,
    ** genomes completely sequenced during Génolevures 3.



    The analysis of the 13 genomes was performed by sequencing random genomic libraries. For each species, a random genomic DNA library was prepared to generate fragments ranging in size from 3 to 5 kb. This size was chosen based on the average length of S. cerevisiae ORFs and intergenic regions. Single pass sequencing (up to 1 kb) of both ends for each insert led to the characterization of each insert by 2 Random Sequence Tags (RST). For some species, analysis was performed on about 5000 RSTs, and for the other ones on 2500 RSTs.


    Each set of RST was compared to the Saccharomyces cerevisiae genome, and annotated accordingly. The sequences which were not annotated at this step were subsequently compared to a collection of protein sequences called Gproteome. Gproteome consists in proteins and ORFs products from completely sequenced organisms plus a filtered SwissProt database.
    The sequences available on this site are the full complement of the RSTs generated in the project, together with the annotations made by the Consortium members. Altogether they represent about 20000 newly-identified genes.


    Annotation  


    Each set of RSTs was compared to Saccharomyces cerevisiae rDNA, tRNA genes, Ty elements and mitochondrial sequences, using BLASTx or tBLASTx. The RSTs with valid alignments were then set apart. The remaining RSTs were compared to the Saccharomyces cerevisiae proteome using BLASTx (Tekaia et al., 2000) and the alignments were submitted to expert validation.

    All species use the Standard genetic code except Debaryomyces hansenii var. hansenii, Pichia sorbitophila and Candida tropicalis which use the Alternative Yeast Nuclear genetic code.

    For the validation, RST segments having a single clearcut homolog (denoted "o") were distinguished from those having several possible homologs as a result of the existence of gene families in Saccharomyces cerevisiae (denoted "oo").

    The sequences which were not annotated at this step were subsequently compared to a collection of protein sequences from other completely sequenced organisms called GPROTEOME.

    Sequence Quality

    As the sequences produced in this project are single read RSTs, they are prone to contain undetermined residues and frameshifts. Consequently caution should be exercised when computing RSTs translation products.
    All the translations given here are hypothetical translations of the segments corresponding to BLASTx alignments only.

    Coordinates

    The coodinates given in the annotation tables correspond to the beginnings and the ends of the alignments. These coordinates are expressed in nucleotides for the RSTs and in amino-acids for the BLASTx hits.

    Reference data

    GPROTEOME basically consists in a compilation of 23 completely sequenced organisms, plus a partial sequence of Schizosaccharomyces pombe genome, plus a "filtered" SwissProt (the entries corresponding to the species already present in GPROTEOME were removed).

    SpeciesDate of databaseNumber of proteins
    Bacteria
    Aquifex aeolicusApril 19th, 19991,522
    Bacillus subtilisApril 19th, 19994,100
    Borrelia burgdorferiApril 19th, 19991,639
    Campylobacter jejuniApril 19th, 19991,731
    Chlamydia pneumoniaeApril 19th, 19991,052
    Chlamydi atrachomatisApril19th, 1999877
    Escherichia coliApril 19th, 19994,290
    Haemophilus influenzaeApril 19th, 19991,713
    Helicobacter pyloriApril 19th, 19991,577
    Mycobacterium tuberculosisApril 19th, 19993,924
    Mycoplasma genitaliumApril 19th, 1999479
    Mycoplasma pneumoniaeApril 19th, 1999677
    Rickettsia prowazekiiApril 19th, 1999837
    Synechocystis sp.April 19th, 19993,168
    Thermotoga maritimaMay 28th, 19991,849
    Treponema pallidumApril 19th, 19991,031
    Archaea
    Aeropirum pernix K1July 23rd, 19992,694
    Archaeoglobus fulgidusApril 19th, 19992,409
    Methanobacterium thermoautotrophicumApril 19th, 19991,871
    Methanococcus jannaschiiApril 19th, 19991,771
    Pyrococcus abyssi May 5th, 19991,765
    Pyrococcus horikoshiiApril 19th, 19992,061
    Eukaryota
    Caenorhabditis elegans April 19th, 199919,099
    Schizosaccharomyces pombeOctober 9th, 19993,955
       
    SwissProt (filtered)November 3rd, 199958,365

    BLASTx comparison to GPROTEOME was done on sequences which were not annotated from Saccharomyces cerevisiae comparisons. Saccharomyces cerevisiae DNA sequences were downloaded from MIPS on March 2nd, 1999. The ORFs were predicted and filtered, based on MIPS annotations (Blandin et al., 2000).

    References


    Members