Chlamydomonas Genome Project:

An NSF grant for which I served as PI, but involves a consortium of researchers, is entitled 'Chlamydomonas genomics: Photosynthesis and acclimation'. Figure 1 shows two Chlamydomonas cells with their long flagella (analogous to cilia in animals) at the anterior of the cells. This project has been extremely productive, resulting in numerous papers and websites. A manuscript that describes the full Chlamydomonas genome sequence was published in 2007 (Merchant et al., 2007). This project has also led to the development of the GreenCut, a set of proteins present in green algae and plants but absent from nonphotosynthetic organisms (Grossman et al. 2010); many of these proteins have unknown physiological functions but likely localize in the chloroplasts where they perform important metabolic functions. We are in the process of elucidating the functions of GreenCut proteins. This project helped trigger a chain of new projects in Europe relating to Chlamydomonas proteomics, and motivated the Joint Genome Institutes to finish the Volvox carteri genomic sequence. A number of relevant websites are listed below:

Gene Expression and Proteome Dynamics in Chlamydomonas reinhardtii

The Chlamydomonas Genetics Center

The Flagellar Proteome

The EST Database

The Microarray Project

JGI Portal for Chlamydomonas Genome

Chlamydomonas Chloroplast Genome Portal

JGI Genome Projects

JGI Integrated Microbial Genomes

Figure 1. Swimming cells of Chlamydomonas reinhardtii. Note the long flagella at the anterior of the cell.

The major results of the Chlamydomonas Genome Project are:

1. Development of unigene set and microarray (Eberhard et al., 2006; Jain et al., 2007) with 10,000 unique gene elements. The information used to develop the arrays was from ~300,000 cDNA sequences and from genomic sequences corresponding to those cDNAs. We established a protocol for aligning contiguous EST sequences with genomic sequence information, which enabled us to generate a reliable set of ACEGs (Aligned Contiguous EST sequences with Genomic sequence support), representing a unigene set (Jain et al., 2006, 2007), which was important for array design.

Figure 2. Typical microarray performed using RNA samples isolated from nutrient replete and sulfur-starved cells (Zhang et al., 2004; Eberhard et al., 2006).

2. Arrays were used to examine modulation of gene expression by blue and high light to establish the role of the PHOT photoreceptor (Im et al, 2006), to define how light influences expression of genes critical for pigment biosynthesis ( Lohr et al., 2005), to characterize acclimation of cells to nutrient deprivation (Eberhard et al., 2006; Moseley et al., 2006; Moseley et al 2008b; Pollock et al., 2005; Zhang et al., 2004) and anoxia (Mus et al., 2007; Dubini et al., 2009, Grossman et al. 2010 In Submission), and to identify regulatory elements that control cellular responses to the environment (Eberhard et., 2006). A typical array that shows expression of cells during sulfur-deprived compared to sulfur-replete growth conditions is presented in Figure 2. New transcriptome analyses using Illumina sequencing (rather than microarrays) are being performed in collaborations with Sabeeha Merchant, Matt Posewitz, Maria Ghirardi, Mike Seibert and Matteo Pelligrini (Gonzalez-Ballester et al., 2010).

3. New molecular markers were identified, allowing for the rapid map-based cloning of mutant alleles (Rymarquis et al., 2005). A mapping kit was developed (

4. Mutant libraries were generated in which genes are interrupted by a DNA tag; marking the genes allows them to be readily identified and helps couple genomic and genetic information (Dent et al., 2005).

5. Molecular methods for altering gene expression using RNAi have been developed (Rohr et al., 2004), and TILLING (Targeting Induced Local Lesions In Genomes) is being attempted (Niyogi, Kustu, Till); a separate project to support the TILLING work was recently funded (NIH Grant awarded to Niyogi and Kustu).

6. Numerous resources have been generated and are available to the general scientific community. There are updated web sites for disseminating genomic information, as given above, and information from these databases has been probed by undergraduates from Stanford University and community colleges in the Bay Area. Furthermore, researchers have been trained in gene annotation, and we have successfully completed an EMBO course on Chlamydomonas genetics/genomics/cell biology (September, 2006).

7. Perhaps the most important findings of our comparative genomic analyses of the Chlamydomonas genome is the establishment of ~350 genes in the green lineage (the number of genes in this group is now ~600), most of which are associated with the function of chloroplasts (based on presequence and transcript profiling), although for almost 300 the precise function is not known. We have recently been developing a large scale project around this work (collaborating with Sabeeha Merchant, Matteo Pelligrini, Francis-Andre Wollman) in which we will use mutants in Chlamydomonas, Arabidopsis and Synechocystis coupled with sophisticated biophysical methods to establish the functions of these 'unknowns'. A summary of the different aspects of the Chlamydomonas Genome Project is shown in Figure 3.


Figure 3. The multiple aspects of Chlamydomonas genomics.