This is a printer version of an article.
To view the article online, visit:


Researchers: Ocean Survey Reveals Millions of New Genes, Thousands of New Protein Families News Service
March 13, 2007 18:03 EST

ROCKVILLE, Maryland -- Researchers from the J. Craig Venter Institute (JCVI) today announced the publication of several studies from the Sorcerer II Global Ocean Sampling Expedition (GOS) in PLoS Biology ( detailing the discovery of millions of new genes, thousands of new protein families and specifically the characterization of thousands of new protein kinases from ocean microbes using whole environment shotgun sequencing and new computational tools. Researchers believe these data will lead to better understanding of key biological processes which could eventually offer new ideas for alternative energy production and could offer solutions to deal with climate change and other environmental issues.

The GOS dataset is 90-fold larger than other marine metagenomic datasets, thus making it the largest ever released in the public domain. The GOS analysis also nearly doubles the number of previously known proteins. This enormous amount of data allowed the researchers to better understand the genomic structure and evolution of microorganisms, as well as the function of important protein families such as protein kinases, which are key regulators of cellular function in all organisms. Although invisible to the naked eye, microbes make up the vast majority of life on the planet and are responsible for creation and maintenance of Earth's atmosphere, it is important to understand the role and function of these organisms to ensure the survival of the planet and human life on it.

"This publication is not only providing an unprecedented level of new genes and protein family discoveries, but is also pivotal in that we have provided compelling analysis of evolution and function of these genes and proteins within the larger context of organisms interacting with their environment," said J. Craig Venter, Ph.D., founder and chairman, the J. Craig Venter Institute. "Given the findings, it's clear that we've only begun to scratch the surface of understanding the microbial world around us."

The Sorcerer II Expedition began with a pilot project in 2003 in the Sargasso Sea near Bermuda in which more than one million new genes and hundreds of new photoreceptors were discovered in what was thought to be an area of low diversity. The GOS publication today is a result of ocean water sampling conducted from Halifax, Nova Scotia to the Eastern Tropical Pacific during the two year circumnavigation by the Sorcerer II Expedition. The Gordon and Betty Moore Foundation and the United States Department of Energy, Office of Science, funded the sequencing and analysis of the Expedition. The JCVI funded the operation of the vessel.

The group also announced today the launch of a new online database and high-speed computational resource, Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA). Funded by a grant from the Moore Foundation of $24.5 million over seven years, CAMERA was developed by the UC San Diego Division of the California Institute for Telecommunications and Information Technology (Calit2) in partnership with JCVI and UCSD's Center for Earth Observations and Applications (CEOA) at Scripps Institution of Oceanography.

"The scale and complexity of the GOS data required Calit2 to architect a powerful new cyberinfrastructure to enable both interactive access as well as high performance computation on the data by the global metagenomic community," said Larry Smarr, Calit2 director and principal investigator on CAMERA.

CAMERA houses metagenomic data and provides the advanced software tools and computer hardware to analyze these data. Using dedicated optical circuits, CAMERA permits scientists to connect their local laboratory computers directly to the CAMERA database and tools using the National LambdaRail or international optical circuits, resulting in up to a hundred-fold increase in bandwidth over current standards. CAMERA has been in beta testing since January 2007 and today is available to researchers worldwide. In addition to the CAMERA database, the GOS data is also being deposited in the U.S. National Institutes of Health's public database, GenBank.

The GOS publication was a result of intensive analysis of these data by scientists from the JCVI along with collaborators at four University of California campuses (San Diego, Los Angeles, Berkeley and Davis), University of Southern California, Salk Institute for Biological Studies, Burnham Institute, University of Hawaii, Brown University, Universidad Nacional Autonoma de Mexico, Universidad de Costa Rica, Universidad de Concepcion, Bedford Institute of Oceanography, Smithsonian Tropical Research Institute, and Rutgers University.