International consortium of researchers generates gene sequences from more than 1100 plantspecies
Image caption: Green alga Lacunastrum gracillimum, female cones of gymnosperm, Gnetum gnemon, and cherry tree flower, Prunus domestica.
Photo credits: Michael Melkonian and WalterS. Judd.
Gene sequences for more than 1100 plantspecies have been released by an international consortium of nearly 200 plant scientists, the culmination of a nine-year research project.
The OneThousand Plant Transcriptomes Initiative (1KP) is a global collaboration to examine the diversification of plant species, genes andgenomes across the more than one-billion-year history of green plants datingback to the ancestors of flowering plants and green algae.
“In the tree of life, everything isinterrelated,” said Gane Ka-Shu Wong, lead investigator andprofessor in the University of Alberta’s Faculty of Science and Faculty of Medicine & Dentistry.“And if we want to understand how the tree of life works, we need to examinethe relationships between species. That’s where genetic sequencing comes in.”
The findings, published today in Nature, reveal the timing of whole genome duplications and the origins, expansions and contractions of gene families contributing to fundamental genetic innovations enabling the evolution of greenalgae, mosses, ferns, conifer trees, flowering plants and all other green plant lineages. The history of how and when plants secured the ability to grow tall,and make seeds, flowers and fruits provides a framework for understanding plantdiversity around the planet including annual crops and long-lived forest treespecies.
“Our inferred relationships among living plant species inform us that over the billion years since an ancestral greenalgal species split into two separate evolutionary lineages, one includingflowering plants, land plants and related algal groups and the other comprisinga diverse array of green algae, plant evolution has been punctuated within novations and periods of rapid diversification” said James Leebens-Mack, professor of plant biology in the University of Georgia Franklin College of Arts and Sciences and co-corresponding authoron the study. “In order to link what we know about gene and genome evolution toa growing understanding of gene function in flowering plant, moss and algal organisms,we needed to generate new data to better reflect gene diversity among all greenplant lineages.”
The study inspired a community effort togather and sequence diverse plant lineages derived from terrestrial and aquatic habitats on a global scale. Over 100 taxonomic specialists contributed material from field and living collections that include the Central Collection of Algal Cultures, Royal Botanic Gardens, Kew, Royal Botanic Garden Edinburgh, Atlanta Botanical Garden, New York Botanical Garden, Fairylake Botanical Garden, Shenzhen, The Florida Museum of Natural History, Duke University, University of British Columbia Botanical Garden and The University of Alberta. By sequencing and analyzing genes from a broad sampling of plant species, researchers are better able to reconstruct gene content in theancestors of all crops and model plant species, and gain a more complete picture of the gene and genome duplications that enabled evolutionary innovations.
Nearly a decade ago, Wong organized private funding through the Somekh Family Foundation as well as support from the Government of Alberta and a sequencing commitment from BGI in Shenzhen, China, to launch 1KP. Once the project was operational, additional resources came from other ongoingprojects, including iPlant (now CyVerse)funded by the U.S. National Science Foundation.
The massive scope of the project demanded development and refinement of new computational tools for sequence assembly andphylogenetic analysis.
“New algorithms were developed by software engineers at BGI to assemble the massive volume of gene sequence data generated for this project,” explained Wong.
Founder professor of computer science Tandy Warnow of the University of Illinois at Urbana-Champaign and Siavash Mirarab, assistant professorof electrical and computer engineering at the University of California San Diego, developed new algorithms for inferring evolutionary relationships from hundreds of gene sequences for over one thousandspecies, addressing substantial heterogeneity in evolutionary histories across the genomes.
The timing of 244 whole genome duplications across the green plant tree of life was one of the interrelated research fociof the project.
“Perhaps the biggest surprise of ouranalyses was the near absence of whole genome duplications in the algae,” said Mike Barker, associate professor of ecology andevolutionary biology at the University of Arizona. “Building on nearly 20years of research on plant genomes, we found that the average flowering plantgenome has nearly 4 rounds of ancestral genome duplication dating as far backas the common ancestor of all seed plants more than 300 million years ago. Wealso find multiple rounds of genome duplication in fern lineages, but there islittle evidence of genome doubling in algal lineages.”
In addition to genome duplications, theexpansion of key gene families has contributed to the evolution ofmulticellularity and complexity in green plants.
“Gene family expansions through duplication events catalyzed diversification of plant form and function across the greentree of life,” said co-author Marcel Quint, professorof crop physiology, at Halle University, Germany. “Such expansions unleashed during terrestrialization or even beforeset the stage for evolutionary innovations including the origin of the seed andlater the origin of the flower.”
“The view of evolutionary relationships provided by 1KP has led to new hypotheses about the origins of key structures and processes in green plants,” said coauthor Pam Soltis,of the Florida Museum of Natural History, University of Florida.
The paper, “One Thousand Plant Transcriptomes and Phylogenomics of Green Plants,” was published in Nature (https://www.nature.com/articles/s41586-019-1693-2). Sequences, sequence alignments and tree data are available through the CyVerseData Commons.
By Alan Flurry (aflurry@uga.edu),University of Georgia, Athens, GA U.S.A. and Katie Willis(kewillis@ualberta.ca), University of Alberta, Edmonton Canada.