UCSC releases a more readable version of the human genome
“It’s funny, biology is such a mass discipline,” says Jim Kent. “What we focus on is often as much of a reflection on ourselves as it is on anything.” As director of the UC Santa Cruz Genome Browser Project and head of the ENCODE Data Coordination Center, Kent sets quite an example for channeling cultural fascination with science into real-world application.
“Back in the 1960s, when the focus was on energy and physics, people figured out the energy centers of the cell,” says Kent. But in 2012—a time of immense, international collaboration between biologists and computer engineers—the “culture of the age” culminated in September with a much more accessible and informative translation of the entire human genome than was previously available.
The Encyclopedia of DNA Elements (ENCODE) kicked off in 2003, after UCSC sequenced and published a complete human genome sequence draft days before the competing Celera Genomics Corporation attempted to patent theirs.
Funded chiefly by the National Human Genome Research Institute, this publicly available information was vast but largely hieroglyphic (at the time, researchers could only attribute genetic functionality to roughly 2.5 percent of the genome). ENCODE has since then been an effort to put meaning to the madness by determining the actual functions within this DNA “Rosetta stone.”
A probing technique in the project’s pilot phase found that roughly 80 percent of human DNA, once considered wholly inactive, transcribes into RNA. While more recent data points to the conservative estimate that 25 percent has direct genetic functionality, that’s still a tenfold increase in viable DNA.
But a particularly astounding component of this freshly significant amount of the genome is “regulatory DNA.” Regulatory regions determine which genes are expressed and where, giving humans two arms, two legs, and differentiating us from chimpanzees.
“The regulatory region is the control switch,” explains Kent, who wrote the original sequencing project’s computer program as a graduate student at UCSC. “It’s the closest thing that the cell has to the computer system.”
For Kate Rosenbloom, who serves as the technical project manager of the ENCODE Data Coordination Center at UCSC, the interpretation is a step toward completing the genome puzzle. “We’ve had all the parts all along and now we’re pulling them into a story to see how they all connect,” she says.
Detailing these hugely significant regulatory regions were as much a responsibility as a privilege. The amount of data storage and software power required increased exponentially—but Kent and internationally acclaimed colleague David Haussler were still up for the task. The data repository in its behemoth entirety relocated from the NHGRI to the web-based UCSC Genome Browser, which has maintained a culture of public availability since the original genome sequencing.
With the first “revision” of the Encyclopedia’s data and format wrapped up this September, “Our target is so a first-year graduate student in a biomedical discipline can understand,” says Kent. “That’s probably where we’re going to stay for a while.”
In coordination with the City of Santa Cruz, UCSC has been pushing to create a stronger culture of entrepreneurship at the university. To curb the county’s trend of thousands of residents commuting over the hill to Silicon Valley each weekday, the university has leveraged hundreds of thousands of federal, local and private dollars to encourage recent grads to stay and plant their business seeds in Santa Cruz.
Enter Five3 Genomics, a startup company by UCSC Bioinformatics graduates Steve Benz, Ph.D, Charles Vaske, Ph.D, and Zack Sanborn. In addition to keeping Slug brainpower in town, the company is utilizing UCSC’s genome browser to genetically tailor medicine.
“We’re a genomics software company focused on cancer,” Benz, the CEO, explains. “We’re trying to identify key aspects of cancer genome information to target treatment. It’s called precision medicine.”
The trio, currently operating out of NextSpace in Downtown Santa Cruz, says they will officially premiere in approximately one month’s time. They say the culmination of ENCODE’s second phase is more than a windfall.
“Most scientists can’t comprehend how large this is,” says Vaske, the company’s CSO. “Our job would be impossible without [the UCSC Genome Browser].”
Vaske quickly navigates to a page on the Genome Browser’s website, where he points to the sequencing charts and graphs of a breast cancer cell’s DNA, which the company has been studying to discern which gene mutations of the cancer are common, and which are more abnormal and virulent.
“Steve [Benz] and I work on modeling,” Vaske explains. “We try to create … a rough simulation of what is going on, so we can say ‘if you have this [gene], you’re more likely to have this response to this treatment.’”
Benz adds that the goal is to get to causality—the invaluable knowledge that allows doctors to both evolve their preventative medicine as well as prescribe more effective treatment.
Benz explains that while access to the Encyclopedia is like looking at a road map and trying to determine how to get from point A to point B, the information allows their local startup to make a global impact.
Going forward, Five3 will be one of many groups utilizing the genome data. UCSC will convene a panel of medical and legal experts titled “Genomics Gets Personal: Property, Persons, Privacy” on Thursday, Sept. 27 at the UC San Francisco Mission Bay campus to discuss the ethical implications of genome data access as science, and culture, continue to evolve.