Phase 5 of the Cancer Gene Index Project Begins


By Allison Proffitt

June 25, 2008 | Sophic has announced $1.3 million of funding from the National Cancer Institute to complete the Cancer Gene Index Project over the next 12 months. Sophic started the project in June 2004 with the goal of mining 8.8 million Medline abstracts to identify suspected cancer genes and manually annotate gene-disease and gene-compound relationships. So far 4,658 cancer genes have been made publically available on the NCI website.

“We’ve completed four years of work and this phase is the completion phase, which means we will have completely analyzed and annotated the 6,610 identified cancer genes with manual annotations for role codes and evidence codes,” Patrick Blake, Sophic’s CEO told Bio-IT World. “I just attended the caBIG conference and they are identifying this dataset as the backbone for cancer research across the cancer community. We’re proud and we’re thrilled to be able to offer this asset to people who are fighting this terrible disease.”

The fifth phase of the project, announced on Monday, will bring the total number of cancer related genes indexed to 6,610. Sophic has completed the work in conjunction with NCI and Biomax Informatics AG of Munich, Germany in what Blake calls “a true collaboration” using Biomax’ BioLT literature mining tool. “We developed a ‘factory assembly line’ methodology that allows the automated text mining results to be fed into the scientific team who curate and annotate the information in an efficient, quality-controlled, work-flow process,” said Klaus Heumann, CEO of Biomax. The phase-based strategy has been designed so that “nothing is missed” and that all cancer genes, cancer types, and compounds and treatments related to cancer genes are examined.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

Waters white paper image
Software Helps Doping Control Lab Streamline Results Management
Sponsored by Waters
The Karolinska University Hospital’s Doping Control Lab tests thousands of samples annually for stimulants, diuretics, and other masking agents. Increased regulatory pressure and new technologies increased the number of samples analyzed creating data management challenges. Waters® NuGenesis® Scientific Data Management System and TargetLynx™ Application Manager software were used to reduce the time required to calculate, review and search results.


sas whitepaper92
Managed Innovation, Assured Compliance
Sponsored by SAS
Discovery organizations are identifying a lot of promising compounds, but clinical research processes haven't kept pace with timely testing of all those potential therapies. This white paper describes how SAS® Drug Development supports true innovation across the clinical trial process.

In this white paper you will learn how to:

  • Assemble data to foster better collaboration
  • Get up-to-date information during clinical trials
  • Make informed decisions earlier in the trial process


BlueArc white paper image
Addressing Life Sciences Constantly Growing Data Challenges Research Environments
Sponsored by BlueArc
The continued explosion of raw experimental data, the increased use of video, the growing adoption of new data retention practices, and the move to high throughput computational workflows are all placing new demands on the way life sciences organizations store and manage their data.

Download this white paper to learn about:

  • Factors driving the data explosion in the life sciences
  • New data management issues that must be addressed
  • HPC trends that are placing new demands on storage
  • Storage solution attributes that address performance, manageability, and energy efficiency.


Life Science Webcasts & Podcasts

Medidata Solutions

Rising Clinical Trial Delays and Costs - Addressing the Cause, Not the Symptoms 

medidata podcastProtocol complexity is taking a toll on clinical study speed and efficiency: increasingly complicated and ambitious protocols are not only burdening sites and study volunteers but are also prolonging trials and increasing expenses. In response, sponsors have turned to global study placement, restructured site relationships and new site management practices, but the problem remains.

This podcast will discuss:

  • Why these responses address only the symptoms, not the underlying cause, of rising clinical trial delays and costs.
  • Results of a recent joint Tufts University / Medidata Solutions study.
  • New metrics benchmarking protocol design trends.
  • Systematic protocol design improvements and why they are essential to clinical trial performance excellence.

Speakers: Ken Getz, Senior Research Fellow at the Tufts Center for the Study of Drug Development, and Ed Seguine, General Manager, Trial Planning Solutions at Medidata.

Download Now 



More Podcasts

Job Openings

Director, Center For Information Technology (CIT) - National Institutes of Health  (NIH), Department of Health and Human Service
Located in Bethesda, MD. This position requires:
• High-level vision, leadership, management, and modernization of CIT programs and services.
• Strategic direction and policy development for CIT long-term operations and objectives.
• Serve as a key IT advisor to the NIH Chief Information Officer.
A TOP SECRET security clearance will be required.  More job detail is found at:  http://www.jobs.nih.gov under the Executive Jobs section.Or contact Ms.Winnie Garner at seniorre@od.nih.gov.  Applications must be received ELECTRONICALLY by (11:59 p.m.), December 17, 2008.  DHHS and NIH are Equal Opportunity Employers

Bioinformatics Manager- Lilly Singapore Centre for Drug Discovery
For more information click here 

For reprints and/or copyright permission, please contact The YGS Group, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.