• Duke Medicine built homegrown data analytics tools to better manage electronic health information
  • Data dates back to 1996, with more than 4 million patient records
  • High take-up rate among clinicians increase frequency and quality of research and clinical care

Duke Medicine may not be the only institution to use data analytics on patient data, but it is one of the first to offer self-service analytics.   This is where the researchers themselves access the data warehouse to explore data and generate their own insights, with no IT person as intermediary.

“Data is an extremely valuable enterprise asset, and information is foundational for clinical, financial, research, and translational medicine initiatives,” said Dr Jeffrey Ferranti, Vice President and Chief Information Officer (CIO), Duke Medicine.

In fact, Duke’s ability to gather and store electronic health information has outpaced its ability to analyse and use that information to improve healthcare quality and reduce cost.

The result? The building of homegrown data analytics tools like Duke Enterprise Data Unified Content Explorer (DEDUCE) and Duke Integrated Subject Cohort Enrollment Research Network (DISCERN).


They help Duke, as an Academic Medical Centre, to have a more seamless continuum of research and clinical care by securely leveraging patient data to do better research.

“The goal is to build a tool that answers questions relevant to different domains of expertise. The alternative is ask an IT person to pull out the data."

These tools are built on the foundation of EPIC. This is a comprehensive electronic medical record system that replaced 135 systems with a single one, where a patient’s data is captured as one record.

Duke’s data-like the patient’s lab records, current procedural terminology coded procedures, computerised physician order entry orders, encounters and billing invoices – date back to 1996, with over 4 million patient records in it.

DEDUCE helps doctors to identify cohorts of patient with similar issues.   Its power is seen in the TECOS Trial to evaluate patient outcomes after treatment with Sitagliptin, an antidiabetic drug. The researchers narrowed down from nearly 4 million patients to a cohort of 2,000 patients with the inclusions and exclusions, like Type II diabetes, over 50, and had a HbA1c result between 6.4 and 8.1.

“Without this tool, we wouldn’t have been able to find those patients,” said Ferranti.

So far, since 2008, some 1,300 clinicians have undergone the four-hour training to use DEDUCE, and perform 2,000-3,000 queries a month.


DEDUCE is even able to map the patients geo-spatially, slice the data based on demographic and socioeconomic variables, do text analytics and graphics.

Complementing DEDUCE is DISCERN – where DEDUCE trawls retrospective data, DISCERN combines both the warehouse data and real-time clinical events to identify potential study recruits and alert the appropriate staff.


Without DISCERN, there were multiple missed opportunities at recruitment because of the aggressive timeline to recruit patients.

Going forward, Ferranti’s vision is to have federated queries.   This allows the simultaneous search of multiple resources, where DEDUCE is installed in multiple institutions around US and the world, allowing the researcher to pull data from across these disparate databases.