How does data mining help healthcare?

Archer's Blog - How does data mining help healthcare?

Is data mining just another buzzword in the modern business world? How can data mining tools enable the discovery of deeper insights into healthcare? How to use huge amounts of data generated by healthcare EDI transactions to advance the medical services?

Quick navigation

What is data mining?

The term's meaning differs when used in different industries. The most common definition, as provided by Techtarget, is “the process of sorting through large data sets to identify patterns and establish relationships to solve problems through data analysis.”

Data mining tools allow you to discover patterns and to use those patterns to predict future trends or the likelihood of future events. Typically data mining is applied to structured data.

How does data mining work?

To perform data mining, you need two things:

  • the data itself (lots of data indeed)
  • and the computing power capable to deal with the data (petabytes of data to be more precise).

The more organized the data is, the easier it is to mine it and get useful information for analysis.

Data mining is commonly used for marketing purposes. For example, online services such as Facebook, Google, and many others, mine myriads of data to provide users with targeted content. E-commerce companies, such as Amazon, use data mining to offer cross-sells and up-sells. When you see a box “People who viewed this product, also liked this”, you see the results of very sophisticated data mining.

What is data mining in healthcare?

The medical industry collects a dazzling array of data, most of which is electronic health records (EHRs) collected by HIPAA covered healthcare facilities. According to a survey by PubMed, data mining is becoming increasingly popular in healthcare, if not increasingly essential. The huge amounts of data generated by healthcare EDI transactions cannot be processed and analyzed using traditional methods because of the complexity and volume of the data.

Data mining provides the methodology and technology for healthcare organizations to:

  • evaluate treatment effectiveness,
  • save lives of patients using predictive medicine,
  • manage healthcare at different levels,
  • manage customer relationship,
  • detect waste, fraud and abuse.

How does data mining work in healthcare?

To sift through the collected medical data and to extract the useful knowledge hidden there, data mining is used as a part of the Knowledge Discovery in Databases (KDD) process.

The whole process includes the following main steps, which can be performed in an iterative and interactive sequence:

  • Data selection. The main goal of this step is to create a target data set from the original data, on which knowledge discovery has to be performed.
  • Data preprocessing is the step where the data is “cleaned” to define strategies for handling missing data fields and accounting for time-sequence information.
  • Data transformation. This step reduces and projects the data using transformation techniques or methods to find invariant aspects of the data.
  • Data mining. This step deals with extracting interesting patterns by choosing methods, tasks, and algorithms and presents the output results appropriately.
  • Data interpretation or evaluation. This step is performed by the user to interpret and extract knowledge from the mined patterns.

Data mining techniques used in healthcare

Microsoft says data mining “uses mathematical analysis to derive patterns and trends that exist in data. Typically, these patterns cannot be discovered by traditional data exploration because the relationships are too complex or because there is too much data”.

Data mining involves the creation of association rules, the use of support and confidence criteria to locate the most important relationships within the data.

Other healthcare data mining parameters include:

  • sequence or path analysis (i.e. finding patterns where one event leads to another later event),
  • classification (i.e. looking for new patterns and predicting variables based on the factors the database contains),
  • clustering (i.e. grouping a set of objects and aggregating them based on their similarity to each other)
  • and forecasting.

On top of mining large databases, such as hospital EHRs, data mining techniques include:

  • web mining,
  • network approaches,
  • text mining,
  • natural language processing (NLP),
  • machine learning,
  • predictive modeling,
  • relationship and link analysis,
  • statistical analysis, etc.

The healthcare industry possesses rich data sources, such as electronic medical records, administrative reports and other benchmarking findings.

EMR implementation How Why When

Purposes of the data mining in healthcare

Today, data mining in healthcare is used mainly for predicting various diseases, assisting with diagnosis and advising doctors in making clinical decisions. But, the potential of data mining is much bigger – it can provide question-based answers, anomaly-based discoveries, provide more informed decisions, probability measures, predictive modeling, and decision support.

Using data mining, the healthcare industry can be very effective in such fields as:

  • medical research,
  • pharmaceuticals,
  • medical devices,
  • genetics,
  • hospital management,
  • and health care insurance, etc.

Examples of healthcare data mining application

Let’s review some applications of data mining in the healthcare industry and how mathematical and statistical data mining can address various cases in the clinical, financial, and operational environments to find best practices and the most effective solutions.

Detection and prevention of fraud and abuse

One of the most prominent examples of data mining use in healthcare is detection and prevention of fraud and abuse. In this area, data mining techniques involve establishing normal patterns, identifying unusual patterns of medical claims by healthcare providers (clinics, doctors, labs, etc).

On May 17, 2013, the Department of Health and Human Services (HHS) issued the final rule "State Medicaid Fraud Control Units; Data Mining" which permits Federal financial participation in the cost of data mining can be covered, if certain criteria are satisfied.

MFCUs must submit data mining applications to the Office of Inspector General for approval. The Centers for Medicare & Medicaid Services (CMS) also updated data mining rules to enrich patient care. This rule makes identifiable data files (IDFs) available to certain stakeholders as allowed by federal laws and regulations and CMS policy.

Measuring treatment effectiveness

This application involves comparing and contrasting symptoms, causes and courses of treatment to find the most effective course of action for a certain illness or condition. Data mining tools compare symptoms, causes, treatments and negative effects, identify the side effects of a particular treatment, and analyze which decision would be most effective.

Through data mining providers can develop smart methodologies for treatment, best standards of medical and care practices. For example, a research paper published in International Journal of Scientific & Engineering Research explores a case of data mining used by United HealthCare.

This facility has mined its treatment record data to find ways to deliver better medicine at a lower cost. The data obtained allowed them to develop clinical profiles “to give physicians information about their practice patterns and to compare these with those of other physicians and peer-reviewed industry standards.”

Aiding hospital management

The data mining tools can identify and track chronic disease states and high-risk patients, develop appropriate treatment schemes, and reduce the number of hospital admissions and claims.

A Survey of Health Care Prediction Using Data Mining cites the Arkansas Data Network data mining initiative as an example of an organization that is developing better diagnosis and treatment protocols. The facility analyzes readmission and resource utilization data and compares its data with current scientific literature to “determine the best treatment options, thus using evidence to support medical care and streamlining the process”.

These are only a few examples of data mining in healthcare, but its potential and benefits for healthcare systems are very promising.

Benefits of healthcare data mining

Data mining is gaining momentum in the healthcare industry because it offers benefits to all stakeholders:

  • Care providers can use data mining to identify effective treatments and best practices as well as to develop guidelines and standards of care;
  • Patients, especially those having chronic or high-risk diseases, can receive better, more affordable healthcare services with appropriate identification, tracking and use of appropriate interventions and treatment protocols;
  • Healthcare organizations can use data mining to improve patient satisfaction, to provide more patient-centered care, and to decrease costs and increase operating efficiency while maintaining high-quality care;
  • Insurance organization can detect medical insurance fraud and abuse through data mining and reduce their losses.


Contact our team at for more information.