Data Governance and Frankenstein

"Knowledge is Knowing Frankenstein isn’t the Monster, Wisdom is Knowing Frankenstein is the Monster”
- Alexandra Melnick

Before the EMR came on the scene, the physicians I work with tell me the extent of patient “data” was what could be stuffed into a manila file folder bursting at the seams.  In whatever condition it lived in, healthcare data has been collected and used not only to treat patients, but to support the billing and claims process, as well. With the advent and expansion of EMRs, the amount of patient data being recorded has expanded exponentially. Reports say that in 2011 the data from the U.S. healthcare system alone reached 150 exabytes. At this rate of growth, big data for U.S. healthcare will soon reach the zettabyte (10^21 gigabytes) scale and, not long after, the yottabyte (10^24 gigabytes)!

Not only is this a massive volume of data, but it poses the additional challenge of coming in two flavors:  structured and unstructured.  For anyone not familiar with these terms, picture writing a review for a product on Amazon – choosing between 1 and 5 stars is discrete, structured data, but actually typing out the more detailed review in sentences is unstructured data. Like writing a narrative review of a product online, much of a patient’s clinical data is stored unstructured such as in clinical notes. So, the sheer volume combined with the challenges of unstructured data consumption has heralded the help of Big Data scientists and technology.   The goal is to try to make sense of the data to extract as much value as possible from it.  To utilize this vast amount of healthcare data, it needs to be properly maintained, curated, and structured so that patients, health systems, and startups will benefit. 

So, what is Data Governance? Data Governance is a quality control discipline for assessing, managing, using, improving, monitoring, maintaining, and protecting organizational information.  Think of it as how a world-class gardener needs to run their garden – they need to choose the appropriate seeds or crops, decide where to plant them, choose the best soil, and nurture them with the right fertilizer and water at the right times. A health system that ignores Data Governance will be in for a long reporting winter if they can’t reap the benefits of great, usable data. Or, to use the analogy of the quote at the start of this post, if Dr. Frankenstein had better quality control for assessing and managing the brains for his “science project”, he would have gotten into a lot less trouble!

The goal of Data Governance in healthcare, specifically, is for patient data to be complete and valid, to be understood throughout the organization (e.g. a health system), and to have maximum value. When an organization meets these criteria, their data will be ripe for use by modern technology and experts who can leverage it for population management, predictive analytics, precision medicine and more.   When executed properly, data governance enables us to squeeze out wisdom from the raw data knowledge. 

As with most things health tech, Data Governance may not sound like the most glamorous undertaking.  But for those of us who constantly look for ways to put technology to work for patients and providers, it’s exciting!  Forward-thinking health systems are paving the way by investing in this area of focus. The University of Mississippi worked for 18 months on Data Governance without producing a single report. But, this hard work has paid dividends, allowing them to produce 40 data visual apps and 1,200 reports with just five report writers due to the groundwork in place. UPMC recently partnered with informatics to build a single backbone of data re-usability for clean, safe and connected data. 

Startups are emerging to help meet the need for data governance and advocate for its importance – with investors in tow. Heureka Software recently raised $1.1M in seed funding to help work with "Dark Data” (No, not Dark Matter – scientists are still working on figuring that out….).  Dark data represents all electronically stored artifacts and files that are outside of core transaction-based systems. Phemi Health Systems, a Hadoop-based big data warehouse with a heavy emphasis on privacy and data governance, has also received up to $15M in funding. 

The further adoption of strong Data Governance processes and tools will take the countless pieces of data scattered throughout patient stories and mold it into actionable wisdom to help patients both today and in the future.  Healthcare has always been a bit of a “horse-and-buggy” when compared to other industries.  This is partially due to a technology aversion (that has been rapidly improving in recent years) and the industry regulations that make it challenging for innovation to disrupt this space as quickly as it wants (and needs) to disrupt it.  Data Governance may not be as exciting as virtual reality or self-driving cars, but these early successes show how powerful this knowledge can become with the right expertise and discipline.