0% found this document useful (0 votes)
133 views

Linguamatics CaseStudy Payer PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
133 views

Linguamatics CaseStudy Payer PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Big data analytics for population health

With Linguamatics Health, powered by text mining solution I2E, healthcare


payers can extract member insights from unstructured big data to improve
population risk stratification.

Quick facts
Situation: A top-5 payer needed to mine member-related data from a mixture of unstructured
formats held in a data lake, to strengthen their analysis of Congestive Heart Failure (CHF)
populations. The payer wanted to integrate the extracted data with conventional data
warehousing and analytics approaches, to support improved patient stratification.

Solution: The payer implemented the Linguamatics Health I2E platform with an automation
workflow to ingest data from Hadoop, process it, and load into a data warehouse for analysis.

Success: The team demonstrated that Linguamatics Health can be integrated into existing
Hadoop and Netezza systems, to gather and use insights from unstructured data as part
of risk stratification analytics for CHF. The Linguamatics Health I2E infrastructure can be
easily extended to support other diseases areas and risk factors such as diabetes, obesity,
and more.

Situation

Many payers are assessing how to improve stratification of patient populations using big data to fuel the
drive toward better member wellness. Risk stratification has so far been biased toward structured data, with
major investments in data warehouses, analytical tools, dashboards, and Master Data Management (MDM).
However, because of the growing availability of electronic health record (EHR) data in Continuity of Care
Document (CCD) format from their providers, extensive notes about members, and nurses’ notes, there
is a huge untapped potential in unstructured data. To manage these documents, many groups are making
use of Hadoop, as these technologies have proven to scale to the data volumes payers need to support.

But how can payers make effective use of unstructured data to stratify populations more effectively
when much of their infrastructure is tied to structured data, while the sources of unstructured data are
so varied? How can these data worlds be brought together?

As interest in long-term member wellness increases in importance it is the insights trapped in


unstructured data that will become the differentiator in a changing and competitive market.
Solution

Linguamatics teamed up with a top-5 payer to transform their unstructured data into fuel to drive risk
stratification. Unstructured data stored in Cloudera needed to be loaded into Netezza in structured form
from CCD, nurses’ notes, and Optical Character Recognition (OCR) documents.

Unstructured data is extracted from Cloudera Hadoop Distributed File System (HDFS) and passed through
an I2E pipeline to mine risk factors of interest (such as diseases and family history, and lifestyle factors
such as smoking), and then turned into structured data. This process is described in more detail in Figure 1.

I2E is used to extract, for example, a person’s smoking status to enable them to be grouped by smoking
behavior. The different ways this can be represented linguistically are incorporated into the query,
and return a consistent and normalized value associated with each person’s status. Standard Extract,
Transform, Load (ETL) approaches are used to load the structured data output from I2E into Netezza.
Improving the efficiency and effectiveness of drug trial planning in this way reduces the cost of the whole
trial, and has the potential to minimize the time before the drug reaches the market.

Success

Transforming unstructured data into actionable insights


The team demonstrated that I2E can be integrated into existing Hadoop and Netezza systems to gain
insights from unstructured data to be used as part of risk stratification analytics for CHF. Linguamatics
Health helps the payer advance their ability to stratify patients at a much more detailed level of insights.
In addition, by cross-referencing insights across multiple sources of unstructured data, a more complete
picture emerges.

Leveraging existing infrastructure


By providing an extraction pipeline that supports existing investments in big data and data warehouses,
rather than tearing out well understood and established approaches, Linguamatics is able to plug
natural language processing (NLP) into these systems to enhance understanding of members based on
unstructured data.

Fast time to value


The payer learned how to build queries very quickly due to Linguamatics GUI-driven NLP interface—users
do not have to be NLP experts. The system is fully configurable, so that modifications can be easily made
without Linguamatics Professional Services.

Extending value into other disease areas and applications


The I2E infrastructure can be easily extended to support other disease areas and risk factors—for
example, COPD, diabetes, and obesity. Investigations into reducing the manual chart review required for
HEDIS (Healthcare Effectiveness Data and Information Set) metrics, and improving the capture rate, are
also being addressed with Linguamatics Health, demonstrating the power and flexibility of this enterprise
NLP platform.
Figure 1: Technical workflow

1 Unstructured member-related documents stored in HDFS in Hadoop are extracted and sent to the
Linguamatics Asynchronous Messaging Pipeline (AMP) via RESTful Web Services to manage information
extraction.

2 AMP distributes the documents across multiple I2E servers depending on the required workload.
If servers are down, or there are connections issues, AMP will reschedule the extraction jobs.

3 Multiple instances of I2E receive documents; these are indexed and information is extracted. Extracted
information may include diseases, medications, and lab values, as well as concepts such as lifestyle
factors (smoking, and alcohol and drug use), ambulatory status, living location, and social determinants
(social support network).

4 The extracted information is returned to AMP in XML, JSON, or CSV/TSV format.

5 AMP sends the extracted data to data warehousing or MDM solutions.

6 The end user/automated routine is able to run population analytics across structured and unstructured
data sets to improve risk stratification.

If you are interested in learning more about Linguamatics Health, I2E, and population health and
risk stratification, please email enquiries@linguamatics.com

© 2017 Linguamatics Ltd. The Linguamatics logo is a trademark of Linguamatics Ltd. All rights reserved. All other trademarks mentioned in this document
are the property of their respective owners.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy