Unstructured Data
Unstructured Data
structure such that it can not be used by a computer program easily. Unstructured data is not
organised in a pre-defined manner or does not have a pre-defined data model, thus it is not a
good fit for a mainstream relational database.
Characteristics of Unstructured Data:
• Data neither conforms to a data model nor has any structure.
• Data can not be stored in the form of rows and columns as in Databases
• Data does not follows any semantic or rules
• Data lacks any particular format or sequence
• Data has no easily identifiable structure
• Due to lack of identifiable structure, it can not used by computer programs easily
Sources of Unstructured Data:
• Web pages
• Images (JPEG, GIF, PNG, etc.)
• Videos
• Memos
• Reports
• Word documents and PowerPoint presentations
• Surveys
Advantages of Unstructured Data:
• Its supports the data which lacks a proper format or sequence
• The data is not constrained by a fixed schema
• Very Flexible due to absence of schema.
• Data is portable
• It is very scalable
• It can deal easily with the heterogeneity of sources.
• These type of data have a variety of business intelligence and analytics applications.
Disadvantages Of Unstructured data:
• It is difficult to store and manage unstructured data due to lack of schema and structure
• Indexing the data is difficult and error prone due to unclear structure and not having
pre-defined attributes. Due to which search results are not very accurate.
• Ensuring security to data is difficult task.
Problems faced in storing unstructured data:
• It requires a lot of storage space to store unstructured data.
• It is difficult to store videos, images, audios, etc.
• Due to unclear structure, operations like update, delete and search is very difficult.
• Storage cost is high as compared to structured data
• Indexing the unstructured data is difficult
Possible solution for storing Unstructured data:
• Unstructured data can be converted to easily manageable formats
• using Content addressable storage system (CAS) to store unstructured data.
It stores data based on their metadata and a unique name is assigned to every object
stored in it.The object is retrieved based on content not its location.
• Unstructured data can be stored in XML format.
• Unstructured data can be stored in RDBMS which supports BLOBs
Extracting information from unstructured Data:
unstructured data do not have any structure. So it can not easily interpreted by conventional
algorithms. It is also difficult to tag and index unstructured data. So extracting information
from them is tough job. Here are possible solutions:
• Taxonomies or classification of data helps in organising data in hierarchical structure.
Which will make search process easy.
• Data can be stored in virtual repository and be automatically tagged. For example
Documentum.
• Use of application platforms like XOLAP.
XOLAP helps in extracting information from e-mails and XML based documents
• Use of various data mining tools
Importance of Unstructured Data:
Unstructured data, in the form of text, images, audio, and video, constitutes a vast majority of
the data generated today. Recognizing its significance is crucial for harnessing comprehensive
insights and making informed decisions. Unstructured data includes social media posts, emails,
customer reviews, and multimedia content, providing valuable but often untapped information.
1. Rich Information Source: Unstructured data holds nuanced and qualitative
information, offering deeper insights into customer sentiments, preferences, and market
trends compared to structured data alone.
2. Holistic Understanding: To gain a holistic understanding of a situation, businesses
need to analyze both structured and unstructured data. Unstructured data contributes
context, helping to paint a complete picture.
3. Innovation and Competitive Advantage: Organizations leveraging unstructured data
gain a competitive edge by uncovering hidden patterns, emerging trends, and innovative
ideas that might be overlooked in structured data.
Unstructured Data Analytics: Descriptive, Diagnostic, Predictive, and Prescriptive
Analytics:
Analyzing unstructured data involves different levels of analytics, each serving a unique
purpose.
1. Descriptive Analytics: Describes what has happened, summarizing historical data. In
the context of unstructured data, this could involve sentiment analysis, summarization,
and categorization to understand the current state.
2. Diagnostic Analytics: This goes a step further to determine why something happened.
Unstructured data diagnostic analytics might involve root cause analysis, identifying
factors influencing trends or patterns.
3. Predictive Analytics: Forecasts future trends based on patterns identified in historical
and current data. In unstructured data, predictive analytics could involve predicting
customer behavior, market trends, or emerging issues.
4. Prescriptive Analytics: Recommends actions to optimize outcomes. Utilizing
unstructured data, prescriptive analytics provides insights into what actions should be
taken to achieve desired results, guiding decision-makers proactively.
Case Study: Application of Unstructured Data Analytics:
Consider a retail company analyzing customer reviews (unstructured data) alongside sales data
(structured data). Descriptive analytics reveals sentiments and popular products. Diagnostic
analytics uncovers the reasons behind positive or negative sentiments. Predictive analytics
forecasts potential trends based on past reviews. Finally, prescriptive analytics suggests actions,
such as modifying marketing strategies or enhancing specific product features, to optimize
customer satisfaction and sales.
In conclusion, the importance of unstructured data lies in its ability to offer a more profound
understanding of various aspects, while analytics on this data spectrum provides a
comprehensive approach to decision-making, thereby fostering innovation and a competitive
edge.
Case Study: Enhancing Customer Satisfaction through Unstructured Data Analytics
Background:
A multinational e-commerce company faced challenges in understanding customer sentiments
and improving overall satisfaction. The company operated in diverse markets, and traditional
structured data analysis didn't capture the nuanced feedback scattered across customer reviews,
social media comments, and emails.
Objectives:
1. Understand Customer Sentiments: Analyze unstructured data to comprehend
customer sentiments towards products and services.
2. Identify Key Issues: Use diagnostic analytics to identify specific issues or trends
impacting customer satisfaction.
3. Predict Future Trends: Apply predictive analytics to forecast potential shifts in
customer preferences and behaviors.
4. Prescribe Actionable Insights: Provide prescriptive analytics to suggest strategies for
enhancing customer satisfaction.
Methodology:
1. Descriptive Analytics:
• Utilized Natural Language Processing (NLP) tools to analyze customer reviews,
social media mentions, and emails.
• Categorized sentiments into positive, negative, and neutral to gain an overview
of customer feelings.
2. Diagnostic Analytics:
• Identified common themes in negative sentiments, such as delivery delays and
product quality concerns.
• Cross-referenced structured data on returns and complaints to pinpoint root
causes.
3. Predictive Analytics:
• Developed predictive models to anticipate potential shifts in customer
preferences based on historical unstructured data.
• Used sentiment analysis to forecast emerging trends and potential areas of
concern.
4. Prescriptive Analytics:
• Provided actionable insights to address identified issues, such as implementing
faster delivery options and improving quality control processes.
• Suggested targeted marketing strategies based on predictive insights to align
with evolving customer preferences.
Results:
1. Improved Customer Satisfaction: By addressing specific concerns highlighted in
unstructured data, the company witnessed a noticeable improvement in overall
customer satisfaction scores.
2. Proactive Issue Resolution: Predictive analytics helped the company proactively
address emerging issues before they became widespread, minimizing negative impacts
on customer experience.
3. Enhanced Product Development: Insights from unstructured data guided product
development strategies, ensuring new offerings aligned with customer expectations.
4. Cost Savings: Efficiently targeted resources and initiatives based on prescriptive
analytics, resulting in cost savings and improved ROI.
Conclusion:
This case study illustrates the transformative impact of unstructured data analytics on customer
satisfaction. By integrating insights from various sources, the company not only gained a
comprehensive understanding of customer sentiments but also proactively addressed issues,
fostering a positive customer experience and supporting long-term business growth
Case Study: Unstructured Data Analytics in Healthcare for Patient Care Optimization
Background:
A regional hospital aimed to enhance patient care by harnessing insights from both structured
and unstructured data. Traditional methods primarily focused on structured clinical data, but
valuable information existed in unstructured formats, such as doctor's notes, patient feedback,
and medical forums.
Objectives:
1. Comprehensive Patient Profile: Develop a holistic patient profile by integrating
structured medical records with unstructured data, including doctor notes, patient
emails, and community forum discussions.
2. Early Diagnosis and Intervention: Use diagnostic analytics to identify patterns in
unstructured data that could lead to early diagnosis and intervention, improving patient
outcomes.
3. Predictive Analytics for Disease Trends: Forecast potential disease trends and health
issues by analyzing unstructured data sources, enabling proactive healthcare planning
and resource allocation.
4. Prescriptive Analytics for Personalized Care: Provide prescriptive insights to guide
personalized patient care plans based on a combination of structured and unstructured
data.
Methodology:
1. Descriptive Analytics:
• Employed Natural Language Processing (NLP) to extract information from
unstructured sources, translating doctor notes and patient communications into
structured data.
• Categorized patient sentiments and concerns to create a comprehensive patient
profile.
2. Diagnostic Analytics:
• Identified recurring patterns in unstructured data related to patient symptoms,
treatment responses, and side effects.
• Correlated this information with structured data to diagnose potential issues and
improve treatment strategies.
3. Predictive Analytics:
• Developed predictive models to anticipate potential disease outbreaks or
prevalent health issues within the community.
• Analyzed unstructured patient data to identify early indicators of chronic
conditions, allowing for proactive interventions.
4. Prescriptive Analytics:
• Provided healthcare professionals with actionable insights based on patient-
specific unstructured data, guiding the development of personalized care plans.
• Integrated predictive analytics to recommend preventive measures for patients
with identified health risks.
Results:
1. Comprehensive Patient Profiles: Integration of structured and unstructured data
allowed healthcare professionals to have a more comprehensive understanding of each
patient's medical history and preferences.
2. Early Diagnosis and Intervention: Diagnostic analytics facilitated early identification
of potential health issues, enabling timely intervention and improved patient outcomes.
3. Proactive Healthcare Planning: Predictive analytics helped the hospital proactively
allocate resources and plan for potential disease trends, enhancing community
healthcare initiatives.
4. Personalized Patient Care: Prescriptive analytics guided the development of
personalized care plans, leading to higher patient satisfaction and improved overall
healthcare quality.
Conclusion:
This case study demonstrates the transformative impact of unstructured data analytics in
healthcare. By leveraging insights from both structured and unstructured data, the hospital
optimized patient care, achieved early diagnosis, and enhanced proactive healthcare planning,
ultimately improving health outcomes for the community.
Analytics is used in almost every industry. The technological changes you see every day is all
because of analytics. Today we will see the main types of analytics
• Descriptive Analytics
• Diagnostic Analytics
• Predictive Analytics
• Prescriptive Analytics
Let’s discuss analytics types as follows.
Descriptive Analytics :
Descriptive analytics deals with past trends data, it basically finds out what has happened in
the past, and based on past data or historic data it predicts the future outcome. One of the main
objectives of descriptive analytics is to look at the trends of past data, summarize it in an
innovative way that can be useful for generating insight.
Example –
Let’s take the example of DMart, we can look at the product’s history and find out which
products have been sold more or which products have large demand by looking at the product
sold trends and based on their analysis we can further make the decision of putting a stock of
that item in large quantity for the coming year.
Diagnostic Analytics :
Diagnostic analysis works hand in hand with Descriptive analytics. As descriptive analytics
find out what happened in the past, diagnostic analytics, on the other hand, finds out why did
that happen or what measures were taken at that time, or how frequent it has happened.it
basically gives a detailed explanation of a particular scenario by understanding behavior
patterns.
Example –
Let’s take the example of Dmart again. Now if we want to find out why a particular product
has a lot of demand, is it because of their brand or is it because of quality. All this information
can easily be identified using diagnostic analytics.
Predictive Analytics :
Whatever information we have received from descriptive and diagnostic analytics, we can use
that information to predict future data. it basically finds out what is likely to happen in the
future. Now when I say future data doesn’t mean we have become fortune-tellers, by looking
at the past trends and behavioral patterns we are forecasting that it might happen in the future.
Example –
The best example would be Amazon and Netflix recommender system. You might have noticed
that whenever you buy any product from Amazon, on the payment side it shows you a
recommendation saying the customer who purchased this has also purchased this product that
recommendation is based on the customer purchased behavior in the past. By looking at
customer past purchase behavior analyst creates an association between each product and that’s
the reason it shows recommendation when you buy any product.
The next example would be Netflix, when you watch any movies or web series on Netflix you
can see that Netflix provide you with a lot of recommended movies or web series, that
recommendation is based on past data or past trends, it identifies which movie or series has
gain lot of public interest and based on that it creates a recommendation
Prescriptive Analytics :
This is an advanced method of Predictive analytics. Now when you predict something or when
you start thinking out of the box you will definitely have a lot of options, and then we get
confused as to which option will actually work. Prescriptive analytics helps to find which is
the best option to make it happen or work. As predictive analytics forecast future data,
Prescriptive analytics on the other hand helps to make it happen whatever we have forecasted.
Prescriptive analytics is the highest level of analytics that is used for choosing the best optimal
solution by looking at descriptive, diagnostic, and predictive data.
Example–
The best example would be Google self-driving Car, by looking at the past trends and
forecasted data it identifies when to turn or when to slow down, works much like a human
driver.
Types of Analytics
Types of analytics explained — descriptive, predictive, prescriptive, and more
Most business leaders have a general understanding of data analytics and many companies have
departments dedicated to gathering and interpreting information about customers, processes,
and markets. But there is more than one kind of analytics — and each tells a different story
about your business. Understanding the different types of analytics can help you choose the
ones that will benefit your business most and ultimately drive business objectives.
This post will explore the three most common types of data analytics and one less known
model. This information will help you gain better insight into what your data says about your
business so you can make adjustments to meet your goals.
• Descriptive analytics
• Predictive analytics
• Prescriptive analytics
• Diagnostic analytics
• The future of analytics
Types of business analytics
The process of business analytics is an essential tool for interpreting and applying the vast
amount of data your company collects and organizes. From customer behavior and conversion
rates to revenue and business processes, the information generated by your company’s
operations has to tell a helpful story to benefit you. Business analytics is the process that helps
turn those data points into actionable insights.
The four different types of business analytics are descriptive, predictive, prescriptive, and
diagnostic. Exploring the distinctions between these models can help you learn how to use
each to support your business goals.
Descriptive analytics
Descriptive analytics examines what happened in the past. You’re utilizing descriptive
analytics when you examine past data sets for patterns and trends. This is the core of most
businesses’ analytics because it answers important questions like how much you sold and if
you hit specific goals. It’s easy to understand even for non-data analysts.
Descriptive analytics functions by identifying what metrics you want to measure, collecting
that data, and analyzing it. It turns the stream of facts your business has collected into
information you can act on, plan around, and measure.
Examples of descriptive analytics include:
• Annual revenue reports
• Survey response summaries
• Year-over-year sales reports
The main difficulty of descriptive analytics is its limitations. It’s a helpful first step for decision
makers and managers, but it can’t go beyond analyzing data from past events. Once descriptive
analytics is done, it’s up to your team to ask how or why those trends occurred, brainstorm and
develop possible responses or solutions, and choose how to move forward.
Predictive analytics
Predictive analytics is what it sounds like — it aims to predict likely outcomes and make
educated forecasts using historical data. Predictive analytics extends trends into the future to
see possible outcomes. This is a more complex version of data analytics because it uses
probabilities for predictions instead of simply interpreting existing facts.
Use predictive analytics by first identifying what you want to predict and then bringing existing
data together to project possibilities to a particular date. Statistical modeling or machine
learning are commonly used with predictive analytics. This is how you answer planning
questions such as how much you might sell or if you’re on track to hit your Q4 targets.
A business is in a better position to set realistic goals and avoid risks if they use data to create
a list of likely outcomes. Predictive analytics can keep your team or the company as a whole
aligned on the same strategic vision.
Examples of predictive analytics include:
• Ecommerce businesses that use a customer’s browsing and purchasing history to make
product recommendations.
• Financial organizations that need help determining whether a customer is likely to pay
their credit card bill on time.
• Marketers who analyze data to determine the likelihood that new customers will
respond favorably to a given campaign or product offering.
The primary challenge with predictive analytics is that the insights it generates are limited to
the data. First, that means that smaller or incomplete data sets will not yield predictions as
accurate as larger data sets might. Getting good business intelligence (BI) from predictive
analytics requires sufficient data, but what counts as “sufficient” depends on the industry,
business, audience, and the use case.
Additionally, the challenge of predictive analytics being restricted to the data simply means
that even the best algorithms with the biggest data sets can’t weigh intangible or distinctly
human factors. A sudden economic shift or even a change in the weather can affect spending,
but a predictive analytics model can’t account for those variables.
Prescriptive analytics
Prescriptive analytics uses the data from a variety of sources — including statistics, machine
learning, and data mining — to identify possible future outcomes and show the best option.
Prescriptive analytics is the most advanced of the three types because it provides actionable
insights instead of raw data. This methodology is how you determine what should happen, not
just what could happen.
Using prescriptive analytics enables you to not only envision future outcomes, but to
understand why they will happen. Prescriptive analytics also can predict the effect of future
decisions, including the ripple effects those decisions can have on different parts of the
business. And it does this in whatever order the decisions may occur.
Prescriptive analytics is a complex process that involves many variables and tools like
algorithms, machine learning, and big data. Proper data infrastructures need to be established
or this type of analytics could be a challenge to manage.
Examples of prescriptive analytics include:
• Calculating client risk in the insurance industry to determine what plans and rates an
account should be offered.
• Discovering what features to include in a new product to ensure its success in the
market, possibly by analyzing data like customer surveys and market research to
identify what features are most desirable for customers and prospects.
• Identifying tactics to optimize patient care in healthcare, like assessing the risk for
developing specific health problems in the future and targeting treatment decisions to
reduce those risks.
The most common issue with prescriptive analytics is that it requires a lot of data to produce
useful results, but a large amount of data isn’t always available. This type of analytics could
easily become inaccessible for most.
Though the use of machine learning dramatically reduces the possibility of human error, an
additional downside is that it can’t always account for all external variables since it often relies
on machine learning algorithms.
Diagnostic analytics
Another common type of analytics is diagnostic analytics and it helps explain why things
happened the way they did. It’s a more complex version of descriptive analytics, extending
beyond what happened to why it happened.
Diagnostics analytics identifies trends or patterns in the past and then goes a step further to
explain why the trends occurred the way they did. It’s a logical step after descriptive analytics
because it answers questions like why a certain amount was sold or why Q1 targets were hit.
Diagnostic analytics is also a useful tool for businesses that want more confidence to duplicate
good outcomes and avoid negative ones. Descriptive analytics can tell you what happened but
then it is up to your team to figure out what to do with that data. Diagnostic analytics applies
data to figure out why something happened so you can develop better strategies without so
much trial and error.
Examples of diagnostic analytics include:
• Why did year-over-year sales go up?
• Why did a certain product perform above expectations?
• Why did we lose customers in Q3?
The main flaw with diagnostic analytics is its limitation of providing actionable observations
about the future by focusing on past occurrences. Understanding the causal relationships and
sequences may be enough for some businesses, but it may not provide sufficient answers for
others. For the latter, managing big data will likely require more advanced analytics solutions
and you might have to implement additional tools — venturing into predictive or prescriptive
analytics — to find meaningful insights.
The future of analytics
The use of analytics in business is not new, but it is on a steep growth trajectory. Fueled by
huge data sets streaming in from the IoT, advancements in AI, and the growth of self-service
BI tools, the use of analytics in business has yet to peak.
The US Bureau of Labor Statistics predicts huge growth in the number of research analysts in
the coming years, projecting a “must faster than average” growth rate of 19%. Additionally,
some of the industry’s top experts in data science and analytics predict the ideal candidate for
businesses in the future will be a person who can both understand and speak data.
Even as the need for analytics experts grows, the market for self-service tools continues to
escalate as well. A report from Allied Market Research expects the self-service BI market to
reach $14.19 billion by 2026, and Gartner cites the growth of business-composed data and
analytics, that focuses on people, “shifting from IT to business.”
Implementing more advanced analytics — and for some businesses bringing analytics into
business strategy — will continue to become more important for companies of all sizes.