Ba Unit3 Notes

BA-UNIT3 - NOTES
UNIT III BUSINESS FORECASTING

Introduction to Business Forecasting and Predictive analytics - Logic and Data Driven Models –
Data Mining and Predictive Analysis Modeling – Machine Learning for Predictive analytics.
Analyzing a large volume of data is already a crucial part of the decision-making process for any
business, irrespective of its volume. Available big data resolve everyday problems like improving
the conversion rate or to attaining customer loyalty for an e-commerce business. But do you know
that you can also use this data to forecast events before they actually happen? It adds the value of
Predictive Analytics Solutions to predict user behavior based on historical data and act
consequently to optimize sales.
For online businesses, occasionally executing predictive analytics is equal to improving your
understanding of the customer and classifying changes in the market before they occur.
The predictive analytics models take out patterns from past and transactional data to recognize
risks and opportunities. Self-learning software will automatically evaluate the existing data and
provide tools for future problems. It will enable you to build new sales strategies to adjust
according to the changes and increase profit growth.
1
INTRODUCTION TO BUSINESS FORECASTING TECHNIQUES
Companies conduct business forecasts to determine their goals, targets, and project plans for each
new period, whether quarterly, annually, or even 2–5-year planning. Some companies utilize
predictive analytics software to collect and analyze the data necessary to make an accurate
business forecast. Predictive analytics solutions give you the tools to store data, organize
information into comprehensive datasets, develop predictive models to forecast business
opportunities, adapt datasets to data changes, and allow import/export from other data channels.
Forecasting helps managers guide strategy and make informed decisions about critical business
operations such as sales, expenses, revenue, and resource allocation. When done right, forecasting
adds a competitive advantage and can be the difference between successful and unsuccessful
companies.
In this to business forecasting, we'll cover:
• What is business forecasting?
• What are the best forecasting techniques?
• Why forecasting in management is important?
• How to conduct business forecasts?
• A few forecasting examples for businesses
To deal with the increasing variety and complexity of management forecasting problems, many
forecasting methods have been documented in recent years. All the methods have distinct usage
and attention must be paid to select the right method for a specific application. The administrator
and the manager of the forecast share a significant role in choosing the technique; and the better
they recognize the various forecast opportunities, the more likely the company’s estimated effort
will pay off.
What is business forecasting?

Business forecasting is a projection of future developments of a business or industry based on
trends and patterns of past and present data.
.
Choosing a right forecasting method depends on many factors –
 significance and accessibility of historical data,
 the background of the prediction,
 time available for analysis,
 the degree of precision anticipated,
 value to the Company,
 And desired time period for forecast.
Manager and forecaster need to collectively work to achieve successful forecasting, they must try
to answer the following questions:
1. How the forecast is going to be useful? – precisely the purpose of it

2. Understand the mechanisms and sensitivities of the system for which forecast is made?
3. How relevant is past in predicting future?
TYPES OF PREDICTING MODELS
Qualitative Techniques:
Qualitative Technique is applied when enough data is not available – i.e. when the product
is launched in the market for the first time. They use human evaluation and rating schemes to
convert qualitative data into quantitative calculations.
The goal is to gather all information and considerations related to the factors being evaluated in a
logical, impartial, and systematic manner. Such methods are often used in the field of new
technologies, where the development of product ideas may require more “invention”, so it is
difficult to study the research and development requirements, and the perception and entry of the
market is very uncertain.
Qualitative business forecasting is predictions and projections based on experts' and

customers' opinions. This method is best when there is insufficient past data to analyze to reach a
quantitative forecast. In these cases, industry experts and forecasters piece together available data
to make qualitative predictions.
Qualitative models are most successful with short-term projections. They are expert-
driven, bringing up contrasting opinions and reliance on judgment over calculable data. Examples
of qualitative models in business forecasting include:
• Market research: This involves polling people – experts, customers, employees – to get
their preferences, opinions, and feedback on a product or service.
• Delphi method: The Delphi method relies on asking a panel of experts for their opinions
and recommendations and compiling them into a forecast.
Time Series Analysis
Time Series is a set of observations on the values that a variable takes at different times. Example:
Sales trend, stock market prices, weather forecasts etc. In simple terms. Let’s take a sales data.
You would have cells that are connected on every month basis. Like for January, you sold 150,
and in February, you sold about a bit more let us assume three hundred and so on for all the 12
months. So, you have your sales data, right? This becomes a time series for you. And given that
there is a pattern, we can predict the future sales of the same unit.
Casual Methods:
Causal forecasting recognizes that the predicted dependent variable affects one or more other
independent variables. Causal methods take into account all possible factors that may affect the
dependent variable. Consequently, the data necessary for such forecasting can vary from internal
data to external data, such as surveys, macroeconomic indicators, product characteristics, social
chatter, etc. Typically, casual models are infinitely modified to ensure that the latest data are
included in the model.
Quantitative business forecasting
Use quantitative forecasting when there is accurate past data available to analyze patterns and
predict the probability of future events in your business or industry.
Quantitative forecasting extracts trends from existing data to determine the more probable results.
It connects and analyzes different variables to establish cause and effect between events, elements,
and outcomes. An example of data used in quantitative forecasting is past sales numbers.
Quantitative models work with data, numbers, and formulas. There is little human interference in
quantitative analysis. Examples of quantitative models in business forecasting include:
• The indicator approach: This approach depends on the relationship between specific
indicators being stable over time, e.g., GDP and the unemployment rate. By following the
relationship between these two factors, forecasters can estimate a business's performance.
• The average approach: This approach infers that the predictions of future values are equal
to the average of the past data. It is best to use this approach only when assuming that the
future will resemble the past.
• Econometric modeling: Econometric modeling is a mathematically rigorous approach to
forecasting. Forecasters assume the relationships between indicators stay the same and test
the consistency and strength of the relationship between datasets.
• Time-series methods: Time-series methods use historical data to predict future outcomes.
By tracking what happened in the past, forecasters expect to get a near-accurate view of the
future
How do you choose the right business forecasting technique?
Choosing the right business forecasting technique depends on many factors. Some of these are:
• Context of the forecast
• Availability and relevance of past data
• Degree of accuracy required
• Allocated time to conduct the forecast
• Period to be forecast
• Costs and benefits of the forecast
• Stage of the product or business needing the forecast
Managers and forecasters must consider the stage of the product or business as this influences the
availability of data and how you establish relationships between variables. A new startup with no
previous revenue data would be unable to use quantitative methods in its forecast. The more you
understand the use, capabilities, and impact of different forecasting techniques, the
Why is business forecasting important?

Forecasting is valuable to businesses because it gives the ability to make informed business
decisions and develop data-driven strategies. Financial and operational decisions are made based
on current market conditions and predictions on how the future looks. Past data is aggregated and
analyzed to find patterns, used to predict future trends and changes. Forecasting allows your
company to be proactive instead of reactive.
Any insight into the future puts your organization at an advantage. Forecasting helps you predict
potential issues, make better decisions, and measure the impact of those decisions.
By combining quantitative and qualitative techniques, statistical and econometric models, and
objectivity, forecasting becomes a formidable tool for your company.
Business forecasting helps managers develop the best strategies for current and future trends and
events. Today, artificial intelligence, forecasting software, and big data make business forecasting
easier, more accurate, and personalized to each organization.
Forecasting does not promise an accurate picture of the future or how your business will evolve,
but it points in a direction informed by data, logic, and experiential reasoning.
What are the integral elements of business forecasting?
While there are different forecasting techniques and methods, all forecasts follow the same process
on a conceptual level. Standard elements of business forecasting include:
• Prepare the stage: Before you begin, develop a system to investigate the current state of
business.
• Choose a data point: An example for any business could be "What is our sales projection
for next quarter?"
• Choose indicators and data sets: Identify the relevant indicators and data sets you need
and decide how to collect the data.
• Make initial assumptions: To kick start the forecasting process, forecasters may make
some assumptions to measure against variables and indicators.
• Select forecasting technique: Pick the technique that fits your forecast best.
• Analyze data: Analyze available data using your selected forecasting technique.
• Estimate forecasts: Estimate future conditions based on data you've gathered to reach data-
backed estimates.
• Verify forecasts: Compare your forecast to the eventual results. This helps you identify any
problems, tweak errant variables, correct deviations, and continue to improve your
forecasting technique.
• Review forecasting process: Review any deviations between your forecasts and actual
performance data.
How do you do business forecasting?
Successful business forecasting begins with a collaboration between the manager and forecaster.
They work together to answer the following questions:
1. What is the purpose of the forecast? How will it be used?
2. What are the components and dynamics of the system the forecast is focused on?
3. How relevant is past data in estimating the future?

Once these answers are clear, choose the best forecasting methods based on the stage of the product
or business life cycle, availability of past data, and skills of the forecasters and managers leading
the project.
With the right forecasting method, you can develop your process using the integral elements of
business forecasting mentioned above.
How do you get data for business forecasting?
A forecast is only as good as the data supplied. Before collecting data, ask:
• Why do you need it?
• What kind of data do you need?
• When will you collect it?
• Where will you gather it?
• Who is in charge of collecting it?
• How will you collect it?
• How will you analyze it?
When you have these answers, you can start collecting data from two main sources:
• Primary sources: These sources are gathered first-hand using reporting tools — you or
members of your team source data through interviews, surveys, research, or observations.
• Secondary sources: Secondary sources are second-hand information or data that others have
collected. Examples include government reports, publications, financial statements,
competitors' annual reports, journals, and other periodicals.
Ideally, prediction methods should be evaluated in the situations in which they will be
used. The basis for conducting the evaluation is the need to test methods against
reasonable alternatives.
The evaluation consists of four steps:
 test assumptions,
 test data and methods,
 repeat results,
 and evaluate results.
Most of the principles for testing prediction methods are based on generally accepted
methodological procedures, such as defining criteria or obtaining a large sample of
prediction errors. However, forecasters often violate such principles, even in academic
studies.
Some principles may be surprising, such as
 not using the R squared,

 not using themean squared error,
 and not using fit within the sample to define the most accurate time series model.
A checklist of 32 principles is provided to assist in the systematic assessment of

prediction methods
BUSINESS FORECASTING PROCESS
The way a company forecasts is always unique to its needs and resources, but the primary
forecasting process can be summed up in five steps. These steps outline how business forecasting
starts with a problem and ends with not only a solution but valuable learnings.
1. Choose an issue to address
The first step in predicting the future is choosing the problem you’re trying to solve or the question
you’re trying to answer. This can be as simple as determining whether your audience will be
interested in a new product your company is developing. Because this step doesn’t yet involve any
data, it relies on internal considerations and decisions to define the problem at hand.
2. Create a data plan
The next step in forecasting is to collect as much data as possible and decide how to use it. This
may require digging up some extensive historical company data and examining the past and
present market trends. Suppose your company is trying to launch a new product. In this case, the
gathered data can be a culmination of the performance of your previous product and the current
performance of similar competing products in the target market.
3. Pick a forecasting technique
After collecting the necessary data, it’s time to choose a business forecasting technique that works
with the available resources and the type of prediction. All the forecasting models are effective
and get you on the right track, but one may be more favorable than others in creating a unique,
comprehensive forecast.
For example, if you have extensive data on hand, quantitative forecasting is ideal for interpretation.
Qualitative forecasting is best if you have less hard data available and are willing to invest in
extensive market research.
4. Analyze the data
Once the ball starts rolling, you can begin identifying patterns in the past and predict the probability
of their repetition. This information will help your company’s decision-makers determine what to
do beforehand to prepare for the predicted scenarios.
5. Verify your findings
The end of business forecasting is simple. You wait to see if what you predicted actually happens.
This step is especially important in determining not only the success of your forecast but also the
effectiveness of the entire process. Having done some forecasting, you can compare the present
experience with these forecasts to identify potential areas for growth.
When in doubt, never throw away “old” data. The final information of one forecasting process can
also be used as the past data for another forecast. It’s like a life cycle of business development
predictions.
Business forecasting examples
Some forecasting examples for business include:
1. Calculating cash flow forecasts, i.e., predicting your financial needs within a timeframe
2. Estimating the threat of new entrants into your market
3. Measuring the opportunity of developing a new product or service

4. Estimating the costs of recurring bills
5. Predicting future sales growth based on past sales performance
6. Analyzing relationships between variables, e.g., Facebook ads and potential revenue
7. Budgeting contingencies and efficient allocation of resources
8. Comparing customer acquisition costs and customer lifetime value over time
Business Forecasting: How it Works & Real-Life Examples
A rapidly evolving modern business climate has proven how fast things can change, with
businesses evolving beside it to succeed. In fact, today’s world requires agile strategy and
management.
This is where business forecasting can help, enabling businesses to plan for unexpected events.
In this, you’ll learn the basic principles of business forecasting and how to implement forecasting
techniques in your business planning.
What is business forecasting?
Business forecasting involves forecasting tools and techniques to help businesses predict certain
developments, such as revenue, sales, and growth. Through analytics, data, insights, and
experience, business forecasting provides organizations with the information they can use to
improve their decision-making. Whether you have a large or small company or offer products or
services, accurate forecasts can help your business prepare for future events and future trends.
For example, let’s say a new company started the year with few sales. During Q3, their sales began
to skyrocket because of a new marketing technique spreading brand awareness. Applying a
business forecasting technique, the team can better gauge Q4 sales—preparing inventory,
expanding their team, and taking the necessary steps to have a successful quarter.
3 examples of business forecasting in action
Now that you understand the basics of business forecasting, it’s time to see how it works in
practice. Read the following examples to better understand the different approaches to business
forecasting.
1. A company forecasting its sales through the end of the year
Let’s suppose a small greeting card company wants to forecast its sales through the end of the
year. The company has just a year and a half of experience and limited data to use for predictions.
Though the first few quarters were slow to start, they have gained a great reputation in the last
three quarters. For this reason, sales are on the rise.
Since the business has limited historical data, they might consider a qualitative model for
predicting future sales. By polling their customers, the greeting card company can gauge the
willingness of their audience to buy new cards and pricing for the remaining quarters of the year.
Market surveys are a type of qualitative forecasting, which utilizes questionnaires to estimate
future customer behavior.
2. A company forecasting sales for the next quarter
In this example, let’s suppose a well-established shoe brand is forecasting profits for the next
quarter. Normally, this company would use the time series forecasting technique to estimate profits
for the next quarter. However, economic conditions have shifted, and the unemployment rate is
higher than normal. As a result, the company chooses the indicator approach to predict the actual
performance of its product.
In this scenario, the company might compare two variables: employment rate and spending rates.
With this business forecasting approach, the company predicts it will have a decrease in profits for
the upcoming quarter. Following this prediction, it chooses to produce fewer items in response to
economic changes and adjust budgets accordingly.
3. A company forecasting returns on a new product
In this next example, let’s suppose a loungewear company plans on rolling out a new product:
slippers. Since this product is new to the company, there are no official metrics for pricing and
popularity. For this reason, the company needs to gauge the interest level of its target audience.
In this case, demand forecasting would be a great approach to gauge how much customers are
willing to spend and how much the company will need to invest in terms of materials. By using
this forecasting process, the loungewear company can decide if the product will perform well and
what kind of demand exists. Ultimately, this will help the team make informed business decisions
for production as well as sales.
What are the limits of business forecasting?
You can follow the rules, use the right methods, and still get your business forecast wrong. It is,
after all, an attempt to predict the future. Some limits to business forecasting include:
• Biases and errors by the forecasters or managers
• Incorrect information from employees, experts, or customers
• Inaccurate past numbers
• Sudden change in market conditions
• New industry regulations

PREDICTIVE ANALYTICS
Predictive analytics uses historical data to predict future events. Typically, historical data is used
to build a mathematical model that captures important trends. That predictive model is then used
on current data to predict what will happen next, or to suggest actions to take for optimal outcomes.
Predictive analytics has received a lot of attention in recent years due to advances in supporting
technology, particularly in the areas of big data and machine learning.
Rise of Big Data
Predictive analytics is often discussed in the context of big data, Engineering data, for example,
comes from sensors, instruments, and connected systems out in the world. Business system data
at a company might include transaction data, sales results, customer complaints, and marketing
information. Increasingly, businesses make data-driven decisions based on this valuable trove of
information.
Increasing Competition
With increased competition, businesses seek an edge in bringing products and services to crowded
markets. Data-driven predictive models can help companies solve long-standing problems in new
ways.
Equipment manufacturers, for example, can find it hard to innovate in hardware alone. Product
developers can add predictive capabilities to existing solutions to increase value to the customer.
Using predictive analytics for equipment maintenance, or predictive maintenance, can anticipate
equipment failures, forecast energy needs, and reduce operating costs. For example, sensors that
measure vibrations in automotive parts can signal the need for maintenance before the vehicle
fails on the road.
Companies also use predictive analytics to create more accurate forecasts, such as forecasting the
demand for electricity on the electrical grid. These forecasts enable resource planning (for
example, scheduling of various power plants), to be done more effectively.
Cutting-Edge Technologies for Big Data and Machine Learning
To extract value from big data, businesses apply algorithms to large data sets using tools such as
Hadoop and Spark. The data sources might consist of transactional databases, equipment log files,
images, video, audio, sensor, or other types of data. Innovation often comes from combining data
from several sources.
With all this data, tools are necessary to extract insights and trends. Machine learning techniques
are used to find patterns in data and to build models that predict future outcomes. A variety of
machine learning algorithms are available, including linear and nonlinear regression, neural
networks, support vector machines, decision trees, and other algorithms.
Predictive Analytics Examples

Predictive analytics helps teams in industries as diverse as finance, healthcare, pharmaceuticals,
automotive, aerospace, and manufacturing.
 Automotive – Breaking new ground with autonomous vehicles

Companies developing driver assistance technology and new autonomous vehicles use predictive
analytics to analyze sensor data from connected vehicles and to build driver assistance
algorithms.
 Aerospace – Monitoring aircraft engine health
To improve aircraft up-time and reduce maintenance costs, an engine manufacturer created a
real- time analytics application to predict subsystem performance for oil, fuel, liftoff, mechanical
health, and controls.
 Energy Production – Forecasting electricity price and demand
Sophisticated forecasting apps use models that monitor plant availability, historical trends,
seasonality, and weather.
 Financial Services – Developing credit risk models
Financial institutions use machine learning techniques and quantitative tools to predict credit
risk.
 Industrial Automation and Machinery – Predicting machine failures
A plastic and thin film producer saves 50,000 Euros monthly using a health monitoring and
predictive maintenance application that reduces downtime and minimizes waste.
 Medical Devices – Using pattern-detection algorithms to spot asthma and COPD
An asthma management device records and analyzes patients' breathing sounds and provides
instant feedback via a smart phone app to help patients manage asthma and COPD.
How Predictive Analytics Works

Predictive analytics is the process of using data analytics to make predictions based on data. This
process uses data along with analysis, statistics, and machine learning techniques to create a
predictive model for forecasting future events.
The term “predictive analytics” describes the application of a statistical or machine learning
technique to create a quantitative prediction about the future. Frequently, supervised machine
learning techniques are used to predict a future value (How long can this machine run before
requiring maintenance?) or to estimate a probability (How likely is this customer to default on a
loan?).
Predictive analytics starts with a business goal: to use data to reduce waste, save time, or cut costs.
The process harnesses heterogeneous, often massive, data sets into models that can generate clear,
actionable outcomes to support achieving that goal, such as less material waste, less stocked
inventory, and manufactured product that meets specifications.
Predictive Analytics Workflow
We are all familiar with predictive models for weather forecasting. A vital industry application of
predictive models relates to energy load forecasting to predict energy demand. In this case, energy
producers, grid operators, and traders need accurate forecasts of energy load to make decisions for
managing loads in the electric grid. Vast amounts of data are available, and using predictive
analytics, grid operators can turn this information into actionable insights.
Step-by-Step Workflow for Predicting Energy Loads
Typically, the workflow for a predictive analytics application follows these basic steps:
1. Import data from varied sources, such as web archives, databases, and spreadsheets.
Data sources include energy load data in a CSV file and national weather data showing
temperature and dew point.
2. Clean the data by removing outliers and combining data sources.
Identify data spikes, missing data, or anomalous points to remove from the data. Then
aggregate different data sources together – in this case, creating a single table including
energy load, temperature, and dew point.
3. Develop an accurate predictive model based on the aggregated data using statistics,
curve fitting tools, or machine learning.
Energy forecasting is a complex process with many variables, so you might choose to use
neural networks to build and train a predictive model. Iterate through your training data set
to try different approaches. When the training is complete, you can try the model against new
data to see how well it performs.
4. Integrate the model into a load forecasting system in a production environment.
Once you find a model that accurately forecasts the load, you can move it into your
production system, making the analytics available to software programs or devices, including
web apps, servers, or mobile devices.
Create models and forecast future outcomes

Predictive modeling is a technique that uses mathematical and computational methods to predict
an event or outcome. A mathematical approach uses an equation-based model that describes the
phenomenon under consideration. The model is used to forecast an outcome at some future state
or time based upon changes to the model inputs. The model parameters help explain how model
inputs influence the outcome. Examples include time-series regression models for predicting
airline traffic volume or predicting fuel efficiency based on a linear regression model of engine
speed versus load.
The computational predictive modeling approach differs from the mathematical approach because
it relies on models that are not easy to explain in equation form and often require simulation
techniques to create a prediction. This approach is often called “black box” predictive modeling
because the model structure does not provide insight into the factors that map model input to
outcome. Examples include using neural networks to predict which winery a glass of wine
originated from or bagged decision trees for predicting the credit rating of a borrower.
Predictive modeling is often performed using curve and surface fitting, time series regression,
or machine learning approaches. Regardless of the approach used, the process of creating a
predictive model is the same across methods. The steps are:
1. Clean the data by removing outliers and treating missing data

2. Identify a parametric or nonparametric predictive modeling approach to use
3. Preprocess the data into a form suitable for the chosen modeling algorithm
4. Specify a subset of the data to be used for training the model
5. Train, or estimate, model parameters from the training data set
6. Conduct model performance or goodness-of-fit tests to check model adequacy
7. Validate predictive modeling accuracy on data not used for calibrating the model
8. Use the model for prediction if satisfied with its performance
Types of Predictive Models

While data analysts are required to make decisions regarding which mathematical model to use in
a given situation, they are not actually the ones crunching the data. Statisticians and programmers
develop computer programs that carry out these processes, each of which operates using a different
mathematical model.
“The tools we’re using for predictive analytics now have improved and become much more
sophisticated,” Goulding says, explaining that these advanced models have allowed us to “handle
massive amounts of data in ways we couldn’t before.”
The advancement of these tools has also resulted in the use of predictive analytics to identify
“unknowns” that previously could not be addressed, leading to an overall need for analysts that
can succinctly identify which model best aligns with the type of unknown in each scenario.
Below, we explore four common predictive models and the types of questions they can be best
used to answer.
1. Linear Regression
Linear regression is one of the most famous and historic modeling tools, according to Goulding.
This model considers all the known data points on a graph and creates a straight line that travels
through the center of those data points. This line represents the smallest possible distance between
all the points on the graph. A linear regression mathematical modeling tool can then base
predictions about nonexistent data off of the relationship between this line and the existing data
points.
Real-World Example
A linear regression model would be useful when a doctor wants to predict a new patient’s
cholesterol based only on their body mass index (BMI). In this example, the analyst would know
to put the data the doctor gathered from his 5,000 other patients—including each of their BMIs
and cholesterol levels—into the linear regression model. They are hoping to predict an unknown
based on a predetermined set of quantifiable data.
The linear regression model would take the data, plot it onto a graph, and establish a line down the
center that properly depicts the smallest distance between all plotted data points. In this scenario,
when that new patient arrives knowing only that their BMI is 31, a data analyst will be able to
predict the patient’s cholesterol by looking at that line and seeing what cholesterol level most
closely aligns with other patients who have a BMI of 31.
2. Text Mining
Whereas linear regression uses only numeric data, mathematical models can also be used to make
predictions about non-numerical factors. Text mining is a perfect example.
“Text mining is part of predictive analytics in the sense that analytics is all about finding the
information I previously knew nothing about,” Goulding says. In this scenario, the tool takes data
points in the form of text-based words or phrases and searches a giant database for those specific
points.
Sound Familiar? The algorithm used by Google or other search engines to bring up relevant links
when you search for a specific keyword is an example of text mining.
Real-World Example
Although tools like search engines—or even the “find” function you may use when searching for
a word in a digital body of text—represent some common examples of text mining, there are also
industry-specific instances where this type of predictive analytics comes into play.
Goulding describes another medical application of predictive analytics, explaining how doctors
rely on text mining when analyzing patient symptoms and trying to determine the root cause. “If
I’m a doctor and I have 50 children in front of me with flu symptoms, my brain can figure out that
the next patient to walk in the door [with similar symptoms] also has the flu,” he says. “But if I
see an unusual set of symptoms from just one patient, I may need the case history of patients from
all over the world to make a correct diagnosis. My brain can’t help me do this; analytics, however,
can.”
Especially in complex patient cases, an analyst can use text mining modeling tools to comb
databases, locate similar symptoms among patients of the past, and generate a prediction as to
what this new patient is “most likely” suffering from based on that data.
3. Optimal Estimation
Optimal estimation is a modeling technique that is used to make predictions based on observed
factors. This model has been used in analytics for over 50 years and has laid the groundwork for
many of the other predictive tools used today. According to Goulding, past applications of this
method include determining “how to best recalibrate equipment on a manufacturing floor…[and]
estimating where a bullet might go when shot,” as well as in other aspects of the defense industry.
Real-World Example
If two planes were flying toward one another, an analyst might use the optimal estimation model
to predict if or when they will collide. To do this, the analyst would put a variety of observed
factors into the mathematical modeling tool, including the airplanes’ height, altitude, speed, angle,
and more. The mathematical model would then be able to help predict at which point, if any, the
planes would meet.
4. Clustering Models
Clustering models are focused on finding different groups with similar qualities or elements within
the data. Many mathematical modeling tools fall within this category, including:
K-Means
Hierarchical Clustering
TwoStep
Density-Based Scan Clustering
Gaussian Clustering Model
Kohonen
Real-World Example
If a fast-food restaurant wanted to open a new location in a new city, the corporate team may work
with a data analyst to figure out exactly where that new location should go. The analyst would start
by gathering an array of specific, relevant data about each location—including factors like
demographics, where the high-end houses are, how close the location is to a college, etc.—then
input all of that data into a clustering mathematical model. This model would most efficiently
analyze this particular type of data and predict where the most strategic location in the city for that
restaurant is based on the data alone.
5. Neural Networks
Neural networks are complex algorithms inspired by the structure of the human brain. They
process historical and current data and identify complex relationships within the data to predict the
future, similar to how the human brain can spot trends and patterns.
A typical neural network is composed of artificial neurons, called units, arranged in different
layers. The neural network uses input units to learn about and process data. On the other hand,
output units are on the opposite side and outline how the neural network should respond to the
input units. Between the two are hidden layers, which are layers of mathematical functions that
produce a specific output.
Real-World Example
If an e-commerce retailer wants to accurately predict which products its customers are likely to
consider purchasing in the future, a data analyst or data scientist might use neural networks to
inform the company’s product recommendation algorithm. The analyst will pull purchase data
and feed it to the neural network, giving the network real examples to learn from. This data will
travel through the neural network through various mathematical functions until the output is
produced and a product recommendation populates.
Other Common Predictive Models
In addition to the mathematical models above, there are additional models that data analysts use
to make predictions, including:
Decision trees
Random forests
Logistic regression
Bayesian methods
Why Is Predictive Analytics Important?
While organizations have recognized the importance of gathering data as a means of looking back
on industry trends for years, business teams have only just started scratching the surface of
possibility when it comes to predictive analytics.
“Analytics is getting exciting in every industry because we’re [more] equipped than ever to…use
the data in the back room that has been gathering dust…to make better business decisions,”
Goulding says.
From insurance to retail to healthcare, organizations are starting to adapt to this model of
informed decision-making and are using it to their advantage:
• Today, insurance companies can predict if a new client is a risk based on their age, history,
health conditions, etc. They can weigh this data and make an informed decision about
whether or not they want to cover that individual.
• Retail organizations can predict how new brands or items might sell in their local market
based on consumer demographics. They can then make strategic decisions about how much
product to stock.
• Doctors can use predictive data to help determine not only what ailment someone’s
conditions point to but also their chances of survival, whether or not they need immediate
surgery, and their condition’s expected decline over a certain period of time.
No matter the industry, the recent advancements in mathematical modeling and the overall lean
into data as a prescriptive form of insight have changed the way businesses operate today.
Businesses can make data-driven decisions based on predictive models, allowing them to mitigate
potential risks and maximize profits. These changes have created an overall trend in decision-
making that is sure to continue developing and expanding for years to come.
26
Predictive Analytics vs. Prescriptive Analytics
Organizations that have successfully implemented predictive analytics see prescriptive analytics
as the next frontier. Predictive analytics creates an estimate of what will happen next;
prescriptive analytics tells you how to react in the best way possible given the prediction.
Prescriptive analytics is a branch of data analytics that uses predictive models to suggest actions
to take for optimal outcomes. Prescriptive analytics relies on optimization and rules-based
techniques for decision making. Forecasting the load on the electric grid over the next 24 hours is
an example of predictive analytics, whereas deciding how to operate power plants based on this
forecast represents prescriptive analytics.
PREDICTIVE ANALYSIS VS FORECASTING – HOW CAN IT HELP COMPANIES?
Forecasting is a method by which companies find out trends that will dominate the market in the
company years. It has many advantages not just for new startups but for established and old
companies. Forecasting is defined as a planning tool that can help the management to cope with
an uncertain future, mainly through the use of past data and analysis of market trends. The process
of forecasting begins with certain assumptions that are based on the management experience,
knowledge and astute judgement sense of the management team. These estimates are then
projected on techniques like Box-Jenkins models, Delphi method, exponential smoothing, moving
averages, regression analysis, and trend projection. Since any error in the assumptions will also
result in a similar or magnified error in forecasting results, the technique of sensitivity analysis is
used where a range of values is assigned to uncertain factors, which are also called variables.
4 Major Benefits of Forecasting. Given below are the major benefits of forecasting.
1. Forecasting helps in establishing new startups and promoting new brands:
Forecasting is an important element when new brands are being set up in the industry. This
is especially true when the industry is filled with multiple challenges and
there are many hurdles in the path of seeing up a successful brand. Forecasting can help
entrepreneurs to find out the best way that they can overcome these challenges and thereby
establish a successful company. Through forecasting brands can understand how they will
be perceived in the market and whether their products have the capability to meet the
expectations and demands of the target audience. In short, good and strong forecasting can
help startup companies to increase their chances of success by helping them plan and
strategies their entry in a much better manner. At the same time, good forecasting can help
new brands to meet the supply and demand situation, thereby increasing their brand power
and loyalty.
2. Forecasting can help brands to use their financial resources in a much better manner,
than before: Financial concerns, especially for new and small companies is a very
important aspect. That is why it is important that in such situations, the available resources
are utilised in a proper and effective manner. As no brand can survive without adequate
capital, financial forecasting plays a very important role in such a scenario. By helping
companies to divide their resources in a proper manner, financial forecasting can hold the
key to proper and effective financial planning in a company.
3. Forecasting can help the administration take good and successful management
decisions: Every company is based on good administrative decisions. Without a strong
administrative backbone, companies will completely turn into a failure, sooner or later. The
administration team of any company is essentially a decision making process and has
responsibility for making decisions and for ascertaining that the decisions made are carried
out. That is why it is important that the wheels of the administrative department is working
in a continue manner and it is here that forecasting plays a very important role
as it helps companies to take decisions at the right time.

4. Forecasting helps companies to plan in a systematic manner: Planning is a very
important component of any company, be it in the long term or short term. Forecasting can
help companies to plan their growth strategy while keeping in mind the needs of the
consumers while at the same time having an intricate understanding of the market trends
as well. In other words, good and proper planning whether it is for the overall growth of
the company or for a section of the company is completely dependent on good forecasting
techniques.
Conclusion
In the end, both Predictive Analysis vs Forecasting are two techniques through which brands can
correctly forecast and understand market techniques while at the same time meet customer
expectations as well. In short, the need today is not for better Predictive Analysis vs Forecasting
methods, but for better application of the techniques at hand.
LOGIC AND DATA DRIVEN MODELS
Predictive modeling means the developing models that can be used to forecast or
predict future events. Models can be developed either through logic or data.
• 1. Logic driven models remain based on experience, knowledge and logical relationships
of variables and constants connected to the desired business performance outcome
situation.
• 2. Data-driven Models refers to the models in which data is collected from many sources
to qualitatively establish model relationships. Logic driven models is often used as a first
step to establish relationships through data-driven models. Data driven models include
sampling and estimation, regression analysis, correlation analysis, forecasting models and
stimulation.
It leverages statistics to predict outcomes. Most often the event one wants to predict is in the future,
but predictive modeling can be applied to any type of unknown event, regardless of when it
occurred. For example, predictive models are often used to detect crimes and identify suspects,
after the crime has taken place.
In many cases the model is chosen on the basis of detection theory to try to guess the probability
of an outcome given a set amount of input data, for example given an email determining how likely
that it is spam.
Models can use one or more classifiers in trying to determine the probability of a set of data
belonging to another set, say spam or ‘ham’.
Depending on definitional boundaries, predictive modeling is synonymous with, or largely

overlapping with, the field of machine learning, as it is more commonly referred to in academic or
research and development contexts. When deployed commercially, predictive modeling is often
referred to as predictive analytics.
Usage
Predictive models can either be used directly to estimate a response (output) given a defined set
of characteristics (input), or indirectly to drive the choice of decision rules.
Depending on the methodology employed for the prediction, it is often possible to derive a formula
that may be used in a spreadsheet software. This has some advantages for end users or decision
makers, the main one being familiarity with the software itself, hence a lower barrier to adoption.
Nomograms are useful graphical representation of a predictive model. As in spreadsheet software,

their use depends on the methodology chosen. The advantage of nomograms is the immediacy of
computing predictions without the aid of a computer.
Point estimates tables are one of the simplest form to represent a predictive tool. Here combination
of characteristics of interests can either be represented via a table or a graph and the associated
prediction read off the y-axis or the table itself.
Tree based methods (e.g. CART, survival trees) provide one of the most graphically intuitive ways
to present predictions. However, their usage is limited to those methods that use this type of
modelling approach which can have several drawbacks. Trees can also be employed to represent
decision rules graphically.
Score charts are graphical tabular or graphical tools to represent either predictions or decision
rules.
A statistical model embodies a set of assumptions concerning the generation of the observed data,
and similar data from a larger population. A model represents, often in considerably idealized
form, the data-generating process. The model assumptions describe a set of probability
distributions, some of which are assumed to adequately approximate the distribution from which
a particular data set is sampled.
A logic-driven is based on experience, knowledge and logical relationships of variable and

constants connected to the desired performance outcome. To help conceptualize the relationships
inherent in a system, diagramming methods are useful.
Cause and effect diagram enables a user to hypothesize relationships between potential causes and
of an outcome.
Influence diagram are another tool to conceptualize relationships with business performance
relationships.
Example –
A restaurant customer dines 6 times a year and spends an average of $50 per visit. The restaurant
realizes a 40% margin on the average bill for food and drinks.
Annual gross profit on a customer = $50(6)(0.40) = $120
30% of customers do not return each year. Average lifetime of a customer = 1/.3 = 3.33 years.
Average gross profit for a customer = $120(3.33) = $400
OR Average gross profit for a customer = $120/.3 = $400
Thus, the economic value of a customer is

• V = value of a loyal customer
• R = revenue per purchase
• F = purchase frequency (number visits per year)
• M = gross profit margin
• D = defection rate (proportion customers not returning each year)
AD8551-BUSINESS ANALYTICS UNIT-III
44
AD8551-BUSINESS ANALYTICS UNIT-III
PREDICTIVE MODELING
Predictive modeling is a method of predicting future outcomes by using data modeling. It’s one of
the premier ways a business can see its path forward and make plans accordingly. While not
foolproof, this method tends to have high accuracy rates, which is why it is so commonly used.
In short, predictive modeling is a statistical technique using machine learning and data mining to
predict and forecast likely future outcomes with the aid of historical and existing data. It works by
analyzing current and historical data and projecting what it learns on a model generated to forecast
likely outcomes. Predictive modeling can be used to predict just about anything, from TV ratings
and a customer’s next purchase to credit risks and corporate earnings.
A predictive model is not fixed; it is validated or revised regularly to incorporate changes in the
underlying data. In other words, it’s not a one-and-done prediction. Predictive models make
assumptions based on what has happened in the past and what is happening now. If incoming, new
data shows changes in what is happening now, the impact on the likely future outcome must be
recalculated, too. For example, a software company could model historical sales data against
marketing expenditures across multiple regions to create a model for future revenue based on the
impact of the marketing spend.
Most predictive models work fast and often complete their calculations in real time. That’s why
banks and retailers can, for example, calculate the risk of an online mortgage or credit card
application and accept or decline the request almost instantly based on that prediction.
Some predictive models are more complex, such as those used in computational biology
and quantum computing; the resulting outputs take longer to compute than a credit card application
but are done much more quickly than was possible in the past thanks to advances in technological
capabilities, including computing power.
Top 5 Types of Predictive Models
Fortunately, predictive models don’t have to be created from scratch for every application.
Predictive analytics tools use a variety of vetted models and algorithms that can be applied to a
wide spread of use cases.
Predictive modeling techniques have been perfected over time. As we add more data, more
muscular computing, AI and machine learning and see overall advancements in analytics, we’re
able to do more with these models.
The top five predictive analytics models are:
1. Classification model: Considered the simplest model, it categorizes data for simple and
direct query response. An example use case would be to answer the question “Is this a
fraudulent transaction?”
2. Clustering model: This model nests data together by common attributes. It works by
grouping things or people with shared characteristics or behaviors and plans strategies for
each group at a larger scale. An example is in determining credit risk for a loan applicant
based on what other people in the same or a similar situation did in the past.
3. Forecast model: This is a very popular model, and it works on anything with a numerical
value based on learning from historical data. For example, in answering how much lettuce
a restaurant should order next week or how many calls a customer support agent should be
able to handle per day or week, the system looks back to historical data.
4. Outliers model: This model works by analyzing abnormal or outlying data points. For
example, a bank might use an outlier model to identify fraud by asking whether a
transaction is outside of the customer’s normal buying habits or whether an expense in a
given category is normal or not. For example, a $1,000 credit card charge for a washer and
dryer in the cardholder’s preferred big box store would not be alarming, but $1,000 spent
on designer clothing in a location where the customer has never charged other items might
be indicative of a breached account.
5. Time series model: This model evaluates a sequence of data points based on time. For
example, the number of stroke patients admitted to the hospital in the last four months is
used to predict how many patients the hospital might expect to admit next week, next month
or the rest of the year. A single metric measured and compared over time is thus more
meaningful than a simple average.
Common Predictive Algorithms

Predictive algorithms use one of two things: machine learning or deep learning. Both are
subsets of artificial intelligence (AI). Machine learning (ML) involves structured data, such as
spreadsheet or machine data. Deep learning (DL) deals with unstructured data such as video, audio,
text, social media posts and images—essentially the stuff that humans communicate with that are
not numbers or metric reads.
Some of the more common predictive algorithms are:
1. Random Forest: This algorithm is derived from a combination of decision trees, none of
which are related, and can use both classification and regression to classify vast amounts
of data.
2. Generalized Linear Model (GLM) for Two Values: This algorithm narrows down the
list of variables to find “best fit.” It can work out tipping points and change data capture
and other influences, such as categorical predictors, to determine the “best fit” outcome,
thereby overcoming drawbacks in other models, such as a regular linear regression.
3. Gradient Boosted Model: This algorithm also uses several combined decision trees, but
unlike Random Forest, the trees are related. It builds out one tree at a time, thus enabling
the next tree to correct flaws in the previous tree. It’s often used in rankings, such as on
search engine outputs.
4. K-Means: A popular and fast algorithm, K-Means groups data points by similarities and
so is often used for the clustering model. It can quickly render things like personalized
retail offers to individuals within a huge group, such as a million or more customers with
a similar liking of lined red wool coats.
5. Prophet: This algorithm is used in time-series or forecast models for capacity planning,
such as for inventory needs, sales quotas and resource allocations. It is highly flexible and
can easily accommodate heuristics and an array of useful assumptions.
Predictive Modeling and Data Analytics
Predictive modeling is also known as predictive analytics. Generally, the term “predictive
modeling” is favored in academic settings, while “predictive analytics” is the preferred term for
commercial applications of predictive modeling.
Successful use of predictive analytics depends heavily on unfettered access to sufficient volumes
of accurate, clean and relevant data. While predictive models can be extraordinarily complex, such
as those using decision trees and k-means clustering, the most complex part is always the neural
network; that is, the model by which computers are trained to predict outcomes. Machine learning
uses a neural network to find correlations in exceptionally large data sets and “to learn” and
identify patterns within the data.
Benefits of Predictive Modeling

In a nutshell, predictive analytics reduce time, effort and costs in forecasting business outcomes.
Variables such as environmental factors, competitive intelligence, regulation changes and market
conditions can be factored into the mathematical calculation to render more complete views at
relatively low costs.
Examples of specific types of forecasting that can benefit businesses include demand forecasting,
headcount planning, churn analysis, external factors, competitive analysis, fleet and IT hardware
maintenance and financial risks.
Challenges of Predictive Modeling
It’s essential to keep predictive analytics focused on producing useful business insights because
not everything this technology digs up is useful. Some mined information is of value only in
satisfying a curious mind and has few or no business implications. Getting side-tracked is a
distraction few businesses can afford.
Also, being able to use more data in predictive modeling is an advantage only to a point. Too much
data can skew the calculation and lead to a meaningless or an erroneous outcome. For example,
more coats are sold as the outside temperature drops. But only to a point. People do not buy more
coats when it’s -20 degrees Fahrenheit outside than they do when it’s -5 degrees below freezing.
At a certain point, cold is cold enough to spur the purchase of coats and more frigid temps no
longer appreciably change that pattern.
And with the massive volumes of data involved in predictive modeling, maintaining security and
privacy will also be a challenge. Further challenges rest in machine learning’s limitations.
Limitations of Predictive Modeling

According to a McKinsey report, common limitations and their “best fixes” include:
1. Errors in data labeling: These can be overcome with reinforcement

learning or generative adversarial networks (GANs).
2. Shortage of massive data sets needed to train machine learning: A possible fix is “one-
shot learning,” wherein a machine learns from a small number of demonstrations rather
than on a massive data set.
3. The machine’s inability to explain what and why it did what it did: Machines do not
“think” or “learn” like humans. Likewise, their computations can be so exceptionally
complex that humans have trouble finding, let alone following, the logic. All this makes it
difficult for a machine to explain its work, or for humans to do so. Yet model transparency
is necessary for a number of reasons, with human safety chief among them. Promising
potential fixes: local-interpretable-model-agnostic explanations (LIME) and attention
techniques.
4. Generalizability of learning, or rather lack thereof: Unlike humans, machines have
difficulty carrying what they’ve learned forward. In other words, they have trouble
applying what they’ve learned to a new set of circumstances. Whatever it has learned is
applicable to one use case only. This is largely why we need not worry about the rise of AI
overlords anytime soon. For predictive modeling using machine learning to be reusable—
that is, useful in more than one use case—a possible fix is transfer learning.
5. Bias in data and algorithms: Non-representation can skew outcomes and lead to
mistreatment of large groups of humans. Further, baked-in biases are difficult to find and
purge later. In other words, biases tend to self-perpetuate. This is a moving target, and no
clear fix has yet been identified.
The Future of Predictive Modeling
Predictive modeling, also known as predictive analytics, and machine learning are still young and
developing technologies, meaning there is much more to come. As techniques, methods, tools and
technologies improve, so will the benefits to businesses and societies.
However, these are not technologies that businesses can afford to adopt later, after the tech reaches
maturity and all the kinks are worked out. The near-term advantages are simply too strong for a
late adopter to overcome and remain competitive.
DATA MINING
The data mining process is used to get the pattern and probabilities from the large dataset due to
which it is highly used in business for forecasting the trends, along with this it is also used in fields
like Market, Manufacturing, Finance, and Government to make predictions and analysis using the
tools and techniques like R-language and Oracle data mining, which involves the flow of six
different steps.
One of the essential tasks of data mining relates to the automatic and semi-automatic analysis of
large quantities of raw data and information to extract the previously unknown exciting set of
patterns such as clusters or a group of data records, anomaly detection (unusual forms) and also in
the case of dependencies which makes use of sequential pattern mining and association rule
mining. This makes use of spatial indices. These patterns can be known to be among the kinds in
the input data and can be used in further analysis, such as predictive analysis and machine learning.
More accurate sets of results can be obtained once you start making use of support decision
systems.
How does Data Mining Work?

There is an abundance of data in the industry across domains, and it becomes essential to treat and
process the data accordingly. Basically, in a nutshell, it involves the ETL set of processes such as
the extraction, transformation, and loading of the data and everything else required for this ETL to
happen. This involves the cleansing, change, and processing of data in various systems and
representations. The clients can use this processed data to analyse the businesses and the trends of
growth in their companies.
Advantages
The advantage of data mining includes the ones related to business and ones like medicine, weather
forecast, healthcare, transportation, insurance, government, etc. Some of the advantages include:
1. Marketing/Retail: It helps all the marketing companies and firms to build models which
are based on a historical set of data and information to predict the responsiveness to the
marketing campaigns prevailing today, such as online marketing campaigns, direct mail,
etc.
2. Finance/Banking: The data mining involves financial institutions provide information
about loans and also credit reporting. When the model is built on historical information,
good or bad loans can then be determined by the financial institutions. Furthermore,
fraudulent and suspicious transactions are monitored by the banks.
3. Manufacturing: The faulty equipment and the quality of the manufactured products can
be determined by using the optimal parameters for controlling. For example, for some of
the semi-conductor development industries, water hardness and quality become a major
challenge as they tend to affect the quality of their product’s production.
4. Government: The governments can be benefitted from the monitoring and gauging the
suspicious activities to avoid anti-money laundering activities.
Different Stages of Data Mining Process

The different stages of the data mining process are as follows
1. Data cleansing: This is the initial stage in data mining, where the classification of the data
becomes an essential component to obtain final data analysis. It involves identifying and
removing inaccurate and tricky data from a set of tables, databases, and record sets. Some
techniques include the ignorance of tuple, which is mainly found when the class label is
not in place; the next approach requires filling the missing values on its own, replacing
missing values and incorrect values with global constants or predictable or mean values.
2. Data integration: It is a technique that involves merging the new set of information with
the existing group. The source may, however, involve many data sets, databases or flat
files. The customary implementation for data integration is creating an EDW (enterprise
data warehouse), which then talks about two concepts- tight and loose coupling, but let’s
not dig into the detail.
3. Data transformation: This requires transforming data within formats, generally from the
source system to the required destination system. Some strategies include Smoothing,
Aggregation, Normalization, Generalization, and attribute construction.
4. Data discretization: The technique that can split the continuous attribute domain along
intervals is called data discretization. The datasets are stored in small chunks, thereby
making our study much more efficient. Two strategies involve Top-down discretization
and bottom-up discretization.
5. Concept hierarchies: They minimize the data by replacing and collecting low-level
concepts from high-level concepts. Concept hierarchies define the multi-dimensional data
with multiple levels of abstraction. The methods are Binning, histogram analysis, cluster
analysis, etc.
6. Pattern evaluation and data presentation: If the data is presented efficiently, the client
and the customers can make use of it in the best possible way. After going through the
above set of stages, the data is presented in graphs and diagrams and thereby understanding
it with minimum statistical knowledge.
Tools and Techniques

Data mining tools and techniques involve how these data can be mined and be put to fair and
effective use. The following two are among the most popular set of tools and techniques of data
mining:
1. R-language: It is an open-source tool that is used for graphics and statistical computing. It has
a wide variety of classical statistical tests, classification, graphical techniques, time-series analysis,
etc. It makes use of effective storage facilities and data handling.
2. Oracle data mining: It is popularly known as ODM, which becomes a part of Oracle advanced
analytics database, thereby generating detailed insights and predictions specifically used to detect
customer behavior, develop customer profiles, and identify cross-selling ways opportunities.
Conclusion
Data mining is all about explaining historical data and a real streaming set of data, thereby
making use of predictions and analysis on top of the mined data. It is closely related to data
science and
machine learning algorithms such as classification, regression, clustering, XGboosting, etc., as
they tend to form essential data mining techniques.
One of the drawbacks can include the training of resources on software, which can be a
complicated and time-consuming task. Data mining becomes a necessary component of one’s
system today, and by making efficient use of it, businesses can grow and predict their future sales
and revenue.
DATA MINING AND PREDICTIVE ANALYSIS MODELING
KEY DIFFERENCES OF PREDICTIVE ANALYTICS VS DATA MINING
Below is the difference between predictive analytics and data mining
● Process – Process of Data Mining can be summarized into six phases-
a. Business/Research Understanding Phase – Clearly enunciate the project objectives and
requirements in terms of the business or research unit as a whole
b. Data Understanding Phase – collect and use exploratory data analysis to familiarize
yourself with the data, and discover initial insights.
c. Data Preparation Phase – Clean and apply a transformation to raw data so that it is ready
for the modeling tools
d. Modeling Phase – Select and apply appropriate modeling techniques and calibrate
model settings to optimize results.
e. Evaluation Phase – Models must be evaluated for quality and effectiveness before we deploy.
Also, determine whether the model, in fact, achieves the objectives set for it in phase 1.
f. Deployment Phase – Making use of models in production might be a simple deployment like
generating a report or a complex one like Implementing a parallel data mining process in
another department.
High-level steps in the Predictive Analytics process area
a. Define Business Goal – What business goal to be achieved and how data fits. For example,
the business goal is more effective offers to new customers, and the data needed is the
segmentation of customers with specific attributes.

b. Collect Additional Data – Additional data needed might be user profile data from
online systems or data from third-party tools to better understand data. This helps to find a
reason behind the pattern. Sometimes Marketing surveys are conducted to collect data.
c. Draft Predictive Model – Model created with newly collected data and business knowledge.
A model can be a simple business rule like “There is a greater chance to get convert the users
from age a to b from India if we give offer like this” or a complex mathematical model.
● Business Value – Data Ming itself adds values to business-like
a. Deeply understand customer segments across different dimensions
b. Get performance patterns specific to KPIs (Eg. Is subscription increasing with active
users count?)
c. Identify Fraudulent activity attempts and prevent it.
d. System performance patterns (Eg -Page loading time across different devices – any pattern?)
Predictive analytics empowers organizations by providing three advantages:
a. Vision – Helps to see what is invisible to others. Predictive analytics can go through a lot of
past customer data, associate it with other pieces of data, and assemble all the pieces in the
right order.
b. Decision – A well-made predictive analytics model provides analytical results free of
emotion and bias. It provides consistent and unbiased insights to support decisions.
c. Precision – Helps to use automated tools to do the reporting job for you — saving time
and resources, reducing human error, and improving precision.

● Performance measure – Performance of Data Mining process measured on how well the
model finding patterns in data. Most of the time it will be a regression, classification or
clustering model and there is a well-defined performance measure for all these.
The performance of predictive analytics is measured on business impact. For example – How
well the targeted ad campaign work compared to a general campaign? No matter how well data
mining finding patterns, to work predictive models well, business insight is a must.
● Future – Data Mining field is evolving very fast. Trying to find patterns in data with lesser
data points with a minimum number of features with help of more sophisticated models like
Deep Neural Networks. A lot of pioneers in this field like Google also trying to make the
process simple and accessible to everyone. One example is Cloud AutoML from Google.
Predictive analytics expanding to a wide variety of new areas like Employee Retention
prediction, Crime Prediction (aka predictive policing), etc. At the same time organizations trying
to predict more accurately by collecting maximum information about users like where are they
going, what type of videos watching etc.
Predictive Analytics and Data Mining Comparison Table

Below are the lists of points, that describe the comparisons between Predictive Analytics and
Data Mining.
HOW DATA MINING WORKS
Imagine that you have gathered three friends and decided which pizza to buy - vegetarian, meat,
or fish? You just poll everyone and conclude what exactly needs to be ordered in your favorite
pizzeria. But what if, for example, you have three million friends and several hundred varieties of
pizza from several dozen establishments? It's not so easy to deal with an order, is it? Nevertheless,
it is what data mining specialists do.
According to this principle, when you go to an online store to buy earrings, you will immediately
be offered a bracelet, pendant, and rings to match. And to the swimsuit - a straw hat, sunglasses,
and sandals.
It is precisely the ideally structured array of specific information that make it possible to identify
a suspicious declaration of income among millions of others of the same kind.
Data mining is conventionally divided into three stages:
• Exploration, in which the data is sorted into essential and non-essential (cleaning, data
transformation, selection of subsets)
• A model building or hidden pattern identification, the same datasets are applied to different
models, allowing better choices. It is called competitive pricing of models
• Deployment - the selected data model is used to predict the results
Data mining is handled by highly qualified mathematicians and engineers as well as AI/ML
experts.
HOW PREDICTIVE ANALYTICS WORKS

According to a report by Zion Market Research, the global predictive analytics market was valued
at approximately $3.49 billion in 2016 and is expected to reach approximately $10.95 billion by
2022, with a CAGR between 2016 and 2022 at about 21%.
Predictive analytics works with behavioral factors, making it possible to predict customer behavior
in the future - how many will come, how many will go, how to change the product, and what
promotions to offer to prevent consumer churn.
You can make predictions based on one person's behavior or a group united by a specific criterion
(gender, age, place of residence, etc.) Predictive analytics uses not only statistics, but ML, teaching
itself.
Business analysts interpret forecasts from inferred patterns. If you don't predict how your regular
and hypothetical customers will behave, you will lose the battle with your competitors.
Data Mining and Predictive Analytics in Healthcare
The healthcare system was one of the first to adopt AI technologies, including data mining and
predictive analytics. It includes detecting fraud, managing customer relationships, and measuring
the effectiveness of specific treatments. And, of course, there is such a massive layer of
developments as predictive medicine based on predictive analytics.
Step-By-Step Guide On Mobile App Hipaa Compliance
Using the example of the latter, we will explain how it works. Let's say you have a cancer patient
like thousands of other patients in your hospital. Based on their treatment, you decide which
regimen to choose for this particular patient, taking into account all of the characteristics. The more
patients you add to the database, the more relevant solution will be given by the self- learning
application for future patients.
Video Streaming App Proof of Concept
Another example: you can adjust the number of medical personnel in a hospital depending on the
reasons for the visit. If most of the patients who come to you are kids, it's time to expand the
pediatric ward. AI will help the HR department see an impending problem before it becomes
urgent. Also, such a system can predict peak loads in hours/days/months of hospital operation,
which will make it possible to intelligently plan the shifts of doctors and nurses.
Clustering patients into groups will help assign a patient to a risk group for a particular disease
before getting sick. For example, those prone to diabetes or disseminated sclerosis need to stick to
diets so as not to worsen their health. If the patient prepares in advance, the course of the disease
will be far less intense and more effectively treated.
But data analysis tools can be helpful not only for doctors. So, a special application can remind
the patient that it is time to replenish the supply of prescription drugs, and if necessary,
automatically pay for them at the nearest pharmacy and order home delivery.
PREDICTIVE MODELING ANALYTICS AND MACHINE LEARNING
For many organisations, big data – incredible volumes of raw structured, semi-structured and
unstructured data – is an untapped resource of intelligence that can support business decisions and
enhance operations. As data continues to diversify and change, more and more organisations are
embracing predictive analytics, to tap into that resource and benefit from data at scale.
What is predictive analytics?
A common misconception is that predictive analytics and machine learning are the same thing.
This is not the case. (Where the two do overlap, however, is predictive modelling – but more
on that later.)
At its core, predictive analytics encompasses a variety of statistical techniques (including machine
learning, predictive modelling and data mining) and uses statistics (both historical and current) to
estimate, or ‘predict’, future outcomes. These outcomes might be behaviours a customer is likely
to exhibit or possible changes in the market, for example. Predictive analytics help us to understand
possible future occurrences by analysing the past.
Machine learning, on the other hand, is a subfield of computer science that, as per Arthur Samuel’s
definition from 1959, gives ‘computers the ability to learn without being explicitly programmed’.
Machine learning evolved from the study of pattern recognition and explores the notion that
algorithms can learn from and make predictions on data. And, as they begin to become more
‘intelligent’, these algorithms can overcome program instructions to make highly accurate, data-
driven decisions.
How does predictive analytics work?
Predictive analytics is driven by predictive modelling. It’s more of an approach than a process.
Predictive analytics and machine learning go hand-in-hand, as predictive models typically include
a machine learning algorithm. These models can be trained over time to respond to new data
or values, delivering the results the business needs. Predictive modelling largely overlaps with the
field of machine learning.
There are two types of predictive models. They are Classification models, that predict class
membership, and Regression models that predict a number. These models are then made up of
algorithms. The algorithms perform the data mining and statistical analysis, determining trends
and patterns in data. Predictive analytics software solutions will have built in algorithms that can
be used to make predictive models. The algorithms are defined as ‘classifiers’, identifying which
set of categories data belongs to.
The most widely used predictive models are:
 Decision trees:
Decision trees are a simple, but powerful form of multiple variable analysis. They are produced
by algorithms that identify various ways of splitting data into branch-like segments. Decision
trees partition data into subsets based on categories of input variables, helping you to understand
someone’s path of decisions.
 Regression (linear and logistic)

Regression is one of the most popular methods in statistics. Regression analysis estimates
relationships among variables, finding key patterns in large and diverse data sets and how they
relate to each other.
 Neural networks
Patterned after the operation of neurons in the human brain, neural networks (also called
artificial neural networks) are a variety of deep learning technologies. They’re typically used to
solve complex pattern recognition problems – and are incredibly useful for analyzing large data
sets. They are great at handling nonlinear relationships in data – and work well when certain
variables are unknown
Other classifiers:
 Time Series Algorithms: Time series algorithms sequentially plot data and are useful for
forecasting continuous values over time.
 Clustering Algorithms: Clustering algorithms organise data into groups whose members are
similar.
 Outlier Detection Algorithms: Outlier detection algorithms focus on anomaly detection,

identifying items, events or observations that do not conform to an expected pattern or standard
within a data set.
 Ensemble Models: Ensemble models use multiple machine learning algorithms to obtain better
predictive performance than what could be obtained from one algorithm alone.
 Factor Analysis: Factor analysis is a method used to describe variability and aims to find
independent latent variables.
 Naïve Bayes: The Naïve Bayes classifier allows us to predict a class/category based on a given
set of features, using probability.
 Support vector machines: Support vector machines are supervised machine learning techniques
that use associated learning algorithms to analyze data and recognize patterns.
Each classifier approaches data in a different way, therefore for organizations to get the results
they need, they need to choose the right classifiers and models.
Applications of predictive analytics and machine learning
For organizations overflowing with data but struggling to turn it into useful insights, predictive
analytics and machine learning can provide the solution. No matter how much data an organization
has, if it can’t use that data to enhance internal and external processes and meet objectives, the data
becomes a useless resource.
Predictive analytics is most commonly used for security, marketing, operations, risk and fraud
detection. Here are just a few examples of how predictive analytics and machine learning
are utilised in different industries:
1. Banking and Financial Services

In the banking and financial services industry, predictive analytics and machine learning are
used in conjunction to detect and reduce fraud, measure market risk, identify opportunities
and much, much more.
2. Security
With cybersecurity at the top of every business’ agenda in 2017, it should come as no surprise
that predictive analytics and machine learning play a key part in security. Security institutions
typically use predictive analytics to improve services and performance, but also to detect
anomalies, fraud, understand consumer behaviour and enhance data security.
3. Retail
Retailers are using predictive analytics and machine learning to better understand consumer
behaviour; who buys what and where? These questions can be readily answered with the right
predictive models and data sets, helping retailers to plan ahead and stock items based on
seasonality and consumer trends – improving ROI significantly.
Want to find out more about getting Predictive Analytics to work?

Developing the right environment
While machine learning and predictive analytics can be a boon for any organisation, implementing
these solutions haphazardly, without considering how they will fit into everyday operations, will
drastically hinder their ability to deliver the insights the organisation needs.
To get the most out of predictive analytics and machine learning, organisations need to ensure they
have the architecture in place to support these solutions, as well as high-quality data to feed them
and help them to learn. Data preparation and quality are key enablers of predictive analytics. Input
data, which may span multiple platforms and contain multiple big data sources, must be
centralised, unified and in a coherent format.
In order to achieve this, organisations must develop a sound data governance program to police
the overall management of data and ensure only high-quality data is captured and recorded.
Secondly, existing processes will need to be altered to include predictive analytics and machine
learning as this will enable organisations to drive efficiency at every point in the business. Lastly,
organisations need to know what problems they are looking to solve, as this will help them to
determine the best and most applicable model to use.
Understanding predictive models
Typically, an organisation’s data scientists and IT experts are tasked with the development of
choosing the right predictive models – or building their own to meet the organisation’s needs.
Today, however, predictive analytics and machine learning is no longer just the domain of
mathematicians, statisticians and data scientists, but also that of business analysts and consultants.
More and more of a business’ employees are using it to develop insights and improve business
operations – but problems arise when employees do not know what model to use, how to deploy
it, or need information right away.
At SAS, we develop sophisticated software to support organisations with their data governance
and analytics. Our data governance solutions help organisations to maintain high-quality data, as
well as align operations across the business and pinpoint data problems within the
same environment., Our predictive analytics solutions help organisations to turn their data into
timely insights for better, faster decision making. These predictive analytics solutions are
designed to meet the needs of all types of users and enables them to deploy predictive models
rapidly.
MACHINE LEARNING
What is Machine learning?
Machine learning methods enable computers to operate autonomously without explicit

programming. ML applications are fed with new data, and they can independently learn, grow,
develop, and adapt.
Machine learning derives insightful information from large volumes of data by leveraging
algorithms to identify patterns and learn in an iterative process. ML algorithms use computation
methods to learn directly from data instead of relying on any predetermined equation that may
serve as a model.
The performance of ML algorithms adaptively improves with an increase in the number of

available samples during the ‘learning’ processes. For example, deep learning is a sub-domain of
machine learning that trains computers to imitate natural human traits like learning from examples.
It offers better performance parameters than conventional ML algorithms.
While machine learning is not a new concept – dating back to World War II when the Enigma
Machine was used – the ability to apply complex mathematical calculations automatically to
growing volumes and varieties of available data is a relatively recent development.
Today, with the rise of big data, IoT, and ubiquitous computing, machine learning has become
essential for solving problems across numerous areas, such as
• Computational finance (credit scoring, algorithmic trading)

• Computer vision (facial recognition, motion tracking, object detection)
• Computational biology (DNA sequencing, brain tumor detection, drug discovery)
• Automotive, aerospace, and manufacturing (predictive maintenance)
• Natural language processing (voice recognition)
How does machine learning work?
Machine learning algorithms are molded on a training dataset to create a model. As new input
data is introduced to the trained ML algorithm, it uses the developed model to make a prediction.
How Machine Learning Works
Note: The above illustration discloses a high-level use case scenario. However, typical machine
learning examples may involve many other factors, variables, and steps.
Further, the prediction is checked for accuracy. Based on its accuracy, the ML algorithm is either
deployed or trained repeatedly with an augmented training dataset until the desired accuracy is
achieved.
Types of Machine Learning
Machine learning algorithms can be trained in many ways, with each method having its pros and
cons. Based on these methods and ways of learning, machine learning is broadly categorized into
four main types:
Types of Machine Learning
1. Supervised machine learning
This type of ML involves supervision, where machines are trained on labeled datasets and enabled
to predict outputs based on the provided training. The labeled dataset specifies that some input and
output parameters are already mapped. Hence, the machine is trained with the input and
corresponding output. A device is made to predict the outcome using the test dataset in subsequent
phases.
For example, consider an input dataset of parrot and crow images. Initially, the machine is trained
to understand the pictures, including the parrot and crow’s color, eyes, shape, and size. Post-
training, an input picture of a parrot is provided, and the machine is expected to identify the object
and predict the output. The trained machine checks for the various features of the object, such as
color, eyes, shape, etc., in the input picture, to make a final prediction. This is the process of object
identification in supervised machine learning.
The primary objective of the supervised learning technique is to map the input variable (a) with
the output variable (b). Supervised machine learning is further classified into two broad categories:
• Classification: These refer to algorithms that address classification problems where

the output variable is categorical; for example, yes or no, true or false, male or
female, etc. Real-world applications of this category are evident in spam detection
and email filtering.
Some known classification algorithms include the Random Forest Algorithm, Decision Tree
Algorithm, Logistic Regression Algorithm, and Support Vector Machine Algorithm.
• Regression: Regression algorithms handle regression problems where input and

output variables have a linear relationship. These are known to predict continuous
output variables. Examples include weather prediction, market trend analysis, etc.
Popular regression algorithms include the Simple Linear Regression Algorithm, Multivariate
Regression Algorithm, Decision Tree Algorithm, and Lasso Regression.
2. Unsupervised machine learning
Unsupervised learning refers to a learning technique that’s devoid of supervision. Here, the
machine is trained using an unlabeled dataset and is enabled to predict the output without any
supervision. An unsupervised learning algorithm aims to group the unsorted dataset based on the
input’s similarities, differences, and patterns.
For example, consider an input dataset of images of a fruit-filled container. Here, the images are
not known to the machine learning model. When we input the dataset into the ML model, the
task of the model is to identify the pattern of objects, such as color, shape, or differences seen
in the
input images and categorize them. Upon categorization, the machine then predicts the output as it
gets tested with a test dataset.
Unsupervised machine learning is further classified into two types:
• Clustering: The clustering technique refers to grouping objects into clusters based
on parameters such as similarities or differences between objects. For example,
grouping customers by the products they purchase.
Some known clustering algorithms include the K-Means Clustering Algorithm, Mean-Shift
Algorithm, DBSCAN Algorithm, Principal Component Analysis, and Independent Component
Analysis.
• Association: Association learning refers to identifying typical relations between the

variables of a large dataset. It determines the dependency of various data items and
maps associated variables. Typical applications include web usage mining and
market data analysis.
Popular algorithms obeying association rules include the Apriori Algorithm, Eclat Algorithm, and
FP-Growth Algorithm.
3. Semi-supervised learning
Semi-supervised learning comprises characteristics of both supervised and unsupervised machine

learning. It uses the combination of labeled and unlabeled datasets to train its algorithms. Using
both types of datasets, semi-supervised learning overcomes the drawbacks of the options
mentioned above.
Consider an example of a college student. A student learning a concept under a teacher’s

supervision in college is termed supervised learning. In unsupervised learning, a student self-
learns the same concept at home without a teacher’s guidance. Meanwhile, a student revising the
concept after learning under the direction of a teacher in college is a semi-supervised form of
learning.
4. Reinforcement learning
Reinforcement learning is a feedback-based process. Here, the AI component automatically takes

stock of its surroundings by the hit & trial method, takes action, learns from experiences, and
improves performance. The component is rewarded for each good action and penalized for every
wrong move. Thus, the reinforcement learning component aims to maximize the rewards by
performing good actions.
Unlike supervised learning, reinforcement learning lacks labeled data, and the agents learn via
experiences only. Consider video games. Here, the game specifies the environment, and each move
of the reinforcement agent defines its state. The agent is entitled to receive feedback via
punishment and rewards, thereby affecting the overall game score. The ultimate goal of the agent
is to achieve a high score.
Reinforcement learning is applied across different fields such as game theory, information theory,
and multi-agent systems. Reinforcement learning is further divided into two types of methods or
algorithms:
•Positive reinforcement learning: This refers to adding a reinforcing stimulus after

a specific behavior of the agent, which makes it more likely that the behavior may
occur again in the future, e.g., adding a reward after a behavior.
• Negative reinforcement learning: Negative reinforcement learning refers to
strengthening a specific behavior that avoids a negative outcome.
Top 5 Machine Learning Applications
Industry verticals handling large amounts of data have realized the significance and value of
machine learning technology. As machine learning derives insights from data in real-time,
organizations using it can work efficiently and gain an edge over their competitors.
Every industry vertical in this fast-paced digital world, benefits immensely from machine learning
tech. Here, we look at the top five ML application sectors.
1. Healthcare industry
Machine learning is being increasingly adopted in the healthcare industry, credit to wearable
devices and sensors such as wearable fitness trackers, smart health watches, etc. All such devices
monitor users’ health data to assess their health in real-time.
Moreover, the technology is helping medical practitioners in analyzing trends or flagging events
that may help in improved patient diagnoses and treatment. ML algorithms even allow medical
experts to predict the lifespan of a patient suffering from a fatal disease with increasing accuracy.
Additionally, machine learning is contributing significantly to two areas:
• Drug discovery: Manufacturing or discovering a new drug is expensive and involves

a lengthy process. Machine learning helps speed up the steps involved in such a
multi-step process. For example, Pfizer uses IBM’s Watson to analyze massive
volumes of disparate data for drug discovery.
• Personalized treatment: Drug manufacturers face the stiff challenge of validating
the effectiveness of a specific drug on a large mass of the population. This is because
the drug works only on a small group in clinical trials and possibly causes side effects
on some subjects.
To address these issues, companies like Genentech have collaborated with GNS Healthcare to
leverage machine learning and simulation AI platforms, innovating biomedical treatments to
address these issues. ML technology looks for patients’ response markers by analyzing individual
genes, which provides targeted therapies to patients.
2. Finance sector
Today, several financial organizations and banks use machine learning technology to tackle
fraudulent activities and draw essential insights from vast volumes of data. ML-derived insights
aid in identifying investment opportunities that allow investors to decide when to trade.
Moreover, data mining methods help cyber-surveillance systems zero in on warning signs of
fraudulent activities, subsequently neutralizing them. Several financial institutes have already
partnered with tech companies to leverage the benefits of machine learning.
For example,
• Citibank has partnered with fraud detection company Feedzai to handle online and
in-person banking frauds.
• PayPal uses several machine learning tools to differentiate between legitimate and
fraudulent transactions between buyers and sellers.
3. Retail sector
Retail websites extensively use machine learning to recommend items based on users’ purchase
history. Retailers use ML techniques to capture data, analyze it, and deliver personalized shopping
experiences to their customers. They also implement ML for marketing campaigns, customer
insights, customer merchandise planning, and price optimization.
According to a September 2021 report by Grand View Research, Inc., the global recommendation
engine market is expected to reach a valuation of $17.30 billion by 2028. Common day-to-day
examples of recommendation systems include:
• When you browse items on Amazon, the product recommendations that you see on
the homepage result from machine learning algorithms. Amazon uses artificial
neural networks (ANN) to offer intelligent, personalized recommendations relevant
to customers based on their recent purchase history, comments, bookmarks, and
other online activities.
• Netflix and YouTube rely heavily on recommendation systems to suggest shows and
videos to their users based on their viewing history.
Moreover, retail sites are also powered with virtual assistants or conversational chatbots that
leverage ML, natural language processing (NLP), and natural language understanding (NLU) to
automate customer shopping experiences.
4. Travel industry
Machine learning is playing a pivotal role in expanding the scope of the travel industry. Rides
offered by Uber, Ola, and even self-driving cars have a robust machine learning backend.
Consider Uber’s machine learning algorithm that handles the dynamic pricing of their rides. Uber
uses a machine learning model called ‘Geosurge’ to manage dynamic pricing parameters. It
usesAD8551-BUSINESS ANALYTICSUNIT-III
real-time predictive modeling on traffic patterns, supply, and demand. If you are getting late for a
meeting and need to book an Uber in a crowded area, the dynamic pricing model kicks in, and you
can get an Uber ride immediately but would need to pay twice the regular fare.
Moreover, the travel industry uses machine learning to analyze user reviews. User comments are
classified through sentiment analysis based on positive or negative scores. This is used for
campaign monitoring, brand monitoring, compliance monitoring, etc., by companies in the travel
industry.
5. Social media
With machine learning, billions of users can efficiently engage on social media networks. Machine
learning is pivotal in driving social media platforms from personalizing news feeds to delivering
user-specific ads. For example, Facebook’s auto-tagging feature employs image recognition to
identify your friend’s face and tag them automatically. The social network uses ANN to recognize
familiar faces in users’ contact lists and facilitates automated tagging.
Similarly, LinkedIn knows when you should apply for your next role, whom you need to connect
with, and how your skills rank compared to peers. All these features are enabled by machine
learning.
Below is the top 8 Comparison between Machine Learning and Predictive Modelling:
Machine learning is an area of computer science which uses cognitive learning methods to
program their systems without the need of being explicitly programmed. In other words,
those machines are well known to grow better with experience.
Machine learning is related to other mathematical techniques and also with data mining which
encompasses terms such as supervised and unsupervised learning.
Predictive modeling, on the other hand, is a mathematical technique which uses statistics for
prediction. It aims to work upon the provided information to reach an end conclusion after an
event has been triggered.
Key Differences between Machine Learning and Predictive Modelling

Below are the lists of points, describe the key differences between Machine Learning and
Predictive Modelling
1. Machine learning is an AI technique where the algorithms are given data and are asked to
process without a predetermined set of rules and regulations whereas Predictive
analysis is the analysis of historical data as well as existing external data to find patterns
and behaviors.
2. Machine learning algorithms are trained to learn from their past mistakes to improve
future performance whereas predictive makes informed predictions based upon historical
data about future events only
3. Machine learning is a new generation technology which works on better algorithms
and massive amounts of data whereas predictive analysis are the study and not a
particular technology which existed long before Machine learning came into existence.
Alan Turing had already made used of this technique to decode the messages during
world war II.
4. Related practices and learning techniques for machine learning include Supervised and
unsupervised learning while for predictive analysis it is Descriptive analysis,
Diagnostic analysis, Predictive analysis, Prescriptive analysis, etc.
5. Once our machine learning model is trained and tested for a relatively smaller dataset,
then the same method can be applied to hidden data. The data effectively need not be
biased as it would result in bad decision making. In the case of predictive analysis, data
is useful when it is complete, accurate and substantial. Data quality needs to be taken
care of when data is ingested initially. Organizations use this to predict forecasts,
consumer behaviors and make rational decisions based on their findings. A success case
will surely result in boosting business and firm’s revenues.

Conclusion
In a nutshell, when it comes to data analytics, machine learning is a methodology which is used
to devise and generate complex algorithms and models which lend themselves to a prediction.
This is popularly known as predictive analysis in commercial use which is used by researchers,
engineers, data scientists and other analysts to make decisions and provide results and uncover
the hidden insights by making use of historical learning.
SUMMARY OF BUSINESS FORECASTING

Business forecasting is imperative for making balanced financial and operational
decisions. Its impact across industries has grown in recent years due to the way
companies build data-driven strategies and rely on data. But let’s find out what is needed
for efficient forecasting and why machine learning models have all the prerequisites for
enhancing business intelligence.
How AI Improves Business Forecast Accuracy

Thanks to forecasting, companies are able to better serve clients and ship orders,
instead of running out of stock. This leads to a huge impact on sales and customer
satisfaction. For example, knowing the demand brings an ability to manage
logistics and track inventory costs, or even predict ROI for a new product.
Therefore, ML forecasting models allow organizations to enhance their AI maturity,
and more importantly, to solve business tasks by looking at existing data.
Nowadays, the volume of data from markets, industries, and users is skyrocketing.
FinancesOnline reveals that the world will produce and consume 94 zettabytes in
2022. Such growth fuels the training of ML models, making them more robust and
accurate. According to Market Research Future, the ML market share is projected
to reach $106.52B by 2030, with a CAGR of 38.76% during the forecast period of
2020-2030. With increasing market share (caused by evolving cloud-based
services and growth in unstructured data) comes new opportunities for building
forecasting models. So, let’s figure out how these models improve business
forecast accuracy and why they are more efficient than traditional approaches.
ML forecasting rests on an enormous amount of information, which can be
analyzed to achieve accurate predictions and high performance rates. Unlike
traditional forecasting approaches, machine learning allows companies to consider
numerous business drivers and factors, and for building nonlinear algorithms to
minimize loss functions (a crucial ingredient in all optimization problems).
Training of any ML forecasting model requires the assessment stage. This stage
foresees comparison of predicted and actual results. It brings an understanding of
how well the model performs. After that, it would be possible to compare different
forecasting algorithms and choose the one which produces a minimal amount of
errors. With this approach, businesses can replace traditional techniques with ML,
getting the following benefits for their business forecast:
 Acquiring insights and detecting hidden patterns that are difficult to trace with
traditional approaches. Training ML forecasting models on BigData, and moving
computation to Cloud is becoming de-facto an industry standard.
 Reduced number of errors in forecasting. For instance, McKinsey claims that AI- driven
forecasting models applied to delivery chain management can reduce the number of
errors by 20–50%.
 Ability to infuse more data in a model. External data may be valuable here and
change the outcomes in terms of predictions.
 Flexibility and rapid adaptability to changes. Compared to traditional non-AI
approaches, ML forecasting algorithms can be quickly adapted in case of any
significant changes.
Please note that we’re considering forecasting, not predictive modeling. We’ll
explain the difference between these two models in simple terms.
Difference Between Forecasting & PredictiveModeling

Both forecasting and predictive algorithms are applied to address cumbersome
challenges related to business planning, customer behavior, and decision-making.
But, nevertheless, these techniques differ.
Forecasting modeling implies analysis of past and present data to find patterns,
or trends, which allow us to estimate the probability of future events. In contrast to
predicting, forecasting modeling should have traceable logics. Typical use cases
include a forecast for energy consumption in the following 6–12 months, an
evaluation of how many customers will reach support in the next 7 days, or how
many agreements for the supply are expected to be signed. All this could be
forecasted based on previous (historical) data.
Predictive modeling is the process of applying AI and data mining to assess more
detailed, specific outcomes and use much more diverse data types. The difference
between predictive and forecasting modeling is blurred, still, we can consider an
example to understand it better. Just imagine that a credit institution plans to
launch a new premium card. At this point, two questions may arise.
The first will probably be, how many cards will be issued in the next 6 months?
Forecasting modeling will help us find an answer to this question thanks to analysis
of similar products launched in the past.But we still don’t know whom we can
recommend this card to. Here predictive modeling comes into play. It enables us to
analyze a customer information database with such fields as age, salary,
preferences, consumer habits, etc. With this approach, we will eventually
understand which clients are more likely to use this card.
Use Сases For Machine Learning Forecasting For Business

FINANCIAL FORECASTING
Without a financial forecast, companies face disruption in processes and
performance, while C-level managers tend to make incorrect decisions. That’s why
companies leverage ML forecasting which instead of dealing with mundane tasks,
concentrates attention on understanding business drivers. Moreover, ML financial
forecasting reduces the amount of ineffective strategies in play and human errors
and helps predict supply, demand, inventory, future revenues, expenses, and cash
flow.
For example, stakeholders of the business are aiming to know the company’s
turnover and key factors for growth during the next financial period to understand
and analyze areas of improvement. Based on historical key company business
indicators and existing turnover information during the past periods, we can
develop an ML forecasting model using deep learning or regression models. It will
predict future required metrics, based also on seasonal information and other
influencing factors. In this case, business owners will be able to plan the next
period of time accordingly.
SUPPLY CHAIN FORECASTING

ML can fully transform management in the area of supply chains, which are
becoming more globalized and sophisticated. ML-based forecasting solutions
enable companies to efficiently respond to issues and threats as well as avoid
under and overstocking. Machine learning algorithms for forecasting can learn
relationships from a training dataset and then apply these relationships to new
data. Thus, ML improves selecting and segmenting suppliers, predicting supply
chain risks, inventory management, and transportation and distribution processes.
Let’s look at an example of using machine learning for supply chain forecasting.
The chain of hypermarkets operates around 100 stores in different locations and
has an average of 50000 SKUs per store. For such a big chain, it’s definitely
required that the process of replenishment of warehouses be automated. There are
two main benefits in this case:
1. No need to store a lot of hard-to-sell products

2. Frequently sold products should be delivered on time
Based on the previous information on replenishment of warehouses, as well as

data that shows how fast certain products are selling, we can develop an ML model
for predicting the number of products per SKU. The prediction could be shown with
different time horizons (e.g. daily, weekly, monthly, etc.). This can help managers
properly organize the system of storing products and minimize the case of product
absence.
PRICE PREDICTION
Price prediction algorithms determine how much the product must cost to be
appealing to consumers, meet the company’s expectations, and assure the highest
level of sales. The construction of price forecasts should take into account such
factors as product features, demand, and existing trends. This approach may be
perceived skeptically, yet it’s beneficial when companies enter a new market or
release a new product and want to easily cope with a myriad of fluctuating
factors.
Often business owners want to have an understanding of price changes for a
specific product for a future period of time. Having taken into consideration client
data with related price changes for a past period of time for all of the existing
products, we can catch general patterns from the previous data and extrapolate
them for the next periods. The positive impact could also be applied by adding
external third-party data that could influence prices as well, for instance: inflation
rate, holidays, seasonal patterns, etc. Wrapping up all of this data, we can develop
an ML forecasting model that will be able to predict price trends for specific
products.
DEMAND & SALES FORECASTING
A fluctuation in demand is a cumbersome challenge that concerns the whole e-
commerce industry. That’s why companies, including manufacturers, apply ML
demand forecasting to predict buyers’ behavior and find out how many products to
produce or order. With ML models, it’s possible to avoid excess inventory or
stockout. Moreover, such an approach to demand forecasting enables
understanding the target audience and competition.
Let’s say a restaurant chain business wants to plan demand in advance. It will help
the business in several ways:
 to know the number of dishes that will be sold in the restaurant in order to plan food stock
in advance,
 to understand and define an appropriate number of employees that are required to provide
quality customer service
 to come up with the proper and timely marketing campaign
In order to develop a demand forecasting model and help businesses to fulfill their
goals, it will be great to start by analyzing historical data of the previous periods.
One of the ways to improve the model performance could be an integration of NLP
algorithms as well. For example, we can consider reviews on Google for our
restaurant chain, as well as the main competitors to identify the main dishes/quality
of service that customers like or do not like.
FRAUD DETECTION
According to a TransUnion report, there is a 52.2% increase in the rate of
suspected digital fraud globally between 2019 and 2021. It indicates that
companies should make greater efforts in the development of anti-fraud tactics. ML
algorithms can detect suspicious financial transactions by learning from past data.
They are already successfully applied in e-commerce, banking, healthcare, fintech,
and other areas.
For instance, a cafe chain owner wants to analyze the productivity of employees.
One of the main goals is to detect hidden patterns that allow employees to cheat.
Different frauds like this could lead to losing money. Based on historical data, we
can develop a fraud detection model that will detect anomaly patterns and notify
about them. In this case, managers can precisely analyze detected anomalies and
identify the root cause of such deviations in the data. In the future, such cases
could be prevented by the manager to keep the business safe.
Key Machine Learning Forecasting Algorithms:
Let’s look at some key machine learning forecasting algorithms to better
understand how ML forecasting can be applied.
REGRESSION ALGORITHMS
ML regression models are applied to predict trends and outcomes, being capable of
comprehending how variables impact each other along with the results. The
dependency between variables can be both linear and nonlinear, while labeled data
is required for training. After understanding the relationship of variables, regression
models can predict what results will be in unseen data.
Simple and multiple linear regression and logistic regression, where a target
variable has only two values, are one of the most common baseline models to
predict sales, stock prices, and customer behavior.
DEEP LEARNING ALGORITHMS
Time series forecasting implementation is gradually replenishing with new deep
learning algorithms. The more versatile and explainable a model is, the higher the
chances for its production use. Let’s take a look at a few deep learning models for
time series forecasting.
The first one is DeepAR. It’s a supervised ML algorithm created by Amazon and
based on recurrent neural networks. It has proven its efficiency with datasets
consisting of hundreds of interrelated time series. The advantages of the method
are the possibility to use a rich set of inputs, scaling capabilities, and suitability for
probabilistic forecasting.
The second one is the Temporal Fusion Transformer (TFT). It overcomes other
deep learning models in terms of versatility and can be built on multiple time series.
TFT performs well even if trained on a small dataset, thus being suitable for
demand forecasting as just one example.
The third algorithm is long short-term memory (LSTM) based upon an artificial
RNN, in which the output from one step is transformed into the input of the next
step. As for the architecture of LSTM, it consists of neural networks and memory
cells for maintaining data, while any manipulation within the memory is performed
by gates. There are three gates here: Forget, Input, and Output. However, LSTM
requires plenty of resources and a long time for training.
TREE-BASED ALGORITHMS
Tree-based algorithms refer to supervised learning approaches. Their advantages
include accuracy, sustainability, and suitability for mapping non-linear patterns. The
idea here is to define homogeneous sets in the sample taking into account the key
differentiator in input. The classification of tree-based algorithms depends on the
target variable. As for advantages, tree-based algorithms can be easily grasped,
require minimal data cleaning, and handle different types of variables. The
tendency toward overfitting and irreconcilability with continuous variables may be
seen as disadvantages in this case.
GAUSSIAN PROCESSES
Gaussian processes (GP) are inferior in popularity to other models, yet they are
powerful enough for industrial application, including automatic forecasting.
Gaussian processes enable us to incorporate expert opinion via kernel, though
their application in forecasting depends on the number of parameters and may be
expensive.
AUTO-REGRESSIVE ALGORITHMS
The group of auto-regression algorithms foresees predicting future values using the
output from the previous step as an input. Forecasting algorithms of this group
include ARIMA, SARIMA, and others. In ARIMA, forecasting is carried out with the
application of moving and autoregressive averages. For instance, the ARIMA
model can predict fuel costs or forecast a company’s revenue based on past
periods. SARIMA uses the same basic idea, but it includes a seasonal component
that may affect the outcomes.
EXPONENTIAL SMOOTHING
Exponential smoothing is an alternative to ARIMA models. It can be applied as a
forecasting model for univariate data that can be extended to support data with a
systematic trend or seasonal component. In this model, forecasting is a weighted
sum of past observations, yet the importance (weight) of past observations is
exponentially decreased. The accuracy of prediction depends on the type of the
exponential smoothing model which can be single, double, or triple. The most
sophisticated exponential smoothing models take into account trends and
seasonality.
How to Apply Machine Learning Forecasting

Regardless of the chosen model, the whole adoption of ML practices looks as the
following:
1. Define business goals and available internal data
2. Search for external data, namely market reports, trends, GDPs, product reviews, etc.
3. Structure, clean, and label data (if needed)
4. Identify the batch of problems to be solved with the help of forecasting
5. Select a baseline model (usually simple regression or tree-based models) to be used as a
first reference point to start with
6. Improve models’ performance by implementing more sophisticated ML models or adjusting
the data
7. After achieving comfortable results, the model is implemented into production (added to
existing software and used on more data)
Challenges of ML Forecasting
Nothing good comes without challenges, ML forecasting is no exception. Key
business forecasting with machine learning challenges include the following:
 Insufficient amount of data to train a model
 An incorrectly chosen metric to evaluate results in alignment with business needs
 Imputation of missing data
 Dealing with outliers/anomalies
While infusing the data at the scale of AI, businesses encounter difficulties and
limitations, that’s why it’s crucial to involve experienced data science professionals
and AI engineers when implementing machine learning.

Ba Unit3 Notes

Uploaded by

Copyright:

Available Formats

Ba Unit3 Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ba Unit3 Notes

Uploaded by

Copyright:

Available Formats

BA-UNIT3 - NOTES

UNIT III BUSINESS FORECASTING

In this to business forecasting, we'll cover:

• What is business forecasting?

• What are the best forecasting techniques?

• Why forecasting in management is important?

• How to conduct business forecasts?

• A few forecasting examples for businesses

What is business forecasting?

1. How the forecast is going to be useful? – precisely the purpose of it

Qualitative business forecasting is predictions and projections based on experts' and

How do you choose the right business forecasting technique?

• Degree of accuracy required

• Allocated time to conduct the forecast

• Costs and benefits of the forecast

• Stage of the product or business needing the forecast

Why is business forecasting important?

What are the integral elements of business forecasting?

How do you do business forecasting?

1. What is the purpose of the forecast? How will it be used?

3. How relevant is past data in estimating the future?

How do you get data for business forecasting?

• Why do you need it?

• What kind of data do you need?

• When will you collect it?

• Where will you gather it?

• Who is in charge of collecting it?

• How will you collect it?

• How will you analyze it?

Some principles may be surprising, such as

 not using the R squared,

A checklist of 32 principles is provided to assist in the systematic assessment of

1. Choose an issue to address

2. Create a data plan

4. Analyze the data

Business forecasting examples

Some forecasting examples for business include:

2. Estimating the threat of new entrants into your market

3. Measuring the opportunity of developing a new product or service

5. Predicting future sales growth based on past sales performance

7. Budgeting contingencies and efficient allocation of resources

Business Forecasting: How it Works & Real-Life Examples

What is business forecasting?

3 examples of business forecasting in action

2. A company forecasting sales for the next quarter

3. A company forecasting returns on a new product

• Biases and errors by the forecasters or managers

• Incorrect information from employees, experts, or customers

• Inaccurate past numbers

• Sudden change in market conditions

• New industry regulations

Rise of Big Data

Cutting-Edge Technologies for Big Data and Machine Learning

Predictive Analytics Examples

 Automotive – Breaking new ground with autonomous vehicles

How Predictive Analytics Works

Predictive Analytics Workflow

Create models and forecast future outcomes

1. Clean the data by removing outliers and treating missing data