Ba Unit3 Notes
Ba Unit3 Notes
Ba Unit3 Notes
Analyzing a large volume of data is already a crucial part of the decision-making process for any
business, irrespective of its volume. Available big data resolve everyday problems like improving
the conversion rate or to attaining customer loyalty for an e-commerce business. But do you know
that you can also use this data to forecast events before they actually happen? It adds the value of
Predictive Analytics Solutions to predict user behavior based on historical data and act
consequently to optimize sales.
For online businesses, occasionally executing predictive analytics is equal to improving your
understanding of the customer and classifying changes in the market before they occur.
The predictive analytics models take out patterns from past and transactional data to recognize
risks and opportunities. Self-learning software will automatically evaluate the existing data and
provide tools for future problems. It will enable you to build new sales strategies to adjust
according to the changes and increase profit growth.
1
INTRODUCTION TO BUSINESS FORECASTING TECHNIQUES
Companies conduct business forecasts to determine their goals, targets, and project plans for each
new period, whether quarterly, annually, or even 2–5-year planning. Some companies utilize
predictive analytics software to collect and analyze the data necessary to make an accurate
business forecast. Predictive analytics solutions give you the tools to store data, organize
information into comprehensive datasets, develop predictive models to forecast business
opportunities, adapt datasets to data changes, and allow import/export from other data channels.
Forecasting helps managers guide strategy and make informed decisions about critical business
operations such as sales, expenses, revenue, and resource allocation. When done right, forecasting
adds a competitive advantage and can be the difference between successful and unsuccessful
companies.
To deal with the increasing variety and complexity of management forecasting problems, many
forecasting methods have been documented in recent years. All the methods have distinct usage
and attention must be paid to select the right method for a specific application. The administrator
and the manager of the forecast share a significant role in choosing the technique; and the better
they recognize the various forecast opportunities, the more likely the company’s estimated effort
will pay off.
.
Choosing a right forecasting method depends on many factors –
significance and accessibility of historical data,
the background of the prediction,
time available for analysis,
the degree of precision anticipated,
value to the Company,
And desired time period for forecast.
Manager and forecaster need to collectively work to achieve successful forecasting, they must try
to answer the following questions:
Qualitative Techniques:
Qualitative Technique is applied when enough data is not available – i.e. when the product
is launched in the market for the first time. They use human evaluation and rating schemes to
convert qualitative data into quantitative calculations.
The goal is to gather all information and considerations related to the factors being evaluated in a
logical, impartial, and systematic manner. Such methods are often used in the field of new
technologies, where the development of product ideas may require more “invention”, so it is
difficult to study the research and development requirements, and the perception and entry of the
market is very uncertain.
Qualitative models are most successful with short-term projections. They are expert-
driven, bringing up contrasting opinions and reliance on judgment over calculable data. Examples
of qualitative models in business forecasting include:
• Market research: This involves polling people – experts, customers, employees – to get
their preferences, opinions, and feedback on a product or service.
• Delphi method: The Delphi method relies on asking a panel of experts for their opinions
and recommendations and compiling them into a forecast.
Time Series Analysis
Time Series is a set of observations on the values that a variable takes at different times. Example:
Sales trend, stock market prices, weather forecasts etc. In simple terms. Let’s take a sales data.
You would have cells that are connected on every month basis. Like for January, you sold 150,
and in February, you sold about a bit more let us assume three hundred and so on for all the 12
months. So, you have your sales data, right? This becomes a time series for you. And given that
there is a pattern, we can predict the future sales of the same unit.
Casual Methods:
Causal forecasting recognizes that the predicted dependent variable affects one or more other
independent variables. Causal methods take into account all possible factors that may affect the
dependent variable. Consequently, the data necessary for such forecasting can vary from internal
data to external data, such as surveys, macroeconomic indicators, product characteristics, social
chatter, etc. Typically, casual models are infinitely modified to ensure that the latest data are
included in the model.
Quantitative business forecasting
Use quantitative forecasting when there is accurate past data available to analyze patterns and
predict the probability of future events in your business or industry.
Quantitative forecasting extracts trends from existing data to determine the more probable results.
It connects and analyzes different variables to establish cause and effect between events, elements,
and outcomes. An example of data used in quantitative forecasting is past sales numbers.
Quantitative models work with data, numbers, and formulas. There is little human interference in
quantitative analysis. Examples of quantitative models in business forecasting include:
• The indicator approach: This approach depends on the relationship between specific
indicators being stable over time, e.g., GDP and the unemployment rate. By following the
relationship between these two factors, forecasters can estimate a business's performance.
• The average approach: This approach infers that the predictions of future values are equal
to the average of the past data. It is best to use this approach only when assuming that the
future will resemble the past.
• Econometric modeling: Econometric modeling is a mathematically rigorous approach to
forecasting. Forecasters assume the relationships between indicators stay the same and test
the consistency and strength of the relationship between datasets.
• Time-series methods: Time-series methods use historical data to predict future outcomes.
By tracking what happened in the past, forecasters expect to get a near-accurate view of the
future
Choosing the right business forecasting technique depends on many factors. Some of these are:
• Context of the forecast
• Availability and relevance of past data
• Period to be forecast
Managers and forecasters must consider the stage of the product or business as this influences the
availability of data and how you establish relationships between variables. A new startup with no
previous revenue data would be unable to use quantitative methods in its forecast. The more you
understand the use, capabilities, and impact of different forecasting techniques, the
Any insight into the future puts your organization at an advantage. Forecasting helps you predict
potential issues, make better decisions, and measure the impact of those decisions.
By combining quantitative and qualitative techniques, statistical and econometric models, and
objectivity, forecasting becomes a formidable tool for your company.
Business forecasting helps managers develop the best strategies for current and future trends and
events. Today, artificial intelligence, forecasting software, and big data make business forecasting
easier, more accurate, and personalized to each organization.
Forecasting does not promise an accurate picture of the future or how your business will evolve,
but it points in a direction informed by data, logic, and experiential reasoning.
While there are different forecasting techniques and methods, all forecasts follow the same process
on a conceptual level. Standard elements of business forecasting include:
• Prepare the stage: Before you begin, develop a system to investigate the current state of
business.
• Choose a data point: An example for any business could be "What is our sales projection
for next quarter?"
• Choose indicators and data sets: Identify the relevant indicators and data sets you need
and decide how to collect the data.
• Make initial assumptions: To kick start the forecasting process, forecasters may make
some assumptions to measure against variables and indicators.
• Select forecasting technique: Pick the technique that fits your forecast best.
• Analyze data: Analyze available data using your selected forecasting technique.
• Estimate forecasts: Estimate future conditions based on data you've gathered to reach data-
backed estimates.
• Verify forecasts: Compare your forecast to the eventual results. This helps you identify any
problems, tweak errant variables, correct deviations, and continue to improve your
forecasting technique.
• Review forecasting process: Review any deviations between your forecasts and actual
performance data.
Successful business forecasting begins with a collaboration between the manager and forecaster.
They work together to answer the following questions:
2. What are the components and dynamics of the system the forecast is focused on?
With the right forecasting method, you can develop your process using the integral elements of
business forecasting mentioned above.
A forecast is only as good as the data supplied. Before collecting data, ask:
When you have these answers, you can start collecting data from two main sources:
• Primary sources: These sources are gathered first-hand using reporting tools — you or
members of your team source data through interviews, surveys, research, or observations.
• Secondary sources: Secondary sources are second-hand information or data that others have
collected. Examples include government reports, publications, financial statements,
competitors' annual reports, journals, and other periodicals.
Ideally, prediction methods should be evaluated in the situations in which they will be
used. The basis for conducting the evaluation is the need to test methods against
reasonable alternatives.
The evaluation consists of four steps:
test assumptions,
test data and methods,
repeat results,
and evaluate results.
Most of the principles for testing prediction methods are based on generally accepted
methodological procedures, such as defining criteria or obtaining a large sample of
prediction errors. However, forecasters often violate such principles, even in academic
studies.
The way a company forecasts is always unique to its needs and resources, but the primary
forecasting process can be summed up in five steps. These steps outline how business forecasting
starts with a problem and ends with not only a solution but valuable learnings.
The first step in predicting the future is choosing the problem you’re trying to solve or the question
you’re trying to answer. This can be as simple as determining whether your audience will be
interested in a new product your company is developing. Because this step doesn’t yet involve any
data, it relies on internal considerations and decisions to define the problem at hand.
The next step in forecasting is to collect as much data as possible and decide how to use it. This
may require digging up some extensive historical company data and examining the past and
present market trends. Suppose your company is trying to launch a new product. In this case, the
gathered data can be a culmination of the performance of your previous product and the current
performance of similar competing products in the target market.
3. Pick a forecasting technique
After collecting the necessary data, it’s time to choose a business forecasting technique that works
with the available resources and the type of prediction. All the forecasting models are effective
and get you on the right track, but one may be more favorable than others in creating a unique,
comprehensive forecast.
For example, if you have extensive data on hand, quantitative forecasting is ideal for interpretation.
Qualitative forecasting is best if you have less hard data available and are willing to invest in
extensive market research.
Once the ball starts rolling, you can begin identifying patterns in the past and predict the probability
of their repetition. This information will help your company’s decision-makers determine what to
do beforehand to prepare for the predicted scenarios.
5. Verify your findings
The end of business forecasting is simple. You wait to see if what you predicted actually happens.
This step is especially important in determining not only the success of your forecast but also the
effectiveness of the entire process. Having done some forecasting, you can compare the present
experience with these forecasts to identify potential areas for growth.
When in doubt, never throw away “old” data. The final information of one forecasting process can
also be used as the past data for another forecast. It’s like a life cycle of business development
predictions.
1. Calculating cash flow forecasts, i.e., predicting your financial needs within a timeframe
6. Analyzing relationships between variables, e.g., Facebook ads and potential revenue
8. Comparing customer acquisition costs and customer lifetime value over time
A rapidly evolving modern business climate has proven how fast things can change, with
businesses evolving beside it to succeed. In fact, today’s world requires agile strategy and
management.
This is where business forecasting can help, enabling businesses to plan for unexpected events.
In this, you’ll learn the basic principles of business forecasting and how to implement forecasting
techniques in your business planning.
Business forecasting involves forecasting tools and techniques to help businesses predict certain
developments, such as revenue, sales, and growth. Through analytics, data, insights, and
experience, business forecasting provides organizations with the information they can use to
improve their decision-making. Whether you have a large or small company or offer products or
services, accurate forecasts can help your business prepare for future events and future trends.
For example, let’s say a new company started the year with few sales. During Q3, their sales began
to skyrocket because of a new marketing technique spreading brand awareness. Applying a
business forecasting technique, the team can better gauge Q4 sales—preparing inventory,
expanding their team, and taking the necessary steps to have a successful quarter.
Now that you understand the basics of business forecasting, it’s time to see how it works in
practice. Read the following examples to better understand the different approaches to business
forecasting.
1. A company forecasting its sales through the end of the year
Let’s suppose a small greeting card company wants to forecast its sales through the end of the
year. The company has just a year and a half of experience and limited data to use for predictions.
Though the first few quarters were slow to start, they have gained a great reputation in the last
three quarters. For this reason, sales are on the rise.
Since the business has limited historical data, they might consider a qualitative model for
predicting future sales. By polling their customers, the greeting card company can gauge the
willingness of their audience to buy new cards and pricing for the remaining quarters of the year.
Market surveys are a type of qualitative forecasting, which utilizes questionnaires to estimate
future customer behavior.
In this example, let’s suppose a well-established shoe brand is forecasting profits for the next
quarter. Normally, this company would use the time series forecasting technique to estimate profits
for the next quarter. However, economic conditions have shifted, and the unemployment rate is
higher than normal. As a result, the company chooses the indicator approach to predict the actual
performance of its product.
In this scenario, the company might compare two variables: employment rate and spending rates.
With this business forecasting approach, the company predicts it will have a decrease in profits for
the upcoming quarter. Following this prediction, it chooses to produce fewer items in response to
economic changes and adjust budgets accordingly.
In this next example, let’s suppose a loungewear company plans on rolling out a new product:
slippers. Since this product is new to the company, there are no official metrics for pricing and
popularity. For this reason, the company needs to gauge the interest level of its target audience.
In this case, demand forecasting would be a great approach to gauge how much customers are
willing to spend and how much the company will need to invest in terms of materials. By using
this forecasting process, the loungewear company can decide if the product will perform well and
what kind of demand exists. Ultimately, this will help the team make informed business decisions
for production as well as sales.
What are the limits of business forecasting?
You can follow the rules, use the right methods, and still get your business forecast wrong. It is,
after all, an attempt to predict the future. Some limits to business forecasting include:
Predictive analytics uses historical data to predict future events. Typically, historical data is used
to build a mathematical model that captures important trends. That predictive model is then used
on current data to predict what will happen next, or to suggest actions to take for optimal outcomes.
Predictive analytics has received a lot of attention in recent years due to advances in supporting
technology, particularly in the areas of big data and machine learning.
Predictive analytics is often discussed in the context of big data, Engineering data, for example,
comes from sensors, instruments, and connected systems out in the world. Business system data
at a company might include transaction data, sales results, customer complaints, and marketing
information. Increasingly, businesses make data-driven decisions based on this valuable trove of
information.
Increasing Competition
With increased competition, businesses seek an edge in bringing products and services to crowded
markets. Data-driven predictive models can help companies solve long-standing problems in new
ways.
Equipment manufacturers, for example, can find it hard to innovate in hardware alone. Product
developers can add predictive capabilities to existing solutions to increase value to the customer.
Using predictive analytics for equipment maintenance, or predictive maintenance, can anticipate
equipment failures, forecast energy needs, and reduce operating costs. For example, sensors that
measure vibrations in automotive parts can signal the need for maintenance before the vehicle
fails on the road.
Companies also use predictive analytics to create more accurate forecasts, such as forecasting the
demand for electricity on the electrical grid. These forecasts enable resource planning (for
example, scheduling of various power plants), to be done more effectively.
To extract value from big data, businesses apply algorithms to large data sets using tools such as
Hadoop and Spark. The data sources might consist of transactional databases, equipment log files,
images, video, audio, sensor, or other types of data. Innovation often comes from combining data
from several sources.
With all this data, tools are necessary to extract insights and trends. Machine learning techniques
are used to find patterns in data and to build models that predict future outcomes. A variety of
machine learning algorithms are available, including linear and nonlinear regression, neural
networks, support vector machines, decision trees, and other algorithms.
The term “predictive analytics” describes the application of a statistical or machine learning
technique to create a quantitative prediction about the future. Frequently, supervised machine
learning techniques are used to predict a future value (How long can this machine run before
requiring maintenance?) or to estimate a probability (How likely is this customer to default on a
loan?).
Predictive analytics starts with a business goal: to use data to reduce waste, save time, or cut costs.
The process harnesses heterogeneous, often massive, data sets into models that can generate clear,
actionable outcomes to support achieving that goal, such as less material waste, less stocked
inventory, and manufactured product that meets specifications.
We are all familiar with predictive models for weather forecasting. A vital industry application of
predictive models relates to energy load forecasting to predict energy demand. In this case, energy
producers, grid operators, and traders need accurate forecasts of energy load to make decisions for
managing loads in the electric grid. Vast amounts of data are available, and using predictive
analytics, grid operators can turn this information into actionable insights.
Step-by-Step Workflow for Predicting Energy Loads
Typically, the workflow for a predictive analytics application follows these basic steps:
1. Import data from varied sources, such as web archives, databases, and spreadsheets.
Data sources include energy load data in a CSV file and national weather data showing
temperature and dew point.
2. Clean the data by removing outliers and combining data sources.
Identify data spikes, missing data, or anomalous points to remove from the data. Then
aggregate different data sources together – in this case, creating a single table including
energy load, temperature, and dew point.
3. Develop an accurate predictive model based on the aggregated data using statistics,
curve fitting tools, or machine learning.
Energy forecasting is a complex process with many variables, so you might choose to use
neural networks to build and train a predictive model. Iterate through your training data set
to try different approaches. When the training is complete, you can try the model against new
data to see how well it performs.
4. Integrate the model into a load forecasting system in a production environment.
Once you find a model that accurately forecasts the load, you can move it into your
production system, making the analytics available to software programs or devices, including
web apps, servers, or mobile devices.
The computational predictive modeling approach differs from the mathematical approach because
it relies on models that are not easy to explain in equation form and often require simulation
techniques to create a prediction. This approach is often called “black box” predictive modeling
because the model structure does not provide insight into the factors that map model input to
outcome. Examples include using neural networks to predict which winery a glass of wine
originated from or bagged decision trees for predicting the credit rating of a borrower.
Predictive modeling is often performed using curve and surface fitting, time series regression,
or machine learning approaches. Regardless of the approach used, the process of creating a
predictive model is the same across methods. The steps are:
A linear regression model would be useful when a doctor wants to predict a new patient’s
cholesterol based only on their body mass index (BMI). In this example, the analyst would know
to put the data the doctor gathered from his 5,000 other patients—including each of their BMIs
and cholesterol levels—into the linear regression model. They are hoping to predict an unknown
based on a predetermined set of quantifiable data.
The linear regression model would take the data, plot it onto a graph, and establish a line down the
center that properly depicts the smallest distance between all plotted data points. In this scenario,
when that new patient arrives knowing only that their BMI is 31, a data analyst will be able to
predict the patient’s cholesterol by looking at that line and seeing what cholesterol level most
closely aligns with other patients who have a BMI of 31.
2. Text Mining
Whereas linear regression uses only numeric data, mathematical models can also be used to make
predictions about non-numerical factors. Text mining is a perfect example.
“Text mining is part of predictive analytics in the sense that analytics is all about finding the
information I previously knew nothing about,” Goulding says. In this scenario, the tool takes data
points in the form of text-based words or phrases and searches a giant database for those specific
points.
Sound Familiar? The algorithm used by Google or other search engines to bring up relevant links
when you search for a specific keyword is an example of text mining.
Real-World Example
Although tools like search engines—or even the “find” function you may use when searching for
a word in a digital body of text—represent some common examples of text mining, there are also
industry-specific instances where this type of predictive analytics comes into play.
Goulding describes another medical application of predictive analytics, explaining how doctors
rely on text mining when analyzing patient symptoms and trying to determine the root cause. “If
I’m a doctor and I have 50 children in front of me with flu symptoms, my brain can figure out that
the next patient to walk in the door [with similar symptoms] also has the flu,” he says. “But if I
see an unusual set of symptoms from just one patient, I may need the case history of patients from
all over the world to make a correct diagnosis. My brain can’t help me do this; analytics, however,
can.”
Especially in complex patient cases, an analyst can use text mining modeling tools to comb
databases, locate similar symptoms among patients of the past, and generate a prediction as to
what this new patient is “most likely” suffering from based on that data.
3. Optimal Estimation
Optimal estimation is a modeling technique that is used to make predictions based on observed
factors. This model has been used in analytics for over 50 years and has laid the groundwork for
many of the other predictive tools used today. According to Goulding, past applications of this
method include determining “how to best recalibrate equipment on a manufacturing floor…[and]
estimating where a bullet might go when shot,” as well as in other aspects of the defense industry.
Real-World Example
If two planes were flying toward one another, an analyst might use the optimal estimation model
to predict if or when they will collide. To do this, the analyst would put a variety of observed
factors into the mathematical modeling tool, including the airplanes’ height, altitude, speed, angle,
and more. The mathematical model would then be able to help predict at which point, if any, the
planes would meet.
4. Clustering Models
Clustering models are focused on finding different groups with similar qualities or elements within
the data. Many mathematical modeling tools fall within this category, including:
K-Means
Hierarchical Clustering
TwoStep
Density-Based Scan Clustering
Gaussian Clustering Model
Kohonen
Real-World Example
If a fast-food restaurant wanted to open a new location in a new city, the corporate team may work
with a data analyst to figure out exactly where that new location should go. The analyst would start
by gathering an array of specific, relevant data about each location—including factors like
demographics, where the high-end houses are, how close the location is to a college, etc.—then
input all of that data into a clustering mathematical model. This model would most efficiently
analyze this particular type of data and predict where the most strategic location in the city for that
restaurant is based on the data alone.
5. Neural Networks
Neural networks are complex algorithms inspired by the structure of the human brain. They
process historical and current data and identify complex relationships within the data to predict the
future, similar to how the human brain can spot trends and patterns.
A typical neural network is composed of artificial neurons, called units, arranged in different
layers. The neural network uses input units to learn about and process data. On the other hand,
output units are on the opposite side and outline how the neural network should respond to the
input units. Between the two are hidden layers, which are layers of mathematical functions that
produce a specific output.
Real-World Example
If an e-commerce retailer wants to accurately predict which products its customers are likely to
consider purchasing in the future, a data analyst or data scientist might use neural networks to
inform the company’s product recommendation algorithm. The analyst will pull purchase data
and feed it to the neural network, giving the network real examples to learn from. This data will
travel through the neural network through various mathematical functions until the output is
produced and a product recommendation populates.
Other Common Predictive Models
In addition to the mathematical models above, there are additional models that data analysts use
to make predictions, including:
Decision trees
Random forests
Logistic regression
Bayesian methods
Why Is Predictive Analytics Important?
While organizations have recognized the importance of gathering data as a means of looking back
on industry trends for years, business teams have only just started scratching the surface of
possibility when it comes to predictive analytics.
“Analytics is getting exciting in every industry because we’re [more] equipped than ever to…use
the data in the back room that has been gathering dust…to make better business decisions,”
Goulding says.
From insurance to retail to healthcare, organizations are starting to adapt to this model of
informed decision-making and are using it to their advantage:
• Today, insurance companies can predict if a new client is a risk based on their age, history,
health conditions, etc. They can weigh this data and make an informed decision about
whether or not they want to cover that individual.
• Retail organizations can predict how new brands or items might sell in their local market
based on consumer demographics. They can then make strategic decisions about how much
product to stock.
• Doctors can use predictive data to help determine not only what ailment someone’s
conditions point to but also their chances of survival, whether or not they need immediate
surgery, and their condition’s expected decline over a certain period of time.
No matter the industry, the recent advancements in mathematical modeling and the overall lean
into data as a prescriptive form of insight have changed the way businesses operate today.
Businesses can make data-driven decisions based on predictive models, allowing them to mitigate
potential risks and maximize profits. These changes have created an overall trend in decision-
making that is sure to continue developing and expanding for years to come.
26
Predictive Analytics vs. Prescriptive Analytics
Organizations that have successfully implemented predictive analytics see prescriptive analytics
as the next frontier. Predictive analytics creates an estimate of what will happen next;
prescriptive analytics tells you how to react in the best way possible given the prediction.
Prescriptive analytics is a branch of data analytics that uses predictive models to suggest actions
to take for optimal outcomes. Prescriptive analytics relies on optimization and rules-based
techniques for decision making. Forecasting the load on the electric grid over the next 24 hours is
an example of predictive analytics, whereas deciding how to operate power plants based on this
forecast represents prescriptive analytics.
Forecasting is a method by which companies find out trends that will dominate the market in the
company years. It has many advantages not just for new startups but for established and old
companies. Forecasting is defined as a planning tool that can help the management to cope with
an uncertain future, mainly through the use of past data and analysis of market trends. The process
of forecasting begins with certain assumptions that are based on the management experience,
knowledge and astute judgement sense of the management team. These estimates are then
projected on techniques like Box-Jenkins models, Delphi method, exponential smoothing, moving
averages, regression analysis, and trend projection. Since any error in the assumptions will also
result in a similar or magnified error in forecasting results, the technique of sensitivity analysis is
used where a range of values is assigned to uncertain factors, which are also called variables.
4 Major Benefits of Forecasting. Given below are the major benefits of forecasting.
Forecasting is an important element when new brands are being set up in the industry. This
is especially true when the industry is filled with multiple challenges and
there are many hurdles in the path of seeing up a successful brand. Forecasting can help
entrepreneurs to find out the best way that they can overcome these challenges and thereby
establish a successful company. Through forecasting brands can understand how they will
be perceived in the market and whether their products have the capability to meet the
expectations and demands of the target audience. In short, good and strong forecasting can
help startup companies to increase their chances of success by helping them plan and
strategies their entry in a much better manner. At the same time, good forecasting can help
new brands to meet the supply and demand situation, thereby increasing their brand power
and loyalty.
2. Forecasting can help brands to use their financial resources in a much better manner,
than before: Financial concerns, especially for new and small companies is a very
important aspect. That is why it is important that in such situations, the available resources
are utilised in a proper and effective manner. As no brand can survive without adequate
capital, financial forecasting plays a very important role in such a scenario. By helping
companies to divide their resources in a proper manner, financial forecasting can hold the
3. Forecasting can help the administration take good and successful management
administrative backbone, companies will completely turn into a failure, sooner or later. The
administration team of any company is essentially a decision making process and has
responsibility for making decisions and for ascertaining that the decisions made are carried
out. That is why it is important that the wheels of the administrative department is working
in a continue manner and it is here that forecasting plays a very important role
important component of any company, be it in the long term or short term. Forecasting can
help companies to plan their growth strategy while keeping in mind the needs of the
consumers while at the same time having an intricate understanding of the market trends
as well. In other words, good and proper planning whether it is for the overall growth of
the company or for a section of the company is completely dependent on good forecasting
techniques.
Conclusion
In the end, both Predictive Analysis vs Forecasting are two techniques through which brands can
correctly forecast and understand market techniques while at the same time meet customer
expectations as well. In short, the need today is not for better Predictive Analysis vs Forecasting
Predictive modeling means the developing models that can be used to forecast or
predict future events. Models can be developed either through logic or data.
• 1. Logic driven models remain based on experience, knowledge and logical relationships
of variables and constants connected to the desired business performance outcome
situation.
• 2. Data-driven Models refers to the models in which data is collected from many sources
to qualitatively establish model relationships. Logic driven models is often used as a first
step to establish relationships through data-driven models. Data driven models include
sampling and estimation, regression analysis, correlation analysis, forecasting models and
stimulation.
It leverages statistics to predict outcomes. Most often the event one wants to predict is in the future,
but predictive modeling can be applied to any type of unknown event, regardless of when it
occurred. For example, predictive models are often used to detect crimes and identify suspects,
after the crime has taken place.
In many cases the model is chosen on the basis of detection theory to try to guess the probability
of an outcome given a set amount of input data, for example given an email determining how likely
that it is spam.
Models can use one or more classifiers in trying to determine the probability of a set of data
belonging to another set, say spam or ‘ham’.
Usage
Predictive models can either be used directly to estimate a response (output) given a defined set
of characteristics (input), or indirectly to drive the choice of decision rules.
Depending on the methodology employed for the prediction, it is often possible to derive a formula
that may be used in a spreadsheet software. This has some advantages for end users or decision
makers, the main one being familiarity with the software itself, hence a lower barrier to adoption.
Tree based methods (e.g. CART, survival trees) provide one of the most graphically intuitive ways
to present predictions. However, their usage is limited to those methods that use this type of
modelling approach which can have several drawbacks. Trees can also be employed to represent
decision rules graphically.
Score charts are graphical tabular or graphical tools to represent either predictions or decision
rules.
A statistical model embodies a set of assumptions concerning the generation of the observed data,
and similar data from a larger population. A model represents, often in considerably idealized
form, the data-generating process. The model assumptions describe a set of probability
distributions, some of which are assumed to adequately approximate the distribution from which
a particular data set is sampled.
Cause and effect diagram enables a user to hypothesize relationships between potential causes and
of an outcome.
Influence diagram are another tool to conceptualize relationships with business performance
relationships.
Example –
A restaurant customer dines 6 times a year and spends an average of $50 per visit. The restaurant
realizes a 40% margin on the average bill for food and drinks.
30% of customers do not return each year. Average lifetime of a customer = 1/.3 = 3.33 years.
Predictive modeling is a method of predicting future outcomes by using data modeling. It’s one of
the premier ways a business can see its path forward and make plans accordingly. While not
foolproof, this method tends to have high accuracy rates, which is why it is so commonly used.
In short, predictive modeling is a statistical technique using machine learning and data mining to
predict and forecast likely future outcomes with the aid of historical and existing data. It works by
analyzing current and historical data and projecting what it learns on a model generated to forecast
likely outcomes. Predictive modeling can be used to predict just about anything, from TV ratings
and a customer’s next purchase to credit risks and corporate earnings.
A predictive model is not fixed; it is validated or revised regularly to incorporate changes in the
underlying data. In other words, it’s not a one-and-done prediction. Predictive models make
assumptions based on what has happened in the past and what is happening now. If incoming, new
data shows changes in what is happening now, the impact on the likely future outcome must be
recalculated, too. For example, a software company could model historical sales data against
marketing expenditures across multiple regions to create a model for future revenue based on the
impact of the marketing spend.
Most predictive models work fast and often complete their calculations in real time. That’s why
banks and retailers can, for example, calculate the risk of an online mortgage or credit card
application and accept or decline the request almost instantly based on that prediction.
Some predictive models are more complex, such as those used in computational biology
and quantum computing; the resulting outputs take longer to compute than a credit card application
but are done much more quickly than was possible in the past thanks to advances in technological
capabilities, including computing power.
Fortunately, predictive models don’t have to be created from scratch for every application.
Predictive analytics tools use a variety of vetted models and algorithms that can be applied to a
wide spread of use cases.
Predictive modeling techniques have been perfected over time. As we add more data, more
muscular computing, AI and machine learning and see overall advancements in analytics, we’re
able to do more with these models.
1. Classification model: Considered the simplest model, it categorizes data for simple and
direct query response. An example use case would be to answer the question “Is this a
fraudulent transaction?”
2. Clustering model: This model nests data together by common attributes. It works by
grouping things or people with shared characteristics or behaviors and plans strategies for
each group at a larger scale. An example is in determining credit risk for a loan applicant
based on what other people in the same or a similar situation did in the past.
3. Forecast model: This is a very popular model, and it works on anything with a numerical
value based on learning from historical data. For example, in answering how much lettuce
a restaurant should order next week or how many calls a customer support agent should be
able to handle per day or week, the system looks back to historical data.
4. Outliers model: This model works by analyzing abnormal or outlying data points. For
example, a bank might use an outlier model to identify fraud by asking whether a
transaction is outside of the customer’s normal buying habits or whether an expense in a
given category is normal or not. For example, a $1,000 credit card charge for a washer and
dryer in the cardholder’s preferred big box store would not be alarming, but $1,000 spent
on designer clothing in a location where the customer has never charged other items might
be indicative of a breached account.
5. Time series model: This model evaluates a sequence of data points based on time. For
example, the number of stroke patients admitted to the hospital in the last four months is
used to predict how many patients the hospital might expect to admit next week, next month
or the rest of the year. A single metric measured and compared over time is thus more
meaningful than a simple average.
1. Random Forest: This algorithm is derived from a combination of decision trees, none of
which are related, and can use both classification and regression to classify vast amounts
of data.
2. Generalized Linear Model (GLM) for Two Values: This algorithm narrows down the
list of variables to find “best fit.” It can work out tipping points and change data capture
and other influences, such as categorical predictors, to determine the “best fit” outcome,
thereby overcoming drawbacks in other models, such as a regular linear regression.
3. Gradient Boosted Model: This algorithm also uses several combined decision trees, but
unlike Random Forest, the trees are related. It builds out one tree at a time, thus enabling
the next tree to correct flaws in the previous tree. It’s often used in rankings, such as on
search engine outputs.
4. K-Means: A popular and fast algorithm, K-Means groups data points by similarities and
so is often used for the clustering model. It can quickly render things like personalized
retail offers to individuals within a huge group, such as a million or more customers with
a similar liking of lined red wool coats.
5. Prophet: This algorithm is used in time-series or forecast models for capacity planning,
such as for inventory needs, sales quotas and resource allocations. It is highly flexible and
can easily accommodate heuristics and an array of useful assumptions.
Predictive Modeling and Data Analytics
Predictive modeling is also known as predictive analytics. Generally, the term “predictive
modeling” is favored in academic settings, while “predictive analytics” is the preferred term for
commercial applications of predictive modeling.
Successful use of predictive analytics depends heavily on unfettered access to sufficient volumes
of accurate, clean and relevant data. While predictive models can be extraordinarily complex, such
as those using decision trees and k-means clustering, the most complex part is always the neural
network; that is, the model by which computers are trained to predict outcomes. Machine learning
uses a neural network to find correlations in exceptionally large data sets and “to learn” and
identify patterns within the data.
DATA MINING
The data mining process is used to get the pattern and probabilities from the large dataset due to
which it is highly used in business for forecasting the trends, along with this it is also used in fields
like Market, Manufacturing, Finance, and Government to make predictions and analysis using the
tools and techniques like R-language and Oracle data mining, which involves the flow of six
different steps.
One of the essential tasks of data mining relates to the automatic and semi-automatic analysis of
large quantities of raw data and information to extract the previously unknown exciting set of
patterns such as clusters or a group of data records, anomaly detection (unusual forms) and also in
the case of dependencies which makes use of sequential pattern mining and association rule
mining. This makes use of spatial indices. These patterns can be known to be among the kinds in
the input data and can be used in further analysis, such as predictive analysis and machine learning.
More accurate sets of results can be obtained once you start making use of support decision
systems.
process the data accordingly. Basically, in a nutshell, it involves the ETL set of processes such as
the extraction, transformation, and loading of the data and everything else required for this ETL to
happen. This involves the cleansing, change, and processing of data in various systems and
representations. The clients can use this processed data to analyse the businesses and the trends of
Advantages
The advantage of data mining includes the ones related to business and ones like medicine, weather
forecast, healthcare, transportation, insurance, government, etc. Some of the advantages include:
1. Marketing/Retail: It helps all the marketing companies and firms to build models which
are based on a historical set of data and information to predict the responsiveness to the
marketing campaigns prevailing today, such as online marketing campaigns, direct mail,
etc.
about loans and also credit reporting. When the model is built on historical information,
good or bad loans can then be determined by the financial institutions. Furthermore,
3. Manufacturing: The faulty equipment and the quality of the manufactured products can
be determined by using the optimal parameters for controlling. For example, for some of
the semi-conductor development industries, water hardness and quality become a major
4. Government: The governments can be benefitted from the monitoring and gauging the
becomes an essential component to obtain final data analysis. It involves identifying and
removing inaccurate and tricky data from a set of tables, databases, and record sets. Some
techniques include the ignorance of tuple, which is mainly found when the class label is
not in place; the next approach requires filling the missing values on its own, replacing
missing values and incorrect values with global constants or predictable or mean values.
2. Data integration: It is a technique that involves merging the new set of information with
the existing group. The source may, however, involve many data sets, databases or flat
files. The customary implementation for data integration is creating an EDW (enterprise
data warehouse), which then talks about two concepts- tight and loose coupling, but let’s
3. Data transformation: This requires transforming data within formats, generally from the
source system to the required destination system. Some strategies include Smoothing,
4. Data discretization: The technique that can split the continuous attribute domain along
intervals is called data discretization. The datasets are stored in small chunks, thereby
making our study much more efficient. Two strategies involve Top-down discretization
5. Concept hierarchies: They minimize the data by replacing and collecting low-level
concepts from high-level concepts. Concept hierarchies define the multi-dimensional data
with multiple levels of abstraction. The methods are Binning, histogram analysis, cluster
analysis, etc.
6. Pattern evaluation and data presentation: If the data is presented efficiently, the client
and the customers can make use of it in the best possible way. After going through the
above set of stages, the data is presented in graphs and diagrams and thereby understanding
effective use. The following two are among the most popular set of tools and techniques of data
mining:
1. R-language: It is an open-source tool that is used for graphics and statistical computing. It has
a wide variety of classical statistical tests, classification, graphical techniques, time-series analysis,
2. Oracle data mining: It is popularly known as ODM, which becomes a part of Oracle advanced
analytics database, thereby generating detailed insights and predictions specifically used to detect
customer behavior, develop customer profiles, and identify cross-selling ways opportunities.
Conclusion
Data mining is all about explaining historical data and a real streaming set of data, thereby
making use of predictions and analysis on top of the mined data. It is closely related to data
science and
One of the drawbacks can include the training of resources on software, which can be a
complicated and time-consuming task. Data mining becomes a necessary component of one’s
system today, and by making efficient use of it, businesses can grow and predict their future sales
and revenue.
DATA MINING AND PREDICTIVE ANALYSIS MODELING
KEY DIFFERENCES OF PREDICTIVE ANALYTICS VS DATA MINING
Below is the difference between predictive analytics and data mining
b. Data Understanding Phase – collect and use exploratory data analysis to familiarize
c. Data Preparation Phase – Clean and apply a transformation to raw data so that it is ready
d. Modeling Phase – Select and apply appropriate modeling techniques and calibrate
e. Evaluation Phase – Models must be evaluated for quality and effectiveness before we deploy.
Also, determine whether the model, in fact, achieves the objectives set for it in phase 1.
f. Deployment Phase – Making use of models in production might be a simple deployment like
generating a report or a complex one like Implementing a parallel data mining process in
another department.
a. Define Business Goal – What business goal to be achieved and how data fits. For example,
the business goal is more effective offers to new customers, and the data needed is the
online systems or data from third-party tools to better understand data. This helps to find a
reason behind the pattern. Sometimes Marketing surveys are conducted to collect data.
c. Draft Predictive Model – Model created with newly collected data and business knowledge.
A model can be a simple business rule like “There is a greater chance to get convert the users
from age a to b from India if we give offer like this” or a complex mathematical model.
b. Get performance patterns specific to KPIs (Eg. Is subscription increasing with active
users count?)
d. System performance patterns (Eg -Page loading time across different devices – any pattern?)
a. Vision – Helps to see what is invisible to others. Predictive analytics can go through a lot of
past customer data, associate it with other pieces of data, and assemble all the pieces in the
right order.
emotion and bias. It provides consistent and unbiased insights to support decisions.
c. Precision – Helps to use automated tools to do the reporting job for you — saving time
model finding patterns in data. Most of the time it will be a regression, classification or
clustering model and there is a well-defined performance measure for all these.
The performance of predictive analytics is measured on business impact. For example – How
well the targeted ad campaign work compared to a general campaign? No matter how well data
mining finding patterns, to work predictive models well, business insight is a must.
● Future – Data Mining field is evolving very fast. Trying to find patterns in data with lesser
data points with a minimum number of features with help of more sophisticated models like
Deep Neural Networks. A lot of pioneers in this field like Google also trying to make the
process simple and accessible to everyone. One example is Cloud AutoML from Google.
Predictive analytics expanding to a wide variety of new areas like Employee Retention
prediction, Crime Prediction (aka predictive policing), etc. At the same time organizations trying
to predict more accurately by collecting maximum information about users like where are they
Data Mining.
HOW DATA MINING WORKS
Imagine that you have gathered three friends and decided which pizza to buy - vegetarian, meat,
or fish? You just poll everyone and conclude what exactly needs to be ordered in your favorite
pizzeria. But what if, for example, you have three million friends and several hundred varieties of
pizza from several dozen establishments? It's not so easy to deal with an order, is it? Nevertheless,
it is what data mining specialists do.
According to this principle, when you go to an online store to buy earrings, you will immediately
be offered a bracelet, pendant, and rings to match. And to the swimsuit - a straw hat, sunglasses,
and sandals.
It is precisely the ideally structured array of specific information that make it possible to identify
a suspicious declaration of income among millions of others of the same kind.
Data mining is conventionally divided into three stages:
• Exploration, in which the data is sorted into essential and non-essential (cleaning, data
transformation, selection of subsets)
• A model building or hidden pattern identification, the same datasets are applied to different
models, allowing better choices. It is called competitive pricing of models
• Deployment - the selected data model is used to predict the results
Data mining is handled by highly qualified mathematicians and engineers as well as AI/ML
experts.
For many organisations, big data – incredible volumes of raw structured, semi-structured and
unstructured data – is an untapped resource of intelligence that can support business decisions and
enhance operations. As data continues to diversify and change, more and more organisations are
embracing predictive analytics, to tap into that resource and benefit from data at scale.
A common misconception is that predictive analytics and machine learning are the same thing.
This is not the case. (Where the two do overlap, however, is predictive modelling – but more
on that later.)
At its core, predictive analytics encompasses a variety of statistical techniques (including machine
learning, predictive modelling and data mining) and uses statistics (both historical and current) to
estimate, or ‘predict’, future outcomes. These outcomes might be behaviours a customer is likely
to exhibit or possible changes in the market, for example. Predictive analytics help us to understand
possible future occurrences by analysing the past.
Machine learning, on the other hand, is a subfield of computer science that, as per Arthur Samuel’s
definition from 1959, gives ‘computers the ability to learn without being explicitly programmed’.
Machine learning evolved from the study of pattern recognition and explores the notion that
algorithms can learn from and make predictions on data. And, as they begin to become more
‘intelligent’, these algorithms can overcome program instructions to make highly accurate, data-
driven decisions.
Predictive analytics is driven by predictive modelling. It’s more of an approach than a process.
Predictive analytics and machine learning go hand-in-hand, as predictive models typically include
a machine learning algorithm. These models can be trained over time to respond to new data
or values, delivering the results the business needs. Predictive modelling largely overlaps with the
field of machine learning.
There are two types of predictive models. They are Classification models, that predict class
membership, and Regression models that predict a number. These models are then made up of
algorithms. The algorithms perform the data mining and statistical analysis, determining trends
and patterns in data. Predictive analytics software solutions will have built in algorithms that can
be used to make predictive models. The algorithms are defined as ‘classifiers’, identifying which
set of categories data belongs to.
Decision trees:
Decision trees are a simple, but powerful form of multiple variable analysis. They are produced
by algorithms that identify various ways of splitting data into branch-like segments. Decision
trees partition data into subsets based on categories of input variables, helping you to understand
someone’s path of decisions.
Neural networks
Patterned after the operation of neurons in the human brain, neural networks (also called
artificial neural networks) are a variety of deep learning technologies. They’re typically used to
solve complex pattern recognition problems – and are incredibly useful for analyzing large data
sets. They are great at handling nonlinear relationships in data – and work well when certain
variables are unknown
Other classifiers:
Time Series Algorithms: Time series algorithms sequentially plot data and are useful for
forecasting continuous values over time.
Clustering Algorithms: Clustering algorithms organise data into groups whose members are
similar.
Ensemble Models: Ensemble models use multiple machine learning algorithms to obtain better
predictive performance than what could be obtained from one algorithm alone.
Factor Analysis: Factor analysis is a method used to describe variability and aims to find
independent latent variables.
Naïve Bayes: The Naïve Bayes classifier allows us to predict a class/category based on a given
set of features, using probability.
Support vector machines: Support vector machines are supervised machine learning techniques
that use associated learning algorithms to analyze data and recognize patterns.
Each classifier approaches data in a different way, therefore for organizations to get the results
they need, they need to choose the right classifiers and models.
For organizations overflowing with data but struggling to turn it into useful insights, predictive
analytics and machine learning can provide the solution. No matter how much data an organization
has, if it can’t use that data to enhance internal and external processes and meet objectives, the data
becomes a useless resource.
Predictive analytics is most commonly used for security, marketing, operations, risk and fraud
detection. Here are just a few examples of how predictive analytics and machine learning
are utilised in different industries:
While machine learning and predictive analytics can be a boon for any organisation, implementing
these solutions haphazardly, without considering how they will fit into everyday operations, will
drastically hinder their ability to deliver the insights the organisation needs.
To get the most out of predictive analytics and machine learning, organisations need to ensure they
have the architecture in place to support these solutions, as well as high-quality data to feed them
and help them to learn. Data preparation and quality are key enablers of predictive analytics. Input
data, which may span multiple platforms and contain multiple big data sources, must be
centralised, unified and in a coherent format.
In order to achieve this, organisations must develop a sound data governance program to police
the overall management of data and ensure only high-quality data is captured and recorded.
Secondly, existing processes will need to be altered to include predictive analytics and machine
learning as this will enable organisations to drive efficiency at every point in the business. Lastly,
organisations need to know what problems they are looking to solve, as this will help them to
determine the best and most applicable model to use.
Typically, an organisation’s data scientists and IT experts are tasked with the development of
choosing the right predictive models – or building their own to meet the organisation’s needs.
Today, however, predictive analytics and machine learning is no longer just the domain of
mathematicians, statisticians and data scientists, but also that of business analysts and consultants.
More and more of a business’ employees are using it to develop insights and improve business
operations – but problems arise when employees do not know what model to use, how to deploy
it, or need information right away.
At SAS, we develop sophisticated software to support organisations with their data governance
and analytics. Our data governance solutions help organisations to maintain high-quality data, as
well as align operations across the business and pinpoint data problems within the
same environment., Our predictive analytics solutions help organisations to turn their data into
timely insights for better, faster decision making. These predictive analytics solutions are
designed to meet the needs of all types of users and enables them to deploy predictive models
rapidly.
MACHINE LEARNING
Machine learning derives insightful information from large volumes of data by leveraging
algorithms to identify patterns and learn in an iterative process. ML algorithms use computation
methods to learn directly from data instead of relying on any predetermined equation that may
serve as a model.
While machine learning is not a new concept – dating back to World War II when the Enigma
Machine was used – the ability to apply complex mathematical calculations automatically to
growing volumes and varieties of available data is a relatively recent development.
Today, with the rise of big data, IoT, and ubiquitous computing, machine learning has become
essential for solving problems across numerous areas, such as
Machine learning algorithms are molded on a training dataset to create a model. As new input
data is introduced to the trained ML algorithm, it uses the developed model to make a prediction.
How Machine Learning Works
Note: The above illustration discloses a high-level use case scenario. However, typical machine
learning examples may involve many other factors, variables, and steps.
Further, the prediction is checked for accuracy. Based on its accuracy, the ML algorithm is either
deployed or trained repeatedly with an augmented training dataset until the desired accuracy is
achieved.
Machine learning algorithms can be trained in many ways, with each method having its pros and
cons. Based on these methods and ways of learning, machine learning is broadly categorized into
four main types:
Types of Machine Learning
This type of ML involves supervision, where machines are trained on labeled datasets and enabled
to predict outputs based on the provided training. The labeled dataset specifies that some input and
output parameters are already mapped. Hence, the machine is trained with the input and
corresponding output. A device is made to predict the outcome using the test dataset in subsequent
phases.
For example, consider an input dataset of parrot and crow images. Initially, the machine is trained
to understand the pictures, including the parrot and crow’s color, eyes, shape, and size. Post-
training, an input picture of a parrot is provided, and the machine is expected to identify the object
and predict the output. The trained machine checks for the various features of the object, such as
color, eyes, shape, etc., in the input picture, to make a final prediction. This is the process of object
identification in supervised machine learning.
The primary objective of the supervised learning technique is to map the input variable (a) with
the output variable (b). Supervised machine learning is further classified into two broad categories:
Some known classification algorithms include the Random Forest Algorithm, Decision Tree
Algorithm, Logistic Regression Algorithm, and Support Vector Machine Algorithm.
Popular regression algorithms include the Simple Linear Regression Algorithm, Multivariate
Regression Algorithm, Decision Tree Algorithm, and Lasso Regression.
Unsupervised learning refers to a learning technique that’s devoid of supervision. Here, the
machine is trained using an unlabeled dataset and is enabled to predict the output without any
supervision. An unsupervised learning algorithm aims to group the unsorted dataset based on the
input’s similarities, differences, and patterns.
For example, consider an input dataset of images of a fruit-filled container. Here, the images are
not known to the machine learning model. When we input the dataset into the ML model, the
task of the model is to identify the pattern of objects, such as color, shape, or differences seen
in the
input images and categorize them. Upon categorization, the machine then predicts the output as it
gets tested with a test dataset.
• Clustering: The clustering technique refers to grouping objects into clusters based
on parameters such as similarities or differences between objects. For example,
grouping customers by the products they purchase.
Some known clustering algorithms include the K-Means Clustering Algorithm, Mean-Shift
Algorithm, DBSCAN Algorithm, Principal Component Analysis, and Independent Component
Analysis.
Popular algorithms obeying association rules include the Apriori Algorithm, Eclat Algorithm, and
FP-Growth Algorithm.
3. Semi-supervised learning
4. Reinforcement learning
Unlike supervised learning, reinforcement learning lacks labeled data, and the agents learn via
experiences only. Consider video games. Here, the game specifies the environment, and each move
of the reinforcement agent defines its state. The agent is entitled to receive feedback via
punishment and rewards, thereby affecting the overall game score. The ultimate goal of the agent
is to achieve a high score.
Reinforcement learning is applied across different fields such as game theory, information theory,
and multi-agent systems. Reinforcement learning is further divided into two types of methods or
algorithms:
Industry verticals handling large amounts of data have realized the significance and value of
machine learning technology. As machine learning derives insights from data in real-time,
organizations using it can work efficiently and gain an edge over their competitors.
Every industry vertical in this fast-paced digital world, benefits immensely from machine learning
tech. Here, we look at the top five ML application sectors.
1. Healthcare industry
Machine learning is being increasingly adopted in the healthcare industry, credit to wearable
devices and sensors such as wearable fitness trackers, smart health watches, etc. All such devices
monitor users’ health data to assess their health in real-time.
Moreover, the technology is helping medical practitioners in analyzing trends or flagging events
that may help in improved patient diagnoses and treatment. ML algorithms even allow medical
experts to predict the lifespan of a patient suffering from a fatal disease with increasing accuracy.
To address these issues, companies like Genentech have collaborated with GNS Healthcare to
leverage machine learning and simulation AI platforms, innovating biomedical treatments to
address these issues. ML technology looks for patients’ response markers by analyzing individual
genes, which provides targeted therapies to patients.
2. Finance sector
Today, several financial organizations and banks use machine learning technology to tackle
fraudulent activities and draw essential insights from vast volumes of data. ML-derived insights
aid in identifying investment opportunities that allow investors to decide when to trade.
Moreover, data mining methods help cyber-surveillance systems zero in on warning signs of
fraudulent activities, subsequently neutralizing them. Several financial institutes have already
partnered with tech companies to leverage the benefits of machine learning.
For example,
• Citibank has partnered with fraud detection company Feedzai to handle online and
in-person banking frauds.
• PayPal uses several machine learning tools to differentiate between legitimate and
fraudulent transactions between buyers and sellers.
3. Retail sector
Retail websites extensively use machine learning to recommend items based on users’ purchase
history. Retailers use ML techniques to capture data, analyze it, and deliver personalized shopping
experiences to their customers. They also implement ML for marketing campaigns, customer
insights, customer merchandise planning, and price optimization.
According to a September 2021 report by Grand View Research, Inc., the global recommendation
engine market is expected to reach a valuation of $17.30 billion by 2028. Common day-to-day
examples of recommendation systems include:
• When you browse items on Amazon, the product recommendations that you see on
the homepage result from machine learning algorithms. Amazon uses artificial
neural networks (ANN) to offer intelligent, personalized recommendations relevant
to customers based on their recent purchase history, comments, bookmarks, and
other online activities.
• Netflix and YouTube rely heavily on recommendation systems to suggest shows and
videos to their users based on their viewing history.
Moreover, retail sites are also powered with virtual assistants or conversational chatbots that
leverage ML, natural language processing (NLP), and natural language understanding (NLU) to
automate customer shopping experiences.
4. Travel industry
Machine learning is playing a pivotal role in expanding the scope of the travel industry. Rides
offered by Uber, Ola, and even self-driving cars have a robust machine learning backend.
Consider Uber’s machine learning algorithm that handles the dynamic pricing of their rides. Uber
uses a machine learning model called ‘Geosurge’ to manage dynamic pricing parameters. It
usesAD8551-BUSINESS ANALYTICSUNIT-III
real-time predictive modeling on traffic patterns, supply, and demand. If you are getting late for a
meeting and need to book an Uber in a crowded area, the dynamic pricing model kicks in, and you
can get an Uber ride immediately but would need to pay twice the regular fare.
Moreover, the travel industry uses machine learning to analyze user reviews. User comments are
classified through sentiment analysis based on positive or negative scores. This is used for
campaign monitoring, brand monitoring, compliance monitoring, etc., by companies in the travel
industry.
5. Social media
With machine learning, billions of users can efficiently engage on social media networks. Machine
learning is pivotal in driving social media platforms from personalizing news feeds to delivering
user-specific ads. For example, Facebook’s auto-tagging feature employs image recognition to
identify your friend’s face and tag them automatically. The social network uses ANN to recognize
familiar faces in users’ contact lists and facilitates automated tagging.
Similarly, LinkedIn knows when you should apply for your next role, whom you need to connect
with, and how your skills rank compared to peers. All these features are enabled by machine
learning.
Below is the top 8 Comparison between Machine Learning and Predictive Modelling:
Machine learning is an area of computer science which uses cognitive learning methods to
program their systems without the need of being explicitly programmed. In other words,
Machine learning is related to other mathematical techniques and also with data mining which
Predictive modeling, on the other hand, is a mathematical technique which uses statistics for
prediction. It aims to work upon the provided information to reach an end conclusion after an
Predictive Modelling
1. Machine learning is an AI technique where the algorithms are given data and are asked to
analysis is the analysis of historical data as well as existing external data to find patterns
and behaviors.
2. Machine learning algorithms are trained to learn from their past mistakes to improve
future performance whereas predictive makes informed predictions based upon historical
and massive amounts of data whereas predictive analysis are the study and not a
particular technology which existed long before Machine learning came into existence.
Alan Turing had already made used of this technique to decode the messages during
4. Related practices and learning techniques for machine learning include Supervised and
5. Once our machine learning model is trained and tested for a relatively smaller dataset,
then the same method can be applied to hidden data. The data effectively need not be
biased as it would result in bad decision making. In the case of predictive analysis, data
is useful when it is complete, accurate and substantial. Data quality needs to be taken
care of when data is ingested initially. Organizations use this to predict forecasts,
consumer behaviors and make rational decisions based on their findings. A success case
Training of any ML forecasting model requires the assessment stage. This stage
foresees comparison of predicted and actual results. It brings an understanding of
how well the model performs. After that, it would be possible to compare different
forecasting algorithms and choose the one which produces a minimal amount of
errors. With this approach, businesses can replace traditional techniques with ML,
getting the following benefits for their business forecast:
Acquiring insights and detecting hidden patterns that are difficult to trace with
traditional approaches. Training ML forecasting models on BigData, and moving
computation to Cloud is becoming de-facto an industry standard.
Reduced number of errors in forecasting. For instance, McKinsey claims that AI- driven
forecasting models applied to delivery chain management can reduce the number of
errors by 20–50%.
Ability to infuse more data in a model. External data may be valuable here and
change the outcomes in terms of predictions.
Flexibility and rapid adaptability to changes. Compared to traditional non-AI
approaches, ML forecasting algorithms can be quickly adapted in case of any
significant changes.
Please note that we’re considering forecasting, not predictive modeling. We’ll
explain the difference between these two models in simple terms.
to know the number of dishes that will be sold in the restaurant in order to plan food stock
in advance,
to understand and define an appropriate number of employees that are required to provide
quality customer service
to come up with the proper and timely marketing campaign
In order to develop a demand forecasting model and help businesses to fulfill their
goals, it will be great to start by analyzing historical data of the previous periods.
One of the ways to improve the model performance could be an integration of NLP
algorithms as well. For example, we can consider reviews on Google for our
restaurant chain, as well as the main competitors to identify the main dishes/quality
of service that customers like or do not like.
FRAUD DETECTION
According to a TransUnion report, there is a 52.2% increase in the rate of
suspected digital fraud globally between 2019 and 2021. It indicates that
companies should make greater efforts in the development of anti-fraud tactics. ML
algorithms can detect suspicious financial transactions by learning from past data.
They are already successfully applied in e-commerce, banking, healthcare, fintech,
and other areas.
For instance, a cafe chain owner wants to analyze the productivity of employees.
One of the main goals is to detect hidden patterns that allow employees to cheat.
Different frauds like this could lead to losing money. Based on historical data, we
can develop a fraud detection model that will detect anomaly patterns and notify
about them. In this case, managers can precisely analyze detected anomalies and
identify the root cause of such deviations in the data. In the future, such cases
could be prevented by the manager to keep the business safe.
Key Machine Learning Forecasting Algorithms:
Let’s look at some key machine learning forecasting algorithms to better
understand how ML forecasting can be applied.
REGRESSION ALGORITHMS
ML regression models are applied to predict trends and outcomes, being capable of
comprehending how variables impact each other along with the results. The
dependency between variables can be both linear and nonlinear, while labeled data
is required for training. After understanding the relationship of variables, regression
models can predict what results will be in unseen data.
Simple and multiple linear regression and logistic regression, where a target
variable has only two values, are one of the most common baseline models to
predict sales, stock prices, and customer behavior.
DEEP LEARNING ALGORITHMS
Time series forecasting implementation is gradually replenishing with new deep
learning algorithms. The more versatile and explainable a model is, the higher the
chances for its production use. Let’s take a look at a few deep learning models for
time series forecasting.
The first one is DeepAR. It’s a supervised ML algorithm created by Amazon and
based on recurrent neural networks. It has proven its efficiency with datasets
consisting of hundreds of interrelated time series. The advantages of the method
are the possibility to use a rich set of inputs, scaling capabilities, and suitability for
probabilistic forecasting.
The second one is the Temporal Fusion Transformer (TFT). It overcomes other
deep learning models in terms of versatility and can be built on multiple time series.
TFT performs well even if trained on a small dataset, thus being suitable for
demand forecasting as just one example.
The third algorithm is long short-term memory (LSTM) based upon an artificial
RNN, in which the output from one step is transformed into the input of the next
step. As for the architecture of LSTM, it consists of neural networks and memory
cells for maintaining data, while any manipulation within the memory is performed
by gates. There are three gates here: Forget, Input, and Output. However, LSTM
requires plenty of resources and a long time for training.
TREE-BASED ALGORITHMS
Tree-based algorithms refer to supervised learning approaches. Their advantages
include accuracy, sustainability, and suitability for mapping non-linear patterns. The
idea here is to define homogeneous sets in the sample taking into account the key
differentiator in input. The classification of tree-based algorithms depends on the
target variable. As for advantages, tree-based algorithms can be easily grasped,
require minimal data cleaning, and handle different types of variables. The
tendency toward overfitting and irreconcilability with continuous variables may be
seen as disadvantages in this case.
GAUSSIAN PROCESSES
Gaussian processes (GP) are inferior in popularity to other models, yet they are
powerful enough for industrial application, including automatic forecasting.
Gaussian processes enable us to incorporate expert opinion via kernel, though
their application in forecasting depends on the number of parameters and may be
expensive.
AUTO-REGRESSIVE ALGORITHMS
The group of auto-regression algorithms foresees predicting future values using the
output from the previous step as an input. Forecasting algorithms of this group
include ARIMA, SARIMA, and others. In ARIMA, forecasting is carried out with the
application of moving and autoregressive averages. For instance, the ARIMA
model can predict fuel costs or forecast a company’s revenue based on past
periods. SARIMA uses the same basic idea, but it includes a seasonal component
that may affect the outcomes.
EXPONENTIAL SMOOTHING
Exponential smoothing is an alternative to ARIMA models. It can be applied as a
forecasting model for univariate data that can be extended to support data with a
systematic trend or seasonal component. In this model, forecasting is a weighted
sum of past observations, yet the importance (weight) of past observations is
exponentially decreased. The accuracy of prediction depends on the type of the
exponential smoothing model which can be single, double, or triple. The most
sophisticated exponential smoothing models take into account trends and
seasonality.
Challenges of ML Forecasting
Nothing good comes without challenges, ML forecasting is no exception. Key
business forecasting with machine learning challenges include the following:
Insufficient amount of data to train a model
An incorrectly chosen metric to evaluate results in alignment with business needs
Imputation of missing data
Dealing with outliers/anomalies
While infusing the data at the scale of AI, businesses encounter difficulties and
limitations, that’s why it’s crucial to involve experienced data science professionals
and AI engineers when implementing machine learning.