Unit 2 TE Honours
Unit 2 TE Honours
Unit 2 TE Honours
2. When a correlation coefficient is (-1), that means for every positive increase in one
variable, there is a negative decrease in the other fixed proportion. For example, the
decrease in the quantity of gas in a gas tank shows a perfect (almost) inverse
correlation with speed.
3. When a correlation coefficient is (0) for every increase, that means there is no
positive or negative increase, and the two variables are not related.
Z-value, Z-score, or Z
For a recent final exam in STAT 500, the mean was 68.55 with a standard deviation
of 15.45.
Characteristics of Z-scores
For instance, assume U.S. adult heights and weights are both normally distributed.
Clearly, they would have different means and standard deviations. However, if you
knew these means and standard deviations, you could find your z-score for your
weight and height.
You can now use the Standard Normal Table to find the probability, say, of a
randomly selected U.S. adult weighing less than you or taller than you.
1. Node a is the parent of node b, c, and e, and node b, c, and e are the child nodes of node a.
2. Node b and c are the parent nodes of d.
3. Node e is the child node of nodes d, c, and a.
It is important to note the relationships between the nodes. Bayesian networks fall under
probabilistic graphical techniques; hence, probability plays a crucial role in defining the
relationship among these nodes.
There are two types of probabilities that you need to be fully aware of in Bayesian networks:
1. Joint probability
Joint probability is a probability of two or more events happening together. For example, the
joint probability of two events A and B is the probability that both events occur, P(A∩B).
2. Conditional probability
Conditional probability defines the probability that event B will occur, given that event A has
already occurred. There are two ways joint probability can be represented:
The conditional probability distribution of each node is represented by a table called the
"node table". It contains two columns, one for each possible state of the parent node (or
"parent random variable") and one for each possible state of the child node (or "child random
variable").
The rows in this table correspond to all possible combinations of parent and child states. In
order to find out how likely it is that a certain event will happen, we need to sum up the
probabilities from all paths of that event.
You have installed a burglar alarm at home. The alarm not only detects burglary but also
responds to minor earthquakes. You have two neighbors, Chris and Martin, who have agreed
to get in touch with you when the alarm rings. Chris calls you when he hears the alarm but
sometimes confuses it with the telephone ringing and calls. On the other hand, Martin is a
music lover who sometimes misses the alarm due to the loud music he plays.
Problem:
Based on the evidence on who will or will not call, find the probability of a burglary
occurring in the house.
1. Burglary (B)
2. Earthquake (E)
3. Alarm (A)
4. Chris calls ( C )
5. Martin calls (M)
Links act as causal dependencies that define the relationship between the nodes. Both Chris
and Martin call when there is an alarm.
Let’s write the probability distribution function formula for the above five nodes.
Now, let's look at the observed values for each of the nodes with the table of probabilities:
Node B:
Node E:
Node A:
Node C:
Node M:
Based on the above observed values, the conditional values can be derived and, therefore, the
probability distribution can be calculated.
Node A:
Node C:
Node M:
1. Spam filtering: A spam filter is a program that helps in detecting unsolicited and spam
mails. Bayesian spam filters check whether a mail is spam or not. They use filtering to learn
from spam and ham messages.
2. Biomonitoring: This involves the use of indicators to quantify the concentration of
chemicals in the human body. Blood or urine is used to measure the same.
3. Information retrieval: Bayesian networks assist in information retrieval for research, which
is a constant process of extracting information from databases. It works in a loop. Hence, we
have to continuously reconsider and redefine our research problem to avoid data overload.
4. Image processing: A form of signal processing, image processing uses mathematical
operations to convert images into digital format. Once images are converted, their quality can
be enhanced with more operations. The input image doesn’t necessarily have to be in the
form of an image; it could be a photograph or a video frame.
5. Gene regulatory network: A Bayesian network is an algorithm that can be applied to gene
regulatory networks in order to make predictions about the effects of genetic variations on
cellular phenotypes. Gene regulatory networks are a set of mathematical equations that
describe the interactions between genes, proteins, and metabolites. They are used to study
how genetic variations affect the development of a cell or organism.
6. Turbo code: Turbo codes are a type of error correction code capable of achieving very high
data rates and long distances between error correcting nodes in a communications system.
They have been used in satellites, space probes, deep-space missions, military
communications systems, and civilian wireless communication systems, including WiFi and
4G LTE cellular telephone systems.
7. Document classification: This is a problem often encountered in computer science and
information science. Here, the main issue is to assign a document multiple classes. The task
can be achieved manually and algorithmically. Since manual effort takes too much time,
algorithmic documentation is done to complete it quickly and effectively.
We have seen what Bayesian networks in machine learning are and how they work. To recap,
they are a type of probabilistic graphical model. The first stage of belief networks is to
convert all possible states of the world into beliefs, which are either true or false. In the
second stage, all possible transitions between states are encoded as conditional probabilities.
The final stage is to encode all possible observations as likelihoods for each state.
A belief network can be seen as an inference procedure for a set of random variables,
conditioned on some other random variables. The conditional independence assumptions
define the joint probability distribution from which the conditional probabilities are
computed.
What is regression analysis and what
does it mean to perform a regression?
Regression analysis is a reliable method of identifying which variables have
impact on a topic of interest. The process of performing a regression allows
you to confidently determine which factors matter most, which factors can
be ignored, and how these factors influence each other.
Let’s continue using our application training example. In this case, we’d
want to measure the historical levels of satisfaction with the events from the
past three years or so (or however long you deem statistically significant),
as well as any information possible in regards to the independent
variables.
Perhaps we’re particularly curious about how the price of a ticket to the
event has impacted levels of satisfaction.
Our dependent variable (in this case, the level of event satisfaction) should
be plotted on the y-axis, while our independent variable (the price of the
event ticket) should be plotted on the x-axis.
Once your data is plotted, you may begin to see correlations. If the
theoretical chart above did indeed represent the impact of ticket prices on
event satisfaction, then we’d be able to confidently say that the higher the
ticket price, the higher the levels of event satisfaction.
But how can we tell the degree to which ticket price affects event satisfaction?
To begin answering this question, draw a line through the middle of all of
the data points on the chart. This line is referred to as your regression line,
and it can be precisely calculated using a standard statistics program like
Excel.
We’ll use a theoretical chart once more to depict what a regression line
should look like.
Excel will even provide a formula for the slope of the line, which adds
further context to the relationship between your independent and
dependent variables.
The formula for a regression line might look something like Y = 100 + 7X +
error term.
This tells you that if there is no “X”, then Y = 100. If X is our increase in
ticket price, this informs us that if there is no increase in ticket price, event
satisfaction will still increase by 100 points.
You’ll notice that the slope formula calculated by Excel includes an error
term. Regression lines always consider an error term because in reality,
independent variables are never precisely perfect predictors of dependent
variables. This makes sense while looking at the impact of ticket prices on
event satisfaction — there are clearly other variables that are contributing
to event satisfaction outside of price.
Least Squares Method: What It Means, How to Use It, With Examples
By
WILL KENTON
Updated September 24, 2023
Reviewed by
MICHAEL J BOYLE
Fact checked by
YARILET PEREZ
Investopedia / Xiaojie Liu
Trending Videos
KEY TAKEAWAYS
The least squares method is a statistical procedure to find the best fit
for a set of data points.
The method works by minimizing the sum of the offsets or residuals
of points from the plotted curve.
Least squares regression is used to predict the behavior of
dependent variables.
The least squares method provides the overall rationale for the
placement of the line of best fit among the data points being studied.
Traders and analysts can use the least squares method to identify
trading opportunities and economic or financial trends.
For instance, an analyst may use the least squares method to generate a
line of best fit that explains the potential relationship between independent
and dependent variables. The line of best fit determined from the least
squares method has an equation that highlights the relationship between
the data points.
If the data shows a lean relationship between two variables, it results in a
least-squares regression line. This minimizes the vertical distance from the
data points to the regression line. The term least squares is used because
it is the smallest sum of squares of errors, which is also called the
variance. A non-linear least-squares problem, on the other hand, has no
closed solution and is generally solved by iteration.
Advantages
One of the main benefits of using this method is that it is easy to apply and
understand. That's because it only uses two variables (one that is shown
along the x-axis and the other on the y-axis) while highlighting the best
relationship between them.
Investors and analysts can use the least square method by analyzing past
performance and making predictions about future trends in the economy
and stock markets. As such, it can be used as a decision-making tool.
Disadvantages
The primary disadvantage of the least square method lies in the data used.
It can only highlight the relationship between two variables. As such, it
doesn't take any others into account. And if there are any outliers, the
results become skewed.
Another problem with this method is that the data must be evenly
distributed. If this isn't the case, the results may not be reliable.