0% found this document useful (0 votes)
20 views

Biostatistics - Probability - 02 October 2024

biostatistics biomedical sciences , mut

Uploaded by

melusimthethwa30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Biostatistics - Probability - 02 October 2024

biostatistics biomedical sciences , mut

Uploaded by

melusimthethwa30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

Probability

02 October 2024

Presented by Mrs K.N. Bhengu – Biomedical Sciences


Conditional Probability
The conditional probability of an event B is the probability that the event will
occur given the knowledge that an event A has already occurred. This
probability is written P(B|A), notation for the probability of B given A. In the case
where events A and B are independent (where event A has no effect on the
probability of event B), the conditional probability of event B given event A is
simply the probability of event B, that is P(B).
If events A and B are not independent, then the probability of the
intersection of A and B (the probability that both events occur) is defined
by
P(A and B) = P(A)P(B|A).

Presented by: Mrs K.N. Bhengu – Biomedical Sciences


Conditional Probability
From this definition, the conditional probability P(B|A) is easily obtained by
dividing by P(A):

In a card game, suppose a player needs to draw two cards of the same suit in order to
win. Of the 52 cards, there are 13 cards in each suit. Suppose first the player draws a
heart. Now the player wishes to draw a second heart. Since one heart has already been
chosen, there are now 12 hearts remaining in a deck of 51 cards. So the conditional
probability
P(Draw second heart|First card a heart) = 12/51.
Conditional Probability
Problem: A math teacher gave her class two tests. 25% of the class passed
both tests and 42% of the class passed the first test.
What percent of those who passed the first test also passed the second test?
P(second/first) = P(First and Second)
P(first)
=0,25
0,42
=0,60
= 60 %
https://youtu.be/hR0fQ31-3lg
Conditional Probability
Example 1: A jar contains black and white marbles. Two marbles are chosen without
replacement.
The probability of selecting a black marble and then a white marble is 0.34, and the
probability of selecting a black marble on the first draw is 0.47.
What is the probability of selecting a white marble on the second draw, given that the first
marble drawn was black?
P(white/black) = P(black and white)
P (black)
= 0,34
0,47
=0,72
= 72 %
Bayes Theorem
In probability theory and statistics, Bayes’ theorem (alternatively Bayes’ law or Bayes’ rule)
describes the probability of an event, based on prior knowledge of conditions that might
be related to the event.

For example, if cancer is related to age, then, using Bayes’ theorem, a person's age can be
used to more accurately assess the probability that they have cancer than can be done
without knowledge of the person’s age.

https://youtu.be/cqTwHnNbc8g
Bayes Theorem

• Another important method for calculating conditional probabilities is given by Bayes's


formula. The formula is based on the expression P(B) = P(B|A)P(A) + P(B|Ac)P(Ac), which
simply states that the probability of event B is the sum of the conditional probabilities of
event B given that event A has or has not occurred.
• For independent events A and B, this is equal to P(B)P(A) + P(B)P(Ac) = P(B)(P(A) + P(Ac)) =
P(B)(1) = P(B), since the probability of an event and its complement must always sum to
1. Bayes's formula is defined as follows:
Bayes Theorem
Suppose a voter poll is taken in three states.

In state A, 50% of voters support the liberal candidate, in state B, 60% of the voters support the liberal
candidate, and in state C, 35% of the voters support the liberal candidate.

Of the total population of the three states, 40% live in state A, 25% live in state B, and 35% live in
state C.

Given that a voter supports the liberal candidate, what is the probability that she lives in state B?
Bayes Theorem
P(Voter lives in state B|Voter supports liberal candidate) =
P(Voter supports liberal candidate|Voter lives in state B)P(Voter lives in state B) divide by
P(Voter supports lib. cand.|Voter lives in state A)P(Voter lives in state A) +
P(Voter supports lib. cand.|Voter lives in state B)P(Voter lives in state B) +
P(Voter supports lib. cand.|Voter lives in state C)P(Voter lives in state C))
= (0.60)*(0.25)/((0.50)*(0.40) + (0.60)*(0.25) + (0.35)*(0.35))
= (0.15)/(0.20 + 0.15 + 0.1225)
= 0.15/0.4725
= 0.3175.
The probability that the voter lives in state B is approximately 0.32
Probabilities and Odds
The odds are defined as the probability that the event will occur divided by the probability
that the event will not occur.
If the probability of an event occurring is Y, then the probability of the event not occurring is
1-Y. (Example: If the probability of an event is 0.80 (80%), then the probability that the
event will not occur is 1-0.80 = 0.20, or 20%.

Odds of Events = Y / (1-Y)


So, in this example, if the probability of the event occurring = 0.80, then the odds are
0.80 / (1-0.80) = 0.80/0.20 = 4 (i.e., 4 to 1).
If the horse runs 100 races and wins 5 and loses the other 95 times, the probability of
winning is 0.05 or 5%, and the odds of the horse winning are 5/95 = 0.0526.
Probability of Odds

Using the RED marble example [P(RED) = 1/4 and Odds_Favor(RED) = 1/3] we can
demonstrate how these are equivalent:
Random variable
• A random variable is a variable whose value is a numerical outcome of a random
phenomenon.

• A random variable is denoted with a capital letter

• The probability distribution of a random variable X tells what the possible values of X
are and how probabilities are assigned to those values

• A random variable can be discrete or continuous

• A discrete random variable X has a countable number of possible values.


Discrete Random Variables
Discrete Random Variable
Variable that have a finite or countable number of possible values.
These variables usually occur in counting experiments.
Examples: number of students present
number of red marbles in a jar
number of heads when flipping three coins
students’ grade level
Continuous Random Variables
Variables that can take on any value in some interval i.e. they can take an infinite number of possible values.
These variables usually occur in experiments where measurements are taken.
Examples: height of students in class
weight of students in class
time it takes to get to school
distance travelled between classes
Continuous Random Variable

A continuous random variable X takes all values in a given interval of numbers.

The probability distribution of a continuous random variable is shown by a density curve.

The probability that X is between an interval of numbers is the area under the density
curve between the interval endpoints

The probability that a continuous random variable X is exactly equal to a number is zero
Means and Variances of Random Variables

• The mean of a discrete random variable, X, is its weighted average. Each value of X is
weighted by its probability.
• To find the mean of X, multiply each value of X by its probability, then add all the
products.
• The mean of a random variable X is called the expected value of X.

 X  x1 p1  x2 p2   xk pk
 xi pi
Mean of Random Variables
Law of Large Numbers
As the number of observations increases, the mean of the observed values, x

approaches the mean of the population 

The more variation in the outcomes, the more trials are needed to ensure that x is close to 
Rules of means
If X is a random variable and a and b are fixed numbers, then
 a bX a  b X

If X and Y are random variables, then


 X Y  X  Y
Normal Distribution

A normal distribution, sometimes called the bell curve, is a distribution that


occurs naturally in many situations.

This creates a distribution that resembles a bell (hence the nickname). The
bell curve is symmetrical. Half of the data will fall to the left of the mean; half
will fall to the right.
Properties of Normal Distribution
• The mean, mode and median are all equal.

• The curve is symmetric at the center (i.e. around the mean, μ).

• Exactly half of the values are to the left of center and exactly half the
values are to the right.

• The total area under the curve is 1.


Normal Distribution
The empirical rule tells you what percentage of your data falls within a certain number of standard
deviations from the mean:
• 68% of the data falls within one standard deviation of the mean.
• 95% of the data falls within two standard deviations of the mean.
• 99.7% of the data falls within three standard deviations of the mean.
Confidence Interval
• A confidence interval (CI) is a type of interval estimate, computed from the statistics of the observed
data, that might contain the true value of an unknown population parameter.
• The confidence level is designated prior to examining the data. Most commonly, the 95% confidence
level is used.
• However, other confidence levels can be used, for example, 90% and 99%.
• We measure the heights of 40 randomly chosen men, and get a mean height of 175cm,
• We also know the standard deviation of men's heights is 20cm.
• The 95% Confidence Interval (we show how to calculate it later) is:
• 175cm ± 6,2cm
• This says the true mean of ALL men (if we could measure all their heights) is likely to be between
168,8cm and 181,2cm.
Confidence Interval
• Step 1: find the number of observations n, calculate their mean X, and standard deviation s
• Using our example:
• Number of observations: n = 40
• Mean: X = 175
• Standard Deviation: s = 20
• Note: we should use the standard deviation of the entire population, but in many cases we won't
know it.
• We can use the standard deviation for the sample if we have enough observations (at least n=30,
hopefully more).
• Step 2: decide what Confidence Interval we want: 95% or 99% are common choices. Then find the
"Z" value for that Confidence Interval here:
Confidence Interval
For 95% the Z value is 1,960 Confidence
Z
Interval
Step 3: use that Z in this formula for the Confidence
Interval 80% 1,282

X ± Zs√n 85% 1,440


Where:
X is the mean 90% 1,645
Z is the chosen Z-value from the table above
95% 1,960
s is the standard deviation
n is the number of observations 99% 2,576
And we have:
175 ± 1,960 × 20√40 99,5% 2,807

99,9% 3,291
Confidence Interval
• Which is:

• 175cm ± 6,20cm

• In other words: from 168,8cm to 181,2cm

• The value after the ± is called the margin of error

• The margin of error in our example is 6,20cm


Testing Hypothesis
Testing Hypothesis
• A statistical hypothesis, sometimes called confirmatory data analysis, is a
hypothesis that is testable on the basis of observing a process that is modelled via a set
of random variables. A statistical hypothesis test is a method of statistical inference.

• Commonly, two statistical data sets are compared, or a data set obtained by sampling is
compared against a synthetic data set from an idealized model.

• A hypothesis is proposed for the statistical relationship between the two data sets, and
this is compared as an alternative to an idealized null hypothesis that proposes no
relationship between two data sets.
Testing Hypothesis

The process of distinguishing between the null hypothesis and the alternative
hypothesis is aided by considering two conceptual types of errors.

The first type of error occurs when the null hypothesis is wrongly rejected. The
second type of error occurs when the null hypothesis is wrongly not rejected.
Critical Values and Critical Region

A critical value is a line on a graph that splits the graph into sections. One or two of the
sections is the “rejection region”; if your test value falls into that region, then you reject the
null hypothesis.
A one tailed test with the rejection in one tail. The critical value is the red line to the left of that
region
Critical Values
shows that results of a one-tailed Z-test are shows that results of a two-tailed Z-test are
significant if the test statistic is equal to or significant if the absolute value of the test statistic
greater than 1.64, the critical value in this is equal to or greater than 1.96, the critical value in
case. The shaded area is 5% (α) of the area this case
under the curve
When are critical values of z used?

• A critical value of z (Z-score) is used when the sampling distribution is


normal, or close to normal.

• Z-scores are used when the population standard deviation is known or when
you have larger sample sizes.

• While the z-score can also be used to calculate probability for unknown
standard deviations and small samples, many statisticians prefer to use the t
distribution to calculate these probabilities.
Examples of critical regions
Hypothesis
Here are 7 steps to take to formulate a strong A/B testing Hypothesis

• 1. Define your problem

• Defining your problem is the first thing that needs to be done. What is it that
you want to test or solve? Is it to double your sales or to increase the number
of opt-ins?
• Whatever your goals are, they need to be clearly defined, quantifiable, and
measurable. This should give you a clear idea of what your new design
should solve including the process that will be followed to achieve the
results.
Here are 7 steps to take to formulate a strong A/B testing Hypothesis

• 2. Find out the reasons behind the numbers

• Now that you have defined your problem and you have a clear picture of what it is you want to
achieve, the next thing that follows is an in-depth analysis of the current problem. Basically, you want
to take as much time as possible to learn the reasons behind your numbers.

• You won’t be able to form an accurate hypothesis without studying what is happening in the website
where you want to test your A/B test. Now that you are already looking for better variables to improve
your conversion rates, it is only logical that you find the reasons that brought you to this current
situation. Why are you experiencing high bounce rate? Why aren’t you seeing more conversions?
Why are most of your customers failing to complete the payment process? These are obviously some
of the reasons that may push you to improve your website
Here are 7 steps to take to formulate a strong A/B testing Hypothesis

3. Talk to your visitors

• It is important to get real feedback from your visitors. One way is to use surveys—both
entry surveys and exit surveys that are used to discover your visitor’s objectives and
determine whether their goals have been met respectively this is aimed at
understanding what they want or what their desires are.

• Knowing the reasons behind their decisions and actions is the most important part of
the survey. Therefore, do not hesitate to ask them to give reasons for their actions in the
survey. For instance, you can place an exit survey at the end of a buying process to ask
them why they bought your product. You could also place an exit survey immediately
they abandoning a buying process to understand why they did so.
Here are 7 steps to take to formulate a strong A/B testing Hypothesis
• 4. Use segmentation to get actionable data

• An experiment may show that a certain product is not performing well, but upon further
analysis, it may be discovered that majority of people who buy the product are women
aged between 18 and 29 years.

• Upon further investigation, it may turn out that ads for the product were being targeted to
the general population. So when you do segmentation, it may eventually occur to you that
you should concentrate your marketing efforts on the women who fall in the 18-29 age
bracket.
Here are 7 steps to take to formulate a strong A/B testing Hypothesis

• 5. Articulate a Hypothesis for your test

• Now that you have gathered enough evidence to show what or where the problems is, it
is time to state why you think the problem occurs.
• Your hypothesis should have the following characteristics:
• It is goal oriented—it clearly states what needs to be accomplished
• It can be tested—it can easily be implemented
• It is insightful—looking at the hypothesis, one should learn something about the problem.
• An example of hypothesis
• Problem: less than 5% of visitors buy the mobile app
Here are 7 steps to take to formulate a strong A/B testing Hypothesis

6. Test substantial variations based on your Hypothesis

• We can call this the brainstorming stage. After determining the problem and articulating a
hypothesis. The next thing that follows is coming up with substantial variations based on
your hypothesis.
• Taking the above example, the hypothesis states that “The text in the CTA button does
not provide a clear message to the customer.” The substantial variations could include
things like changing the colour of the button, changing position of the CTA on the landing
page, changing the wordings, creating different icon etc.
• The substantial variations of your hypothesis are meant to bring you closer to the
solution as quickly as possible and provide you with insights.
Here are 7 steps to take to formulate a strong A/B testing Hypothesis

7. Analyse results to validate your hypothesis and Repeat

• Once you’ve managed to articulate your hypotheses and test substantial variations,
it’s time to analyse results to validate your hypothesis. You need to have sufficient
test results in order to analyse and compare.

• When you are analysing your tests with the aim of implementing solutions, you
should bear in mind that revenue is the ultimate measurement of improvement.

• Customer feedback and analytics are tools you can use. You should look at the data
your customers have left to help you choose the elements that need to be analysed.
The various elements you could test include:
Here are 7 steps to take to formulate a strong A/B testing Hypothesis

• CTAs—colours, texts, size


• Images—placement size
• Headlines—size, length, style, tone text colour
• Testimonials—placement, number, length
• Videos—number, with or without videos
• Forms—files type, colour, number of fields
• Shopping cart—icon, text, number of steps
• Copywriting—long text or short, style, tone
• After learning from your results, you should start the process all over
because there is always room for improvement

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy