Quantitative Level 1
Quantitative Level 1
Quantitative Level 1
QUANTITATIVE METHOD
Lecturer: Tran Trong Kien, CFA
Email: Trankien88@gmail.com
1
ti
n
a
u
Q
%
0
1
e
v
t
READING 6: THE TIME VALUE OF MONEY
2
READING 6: THE TIME VALUE OF MONEY
Learning outcomes:
DEFINITION:
3
- Future value: projecting the cash flows forward, on the basis of an
appropriate compound interest rate, to the end of investment’s life;
- Present value: bringing the CF from investment back to the beginning of the
investment’s life based on an appropriate compound rate;
- Time line: a diagram of the CF associated with a TVM problem;
Note:
- Cash flow at the end of one period is the same as the beginning of the next
period.
6.b Explain an interest rate as the sum of a real-free rate and premiums that
compensate investors for bearing distinct type of risk:
- Other type of risk: each added to increase the required rate of return
+ Default risk: risk a borrower will not make a promised payments in a
timely manner;
+ Liquidity risk: risk of receiving less than fair value for an investment if it
must be sold for cash quickly;
+ Maturity risk: prices of long-term bonds are more volatile than those
short-term bonds.
Longer maturity bonds have more maturity risk than short-term bonds and
require a maturity risk premium.
Required interest rate on a security = nominal risk free rate (T-bills) +
default risk premium + liquidity risk premium + maturity risk premium.
4
6.c. Calculate and interpret the effective annual rate, given the stated
annual interest rate and the frequency of compounding:
-Effective annual rate (EAR) or effective annual yield (EAY): annual rate
of return actually being earned after adjustments have been made for
different compounding periods.
5
Example: FV of a single sum
Calculate the FV of a $200 investment in the at the end of two years if it
earns an annually compounded rate of return of 10%.
- With more than compounding period per year, the future value:
6
Example: PV of a single sum
Given a discount rate of 10%, calculate the PV of $200 cash flow that will
be received in two years.
- Annuities: is a stream of equal cash flows that occur at equal intervals over
a given period
+ Ordinary annuities: cash flows that occur at the end of each compounding
period.
+ Annuity due: payments or receipts occur at the beginning of the period
Example: FV of an ordinary annuity
What is the future value of an ordinary annuity that pays $200 per year at
the end of the next three years, given the investment is expected to earn a
10% rate of return?
7
-Future vale of annuity due:
Note: annuity due payments are made or received at the beginning of each
period, FV of an annuity due is calculated as of the end of last period.
Formula:
Example: Kodon Corporation issues preferred stock that will pay 4.5$ per
year in annual dividends beginning next year and plans to follow this dividend
policy forever. Given an 8% rate of return, what is the value of Kodon’s preferred
stock today.
Example: PV of a deferred perpetuity
Assume the Kodon preferred stock in the preceding examples is scheduled to
pay its first dividend in four years and is non-cumulative (i.e does not pay
any dividends for the first three years). Given an 8% required rate of return,
what is the value of Kodon’s preferred stock today?
8
Using a rate of return 10%, compute the future value of three-year uneven
cash flow stream of 300 USD year 1, 600 USD year 2 and 200 USD year 3.
6.f Demonstrate the use of time line in modeling and solving time value
of money problems.
Example: Computing a loan payment
Suppose you are considering applying for a $2000 loan that will be repaid
with equal end-of-year payments over the next 13 years. If the annual interest rate
for the loan is 6%, how much will your payment be?
9
Example: Computing the of years in an ordinary annuity
Suppose you have a $1000 ordinary annuity earning an 8% return. How
many annual end-of-year $150 withdrawals can be made?
The connection between present values, future values, and series of cash
flow:
- The cash flow additivity principle:present value of any stream of cash flows
equal the sum of the present value of the cash flows.
Example: Additivity principle
A security will make the following payments at the end of the next four
years: $100, $100, $400 and $100. Calculate the present value of these cash
10
flows using the concept of the present value of an annuity when the
appropriate discount rate is 10%.
11
MODULE 7.1 DESCRIBING DATA SETS
7.a Distinguish between descriptive statistics and inferential statistics,
between a population and a sample, among the types of measurement scales:
-Statistics: referring to the data and methods use to analyze the data
+ Descriptive statistics: summarize the important characteristics of large
data sets.
+ Inferential statistics: procedures to make forecasts, estimates, or
judgments about a larger set of data on the basis of the statistical characteristics of
a smaller set.
-Population: set of all possible members of stated group.
-Sample: subset of population of interest
-Types of measurement scales:
+ Nominal scale: containing least information, classified or counted with no
particular order. (là thang đo dùng để phân chia hay đặt tên nhóm được khảo sát thành các
lớp phân loại khác nhau chứ không có ý nghĩa nào khác, ví dụ thành thị =1, nông thôn=2 ở đây
không thể nói nông thông lớn hơn thành thị và ngược lại)
+Ordinal scale: every observation is assigned to one of several categories.
These categories are ordered with respect to a specified characteristic. (Bản chất là
thang định danh nhưng các lớp khác nhau được sắp xếp theo một thứ hạng giảm dần hoặc tăng
dần, ví dụ: hạng nhất/hạng nhì/hạng ba, khoảng cách giữa các điểm đo không chắc đều nhau và
hiệu số không có ý nghĩa)
+ Interval scale: provide relative ranking, like ordinal scales, plus the
assurance that differences between scale are equal.( Bản chất là thang thứ bậc có các
khoảng cách đều nhau nhưng không có điểm gốc 0 tuyệt đối. Quan hệ giữa các điểm đo trên
thang là A>B>C>D và A - B = B – C. Một đặc điểm quan trọng của thang định khoảng là
thang này không có điểm 0 tuyệt đối, nghĩa là điểm 0 không có thật, chỉ là quy ước (như 0 độ C
không phải là "không có nhiệt độ" mà là "tại nhiệt độ đó nước từ thể rắn chuyển sang thể lỏng"
và còn có thể xuống thấp hơn mức 0 độ). Điều này dẫn đến việc so sánh tỷ lệ giữa các trị số đo
là không có ý nghĩa (phép chia). Ta không thể nói: 40 độ C là nóng gấp 4lần 10 độ C)
+ Ratio scale: provide ranking and equal differences between scale values
and also have true zero point as origin. (Là thang đokhoảngvớiđiểm 0 tuyệtđối. Vídụ: thang
đovớicácthôngsốvậtlý: dài, rộng, cao, cânnặng; thunhập, chi tiêu... Nhờđiểmgốcvàmộttiêuchuẩnđocụthể, ta
cóthểsửdụngđượcmọicôngcụtoán - thốngkêđểphântíchdữliệu, so sánhđượctỷlệgiữacáctrịsốđo. Ta
cóthểnói: ngườicóthunhập 10 triệuđồng/thánglàgấpđôingườicóthunhập 5 triệuđồng/tháng.)
12
Order, interval, and ratios all make sense with a ratio scale.
7.b define a parameter, a sample statistic, and frequency distribution:
-Parameter: describe a characteristic of population (e.x: mean return,
standard deviation…)
-Sample statistic: measure characteristic of sample;
-Frequency distribution: tabular presentation of statistical data,
summarizing statistical data by assigning it to specified groups or intervals;
Constructing a frequency distribution:
-Step 1: Define the intervals:
+ Refer to a class, set of values that observation may take on;
+ Each interval must have a lower and upper limit, inclusive and non-
overlapping;
+ Modal interval: interval with greatest frequency
-Step 2: Tally the observation:
+ Observations must be tallied;
+ or assigned to their appropriate interval;
-Step 3: Count the observation:
+ Absolute frequency or simple frequency is actual number of observations
fall within interval;
Example: Constructing a frequency distribution
Use the data in table A to construct a frequency distribution for the returns
on Intelco’s common stock:
13
- Cumulative frequency: summing absolute or relative frequencies starting
at the lowest interval and progressing through the highest.
14
- Frequency polygon: mid point of each interval is plotted on the horizontal
axis and absolute frequency is plotted on the vertical axis.
- Population mean:
- Sample mean:
15
Example: Population mean and sample mean
You have calculated the stock return for AXZ Corporation over the last five
years as 25%, 34%, 19%, 54% and 17%. Given this information, estimate
the mean of distribution of return.
16
-Mode: value that occurs most frequently in data set. It may have more than
one mode or even no mode.
+ Unimodal: one value appears most frequently
+ Bimodal and trimodals: set of data has two or three values
Example: The mode
What is the mode of the following data set?
Data set: 30%, 28%, 25%, 23%, 28%, 15% and 5%.
This equation has a solution only if the product under the radical sign is non-
negative.
When calculating the geometric mean for a returns data set, it is necessary to
add 1 toeach value under the radical and then subtract 1 from the result
-Harmonic mean: is used for certain computations, such as the average cost
of shares purchased over time.
17
For values that are not equal: harmonic mean< geometric mean<
arithmetic mean
- Holding period return:
-When the location, Ly, is a whole number, the location corresponds to an actual
observation.
-When Ly is not a whole number or integer, Ly lies between the two closest integer
numbers (one above and one below), and we use linear interpolation between those
two places to determine Py
18
Investment analysts use quantiles every day to rank performance—for example, the
performance of portfolios. The performance of investment managers is often
characterized in terms of the quartile in which they fall relative to the performance
of their peer group of managers
LOS 7.g: Calculate and interpret 1) a range and a mean absolute deviation
and 2) the variance and standard deviation of a population and of a sample.
-Dispersion is defined as the variability around the central tendency.
+ Central tendency: measure of the reward;
+ Dispersion: measure of risk;
-Range: is the distance between the largest and the smallest value
(maximum value – minimum value);
19
-Population standard deviation: is the square root of the population
variance and is calculated as follow:
20
-Chebyshev’s inequality: the percentage of the observations that lie within k
standard deviations of the mean is at least 1-1/k2 for all k>1.
21
+ For symmetrical distribution, unimodal distribution, the mean, median and
mode are equal
22
-A symmetric distribution has skewness of 0;
- A positively skewed distribution has positive skewness;
- A negatively skewed distribution has negative skewness;
LOS 7.l: Explain measures of sample skewness and kurtosis.
-Kurtosis: measure of combined weight of the tails of distribution relative to
the rest of distribution – that is, proportion of the total probability in the tail;
+ Leptokurtic:a distribution has fatter tails than normal distribution tends to
generate more-frequent extremely large deviation from the mean than normal
distribution;
+ Platykurtic: a distribution has thinner tails than normal distribution;
+ Mesokurtic: same kurtosis as a normal distribution;
23
Most risk managers focus more on the distribution of returns in the tails of
the distribution. General, greater positive kurtosis and negative skew indicates
increased risk.
Measures of sample skew and kurtosis:
-Sample skewness: equal to the sum of the cubed deviations from the mean
divided by the cubed standard deviations and number of observations.
24
+ It is used for multi-year return, captures how the total returns are linked
over time.
-Arithmetic mean:
+ average return over a one-year period horizon;
+ is used for forward-looking context.
25
E.x: A = the portfolio earns a return of 10 percent and B = the portfolio earns a
return below 10 percent, C = the portfolio earns a return above 10 percent
-Objective probability:
+Empirical probability: established by analyzing past data;
+ Priori probability: one based on logical analysis rather than on
observation or personal judgment;
-Subjective probability: involves the use of personal judgment;
LOS 8.c: State the probability of an event in terms of odds for and against the
event.
-Odds: an event will or will not occur is an alternative way of expressing
probabilities;
(Odds cũngcóđịnhnghĩatươngtựnhưxácsuất, nhưng ở đâychínhlàtỉlệcủa 2 xácsuất. Cụthể,
Odds đượcđịnhnghĩalàtỉlệgiữaxácsuấtxảyra 1 sựkiện so vớixácsuấtkhôngxảyrasựkiệnđó)
- Probability of occurrence of 0.125;
- The odds the event will occur are (1/8)/(7/8)=1/7;
- The odds against the event occurring are the reciprocal of 1/7; is seven to
one;
1.Odds for E = P(E)/[1 − P(E)]. The odds for E are the probability of E divided by
1 minus the probability of E. Given odds for E of “a to b” the implied probability
of E is a/(a + b).
If a race horse runs 100 races and wins 25 times and loses the other 75 times, the
probability of winning is 25/100 = 0.25 or 25%, but the odds of the horse winning
are 25/75 = 0.333 or 1 win to 3 loses.
2.Odds against E = [1 − P(E)]/P(E), the reciprocal of odds for E. Given odds
against E of “a to b,” the implied probability of E is b/(a + b).
LOS 8.d: Distinguish between unconditional and conditional probabilities.
26
- Addition rule of probability is used to determine the probability that at
least one of two events will occur. If A and B shares any outcomes.
LOS 8.f: Calculate and interpret 1) the joint probability of two events, 2) the
probability that at least one of two events will occur, given the probability of
each and the joint probability of the two events, and 3) a joint probability of
any number of independent events.
-Joint probability: two events is the probability that they will both occur.
+ Multiplication rule of probability:
27
Calculating a joint probability of any number of independent events:
LOS 8.h: Calculate and interpret an unconditional probability using the total
probability rule.
-Total probability rule: highlights the relationship between unconditional
and conditional probabilities of mutually exclusive and exhaustive events.
28
-Expected variance: The variance is calculated as the probability-weighted
sum of the squared differences between each possible outcome and expected EPS
29
- Tree diagram:
LOS 8.k: Calculate and interpret covariance and correlation and interpret a
scatterplot.
-Covariance: measure how two assets move together;
30
-Sample covariance:determine covariance using historical data
-Correlation coefficient:
31
- Scatterplots are a method for displaying the relationship between two
variables. With onevariable on the vertical axis and the other on the horizontal
axis, their paired observations caneach be plotted as a single point
A key advantage of creating scatter plots is that they can reveal non-linear
relationships,which are not described by the correlation coefficient.
32
Limitations of Correlation Analysis.
Correlation measures the linear association between two variables, but it
may not always be reliable. Two variables can have a strong nonlinear relation and
still have a very low correlation.
For example, the relation Y= (X− 4)2is a nonlinear relation contrasted to the
linear relation Y= 2X− 4. Even though these two variables are perfectly associated,
there is no linear association between them
Role of outliers (extreme values) in the correlation of two variables; If
removing the outliers significantly reduces the calculated correlation,further
inquiry is necessary into whether the outliers provide information or are caused by
noise (randomness) in the data used
- Spurious correlation refers to correlation that is either the result of chance
relationships in a particular data setor present due tochanges in both variables over
time that is caused by their association with a third variable.
8.l Calculate and interpret the expected value, variance and standard
deviation of a random variable and of return on a portfolio.
-Weight of portfolio asset:
33
-Portfolio variance:
34
LOS 8.n: Calculate and interpret an updated probability using Bayes’
formula.
-Bayes’ formula: update a given set of prior probabilities for a given event
in response to the arrival of new information:
35
-Combination formula: general formula for labeling when k=2;
36
-Multiplication Rule of Counting.If one task can be done in n1 ways, and
a second task, given the first, can be done in n2 ways, and a third task, given the
first two tasks, can be done in n3 ways, and so on for ktasks, then the number of
ways the ktasks can be done is (n1)(n2)(n3 ) … (nk).
-Factorial: assign every member of a group of size n to one of n slots (or
tasks);
- The labeling:formulaapplies to three or more subgroupsof predetermined
size.Each element of the entire group must be assigned a place, or label, in one of
thethree or more subgroups.
- The combination:formula applies to only two groupsof predetermined
size. Lookfor the word “choose” or “combination.”
-The permutation:formula applies to only two groupsof predetermined size.
Look for a specific reference to “order” being important.
37
READING 9: COMMON PROBABILITY DISTRIBUTIONS
38
Module 9.1: Uniform and binomial distribution:
Los 9.a Define a probability distribution and distinguish between discrete and
continuous random variables and their probability functions:
Los 9.b Describe the set of possible outcomes of a specified discrete random
variable:
-A probability distribution: probabilities of all possible outcomes for a
random variable.
+ The probabilities of all possible outcomes must sum to 1.
E.x: A simple probability distribution is that for the roll of one fair die; there are
six possible outcomes and each one has a probability of 1/6, so they sum to 1
- Random variable: quantity whose future outcomes are uncertain.
-A discrete random variable: number of possible outcomes can be counted,
+ For each possible outcome, there is measureable and positive probability;
E.x: the number of days it will rain in a given month
-A probability function: p(x) specifies the probability that a random variable
is equal to specific value.
Two key properties of a probability function are:
+ 0 ≤ p(x) ≤ 1;
+ ∑ p(x) =1;
40
With eight outcomes, p(x) = 1/8, or 0.125, for all values of X (X = 1, 2, 3, 4, 5, 6,
7, 8);
41
+ The probability for a range of outcome is p(x)k, where k is the number of
possible outcomes in the range.
E.xSuppose that the possible outcomes are the integers (whole numbers) 1
to 8, inclusive, and the probability that the random variable takes on any of these
possible values is the same for all outcomes (that is, it is uniform). With eight
outcomes, p(x) = 1/8, or 0.125, for all values of X (X = 1, 2, 3, 4, 5, 6, 7, 8). The
distribution has a finite number of specified outcomes, and each outcome is
equally likely
42
The binomial probability function defines the probability of x successes in n
trials can be expressed using the following formula:
43
LOS 9.g: Construct a binomial tree to describe stock price movement.
-A binomial tree: showing all possible combination of up-moves and down-
moves over a number of successive periods;
LOS 9.h: Define the continuous uniform distribution and calculate and
interpret probabilities, given a continuous uniform distribution.
-Continuous uniform distribution: defined over a range the spans between
some lower limit, a, and some upper limit, b.
+ Even if a<x<b, P(X=x)=0;
44
MODULE 9.2: NORMAL DISTRIBUTION
LOS 9.i: Explain the key properties of the normal distribution.
45
LOS 9.j: Distinguish between a univariate and a multivariate distribution and
explain the role of correlation in the multivariate normal distribution.
A univariate distribution describes a single random variable
A multivariate distributionspecifiesthe probabilities associated with a
group of random variables
-the return on a given stock and the return on the S&P 500 or some other
market index will have special significance;
-Regardless of the specific variables, the simultaneous analysis of two or
more random variables requires an understanding of multivariate distributions.
The multivariate normal distribution for the return of n assets can be defined
by following 3 sets of parameters:
46
LOS 9.l: Define the standard normal distribution, explain how to standardize
a random variable, and calculate and interpret probabilities using the
standard normal distribution.
There are as many different normal distributions as there are choices for mean (μ)
and variance (σ2).For the sake of efficiency, however, we would like to refer all
probability statements to a single normal distribution
-Standard normal distribution: is a normal distribution that has been
standardized so that it has a mean of zero and standard deviation of 1, N[0,1];
+ Standardization: is the process of converting an observed value for a
random variable to its z-value.
48
- Formula for Roy’s safety-first criterion:
49
LOS 9.n: Explain the relationship between normal and lognormal
distributions and why the lognormal distribution is used to model asset prices.
50
- Compounded rates of return are additive for multiple periods;
In general, the holding period return after T years, when the annual continuously
compounded rate is Rcc, is given by
Given investment results over a 2-year period, we can calculate the 2-year
continuously compounded return and divide by two to get the annual rate.
Consider an investment that appreciated from $1,000 to $1,221.40 over a 2-year
period. The 2-year continuously compounded rate is ln(1,221.40 / 1,000) = 20%,
and the annual continuously compounded rate (Rcc) is 20% / 2 = 10%.
LOS 9.p: Explain Monte Carlo simulation and describe its applications and
limitations.
-Monte Carlo simulation: is a technique based on the repeated generation of
one or more risk factors that affect security values to generate a distribution of
security values.
1. Valuing an Asian call option on stock. C iT to represent the value of the option at
maturity T. The subscript i in C iT indicates that CiT is a value resulting from the
ith simulation trial
2. Specify a time grid. Take the horizon in terms of calendar time and split it into a
number of subperiods, say K in total.
3. Specify distributional assumptions for the risk factors that drive the underlying
variables
-For example, stock price is the underlying variable for the Asian call, so we
need a model for stock price movement
51
the mean value of Ci0 for the total number of simulation trials. This mean value
is the Monte Carlo estimate of the value of the Asian call
52
READING 10: SAMPLING AND ESTIMATION
53
LOS 10.c: Distinguish between simple random and stratified random
sampling.
-Stratified random sampling: use a classification system to separate the
population into smaller groups based on one or more distinguished characteristics.
+ From each subgroup, or stratum, a random sample is taken and the results
are pooled.
+ The size of the samples from each stratum is based on the size of the
stratum relative to the population.\
LOS 10.d: Distinguish between time-series and cross-sectional data.
-Time series data: consists of observation taken over a period of time at
specific and equally spaced time intervals.
- Cross-sectional data: sample of observations taken at a single point in
time.
- Longitudinal data: observations over time of multiple characteristic of the
same entity, such as unemployment, inflation, and GDP growth rates for a country
over 10 years.
- Panel data: contains observations over time of the same characteristic for
multiple entities, such as debt/equity ratios for 20 companies over the most recent
24 quarters.
LOS 10.e: Explain the central limit theorem and its importance.
54
When donnot know the population standard deviation and need to use sample
standard deviation
55
LOS 10.g: Identify and describe desirable properties of an estimator.
Desirable properties of an estimator are unbiasedness, efficiency and consistency
-An unbiased estimator: the expected value of the estimator is equal to the
parameter you are trying to estimate. E.g,
-An efficient estimator: is also efficient if the variance of its sampling
distribution is smaller than all the other unbiased estimators.
+ The sample mean, is an unbiased and efficient estimator of the population
mean.
+ The sample variance is an efficient estimator of population variance.
-A consistent estimator: accuracy of the parameter estimate increases as the
sample size increases.
+ As the sample size increases, the standard error of the sample mean falls,
and the sampling distribution bunches more closely around the population mean.
MODULE 10.2: CONFIDENCE INTERVAL AND T-DISTRIBUTIONS
LOS 10.h: Distinguish between a point estimate and a confidence interval
estimate of a population parameter.
-Point estimates: are single values used to estimate population parameters.
+ The formula used to compute the point estimate is called the estimator.
+ E.g: The sample mean, is an estimator of the population mean μ
56
- Characterisitic of t-distribution:
+ symmetrical distribution that is centered around zero;
+ t-distribution has fatter and thicker tail than normal distribution;
+ As the number of observation increases, t-distribution become more
spiked and its tail become thinner;
+ As the number of degrees of freedom increases without bound, the t-
distribution converge to z-distribution, more outliers
+ Hypothesis using t-distribution more difficult to reject the null relative to
hypothesis testing using z-distribution.
+ This means that confidence intervals for a random variable that follows a
t-distribution must be wider (narrower) when degrees of freedom are less (more)
for a given significance level.
+ The greater the degree of freedom, the greater the percentage of
observation near center of distribution and lower percentage of observations in the
tail.
57
LOS 10.j: Calculate and interpret a confidence interval for a population
mean, given a normal distribution with 1) a known population variance, 2) an
unknown population variance, or 3) an unknown population variance and a
large sample size.
-Confidence interval: range of value within which the actual value of a parameter
will lie, given the probability of 1-α,
+ α is called the level of significance;
+ 1-α is degree of confidence it will contain the parameter it is intended to
estimate;
+ E.g: population mean of random variables will range from 15 to 25 with a
95% degree of confidence.
+ Confidence intervals take on the following form:
58
The most basis confidence interval for the population mean arise when
sampling from a normal distribution with known variance. The reliability factor in
this case based on the standard normal distribution with mean 0 and variance of 1
59
Confidence interval can be interpreted from a probabilistic perspective or a
practical perspective: (rarely we know population variance in practice)
60
Confidence Interval for a population mean when the population variance is
unknown given a large sampe from any type of distribution
If the sample is not random, the central limit theorem doesn’t apply, cannot
form unbiased confidence intervals.
LOS 10.k: Describe the issues regarding selection of the appropriate sample
size, datamining bias, sample selection bias, survivorship bias, look-ahead
bias, and time-period bias.
-Data mining: analysts repeatedly use the same database to search for
patterns or trading rules ;
-Data-mining bias: results where the statistical significance of the pattern is
overestimated because the results were found
+ The best way to avoid data mining is to test use out-of-sample data.
-Sample selection bias: some data is systematically excluded from the
analysis, usually because of the lack of availability.
+ Observed value to be nonrandom
+ Conclusion drawn from this sample can’t be applied to population
61
-Survivorship bias:is a common type of sample selection bias
-Look-ahead bias: study test a relationship using sample data that was not
available on the test date.
-Time-period bias: if the time period over which the data is gathered is
either too short or too long.
+ Time period is too short, research results may reflect phenomena specific
to that time period, or event data mining.
+ Time period is too long, the fundamental economic relationships
underlying the results may have changed.
62
READING 11: HYPOTHESIS TESTING
63
The Null Hypothesis and Alternative Hypothesis:
- The null hypothesis: is the hypothesis that the researcher wants to reject.
-The alternative hypothesis: what is concluded if there is sufficient evidence
to reject the null hypothesis. It is the hypothesis that are really trying to assess.
-can be one-sided or two-sided.
+ One-sided is referred to as a one-tailed test;
+ Two-sided test is referred to as a two-tailed test;
- Two-tailed test for the population mean may be structured as:
- Two-tailed test uses two critical values, the general decision is:
Reject Ho if: test statistic > upper critical value or test statistic < lower
critical value.
64
-One-tailed hypothesis test of the population mean, the null and alternative
hypotheses are either:
LOS 11.c: Explain a test statistic, Type I and Type II errors, a significance
level, and how significance levels are used in hypothesis testing.
65
+ Population standard deviation is not known:
Or
The range within which we fail to reject the null for a two-tailed hypothesis
test at a given significance level.
67
LOS 11.g: Identify the appropriate test statistic and interpret the results for a
hypothesis test concerning the population mean of both large and small
samples when the population is normally or approximately normally
distributed and the variance is 1) known or 2) unknown.
The t-Test: employs a test statistic distributed according to a t-distribution;
Using the t-test if the population variance is unknown and either of the
following conditions exist:
+ The sample is large (n≥30);
+ The sample is small (less than 30), but the distribution of the population is
normal or approximately normal.
-If the sample is small and distribution is nonnormal, no reliable statistical
test.
-t-statstic with n-1 degrees of freedom computed as:
68
The z-test:
The z-test is the appropriate hypothesis test of the population mean when the
population is normally distributed with known variance.
The z-statistic for a hypothesis test for a population mean is computed as
follows
When the sample size is largeand the population variance is unknown, the z-
statistic is:
This is acceptable if the sample size is large, although the t-statistic is the
more conservative measure when the population variance is unknown.
69
MODULES 11.3: MEAN DIFFERECNES, DIFFERENCE IN MEAN
LOS 11.h: Identify the appropriate test statistic and interpret the results for a
hypothesis test concerning the equality of the population means of two at least
approximately normally distributed populations, based on independent
random samples with 1) equal or 2) unequal assumed variances
-A pooled variance is used with the t-test for testing the hypothesis that the
means of two normally distributed populations are equal, when the variances of the
populations are unknown but assumed to be equal.
Assuming independent samples, the t-statistic in this case is computed as:
70
The t-test for equality of population means when the populations are
normally distributed and have variances that are unknown and assumed to be
unequalusesthe sample variances for both populations.
Assuming independent samples, the t-statistic in this case is computed as
follows:
71
LOS 11.i: Identify the appropriate test statistic and interpret the results for a
hypothesis test concerning the mean difference of two normally distributed
populations.
-If the observations in the two samples both depend on some other factor,
using “paired comparison”
+ “paired comparison”:test of whether the means of the differences
between observationsfor the two samples are different,
+ “paired comparison”:requires that the sample data be normally
distributed.
The general form of the test for anyhypothesized mean difference, μdz
, is as follows:
72
LOS 11.j: Identify the appropriate test statistic and interpret the results for a
hypothesis test concerning 1) the variance of a normally distributed
population, and 2) the equality of the variances of two normally distributed
populations based on two independent random samples.
-The chi-square testis used for hypothesis tests concerning the variance of a
normally distributed population.
- The hypotheses for a two-tailed test of a singlepopulation variance are
structured as:
73
The chi-square test compares the test statistic, ,toa critical chi-square
value at a given level of significance and n − 1 degrees of freedom.
74
Testing the equality of the variances of two normally distributed populations,
based on two independent random samples
-The F-test is used under the assumption that the populations from which
samples are drawn are normally distributed and that the samples are independent.
75
- The upper critical value is always greater than one;
- The lower critical value is always less than one;
LOS 11.k: Formulate a test of the hypothesis that the population correlation
coefficient equals zero and determine whether the hypothesis is rejected at a
given level of significance.
Correlation measures the strength of the relationship between two variables.
If the correlation between two variables is zero, there is no linear relationship
between them.
When the sample correlation coefficient for two variables is different from zero,
we must address the question of whether the true population correlation coefficient
(ρ) is equal to zero.
76
The appropriate test statistic for the hypothesis that the population correlation
equals zero, when the two variables are normally distributed, is:
77
The Spearman rank correlation testcan be used when the data are not
normallydistributed.
+ A large positive value of the Spearman rank correlations, such as
0.85,would indicate that a high (low) rank in one year is associated with a high
(low) rank inthe second year.
+ Alternatively, a large negative rank correlation would indicate that a
high rank in year 1 suggests a low rank in year 2, and vice versa
78