3-Basic Stats

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Basic Statistical Concepts

Learning Objectives :
1. Probability Density function
2. Normal Distribution
3. Correlation and Covariance
4. Various methods of computing Volatility
5. Vasicek Model
Probability Density Function- Basic Concepts
❑ Let X be a continuous random variable . Then a probability distribution or probability density
function (pdf) of X is a function f(x) such that for any two numbers a and b with b >= a,
P ( a<= X <= b) is ∫f(x) dx , with integration from a to b
❑That is, the probability that X takes on a value in the interval [a, b] is the area above this interval
and under the graph of the density function. The graph of f(x) is often referred to as the density
curve
❑For f (x) to be a legitimate pdf, it must satisfy the
following two conditions: f (x)

✓ f (x) >= 0 for all x


✓∫f(x) dx = 1 for the area under the curve
X--→
a b
2
Cumulative Density Function- Basic Concepts
❑ The cumulative distribution function [CDF] F(x) for a discrete random variable X, gives for every
number x, the probability that P (X <= x)
❑It is obtained by summing the pdf over all possible values of the y where y <= x
❑The cumulative distribution function F(x) for a continuous rv X is defined for every number x by
F (x) = P ( X <=x) = ∫f(y) dy , with integration from - 𝒊𝒏𝒇𝒊𝒏𝒊𝒕𝒚 to x

CDF
1.20
1.00
0.80
0.60
0.40
0.20
-

3
Normal Distribution
❑ Of all the distribution functions, Normal Distribution is amongst the most important

❑Numerous economic and physical measures and indicators are normally distributed

❑A continuous rv X is said to have a normal distribution with parameters μ and σ, if the pdf of X is

f (x) = 1 / σ√2π e [ -1/2 ( x-μ )/σ)2]

❑Each density curve is symmetric about μ and bell-shaped, so the center of the bell (point of
symmetry) is both the mean of the distribution and the median.

❑ For μ= 0 and σ= 1, the pdf is called Standard Normal Distribution

❑ The Standard Normal Distribution is a reference distribution from which information on other
distributions can be obtained using z = (x -μ )/ σ. Z is negative for values to left of the mean

❑For any z value, area to the left of the curve can be found from empirical tables 4
Important Z values and Percentiles

CRITICAL Z Values And Percentiles

Percentiles 95 99 99.5 99.9

Z Value 1.645 2.33 2.58 3.08

✓ 68% of the population lies within


1 SD of the mean
✓ 95% of the population lies within
2 SD of the mean
✓ 99.7% of the population lies
within 3 SD of the mean

5
Normal Probability Distributions
Let p= 0.99
Mean, µ 0
Std Dev, σ 1
Then z Value= F-(p) 2.33 =NORM.INV(p,mean, SD)
X pdf CDF
And P value for a given z , P(z) 0.99 =NORM.DIST(p,mean, SD)
-4.00 0.00 0.00
-3.50 0.00 0.00
-3.00 0.00 0.00
-2.50 0.02 0.01 PDF and CDF
-2.00 0.05 0.02
1.05
-1.50 0.13 0.07
-1.00 0.24 0.16 0.85
-0.50 0.35 0.31
0.65
- 0.40 0.50
0.50 0.35 0.69 0.45
1.00 0.24 0.84
0.25
1.50 0.13 0.93
2.00 0.05 0.98 0.05
2.50 0.02 0.99
(0.15)
3.00 0.00 1.00
3.50 0.00 1.00 pdf CDF
4.00 0.00 1.00 6
Inverse Normal Function
CDF
1.20 ✓ Enter a p value ( say 0.8)
1.00
0.80
✓ Use the NORM.INV function to
0.60 compute the z value
0.40
0.20
✓ This gives the Z value corresponding to
- the area of 89%
✓ In other words we are finding z using
the inverse function F- (p)
pdf ✓ We are thus inverting the CDF function
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
-

7
Typical problem

❑ Consider a portfolio with a mean return of 15% and SD of 25%.


✓ What is the probability that the portfolio returns will be between 10% and 20% assuming normal distribution
✓ What is the probability that the returns will be negative ?

Mean 15.00% Mean 15.00%


SD 25.00% SD 25.00%

Lower Bound 10% Lower Bound Infinity


Upper Bound 20% Upper Bound 0%

Z Value-LB NA
Z Value-LB (0.20)
Z Value-UB 0.20 Z Value-UB (0.60)

CDF Lower Bound 42% CDF Lower Bound 0%


CDF Upper Bound 58% CDF Upper Bound 27%

PROBABILITY 15.9% PROBABILITY 27.4%


8
Question (Hull 1.17)

✓A bank estimates that its profit next year is normally distributed with a mean of 0.8% of assets and
the standard deviation of 2% of assets. How much equity (as a percentage of assets) does the
company need to be (a) 99% sure that it will have a positive equity at the end of the year and (b)
99.9% sure that it will have positive equity at the end of the year? Ignore taxes.

Mean 0.80% Mean 0.80%


SD 2.00% SD 2.00%

Z value corresponding Z value corresponding to


to 99% probability 2.326 99% probability 3.090
(Be careful about sign)

Using z = (X-μ)/σ Using z = (X-μ)/σ


we get X as -3.85% we get X as -5.38%
9
Two other properties : Skew and Kurtosis
𝐫 𝐭𝐡 𝐦𝐨𝐦𝐞𝐧𝐭 = 𝐄 (𝐗 − µ)𝐫 𝐫 𝐭𝐡 𝐦𝐨𝐦𝐞𝐧𝐭 = 𝐄 (𝐗 − µ)𝐫
𝐄 (𝐗 − µ)𝟑
𝐒𝐤𝐞𝐰 = 𝐄 (𝐗 − µ)𝟒
𝛔𝟑 𝐤𝐮𝐫𝐭𝐨𝐬𝐢𝐬 =
𝛔𝟒

✓ Usually 3 is subtracted from the figure arrived

In case of Positive Skew, HIGH Returns are ✓ 3 : Normal Distn


MORE likely and LOW Returns are LESS ✓ Light Tailed : Leptokurtosis (>3)
likely ✓ Fat Tailed : Platykurtosis ( <3)

In case of Negative Skew, HIGH RETURNS


are LESS likely and HIGH returns are
MORE likely

Excess Kurtosis leads to a situation where


VERY HIGH and VERY LOW returns are
MORE Likely than NORMAL DISTN 10
Computation of Population Variance ….

X f(x) Xf(X) X-μ (X-μ)2 f(x) X (X-μ)2


1 10% 0.100 -2.100 4.410 0.441 ✓ Variance is a measure of
2 20% 0.400 -1.100 1.210 0.242 dispersion
3 35% 1.050 -0.100 0.010 0.003 ✓ It can be also calculated as
4 20% 0.800 0.900 0.810 0.162
E(X2)- E(X)2
5 15% 0.750 1.900 3.610 0.542
100% 3.100 ✓ Note Population Variance
1.39 and Sample Variance differ
Mean (μ) 3.1 ✓ In case of Sample Variance
Variance( σ2) 1.39 we divided by (N-1)
Standard
Deivation(σ) 1.18
✓ This adjusts for the degrees
of freedom and a bit more
conservative

11
Covariance and Correlation
❑ When two random variables X and Y are not independent, it is frequently of interest to assess how
strongly they are related to one another.

❑For discrete random variables it is given by the equation Cov(X,Y) =Average of ∑ (X-μx) (Y-μy)

❑For a strong positive relationship, product will be positive , for a strongly negative relationship
product will be negative

❑If they are not corelated , the negatives and positives will mostly cancel out , giving a product closer
to Zero

❑The correlation coefficient of X and Y, denoted by Corr(X, Y) or just ρ , is defined by

Cov(X, Y)/σxσy
12
Illustrating COVAR and CORREL

A B A-μA B-μB Product


4 1 -4.5 -3.8 17.1
5 4 -3.5 -0.8 2.8
6 4 -2.5 -0.8 2
7 3 -1.5 -1.8 2.7
8 2 -0.5 -2.8 1.4
9 5 0.5 0.2 0.1
10 8 1.5 3.2 4.8
11 7 2.5 2.2 5.5
12 6 3.5 1.2 4.2
13 8 4.5 3.2 14.4
μA 8.5 55
μB 4.8
COVAR 5.50

Sigma a 2.87

Sigma b 2.32 CORREL 0.83

13
Volatility
❑ Volatility is defined as the Standard Deviation of the return provided per unit of time

✓Suppose that Si is the value of a variable on day i. The volatility per day is the standard
deviation of returns ln(Si /Si-1)

✓Continuous Compounding returns are used for Volatility computation

❑Normally days when markets are closed are ignored in volatility calculations

❑The volatility per year is 252 times the daily volatility

❑Variance rate is the square of volatility

❑Of the variables needed to price an option the one that cannot be observed directly is volatility

❑We can therefore imply volatilities from market prices and vice versa
14
Computing volatility- Daily Standard Deviation

Closing Price ln returns Squared


Daily Standard Deviation
Price Relative Return
𝒖𝒊 = 𝑺𝒊 𝑺𝒊 − 𝑺𝒊−𝟏
𝑺𝒊 𝒖𝒊 = 𝒍𝒏 ≅
ൗ𝑺 𝑺𝒊
𝑺𝒊−𝟏 𝑺𝒊−𝟏
𝒊−𝟏 𝒍𝒏
𝑺𝒊−𝟏
20.00
𝒎
20.10 1.00500 0.00499 0.00002 𝟏
19.90 0.99005 -0.01000 0.00010 𝝈𝟐𝒏 = 𝒖)𝟐
෍(𝒖𝒏−𝒊 −ഥ
𝒎−𝟏
20.00 1.00503 0.00501 0.00003 𝒊=𝟏
20.90 1.00966 0.00962 0.00009
20.40 0.97608 -0.02421 0.00059
𝒎 𝒎
20.50 1.00490 0.00489 0.00002 𝟏 𝟏
20.60 1.00488 0.00487 0.00002 𝝈𝟐𝒏 = ෍ 𝒖𝟐𝒏−𝒊 𝝈𝟐 = ෍ 𝒖𝟐𝒏−𝒊
20.30 0.98544 -0.01467 0.00022 𝒎 𝒎
𝒊=𝟏 𝒊=𝟏
SUM 0.01489 0.00424
Average 0.00074 0.00021

Daily Volatility 1.456% The above simplification is usually made


in Risk Management
15
Problem : HULL 10.18
Suppose that observations on a stock price (in dollars) at the end of each of 15 consecutive days are
as follows:30.2, 32.0, 31.1, 30.1, 30.2, 30.3, 30.6, 30.9, 30.5, 31.1, 31.3, 30.8, 30.3, 29.9, 29.8

Estimate the daily volatility using both approaches


1st Method 2nd Method
i Si ln(Si/Si-1) ln(Si/Si-1)-ave [ln(Si/Si-1)-ave]^2 (Si -Si-1)/Si-1 [(Si-Si-1)/Si-1]^2
0 30.20
1 32.00 0.0579 0.0588 0.0035 0.0596 0.0036
2 31.10 -0.0285 -0.0276 0.0008 -0.0281 0.0008
3 30.10 -0.0327 -0.0317 0.0010 -0.0322 0.0010
4 30.20 0.0033 0.0043 0.0000 0.0033 0.0000
5 30.30 0.0033 0.0043 0.0000 0.0033 0.0000
6 30.60 0.0099 0.0108 0.0001 0.0099 0.0001
7 30.90 0.0098 0.0107 0.0001 0.0098 0.0001
8 30.50 -0.0130 -0.0121 0.0001 -0.0129 0.0002
9 31.10 0.0195 0.0204 0.0004 0.0197 0.0004
10 31.30 0.0064 0.0074 0.0001 0.0064 0.0000
11 30.80 -0.0161 -0.0152 0.0002 -0.0160 0.0003
12 30.30 -0.0164 -0.0154 0.0002 -0.0162 0.0003
13 29.90 -0.0133 -0.0123 0.0002 -0.0132 0.0002
14 29.80 -0.0034 -0.0024 0.0000 -0.0033 0.0000
SUM 0.0067 SUM 0.0069
16
SD 0.0228 SD 0.0222
Computing volatility- EWMA
Closing Price Daily Squared Weighted
Day Price Ratio Return Return Weights Square Returns ✓ Historical SD assigns same weight to
80% all prices
✓ Thus its an equally weighted
- 20.00 approach
1 19.80 1.010101 1.01% 0.0001010 20% 0.0000202 ✓ Recent spikes will not be captured
2 20.13 0.9838055 -1.63% 0.0002666 16% 0.0000427 ✓ Exponentially Weighted Moving
3 20.15 0.9988103 -0.12% 0.0000014 13% 0.0000002
4 20.18 0.9985084 -0.15% 0.0000022 10% 0.0000002
Average (EWMA) assigns greater
5 20.17 1.000386 0.04% 0.0000001 8% 0.0000000 weight to recent returns
55 20.00 1.0008524 0.09% 0.0000007 0% 0.0000000 ✓ It does this with a single parameter λ
56 20.00 1.0001241 0.01% 0.0000000 0% 0.0000000 ✓ Each successive weights is lowered
57 20.04 0.9981933 -0.18% 0.0000033 0% 0.0000000
58 20.04 0.9999751 0.00% 0.0000000 0% 0.0000000
by λ
59 20.04 1.000166 0.02% 0.0000000 0% 0.0000000
60 19.98 1.0025724 0.26% 0.0000066 0% 0.0000000
Sum 0.000495 100.0% Simplified formulae
Avg 0.0000135 0.00083% 0.00642%
𝝈𝟐𝒏 = 𝝀𝝈𝟐𝒏−𝟏 +(𝟏 − 𝝀)𝒖𝟐𝒏−𝟏
Daily volatility 0.28730% 0.80127%
17
Advantages of EWMA

❑ Relatively little data needs to be stored

❑We need only remember the current estimate of the variance rate and the most recent observation
on the market variable

❑Tracks volatility changes

❑l = 0.94 has been found to be a good choice across a wide range of market variables

✓Risk Metrics, a database created by JPM , uses λ=0.94 for updating volatility estimates across a
range of market variables

18
Estimating Volatilities - GARCH

❑ In GARCH (1,1) we assign some weight to the long-run average variance rate

 2n = g V L + a u n2 − 1 + b  2n − 1
❑Maximum weightage , 80% at least is assigned to β
❑Balance weight is distributed between ά and Ύ
❑Since weights must sum to 1
g + a + b =1

19
Computing volatility- GARCH
Relatives daily µi
Weighted
returns
Simplified formulae
Day Prices daily µ2i Weights squared

𝑺𝒊 𝒖𝒊 𝒖𝟐𝒊 𝜶𝜷𝒊−𝟏
0 20.00
ൗ𝑺
𝒊−𝟏 𝝈𝟐𝒏 = 𝝎 + 𝜶𝒖𝟐𝒏−𝟏 + 𝜷𝝈𝟐𝒏−𝟏
1 19.80 1.01010 1.01% 0.00010 10% 0.000010
2 20.13 1.01646 1.63% 0.00027 8.00% 0.000021
3 20.15 1.00119 0.12% 0.00000 6.40% 0.000000
Where 𝝎= ΎVL where VL is the long
4 20.18 1.00149 0.15% 0.00000 5.12% 0.000000 Assumed Parameters run average variance rate
5 20.17 0.99961 -0.04% 0.00000 4.10% 0.000000
53 20.00 0.99732 -0.27% 0.00001 0.00% 0.000000 α 10%
54 20.02 1.00090 0.09% 0.00000 0.00% 0.000000 β 80% And 𝜶+𝜷+ Ύ =1
55 20.00 0.99915 -0.09% 0.00000 0.00% 0.000000 γ 10%
56 20.00 0.99988 -0.01% 0.00000 0.00% 0.000000 σ2(LR) 0.00010
57 20.04 1.00181 0.18% 0.00000 0.00% 0.000000 ω 0.000010
Key difference: We are assuming a
58 20.04 1.00002 0.00% 0.00000 0.00% 0.000000 Long Run Volatility / SD and
59 20.04 0.99983 -0.02% 0.00000 0.00% 0.000000
60 19.98 0.99743 -0.26% 0.00001 0.00% 0.000000
assigning it a certain weight
0.000032
σ2(MA) 0.000825% σ2n 0.008210%
σ(MA) 0.2873% σn 0.9061%
20
Problem : HULL 10.19
Suppose that the price of an asset at close of trading yesterday was $300 and its volatility was
estimated as 1.3% per day. The price at the close of trading today is $298. Update the volatility
estimate using (a) The EWMA model with λ = 0.94 and (b) The GARCH(1,1) model with w =
0.000002, alpha = 0.04, and beta = 0.94.

a) Use 𝝈𝟐𝒏 = 𝝀𝝈𝟐𝒏−𝟏 +(𝟏 − 𝝀)𝒖𝟐𝒏−𝟏


Where un-1= −2/300 = −0.00667

b) Use 𝝈𝟐𝒏 = 𝝎 + 𝜶𝒖𝟐𝒏−𝟏 + 𝜷𝝈𝟐𝒏−𝟏

21
Comparing the three approaches

Method
𝒎
Volatility / Standard Deviation 𝟏 Assigns equal weight to each
𝝈𝟐𝒏 = ෍ 𝒖𝟐𝒏−𝒊
( Conventional Way) 𝒎 day
𝒊=𝟏

EWMA Assigns higher weight to


𝝈𝟐𝒏 = 𝝀𝝈𝟐𝒏−𝟏 +(𝟏 − 𝝀)𝒖𝟐𝒏−𝟏
latest days . 𝝀 ,typically 90%
and above , decides the
weight
GARCH 𝝈𝟐𝒏 = 𝝎 + 𝜶𝒖𝟐𝒏−𝟏 + 𝜷𝝈𝟐𝒏−𝟏 Weight is given to the Long
Run Range Variance (𝝎) to
which the Variance is
ultimately going to get pulled
. Can be used to FORECAST

22
Forecasting Volatility using GARCH
𝝈𝟐𝒏 = 𝝎 + 𝜶𝒖𝟐𝒏−𝟏 + 𝜷𝝈𝟐𝒏−𝟏
Key formulae we have learnt
𝝈𝟐𝒏 = 𝜸𝑽𝑳 + 𝜶𝒖𝟐𝒏−𝟏 +𝜷𝝈𝟐𝒏−𝟏

From above it can derived that 𝑬 𝝈𝟐𝒏+𝒕 = 𝑽𝑳 + 𝜶 + 𝜷 𝒕


𝝈𝟐𝒏 − 𝑽𝑳

❑ This equation forecasts the volatility on t day forward using the information available
at end of day n-1
❑ The Variance Rate exhibits mean reversion with reversion level of 𝑽𝑳 and a reversion
rate of 1- 𝜶-𝜷
❑ 𝜸 is effectively the rate at which the Volatility mean reverts 23
Problem : HULL 10.21
Suppose that the parameters in a GARCH(1,1) model are a = 0.03, b= 0.95 and w = 0.000002.
(a) What is the long-run average volatility?
(b) If the current volatility is 1.5% per day, what is your estimate of the volatility in 20, 40, and 60 days?

(c) Suppose that there is an event that increases the volatility from 1.5% per day to 2% per day. Estimate the effect on
the volatility in 20, 40, and 60 days.

• . λVL = w, therefore VL = 0.000002/ (1-0.03-0.95) = 0.0001

𝑬 𝝈𝟐𝒏+𝒕 = 𝑽𝑳 + 𝜶 + 𝜷 𝒕 𝝈𝟐𝒏 − 𝑽𝑳
Therefore , Volatility in 20 days = 0.0001+ ( 0.98) ^20 * ( 1.5%^2 – 0.00001) = 0.000183
Volatility = SQRT(0.000183) = 1.35%

With 2% , Volatility will be 1.73%

24
Vasicek Model
❑Very important for determining the credit risk capital for a portfolio of loans
❑For a large portfolio of loans, each of which has a probability of PD of defaulting by time T the
Worst Case Default Rate that will not be exceeded at the X% confidence level is

 N  PD  + r N ( X ) 
 −1 −1

WCDR = N  

 1 − r 

where r is the Gaussian copula correlation

❑Assumes default probability is same , so is default correlation amongst the loans

❑Credit VAR at X % confidence interval is WCDR X EAD X LGD less Expected Loss

25
Vasicek Model
❑ The result from the Vasicek model is used to determine the Credit VAR

Expected Loss
Probability

Credit VAR = Capital Requirement


for Unexpected Loss

Loss corresponding to WCDR of


99.9%
Loss over 1 year
❑Incidentally for PD=1%, and ρ= 0.5, WCDR works out to 42%. If correlation = 0, WCDR = PD

 N  PD  + r N ( X ) 
 −1 −1

WCDR = N  

 1− r 
 26
Problem : HULL 11.19
Suppose that a bank has made a large number loans of a certain type. The one-year
probability of default on each loan is 1.2%. The bank uses a Gaussian copula for time to
default. It is interested in estimating a “99.97% worst case” for the percent of loan that
default on the portfolio. Show how this varies with the copula correlation.

PD 1.2%
Confidence Level 99.97%

VALUE as given
Correlation N-1(PD) N-1(X) by Formula N(Value)
0 -2.257 3.432 -2.257 1.2%
0.2 -2.257 3.432 -0.808 21.0%
0.4 -2.257 3.432 -0.112 45.5%
0.6 -2.257 3.432 0.634 73.7%
0.9 -2.257 3.432 3.157 99.9%
27

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy