0% found this document useful (0 votes)
31 views7 pages

Formulario

The document discusses various functions and commands in MATLAB related to vectors, matrices, random variables, plots, and linear regression. It covers: 1) How to generate random vectors and matrices with different distributions and change their mean and standard deviation. 2) Functions to find elements, dimensions, sums, and means of matrices. 3) Commands for logical indexing and filtering matrices. 4) Using for loops and if statements to iterate through vectors and matrices. 5) Functions for linear regression like fitting a simple linear model and estimating beta in the Capital Asset Pricing Model (CAPM).

Uploaded by

frapass99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views7 pages

Formulario

The document discusses various functions and commands in MATLAB related to vectors, matrices, random variables, plots, and linear regression. It covers: 1) How to generate random vectors and matrices with different distributions and change their mean and standard deviation. 2) Functions to find elements, dimensions, sums, and means of matrices. 3) Commands for logical indexing and filtering matrices. 4) Using for loops and if statements to iterate through vectors and matrices. 5) Functions for linear regression like fitting a simple linear model and estimating beta in the Capital Asset Pricing Model (CAPM).

Uploaded by

frapass99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

%% VECTORS AND MATRICES

%vectors and matrices of random variables


x = randn(3,5) %3x5 matrix of N(0,1) r.vs.
%we can change the mean in 2
x = randn(3,5)+2 %3x5 matrix of N(2,1) r.vs.
%we can change the st. dev. from 1 to 3 (so, the variance from 1 to 9)
x = 3*randn(3,5)+2 %3x5 matrix of N(2,9) r.vs.

x = rand(3,5) %3x5 matrix of uniform r.vs. (distribution from 0 to 1)

%vectors and matrices of random integer numbers


x = randi(10) %we specify that it must be a number between 1 and a number (10)
x = randi(10, 3, 3) %3x3 matrix of random integer numbers between 1 and 10

%linspace command
%row vector of n equally spaced points between X1 and X2.
%X1 is the first input, X2 is the second input, n is the third input (if
%we don't specify n, it is automatically 100)
x = linspace(0,1,10)
%X1 and X2 will always be the first and the last element of the vector

%find the elements of a matrix


numel(A)%number of elements of A
A(:) %column than contains all the columns of A
A(:, 4) %fourth column of A
A(4, :) %fourth row of A
A(end-1:end, 4) %elements on the fourth column, penultimate and last row
A(4, end-1:end) %elements on the fourth row, penultimate and last column

%mean of the elements of a matrix


%mean of the elements on each column of the matrix A
mean(A,1) %you can omit the 1
%mean of the elements on each row of the matrix A
mean(A,2) %you can't omit the 2

%“size” function
%associate to the first value (the number of rows) the r.v. n_rows and to
%the second value (the number of columns) the r.v. n_cols
[n_rows, n_cols] = size(A)

%“sum” function
sum(A) %sum of the elements on each column of A (column-wise sum)
sum(A,2) %sum of the elements on each row of A (row-wise sum)

%plots
plot(x, y,'ok') %scatter plot with black points
plot(x, y,'-b') %line plot with blue lines
plot(x, y,'-or') %plot with red lines and points
figure(1), plot(x,y,'ok'),title('blabla') %we put the plot into figure 1 and we
%assign a title to it

%“diff” function
x = [1,2,3,4,5]’
diff(x) %the first component of this vector contains the difference between
%the 2nd and the 1st component of the original vector, and so on
%logical indexing
n = sum(x == 1) %how many times x is = 1?
mean(price(price > 1)) %mean price conditionally on having the price > 1
x = z(z<0 & w>=2) %vector that contains the components of z that are in
%correspondence of the elements of z that are < 0 AND the %elements of w that are >= 2
y = z(z<0 | w>=2) %vector that contains the components of z that are in
%correspondence of the elements of z that are < 0 OR the %elements of w that are >= 2

%if statement
u = rand %uniformly distributed in [0,1] random variable

%Matlab checks if the logical condition u >= 0.8 is true.


%if it is true, the following command is executed.
%if it is false, Matlab checks if the second logical condition is true.
%if it is true, the following command is executed.
%if it is false, Matlab checks if the third logical condition is true. Etc..
if (u >= 0.8)
disp('The variable u is greater or equal than 0.8')
% "disp" is a command that displays in the command window a string
elseif (u < 0.8 && u > 0.2) %&& avoids checking useless conditions: once the
%first is false, I don't care about the other
disp('The variable u is between 0.2 and 0.8')
else
disp('The variable u is smaller or equal to 0.2')
end

%if statement with FOR loops


%we want to flip the sign of the components of v that are < 0.
%before doing a FOR loop over the elements of a vector we must declare the length of the vector
v = randn(100,1)
for j = 1:100 %we must specify the range over which the index j must vary
if(v(j)<0)
v(j)=-v(j)
end
end
%Matlab will repeat for 100 times the commands written between "for"
%and the last "end", every time with a new value of j

%“cumsum” function
a = (1:10)'
cumsum(a) %the element j is the sum of j and all the elements before j
%the first element of cumsum(a) coincides with the first element of a

%Brownian motion (0711)


y = cumsum(a)/sqrt(length(a)) %Brownian motion

%strcmp command
%the string compare command compares each row of a column with a cell.
we obtain a logical vector (1 if the comparison is true, 0 if it is false).
strcmp(T.Gender, 'F') %1 for females, 0 for males
%ages of females (we select the age every time there’s a female)
T.Age(strcmp(T.Gender, 'F'))

%prctile command
%how many smokers have a Systolic pressure larger than the 90 percentile (value of the Systolic pressure below which
there is the 90% of the sample)?
perc90 = prctile(T.Systolic,90)
sum((T.Systolic > perc90) & (T.Smoker==1))

%max and min of arrays


x = randn(100,1)
[x_max,i_max] = max(x) %x_max is the max, i_max is the component of x in which
%the max is attained
[x_min,i_min] = min(x) %x_min is the min, i_min is the component of x in which
%the min is attained

%“sort” function
x = randn(100,1)
sort(x) %it puts in an increasing order the components of x (the first component
%of the vector sort(x) is the min(x), while the last component is max(x)

%“find” function
find(x==min(x)) % we find the component where the condition is true

%NaN’s filtering and “isnan” function


isnan(A) %the function gives us 0 if the component isn’t a NaN, and 1 if it is
isfinite(A) %it gives us 0 if the component isn’t finite , and 1 if it is
sum(isnan(A),1) %how many NaNs appear in each column (column wise sum)
sum(isnan(A),2) %how many NaNs appear in each row (row wise sum)
A(:,sum(isnan(A)) > 3) = [] %we cancel the columns that contain more than 3 NaNs
A(sum(isnan(A),2) > 3,:) = [] %we cancel the rows that contain more than 3 NaNs
y_denan = y(~isnan(y)) %variable y purified from the NaNs

%”interp1” function (interpolation)


%we interpolate the missing values of a dataset using the informations
%provided by the other values of the dataset.
%we must specify the variables with no missing observations (time_denan and %y_denan), and the original variable
with missing observations (time).
%then we must specify which interpolation we want to use (linear or previous).

y_linear_interp = interp1(time_denan,y_denan, time, 'linear')


%a line connects the two non-missing observations that are closest to the missing observation

y_previous_interp = interp1(time_denan,y_denan, time, 'previous')


%the last non-missing observation is kept constant

%fminbnd command (function bounded minimization)


we find the minimum of a function into a specific interval.
(@x means that x is the independent variable upon which we want to perform the
%minimization).
%(1,3) is the interval within which the algorithm must look for the minimum.
[x_min,f_min] = fminbnd(@(x)my_function_to_min(x),1,3) %2 outputs:
%the position of the
%minimum (x_min) and
%the value of the
%function in that
%position(f_min)

%optimset command (setting of the optimizer)


%to find the minimum of a function we exploit an algorithm that is iterated
%different times until the difference between the new estimate of the minimum
%and the previous one is below a certain tolerance. when the tolerance is
%reached, so we don't get an improvement iterating the minimization algorithm, the algorithm is stopped.
options = optimset('Display','iter', 'TolX',0.000001)
%'Display': the function will display all the iterations of the optimization
%algorithm (the result doesn't change, we just see how the
%algorithm works).
%'TolX': we say to the optimizer to stop the iteration when the difference
%between an estimate and the previous one is below 0.000001 (the
%result changes).
[x_min, f_min, EXITFLAG] = fminbnd(@(x)my_function_to_min(x),1,3,options)
%EXITFLAG is an integer number that tells us which cryterium is used to stop
%the iteration of the minimization algorithm.
%EXITFLAG = 1: the iteration has been stopped because the tolerance has
%been reached
%EXITFLAG = 0: the algorithm has been stopped because the maximum number of
%iterations has been reached (but the tolerance hasn't been reached)

%lsqnonlin command (non-linear least squares estimation)


%we create a function that computes the vector of the differences y – y_hat %(residuals), then lsqnonlin computes the
sum of the squares and minimizes it.
%doing this, it finds the estimators of the parameters such that we have the %best fit of the model to the data
par0 = [0.01,1,3] %starting point of the minimizer
par_est = lsqnonlin(@(par)model_to_fit(par,x,y),par0,[0.01 0.01 0.01],...
[1 3 3],options)
%par_est: the output is a vector made by the 3 estimates of the parameters.
%@(par): par is the variable upon which the minimization must be performed
%[..],[..]: lower and upper bounds of the 3 parameters (the parenthesis can be
%empty - unconstrained optimization – but they must be there)
%par0: we must specify the starting value par0!

%% SIMPLE LINEAR REGRESSION MODEL


%SSE di Matlab in fitlm(X,Y) = SSR del prof
%SSR di Matlab in fitlm(X,Y) = SSE del prof
%SST di Matlab in fitlm(X,Y) = SST

%if Y = km travelled, X = cost of the trip, which is the estimated cost for a %trip of 600 km?
cost = (600-beta_0_hat)/beta_1_hat

%CAPM
%The CAPM is a model we use to estimate which is the effect of a market's
%variation on a stock. Therefore, we regress the return of the stock in
%excess with respect to the risk-free rate, against the return of
%the market in excess with respect to the risk-free rate.
Y = r_stock - rf %log-returns of the stock in excess wrt the risk-free rate
X = r_mkt - rf %log-returns of the market in excess wrt the risk-free rate
T = length(Y) %sample size

%beta_1 is the "beta of the stock", that expresses the sensitivity of the stock %to a market variation (higher is the beta,
higher is the sensitivity)
%if beta_1 > 1, the shocks are amplified
%if beta_1 < 1, the shocks are dumped

%OLS estimators
beta_1_hat = sum((X-mean(X)).*(Y-mean(Y)))/sum((X-mean(X)).^2) %cov(X,Y)/var(X)
beta_0_hat = mean(Y) - beta_1_hat*mean(X)

%quality of the regression (R2)


Y_hat = beta_0_hat + beta_1_hat*X %predicted/fitted values
res = Y – Y_hat %vector of differences between data and predicted values
SSR = sum(res.^2)
SST = sum((Y-mean(Y)).^2)
R2 = 1 - (SSR/SST) %data variation explained by the regression

%estimator of the variance of the errors


sigma2_eps_hat = SSR/(T-2)
%estimator of the variance of beta_1_hat
V_beta_1_hat = sigma2_eps_hat/sum((X-mean(X)).^2)
%estimator of the variance of beta_0_hat
V_beta_0_hat = ((sigma2_eps_hat)/T)*((sum(X.^2))/(sum((X-mean(X)).^2)))

%t-stat (H0: beta_1 = 1 H1: beta_1 ≠ 1) (beta_1 = 1 means the


%variation of the market is transferred to the stock 1 to 1).
t_stat_beta_1 = (beta_1_hat - 1)/sqrt(V_beta_1_hat)

%t-stat (H0: beta_1 = 0 H1: beta_1 ≠ 0)


t_stat_beta_1 = (beta_1_hat)/sqrt(V_beta_1_hat)

%we reject the null in favor of the alternative with significance level 5%
%if abs(t) > 1.96 (rejection rule for a two-sided test)
abs(t_stat_beta_1)>1.96 %1 = null rejected, 0 = null not rejected
%or if p-value < 0.05
p_value_beta_1_hat = 2*(1 - normcdf(abs(t_stat_beta_1)))

%% MULTIPLE LINEAR REGRESSION MODEL


%SSE di Matlab in fitlm(X,Y) = SSR del prof
%SSR di Matlab in fitlm(X,Y) = SSE del prof
%SST di Matlab in fitlm(X,Y) = SST

%OLS estimators
X = [ones(length(Y),1),X1,X2,X3]
K=3
beta_ols = ((X'*X)^(-1))*(X'*Y)
Yhat = X*beta_ols
res = Y - Yhat
SSR = sum(res.^2)
sigma2_eps_hat = SSR/(T-K-1)
var_covar = sigma2_eps_hat*((X'*X)^(-1))
std_errors = sqrt(diag(var_covar))
t_stat = beta_ols./std_errors

%Collinearity
X = [ones(5,1),X1,X2,X3,X4]
%if one of the three regressors is a linear combination of another regressor..
det(X'*X) %has a very small value, so..
(X'*X)^(-1) %Matlab calculates the inverse but gives us a warning

%VIF
X = [ones(length(Y),1),X1,X2,X3,X4]
%compute the VIF for each of the 4 regressors
VIF = NaN(4,1)
for j = 1:4
%the new regressand is the j-th regressor, which in the matrix X is not
%the column j but the column j+1 (the first regressor is the constant!)
Y_new = X(:,j+1)
%the new regressors are the remaining regressors
X_new = X
X_new(:,j+1) = [] %we remove the column corresponding to the j-th
%regressor, that is our new dependent r.v.
beta_ols_new = ((X_new'*X_new)^(-1))*(X_new'*Y_new)
Yhat_new = X_new*beta_ols_new
res_new = Y_new - Yhat_new
SSR_new = sum((res_new).^2)
SST_new = sum((Y_new - mean(Y_new)).^2)
R2 = 1 - (SSR_new/SST_new) %R2 > 0.9: multicollinearity
VIF(j) = 1/(1 - R2) %VIF > 10: multicollinearity
end

%F-test
%unrestricted model
T = length(Y)
X_U = [ones(length(Y),1),X1,X2,X3,X4]
result_U = my_ols_routine(Y,X_U)
result_U.tstat
SSR_U = result_U.SSR

%we get rid of the unsignificant regressor


X_R = [ones(length(Y),1),X1,X2,X3]
result_R = my_ols_routine(Y,X_R)
SSR_R = result_R.SSR
q = 1 %we put just one regressor equal to 0 (it’s K_U – K_R)
K = 4 %number of regressors of the unrestricted model!
F = ((SSR_R-SSR_U)/q)/(SSR_U/(T-K-1))
pvalue_R = 1 - fcdf(F,q,T-K-1)
% if it is > 0.05, we don't reject the null hypothesis, so we can rely on the
% restricted model (the increase that we have in SSR removing the last
% regressor is acceptable).
% if it is < 0.05, we reject the null hypothesis, so we can rely on the
% unrestricted model (the increase that we have in SSR removing the last
% regressor is not acceptable).

%t-test to see if 2 regressors have the same effect on Y


Y = GNP
T = length(Y)
X = [ones(T,1),FI,CD,PP]
result = my_ols_routine(Y,X)
result.tstat

% H_0: FI = CD
% H_1: FI ≠ CD
K=3
X_new = [ones(T,1),FI,FI+CD,PP]
result_new = my_ols_routine(Y,X_new)
result_new.tstat
%if the coefficient associated to FI has a t-stat > 1.96, we reject
%the null hypothesis, so FI ≠ CD

%LR test
% Unrestricted model: BD = beta_0 + beta_1*PF + beta_2*PA + beta_3*Wkg
Y = BD
T = length(Y)
X = [ones(T,1),PF,PA,Wkg]
result_U = my_ols_routine(Y,X)
result_U.beta_ols
result_U.tstat
SSR_U = result_U.SSR

%I want to compare the Unrestricted model with the Restricted model


%BD = beta_0 + beta_2*PA + 0.0066*Wkg
%so we are not assuming that both coefficients are 0 (like in the F test),
%but that one is 0 and another one is 0.0066

%H_0: beta_1 = 0 & beta_3 = 0.0066


%H_1: beta_1 ≠ 0 or beta_3 ≠ 0.0066
Y_R = Y - 0.0066*Wkg - 0*PA
N = length(Y)
X_R = [ones(T,1),PF]
result_R = my_ols_routine(Y_R,X_R)
result_R.beta_ols
result_R.tstat
SSR_R = result_R.SSR
LR = T*log(SSR_R/SSR_U) %under H_0 is a chi2 with 2 degrees of freedom
%(because we're imposing 2 restrictions)
LR = -2*log(LR/LU)
pvalue = 1 - chi2cdf(LR,2)
%if the null is rejected, we have to rely on the unrestricted model

%gradient of f(B)
grad = 2*A*B d(B'*A*B)/dB = 2*A*B
%where
% - A is a nxn symmetric matrix (A’= A)
% - B is a nx1 vector of the beta
% - f(B) = B’*A*B
% grad is a column vector made by the derivatives of f(B) wrt beta_1 and beta_2

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy