CS1A (2)
CS1A (2)
INDICATIVE SOLUTION
Introduction
The indicative solution has been written by the Examiners with the aim of helping candidates. The solutions
given are only indicative. It is realized that there could be other points as valid answers and examiner have given
credit for any alternative approach or interpretation which they consider to be reasonable.
IAI CS1A-0523
Solution 1:
i) Correct Answer is Option C
It is a basic property of MGFs that for any random variable at t=0, the value of the MGF shall be
equal to 1. This is not true in case of option C which will return a value of 2/3, hence it is not
valid. [1]
The result can be obtained by integrating the pdf from a to x. It returns the following result:
- exp(λa) * [ exp(-λt) ] xa
= - exp(λa) * [ exp(-λx) - exp(-λa)]
= exp(λa) * [ exp(-λa) - exp(-λx) ] [1]
Taking logs to the base e on both sides and substituting values of λ and a,
-0.10 * x = ln ( exp(-0.10*0.05) – u / exp(0.10*0.05) )
x = - 10 * ln ( (1-u) * exp(-0.10*0.05) )
x = - 10 * ln ( (1-u) * 0.995012) [1]
Solution 2:
i) Pr(Y < X / 4)
= Pr (Y < 250)
= Pr (3Y < X < ∞, 0 < Y < 250) [0.5]
= 3/103 (e-3y/1000/(-3/1000))0250
= (e0 – e -3(250/1000))
= 1 – exp(-0.75) [1]
= 1 – 0.472
= 0.528
The probability that the ratio has in fact reduced to 1:4 is 0.528. [0.5]
[3]
∞
ii) f(y) =∫𝑥=3𝑦 3/10^6 e-x/1000 dx
=3/106 (e-x/1000/(-1/1000))3y∞
= 3/1000 e-3y/1000 [2]
iv) For random variables X and Y to be independent, fX(x) * fY(y) = f XY (x, y) must be true.
From observation itself, we understand that since f XY (x, y) does not contain any term in y, the
above is not true and hence X and Y cannot be independent. [1]
vi) We have decided to use g XY (x, y) and we know that for this joint probability density
function X and Y are independent random variables.
Hence, conditional expectation E (Y | X > 950) is independent of X and hence it is equal to
E(Y). [1]
Solution 3:
i) Correct Answer is Option A
All other options represent statements which are not true. [1]
𝑦𝜃 − 𝑏(𝜃)
𝑓(𝑦) = 𝑒𝑥𝑝 [ + 𝑐(𝑦, ∅)]
𝑎(∅) [0.5]
[1]
Page 3 of 11
IAI CS1A-0523
Where:
𝜃 = log (µ / (1 – µ) ) [0.5]
∅=𝑛
a(∅) = 1 / ∅ [0.5]
c(y, ∅) = log (n C ny) [0.5]
Using 𝜃 = log (µ / (1 – µ)), we can show that µ = 𝑒 𝜃 / (1 + 𝑒 𝜃 )
b(𝜃) = 𝑙𝑜𝑔 (1 + 𝑒 𝜃 ) [1]
[4]
Solution 4:
i) Correct Answer is Option D
All methods are valid methods to estimate the value of parameter based on information from a
sample. [1]
33 37 70
ii) p̂E = = 0.786, p̂Z = (11∗7) =0.481, p̂ = =0.588 [1]
(6 ∗ 7) (17∗7)
let ni be the total number of questions for the whole batch i,, Bi be the total number of correct
answer by Batch i.
L(b;Ɵ) = (2Ɵ)BE (1-2Ɵ)7nE-BE (Ɵ)BZ (1-Ɵ)7nZ-BZ *constant [2]
Page 4 of 11
IAI CS1A-0523
l(b;Ɵ ) = InL(b;Ɵ)
= 33In(2Ɵ)+(42-33) In(1-2Ɵ)+37In(Ɵ) +(77-37) In(1-Ɵ)+ constant
= 33 In(2Ɵ) +9In(1-2Ɵ)+37 In(Ɵ) +40 In(1-Ɵ)+constant
iv) 𝑑𝑙 66 18 37 40
= 2Ɵ - 1−2Ɵ + -
𝑑Ɵ Ɵ 1−Ɵ
70 18 40
= − 1−2Ɵ - 1−Ɵ [1.5]
Ɵ
From the table it appears that MLE gives a better fit as predicted values are close to expected
values. [1]
[2]
[10 Marks]
Solution 5:
i) Correct Answer is Option A
Since the sample size is small and since the population variance is not known, t distribution
would be suitable to perform this test. [1]
Pooled Variance:
S2p = 1/18*(9*144.7484+9*233.7427)
=189.2456 [1]
1 1
95 % confidence interval is: (X̅A - X̅B) ± tnA+nB-2 * S2p* √𝑛𝐴 + 𝑛𝐵
2
= (51.48-40.14)± 2.101 √189.2456 √10
= (-1.58569, 24.26569) [1]
Since 0 lies in the above confidence interval, we can conclude that at 5% level of significance,
there is insufficient evidence to reject the null hypothesis. [0.5]
[4]
Page 5 of 11
IAI CS1A-0523
Normality of the population data and equal population variances are the assumptions used for
conducting a t-test. [1]
38.9097
= t2.5%,2n-2
√𝑛 [1]
This should be less than 20, so using percentage points of the t distribution,
We have:
n = 15 => t2.5%,2n-2 =2.048
=> 38.9097 * 2.048/√15 = 20.57511 > 20 [1]
And n = 16 => t2.5%,2n-2 =2.042
Solution 6:
i) Correct Answer is Option B
Since the data is perfectly monotonically decreasing, both the coefficients would be equal to -1
(perfect negative correlation).
[2]
ii) H0: ρ = 0
H1: ρ > 0 [0.5]
Under H0, the sampling distribution of Kendall’s rank correlation coefficient is approximately
normal with mean 0 and variance = 2(2n+5) / 9n(n-1) [0.5]
Hence, we can conclude that the inflation rates for Freedonia and Genovia are not positively
correlated.
[4]
iii) The completed table with the rank values for Genovia is given below:
In case of Rank 1 (Freedonia), there are 4 concordant pairs and 5 discordant pairs. So, the rank
value here for Genovia is higher than 4 rank values (10, 9, 8, 7) but lower than 5 rank values (1,
2, 3, 4, 5). It must be 6.
In case of Rank 2 (Freedonia), there are 6 concordant pairs and 2 discordant pairs. So, the rank
value here for Genovia is higher than 6 rank values (10, 9, 8, 7, 5, 4) but lower than 2 rank values
(1, 2). It must be 3. Kindly note that 6 is already considered in the upper cell and it is not being
taken into consideration.
The above process can be continued till we get all rank values for Genovia. [2.5]
6 ∑ 𝑑2
rs = 1 – 𝑛(𝑛2 −1) [0.5]
iv) If we want to retain those components which explain 90% of the total variance, PC1 should be
retained as it accounts for 95.88% of the total variance. [1]
Based on the Scree Plot, the plot becomes flat from PC2 and onwards. Hence using Scree Test,
PC1 should be retained as variances level off after PC1. [1]
As per Kaiser’s Test only those PCs with variances greater than 1 should be retained (applicable
in case of scaled data). Since, only PC1 has variance greater than 1, only PC1 should be retained. [1]
[3]
Solution 7:
i) Correct Answer is Option B
Page 7 of 11
IAI CS1A-0523
Since we are modelling N which represents the number of trials to be performed until the first
success occurs, the appropriate distribution would be geometric distribution. [1]
ii) The prior distribution of “p” is uniform over the interval [0,1]
So f prior (p) =1 0≤p≤1 [0.5]
Sample contains only one observation n1. So the likelihood function of “p” is:
L(p) = P(N = n1) = (1 – p)(n1-1) * p
The above expansion is based on the fact that N | p ~ Geometric(p) [1]
Page 8 of 11
IAI CS1A-0523
[3]
v) Bayesian estimate of “p” under squared error loss is the mean of the posterior distribution
which is given by:
Solution 8:
i) Correct Answer is Option B
Value of estimates for the intercept, X1 and X2 will be the values of α, β1 and β2 respectively. [1]
Page 9 of 11
IAI CS1A-0523
= 1013.90 / 1472.10
= 0.6887
ʎ̂
= 𝑦̅ – µ̂ * 𝑧̅
= 80.30 – 0.6887 * 43.70
= 50.2019 [1]
𝑦̂ 𝐸𝑚𝑒𝑟𝑎𝑙𝑑 𝐶𝑖𝑡𝑦
= 50.2019 + 0.6887 (125-90)
= 74.3064 [0.5]
𝑒̂ 𝐸𝑚𝑒𝑟𝑎𝑙𝑑 𝐶𝑖𝑡𝑦
= 77 – 74.3064
= 2.6936 [0.5]
𝑦̂ 𝐷𝑎𝑟𝑘 𝐶𝑖𝑡𝑦
= 50.2019 + 0.6887 (124-72)
= 86.0143 [0.5]
𝑒̂ 𝐷𝑎𝑟𝑘 𝐶𝑖𝑡𝑦
= 88 – 86.0143
= 1.9857 [0.5]
[5]
v) R2
= Sxz2 / (Sxx * Szz)
= (1013.90)2 / (776.10 * 1472.10)
= 89.9778% [1]
Adjusted R2
= 1 – (10 – 1) / (10 – 1 – 1) * (1 – 0.899778)
= 88.725% [1]
[2]
vi) Correct Answer is Option D
2 = (0.6887) * zreduction
zreduction = 2/0.6887 = 2.90 [1]
vii) ̅ = 0.
a) We are given that W
̅)2 = ∑(w − 0)2 = ∑(z − 𝑧̅)2 = Szz
Sww = ∑(w − 𝑤 [1]
b) 𝛿̂ = 𝑦̅ – µ̂ * 𝑤
̅ = 𝑦̅ – µ̂ * 0 = 𝑦̅ = ʎ̂ + µ̂ * 𝑧̅ [1]
[4]
viii) Improvised
Multiple Linear Bivariate Linear
City Bivariate Linear
Regression Model Model
Model
Emerald City (𝑦̂, 𝑒) (74.49, 2.51) (74.31, 2.69) (74.31, 2.69)
Dark City (𝑦̂, 𝑒) (85.72, 2.28) 86.02, 1.99) (86.02, 1.98)
Adjusted R2 87.21% 88.73% 88.72%
[1]
Page 10 of 11
IAI CS1A-0523
In terms of the predicted responses and residuals, for Emerald City, the multiple linear
regression model appears to be a better fit. However for Dark City, the bivariate linear model
gives better results as compared to the multiple linear regression model.
However in terms of Adjusted R2 (which measures the variation of the predicted responses to
actual responses), Bivariate Linear Model appears to be a better fit. [1]
Improvised Bivariate Linear Model just employs a linear combination of the explanatory
variable of the Original Bivariate Linear Model and hence gives almost similar results like the
original model. Unlike the presumption made by your son, it is clear that the improvised model
does not provide a better fit as compared to the original model. [1]
[3]
[20 Marks]
*****************
Page 11 of 11