0% found this document useful (0 votes)
446 views

ML U3 MCQ

1. The step that impacts the trade-off between under-fitting and over-fitting the most in regression modeling is the polynomial degree. 2. The leave-one-out cross validation mean square error for linear regression on the given data is 27. 3. Maximum likelihood estimates may not always exist and if they exist, they may not be unique.

Uploaded by

Dip
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
446 views

ML U3 MCQ

1. The step that impacts the trade-off between under-fitting and over-fitting the most in regression modeling is the polynomial degree. 2. The leave-one-out cross validation mean square error for linear regression on the given data is 27. 3. Maximum likelihood estimates may not always exist and if they exist, they may not be unique.

Uploaded by

Dip
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

1

Which of the following step / assumption in regression modeling


impacts the trade-off between under-fitting and over-fitting the most
The polynomial degree
Whether we learn the weights by matrix inversion or gradient descent
The use of a constant-term

A
1
Suppose you have the following data with one real-value input
variable & one real-value output variable. What is leave-one out cross
validation mean square error in case of linear regression (Y = bX+c)?

10/27
20/27
50/27

49/27

1
Which of the following is/ are true about “Maximum Likelihood
estimate (MLE)”?

1. MLE may not always exist


2. MLE always exists
3. If MLE exist, it (they) may not be unique
4. If MLE exist, it (they) must be unique

1and4
2 and3

1 and3

2 and4

C
1

Let’s say, a “Linear regression” model perfectly fits the training data
(train error is zero). Now, Which of the following statement is true?
You will always have test error zero
. You can not have test error zero
None of the above

1
Which one of the statement is true regarding residuals in regression
analysis?
A. Mean of residuals is always zero
Mean of residuals is always less than zero

Mean of residuals is always greater than zero

There is no such rule for residuals.

1
Which of the one is true about Heteroskedasticity?
Linear Regression with varying error terms
Linear Regression with constant error terms
Linear Regression with zero error terms
None of the above

1
Which of the following indicates a fairly strong relationship between
X and Y?
A. Correlation coefficient = 0.9
. The p-value for the null hypothesis Beta coefficient =0 is 0.0001

The t-statistic for the null hypothesis Beta coefficient=0 is 30


None of these

1
Which of the following assumptions do we make while deriving linear regression param

1. The true relationship between dependent y and predictor x is linear


2. The model errors are statistically independent
3. The errors are normally distributed with a 0 mean and constant standard deviation.

1,2&3
1&3
All of above

C
1
To test linear relationship of y(dependent) and x(independent)
continuous variables, which of the following plot best suited?
Scatter plot
Barchart

Histograms

None of these

1
Generally, which of the following method(s) is used for predicting
continuous dependent variable?

1. Linear Regression
2. Logistic Regression

1&2
Only 1

Only 2

None f the above

1
. A correlation between age and health of a person found to be
-1.09. On the basis of this you would tell the doctors that:
. The age is good predictor of health

. The age is poor predictor of health


None of these

All of the above

1
Which of the following offsets, do we use in case of least square line fit? Suppose horizontal axis is
independent variable and vertical axis is dependent variable

Vertical offset
Perpendicular offset

Both but depend on situation

Both a&b

1
Suppose we have generated the data with help of polynomial regression of degree 3 (degree 3 will
perfectly fit this data). Now consider below points and choose the option based on these points.

1. Simple Linear regression will have high bias and low variance
2. Simple Linear regression will have low bias and high variance
3. polynomial of degree 3 will have low bias and high variance

Polynomial of degree 3 will have low bias and Low variance


. Only 1
1&3

1&4

None of the above


C

1
. Suppose you are training a linear regression model. Now consider
these points.

1. Overfitting is more likely if we have less data


2. Overfitting is more likely when the hypothesis space is small

Which of the above statement(s) are correct?


Both are False
1 is False and 2 is True
1 is True and 2 is False
None of the above

1
Suppose we fit “Lasso Regression” to a data set, which has 100 features (X1,X2…X100). Now, we rescale
one of these feature by multiplying with 10 (say that feature is X1), and then refit Lasso regression with
the same regularization parameter.

Now, which of the following option will be correct?


It is more likely for X1 to be excluded from the model
It is more likely for X1 to be included in the model

. Can’t say

None of the above

B
1
Which of the following is true about “Ridge” or “Lasso” regression
methods in case of feature selection?
Ridge regression uses subset selection of features
. Lasso regression uses subset selection of features

Both use subset selection of features

All of the above

1
. Which of the following statement(s) can be true post adding a
variable in a linear regression model?
1. R-Squared and Adjusted R-squared both increase
2. R-Squared increases and Adjusted R-squared decreases
3. R-Squared decreases and Adjusted R-squared decreases
4. R-Squared decreases and Adjusted R-squared increases
. 1 and 2
1 and 3
2 and 4
none of these

1
. Which of the following metrics can be used for evaluating regression
models?
1. R Squared
2. Adjusted R Squared
3. F Statistics
1. RMSE / MSE / MAE
2 and 4
1 and 2.
. 2, 3 and 4.
All of the above

1
We can also compute the coefficient of linear regression with the help
of an analytical method called “Normal Equation”. Which of the
following is/are true about “Normal Equation”?
1. We don’t have to choose the learning rate
2. It becomes slow when number of features is very large
3. No need to iterate
1 and 2
1&3

2&3

1,2&3

1
. The expected value of Y is a linear function of the X(X1,X2….Xn) variables and regression line is
defined as:
Y = β0 + β1 X1 + β2 X2……+ βn Xn
Which of the following statement(s) are true?
1. If Xi changes by an amount ∆Xi, holding other variables constant, then the expected value of Y
changes by a proportional amount βi ∆Xi, for some constant βi (which in general could be a
positive or negative number).
2. The value of βi is always the same, regardless of values of the other X’s.
3. The total effect of the X’s on the expected value of Y is the sum of their separate effects.

. 1 and 2
1 and 3
2 and 3
1,2 and 3
D

1
. How many coefficients do you need to estimate in a simple linear
regression model (One independent variable)
1
2

CAN’T SAY

2
. Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find
the sum of residuals in both cases A and B.

Which of the following statement is true about sum of residuals of A and B


A has higher than B
A has lower than B
Both have same
None of these

C
1
If two variables are correlated, is it necessary that they have a linear
relationsh
YES
NO

Both a&b

None of the above

1
Correlated variables can have zero correlation coeffficient. True or
False?
TRUE
FALSE

1
Suppose I applied a logistic regression model on data and got training accuracy X and testing accuracy Y.
Now I want to add few new features in data. Select option(s) which are correct in such case.
Note: Consider remaining parameters are same.
1. Training accuracy always decreases.
2. Training accuracy always increases or remain same.
3. Testing accuracy always decreases
Testing accuracy always increases or remain same

Only 2
Only 1

Only3
All of the above

1
The graph below represents a regression line predicting Y from X. The values on the
graph shows the residuals for each predictions value. Use this information to
compute the SSE.

3.02
0.75
1.01
None of these

1
Suppose the distribution of salaries in a company X has median
$35,000, and 25th and 75th percentiles are $21,000 and $53,000
respectively.
Would a person with Salary $1 be considered an Outlier?
YES
NO

. More information is required

None of these

C
1
Which of the following option is true regarding “Regression” and
“Correlation” ?
Note: y is dependent variable and x is independent variable.
The relationship is symmetric between x and y in both.
The relationship is not symmetric between x and y in both.
The relationship is not symmetric between x and y in case of correlation
but in case of regression it is symmetric.
The relationship is symmetric between x and y in case of correlation but
in case of regression it is not symmetric.

1
True-False: Is Logistic regression a supervised machine learning
algorithm?
TRUE

FALSE

1
True-False: Is Logistic regression mainly used for Regression?
TRUE
FALSE
B

1
True-False: Is it possible to design a logistic regression algorithm
using a Neural Network Algorithm?
TRUE
FALSE

1
True-False: Is it possible to apply a logistic regression algorithm on a
3-class Classification problem?
TRUE
FALSE

A
1
Which of the following methods do we use to best fit the data in
Logistic Regression?
Least Square Error
Maximum Likelihood

Jaccard distance

Both a&B

1
One of the very good methods to analyze the performance of Logistic
Regression is AIC, which is similar to R-Squared in Linear
Regression. Which of the following is true about AIC
We prefer a model with minimum AIC value
We prefer a model with maximum AIC value

Both but depend on the situation

None of the above

1
True-False] Standardisation of features is required before training a
Logistic Regression
TRUE
FALSE
B

1
Which of the following algorithms do we use for Variable Selection?
) LASSO
Ridge

Both

All of these

1
Suppose you have been given a fair coin and you want to find out the
odds of getting heads. Which of the following option is true for such a
case?
odds will be 0
odds will be 0.5

odds will be 1

None of the above

1
) The logit function(given as l(x)) is the log of odds function. What
could be the range of logit function in the domain x=[0,1]?
(– ∞ , ∞)
(0,1)

(0, ∞)

(- ∞, 0)

1
Which of the following option is true?
Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case
Linear Regression errors values has to be normally distributed but in case
of Logistic Regression it is not the case

Both Linear Regression and Logistic Regression error values have to be


normally distributed

Both Linear Regression and Logistic Regression error values have not to
be normally distributed

1
17) Which of the following is true regarding the logistic function for any value “x Note:
Logistic(x): is a logistic function of any number “x”
Logit(x): is a logit function of any number “x”
Logit_inv(x): is a inverse logit function of any number “x””?
C) A) Logistic(x) = Logit(x)
Logistic(x) = Logit_inv(x)

A) Logistic(x) = Logit(x)
None of these

B
2
Suppose, You applied a Logistic Regression model on a given data and
got a training accuracy X and testing accuracy Y. Now, you want to
add a few new features in the same data. Select the option(s) which
is/are correct in such a case.

Note: Consider remaining parameters are same.


Training accuracy increases
Training accuracy increases or remains the same

Testing accuracy decreases

Testing accuracy increases or remains the same

A&D

1
Choose which of the following options is true regarding One-Vs-All
method in Logistic Regression.
We need to fit n models in n-class classification problem
We need to fit n-1 models to classify into n classes

We need to fit only 1 model to classify into n classes

1
What would do if you want to train logistic regression on same data
that will take less time as well as give the comparatively similar
accuracy(may not be same)?

Suppose you are using a Logistic Regression model on a huge dataset. One
of the problem you may face on such huge data is that Logistic regression
will take very long time to train
Decrease the learning rate and decrease the number of iteration
Decrease the learning rate and increase the number of iteration

Increase the learning rate and increase the number of iteration

Increase the learning rate and decrease the number of iteration

2
Which of the following image is showing the cost function for y =1.
Following is the loss function in logistic regression(Y-axis loss function and x axis log probability) for two
class classification problem.
Note: Y is the target class

A
B

BOTH

NON OF THESE

1
Logistic regression is used when you want to:
Predict a dichotomous variable from continuous or dichotomous variables.
Predict a continuous variable from dichotomous variables.
Predict any categorical variable from several other categorical
variables.
Predict a continuous variable from dichotomous or continuous variables

1
The correlation coefficient is used to determine
A specific value of the y-variable given a specific value of the
x-variable
A specific value of the x-variable given a specific value of the
y-variable

The strength of the relationship between the x and y variables

none

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy