0% found this document useful (0 votes)
24 views4 pages

Exercises ML PDF

Uploaded by

moktarm243
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views4 pages

Exercises ML PDF

Uploaded by

moktarm243
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Exercises

1. What is the gender classification of weight ( 130) by using Naïve Bayes?

Weight Eye Hair length Gender


180 Blue Short Male
100 Brown Long Female
150 Blue Long Female
130 Blue Long Female
170 Brown Short Male
150 Blue Long Female
140 Brown Short Female
190 Blue Long Male

2. Consider a dataset that contains two variables: height (cm) & weight (kg). Each point is classified
as normal or underweight. Based on the below data, you need to classify the following set (57 kg,
170 cm) as normal or underweight using the KNN algorithm. To find the nearest neighbors, we
will calculate the Euclidean distance. And k=3

Weight Height Class


51 167 Underweight
62 182 Normal
69 176 Normal
64 173 Normal
65 172 Normal
56 174 Underweight
58 169 Normal
57 173 Normal
55 170 Normal

3. Let’s consider a simple dataset comprised of 10 data samples:


Age group Income status Class Prediction
Will they buy a car?
Young Low Yes Yes
Old Low No Yes
Middle-aged Low No No
Young Medium Yes No
Middle-aged Medium Yes Yes
Young Low No No
Old High Yes Yes
Middle-aged Medium Yes Yes
Middle-aged High Yes No
Old Low No Yes

Given two inputs– Young, and Low–we want to compute the probability of these people buying a car by using
Naive Bayes.

1|4
4. Suppose we have several objects (4 types of medicines) and each object have two attributes of
features (pH and weight index). Each medicine represents one point with two attributes (x,y) that
we can represented it as coordinate in an attribute space shown in the figure below
Use Euclidean distance in your distance computations, assume that k=2 and initial cluster are medicine
c1=(1,1) and c2=(2,1) . Show two iteration of k-means clustering algorithm for these data points. Give cluster
means and cluster contents after each iteration

5- Suppose we have height, weight and T-shirt size of some customers and we need to predict the T-shirt size of a new
customer given only height and weight information we have. Data including height, weight and T-shirt size information
is shown below –

New customer named 'Monica' has height 161cm and weight 61kg.

Use Manhattan distance

Let k be 5.

2|4
Multiple Choice Questions

1. ________ is a part of machine learning.


a. Artificial intelligence b. Deep learning c. Both a and b d. None of the mentioned

2. Naive Bayes classifiers are a collection------------------of algorithms


a. Classification b. clustering c. regression d. all

3. Which of the following are common classes of problems in machine learning?


a. Classification b. Cluster c. Regression d. All of the mentioned

4. Which is Naive Bayes equation?


a. P(C / X)= (P(X / C) *P(C) ) / P(X) b. P(C / X)= (P(X / C) *P(X) ) / P(C)
c. P(C / X)= (P(C / X) *P(C) ) / P(X) d. P(C / X)= (P(X) *P(C) ) / P(C/X)

5. Identify the type of learning in which labeled training data is used.


a. Classification b. Regression c. Cluster d. Both (a)and (b)

3|4
6. What is unsupervised learning?
a. labeled of groups may be known b. Features of group are known
c. Neither feature nor labeled of groups is known d. None of the mentioned

7. What kind of distance metric(s) is suitable for binary variables to find the closest neighbors?
a. Euclidean distance. b. Manhattan distance.
c. Minkowski distance. d. Hamming distance.

8. You have a dataset with different names of flowers containing their petal lengths and color. Your model
has to predict the name of flower for given petal lengths and color. This is a------
a. Regression task b. Classification task c. Clustering task d. None of the mentioned

9. What does k stand for in the K-means algorithm?


a. Number of neighbors b. Number of output clusters
c. Number of input features d. None of the mentioned

10. The algorithm, that improves upon itself. It typically learns by trial and error to achieve a clear objective.
a. Deep learning b. Reinforcement c. Regression d. Similarity distance

11. classify the following problem as supervised or unsupervised learning:


"Identify that online shoppers often purchase groups of products at the same time"
a. Supervised b. Unsupervised

12. To handle High Dimensional Data


A. Drop features with many missing values.
B. Drop features with low variance.
C. Drop features with low correlation with the response variable
D. All of the above

13. The important of feature extraction phase is _________


a. Transform complex data into smaller sets
b. Transform large data into smaller sets
c. Both (a) and (b)
d. None of the above.

14. What is example of Regression?


a. Predict Stock Market Price b. Spam And Non-Spam Emails
c. both (a) & (b) d. None Of Them
15. The importance of using PCA before clustering is to find which dimension of data minimizes the feature
variance. a. True b. False

4|4

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy