Unit 1 (2)
Unit 1 (2)
Course WorkBook
Index Sheet - Coursework Book
3 Assignment 37
4 Resources 37
2 History Corner
5 Programming Corner-1
6 Programming Corner-2
13 Unit-1 Assignment
14 MCQ Corner
UNIT I MACHINE LEARNING BASICS
Deep learning and machine learning differ in how each algorithm learns. In "deep"
machine learning, labeled datasets, also known as supervised learning, can be used to
inform algorithms, but labeled datasets are not always required. Deep learning can take
unstructured data in its raw form (such as text or images) and automatically determine a
set of characteristics that distinguish different categories of data. This eliminates some of
the required human intervention and allows for larger data sets.
2.Essential concepts of ML
UC Berkeley divides the learning system of machine learning algorithms into three main
parts.
3.Types of learning
Create Models for the following applications aforementioned in the above link.
● Bananameter
● Snap Clap Whistle
● Head Tilt
Screen record your executions and store them in your github repository AD8552_ML.
4.Machine learning methods based on Time
5.Dimensionality
Dimensionality in measurements alludes to the number of properties a dataset has. For
instance, medical services information is famous for having immense measures of factors
(for example circulatory strain, weight, cholesterol level). In an ideal world, this
information could be addressed in a calculation sheet, with one segment addressing each
aspect. Practically speaking, this is challenging to do, to some degree in light of the fact that
numerous factors are between related (like weight and circulatory strain).
Note: Dimensionality implies something else in different areas of arithmetic and science.
For instance, in material science, dimensionality can as a rule be communicated as far as
basic aspects like mass, time, or length. In framework variable based math, two units of
measure have a similar dimensionality on the off chance that the two proclamations are
valid:
1.A capability exists that maps one variable onto another variable.
2.The opposite of the capability in (1) does the converse.
High Dimensional implies that the quantity of aspects are astoundingly high — so high that
estimations become very troublesome. With high layered information, the quantity of
elements can surpass the quantity of perceptions. For instance, microarrays, which
measure quality articulation, can contain several many examples. Each example can contain
a huge number of genes.One individual (for example one perception) has a huge number of
conceivable quality blends. Different regions where elements surpass perceptions
incorporate money, high goal imaging, and site examination (for example publicizing,
slithering, or positioning).
As a straightforward model, suppose we are utilizing a model to foresee the area of a huge
microscopic organisms in a 25cm2 petri dish. The model may be genuinely exact at nailing
the molecule down to the closest square cm. Notwithstanding, suppose we add only
another aspect: Instead of a 2D petri dish you utilize a 3D recepticle . The prescient space
increments dramatically, from 25 cm2 to 125 cm3. At the point when we add more aspects,
it's a good idea that the computational weight likewise increments. It wouldn't be difficult
to pinpoint where microorganisms may be in a 3D model. Be that as it may, it's a seriously
difficult undertaking.
History Corner
Most referred research article for Dimensionality of Data which was published in 1977 for
your reference.
Finney, D.J. (1977). “Dimensions of Stat” Journal of the Royal Stat. Society. Series C (Applied
Stat). 26, No.3, p.285-289.
6.Linearity & Non Linearity
Linear data is data that can be represented on a line graph. This means that there is a clear
relationship between the variables and that the graph will be a straight line.
Activity Corner
1.
2.
3.
4.
Non-linear data, on the other hand, cannot be represented on a line graph. This is because
there is no clear relationship between the variables and the graph will be curved.
Activity Corner
1.
2.
3.
4.
The basic principle is to utilize direct relapse first to decide if it can fit the specific kind of
bend in our information. In the event that we can't acquire a sufficient fit utilizing straight
relapse, that is the point at which we could have to pick nonlinear relapse.
Direct relapse is simpler to utilize, less difficult to decipher, and we acquire more insights
that assist us with surveying the model. While straight relapse can display bends, it is
moderately confined looking like the bends that it can fit. Now and again it can't fit the
particular bend in our information.
Nonlinear relapse can fit a lot more kinds of bends, yet it can require more exertion both to
view as the best fit and to decipher the job of the free factors. Also, R-squared isn't
legitimate for nonlinear relapse, and it is difficult to compute p-values for the boundary
gauges.
Programming Corner
Execute the following Python Program in Colab/Jupyter Notebook and draw the output
graph in the following workspace.
import pandas as pd
import numpy as np
from sklearn import datasets
import matplotlib.pyplot as plt
Output:
Inference:
Programming Corner
Execute the following Python Program in Colab/Jupyter Notebook and draw the output
graph in the following workspace.
import pandas as pd
import numpy as np
from sklearn import datasets
import matplotlib.pyplot as plt
# Load the IRIS Dataset
#
iris = datasets.load_iris()
X = iris.data
y = iris.target
# Create a scatter plot
#
plt.scatter(X[50:100, 0], X[50:100, 1], color='black', marker='x', label='versicolor')
plt.scatter(X[100:150, 0], X[100:150, 1], color='red', marker='+', label='verginica')
plt.xlabel('sepal length [cm]')
plt.ylabel('petal length [cm]')
plt.legend(loc='upper left')
plt.show()
Output:
Inference:
7.Occam's Razor
Occam's (or Ockham's) razor is a principle attributed to the 14th century logician and
Franciscan friar William of Ockham. Ockham was the village in the English county of Surrey
where he was born.
The theorem states that all optimization algorithms perform equally well when their
performance is averaged across all possible problems.
The law of diminishing returns is an economic theory predicting that after an optimal level
of capacity, adding additional factors of production decreases the output.
Research Corner
Identify a research article which was published between 2021 to 2022 of your choice.
1.Write the title of the article.
2.Read and write the glimpse of the complete paper in the following workspace within 50
words/8 Sentences.
There are two key stages of Data Understanding: a Data Assessment and Data Exploration.
Activity Corner
Principal Component Analysis is an unsupervised learning algorithm that is utilized for the
dimensionality decrease in AI. A measurable cycle changes over the perceptions of related
highlights into a bunch of directly uncorrelated elements with the assistance of
symmetrical change. These new changed highlights are known as the Principal
Components. One of the famous apparatuses is utilized for exploratory information
examination and prescient demonstration. It is a method to draw solid examples from the
given dataset by lessening the fluctuations.
PCA by and large attempts to track down the lower-layered surface to project the
high-layered information.PCA works by considering the change of each property on the
grounds that the high trait shows the great split between the classes, and thus it lessens the
dimensionality. A few certifiable uses of PCA are picture handling, film suggestion
framework, improving the power portion in different correspondence channels. It is an
element extraction method, so it contains the significant factors and drops the most
non-significant variable.
Problem
Given the data in Table, reduce the dimension from 2 to 1 using the Principal Component
Analysis (PCA) algorithm.
Feature Example-1 Example-2 Example-3 Example-4
X1 4 8 13 7
X2 11 4 5 14
The figure shows the scatter plot of the given data points.
To find the first principal components, we need only compute the eigenvector
corresponding to the largest eigenvalue. In the present example, the largest eigenvalue is λ1
and so we compute the eigenvector corresponding to λ1.
Using the theory of systems of linear equations, we note that these equations are not
independent and solutions are given by,
that is,
be the kth sample in the above Table (dataset). The first principal component of this
example is given by (here “T” denotes the transpose of the matrix)
For example, the first principal component corresponding to the first example
is calculated as follows:
X1 4 8 13 7
X2 11 4 5 14
and then change the directions of coordinate axes to the directions of the eigenvectors e1
and e2.
Next, we drop perpendiculars from the given data points to the e1-axis (see below Figure).
The first principal components are the e1-coordinates of the feet of perpendiculars, that is,
the projections on the e1-axis. The projections of the data points on the e1-axis may be
taken as approximations of the given data points hence we may replace the given data set
with these points.
CLASS 1 X = 2 , 3 , 4 Y = 1 , 5 , 3
CLASS 2 X = 5 , 6 , 7 Y = 6 , 7 , 8
11.5 Linear Discriminant Analysis
The first Linear discriminant was depicted for a 2-class issue, and it was afterwards
summed up as "multi-class Linear Discriminant Analysis" or "Various Discriminant
Analysis" by C. R. Rao in 1948 (The usage of various estimations in issues of organic
classification).In a nutshell, the objective of a LDA is frequently to project an element space
(a dataset n-layered examples) into a more modest subspace
k(where k≤n−1), while keeping up with the class-unfair data. As a general rule,
dimensionality decrease doesn't just assist with diminishing computational expenses for a
given order task, yet it can likewise be useful to stay away from overfitting by limiting the
blunder in boundary assessment.
Problem
Problem Solving Workspace
Factory "ABC" produces very expensive and high quality chip rings that their qualities are
measured in terms of curvature and diameter. Result of quality control by experts is given
in the table below.
As a consultant to the factory, you get a task to set up the criteria for automatic quality
control. Then, the manager of the factory also wants to test your criteria upon new type of
chip rings that even the human experts are argued to each other. The new chip rings have
curvature 2.81 and diameter 5.46. Can you solve this problem by employing Discriminant
Analysis?
Research Corner
Download the following research article from the following link and answer the following
questions.
Link : https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3972280
Questions
c.Perform PCA and LDA using the data set in Page 12.
e.Identify a research paper in the literature review section from Page 3 to 7 and check the
references for the full citation using the number.Download it.
The concept of machine intelligence highlights the intersection of machine learning and
artificial intelligence, as well as the broad spectrum of opportunities and approaches in the
field.
Research of MI in Google
Machine Intelligence at Google raises deep scientific and engineering challenges, allowing
us to contribute to the broader academic research community through technical talks and
publications in major conferences and journals. Contrary to much of current theory and
practice, the statistics of the data we observe shifts rapidly, the features of interest change
as well, and the volume of data often requires enormous computation capacity. When
learning systems are placed at the core of interactive services in a fast changing and
sometimes adversarial environment, combinations of techniques including deep learning
and statistical models need to be combined with ideas from control and game theory.
Research of MI in Microsoft
Intelligent machines and intelligent software rely on algorithms that can reason about
observed data to make predictions or decisions that are useful. Such systems rely on
machine learning and artificial intelligence, combining computation, data, models, and
algorithms. Microsoft mission, in the Machine Intelligence theme at Microsoft Research
Cambridge, is to expand the reach and efficiency of machine intelligence technology.
Microsoft research how to incorporate structured input data such as code and molecules
effectively into deep learning models. Microsoft invent new methods so models can
accurately quantify their uncertainty when making predictions. Microsoft build models
that learn from small data that is corrupted or only partially observed. Microsoft develop
deep learning algorithms that apply to interactive settings in gaming and in decision
making task, where model predictions have consequences on future inputs.
Nair, Vineet, et al. "ADVISER: AI-Driven Vaccination Intervention Optimiser for Increasing
Vaccine Uptake in Nigeria." arXiv preprint arXiv:2204.13663 (2022).
Activity Corner
Identify a recent research article related to Machine Intelligence and write its citation in
IEEE format.
Unit-1 Assignment
Identify a Society related problem in your area/native and share possible solutions
strategies through Machine Learning.
Resources
1.https://doi.org/10.1007/978-3-030-26622-6
2.Andrew NG Youtube Tutorial - Machine Learning
3.Introduction to Machine Learning,IITM,NPTEL/Swayam Course.
AD8552- Machine Learning - Question Bank
Unit-1: Introduction to Machine Learning
Part A
Sl.no Questions Blooms
CO
Level*
1 Define Big Data.Give Examples. CO1 B1
2 Relate BigData with Artificial Intelligence and Machine Learning. CO1 B3
3 “Storage and Computational Resources from Cloud will impact B5
the performance of Machine Learning Algorithms by the role of CO1
Big Data” - Justify.
4 List the types of learning algorithms in Machine Learning. CO1 B1
5 Compare the types of learning algorithms in Machine Learning. CO1 B5
6 List the Machine Learning methods based on time. CO1 B1
7 Identify the metric to quote the property of Orthogonality in a B1
CO1
data set.
8 Define Curse of Dimensionality. CO1 B1
9 Relate data density with model accuracy. CO1 B3
10 “Image classification of different birds is a valid static B5
CO1
learning”-Justify
11 “Prediction of Car Engine Performance is a valid dynamic B5
CO1
learning”-Justify
12 Compare linear data and non linear data. CO1 B5
13 Compare linear model and non linear model. CO1 B5
14 Test that y=3ex is producing a non linear data. CO1 B6
15 Transform y=7ex into a linear data modeling equation. CO1 B2
16 Write the algorithms for linear model. CO1 B3
17 Write the algorithms for non linear model. CO1 B3
18 State the Principle of Parsimony. CO1 B1
19 State the NFL theorem. CO1 B1
20 State the Law of Diminishing Returns. CO1 B1
21 Define Expert Systems.Give Examples. CO1 B1
22 List the steps in data preprocessing. CO1 B1
23 Write the steps to understand the nature of data. CO1 B3
24 Write the examples of entities in a data set. CO1 B3
25 Write the examples of attributes in a data set. CO1 B3
26 Write the examples of datatypes in a dataset. CO1 B3
27 Relate Data Visualization with Data Dimensionality. CO1 B2
28 Define Principal Component Analysis(PCA). CO1 B1
29 Write the algorithm of PCA. CO1 B3
30 Define Linear Discriminant Analysis(LCA). CO1 B1
31 Write the algorithm of LCA. CO1 B3
32 Compare PCA and LDA. CO1 B4
Part B
Sl.no Questions Blooms
CO
Level*
3 Criticize the performance of PCA and LDA using your own data CO1 B6
set with three variables.
NAME :
CLASS :
Machine Learning
DATE :
30 Questions
1. 1.Real-Time decisions, Game AI, Learning Tasks, Skill Acquisition, and Robot Navigation are applications of which of the
following
4. 4. Targeted marketing, Recommended Systems, and Customer Segmentation are applications in which of the following.
5. 5. Fraud Detection, Image Classification, Diagnostic, and Customer Retention are applications in which of the following
The autonomous acquisition of knowledge through the The selective acquisition of knowledge through the use
A B
use of manual programs of manual programs
The selective acquisition of knowledge through the use The autonomous acquisition of knowledge through the
C D
of computer programs use of computer programs
7.
7. Which of the following are common classes of problems in machine learning?
A Clustering B Classification
A Deduction B Analogy
C Introduction D Memorization
A False B True
13. 13. Which is the best language for machine learning ___
A Java B JavaScript
C Python D C++
14.
14. Machine learning model which is build on sample data is known as ___
15. 15. Application of machine learning methods to large databases is called ___
A Accuracy B Outcome
It is a diagram or chart that helps determine a course of It is used to train algorithms that classify data or predict
A B
action or show a statistical probability. outcomes accurately by using labeled datasets.
It is a diagram or chart that helps determine a course of It is a type of algorithm that learns patterns from
C D
action or show a statistical probability. untagged data and predicts a outcome.
19. 19. Which of the following is the best machine learning method?
A Accuracy B Scalable
20.
20. Which of the following is a widely used and effective machine learning algorithm based on the idea of bagging?
21. 21. When performing regression or classification, which of the following is the correct way to preprocess the data?
22. 22. A Machine Learning technique that helps in detecting the outliers in data
23. 23. ………………. algorithms enable the computers to learn from data, and even improve themselves, without being
explicitly programmed.
24. 24. If machine learning model output involves target variable then that model is called as
26. 26. The problem of finding hidden structure in unlabeled data is called…
29. 29. What is the application of machine learning methods to a large database called?
30. 30. Which of the following are common classes of problems in machine learning?
C Regression D Clustering