Project Cycle Notes
Project Cycle Notes
we’re trying to address under the _______ stage of the AI Project Cycle. a. Data Exploration b.
Evaluation c. Modelling d. Problem Scoping Ans: d. Problem Scoping
Reviewing the project or business requirements for the AI model a. Data Exploration b. Evaluation c.
Modelling d. Problem Scoping Ans: d. Problem Scoping
What do you mean by Problem Scoping? a. Creating an algorithm to solve a problem b. Proper
solution of a problem c. Recognizing a problem and having a plan to address it. d. analyzing the
trends in the collected data sets Ans: c. Recognizing a problem and having a plan to address it.
a. 18 b. 17 c. 16 d. 15
Ans: b. 17
91. People that experience the mentioned issue and would gain from the solution are referred to as
___________. a. Key Persons b. Stakeholders c. End user d. None of the above Ans: b. Stakeholders
Aman want to make an Artificially Intelligent system which can predict the salary of any employee
based on his previous salaries. He has to feed the data of his previous salaries. This is the data with
which the machine can be trained. The previous salary data here is known as ____________ while
the next salary prediction data set is known as the ___________
The problem statement template gives a clear idea about the basic framework required to achieve
the goal. It is the 4Ws canvas which segregates; what is the problem, where does it arise, who is
affected, why is it a problem? It takes us straight to the goal.
1. No Poverty: This is Goal 1 and strives to End poverty in all its forms everywhere globally by 2030.
The goal has a total of seven targets to be achieved. 2. Quality Education: This is Goal 4 which aspires
to ensure inclusive and equitable quality education and promote lifelong learning opportunities for
all. It has 10 targets to achieve. * (Any two goals can be defined)
4. Mention the precautions to be taken while acquiring data for developing an AI Project. It should be
from an authentic source, and accurate. Look for redundant and irrelevant data parameters that does
not take part in prediction.
. Draw the 4Ws problem canvas and explain each one of them briefly. The 4Ws problem canvas is the
basic template while scoping a problem and using this canvas, the picture becomes clearer while we
are working to solve it.
a) Who: The “Who” block helps you in analyzing the people getting affected directly or indirectly due
to it? Under this, you find out who the ‘stakeholders’ to this problem are and what you know about
them. Stakeholders are the people who face this problem and would be benefitted with the solution.
b) What: Under the “What” block, you need to look into what you have on hand. At this stage, you
need to determine the nature of the problem. What is the problem and how do you know that it is a
problem?
c) Where: In this block, you need to focus on the context/situation/location of the problem. It will
help you look into the situation in which the problem arises, the context of it, and the locations
where it is prominent.
d) Why: in the “Why” canvas, think about the benefits which the stakeholders would get from the
Rule Based Approach: It refers to the AI modelling where the relationship or patterns in data are
defined by the developer. The machine follows the rules or instructions mentioned by the developer,
and performs its task accordingly. For example, suppose you have a dataset comprising of 100 images
of apples and 100 images of bananas. To train your machine, you feed this data into the machine and
label each image as either apple or banana. Now if you test the machine with the image of an apple,
it will compare the image with the trained data and according to the labels of trained images, it will
identify the test image as an apple. This is known as Rule based approach. The rules given to the
machine in this example are the labels given to the machine for each image in the training dataset.
Learning Based Approach: In this approach, the machine learns by itself. It refers to the AI modelling
where the relationship or patterns in data are not defined by the developer. In this approach,
random data is fed to the machine and it is left on the machine to figure out patterns and trends out
of it. Generally, this approach is followed when the data is un labelled and too random for a human
to make sense out of it. For example, suppose you have a dataset of 1000 images of random stray
dogs of your area. You would put this into a learning approach-based AI machine and the machine
would come up with various patterns it has observed in the features of these 1000 images which you
might not have even thought of!
In a supervised learning model, the dataset which is fed to the machine is labelled. It means some
data is already tagged with the correct answer. In other words, we can say that the dataset is known
to the person who is training the machine only then he/she is able to label the data. Unsupervised
Learning: An unsupervised learning model works on unlabelled dataset. This means that the data
which is fed to the machine is random and there is a possibility that the person who is training the
model does not have any information regarding it. The unsupervised learning models are used to
identify relationships, patterns and trends out of the data which is fed into it. It helps the user in
understanding what the data is about and what are the major features identified by the machine in
it.
Differentiate between classification and clustering algorithms with the help of suitable examples.
Classification is a process of finding a function which helps in dividing the dataset into classes based
on different parameters. In Classification, a computer program is trained on the training dataset and
based on that training; it categorizes the data into different classes. The task of the classification
algorithm is to find the mapping function to map the input(x) to the discrete output(y). Example: The
best example to understand the Classification problem is Email Spam Detection. The model is trained
on the basis of millions of emails on different parameters, and whenever it receives a new email, it
identifies whether the email is spam or not. If the email is spam, then it is moved to the Spam folder.
Regression is a process of finding the correlations between dependent and independent variables. It
helps in predicting the continuous variables such as prediction of Market Trends, prediction of House
prices, etc. The task of the Regression algorithm is to find the mapping function to map the input
variable(x) to the continuous output variable(y).
Example: Suppose we want to do weather forecasting, so for this, we will use the Regression
algorithm. In weather prediction, the model is trained on the past data, and once the training is
completed, it can easily predict the weather for future days.
Answer – Various ways to collect the data is – a. Surveys b. Web Scraping c. Sensors d. Cameras e.
Observations f. Application Program Interface (API)
Answer – Data exploration is the process of displaying and detecting unique patterns and trends in
data using tools and procedures. Data visualization and other complex statistical techniques can be
used to do this.
What is data modelling? Answer – Data modelling is the process of developing a visual
representation of an entire information system or certain components of it. for example the
development, training, and application of machine learning algorithms that simulate logical decision-
making based on accessible facts are known as AI modelling.