Capstone Project

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

1>What is a capstone project?

A capstone project is a project where students research


a topic independently to find a deep understanding of
the subject matter.
It gives an opportunity for the student to integrate all
their knowledge and demonstrate it through a
comprehensive project
2> What is the importance of pattern in problem
solving?
• The premise that underlies all Machine Learning
disciplines is that there needs to be a pattern.
If there is no pattern, then the problem cannot be
solved with Al technology.
It is fundamental that this question is asked before
deciding to embark on an Al development journey.
3> List down different problem categories that comes
under predictive analysis? Write one example for each?
1) Which category? (Classification)- Eg: Spam mail
classification
2) How much or how many? (Regression)- Eg: Flight
fare prediction
3) Which group? (Clustering)- Eg: Email marketing
4) Is this unusual? (Anomaly Detection) - Eg: Credit card
fraud detection
5) Which option should be taken? (Recommendation) -
Video recommendation system
4> What is design thinking? Draw the diagram and
briefly explain each stage of design thinking?
Design Thinking is a design methodology that provides
a solution- based approach to solving problems.
It is extremely useful in tackling complex problems that
are ill- defined or unknown.
Stage 1: Empathize
• Observe consumers to gain a deeper understanding
of the problem
Observation must be made with empathy
Use 5W1H method for right questioning
Who, What, When, Where, Why
How.
Stage 2: Define
• Define the problem statement
Determining the cause of the problem
Brainstorming to generate possible solutions
Selecting most suitable solution
Stage 3: Ideate
• Gather ideas to solve the problem you defined
Brainstorm to arrive at various creative solutions
Stage 4: Prototype
• A prototype is a simple experimental model for a
proposed solution
Build representation (charts, models) of one or more
ideas
Stage 5: Test
• Test the prototype and gain user feedback
• Iterate
Design thinking is an iterative process
5> What is problem decomposition? Write down the
steps involved in problem decomposition?
Problem decomposition is the process of breaking down
the problem into smaller units before coding.
Problem decomposition steps:
• Understand the problem and then restate the problem
in your own words
Break the problem down into a few large pieces.
Break complicated pieces down into smaller pieces.
Code one small piece at a time.
• Think about how to implement it
• Write the code/query
• Test it... on its own.
• Fix problems, if any
6> Explain Train-Test Split Evaluation?
The train-test split is a technique for evaluating the
performance of a machine learning algorithm.
• It can be used for classification or regression
problems and can be used for any supervised learning
algorithm.
The procedure involves taking a dataset and dividing it
into two subsets.
• The first subset is used to fit the model and is referred
to as the training dataset.
• The second subset is not used to train the model; but
to evaluate the fit machine learning model. It is referred
to as testing dataset.
7> How will you configure train test split procedure?
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.33)
OR
X_train, X_test, y_train, y_test =
train_test_split(X,y,train_size=0.67)
(TRAIN > TEST)
8> Explain cross validation?
It is a resampling technique for evaluating machine
learning models on a sample of data.
• The process includes a parameter k, which specifies
the number of groups in to which a given data sample
should be divided.
• The process is referred as K- fold cross validation. For
example, K=10 for 10-fold cross validation.
More reliable, though it takes longer to run
9> Explain difference between cross validation and
train test split?
On small datasets, the extra computational burden of
running cross-validation isn't a big deal. So, if your
dataset is smaller, you should run cross-validation
• If your dataset is larger, you can use train-test-split
method
10> What are hyper parameters?
Hyper parameters are parameters whose values govern
the learning process.
They also determine the value of model parameters
learned by a learning algorithm.
E.g: The ratio of train-test-split, Number of hidden
layers in neural network, Number of clusters in
clustering task.
11> How are MSE and RMSE related? What is their
range? Are they sensitive to outliers?

MSE: One of the most used regression loss functions is


MSE.
Squaring the error gives outliers more weight, resulting
in a smooth gradient for minor errors.
Because the errors are squared, MSE can never be
negative. The error value varies from 0 to infinity.
• The MSE grows exponentially as the error grows. An
MSE value close to zero indicates a good model.
12>RMSE: The square root of MSE is used to calculate
RMSE. The Root Mean Square Deviation (RMSE) is
another name for the Root Mean Square Error.
A RMSE value of 0 implies that the model is perfectly
fitted. The model and its predictions perform better
when the RMSE is low.
A greater RMSE indicates a substantial discrepancy
between the residual and the ground truth.
The RMSE of a good model should be less than 1
13> What is loss function? What are the different
categories of loss function?
• A loss function is a measure of how good a prediction
model does in terms of being able to predict the
expected outcome.
• Loss functions can be broadly categorized into 2
types: Classification and Regression Loss.
• Regression functions predict a quantity, and
classification functions predict a label.

14> Draw the diagram of Analytic Approach and


explain each stage?

1. Business understanding
• What problem you are trying to solve?
• Every project, whatever its size, begins with the
understanding of the business.
• Business partners who need the analytics solution
play a critical role in this phase by defining the
problem, the project objectives, and the solution
requirements from a business perspective.
2 Analytic approaches
• The problem must be expressed in the context of
statistical learning to identify the appropriate machine
learning techniques to achieve the desired result.
3. Data Requirement
What data do you need to answer the question?
• Analytic approach determines the data requirements -
specific content, formats, and data representations,
based on domain knowledge
4. Data collection
• Where is the data coming from (identify all sources)
and how will you get it?
• The Data Scientist identifies and collects data
resources (structured, unstructured and semi-
structured) that are relevant to the problem area.
• If the data scientist finds gaps in the data collection,
he may need to review the data requirements and
collect more data.
5. Data understanding
Is the data that you collected representative of the
problem to be solved?
• Descriptive statistics and visualization techniques can
help a data scientist understand the content of the
data, assess its quality, and obtain initial information
about the data
6. Data preparation
• What additional work is required to manipulate and
work with the data?
• The Data preparation step includes all the activities
used to create the data set used during the modelling
phase.
• This includes cleansing data, combining data from
multiple sources, and transforming data into more
useful variables.
• In addition, feature engineering and text analysis can
be used to derive new structured
7. Model Training
In What way can the data be visualized to get the
answer that is required?
• From the first version of the prepared data set, Data
scientists use a Training dataset (historical data in
which the desired result is known) to develop predictive
or descriptive models.
The modelling process is very iterative.
8. Model Evaluation
• Does the model used really answer the initial
question or does it need to be adjusted?
• The Data Scientist evaluates the quality of the model
and verifies that the business problem is handled in a
complete and adequate manner
9. Deployment
• Can you put the model into practice?
• Once a satisfactory model has been developed and
approved by commercial sponsors, it will be
implemented in the production environment or in a
comparable test environment.
10. Feedback
Can you get constructive feedback into answering the
question? By collecting the results of the implemented
model, the organization receives feedback on the
performance of the model and its impact on the
implementation environment

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy