0% found this document useful (0 votes)
10 views2 pages

Rdatascience - Problem Statements

Uploaded by

pawartushar0502
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views2 pages

Rdatascience - Problem Statements

Uploaded by

pawartushar0502
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

1.

Perform the following operations using Python on Iris data set.


1. Load the Dataset into pandas data frame.
2. Display information about missing values in the data
3. Display initial statistics.
4. Check the dimensions of the data frame.
5. Display data type of the variable.
6. Apply proper data type conversion
7. Convert categorical variables into quantitative variables using one hot encoding and label encoder

2.
Perform the following operations using Python by creating student performance dataset.
1. Display Missing Values
2. Replace missing values using any 2 suitable
3. Identify outliers using boxplot and scatterplot
4. Handle outlier using any technique
5. Perform any 2 data normalization technique

3.
Perform the following operations on iris dataset

1. Display mean, median, minimum, maximum, standard deviation


2. Provide mean, median, minimum, maximum, standard deviation for a given dataset by grouping using one of the
qualitative (categorical) variable
3. Display missing values and inconsistencies.
4. Replace missing values using any 2 suitable
4.
Perform the following operation using titanic data set.

1. check how the price of the ticket (column name: 'fare') for each passenger is distributed by plotting a histogram.

2. plot a box plot for distribution of age with respect to each gender along with the information about whether they survived
or not. (Column names : 'sex' and 'age')
3. Write observations on the inference from the above statistics.
5.
Perform the following operations on iris dataset

1. List down the features and their types


2. Create a box plot and histogram for each feature in the dataset.
3. Compare distributions and identify outliers.

6. Create a Linear Regression Model using Python/R to predict home prices using Boston Housing Dataset. Find the performance of
your model.
7. Create a logistic regression model on social network ads.csv to perform classification on given dataset. Compute
Confusion matrix to find TP, FP, TN, FN, Accuracy, Error rate, Precision, Recall .
8. Create a Naïve Bayes classification model using Python on on social network ads.csv dataset. Compute Confusion matrix to find TP,
FP, TN, FN, Accuracy, Error rate, Precision, Recall on the given dataset.

9.
For given text apply following preprocessing methods:

1. Tokenization
2. POS Tagging
3. Stop word Removal
4. Lemmatization
5. Stemming
10. Calculate Term Frequency and Inverse Document Frequency. Considering sentences of documents.

11. Write Scala program to find average temperature, average dew point and average wind speed for given weather dataset

12.
Perform the following operations using Python by creating student performance dataset.
1. Display Missing Values
2. Replace missing values using any 2 suitable
3. Identify outliers using IQR and ZScore
4. Handle outlier using any technique
5. Perform data normalization using Min Max

13.
Perform the following operations using Python by creating student performance dataset.
1. Display Missing Values
2. Replace missing values using any 2 suitable
3. Identify outliers using IQR and ZScore
4.Handle outlier using any technique
5.Perform data normalization using decimal scaling

14.
For given text apply following preprocessing methods:

1. Tokenization

2. POS Tagging
3. Stop word Removal

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy