0% found this document useful (0 votes)
24 views24 pages

Aai101 Data Science Question Bank

The document is a question bank for the course 'Introduction to Data Science' for the academic year 2024-2025, authored by Mr. Madhavan R. It outlines course outcomes, knowledge levels based on Bloom's taxonomy, and includes various questions categorized into parts A, B, and C covering topics such as data science processes, types of data, and statistical descriptions. The document serves as a comprehensive resource for assessing students' understanding and application of data science concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views24 pages

Aai101 Data Science Question Bank

The document is a question bank for the course 'Introduction to Data Science' for the academic year 2024-2025, authored by Mr. Madhavan R. It outlines course outcomes, knowledge levels based on Bloom's taxonomy, and includes various questions categorized into parts A, B, and C covering topics such as data science processes, types of data, and statistical descriptions. The document serves as a comprehensive resource for assessing students' understanding and application of data science concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

DEPARTMENT OF COMPUTER SCIENCE AND BUSINESS SYSTEMS

QUESTION BANK
SUBJECT CODE: AAI101 YEAR / SEM: I/II
SUBJECT NAME: INTRODUCTION TO DATA SCIENCE
ACADEMIC YEAR: 2024-2025
NAME OF THE FACULTY: MR. MADHAVAN R

Course Outcomes
After successful Completion of the Course, the Students should be able to
Course
Course Outcomes
Outcome No

CO1 Define the data science process

CO2 Understand different types of data description for data science process
CO3 Gain knowledge on relationships between data

CO4 Use the Python Libraries for Data Wrangling


CO5 Apply visualization Libraries in Python to interpret and explore data

Knowledge Level (Blooms Taxonomy)


K Applying
K Remembering K Understanding
(Application of
1 (Knowledge) 2 (Comprehension) 3
Knowledge)
Analysing Evaluating K Creating
K K
4 (Analysis) 5 (Evaluation) 6 (Synthesis)
UNIT 1

SYLLABUS: UNIT-1 INTRODUCTION


Data Science: Benefits and uses – facets of data – Data Science Process: Overview – Defining
research goals – Retrieving data – Data preparation – Exploratory Data analysis – build the
model– presenting findings and building applications – Data Mining – Data Warehousing –
Basic Statistical descriptions of Data
PART - A
Topic

Q.No Questions BT
Level Mark

Define Data Science and Big Data. (NOV/DEC 2022) Data BTL1 2
1. Science:Use
s and
Benefits
What is the role of data science in business, medical research, Data BTL1 2
2. healthcare, education, social media, technology and financial Science:Use
institutions? s and
Benefits
Write the main types/categories of data? Facets of BTL1 2
3. Data

What is NLP ? Is natural language structured data? Facets of BTL1 2


4. Data
What is machine generated data with an example? Facets of BTL1 2
5. Data
What is graph-based or network data? Facets of BTL2 2
6. Data
List the steps involved in data science processing? Data Science BTL1 2
7. Process
What are outliers? Data Science BTL2 2
8. Process
Explain the types of data.(AU NOV/DEC 2023) Facets of BTL2 2
9. Data
List an overview of common errors in retrieving data and which Data Science BTL2 2
10. cleansing solutions to be employed. (NOV/DEC 2022) Process
How missing values present in a dataset are treated during data Data Science BTL1 2
11. analysis phase? (AU APR/MAY 2024) Process
Define Median with example.(AU NOV/DEC 2023) Statistical BTL2 2
12. Description
of Data
Identify and write down various data analytic challenges faced in Data Science BTL2 2
13. the conventional system.(AU APR/MAY 2024) Process
What is data preparation and process? Data Science BTL2 2
14. Process
What is data modeling? Data Science BTL2 2
15. Process
Define Data Mining.(APRIL/MAY 2023) Data Mining BTL2 2
16.
Give an overview of Common Errors.(NOV/DEC 2023) Data Science BTL1 2
17. Process
Identify the components of data science Data Science BTL1 2
18. Process
How missing values present in a dataset are treated during data Data Science BTL1 2
19. analysis phase?(AU APR/MAY 2024) Process
Enumerate the categories of data used in data science Facets of BTL1 2
20. Data
Identify and write down various data analytic challenges faced in Data Science BTL1 2
21. the conventional system(AU APR/MAY 2024) Process
Mention the significance of setting goals in data science project Data Science BTL2 2
22. Process
What is Structured data?( NOV/DEC 2023) Facets of BTL2 2
23. Data
Outline the difference between structured and unstructured data. Facets of BTL2 2
24. (APRIL/MAY 2023). Data
Define data warehousing, data mart and data lake Data BTL1 2
25. Warehousin
g
PART – B
Examine the different facets of data with the challenges in their Facets of BTL1 13
1. processing(AU NOV/DEC 2022) Data
List the facets of data with example Facets of BTL1 13
2. Data
Elaborate the steps in data science process with diagram (AU Data Science BTL1 13
3. NOV/DEC 2022,APR/MAY 2023) Process

Briefly explain the architecture of data mining Data Mining BTL1 13


4.
What is data warehouse? Outline the architecture of a data Data BTL2 13
5. warehouse with a diagram( AU APR/MAY 2023) Warehousin
g
How do you set the research goal, retrieving data and data Data Science BTL2 13
6. preparation process in data science process? Process
(i) Suppose there is a dataset having variables with missing values Data BTL2 13
7. of more than 30%, how will you deal with such dataset. Preparation
(ii) List down the various feature selection methods for selecting
the right variables for building efficient predictive models.Explain
about any two selection methods.(AU APR/MAY 2024)
Explain the Basic Statistical description of data. Statistical BTL2 6
8. Description
of Data
(i) Explain Data Analytic life cycle.Brief about Time-series Data BTL2 7
9. Analysis. Preparation
(ii)Outline the purpose of data cleansing.How missing and
nullified data attributes are handled and modified during
preprocessing stage?(AU APR/MAY 2024)
Explain about cleaning, integrating and transforming data in Exploratory BTL2 6
10. detail. (AU NOV/DEC 2023) data analysis
PART – C (If applicable)
Challenges and implementation of data mining? Data Mining BTL2 15M
1.
Find the following for the given data set: Statistical BTL2 15M
2. Mean,Median,Mode,Variance, Standard Deviation and Skewness. Description
MARKS 0- 10- 20- 30- 40- 50- 60- 70- of Data
10 20 30 40 50 60 70 80
NO OF 10 40 20 0 10 40 16 14
STUDENTS
(AU NOV/DEC 2023)
If the collected dataset is population.csv what are all the steps for Data BTL2 15M
3. the data preparation. Elaborate in detail. Preparation

What is predictor and target variable what are its use in data Data BTL2 15M
4. modeling elaborate in detail with program and example. Modelling
Elaborate the steps of data exploration techniques if the given data Data BTL3 15M
5. set is population dataset from the year 2011 to 2016 and the Exploration
quantity of data required ranging from 6 to 50.

UNIT - 2

SYLLABUS: UNIT 2- DESCRIBING DATA

Types of Data – Types of Variables -Describing Data with Tables and Graphs –Describing Data with
Averages – Describing Variability – Normal Distributions and Standard (z) Scores

PART – A

BT
Q.No Questions Topic Mark
Level

What is frequency distribution? Describing BTL1 2


Data with
1.
Tables and
Graphs
what are the types and uses of frequency distribution? Describing BTL1 2
Data with
2.
Tables and
Graphs
what is grouped frequency distribution? Describing BTL1 2
Data with
3.
Tables and
Graphs
What is the ungrouped frequency distribution? Describing BTL3 2
Data with
4.
Tables and
Graphs
What is cumulative frequency distribution? Describing BTL1 2
Data with
5.
Tables and
Graphs
6. What is relative frequency distribution? Describing BTL1 2
Data with
Tables and
Graphs
Define percentile ranks? Describing BTL2 2
7. Data with
Averages
What is a histogram? Describing BTL2 2
Data with
8.
Tables and
Graphs
Explain any three features of histogram? Describing BTL2 2
Data with
9.
Tables and
Graphs
What is frequency polygon? Describing BTL2 2
Data with
10.
Tables and
Graphs
What if distribution have more than one mode or no mode at all? Describing BTL1 2
Data with
11.
Tables and
Graphs
Explain range, variance and SD ? Describing BTL2 2
12.
variability
What is degree of freedom? Describing BTL3 2
13.
variability
What is interquartile range (IQR)? Describing BTL2 2
14.
variability
Define normal curve and its property? Normal BTL2 2
15. Distributions
and z score
What is z-score? Normal BTL1 2
16. Distributions
and z score
Define data. What are the types of data Types of BTL1 2
17.
data
Compare and contrast qualitative and quantitative data with an example. Describing BTL3 2
(APRIL/MAY 2023) Data with
18.
Tables and
Graphs
Classify the below list of data into their types : a) ethnic group b) age c) BTL3 2
19. Describing
family size d) academic major e) sexual preference f) IQ score g) net worth Data with
(dollars) h) third-place finish (i) gender (j) temperature and write a brief Tables and
notes on them. (NOV/ DEC 2022) Graphs
List the differences between a discrete and continuous variable with an Types of BTL1 2
20.
example. (APRIL/MAY 2023)(NOV/ DEC 2022) variables
Differentiate between bar graph and a histogram Describing BTL1 2
Data with
21.
Tables and
Graphs
Define median with example.(AU NOV/DEC 2023) Describing BTL2 2
22. Data with
Averages
What is positively skewed distribution? Normal BTL2 2
23. Distributions
and z score
What is negatively skewed distribution? Normal BTL2 2
24. Distributions
and z score
What is a normal curve? Normal 2
25. Distributions BTL2
and z score
PART – B
(i) Describe the types of variables. BTL3 13
(ii) Suppose a hospital tested the age and body fat data for randomly
selected adults with the following result:
Age 23 27 39 49 50 52 54 56 57 58 60 Describing
1.
%fat 9. 17.8 31.4 27. 31.2 34. 42.5 33. 30.2 34.1 41 Variability
5 2 6 4
Draw the boxplots for age.
(AU NOV/DEC 2023)
Indicate whether each of the following distributions is positively and BTL3 7
negatively skewed. The distribution of
Normal
(1) Income of tax payers have a mean of $48,000 and a median of
2. Distributions
$43,000
and z score
(2) GPA’s for all students at some college have a mean of 3.01 and a
median of 3.20. (AU APR/MAY 2024)
3. (i)Explain Normal curve and Z-score Normal BTL3 13
(ii) Using standard normal table curve, find the portion of the total area Distributions
identified with the following segments. and z score
(1) Above a z score of 1.80
(2) Between the mean of a z score of 1.65
Between z scores of 0 and -1.96(AU NOV/DEC 2023)
Demonstrate the different types of variables used in data analysis with an Types of BTL3 13
4.
example for each.(NOV/DEC 2022) variables
The number of friends reported by Facebook users is summarized in the BTL3 13
following frequency distribution.(NOV/DEC 2022)

FRIENDS f
400-above 2
350-399 5
300-349 12
250-299 17
200-249 23
Describing
150-199 49
Data with
5.
100-149 27 Tables and
Graphs
50-99 29
0-49 36
Total 200

(i) What is the shape of the shape of this distribution?


(ii) Find the relative frequencies.
(iii) Find the approximate percentile rank of the interval 300-349
(iv) Convert to a histogram
(v) Why would it not be possible to convert to a stem and a leaf
display?

What is relative frequency distribution? The GRE scores for a group of BTL3 8
graduate school applicants are distributed as follows:
Describing
GRE Score Frequency
Data with
6. 725-749 1 Tables and
700-724 3 Graphs

675-699 14
650-774 30
625-649 34
600-624 42
575-599 30
550-574 27
525-549 13
500-524 4
475-499 2
TOTAL 200

Explain the procedure to convert a frequency distribution into a relative


frequency distribution and convert the data presented in the above table to a
relative frequency distribution. Do not round the numbers to two digits to
the right of the decimal point.
(NOV/DEC 2022)
Suppose the IQ score have a bell shaped distribution with a mean of 100 BTL3 7
and standard deviation of 15 then calculate the following : (i) what Normal
percentage of people should have an IQ score between 85 and 115?
7. (ii) what percentage of people should have an IQ score between 70 and Distributions
130? and z score

(i) What is Z-score ? Outline the steps to obtain a Z-Score. Normal BTL3 7
8. (APR/MAY 2023) Distributions
and z score
(ii) Express each of the following scores as a Z Score: First, Mary’s BTL3 6
intelligent quotient is 135, given a mean of 100 and standard deviation 15. Normal
9. Second, Mary obtained a score of 470 in the Competetive Examination Distributions
conducted in April 2022, given a mean of 500 and a standard deviation of and z score
100. (APR/MAY 2023)
10. Describing BTL3 5
What is frequency distribution? Customers who have purchased a particular Data with
product rated the usability of the product on a 10-point scale, ranging from Tables and
1(poor) to 10(excellent) as follows
Graphs
3 7 2 7 8
3 1 4 10 3
2 5 3 5 8
9 7 6 3 7
8 9 7 3 6
Construct a frequency distribution for the above data.
(APR/MAY 2023)

PART – C
(i)What is mode? Can there be distributions with no mode or more than one BTL3 15
mode? The owner of a new car conducts six gas mileage tests and obtains
the following results, expressed in miles per gallon: 26.3, 28.7, 27.4, 26.6, Describing
27.4, 26.9. Find the mode for these data. Data with
1. (ii)What is median? Outline the steps to find the median and find the median
for the following scores: first, set of five scores 2,8,2,7,6 and second, set of Tables and
six scores 3,8,9,3,1,8 with steps. Graphs
(APRIL/MAY 2023)

During their first swim through a water maze.15 laboratory rats made the Describing BTL2 15M
2. following number of errors(blind alleyway entrances): Data with
2,17,5,3,28,7,5,8,5,6,2,12,10,4,3(AU APR/MAY 2024) Averages
Explain in detail about the Normal Distribution. Normal BTL2 15 M
3. Distributions
and z score
Explain in detail about the Describing data with Tables and Graphs. Describing BTL2 15 M
Data with
4.
Tables and
Graphs
Explain in detail about the Types of Variables in Data Science give a case BTL2 15 M
Types of
5. study on Sample and Population mean and variance with relevant examples
variables
and dataset.
UNIT - 3

SYLLABUS: UNIT III- DESCRIBING RELATIONSHIPS


Correlation –Scatter plots –correlation coefficient for quantitative data –computational formula for
correlation coefficient – Regression –regression line –least squares regression line – Standard error of
estimate – interpretation of r2 –multiple regression equations –regression towards the mean.
PART – A

BT
Q.No Questions Topic Mark
Level

What is percentile rank? Give an example.(NOV/DEC 2022) BTL4 2


1. Correlation

Consider Helen sent 10 greeting cards to her friends and she BTL3 2
2. received back 8 cards, what is the kind of relationship it is? Brief on Correlation
it.(NOV/DEC 2022)
What is the use of scatter plot? (APRIL/MAY 2023) BTL4 2
3. Scatter plots

Define Correlation Coeffecient.(APRIL/MAY 2023) Correlation BTL4 2


4.
coefficient
What is correlation and its types? Correlation BTL1 2
5.
coefficient
Define Scatterplots? BTL4 2
6. Scatter plots

What is a correlation coefficient? Correlation BTL1 2


7.
coefficient
Define Regression. BTL1 2
8. Regression

Write the types of regression analysis. BTL1 2


9. Regression

Define single and multiple linear regression. Multiple BTL1 2


10.
regression
equations
What is ridge regression? BTL2 2
11. Regression

What is standard error estimate? Standard BTL3 2


12. error of
estimate
What is the need for correlation? BTL1 2
13. Correlation

What is causation? BTL4 2


14. Correlation

What is linear relationship and non-linear relationship? BTL1 2


15. Regression

List the types of nonlinear relationship BTL1 2


16. Scatterplots

What is curvilinear relationship BTL1 2


17. Scatterplots

What are the key properties of Pearson correlation coefficient ? (AU Correlation BTL1 2
18.
NOV/DEC 2023) coefficient
Compare correlation and regression BTL2 2
19. Correlation

What is restricted range? Correlation BTL3 2


20.
coefficient
What is interpretation of r2 ? Correlation BTL1 2
21.
coefficient
Regression BTL4 2
22. What is regression towards mean? (AU NOV/DEC 2023) towards
mean
BTL1 2
State the purpose of adding additional quantative and/or categorical
23. explanatory variables to any developed linear regression model. Regression
Justify with an example. (AU APR/MAY 2024)

Give an example of a dataset with non-Guassian distribution. BTL4 2


24. (AU APR/MAY 2024) Regression

Multiple BTL4 2
regression
25. What is Multicollinearity? equations
PART – B
Calculate the correlation coefficient for the heights ‘in inches’ of BTL3 13
fathers(x) and their son’s (y) with the data presented Correlation
1. below.(APR/MAY 2023)
coefficient
x 66 68 68 70 71 72 72
y 68 70 69 72 72 72 74
The values of x and their corresponding values of y are presented BTL3 13
below.(APR/MAY 2023)
x 0.5 1.5 2.5 3.5 4.5 5.5 6.5
y 2.5 3.5 5.5 4.5 6.5 8.5 10.5 Least
2.
squares
(i)Find the least square regression line y=ax+b.
(ii)Estimate the value of y when x=10.

Categorize the different types of relationships using Scatter plots. BTL3 13


Each of the following pairs represents the number of licenced
drivers(X) and the number of cars(Y) for seven houses in my
neighborhood:
DRIVERS(X CARS(Y)
)
5 4
5 3
2 2

3. 2 2 Scatter plots
3 2
1 1
2 2
(1) Construct a scatterplot to verify a lack of pronounced
culvilinearity
(2) Determine the least squares equation for these data.
(Remember, you will first have to calculate r, SSy and SSx)
Determine the standard error of estimate, Sy/x, given that n=7 .
(NOV/DEC 2022)
4. In studies dating back over 100 years, it’s well established that Correlation BTL3 13
regression toward the mean occurs between the heights of fathers
and the heights of their adult sons.
Indicate whether the following statements are true or false.
(1) Sons of tall fathers will tend to be shorter than their fathers.
(2) Sons of short fathers will tend to be taller than the mean for
all sons.
(3) Every son of a tall father will be shorter than his father.
(4) Taken as a group, adult sons are shorter than their fathers.
(5) Fathers of tall sons will tend to be taller than their sons.
(6) Fathers of short sons will tend to be taller than their sons but
shorter than the mean for all fathers.
(ii) Interpret the value of r2 in correlation based analysis.(NOV/DEC
2022)
(i) In statistics, highlight the impact when the goodness of fit test BTL3 13
score is low?
(ii) Given the following dataset of employee.Using regression
analysis, find the expected salary of an employee if the age is 45.
Age Salary
5. 54 67000 Regression
42 43000
49 55000
57 71000
35 25000
(AU APR/MAY 2024)
(i) Define autocorrelation and how it is calculated? What does the BTL2 13
negative correlation convey?
(ii) What is the philosophy of Logistic Regression? Linear
6.
What kind of model it is? What does logistic Regression predict? Regression
Tabulate the cardinal differences of Linear and Logistic Regression.
(AU APR/MAY 2024)
(i) Explain scatter plot. BTL2 13
7. (ii) Describe range and variance Scatter plot
(AU NOV/DEC 2023)
Calculate the value of r using computation formula for the BTL3 13
following data
FRIENDS SEN RECEIVED
T

Dories 13 14
Correlation
8.
Steve 9 18 coefficient

Mike 7 12

Andrea 5 10

John 1 6
(i) Explain the correlation coefficient. Correlation BTL3 6
9.
(ii) Explain how the least squares equation which is used to coefficient
minimize the total of all squared prediction errors with example.
(AU NOV/DEC 2023)
Explain in detail about Multiple Regression Equations. Multiple BTL2 7
10.
regression
PART – C
Assume that an r of - .80 describes the strong negative relationship BTL3 15M
between years of heavy smoking (X) and life expectancy (Y)
Assume, furthermore , that the distributions of heavy smoking and
life expectancy each have the following means and sum of squares :
5 60 35 70 x y X Y SS SS
(i) Determine the least squares regression equation for
predicting life expectancy from years of heavy smoking Standard
1. (ii) Determine the standard error of estimate, Sy/x, assuming error
that the correlation of -80 was based on n=50 pairs of estimate
observations.
(iii) Supply a rough interpretation of Sy/x.
(iv) Predict the life expectancy for John, who has smoked
heavily for 8 years.
Predict the life expectancy for Katie, who has never smoked
heavily.(NOV/DEC 2022)
Consider the following dataset with one response variable y and two BTL3 15M
predictor variables x1 and x2
y 140 155 159 179 192 200 212 215

x1 60 62 67 70 71 72 75 78
Multiple
2. x2 22 25 24 20 15 14 14 11
regression

Fit a multiple linear regression model to this dataset.


(APRIL/MAY 2023)

How do businesses use Regression Analysis? BTL3 15M


Linear
3.
regression
What are the Linear model assumptions in Regression Analysis?
You are given a data set. The set contains many variables, some of BTL3 15M
4. which are highly correlated and you know about it. Purpose the Correlation
ways to handle such high dimensional data.
5. Fit the line using the following data using the least square method Least square BTL3 15M
Years of Experience Salary method
16 5.1 66029

20 6.8 91738

8 3.2 64445

6 2.2 39891

4 7.1 98273

21 3.2 54445

7 10.5 121872

29 6.0 93490

19 4.0 55794

11 8.2 11812

UNIT 4

SYLLABUS: UNIT IV PYTHON LIBRARIES FOR DATA WRANGLING


Basics of Numpy arrays –aggregations –computations on arrays –comparisons, masks,
boolean logic – fancy indexing – structured arrays – Data manipulation with Pandas – data
indexing and selection – operating on data – missing data – Hierarchical indexing –
combining datasets – aggregation and grouping – pivot tables
PART – A
Topics

Q.No Questions BT Level


Mark
State the advantages of using Numpy NumPy BTL2 2
1. arrays.(APRIL/MAY 2023) Arrays
Outline the two types of Numpy’s UFuncs. Universal BTL2 2
2. (APRIL/MAY 2023) functions
List the attributes of Numpy array. Give an example for NumPy BTL2 2
3. it.(NOV/DEC 2022) Arrays

Create a data frame with key and data pairs as Key-Data Data BTL2 2
4. pair as A-10,C-20, C-5,B-10,C-10. Find the sum of each manipulation
key and display the result as each key group. with pandas
(NOV/DEC 2022)
Explain Partial sort.(AU NOV/DEC 2023) Sorting arrays BTL2 2
5.
Under what circumstances,the pivot_table() in pandas is Data BTL2 2
6. used? (AU APR/MAY 2024) manipulation
with pandas
Write the output for the following numpy code? NumPy BTL2 2
7. (i) np.array([3,14,4,2,3]) Arrays
(ii) np.array([1,2,3,4],dtype=’float32’)
(iii) np.array([range(i,i+3) for i in [2,4,6]])
(iv) np.zeros(10,dtype=int)
(v) np.ones((3,5), dtype=float)
(vi) np.full((3,5),3.14)
(vii) np.arrange(0,20,20)
(viii) np.linespace(0,1,50
(ix) np.random.random((3,3))
(x) np.random.normal(0,1,(3,3))
Use appropriate data visualization modules develop a Data BTL2 2
8. python code snippet that generates a simple sinusoidal manipulation
wave in an empty gridded axes? (AU APR/MAY 2024) with pandas
What is Data frame? Data BTL2 2
9. manipulation
with pandas
How a pandas data frame can be constructed? Data BTL2 2
10. manipulation
with pandas
What are indexers? Hierarchical BTL2 2
11. indexing
How missing data can be handled in python? Handling BTL2 2
12. missing data
How the operations can be performed on null values in Handling BTL2 2
13. pandas data science? missing data
Define Hierarchical indexing. Hierarchical BTL2 2
14. indexing
What is pivot table? Data BTL2 2
15. manipulation
with pandas
Identify the details maintained by python to store an Structured BTL2 2
16. integer data
Write python code to create 1D,2D and 3D numpy Numpy arrays BTL2 2
17. arrays.
How do you verify the shape of 1D, 2D and 3D/ND Numpy arrays BTL2 2
18. array respectively?
Compare python list with arrays Data Indexing BTL2 2
19. and Selection
Write short note on python array object Data Indexing BTL2 2
20. and Selection
How to perform slicing to access the elements of numpy Data Indexing BTL2 2
21. arrays and Selection
Summarize some built-in Pandas Aggregation BTL2 2
22. aggregations.(NOV/DEC 2023) and Grouping
What is indexing and negative indexing in tuple. Data Indexing BTL2 2
23. and Selection
Write the list of aggregate functions of numpy Aggregation BTL2 2
24. and Grouping
What is fancy indexing? Fancy BTL2 2
25. Indexing

PART - B
Imagine you have a series of data that represents the Data BTL3 13
1. amount of precipitation each day for a year in a given manipulation
city. Load the daily rainfall statistics for the City of with pandas
Chennai in 2021 which is given in a csv file
Chennairainfall2021.csv using pandas generate a
histogram for rainy days, and find out the days that have
high rainfall.(NOV/DEC 2022)
Consider that an E-Commerce organization like Data BTL3 13
2. Amazon, have different regions sales as NorthSales, manipulation
SouthSales, WestSales, EastSales.csv files. They want with pandas
to combine North and West region sales and South and
East sales so to find the aggregate sales of these
collaborating regions help them to do so using Python
code.(NOV/DEC 2022).
Explain grouping in python with example. (AU Aggregation BTL3 13
3. NOV/DEC 2023) and Grouping

Describe about fancy indexing with an example. Fancy BTL3 6


4. Indexing
Explain the following in python Data Indexing BTL2 7
5. (i) Data indexing and Selection
(ii) Operation on missing data
(AU NOV/DEC 2023)
What is a universal function? Explain clearly each Universal BTL2 13
6. function with examples. Functions
Define Dictionary in Python.Do the following Data Indexing BTL3 13
7. operations on dictionaries. and Selection
(i) Initialize two dictionaries (D1 and D2) with
key and value pairs.
(ii) Compare those two dictionaries with master
key list ‘M’ and print the missing keys.
(iii) Find keys that are in D1 but NOT in D2
(iv) Merge D1 and D2 and create D3 using
expressions(AU APR/MAY 2024)
(i) How to create Hierarchical data from the existing Hierarchical BTL3 13
8. data frame? indexing
(ii) How to use group by with 2 columns in data set?
Give a python code snippet.
Explain data objects in pandas. Data BTL2 13
9. manipulation
with pandas
Briefly explain the hierarchical indexing with examples Hierarchical BTL3 7
10. indexing
PART – C (If applicable)
Explain about the example Counting Rainy days in Aggregation BTL3 15
1. Boolean Masking. and Grouping
Given an unsorted multi indexes that represents the Data Indexing BTL3 15
2. distance between two cities, write a python code snippet and Selection
using appropriate libraries to find the shortest distance
between any two given cities. The following matrix
representation can be used to create the data frame that
can be served as an input for the prescribed program.
(AU APR/MAY 2024)
A B C D E
A 0 30 24 6 13
B 16 0 19 5 10
C 7 16 0 15 12
D 9 17 22 0 18
E 21 8 9 11 0
An URL Server wants to consolidate a history of Data Indexing BTL3 15
3. websites visited by an user ‘U’. Every visited website and Selection
information is stored in a 2-tuple format viz.,
(website_id,Duration_of_visit) in the URL cache.
Using split, apply and continue operations, device a
code snippet that consolidate the website history and
find out the website whose duration of visit is
maximum.
Example:
Input:[(4,2),(5,1),(4,3),(1,4),(7,3),(5,2),(1,1),(7,1)]
Output:[(4,5),(5,3),(1,5),(7,4)].
The website with key_id ‘1’ has the max.duration of
visit=5.(

Explain about the steps involved in GroupBy with Aggregation BTL3 15


4. suitable diagram and coding. and Grouping
Describe in detail about pivot table.(AU NOV/DEC Data BTL3 15
5. 2023) manipulation
with pandas

UNIT 5

SYLLABUS: UNIT V DATA VISUALIZATION


Importing Matplotlib – Line plots – Scatter plots – visualizing errors – density and contour plots –
Histograms – legends – colors – subplots – text and annotation – customization – three-
dimensional plotting – Geographic Data with Basemap – Visualization with Seaborn.

PART – A
Topic

Q.No Questions BT Level


Mark

What is the purpose of errorbar function in Matplotlib? visualizing BTL2 2


1. Give an example.(NOV/DEC 2022) errors
Showcase 3-dimensional drawing in Matplotlib with Three BTL2 2
2. corresponding Python Code. (NOV/DEC 2022) dimensional
plotting
State the two possible options in Ipython notebook used Three BTL2 2
3. to embed graphics directly in the notebook. dimensional
(APRIL/MAY 2023) plotting
How plt.scatter function differs from plt.flot function? Scatter plots BTL2 2
4. (APRIL/MAY 2023)
What is purpose of matplotlib? Importing BTL2 2
5. Matplotlib
Write the dual interface of matplotlib? Importing BTL2 2
6. Matplotlib
How to draw a simple line plot using matplotlib? Line plots BTL2 2
7.
What functions can be used to draw scatter plots? Scatter plots BTL2 2
8.
Write the difference between plot and scatter functions? Scatter plots BTL2 2
9.
Define contour plots? Density and BTL2 2
10. contour plots
What functions can be used to draw contour plots? Density and BTL2 2
11. contour plots
What is the purpose of histogram? Histograms BTL2 2
12.
Write a source code to draw a simple histogram Histograms BTL2 2
13.
How to create a 3-D wireframe plot? Three BTL2 2
14. dimensional
plotting
Define surface plot? Three BTL2 2
15. dimensional
plotting
What is the use of seaborn? Visualization BTL2 2
16. with Seaborn
What is pair plots? Density and BTL2 2
17. contour plots
What is density plot? Density and BTL2 2
18. contour plots
Mention the significance of subplots? Subplots BTL2 2
19.
Brief on basemap tool kit. Geographic BTL2 2
20. Data with
Basemap
Write python code to plot sine and cos wave. density and BTL2 2
21. contour plots
Line plots BTL2 2
22. How can you set different colors for line plot.
Histograms BTL2 2
23. Write a python code snippet that generates a time-series
graph representing COVID-19 incidence cases for a
particular week.(AU APR/MAY 2024)

Day Day Day Day Day Day Day


1 2 3 4 5 6 7
7 18 9 44 2 5 89
Write a python code snippet that draws a histogram for Histograms BTL2 2
24. the following list of positive numbers (AU APR/MAY
2024)
7 1 9 44 2 5 89 9 11 6 7 85 91
8 1 7
State the categories of colormaps. Colors BTL2 2
25.
PART – B
How text and image annotations are done using Python? Text and BTL3 13
1. Give an example of your own with appropriate Python annotation
code.(NOV/DEC 2022)
Appraise the following (i) Histograms (ii) Binnings (iii) Histogram BTL3 13
2. Density with appropriate Python code.(NOV/DEC
2022)
Explain the different types of joins in python.(AU Customization BTL1 13
3. NOV/DEC 2023)

Explain various features of Matplotlib platform used for Importing BTL1 13


4. data visualization and illustrate its challenges.(AU Matplotlib
NOV/DEC 2023)
Explain contour plot and histogram. Density and BTL2 13
5. contour plots
Write a code snippet that projects our globe as a 2-D flat Three BTL3 13
6. surface (using cylindrical project) and convey dimensional
information about the location of any three major Indian plotting
cities in the map(using scatter plot)(AU APR/MAY
2024)
(i) Write a working code that performs a simple Visualization BTL3 13
7. Guassian process regression(GPR), using the Scikit- with Seaborn
Learn API.
(ii) Briefly explain about visualization with Seaborn.
Give an example working code segment that represents
a 2-D kernel density plot for any data.(AU APR/MAY
2024)
Explain in detail Customizing Colorbars. Colors BTL3 13
8.
Explain in detail Customizing Plot Legends. Legends BTL3 13
9.
Explain in detail about Multiple Subplots. Subplots BTL3 13
10.
PART – C (If applicable)
Perform an exploratory data analysis for the following Visualization BTL3 15
1. data with different types of plots: with Seaborn
The dataset contains cases from a study that was
conducted between 1958 and 1970 at the University of
Chicago’s Billings Hospital on the survival of patients
who had undergone surgery for breast cancer.
Data attributes :-
Age of patient at the time of operation (numerical)
Patient’s year of operation (year-1990,numerical )
Number of positive axillary nodes detected(numerical)
Survival status (class attribute ) 1= the patient survived
5 years or longer, 2= the patient died within 5 year.
(NOV/DEC 2022)

Explain in detail about Visualizing a Mobius Strip. Three BTL3 15


2. dimensional
plotting
Explain about Geographic data with Basemap with Geographic BTL3 15
3. different Map Projections, Map background and Plotting data with
data in Maps. Basemap
Explain the example California Cities. Geographic BTL3 15
4. data with
Basemap
Explain the example of Surface Temperature Data. Geographic BTL3 15
5. data with
Basemap

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy