QBSTS_C1
QBSTS_C1
QBSTS_C1
DESCRIPTIVE
STATISTICS
Classification, Tabulation, Frequency distribution and
Graphical representation
Fill in the Blanks
1. Classification can be done according to .
2. Year wise recording of data of food production will be called
classification.
3. The difference between the upper and lower limit of a class is called .
4. The average of upper and lower limits of a class is known as
5. Formula for determining the number of classes was given by
6. Sturges formula for determining the number of classes is
7. An arrangement of data in rows and columns is known as
8. The graphs of less than and more than ogive intersect at
9. The distribution of frequencies according to individual variate values is called
distribution.
1. Frequency distributions are often constructed with the help of . Statistics is the study of _____
and their interpretation.
2. The _____ of statistics includes collecting, organizing, analy zing, and interpreting data.
3. A _____ consists of all the items or individuals under consideration in a statistical study.
4. A _____ is a subset of the population selected for analysis.
5. Data can be categorized as _____ or _____.
6. _____ data are those that can be measured and expressed numerically.
7. _____ data are descriptive and cannot be measured numerically.
8. Attributes are the characteristics or _____ of the items being studied.
9. Variables are properties that can change or _____ among the items being studied.
10. The _____ scale of measurement involves categories without any inherent order.
11. An _____ scale of measurement has categories with a meaningful order but no fixed interval.
12. The _____ scale of measurement has meaningful order and consistent intervals but no true zero
point.
13. The _____ scale of measurement has meaningful order, consistent intervals, and a true zero point.
14. _____ presentation of data involves presenting it in tables.
15. _____ presentation of data involves representing it visually through charts or graphs.
16. A _____ is a graphical representation of frequency distribution using bars.
17. An _____ is a graphical representation of cumulative frequency distribution.
18. Consistency of data refers to the degree of _____ among measurements.
19. Independence of data means that one observation does not _____ another in the dataset.
20. When dealing with attributes, _____ refers to the situation when data items belong to one and only
one category.
21. The primary purpose of statistics is to _____ data to gain insights and make informed decisions.
22. A _____ is a characteristic of interest that can vary among the subjects in a population.
23. In a _____ scale, data can be ranked, but the differences between values are not meaningful.
24. The _____ scale of measurement has all the properties of the other scales and a true zero point.
25. A _____ displays data using a series of connected data points.
26. The _____ of data refers to the spread or variability of the data values.
27. An example of a _____ variable is the color of a car.
28. _____ data consist of categories that have a natural order but no consistent difference between them.
29. A _____ is a graphical representation of data that uses lines to connect data points.
30. The _____ of data measures the tendency of data values to cluster around a central point.
31. A _____ is a table that shows the frequency of each category in a dataset.
32. When two variables are _____, the presence or value of one does not affect the other.
33. A _____ variable is one that can take any value within a certain range.
34. The _____ of data refers to the middle value in a set of ordered data.
35. An example of a _____ variable is the number of people in a household.
MULTIPLE CHOICES
1. Define statistics.
2. Define primary and secondary data
3. Give the advantages of tabulation
4. Write a detail note on the types of classification
5. What are the essential characteristics of a good table?
6. What is the purpose of data presentation?
7. Explain the concept of a sample.
8. Write the limitations of Statistics.
9. Difference between qualitative and quantitative data.
10. Differentiate between qualitative and quantitative data.
11. What does the term "scale of measurement" refer to?
12. Define consistency of data.
13. What is the central tendency of data?
14. Explain the difference between mean and median.
15. Define the term "attribute" in statistics.
16. Why is independence of data important?
17. What does statistics aim to study?
18. Give an example of qualitative data.
19. Define the term "variable."
20. Differentiate between nominal and ordinal scales.
21. What is the purpose of graphical data representation?
22. Define the concept of central tendency.
23. Explain the difference between consistency and independence of data.
24. What does a histogram display?
25. State the significance of a sample in statistics.
26. Define the term "population" in statistical terms.
27. Explain the difference between a census and a sample survey.
28. Describe the concept of a discrete variable.
29. How does a pie chart represent data?
30. Define the term "skewness" in statistics.
31. Differentiate between a frequency polygon and a histogram.
32. What is the purpose of using a line graph?
33. Explain the concept of "range" in data analysis.
34. Define the coefficient of variation.
35. Describe the key features of a bar chart.
36. What is a quartile in statistics?
37. Differentiate between a stem-and-leaf plot and a box-and-whisker plot.
38. Explain the concept of positive correlation.
39. Define the term "margin of error" in relation to surveys.
40. Describe the concept of "bivariate data."
41. What is the significance of the interquartile range?
42. Differentiate between a line graph and a scatter plot.
43. Explain the concept of a representative sample.
44. Define "cumulative frequency" in statistics.
45. Describe the purpose of a Pareto chart.
2.5-3.5 4
3.5-4.5 6
4.5-5.5 10
5.5-6.5 26
6.5-7.5 24
7.5-8.5 15
8.5-9.5 10
9.5-10.5 5
MEASURES OF CENTRAL TENDENCY AND DISPERISION
1. mean is a measure of
a. central values
b. dispersion
c. correlation
d. none of above
2. which of the following represents median?
a. First quartile
b. Fiftieth percentiles
c. Sixth decile
d. None of above
3. if each observation of a set is divided by 2, then the mean of new values:
a. is two times the original mean.
b. Is decreased by 2
c. Is half of the original mean
d. Remain the same
4. Harmonic mean is better than other means if the data are for:
a. Speed or rate
b. Heights or lengths
c. Binary values like 0 and 1
d. None of above
5. Extreme value have no effect on:
a. Average
b. Median
c. Geometric mean
d. Harmonic mean
6. Correct relationship between A.M. , G.M., and H.M. is :
a. A.M.=G.M.=H.M.
b. G.M ≥ A.M ≥ H.M.
c. A.M ≥ G.M. ≥ H.M.
d. None of above
7. What percentage of values is greater than 3rd quartile?
a. 75percent
b. 50percent
c. 25percent
d. 0percent
8. A frequency distribution having two modes is said to be:
a. Unimodal
b. Bimodal
c. Trimodal
d. Without mode
9. The average of n natural numbers isa.
n(n+1)/2
b. n+1/2
c. n2(n+1)/2
d. none of above
10. for deciles, the total number of pertition values are
a. 5
b. 8
c. 9
d. 10
11. Which of the following is not a measure of dispersion?
a. Mean deviation
b. Quartile deviation
c. Standard deviation
d. average deviation from mean
13. Correct formula for mean deviation from a constant A of a Series in which the
variatevalues x1,x2,x3………xk have frequencies f1,f2,……fk respectively is:
a. 1/𝑁 ∑(𝑓𝑖𝑥𝑖 − 𝐴)
b. 1/𝑁 ∑(𝑓𝑖(𝑥𝑖 − 𝐴))
c. 1/𝑁 ∑𝑖|𝑓𝑖(𝑥𝑖 − 𝐴)|
d. None of above Ans.
15. Sum of squares of the deviations is minimum when deviations are taken from
a. Mean
b. Median
c. Mode
d. Zero
17. Average wages of workers of a factory are Rs. 550 per month and SD of wages
is110. The coefficient of variation is:
a. 30 percent
b. 15percent
c. 500 percent
d. 20 percent
22. in case of positive skewed distribution, the relation between mean, median and
modethat holds is :
a. median>mean>mode
b. mean>median>mode
c. mean= median= mode
d. none of the above
23. in case of positive skewed distribution, the extreme values lie in the
a. left tail
b. right tail
c. middle
d. anywhere
25. All values in a sample are same. Then their variance is:
a. Zero
b. One
c. Not calculable
d. All the above
Short Questions (each carry 2 Marks)
Marks 64 63 62 61 60 59
Number of students: 08 18 12 09 07 06
2. The following data pertaining to the number of insects per plant. Find
median number of insects per plant.
No. of plants(f) 2 3 5 6 10 13 9 5 3 2 2 1
6. What are the differences between absolute measure and relative measure of
dispersion?
Seed yield
in gms (x) 2.5-35 3.5-4.5 4.5-5.5 5.5-6.5 6.5-7.5
No. of plants
(f) 4 6 15 15 10
9. If the weights of sorghum ear heads are 45, 60, 48,100, 65 gms. Find the Geometric mean.
10. Compute quartiles for the data given below (grains/panicles) 25, 18, 30, 8, 15, 5, 20, 40, 45
11. For which type of data mode can be calculated.
14. Explain the difference between the mean and the median. Provide examples to illustrate situations
where each measure is more appropriate in representing central tendency. Highlight the advantages and
limitations of using mean and median in data analysis.
15. Compare the range and the standard deviation as measures of dispersion. Describe how each measure
quantifies data spread. Discuss situations where one measure might be preferred over the other and
explain why. Provide a dataset example to demonstrate the calculation and interpretation of both range
and standard deviation.
16. Discuss the concept of skewness and kurtosis in probability distributions. Explain how skewness
indicates the asymmetry of a distribution, while kurtosis measures its tail behavior. Illustrate the
differences between positively and negatively skewed distributions, as well as leptokurtic and
platykurtic distributions, using graphical representations and practical scenarios.
17. Explain the significance of quartiles and percentiles in statistics. Define what quartiles and percentiles
represent and how they divide a dataset. Describe how they are used to identify data points at specific
positions within the dataset. Provide an example to illustrate the calculation and interpretation of
quartiles and percentiles.
18. Elaborate on the process of calculating the coefficient of variation (CV). Define the coefficient of
variation and explain how it relates to the standard deviation and mean. Discuss the advantages of
using CV to compare variability between datasets with different units of measurement. Provide a
numerical example to demonstrate the calculation and interpretation of CV.
19. Compare and contrast the absolute moments and factorial moments. Explain the concept of moments in
statistics and discuss how absolute moments and factorial moments are calculated. Highlight the
differences in their formulas and applications. Provide practical examples to illustrate the calculation
and interpretation of both types of moments.
20. Explain the difference between a scatter plot and a line graph. Detail the purposes and characteristics of
each type of graph. Provide examples of scenarios where one would be more suitable than the other
and explain the insights gained from each.
21. Compare the concepts of variance and standard deviation. Define both terms and describe their roles in
measuring data variability. Discuss how they are related mathematically and explain when it's
advantageous to use one over the other in different contexts.
22. Discuss the characteristics and applications of a histogram and a frequency polygon. Explain how each
type of graphical representation displays frequency distribution patterns. Provide examples of datasets
that would be effectively represented by each type and explain the benefits of using each in data
visualization.
23. Elaborate on the differences between skewness and kurtosis in probability distributions. Define both
skewness and kurtosis and explain their significance in describing distribution shapes. Provide
graphical representations of distributions with different skewness and kurtosis values to demonstrate
how these measures impact data visualization and analysis.
24. Compare the concepts of absolute moments and factorial moments in statistical analysis. Define both
terms and explain their significance in characterizing data distributions. Discuss how these moments
differ in terms of calculation methods and the insights they provide about the data.
25. Discuss the characteristics and applications of a cumulative frequency graph and an ogive graph.
Explain how each type of graph depicts cumulative frequency distributions. Provide real-world
examples to illustrate the use of these graphs and explain how they aid in understanding data trends
Correlation and Regression
2. The spearman rank order correlation is used when the variables to be correlated
are measured on scale.
3. When increase in one variable is associated with decrease in other variable, the
correlation between these variables is
4. Bivariate data involves the analysis of relationships between ______________ variables.
5. A scatter diagram visually represents the relationship between two ______________ variables.
6. Simple correlation measures the ______________ of the linear relationship between two variables.
7. Partial correlation measures the relationship between two variables while controlling for the effect
of ______________ variable(s).
8. Rank correlation, also known as ______________ correlation, assesses the relationship between
variables using their ranks.
9. Simple linear regression aims to find the best-fitting ______________ line for a set of data points.
10. The principle of ______________ involves minimizing the sum of squared differences between
observed and predicted values.
11. Polynomial regression involves fitting a curve of ______________ degree to the data points.
12. Exponential regression models data that exhibits ______________ growth or decay.
13. The process of finding a line or curve that best fits data points is often referred to as
______________.
14. Bivariate data involves the simultaneous analysis of two ______________ variables.
15. A scatter diagram consists of points that represent paired ______________ observations.
16. The coefficient of correlation ranges between -1 and ______________.
17. In simple correlation, the Pearson correlation coefficient measures ______________ relationships.
18. Rank correlation is used when the data isn't suitable for ______________ correlation.
19. In simple linear regression, the slope of the line represents the ______________ between variables.
20. The principle of least squares aims to minimize the ______________ of the residuals.
21. Polynomial regression uses ______________ to fit curves to data.
22. Exponential regression is used for data with an ______________ growth or decay pattern.
23. Fitting a curve to data involves finding the best ______________ representation.
24. Bivariate data analysis deals with the relationship between ______________ variables.
25. A scatter diagram displays the ______________ between two variables.
26. In simple correlation, the value of the Pearson correlation coefficient lies between
______________ and ______________.
27. If the correlation coefficient is close to -1, it indicates a ______________ relationship.
28. Rank correlation, such as the Spearman rank correlation coefficient, is used when data is
______________ or ______________.
29. Simple linear regression aims to find the best-fitting ______________ line.
30. The principle of least squares minimizes the sum of squared ______________.
31. Polynomial regression fits a curve to data using ______________ functions.
32. Exponential regression is suitable for data that follows an ______________ growth or decay
pattern.
33. Fitting a curve to data involves finding the optimal ______________ to represent the relationship.
34. The range of correlation coefficient is
35. Correlation coefficient is independent of
36. Correlation can be calculated when the variables have unit.
37. If correlation coefficient value is +1 then it indicates
38. The regression line is also called a
39. The slope of the regression line is represented by
40. In regression, the independent variable is also called
41. The geometric mean of two regression coefficient is
42. If one regression coefficient is more than unity then other is
Multiple choices
1. The term regression was introduced by:
a. R.A. Fisher
b. Sir Francis Galton
c. Karl Pearson
d. None of above
2. In a regression line Y on X , the variable X is known as:
a. Independent variable
b. Regressor
c. Explanatory variable
d. All the above
9. The formula for simple correlation co-efficient between the variable X and Ywith
usual notations is:
a. Cov(X,Y)/ √𝑉(𝑋)𝑉(𝑌)
b. µ XY / √µ xx µyy
c. 𝜎𝑥𝑦/𝜎𝑥 𝜎𝑦
d. all the aboveAns.
10. the unit of correlation coefficients is:
a. Kg/cc
b. Per cent
c. Non-existing
d. None of the above
11. The range of simple correlation coefficient is:
a. 0 to ∞
b. -∞ to ∞
c. 0 to 1
d. -1 to 1
12. If 𝜌=1, the relation between the two variables X and Y is:
a. Y is proportional to X
b. Y is inversely proportional to X
c. Y is equal to X
d. None of the above
13. The geometric mean of the two regression coefficient b yx is equal to:
a. Correlation co-efficient
b. Co-efficient determination
c. Regression co-efficient
d. None of the above
14. Homogencity of three or more population correlation coefficients can betested by:
a. t- test
b. Z- test
c. Chi-square test
d. F- test
15. Regression coefficient is independent of the change of:
a. Scale
b. Origin
c. Both origin and scale
d. Neither origin nor scale
16. A positive significant correlation between the number of shoes produced andthe steel
produced per year is:
a. A nonsense correlation
b. A spurious correlation
c. A meaningless correlation
d. All the above
17. If the correlation between the two variables X and Y is negative, theregression
coefficient of Y on X is:
a. Positive
b. Negative
c. Not certain
d. None of the above
Short Questions (each carry 2 Marks)
1. Given two variables X and Y: X (10, 15, 20) and Y (25, 30, 35), calculate the Pearson correlation
coefficient.
2. Calculate the rank correlation coefficient (Spearman's rho) for the dataset: X (15, 20, 25) and Y (30, 40,
50).
3. For a simple linear regression, if the slope is 0.5 and the intercept is 10, predict the value of Y when X is
8.
4. Fit a linear regression line to the data points: X (2, 4, 6, 8) and Y (12, 18, 22, 28), and find the equation
of the line.
5. Using the principle of least squares, find the equation of the quadratic polynomial that best fits the data:
X (1, 2, 3) and Y (3, 7, 12).
6. Fit an exponential curve to the data: X (1, 2, 3, 4) and Y (5, 10, 20, 40), and determine the equation of
the curve.
7. Explain the concept of bivariate data with an example from real life.
8. How does a scatter diagram help in understanding the relationship between two variables? Provide a
hypothetical scenario.
9. Describe the key differences between simple, partial, and multiple correlations with examples.
10. In a rank correlation analysis, what does a positive correlation coefficient indicate? Provide a brief
interpretation.
11. Compare and contrast Pearson correlation coefficient and Spearman's rank correlation coefficient.
12. A dataset has three variables: X, Y, and Z. Explain how you would calculate the partial correlation
coefficient between X and Y, controlling for Z.
13. Define simple linear regression. How is the regression line determined using the least squares method?
14. What is the principle of least squares in regression analysis? How does it help in finding the best-fitting
line?
15. Compare polynomial regression with linear regression. When might one be more suitable than the other?
16. Explain the exponential regression model. Give an example of a situation where exponential regression
is applicable.
17. A dataset has X (1, 2, 3) and Y (5, 8, 12). Calculate the slope and intercept of the linear regression line
using the least squares method.
18. Given a dataset X (2, 4, 6) and Y (16, 32, 54), fit an appropriate curve and explain your choice..
19. Mention the properties of the correlation coefficient?
20. Find correlation coefficient between plant height and number of pods.
X 15 20 17 22 25 29 12
Y 18 17 21 23 20 19 22
23. Properties of regression coefficient.
26. From a paddy field, 36 plants were selected at random. The length of panicles(x) and the
number of grains per panicle (y) of the selected plants were recorded. The results are given
below. Fit a regression line y on x.
S.No. Y X
1 95 22.4
2 109 23.3
3 133 24.1
4 132 24.3
5 136 23.5
6 116 22.3
7 126 23.9
8 124 24.0
9 137 23.9
10 90 20.0
From the following data, find the regression equation ∑X = 21, ∑Y =20, ∑X2 = 91, ∑ XY =
74, n = 10.
INDEX NUMBER
1. An index number is a statistical measure that quantifies changes in a ______________ over time.
2. Weighted index numbers assign different ______________ to different items based on their
importance.
3. Laspeyre's index is a weighted index using ______________ prices as weights.
4. Paasche's index is a weighted index using ______________ prices as weights.
5. Chain index numbers are used to account for changes in ______________ over time.
6. Index numbers quantify the ______________ of changes in a variable over time.
7. Weighted index numbers assign ______________ to different items based on their importance.
8. The Laspeyres index uses ______________ prices as weights.
9. The Paasche index uses ______________ prices as weights.
10. Chain index numbers adjust for changes in a ______________ set of items.
11. Conversion of fixed base to chain index numbers involves ______________ the index values.
12. Consumer price index numbers measure changes in the cost of ______________ goods and services.
13. Index numbers can help in comparing ______________ of different periods.
14. The Fisher’s Ideal Index seeks to eliminate the ______________ bias in index numbers.
15. One limitation of index numbers is the ______________ choice of base year.
16. Index numbers provide a way to measure ______________ in a set of related values.
17. Weighted index numbers assign different ______________ to different items based on importance.
18. Laspeyres' index uses ______________ period quantities as weights.
19. Paasche's index uses ______________ period quantities as weights.
20. Errors in index numbers arise due to ______________ or ______________ biases.
21. Conversion of fixed base to chain index involves updating the index using ______________ data.
22. Consumer price index numbers measure changes in the cost of ______________ over time.
23. Index numbers are useful for ______________ trends and making comparisons.
24. Fisher's Ideal Index aims to eliminate ______________ bias in index numbers.
25. A limitation of index numbers is that they might not account for ______________ changes.
a) Index number
b) Reliable
c) Scope
d) Special purpose
36. When the ratio of a sum of prices in the current period to the sum of prices in the base period is
expressed in the form of a percentage, it is known as :
a) Simple price index number
b) Simple aggregate price index number
c) Quantity index number
d) The weighted aggregative price index
37. To measure the relative change in purchasing a specific basket of goods and services between two
periods of a certain locality with a group of people with fixed incomes, which of the below can be used :
a) Consumer price index
b) Pasche’s price index
c) Cost of living index
d) Both (A) and (C)
38. Commodities that show a considerable price fluctuation can be measured by _________ index?
a) Value
b) Price
c) Quantity
d) None
39. Which of the following are the limitations of using an index number?
a) It is only useful for short term comparison
b) It ignores the quantity of the commodity
c) The use of each of the indexes is restricted for a specific purpose
40. All of the above Index number for base year is always considered as------
a. 100
b. 101
c. 201
d. 1000
41. Index number is a special type of ----------
a. Average
b. dispersion
c. correlation
d. None of the above
42. Index number is always expressed in ----------
a. Percentage
b. ratio
c. proportion
d. None of the above
43. Index number is also called as-----------
a. Economic barometer
b. Parameter
c. Constant
d. None of the above
44. Which index number is called as ideal index numbr.
a. Lasperys
b. Paasches
c. Fisher
d. None of the above
1. Calculate cost of living index number using Family Budget method from the followingdata.
House 5 50 150
Rent
Clothing 2 30 60
Fuel 3 30 75
Others 5 50 75
2. Explain briefly the steps in the construction of consumer price index number.
A 50 100 60 180
B 40 120 40 200
D 20 80 25 100
B 320 690 20 60
C 720 1600 10 10
D 720 2100 10 20
6. Determine the price index number from the following data using weighted arithmetic mean of price
relatives
Commodity Unit Weight Price per unit
A Quintal 14 90 120
B Kilogram 20 10 17
C Dozen 35 40 60
D Litre 15 50 95
7. Provide examples of situations where index numbers are used beyond economic contexts.
Determine the price index number from the following data using weighted
arithmetic mean of price relatives.