Business Statistics Assignment 2 & 3
Business Statistics Assignment 2 & 3
Business Statistics Assignment 2 & 3
Rini Darathy
Business Statistics: Assignment 2 & 3
Assignment 2:
Q. Write a short note (250- 300 words each) on the following with suitable
business examples. (20 Marks)
1. Mean
2. Median
3. Standard Deviation
4. Skewness
Answer: Descriptive statistics are the base with which we are going to describe
a sample and various factors associated with the sample.
Population Mean:
Sample Mean:
It is an average of a data set and it is mainly used to calculate the central
tendency of a data set. It is represented as x-bar.
The median is also the measure of central tendency and it is sorted by a middle
number, ascending or descending and it is more descriptive than the average of
a data set. In median the first step is to arrange the number from lowest to
highest. It will give us an approximate average value.
If we have odd numbers in our data set, then the median is calculated as value
at (n+1/2)th location (it means the middle number in the given data set). When
we have even numbers then the median is calculated as value at (n/2) th location
+value at (n+1/2)th location.(it means the middle two values need to be
summed and then divide them by two).
The median will give more justice to the results when we compare the results
with mean.
σ= √Σ(xi- μ)²/N
Normal Distribution:
Values are distributed across the mean. It is a special type of density curve
called Bell curve or sometimes it is called as normal curve. In normal distribution
the central value is called as mu. Some of the data set will be located far from
the mean and some of the data set will be located near to the mean.
Empirical rule:
States that for a normal distribution, all of the data will fall within three standard
deviations of the mean.68% of data falls within the first standard deviation, 95%
of data falls under the second standard deviation, 99.75 of data falls under the
third standard deviation.
As a training team lead, we have asked the participants to rate different trainers
from 1 to 5, so that the trainer will come to know the response received from
the participants to predict whether their training ratings fall on average, below
average or above average. With the sample data, first we should calculate the
mean value by adding all the data and dividing it by total number of data points.
Second step is we have to subtract the mean from the data point values. Third
step is square the result we got and sum up the results and then divide the
number of data points minus 1. Final step is to take the square root to find
standard deviation.
It is a measure of distortion that may be derived from the set of data using bell
curve or normal distribution. A distribution is used to be tilted when the data
point clusters are added towards the side of the scale than the other. Skewness
can be displayed using list of data or using graph. There are two types of
Positive Skewness:
When the data points or distribution frequency curve has long tail to the right
side of the curve then it is called as positive skewness.
Negative Skewness:
When the data points or distribution frequency curve has long tail to left side of
the curve then it is called as negative skewness
Skewness=nΣ((xi- x̄)³/(n-1)(n-2)S³
" Correlation does not necessarily mean causation." Do you agree or disagree
with this statement? In either case, support your answers with two different
business examples. (20 Marks)
Causation refers to the relationship between two events in which one event is
affected by other event. Causation is defined as, when the value of one event,
increases or decreases as a result of other events. It happens only when a
controlled experiment has been done to prove that the occurrence of one
variable is affected by the other variable.
The first reason there is some third variable (Z) that affects both X and Y at the
same time, moving X and Y together. The second reason is reverse causality
where X and Y move together may not be that X Causes Y, but instead that Y
Causes X. The third reason is sample selection; the model we see is not
representative of interested people. The fourth reason is measurement of error,
the results we are interested in are difficult to measure and can only be fully