5.3 Quartiles (Skewness)

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

1|Page RFI

Quartiles- Divide data into four equal parts.

25% Q1 50% Q2 75% Q3

Examp1:Prices(tk.): 40, 21,10,14,30,32,12,25, 34,22


Find all the quartiles, median, inter quartile range (IQR);
and quartile deviation(QD); and comment

Examp2: weights of students (Kg):


55,55,65,45,43,70,50,47,65,60,54,52,68
Find all the quartiles, quartile deviation, inter quartile range and
median and comment for all

Ex #1:14, 17, 19, 23, 27, 32, 40, 49, 54, 59, 71, 80
Find all the quartiles and quartile deviation
Sol:
2|Page RFI

n=12

For Q1 = = = 3 is an integer

So, Q1 will be average of 3rd and 4th value = = 21

For Q2= = = 6 is an integer

So, Q2 will be average of 6th and 7th value = = 36

For Q3= = = 9 is an integer

So, Q3 will be average of 9th and 10th value = = 56.5

Q1 = 21; Q2 = 36; Q3 = 56.5

Ex#2:14, 17, 19, 23, 27, 32, 40, 49, 54, 59, 71, 80, 81
Sol:
n=13

For Q1 = = = 3.25 is not an integer

So, Q1 will be the 4th value = 23

For Q2= = = 6.5 is not an integer


3|Page RFI

So, Q2 will be the 7th value = 40

For Q3= = = 9.75 is not an integer

So, Q3 will be the 10th value = 59

Q1 = 23; Q2 = 40; Q3 = 59

Ex#3:14, 17, 19, 23, 27, 32, 40, 49, 54, 59, 71, 80, 81, 81
Sol:
n=14

For Q1 = = = 3.5 is not an integer

So, Q1 will be the 4th value = 23

For Q2= = = 7 is an integer


4|Page RFI

So, Q2 will be arrange of 7th and 8th value = = 44.5

For Q3= = = 10.5 is not an integer

So, Q3 will be the 11th value = 71

Q1 = 23; Q2 = 44.5; Q3 = 71

Inter-quartile Range

Quartile deviation = = = 24
5|Page RFI

Skewness

Shape of the data (distribution)


[Note: All the three below curves is drawn using frequency polygon theorem from some imaginary
statistical data]

Age

mean

median

mode

mean> median > mode

 Positively skewed(right tailed):maximum data(age) are taking lower


values
6|Page RFI

Age

mean

median

mode

mode> median > mean

 Negatively skewed (left tailed):maximum data(age) are taking higher


values


7|Page RFI

50% 50%

Age

mean

median

mode

mode = median = mean

 Symmetric: 50% data is below or above the average


8|Page RFI

Qualifying the data as positively skewed or negatively skewed can in some


cases reveal if mean > median or median > mean, respectively. However, these
are simplistic observations, and must not be strictly applied.

Measuring of skewness

#1.Karl Pearson’s coefficient of skewness

Skewness,

Statistical Law
SKP =
Mode = 3*median – 2*mean
or

SKP =

If SKP> 0; it is positively skewed

If SKP< 0; it is negatively skewed

If SKP = 0; it is symmetric

Example: from the income of employees it is found that average income is


5000 tk, mode income is 5050tk, median income 5010tk and variance of
income is 10000 tk2. Find out Karl Pearson’s coefficient of skewness. What
will be shape (skewness) of income? Comment with graph.
9|Page RFI

This Method is
Preferable for
#2.Bowley’s coefficient of skewness Ungrouped Data

SKB =

If SKB> 0; it is positively skewed

If SKB< 0; it is negatively skewed

If SKB = 0; it is symmetric

**Note: there is other way to measure skewness


This Method is
Say, µ3=(x-mean)3/n
Preferable for
Then skewness=µ3/σ3
Grouped Data
Where σ=sd

Ex#4:14, 17, 19, 23, 27, 32, 40, 49, 54, 59, 71, 80, 81, 81
Sol:
n=14
10 | P a g e RFI

For Q1 = = = 3.5 is not an integer

So, Q1 will be the 4th value = 23

For Q2= = = 7 is an integer

So, Q2 will be arrange of 7th and 8th value = = 44.5

For Q3= = = 10.5 is not an integer

So, Q3 will be the 11th value = 71

SKB = = 0.11
It is positively skewed

**
It is a popular (mis)conception that the sign or direction of the skew for a
numerical dataset dictates the location of the mean with respect to the
median. The idea is that the mean hangs out in the tail region.
11 | P a g e RFI

Specifically, if a dataset is unimodal, then we are told to have the following


"expectations":

 Positively skewed?   Mean > Median

 Negatively skewed?  Mean < Median

Here is a link to an article that corrects this misconception.

#3.Whisker’s Plot (Box Plot) EDA

Ex#5:14, 17, 19, 23, 27, 32, 40, 49, 54, 59, 71, 80, 81, 81
Sol:
n=14
12 | P a g e RFI

For Q1 = = = 3.5 is not an integer

So, Q1 will be the 4th value = 23

For Q2= = = 7 is an integer

So, Q2 will be arrange of 7th and 8th value = = 44.5

For Q3= = = 10.5 is not an integer

So, Q3 will be the 11th value = 71

Q1 = 23; Q2 = 44.5; Q3 = 71

Lowest Value (14) Q1 Q2Q3 Highest Value (81)

10 20 30 40 50 60 70 80 90

Since, Q2 is near to Q1; so, it is positively skewed

** the dotted lines to the left and to the right of the box are called whiskers.What these
whiskers extend to is a matter of choice.There are some situations in statistics where we
must make a call and stand by it.This is one of them. An American mathematician called
Tukey came up with this version of a boxplot. The whiskers stop either at the extreme
values, or at a fixed distance of 1.5 IQRs(inter quartile range) from this box, whichever
comes first.The points lie beyond the 1.5 IQR mark, which is one way to qualify what are
known as outliers.
13 | P a g e RFI

Exercise

The 2015 batch of an Executive MBA program has 100 students. Their GPAs
after the first quarter are captured by the box plot below. You are also
supplied the dataset on the GPAs, which we ask you to download.

Now, answer the following questions.

MEASURES FOR THE DATA


 
(7 points possible)
What is the approximate skew of the GPAs?

 0.9  -0.9  0.15  -0.15


- unanswered
What approximately is the value at the left edge of the box?

 3.1  3.4  3.3  3.2


- unanswered
What is the value at the right edge of the box?
14 | P a g e RFI

 3.35  3.45  3.5  3.52


- unanswered
What is the value at the vertical line within the box?

 3.1  3.3  3.28  3.25


- unanswered
What is the approximate width of the box?

 0.255  0.325  0.38  0.5


- unanswered
Below what value on the left do we have outliers?

 2.5  2.0  2.43  2.47


- unanswered
CHECKYOUR ANSWER SAVEYOUR ANSWER 
You have used 0 of 4 submissions

OUTLIERS
 
(1 point possible)
How many outliers exist beyond the left whisker?

 2  5  4  6
- unanswered
FINAL 

Ramya was feeling very energetic as she drove to her office that late
November morning. Her Diwali festival break had expanded into a whole
week, thanks to a few days of casual leave. As she stepped into the building
that housed her office, she was greeted by the concierge: “Hey Ramya,
looks like the holidays have rejuvenated you.” Ramya replied with a broad
smile, “Thanks Roshan, did you burst any crackers?” "Yup! But no noisy
ones," he replied.
15 | P a g e RFI

As she walked into her office, Ramyarealised that something else was
making her feel pumped up. She held an executive position at
SouraviFashions (SF), a company that manufactured and sold niche
garments to women. The company owned 13 stores in 5 metros across
India. Being a small store chain, their investment in IT infrastructure was
bare-bones, consisting of a desktop machine at each location, which was
connected to the Internet. Each store had a well-trained cashier, who
doubled up as a data entry specialist.

Business at SF had been quite good for the last couple of years. However,
competition from online retailers was fast becoming a threat. In a long and
arduous meeting held just before the extended weekend, the company had
decided to open more stores, thereby increasing access to its loyal customer
base. Ramya was entrusted with the task of finding potentially profitable
localities to open these stores.

“This meeting could not have been timed better”, she thought. “Diwali is the
one season when retail business is booming all across India.” Later that day,
she sat in her office and sent out emails to all the stores. Being tech-savvy,
Ramya created a spreadsheet on the cloud, and instructed the store
managers to enter the details of every transaction that had taken place
during the festival week. Specifically, she instructed them to meticulously
record all transactions for the period between 16 November and 22
November. Diwali had fallen on 22nd November that year.

Settling into her office after the holidays, Ramya wondered what additional
information might be helpful besides the transactions. The company had
carefully archived some data when it began its operations across the various
cities.  In these archives, there was a dataset with median incomes at all
localities where the existing stores operated. Another dataset contained the
list of declared household incomes (DHI) of customers registered in SF's
loyalty program.
16 | P a g e RFI

Download this spreadsheet containing the details of the transactions during


the week leading to the festival day. Analyse the data in the sheet, and
answer the following questions.

STORE PERFORMANCE
 
(3 points possible)
You may use a pivot table to aggregate store-wise sales. Obtain the number
of garments sold, total sales and average price per garment sold in each
store. Keep in mind that the dataset is only for a brief festival period.

Then proceed to answer these questions.

Which of these stores sold the highest number of garments?

 BAN1  HYD1  DEL1  CHE1


- unanswered
At which store was the average price per garment sold the highest?

 BAN4  BAN8  KOL4  HYD8


- unanswered
Calculate the correlation between the number of garments sold and the
average price per garment. Which of these statements is the most accurate?

 The higher the volume of sales the higher is the average price per
garment  The higher the volume of sales the lower is the average price
per garment  Stores selling high priced garments are more likely to have
17 | P a g e RFI

a larger volume of sales  Stores selling high priced garments are less
likely to have a larger volume of sales
- unanswered
CHECKYOUR ANSWER SAVEYOUR ANSWER 
You have used 0 of 3 submissions

CORRELATION
 
(1 point possible)
Which of the following statements are correct?

 There is a positive correlation between the number of garments sold and


the total sales amount  There is a negative correlation between the
number of garments sold and the total sales amount  There is a positive
correlation between the number of garments sold and the average price per
garment  There is a negative correlation between the number of garments
sold and the average price per garment
- unanswered
CHECKYOUR ANSWER SAVEYOUR ANSWER 
You have used 0 of 2 submissions

BANGALORE
 
(4 points possible)
Construct a two-dimensional pivot table, with Store IDs along the rows and
Garment Types along the columns. Fill in the table with total revenues from
the sale of a specific type of garment from a specific store. You are ready!

Proceed to answer the following questions.

Within Bangalore, the revenue from sales of sportswear is the highest for
which of these stores?

 BAN1  BAN2  BAN3  BAN4


18 | P a g e RFI

- unanswered
Within Bangalore, the revenue from sales of formal garments is the lowest
for which of these stores?

 BAN1  BAN2  BAN3  BAN4


- unanswered
For which of these stores in Bangalore is the revenue from sales of
partywear lower than from the sales of formal garments?

 BAN1  BAN2  BAN3  BAN4


- unanswered
Which of these stores in the worst performing in Bangalore, in terms of total
revenue from sales?

 BAN2  BAN4  BAN6  BAN8


- unanswered
CHECKYOUR ANSWER SAVEYOUR ANSWER 
You have used 0 of 3 submissions

COUNTS
 
(2 points possible)
The sales of how many units of garments are recorded within this dataset?

 unanswered

Across how many transactions were these sales made?

 unanswered

CHECKYOUR ANSWER SAVEYOUR ANSWER 


You have used 0 of 3 submissions

LOCAL INCOME TRENDS


19 | P a g e RFI

 
(5 points possible)
For the following list of problems, assume that people prefer to shop at the
store nearest to them. In other words, everyone buys garments from the
store in their locality.

Answer the questions that follow, all of which concern correlations. (Please
be cautious with giving complete columns while entering formula for
correlation in your spreadsheet)

What is the correlation of the average household income in the locality of the
store, with the minimum price of garment sold in the corresponding store?

 1  0.9  0.09  -0.9


- unanswered
What is the correlation of the average household income in the locality of the
store, with the maximum price of garment sold in the corresponding store?

 1  0.9  0.8  -1


- unanswered
What is the correlation of the average household income in the locality of the
store, with the total number of garments sold in the corresponding store?

 1  0.9  -0.37  -0.77


- unanswered
What is the correlation of the average household income in the locality of the
store, with the total revenue from sales in the corresponding store?

 1  0.9  -0.27  -0.77


- unanswered
What is the correlation of the average household income in the locality of the
store, with the average price of garments sold in the corresponding store?

 1  0.9  -0.9  -1


20 | P a g e RFI

- unanswered
CHECKYOUR ANSWER SAVEYOUR ANSWER 
You have used 0 of 12 submissions

TRENDS
 
(1 point possible)
Using the correlations in this exercise, which of these conclusions can be
made?

Select all the correct options.

 The minimum price of garment sold is gerenally higher in stores located


within high income localities  The maximum price of garment sold is
gerenally lower in stores located within low income localities  Sauravi
Fashions is not performing well in terms of total revenue in localities with
average high household income  Stores in high income localities generally
sell high priced garments
- unanswered
CHECKYOUR ANSWER

Study the distribution of the transacted amounts using Histograms and


Boxplots.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy