Frequency Distribution & Data Visualisation
Frequency Distribution & Data Visualisation
Example
A survey was taken on Maple Avenue. In each of 20 homes, people were asked how
many cars were registered to their households. The results were recorded as
follows:
1, 2, 1, 0, 3, 4, 0, 1, 1, 1, 2, 2, 3, 2, 3, 2, 1, 4, 0, 0
Use the following steps to present this data in a frequency distribution table.
1. Divide the results (x) into intervals, and then count the number of results in
each interval. In this case, the intervals would be the number of households
with no car (0), one car (1), two cars (2) and so forth.
2. Make a table with separate columns for the interval numbers (the number of
cars per household), the tallied results, and the frequency of results in each
interval. Label these columns Number of cars, Tally and Frequency.
3. Read the list of data from left to right and place a tally mark in the
appropriate row. For example, the first result is a 1, so place a tally mark in
the row beside where 1 appears in the interval column (Number of cars).
The next result is a 2, so place a tally mark in the row beside the 2, and so
on. When you reach your fifth tally mark, draw a tally line through the
preceding four marks to make your final frequency calculations easier to
read.
4. Add up the number of tally marks in each row and record them in the final
column entitled Frequency.
5. This relative frequency of a particular observation or class interval is
found by dividing the frequency (f) by the number of observations (n): that
is, (f ÷ n). Thus:
6. Relative frequency = frequency ÷ number of observations
7. The percentage frequency is found by multiplying each relative frequency
value by 100. Thus:
8. Percentage frequency = relative frequency X 100 = f ÷ n X 100
Your frequency distribution table for this exercise should look like this:
Table.1
Frequency table for the number of cars registered in each
household
Table summary
This table displays the results of Frequency table for the number
of cars registered in each household. The information is grouped
by Number of cars (x) (appearing as row headers), Frequency (f)
(appearing as column headers).
0 (4/20) = .20 or
4 4 20
20%
1 (6/20) = .30 or
6 4+6 =10 20+30=50
30%
2 (5/20) = .25 or
5 10+5 =15 50+25 = 75
25%
3 (3/20) = .15 or
3 15+3 =18 75+15=90
15%
4 (2/20) = .10 or
2 18+2 = 20 90+10=100
10%
By looking at this frequency distribution table quickly, we can see that out of
20 households surveyed, 4 households had no cars, 6 households had 1 car, etc.
At a recent chess tournament, all 10 of the participants had to fill out a form
that gave their names, address and age. The ages of the participants were
recorded as follows:
36, 48, 54, 92, 57, 63, 66, 76, 66, 80
Use the following steps to present these data in a cumulative frequency
distribution table.
1. Divide the results into intervals, and then count the number of
results in each interval. In this case, intervals of 10 are appropriate.
Since 36 is the lowest age and 92 is the highest age, start the
intervals at 35 to 44 and end the intervals with 85 to 94.
2. Create a table similar to the frequency distribution table but with
three extra columns.
o In the first column or the Lower value column, list the lower
value of the result intervals. For example, in the first row, you
would put the number 35.
o The next column is the Upper value column. Place the upper
value of the result intervals. For example, you would put the
number 44 in the first row.
o The third column is the Frequency column. Record the number
of times a result appears between the lower and upper values.
In the first row, place the number 1.
o The fourth column is the Cumulative frequency column. Here
we add the cumulative frequency of the previous row to the
frequency of the current row. Since this is the first row, the
cumulative frequency is the same as the frequency. However, in
the second row, the frequency for the 35–44 interval (i.e., 1) is
added to the frequency for the 45–54 interval (i.e. 2). Thus, the
cumulative frequency is 3, meaning we have 3 participants in
the 34 to 54 age group.
1+2=3
Table 2
Ages of participants at a chess tournament
Table summary
This table displays the results of Ages of participants at a chess
tournament. The information is grouped by Lower Value
(appearing as row headers), Upper Value, Frequency (f),
Cumulative frequency, Percentage and Cumulative percentage
(appearing as column headers).
35 44 1 1 10.0 10.0
45 54 2 3 20.0 30.0
55 64 2 5 20.0 50.0
65 74 2 7 20.0 70.0
75 84 2 9 20.0 90.0
85 94 1 10 10.0 100.
Pie chart usually shows the component parts of a whole. Sometimes you will
see a segment of the drawing separated from the rest of the pie in order to
emphasize an important piece of information.
The grouped bar chart is another effective means of comparing sets of data
about the same places or items. It gives two or more pieces of information
for each item on the x-axis instead of just one as in Chart 2. This allows you
to make direct comparisons on the same chart by age group, gender or
anything else you wish to compare. However, if a grouped bar chart has too
many series of data, the chart becomes cluttered and it can be confusing to
read.
Chart 3, a grouped vertical bar chart, compares two series of data: the
numbers of boys and girls that have a smartphone at CUHP from 2012 to
2019. The blue bar represents the number of boys, and the red bar
represents the number of girls.
Ye Number of Number of
ar boys girls
201 110 85
2
201 185 175
3
201 240 225
4
201 285 295
5
201 305 280
6
201 310 315
7
201 315 305
8
201 315 320
9
One disadvantage of vertical bar charts, however, is that they lack space for
text labelling at the foot of each bar. When category labels in the chart are
too long, you might find a horizontal bar chart better for displaying
information, like the example in Chart 4.
Chart Title
Percentage of girls (%) Percentage of boys (%)
Wrestling 5
10
Volleyball 23
10
Tennis 8
8
Swimming 12
9
Soccer 17
20
Football 2
40
Basketball 25
40
Baseball 17
24
Athletics 17
17
Stacked bar charts
There are several other types of bar chart that you may encounter.
The population pyramid is a special application of a grouped bar chart.
Another useful type of bar chart is the stacked bar chart.
The stacked bar chart is a preliminary data analysis tool used to show
segments of totals. The stacked bar chart can be very difficult to analyze if
too many items are in each stack. It can contrast values, but not necessarily
in the simplest manner.
In Chart 5, it is easy to analyze the data presented since there are only
three items in each stack: swimming, running and biking. It is easy to see at
a glance what percentage of time each woman spent on an event. Had this
been a chart representing a decathlon (with 10 events) the data would have
been significantly harder to analyze.
Nam Perce Perce Perce
e ntage ntage ntage
of of of
time time time
spent spent spent
swim cyclin runnin
ming g (%) g (%)
(%)
Averi 13 50 37
Bron 32 53 15
wyn
Hillar 21 28 51
y
Jessa 41 14 45
Mega 9 81 10
n
Merc 28 47 25
edes
Rosal 32 40 28
yn
Tiiu 38 24 38
Tiiu 38 24 38
Rosalyn 32 40 28
Mercedes 28 47 25
Megan 9 81 10
Jessa 41 14 45
Hillary 21 28 51
Bronwyn 32 53 15
Averi 13 50 37
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Percentage of time spent swimming (%) Percentage of time spent cycling (%)
Percentage of time spent running (%)
Month Numb
er of Number of students
stude
350
nts
300
Januar 250
y 250
200
Febru 250
ary 150
April 260 50
0
May 280 January February March April May June July
June 290 Number of students
July 315
Age & donation ($)
Age Average
Average donation ($)
donation ($) 120
15 36 100
16 52 80
17 83 Axis Title 60
40
18 100
20
19 110
0
15 16 17 18 19
40,000 75
50,000 85 40
60,000 82
20
70,000 97
80,000 87 0
90,000 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0
100,000 95 10 20 30 40 50 60 70 80 90 1 00 1 10
Income
The histogram