0% found this document useful (0 votes)
27 views

grade-11-data-handling

The document covers Grade 11 data handling in mathematics, focusing on statistical graphs such as histograms, frequency polygons, and ogives. It includes methods for calculating variance, standard deviation, and identifying outliers, as well as exercises for practical application. The content builds on concepts learned in Grade 10, emphasizing the importance of graphical data representation for analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

grade-11-data-handling

The document covers Grade 11 data handling in mathematics, focusing on statistical graphs such as histograms, frequency polygons, and ogives. It includes methods for calculating variance, standard deviation, and identifying outliers, as well as exercises for practical application. The content builds on concepts learned in Grade 10, emphasizing the importance of graphical data representation for analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

lOMoARcPSD|8087594

Grade 11 Data handling

Mathematics (High School - South Africa)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by lucky ndou (muthavhinelucky2@gmail.com)
lOMoARcPSD|8087594

Grade 11 Data Handling


In this chapter you:
 Draw histograms
 Draw frequency polygons
 Draw ogives (cumulative frequency curves)
 Calculate variance and standard deviation of ungrouped data;
 Determine whether data is symmetric or skewed
 Identify the values of the outliers.

WHAT YOU LEARNED ABOUT DATA HANDLING IN GRADE 10

In Grade 10 you covered the following data handling concepts:


 Measures of central tendency of lists of data, of data in frequency tables and of data
in grouped frequency tables.
 The range, percentiles, quartiles, interquartile and semi-interquartile range
 The five number summary and box-and-whisker diagram
 Using statistical summaries (measures of central tendency and dispersion) to analyse
and make meaningful comments on the context associated with the given data.

STATISTICAL GRAPHS
✓ Organised data can often be presented in graphical form.
 Statistical graphs are used to describe data or to analyse it.
 The purpose of graphs in statistics is to communicate the data to the viewers
in pictorial form. It is easier for most people to understand data when it is
presented as a graph than when it is presented numerically in tables.

✓ In earlier grades you dealt with the following graphs


 Bar graphs and double bar graphs
 Histograms
 Pie charts
 Broken-line graphs.

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

✓ In Grade 11 you study three statistical graphs often used in research: the
histogram, the frequency polygon and the cumulative frequency graph or
ogive.

HISTOGRAMS
✓ A histogram gives us a visual interpretation of data. It looks very similar to a
bar graph, but there are definite differences between them.

HISTOGRAM BAR GRAPH


 It is a representation of grouped data  It is a representation of ungrouped data
that does not have to be numerical
 There is no gap between the bars  There is generally a gap between the
bars
For example, you draw a HISTOGRAM to For example you draw a BAR GRAPH to
show the number of people whose heights show the number of learners in a class
(h) lie in the following intervals (measured who wear glasses and the number who do
not wear glasses.
in cm): 150 ≤ h < 160; 160 ≤ h < 170; etc

EXAMPLE 1


The following table lists the marks (given as percentage) obtained by
the Grade 11 learners of Musi High School in their mathematics test:

24 70 50 22 63 45 48 52 56 38
65 68 65 17 32 60 62 53 63 45
49 44 56 12 55 83 54 22 67 54
34 77 46 50 58 80 81 39 84 75
55 76 73 80 66 71 62 40 23 76

a) Organise the data using a grouped frequency table.


b) Draw a histogram to illustrate the data.
c) Calculate the modal interval. What does this measure of central
tendency tell you about the learners’ marks?
d) Estimate the median. What does this measure of central tendency
tell you about the learners’ marks?
SOLUTION:
a) The lowest mark was 12% and the highest mark was 84%
It is often easiest to use multiples of 10 as the class intervals, so start the first
interval at 10% and end the last interval at 90%

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 1 (continued)

Percentages Frequency
(t) (Number of learners)
10≤ t < 20 2
20 ≤ t < 30 4
30 ≤ t < 40 4
40 ≤ t < 50 7
50 ≤ t < 60 11
60 ≤ t < 70 10
70 ≤ t < 80 7
80 ≤ t < 90 5
TOTAL 50

b) Draw the histogram as follows:


STEP 1: Draw and label the horizontal and vertical axes.
STEP 2: Represent the frequency on the vertical axis and the classes on the
horizontal axis.
STEP 3: Using the frequencies (or number of learners) as the heights, draw
vertical bars for each class.

Mathematics test marks


12
Number of learners

10
8
6
4
2
0
10 ≤ t < 20 20 ≤ t < 30 30 ≤ t < 40 40 ≤ t < 50 50 ≤ t < 60 60 ≤ t < 70 70 ≤ t < 80 80 ≤ t < 90
Percentages

c) The modal interval is the interval with the largest frequency or largest number of
learners. So the modal interval is 50 ≤ t < 60.
This tells us that more learners got marks in the interval 50 ≤ t < 60 than in any of
the other intervals.
d) There are 50 data items (marks/percentages).
The median lies between the 25th and the 26th marks.
Add up the frequencies until you reach 25 (or more than 25):
2 + 4 + 4 + 7 + 11 = 28
The 28th mark lies in the interval 50 ≤ t < 60
So the median lies in the interval 50 ≤ t < 60
The median ≈ 55% (the midpoint of the interval)
This tells us that 50% of the learners got marks that were less than 55% and 50% of
the learners got marks that were more than 55%

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

NOTE:
A histogram should have the following:
 A title which describes the information that is contained in the histogram.
 A horizontal axis with a label which shows the scale of values into which the data fit
(grouped data intervals)
 A vertical axis with a label which shows the number of times the data within the
interval occurred (frequency)
 Adjacent bars (i.e. there are no gaps between the bars).

EXERCISE 2.1

1) The frequency table below represent the distribution of the amount of time (in
hours) that 80 high school learners spent in one week watching their favourite sport.
Time in hours Frequency
10 < t ≤ 15 8
15 < t ≤ 20 28
20 < t ≤ 25 27
25 < t ≤ 30 12
30 < t ≤ 35 4
35 < t ≤ 40 1

a) Draw a histogram to represent the data


b) Calculate
i) the modal interval
ii) an estimate of the median
c) What do these two measures of central tendency tell you about the amount of
time the learners devote to watching their favourite sport?
2) In the 2009 Census@School, learners were asked which their favourite subjects at
school were. Fifty Grade 11 learners from a certain school in Limpopo chose
Science as their favourite subject. The following are their Science marks (as
percentages):
31 62 51 44 61 63 59 47 59 67
50 54 61 41 48 74 53 53 53 36
60 42 50 48 42 27 43 42 43 54
49 47 51 28 54 48 83 65 54 35
61 56 57 32 38 32 40 63 56 59

a) Organise the data in a grouped frequency table.


b) Draw a histogram to represents the data.
c) Calculate the modal interval and an estimate of the median and say what these
two measures of central tendency tell you about the learners’ mark.

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

FREQUENCY POLYGONS

✓ A frequency polygon can be used instead of a


histogram for illustrating grouped data.

NOTE: It is called a frequency polygon because of its A polygon is a closed


shape. geometric shape made
up of line segments.

✓ One way of drawing a frequency polygon is to


a) Draw a histogram
b) Join the midpoints of the top of the columns of the histogram
c) Extend the line to the midpoint of the class interval below the lowest value
and to the midpoint of the class interval above the highest value so that the
line touches the horizontal axis on both sides.

✓ Another way of drawing a frequency polygon is to


a) Calculate the midpoint of each interval and then to plot the ordered pair
(midpoint of the interval; frequency)
b) Plot the midpoint of the interval below the lowest interval and the interval
above the highest interval and plot the points (midpoint of the interval; 0)
c) Join these points with straight lines.

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 1


Eighty of the learners at Alexandra High School were surveyed to find
out how many minutes each week they spent collecting waste material
for recycling. The grouped frequency table shows the results of the
survey.
a) Use the frequency table to Number of Number of
draw a histogram and to minutes learners
then draw a frequency
(t) (f)
polygon on the histogram.
9 < t ≤ 13 8
b)
13 < t ≤ 17 28
i) Find the midpoint of the
17 < t ≤ 21 27
intervals
21 < t ≤ 25 12
ii) Use the table to draw a 25 < t ≤ 29 4
frequency polygon on a 29 < t ≤ 33 1
separate set of axes.

SOLUTION
a) Step 1: Add in two classes with a frequency of zero:
Number of minutes Number of learners
(t) (f)
5<t≤9 0
9 < t ≤ 13 8
13< t ≤ 17 28
17 < t ≤ 21 27
21 < t ≤ 25 12
25 < t ≤ 29 4
29 < t ≤ 33 1
33 < t ≤ 37 0

Step 2: Draw the histogram and then join the midpoints of the top of the columns
to form the frequency polygon.
Number of minutes spent collecting waste materials

30
28
26
24
22
Number of learners

20
18
16
14
12
10
8
6
4
2
0
5<t≤9 9< t ≤ 13 13< t ≤ 17 17 < t ≤ 21 21 < t ≤ 25 25 < t ≤ 29 29< t ≤ 33 33 < t ≤ 37
Number of minutes

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 2 (continued)
b)
i) Calculate the midpoint of each interval using the formula:
lower limit of interval+upper limit of interval
Midpoint = 2

Number of minutes Frequency Ordered


Mid points pairs
(t) (f)
5+9 14
5<t≤9 = =7 0 (7; 0)
2 2
9+13 22
9 < t ≤ 13 = = 11 8 (11; 8)
2 2
13+17 30
13 < t ≤ 17 = = 15 28 (15; 28)
2 2
17+21 38
17 < t ≤ 21 = = 19 27 (19; 27)
2 2
21+25 46
21 < t ≤ 25 = = 23 12 (23; 12)
2 2
25+29 54
25 < t ≤ 29 = = 27 4 (27; 4)
2 2
29+33 62
29 < t ≤ 33 = = 31 1 (31; 1)
2 2
33+37 70
33 < t ≤ 37 = = 35 0 (35; 0)
2 2

ii) Plot the ordered pairs (midpoint; frequency) and join them with straight lines.
Make sure that the graph touches the horizontal axis on both sides.

Number of minutes spent collecting waste materials


30
28
26
24
Number of learners

22
20
18
16
14
12
10
8
6
4
2
0
5<t≤9 9 < t ≤ 13 13 < t ≤ 17 17 < t ≤ 21 21 < t ≤ 25 25 < t ≤ 29 29 < t ≤ 33 33 < t ≤ 37
Number of minutes

NOTE:
The main advantage of using a frequency polygon instead of a histogram is that you
can easily draw two or more frequency polygons on the same set of axes and make
comparisons between the sets of data.

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 3


The Grade 10 and Grade 11 learners were surveyed to find out the
approximate number of hours every week they spend doing their
Mathematics and Science homework. The results are summarised in
the following grouped frequency table:
Number of hours spent on
Number of Number of
Mathematics and Science
Grade 10 Grade 11
homework each week
learners learners
(t)
5 ≤ t < 10 3 12
10 ≤ t < 15 4 22
15 ≤ t < 20 7 10
20 ≤ t < 25 19 6
25 ≤ t < 30 16 8
30 ≤ t < 35 1 2
TOTAL 50 60
a) Draw two frequency polygons on the same set of axes to illustrate
this data.
b) Use the table and the graphs to answer the following:
i) What is the modal interval for Grade 10 and also for Grade 11?
ii) Approximately how many more Grade 11 learners than Grade
10 learners spent between 15 and 20 hours doing their
homework each week?
iii) Which grade spent more time doing their homework?
SOLUTION:
a)
Number of hours spent on
Mid-point Number of Number of
Mathematics and Science
of the Grade 10 Grade 11
homework each week
interval learners learners
(t)
0≤t<5 2,5 0 0
5 ≤ t < 10 7,5 3 12
10 ≤ t < 15 12,5 4 22
15 ≤ t < 20 17,5 7 10
20 ≤ t < 25 22.5 19 6
25 ≤ t < 30 27,5 16 8
30 ≤ t < 35 32,5 1 2
35 ≤ t < 40 37,5 0 0

 Note that it is not essential to have the same number of data items in the two sets
of data.

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 3 (continued)

Number of hours spent on Maths and Science homework


25

20
Number of learners

15

Grade 10
10
Grade 11

0
0≤t<5 5 ≤ t < 10 10 ≤ t < 15 15 ≤ t < 20 20 ≤ t < 25 25 ≤ t < 30 30 ≤ t < 35 35 ≤ t < 40
Number of hours

b)
i) The modal interval for Grade 10 is 20 < t ≤ 25
The modal interval for Grade 11 is 10 < t ≤ 15
ii) Difference in the number of learners who spent between 15 and 20 hours doing
homework each week = Number in Grade 11 – Number in Grade 10
= 10 – 7
=3
So 3 more Grade 11 learners than Grade 10 learners spent between 15 and 20
hours doing homework each week
iii) According to the table:
36 out of 50 Grade 10 learners (72% of them) spent 20 hours or more doing
homework each week.
16 out of 60 Grade 11 learners (27% of them) spent 20 hours or more doing
homework each week.
So the Grade 10s spent more time on homework than the Grade 11s.

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXERCISE 2.2

1) The learners at Mjolo High School enjoy taking part Athletics is a collection of
in athletics. sporting events that involve
Some of the learners took part in the long jump. competitive running,
The distances they jumped (in metres) are: jumping and throwing.

5,46 5,97 6,72 6,26 5,13 6,36


6,11 6,38 6,55 5,84 6,20 6,34
5,80 5,43 5,93 6,64 5,67 6,00
6,05 6,88 5,50 5,51 6,10 5,49

a) Copy and complete the following grouped frequency table:

Distance in metres (m) Frequency Midpoints


5,00 < m ≤ 5,50
5,50 < m ≤ 6,00
6,00 < m ≤ 6,50
6,50 < m ≤ 7,00

b) Draw a frequency polygon to illustrate the data.


c) Write down the modal interval.
2) Some of the learners took part in the javelin competition. The best distances (in
metres) thrown by each competitor in 2011 and 2012 are shown.
Distance Number of Number of
thrown in competitors competitors
metres (m) 2011 2012
10 < m ≤ 20 0 1
20 < m ≤ 30 3 4
30 < m ≤ 40 14 19
40 < m ≤ 50 21 13
50 < m ≤ 60 7 11
60 < m ≤ 70 0 2
TOTAL 45 50
a) On the same set of axes, draw frequency polygons to illustrate the 2011 and 2012
results.
b) By referring to the table and the frequency polygons, comment on the
performance of the competitors in 2011 and 2012.

10

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

OGIVES / CUMULATIVE FREQUENCY CURVES

FREQUENCY Frequency tells us how many of each item there are in a data set.

For example
As part of the Census@School, 170 learners were surveyed to find
out the type of dwelling that they lived in.
The following table shows the result of the survey:
Frequency
Type of house that you
(number of
live in
learners)
Traditional dwelling 7
House on separate yard 76
Tent 1
Informal dwelling in an
86
informal settlement
TOTAL = 170

CUMULATIVE Cumulative frequency shows the number of results that are less
FREQUENCY than (<) or less than or equal to (≤) a stated value in a set of data.

To find the cumulative frequency,


 Add up the frequencies as you go down the frequency table.
 Write each running total or cumulative frequency in your table.

For example
Using the above information, we can find the cumulative frequency.
Frequency
Type of house that you Cumulative
(number of
live in Frequency
learners)
Traditional dwelling 7 7
House on separate yard 76 7 + 76 = 83
Tent 1 83 + 1 = 84
Informal dwelling in an
86 84 + 86 = 170
informal settlement
TOTAL = 170

Can you see that the last cumulative frequency is equal to the total
frequency? (This is a useful check of your addition.)

You can find cumulative frequencies of discrete data and


continuous data.

11

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

✓ An ogive or cumulative frequency curve is a graph that shows the information


in a cumulative frequency table. The graph is useful for estimating the median
and inter-quartile range of the grouped data.

✓ You can draw an ogive of ungrouped discrete data, grouped discrete data or
grouped continuous data. It can be drawn from a grouped frequency table or an
ungrouped frequency table.

EXAMPLE 4
The following frequency table shows the time (in minutes) taken by


learners to travel to school.
Time taken to Cumulative Ordered
Frequency
travel to school Frequency Pairs
0 < t ≤ 10 4
10 < t ≤ 20 12
20 < t ≤ 30 28
30 < t ≤ 40 32
40 < t ≤ 50 29
50 < t ≤ 60 15
a) Complete the table.
b) Draw an ogive to illustrate the information.
SOLUTION:
a) Steps to follow when completing the table:
 Add in an interval with a frequency of 0 before the first interval.
 Find the cumulative frequency by adding the frequencies.
 List the ordered pairs where the first coordinate = upper limit of the interval and
the second coordinate = cumulative frequency.
Note: A cumulative frequency of 105 means that 105 learners or less spent 50
minutes or less to walk to school.
Time taken to travel Cumulative Ordered
Frequency
to school Frequency Pairs
–10 < t ≤ 0 0 0 (0;0)
0 < t ≤ 10 4 4 (10;4)
10 < t ≤ 20 12 4 + 12 = 16 (20;16)
20 < t ≤ 30 28 16 + 28 = 44 (30;44)
30 < t ≤ 40 32 44 + 32 = 76 (40;76)
40 < t ≤ 50 29 76 + 29 = 105 (50;105)
50 < t ≤ 60 15 105 + 15 = 120 (60;120)

b) Draw the ogive as follows:


i) Draw the axes and label the variable on the x-axis and the cumulative frequency
on the y-axis.
ii) Plot the ordered pairs.
iii) Join the points to form a smooth curve.

12

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 4 (continued)

The ogive:

Time taken to travel to school


120
115
110
105
100
95
90
85
80
Cumulative frequency

75
70
65
60
55
50
45
40
35
30
25
20
15
10
5
0
0 10 20 30 40 50 60
Time (in minutes)

✓ Always remember when drawing cumulative frequency curve from a table of


grouped data, the cumulative frequencies are plotted at the upper limit of the
interval.

13

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 5


Use the ogive drawn in Example 4 to
a) Determine the approximate values of
i) the median
ii) the lower quartile
iii) the upper quartile of the set of data.
b) What does each of these values tell you about the time taken by the
learners?
SOLUTION:
a) This is the ogive drawn in Example 4:

Time taken to walk to school


120
110
100
Cumulative frequency

90
80
70
60
50
40
30
20
10
0
0 10 20
Q1 30 M 40 Q2 50 60
Time taken (in minutes)

i) To find the approximate value of the median (M), find the midpoint of the
values plotted on the cumulative frequency axis.
 The maximum value is 120, so the median lies between the 60th and 61st
term.
 Draw a horizontal line from just above 60 until it touches the ogive.
 From that point draw a vertical line down to the horizontal axis.
So the median ≈ 35 minutes.

ii) To find the approximate value of the lower quartile (Q1), find the midpoint of
the lower half of the values plotted on the cumulative frequency axis.
 There are 60 terms in the lower half of the data, so the lower quartile lies
between the 30th and the 31st term.
 Draw a horizontal line from just above 30 until it touches the ogive.
 From that point draw a vertical line down to the horizontal axis.
So the lower quartile ≈ 25 minutes.

14

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 5 (continued)
iii) To find the approximate value of the upper quartile (Q3), find the midpoint of
the upper half of the values plotted on the cumulative frequency axis.
 There are 60 terms in the upper half of the data, so the upper quartile lies
between 60 + 30 = 90th and the 91st term.
 Draw a horizontal line from just above 90 until it touches the ogive.
 From that point draw a vertical line down to the horizontal axis.
So the upper quartile ≈ 45 minutes.
b)
i) The median tells us that 50% of the learners took 35 minutes or less or to walk
to school.
ii) The lower quartile tells us that 25% of the learners took 25 minutes or less to
walk to school.
iii) The upper quartile tells us that 75% of the learners took 45 minutes or less to
walk to school.

EXERCISE 2.3

1) In the 2009 Census@School learners were asked what their arm span was, correct to
the nearest centimetre. The results of two hundred of the Grade 10, 11 and 12
learners who took part were recorded as follows:
Cumulative
Arm span in cm Frequency To find your arm
Frequency
span: Open arms
130 < h ≤ 135 16 wide, measure the
135 < h ≤ 140 26 distance across your
140 < h ≤ 145 42 back from the tip of
145 < h ≤ 150 54 your right hand middle
150 < h ≤ 155 26 finger to the tip of your
155 < h ≤ 160 22 left hand middle
finger.
160 < h ≤ 165 14

a) Copy and complete the table.


b) Draw an ogive to illustrate the data.
c) Use your ogive to determine approximately how many learners have arm spans
that are less than or equal to 152 cm.
d) Use your graph to determine approximately how many learners have arm spans
of between 138 cm and 158 cm.

15

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXERCISE 2.3 (continued)


2) Fifty learners who travel by car to school were asked to record the number of
kilometres travelled to and from school in one week. The following table shows the
results:
Number of Number of Cumulative
kilometres learners frequency
10 < x ≤ 20 2 2
20 < x ≤ 30 9
30 < x ≤ 40 13
40 < x ≤ 50 26
50 < x ≤ 60 42
60 < x ≤ 70 50
TOTAL = 50
a) Copy the table and then fill in the second column of the table.
b) Draw an ogive to illustrate the data.
c) Use your graph to estimate the median number of kilometres travelled per week.
3) The histogram below shows the distribution of the Accounting examination marks
for 200 learners.

Accounting Examination Marks


60
57
50 55

40 43
Frequency

30

20
18
10
12 11
0 4
30 < x ≤ 40 40 < x ≤ 50 50 < x ≤ 60 60 < x ≤ 70 70 < x ≤ 80 80 < x ≤ 90 90 < x ≤ 100
Percentages

a) Draw a grouped frequency table to record the data shown on the histogram.
b) Draw an ogive to illustrate the data in the frequency table.
c) Use the ogive to estimate how many learners scored 72% or more for the
examination.

16

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXERCISE 2.3 (continued)


4) The masses of a random sample of 50 boys in Grade 11 were recorded. This
cumulative frequency graph (ogive) represents the recorded masses.

OGIVE SHOWING THE MASSES OF THE BOYS


55
50
45
40
Cumulative frequency

35
30
25
20
15
10
5
0
50 60 70 80 90 100 110 120
Mass (in kilograms)

a) How many of the boys had a mass between 90 and 100 kilograms?
b) Estimate the median mass of the boys.
c) Estimate how many of boys had mass less than 80 kilograms.

17

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

CHOOSING WHICH DISPLAY TO USE


The following table will help you when you have to select the appropriate diagram or
graph for your data by identifying the diagrams most commonly associated with
different types of data.

DATA TYPE DESCRIPTION EXAMPLE TYPE OF DISPLAY


To show frequencies
e.g. 10 girls in this class
Bar graph
have blonde hair, 18 black
hair, 12 brown hair, etc.
Data that can be To show proportions
arranged into e.g. area by province in
Qualitative categories that are not South Africa: Western
data numerical such as Cape 10,6%; Gauteng
physical traits, 1,5%; Free State 10,6%;
Pie chart
gender, and colours. KwaZulu Natal 7,7%;
Limpopo 10,3%;
Mpumalanga 6,3%; North
West 8,6%; Northern Cape
30,5%.
Few different values Tally table for counting,
Data that has a finite 2, 4, 67, 34, 69 bar graph for display
number of different Stem-and-leaf diagram or
Discrete Many different values
responses such as a bar graph or a
Data 4, 24, 25, 26, 45, 37, 38,
the number of people histogram or a box and
48, 53, 120, 75, 67, 100,
in a household. whisker diagram or a
89, 47, 58, 87, 55, 45,
frequency polygon
Data which have been
Equal intervals
arranged in groups or
Grouped 3≤x<9 Histogram or frequency
classes rather than
data 9 ≤ x < 15 polygon
showing all the
original figures. 15 ≤ x < 21
Data that is increasing
Cumulative by successive Continuous variable
Ogive
data additions of the same Time, height, etc.
numbers
For example comparing the
number of learners in the Compound bar graphs or
Qualitative data
class with brown eyes, blue side-by-side pie charts.
eyes, and green eyes.
For ungrouped data use
For example comparing the back-to-back stem-and-
Two
English exam marks for leaf diagrams or
samples Discrete data
boys and girls in a Grade compound bar graphs.
11 class. For grouped data use
frequency polygons
For example comparing the
Ogives or frequency
Continuous data heights of the girls and the
polygons
boys in a Grade 11 class.
Used to decide
Data in pairs
Two whether there is a
Shoe size and the age of a Scatter plot
variables relationship between
person.
the two variables

18

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

VARIANCE AND STANDARD DEVIATION


✓ The interquartile range (IQR) measures the spread of the middle half of the data
and is closely linked to the median.

Interquartile range = upper quartile – lower quartile


Or
IQR = Q3 – Q1

✓ We can define two more measures of dispersion, taking into account all of the
data, which are linked to the mean. They are the variance and the standard
deviation.

✓ The variance is the mean of the sums of the squares of the deviations from the
mean.
We find the variance by:
i) Finding the mean: x¯ = Σ x
n
ii) Finding the deviation from the mean of each item of the data set:
Deviation = data item – mean = x – x̄
iii) Squaring each deviation : (deviation)2 = (x −x̄ )2
iv) Finding the sum of the squares of the deviations:
Σ(deviation)2 = Σ(x − x̄)2
v) Finding the mean of the squares of the deviations by dividing by the
number of terms in the data set:
Σ(deviation)2 Σ(x–x̄)2
Variance = =
number of data items n

✓ The standard deviation is the square root of the variance:


Σ(deviation)2 Σ(x–x̄)2
Standard Deviation = √variance = √ =√
number of data items n

✓ When data elements are tightly clustered together, the standard deviation and
variance are small; when they are spread apart, the standard deviation and the
variance are relatively large.
 A data set with more data near the mean will have less spread and a
smaller standard deviation
 A data set with lots of data far from the mean which will have a greater
spread and a larger standard deviation.

19

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 6


a) Calculate the variance and the standard deviation of the following
two data sets:
Set A 182 182 184 184 185 185 186
Set B 152 166 176 184 194 200 216
b) Use the two standard deviations to compare the distribution of data
in the two sets.
SOLUTION:
a) Step 1: Find the mean of 182+182+184+184+185+185+186
each set 1 288
Mean of Set A = 7
= 7
= 184
152+166+176+184+194+200+216 1 288
Mean of Set B = = = 184
7 7
Step 2: Find the deviation from the mean of each item in the data set
Step 3: Square each deviation
Step 4: Find the variance
Step 5: Find the standard deviation

GROUP A GROUP B
Data Deviation from 2 Data Deviation from
(Deviation) (Deviation)2
item the mean item the mean
182 182 – 184 = –2 (–2)2 = 4 152 152 – 184 = – 32 (–32)2 = 1 024
182 182 – 184 = –2 (–2)2 = 4 166 166 – 184 = – 18 (–18)2 = 324
184 184 – 184 = 0 02 = 0 176 176 – 184 = − 8 (–8)2 = 64
184 184 – 184 = 0 02 = 0 184 184 – 184 = 0 02 = 0
185 185 – 184 = 1 12 = 1 194 194 – 184 = 10 102 = 100
185 185 – 184 = 1 12 = 1 200 200 – 184 = 16 162 = 256
186 186 – 184 = 2 22 = 4 216 216 – 184 = 32 322 = 1 024
Σ(deviations)2 Σ(deviations)2
= 14 = 2 792
Σ(deviation)2 Σ(deviation)2
Variance = Variance =
number of data items number of data items
14 2 792
= 7
= 7
=2 = 398, 857 ...
Standard Deviation = √variance Standard Deviation = √variance
= √2 2 792
=√
≈ 1,414 7
≈ 19,971

b) The larger standard deviation in Group B indicates that the data items are generally
much further from the mean than the data items in Group 1.
This means that the data items in Group B are more spread out than the data items
in Group A.

20

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594


EXAMPLE 7
Use a scientific calculator to calculate the standard deviation of 9, 7, 11,
10, 13 and 7.

SOLUTION:
CASIO fx-82ZA PLUS calculator SHARP EL-W535HT calculator
Press the following keys Press the following keys:
[MODE] [2: STAT] [1: 1 – VAR] [MODE] [1: STAT] [0:SD]
9 (=) 7 (=) 11 (=) 10 (=) 13 (=) 7 (=) 9 [CHANGE] 7 [CHANGE]
[AC] 11 [CHANGE] 10 [CHANGE]
[SHIFT : 1] [STAT] [4: VAR] 13 [CHANGE] 7 [CHANGE]
[3:  x] [=] [ALPHA] [6:  x] [=]

So the Standard Deviation ≈ 2,141

EXERCISE 2.4

Where necessary, give decimal answers correct to 1 decimal place

1) The arm spans (in cm) of the eleven players in each of two different soccer teams A
and B are recorded.
a) The arm spans for TEAM A are:
203, 214, 187, 188, 196, 199, 205, 203, 199, 194 and 206
i) Calculate the mean of the arm spans
Σx x x − x̄2 (x − x̄)
using the formula: x¯ = n . 203
ii) Copy and complete the table given. 214
iii) Calculate the standard deviation of 187
the arm spans using the formula: 188
Σ(x–x¯ )2
196
σ=√ . 199
n
b) For TEAM B, the variance is 875 cm2. 205
Calculate the standard deviation of the 203
199
arm spans of TEAM B.
194
c) Make a comment about the dispersion
206
of the arm spans of the players in both Σ(x − x¯ )2 =
n=
teams.

21

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXERCISE 2.4 (continued)

2) The time (in minutes) taken by a group of athletes from Lesiba High School to run a
3 km cross country race is: 18 21 16 24 28 20 22 29 19 23
Use your calculator to determine
a) The mean time taken to complete the race.
b) The standard deviation of the time taken to complete the race.

3) The following tables show the masses (in kilograms) of the A and B rugby teams at
Sir John Adamson High School:
TEAM A
51 82 71 64 81 81 76 77 62 68
70 74 81 61 68 69 67 71 68 74
80 62 70 68 62

TEAM B
83 79 67 79 87 62 60 83 76 79
94 110 73 97 70 68 103 85 74 55
47 63 62 87 74

a) Use your calculator to determine the mean and the standard deviations of each
data set.
b) Is the standard deviation a good measure for determining which team plays
better? Give reasons for your answer.
4) As part of the Census@School, learners had to record the length (in centimetres) of
their right foot without a shoe. The girls (G) and boys (B) in Grade 11C measured
their foot lengths and recorded the results in the following table.
G: 29 22 28 23 23 29 29 25 27 23 27 21 24 21 20 25 22 29
B: 28 30 26 29 25 28 26 25 28 22 30 25 21 27 25 23
a) Use your calculator to determine the mean and standard deviation of the foot
lengths of
i) The girls
ii) The boys.
b) Use the mean and the standard deviation of the foot lengths to comment on the
differences in foot sizes of the two groups.

22

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

SYMMETRIC AND SKEWED DATA


✓ A measure of shape describes the distribution of the data within a data set.

✓ A distribution of data values can be symmetric or skewed.

 In a symmetric distribution, the two sides of the distribution are a mirror


image of each other

 In a skewed distribution, the two sides of the distribution are NOT mirror
images of each other.

✓ Both frequency polygons and box-and-whisker diagrams can be used to


illustrate symmetric and skewed data.

KEY FEATURES OF A SYMMETRIC DISTRIBUTION


 The shape is symmetrical
 The mode, median and mode have the same value.
 Most of the data are clustered around the centre.
In fact, about 68% of the data lie within 1 standard deviation of the mean
About 95% of the data lie within 2 standard deviations of the mean
About 99,7% of the data lie within 3 standard deviations of the mean.

KEY FEATURES OF SKEWED DATA


Skewness is the tendency for the values to be more frequently around the high or
low ends of the x-axis.
 With a positively skewed distribution, the tail on the right side is longer
than the left side
Most of the values tend to cluster toward the left side of the x-axis (i.e. the
smaller values) with increasingly fewer values on the right side of the x-axis
(i.e. the larger values).
 With a negatively skewed distribution, the tail on the left side is longer than
the right side.
Most of the values tend to cluster toward the right side of the x-axis (i.e. the
larger values) with increasingly fewer values on the left side of the x-axis
(i.e. the smaller values).

23

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 8


The Grade 10 learners of Leihlo Secondary School, Helen Frans
Secondary School and Pitseng Secondary School attended a meeting at
a hall in Senwabarwana about the problems they have encountered with
the bus company which transports them to school.
The following table shows the time the learners spent in the meeting:
Frequency
Time spent Midpoint of
Leihlo Helen Frans Pitseng
in the hall the intervals
Secondary Secondary Secondary
(in minutes) (in minutes)
School School School
0<t≤5 2,5 0 0 0
5 < t ≤ 10 7,5 25 10 40
10 < t ≤ 15 12,5 41 18 55
15 < t ≤ 20 17,5 60 25 74
20 < t ≤ 25 22,5 73 32 60
25 < t ≤ 30 27,5 81 40 50
30 < t ≤ 35 32,5 73 60 35
35 < t ≤ 40 37,5 64 50 26
40 < t ≤ 45 42,5 55 44 17
45 < t ≤ 50 47,5 25 35 15
50 < t ≤ 55 52,5 0 0 0

a) Draw frequency polygons to represent the time spent by learners


of each school in the hall.
b) Describe the shapes of the polygons.
SOLUTION:
a)
TIME SPENT BY THE LEARNERS IN THE HALL
100
90
80
70
Frequency

60
50 Leihlo
40 Helen Frans
30
20 Pitseng
10
0
2.5 7.5 12.5 17.5 22.5 27.5 32.5 37.5 42.5 47.5 52.5
Time in minutes

b) The data from Leihlo Secondary is symmetric


The data from Helen Frans is not symmetric. It is more spread out on the left and
clustered more closely together on the right. We say that it is skewed left.
The data from Pitseng is also not symmetric. It is more spread out on the right and
clustered more closely together on the left. We say that it is skewed right.

24

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

✓ Note that if the mean and the median of a data set are known, then
 If mean – median ≈ 0, then the distribution is symmetric
 If mean – median > 0, then the distribution is positively skewed
 If mean – median < 0, then the distribution is negatively skewed

✓ Note that for a box-and-whisker diagram


 If the distribution is symmetric, the median is in the middle of the box and
the whiskers are equal in length
 When data is more spread out on the left side and clustered on the right, the
distribution is said to be negatively skewed or skewed to the left.
 When the data is more spread out on the right side clustered on the left, the
distribution is said to be positively skewed or skewed to the right.


EXAMPLE 9
Use the data given in Example 8 for the following:
a) Calculate the mean and the five-number-summary for the time spent
by learners in each school.
b) Draw box-and-whisker diagrams to represent the data.
c) State whether each data set is symmetric, positively skewed or
negatively skewed.
SOLUTION:
Leihlo Secondary School
a) x¯ ≈ 27,5 minutes
Minimum value ≈ 7,5 minutes
Lower quartile = Q1 ≈ 17,5 minutes
Median ≈ 27,5 minutes
Upper quartile = Q3 ≈ 37,5 minutes
Maximum value ≈ 47,5 minutes
b)

5 7,5 10 12,5 15 17,5 20 22,5 25 27,5 30 32,5 35 37,5 40 42,5 45 47,5 50

c) Mean – median = 27,5 minutes – 27,5 minutes = 0


This means that the distribution is symmetric.

25

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 9 (continued)

Helen Frans Secondary School


a) x¯ ≈ 31,6 minutes
Minimum value ≈ 7,5 minutes
Lower quartile = Q1 ≈ 22,5 minutes
Median ≈ 32,5 minutes
Upper quartile = Q3 ≈ 37,5 minutes
Maximum value ≈ 47,5 minutes
b)

5 7,5 10 12,5 15 17,5 20 22,5 25 27,5 30 32,5 35 37,5 40 42,5 45 47,7 50

c) Mean – median = 31,6 minutes – 32,7 minutes = – 1,1


This means that the distribution is negatively skewed (or skewed left).

Pitseng Secondary School


a) x¯ ≈ 22,99 minutes
Minimum value ≈ 7,5 minutes
Lower quartile = Q1 ≈ 12,5 minutes
Median ≈ 22,5 minutes
Upper quartile = Q3 ≈ 32,5 minutes
Maximum value ≈ 47,5 minutes
b)

5 7,5 10 12,5 15 17,5 20 22,5 25 27,5 30 32,5 35 37,5 40 42,5 45 47,7 50

c) Mean – median = 22,99 minutes – 22,5 minutes = 0,49


This means that the distribution is positively skewed (or skewed right).

26

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXAMPLE 9 (continued)
When we draw all three box and whisker diagrams on the same page, we can
immediately see that the Leihlo data is symmetric, the Helen Frans data is negatively
skewed, and the Pitseng data is positively skewed.

Pitseng

Helen Frans

Leihlo

5 7,5 10 12,5 15 17,5 20 22,5 25 27,5 30 32,5 35 37,5 40 42,5 45 47,5 50

EXERCISE 2.5

1) The box and whiskers diagrams of two sets A and B are shown below.

a) Write down what is common to both sets of data.


b) Which data set is symmetrical? State the reasons.
c) Is the other data set skewed left or right? State the reasons.
2) For the 2009 Census@School, 47 Grade 11 learners recorded how long (in minutes)
it took them to travel to school. The following data was obtained:
Time (in minutes) Frequency
5 < t ≤ 10 1
10 < t ≤ 15 5
15 < t ≤ 20 9
20 < t ≤ 25 13
25 < t ≤ 30 11
30 < t ≤ 35 8

a) Use the given information to determine the five number summary.


b) Draw a box and whisker diagram to illustrate the five number summary.
c) Comment on the spread of the time taken to complete the task.

27

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXERCISE 2.5 (continued)


3) Three high schools in Limpopo have a total number of 132 Grade 12 learners. These
learners completed the question in the 2009 Census@School where they were asked
to record the distance (in kilometres) they travel each day from home to school. The
results of the survey are shown in the grouped frequency below.

Distance in
Midpoint of
kilometres Frequency
intervals
(x)
0<x≤5 12
5 < x ≤ 10 29
10 < x ≤ 15 13
15 < x ≤ 20 63
20 < x ≤ 25 12
25 < x ≤ 30 3

a) Copy and complete the table.


b) Draw a frequency polygon to illustrate the data.
c) Determine the median of the data in the table.
d) Use your calculator to determine the mean of the data.
e) Calculate mean – median
f) By referring to the shape of the polygon and the relationship between the mean
and the median, state whether the distribution of the data is symmetric,
positively skewed or negatively skewed.

28

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

OUTLIERS
✓ An outlier is a data entry that is far removed from the other entries in the data
set e.g. a data entry that is much smaller or much larger than the rest of the data
values.

✓ An outlier has an influence on the mean and the range of the data set, but has no
influence on the median or lower or upper quartiles.

✓ An outlier can affect the skewness of the data.

✓ Any data item that is less than Q1 – 1,5  IQR OR more than Q3 +1,5  IQR is
an outlier.


EXAMPLE 1
Investigate the following data set:
1, 8, 12, 14, 14, 15, 17, 17, 19, 26, 32
a) Calculate (where necessary correct to 1 decimal place)
i) The mean
ii) The median
iii) The interquartile range
b) Are any of the entries in the data set outliers?
SOLUTION:
a) 1+8+12+14+14+15+17+17+19+26+32
i) Mean = x¯ = 11
175
= 11
= 15,9090...
x̄ ≈ 15,9
ii) There are 11 terms so the median is the 6th term.
Median = 15
iii) There are 5 terms less than the median so Q1 is the 3rd term. So Q1 = 12.
To find Q3 we add 3 terms to the position of the median and get the 9th term.
So Q3 = 19
IQR = 19 – 12 = 7
b) Lower outlier < Q1 – 1,5  IQR
< 12 – 1,5  7
< 1,5
So 1 is an outlier
Upper outlier > Q3 +1,5  IQR
> 19 + 1,5  7
> 29,5
And 32 is also an outlier

29

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)


lOMoARcPSD|8087594

EXERCISE 2.6

1) Determine the interquartile range and then find outliers (if there are any) for the
following set of data:
10,2 ; 14,1 ; 14,4 ; 14,4 ; 14,5 ; 14,5 ; 14,6 ;
14,7 ; 14,7 ; 14,9 ; 15,1 ; 15,9 ; 16,4 ; 18,9

2) A class of 20 learners has to submit Mathematics assessment tasks over the course
of the year. While some learners were conscientious others were not.
The following table shows the number of assessment tasks each learner handed in:
9 5 11 8 12 2 6 9 15 10
12 6 9 3 9 13 14 16 4 7
a) Determine the IQR
b) Determine the outliers (if any).

3) The following are the ages of boys in one of the Grade 8 class of Dendron
Secondary School:
12 12 13 14 14 13 12 15 15 14 12 19 14 12 9
a) Determine the five number summary.
b) Determine the outliers, if any.

REFERENCES
Bowie L. et al. (2007). Focus on Mathematical Literacy Grade 12. Maskew Miller
Longman.
Freund J. E. (1999). Statistics A First Course. Prentice Hall, New Jersey.
Larson R. and Farber B. (2006). Elementary Statistics Picturing the World. Third
Edition. Pearson, Prentice Hall.
Upton G and Cook I (2001) Introducing Statistics 2nd edition. Oxford
Statistics South Africa (2010) Census At School Results (2009).
The Answer Series Grade 12 Mathematics Paper 3, notes, questions and answers.

30

Downloaded by lucky ndou (muthavhinelucky2@gmail.com)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy