5 1 Representation of Data Easy
5 1 Representation of Data Easy
Subject: Mathematics
Syllabus Code: 9709
Level: AS Level
Component: Probability and Statistics 1
Topic: 5.1 Representation of Data
Difficulty: Easy
Questions
1. For n values of the variable x, it is given that
2. A driver records the distance travelled in each of 150 journeys. These distances, correct to the
nearest km, are summarised in the table. (9709/52/F/M/21 number 5)
Distance (km) 0 − 4 5 − 10 11 − 20 21 − 30 31 − 40 41 − 60
Frequency 12 16 32 66 20 4
3. The times taken by 200 players to solve a computer puzzle are summarised in the following table.
(9709/51/M/J/21 number 5)
1
4. The heights in cm of 160 sunflowers plants were measured. The results are summarised on the
cumulative frequency curve. (9709/53/M/J/21 number 1)
160
140
100
80
60
40
20
0
0 40 80 120 160 200 240
Height (cm)
(a) Use the graph to estimate the number of plants with heights less than 100 cm.
(b) Use the graph to estimate the 65th percentile of the distribution.
(c) Use the graph to estimate the interquartile range of the heights of these plants.
5. A sports club has a volleyball team and a hockey team. The heights of the 6 members of the
volleyball team are summarised by Σx = 1050 and Σx2 = 193 700, where x is the height of a
member in cm. The heights of the 11 members of the hockey team are summarised by Σy = 1991
and Σy 2 = 366 400, where y is the height of a member in cm. (9709/53/M/J/21 number 3)
(a) Find the mean height of all 17 members of the club.
(b) Find the standard deviation of the heights of all 17 members of the club.
6. A summary of 40 values of x gives the following information:
Σ(x − k) = 520, Σ(x − k)2 = 9640,
where k is a constant. (9709/51/O/N/21 number 2)
(a) Given that the mean of these 40 values of x is 34, find the value of k.
(b) Find the variance of these 40 values of x.
7. For n values of the variable x, it is given that
Σ(x − 200) = 446 and Σx = 6846
Find the value of n. (9709/52/M/J/22 number 1)
2
8. The time taken, t minutes, to complete a puzzle was recorded for each of 150 students. These
times are summarised in the table. (9709/53/M/J/22 number 1)
9. Twenty children were asked to estimate the height of the particular tree. Their estimates, in metres,
were as follows. (9709/53/M/J/22 number 2)
4.1 4.2 4.4 4.5 4.6 4.8 5.0 5.2 5.3 5.4
5.5 5.8 6.0 6.2 6.3 6.4 6.6 6.8 6.9 19.4
3
12. asdf
120
100
Cumulative Frequency
80
60
40
20
0
0 10 20 30 40 50
Time (seconds)
The times taken 120 children to complete a particular puzzle are represented in the cumulative
frequency graph. (9709/51/O/N/23 number 1)
(a) Use the graph to estimate the interquartile range of the data. 35% of the children took longer
than T seconds to complete the puzzle.
(b) Use the graph to estimate the value of T .
4
Answers
1. For n values of the variable x, it is given that
x−c=x−c
We can create two equations using the formula for the mean,
Σ(x − 50) Σx
x − 50 = x=
n n
5
Make n the subject of the formula,
50n = 800
n = 16
2. A driver records the distance travelled in each of 150 journeys. These distances, correct to the
nearest km, are summarised in the table. (9709/52/F/M/21 number 5)
Distance (km) 0 − 4 5 − 10 11 − 20 21 − 30 31 − 40 41 − 60
Frequency 12 16 32 66 20 4
Notice that there gaps between classes in our table i.e 0 − 4 then 5 − 10, there is
gap between 4 and 5. To fix this we apply continuity correction. This means that
you should subtract 0.5 from the lower bounds and add 0.5 to the upper bounds,
Distance (km) 0 − 4.5 4.5 − 10.5 10.5 − 20.5 20.5 − 30.5 30.5 − 40.5 40.5 − 60.5
Frequency 12 16 32 66 20 4
Distance (km) 0 − 4.5 4.5 − 10.5 10.5 − 20.5 20.5 − 30.5 30.5 − 40.5 40.5 − 60.5
Cumulative 12 28 60 126 146 150
Frequency
Label the y-axis as cumulative frequency and the x-axis as distance in km. Plot the
upper bounds against the cumulative frequency. Join the points to form an s-shaped
curve,
6
150
135
120
90
75
60
45
30
15
0
0 10 20 30 40 50 60
Distance (km)
(b) For 30% of these journeys the distance travelled is dkm or more. Use your graph to estimate
the value of d.
This means that 70% of the journeys have a distance less than dkm. Let’s find 70%
of 150,
70
× 150 = 105
100
Draw construction lines at a cumulative frequency of 105 and read off the distance,
d = 27
Distance (km) 0 − 4.5 4.5 − 10.5 10.5 − 20.5 20.5 − 30.5 30.5 − 40.5 40.5 − 60.5
Frequency 12 16 32 66 20 4
7
Calculate the mean using the formula,
Σxf
x=
Σf
Simplify,
3238
x=
150
We first need to find the class width and the frequency density,
Class width 10 10 20 20 40
Frequency Density 1.6 5.4 3.9 1.6 0.5
Label the y-axis with frequency density and x-axis with time in t seconds. Plot the
time classes against the frequency density, ensuring each class has the respective
class width,
5
Frequency density
10 20 40 60 100
Time (t seconds)
(b) Calculate an estimate of the mean time taken by these 200 players.
Time (t seconds) 0 ≤ t < 10 10 ≤ t < 20 20 ≤ t < 40 40 ≤ t < 60 60 ≤ t < 100
No. of players 16 54 78 32 20
8
Let’s start by finding the midpoints of each class,
Midpoint 5 15 30 50 80
No. of players 16 54 78 32 20
Simplify,
6430
x=
200
To find the greatest possible value of the interquartile range we need to find the
maximum value of the upper quartile and the minimum value of the lower quartile,
3
q3 = n
4
3
q3 = × 200
4
q3 = 150
When we add up the frequencies, we notice that 150 lies in the class,
40 ≤ t < 60
q3 = 60
10 ≤ t < 20
9
The minimum value in this class is 10,
q1 = 10
IQR = q3 − q1
4. The heights in cm of 160 sunflowers plants were measured. The results are summarised on the
cumulative frequency curve. (9709/53/M/J/21 number 1)
160
140
120
Cumulative Frequency
100
80
60
40
20
0
0 40 80 120 160 200 240
Height (cm)
(a) Use the graph to estimate the number of plants with heights less than 100 cm.
Draw construction lines at a height of 100cm, and read off the respective cumulative
frequency,
60
10
Therefore, the final answer is,
60
(b) Use the graph to estimate the 65th percentile of the distribution.
Draw construction lines at a cumulative frequency of 104, and read off the respective
height,
136cm
IQR = q3 − q1
Draw construction lines at the cumulative frequency 160 and read off the height,
q3 = 76
Draw construction lines at the cumulative frequency 160 and read off the height,
q1 = 150
IQR = q3 − q1
11
Substitute into the formula,
IQR = 150 − 76
IQR = 74cm
5. A sports club has a volleyball team and a hockey team. The heights of the 6 members of the
volleyball team are summarised by Σx = 1050 and Σx2 = 193 700, where x is the height of a
member in cm. The heights of the 11 members of the hockey team are summarised by Σy = 1991
and Σy 2 = 366 400, where y is the height of a member in cm. (9709/53/M/J/21 number 3)
Simplify,
3041
µ̂ =
17
µ̂ = 178.9
Simplify,
σx+y = 30.8
12
Therefore, the final answer is,
σx+y = 30.8
(a) Given that the mean of these 40 values of x is 34, find the value of k.
x = 34 Σ(x − k) = 520 n = 40
Σ(x − k)
x−k =
n
Solve for k,
34 − k = 13
k = 34 − 13
k = 21
Σ(x − k)2 n = 40 k = 21 x = 34
The variance σx is the same as the variance σx−k . In this case we are given more
information of x − k. So let’s find the σx−k ,
r
Σ(x − k)2
σx−k = − (x − k)2
n
13
We need to find x − k. Remember the idea we used above,
x−k =x−k
x − k = 34 − 21
x − k = 13
Simplify,
σx−k = 72
x−c=x−c
We can create two equations using the formula for the mean,
Σ(x − 200) Σx
x − 200 = x=
n n
14
In the first equation,
446 6846
x − 200 = x=
n n
446 6846
x − 200 = x=
n n
200n = 6400
n = 32
15
8. The time taken, t minutes, to complete a puzzle was recorded for each of 150 students. These
times are summarised in the table. (9709/53/M/J/22 number 1)
Label the y-axis as cumulative frequency and the x-axis as time taken (t minutes).
Plot the upper bounds of the time taken against the cumulative frequency. Join
dots to form an s-shaped curve,
160
140
120
Cumulative Frequency
100
80
60
40
20
0
0 20 40 60 80 100 120 140 160 180 200
Time (t minutes)
(b) Use your graph to estimate the 20th percentile of the data.
Draw construction lines at a cumulative frequency of 30 and read off the respective
time taken,
t = 38
16
Therefore, the final answer is,
t = 38
9. Twenty children were asked to estimate the height of the particular tree. Their estimates, in metres,
were as follows. (9709/53/M/J/22 number 2)
4.1 4.2 4.4 4.5 4.6 4.8 5.0 5.2 5.3 5.4
5.5 5.8 6.0 6.2 6.3 6.4 6.6 6.8 6.9 19.4
Simplify,
Σx = 123.4
We have a total of 20 values and the data is already arranged in rank order. The
median must lie between the 10 and 11 data point,
n+1
q2 =
2
20 + 1
q2 =
10.5
The 10th and 11th data point are 5.4 and 5.5 respectively. The median is the average
of those two points,
5.4 + 5.5
q2 =
2
q2 = 5.45
17
If you analyse the data that we are given. You will notice that we have an anomalous
value 19.4 i.e a value that doesn’t follow the trend. This value will inflate the mean
and make it seem bigger than it actually is.
x−c=x−c
x − 20 = 0.7
x − 20 = 0.7
x = 20 + 0.7
x = 20.7
18
Simplify,
σ = 72.23
We need to find x − q,
Σ(x − q)
x−q =
n
700
x−q =
50
x − q = 14
Simplify,
σx−q = 9.42
Σx = 2865 n = 50 x − q = 14
19
To find q we will use the idea that,
x−q =x−q
Let’s find x,
Σx
x=
n
2865
x=
50
x = 57.3
12. asdf
120
100
Cumulative Frequency
80
60
40
20
0
0 10 20 30 40 50
Time (seconds)
The times taken 120 children to complete a particular puzzle are represented in the cumulative
frequency graph. (9709/51/O/N/23 number 1)
(a) Use the graph to estimate the interquartile range of the data.
20
The formula for interquartile range is,
IQR = q3 − q1
Draw construction lines at a cumulative frequency of 90 and read off the time,
q3 = 31
Draw construction lines at a cumulative frequency of 30 and read off the time,
q1 = 23.7
This means 65% of the children took less than T seconds to complete the puzzle.
Let’s find 65% of 120,
65
× 120 = 78
100
Draw construction lines at a cumulative frequency of 78 and read off the time. This
is the value of T ,
T = 28.5
21