0% found this document useful (0 votes)
5 views

5 1 Representation of Data Easy

Uploaded by

Aaron Bvitira
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

5 1 Representation of Data Easy

Uploaded by

Aaron Bvitira
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Probability and Statistics 1

5.1 Representation of Data - Easy

Subject: Mathematics
Syllabus Code: 9709
Level: AS Level
Component: Probability and Statistics 1
Topic: 5.1 Representation of Data
Difficulty: Easy
Questions
1. For n values of the variable x, it is given that

Σ(x − 50) = 144 and Σx = 944

Find the value of n. (9709/52/M/J/20 number 1)

2. A driver records the distance travelled in each of 150 journeys. These distances, correct to the
nearest km, are summarised in the table. (9709/52/F/M/21 number 5)

Distance (km) 0 − 4 5 − 10 11 − 20 21 − 30 31 − 40 41 − 60
Frequency 12 16 32 66 20 4

(a) Draw a cumulative frequency graph to illustrate the data.


(b) For 30% of these journeys the distance travelled is dkm or more. Use your graph to estimate
the value of d.
(c) Calculate an estimate of the mean distance travelled for the 150 journeys.

3. The times taken by 200 players to solve a computer puzzle are summarised in the following table.
(9709/51/M/J/21 number 5)

Time (t seconds) 0 ≤ t < 10 10 ≤ t < 20 20 ≤ t < 40 40 ≤ t < 60 60 ≤ t < 100


No. of players 16 54 78 32 20

(a) Draw a histogram to represent this information.


(b) Calculate an estimate of the mean time taken by these 200 players.
(c) Find the greatest possible value of the interquartile range for these times.

1
4. The heights in cm of 160 sunflowers plants were measured. The results are summarised on the
cumulative frequency curve. (9709/53/M/J/21 number 1)

160

140

Cumulative Frequency 120

100

80

60

40

20

0
0 40 80 120 160 200 240
Height (cm)

(a) Use the graph to estimate the number of plants with heights less than 100 cm.
(b) Use the graph to estimate the 65th percentile of the distribution.
(c) Use the graph to estimate the interquartile range of the heights of these plants.

5. A sports club has a volleyball team and a hockey team. The heights of the 6 members of the
volleyball team are summarised by Σx = 1050 and Σx2 = 193 700, where x is the height of a
member in cm. The heights of the 11 members of the hockey team are summarised by Σy = 1991
and Σy 2 = 366 400, where y is the height of a member in cm. (9709/53/M/J/21 number 3)
(a) Find the mean height of all 17 members of the club.
(b) Find the standard deviation of the heights of all 17 members of the club.
6. A summary of 40 values of x gives the following information:
Σ(x − k) = 520, Σ(x − k)2 = 9640,
where k is a constant. (9709/51/O/N/21 number 2)
(a) Given that the mean of these 40 values of x is 34, find the value of k.
(b) Find the variance of these 40 values of x.
7. For n values of the variable x, it is given that
Σ(x − 200) = 446 and Σx = 6846
Find the value of n. (9709/52/M/J/22 number 1)

2
8. The time taken, t minutes, to complete a puzzle was recorded for each of 150 students. These
times are summarised in the table. (9709/53/M/J/22 number 1)

Time (t minutes) t ≤ 25 t ≤ 50 t ≤ 75 t ≤ 100 t ≤ 150 t ≤ 200


Cumulative frequency 16 44 86 104 132 150

(a) Draw a cumulative frequency graph to illustrate the data.


(b) Use your graph to estimate the 20th percentile of the data.

9. Twenty children were asked to estimate the height of the particular tree. Their estimates, in metres,
were as follows. (9709/53/M/J/22 number 2)

4.1 4.2 4.4 4.5 4.6 4.8 5.0 5.2 5.3 5.4
5.5 5.8 6.0 6.2 6.3 6.4 6.6 6.8 6.9 19.4

(a) Find the mean of the estimated heights.


(b) Find the median of the estimated heights.
(c) Give a reason why the median is likely to be a more suitable measure of the central tendency
for this information.

10. 50 values of the variable x are summarised by

Σ(x − 20) = 35 and Σx2 = 25 036

Find the variance of these 50 values. (9709/53/O/N/22 number 1)

11. A summary of 50 values of x gives

Σ(x − q) = 700, Σ(x − q)2 = 14 235,

where q is a constant. (9709/51/M/J/23 number 1)

(a) Find the standard deviation of these values of x.


(b) Given that Σx = 2865, find the value of q.

3
12. asdf

120

100

Cumulative Frequency
80

60

40

20

0
0 10 20 30 40 50
Time (seconds)

The times taken 120 children to complete a particular puzzle are represented in the cumulative
frequency graph. (9709/51/O/N/23 number 1)

(a) Use the graph to estimate the interquartile range of the data. 35% of the children took longer
than T seconds to complete the puzzle.
(b) Use the graph to estimate the value of T .

4
Answers
1. For n values of the variable x, it is given that

Σ(x − 50) = 144 and Σx = 944

Find the value of n. (9709/52/M/J/20 number 1)

Σ(x − 50) = 144 and Σx = 944

To solve this question, we will use the idea that,

x−c=x−c

We can create two equations using the formula for the mean,

Σ(x − 50) Σx
x − 50 = x=
n n

Substitute in the values of Σ(x − 50) and Σx,


144 944
x − 50 = x=
n n

Now let’s use the idea that,


x−c=x−c

In the first equation,


144 944
x − 50 = x=
n n
144 944
x − 50 = x=
n n

Now we can solve the two equations simultaneously,


144 944
x − 50 = x=
n n

Substitute x in the second equation, into the first equation,


144
x − 50 =
n
944 144
− 50 =
n n

Multiply through by n to get rid of the denominator,

944 − 50n = 144

5
Make n the subject of the formula,

50n = 944 − 144

50n = 800
n = 16

Therefore, the final answer is,


n = 16

2. A driver records the distance travelled in each of 150 journeys. These distances, correct to the
nearest km, are summarised in the table. (9709/52/F/M/21 number 5)

Distance (km) 0 − 4 5 − 10 11 − 20 21 − 30 31 − 40 41 − 60
Frequency 12 16 32 66 20 4

(a) Draw a cumulative frequency graph to illustrate the data.

Notice that there gaps between classes in our table i.e 0 − 4 then 5 − 10, there is
gap between 4 and 5. To fix this we apply continuity correction. This means that
you should subtract 0.5 from the lower bounds and add 0.5 to the upper bounds,

Distance (km) 0 − 4.5 4.5 − 10.5 10.5 − 20.5 20.5 − 30.5 30.5 − 40.5 40.5 − 60.5
Frequency 12 16 32 66 20 4

Remember that we want to plot a cumulative frequency graph, so we need to use


the frequency to find the cumulative frequency for each class,

Distance (km) 0 − 4.5 4.5 − 10.5 10.5 − 20.5 20.5 − 30.5 30.5 − 40.5 40.5 − 60.5
Cumulative 12 28 60 126 146 150
Frequency

Label the y-axis as cumulative frequency and the x-axis as distance in km. Plot the
upper bounds against the cumulative frequency. Join the points to form an s-shaped
curve,

6
150

135

120

Cumulative Frequency 105

90

75

60

45

30

15

0
0 10 20 30 40 50 60
Distance (km)

(b) For 30% of these journeys the distance travelled is dkm or more. Use your graph to estimate
the value of d.

This means that 70% of the journeys have a distance less than dkm. Let’s find 70%
of 150,
70
× 150 = 105
100

Draw construction lines at a cumulative frequency of 105 and read off the distance,

d = 27

Therefore, the final answer is,


d = 27
(c) Calculate an estimate of the mean distance travelled for the 150 journeys.

Distance (km) 0 − 4.5 4.5 − 10.5 10.5 − 20.5 20.5 − 30.5 30.5 − 40.5 40.5 − 60.5
Frequency 12 16 32 66 20 4

Find the midpoint of each class,


Midpoint 2.25 7.5 15.5 25.5 35.5 50.5
Frequency 12 16 32 66 20 4

7
Calculate the mean using the formula,
Σxf
x=
Σf

Substitute into the formula,


(2.25 × 12) + (7.5 × 16) + (15.5 × 32) + (25.5 × 66) + (35.5 × 20) + (50.5 × 4)
x=
150

Simplify,
3238
x=
150

Therefore, the final answer is,


3238
x=
150
3. The times taken by 200 players to solve a computer puzzle are summarised in the following table.
(9709/51/M/J/21 number 5)

Time (t seconds) 0 ≤ t < 10 10 ≤ t < 20 20 ≤ t < 40 40 ≤ t < 60 60 ≤ t < 100


No. of players 16 54 78 32 20

(a) Draw a histogram to represent this information.

We first need to find the class width and the frequency density,
Class width 10 10 20 20 40
Frequency Density 1.6 5.4 3.9 1.6 0.5

Label the y-axis with frequency density and x-axis with time in t seconds. Plot the
time classes against the frequency density, ensuring each class has the respective
class width,

5
Frequency density

10 20 40 60 100

Time (t seconds)

(b) Calculate an estimate of the mean time taken by these 200 players.
Time (t seconds) 0 ≤ t < 10 10 ≤ t < 20 20 ≤ t < 40 40 ≤ t < 60 60 ≤ t < 100
No. of players 16 54 78 32 20

8
Let’s start by finding the midpoints of each class,
Midpoint 5 15 30 50 80
No. of players 16 54 78 32 20

Calculate the mean, using the formula,


Σxf
x=
Σf

Substitute into the formula,


(16 × 5) + (54 × 15) + (78 × 30) + (32 × 50) + (20 × 80)
x=
200

Simplify,
6430
x=
200

Therefore, the final answer is,


6430
x=
200
(c) Find the greatest possible value of the interquartile range for these times.

To find the greatest possible value of the interquartile range we need to find the
maximum value of the upper quartile and the minimum value of the lower quartile,
3
q3 = n
4
3
q3 = × 200
4
q3 = 150

When we add up the frequencies, we notice that 150 lies in the class,

40 ≤ t < 60

The maximum value in the class is 60,

q3 = 60

Let’s find the minimum value of the lower quartile,


1
q1 = n
4
1
q1 = × 200
4
q1 = 50

When we add up the frequencies, we notice that 50 lies in the class,

10 ≤ t < 20

9
The minimum value in this class is 10,

q1 = 10

The formula for interquartile range is,

IQR = q3 − q1

Substitute into the formula,


IQR = 60 − 10
IQR = 50

Therefore, the final answer is,


IQR = 50

4. The heights in cm of 160 sunflowers plants were measured. The results are summarised on the
cumulative frequency curve. (9709/53/M/J/21 number 1)

160

140

120
Cumulative Frequency

100

80

60

40

20

0
0 40 80 120 160 200 240
Height (cm)

(a) Use the graph to estimate the number of plants with heights less than 100 cm.

Draw construction lines at a height of 100cm, and read off the respective cumulative
frequency,
60

10
Therefore, the final answer is,
60
(b) Use the graph to estimate the 65th percentile of the distribution.

We have a total of 160 sunflowers. Let’s find 65% of that,


65
× 160 = 104
100

Draw construction lines at a cumulative frequency of 104, and read off the respective
height,
136cm

Therefore, the final answer is,


136cm
(c) Use the graph to estimate the interquartile range of the heights of these plants.

The formula for interquartile range is,

IQR = q3 − q1

Let’s start by finding the upper quartile,


3
q3 = n
4
3
q3 = × 160
4
q3 = 120

Draw construction lines at the cumulative frequency 160 and read off the height,

q3 = 76

Now let’s find the lower quartile,


1
q1 = n
4
1
q1 = × 160
4
q1 = 40

Draw construction lines at the cumulative frequency 160 and read off the height,

q1 = 150

Now let’s go back to the formula for interquartile range,

IQR = q3 − q1

11
Substitute into the formula,
IQR = 150 − 76
IQR = 74cm

Therefore, the final answer is,


IQR = 74cm

5. A sports club has a volleyball team and a hockey team. The heights of the 6 members of the
volleyball team are summarised by Σx = 1050 and Σx2 = 193 700, where x is the height of a
member in cm. The heights of the 11 members of the hockey team are summarised by Σy = 1991
and Σy 2 = 366 400, where y is the height of a member in cm. (9709/53/M/J/21 number 3)

(a) Find the mean height of all 17 members of the club.

Σx = 1050 nx = 6 Σy 2 = 366 400 ny = 11

The formula for the combined mean is,


Σx + Σy
µ̂ =
nx + ny

Substitute into the formula,


1050 + 1991
µ̂ =
6 + 11

Simplify,
3041
µ̂ =
17
µ̂ = 178.9

Therefore, the final answer is,


µ̂ = 178.9
(b) Find the standard deviation of the heights of all 17 members of the club.
3041
Σx2 = 193 700 nx = 6 Σy 2 = 366 400 ny = 11 µ̂ =
17

The formula for the combined standard deviation is,


s
Σx2 + Σy 2
σx+y = − µ̂2
nx + ny

Substitute into the formula,


s  2
193 700 + 366 400 3041
σx+y = −
6 + 11 17

Simplify,
σx+y = 30.8

12
Therefore, the final answer is,
σx+y = 30.8

6. A summary of 40 values of x gives the following information:

Σ(x − k) = 520, Σ(x − k)2 = 9640,

where k is a constant. (9709/51/O/N/21 number 2)

(a) Given that the mean of these 40 values of x is 34, find the value of k.

x = 34 Σ(x − k) = 520 n = 40

The formula for the mean of (x − k) values is,

Σ(x − k)
x−k =
n

We will use the idea that,


x−k =x−k

Our equation becomes,


Σ(x − k)
x−k =
n
Σ(x − k)
x−k =
n

Substitute in the values,


520
34 − k =
40

Solve for k,
34 − k = 13
k = 34 − 13
k = 21

Therefore, the final answer is,


k = 21
(b) Find the variance of these 40 values of x.

Σ(x − k)2 n = 40 k = 21 x = 34

The variance σx is the same as the variance σx−k . In this case we are given more
information of x − k. So let’s find the σx−k ,
r
Σ(x − k)2
σx−k = − (x − k)2
n

13
We need to find x − k. Remember the idea we used above,

x−k =x−k

x − k = 34 − 21
x − k = 13

Now let’s go back to our formula for variance,


r
Σ(x − k)2
σx−k = − (x − k)2
n

Substitute into the formula,


r
9640
σx−k = − (13)2
40

Simplify,
σx−k = 72

We said that σx−k = σx ,


σx = 72

Therefore, the final answer is,


σx = 72

7. For n values of the variable x, it is given that

Σ(x − 200) = 446 and Σx = 6846

Find the value of n. (9709/52/M/J/22 number 1)

To solve this question, we will use the idea that,

x−c=x−c

We can create two equations using the formula for the mean,

Σ(x − 200) Σx
x − 200 = x=
n n

Substitute in the values of Σ(x − 200) and Σx,


446 6846
x − 200 = x=
n n

Now let’s use the idea that,


x−c=x−c

14
In the first equation,
446 6846
x − 200 = x=
n n
446 6846
x − 200 = x=
n n

Now we can solve the two equations simultaneously,


446 6846
x − 200 = x=
n n

Substitute x in the second equation, into the first equation,


446
x − 200 =
n
6846 446
− 200 =
n n

Multiply through by n to get rid of the denominator,

6846 − 200n = 446

Make n the subject of the formula,

200n = 6846 − 446

200n = 6400
n = 32

Therefore, the final answer is,


n = 32

15
8. The time taken, t minutes, to complete a puzzle was recorded for each of 150 students. These
times are summarised in the table. (9709/53/M/J/22 number 1)

Time (t minutes) t ≤ 25 t ≤ 50 t ≤ 75 t ≤ 100 t ≤ 150 t ≤ 200


Cumulative frequency 16 44 86 104 132 150

(a) Draw a cumulative frequency graph to illustrate the data.

Label the y-axis as cumulative frequency and the x-axis as time taken (t minutes).
Plot the upper bounds of the time taken against the cumulative frequency. Join
dots to form an s-shaped curve,
160

140

120
Cumulative Frequency

100

80

60

40

20

0
0 20 40 60 80 100 120 140 160 180 200
Time (t minutes)

(b) Use your graph to estimate the 20th percentile of the data.

Let’s start by finding 20% of the students,


20
× 150 = 30
100

Draw construction lines at a cumulative frequency of 30 and read off the respective
time taken,
t = 38

16
Therefore, the final answer is,
t = 38

9. Twenty children were asked to estimate the height of the particular tree. Their estimates, in metres,
were as follows. (9709/53/M/J/22 number 2)

4.1 4.2 4.4 4.5 4.6 4.8 5.0 5.2 5.3 5.4
5.5 5.8 6.0 6.2 6.3 6.4 6.6 6.8 6.9 19.4

(a) Find the mean of the estimated heights.

Let’s use the formula for mean,


Σx
x=
n

We know that n is 20. Let’s find Σx,


Σx = 4.1 + 4.2 + 4.4 + 4.5 + 4.6 + 4.8 + 5.0 + 5.2 + 5.3 + 5.4 + 5.5
+5.8 + 6.0 + 6.2 + 6.3 + 6.4 + 6.6 + 6.8 + 6.9 + 19.4

Simplify,
Σx = 123.4

Substitute into the formula,


123.4
x=
20
x = 6.17

Therefore, the final answer is,


x = 6.17
(b) Find the median of the estimated heights.

We have a total of 20 values and the data is already arranged in rank order. The
median must lie between the 10 and 11 data point,
n+1
q2 =
2
20 + 1
q2 =
10.5

The 10th and 11th data point are 5.4 and 5.5 respectively. The median is the average
of those two points,
5.4 + 5.5
q2 =
2
q2 = 5.45

Therefore, the final answer is,


q2 = 5.45
(c) Give a reason why the median is likely to be a more suitable measure of the central tendency
for this information.

17
If you analyse the data that we are given. You will notice that we have an anomalous
value 19.4 i.e a value that doesn’t follow the trend. This value will inflate the mean
and make it seem bigger than it actually is.

Therefore, the final answer is,

The mean is unduly affected by the extreme (anomalous) value, 19.4

10. 50 values of the variable x are summarised by

Σ(x − 20) = 35 and Σx2 = 25 036

Find the variance of these 50 values. (9709/53/O/N/22 number 1)

Σ(x − 20) = 35 Σx2 = 25 036 n = 50

The formula for variance is, r


Σx2
σ= − x2
n

We need to find x. Let’s do that by first finding x − 20,


Σ(x − 20)
x − 20 =
n

Substitute into the formula,


35
x − 20 =
50
x − 20 = 0.7

To find x, we will use the idea that,

x−c=x−c

x − 20 = 0.7
x − 20 = 0.7

Make x the subject of the formula,

x = 20 + 0.7

x = 20.7

Now let’s go back to our formula for variance,


r
Σx2
σ= − x2
n

Substitute into the formula, r


25 036
σ= − (20.7)2
50

18
Simplify,
σ = 72.23

Therefore, the final answer is,


σ = 72.23

11. A summary of 50 values of x gives

Σ(x − q) = 700, Σ(x − q)2 = 14 235,

where q is a constant. (9709/51/M/J/23 number 1)

(a) Find the standard deviation of these values of x.

Σ(x − q) = 700, Σ(x − q)2 = 14 235, n = 50

Let’s start by finding σx−q ,


r
Σ(x − q)2
σx−q = − (x − q)2
n

We need to find x − q,
Σ(x − q)
x−q =
n
700
x−q =
50
x − q = 14

Now let’s go back to our formula for variance,


r
Σ(x − q)2
σx−q = − (x − q)2
n

Substitute into the formula,


r
14 235
σx−q = − (14)2
50

Simplify,
σx−q = 9.42

σx−q is the same as σx ,


σx = 9.42

Therefore, the final answer is,


σx = 9.42
(b) Given that Σx = 2865, find the value of q.

Σx = 2865 n = 50 x − q = 14

19
To find q we will use the idea that,

x−q =x−q

Let’s find x,
Σx
x=
n
2865
x=
50
x = 57.3

Now let’s go back to the idea,


x−q =x−q

Substitute and solve for q,


14 = 57.3 − q
q = 57.3 − 14
q = 43.3

Therefore, the final answer is,


q = 43.3

12. asdf

120

100
Cumulative Frequency

80

60

40

20

0
0 10 20 30 40 50
Time (seconds)

The times taken 120 children to complete a particular puzzle are represented in the cumulative
frequency graph. (9709/51/O/N/23 number 1)

(a) Use the graph to estimate the interquartile range of the data.

20
The formula for interquartile range is,
IQR = q3 − q1

Let’s start by finding the upper quartile,


3
q3 = n
4
3
q3 = × 120
4
q3 = 90

Draw construction lines at a cumulative frequency of 90 and read off the time,
q3 = 31

Now let’s find the lower quartile,


1
q1 = n
4
1
q1 = × 120
4
q1 = 30

Draw construction lines at a cumulative frequency of 30 and read off the time,
q1 = 23.7

Now let’s go back to the formula for interquartile range,


IQR = q3 − q1

Substitute into the formula,


IQR = 31 − 23.7
IQR = 7.3

Therefore, the final answer is,


IQR = 7.3
35% of the children took longer than T seconds to complete the puzzle.
(b) Use the graph to estimate the value of T .

This means 65% of the children took less than T seconds to complete the puzzle.
Let’s find 65% of 120,
65
× 120 = 78
100

Draw construction lines at a cumulative frequency of 78 and read off the time. This
is the value of T ,
T = 28.5

Therefore, the final answer is,


T = 28.5

21

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy