PYC3704 Past Exam Questions
PYC3704 Past Exam Questions
2012-2013
PYC3704 (PYC304C)
Psychological Research
EXAM PREPARATION
This document is a compilation of past UNISA exam papers and their answers.
Please note:
This document is an additional tool for exam preparation. The author takes no responsibility for incorrect answers. Students must ensure that they learn the
prescribed material and understand the content.
This document was sold on Stuvia.co.za. You may not redistribute this document.
PYC3704 Flow Chart
Does the researcher plan a groups-design or a
correlational design?
Categorical:
Interval or Ratio scale: Statistics & hypothesis are about Both variables are categorical Both variable are continues
Statistical hypothesis are about P (qualitative) (quantitative)
the mean (μ) (population proportion) (nominal or ordinal) (interval or ratio scale)
(No longer in syllabus)
Pearson’s correlation
Will one or two samples Chi-square test
coef f icient (tr)
be selected?
-1 < r < 1
Yes No Yes No
z t tc td
Probability (p-value)
Symbols - Populations and Samples
Symbol
Description Populations Samples
(Parameter) (Statistic)
Arithmetic mean μ
Standard deviation σ s (s=√s²)
Variance σ² s²
Standard error of mean
σ (= σ/√n) s (= s/√n)
(Also called Standard deviation of the sampling distribution of the mean)
Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean under H0) µ
(Central value of sampling distribution)
Z score for means z
Difference between scores
Standard deviation of sample of difference scores s
Correlation between two measurements (Pearson's R) ρ r
Proportions P p
Level of significance
Set by the researcher at the start of project
α
Probability of making a Type I error
Mistakenly rejecting the H0 when it is true
Probability of making a Type II error
β
Not rejecting H0 when H0 is false and H1 is true
Squared correlation
r²
Can be used as indication of size of effect
Standard Normal Distribution (z)
Mean Larger Smaller Mean Larger Smaller Mean Larger Smaller Mean Larger Smaller Mean Larger Smaller
z z z z z
to z portion portion to z portion portion to z portion portion to z portion portion to z portion portion
0.00 0.0000 0.5000 0.5000 0.40 0.1554 0.6554 0.3446 0.80 0.2881 0.7881 0.2119 1.20 0.3849 0.8849 0.1151 1.60 0.4452 0.9452 0.0548
0.01 0.0040 0.5040 0.4960 0.41 0.1591 0.6591 0.3409 0.81 0.2910 0.7910 0.2090 1.21 0.3869 0.8869 0.1131 1.61 0.4463 0.9463 0.0537
0.02 0.0080 0.5080 0.4920 0.42 0.1628 0.6628 0.3372 0.82 0.2939 0.7939 0.2061 1.22 0.3888 0.8888 0.1112 1.62 0.4474 0.9474 0.0526
0.03 0.0120 0.5120 0.4880 0.43 0.1664 0.6664 0.3336 0.83 0.2967 0.7967 0.2033 1.23 0.3907 0.8907 0.1093 1.63 0.4484 0.9484 0.0516
0.04 0.0160 0.5160 0.4840 0.44 0.1700 0.6700 0.3300 0.84 0.2995 0.7995 0.2005 1.24 0.3925 0.8925 0.1075 1.64 0.4495 0.9495 0.0505
0.05 0.0199 0.5199 0.4801 0.45 0.1736 0.6736 0.3264 0.85 0.3023 0.8023 0.1977 1.25 0.3944 0.8944 0.1056 1.65 0.4505 0.9505 0.0495
0.06 0.0239 0.5239 0.4761 0.46 0.1772 0.6772 0.3228 0.86 0.3051 0.8051 0.1949 1.26 0.3962 0.8962 0.1038 1.66 0.4515 0.9515 0.0485
0.07 0.0279 0.5279 0.4721 0.47 0.1808 0.6808 0.3192 0.87 0.3078 0.8078 0.1922 1.27 0.3980 0.8980 0.1020 1.67 0.4525 0.9525 0.0475
0.08 0.0319 0.5319 0.4681 0.48 0.1844 0.6844 0.3156 0.88 0.3106 0.8106 0.1894 1.28 0.3997 0.8997 0.1003 1.68 0.4535 0.9535 0.0465
0.09 0.0359 0.5359 0.4641 0.49 0.1879 0.6879 0.3121 0.89 0.3133 0.8133 0.1867 1.29 0.4015 0.9015 0.0985 1.69 0.4545 0.9545 0.0455
0.10 0.0398 0.5398 0.4602 0.50 0.1915 0.6915 0.3085 0.90 0.3159 0.8159 0.1841 1.30 0.4032 0.9032 0.0968 1.70 0.4554 0.9554 0.0446
0.11 0.0438 0.5438 0.4562 0.51 0.1950 0.6950 0.3050 0.91 0.3186 0.8186 0.1814 1.31 0.4049 0.9049 0.0951 1.71 0.4564 0.9564 0.0436
0.12 0.0478 0.5478 0.4522 0.52 0.1985 0.6985 0.3015 0.92 0.3212 0.8212 0.1788 1.32 0.4066 0.9066 0.0934 1.72 0.4573 0.9573 0.0427
0.13 0.0517 0.5517 0.4483 0.53 0.2019 0.7019 0.2981 0.93 0.3238 0.8238 0.1762 1.33 0.4082 0.9082 0.0918 1.73 0.4582 0.9582 0.0418
0.14 0.0557 0.5557 0.4443 0.54 0.2054 0.7054 0.2946 0.94 0.3264 0.8264 0.1736 1.34 0.4099 0.9099 0.0901 1.74 0.4591 0.9591 0.0409
0.15 0.0596 0.5596 0.4404 0.55 0.2088 0.7088 0.2912 0.95 0.3289 0.8289 0.1711 1.35 0.4115 0.9115 0.0885 1.75 0.4599 0.9599 0.0401
0.16 0.0636 0.5636 0.4364 0.56 0.2123 0.7123 0.2877 0.96 0.3315 0.8315 0.1685 1.36 0.4131 0.9131 0.0869 1.76 0.4608 0.9608 0.0392
0.17 0.0675 0.5675 0.4325 0.57 0.2157 0.7157 0.2843 0.97 0.3340 0.8340 0.1660 1.37 0.4147 0.9147 0.0853 1.77 0.4616 0.9616 0.0384
0.18 0.0714 0.5714 0.4286 0.58 0.2190 0.7190 0.2810 0.98 0.3365 0.8365 0.1635 1.38 0.4162 0.9162 0.0838 1.78 0.4625 0.9625 0.0375
0.19 0.0753 0.5753 0.4247 0.59 0.2224 0.7224 0.2776 0.99 0.3389 0.8389 0.1611 1.39 0.4177 0.9177 0.0823 1.79 0.4633 0.9633 0.0367
0.20 0.0793 0.5793 0.4207 0.60 0.2257 0.7257 0.2743 1.00 0.3413 0.8413 0.1587 1.40 0.4192 0.9192 0.0808 1.80 0.4641 0.9641 0.0359
0.21 0.0832 0.5832 0.4168 0.61 0.2291 0.7291 0.2709 1.01 0.3438 0.8438 0.1562 1.41 0.4207 0.9207 0.0793 1.81 0.4649 0.9649 0.0351
0.22 0.0871 0.5871 0.4129 0.62 0.2324 0.7324 0.2676 1.02 0.3461 0.8461 0.1539 1.42 0.4222 0.9222 0.0778 1.82 0.4656 0.9656 0.0344
0.23 0.0910 0.5910 0.4090 0.63 0.2357 0.7357 0.2643 1.03 0.3485 0.8485 0.1515 1.43 0.4236 0.9236 0.0764 1.83 0.4664 0.9664 0.0336
0.24 0.0948 0.5948 0.4052 0.64 0.2389 0.7389 0.2611 1.04 0.3508 0.8508 0.1492 1.44 0.4251 0.9251 0.0749 1.84 0.4671 0.9671 0.0329
0.25 0.0987 0.5987 0.4013 0.65 0.2422 0.7422 0.2578 1.05 0.3531 0.8531 0.1469 1.45 0.4265 0.9265 0.0735 1.85 0.4678 0.9678 0.0322
0.26 0.1026 0.6026 0.3974 0.66 0.2454 0.7454 0.2546 1.06 0.3554 0.8554 0.1446 1.46 0.4279 0.9279 0.0721 1.86 0.4686 0.9686 0.0314
0.27 0.1064 0.6064 0.3936 0.67 0.2486 0.7486 0.2514 1.07 0.3577 0.8577 0.1423 1.47 0.4292 0.9292 0.0708 1.87 0.4693 0.9693 0.0307
0.28 0.1103 0.6103 0.3897 0.68 0.2517 0.7517 0.2483 1.08 0.3599 0.8599 0.1401 1.48 0.4306 0.9306 0.0694 1.88 0.4699 0.9699 0.0301
0.29 0.1141 0.6141 0.3859 0.69 0.2549 0.7549 0.2451 1.09 0.3621 0.8621 0.1379 1.49 0.4319 0.9319 0.0681 1.89 0.4706 0.9706 0.0294
0.30 0.1179 0.6179 0.3821 0.70 0.2580 0.7580 0.2420 1.10 0.3643 0.8643 0.1357 1.50 0.4332 0.9332 0.0668 1.90 0.4713 0.9713 0.0287
0.31 0.1217 0.6217 0.3783 0.71 0.2611 0.7611 0.2389 1.11 0.3665 0.8665 0.1335 1.51 0.4345 0.9345 0.0655 1.91 0.4719 0.9719 0.0281
0.32 0.1255 0.6255 0.3745 0.72 0.2642 0.7642 0.2358 1.12 0.3686 0.8686 0.1314 1.52 0.4357 0.9357 0.0643 1.92 0.4726 0.9726 0.0274
0.33 0.1293 0.6293 0.3707 0.73 0.2673 0.7673 0.2327 1.13 0.3708 0.8708 0.1292 1.53 0.4370 0.9370 0.0630 1.93 0.4732 0.9732 0.0268
0.34 0.1331 0.6331 0.3669 0.74 0.2704 0.7704 0.2296 1.14 0.3729 0.8729 0.1271 1.54 0.4382 0.9382 0.0618 1.94 0.4738 0.9738 0.0262
0.35 0.1368 0.6368 0.3632 0.75 0.2734 0.7734 0.2266 1.15 0.3749 0.8749 0.1251 1.55 0.4394 0.9394 0.0606 1.95 0.4744 0.9744 0.0256
0.36 0.1406 0.6406 0.3594 0.76 0.2764 0.7764 0.2236 1.16 0.3770 0.8770 0.1230 1.56 0.4406 0.9406 0.0594 1.96 0.4750 0.9750 0.0250
0.37 0.1443 0.6443 0.3557 0.77 0.2794 0.7794 0.2206 1.17 0.3790 0.8790 0.1210 1.57 0.4418 0.9418 0.0582 1.97 0.4756 0.9756 0.0244
0.38 0.1480 0.6480 0.3520 0.78 0.2823 0.7823 0.2177 1.18 0.3810 0.8810 0.1190 1.58 0.4429 0.9429 0.0571 1.98 0.4761 0.9761 0.0239
0.39 0.1517 0.6517 0.3483 0.79 0.2852 0.7852 0.2148 1.19 0.3830 0.8830 0.1170 1.59 0.4441 0.9441 0.0559 1.99 0.4767 0.9767 0.0233
Mean Larger Smaller Mean Larger Smaller Mean Larger Smaller Mean Larger Smaller
z z z z
to z portion portion to z portion portion to z portion portion to z portion portion
2.00 0.4772 0.9772 0.0228 2.43 0.4925 0.9925 0.0075 2.86 0.4979 0.9979 0.0021 3.29 0.4995 0.9995 0.0005
2.01 0.4778 0.9778 0.0222 2.44 0.4927 0.9927 0.0073 2.87 0.4979 0.9979 0.0021 3.30 0.4995 0.9995 0.0005
2.02 0.4783 0.9783 0.0217 2.45 0.4929 0.9929 0.0071 2.88 0.4980 0.9980 0.0020 3.31 0.4995 0.9995 0.0005
2.03 0.4788 0.9788 0.0212 2.46 0.4931 0.9931 0.0069 2.89 0.4981 0.9981 0.0019 3.32 0.4995 0.9995 0.0005
2.04 0.4793 0.9793 0.0207 2.47 0.4932 0.9932 0.0068 2.90 0.4981 0.9981 0.0019 3.33 0.4996 0.9996 0.0004
2.05 0.4798 0.9798 0.0202 2.48 0.4934 0.9934 0.0066 2.91 0.4982 0.9982 0.0018 3.34 0.4996 0.9996 0.0004
2.06 0.4803 0.9803 0.0197 2.49 0.4936 0.9936 0.0064 2.92 0.4982 0.9982 0.0018 3.35 0.4996 0.9996 0.0004
2.07 0.4808 0.9808 0.0192 2.50 0.4938 0.9938 0.0062 2.93 0.4983 0.9983 0.0017 3.36 0.4996 0.9996 0.0004
2.08 0.4812 0.9812 0.0188 2.51 0.4940 0.9940 0.0060 2.94 0.4984 0.9984 0.0016 3.37 0.4996 0.9996 0.0004
2.09 0.4817 0.9817 0.0183 2.52 0.4941 0.9941 0.0059 2.95 0.4984 0.9984 0.0016 3.38 0.4996 0.9996 0.0004
2.10 0.4821 0.9821 0.0179 2.53 0.4943 0.9943 0.0057 2.96 0.4985 0.9985 0.0015 3.39 0.4997 0.9997 0.0003
2.11 0.4826 0.9826 0.0174 2.54 0.4945 0.9945 0.0055 2.97 0.4985 0.9985 0.0015 3.40 0.4997 0.9997 0.0003
2.12 0.4830 0.9830 0.0170 2.55 0.4946 0.9946 0.0054 2.98 0.4986 0.9986 0.0014 3.41 0.4997 0.9997 0.0003
2.13 0.4834 0.9834 0.0166 2.56 0.4948 0.9948 0.0052 2.99 0.4986 0.9986 0.0014 3.42 0.4997 0.9997 0.0003
2.14 0.4838 0.9838 0.0162 2.57 0.4949 0.9949 0.0051 3.00 0.4987 0.9987 0.0013 3.43 0.4997 0.9997 0.0003
2.15 0.4842 0.9842 0.0158 2.58 0.4951 0.9951 0.0049 3.01 0.4987 0.9987 0.0013 3.44 0.4997 0.9997 0.0003
2.16 0.4846 0.9846 0.0154 2.59 0.4952 0.9952 0.0048 3.02 0.4987 0.9987 0.0013 3.45 0.4997 0.9997 0.0003
2.17 0.4850 0.9850 0.0150 2.60 0.4953 0.9953 0.0047 3.03 0.4988 0.9988 0.0012 3.46 0.4997 0.9997 0.0003
2.18 0.4854 0.9854 0.0146 2.61 0.4955 0.9955 0.0045 3.04 0.4988 0.9988 0.0012 3.47 0.4997 0.9997 0.0003
2.19 0.4857 0.9857 0.0143 2.62 0.4956 0.9956 0.0044 3.05 0.4989 0.9989 0.0011 3.48 0.4997 0.9997 0.0003
2.20 0.4861 0.9861 0.0139 2.63 0.4957 0.9957 0.0043 3.06 0.4989 0.9989 0.0011 3.49 0.4998 0.9998 0.0002
2.21 0.4864 0.9864 0.0136 2.64 0.4959 0.9959 0.0041 3.07 0.4989 0.9989 0.0011 3.50 0.4998 0.9998 0.0002
2.22 0.4868 0.9868 0.0132 2.65 0.4960 0.9960 0.0040 3.08 0.4990 0.9990 0.0010 3.51 0.4998 0.9998 0.0002
2.23 0.4871 0.9871 0.0129 2.66 0.4961 0.9961 0.0039 3.09 0.4990 0.9990 0.0010 3.52 0.4998 0.9998 0.0002
2.24 0.4875 0.9875 0.0125 2.67 0.4962 0.9962 0.0038 3.10 0.4990 0.9990 0.0010 3.53 0.4998 0.9998 0.0002
2.25 0.4878 0.9878 0.0122 2.68 0.4963 0.9963 0.0037 3.11 0.4991 0.9991 0.0009 3.54 0.4998 0.9998 0.0002
2.26 0.4881 0.9881 0.0119 2.69 0.4964 0.9964 0.0036 3.12 0.4991 0.9991 0.0009 3.55 0.4998 0.9998 0.0002
2.27 0.4884 0.9884 0.0116 2.70 0.4965 0.9965 0.0035 3.13 0.4991 0.9991 0.0009 3.56 0.4998 0.9998 0.0002
2.28 0.4887 0.9887 0.0113 2.71 0.4966 0.9966 0.0034 3.14 0.4992 0.9992 0.0008 3.57 0.4998 0.9998 0.0002
2.29 0.4890 0.9890 0.0110 2.72 0.4967 0.9967 0.0033 3.15 0.4992 0.9992 0.0008 3.58 0.4998 0.9998 0.0002
2.30 0.4893 0.9893 0.0107 2.73 0.4968 0.9968 0.0032 3.16 0.4992 0.9992 0.0008 3.59 0.4998 0.9998 0.0002
2.31 0.4896 0.9896 0.0104 2.74 0.4969 0.9969 0.0031 3.17 0.4992 0.9992 0.0008 3.60 0.4998 0.9998 0.0002
2.32 0.4898 0.9898 0.0102 2.75 0.4970 0.9970 0.0030 3.18 0.4993 0.9993 0.0007 3.65 0.4999 0.9999 0.0001
2.33 0.4901 0.9901 0.0099 2.76 0.4971 0.9971 0.0029 3.19 0.4993 0.9993 0.0007 3.70 0.4999 0.9999 0.0001
2.34 0.4904 0.9904 0.0096 2.77 0.4972 0.9972 0.0028 3.20 0.4993 0.9993 0.0007 3.75 0.4999 0.9999 0.0001
2.35 0.4906 0.9906 0.0094 2.78 0.4973 0.9973 0.0027 3.21 0.4993 0.9993 0.0007 3.80 0.4999 0.9999 0.0001
2.36 0.4909 0.9909 0.0091 2.79 0.4974 0.9974 0.0026 3.22 0.4994 0.9994 0.0006 3.85 0.4999 0.9999 0.0001
2.37 0.4911 0.9911 0.0089 2.80 0.4974 0.9974 0.0026 3.23 0.4994 0.9994 0.0006 3.90 0.5000 1.0000 0.0000
2.38 0.4913 0.9913 0.0087 2.81 0.4975 0.9975 0.0025 3.24 0.4994 0.9994 0.0006 3.95 0.5000 1.0000 0.0000
2.39 0.4916 0.9916 0.0084 2.82 0.4976 0.9976 0.0024 3.25 0.4994 0.9994 0.0006 4.00 0.5000 1.0000 0.0000
2.40 0.4918 0.9918 0.0082 2.83 0.4977 0.9977 0.0023 3.26 0.4994 0.9994 0.0006
2.41 0.4920 0.9920 0.0080 2.84 0.4977 0.9977 0.0023 3.27 0.4995 0.9995 0.0005
2.42 0.4922 0.9922 0.0078 2.85 0.4978 0.9978 0.0022 3.28 0.4995 0.9995 0.0005
PYC3704 (PYC304C)
May/June 2012
2 In psychological research, a construct may be a(n) 3 P4 Constructs and their interrelations (how they affect each other, their patterns
______ of interaction) are used in this way to develop theoretical explanations of
why people behave in certain ways in certain contexts, or why mental
1. measurement based on the careful observation of phenomena appear to be as they are.
aspects of humans or human behaviour
2. observation of an aspect of humans or human
behaviour which was operationalised in some way
3. hypothetical aspect of humans or human behaviour
which we wish to investigate
4. explanation of empirical observations based on the
measurement of certain variables
# Question Ans Page Comments
3 Which of the options below provides the best 4 P1-2 Psychology is a discipline that endeavours to collect information and develop
description of the main purpose of quantitative theories about human behaviour and mental processes. The aim is to
research in psychology? Its purpose is to ______ establish facts that are related to psychological phenomena, that are valid and
can be justified on scientific grounds.
1. develop theories that explain the relationships
among observed aspects of human behaviour and P2 The act of simply observing phenomena and describing them or collecting
mental processes facts about them is usually not sufficient. The next step in the scientific
2. develop predictions about human behaviour of process is to go beyond the level of description by attempting to develop
which we can be applied with absolute certainty explanations for the things we observe: we want to know not only what the
3. describe and classify aspects of humans and facts are, but also why they appear to be as they are. In other words, we want
human behaviour to develop theories, which explain why things are as they appear to be when
4. develop hypotheses about relationships that may we observe them.
exist among various constructs
P3 Psychologists try to develop explanations for human experiences and
behaviour. To do this, they often have to make use of abstract concepts (also
called constructs) that serve as explanations for the behaviour they observe.
P4 Psychologists are interested to find out which constructs are important (in the
sense of being required or useful to explain human behaviour) and how they
work together in a pattern, or what their interrelationships are. One of the
objectives of psychology is not only to describe human behaviour, but also to
find explanations for it. Constructs and how they interact fill the role of
explanatory mechanisms in psychology. We try to find out which constructs
offer an appropriate explanation of the behaviour or events we perceive, and
what the pattern of their interactions with other constructs may be. In this
sense, it can be said that constructs are the building blocks of theory.
P6 The link between observing a construct and measuring it is so close that when
we talk about 'observation' in quantitative research, we often imply the process
of measurement. The taking of a measurement is regarded as an act of
observation.
5 Empirical knowledge is knowledge that is based on 3 P2 All scientific knowledge begins with description of the phenomena being
______ studied, based on careful observation. Knowledge based on observation of
physical events is referred to as empirical knowledge (as distinct from
1. careful reasoning knowledge based on contemplation, unexplained insights, mystical
2. appropriate theories experiences or claims by authority figures).
3. the observation of events
4. published research
# Question Ans Page Comments
“Generalised anxiety disorder (GAD) refers to a pattern of almost constant worry or tension, even when there is little or no apparent cause. Both genetic
predisposition and stressors in the life of a particular patient is believed to contribute to this condition. The research will investigate whether the level of
anxiety of persons diagnosed with GAD is actually reduced by psychotherapy. It is expected that patients receiving therapy will score lower on the
Manifest Anxiety Scale than patients not receiving therapy "
6 “Both genetic predisposition and stressors in the life of 2 P4 A theory is a well-established principle that has been developed to explain
a particular patient is believed to contribute to this P15 some aspect of the natural world. A theory arises from repeated observation
condition' is ______ P18-19 and testing and incorporates facts, laws, predictions, and tested hypotheses
P21-26 that are widely accepted. In science, a theory is a framework for facts. It is
1. the research hypothesis some kind of description that tells you how the facts are connected, and why
2. a theory about the causes of GAD the facts are as they are (where the word 'facts' refers to things or events that
3. a postulated relation between two constructs were observed and described in a careful way). A theory is a network of
4. a description of the constructs in terms of which relations among facts that were proposed to be true and explanations for
GAD can be observed observed phenomena in terms of constructs.
Constructs and their interrelations (how they affect each other, their patterns
of interaction) are used in this way to develop theoretical explanations of why
people behave in certain ways in certain contexts, or why mental phenomena
appear to be as they are.
8 The dependent variable is ______ and the independent 3 P8-9 The dependent variable is the one that is predicted or explained, and the
variable is ______ P24 independent variable is manipulated to see how it affects the dependent
variable.
1. whether or not psychotherapy is received, the level
of anxiety experienced by patients The independent variable is that variable which affects the dependent
2. the effectiveness of psychotherapy, the level of variable; or, conversely, the dependent variable depends on the independent
anxiety variable.
3. the level of anxiety experienced by patients,
whether or not psychotherapy is received When a researcher focuses on the interaction of only two variables at a time,
4. the anxiety score as measured on the Manifest the dependent variable is usually the one that the researcher is interested in,
Anxiety Scale, the presence of stressors in the life the variable that is the focus of the research. The independent variable is
of the patient something that the researcher manipulates, to see how this affects the
dependent variable (in other words, the dependent variable is dependent on
the independent variable).
The dependent variable is the one that is predicted or explained, and the
independent variable is manipulated to see how it affects the dependent
variable.
10 A researcher would use a ______ to make a(n) ______ 1 P11 The entire collection of cases that you are interested in when you make your
about the nature of the ______ measurements for a particular construct is referred to as the population. The
population depends on which people or objects or events you are interested in
1. sample, inference, population studying.
2. sample, hypothesis, population
3. variable, prediction, construct Because populations can be very large, and we rarely have access to them,
4. population, inference, sample we would draw a sample of observations from the population and use that
sample to infer certain things about the population's characteristics. The most
appropriate sample is usually a simple random sample, where each individual
has the same chance of being included. If our samples are not random, they
may lack external validity: it may not be possible to generalise beyond the
group from which we drew the sample.
11 A measurement that summarises an aspect of a 2 P14 A statistic is a sample measurement characteristic.
population is called a ______ while a measurement A test statistic is the quantity you calculate (often by making use of sample
that describes the same aspect of a sample is called P23 statistics) to test a statistical hypothesis.
______ When we refer to these test quantities, we always refer to the name in full -
'test statistic', and when we use the term 'statistic' on its own it refers to a
1. construct, variable descriptive statistic that describes an aspect of the sample data.
2. parameter, statistic
3. statistic, parameter Parameters are values that summarise aspects of population data
4. variable, construct While the word 'parameters' does refer to descriptive statistics, it does not
refer to all descriptive statistics. It is used only for those descriptive statistics
that relate to the population, not to those that describe aspects of the sample.
# Question Ans Page Comments
12 A ______ is a speculative statement about the 4 P1 A research hypothesis is formed as a clear statement in terms of a
relationship among ______, based on observations or P18-19 relationship among the constructs (and the variables by which they are
expectations measured). It is a statement about a possible relationship among constructs
that may explain some set of observations that one intends to investigate.
1. theory, constructs
2. hypothesis, statistics Constructs: concepts that act as explanations for phenomena, events and
3. theory, variables behaviour and are abstracted from observations.
4. hypothesis, constructs
Theories: a theory is a frame of reference for facts that attempts to account
for why things are as they are; a claim about how constructs are related to
produce phenomena, which has been validated by research.
13 A class of 10 boys and 11 girls, including Mary and her 3 P29 Number of possible outcomes = Total kids = 21
friend Elizabeth, chooses a class representative by Number of favourable events = Either Mary or Elizabeth = 2
writing their names on slips of paper, putting these into
a box and asking their teacher to draw one name p(E) = Number of favourable events
blindly. Number of possible outcomes
=1 x 1 = 1 = 0.167
3 2 6
# Question Ans Page Comments
15 Which statement best represents an application of the 1 P31-32 The principle is called the law of large numbers, and it states the following:
law of large numbers? If I flip a coin 1000 times, it will If an experiment is done repeatedly, and if the outcomes are independent of
fall heads-up ______ 500 times one another, the observed proportion of favourable occurrences of an event
will eventually approach its theoretical probability.
1. approximately What the law states is that a probability value should be seen as a theoretical
2. exactly limit on which the relative occurrence of an event (outcome) can be expected
3. at least to converge over time in the long run. For example, in the above coin-flipping
4. either much more or much less than example, the probability of the coin coming up heads or tails on any flip is not
influenced by the result of the previous flip. Each flip is independent of the
other, and the theoretical probability of heads coming up remains the same,
that is, p(heads) = 1/2 = 0.5.
In terms of the law of large numbers, we can make the following prediction: If
we flip the coin repeatedly, even though we do not know whether heads or
tails will come up on any particular flip, the actual proportion of heads will
eventually get close to 0.5. Thus, as the experiment gets repeated over and
over, the relative frequency or proportion of heads will approximate the
theoretical probability of 0.5
16 The expression "0.05 < p ≤ 0.10" should be interpreted 3 P33-34 Because probabilities fall in a range from 0.0 to 1.0 when expressed
as a probability value ______ decimally, a probability can never be higher than 1 or lower than 0. The
general rule is written symbolically as follows: 0 ≤ p ≤ 1. Note that a probability
1. smaller than 0.05 and larger or equal to 0.10 can be 0, but to say that a probability is 0 is actually the same as saying that
2. halfway between 0.05 and 0.10 the event is impossible and can never happen. Likewise, to say that the
3. larger than 0.05 and smaller or equal to 0.10 probability of an event is 1 is to assert that it is an absolute certainty. In actual
4. smaller than 0.05 and equal to 0.10 practice, probabilities fall within these two extremes.
You will typically encounter reference to probabilities in expressions such as
''p > 0.05''. This statement is interpreted as ''the probability value is higher than
0.05''.
17 Suppose that over the years 10 000 students wrote the 4 P35-36 Part 1:
examinations in PYC 3704-C and that 6000 of them p(E) = Number of favourable events = 300 = 3 = 0.03
passed, of which 300 obtained exactly 50%. This Number of possible outcomes 10000 100
means that for randomly selected students the
probability of obtaining exactly 50% is ______ while the Part 2:
probability of obtaining 50% or more is ______ p(E) = Number of favourable events = 6000 = 6 = 0.6
Number of possible outcomes 10000 10
1. 0.60, 0.03
2. 0.05, 0.60
3. 0.60, 0.03
4. 0.03, 0.60
# Question Ans Page Comments
18 During the interpretation of psychological 2 P50-51 Many of the scores that we use are also clustered around the average, and tail
measurements the normal distribution is often ______ off to the ends of the distribution. Because it can be used to describe the
distribution of many naturally or 'normally' occurring continuous variables, this
1. adapted to fit the observed frequency distribution of type of symmetrical probability distribution is called a normal distribution. It is
scores also commonly referred to as the normal curve, because the distribution can
2. used as a theoretical model for interpreting the be plotted by a bell-shaped curve.
observed distribution of scores
3. used to calculate the relative frequency of observed The definition of the standard normal distribution (see section 2.3.3 on p52-53)
scores is that it has a mean (μ) of 0 and a standard deviation (σ) of 1.
4. used to derive the mean and standard deviation of a
sample
19 The scale along the x-axis of the standard normal 3 P52-53 Statisticians have derived a rather complicated-looking equation (or formula)
distribution indicates ______ which describes the normal curve, and have shown that it contains only two
variables, the mean (m) and the standard deviation (s), with the rest of its
1. probabilities terms being constants. The formula produces distributions that are all bell-
2. the mean of the distribution shaped, but the actual shape of the curve - how high it is or how spread out it
3. the number of standard deviations below and above is - depends only on the mean and the standard deviation of the distribution
the mean concerned.
4. the p-values
1. 2 In other words, it is an indication of the size of the error that you make by
2. 10 using a sample of a particular size (n) to determine the population mean. This
3. 50 amount of error will decrease as the size of the sample increases.
4. 25
σ = σ/√n = 50/√25 = 50/5 = 10
22 Which of the following statements about population 4 P23 Parameters are values that summarise aspects of population data
parameters is the most accurate? While the word 'parameters' does refer to descriptive statistics, it does not
refer to all descriptive statistics. It is used only for those descriptive statistics
1. They are essential for making statements about that relate to the population, not to those that describe aspects of the sample.
probability distributions
2. They are always unknown but appropriate values P13 Population parameters are rarely known (usually unknown), since the only
can be estimated prior to sampling P65 Q10 way to determine them would be to collect the relevant data from the entire
3. They are essential, but cannot be estimated from population. Population parameters are usually unknown and have to be
sample information inferred from sample data. Since population parameters are unknown, they
4. They are always required prior to sampling because cannot be essential to make statements about probability. Option 1 is,
they are needed to calculate the sample statistics therefore, incorrect. Option 3 is also incorrect because it incorrectly states that
population parameters cannot be estimated from sampling information, but the
whole process of statistical inference is actually concerned with inferring
information about a population from sample data.
We use the sample to represent the population, and do our calculations on the
P60 sample data, but ultimately we want to determine the situation in the
population. To do this, we often have to estimate the (population) parameters
by using the (sample) statistics. A researcher seldom knows the values of the
population parameters, but the values of sample statistics can be calculated
by means of clearly formulated mathematical procedures and these can be
used as estimates of the parameters of the corresponding population.
Normally, we'll not know what our true population parameter is, and we would
have calculated the mean from only a single sample - but we can still apply
the basic principle: that our sample mean will be a reliable estimate of our
population mean.
# Question Ans Page Comments
23 What is the principal advantage of z scores? They 3 P53 This curve has a mean of µ = 0 and a standard deviation of σ = 1 and is
enable one to ______ known as the standard normal distribution, and is by convention indicated with
the letter 'z' (so it is also referred to as the z-distribution). The measures on
1. determine whether scores are normally distributed this distribution are referred to as standard scores or z-scores.
around the mean
2. transform a person's scores on tests with different
means and the same standard deviations into
comparable percentages
3. compare a person's scores on tests with different
means and standard deviations
4. determine frequency distributions for tests with
different means
According to the standard normal distribution tabel (z-tabel), if z=1 then the
mean to z = 0.3413. Multiply by 2 to get both sides of the mean = 0.6826 or
68.26%
# Question Ans Page Comments
24 Consider the following Table 1 Tut201 The marks should first be converted to z-values, to make it possible to
2012 Q21 compare them across the different means and standard deviations:
Mean of Std. dev. _ _
Subject Student X
class of class Z=X–X or Z= (X - X) / S
A 50% 40% 5% S
B 55% 50% 5%
C 60% 50% 10% ZSubjectA = (50 - 40) / 5 = 10/5 = 2
D 65% 65% 5% ZSubjectB = (55 - 50) / 5 = 5/5 = 1
ZSubjectC = (60 - 50) / 5 = 10/10 = 1
In which subject did Student X do best, relative to his ZSubjectD = (65 - 65) / 5 = 0/5 = 0
class?
1. A So it is clear that in the case of subject A, the student’s marks are 2 standard
2. C deviations above the mean. In the other subjects the student’s marks are 1
3. D standard deviation or less above the mean.
4. B
25 Study the histogram below of the exam marks of a 2 P29 Possible outcomes = 10 + 20 + 40 + 10 + 20 = 100
group of students in the same class. Note that the
values on the horizontal axis are the class (category) Favourable events = score > 40 and < 60
limits = Exam mark of 50 with frequency = 10
= 10
OR
Assume we use this histogram as a basis for making Favourable events = score > 40
probability predictions. What is the probability that a = Exammark 50,freq 10 and exammark 60, freq 20
student's score will be between 40 and 60? = 10 + 20 = 30
1. 0.20 P(score > 40) = Number of favourable events / Number of possible outcomes
2. 0.10 = (10 + 20) / 100
3. 0.70 = 30 /100
4. 0.30 = 0.30
# Question Ans Page Comments
A researcher suspects that the addition of certain food supplements to the diet of elderly people will reduce the decline in cognitive functioning that comes
about because of aging. She decides to test this using a neuropsychological test that measures the speeds with which objects are identified (the
Neuropsychological Perceptual Speed or NPS test). It is known that the distribution of scores on this test is approximately normal and that a mean of µ =
80 and σ = 20 was found in the population of persons older than 65.
To investigate her hypothesis, she obtains a random sample of n=100 persons older than 65. Each member of this sample is given a daily dose of
supplements over a period of six months. At the end of this time, each person is tested on the NPS test and a mean of ẋ = 76 is found. The researcher
plans to test the hypothesis at α = 0.05.
26 The appropriate research hypothesis suggested by the 3 Tut201 A psychological hypothesis formulates a testable empirical claim (something
scenario above is as follows 2012 Q8 that can in principle be observed), and this usually involves postulating a
relationship between two or more variables.
1. Cognitive functioning declines with age
2. The cognitive functioning of elderly persons is
related to their perceptual speed
3. Cognitive functioning will be better for elderly
persons who take the dietary supplement than for
those who do not
4. The perceptual speed of elderly persons who take
the dietary supplement will be greater than for those
who do not
27 The appropriate alternative hypothesis to be tested 1 H0: μ = 80 which is the score of the NPS on a normal population mean. For
is______ the speed of the NPS to improve, the NPS score must go down. See this as
the time it took. The alternative hypothesis will therefore be to see if the NPS
1. H1: μ ˂ 80 gets less than 80. So H1: μ ˂ 80
2. H1: μ ˂ 84
3. H1: ˃ 80
4. H1: μ ≠ 80
# Question Ans Page Comments
28 The mean of the sampling distribution of the mean is 1 SG P60- The sampling distribution of means refers to the distribution of the means of all
______ 61 possible samples of a particular size randomly selected from the same
population
1. 80
2. 76 μ = μ = 80
3. 20
4. unknown
29 The standard error is ______ 2 SG P60 We can estimate the size of the error we would make if we used the sample
mean as an estimate of the population mean. This is referred to as the
1. 20 standard error, and it is specified in the central limit theorem.
2. 2
3. 0.05 SG P61 The standard error is denoted by σẋ. The σ indicates that we are describing
4. unknown a population, and the subscript ẋ informs us that we are dealing with a
population of sample means. The standard error is given by dividing the
population standard deviation by the square root of the sample size
σẋ = σ / √n
If σ = 20 and n = 100, then
σẋ = 20 / √100
= 20 / 10
=2
30 With the information as given in the scenario, what 4 P100-106 The t-distribution is a statistical distribution with a probability distribution that
would be the appropriate statistical test to test can be determined, which means that we can use it to predict the chances of
hypothesis? obtaining specific outcomes when testing for comparisons of means when the
population standard deviation σ is unknown.
1. A one sample t-test So we have to use the t-test (t) when the population standard deviation
2. A two sample t-test (σ) is considered to be unknown - because the given standard deviation
3. A test of correlation r for relationship between comes from the sample.
variables
4. A one sample z-test
When the population standard deviation (σ) is known we use the z-test (z)
# Question Ans Page Comments
31 The test statistic is calculated and, based on this, a 1 A test statistic is calculated to determine how far the observed measurements
computer program is used to determine that the one deviate from what we may expect by chance. Calculating the test statistic is
sided p-value =0.022. What conclusion can be drawn? the first step in a process of comparing the observed data with what may be
expected by chance (i.e., if the null hypothesis were true).
1. The null hypothesis can be rejected, so the
supplement improves cognitive functioning P81 A computer program usually supplies a two-tailed p-value, but in this case the
2. The null hypothesis cannot be rejected, so the question states that the one-tailed p-value =0.022. This also means we are
supplement improves cognitive functioning refering to a directional alternative hypothesis.
3. The alternative hypothesis can be rejected, so the
supplement improves cognitive functioning We have already established that H0: μ = 80 and H1: μ ˂ 80. The researcher
4. Insufficient information is given to make a plans to test the hypothesis at α = 0.05. We can therefore compare the p-
conclusion without further calculations value (0.022) with the alpha (0.05). The p-value is smaller than the alpha
which means we have to reject the null hypothesis.
32 When applying a statistical test, the probability of a 2 SG 82-86 An error of Type I is the error we make if we reject the null hypothesis when
type I error is equal to ______ we should not have done so, and the level of significance represents the
Tut202 greatest risk of doing this that we are willing to take.
1. 0.05 or 0.01 2014 Q5
2. the level of significance We know that the extent of the type I error that a researcher is willing to make
3. the calculated value of the test statistic is controlled by the researcher by setting the level of significance (α) in
4. the p-value of the test statistic under the alternative advance. The probability of a type II error (β) is not controlled in advance by
hypothesis the researcher except for the fact that we know that the lower (smaller)
the probability of a type I error (α) the greater (larger) the probability of a
type II error (β).
You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0 when
you should reject it (error of type II) an absolute certainty.
# Question Ans Page Comments
33 A statistical hypothesis is a formal statement about 1 P18 The next step in the research process is to turn the research hypothesis into a
______ statistical hypothesis: a formal hypothesis that can be tested by
statistical techniques. (on the basis of sample observations, whether the
1. parameters relationship proposed in the research hypothesis indeed exists.)
2. statistics
3. level of significance P71 This statistical hypothesis is a formal expression of the research hypothesis,
4. p-values which enables us to test it.
P74 Take note that a research hypothesis always translates into two mutually
exclusive hypotheses (i.e. both cannot be true at the same time): a null and an
alternative hypothesis. Also remember that, in Topic 1, we referred to
quantities such as as parameters (population parameters). These particular
statistical hypotheses are, thus, statements about the value of a
particular population parameter.
34 The sampling distribution of a statistic (e g of the 1 P58 The sampling distribution of a statistic is the set of all possible values of the
sample mean) can be calculated if we assume that the statistic when all possible samples of a fixed size are taken from the
______ hypothesis is true, but not if we assume that the population. The sampling distribution refers to the variation of a statistic, for
______ hypothesis is true example, the sample mean (), from sample to sample. Note that here we are
not concerned with the variation of individual elements in the sample, or
1. null, alternative individual elements in the population, but with the variation of a summary
2. alternative, null value (such as the mean) for a sample.
3. statistical, research
4. research, statistical P77-79 So what we do instead is to calculate how far from the expected mean our
observed mean is, and determine from this the probability that this difference
is not 'real' but just a consequence of chance (random error). In other words,
we determine the probability of getting this sample result, on a sample of this
size, if H0 were true. We use the expression 'under the null hypothesis' by
which we mean , 'assuming that the hypothesis H0 is true'. Similarly, the
phrase 'under H1' would mean, 'assuming that H1 is true'.
# Question Ans Page Comments
35 When a statistical test yields a large p-value, which of 3 P81 Here is a summary of the important points regarding the p-value:
the following statements is most correct? • The p-value gives the probability of obtaining the sample result under H0.
• If the p-value is very small, the probability is very small that the sample
1. The alternative hypothesis is probably true result would occur under H0, and one should consider rejecting H0 in
2. The null hypothesis is probably false favour of H1.
3. The null hypothesis is probably true • The smaller the p-value, the more likely that the null hypothesis is false
4. The probability of an error of Type I is small and should be rejected in favour of the alternative hypothesis.
So, if the p-value is very large, the probability is very big that the sample result
would occur under H0, and one should consider accepting H0 in favour of H1.
The null hypothesis is then probably true
36 The hypothesis "H1 µ < 50" is a ______ hypothesis and 4 P75-76 The alternative hypothesis can contain any of the symbols '>', '<' or '≠'
requires a ______ statistical test respectively, the symbols for 'larger than', 'smaller than' or 'not equal to'.
1. non-directional, one-tailed When a comparison is between a value that is greater (more) than another,
2. directional, two-tailed we use the symbol '>' and when a comparison is between a value that is
3. non-directional, two-tailed smaller (less than) than another, we use '<'. The statistical test that must be
4. directional, one-tailed performed in either of these cases is a directional or one-tailed statistical
test (we use these expressions interchangeably).
When we do not specify what the direction of the difference should be, and
both a larger and a smaller difference between means are considered as
relevant, the symbol '≠' must be used. The statistical test to be performed will
now be a non-directional or two-tailed test.
The important point to remember is that the p-value indicates more or less
how likely the particular result we have observed in our data is if the null
hypothesis were true; or, as we say, 'under the null hypothesis'.
# Question Ans Page Comments
37 When applying a z-test to compare a sample mean to a 3 Tut201 The observed results are the values which you find in your sample(s) of data,
known population mean, the p-value represents the 2014 for example the sample mean and sample standard deviation, or (if it is
probability of ______ Q10 relevant), the correlation coefficient which you calculated.
1. rejecting the null hypothesis if it is false The p-value shows you the probability of seeing some relationship among
2. obtaining the mean found in the sample of data these variables based on your calculations (such as a difference between
under the alternative hypothesis means or a high correlation), if in fact this observed relationship is merely the
3. obtaining the mean found in the sample of data consequence of chance (in other words, if the null hypothesis was true). You
under the null hypothesis are in fact comparing the observed relationships in the data with what you
4. failing to reject the null hypothesis when it is in fact would expect if the null hypothesis is true by calculating a relevant test
true statistic.
This test statistic can then be used to find the p-value if we know the
probability distribution of the test statistic. If this probability is small, it implies
the null hypothesis is probably not true.
38 When applying a statistical test a decision is reached 1 SG 82-86 An error of Type I is the error we make if we reject the null hypothesis when
by comparing the ______ to the ______ we should not have done so, and the level of significance represents the
Tut202 greatest risk of doing this that we are willing to take.
1. p-value, level of significance 2014 Q5
2. test statistic, population parameter We know that the extent of the type I error that a researcher is willing to make
3. test statistic, level of significance is controlled by the researcher by setting the level of significance (α) in
4. p-value, test statistic advance. The probability of a type II error (β) is not controlled in advance by
the researcher except for the fact that we know that the lower (smaller)
the probability of a type I error (α) the greater (larger) the probability of a
type II error (β).
You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0 when
you should reject it (error of type II) an absolute certainty.
# Question Ans Page Comments
39 The lower we set the level of significance, the greater 2 SG 82-86 An error of Type I is the error we make if we reject the null hypothesis when
the probability of - - we should not have done so, and the level of significance represents the
Tut202 greatest risk of doing this that we are willing to take.
1. rejecting the null hypothesis 2014 Q5
2. a type II error We know that the extent of the type I error that a researcher is willing to make
3. a type l error is controlled by the researcher by setting the level of significance (α) in
4. accepting the alternative hypothesis advance. The probability of a type II error (β) is not controlled in advance by
the researcher except for the fact that we know that the lower (smaller)
the probability of a type I error (α) the greater (larger) the probability of a
type II error (β).
You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0 when
you should reject it (error of type II) an absolute certainty.
40 Which of the following assumptions do we make when 2 P77 The decision rule for H0 is simply as follows:
applying a statistical test? P82 If the p-value of the sample result is smaller (less) than α (level of
We assume that the ______ significance), the null hypothesis is rejected. If the p-value is not smaller than
α, the null hypothesis (H0) is not rejected.
1. level of significance is small
2. null hypothesis is true
3. alternative hypothesis is true
4. the null hypothesis is false
# Question Ans Page Comments
41 The size of the level of significance depends on ______ 1 Tut201 The level of significance (α) reflects the greatest risk that the researcher is
2012 willing to take of rejecting the null hypothesis in error. The researcher wants to
1. a choice made by the researcher Q29 establish that the observation which was made (as calculated from a sample
2. conventional rules of data) has a very small chance of being purely the result of chance
3. the calculation of a test statistic Tut202 variations in the data. He/she controls this by requiring that this calculated
4. the p-value under H0 2013 Q2 probability (the p-value) should be below a specific level (α) which is chosen in
advance.
Alternative 4 is false because the p-value refers to this calculated probability of
finding a test statistic of a particular size if the null hypothesis is true (i.e.
‘under the null hypothesis’), while the level of significance is the maximum
value of this p-value which the researcher is willing to consider if the null
hypothesis is to be rejected. The p-value must be less than this chosen level
of significance, or else the statistical relationship between the variables is
considered to be too small to be regarded as significant (the greater the p-
value, the greater the probability that the effect which was observed in the
sample data is purely the result of chance).
Alternative 3 is false because an appropriate test statistic has to be calculated
in order to find the p-value, but this test statistic is not called ‘the level of
significance.’
While values for α such as 0.01 or 0.05 are often used by convention, the
researcher can in fact use any value which he/she deems appropriate, so
alternative 2 is not strictly correct.
42 When two population means are compared, the p-value 1 P81 Here is a summary of the important points regarding the p-value:
expresses the probability of the difference between the • The p-value gives the probability of obtaining the sample result under H0.
sample means given that ______ • If the p-value is very small, the probability is very small that the sample
result would occur under H0, and one should consider rejecting H0 in
1. H0 is false favour of H1.
2. H1 is true • The smaller the p-value, the more likely that the null hypothesis is false
3. H1 is false and should be rejected in favour of the alternative hypothesis.
4. H0 is true
So, if the p-value is very large, the probability is very big that the sample result
would occur under H0, and one should consider accepting H0 in favour of H1.
The null hypothesis is then probably true
P76-77 Generally, we would compare the two population means because H0 seems to
be false, but H1 has not yet proven to be true.
# Question Ans Page Comments
43 What does it mean to say “the difference between the 2 Tut202 The null hypothesis states that there is no difference in the means calculated
means of groups A and B is statistically significant? 2014 Q8 from samples of data from each of of groups A and B. When we calculate the
two means from sample data (which we regard as an observation) we may
1. It is unlikely that the alternative hypothesis will be find a difference in the two calculated means, but at least part of this
true difference could be due to measurement errors. We calculate the p-value
2. The sample result is more probable under the (based on a test statistic with a known probability distribution) to find out what
alternative hypothesis the probability is that that these observed differences in the sample data are
3. The null hypothesis explains the sample result just a consequence of measurement error if the null hypothesis is assumed to
4. The alternative hypothesis should be rejected be true. If this probability is low (lower than a pre-determined cut-off level, α),
we conclude that the difference in the two means is statistically significant
because the probability that the null hypothesis is true is very small.
In other words, we conclude that the size of the difference between means
found in the sample data would not be likely if the null hypothesis were true.
Therefore: The sample result is more probable under the alternative hypothesis
44 When two means are compared, the p-value expresses 3 P76-77 In fact, we are not yet entitled to conclude that the alternative hypothesis is
the probability that a difference ______ true. This is because of the problem of sampling error. This error exists partly
because we are using a sample to make conclusions about a population, in
1. is statistically significant addition to which we are using a test that is only accurate to a certain degree.
2. which is found between the means is due to the It is because of this random error that we require the use of statistical tests to
alternative hypothesis see whether the result is in fact adequate for us to make a decision about the
3. which is found between the means is due to chance hypothesis. (See section 1.4.4 on the problem of the error term in
or sampling error measurement.)
4. will be found between the means
45 The power of a statistical test refers to the ______ 2 P85-86 The ability of a statistical test to detect a significant relationship between
variables when such a relationship does in fact exist, is referred to as its
1. test's ability to give small p-values power. This is the inverse of a Type II error: it is the probability of rejecting H0
2. test's ability to detect significant results when, in fact, it is false and H1 is true. To put it succinctly, it is the probability
3. sample size of correctly rejecting a false null hypothesis
4. probability that an error of Type I will not be made The power of a test is calculated by subtracting the probability of a Type II
when the test is used error from one (i.e., power = 1 - β). It can be thought of as a measure of the
"accuracy'' of the text. The power of a test is related to how sensitive the test
should be (see section 3.3.4 on effect size below) as well as the sample size
(n) that you are going to use.
In practice, we usually control only the α-level when we use a particular
statistical test. But, given a fixed α-level, there are ways of increasing the
power of a test even if we do not actually calculate the value of 1 - β.
# Question Ans Page Comments
46 The value that is conventionally indicated with the 4 P81-82 Small p-values would lead one to reject the null hypothesis, because it shows
symbol α refers to the ______ that the probability of H0 being true is not very high. But how small must the p-
value be? The practice in empirical research is to decide what size p-values
1. maximum probability of obtaining the observed would be considered small enough to justify rejecting the null hypothesis
results under H0 before the research is actually conducted. We do this by specifying a 'cut-off'
2. probability of making an error of Type II if the p-value so that, if the calculated p-value of our sample result is smaller than
rejection of H0 is in fact true this 'cut-off' p-value, the null hypothesis is rejected. This 'cut-off' p-value is
3. ability of the statistical test to detect whether an called the significance level of the statistical test procedure. We will use the
effect exists symbol 'α' to denote this significance level. The symbol 'α' is pronounced
4. maximum probability of making an error of Type l if 'alpha' and is the Greek letter equivalent to the normal 'a' in our (Roman)
the rejection of H0 is to be considered alphabet. By convention, this value is often set at either 0.05 or 0.01. The α-
value specifies the maximum risk that we are willing to take of making an
error if we reject the null hypothesis (see section 3.3.3 for more details on
this).
47 A researcher wants to test the hypothesis that the mean 2 P105 s = s / √n
depression score on a depression scale for patients = 24 / √64
diagnosed with clinical depression is greater than 120. = 24 / 8
The statistical hypothesis to be tested is =3
H0 µ = 120
H1 µ > 120
1. 0.37
2. 3.0
3. 0.61
4. sẋ cannot be calculated from the information that
was provided
# Question Ans Page Comments
48 Suppose H0 μ = 100 is tested against H1 μ ≠ 100 with 1 p = 0.04
α=0.05. If the t-statistic is found to be -3.20 and the two- α = 0.05
tailed p-value is 0.04, what decision regarding the
statistical hypothesis can be taken? General rule: if p < α, reject H0 and accept H1
49 Suppose the alternative hypothesis states that μ > 60. 1 P106 H1: μ > 60
The researcher should test H0 against H1 if the ______
μ = sample mean
1. sample mean is larger than 60 μ > 60 is directional indicating larger than.
2. sample mean is smaller than 60
3. sample mean differs from 60 So if the sample mean is greater than 60, a test should be performed.
4. p-value is smaller than the level of significance
50 The following list contains a number of situations where 2 P123 One can use t-tests to compare two groups at a time until one has compared
a researcher may consider using a variation of the t-test all three groups with one another. It would probably be wise to use a smaller
a) To compare two group means level of significance since the probability of a Type I error increases as you do
b) To determine whether a relationship exists between more statistical tests on the same data.
two categorical (nominal scale) variables
c) To compare a group mean with a constant value P115-123 T-test does not test for relationships. It compares groups
d) To determine whether a relationship exists between
two continuous quantitative variables
52 Samples can be considered independent when ______ 4 P112 Samples are considered as comprising independent groups if the
composition of the one sample in no way affects, in any systematic way,
1. the sample comes from the assignment of subjects the composition of the other sample. The two samples come from two
to a treatment or experimental group and this is groups that have no obvious relationship. For example, where one sample is
varied to see how it affects certain measurements measurements of a construct like 'self-esteem' among men, and the other
2. care was taken that the samples are drawn under among women, but both groups were sampled purely randomly.
different experimental conditions
3. the samples are drawn from more than a single
population of subjects
4. the composition of one sample is not systematically
related to the composition of the other one
# Question Ans Page Comments
53 A social psychologist wants to test how long people will 3 P113-116 The dependent variable is the one that is predicted or explained, and the
wait before responding to cries of help from an independent variable is manipulated to see how it affects the dependent
unknown person. The psychologist wants to confirm his variable.
suspicion that people will take less time to react when
they hear a female voice than when they hear a male We have to perform a tc test
voice. He tests this on a sample of n=1 5 people who
are told (one at a time) to sit in a waiting room to be In order to use the t-test (tc) statistic, we need to make two assumptions
called for an interview. While they wait, each participant regarding the data:
hears a call for help from a male or female voice, which • that the two populations being compared are normally distributed
is actually a recording. The dependent variable is the • with the same variance (or standard deviation).
number of seconds that each participant waits until they
go to investigate or tried to find help. The following Note: Even the most elementary statistics program makes provision for
sample statistics are calculated from the results. performing t-tests. Such programs usually require that we indicate which
variable should be used to identify the two groups and which is the dependent
Male voice ẋ1= 11.9 seconds, s1 = 3.5 variable. In addition, we have to choose between a tc test for independent
Female voice ẋ2 = 15.3 seconds, s2 = 4.1 samples or a td test for dependent or correlated groups
What minimum assumptions from the ones given above We can also assume that the samples are independent - since the samples
needs to be met before she may proceed? were selected randomly, we can safely consider them to be independent of
each other. All of this makes the tc-test an appropriate test.
1. At least one of (a) or (b) must be true
2. (a) and (b) must both be true P116 Note: Even the most elementary statistics program makes provision for
3. Neither (a) nor (b) is relevant but other assumptions performing t-tests. Such programs usually require that we indicate which
exist that will have to be considered variable should be used to identify the two groups and which is the dependent
4. The t-test should never be used with such a small variable. In addition, we have to choose between a tc test for independent
sample at all samples or a td test for dependent or correlated groups
55 A researcher wants to test the following hypotheses 1 P78-81 The relationship between one-tailed and two-tailed p-values can be
summarised as follows:
H0 μ1 = μ2 • One-tailed p-value = (two-tailed p-value) / 2
H1 μ1 > μ2 • Two-tailed p-value = (one-tailed p-value) x 2
On the basis of data provided, the output from a The important point to remember is that the p-value indicates more or less
computer programme indicates that a t-value of t = 1.72 how likely the particular result we have observed in our data is if the null
was found, with the p-value for a two-tailed test given hypothesis were true; or, as we say, 'under the null hypothesis'.
as p = 0.056. What should the researcher do to
evaluate this result at a level of significance of α = P105 H1 μ1 > μ2 is a directional hypothesis and a one-tailed test should be
0.05? performed. Computer programs often only provide the p-value for non-
directional testing (i.e., for the two-tailed t-test).
1. Divide the p-value by 2 before comparing it with α
2. Multiply the p-value by 2 before comparing it with α In this case, the non-directional p-value (p = 0.056) should be divided by two to
3. Divide α by 2 before comparing p to α get the one-tailed value of p =0.028.
4. Compare the p-value as given with α
# Question Ans Page Comments
A researcher suspects that a relationship exists between colour perception and visual memory (i.e. the capacity to recall visual information). She suspects
that high ability to detect colours rapidly acts as an aid to the capacity of visual memory. A group of 100 research participants are divided into two groups,
based on the capacity of their visual memory, as determined by an appropriate test. One group (Group 1) of n1=44 displays high recollection of visual
images, the other group (Group 2) of n2=56 scores low on the test. Each participant from each of the groups are then tested on how many colours they
can recall of objects they see very briefly displayed on a computer screen
56 Which is the most appropriate research hypothesis for 3 "She suspects that high ability to detect colours rapidly acts as an aid to the
the researcher to test? capacity of visual memory"
1. H1 μ1 ˂ μ2
2. H1 ẋ1 > ẋ2
3. H1 μ1 > μ2
4. H1 μ1 ≠ μ2
# Question Ans Page Comments
58 Which is the appropriate test statistic to be calculated 4 P129-130 Correlation: measuring the association between variables
when analysing the results of this research?
Correlation is a measurement of the extent to which a measurement on
1. The t-statistic for the difference between the means one variable is related to a measurement on another variable for the
of two independent samples same sample of individual cases.
2. The t-statistic for the difference between the means
of two dependent samples This can be visualised by way of a graphical representation called a scatter
3. The t-statistic for the mean difference score of a plot. A scatter plot is a graph that represents the measurements of two
single sample variables on two perpendicular axes, usually called the x-axis (horizontal axis
4. The test statistic based on the correlation coefficient or abscissa) and the y-axis (vertical axis or ordinate).
r for the relationship between two variables (visual
memory and recall of colours)
To test the efficacy of a workshop aimed at improving people's interpersonal skills, a researcher applies a scale which rates the interpersonal skills of 20
participants before and after they participate in the workshop. Scores on his rating scale among the general population have a mean of 5 and a standard
deviation of 1.5
59 Which of the following is the most appropriate way to 2 " To test the efficacy of a workshop aimed at improving people's interpersonal
express the null hypothesis for an analysis of the skills, a researcher applies a scale"
results? (Interpret μ as a population mean and Ḋ as the
population mean of the differences scores) There is no direction indicated (greater, more, smaller, etc.)
H0: μ = 5 " participants before and after they participate in the workshop"
H0: μ1= μ2
H0: Ḋ = 0 So two group means are compared.
H0: μ1 ≠ μ2
H0 is always "=" Therefore: H0: μ1= μ2
# Question Ans Page Comments
60 Which is the appropriate test statistic to calculate? 2 P112 " participants before and after they participate in the workshop"
1. The z-statistic for the mean of a sample So the same group is used which make it dependant.
2. The t-statistic for the difference between the means
of two dependent samples Samples are considered as comprising independent groups if the
3. The t-statistic for the difference between the means composition of the one sample in no way affects, in any systematic way, the
of two independent samples composition of the other sample. The two samples come from two groups that
4. The t-statistic for the mean of a single sample have no obvious relationship. For example, where one sample is
measurements of a construct like 'self-esteem' among men, and the other
among women, but both groups were sampled purely randomly.
62 A scatter plot is a graphical representation of ______ 4 SG A graph showing the position of each of a number of sampling units on each
P130-132 of two variables
1. the relationship between two variables measured on
a nominal scale within a single group Tut202 A scatter plot is a graph showing the relationship between two numerical
2. the frequency distribution of a sample of 2014 variables. In such a graph the data of the one variable are plotted on the
measurements Q18 horizontal axis (usually referred to as the X axis), and the data of the other
3. relationship between two groups of subjects with variable on the vertical (or Y) axis.
regard to a single variable measured on an interval
or ratio scale It is not a comparison of sample and population, nor has it to do with spread
4. the relationship between two variables measured on of data or the independence of variables
a ratio or interval scale within a single group
# Question Ans Page Comments
63 A positive correlation between variables X and Y implies 2 P133 If a correlation exists, the way in which one variable varies will be related to
that persons scoring low on X will generally score variation on the other one. A negative correlation implies that as one variable
______ on Y changes, the other changes in the opposite direction. A high value on X will
imply a low value on Y, while a low value on X will be matched by a high value
1. high on Y. Conversely, if the correlation is positive, the variable values will
2. low generally vary is the same direction (both high or both low).
3. either high or low
4. in an indeterminate way When positive relationships occur, this implies that as one variable gets
larger, so does the other. When negative relationships occur, this implies that
as one variable gets larger, the other gets smaller.
64 Which of the following can take on a value of -0.5? 3 P132-133 Correlation coefficients that measure the linear relationship between two
variables, such as the Pearson product-moment correlation coefficient, can
1. A probability have a continuous value that ranges from -1 to 1 (a positive value is usually
2. A level of significance written without the sign, so '1' is presumed to mean '+1').
3. A correlation coefficient
4. A variance We use 'r' as the symbol that represents a correlation coefficient (as in the
case of the Pearson product-moment correlation coefficient), and the following
applies:
• r = +1 implies a perfect positive linear relationship (the dots in a scatter
plot will run from lower left to upper right in a perfectly straight line)
• r = 0 implies no linear relationship at all (the dots may be scattered all over
the place)
• r = -1 implies a perfect negative linear relationship (the dots will run from
upper left to lower right in a straight line)
# Question Ans Page Comments
65 What is the most likely value of the correlation 3 P132-133
coefficient between the following values of variables X 8
and Y? 7
6
X 2 7 4 5 1
Y 2 7 4 5 1 5
4
1. -1
3
2. 0
3. +1 2
4. 100 1
0
0 1 2 3 4 5 6 7 8
A perfect positive linear relationship exists (the dots in the scatter plot run from
lower left to upper right in a perfectly straight line)
We use 'r' as the symbol that represents a correlation coefficient (as in the
case of the Pearson product-moment correlation coefficient), and the following
applies:
• r = +1 implies a perfect positive linear relationship (the dots in a scatter
plot will run from lower left to upper right in a perfectly straight line)
• r = 0 implies no linear relationship at all (the dots may be scattered all over
the place)
• r = -1 implies a perfect negative linear relationship (the dots will run from
upper left to lower right in a straight line)
# Question Ans Page Comments
66 A researcher hypothesizes that a relationship should 4 SG P137 The symbol ‘ρ’ (the Greek letter ‘rho’) is used to represent the population
exist between spatial ability and general aptitude for parameter being tested when you calculate the Pearson’s correlation
mathematics. She collects the results of a sample of n = Tut202 coefficient ‘r.’ That is, you calculate r for the sample, then have to decide
100 school children for a mathematics test and measure 2014 whether this is likely to represent a significant linear correlation between two
the spatial ability of each with a test that represents a Q13 variables for the whole population (with this population correlation symbolised
person's ability to rotate objects mentally on a 10-point by ρ), by looking at the p-value associated with this calculated sample statistic
scale. r.
Which of the following is the most appropriate way to In a similar way ‘μ’ represents the population parameter (statistic) for a mean,
express the null hypothesis for this research? and ‘σ’ the population parameter for a standard deviation.
1. r=0
2. μ=0
3. ẋ=0
4. p=0
67 A number of psychiatric patients are classified into one 1 P142 After setting up the hypotheses to be tested for, the next step is to create a
of four categories as: schizophrenic, severely contingency table, which is a table indicating the number of individual objects
depressed, bipolar disorder and others. Which of the falling in each cell of cross-tabulated data. In other words, it is a two-
following is suitable for representing this information dimensional table in which each observation is classified in terms of two
versus the gender of these patients? categories simultaneously.
X Y Total
A 6 4 10
B 4 6 10
Total 10 10 20
70 Which of the following is the appropriate formula for the 4 P144 The Pearson chi-square test statistic is a calculation of the difference between
Chi square test? the observed and expected frequencies.
2.
This means the expected value for each cell in the contingency table is
subtracted from the observed value for that cell, squared, and divided by the
expected value for that cell.
3.
Then all of these terms are added together to yield
4.
Oct/Nov 2012
2 Mean, range, variance and standard deviation are 2 P10-11 A distinction exists between inferential statistics and descriptive statistics.
examples of______ Descriptive statistics refers to a set of quantities used to summarise
aspects of numerical data. Examples that you may be familiar with are
1. variables means, range, variance and standard deviation (see Appendix C for a quick
2. descriptive statistics introduction). These summary quantities are sometimes referred to as
3. test statistics parameters (when they refer to the whole collection or population of data; see
4. inferential statistics section 1.4.3 below).
7 Which of the following best describes “latent”? 3 P7 So the (visible) variable reflects the intensity of the underlying (invisible)
construct, in terms of how it was measured. We say that the variable is
1. observable manifest (it is visible in the sense that we can observe it) and the construct is
2. manifest latent (it is invisible in the sense that we need some way to make it
3. hidden appear). So the latent construct is made manifest by the use of an
4. independent appropriate measurement procedure.
P23 To say that a construct is 'latent' is another way of saying it is hidden from
direct observation
# Question Ans Page Comments
8 A psychologist has a theory that visual perceptual 2 P8-9 The dependent variable is the one that is predicted or explained, and the
ability influences the marks that learners will get in a P24 independent variable is manipulated to see how it affects the dependent
mathematics test. In this example, 'visual perceptual variable.
ability' is the ______ variable
The independent variable is that variable which affects the dependent
1. dependent variable; or, conversely, the dependent variable depends on the independent
2. independent variable.
3. manifest
4. hidden When a researcher focuses on the interaction of only two variables at a time,
the dependent variable is usually the one that the researcher is interested in,
the variable that is the focus of the research. The independent variable is
something that the researcher manipulates, to see how this affects the
dependent variable (in other words, the dependent variable is dependent on
the independent variable).
9 An operationally defined variable is ______ 4 P24-26 Operational definitions of psychological constructs should define constructs in
terms of observable behaviour.
1. abstract
2. latent "Operational'' refers to practical procedures by which constructs are made
3. independent visible.
4. observable
"Operationalisation" is where you make the construct (which is usually an
abstract concept, so it is difficult to observe it clearly) visible by finding some
suitable way to measure it.
# Question Ans Page Comments
10 A psychologist is interested in studying the interaction 1 P15-16 We normally start with a research question. This could be an implication of
between small groups of four to five people in each a theory - something that seems to be implied by the theory or some kind of
group. He suspects that the interactions between such practical problem, which is stated in general terms. Using our existing
groups can be described in similar terms to the knowledge about plausible answers, we reformulate the research question in
interactions between individual persons. In order to be terms of a conjecture or supposition, which has the goal of helping the
able to do a scientific study of this (a) ______ question, researcher select what he or she has to observe in order to answer the
he would have to provide a(an) (b) ______ definition of research question. This is the research hypothesis (although there could be
the (c) ______ called "interaction" more than one), which expresses the problem in terms of very specific
relationships among constructs that we expect to find (if our guess is true). It
1. (a) research (b) operational (c) construct is important that this possible relationship should be clear and unambiguous.
2. (a) scientific (b) experimental (c) concept An hypothesis that is stated clearly and specifies exactly what is to be
3. (a) experimental (b) research (c) statistic observed and what should be true if it is valid, is often called an operational
4. (a) hypothetical (b) empirical (c) parameter hypothesis. However, this is just another name for a research hypothesis
where the relationship between the measurements (representing the
construct as variables) is written out in clear and explicit detail. You can
think of the research hypothesis as a description of relationships that
should hold among the constructs (two or more). The operational
hypothesis is then the way the research hypothesis is expressed in the form
of the relationships among the variables produced when the constructs are
measured. But the operational hypothesis is usually taken as equivalent to
the research hypothesis, so the distinction is rarely made in practice.
11 The variable manipulated by a researcher in an 2 P8-9 The dependent variable is the one that is predicted or explained, and the
experiment is called the ______ variable P24 independent variable is manipulated to see how it affects the dependent
variable.
1. hypothetical
2. independent The independent variable is that variable which affects the dependent
3. dependent variable; or, conversely, the dependent variable depends on the independent
4. empirical variable.
P4 Constructs and their interrelations (how they affect each other, their patterns
of interaction) are used in this way to develop theoretical explanations of
why people behave in certain ways in certain contexts, or why mental
phenomena appear to be as they are.
Number of
items 1 2 3 4 5 6 7 8 9 10
remembered So:
μ = ∑xi / N
Frequency 0 4 11 13 22 18 17 9 6 0 = (6+0) / 100
= 6 / 100
= 0.06
Using this table as a basis, estimate the probability that
a specific person will remember nine or more three-
digit numbers.
1. 0.06
2. 0
3. 94%
4. 0.15
16 Two class representatives, one boy and one girl, must 2 P34-35 The multiplicative rule states that p(A and B) = p(A) x p(B) where A and B
be selected from a class of 10 boys and 8 girls, which are both independent events. This rule is used to determine the product of
includes Mary and her friend John. The teacher writes two or more probabilities and is indicated by the word 'and' (i.e. the
the names of all the children on slips of paper. She first probability of A and B).
puts the girls' names into a box and then draws one of
their names blindly. Then she empties the box and Total number of kids = 18 (10 boys and 8 girls)
puts the names of all the boys inside, and one name is John selected = 1 out of 10
again drawn blindly. Mary selected = 1 out of 8
What is the probability that Mary and John will both be P(Mary and John) = P(Mary) x P(John) = 1/8 x 1/10 = 1/80 = 0.0125
selected?
The additive rule is p(A or B) = p(A) + p(B). This rule is used when two or
1. 2/80 more events are mutually exclusive. The additive rule is used to determine
2. 0.0125 the sum of two or more probabilities, and is signalled by the use of the word
3. 0.225 'or' (i.e. the probability of A or B).
4. 2/18
Base your answers to Questions 17 and 18 on the following information:
Suppose the weights of the population of military recruits are distributed normally with a mean of 64 kg and a standard deviation of 8 kg. Different
samples of these recruits, each with a sample size of 16, are drawn repeatedly
17 We expect the standard deviation of the sample means 1 P60-61 Central limit theorem.
to be about ______ kg If a simple random sample of size n is selected from a population with mean
μ and standard deviation σ, the sampling distribution of means obtained from
1. 2 all possible samples is approximately normal with mean μ and standard
2. 3 deviation σ/√n. The central limit theorem gives a precise description of the
3. 4 distribution that you will obtain if you selected every possible sample,
4. 8 calculated every sample mean, and constructed the distribution of the sample
mean. The importance of the theorem lies in the fact that we can use it to
describe a sampling distribution without actually having to sample a
population of raw scores 'infinitely', and because of this we can calculate the
extent to which any sample mean approximates the mean of the population
from which it was drawn.
Just as the normal distribution is defined by its mean and standard deviation,
so the distribution of sample means is described by the same two quantities.
The central value of the sampling distribution equals the population mean (i.e.
the mean of the distribution of all possible means is the same as the mean of
the population from which the samples were drawn, or μ = μ) while the
standard deviation of the sample means is estimated by a value we call
the standard error of the mean. Like a standard deviation, the standard
error of the mean tells us by what average amount the sample means deviate
from the mean of the sampling distribution. It is an estimate of the size of the
error we shall make if we use the mean of the distribution of sample means
as an estimate of the true population mean, that is, if we use μ to estimate μ.
The standard error is denoted by σ. The σ indicates that we are describing a
population, and the subscript informs us that we are dealing with a
population of sample means. The standard error is given by dividing the
population standard deviation by the square root of the sample size:
σ = σ / √n
where: μ = 64 (mean)
σ = 8 (standard deviation)
n = 16 (sample size)
Like a standard deviation, the standard error of the mean tells us by what
average amount the sample means deviate from the mean of the sampling
distribution. It is an estimate of the size of the error we shall make if we use
the mean of the distribution of sample means as an estimate of the true
population mean, that is, if we use μ to estimate μ.
Base your answers to Questions 19 and 20 on the information in the table below:
Std. dev. of
Subject Student X Mean of class
class
A 50% 46% 2%
B 55% 50% 4%
C 60% 50% 6%
D 66% 65% 3%
19 In which subject did Student X do best, relative to his 1 P55 Calculate the z-score for each class. The subject with the highest z-score is
class? where student X did the best in.
1. A Formula is:
2. C
3. D Where x = Student X score
4. B μ = Mean of class
σ = Std dev of class
21 A z-score is conventionally used to refer to a variable 4 P52-53 There is one form of the normal distribution that is of special importance. This
from which probability distribution? curve has a mean of μ = 0 and a standard deviation of σ = 1 and is known as
the standard normal distribution, and is by convention indicated with the
1. Any normal distribution letter 'z' (so it is also referred to as the z-distribution). The measures on this
2. The binomial distribution distribution are referred to as standard scores or z-scores.
3. The even distribution
4. The standardized normal distribution
25 The asymptotic property of the normal curve refers to 2 P52 Normal curves share a number of key properties, such as the following:
the fact that ______ • They are bell-shaped. The most observations occur at the midpoint of the
curve.
1. the curve is bell-shaped • They are symmetrical. The left side is a mirror image of the right side.
2. the endpoints of the curve get continuously closer • They are continuous. Theoretically, the values which the variables can
to the x-axis without ever touching it assume are infinite and are measured on a truly continuous scale so that
3. the curve has a standardised variance the curve is smooth.
4. the curve is symmetrical • Their curves are asymptotic, which means that the two tails never touch
the horizontal axis, moving ever closer to infinity, because there is always
some probability that more extreme values will occur.
26 The standard error is a measurement of ______ 1 P60 We can estimate the size of the error we would make if we used the sample
mean as an estimate of the population mean. This is referred to as the
1. how well a sample mean approximates a population standard error, and it is specified in the central limit theorem.
mean
2. the extent to which a variable varies around its P61 The standard error is denoted by σẋ. The σ indicates that we are
mean describing a population, and the subscript ẋ informs us that we are dealing
3. the extent to which one variable changes as with a population of sample means. The standard error is given by dividing
another one changes the population standard deviation by the square root of the sample size
4. the size of the error being made when you fail to σẋ = σ / √n
reject a null hypothesis which is actually false
Like a standard deviation, the standard error of the mean tells us by what
average amount the sample means deviate from the mean of the sampling
distribution. It is an estimate of the size of the error we shall make if we
use the mean of the distribution of sample means as an estimate of the
true population mean, that is, if we use µẋ to estimate µ.
27 Statistical hypotheses are statements about ______ 1 P74 Take note that a research hypothesis always translates into two mutually
exclusive hypotheses (i.e. both cannot be true at the same time): a null and
1. population parameters an alternative hypothesis. Also remember at this stage that, in Topic 1, we
2. sample statistics referred to quantities such as as parameters (population parameters). These
3. characteristics of statistical distributions particular statistical hypotheses are, thus, statements about the value of
4. all of the above a particular population parameter.
The p-value is read from appendix D as p(Z < -3) = p(Z > 3) = 0.0013. We
read the smaller p in appendix D because we are looking for p(Z < -3) or p(Z
> 3). It is a one-tailed probability so do NOT multiply by 2 to get 0.0026 for a
two-tailed test.
29 The hypothesis “H1: µ > 50" is a (a) ______ hypothesis 4 H1: µ > 50
and requires a (b) ______ statistical test
The ">" indicates a directional hypothesis which requires a one-tailed test
1. (a) non-directional (b) one-tailed
2. (a) directional (b) two-tailed P75-76 The alternative hypothesis can contain any of the symbols '>', '<' or '≠'
3. (a) non-directional (b) two-tailed respectively, the symbols for 'larger than', 'smaller than' or 'not equal to'.
4. (a) directional (b) one-tailed
When a comparison is between a value that is greater (more) than another,
we use the symbol '>' and when a comparison is between a value that is
smaller (less than) than another, we use '<'. The statistical test that must be
performed in either of these cases is a directional or one-tailed statistical
test (we use these expressions interchangeably).
When we do not specify what the direction of the difference should be, and
both a larger and a smaller difference between means are considered as
relevant, the symbol '≠' must be used. The statistical test to be performed will
now be a non-directional or two-tailed test.
The important point to remember is that the p-value indicates more or less
how likely the particular result we have observed in our data is if the null
hypothesis were true; or, as we say, 'under the null hypothesis'.
30 The level of significance of a statistical test ______ 2 SG P82- The a-value specifies the maximum risk that we are willing to take of making
84 an error if we reject the null hypothesis.
1. refers the p-value which is calculated from the test
statistic We know that the extent of the type I error that a researcher is willing to make
2. indicates the maximum risk that a researcher is is controlled by the researcher by setting the level of significance (α) in
willing to take of making an error of Type l advance. The probability of a type II error (β) is not controlled in advance by
3. is the probability of obtaining the sample statistic the researcher except for the fact that we know that the lower (smaller) the
under the null hypothesis probability of a type I error (α) the greater (larger) the probability of a type II
4. is used to indicate the probability of making an error error (β).
by not rejecting the null hypothesis
You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0
when you should reject it (error of type II) an absolute certainty.
31 When applying a statistical test, if the p-value is larger 1 SG 82- An error of Type I is the error we make if we reject the null hypothesis when
than the level of significance we ______ the alternative 86 we should not have done so, and the level of significance represents the
hypothesis greatest risk of doing this that we are willing to take.
Tut202
1. do not accept 2014 Q5 We know that the extent of the type I error that a researcher is willing to make
2. fail to reject is controlled by the researcher by setting the level of significance (α) in
3. accept advance. The probability of a type II error (β) is not controlled in advance by
4. cannot make a conclusion about the researcher except for the fact that we know that the lower (smaller) the
probability of a type I error (α) the greater (larger) the probability of a type II
error (β).
You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0
when you should reject it (error of type II) an absolute certainty.
You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0
when you should reject it (error of type II) an absolute certainty.
Base your answers to Questions 33 to 37 on the following scenario:
Rose is interested in the problem of depth perception. She wonders whether artists who practise visual arts, and who are known to have made a study of
the problem of perspective, would be better at judging depth than people in general. She decides to investigate this using a test for depth perception
which was standardized on the general population with a mean of 5, where a greater number implies better depth perception on a scale of 1 to 9. She
randomly draws 100 students who had graduated from a class on perspective at a school for fine arts and tests each of them on the depth perception
test. She finds that the mean depth perception score of her sample is 6.2 and the sample standard deviation is 1.7
33 How would you describe the population investigated in 3 She randomly draws 100 students who had graduated from a class on
this research? perspective at a school for fine arts
34 Which of the following best describes the research or 2 She wonders whether artists who practise visual arts, and who are known
theoretical hypothesis to be tested? to have made a study of the problem of perspective, would be better at
judging depth than people in general
1. Depth perception is related to artistic ability
2. Visual artists have a superior ability for depth
perception to people In general
3. Students from the school of visual arts have better
depth perception than the general population
4. The relationship between depth perception and
artistic ability is statistically significant
35 Which of the following are appropriate null and 3 H0 always has "=" sign so option 4 is incorrect.
alternative hypotheses?
The hypothesis Rose made was " She wonders whether artists who practise
visual arts, and who are known to have made a study of the problem of
1. H0: μ = 5 H1: μ ˂ 5 perspective, would be better at judging depth than people in general"
2. H0: μ = 5 H1: μ ≠ 5
3. H0: μ = 5 H1: μ ˃ 5 Therefore, H1 must be better (larger / greater than) than the mean (5)
4. H0: μ ≠ 5 H1: μ ˃ 5 H1: μ ˃ 5
37 Which is the appropriate test statistic to calculate? 2 P102- The t-statistic for the mean of a single sample. This is because the standard
106 deviation is unknown. What is given was extracted from a sample of 100.
1. The t-statistic for the difference between the means
of two independent groups In this question the population standard deviation (σ) is considered to be
2. The t-statistic for the mean of a single group unknown because the given standard deviation comes from the sample. So
3. The z-statistic for the mean of a single group we have to use the t-test (t)
4. The t-statistic for the difference between the means
of two dependent groups The important point is that - as in the case of the z-distribution - the t-
distribution is a statistical distribution with a probability distribution that can be
determined, which means that we can use it to predict the chances of
obtaining specific outcomes when testing for comparisons of means when the
population standard deviation σ is unknown.
38 When two population means are compared, the p- 1 P81 Here is a summary of the important points regarding the p-value:
value is calculated to represent the probability of • The p-value gives the probability of obtaining the sample result under H0.
observing a specific difference between the sample • If the p-value is very small, the probability is very small that the sample
means given that ______ result would occur under H0, and one should consider rejecting H0 in
favour of H1.
1. H0 is true • The smaller the p-value, the more likely that the null hypothesis is false
2. H1 is true and should be rejected in favour of the alternative hypothesis.
3. H0 is false
4. H1 is false So, if the p-value is very large, the probability is very big that the sample
result would occur under H0, and one should consider accepting H0 in favour
of H1. The null hypothesis is then probably true
P86 Effect size: A major determinant of the sensitivity or power of a statistical test
is sample size (which is why we can increase sample size to enhance
power). When the sample is large, even smaller effects will have statistical
significance. The reason is that the larger the sample, the less error variance
can be expected (variance purely due to randomness). This is due to a
principle called the law of large numbers, which states that on average the
result obtained from a large number of trials should be close to the expected
value, and will tend to become closer as more trials are performed (this law is
described in section 2.1.2). This implies that when sample sizes are large,
even sample effects that seem insignificant can produce small p-values,
leading to the rejection of H0.
P88 Effect size, power and sample size are interrelated; you can determine one if
you have information regarding the other two. For example, if you set your
desired effect size and know the power of the test, you can use this to
determine what an optimal sample size would be to use the test effectively.
40 The mean score of a sample of research participants is 1 p-value = 0.036
compared with a population mean of 20 for a particular α = 0.01
questionnaire which measures anxiety level. The
following hypothesis is set up to be tested p-value(0.036) > α(0.01)
H0: μ = 20 Since the p-value(0.036) is greater than the level of significance (0.01), we do
H1: μ ≠ 20 not reject the null hypothesis .
A researcher draws a random sample of 25 persons Therefore: H0: μ = 20 (or very close to 20)
and calculates the mean score and the standard
deviation of this sample. This is used to calculate a t-
test statistic to test the hypothesis at a significance The steps we would apply are firstly the decision rule based on the p-value.
level of α = 0.01. lf a p-value of p = 0.036 is found, We decide that we will not reject Ho. So we have already been given that Ho:
which of the following statements about the mean = 20.
which was calculated from this sample is most likely to
be true? If we look at the options provided: if we choose option 2 we would be saying
that we will consider the alternate hypothesis, we have already decided not to
1. It is close to 20 do that. So it can't be option 2. The same concept applies to option 3. And
2. It differs significantly from 20 option 4 is not correct because we have been provided with the all the
3. It is definitely not equal to 20 information required to firstly apply the decision rule and the we have the
4. There is not sufficient information given to estimate Significance level and the p-value.
it
41 When two means are compared, the p-value 4 Tut202 The null hypothesis states that there is no difference in the means calculated
expresses the probability that a difference between the 2014 Q8 from samples of data from each of of groups A and B. When we calculate the
means ______ two means from sample data (which we regard as an observation) we may
find a difference in the two calculated means, but at least part of this
1. Will be significant difference could be due to measurement errors. We calculate the p-value
2. is due to the alternative hypothesis (based on a test statistic with a known probability distribution) to find
3. will be found between the means out what the probability is that that these observed differences in the
4. is due to chance or sampling error sample data are just a consequence of measurement error if the null
hypothesis is assumed to be true. If this probability is low (lower than a
pre-determined cut-off level, α), we conclude that the difference in the two
means is statistically significant because the probability that the null
hypothesis is true is very small.
In other words, we conclude that the size of the difference between means
found in the sample data would not be likely if the null hypothesis were true.
43 Cohen's d refers to the ______ 2 P87 One way that statisticians have suggested to deal with this problem is by the
notion of effect size. Different procedures exist to determine the effect size of
1. difference score when two means from dependent a result. In the case of a comparison between means, one way of calculating
samples are compared this is by the use of Cohen's d. We do this by expressing the mean difference
2. effect size that we observed relative to the standard deviation:
3. power of a test
4. amount of variance shared by two variables when
they are correlated
44 Effect size is calculated to determine ______ 4 The effect size is to assess whether a significant effect is meaningful from a
practical point of view.
1. whether an effect is statistically significant or not
2. the ability of a statistical test to detect a significant P87 The implication is that we have to be careful how we interpret significant
relationship between variables when such a results. A p-value of smaller than our chosen level of significance (α) simply
relationship does ln fact exist implies that, relative to this sample, it is improbable that the effect we see in
3. the level of confidence one can reach that the test our observations is purely due to chance. It does not imply that the effect is
is valid big or important. This is something that we have to decide by looking at what
4. whether a significant effect is meaningful from a the data means. One way that statisticians have suggested to deal with this
practical point of view problem is by the notion of effect size. Different procedures exist to determine
the effect size of a result. In the case of a comparison between means, one
way of calculating this is by the use of Cohen's d. We do this by expressing
the mean difference that we observed relative to the standard deviation:
45 A random sample of n=100 people are tested to see 2 Probability (p-value) = ??
how many items they can recall from a list with pictures Mean () = 7
of 12 items. The distribution of the results is found to Std Dev (s) = 2.0
be more or less normal with a mean of ẋ = 7 and a Sample (n) = 100
standard deviation of s = 2.0. What is the probability Raw score (X) = 10
that a specific person, chosen at random from the
general population, will remember 10 or more items First calculate the z-score using
from the list?
z = (10 - 7) / 2
z=3/2
z = 1.5
Now you have standardised the normal distribution so the mean is 0 and the
std dev is 1. When you look up the z-score (1.5) in the standard normal
distribution tables (Appendix D) you will see the larger and smaller portion
values. Larger portion is 0.9332 (93%) and smaller is 0.0668 (7%)
Suppose that the memory span of adults is normally distributed with a mean of 7 items and a standard deviation of 2 items. A researcher is investigating
the impairment of memory among persons who has been diagnosed as suffering from Korsakoff's syndrome (a neurological disorder linked to chronic
alcohol abuse). He intends to test his prediction on a sample of 50 persons who were diagnosed as suffering from this syndrome
47 Which of the following is an appropriate null hypothesis 2 P73-75 The null hypothesis will always contain equal signs. In this case H0 : μ = 7.
for testing the above prediction?
H0 is defined as the hypothesis of no effect.
1. The mean memory span of the population of
persons suffering from Korsakoff's syndrome is • The null hypothesis (H0) represents the status quo or the current belief in
smaller than 7 a situation. The null hypothesis will always contain equal signs.
2. The mean memory span of the population of • The alternative hypothesis (H1) is the opposite of the null hypothesis and
persons suffering from Korsakoff's syndrome is represents a research claim or specific inference you would like to prove.
equal to 7 This means that the alternative hypothesis takes the sign of the test
3. The mean memory span of the population of depending on the situation.
persons suffering from Korsakoff’s syndrome is not o If we are testing the difference, H1 is indicated with ≠.
equal to 7 o Otherwise we can use signs like less than (<) or greater than (>)
4. The mean memory span of the sample of persons depending on the problem statement.
suffering from Korsakoff's syndrome is equal to 7 • If you reject H0, you have statistical proof that the alternative is correct.
• If you do not reject H0, you have failed to prove that the alternative
hypothesis is correct. Failure to prove the alternative hypothesis does not
necessarily mean that the null hypothesis is true.
• The null hypothesis (H0) always refers to a specific value of a parameter
(such as μ, not a statistic (such as ). This value is always known or will
come from the given scenario.
48 Which of the following is an appropriate alternative 1 A researcher is investigating the impairment of memory among persons who
hypothesis for testing the above prediction? has been diagnosed as suffering from Korsakoff's syndrome.
1. The mean memory span of the population of "Impairment" indicates less than or smaller than, so H1 : μ < 7
persons suffering from Korsakoff's syndrome is
smaller than 7 Therefore: The mean memory span of the population of persons suffering from
2. The mean memory span of the population of Korsakoff's syndrome is smaller than 7
persons suffering from Korsakoff's syndrome is
equal to 7
3. The mean memory span of the population of
persons suffering from Korsakoff's syndrome is not
equal to 7
4. The mean memory span of the sample of persons
suffering from Korsakoff's syndrome smaller than 7
49 Testing the above prediction on a sample will require a 3 P75 Directional because H1 : μ < 7. It is also one-tailed because it only focus on
______ statistical test smaller than 7 and not larger than 7 as well.
1. non-directional Two-tailed is when H1 : μ ≠ 7. Now the focus will be on smaller than and larger
2. two-tailed than 7 results
3. directional
4. non-parametric
50 A pharmaceutical company claims that a new sleeping 2 P81 The relationship between one-tailed and two-tailed p-values can be
pill which they are marketing will put people to sleep in summarised as follows:
less than 15 minutes. A researcher wants to test if the • One-tailed p-value = (two-tailed p-value) / 2
average time before people fall asleep after using this • Two-tailed p-value = (one-tailed p-value) x 2
pill matches this claim. She uses the following
hypothesis The important point to remember is that the p-value indicates more or less
how likely the particular result we have observed in our data is if the null
H0: μ = 15 hypothesis were true; or, as we say, 'under the null hypothesis'.
H1: μ ˂ 15
1. 0.03450
2. 0.01725
3. 0.06900
4. Insufficient lnformation is given to determine thus
value
51 A researcher wants to compare the mean of the non- 4 P61-62 The standard error is an extremely valuable measure because we can use it
verbal reasoning scores of a sample of n=25 students to estimate how well a sample mean approximates its population mean in
with that of the general population. According to the general, that is, how much error you can expect on average between the
literature, the non-verbal reasoning test which she sample mean () that you calculated from your sample and the population
uses was standardized to a population mean of u = mean (μ) that you are trying to estimate.
100 and a population standard deviation of σ =10.
What is the value of the standard deviation of the In other words, it is an indication of the size of the error that you make by
sampling distribution of the mean which will be using a sample of a particular size (n) to determine the population mean. This
required to calculate the zẋ test statistic? amount of error will decrease as the size of the sample increases.
In other words, we conclude that the size of the difference between means
found in the sample data would not be likely if the null hypothesis were true.
A market researcher is asked to conduct a study to examine people’s reaction to a movie trailer. He draws a random sample of 20 males and 20 females
who saw the trailer. He asks them to indicate how likely it is that they will go and see the movie on a 7-point scale, where 1 indicates 'not at all' and 7
indicates 'definitely’. He wants to compare to establish whether males and females differ in their intention to see the movie based on an exposure to the
trailer.
Suppose the researcher finds that the mean and standard deviations for each group in the sample is as follows
1. H0: ẋM = ẋF H1: ẋM ≠ ẋF The word "differ" does not indicate a direction and therefore the alternative
2. H0: μM = μF H1: μM ˃ μF hypothesis must have a "≠" sign.
3. H0: μM = μF H1: μM ≠ μF Hypotheses are tested on population parameters only, therefore only "μ","σ"
4. H0: μ = 0 H1: μ ≠ 0 and "p" can be used. A hypothesis is not stated for samples or statistics (""
or "s").
P161 Symbol
Summary value Populations Samples
(Parameter) (Statistic)
Arithmetic mean μ
Standard deviation σ s
Variance σ² s² (s=√s²)
Standard error of mean
(Also called Standard deviation of the σ (= σ/√n) s (= s/√n)
sampling distribution of the mean)
Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean µ
under H0)
Z score for means z
Correlation between two measurements
(Pearson's R)
ρ r
Proportions P p
54 Which is the appropriate test statistic to calculate? 2 P110 Samples are considered as comprising independent groups if the
composition of the one sample in no way affects, in any systematic way, the
1. The z-statistic for the difference between the means composition of the other sample. The two samples come from two groups
of two samples that have no obvious relationship. For example, where one sample is
2. The t-statistic for the difference between the means measurements of a construct like 'self-esteem' among men, and the other
of two independent samples among women, but both groups were sampled purely randomly.
3. The t-statistic for the mean of a single sample
4. The t-statistic for the difference between the means On the other hand, the concept of dependent groups refers to situations
of two dependent samples where the samples are related, and it implies that each subject in one group
can be systematically paired off with a subject from the other group. For this
reason, a dependent groups research design is often referred to as a
matched-pairs design.
55 A researcher is asked by a motivational speaker to 2 P120 This t-test statistic (td) is used for the comparison of means from two matched
establish whether a workshop on assertiveness or dependent samples.
training is effective. The researcher decides to use a
particular questionnaire which tests an individual's P110 The concept of dependent groups refers to situations where the samples are
level of assertiveness. He presents the questionnaire related, and it implies that each subject in one group can be systematically
to each of a sample of 50 participants in the workshop paired off with a subject from the other group. For this reason, a dependent
before it begins and once again after it has ended. group's research design is often referred to as a matched-pairs design.
When analysing these results the researcher should
use a statistical test for the ______ Another example of such a design would be a repeated measures design,
where the same research participant is observed under more than one
1. comparison of means for a single group treatment or experimental condition. For example, to test the effectiveness of
2. comparison of means for two dependent groups a psychotherapy technique, people can be tested before the treatment
3. comparison of means for two independent groups begins, and again afterwards. The two sets of measurement (indicated by two
4. correlation of two variables variables) can be regarded as two samples of data, which is to be compared
to see whether some kind of change has taken place. Dependent samples
are also sometimes referred to as correlated samples
NOTE: Make sure that you do not confuse the notion of dependent versus
independent samples with the distinction between dependent and
independent variables (Topic 1, section 1.3.2). While the latter refers to the
relationships among variables - how one may affect the other - in the case of
samples it is a relationship among the groups from which the data were
collected (i.e., where the variables were measured) that is of concern.
56 The probability under the null hypothesis of obtaining a 2 SG P81 A one-tailed p-value (used in the case of a directional hypothesis) is half the
t-value of 2.0 or higher in the case of a two-tailed test size of a two-tailed probability.
is ______ that for a one-tailed test Tut201
2014 Conversely, a two-tailed p-value (used in the case of a non-directional
Q12 hypothesis) is twice the size of a one-tailed p-value.
1. the same as
2. twice The relationship between one-tailed and two-tailed p-values can be
3. half summarised as follows:
4. impossible to calculate from • One-tailed p-value = (two-tailed p-value) / 2
• Two-tailed p-value = (one-tailed p-value) x 2
57 When calculating the t-test for two independent 4 P113 In order to use the t-test (tc) statistic, we need to make two assumptions
samples, which of the following assumptions must be regarding the data:
made if the sample sizes are relatively small? • that the two populations being compared are normally distributed
• with the same variance (or standard deviation).
a) the value of σ is known for both populations
b) the two population means differ P116 Note: Even the most elementary statistics program makes provision for
c) the two populations have the same variance performing t-tests. Such programs usually require that we indicate which
d) the two samples come from normally distributed variable should be used to identify the two groups and which is the dependent
data variable. In addition, we have to choose between a tc test for independent
samples or a td test for dependent or correlated groups
1. (a) and (d)
2. (b) and (c)
3. (a) and (c)
4. (c) and (d)
58 A sample of 70 people is tested on a test for 4 P119- "70 people is tested on a test for assertiveness before and after a
assertiveness before and after a workshop in which 120 workshop."
they are given assertiveness training. Which of the
following is the most appropriate formula for comparing We therefore know that we are dealing with two matched or dependant
the mean assertiveness score before the training with samples. We have to use the td test
the one thereafter?
1.
2.
3.
4.
A researcher compares a sample of children from a special school for gifted children with a group of children randomly drawn from other schools on a test
which measures the creativity of the children on a 9-point scale She finds the following
1. Between 0.0 and 0.3 d = (1 - 2) / sp = (5.5 - 4.9) / 1.0 = 0.6 / 1 = 0.6
2. Between 0.3 and 0.5
3. Between 0.5 and 0.8 d = 0.6 which lies between 0.5 and 0.8
4. Greater than 0.8
61 A scatter plot is a graphical representation of the 2 SG A graph showing the position of each of a number of sampling units on each
relation between ______ P130- of two variables
132
1. two variables measured on a nominal scale within a A scatter plot is a graph showing the relationship between two numerical
single group Tut202 variables. In such a graph the data of the one variable are plotted on the
2. two variables measured on a ratio or interval scale 2014 horizontal axis (usually referred to as the X axis), and the data of the other
within a single group Q18 variable on the vertical (or Y) axis. It is not a comparison of sample and
3. two groups of subjects measured on an interval or population, nor has it to do with spread of data or the independence of
ratio scale on a single variable variables
4. two groups of subjects measured on an interval or
ratio scale on two variables
62 A researcher obtains a correlation coefficient of 0.40 3 P130- Pearson product-moment correlation coefficient - the notion of the
between IQ scores and examination marks in a 140 relationship between two continuous variables and how the size of the
random sample of 10 PYC3704 students, and again a relationship can be expressed in terms of a correlation between them. This
correlation coefficient of 0.40 between the same two coefficient can also be used as a test statistic.
variables on another random sample of 100 PYC3704
students. Which of these two correlation coefficients is Correlation is a measurement of the extent to which a measurement on one
the more likely to differ significantly from zero under variable is related to a measurement on another variable for the same sample
the null hypothesis? of individual cases.
1. That obtained on the smaller sample P139 One should, however, be careful as to how one interprets a significant result.
2. Both are equally likely to be significant To clarify this, consider the relationship between the calculated significance
3. That obtained on the larger sample (the p-value) and the sample size (n).
4. There is no relationship between the size of the
correlation coefficient and significance For a smaller sample n, the test must be much more conservative. You must,
therefore, put up a bigger hurdle to be crossed before you conclude that the
result is not the consequence of chance. You, therefore, require a larger
value of r before you can conclude that the result is not a chance event due
to sampling or measurement error, but an actual representation of the state of
affairs in the population. The consequence of this is that, for a large sample,
a relatively modest correlation can turn out to be significant. For example, for
a sample of n = 40 (as in the HIV/AIDS research project in Appendix A), the
value of r must be at least r = 0.26 for a = 0.05 (a 5% level). If we increase
the sample size to 100, a smaller result of r = 0.16 would be significant at the
same level of a = 0.05. This shows that, for a large value of n, a very modest
r can be significant. The implication of this is that significance does not
indicate that a relationship is large. It merely tells you that some relationship
exists (perhaps a modest one), and that it is large enough not to be regarded
as purely due to the effect of chance, given the size of the sample.
63 Which of the combinations of the options below can be 2 The question reads: "when a significant negative correlation is found"
substituted in the following sentence to describe the
situation when a significant negative correlation is P133 When positive relationships occur, this implies that as one variable gets
found between two variables X and Y? larger, so does the other.
A person who scores ______ on variable X is likely to When negative relationships occur, this implies that as one variable gets
have a ______ score on variable Y larger, the other gets smaller.
(a) low, low
(b) low, high
(c) high, low
(d) high, high
64 A researcher wants to establish whether the type of 4 SG P140 The chi-square test is usually used when you have a cross tabulation of
employment category that is filled by employees of a frequency counts of events which are nominal scale measurements. This
particular company (manager, middle manager, clerical Tut202 table is referred to as a contingency table. It is used to compare an observed
worker, or technical worker) is at all related to their 2014 frequency distribution (frequency counts based on a sample of observation)
gender (male or female). Which would be the most Q22 with the frequency distribution which we would expect to find if the null
appropriate test to use? hypothesis of no relationship between two cross-tabulated variables were
true. The variables involved are qualitative in nature.
1. The t-test for two independent samples
2. Pearson’s correlation test statistic
3. The t-test for two dependent samples
4. The Chi-square (x²) test statistic
A group of hospitalized patients who have been diagnosed as suffering from schizophrenia are treated with certain drugs over a period of time. These
drugs were prescribed to improve their mental alertness. A researcher studies a random sample of 30 these patients who have been on these drugs for
varying amounts of time, hoping to establish a relationship between the number of days of drug treatment and patients’ scores on a Mental Alertness Test
65 Which is an appropriate null hypothesis for this 1 P161 ρ = Correlation between two measurements for population parameters
research? r = Correlation between two measurements for sample statistics
μ = population mean
1. ρ=0
2. μ1 = 0 This is correlation/relationship between patients at various stages/days of
3. r=0 drug treatment and patient's scores on a mental alertness test. In this case,
4. μ1 = μ2 we cannot select r=0 because the hypothesis is tested on population
parameters.
Symbol
Summary value Populations Samples
(Parameter) (Statistic)
Arithmetic mean μ
Standard deviation σ s
Variance σ² s² (s=√s²)
Standard error of mean
(Also called Standard deviation of the σ (= σ/√n) s (= s/√n)
sampling distribution of the mean)
Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean µ
under H0)
Z score for means z
Correlation between two measurements
(Pearson's R)
ρ r
Proportions P p
66 Which is an appropriate alternative hypothesis for this 1 P161 The question does not stated better than or more than or any other directional
research? alternative. It is merely comparing for or trying to establish a difference. It is
therefore non-directional (≠) and needs a two-tailed test.
1. ρ≠0
2. μ1 ≠ μ2 Since the hypothesis is tested on population parameters as established in the
3. p>0 previous question, option 1 must be correct.
4. r>0
67 What is the expected frequency in cell AX of the 2 P143- It is important to note that the relation between the variables is described by
following Contingency table? 144 the cell and not by the row or column frequencies. These cell frequencies
represent the way the information is distributed relative to the two variables.
X Y These cell frequencies are often referred to as the observed or empirical cell
A 7 3 frequencies.
B 3 7
To find the expected frequency for a particular cell, the row total for that row
1. 3 is multiplied by the column total for that column and this result is then divided
2. 5 by the overall total. These expected frequencies show what the results would
3. 7 have been like if the distribution of frequencies through the cells were
4. 20 homogeneous, in proportion to the respective row and column totals. If the
observed frequencies correspond precisely with the expected frequencies,
we know that the null hypothesis cannot be rejected. But the observed
frequencies will seldom be precisely equal to the expected frequencies - even
if H0 is not rejected - because of sampling error.
X Y Total
A 7 3 10
B 3 7 10
Total 10 10 20
Row total (O.1) = 10
Column total (O1.) = 10
Sample total (size) (O..) = 20
When positive relationships occur, this implies that as one variable gets
larger, so does the other. When negative relationships occur, this implies that
as one variable gets larger, the other gets smaller.
69 A contingency table represents ______ 4 App B Contingency tables are used to represent frequency counts of data that
have been classified in terms of 2 nominal variables (for example, gender
1. the distribution of the frequencies for a variable P142- and occupational category). It is possible to fit ordinal, interval or ratio scale
2. the data used to plot the relationship between two 144 measurements into such a table, but they would first have to be transformed
variables into a classification system; that is, the data have to be treated as if they
3. frequency counts for each of a number of possible represent nominal scale measurements.
outcomes of an experiment
4. the frequency counts when two nominal-scale Tut202 A contingency table is a two dimensional table used to represent the cross
variables are cross-classified 2014 classification, or cross tabulation, of the responses relating to two nominal or
Q20 categorical variables. It is basically a way to display and record the
relationship between the two variables. The frequency counts of one variable
are presented in the rows of the table and the frequency counts of the other
variable in the columns, as shown in table 6.4 on page 142 and table 6.5 on
p. 144 of the PYC3704 Guide
70 Which of the values given below is the closest to the 1 P132- Correlation coefficients that measure the linear relationship between two
probable value of the Pearson's product moment 133 variables, such as the Pearson product-moment correlation coefficient, can
correlation coefficient for the variables X and Y? have a continuous value that ranges from -1 to 1 (a positive value is usually
written without the sign, so '1' is presumed to mean '+1'). We use 'r' as the
Variable X 1 2 3 4 5 6 7 8 symbol that represents a correlation coefficient (as in the case of the Pearson
Variable Y 16 14 12 10 8 6 4 2 product-moment correlation coefficient), and the following applies:
• r = 1 implies a perfect positive linear relationship (the dots in a scatter plot
1. -1.0 will run from lower left to upper right in a perfectly straight line)
2. 0.5 • r = 0 implies no linear relationship at all (the dots may be scattered all
3. 0 over the place)
4. 1.0 • r = -1 implies a perfect negative linear relationship (the dots will run from
upper left to lower right in a straight line)
When positive relationships occur, this implies that as one variable gets
larger, so does the other or as one variable gets smaller, so does the other.
Variable Y does the same as Variable X.
When negative relationships occur, this implies that as one variable gets
larger, the other gets smaller or as one variable gets smaller, the other gets
larger. Variable Y does the opposite to Variable X.
2 A researcher believes that there is a difference in the 4 P2 An inference is a conclusion that follows from existing information, by
reasoning strategies used to solve puzzles between generalising from the specific information to the general type of phenomenon,
students who study physical sciences such as physics where the conclusion is not absolutely certain. So in summary inferential
and chemistry and students who study social sciences statistics are techniques for making generalisations based on imperfect
such as psychology or sociology. She sets up a series P10-11 numeric data, where the conclusions have a high probability of being true, but
of puzzles to be solved by students from different you can never be completely certain.
colleges or faculties at a university. This kind of
research is referred to as ________ research A distinction exists between inferential statistics and descriptive statistics.
1. statistical Descriptive statistics refers to a set of quantities used to summarise
2. theoretical aspects of numerical data. Examples that you may be familiar with are
3. empirical means, range, variance and standard deviation (see Appendix C for a quick
4. inferential introduction). These summary quantities are sometimes referred to as
parameters (when they refer to the whole collection or population of data; see
section 1.4.3 below).
You are INFERRING from your sample back to your population of all
students. If they had said experimental that would also have been correct.
We don't really used the word EMPIRICAL to refer to a TYPE of research, its
used to describe the nature of the research, ie that your research should be
testable
# Question Ans Page Comments
3 Which of the following definitions best describe the 1 P6 The taking of a measurement is regarded as an act of observation
meaning of ‘measurement’ in the context of
psychological research? Measurement means to P7 A construct that has been measured in some way produces a variable. A
________ variable refers to a number that can take on any one of a range of possible
1. find a way to observe a specific construct or values. They can be discrete (when only whole numbers like 1, 2, 3 are
phenomenon which is hidden allowed) or continuous (what mathematicians refer to as 'real numbers'). In
2. determine the extent to which a specific some cases variables also take on values smaller than zero to produce
phenomenon is present on a numeric scale negative numbers.
3. specify the relationship that is believed to exist
between two (or more) constructs or phenomena So the (visible) variable reflects the intensity of the underlying (invisible)
4. calculate a summary value which describes an construct, in terms of how it was measured. We say that the variable is
aspect of a specific construct or phenomenon manifest (it is visible in the sense that we can observe it) and the construct is
latent (it is invisible in the sense that we need some way to make it appear).
So the latent construct is made manifest by the use of an appropriate
measurement procedure.
4 A variable is described as ‘manifest’ because it is a[n] 4 P7 So the (visible) variable reflects the intensity of the underlying (invisible)
(a) ______ measurement of a construct which is (b) construct, in terms of how it was measured. We say that the variable is
______ manifest (it is visible in the sense that we can observe it) and the construct is
1. (a) latent (b) observable latent (it is invisible in the sense that we need some way to make it
2. (a) dependent (b) independent appear). So the latent construct is made manifest by the use of an
3. (a) independent (b) dependent appropriate measurement procedure.
4. (a) observable (b) latent
P23 To say that a construct is 'latent' is another way of saying it is hidden from
direct observation
5 When a specific psychological construct or 3 P6 The taking of a measurement is regarded as an act of observation
phenomenon is measured on a quantitative scale, the
resulting value is referred to as a ______ P7 A construct that has been measured in some way produces a variable.
A variable refers to a number that can take on any one of a range of possible
1. parameter values. They can be discrete (when only whole numbers like 1, 2, 3 are
2. descriptive statistic allowed) or continuous (what mathematicians refer to as 'real numbers'). In
3. variable some cases variables also take on values smaller than zero to produce
4. test statistic negative numbers.
1. other concepts "Operational'' refers to practical procedures by which constructs are made
2. observable instances visible.
3. latent variables
4. underlying constructs "Operationalisation" is where you make the construct (which is usually an
abstract concept, so it is difficult to observe it clearly) visible by finding some
suitable way to measure it.
7 Which of the following is appropriate as a research or 3 P9 a) H0: μM = μF H1: μM ≠ μF
operational hypothesis? Too many independent factors like job level or seniority etc
10 A researcher believes that people who make eye 1 P8-9 The dependent variable is the one that is predicted or explained, and the
contact with others when they speak to them are P24 independent variable is manipulated to see how it affects the dependent
generally perceived to be more trustworthy than those variable.
who do not. She sets up an experiment where a group
of 100 research participants are each interviewed by a The independent variable is that variable which affects the dependent
research assistant. In half of the cases the interviewer variable; or, conversely, the dependent variable depends on the independent
makes a lot of eye contact with the participants during variable.
the interview and in half of the cases no or very little
eye contact is made. Afterwards participants are asked When a researcher focuses on the interaction of only two variables at a time,
to rate the research assistant for level of the dependent variable is usually the one that the researcher is interested in,
trustworthiness. In this scenario, whether eye contact the variable that is the focus of the research. The independent variable is
was made or not is the (a) ______ variable, while something that the researcher manipulates, to see how this affects the
perceived level of trustworthiness is the (b) ______ dependent variable (in other words, the dependent variable is dependent on
variable the independent variable).
= 1 / 99 = 0.010
# Question Ans Page Comments
16 Suppose that over the years 2 000 students wrote the 3 P35-36 Part 1:
examinations in PYC 304-C and that 1200 of them p(E) = Number of favourable events = 600 = 3 = 0.30
passed, of which 600 obtained exactly 50%. This Number of possible outcomes 2000 10
means that for randomly selected students the
probability of obtaining exactly 50% is ______ while Part 2:
the probability of obtaining 50% or more is ______ p(E) = Number of favourable events = 1200 = 6 = 0.6
Number of possible outcomes 2000 10
1. 0.60; 0.30
2. 0.05; 0.60
3. 0.30; 0.60
4. 0.60; 0.50
17 The probability value “p is larger than or equal to 0.2" 1 P33-34 p is larger than or equal to 0.2 (p ≥ 0.2)
is ______ the probability value "p is smaller than or p is smaller than or equal to 10% (p ≤ 10% or p ≤ 0.1)
equal to 10%"
So the probability value “p is larger than or equal to 0.2" is larger than the
1. larger than probability value "p is smaller than or equal to 10%"
2. larger than or equal to
3. smaller than or equal to
4. exactly the same as
Individual A B C D E F G H I J
Test score 12 12 7 10 9 12 13 8 9 8
# Question Ans Page Comments
18 The mean of the test scores is? 2 P59-60 Formula is :
1. 12.00
2. 10.00
3. 9.00 So:
4. 8.00 μ = ∑xi / N
= (12+12+7+10+9+12+13+8+9+8) / 10
= 100 / 10
= 10
19 The standard deviation of the distribution of sample 3 P53 Formula is :
scores is 2.11 Therefore the z-score for individual E is
1. 0.47
2. 1.42 Where:
3. -0.47 X = 9 (test score for Student E)
4. -1.42 μ = 10 (calculated in previous question)
σ = 2.11 (standard deviation)
So:
Z = (x - μ) / σ = (9 - 10) / 2.11 = -1 / 2.11 = -0.474
20 The proportion of scores less than z=0.00 is 2 App D P(Z < z) = 0.5000
1. 0.00 See standard normal distribution table in Appendix D for Smaller Portion of
2. 0.50 z=0.00
3. 1.00
4. -0.50 If z=0.00, then half half the scores will be less and half will be more than the
mean.
# Question Ans Page Comments
21 In a normal distribution, approximately ______ of the 3 P53 The normal curve (also known as the bell curve) is the most common
scores fall within 1 standard deviation of the mean distribution of data. The normal curve is completely determined by two
parameters: mean (μ = 0) and standard deviation (σ = 1). The normal curve is
1. 14% symmetric about the mean which is also the median and the mode. Most data
2. 95% is clumped in close to the mean.
3. 68%
4. 83% Theorem 1 The 68-95-99.7 Rule: In every normal distribution with mean µ
and standard deviation σ, approximately 68% of the data falls within one
standard deviation of the mean. Approximately 95% of the data falls within
two standard deviations of the mean. And finally, approximately 99.7%
(almost everything) of the data falls within three standard deviations of the
mean.
According to the standard normal distribution tabel (z-tabel), if z=1 then the
mean to z = 0.3413. Multiply by 2 to get both sides of the mean = 0.6826 or
68.26%
So:
68.27% of the values lie within one standard deviation of the mean.
(0.3413 + 0.3413 = 0.6826 = 68.26% - numbers were rounded)
95.45% of the values lie within two standard deviations of the mean.
(0.1359 + 0.3413 + 0.3413 + 0.1359 = 0.9544 = 95.44%)
99.73% of the values lie within three standard deviations of the mean.
(0.0215 + 0.1359 + 0.3413 + 0.3413 + 0.1359 + 0.0215 = 0.9974 =
99.74%)
The sampling distribution of a statistic is the set of all possible values of the
statistic when all possible samples of a fixed size are taken from the
population. The sampling distribution refers to the variation of a statistic, for
example, the sample mean (), from sample to sample. Note that here we are
not concerned with the variation of individual elements in the sample, or
individual elements in the population, but with the variation of a summary
value (such as the mean) for a sample.
25 An alpha level of 0.05 indicates that ______ 1 SG 82- An error of Type I is the error we make if we reject the null hypothesis when
86 we should not have done so, and the level of significance represents the
1. if H0 is true, the probability of falsely rejecting it is greatest risk of doing this that we are willing to take.
limited to 0.05
2. 95% of the time, chance is operating. The alpha level is the level of significance, in this case 0.05 or 5%.
3. the probability of a Type II error is 0.05
4. the probability of a correct decision is 0.05
# Question Ans Page Comments
A researcher believes that women today weigh less than in previous years. To investigate this belief she randomly samples 41 adult women and records
their weights. The scores have a mean of 51 kg and a standard deviation of 5.6. A local census taken several years ago shows the mean weight of adult
women was 52.6 kg at that time
26 Given the data above, what would be the most 2 P102- In this question the population standard deviation (σ) is considered to be
appropriate statistical approach to establish whether 106 unknown because the given standard deviation comes from the sample. So
there is a statistically significant difference between the we have to use the t-test (t)
average weight of the women in the sample and the
weight of the women recorded in the census? The important point is that - as in the case of the z-distribution - the t-
distribution is a statistical distribution with a probability distribution that can be
1. A correlational study focusing on the linear increase determined, which means that we can use it to predict the chances of
in weight obtaining specific outcomes when testing for comparisons of means when the
2. A study of the group differences using a single population standard deviation σ is unknown.
sample t-test
3. A study of the group differences using the t-test for
independent groups
4. The z-test
27 If the population standard deviation was available 4 P80 When the population standard deviation (σ) is known we use the z-test (z)
instead of the sample standard deviation, which
technique would then have been the most appropriate P100- In Topic 3 (section 3.2.2), in the process of explaining the logic of statistical
for the statistical analysis of the data? 102 testing in general, we introduced you to the z test for single-sample
comparisons. This is used when you have only one sample of data of a
1. A correlational study focusing on the linear increase variable from which a mean could be derived, and you want to compare this
in weight mean with a specific constant value.
2. A study of the group differences using a single
sample t-test
3. A study of the group differences using the t-test for
independent groups
4. The single-sample z-test
# Question Ans Page Comments
28 Suppose the obtained value of the appropriate statistic 4 p-value = 0.025
is -2.07, and subsequently a p-value of 0.025 was α = 0.01 (one-tailed)
found. What can be concluded based on these results
if a significance level of α = 0.01 (one-tailed) is used? Since p > α (0.025 > 0.01), do not reject H0
1. Accept H1
2. Do not accept H0
3. Reject H0
4. Do not reject H0
29 The nominal distribution is useful for interpreting 1 P51 Many psychological and educational variables are distributed approximately
psychological measurements because ______ normally, so that the normal curve can be used as a theoretical model for
interpreting the distribution of these variables. The distributions relating to
1. many psychological variables are approximately psychological variables such as measures of reading ability, introversion, job
normally distributed satisfaction and memory can all be plotted on a normal curve, and
2. it has a mean of zero and a standard deviation of 1 psychometric tests are often standardised in such a way that they conform to
3. it represents an arbitrarily large population of this distribution. Almost all the statistical tests discussed in this module
scores assume normal distributions. Furthermore, many psychological
4. it is symmetrical in shape measurements work very well even if the distribution is only approximately
normally distributed. Some tests work well even with very wide deviations
from normality. Also, apart from its theoretical significance, the normal
distribution is useful because it is easy to work with in practice, and because
many kinds of statistical tests can be derived for normal distributions..
App B On a nominal scale, numbers show category membership, but are otherwise
P156 arbitrary. They do not represent a size or intensity of something, but are only
used as labels to distinguish among qualities or characteristics. They can
also be referred to as categorical variables, or qualitative variables. This is
because differences in the numbers represent differences in quality,
character or type, but not in amount.
For example, we could code a variable like 'region' into 1 = North; 2 =West; 3
= South; and 4 = East. But these four categories can be coded in a different
sequence if we choose, without any information being lost. Note in the special
case where there are only two options, for example, when we code 'Gender'
as 1 = male and 2 = female, we refer to it as a dichotomy.
The important point about nominal scale measurements is that you cannot do
arithmetic with them. Adding them and obtaining an average makes no sense
(e.g. adding telephone numbers to obtain an 'average telephone number').
# Question Ans Page Comments
30 If examination scores are approximately nominally 3 Calculate the z-score for each class. The subject with the highest z-score is
distributed with a mean of 60% and a standard where student X did the best in.
deviation of 8% and Pete’s score is 66%, he did better
than about ______ of the candidates Formula is:
Pete did better than 77% who got less than 66%
Alternatively:
To summarize, 23% did better than 66% so Pete did better than 77% who got
less than 66%.
# Question Ans Page Comments
31 After findings that a significant difference exists 4 P86-87 Due to errors of measurements especially in the standard error, a
between male and female participants on a test which statistical hypothesis test may indicate a significant relationship yet
tests level of creativity, a researcher decides to also such a relationship is questionable in real life. For example, a study on
calculate an effect size, using Cohen's d. This is used reckless driving may indicate that taxi drivers in Johannesburg to be the most
to determine ______ careful drivers in South Africa, yet, such a result is questionable in really life.
1. the size of the error that would be made if the null Now to determine, if indeed this is significant or is not due to error of
hypothesis is rejected measurements we do the effect size test using Cohen’s d test.
2. the ability of a statistical test to detect a significant
relationship between variables
3. the level of confidence one can have that the test is
valid
4. whether a significant effect is meaningful from a A result of d > 1 would imply a difference of greater than one standard
practical point of view deviation between the means, which is quite large.
The rule of the thumb we can interpret the effect size as follows:
Around 0.2 “small”
Around 0.5 “medium”
Around 0.8 “large effect size”
32 Transforming variables to z-scores is useful because it 2 P55 Transforming a set of measurements, each with a different mean and a
______ different standard deviation, into a z-score can be used to compare an
individual across different distributions. After transformation, all the scores will
1. is used to calculate the test statistic fall on a common standard normal distribution with a mean of 0 and a
2. enables one to compare variables with different standard deviation of 1, which makes it possible to compare them directly.
means and standard deviations from scores with
different original units
3. can be used to test whether a score is normally
distributed
4. is easy to calculate the mean and standard
deviation of most scores
# Question Ans Page Comments
33 A probability of an event occurring which depends on 1 P36-37 In the formulation of the multiplicative rule given above we assume that the
something else occurring, such as passing a test when probabilities of the two events, A and B, are independent of one another.
you do not understand your course, can be described However, in some cases a particular probability is conditional on something
as ______ else happening. For example, the probability of event A occurring may be
conditional on the prior occurrence of event B. Conditional probabilities are
1. conditional probability written as p(B|A), where | indicates that a condition applies. p(B|A) is read as
2. an independent event 'the probability of B given A.' Likewise p(A|B) is read as 'the probability of A
3. mutually exclusive events given B', or equivalently, as 'the probability of A happening on condition that
4. a multiplicative probability B has occurred'.
Suppose we let A denote 'Marie wins the race' and B|A stand for 'Marie gets
a trophy given that she won the race'. We further assign a probability of 0.5 to
A and a probability of 0.6 to B|A. Therefore, the probability that Marie will win
the race and get a trophy is p(A and B) = (0.5) x (0.6) = 0.3.
Note that from the formula for conditional probability, using simple algebra,
we can derive formula 2 below.
p(B|A) = p(A and B) / p(A) (Formula 2)
Let us assume that we know that the chance of Marie winning the race and
also a trophy is 0.3. We also know that the probability of winning the race is
0.6. What is the conditional probability of her winning a trophy provided she
had won the race?
We use formula 2, insert the given probabilities and, therefore, have
p(B|A) = 0.3 / 0.5 = 0.6
# Question Ans Page Comments
34 The sampling error of the mean will be smaller in cases 1 P61-62 We can estimate the size of the error we would make if we used the sample
where ______ mean as an estimate of the population mean. This is referred to as the
standard error, and it is specified in the central limit theorem.
1. the sample is larger and the standard deviation of
the population smaller The standard error is denoted by σẋ. The σ indicates that we are
2. the population is larger and the variability of the describing a population, and the subscript ẋ informs us that we are dealing
scores in the sample is smaller with a population of sample means. The standard error is given by dividing
3. the sample mean is smaller the population standard deviation by the square root of the sample size
4. a medium-size rather than a large sample is used σẋ = σ / √n
Like a standard deviation, the standard error of the mean tells us by what
average amount the sample means deviate from the mean of the sampling
distribution. It is an estimate of the size of the error we shall make if we
use the mean of the distribution of sample means as an estimate of the
true population mean, that is, if we use µẋ to estimate µ.
For instance:
Assume σ = 5 and n=36: σẋ = σ / √n = 5 / √36 = 5/6 = 0.833
By increasing the sample size (from 36 to 49), the standard error (σẋ) has
reduced
Suppose that the memory span of adults is normally distributed with a mean of 7 items and a standard deviation of 2 items. A researcher predicts that
'dyslexic adults have a shorter memory span than adults in general'
# Question Ans Page Comments
35 Which of the following is an appropriate null hypothesis 2 P73-75 The null hypothesis will always contain equal signs. In this case H0 : μ = 7.
for testing the above prediction? Since the hypothesis should verify dyslexic people's memory span, option 2 is
correct
1. The mean memory span of the population of
dyslexic adults is smaller than 7 H0 is defined as the hypothesis of no effect.
2. The mean memory span of the population of
dyslexic adults equals 7 • The null hypothesis (H0) represents the status quo or the current belief in
3. The mean memory span of the population of adults a situation. The null hypothesis will always contain equal signs.
equals 7 • The alternative hypothesis (H1) is the opposite of the null hypothesis and
4. The mean memory span of the population of adults represents a research claim or specific inference you would like to prove.
does not equal 7 This means that the alternative hypothesis takes the sign of the test
depending on the situation.
o If we are testing the difference, H1 is indicated with ≠.
o Otherwise we can use signs like less than (<) or greater than (>)
depending on the problem statement.
• If you reject H0, you have statistical proof that the alternative is correct.
• If you do not reject H0, you have failed to prove that the alternative
hypothesis is correct. Failure to prove the alternative hypothesis does not
necessarily mean that the null hypothesis is true.
• The null hypothesis (H0) always refers to a specific value of a parameter
(such as μ, not a statistic (such as ). This value is always known or will
come from the given scenario.
36 Which of the following is an appropriate alternative 1 P73-75 The alternative will take the direction of the question. Hence, “The mean
hypothesis for testing the above prediction? memory span of the population of dyslexic adults is smaller than 7”. In this
case H1 : μ < 7
1. The mean memory span of the population of
dyslexic adults is smaller than 7
2. The mean memory span of the population of adults
is not equal to 7
3. The mean memory span of the population of
dyslexic adults equals 7
4. The mean memory span of the population of adults
does not equal 7
# Question Ans Page Comments
37 Testing the above prediction will require a ______ 3 P75 Directional because H1 : μ < 7. It is also one-tailed because it only focus on
statistical test smaller than 7 and not larger than 7 as well.
1. non-directional Two-tailed is when H1 : μ ≠ 7. Now the focus will be on smaller than and larger
2. two-tailed than 7 results
3. directional
4. (1) and (2) are both correct
38 When applying a statistical test, the p-value represents 3 Tut201 The observed results are the values which you find in your sample(s) of data,
the probability of observing the ______ 2014 for example the sample mean and sample standard deviation, or (if it is
Q10 relevant), the correlation coefficient which you calculated.
1. sample statistic under the alternative hypothesis
2. population parameter under the null hypothesis P78-82 The p-value shows you the probability of seeing some relationship among
3. sample statistic under the null hypothesis these variables based on your calculations (such as a difference between
4. population parameter under the alternative means or a high correlation), if in fact this observed relationship is merely the
hypothesis consequence of chance (in other words, if the null hypothesis was true). You
are in fact comparing the observed relationships in the data with what you
would expect if the null hypothesis is true by calculating a relevant test
statistic.
The p-value is the probability that the NULL hypothesis is true. You test the
H0 using SAMPLE data
This test statistic can then be used to find the p-value if we know the
probability distribution of the test statistic. If this probability is small, it implies
the null hypothesis is probably not true.
1. non-directional, one-tailed Two-tailed is when H1 : μ ≠ 30. Now the focus will be on smaller than and
2. directional, two-tailed larger than 7 results
3. directional, one-tailed
4. non-directional, two-tailed
40 An alpha level of 0.05 indicates that ______ 1 P82-83 The decision rule for H0 is simply as follows:
If the probability (p-value) of the sample result is smaller than α (alpha) (i.e. if
1. if H0 is true, the probability of falsely rejecting it is the p-value < α), the null hypothesis is rejected. If the p-value is not smaller
limited to 0.05 than α (i.e if the p-value ≥ α), the null hypothesis is not rejected.
2. 95% of the time chance is operating
3. the probability of a Type II error is 0.05 The α-value specifies the maximum risk that we are willing to take of making
4. the probability of a correct decision is 0.05 an error if we reject the null hypothesis
41 If alpha is changed from 0.05 to 0.01, the ______ 4 When alpha reduces, the probability of Type I (α) error decreases and Type II
(β) increases.
1. probability of a Type II error decreases.
2. probability of a Type l error increases SG 82- An error of Type I is the error we make if we reject the null hypothesis when
3. error probabilities stay the same but the probability 86 we should not have done so, and the level of significance (α) represents the
that we will retain a false H0 increases greatest risk of doing this that we are willing to take.
4. probability that we will retain a false H0 increases
An error of Type II is the opposite of Type I. We fail to reject the null
hypothesis when we were supposed to.
P85 Generally, though, the smaller α, the larger β. If we wish to avoid Type I
errors, we set α to a small value such as 0.01 or even 0.001, but if we want to
avoid Type II errors, we could set α to a larger value.
# Question Ans Page Comments
42 lf the alternative hypothesis states that alcohol affects 2 P73-75 The null hypothesis will always contain equal signs so in this case "alcohol
short-term memory, the null hypothesis states that has no effect on short-term memory"
1. alcohol does not decrease short-term memory H0 is defined as the hypothesis of no effect.
2. alcohol has no effect on short-term memory
3. alcohol decreases short-term memory • The null hypothesis (H0) represents the status quo or the current belief in
4. all of the above a situation. The null hypothesis will always contain equal signs.
• The alternative hypothesis (H1) is the opposite of the null hypothesis and
represents a research claim or specific inference you would like to prove.
This means that the alternative hypothesis takes the sign of the test
depending on the situation.
o If we are testing the difference, H1 is indicated with ≠.
o Otherwise we can use signs like less than (<) or greater than (>)
depending on the problem statement.
• If you reject H0, you have statistical proof that the alternative is correct.
• If you do not reject H0, you have failed to prove that the alternative
hypothesis is correct. Failure to prove the alternative hypothesis does not
necessarily mean that the null hypothesis is true.
• The null hypothesis (H0) always refers to a specific value of a parameter
(such as μ, not a statistic (such as ). This value is always known or will
come from the given scenario.
43 When the results are statistically significant, this means 4 P82-83 This question examines judgement using the p-value (results are statistically
that ______ significant).
(a) the obtained probability is equal to or less than We generally would reject the null hypothesis when the p-value is less than
alpha the level of significance (α), therefore A and C are correct
(b) the independent variable has had a large effect
(c) we can reject H0 The dependent variable is the one that is predicted or explained, and the
independent variable is manipulated to see how it affects the dependent
1. (a) is correct but neither of the other statements variable. This has nothing to do with this question.
2. (b) and (c) are correct but not necessarily (a)
3. (a) and (b) are correct but not (c)
4. (a) and (c) are both correct but not necessarily (b)
# Question Ans Page Comments
44 A researcher draws a single random sample from a 1 P102- In this case the population standard deviation is unknown. So we use the t-
population to test his hypothesis about the mean 106 test (t).
population score on a psychological test. Scores on
this test are distributed normally in the general The important point is that - as in the case of the z-distribution - the t-
population with a known mean but an unknown distribution is a statistical distribution with a probability distribution that can be
standard deviation. Which test statistic should the determined, which means that we can use it to predict the chances of
researcher calculate to test his hypothesis? obtaining specific outcomes when testing for comparisons of means when the
population standard deviation σ is unknown.
1. The t-statistic for the mean of a single sample
2. The z-statistic for the mean of a single sample
3. The standard deviation of the sampling distribution
of the mean of a single sample
4. The t-statistic for independent groups
A researcher hypothesizes that chess-playing students are better at non-verbal reasoning than students in general. He draws a random sample of 25
students from the members of the chess clubs of South African universities and measures their non-verbal reasoning ability by means of a test developed
for this purpose. The scores of a large group of students on this test were found in earlier research to be distributed normally with a mean of 20. Suppose
the researcher finds that the mean score of his sample is 22.3 and the standard deviation of the scores is 6.0
45 Which research design did the researcher use? 1 P100- He drew one random sample which he is comparing to the general population
106
1. Single-sample groups design
2. Two-groups design
3. Two-groups design with a known population mean
4. A correlational design
1. 2.3/1.2
2. 2.3/5 where: = 22.3
3. -2.3/1.2 μ = 20
4. -2.3/5 s = 6
n = 25
49 Two samples can be considered independent when 1 P110 Samples are considered as comprising independent groups if the
______ composition of the one sample in no way affects, in any systematic way,
the composition of the other sample. The two samples come from two
1. the composition of one sample is not systematically groups that have no obvious relationship. For example, where one sample is
related to the composition of the other one measurements of a construct like 'self-esteem' among men, and the other
2. the samples are drawn under different experimental among women, but both groups were sampled purely randomly.
conditions
3. one sample comes from a treatment or On the other hand, the concept of dependent groups refers to situations
experimental group while the other comes from a where the samples are related, and it implies that each subject in one group
control group can be systematically paired off with a subject from the other group. For this
4. care was taken that the samples are drawn at reason, a dependent groups research design is often referred to as a
random matched-pairs design.
# Question Ans Page Comments
A researcher wants to validate a new depression scale where a high score indicates a high incidence of depression. She applies it to a sample of 40
patients diagnosed with depression and a control group of 40 persons who were Judged not to suffer from depression by a panel of clinical psychologists
50 Which is an appropriate alternative hypothesis to test 2 A researcher wants to validate a new depression scale where a high score
the validity of the depression scale based on group indicates a high incidence of depression.
mean values?
So the depression must be larger than the control
1. μDepression ≠ μControl
2. μDepression ˃μControl Therefore: μDepression ˃μControl
3. μDepression ˂ μControl
4. The population mean of the difference scores
equals zero
51 Which of the following would be the most appropriate 3 P110 Samples are considered as comprising independent groups if the
statistical test to determine whether a significant composition of the one sample in no way affects, in any systematic way, the
difference exist between the scores for the two groups composition of the other sample. The two samples come from two groups
(measuring depression and non-depression scores)? that have no obvious relationship. For example, where one sample is
measurements of a construct like 'self-esteem' among men, and the other
1. A test for a correlation coefficient among women, but both groups were sampled purely randomly.
2. The t-test for dependent samples
3. The t-test for independent samples On the other hand, the concept of dependent groups refers to situations
4. The chi-square (x²) test where the samples are related, and it implies that each subject in one group
can be systematically paired off with a subject from the other group. For this
reason, a dependent groups research design is often referred to as a
matched-pairs design.
SG P140 The chi-square test is usually used when you have a cross tabulation of
frequency counts of events which are nominal scale measurements. This
Tut202 table is referred to as a contingency table. It is used to compare an observed
2014 frequency distribution (frequency counts based on a sample of observation)
Q22 with the frequency distribution which we would expect to find if the null
hypothesis of no relationship between two cross-tabulated variables were
true.
# Question Ans Page Comments
52 A researcher wants to determine whether a significant 4 The effect size is to assess whether a significant effect is meaningful from a
difference exists in the scores on a test of creativity practical point of view.
between a group of students who are studying for a
BSc degree and a group of students who are studying P87 The implication is that we have to be careful how we interpret significant
for a BA degree. She finds a mean score of ẋ1 = 20.4 results. A p-value of smaller than our chosen level of significance (a) simply
for the BSc students and a mean score of ẋ2 = 22.3 for implies that, relative to this sample, it is improbable that the effect we see in
the BA students. A t-test shows that this difference is our observations is purely due to chance. It does not imply that the effect is
statistically significant. In spite of the significant big or important. This is something that we have to decide by looking at what
difference, the researcher feels that the difference the data means. One way that statisticians have suggested to deal with this
between the two means is too small to be really problem is by the notion of effect size. Different procedures exist to determine
important. What calculation can she do to confirm this the effect size of a result. In the case of a comparison between means, one
suspicion? way of calculating this is by the use of Cohen's d. We do this by expressing
the mean difference that we observed relative to the standard deviation:
1. Level of significance
2. Variance
3. Degrees of freedom
4. Effect size
53 A researcher plans to use the t-test to compare two 2 P115 In order to use the t-test (tc) statistic, we need to make two assumptions
independent samples of data with 20 individuals each. regarding the data:
• that the two populations being compared are normally distributed
Consider the following assumptions that may be • with the same variance (or standard deviation).
relevant here
a) The sample standard deviations have to be equal (Remember that the square root of the variance is equal to the standard
b) The data from both samples has to come from deviation.)
populations that are nominally distributed
We can also assume that the samples are independent - since the samples
What minimum assumptions from the ones given were selected randomly, we can safely consider them to be independent of
above needs to be met before she may proceed? each other. All of this makes the tc-test an appropriate test.
1. At least one of (a) or (b) must be true P116 Note: Even the most elementary statistics program makes provision for
2. (a) and (b) must both be true performing t-tests. Such programs usually require that we indicate which
3. Neither (a) nor (b) is relevant but other assumptions variable should be used to identify the two groups and which is the
exist that will have to be considered dependent variable. In addition, we have to choose between a tc test for
4. The t-test should never be used with such a small independent samples or a td test for dependent or correlated groups
sample
# Question Ans Page Comments
54 In which of the following cases should the scores being 2 P117- Often the two samples are not 'independent'. This happens when each
investigated be regarded as dependent when a test for 118 subject in one sample is matched with regard to some characteristic (usually
significance is selected? a nuisance or external variable that we wish to control) to a particular subject
in the other sample. The samples are dependent if each measurement of
1. The variables represent exam scores of children a variable for a particular case can be paired with the measurement of a
from two schools, matched on demographic criteria matching case in the other sample. The implication is that the two samples
like grade and gender will always have to be of the same size (that is, n1 = n2). This design is,
2. The variables represent scores from subjects on a therefore, often referred to as a matched-pairs design. This implicit matching
motivational scale, who were tested before and usually causes the scores to be correlated (see Topic 6 for the meaning of
after listening to a presentation by a motivational this term).
speaker
3. The scores on a test for mathematical ability and a A typical example would be if the same research participants are measured
test for attention span twice, once before and again after an intervention. From the point of view of
4. The variables represent frequency counts for research design, we would refer to this type of comparison as a two-sample
gender and favourite colour, cross-classified in a repeated measures design.
contingency table
A psychologist develops a series of workshops providing assertiveness training to a group of persons who suffer from low self esteem. To test the efficacy
of the workshops, she applies a psychometric test which measures level of self esteem to 50 persons at the start and again after the end of the series of
workshops, predicting that the latter scores will be higher (reflecting higher self esteem). The self esteem scale was standardised on the general
population with a mean score of 30 and a standard deviation of 10.
55 Which constructs are related to one another by the 2 P117- To test the efficacy of the workshops, she applies a psychometric test which
research hypothesis? 118 measures level of self esteem to 50 persons at the start and again after the
end of the series of workshops
1. Attending a workshop of assertiveness training, self
esteem
2. Self esteem before a workshop; self esteem after a
workshop
3. Self esteem in the treatment group, self esteem in
the general population
4. Level of assertiveness; level of self esteem
# Question Ans Page Comments
56 Which is an appropriate null hypothesis for the analysts 1 If they give you POPULATION values you MUST use them!!!
of the results? (The self esteem scale was standardised on the general population with a
mean score of 30 and a standard deviation of 10.)
1. μ = 30
2. μ1 = μ2 So here you will need to do td first (for the pre-test post test design) and then
3. The population mean of the difference scores z (for the single sample groups design) to test your hypothesis.
equals zero
4. μ1 ≠ μ2 If they didn't give you the population mean then option 3 would be correct.
57 Which is the appropriate test statistic to calculate? 2 In the previous question both td and z tests were performed, but the test is
done on dependant samples, therefore option 1 and 3 are incorrect.
1. The z-statistic for the difference between the means
of two independent samples The appropriate test statistic to calculate is the t-statistic for the difference
2. The t-statistic for the difference between the means between the means of two dependent samples
of two dependent samples
3. The t-statistic for the difference between the means Having said that, option 1 was incorrectly phrased. It should have read "The
of two independent samples z-test for single sample groups design". In this case option 1 would be the
4. A test of the correlation coefficient for the two sets correct answer since this is a single group and the population standard
of scores deviation was given. If σ is given, you have to use it for your tests and
therefore the z test must be performed
58 A researcher wants the compare the cognitive 1 P114- The t-test is performed before the p-value can be determined. Only if the p-
development of two groups of children using the mean 116 value is smaller than the level of significance (α) should the null hypothesis
score of each group to test the following hypotheses (H0) not be accepted.
H0: μ1 = μ2
H1: μ1 ˃ μ2 Therefore: She needs to find the relevant p-value before making any
conclusion
Her results derived from a random sample from each
group of children shows that the mean sample score
on a scale which measures level of cognitive
development for the first group is less than the mean
sample score for group two (i.e. ẋ1 < ẋ2 ). What may
she conclude?
60 Correlation is used in data analysis when one 3 P129- Correlation: measuring the association between variables
investigates the relation between ______ 130
Correlation is a measurement of the extent to which a measurement on
1. the mean of a single sample of subjects and a one variable is related to a measurement on another variable for the
population mean same sample of individual cases.
2. two groups of subjects, with respect to a single
variable This can be visualised by way of a graphical representation called a scatter
3. two variables measured on the same group of plot. A scatter plot is a graph that represents the measurements of two
subjects variables on two perpendicular axes, usually called the x-axis (horizontal axis
4. two variables from independent samples or abscissa) and the y-axis (vertical axis or ordinate).
61 A positive correlation between variables X and Y 2 P133 If a correlation exists, the way in which one variable varies will be related to
implies that persons scoring low on X will generally variation on the other one.
score ______ on Y
A negative correlation implies that as one variable changes, the other
1. high changes in the opposite direction. A high value on X will imply a low value on
2. low Y, while a low value on X will be matched by a high value on Y.
3. either high or low
4. in a totally unpredictable way Conversely, if the correlation is positive, the variable values will
generally vary in the same direction (both high or both low).
# Question Ans Page Comments
62 Which of the values given below is the best estimate of 1 P132- As variable X increases (from -2 to 2), variable Y (decreases (from 2 to -2).
the Pearson correlation coefficient between the 133 This implies a negative correlation.
following values of X and Y?
I also changes exactly the same amounts, so we have a perfect negative
X -2 -1 0 1 2 correlation which is -1
Y 2 1 0 -1 -2 We use 'r' as the symbol that represents a correlation coefficient (as in the
case of the Pearson product-moment correlation coefficient), and the
1. -1 following applies:
2. 0 • r = 1 implies a perfect positive linear relationship (the dots in a scatter plot
3. +1 will run from lower left to upper right in a perfectly straight line)
4. 0.5 • r = 0 implies no linear relationship at all (the dots may be scattered all
over the place)
• r = -1 implies a perfect negative linear relationship (the dots will run from
upper left to lower right in a straight line)
63 A researcher hypothesizes that the drug treatment of 1 SG P137 The symbol ‘ρ’ (the Greek letter ‘rho’) is used to represent the population
hospitalised schizophrenic patients improves their parameter being tested when you calculate the Pearson’s correlation
mental alertness. He studies a random sample of 27 Tut202 coefficient ‘r.’ That is, you calculate r for the sample, then have to decide
such patients and finds a correlation coefficient of 0.6 2014 whether this is likely to represent a significant linear correlation between two
between the number of days of drug treatment and Q13 variables for the whole population (with this population correlation symbolised
patients’ scores on the Mental Alertness Test. Which is by ρ), by looking at the p-value associated with this calculated sample
an appropriate null hypothesis for this research? statistic r.
1. p=0 In a similar way ‘μ’ represents the population parameter (statistic) for a mean,
2. μ=0 and ‘σ’ the population parameter for a standard deviation. These two are not
3. r=0 applicable in this question.
4. μ1 = μ2
# Question Ans Page Comments
64 The table below gives the number of persons observed 4 P143- It is important to note that the relation between the variables is described by
to be in each of the categories in a cross classification 144 the cell and not by the row or column frequencies. These cell frequencies
of gender (male/female) and place of residence represent the way the information is distributed relative to the two variables.
(rural/urban). What would the expected value be for These cell frequencies are often referred to as the observed or empirical cell
persons classified as both 'urban' and 'male’, if no frequencies.
relationship exists between gender and place of
residence? To find the expected frequency for a particular cell, the row total for that row
is multiplied by the column total for that column and this result is then divided
Row by the overall total. These expected frequencies show what the results would
Male Female
Total have been like if the distribution of frequencies through the cells were
Urban 6 4 10 homogeneous, in proportion to the respective row and column totals. If the
observed frequencies correspond precisely with the expected frequencies,
Rural 6 8 14 we know that the null hypothesis cannot be rejected. But the observed
Column frequencies will seldom be precisely equal to the expected frequencies - even
12 12 24 if H0 is not rejected - because of sampling error.
Total
It is the differences between these expected and observed frequencies that
1. 24 interest us, that is, we want to know how far the actual (observed) results are
2. 6 removed from the expected situation, if there is no interaction effect.
3. 10 Row total (O.1) = 10
4. 5 Column total (O1.) = 12
Sample total (size) (O..) = 24
P144 The chi-square test is usually used when you have a cross tabulation of
frequency counts of events which are nominal scale measurements. This
table is referred to as a contingency table. It is used to compare an observed
frequency distribution (frequency counts based on a sample of observation)
with the frequency distribution which we would expect to find if the null
hypothesis of no relationship between two cross-tabulated variables were
true. The Pearson chi-square test statistic is a calculation of the difference
between the observed and expected frequencies and is qualitative of nature.
66 Which of the following is the appropriate formula for the 1 P144- The Pearson chi-square test statistic, is a calculation of the difference
Chi square test? 145 between the observed and expected frequencies.
This means the expected value for each cell in the contingency table is
subtracted from the observed value for that cell, squared, and divided by the
expected value for that cell.
A psychologist reads an article in which the author claims that playing computer games leads to higher levels of aggression in children. She decides to
test this by asking a sample of children to report the number of computer games they play per month and measuring the aggression level of each child
with an appropriate psychometric test. She expects to find that a positive correlation will exist in her sample between level of aggression and number of
computer games played
67 The researcher draws a graph of the relationship 1 A graph showing the position of each of a number of sampling units on each
between aggression and number of computer games. of two variables
Which of the scatter plots below give the most
probable representation of the data if the expected P130- A scatter plot is a graph showing the relationship between two numerical
relationship exists? 132 variables. In such a graph the data of the one variable are plotted on the
horizontal axis (usually referred to as the X axis), and the data of the other
variable on the vertical (or Y) axis. It is not a comparison of sample and
Tut202 population, nor has it to do with spread of data or the independence of
2014 variables
Q18
The closer the dots in the plot are to a straight line, the closer the correlation
coefficient is to 1 (it can be either a positive number (+1) or a negative
number (-1)). The more arbitrary or spread out the dots, the closer the
correlation coefficient is to 0. If the plot seems to form a line from lower left to
1. Graph A upper right, the correlation is positive. On the other hand, if the line runs from
2. Graph B upper left to lower right, the correlation is negative.
3. Graph C
4. Graph D She expects to find that a positive correlation will exist in her sample
between level of aggression and number of computer games played.
Therefore the plot must form a line from lower left to upper right
# Question Ans Page Comments
68 The researcher calculates the Pearson product 4 P130- She expects to find that a positive correlation will exist in her sample
moment correlation coefficient of the relationship 132 between level of aggression and number of computer games played.
between level of aggression and number of computer
games played. Which of following expressions best This implies that the relationship must be between 0 and 1 (or greater than 0)
represent the relationship if the expectations of the
researcher about the relationship are true?
Therefore: r > 0
1. r≠0
2. r=0
3. r<0
4. r>0
A sample of 300 clients are drawn from three community mental health centres (indicated in the table as A, B and C). Counts are made of those clients
who are diagnosed as having social adjustment problems, those with problems related to anxiety, and the remaining clients are classified under 'other
problems '. Counts of the number of clients from the different centres which fall in each of the categories are supplied below.
69 What is the type of arrangement of data above called? 2 SG A contingency table is a table indicating the number of individual objects
P142- falling in each cell of cross-tabulated data. In other words, it is a two-
1. Histogram 144 dimensional table in which each observation is classified in terms of two
2. Contingency table categories simultaneously.
3. Correlation matrix
4. Classification table
# Question Ans Page Comments
70 A researcher want to establish whether the types of 1 P140 The chi-square test is usually used when you have a cross tabulation of
diagnoses made differs significantly among the P144- frequency counts of events which are nominal scale measurements. This
different mental health centres or not. Which of the 145 table is referred to as a contingency table. It is used to compare an observed
following would be the most appropriate statistical test frequency distribution (frequency counts based on a sample of observation)
to use? Tut202 with the frequency distribution which we would expect to find if the null
2014 hypothesis of no relationship between two cross-tabulated variables were
1. The Chi-square (x²) test Q22 true.
2. A test of the correlation coefficient
3. The t-test for two samples
4. The z-test for two samples
Oct/Nov 2013
2 In science, including social science, the word ‘theory’ 2 SG P4 A theory is a framework for facts: it is the explanation of why the facts (i.e.
refers to ______ observations, measurements, phenomenon) are as they are, or are related in
Tut201 the way in which they are related, based on empirical investigations.
1. a plausible guess based on one’s previous 2013 Q7
knowledge about a phenomenon Option 1 is a description of a hypothesis, but this is often how the word
2. an explanation of why a phenomenon appears as it ‘theory’ is used in informal conversation.
is observed to be
3. an explanation of the procedure by which a
construct should be measured
4. the process where independent variables are varied
to see how they affect the dependent variables
# Question Ans Page Comments
3 Inferential statistics refer to ______ 2 P2 An inference is a conclusion that follows from existing information, by
generalising from the specific information to the general type of phenomenon,
1. calculating statistics which summarises the data where the conclusion is not absolutely certain. So in summary inferential
2. using probability theory to make conclusions based statistics are techniques for making generalisations based on imperfect
on observations of data P10-11 numeric data, where the conclusions have a high probability of being true, but
3. the process of converting general research you can never be completely certain.
questions into specific formal hypotheses
4. the process of finding a way to measure an abstract A distinction exists between inferential statistics and descriptive statistics.
construct Descriptive statistics refers to a set of quantities used to summarise
aspects of numerical data. Examples that you may be familiar with are
means, range, variance and standard deviation (see Appendix C for a quick
introduction). These summary quantities are sometimes referred to as
parameters (when they refer to the whole collection or population of data; see
section 1.4.3 below).
4 When doing research, the term 'Operationalisation' is 3 P24-26 Operational definitions of psychological constructs should define constructs in
used to refer to the process of ______ terms of observable behaviour.
1. calculating a test statistic to test a particular "Operational'' refers to practical procedures by which constructs are made
hypothesis visible.
2. converting a general research question into a
formal statistical hypothesis "Operationalisation" is where you make the construct (which is usually an
3. determining a way to get a numeric measurement abstract concept, so it is difficult to observe it clearly) visible by finding some
of a construct which is being measured suitable way to measure it.
4. converting a calculated test statistic into a
probability value called the p-value
5 In social science research, the total collection of 4 P10 When several measurements are collected from a number of people, the
measurements across a group of research participants collected information is referred to as the data (while a single item of
is referred to as ______ information is a datum). Data are all the variables for all the cases in the
research.
1. descriptive statistics
2. parameters
3. sample statistics
4. data
# Question Ans Page Comments
6 Psychological measurements are always imperfect. 1 P14-15 One of the consequences of using samples to represent populations is that
The way in which a measurement varies around its this always leads to a certain degree of measurement error, no matter how
‘true’ value is referred to as ______ rigorous our sampling procedure is. Another source of measurement error
lies in the fact that our measurements are imprecise, that the measurement of
1. measurement error a psychological construct is only more or less accurate. This measurement
2. variance error is a kind of hidden variable, which we always presume to exist in social
3. hidden variables scientific research. This is referred to as the error component or the error
4. standard deviation term.
This is one of the major reasons for using statistical probability theory in our
data analysis: we assume that any variable we measure contains a 'true'
element and an 'error' component. Furthermore, we assume that the mean of
the error component is zero. We can do this because it is reasonable to
assume that positive deviations and negative deviations from the perfect
score (measurements that are too high or too low) will cancel each other out.
We also need to make an additional assumption, namely, that these error
terms are distributed around this mean of zero in a normal distribution
Variance (s²) and standard deviation (s) are the same concept so they can't
both be correct (2 and 4)
7 A social science researcher is told by a grade 1 teacher 2 P3 Constructs and their interrelations (how they affect each other, their patterns
that some children are terribly shy while other children of interaction) are used in this way to develop theoretical explanations of
seem to be quite comfortable in the social group. The why people behave in certain ways in certain contexts, or why mental
researcher decides to investigate, using a test for phenomena appear to be as they are. Psychologists try to develop
shyness which was developed especially for young explanations for human experiences and behaviour. To do this, they often
children. In this study, ‘shyness’ would be a ______ have to make use of abstract concepts (also called constructs) that serve as
while the measurement of it is referred to as a ______ explanations for the behaviour they observe.
1. variable, construct P7 A construct that has been measured in some way produces a variable.
2. construct, test A variable refers to a number that can take on any one of a range of possible
3. construct, variable values. They can be discrete (when only whole numbers like 1, 2, 3 are
4. concept, scale allowed) or continuous (what mathematicians refer to as 'real numbers'). In
some cases variables also take on values smaller than zero to produce
negative numbers.
9 Numeric values which represent some kind of 2 P7 A construct that has been measured in some way produces a variable.
psychological measurement and which can change A variable refers to a number that can take on any one of a range of possible
from one measurement to the next are referred to as values. They can be discrete (when only whole numbers like 1, 2, 3 are
______ allowed) or continuous (what mathematicians refer to as 'real numbers'). In
some cases variables also take on values smaller than zero to produce
1. statistics negative numbers.
2. variables
3. parameters
4. constructs
10 A psychologist is conducting research into hypnosis. 1 P8-9 The dependent variable is the one that is predicted or explained, and the
She believes that a relationship exists between a P24 independent variable is manipulated to see how it affects the dependent
person's suggestibility (susceptibility to hypnosis) and variable.
his or her level of self esteem. In this design,
‘suggestibility’ is the ______ variable and ‘level of self The independent variable is that variable which affects the dependent
esteem’ is the ______ variable variable; or, conversely, the dependent variable depends on the independent
variable.
1. dependent, independent
2. latent, manifest When a researcher focuses on the interaction of only two variables at a time,
3. independent, dependent the dependent variable is usually the one that the researcher is interested in,
4. hidden, operational the variable that is the focus of the research. The independent variable is
something that the researcher manipulates, to see how this affects the
dependent variable (in other words, the dependent variable is dependent on
the independent variable).
# Question Ans Page Comments
11 The symbol ______ is usually used to indicate the 1 P161
mean of a sample, while the mean of the population Symbol
from which the sample comes is indicated by the Summary value Populations Samples
symbol ______
(Parameter) (Statistic)
1. , μ Arithmetic mean μ
2. s, σ Standard deviation σ s
3. α, Variance σ² s² (s=√s²)
4. μ, σ Standard error of mean
(Also called Standard deviation of the σ (= σ/√n) s (= s/√n)
sampling distribution of the mean)
Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean µ
under H0)
Z score for means z
Correlation between two measurements
(Pearson's R)
ρ r
Proportions P p
12 Which best describes “research hypothesis"? 2 Tut201 A psychological hypothesis formulates a testable empirical claim (something
2012 Q8 that can in principle be observed), and this usually involves postulating a
1. A proven relation between two constructs relationship between two or more variables.
2. A proposed relation between two or more variables
3. A network of all the possible relations between P1 A research hypothesis is formed as a clear statement in terms of a
constructs P18-19 relationship among the constructs (and the variables by which they are
4. A scientific theory measured). It is a statement about a possible relationship among constructs
that may explain some set of observations that one intends to investigate.
# Question Ans Page Comments
13 Which of the following does NOT represent a possible 3 P33 The probability value tells us at a glance how frequent or infrequent the event
value for a probability‘? is, and what the likelihood is of obtaining a favourable outcome associated
with it.
1. 99% • Probabilities can be expressed as percentages (e.g. a 10% probability),
2. 0 as fractions (e.g. a 1/10 probability), or as a decimals (e.g. a 0.10
3. -0.05 probability).
4. 1.0 • A probability value represents a proportion (i.e. the proportion of
outcomes supporting the event). A proportion is a decimal number
between 0 and 1 and indicates the fraction of the total.
• We often refer to the probability of an event (or statistic) as its p-value.
• When decimal notation is used to describe probabilities, they fall in a
range between 0 and 1, with values closer to 1 indicating a greater
likelihood (or chance of success) than values close to zero.
• Because probabilities fall in a range from 0.0 to 1.0 when expressed
decimally, a probability can never be higher than 1 or lower than 0. The
general rule is written symbolically as follows: 0 ≤ p ≤ 1.
14 A jar contains 5 red, 8 blue, 3 green and 4 yellow 4 P29 Number of possible outcomes = Total marbels = 20 (5+8+3+4)
marbles. What is the probability that a person who is Number of favourable events = Pick one blue marble = 8
blindfolded will choose a blue marble purely at
random? p(E) = Number of favourable events
Number of possible outcomes
1. 0.20
2. 0.25 =8 = 2/5 = 0.40
3. 0.50 20
4. 0.40
15 Consider the same jar, filled with the same number of 2 P29 Number of possible outcomes = Total marbels = 20 (5+8+3+4)
marbles than in the previous question. What is the Number of favourable events = Pick one red marble = 5 OR
probability that a person would choose either a red Pick one yellow marble = 4
marble or a yellow one?
p(E) = Number of favourable events
1. 0.50 Number of possible outcomes
2. 0.45
3. 0.25 = 5+4 = 9 = 4.5/10 = 0.45
4. 0.90 20 20
# Question Ans Page Comments
16 A researcher applies a test of creativity on a sample of 1 P53 Score of 8 or more :
fine arts students She creates the following graph Score of 8 = 3
based on the results, where the horizontal (x) axis Score of 9 = 5
represents the scores on the creativity test and the Score of 10 = 2
vertical axis (Y) are frequencies (counts for each score
are indicated on top of the bars in the graph) Number of participants (N) = 30 (2+4+8+6+3+5+2)
Formula is :
So:
μ = ∑xi / N
= (3+5+2) / 30
= 10 / 30
= 0.33
1. 0.33
2. 0.25
3. About 50%
4. More information is needed, the p-value will have to
be calculated from the raw data
17 The expression “0.05 ≤ p ≤ 0.10" denotes a probability 4 P33-34 Larger than or equal to 0.05 and smaller than or equal to 0.10
value which is ______
Because probabilities fall in a range from 0.0 to 1.0 when expressed
1. a number halfway between 0.05 and 0.10 decimally, a probability can never be higher than 1 or lower than 0. The
2. larger than or equal to 0.10 or smaller than or equal general rule is written symbolically as follows: 0 ≤ p ≤ 1. Note that a
to 0.05 probability can be 0, but to say that a probability is 0 is actually the same as
3. larger than 0.05 and smaller than 0.10 saying that the event is impossible and can never happen. Likewise, to say
4. larger than or equal to 0.05 and smaller than or that the probability of an event is 1 is to assert that it is an absolute certainty.
equal to 0.10 In actual practice, probabilities fall within these two extremes.
You will typically encounter reference to probabilities in expressions such as
''p > 0.05''. This statement is interpreted as ''the probability value is higher
than 0.05''.
# Question Ans Page Comments
18 What would the z-score be for a soldier with a score of 4 P53 Formula is :
5.5?
1. -0.25
2. 0 Where:
3. 0.5 X = 5.5 (score of soldier)
4. 1 μ = 3.5 (normally distributed mean)
σ = 2.0 (standard deviation)
So:
Z = (x - μ) / σ = (5.5 - 3.5) / 2.0 = 2 / 2.0 = 1
19 What would the probability be of a soldier getting a 2 The z-score for a soldier getting 5.5 = 1 (1.00)
score of greater than 5.5?
Refer to the standard normal distribution table and lookup 1.00
1. 0.84 • The larger portion for z=1 is 0.8413 (0.84)
2. 0.16 • The smaller portion for z=1 is 0.1587 (0.16)
3. 0.34
4. The p-value will have to be calculated using the raw Since the mean is 3.5 and the soldier is already at 5.5, that forms the larger
data portion. For a soldier to get a score of greater than 5.5, we have to look at the
smaller portion which is 0,1587 (0.16)
# Question Ans Page Comments
20 A variable X is found to be normally distributed. If the 2 P53-54 Figure 2.7 above shows the approximate proportions of scores distributed
probability distribution of this variable is plotted, what under the area covered by the curve.
would the total size of the area under the curve be, to • The total area under the curve gives the probability of the interval -∞
the left side of the sample mean? and +∞, and is equal to +1 (i.e., the probability of any value of z falling
between minus and plus infinity is equal to 1).
1. 100% • Because the distribution is symmetrical, 0.5 of the area lies to the
2. 0.5 left of the mean and the same proportion to the right of the mean.
3. 1 • Approximately 0.341 of the area lies between the mean and 1 standard
4. 0 deviation in each direction.
• Roughly two-thirds, or 0.682 (0.341 x 2) of the area of the curve lies within
one standard deviation of the mean.
• Approximately 0.477 (i.e. 0.3413 + 0.1359) of the area lies between the
mean and 2 standard deviations in each direction.
• Approximately 0.954 (i.e. 0.477 x 2) of the area lies within 2 standard
deviations from the mean.
• Approximately 0.998 (i.e. 0.954 + (0.0215 x 2)) of the area lies within
three standard deviations from the mean.
The researcher calculates descriptive statistics for this sample, and finds a mean of Y = 120 and a standard variation of 10. Since the sample data is
roughly normally distributed, she draws the graph below.
# Question Ans Page Comments
21 Based on the data in the scenario, what would the 1 P68 The standard deviations in the graph are the square roots of the variances
variance of the distribution of the scores be? s² (s=√s²)
1. 100 P160 The variance is just the square of the standard deviation. Conversely, the
2. 10 standard deviation is the square root of the variance. Variance gives an
3. 12 indication of how much the data varies around the mean; the 'width' of the
4. 1 distribution (in both directions). The advantage of using standard deviation is
that it is expressed in the same units (the same measurement scale) as the
original data, while the variance represents a measurement in squares (x²).
• For a sample, the variance is s²
• For a population, the variance is σ²
1. 120
2. 10
3. 1
4. It is the range of values between 110 and 130
25 Which of the following expressions of the rule for 2 P34-36 The additive rule is p(A or B) = p(A) + p(B). This rule is used when two or
combining mutually exclusive probabilities is correct? more events are mutually exclusive. The additive rule is used to determine
the sum of two or more probabilities, and is signalled by the use of the word
P(A or B) = 'or' (i.e. the probability of A or B).
1. P(A) / P(B) The multiplicative rule states that p(A and B) = p(A) x p(B) where A and B
2. P(A) + P(B) are both independent events. This rule is used to determine the product of
3. P(A) x P(B) two or more probabilities and is indicated by the word 'and' (i.e. the probability
4. P(A) - P(B) of A and B).
28 When a statistical test is performed, the size of the p- 1 P84 The test statistic is a value with a known probability distribution: we can use it
value will be a consequence of ______ to determine what the probability is of finding an effect of a particular size,
which we refer to as the p-value. It is because of our knowledge of the
1. the value of the test statistic probability distribution of the test statistic that we can determine the p-value.
2. a choice made by the researcher
3. the null hypothesis We compare this p-value with a level of significance (α) that we chose before
4. the level of significance at which the test is we did the sampling and made the observation. This is chosen by the
performed researcher, based on the risk of being wrong when rejecting the null
hypothesis that he or she is willing to take. If the p- value associated with the
test statistic is smaller than this α -value, the null hypothesis is rejected and
the alternative hypothesis accepted. If not, the null hypothesis is not rejected.
# Question Ans Page Comments
29 When a researcher sets the level of significance to α = 1 When alpha reduces, the probability of Type I (α) error decreases and Type II
0.01 during hypothesis testing, it implies that the (β) increases.
probability of making an error of ______ will be at
______ 1% SG 82- An error of Type I is the error we make if we reject the null hypothesis when
86 we should not have done so, and the level of significance (α) represents the
1. Type I, most greatest risk of doing this that we are willing to take.
2. Type I, least
3. Type II, most An error of Type II is the opposite of Type I. We fail to reject the null
4. Type II, least hypothesis when we were supposed to.
P85 Generally, though, the smaller α, the larger β. If we wish to avoid Type I
errors, we set α to a small value such as 0.01 or even 0.001, but if we
want to avoid Type II errors, we could set α to a larger value.
30 Before doing statistical testing, a researcher sets the 3 P85 See above comments
level of significance to 0.05. This is the ______
31 Which of the statements below are true? 4 P83 A test statistic is used to determine a p-value
A test statistic ______ P84 We calculate a test statistic that is an indication of how far the observed
(a) is used to determine a p-value effect - as reflected in the sample data - deviates from what the null
(b) is used to determine the value of α hypothesis tells us to expect (if it were true).
(c) shows how far an observed measurement The test statistic is a value with a known probability distribution: we can use it
deviates from what can be expected by chance to determine what the probability is of finding an effect of a particular size,
(d) indicates the probability of making an error if which we refer to as the p-value. It is because of our knowledge of the
the null hypothesis is rejected probability distribution of the test statistic that we can determine the p-value.
We compare this p-value with a level of significance (a) that we chose before
1. Only (a) we did the sampling and made the observation. This is chosen by the
2. Only (b) researcher, based on the risk of being wrong when rejecting the null
3. Both (c) and (d) hypothesis that he or she is willing to take. If the p-value associated with the
4. Both (a) and (c) test statistic is smaller than this a-value, the null hypothesis is rejected and the
alternative hypothesis accepted. If not, the null hypothesis is not rejected.
The p-value represents the probability that the null hypothesis is true: that the
effect we see in our observation is due to chance effects like measurement
error.
# Question Ans Page Comments
32 When a statistical test yields a large p-value, which of 3 P83-84 The p-value represents the probability that the null hypothesis is true: that the
the following statements is most correct? effect we see in our observation is due to chance effects like measurement
error. If this probability is small, we conclude that H0 is not true, and we
1. The alternative hypothesis is probably true reject it. If this probability is large, we conclude that H0 is probably true,
2. The null hypothesis is probably false and we fail to reject it (the research hypothesis could not be confirmed)
3. The null hypothesis is probably true
4. The alternative hypothesis cannot be rejected This p-value is also a direct indication of the probability that the null
hypothesis is being mistakenly rejected. In other words, it shows the
probability that the researcher is rejecting a null hypothesis that is actually
true.
33 When applying a t-test to compare a sample mean 3 P110 Determine the p-value, which tells you what the probability of this observed
calculated from a measurement to a known population relationship (indicated by the test statistic) would be under the null
mean, the p-value represents ______ hypothesis.
1. the probability of correctly rejecting the null P77-78 Calculating the probability of the sample result under the null hypothesis
hypothesis
2. the probability of obtaining the sample mean under
the alternative hypothesis
3. the probability of obtaining the sample mean under
the null hypothesis
4. the largest risk of making an error by rejecting the
null hypothesis that one is willing to take
34 Under which of the circumstances below would you 3 P103 The t-distribution is a statistical distribution with a probability distribution that
make use of a t-test statistic? can be determined, which means that we can use it to predict the chances of
obtaining specific outcomes when testing for comparisons of means when
1. When comparing from two independent variables the population standard deviation σ is unknown.
2. The sample standard deviation is unknown
3. The population standard deviation is unknown App F Three types of t-tests:
4. The sample is not known to be normally distributed P177 t test - Difference between one group and a constant, σ is unknown
tc test - Difference between two independent groups, σ is unknown
td test - Difference between two dependent groups, σ is unknown
# Question Ans Page Comments
35 When a statistical test yields a very small p-value, we 4 P81 Here is a summary of the important points regarding the p-value:
know that the sample result is very ______ • The p-value gives the probability of obtaining the sample result under H0.
• If the p-value is very small, the probability is very small that the sample
1. likely under the null hypothesis result would occur under H0, and one should consider rejecting H0 in
2. unlikely under the alternative hypothesis favour of H1.
3. likely at a specific level of significance • The smaller the p-value, the more likely that the null hypothesis is false
4. unlikely under the null hypothesis and should be rejected in favour of the alternative hypothesis.
So, if the p-value is very large, the probability is very big that the sample
result would occur under H0, and one should consider accepting H0 in favour
of H1. The null hypothesis is then probably true
36 A type I error occurs when the ______ 1 SG 82- An error of Type I is the error we make if we reject the null hypothesis
86 when we should not have done so, and the level of significance (α)
1. null hypothesis is wrongly rejected represents the greatest risk of doing this that we are willing to take.
2. null hypothesis is not rejected when it should be
3. alternative hypothesis is wrongly rejected An error of Type II is the opposite of Type I. We fail to reject the null
4. p-value exceeds the level of significance hypothesis when we were supposed to.
P85 Generally, though, the smaller α, the larger β. If we wish to avoid Type I
errors, we set α to a small value such as 0.01 or even 0.001, but if we want to
avoid Type II errors, we could set α to a larger value.
37 Which one of the following alternative hypotheses 4 P75 The alternative hypothesis can contain any of the symbols '>', '<' or '≠'
requires a non-directional test of significance? respectively, the symbols for 'larger than', 'smaller than' or 'not equal to'.
When a comparison is between a value that is greater (more) than another,
1. The mean anxiety score for boys is greater than we use the symbol '>' and when a comparison is between a value that is
that of girls smaller (less than) than another, we use '<'. The statistical test that must be
2. The mean verbal ability score for boys is lower than performed in either of these cases is a directional or one-tailed statistical
that of girls test (we use these expressions interchangeably).
3. There is no correlation between the test marks and
examination marks for a group of boys and girls When we do not specify what the direction of the difference should be,
4. There is not a significant correlation between the and both a larger and a smaller difference between means are
anxiety scores of boys and those of girls considered as relevant, the symbol '≠' must be used. The statistical test
to be performed will now be a non-directional or two-tailed test.
H0: μ = 100
H1: μ ≠ 100
Where both values of the mean, either greater than or smaller than 100 are to
be considered, a non-directional or two-tailed test is required.
# Question Ans Page Comments
38 Which of the following statements about the p-value 4 P81 Here is a summary of the important points regarding the p-value:
are true? • The p-value gives the probability of obtaining the sample result under H0.
• If the p-value is very small, the probability is very small that the sample
a) It gives the probability of making an error of result would occur under H0, and one should consider rejecting H0 in
Type I favour of H1.
b) It should exceed the level of significance • The smaller the p-value, the more likely that the null hypothesis is false
c) If it is relatively large the null hypothesis will and should be rejected in favour of the alternative hypothesis.
probably have to be rejected
d) If it is less than or equal to the level of So, if the p-value is very large, the probability is very big that the sample
significance, H0 should be rejected result would occur under H0, and one should consider accepting H0 in favour
of H1. The null hypothesis is then probably true
1. (d) and none of the others
2. (b) and (c)
3. (a) and (b)
4. (a) and (d)
39 Which of the following are appropriate ways to express 4 P75 The alternative hypothesis (H1) can contain any of the symbols '>', '<' or '≠'
an alternative hypothesis when a formal statistical respectively, the symbols for 'larger than', 'smaller than' or 'not equal to'.
hypothesis is being formulated? When a comparison is between a value that is greater (more) than another,
we use the symbol '>' and when a comparison is between a value that is
a) μ1 = μ2 smaller (less than) than another, we use '<'. The statistical test that must be
b) μ1 ≠ μ2 performed in either of these cases is a directional or one-tailed statistical
c) μ1 > μ2 test (we use these expressions interchangeably).
d) μ1 < μ2
When we do not specify what the direction of the difference should be,
and both a larger and a smaller difference between means are
1. Only (a) considered as relevant, the symbol '≠' must be used. The statistical test
2. Only (b) to be performed will now be a non-directional or two-tailed test.
3. Both (c) and (d) only
4. (b), (c) and (d)
40 During statistical hypothesis testing, a p-value is 3 P82 The decision rule for H0 is simply as follows:
calculated based on a test statistic. If this p-value is If the p-value of the sample result is smaller (less) than α (level of
______ than the level of significance, the ______ significance), the null hypothesis is rejected. If the p-value is not smaller than
hypothesis should be ______ α, the null hypothesis (H0) is not rejected.
1. z-tables P84 We calculate a test statistic that is an indication of how far the observed
2. size of the test statistic effect - as reflected in the sample data - deviates from what the null
3. null hypothesis statement hypothesis tells us to expect (if it were true).
4. level of significance The test statistic is a value with a known probability distribution: we can use it
to determine what the probability is of finding an effect of a particular size,
which we refer to as the p-value. It is because of our knowledge of the
probability distribution of the test statistic that we can determine the p-value.
We compare this p-value with a level of significance (α) that we chose before
we did the sampling and made the observation. This is chosen by the
researcher, based on the risk of being wrong when rejecting the null
hypothesis that he or she is willing to take. If the p-value associated with the
test statistic is smaller than this a-value, the null hypothesis is rejected and the
alternative hypothesis accepted. If not, the null hypothesis is not rejected.
The p-value represents the probability that the null hypothesis is true: that the
effect we see in our observation is due to chance effects like measurement
error.
# Question Ans Page Comments
43 A failure to reject H0 implies that a difference between 3 P83 We would say that our result does not enable us to conclude that H0 is false,
the calculated sample mean and its expected value or even that the result favours the null hypothesis, but that we cannot accept
under H0 is due to ______ it is literally true. Even if our sample mean is =100 exactly, there is a remote
possibility that this is a chance event (due to measurement or sampling
1. the dependent variable error). What we do know in such a case is that there is no indication that H1
2. the independent variable can be true and no reason to do a test to confirm this.
3. chance
4. the test statistic P87 The implication is that we have to be careful how we interpret significant
results. A p-value of smaller than our chosen level of significance (α) simply
implies that, relative to this sample, it is improbable that the effect we see in
our observations is purely due to chance. It does not imply that the effect is
big or important. This is something that we have to decide by looking at what
the data means.
Base your answers to Questions 44 to 47 on the following scenario
Based on her experience, a developmental psychologist formulates a hypothesis that infants of younger than four months old tend to look at their mother's
faces for longer periods of time than at the faces of strangers. She also knows that past research has established that an infant will look at a picture of a
random human face for an average of ten seconds before looking away.
To test her hypothesis, the psychologist selects a sample of n = 25 infants of between one and four months old. She presents each infant with a picture of
their mother on a video screen and records for how long each infant attends to the image before looking away.
After collecting the data, she calculates that the infants in her sample spend a mean of = 12.5 seconds looking at the images before looking away, with a
standard deviation of s = 5.5 seconds.
44 Which research design is the most appropriate? 2 P99 A mean based on a single sample is to be compared to a specific value - a
population mean that is treated as a given.
1. Correlational design
2. Single sample group design
3. Two-groups design for dependent samples
4. Two-groups design with a known population mean
45 Which is the most appropriate way of formulating the 1 "a developmental psychologist formulates a hypothesis that infants of
relevant statistical hypotheses? younger than four months old tend to look at their mother's faces for longer
periods of time than at the faces of strangers"
1. H0 μ = 10, H1 μ > 10
2. H0 μ = 10, H1 μ < 10 The term "longer" indicates a directional hypothesis (greater than ">")
3. H0 μ ≠ 10, H1 μ > 10
4. H0 μ = 10, H1 μ ≠ 10
# Question Ans Page Comments
46 Based on the information presented in the scenario, 4 P103 The t-distribution is a statistical distribution with a probability distribution that
which would be the most appropriate test statistic to can be determined, which means that we can use it to predict the chances of
use out of the following‘? obtaining specific outcomes when testing for comparisons of means when
the population standard deviation σ is unknown.
1. The z-statistic for the mean of a single sample (z)
2. The t-statistic for the difference between the means App F Three types of t-tests:
of two independent samples (tc) P177 t test - Difference between one group and a constant, σ is unknown
3. The t-statistic for the difference between the means tc test - Difference between two independent groups, σ is unknown
of two dependent samples (td) td test - Difference between two dependent groups, σ is unknown
4. The t-statistic for the mean of a single sample (t)
z test - Difference between one group and a constant, σ is known
47 Given the scenario above, what would the calculated 2 P105 s = s/√n
value of the standard deviation of the distribution of the where: s = 5.5
means (the standard error) be? n = 25
48 The standard error of the mean for samples of a 3 P103- This is the standard deviation of the distribution of the means (or standard
specific size is the ______ 104 error of the mean), which we can calculate using the central limit theorem:
She draws a random sample of 100 children from a specific primary school and after investigation of their histories of television watching, allocates them
into two groups, a TV-Group of 45 children with a history of watching educational programmes and a Non-TV Group of 55 children with no such history.
At the end of the school year, she compares the final year marks of the two groups
# Question Ans Page Comments
51 Considering the scenario above, which of the following 2 Note:
statements are true? • The two groups are dependent.
(a) The two groups are dependent because they come • The dependent variable will be their final year marks as this will be
from the same school influenced by the independent variable.
(b) Watching television is the dependant variable and • The independent variable is watching educational programmes on TV
year mark is the independent variable • She hypothesises that the primary school children who regularly watch
(c) A one-tailed test would be required educational programmes on television get better general grades in
primary school than those who do not
1. (a) and none of the others o This means the grades are greater than (>)
2. (a) and (c)
3. (b) and (c) Samples are considered as comprising independent groups if the
4. (c) and none of the others P112 composition of the one sample in no way affects, in any systematic way, the
composition of the other sample. The two samples come from two groups
that have no obvious relationship. For example, where one sample is
measurements of a construct like 'self-esteem' among men, and the other
among women, but both groups were sampled purely randomly.
52 Which of the following is an appropriate description of 1 The researcher only focuses on primary school children, so that will be the
the research population in the scenario? whole population from which she will take a sample.
54 A psychotherapist wants to test the effectiveness of a 1 P112 Samples are considered as comprising independent groups if the
programme of cognitive behavioural therapy on clients composition of the one sample in no way affects, in any systematic way, the
who were diagnosed as suffering from high social composition of the other sample. The two samples come from two groups
anxiety. She uses a sample of 50 persons who were that have no obvious relationship. For example, where one sample is
diagnosed as persons with high anxiety and tests them measurements of a construct like 'self-esteem' among men, and the other
on a Social Anxiety Scale before the commencement of among women, but both groups were sampled purely randomly.
the series of therapy sessions, and again afterwards.
The two sets of measurements should be regarded as On the other hand, the concept of dependent groups refers to situations
______ where the samples are related, and it implies that each subject in one group
can be systematically paired off with a subject from the other group. For this
1. dependent reason, a dependent groups research design is often referred to as a
2. independent matched-pairs design.
3. drawn from a single population
4. highly correlated Sometimes dependent samples are produced when the researcher
deliberately matches subjects into pairs, based on the value of some hidden
or 'nuisance' variable. Another example of such a design would be a repeated
measures design, where the same research participant is observed under
more than one treatment or experimental condition
# Question Ans Page Comments
55 A researcher wants to compare two group means by P81 There is no correct answer for this question.
testing the following hypotheses at a significance level
of α = 0.05 Remember that H1 μ1 > μ2 is a directional hypothesis and requires a one-tail
test.
H0 μ1 = μ2 The p-value provided by the computer is non-directional (two-tailed) and must
H1 μ1 > μ2 therefore be divided by 2 to give a one-tail value.
1. H0 can be rejected because α < p-value But also remember that the alpha value is normally indicated for one-tail.
2. H0 cannot be rejected because α < p-value Therefore we should also be able to multiply it by 2 to get it to a two tail
3. H0 can be rejected because α / 2 < p-value comparisson with the two-tailed p-value.
4. H0 cannot be rejected because α x 2 > p-value So: α x 2 > p-value is the same as p-value / 2 < α
This means in the question that α=0.05 x 2 = 0.10 which is greater than p=0.07
So p < α and therefore H0 should be rejected.
56 Two samples can be considered independent when 4 P112 Samples are considered as comprising independent groups if the
______ composition of the one sample in no way affects, in any systematic way, the
composition of the other sample. The two samples come from two groups
1. care was taken that there were no hidden variables that have no obvious relationship. For example, where one sample is
that could affect them measurements of a construct like 'self-esteem' among men, and the other
2. care was taken that the samples are drawn under among women, but both groups were sampled purely randomly.
different experimental or treatment conditions
3. the samples are drawn from more than a single On the other hand, the concept of dependent groups refers to situations
population of subjects where the samples are related, and it implies that each subject in one group
4. there is no systematic matching of individuals of can be systematically paired off with a subject from the other group. For this
one sample with individuals from the other one reason, a dependent groups research design is often referred to as a
matched-pairs design.
To develop this adjusted t-test, we use the two matched samples to create a
new variable called 'đ '. We do this by computing a 'difference score' between
1 and 2 so that đ reflects the mean of the differences between the
measurements before and after. đ = 1 - 2
# Question Ans Page Comments
59 Which of the following statements about the 1 P106 Note that the bigger the t-value the greater the likelihood of rejecting H0 (as is
relationship between the value of a t-test statistic and the case with z-statistics), because it refers to how far the observed value of
the p-value is true, if the sample size (n) remains the sample statistic differs from the population parameter that was provided
constant? and refers to the areas on the edges of the distribution.
1. The larger the value of the t-test statistic, the This implies, the bigger the t-value, the smaller the p-vlaue
smaller p will be
2. The smaller the value of the t-test statistic, the
smaller p will be
3. lf the sample size n remains the same, the
relationship between the test statistic and the p-
value will remain constant
4. There is no specific relationship between the p-
value and the t-test statistic
60 Which of the following gives the best description of a 4 P73-75 By convention, the null hypothesis is usually indicated with the symbol H0
null hypothesis? The null hypothesis is the hypothesis
that ______ This hypothesis is referred to as the 'null hypothesis' because it is the
hypothesis that implies no effect.
1. expresses the research hypothesis through the use
of appropriate symbols The null hypothesis always contains the 'equal to' symbol '='. The null
2. indicates the direction of the difference that is hypothesis is the hypothesis that no effect exists, and in cases where we
expected between two groups are testing a mean, this implies that two group means (or a group mean and
3. expresses the probability that observed relationship a specific constant value) do not differ.
will be significant
4. states that there is no relationship between the
variables
61 In correlational research one investigates the relation 3 P130 Correlation is a measurement of the extent to which a measurement on one
between ______ variable is related to a measurement on another variable for the same sample
of individual cases.
1. the mean of a single sample of subjects and a
population mean
2. two groups of subjects, with respect to a single
variable
3. two variables measured on the same group of
subjects
4. the difference scores of two groups of test
measurements
# Question Ans Page Comments
62 A researcher hypothesizes that the greater the number 1 P161 You should take careful note of the following important distinctions between
of books read by pupils over a specific school year, the samples and populations. Summary values for populations are called
greater their language comprehension will be at the 'parameters' and are usually denoted by Greek letters, while summary values
end of that year. He studies a random sample of 100 for samples are called 'statistics' and are denoted by Roman letters.
pupils in grades 10 — 12 in a specific school, collecting Symbol
information on the number of books they read in a Summary value Populations Samples
specific year and letting them do a reading (Parameter) (Statistic)
comprehension test at the end of the year Arithmetic mean μ
Standard deviation σ s
Variance σ² s² (s=√s²)
Which is an appropriate formal expression of the Standard error of mean
alternative hypothesis for this research? (Also called Standard deviation of the σ (= σ/√n) s (= s/√n)
sampling distribution of the mean)
1. ρ>0 Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean µ
2. μ>0 under H0)
3. r >0 Z score for means z
4. ρ≠0 Correlation between two measurements
(Pearson's R)
ρ r
Proportions P p
H1: r = 0 This implies that a relationship that differs significantly from zero
does in fact exist, but we are making no `educated guesses' as to whether it
is a positive or negative relationship: we just want to know whether there is in
fact a relationship.
H1: r > 0 This implies that we want to establish whether a significant
relationship of greater than zero exists, that is, a significant positive
relationship.
H1: r < 0 This implies that we want to establish whether a significant
relationship of less than zero exists, that is, a significant negative relationship.
# Question Ans Page Comments
63 Which of the following can never be exactly zero? 2 P83-85 If the significance level (α) = 0, then there would be no possibility of the p-
value being lower than α so the null hypothesis H0 will always be true
1. a probability
2. a level of significance
3. a correlation coefficient
4. a t-test statistic
64 A graph showing the position of each of a number of 3 P130- A graph showing the position of each of a number of sampling units on each
measurements on each of two variables is called a 132 of two variables
______
Tut202 A scatter plot is a graph showing the relationship between two numerical
1. histogram 2014 variables. In such a graph the data of the one variable are plotted on the
2. contingency table Q18 horizontal axis (usually referred to as the X axis), and the data of the other
3. scatter plot variable on the vertical (or Y) axis. It is not a comparison of sample and
4. correlation coefficient population, nor has it to do with spread of data or the independence of
variables
65 For a larger sample size (n) ______ 4 P139 If you randomly put three dots on a blank square of paper, they may, purely
by chance, fall into something approximating a straight line. If you make a
1. a smaller value of a Pearson's correlation hundred marks on the same piece of paper, also in a totally random way, the
coefficient r will reach significance chance of them falling in a straight line is, however, a lot less. This tells you
2. a larger value of a Pearson's correlation coefficient something about the relationship between r (a measure of whether the dots
r is required before the result will be significant on a scatter plot fall in a straight line) and the number of dots (the sample
3. the size of Pearson's correlation coefficient r is size n): the smaller n, the more likely it is that the plot will represent a straight
likely to increase line purely by chance. Therefore, for a smaller sample n, the test must be
4. the size of Pearson’s correlation coefficient r is much more conservative. You must, therefore, put up a bigger hurdle to be
likely to move closer to zero crossed before you conclude that the result is not the consequence of
chance. You, therefore, require a larger value of r before you can conclude
that the result is not a chance event due to sampling or measurement error,
but an actual representation of the state of affairs in the population.
67 After finding the correlation of r = -0.71 (as indicated in 1 P139 The squared correlation (r²) measures the proportion of variance in one
the previous question), the researcher decides to also variable that can be determined from its relationship with the other, or how
calculate the size of the effect of one variable on the much variance they have in common. It can be used as an indication of the
other. Given the information at his disposal, what is he size of the effect.
likely to conclude?
P140 Evaluating r²
1. About half of the variance in one of the variables is • r² = 0.01 = 1% Small effect
accounted for by the other one • r² = 0.09 = 9% Medium effect
2. About a quarter of the variance in one of the • r² = 0.25 = 25% Large effect
variables is accounted for by the other
3. About three quarters of the variance in one of the r = -0.71
variables is accounted for by the other one r² = (-0.71 x -0.71) = 0.5041 = 50%
4. He decides that before he can find the size of the
effect between the two variables, the two group At least 50% variance in one of the variables is accounted for by the other
means and standard variations will first have to be one
calculated
# Question Ans Page Comments
Base your answers to Questions 68 to 70 on the following scenario
A sample of clients are drawn from three community welfare centres (indicated as A, B and C). Based on interviews, the clients are categorised into one of
four categories
• Those that have psychological problems, that is, they may require intervention by psychotherapists,
• Those that have welfare problems, that is, those who may require intervention by social workers,
• Those with health-related problems, who may require interventions by health care givers
• Others - these are clients who do not fit into any of the other three groups
Counts are made of those clients from the different centres who fit into each of these categories, and this is reflected in the contingency table below
SG P140 The chi-square test is usually used when you have a cross tabulation of
frequency counts of events which are nominal scale measurements. This
Tut202 table is referred to as a contingency table. It is used to compare an observed
2014 frequency distribution (frequency counts based on a sample of observation)
Q22 with the frequency distribution which we would expect to find if the null
hypothesis of no relationship between two cross-tabulated variables were
true.
The formula is
This means is that the expected value for each cell in the contingency table is
subtracted from the observed value for that cell, squared, and divided by the
expected value for that cell. Then all of these terms are added together to
yield
# Question Ans Page Comments
70 Given the data above, what would be the expected 2 Mental Health Centre (columns)
value (it the null hypothesis is true) for health-related by Type of Problem (rows) A B C Row totals
problems in community centre B? ‘Psychological' 27 (O11) 35 (O12) 32 (O13) 94 (O1.)
‘Welfare’ 16 (O21) 28 (O22) 22 (O23) 66 (O2.)
1. 20 ‘Health-related’ 16 (O31) 20 (O32) 34 (O33) 70 (O3.)
2. 23.3 Other 29 (O41 17 (O42) 24 (O43) 70 (O4.)
3. 70 Column totals 88 (O.1) 100 (O.2) 112 300 (O..)
4. 100
The cell frequencies represent the way the information is distributed relative
P142
to the variables. These cell frequencies are often referred to as the observed
or empirical cell frequencies. The question now is: How would these cell
frequencies be distributed under the null hypothesis, that is, if H0 is actually
true? Asked differently: What are the expected frequencies if the two
categorical variables are truly independent?
We can indicate these expected cell frequencies by Eij and they are
computed as follows:
E11 = (O1. x O.1)/O.. = (94 x 88)/300 = 27.57 .... (row 1, column 1)
E12 = (O1. x O.2)/O.. = (94 x 100)/ 300 = 31.33 ... (row 1, column 2)
E13 = (O1. x O.3)/O.. = (94 x 112)/ 300 = 35.09 ... (row 1, column 3)
E21 = (O2. x O.1)/O.. = (66 x 88)/ 300 = 19..36 ... (row 2, column 1)
E22 = (O2. x O.2)/O.. = (66 x 100)/ 300 = 22.00 ... (row 2, column 2)
E23 = (O2. x O.3)/O.. = (66 x 112)/ 300 = 24.64.... (row 2, column 3)
E21 = (O3. x O.1)/O.. = (70 x 88)/ 300 = 20.53 ..... (row 3, column 1)
E22 = (O3. x O.2)/O.. = (70 x 100)/ 300 = 23.33 ... (row 3, column 2)
E23 = (O3. x O.3)/O.. = (70 x 112)/ 300 = 26.13 ... (row 3, column 3)
E21 = (O4. x O.1)/O.. = (70 x 88)/ 300 = 20.53 ..... (row 4, column 1)
E22 = (O4. x O.2)/O.. = (70 x 100)/ 300 = 23.33 ... (row 4, column 2)
E23 = (O4. x O.3)/O.. = (70 x 112)/ 300 = 26.13 ... (row 4, column 3)