0% found this document useful (1 vote)
239 views143 pages

PYC3704 Past Exam Questions

This document provides a compilation of past exam questions and answers from UNISA's PYC3704 Psychological Research course from 2012-2013. The answers are informed by course materials like the study guide as well as tutor and student feedback. Key concepts covered in past exams include research design types (groups vs correlational), measurement levels, statistical analyses for different variable types, and symbols used for populations and samples. Flow charts and tables are provided to help explain statistical concepts and procedures.

Uploaded by

gavhumende
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
239 views143 pages

PYC3704 Past Exam Questions

This document provides a compilation of past exam questions and answers from UNISA's PYC3704 Psychological Research course from 2012-2013. The answers are informed by course materials like the study guide as well as tutor and student feedback. Key concepts covered in past exams include research design types (groups vs correlational), measurement levels, statistical analyses for different variable types, and symbols used for populations and samples. Flow charts and tables are provided to help explain statistical concepts and procedures.

Uploaded by

gavhumende
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 143

PYC3704 - Past exam questions

2012-2013
PYC3704 (PYC304C)
Psychological Research

EXAM PREPARATION

This document is a compilation of past UNISA exam papers and their answers.

The answers are motivated by a combination of:


• Page references to the UNISA Study Guide for PYC3704: Psychological Research
• Feedback in past UNISA Tutorial Letters
• Personal views and comments from the author, other tutors and students

Past exams covered are:


May-Jun 2012 Oct-Nov 2012 May-Jun 2013 Oct-Nov 2013

Please note:
This document is an additional tool for exam preparation. The author takes no responsibility for incorrect answers. Students must ensure that they learn the
prescribed material and understand the content.
This document was sold on Stuvia.co.za. You may not redistribute this document.
PYC3704 Flow Chart
Does the researcher plan a groups-design or a
correlational design?

Groups-design: Correlational design:


What is the measurement level of the dependent variable? What is the measurement level of the variables?
(Compare 2 or more population groups) (Relationship between 2 or more variables in single sample)

Categorical:
Interval or Ratio scale: Statistics & hypothesis are about Both variables are categorical Both variable are continues
Statistical hypothesis are about P (qualitative) (quantitative)
the mean (μ) (population proportion) (nominal or ordinal) (interval or ratio scale)
(No longer in syllabus)

Pearson’s correlation
Will one or two samples Chi-square test
coef f icient (tr)
be selected?
-1 < r < 1

One sample Two samples Display data on Display data on


Contingency Table Scatter Plot

Is σ known Are the groups


Statistical hypothesis
(population) independent
f or tr
H0 : ρ = 0

Yes No Yes No
z t tc td

Probability (p-value)
Symbols - Populations and Samples

Symbol
Description Populations Samples
(Parameter) (Statistic)
Arithmetic mean μ 
Standard deviation σ s (s=√s²)
Variance σ² s²
Standard error of mean
σ (= σ/√n) s  (= s/√n)
(Also called Standard deviation of the sampling distribution of the mean)
Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean under H0) µ
(Central value of sampling distribution)
Z score for means z
Difference between scores  
Standard deviation of sample of difference scores s
Correlation between two measurements (Pearson's R) ρ r
Proportions P p
Level of significance
Set by the researcher at the start of project
α
Probability of making a Type I error
Mistakenly rejecting the H0 when it is true
Probability of making a Type II error
β
Not rejecting H0 when H0 is false and H1 is true
Squared correlation

Can be used as indication of size of effect
Standard Normal Distribution (z)
Mean Larger Smaller Mean Larger Smaller Mean Larger Smaller Mean Larger Smaller Mean Larger Smaller
z z z z z
to z portion portion to z portion portion to z portion portion to z portion portion to z portion portion
0.00 0.0000 0.5000 0.5000 0.40 0.1554 0.6554 0.3446 0.80 0.2881 0.7881 0.2119 1.20 0.3849 0.8849 0.1151 1.60 0.4452 0.9452 0.0548
0.01 0.0040 0.5040 0.4960 0.41 0.1591 0.6591 0.3409 0.81 0.2910 0.7910 0.2090 1.21 0.3869 0.8869 0.1131 1.61 0.4463 0.9463 0.0537
0.02 0.0080 0.5080 0.4920 0.42 0.1628 0.6628 0.3372 0.82 0.2939 0.7939 0.2061 1.22 0.3888 0.8888 0.1112 1.62 0.4474 0.9474 0.0526
0.03 0.0120 0.5120 0.4880 0.43 0.1664 0.6664 0.3336 0.83 0.2967 0.7967 0.2033 1.23 0.3907 0.8907 0.1093 1.63 0.4484 0.9484 0.0516
0.04 0.0160 0.5160 0.4840 0.44 0.1700 0.6700 0.3300 0.84 0.2995 0.7995 0.2005 1.24 0.3925 0.8925 0.1075 1.64 0.4495 0.9495 0.0505
0.05 0.0199 0.5199 0.4801 0.45 0.1736 0.6736 0.3264 0.85 0.3023 0.8023 0.1977 1.25 0.3944 0.8944 0.1056 1.65 0.4505 0.9505 0.0495
0.06 0.0239 0.5239 0.4761 0.46 0.1772 0.6772 0.3228 0.86 0.3051 0.8051 0.1949 1.26 0.3962 0.8962 0.1038 1.66 0.4515 0.9515 0.0485
0.07 0.0279 0.5279 0.4721 0.47 0.1808 0.6808 0.3192 0.87 0.3078 0.8078 0.1922 1.27 0.3980 0.8980 0.1020 1.67 0.4525 0.9525 0.0475
0.08 0.0319 0.5319 0.4681 0.48 0.1844 0.6844 0.3156 0.88 0.3106 0.8106 0.1894 1.28 0.3997 0.8997 0.1003 1.68 0.4535 0.9535 0.0465
0.09 0.0359 0.5359 0.4641 0.49 0.1879 0.6879 0.3121 0.89 0.3133 0.8133 0.1867 1.29 0.4015 0.9015 0.0985 1.69 0.4545 0.9545 0.0455
0.10 0.0398 0.5398 0.4602 0.50 0.1915 0.6915 0.3085 0.90 0.3159 0.8159 0.1841 1.30 0.4032 0.9032 0.0968 1.70 0.4554 0.9554 0.0446
0.11 0.0438 0.5438 0.4562 0.51 0.1950 0.6950 0.3050 0.91 0.3186 0.8186 0.1814 1.31 0.4049 0.9049 0.0951 1.71 0.4564 0.9564 0.0436
0.12 0.0478 0.5478 0.4522 0.52 0.1985 0.6985 0.3015 0.92 0.3212 0.8212 0.1788 1.32 0.4066 0.9066 0.0934 1.72 0.4573 0.9573 0.0427
0.13 0.0517 0.5517 0.4483 0.53 0.2019 0.7019 0.2981 0.93 0.3238 0.8238 0.1762 1.33 0.4082 0.9082 0.0918 1.73 0.4582 0.9582 0.0418
0.14 0.0557 0.5557 0.4443 0.54 0.2054 0.7054 0.2946 0.94 0.3264 0.8264 0.1736 1.34 0.4099 0.9099 0.0901 1.74 0.4591 0.9591 0.0409
0.15 0.0596 0.5596 0.4404 0.55 0.2088 0.7088 0.2912 0.95 0.3289 0.8289 0.1711 1.35 0.4115 0.9115 0.0885 1.75 0.4599 0.9599 0.0401
0.16 0.0636 0.5636 0.4364 0.56 0.2123 0.7123 0.2877 0.96 0.3315 0.8315 0.1685 1.36 0.4131 0.9131 0.0869 1.76 0.4608 0.9608 0.0392
0.17 0.0675 0.5675 0.4325 0.57 0.2157 0.7157 0.2843 0.97 0.3340 0.8340 0.1660 1.37 0.4147 0.9147 0.0853 1.77 0.4616 0.9616 0.0384
0.18 0.0714 0.5714 0.4286 0.58 0.2190 0.7190 0.2810 0.98 0.3365 0.8365 0.1635 1.38 0.4162 0.9162 0.0838 1.78 0.4625 0.9625 0.0375
0.19 0.0753 0.5753 0.4247 0.59 0.2224 0.7224 0.2776 0.99 0.3389 0.8389 0.1611 1.39 0.4177 0.9177 0.0823 1.79 0.4633 0.9633 0.0367
0.20 0.0793 0.5793 0.4207 0.60 0.2257 0.7257 0.2743 1.00 0.3413 0.8413 0.1587 1.40 0.4192 0.9192 0.0808 1.80 0.4641 0.9641 0.0359
0.21 0.0832 0.5832 0.4168 0.61 0.2291 0.7291 0.2709 1.01 0.3438 0.8438 0.1562 1.41 0.4207 0.9207 0.0793 1.81 0.4649 0.9649 0.0351
0.22 0.0871 0.5871 0.4129 0.62 0.2324 0.7324 0.2676 1.02 0.3461 0.8461 0.1539 1.42 0.4222 0.9222 0.0778 1.82 0.4656 0.9656 0.0344
0.23 0.0910 0.5910 0.4090 0.63 0.2357 0.7357 0.2643 1.03 0.3485 0.8485 0.1515 1.43 0.4236 0.9236 0.0764 1.83 0.4664 0.9664 0.0336
0.24 0.0948 0.5948 0.4052 0.64 0.2389 0.7389 0.2611 1.04 0.3508 0.8508 0.1492 1.44 0.4251 0.9251 0.0749 1.84 0.4671 0.9671 0.0329
0.25 0.0987 0.5987 0.4013 0.65 0.2422 0.7422 0.2578 1.05 0.3531 0.8531 0.1469 1.45 0.4265 0.9265 0.0735 1.85 0.4678 0.9678 0.0322
0.26 0.1026 0.6026 0.3974 0.66 0.2454 0.7454 0.2546 1.06 0.3554 0.8554 0.1446 1.46 0.4279 0.9279 0.0721 1.86 0.4686 0.9686 0.0314
0.27 0.1064 0.6064 0.3936 0.67 0.2486 0.7486 0.2514 1.07 0.3577 0.8577 0.1423 1.47 0.4292 0.9292 0.0708 1.87 0.4693 0.9693 0.0307
0.28 0.1103 0.6103 0.3897 0.68 0.2517 0.7517 0.2483 1.08 0.3599 0.8599 0.1401 1.48 0.4306 0.9306 0.0694 1.88 0.4699 0.9699 0.0301
0.29 0.1141 0.6141 0.3859 0.69 0.2549 0.7549 0.2451 1.09 0.3621 0.8621 0.1379 1.49 0.4319 0.9319 0.0681 1.89 0.4706 0.9706 0.0294
0.30 0.1179 0.6179 0.3821 0.70 0.2580 0.7580 0.2420 1.10 0.3643 0.8643 0.1357 1.50 0.4332 0.9332 0.0668 1.90 0.4713 0.9713 0.0287
0.31 0.1217 0.6217 0.3783 0.71 0.2611 0.7611 0.2389 1.11 0.3665 0.8665 0.1335 1.51 0.4345 0.9345 0.0655 1.91 0.4719 0.9719 0.0281
0.32 0.1255 0.6255 0.3745 0.72 0.2642 0.7642 0.2358 1.12 0.3686 0.8686 0.1314 1.52 0.4357 0.9357 0.0643 1.92 0.4726 0.9726 0.0274
0.33 0.1293 0.6293 0.3707 0.73 0.2673 0.7673 0.2327 1.13 0.3708 0.8708 0.1292 1.53 0.4370 0.9370 0.0630 1.93 0.4732 0.9732 0.0268
0.34 0.1331 0.6331 0.3669 0.74 0.2704 0.7704 0.2296 1.14 0.3729 0.8729 0.1271 1.54 0.4382 0.9382 0.0618 1.94 0.4738 0.9738 0.0262
0.35 0.1368 0.6368 0.3632 0.75 0.2734 0.7734 0.2266 1.15 0.3749 0.8749 0.1251 1.55 0.4394 0.9394 0.0606 1.95 0.4744 0.9744 0.0256
0.36 0.1406 0.6406 0.3594 0.76 0.2764 0.7764 0.2236 1.16 0.3770 0.8770 0.1230 1.56 0.4406 0.9406 0.0594 1.96 0.4750 0.9750 0.0250
0.37 0.1443 0.6443 0.3557 0.77 0.2794 0.7794 0.2206 1.17 0.3790 0.8790 0.1210 1.57 0.4418 0.9418 0.0582 1.97 0.4756 0.9756 0.0244
0.38 0.1480 0.6480 0.3520 0.78 0.2823 0.7823 0.2177 1.18 0.3810 0.8810 0.1190 1.58 0.4429 0.9429 0.0571 1.98 0.4761 0.9761 0.0239
0.39 0.1517 0.6517 0.3483 0.79 0.2852 0.7852 0.2148 1.19 0.3830 0.8830 0.1170 1.59 0.4441 0.9441 0.0559 1.99 0.4767 0.9767 0.0233
Mean Larger Smaller Mean Larger Smaller Mean Larger Smaller Mean Larger Smaller
z z z z
to z portion portion to z portion portion to z portion portion to z portion portion
2.00 0.4772 0.9772 0.0228 2.43 0.4925 0.9925 0.0075 2.86 0.4979 0.9979 0.0021 3.29 0.4995 0.9995 0.0005
2.01 0.4778 0.9778 0.0222 2.44 0.4927 0.9927 0.0073 2.87 0.4979 0.9979 0.0021 3.30 0.4995 0.9995 0.0005
2.02 0.4783 0.9783 0.0217 2.45 0.4929 0.9929 0.0071 2.88 0.4980 0.9980 0.0020 3.31 0.4995 0.9995 0.0005
2.03 0.4788 0.9788 0.0212 2.46 0.4931 0.9931 0.0069 2.89 0.4981 0.9981 0.0019 3.32 0.4995 0.9995 0.0005
2.04 0.4793 0.9793 0.0207 2.47 0.4932 0.9932 0.0068 2.90 0.4981 0.9981 0.0019 3.33 0.4996 0.9996 0.0004
2.05 0.4798 0.9798 0.0202 2.48 0.4934 0.9934 0.0066 2.91 0.4982 0.9982 0.0018 3.34 0.4996 0.9996 0.0004
2.06 0.4803 0.9803 0.0197 2.49 0.4936 0.9936 0.0064 2.92 0.4982 0.9982 0.0018 3.35 0.4996 0.9996 0.0004
2.07 0.4808 0.9808 0.0192 2.50 0.4938 0.9938 0.0062 2.93 0.4983 0.9983 0.0017 3.36 0.4996 0.9996 0.0004
2.08 0.4812 0.9812 0.0188 2.51 0.4940 0.9940 0.0060 2.94 0.4984 0.9984 0.0016 3.37 0.4996 0.9996 0.0004
2.09 0.4817 0.9817 0.0183 2.52 0.4941 0.9941 0.0059 2.95 0.4984 0.9984 0.0016 3.38 0.4996 0.9996 0.0004
2.10 0.4821 0.9821 0.0179 2.53 0.4943 0.9943 0.0057 2.96 0.4985 0.9985 0.0015 3.39 0.4997 0.9997 0.0003
2.11 0.4826 0.9826 0.0174 2.54 0.4945 0.9945 0.0055 2.97 0.4985 0.9985 0.0015 3.40 0.4997 0.9997 0.0003
2.12 0.4830 0.9830 0.0170 2.55 0.4946 0.9946 0.0054 2.98 0.4986 0.9986 0.0014 3.41 0.4997 0.9997 0.0003
2.13 0.4834 0.9834 0.0166 2.56 0.4948 0.9948 0.0052 2.99 0.4986 0.9986 0.0014 3.42 0.4997 0.9997 0.0003
2.14 0.4838 0.9838 0.0162 2.57 0.4949 0.9949 0.0051 3.00 0.4987 0.9987 0.0013 3.43 0.4997 0.9997 0.0003
2.15 0.4842 0.9842 0.0158 2.58 0.4951 0.9951 0.0049 3.01 0.4987 0.9987 0.0013 3.44 0.4997 0.9997 0.0003
2.16 0.4846 0.9846 0.0154 2.59 0.4952 0.9952 0.0048 3.02 0.4987 0.9987 0.0013 3.45 0.4997 0.9997 0.0003
2.17 0.4850 0.9850 0.0150 2.60 0.4953 0.9953 0.0047 3.03 0.4988 0.9988 0.0012 3.46 0.4997 0.9997 0.0003
2.18 0.4854 0.9854 0.0146 2.61 0.4955 0.9955 0.0045 3.04 0.4988 0.9988 0.0012 3.47 0.4997 0.9997 0.0003
2.19 0.4857 0.9857 0.0143 2.62 0.4956 0.9956 0.0044 3.05 0.4989 0.9989 0.0011 3.48 0.4997 0.9997 0.0003
2.20 0.4861 0.9861 0.0139 2.63 0.4957 0.9957 0.0043 3.06 0.4989 0.9989 0.0011 3.49 0.4998 0.9998 0.0002
2.21 0.4864 0.9864 0.0136 2.64 0.4959 0.9959 0.0041 3.07 0.4989 0.9989 0.0011 3.50 0.4998 0.9998 0.0002
2.22 0.4868 0.9868 0.0132 2.65 0.4960 0.9960 0.0040 3.08 0.4990 0.9990 0.0010 3.51 0.4998 0.9998 0.0002
2.23 0.4871 0.9871 0.0129 2.66 0.4961 0.9961 0.0039 3.09 0.4990 0.9990 0.0010 3.52 0.4998 0.9998 0.0002
2.24 0.4875 0.9875 0.0125 2.67 0.4962 0.9962 0.0038 3.10 0.4990 0.9990 0.0010 3.53 0.4998 0.9998 0.0002
2.25 0.4878 0.9878 0.0122 2.68 0.4963 0.9963 0.0037 3.11 0.4991 0.9991 0.0009 3.54 0.4998 0.9998 0.0002
2.26 0.4881 0.9881 0.0119 2.69 0.4964 0.9964 0.0036 3.12 0.4991 0.9991 0.0009 3.55 0.4998 0.9998 0.0002
2.27 0.4884 0.9884 0.0116 2.70 0.4965 0.9965 0.0035 3.13 0.4991 0.9991 0.0009 3.56 0.4998 0.9998 0.0002
2.28 0.4887 0.9887 0.0113 2.71 0.4966 0.9966 0.0034 3.14 0.4992 0.9992 0.0008 3.57 0.4998 0.9998 0.0002
2.29 0.4890 0.9890 0.0110 2.72 0.4967 0.9967 0.0033 3.15 0.4992 0.9992 0.0008 3.58 0.4998 0.9998 0.0002
2.30 0.4893 0.9893 0.0107 2.73 0.4968 0.9968 0.0032 3.16 0.4992 0.9992 0.0008 3.59 0.4998 0.9998 0.0002
2.31 0.4896 0.9896 0.0104 2.74 0.4969 0.9969 0.0031 3.17 0.4992 0.9992 0.0008 3.60 0.4998 0.9998 0.0002
2.32 0.4898 0.9898 0.0102 2.75 0.4970 0.9970 0.0030 3.18 0.4993 0.9993 0.0007 3.65 0.4999 0.9999 0.0001
2.33 0.4901 0.9901 0.0099 2.76 0.4971 0.9971 0.0029 3.19 0.4993 0.9993 0.0007 3.70 0.4999 0.9999 0.0001
2.34 0.4904 0.9904 0.0096 2.77 0.4972 0.9972 0.0028 3.20 0.4993 0.9993 0.0007 3.75 0.4999 0.9999 0.0001
2.35 0.4906 0.9906 0.0094 2.78 0.4973 0.9973 0.0027 3.21 0.4993 0.9993 0.0007 3.80 0.4999 0.9999 0.0001
2.36 0.4909 0.9909 0.0091 2.79 0.4974 0.9974 0.0026 3.22 0.4994 0.9994 0.0006 3.85 0.4999 0.9999 0.0001
2.37 0.4911 0.9911 0.0089 2.80 0.4974 0.9974 0.0026 3.23 0.4994 0.9994 0.0006 3.90 0.5000 1.0000 0.0000
2.38 0.4913 0.9913 0.0087 2.81 0.4975 0.9975 0.0025 3.24 0.4994 0.9994 0.0006 3.95 0.5000 1.0000 0.0000
2.39 0.4916 0.9916 0.0084 2.82 0.4976 0.9976 0.0024 3.25 0.4994 0.9994 0.0006 4.00 0.5000 1.0000 0.0000
2.40 0.4918 0.9918 0.0082 2.83 0.4977 0.9977 0.0023 3.26 0.4994 0.9994 0.0006
2.41 0.4920 0.9920 0.0080 2.84 0.4977 0.9977 0.0023 3.27 0.4995 0.9995 0.0005
2.42 0.4922 0.9922 0.0078 2.85 0.4978 0.9978 0.0022 3.28 0.4995 0.9995 0.0005
PYC3704 (PYC304C)

May/June 2012

# Question Ans Page Comments


1 The term 'inference’ in psychological research refers to 2 P2 An inference is a conclusion that follows from existing information, by
______ generalising from the specific information to the general type of phenomenon,
where the conclusion is not absolutely certain. So in summary inferential
1. describing information in a precise way statistics are techniques for making generalisations based on imperfect
2. making a prediction or generalization based on numeric data, where the conclusions have a high probability of being true, but
existing information you can never be completely certain.
3. the procedures for making a construct visible so
that a measurement can be made
4. the development of a hypothesis as a relationship
among variables

2 In psychological research, a construct may be a(n) 3 P4 Constructs and their interrelations (how they affect each other, their patterns
______ of interaction) are used in this way to develop theoretical explanations of
why people behave in certain ways in certain contexts, or why mental
1. measurement based on the careful observation of phenomena appear to be as they are.
aspects of humans or human behaviour
2. observation of an aspect of humans or human
behaviour which was operationalised in some way
3. hypothetical aspect of humans or human behaviour
which we wish to investigate
4. explanation of empirical observations based on the
measurement of certain variables
# Question Ans Page Comments
3 Which of the options below provides the best 4 P1-2 Psychology is a discipline that endeavours to collect information and develop
description of the main purpose of quantitative theories about human behaviour and mental processes. The aim is to
research in psychology? Its purpose is to ______ establish facts that are related to psychological phenomena, that are valid and
can be justified on scientific grounds.
1. develop theories that explain the relationships
among observed aspects of human behaviour and P2 The act of simply observing phenomena and describing them or collecting
mental processes facts about them is usually not sufficient. The next step in the scientific
2. develop predictions about human behaviour of process is to go beyond the level of description by attempting to develop
which we can be applied with absolute certainty explanations for the things we observe: we want to know not only what the
3. describe and classify aspects of humans and facts are, but also why they appear to be as they are. In other words, we want
human behaviour to develop theories, which explain why things are as they appear to be when
4. develop hypotheses about relationships that may we observe them.
exist among various constructs
P3 Psychologists try to develop explanations for human experiences and
behaviour. To do this, they often have to make use of abstract concepts (also
called constructs) that serve as explanations for the behaviour they observe.

P4 Psychologists are interested to find out which constructs are important (in the
sense of being required or useful to explain human behaviour) and how they
work together in a pattern, or what their interrelationships are. One of the
objectives of psychology is not only to describe human behaviour, but also to
find explanations for it. Constructs and how they interact fill the role of
explanatory mechanisms in psychology. We try to find out which constructs
offer an appropriate explanation of the behaviour or events we perceive, and
what the pattern of their interactions with other constructs may be. In this
sense, it can be said that constructs are the building blocks of theory.

P6 The link between observing a construct and measuring it is so close that when
we talk about 'observation' in quantitative research, we often imply the process
of measurement. The taking of a measurement is regarded as an act of
observation.

P21-26 Research in psychology is primarily about testing theories of human


Q2 behaviour.

Q3 The main purpose of psychological research is to test theories empirically.


# Question Ans Page Comments
4 Operationalising a construct means to ______ 4 P25 'Operationalisation' is where you make the construct (which is usually an
abstract concept, so it is difficult to observe it clearly) visible by finding some
1. find an explanation for the construct to explain why suitable way to measure it. You need it to be able to test a hypothesis, but it
it appears as it is is not in itself 'the process of forming an hypothesis'.
2. make an educated guess on how it relates to other
constructs The primary aim of operationalisation is to describe a construct clearly and
3. determine the correct level at which it should be unambiguously so that it can be measured and tested in a research study.
measured
4. devise a systematic procedure to make the
construct observable, in such a way that we can
measure it

5 Empirical knowledge is knowledge that is based on 3 P2 All scientific knowledge begins with description of the phenomena being
______ studied, based on careful observation. Knowledge based on observation of
physical events is referred to as empirical knowledge (as distinct from
1. careful reasoning knowledge based on contemplation, unexplained insights, mystical
2. appropriate theories experiences or claims by authority figures).
3. the observation of events
4. published research
# Question Ans Page Comments

Use the following extract from a research proposal to answer Questions 6 to 8

“Generalised anxiety disorder (GAD) refers to a pattern of almost constant worry or tension, even when there is little or no apparent cause. Both genetic
predisposition and stressors in the life of a particular patient is believed to contribute to this condition. The research will investigate whether the level of
anxiety of persons diagnosed with GAD is actually reduced by psychotherapy. It is expected that patients receiving therapy will score lower on the
Manifest Anxiety Scale than patients not receiving therapy "

6 “Both genetic predisposition and stressors in the life of 2 P4 A theory is a well-established principle that has been developed to explain
a particular patient is believed to contribute to this P15 some aspect of the natural world. A theory arises from repeated observation
condition' is ______ P18-19 and testing and incorporates facts, laws, predictions, and tested hypotheses
P21-26 that are widely accepted. In science, a theory is a framework for facts. It is
1. the research hypothesis some kind of description that tells you how the facts are connected, and why
2. a theory about the causes of GAD the facts are as they are (where the word 'facts' refers to things or events that
3. a postulated relation between two constructs were observed and described in a careful way). A theory is a network of
4. a description of the constructs in terms of which relations among facts that were proposed to be true and explanations for
GAD can be observed observed phenomena in terms of constructs.

Constructs and their interrelations (how they affect each other, their patterns
of interaction) are used in this way to develop theoretical explanations of why
people behave in certain ways in certain contexts, or why mental phenomena
appear to be as they are.

A hypothesis is a specific, testable prediction about what you expect to


happen in your study. A hypothesis can be informally described as an
educated guess. As we indicated above, research usually tries to establish
relationships among constructs in order to develop a theory or to test an
existing theory. Usually, the theory makes it possible for us to make some kind
of prediction of how constructs should be interrelated. We formulate this
relationship as an hypothesis, and we test the hypothesis (using statistical
methods) to see if the prediction is true. If it is not true, there is something
wrong with the theory, and we need to reconsider it.
# Question Ans Page Comments
7 "Whether the level of anxiety of persons diagnosed with 4 P4 See comments above
GAD is actually reduced by psychotherapy" describes P15
______ P18-19
P21-26
1. an observed relation between two variables
2. a theoretical prediction about the effect of
psychotherapy
3. the operationalisation of the construct 'anxiety’
4. the hypothesis to be investigated

8 The dependent variable is ______ and the independent 3 P8-9 The dependent variable is the one that is predicted or explained, and the
variable is ______ P24 independent variable is manipulated to see how it affects the dependent
variable.
1. whether or not psychotherapy is received, the level
of anxiety experienced by patients The independent variable is that variable which affects the dependent
2. the effectiveness of psychotherapy, the level of variable; or, conversely, the dependent variable depends on the independent
anxiety variable.
3. the level of anxiety experienced by patients,
whether or not psychotherapy is received When a researcher focuses on the interaction of only two variables at a time,
4. the anxiety score as measured on the Manifest the dependent variable is usually the one that the researcher is interested in,
Anxiety Scale, the presence of stressors in the life the variable that is the focus of the research. The independent variable is
of the patient something that the researcher manipulates, to see how this affects the
dependent variable (in other words, the dependent variable is dependent on
the independent variable).

Hidden variables are effects on the dependent variable that we may be


unaware of, or that we choose to ignore. Very often the events or behaviour
that we observed are the consequence of many interacting factors, and we
have to analyse the situation carefully to try and identify as many things as
possible that may interfere with our ability to find a clear relationship between
a dependent variable and some specific independent variable.

One of these hidden effects that researchers in psychology often have to


contend with is that people change their behaviour when they realise that
someone is paying extra attention to them (usually referred to as the
'Hawthorne effect').
# Question Ans Page Comments
9 “The mental age of child number one is eight years'. In 2 P7 A construct that has been measured in some way produces a variable. A
this statement 'mental age" is a(n) ______, whereas variable refers to a number that can take on any one of a range of possible
“eight years' is a(n) ______ values. They can be discrete (when only whole numbers like 1, 2, 3 are
allowed) or continuous (what mathematicians refer to as 'real numbers'). In
1. variable, specific value of that variable some cases variables also take on values smaller than zero to produce
2. construct, variable negative numbers.
3. independent variable, dependant variable
4. hidden variable, descriptive statistic So the (visible) variable reflects the intensity of the underlying (invisible)
construct, in terms of how it was measured. We say that the variable is
manifest (it is visible in the sense that we can observe it) and the construct is
latent (it is invisible in the sense that we need some way to make it appear).
So the latent construct is made manifest by the use of an appropriate
measurement procedure.

The dependent variable is the one that is predicted or explained, and the
independent variable is manipulated to see how it affects the dependent
variable.
10 A researcher would use a ______ to make a(n) ______ 1 P11 The entire collection of cases that you are interested in when you make your
about the nature of the ______ measurements for a particular construct is referred to as the population. The
population depends on which people or objects or events you are interested in
1. sample, inference, population studying.
2. sample, hypothesis, population
3. variable, prediction, construct Because populations can be very large, and we rarely have access to them,
4. population, inference, sample we would draw a sample of observations from the population and use that
sample to infer certain things about the population's characteristics. The most
appropriate sample is usually a simple random sample, where each individual
has the same chance of being included. If our samples are not random, they
may lack external validity: it may not be possible to generalise beyond the
group from which we drew the sample.
11 A measurement that summarises an aspect of a 2 P14 A statistic is a sample measurement characteristic.
population is called a ______ while a measurement A test statistic is the quantity you calculate (often by making use of sample
that describes the same aspect of a sample is called P23 statistics) to test a statistical hypothesis.
______ When we refer to these test quantities, we always refer to the name in full -
'test statistic', and when we use the term 'statistic' on its own it refers to a
1. construct, variable descriptive statistic that describes an aspect of the sample data.
2. parameter, statistic
3. statistic, parameter Parameters are values that summarise aspects of population data
4. variable, construct While the word 'parameters' does refer to descriptive statistics, it does not
refer to all descriptive statistics. It is used only for those descriptive statistics
that relate to the population, not to those that describe aspects of the sample.
# Question Ans Page Comments
12 A ______ is a speculative statement about the 4 P1 A research hypothesis is formed as a clear statement in terms of a
relationship among ______, based on observations or P18-19 relationship among the constructs (and the variables by which they are
expectations measured). It is a statement about a possible relationship among constructs
that may explain some set of observations that one intends to investigate.
1. theory, constructs
2. hypothesis, statistics Constructs: concepts that act as explanations for phenomena, events and
3. theory, variables behaviour and are abstracted from observations.
4. hypothesis, constructs
Theories: a theory is a frame of reference for facts that attempts to account
for why things are as they are; a claim about how constructs are related to
produce phenomena, which has been validated by research.
13 A class of 10 boys and 11 girls, including Mary and her 3 P29 Number of possible outcomes = Total kids = 21
friend Elizabeth, chooses a class representative by Number of favourable events = Either Mary or Elizabeth = 2
writing their names on slips of paper, putting these into
a box and asking their teacher to draw one name p(E) = Number of favourable events
blindly. Number of possible outcomes

What is the probability that either Mary or Elizabeth will =2


be selected? 21
1. 1/11
2. 1/21 = 2/21
3. 2/21
4. 2/11
14 A college student claims that she can identify three 2 P35-36 The multiplicative rule states that p(A and B) = p(A) x p(B) where A and B are
different types of cheese by taste. An experiment is set both independent events. This rule is used to determine the product of two or
up to test her ability. She is blindfolded and given three more probabilities and is indicated by the word 'and' (i.e. the probability of A
pieces of cheese, each representing a different brand. and B).
What is the probability that she will correctly identify
TWO particular pieces of cheese by chance? To identify 2 cheeses require two favourable events and therefore two
calculations under multiplicative rule. First she needs to identify 1 correct
1. 0.11 cheese from 3 different types, then she needs to identify the 2nd correct
2. 0.16 cheese from the remaining 2 types.
3. 0.33
4. 0.67 p(E) = Number of favourable events x Number of favourable events
Number of possible outcomes Number of possible outcomes

=1 x 1 = 1 = 0.167
3 2 6
# Question Ans Page Comments
15 Which statement best represents an application of the 1 P31-32 The principle is called the law of large numbers, and it states the following:
law of large numbers? If I flip a coin 1000 times, it will If an experiment is done repeatedly, and if the outcomes are independent of
fall heads-up ______ 500 times one another, the observed proportion of favourable occurrences of an event
will eventually approach its theoretical probability.
1. approximately What the law states is that a probability value should be seen as a theoretical
2. exactly limit on which the relative occurrence of an event (outcome) can be expected
3. at least to converge over time in the long run. For example, in the above coin-flipping
4. either much more or much less than example, the probability of the coin coming up heads or tails on any flip is not
influenced by the result of the previous flip. Each flip is independent of the
other, and the theoretical probability of heads coming up remains the same,
that is, p(heads) = 1/2 = 0.5.

In terms of the law of large numbers, we can make the following prediction: If
we flip the coin repeatedly, even though we do not know whether heads or
tails will come up on any particular flip, the actual proportion of heads will
eventually get close to 0.5. Thus, as the experiment gets repeated over and
over, the relative frequency or proportion of heads will approximate the
theoretical probability of 0.5
16 The expression "0.05 < p ≤ 0.10" should be interpreted 3 P33-34 Because probabilities fall in a range from 0.0 to 1.0 when expressed
as a probability value ______ decimally, a probability can never be higher than 1 or lower than 0. The
general rule is written symbolically as follows: 0 ≤ p ≤ 1. Note that a probability
1. smaller than 0.05 and larger or equal to 0.10 can be 0, but to say that a probability is 0 is actually the same as saying that
2. halfway between 0.05 and 0.10 the event is impossible and can never happen. Likewise, to say that the
3. larger than 0.05 and smaller or equal to 0.10 probability of an event is 1 is to assert that it is an absolute certainty. In actual
4. smaller than 0.05 and equal to 0.10 practice, probabilities fall within these two extremes.
You will typically encounter reference to probabilities in expressions such as
''p > 0.05''. This statement is interpreted as ''the probability value is higher than
0.05''.
17 Suppose that over the years 10 000 students wrote the 4 P35-36 Part 1:
examinations in PYC 3704-C and that 6000 of them p(E) = Number of favourable events = 300 = 3 = 0.03
passed, of which 300 obtained exactly 50%. This Number of possible outcomes 10000 100
means that for randomly selected students the
probability of obtaining exactly 50% is ______ while the Part 2:
probability of obtaining 50% or more is ______ p(E) = Number of favourable events = 6000 = 6 = 0.6
Number of possible outcomes 10000 10
1. 0.60, 0.03
2. 0.05, 0.60
3. 0.60, 0.03
4. 0.03, 0.60
# Question Ans Page Comments
18 During the interpretation of psychological 2 P50-51 Many of the scores that we use are also clustered around the average, and tail
measurements the normal distribution is often ______ off to the ends of the distribution. Because it can be used to describe the
distribution of many naturally or 'normally' occurring continuous variables, this
1. adapted to fit the observed frequency distribution of type of symmetrical probability distribution is called a normal distribution. It is
scores also commonly referred to as the normal curve, because the distribution can
2. used as a theoretical model for interpreting the be plotted by a bell-shaped curve.
observed distribution of scores
3. used to calculate the relative frequency of observed The definition of the standard normal distribution (see section 2.3.3 on p52-53)
scores is that it has a mean (μ) of 0 and a standard deviation (σ) of 1.
4. used to derive the mean and standard deviation of a
sample

19 The scale along the x-axis of the standard normal 3 P52-53 Statisticians have derived a rather complicated-looking equation (or formula)
distribution indicates ______ which describes the normal curve, and have shown that it contains only two
variables, the mean (m) and the standard deviation (s), with the rest of its
1. probabilities terms being constants. The formula produces distributions that are all bell-
2. the mean of the distribution shaped, but the actual shape of the curve - how high it is or how spread out it
3. the number of standard deviations below and above is - depends only on the mean and the standard deviation of the distribution
the mean concerned.
4. the p-values

20 The mean and standard deviation of a set of test 3 P55 X-μ 14 - 20 -6


scores are 20 and 8 respectively. What is the z-score Z = σ = 8 = 8 = -0.75
corresponding to a test score of 14?
Where:
1. 1.33
2. 0.75
x represents the variable (test score),
3. -0.75 μ is the population mean,
4. -1.33 σ the standard deviation of the population from which x was obtained.
# Question Ans Page Comments
21 Suppose the height of military recruits is distributed 2 P61-62 The standard error is an extremely valuable measure because we can use it to
normally with a mean of 1750 mm and a standard estimate how well a sample mean approximates its population mean in
deviation of 50 mm. Drawing repeated samples of 25 P109 general, that is, how much error you can expect on average between the
recruits each, we expect the standard deviation of the sample mean () that you calculated from your sample and the population
sample means to be about ______ mm mean (μ) that you are trying to estimate.

1. 2 In other words, it is an indication of the size of the error that you make by
2. 10 using a sample of a particular size (n) to determine the population mean. This
3. 50 amount of error will decrease as the size of the sample increases.
4. 25
σ = σ/√n = 50/√25 = 50/5 = 10
22 Which of the following statements about population 4 P23 Parameters are values that summarise aspects of population data
parameters is the most accurate? While the word 'parameters' does refer to descriptive statistics, it does not
refer to all descriptive statistics. It is used only for those descriptive statistics
1. They are essential for making statements about that relate to the population, not to those that describe aspects of the sample.
probability distributions
2. They are always unknown but appropriate values P13 Population parameters are rarely known (usually unknown), since the only
can be estimated prior to sampling P65 Q10 way to determine them would be to collect the relevant data from the entire
3. They are essential, but cannot be estimated from population. Population parameters are usually unknown and have to be
sample information inferred from sample data. Since population parameters are unknown, they
4. They are always required prior to sampling because cannot be essential to make statements about probability. Option 1 is,
they are needed to calculate the sample statistics therefore, incorrect. Option 3 is also incorrect because it incorrectly states that
population parameters cannot be estimated from sampling information, but the
whole process of statistical inference is actually concerned with inferring
information about a population from sample data.

We use the sample to represent the population, and do our calculations on the
P60 sample data, but ultimately we want to determine the situation in the
population. To do this, we often have to estimate the (population) parameters
by using the (sample) statistics. A researcher seldom knows the values of the
population parameters, but the values of sample statistics can be calculated
by means of clearly formulated mathematical procedures and these can be
used as estimates of the parameters of the corresponding population.

Normally, we'll not know what our true population parameter is, and we would
have calculated the mean from only a single sample - but we can still apply
the basic principle: that our sample mean will be a reliable estimate of our
population mean.
# Question Ans Page Comments
23 What is the principal advantage of z scores? They 3 P53 This curve has a mean of µ = 0 and a standard deviation of σ = 1 and is
enable one to ______ known as the standard normal distribution, and is by convention indicated with
the letter 'z' (so it is also referred to as the z-distribution). The measures on
1. determine whether scores are normally distributed this distribution are referred to as standard scores or z-scores.
around the mean
2. transform a person's scores on tests with different
means and the same standard deviations into
comparable percentages
3. compare a person's scores on tests with different
means and standard deviations
4. determine frequency distributions for tests with
different means

FIGURE 2.7: The standard normal distribution

While its major use is in calculating probabilities, transforming a score from a


P55 normal distribution to its associated z-score has an additional benefit.
Transforming a set of measurements, each with a different mean and a
different standard deviation, into a z-score can be used to compare an
individual across different distributions. After transformation, all the scores
will fall on a common standard normal distribution with a mean of 0 and a
standard deviation of 1, which makes it possible to compare them directly.

Theorem 1 The 68-95-99.7 Rule: In every normal distribution with mean µ


and standard deviation σ, approximately 68% of the data falls within one
standard deviation of the mean. Approximately 95% of the data falls within two
standard deviations of the mean. And finally, approximately 99.7% (almost
everything) of the data falls within three standard deviations of the mean.

According to the standard normal distribution tabel (z-tabel), if z=1 then the
mean to z = 0.3413. Multiply by 2 to get both sides of the mean = 0.6826 or
68.26%
# Question Ans Page Comments
24 Consider the following Table 1 Tut201 The marks should first be converted to z-values, to make it possible to
2012 Q21 compare them across the different means and standard deviations:
Mean of Std. dev. _ _
Subject Student X
class of class Z=X–X or Z= (X - X) / S
A 50% 40% 5% S
B 55% 50% 5%
C 60% 50% 10% ZSubjectA = (50 - 40) / 5 = 10/5 = 2
D 65% 65% 5% ZSubjectB = (55 - 50) / 5 = 5/5 = 1
ZSubjectC = (60 - 50) / 5 = 10/10 = 1
In which subject did Student X do best, relative to his ZSubjectD = (65 - 65) / 5 = 0/5 = 0
class?
1. A So it is clear that in the case of subject A, the student’s marks are 2 standard
2. C deviations above the mean. In the other subjects the student’s marks are 1
3. D standard deviation or less above the mean.
4. B
25 Study the histogram below of the exam marks of a 2 P29 Possible outcomes = 10 + 20 + 40 + 10 + 20 = 100
group of students in the same class. Note that the
values on the horizontal axis are the class (category) Favourable events = score > 40 and < 60
limits = Exam mark of 50 with frequency = 10
= 10

P(score > 40) = Number of favourable events / Number of possible outcomes


= 10 / 100
= 0.10

OR

Possible outcomes = 10 + 20 + 40 + 10 + 20 = 100

Assume we use this histogram as a basis for making Favourable events = score > 40
probability predictions. What is the probability that a = Exammark 50,freq 10 and exammark 60, freq 20
student's score will be between 40 and 60? = 10 + 20 = 30

1. 0.20 P(score > 40) = Number of favourable events / Number of possible outcomes
2. 0.10 = (10 + 20) / 100
3. 0.70 = 30 /100
4. 0.30 = 0.30
# Question Ans Page Comments

Use the scenario below to answer Questions 26 to 31

A researcher suspects that the addition of certain food supplements to the diet of elderly people will reduce the decline in cognitive functioning that comes
about because of aging. She decides to test this using a neuropsychological test that measures the speeds with which objects are identified (the
Neuropsychological Perceptual Speed or NPS test). It is known that the distribution of scores on this test is approximately normal and that a mean of µ =
80 and σ = 20 was found in the population of persons older than 65.

To investigate her hypothesis, she obtains a random sample of n=100 persons older than 65. Each member of this sample is given a daily dose of
supplements over a period of six months. At the end of this time, each person is tested on the NPS test and a mean of ẋ = 76 is found. The researcher
plans to test the hypothesis at α = 0.05.

26 The appropriate research hypothesis suggested by the 3 Tut201 A psychological hypothesis formulates a testable empirical claim (something
scenario above is as follows 2012 Q8 that can in principle be observed), and this usually involves postulating a
relationship between two or more variables.
1. Cognitive functioning declines with age
2. The cognitive functioning of elderly persons is
related to their perceptual speed
3. Cognitive functioning will be better for elderly
persons who take the dietary supplement than for
those who do not
4. The perceptual speed of elderly persons who take
the dietary supplement will be greater than for those
who do not

27 The appropriate alternative hypothesis to be tested 1 H0: μ = 80 which is the score of the NPS on a normal population mean. For
is______ the speed of the NPS to improve, the NPS score must go down. See this as
the time it took. The alternative hypothesis will therefore be to see if the NPS
1. H1: μ ˂ 80 gets less than 80. So H1: μ ˂ 80
2. H1: μ ˂ 84
3. H1:  ˃ 80
4. H1: μ ≠ 80
# Question Ans Page Comments
28 The mean of the sampling distribution of the mean is 1 SG P60- The sampling distribution of means refers to the distribution of the means of all
______ 61 possible samples of a particular size randomly selected from the same
population
1. 80
2. 76 μ = μ = 80
3. 20
4. unknown

29 The standard error is ______ 2 SG P60 We can estimate the size of the error we would make if we used the sample
mean as an estimate of the population mean. This is referred to as the
1. 20 standard error, and it is specified in the central limit theorem.
2. 2
3. 0.05 SG P61 The standard error is denoted by σẋ. The σ indicates that we are describing
4. unknown a population, and the subscript ẋ informs us that we are dealing with a
population of sample means. The standard error is given by dividing the
population standard deviation by the square root of the sample size
σẋ = σ / √n
If σ = 20 and n = 100, then
σẋ = 20 / √100
= 20 / 10
=2

30 With the information as given in the scenario, what 4 P100-106 The t-distribution is a statistical distribution with a probability distribution that
would be the appropriate statistical test to test can be determined, which means that we can use it to predict the chances of
hypothesis? obtaining specific outcomes when testing for comparisons of means when the
population standard deviation σ is unknown.
1. A one sample t-test So we have to use the t-test (t) when the population standard deviation
2. A two sample t-test (σ) is considered to be unknown - because the given standard deviation
3. A test of correlation r for relationship between comes from the sample.
variables
4. A one sample z-test
When the population standard deviation (σ) is known we use the z-test (z)
# Question Ans Page Comments
31 The test statistic is calculated and, based on this, a 1 A test statistic is calculated to determine how far the observed measurements
computer program is used to determine that the one deviate from what we may expect by chance. Calculating the test statistic is
sided p-value =0.022. What conclusion can be drawn? the first step in a process of comparing the observed data with what may be
expected by chance (i.e., if the null hypothesis were true).
1. The null hypothesis can be rejected, so the
supplement improves cognitive functioning P81 A computer program usually supplies a two-tailed p-value, but in this case the
2. The null hypothesis cannot be rejected, so the question states that the one-tailed p-value =0.022. This also means we are
supplement improves cognitive functioning refering to a directional alternative hypothesis.
3. The alternative hypothesis can be rejected, so the
supplement improves cognitive functioning We have already established that H0: μ = 80 and H1: μ ˂ 80. The researcher
4. Insufficient information is given to make a plans to test the hypothesis at α = 0.05. We can therefore compare the p-
conclusion without further calculations value (0.022) with the alpha (0.05). The p-value is smaller than the alpha
which means we have to reject the null hypothesis.

32 When applying a statistical test, the probability of a 2 SG 82-86 An error of Type I is the error we make if we reject the null hypothesis when
type I error is equal to ______ we should not have done so, and the level of significance represents the
Tut202 greatest risk of doing this that we are willing to take.
1. 0.05 or 0.01 2014 Q5
2. the level of significance We know that the extent of the type I error that a researcher is willing to make
3. the calculated value of the test statistic is controlled by the researcher by setting the level of significance (α) in
4. the p-value of the test statistic under the alternative advance. The probability of a type II error (β) is not controlled in advance by
hypothesis the researcher except for the fact that we know that the lower (smaller)
the probability of a type I error (α) the greater (larger) the probability of a
type II error (β).

You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0 when
you should reject it (error of type II) an absolute certainty.
# Question Ans Page Comments
33 A statistical hypothesis is a formal statement about 1 P18 The next step in the research process is to turn the research hypothesis into a
______ statistical hypothesis: a formal hypothesis that can be tested by
statistical techniques. (on the basis of sample observations, whether the
1. parameters relationship proposed in the research hypothesis indeed exists.)
2. statistics
3. level of significance P71 This statistical hypothesis is a formal expression of the research hypothesis,
4. p-values which enables us to test it.

P74 Take note that a research hypothesis always translates into two mutually
exclusive hypotheses (i.e. both cannot be true at the same time): a null and an
alternative hypothesis. Also remember that, in Topic 1, we referred to
quantities such as as parameters (population parameters). These particular
statistical hypotheses are, thus, statements about the value of a
particular population parameter.
34 The sampling distribution of a statistic (e g of the 1 P58 The sampling distribution of a statistic is the set of all possible values of the
sample mean) can be calculated if we assume that the statistic when all possible samples of a fixed size are taken from the
______ hypothesis is true, but not if we assume that the population. The sampling distribution refers to the variation of a statistic, for
______ hypothesis is true example, the sample mean (), from sample to sample. Note that here we are
not concerned with the variation of individual elements in the sample, or
1. null, alternative individual elements in the population, but with the variation of a summary
2. alternative, null value (such as the mean) for a sample.
3. statistical, research
4. research, statistical P77-79 So what we do instead is to calculate how far from the expected mean our
observed mean is, and determine from this the probability that this difference
is not 'real' but just a consequence of chance (random error). In other words,
we determine the probability of getting this sample result, on a sample of this
size, if H0 were true. We use the expression 'under the null hypothesis' by
which we mean , 'assuming that the hypothesis H0 is true'. Similarly, the
phrase 'under H1' would mean, 'assuming that H1 is true'.
# Question Ans Page Comments
35 When a statistical test yields a large p-value, which of 3 P81 Here is a summary of the important points regarding the p-value:
the following statements is most correct? • The p-value gives the probability of obtaining the sample result under H0.
• If the p-value is very small, the probability is very small that the sample
1. The alternative hypothesis is probably true result would occur under H0, and one should consider rejecting H0 in
2. The null hypothesis is probably false favour of H1.
3. The null hypothesis is probably true • The smaller the p-value, the more likely that the null hypothesis is false
4. The probability of an error of Type I is small and should be rejected in favour of the alternative hypothesis.

So, if the p-value is very large, the probability is very big that the sample result
would occur under H0, and one should consider accepting H0 in favour of H1.
The null hypothesis is then probably true

36 The hypothesis "H1 µ < 50" is a ______ hypothesis and 4 P75-76 The alternative hypothesis can contain any of the symbols '>', '<' or '≠'
requires a ______ statistical test respectively, the symbols for 'larger than', 'smaller than' or 'not equal to'.

1. non-directional, one-tailed When a comparison is between a value that is greater (more) than another,
2. directional, two-tailed we use the symbol '>' and when a comparison is between a value that is
3. non-directional, two-tailed smaller (less than) than another, we use '<'. The statistical test that must be
4. directional, one-tailed performed in either of these cases is a directional or one-tailed statistical
test (we use these expressions interchangeably).

When we do not specify what the direction of the difference should be, and
both a larger and a smaller difference between means are considered as
relevant, the symbol '≠' must be used. The statistical test to be performed will
now be a non-directional or two-tailed test.

P81 The relationship between one-tailed and two-tailed p-values can be


summarised as follows:
• One-tailed p-value = (two-tailed p-value) / 2
• Two-tailed p-value = (one-tailed p-value) x 2

The important point to remember is that the p-value indicates more or less
how likely the particular result we have observed in our data is if the null
hypothesis were true; or, as we say, 'under the null hypothesis'.
# Question Ans Page Comments
37 When applying a z-test to compare a sample mean to a 3 Tut201 The observed results are the values which you find in your sample(s) of data,
known population mean, the p-value represents the 2014 for example the sample mean and sample standard deviation, or (if it is
probability of ______ Q10 relevant), the correlation coefficient which you calculated.

1. rejecting the null hypothesis if it is false The p-value shows you the probability of seeing some relationship among
2. obtaining the mean found in the sample of data these variables based on your calculations (such as a difference between
under the alternative hypothesis means or a high correlation), if in fact this observed relationship is merely the
3. obtaining the mean found in the sample of data consequence of chance (in other words, if the null hypothesis was true). You
under the null hypothesis are in fact comparing the observed relationships in the data with what you
4. failing to reject the null hypothesis when it is in fact would expect if the null hypothesis is true by calculating a relevant test
true statistic.

This test statistic can then be used to find the p-value if we know the
probability distribution of the test statistic. If this probability is small, it implies
the null hypothesis is probably not true.

38 When applying a statistical test a decision is reached 1 SG 82-86 An error of Type I is the error we make if we reject the null hypothesis when
by comparing the ______ to the ______ we should not have done so, and the level of significance represents the
Tut202 greatest risk of doing this that we are willing to take.
1. p-value, level of significance 2014 Q5
2. test statistic, population parameter We know that the extent of the type I error that a researcher is willing to make
3. test statistic, level of significance is controlled by the researcher by setting the level of significance (α) in
4. p-value, test statistic advance. The probability of a type II error (β) is not controlled in advance by
the researcher except for the fact that we know that the lower (smaller)
the probability of a type I error (α) the greater (larger) the probability of a
type II error (β).

You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0 when
you should reject it (error of type II) an absolute certainty.
# Question Ans Page Comments
39 The lower we set the level of significance, the greater 2 SG 82-86 An error of Type I is the error we make if we reject the null hypothesis when
the probability of - - we should not have done so, and the level of significance represents the
Tut202 greatest risk of doing this that we are willing to take.
1. rejecting the null hypothesis 2014 Q5
2. a type II error We know that the extent of the type I error that a researcher is willing to make
3. a type l error is controlled by the researcher by setting the level of significance (α) in
4. accepting the alternative hypothesis advance. The probability of a type II error (β) is not controlled in advance by
the researcher except for the fact that we know that the lower (smaller)
the probability of a type I error (α) the greater (larger) the probability of a
type II error (β).

You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0 when
you should reject it (error of type II) an absolute certainty.
40 Which of the following assumptions do we make when 2 P77 The decision rule for H0 is simply as follows:
applying a statistical test? P82 If the p-value of the sample result is smaller (less) than α (level of
We assume that the ______ significance), the null hypothesis is rejected. If the p-value is not smaller than
α, the null hypothesis (H0) is not rejected.
1. level of significance is small
2. null hypothesis is true
3. alternative hypothesis is true
4. the null hypothesis is false
# Question Ans Page Comments
41 The size of the level of significance depends on ______ 1 Tut201 The level of significance (α) reflects the greatest risk that the researcher is
2012 willing to take of rejecting the null hypothesis in error. The researcher wants to
1. a choice made by the researcher Q29 establish that the observation which was made (as calculated from a sample
2. conventional rules of data) has a very small chance of being purely the result of chance
3. the calculation of a test statistic Tut202 variations in the data. He/she controls this by requiring that this calculated
4. the p-value under H0 2013 Q2 probability (the p-value) should be below a specific level (α) which is chosen in
advance.
Alternative 4 is false because the p-value refers to this calculated probability of
finding a test statistic of a particular size if the null hypothesis is true (i.e.
‘under the null hypothesis’), while the level of significance is the maximum
value of this p-value which the researcher is willing to consider if the null
hypothesis is to be rejected. The p-value must be less than this chosen level
of significance, or else the statistical relationship between the variables is
considered to be too small to be regarded as significant (the greater the p-
value, the greater the probability that the effect which was observed in the
sample data is purely the result of chance).
Alternative 3 is false because an appropriate test statistic has to be calculated
in order to find the p-value, but this test statistic is not called ‘the level of
significance.’
While values for α such as 0.01 or 0.05 are often used by convention, the
researcher can in fact use any value which he/she deems appropriate, so
alternative 2 is not strictly correct.
42 When two population means are compared, the p-value 1 P81 Here is a summary of the important points regarding the p-value:
expresses the probability of the difference between the • The p-value gives the probability of obtaining the sample result under H0.
sample means given that ______ • If the p-value is very small, the probability is very small that the sample
result would occur under H0, and one should consider rejecting H0 in
1. H0 is false favour of H1.
2. H1 is true • The smaller the p-value, the more likely that the null hypothesis is false
3. H1 is false and should be rejected in favour of the alternative hypothesis.
4. H0 is true
So, if the p-value is very large, the probability is very big that the sample result
would occur under H0, and one should consider accepting H0 in favour of H1.
The null hypothesis is then probably true

P76-77 Generally, we would compare the two population means because H0 seems to
be false, but H1 has not yet proven to be true.
# Question Ans Page Comments
43 What does it mean to say “the difference between the 2 Tut202 The null hypothesis states that there is no difference in the means calculated
means of groups A and B is statistically significant? 2014 Q8 from samples of data from each of of groups A and B. When we calculate the
two means from sample data (which we regard as an observation) we may
1. It is unlikely that the alternative hypothesis will be find a difference in the two calculated means, but at least part of this
true difference could be due to measurement errors. We calculate the p-value
2. The sample result is more probable under the (based on a test statistic with a known probability distribution) to find out what
alternative hypothesis the probability is that that these observed differences in the sample data are
3. The null hypothesis explains the sample result just a consequence of measurement error if the null hypothesis is assumed to
4. The alternative hypothesis should be rejected be true. If this probability is low (lower than a pre-determined cut-off level, α),
we conclude that the difference in the two means is statistically significant
because the probability that the null hypothesis is true is very small.

In other words, we conclude that the size of the difference between means
found in the sample data would not be likely if the null hypothesis were true.

Therefore: The sample result is more probable under the alternative hypothesis

44 When two means are compared, the p-value expresses 3 P76-77 In fact, we are not yet entitled to conclude that the alternative hypothesis is
the probability that a difference ______ true. This is because of the problem of sampling error. This error exists partly
because we are using a sample to make conclusions about a population, in
1. is statistically significant addition to which we are using a test that is only accurate to a certain degree.
2. which is found between the means is due to the It is because of this random error that we require the use of statistical tests to
alternative hypothesis see whether the result is in fact adequate for us to make a decision about the
3. which is found between the means is due to chance hypothesis. (See section 1.4.4 on the problem of the error term in
or sampling error measurement.)
4. will be found between the means

45 The power of a statistical test refers to the ______ 2 P85-86 The ability of a statistical test to detect a significant relationship between
variables when such a relationship does in fact exist, is referred to as its
1. test's ability to give small p-values power. This is the inverse of a Type II error: it is the probability of rejecting H0
2. test's ability to detect significant results when, in fact, it is false and H1 is true. To put it succinctly, it is the probability
3. sample size of correctly rejecting a false null hypothesis
4. probability that an error of Type I will not be made The power of a test is calculated by subtracting the probability of a Type II
when the test is used error from one (i.e., power = 1 - β). It can be thought of as a measure of the
"accuracy'' of the text. The power of a test is related to how sensitive the test
should be (see section 3.3.4 on effect size below) as well as the sample size
(n) that you are going to use.
In practice, we usually control only the α-level when we use a particular
statistical test. But, given a fixed α-level, there are ways of increasing the
power of a test even if we do not actually calculate the value of 1 - β.
# Question Ans Page Comments
46 The value that is conventionally indicated with the 4 P81-82 Small p-values would lead one to reject the null hypothesis, because it shows
symbol α refers to the ______ that the probability of H0 being true is not very high. But how small must the p-
value be? The practice in empirical research is to decide what size p-values
1. maximum probability of obtaining the observed would be considered small enough to justify rejecting the null hypothesis
results under H0 before the research is actually conducted. We do this by specifying a 'cut-off'
2. probability of making an error of Type II if the p-value so that, if the calculated p-value of our sample result is smaller than
rejection of H0 is in fact true this 'cut-off' p-value, the null hypothesis is rejected. This 'cut-off' p-value is
3. ability of the statistical test to detect whether an called the significance level of the statistical test procedure. We will use the
effect exists symbol 'α' to denote this significance level. The symbol 'α' is pronounced
4. maximum probability of making an error of Type l if 'alpha' and is the Greek letter equivalent to the normal 'a' in our (Roman)
the rejection of H0 is to be considered alphabet. By convention, this value is often set at either 0.05 or 0.01. The α-
value specifies the maximum risk that we are willing to take of making an
error if we reject the null hypothesis (see section 3.3.3 for more details on
this).
47 A researcher wants to test the hypothesis that the mean 2 P105 s = s / √n
depression score on a depression scale for patients = 24 / √64
diagnosed with clinical depression is greater than 120. = 24 / 8
The statistical hypothesis to be tested is =3
H0 µ = 120
H1 µ > 120

She uses a random sample of n=64 drawn from the


population of diagnosed patients and finds that ẋ = 127
and s = 24

Which of the values below is the closest to the correct


value of sẋ?

1. 0.37
2. 3.0
3. 0.61
4. sẋ cannot be calculated from the information that
was provided
# Question Ans Page Comments
48 Suppose H0 μ = 100 is tested against H1 μ ≠ 100 with 1 p = 0.04
α=0.05. If the t-statistic is found to be -3.20 and the two- α = 0.05
tailed p-value is 0.04, what decision regarding the
statistical hypothesis can be taken? General rule: if p < α, reject H0 and accept H1

1. Do not reject H1 Remember H1 μ ≠ 100 (two-tailed), so α=0.05 is based on two-tailed


2. Reject H1 and accept H0 hypothesis. The p-value of 0.04 is also two-tailed, so we can compared the p-
3. Do not reject H0 value and α directly
4. Reject H0, and accept H1

49 Suppose the alternative hypothesis states that μ > 60. 1 P106 H1: μ > 60
The researcher should test H0 against H1 if the ______
μ = sample mean
1. sample mean is larger than 60 μ > 60 is directional indicating larger than.
2. sample mean is smaller than 60
3. sample mean differs from 60 So if the sample mean is greater than 60, a test should be performed.
4. p-value is smaller than the level of significance

50 The following list contains a number of situations where 2 P123 One can use t-tests to compare two groups at a time until one has compared
a researcher may consider using a variation of the t-test all three groups with one another. It would probably be wise to use a smaller
a) To compare two group means level of significance since the probability of a Type I error increases as you do
b) To determine whether a relationship exists between more statistical tests on the same data.
two categorical (nominal scale) variables
c) To compare a group mean with a constant value P115-123 T-test does not test for relationships. It compares groups
d) To determine whether a relationship exists between
two continuous quantitative variables

Two of the statements above are true. Choose the


correct set of true statements from the list below

1. (a) and (b)


2. (a) and (c)
3. (b) and (d)
4. (c) and (d)
# Question Ans Page Comments
51 When applying a t-test for the difference between the 1 P86-87 With a t-test, the population standard deviation σ (and, therefore, the
means of two independent samples, the probability of population variance σ²) is unknown.
obtaining the calculated t-statistic under the null
hypothesis is compared to the ______ to reach a P105 You calculate the t-statistic using sample values to get a p-value which is then
decision compared with the alpha level of significance
P116
1. level of significance
2. degrees of freedom
3. two-tailed probability
4. effect size

52 Samples can be considered independent when ______ 4 P112 Samples are considered as comprising independent groups if the
composition of the one sample in no way affects, in any systematic way,
1. the sample comes from the assignment of subjects the composition of the other sample. The two samples come from two
to a treatment or experimental group and this is groups that have no obvious relationship. For example, where one sample is
varied to see how it affects certain measurements measurements of a construct like 'self-esteem' among men, and the other
2. care was taken that the samples are drawn under among women, but both groups were sampled purely randomly.
different experimental conditions
3. the samples are drawn from more than a single
population of subjects
4. the composition of one sample is not systematically
related to the composition of the other one
# Question Ans Page Comments
53 A social psychologist wants to test how long people will 3 P113-116 The dependent variable is the one that is predicted or explained, and the
wait before responding to cries of help from an independent variable is manipulated to see how it affects the dependent
unknown person. The psychologist wants to confirm his variable.
suspicion that people will take less time to react when
they hear a female voice than when they hear a male We have to perform a tc test
voice. He tests this on a sample of n=1 5 people who
are told (one at a time) to sit in a waiting room to be In order to use the t-test (tc) statistic, we need to make two assumptions
called for an interview. While they wait, each participant regarding the data:
hears a call for help from a male or female voice, which • that the two populations being compared are normally distributed
is actually a recording. The dependent variable is the • with the same variance (or standard deviation).
number of seconds that each participant waits until they
go to investigate or tried to find help. The following Note: Even the most elementary statistics program makes provision for
sample statistics are calculated from the results. performing t-tests. Such programs usually require that we indicate which
variable should be used to identify the two groups and which is the dependent
Male voice ẋ1= 11.9 seconds, s1 = 3.5 variable. In addition, we have to choose between a tc test for independent
Female voice ẋ2 = 15.3 seconds, s2 = 4.1 samples or a td test for dependent or correlated groups

Given these sample statistics, what type of statistical


test is required to confirm the relevant statistical
hypothesis?

1. A one-tailed statistical test


2. A two-tailed statistical test
3. A test for independent samples
4. No statistical test is necessary
# Question Ans Page Comments
54 A researcher plans to use the t-test to compare two 2 P115 In order to use the t-test (tc) statistic, we need to make two assumptions
independent samples of data of only 15 individuals regarding the data:
each. Consider the following assumptions that may be • that the two populations being compared are normally distributed
relevant here • with the same variance (or standard deviation).
a) the sample standard deviations have to be equal
b) the data from both samples has to come from (Remember that the square root of the variance is equal to the standard
populations that are normally distributed deviation.)

What minimum assumptions from the ones given above We can also assume that the samples are independent - since the samples
needs to be met before she may proceed? were selected randomly, we can safely consider them to be independent of
each other. All of this makes the tc-test an appropriate test.
1. At least one of (a) or (b) must be true
2. (a) and (b) must both be true P116 Note: Even the most elementary statistics program makes provision for
3. Neither (a) nor (b) is relevant but other assumptions performing t-tests. Such programs usually require that we indicate which
exist that will have to be considered variable should be used to identify the two groups and which is the dependent
4. The t-test should never be used with such a small variable. In addition, we have to choose between a tc test for independent
sample at all samples or a td test for dependent or correlated groups

55 A researcher wants to test the following hypotheses 1 P78-81 The relationship between one-tailed and two-tailed p-values can be
summarised as follows:
H0 μ1 = μ2 • One-tailed p-value = (two-tailed p-value) / 2
H1 μ1 > μ2 • Two-tailed p-value = (one-tailed p-value) x 2

On the basis of data provided, the output from a The important point to remember is that the p-value indicates more or less
computer programme indicates that a t-value of t = 1.72 how likely the particular result we have observed in our data is if the null
was found, with the p-value for a two-tailed test given hypothesis were true; or, as we say, 'under the null hypothesis'.
as p = 0.056. What should the researcher do to
evaluate this result at a level of significance of α = P105 H1 μ1 > μ2 is a directional hypothesis and a one-tailed test should be
0.05? performed. Computer programs often only provide the p-value for non-
directional testing (i.e., for the two-tailed t-test).
1. Divide the p-value by 2 before comparing it with α
2. Multiply the p-value by 2 before comparing it with α In this case, the non-directional p-value (p = 0.056) should be divided by two to
3. Divide α by 2 before comparing p to α get the one-tailed value of p =0.028.
4. Compare the p-value as given with α
# Question Ans Page Comments

Base your answers to Questions 56 - 58 on the following scenario

A researcher suspects that a relationship exists between colour perception and visual memory (i.e. the capacity to recall visual information). She suspects
that high ability to detect colours rapidly acts as an aid to the capacity of visual memory. A group of 100 research participants are divided into two groups,
based on the capacity of their visual memory, as determined by an appropriate test. One group (Group 1) of n1=44 displays high recollection of visual
images, the other group (Group 2) of n2=56 scores low on the test. Each participant from each of the groups are then tested on how many colours they
can recall of objects they see very briefly displayed on a computer screen

56 Which is the most appropriate research hypothesis for 3 "She suspects that high ability to detect colours rapidly acts as an aid to the
the researcher to test? capacity of visual memory"

1. The mean of the number of colours recalled by the


participants with a good visual memory will differ The mean of the number of colours recalled by the participants with a good
significantly from the mean number of colours visual memory will be significantly greater than the mean number of colours
recalled by those with a limited visual memory recalled by those with a limited visual memory
2. The mean of the number of colours recalled by the
participants with a good visual memory will be
significantly less than the mean number of colours
recalled by those with a limited visual memory
3. The mean of the number of colours recalled by the
participants with a good visual memory will be
significantly greater than the mean number of
colours recalled by those with a limited visual
memory
4. The mean of the differences between the number of
colours recalled by the participants with a good
visual memory and those with a limited visual
memory will be significantly greater than zero

57 Which is an appropriate way to formulate the alternative 3 H1 μ1 > μ2


hypothesis for the analysis of the results?

1. H1 μ1 ˂ μ2
2. H1 ẋ1 > ẋ2
3. H1 μ1 > μ2
4. H1 μ1 ≠ μ2
# Question Ans Page Comments
58 Which is the appropriate test statistic to be calculated 4 P129-130 Correlation: measuring the association between variables
when analysing the results of this research?
Correlation is a measurement of the extent to which a measurement on
1. The t-statistic for the difference between the means one variable is related to a measurement on another variable for the
of two independent samples same sample of individual cases.
2. The t-statistic for the difference between the means
of two dependent samples This can be visualised by way of a graphical representation called a scatter
3. The t-statistic for the mean difference score of a plot. A scatter plot is a graph that represents the measurements of two
single sample variables on two perpendicular axes, usually called the x-axis (horizontal axis
4. The test statistic based on the correlation coefficient or abscissa) and the y-axis (vertical axis or ordinate).
r for the relationship between two variables (visual
memory and recall of colours)

Base your answers to Questions 59 and 60 on the following scenario.

To test the efficacy of a workshop aimed at improving people's interpersonal skills, a researcher applies a scale which rates the interpersonal skills of 20
participants before and after they participate in the workshop. Scores on his rating scale among the general population have a mean of 5 and a standard
deviation of 1.5

59 Which of the following is the most appropriate way to 2 " To test the efficacy of a workshop aimed at improving people's interpersonal
express the null hypothesis for an analysis of the skills, a researcher applies a scale"
results? (Interpret μ as a population mean and Ḋ as the
population mean of the differences scores) There is no direction indicated (greater, more, smaller, etc.)

H0: μ = 5 " participants before and after they participate in the workshop"
H0: μ1= μ2
H0: Ḋ = 0 So two group means are compared.
H0: μ1 ≠ μ2
H0 is always "=" Therefore: H0: μ1= μ2
# Question Ans Page Comments
60 Which is the appropriate test statistic to calculate? 2 P112 " participants before and after they participate in the workshop"

1. The z-statistic for the mean of a sample So the same group is used which make it dependant.
2. The t-statistic for the difference between the means
of two dependent samples Samples are considered as comprising independent groups if the
3. The t-statistic for the difference between the means composition of the one sample in no way affects, in any systematic way, the
of two independent samples composition of the other sample. The two samples come from two groups that
4. The t-statistic for the mean of a single sample have no obvious relationship. For example, where one sample is
measurements of a construct like 'self-esteem' among men, and the other
among women, but both groups were sampled purely randomly.

On the other hand, the concept of dependent groups refers to situations


where the samples are related, and it implies that each subject in one group
can be systematically paired off with a subject from the other group. For this
reason, a dependent groups research design is often referred to as a
matched-pairs design.
61 When studying correlations in research, one 3 P129-130 Correlation: measuring the association between variables
investigates the relation between ______
The notion of the relationship between two continuous variables and how the
1. the mean of a single sample of subjects and a size of the relationship can be expressed in terms of a correlation between
population mean them (the index of association is the Pearson product-moment correlation
2. two dependent groups of subjects, with respect to a coefficient). This coefficient can also be used as a test statistic.
single variable
3. two variables measured on the same group of Correlation is a measurement of the extent to which a measurement on
subjects one variable is related to a measurement on another variable for the
4. two independent groups of subjects, with respect to same sample of individual cases. This can be visualised by way of a
a single variable graphical representation called a scatter plot.

62 A scatter plot is a graphical representation of ______ 4 SG A graph showing the position of each of a number of sampling units on each
P130-132 of two variables
1. the relationship between two variables measured on
a nominal scale within a single group Tut202 A scatter plot is a graph showing the relationship between two numerical
2. the frequency distribution of a sample of 2014 variables. In such a graph the data of the one variable are plotted on the
measurements Q18 horizontal axis (usually referred to as the X axis), and the data of the other
3. relationship between two groups of subjects with variable on the vertical (or Y) axis.
regard to a single variable measured on an interval
or ratio scale It is not a comparison of sample and population, nor has it to do with spread
4. the relationship between two variables measured on of data or the independence of variables
a ratio or interval scale within a single group
# Question Ans Page Comments
63 A positive correlation between variables X and Y implies 2 P133 If a correlation exists, the way in which one variable varies will be related to
that persons scoring low on X will generally score variation on the other one. A negative correlation implies that as one variable
______ on Y changes, the other changes in the opposite direction. A high value on X will
imply a low value on Y, while a low value on X will be matched by a high value
1. high on Y. Conversely, if the correlation is positive, the variable values will
2. low generally vary is the same direction (both high or both low).
3. either high or low
4. in an indeterminate way When positive relationships occur, this implies that as one variable gets
larger, so does the other. When negative relationships occur, this implies that
as one variable gets larger, the other gets smaller.
64 Which of the following can take on a value of -0.5? 3 P132-133 Correlation coefficients that measure the linear relationship between two
variables, such as the Pearson product-moment correlation coefficient, can
1. A probability have a continuous value that ranges from -1 to 1 (a positive value is usually
2. A level of significance written without the sign, so '1' is presumed to mean '+1').
3. A correlation coefficient
4. A variance We use 'r' as the symbol that represents a correlation coefficient (as in the
case of the Pearson product-moment correlation coefficient), and the following
applies:
• r = +1 implies a perfect positive linear relationship (the dots in a scatter
plot will run from lower left to upper right in a perfectly straight line)
• r = 0 implies no linear relationship at all (the dots may be scattered all over
the place)
• r = -1 implies a perfect negative linear relationship (the dots will run from
upper left to lower right in a straight line)
# Question Ans Page Comments
65 What is the most likely value of the correlation 3 P132-133
coefficient between the following values of variables X 8
and Y? 7

6
X 2 7 4 5 1
Y 2 7 4 5 1 5

4
1. -1
3
2. 0
3. +1 2
4. 100 1

0
0 1 2 3 4 5 6 7 8

A perfect positive linear relationship exists (the dots in the scatter plot run from
lower left to upper right in a perfectly straight line)

We use 'r' as the symbol that represents a correlation coefficient (as in the
case of the Pearson product-moment correlation coefficient), and the following
applies:
• r = +1 implies a perfect positive linear relationship (the dots in a scatter
plot will run from lower left to upper right in a perfectly straight line)
• r = 0 implies no linear relationship at all (the dots may be scattered all over
the place)
• r = -1 implies a perfect negative linear relationship (the dots will run from
upper left to lower right in a straight line)
# Question Ans Page Comments
66 A researcher hypothesizes that a relationship should 4 SG P137 The symbol ‘ρ’ (the Greek letter ‘rho’) is used to represent the population
exist between spatial ability and general aptitude for parameter being tested when you calculate the Pearson’s correlation
mathematics. She collects the results of a sample of n = Tut202 coefficient ‘r.’ That is, you calculate r for the sample, then have to decide
100 school children for a mathematics test and measure 2014 whether this is likely to represent a significant linear correlation between two
the spatial ability of each with a test that represents a Q13 variables for the whole population (with this population correlation symbolised
person's ability to rotate objects mentally on a 10-point by ρ), by looking at the p-value associated with this calculated sample statistic
scale. r.

Which of the following is the most appropriate way to In a similar way ‘μ’ represents the population parameter (statistic) for a mean,
express the null hypothesis for this research? and ‘σ’ the population parameter for a standard deviation.

1. r=0
2. μ=0
3. ẋ=0
4. p=0

67 A number of psychiatric patients are classified into one 1 P142 After setting up the hypotheses to be tested for, the next step is to create a
of four categories as: schizophrenic, severely contingency table, which is a table indicating the number of individual objects
depressed, bipolar disorder and others. Which of the falling in each cell of cross-tabulated data. In other words, it is a two-
following is suitable for representing this information dimensional table in which each observation is classified in terms of two
versus the gender of these patients? categories simultaneously.

1. A contingency table Severely Bipolar


Schizophrenic Others
2. A scatter plot depressed disorder
3. A histogram Male
4. A spreadsheet Female
# Question Ans Page Comments
68 What is the expected frequency of observations in cell 2 P143-144 It is important to note that the relation between the variables is described by
AY if no interactions exist between the variables in the the cell and not by the row or column frequencies. These cell frequencies
rows and columns of the following contingency table? represent the way the information is distributed relative to the two variables.
These cell frequencies are often referred to as the observed or empirical cell
X Y frequencies.
A 6 4
B 4 6 To find the expected frequency for a particular cell, the row total for that row is
multiplied by the column total for that column and this result is then divided by
1. 4 the overall total. These expected frequencies show what the results would
2. 5 have been like if the distribution of frequencies through the cells were
3. 20 homogeneous, in proportion to the respective row and column totals. If the
4. It cannot be calculated from the information observed frequencies correspond precisely with the expected frequencies, we
provided know that the null hypothesis cannot be rejected. But the observed
frequencies will seldom be precisely equal to the expected frequencies - even
if H0 is not rejected - because of sampling error.

It is the differences between these expected and observed frequencies that


interest us, that is, we want to know how far the actual (observed) results are
removed from the expected situation, if there is no interaction effect.

X Y Total
A 6 4 10
B 4 6 10
Total 10 10 20

Row total (O.1) = 10


Column total (O1.) = 10
Sample total (size) (O..) = 20

E11 = (Row total x Column total) / Sample total


E11 = (O.1 x O1.) / O.. = (10 x 10) / 20 = 100 / 20 = 5
# Question Ans Page Comments
69 A researcher wants to establish whether a relationship 2 SG P140 The chi-square test is usually used when you have a cross tabulation of
exists between people’s religious affiliation and whether frequency counts of events which are nominal scale measurements. This
they are in favour of or against the death penalty (yes or Tut202 table is referred to as a contingency table. It is used to compare an observed
no). Which of the following would be the most 2014 frequency distribution (frequency counts based on a sample of observation)
appropriate test to use? Q22 with the frequency distribution which we would expect to find if the null
hypothesis of no relationship between two cross-tabulated variables were
1. The t-test for two independent samples true.
2. The Chi-square (x²) test statistic
3. Pearson's correlation test statistic
4. The t-test for two dependent samples

70 Which of the following is the appropriate formula for the 4 P144 The Pearson chi-square test statistic is a calculation of the difference between
Chi square test? the observed and expected frequencies.

The formula is:


1.

2.
This means the expected value for each cell in the contingency table is
subtracted from the observed value for that cell, squared, and divided by the
expected value for that cell.
3.
Then all of these terms are added together to yield

4.
Oct/Nov 2012

# Question Ans Page Comments


1 The entire collection of cases that you are interested in 1 P11 The entire collection of cases that you are interested in when you make your
when you do research is referred to as the ______ measurements for a particular construct is referred to as the population. The
population depends on which people or objects or events you are interested
1. population in studying.
2. range
3. sample
4. data

2 Mean, range, variance and standard deviation are 2 P10-11 A distinction exists between inferential statistics and descriptive statistics.
examples of______ Descriptive statistics refers to a set of quantities used to summarise
aspects of numerical data. Examples that you may be familiar with are
1. variables means, range, variance and standard deviation (see Appendix C for a quick
2. descriptive statistics introduction). These summary quantities are sometimes referred to as
3. test statistics parameters (when they refer to the whole collection or population of data; see
4. inferential statistics section 1.4.3 below).

Inferential statistics refers to the use of statistical techniques to make


generalisations about the relationships among (two or more) variables. Here
the patterns that may exist in the data are carefully investigated.
# Question Ans Page Comments
3 Quantities that summarises aspects of a population are 2 P11 These summary quantities are sometimes referred to as parameters (when
called (a) ______, while (b) ______ do the same for they refer to the whole collection or population of data
samples
P161 You should take careful note of the following important distinctions between
1. (a) statistics (b) parameters samples and populations. Summary values for populations are called
2. (a) parameters (b) statistics 'parameters' and are usually denoted by Greek letters, while summary values
3. (a) constructs (b) variables for samples are called 'statistics' and are denoted by Roman letters.
4. (a) variables (b) parameters Symbol
Summary value Populations Samples
(Parameter) (Statistic)
Arithmetic mean μ 
Standard deviation σ s
Variance σ² s² (s=√s²)
Standard error of mean
(Also called Standard deviation of the σ (= σ/√n) s  (= s/√n)
sampling distribution of the mean)
Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean µ
under H0)
Z score for means z
Correlation between two measurements
(Pearson's R)
ρ r
Proportions P p
4 The process of selecting a subset of a population for a 3 P11-12 Because populations can be very large, and we rarely have access to them,
survey is known as ______ we would draw a sample of observations from the population and use that
sample to infer certain things about the population's characteristics. The most
1. survey research appropriate sample is usually a simple random sample, where each individual
2. triangulation has the same chance of being included.
3. sampling
4. operationalisation P12 One of the most effective methods of sampling is random sampling, which
involves selecting a subset in such a way that each member of the population
has an equal probability of being included in the sample. From a statistical
point of view, a more satisfactory definition of random sampling is that it is a
method of drawing a sample from a population in such a way that every
possible sample of a particular size has the same probability of being
selected.
# Question Ans Page Comments
5 An inference is ______ 3 P2 An inference is a conclusion that follows from existing information, by
generalising from the specific information to the general type of phenomenon,
1. an explanation of why certain things are as they are where the conclusion is not absolutely certain. So in summary inferential
observed to be statistics are techniques for making generalisations based on imperfect
2. an educated guess about how certain phenomena numeric data, where the conclusions have a high probability of being true, but
may be interrelated you can never be completely certain.
3. a generalisation from a specific situation to the
phenomenon in general, which have a high
probability of being true
4. a conclusion which follows logically from certain
premises and which must be true if the premises
are true
6 “Empirically" means “based on ______" 3 P2 All scientific knowledge begins with description of the phenomena being
studied, based on careful observation. Knowledge based on observation of
1. theory physical events is referred to as empirical knowledge (as distinct from
2. statistical arguments knowledge based on contemplation, unexplained insights, mystical
3. observations experiences or claims by authority figures).
4. facts

7 Which of the following best describes “latent”? 3 P7 So the (visible) variable reflects the intensity of the underlying (invisible)
construct, in terms of how it was measured. We say that the variable is
1. observable manifest (it is visible in the sense that we can observe it) and the construct is
2. manifest latent (it is invisible in the sense that we need some way to make it
3. hidden appear). So the latent construct is made manifest by the use of an
4. independent appropriate measurement procedure.

P23 To say that a construct is 'latent' is another way of saying it is hidden from
direct observation
# Question Ans Page Comments
8 A psychologist has a theory that visual perceptual 2 P8-9 The dependent variable is the one that is predicted or explained, and the
ability influences the marks that learners will get in a P24 independent variable is manipulated to see how it affects the dependent
mathematics test. In this example, 'visual perceptual variable.
ability' is the ______ variable
The independent variable is that variable which affects the dependent
1. dependent variable; or, conversely, the dependent variable depends on the independent
2. independent variable.
3. manifest
4. hidden When a researcher focuses on the interaction of only two variables at a time,
the dependent variable is usually the one that the researcher is interested in,
the variable that is the focus of the research. The independent variable is
something that the researcher manipulates, to see how this affects the
dependent variable (in other words, the dependent variable is dependent on
the independent variable).

9 An operationally defined variable is ______ 4 P24-26 Operational definitions of psychological constructs should define constructs in
terms of observable behaviour.
1. abstract
2. latent "Operational'' refers to practical procedures by which constructs are made
3. independent visible.
4. observable
"Operationalisation" is where you make the construct (which is usually an
abstract concept, so it is difficult to observe it clearly) visible by finding some
suitable way to measure it.
# Question Ans Page Comments
10 A psychologist is interested in studying the interaction 1 P15-16 We normally start with a research question. This could be an implication of
between small groups of four to five people in each a theory - something that seems to be implied by the theory or some kind of
group. He suspects that the interactions between such practical problem, which is stated in general terms. Using our existing
groups can be described in similar terms to the knowledge about plausible answers, we reformulate the research question in
interactions between individual persons. In order to be terms of a conjecture or supposition, which has the goal of helping the
able to do a scientific study of this (a) ______ question, researcher select what he or she has to observe in order to answer the
he would have to provide a(an) (b) ______ definition of research question. This is the research hypothesis (although there could be
the (c) ______ called "interaction" more than one), which expresses the problem in terms of very specific
relationships among constructs that we expect to find (if our guess is true). It
1. (a) research (b) operational (c) construct is important that this possible relationship should be clear and unambiguous.
2. (a) scientific (b) experimental (c) concept An hypothesis that is stated clearly and specifies exactly what is to be
3. (a) experimental (b) research (c) statistic observed and what should be true if it is valid, is often called an operational
4. (a) hypothetical (b) empirical (c) parameter hypothesis. However, this is just another name for a research hypothesis
where the relationship between the measurements (representing the
construct as variables) is written out in clear and explicit detail. You can
think of the research hypothesis as a description of relationships that
should hold among the constructs (two or more). The operational
hypothesis is then the way the research hypothesis is expressed in the form
of the relationships among the variables produced when the constructs are
measured. But the operational hypothesis is usually taken as equivalent to
the research hypothesis, so the distinction is rarely made in practice.
11 The variable manipulated by a researcher in an 2 P8-9 The dependent variable is the one that is predicted or explained, and the
experiment is called the ______ variable P24 independent variable is manipulated to see how it affects the dependent
variable.
1. hypothetical
2. independent The independent variable is that variable which affects the dependent
3. dependent variable; or, conversely, the dependent variable depends on the independent
4. empirical variable.

When a researcher focuses on the interaction of only two variables at a time,


the dependent variable is usually the one that the researcher is interested in,
the variable that is the focus of the research. The independent variable is
something that the researcher manipulates, to see how this affects the
dependent variable (in other words, the dependent variable is dependent on
the independent variable).
12 Which one of the definitions below is FALSE? 3 P3 Psychologists try to develop explanations for human experiences and
behaviour. To do this, they often have to make use of abstract concepts (also
1. The term construct is used to refer to a concept called constructs) that serve as explanations for the behaviour they observe.
which is of importance in psychological research Concepts such as these are sometimes referred to as 'constructs'. They are
2. Measurement is a process whereby numbers are in a sense 'made up' concepts that we use to explain things (like behavioural
allocated to a construct according to a rule patterns) that we can observe, but cannot see in themselves (at least, not
3. When a psychological variable is measured, the directly).
result is referred to as a statistic
4. When a construct is measured, the resulting P4 Psychologists are interested to find out which constructs are important (in the
quantity is referred to as a variable sense of being required or useful to explain human behaviour) and how they
work together in a pattern, or what their interrelationships are. One of the
objectives of psychology is not only to describe human behaviour, but also to
find explanations for it. Constructs and how they interact fill the role of
explanatory mechanisms in psychology. We try to find out which constructs
offer an appropriate explanation of the behaviour or events we perceive, and
what the pattern of their interactions with other constructs may be. In this
sense, it can be said that constructs are the building blocks of theory.

P4 Constructs and their interrelations (how they affect each other, their patterns
of interaction) are used in this way to develop theoretical explanations of
why people behave in certain ways in certain contexts, or why mental
phenomena appear to be as they are.

P5 'Measurement' refers to a process whereby numbers are allocated to


something according to a rule. So what we need to do is to find specific
standard procedures by which a specific construct can be observed, in such a
way that a numeric value can be allocated.

P7 A construct that has been measured in some way produces a variable.


A variable refers to a number that can take on any one of a range of possible
values. They can be discrete (when only whole numbers like 1, 2, 3 are
allowed) or continuous (what mathematicians refer to as 'real numbers'). In
some cases variables also take on values smaller than zero to produce
negative numbers.
13 Summary statistics that are used to summarize 1 P14 A statistic is a sample measurement characteristic.
information about a population are called ______ A test statistic is the quantity you calculate (often by making use of sample
P23 statistics) to test a statistical hypothesis.
1. parameters When we refer to these test quantities, we always refer to the name in full -
2. inferential statistics 'test statistic', and when we use the term 'statistic' on its own it refers to a
3. samples descriptive statistic that describes an aspect of the sample data.
4. descriptive statistics
Parameters are values that summarise aspects of population data
While the word 'parameters' does refer to descriptive statistics, it does not
refer to all descriptive statistics. It is used only for those descriptive statistics
that relate to the population, not to those that describe aspects of the sample.
14 A frequency distribution of the ages in months of a 2 P47 The expression ‘frequency’ implies the count of observations in each of a
class of Grade 1 children indicates for each age in number of categories. A frequency distribution of ages will represent the
months what the corresponding ______ is number of students falling in each of a number of age categories, which can
be represented graphically in a histogram.
1. variable
2. number of children of that age P140 The chi-square test is usually used when you have a cross tabulation of
3. z-score frequency counts of events which are nominal scale measurements. This
4. probability table is referred to as a contingency table. It is used to compare an observed
frequency distribution (frequency counts based on a sample of observation)
with the frequency distribution which we would expect to find if the null
hypothesis of no relationship between two cross-tabulated variables were
true.
15 A researcher investigating short term memory reads a 1 P53 Frequency of 9 or more number remembered are:
list of ten three digit numbers to a group of 100 9 numbers = 6
research participants. Each participant is then asked to 10 numbers = 0
write down as many of the numbers as they can recall.
The frequency table below shows the count of persons Number of participants (N) = 100
who can correctly recall a specific number of three-digit
numbers from the list Formula is :

Number of
items 1 2 3 4 5 6 7 8 9 10
remembered So:
μ = ∑xi / N
Frequency 0 4 11 13 22 18 17 9 6 0 = (6+0) / 100
= 6 / 100
= 0.06
Using this table as a basis, estimate the probability that
a specific person will remember nine or more three-
digit numbers.

1. 0.06
2. 0
3. 94%
4. 0.15

16 Two class representatives, one boy and one girl, must 2 P34-35 The multiplicative rule states that p(A and B) = p(A) x p(B) where A and B
be selected from a class of 10 boys and 8 girls, which are both independent events. This rule is used to determine the product of
includes Mary and her friend John. The teacher writes two or more probabilities and is indicated by the word 'and' (i.e. the
the names of all the children on slips of paper. She first probability of A and B).
puts the girls' names into a box and then draws one of
their names blindly. Then she empties the box and Total number of kids = 18 (10 boys and 8 girls)
puts the names of all the boys inside, and one name is John selected = 1 out of 10
again drawn blindly. Mary selected = 1 out of 8

What is the probability that Mary and John will both be P(Mary and John) = P(Mary) x P(John) = 1/8 x 1/10 = 1/80 = 0.0125
selected?
The additive rule is p(A or B) = p(A) + p(B). This rule is used when two or
1. 2/80 more events are mutually exclusive. The additive rule is used to determine
2. 0.0125 the sum of two or more probabilities, and is signalled by the use of the word
3. 0.225 'or' (i.e. the probability of A or B).
4. 2/18
Base your answers to Questions 17 and 18 on the following information:

Suppose the weights of the population of military recruits are distributed normally with a mean of 64 kg and a standard deviation of 8 kg. Different
samples of these recruits, each with a sample size of 16, are drawn repeatedly

17 We expect the standard deviation of the sample means 1 P60-61 Central limit theorem.
to be about ______ kg If a simple random sample of size n is selected from a population with mean
μ and standard deviation σ, the sampling distribution of means obtained from
1. 2 all possible samples is approximately normal with mean μ and standard
2. 3 deviation σ/√n. The central limit theorem gives a precise description of the
3. 4 distribution that you will obtain if you selected every possible sample,
4. 8 calculated every sample mean, and constructed the distribution of the sample
mean. The importance of the theorem lies in the fact that we can use it to
describe a sampling distribution without actually having to sample a
population of raw scores 'infinitely', and because of this we can calculate the
extent to which any sample mean approximates the mean of the population
from which it was drawn.

Just as the normal distribution is defined by its mean and standard deviation,
so the distribution of sample means is described by the same two quantities.
The central value of the sampling distribution equals the population mean (i.e.
the mean of the distribution of all possible means is the same as the mean of
the population from which the samples were drawn, or μ = μ) while the
standard deviation of the sample means is estimated by a value we call
the standard error of the mean. Like a standard deviation, the standard
error of the mean tells us by what average amount the sample means deviate
from the mean of the sampling distribution. It is an estimate of the size of the
error we shall make if we use the mean of the distribution of sample means
as an estimate of the true population mean, that is, if we use μ to estimate μ.

The standard error is denoted by σ. The σ indicates that we are describing a
population, and the subscript  informs us that we are dealing with a
population of sample means. The standard error is given by dividing the
population standard deviation by the square root of the sample size:
σ = σ / √n
where: μ = 64 (mean)
σ = 8 (standard deviation)
n = 16 (sample size)

So: σ = σ / √n = 8 / √16 = 8/4 =2


18 We expect the mean of the sample means to be about 3 P61 μ = μ = 64
______ kg
Just as the normal distribution is defined by its mean and standard deviation,
1. 52 so the distribution of sample means is described by the same two quantities.
2. 72 The central value of the sampling distribution equals the population mean (i.e.
3. 64 the mean of the distribution of all possible means is the same as the
4. 62 mean of the population from which the samples were drawn, or μ = μ
while the standard deviation of the sample means is estimated by a value we
call the standard error of the mean.

Like a standard deviation, the standard error of the mean tells us by what
average amount the sample means deviate from the mean of the sampling
distribution. It is an estimate of the size of the error we shall make if we use
the mean of the distribution of sample means as an estimate of the true
population mean, that is, if we use μ to estimate μ.

Base your answers to Questions 19 and 20 on the information in the table below:
Std. dev. of
Subject Student X Mean of class
class
A 50% 46% 2%
B 55% 50% 4%
C 60% 50% 6%
D 66% 65% 3%

19 In which subject did Student X do best, relative to his 1 P55 Calculate the z-score for each class. The subject with the highest z-score is
class? where student X did the best in.

1. A Formula is:
2. C
3. D Where x = Student X score
4. B μ = Mean of class
σ = Std dev of class

Subject A: Z = (x- μ) / σ = (50%-46%) / 2% = 4% / 2% = 2


Subject B: Z = (x- μ) / σ = (55%-50%) / 4% = 5% / 4% = 1.25
Subject C: Z = (x- μ) / σ = (60%-50%) / 6% = 10% / 6% = 1.67
Subject D: Z = (x- μ) / σ = (66%-65%) / 3% = 1% / 3% = 0.33

Student X did best in Subject A


20 What is the probability of getting a score of 66% or 4 App D Subject D: Z = (x- μ) / σ = (66%-65%) / 3% = 1% / 3% = 0.33
more in subject D? So Z=0.33

1. 0.33 P(score ≥ 66%) = P(X ≥ 66%)


2. 0.13
3. 0.63 So P(Z ≥ 0.33) = 0.3707 (Refer to standard normal distribution table where
4. 0.37 z=0.33. Since we are looking for ≥ 0.33, refer to the smaller portion column)

21 A z-score is conventionally used to refer to a variable 4 P52-53 There is one form of the normal distribution that is of special importance. This
from which probability distribution? curve has a mean of μ = 0 and a standard deviation of σ = 1 and is known as
the standard normal distribution, and is by convention indicated with the
1. Any normal distribution letter 'z' (so it is also referred to as the z-distribution). The measures on this
2. The binomial distribution distribution are referred to as standard scores or z-scores.
3. The even distribution
4. The standardized normal distribution

FIGURE 2.7: The standard normal distribution


22 The total area under the standard normal curve equals 4 P53-54 Figure 2.7 above shows the approximate proportions of scores distributed
______ under the area covered by the curve.
• The total area under the curve gives the probability of the interval -∞
1. its mean and +∞, and is equal to +1 (i.e., the probability of any value of z falling
2. its standard deviation between minus and plus infinity is equal to 1).
3. the z-score • Because the distribution is symmetrical, 0.5 of the area lies to the left of
4. one the mean and the same proportion to the right of the mean.
• Approximately 0.341 of the area lies between the mean and 1 standard
deviation in each direction.
• Roughly two-thirds, or 0.682 (0.341 x 2) of the area of the curve lies within
one standard deviation of the mean.
• Approximately 0.477 (i.e. 0.3413 + 0.1359) of the area lies between the
mean and 2 standard deviations in each direction.
• Approximately 0.954 (i.e. 0.477 x 2) of the area lies within 2 standard
deviations from the mean.
• Approximately 0.998 (i.e. 0.954 + (0.0215 x 2)) of the area lies within
three standard deviations from the mean.
23 The mean and standard deviation of a set of test 2 P55 X-μ 14 - 20 -6
scores are 20 and 8 respectively. If the z-score which Z = σ = 8 = 8 = -0.75
corresponds to a test score of 14 is calculated, in
which of the intervals listed below would it fall? Where:
1. Smaller than -1.0
x represents the variable (test score),
2. Between -1.0 and 0 μ is the population mean,
3. Between 0 and 1.0 σ the standard deviation of the population from which x was obtained.
4. Larger than 1.0
24 Why is the central limit theorem of importance in 3 P60-61 Central limit theorem.
inferential statistics? It ______ If a simple random sample of size n is selected from a population with mean
μ and standard deviation σ, the sampling distribution of means obtained from
1. informs us how sampling error will increase as the all possible samples is approximately normal with mean μ and standard
population increases deviation σ/√n
2. tells us that sampling error will begin to The central limit theorem gives a precise description of the distribution that
approximate a normal distribution as samples grow you will obtain if you selected every possible sample, calculated every
larger sample mean, and constructed the distribution of the sample mean. The
3. shows that the sampling distributions of certain importance of the theorem lies in the fact that we can use it to describe a
sampling statistics will approach a normal sampling distribution without actually having to sample a population of raw
distribution as the sample sizes increase scores 'infinitely', and because of this we can calculate the extent to which
4. can be used to convert any measurement into an any sample mean approximates the mean of the population from which it was
equivalent z-score drawn.

Some interesting facts about this theorem should be noted:


• This theorem gives the sample distribution of the sample means for any
population, irrespective of the shape, mean or standard deviation of the
original population.
• The distribution of sample means will become more normal as
sample size (n) increases, so that with larger and larger samples the
shape of the distribution of sample means will become increasingly
normal in form. In fact the distribution of sample means approximates a
normal distribution very rapidly: by the time the sample size reaches
n=30, the distribution is very close to perfectly normal.

25 The asymptotic property of the normal curve refers to 2 P52 Normal curves share a number of key properties, such as the following:
the fact that ______ • They are bell-shaped. The most observations occur at the midpoint of the
curve.
1. the curve is bell-shaped • They are symmetrical. The left side is a mirror image of the right side.
2. the endpoints of the curve get continuously closer • They are continuous. Theoretically, the values which the variables can
to the x-axis without ever touching it assume are infinite and are measured on a truly continuous scale so that
3. the curve has a standardised variance the curve is smooth.
4. the curve is symmetrical • Their curves are asymptotic, which means that the two tails never touch
the horizontal axis, moving ever closer to infinity, because there is always
some probability that more extreme values will occur.
26 The standard error is a measurement of ______ 1 P60 We can estimate the size of the error we would make if we used the sample
mean as an estimate of the population mean. This is referred to as the
1. how well a sample mean approximates a population standard error, and it is specified in the central limit theorem.
mean
2. the extent to which a variable varies around its P61 The standard error is denoted by σẋ. The σ indicates that we are
mean describing a population, and the subscript ẋ informs us that we are dealing
3. the extent to which one variable changes as with a population of sample means. The standard error is given by dividing
another one changes the population standard deviation by the square root of the sample size
4. the size of the error being made when you fail to σẋ = σ / √n
reject a null hypothesis which is actually false
Like a standard deviation, the standard error of the mean tells us by what
average amount the sample means deviate from the mean of the sampling
distribution. It is an estimate of the size of the error we shall make if we
use the mean of the distribution of sample means as an estimate of the
true population mean, that is, if we use µẋ to estimate µ.
27 Statistical hypotheses are statements about ______ 1 P74 Take note that a research hypothesis always translates into two mutually
exclusive hypotheses (i.e. both cannot be true at the same time): a null and
1. population parameters an alternative hypothesis. Also remember at this stage that, in Topic 1, we
2. sample statistics referred to quantities such as as parameters (population parameters). These
3. characteristics of statistical distributions particular statistical hypotheses are, thus, statements about the value of
4. all of the above a particular population parameter.

28 Suppose we have stated H0 µ =10, and H1 µ < 10, and 3 App D


find that the sample mean corresponds to a z-score of
-3. This means that the corresponding p-value ______

1. need not be found to reach a decision


2. is 0.0026 Z < -3
3. is 0.0013 H1 µ < 10
4. can only be calculated if the sample standard
deviation is known
Since we are dealing with the z-scores on the bell curve, the probability of Z <
-3 = probability of Z > 3 - both ends of the tails. However, since H1 µ < 10
indicates a one-tailed directional test, we only get the p-value for one tail.

The p-value is read from appendix D as p(Z < -3) = p(Z > 3) = 0.0013. We
read the smaller p in appendix D because we are looking for p(Z < -3) or p(Z
> 3). It is a one-tailed probability so do NOT multiply by 2 to get 0.0026 for a
two-tailed test.
29 The hypothesis “H1: µ > 50" is a (a) ______ hypothesis 4 H1: µ > 50
and requires a (b) ______ statistical test
The ">" indicates a directional hypothesis which requires a one-tailed test
1. (a) non-directional (b) one-tailed
2. (a) directional (b) two-tailed P75-76 The alternative hypothesis can contain any of the symbols '>', '<' or '≠'
3. (a) non-directional (b) two-tailed respectively, the symbols for 'larger than', 'smaller than' or 'not equal to'.
4. (a) directional (b) one-tailed
When a comparison is between a value that is greater (more) than another,
we use the symbol '>' and when a comparison is between a value that is
smaller (less than) than another, we use '<'. The statistical test that must be
performed in either of these cases is a directional or one-tailed statistical
test (we use these expressions interchangeably).

When we do not specify what the direction of the difference should be, and
both a larger and a smaller difference between means are considered as
relevant, the symbol '≠' must be used. The statistical test to be performed will
now be a non-directional or two-tailed test.

P81 The relationship between one-tailed and two-tailed p-values can be


summarised as follows:
• One-tailed p-value = (two-tailed p-value) / 2
• Two-tailed p-value = (one-tailed p-value) x 2

The important point to remember is that the p-value indicates more or less
how likely the particular result we have observed in our data is if the null
hypothesis were true; or, as we say, 'under the null hypothesis'.
30 The level of significance of a statistical test ______ 2 SG P82- The a-value specifies the maximum risk that we are willing to take of making
84 an error if we reject the null hypothesis.
1. refers the p-value which is calculated from the test
statistic We know that the extent of the type I error that a researcher is willing to make
2. indicates the maximum risk that a researcher is is controlled by the researcher by setting the level of significance (α) in
willing to take of making an error of Type l advance. The probability of a type II error (β) is not controlled in advance by
3. is the probability of obtaining the sample statistic the researcher except for the fact that we know that the lower (smaller) the
under the null hypothesis probability of a type I error (α) the greater (larger) the probability of a type II
4. is used to indicate the probability of making an error error (β).
by not rejecting the null hypothesis
You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0
when you should reject it (error of type II) an absolute certainty.
31 When applying a statistical test, if the p-value is larger 1 SG 82- An error of Type I is the error we make if we reject the null hypothesis when
than the level of significance we ______ the alternative 86 we should not have done so, and the level of significance represents the
hypothesis greatest risk of doing this that we are willing to take.
Tut202
1. do not accept 2014 Q5 We know that the extent of the type I error that a researcher is willing to make
2. fail to reject is controlled by the researcher by setting the level of significance (α) in
3. accept advance. The probability of a type II error (β) is not controlled in advance by
4. cannot make a conclusion about the researcher except for the fact that we know that the lower (smaller) the
probability of a type I error (α) the greater (larger) the probability of a type II
error (β).

You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0
when you should reject it (error of type II) an absolute certainty.

The decision rule for H0 is simply as follows:


If the p-value of the sample result is smaller (less) than α (level of
significance), the null hypothesis is rejected. If the p-value is not
smaller than α, the null hypothesis (H0) is not rejected.

Similarly, the decision rule for H1 is simply as follows:


If the p-value of the sample result is larger than α, the alternative
hypothesis is accepted. If the p-value is not larger than α, the alternative
hypothesis is not accepted.
32 A type II error occurs when ______ 2 SG 82- An error of Type I is the error we make if we reject the null hypothesis when
86 we should not have done so, and the level of significance represents the
1. the null hypothesis is rejected when it should not be greatest risk of doing this that we are willing to take.
rejected Tut202
2. the null hypothesis is not rejected when it should be 2014 Q5 We know that the extent of the type I error that a researcher is willing to make
rejected is controlled by the researcher by setting the level of significance (α) in
3. the null hypothesis is wrongly not rejected advance. The probability of a type II error (β) is not controlled in advance by
4. the alternative hypothesis not accepted when it the researcher except for the fact that we know that the lower (smaller)
should be accepted the probability of a type I error (α) the greater (larger) the probability of a
type II error (β).

You could eliminate error of type I (rejecting H0 when you should not)
altogether by never rejecting the null hypothesis, irrespective of how small the
p-value becomes, but that would make the probability of not rejecting H0
when you should reject it (error of type II) an absolute certainty.
Base your answers to Questions 33 to 37 on the following scenario:

Rose is interested in the problem of depth perception. She wonders whether artists who practise visual arts, and who are known to have made a study of
the problem of perspective, would be better at judging depth than people in general. She decides to investigate this using a test for depth perception
which was standardized on the general population with a mean of 5, where a greater number implies better depth perception on a scale of 1 to 9. She
randomly draws 100 students who had graduated from a class on perspective at a school for fine arts and tests each of them on the depth perception
test. She finds that the mean depth perception score of her sample is 6.2 and the sample standard deviation is 1.7

33 How would you describe the population investigated in 3 She randomly draws 100 students who had graduated from a class on
this research? perspective at a school for fine arts

1. The general population


2. Artists who studied perspective
3. Artists who had studied perspective at a specific
school for fine arts
4. Artists who had completed the test for depth
perception

34 Which of the following best describes the research or 2 She wonders whether artists who practise visual arts, and who are known
theoretical hypothesis to be tested? to have made a study of the problem of perspective, would be better at
judging depth than people in general
1. Depth perception is related to artistic ability
2. Visual artists have a superior ability for depth
perception to people In general
3. Students from the school of visual arts have better
depth perception than the general population
4. The relationship between depth perception and
artistic ability is statistically significant

35 Which of the following are appropriate null and 3 H0 always has "=" sign so option 4 is incorrect.
alternative hypotheses?
The hypothesis Rose made was " She wonders whether artists who practise
visual arts, and who are known to have made a study of the problem of
1. H0: μ = 5 H1: μ ˂ 5 perspective, would be better at judging depth than people in general"
2. H0: μ = 5 H1: μ ≠ 5
3. H0: μ = 5 H1: μ ˃ 5 Therefore, H1 must be better (larger / greater than) than the mean (5)
4. H0: μ ≠ 5 H1: μ ˃ 5 H1: μ ˃ 5

So the correct hypotheses are H0: μ = 5 H1: μ ˃ 5


36 Which is the correct value of the standard deviation of 4 The given standard deviation was extracted from the sample of 100 so we
the sampling distribution of the mean of the depth use s and not σ.
perception scores?
The formula for standard deviation of the sampling distribution is the standard
1. 17 error which is s = s/√n
2. 2.0 = 1.7/√100
3. 0.017 = 1.7/10
4. 0.17 = 0.17

37 Which is the appropriate test statistic to calculate? 2 P102- The t-statistic for the mean of a single sample. This is because the standard
106 deviation is unknown. What is given was extracted from a sample of 100.
1. The t-statistic for the difference between the means
of two independent groups In this question the population standard deviation (σ) is considered to be
2. The t-statistic for the mean of a single group unknown because the given standard deviation comes from the sample. So
3. The z-statistic for the mean of a single group we have to use the t-test (t)
4. The t-statistic for the difference between the means
of two dependent groups The important point is that - as in the case of the z-distribution - the t-
distribution is a statistical distribution with a probability distribution that can be
determined, which means that we can use it to predict the chances of
obtaining specific outcomes when testing for comparisons of means when the
population standard deviation σ is unknown.

38 When two population means are compared, the p- 1 P81 Here is a summary of the important points regarding the p-value:
value is calculated to represent the probability of • The p-value gives the probability of obtaining the sample result under H0.
observing a specific difference between the sample • If the p-value is very small, the probability is very small that the sample
means given that ______ result would occur under H0, and one should consider rejecting H0 in
favour of H1.
1. H0 is true • The smaller the p-value, the more likely that the null hypothesis is false
2. H1 is true and should be rejected in favour of the alternative hypothesis.
3. H0 is false
4. H1 is false So, if the p-value is very large, the probability is very big that the sample
result would occur under H0, and one should consider accepting H0 in favour
of H1. The null hypothesis is then probably true

The important point to remember is that the p-value indicates more or


less how likely the particular result we have observed in our data is if
the null hypothesis were true; or, as we say, 'under the null hypothesis'.
39 A psychologically unimportant result may turn out to be 2 SG P84- By setting a low level of significance, we reduce the probability of a type I
statistically significant if the researcher ______ 85 error. This by implication makes the probability of accepting the alternative
hypothesis smaller. By setting a low level of significance we unfortunately
1. sets a low level of significance increase the probability of not rejecting a null hypothesis when it should be
2. uses a large sample rejected because it is false
3. reduces the probability of a type I error
4. changes the effect size P33 The main reason why such a seemingly slight difference could be so
statistically significant stems from the large sample size - the larger the
sample size, the closer the observed frequency can be expected to be to the
true probability.

P86 Effect size: A major determinant of the sensitivity or power of a statistical test
is sample size (which is why we can increase sample size to enhance
power). When the sample is large, even smaller effects will have statistical
significance. The reason is that the larger the sample, the less error variance
can be expected (variance purely due to randomness). This is due to a
principle called the law of large numbers, which states that on average the
result obtained from a large number of trials should be close to the expected
value, and will tend to become closer as more trials are performed (this law is
described in section 2.1.2). This implies that when sample sizes are large,
even sample effects that seem insignificant can produce small p-values,
leading to the rejection of H0.

P88 Effect size, power and sample size are interrelated; you can determine one if
you have information regarding the other two. For example, if you set your
desired effect size and know the power of the test, you can use this to
determine what an optimal sample size would be to use the test effectively.
40 The mean score of a sample of research participants is 1 p-value = 0.036
compared with a population mean of 20 for a particular α = 0.01
questionnaire which measures anxiety level. The
following hypothesis is set up to be tested p-value(0.036) > α(0.01)

H0: μ = 20 Since the p-value(0.036) is greater than the level of significance (0.01), we do
H1: μ ≠ 20 not reject the null hypothesis .

A researcher draws a random sample of 25 persons Therefore: H0: μ = 20 (or very close to 20)
and calculates the mean score and the standard
deviation of this sample. This is used to calculate a t-
test statistic to test the hypothesis at a significance The steps we would apply are firstly the decision rule based on the p-value.
level of α = 0.01. lf a p-value of p = 0.036 is found, We decide that we will not reject Ho. So we have already been given that Ho:
which of the following statements about the mean = 20.
which was calculated from this sample is most likely to
be true? If we look at the options provided: if we choose option 2 we would be saying
that we will consider the alternate hypothesis, we have already decided not to
1. It is close to 20 do that. So it can't be option 2. The same concept applies to option 3. And
2. It differs significantly from 20 option 4 is not correct because we have been provided with the all the
3. It is definitely not equal to 20 information required to firstly apply the decision rule and the we have the
4. There is not sufficient information given to estimate Significance level and the p-value.
it

41 When two means are compared, the p-value 4 Tut202 The null hypothesis states that there is no difference in the means calculated
expresses the probability that a difference between the 2014 Q8 from samples of data from each of of groups A and B. When we calculate the
means ______ two means from sample data (which we regard as an observation) we may
find a difference in the two calculated means, but at least part of this
1. Will be significant difference could be due to measurement errors. We calculate the p-value
2. is due to the alternative hypothesis (based on a test statistic with a known probability distribution) to find
3. will be found between the means out what the probability is that that these observed differences in the
4. is due to chance or sampling error sample data are just a consequence of measurement error if the null
hypothesis is assumed to be true. If this probability is low (lower than a
pre-determined cut-off level, α), we conclude that the difference in the two
means is statistically significant because the probability that the null
hypothesis is true is very small.

In other words, we conclude that the size of the difference between means
found in the sample data would not be likely if the null hypothesis were true.

Therefore: The sample result is more probable under the alternative


hypothesis
42 Which symbol is conventionally used to indicate the 1 P83 α = level of significance
value of the maximum probability that an error would P85 β = the error of failing to reject a null hypothesis that is really false
be made if the null hypothesis is rejected which a P29 p = probability
particular researcher is willing to allow? P79 σ = standard deviation

1. α α indicates the risk that a researcher is prepared to take of making a


2. ß Type I error (rejecting H0 when the researcher should not do so),
3. p β is used to indicate the opposite risk - the risk he (or she) is taking of making
4. σ a Type II error (not rejecting H0 when in fact he should).

43 Cohen's d refers to the ______ 2 P87 One way that statisticians have suggested to deal with this problem is by the
notion of effect size. Different procedures exist to determine the effect size of
1. difference score when two means from dependent a result. In the case of a comparison between means, one way of calculating
samples are compared this is by the use of Cohen's d. We do this by expressing the mean difference
2. effect size that we observed relative to the standard deviation:
3. power of a test
4. amount of variance shared by two variables when
they are correlated

44 Effect size is calculated to determine ______ 4 The effect size is to assess whether a significant effect is meaningful from a
practical point of view.
1. whether an effect is statistically significant or not
2. the ability of a statistical test to detect a significant P87 The implication is that we have to be careful how we interpret significant
relationship between variables when such a results. A p-value of smaller than our chosen level of significance (α) simply
relationship does ln fact exist implies that, relative to this sample, it is improbable that the effect we see in
3. the level of confidence one can reach that the test our observations is purely due to chance. It does not imply that the effect is
is valid big or important. This is something that we have to decide by looking at what
4. whether a significant effect is meaningful from a the data means. One way that statisticians have suggested to deal with this
practical point of view problem is by the notion of effect size. Different procedures exist to determine
the effect size of a result. In the case of a comparison between means, one
way of calculating this is by the use of Cohen's d. We do this by expressing
the mean difference that we observed relative to the standard deviation:
45 A random sample of n=100 people are tested to see 2 Probability (p-value) = ??
how many items they can recall from a list with pictures Mean () = 7
of 12 items. The distribution of the results is found to Std Dev (s) = 2.0
be more or less normal with a mean of ẋ = 7 and a Sample (n) = 100
standard deviation of s = 2.0. What is the probability Raw score (X) = 10
that a specific person, chosen at random from the
general population, will remember 10 or more items First calculate the z-score using
from the list?

1. Less than 0.05 We do not use z-transformation formula


2. Between 0.05 and 0.1
3. Between 0.1 and 0.5 P55-56
4. Greater than 0.5 In practice, we are rarely able to calculate the mean of a population of scores
and the standard deviation of the population s, because we seldom have
population scores available. In such cases we can draw a representative
sample from the population and use the sample statistics  and s to calculate
z, as follows:

z = (10 - 7) / 2
z=3/2
z = 1.5

Now you have standardised the normal distribution so the mean is 0 and the
std dev is 1. When you look up the z-score (1.5) in the standard normal
distribution tables (Appendix D) you will see the larger and smaller portion
values. Larger portion is 0.9332 (93%) and smaller is 0.0668 (7%)

So now you finally have enough information to answers the question.


What is the probability that a specific person, chosen at random from the
general population will remember 10 or more items from the list?
So 10 or more will be the smaller portion of the graph (to the right of the z-
score) so there is a 0.0668 (0.07 or 7%) chance (probability) they will get 10
or more items. The probability of 0.07 is between 0.05 and 0.1.
46 Under which condition would a researcher use a t- 1 P100- The t-distribution is a statistical distribution with a probability distribution that
statistic to test a hypothesis about an unknown 106 can be determined, which means that we can use it to predict the chances of
population mean µ? obtaining specific outcomes when testing for comparisons of means when the
The value of the (a) ______is (b) ______ population standard deviation σ is unknown.
So we have to use the t-test (t) when the population standard deviation
1. (a) population standard deviation σ, (σ) is considered to be unknown - because the given standard deviation
(b) unknown comes from the sample.
2. (a) standard error sẋ,
(b) unknown
3. (a) population standard deviation σ, When the population standard deviation (σ) is known we use the z-test
(b) known (z)
4. (a) standard error sẋ,
(b) known

Base your answers to Questions 47 to 49 on the following scenario

Suppose that the memory span of adults is normally distributed with a mean of 7 items and a standard deviation of 2 items. A researcher is investigating
the impairment of memory among persons who has been diagnosed as suffering from Korsakoff's syndrome (a neurological disorder linked to chronic
alcohol abuse). He intends to test his prediction on a sample of 50 persons who were diagnosed as suffering from this syndrome

47 Which of the following is an appropriate null hypothesis 2 P73-75 The null hypothesis will always contain equal signs. In this case H0 : μ = 7.
for testing the above prediction?
H0 is defined as the hypothesis of no effect.
1. The mean memory span of the population of
persons suffering from Korsakoff's syndrome is • The null hypothesis (H0) represents the status quo or the current belief in
smaller than 7 a situation. The null hypothesis will always contain equal signs.
2. The mean memory span of the population of • The alternative hypothesis (H1) is the opposite of the null hypothesis and
persons suffering from Korsakoff's syndrome is represents a research claim or specific inference you would like to prove.
equal to 7 This means that the alternative hypothesis takes the sign of the test
3. The mean memory span of the population of depending on the situation.
persons suffering from Korsakoff’s syndrome is not o If we are testing the difference, H1 is indicated with ≠.
equal to 7 o Otherwise we can use signs like less than (<) or greater than (>)
4. The mean memory span of the sample of persons depending on the problem statement.
suffering from Korsakoff's syndrome is equal to 7 • If you reject H0, you have statistical proof that the alternative is correct.
• If you do not reject H0, you have failed to prove that the alternative
hypothesis is correct. Failure to prove the alternative hypothesis does not
necessarily mean that the null hypothesis is true.
• The null hypothesis (H0) always refers to a specific value of a parameter
(such as μ, not a statistic (such as ). This value is always known or will
come from the given scenario.
48 Which of the following is an appropriate alternative 1 A researcher is investigating the impairment of memory among persons who
hypothesis for testing the above prediction? has been diagnosed as suffering from Korsakoff's syndrome.

1. The mean memory span of the population of "Impairment" indicates less than or smaller than, so H1 : μ < 7
persons suffering from Korsakoff's syndrome is
smaller than 7 Therefore: The mean memory span of the population of persons suffering from
2. The mean memory span of the population of Korsakoff's syndrome is smaller than 7
persons suffering from Korsakoff's syndrome is
equal to 7
3. The mean memory span of the population of
persons suffering from Korsakoff's syndrome is not
equal to 7
4. The mean memory span of the sample of persons
suffering from Korsakoff's syndrome smaller than 7

49 Testing the above prediction on a sample will require a 3 P75 Directional because H1 : μ < 7. It is also one-tailed because it only focus on
______ statistical test smaller than 7 and not larger than 7 as well.

1. non-directional Two-tailed is when H1 : μ ≠ 7. Now the focus will be on smaller than and larger
2. two-tailed than 7 results
3. directional
4. non-parametric
50 A pharmaceutical company claims that a new sleeping 2 P81 The relationship between one-tailed and two-tailed p-values can be
pill which they are marketing will put people to sleep in summarised as follows:
less than 15 minutes. A researcher wants to test if the • One-tailed p-value = (two-tailed p-value) / 2
average time before people fall asleep after using this • Two-tailed p-value = (one-tailed p-value) x 2
pill matches this claim. She uses the following
hypothesis The important point to remember is that the p-value indicates more or less
how likely the particular result we have observed in our data is if the null
H0: μ = 15 hypothesis were true; or, as we say, 'under the null hypothesis'.
H1: μ ˂ 15

Suppose she tests this on a random sample of n = 40 Therefore:


research participants who suffer from insomnia. She One tailed p-value = two tailed p-value divide by two
finds that the mean time before members of the So one-tailed p-value = 0.03450 / 2 = -0.01725
sample falls asleep after using the pill is 14.3 minutes
with a standard deviation of 3.2. A subsequent t-test
produces a two-tailed p-value of 0.0345 and the level
of significance was set at 0.05. What is the value of the
one-tailed or directional p-value?

1. 0.03450
2. 0.01725
3. 0.06900
4. Insufficient lnformation is given to determine thus
value

51 A researcher wants to compare the mean of the non- 4 P61-62 The standard error is an extremely valuable measure because we can use it
verbal reasoning scores of a sample of n=25 students to estimate how well a sample mean approximates its population mean in
with that of the general population. According to the general, that is, how much error you can expect on average between the
literature, the non-verbal reasoning test which she sample mean () that you calculated from your sample and the population
uses was standardized to a population mean of u = mean (μ) that you are trying to estimate.
100 and a population standard deviation of σ =10.
What is the value of the standard deviation of the In other words, it is an indication of the size of the error that you make by
sampling distribution of the mean which will be using a sample of a particular size (n) to determine the population mean. This
required to calculate the zẋ test statistic? amount of error will decrease as the size of the sample increases.

1. 2.5 σ = σ/√n = 10/√25 = 10/5 = 2


2. 0.4
3. 10
4. 2
52 What does it mean to say "the difference between the 2 Tut202 The null hypothesis states that there is no difference in the means calculated
means of groups A and B is statistically significant? 2014 Q8 from samples of data from each of of groups A and B. When we calculate the
two means from sample data (which we regard as an observation) we may
1. The null hypothesis adequately explains the results find a difference in the two calculated means, but at least part of this
2. The alternative hypothesis is true difference could be due to measurement errors. We calculate the p-value
3. The alternative hypothesis should be rejected (based on a test statistic with a known probability distribution) to find out what
4. The null hypothesis cannot be rejected the probability is that that these observed differences in the sample data are
just a consequence of measurement error if the null hypothesis is assumed to
be true. If this probability is low (lower than a pre-determined cut-off level, α),
we conclude that the difference in the two means is statistically significant
because the probability that the null hypothesis is true is very small.

In other words, we conclude that the size of the difference between means
found in the sample data would not be likely if the null hypothesis were true.

Therefore: The sample result is more probable under the alternative


hypothesis

Base your answers to Questions 53 and 54 on the following scenario

A market researcher is asked to conduct a study to examine people’s reaction to a movie trailer. He draws a random sample of 20 males and 20 females
who saw the trailer. He asks them to indicate how likely it is that they will go and see the movie on a 7-point scale, where 1 indicates 'not at all' and 7
indicates 'definitely’. He wants to compare to establish whether males and females differ in their intention to see the movie based on an exposure to the
trailer.

Suppose the researcher finds that the mean and standard deviations for each group in the sample is as follows

Males ẋM = 5.7 SM = 2.1


Females ẋF = 4.19 SF = 1.6
53 Which are the appropriate statistical hypotheses for 3 P11-18 He wants to compare to establish whether males and females differ in their
testing the researcher's hypothesis? P21-26 intention to see the movie based on an exposure to the trailer.

1. H0: ẋM = ẋF H1: ẋM ≠ ẋF The word "differ" does not indicate a direction and therefore the alternative
2. H0: μM = μF H1: μM ˃ μF hypothesis must have a "≠" sign.
3. H0: μM = μF H1: μM ≠ μF Hypotheses are tested on population parameters only, therefore only "μ","σ"
4. H0: μ = 0 H1: μ ≠ 0 and "p" can be used. A hypothesis is not stated for samples or statistics (""
or "s").

An hypothesis is a statement of relationships among variables, not about the


nature of variables

P161 Symbol
Summary value Populations Samples
(Parameter) (Statistic)
Arithmetic mean μ 
Standard deviation σ s
Variance σ² s² (s=√s²)
Standard error of mean
(Also called Standard deviation of the σ (= σ/√n) s  (= s/√n)
sampling distribution of the mean)
Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean µ
under H0)
Z score for means z
Correlation between two measurements
(Pearson's R)
ρ r
Proportions P p
54 Which is the appropriate test statistic to calculate? 2 P110 Samples are considered as comprising independent groups if the
composition of the one sample in no way affects, in any systematic way, the
1. The z-statistic for the difference between the means composition of the other sample. The two samples come from two groups
of two samples that have no obvious relationship. For example, where one sample is
2. The t-statistic for the difference between the means measurements of a construct like 'self-esteem' among men, and the other
of two independent samples among women, but both groups were sampled purely randomly.
3. The t-statistic for the mean of a single sample
4. The t-statistic for the difference between the means On the other hand, the concept of dependent groups refers to situations
of two dependent samples where the samples are related, and it implies that each subject in one group
can be systematically paired off with a subject from the other group. For this
reason, a dependent groups research design is often referred to as a
matched-pairs design.

55 A researcher is asked by a motivational speaker to 2 P120 This t-test statistic (td) is used for the comparison of means from two matched
establish whether a workshop on assertiveness or dependent samples.
training is effective. The researcher decides to use a
particular questionnaire which tests an individual's P110 The concept of dependent groups refers to situations where the samples are
level of assertiveness. He presents the questionnaire related, and it implies that each subject in one group can be systematically
to each of a sample of 50 participants in the workshop paired off with a subject from the other group. For this reason, a dependent
before it begins and once again after it has ended. group's research design is often referred to as a matched-pairs design.
When analysing these results the researcher should
use a statistical test for the ______ Another example of such a design would be a repeated measures design,
where the same research participant is observed under more than one
1. comparison of means for a single group treatment or experimental condition. For example, to test the effectiveness of
2. comparison of means for two dependent groups a psychotherapy technique, people can be tested before the treatment
3. comparison of means for two independent groups begins, and again afterwards. The two sets of measurement (indicated by two
4. correlation of two variables variables) can be regarded as two samples of data, which is to be compared
to see whether some kind of change has taken place. Dependent samples
are also sometimes referred to as correlated samples

NOTE: Make sure that you do not confuse the notion of dependent versus
independent samples with the distinction between dependent and
independent variables (Topic 1, section 1.3.2). While the latter refers to the
relationships among variables - how one may affect the other - in the case of
samples it is a relationship among the groups from which the data were
collected (i.e., where the variables were measured) that is of concern.
56 The probability under the null hypothesis of obtaining a 2 SG P81 A one-tailed p-value (used in the case of a directional hypothesis) is half the
t-value of 2.0 or higher in the case of a two-tailed test size of a two-tailed probability.
is ______ that for a one-tailed test Tut201
2014 Conversely, a two-tailed p-value (used in the case of a non-directional
Q12 hypothesis) is twice the size of a one-tailed p-value.
1. the same as
2. twice The relationship between one-tailed and two-tailed p-values can be
3. half summarised as follows:
4. impossible to calculate from • One-tailed p-value = (two-tailed p-value) / 2
• Two-tailed p-value = (one-tailed p-value) x 2

57 When calculating the t-test for two independent 4 P113 In order to use the t-test (tc) statistic, we need to make two assumptions
samples, which of the following assumptions must be regarding the data:
made if the sample sizes are relatively small? • that the two populations being compared are normally distributed
• with the same variance (or standard deviation).
a) the value of σ is known for both populations
b) the two population means differ P116 Note: Even the most elementary statistics program makes provision for
c) the two populations have the same variance performing t-tests. Such programs usually require that we indicate which
d) the two samples come from normally distributed variable should be used to identify the two groups and which is the dependent
data variable. In addition, we have to choose between a tc test for independent
samples or a td test for dependent or correlated groups
1. (a) and (d)
2. (b) and (c)
3. (a) and (c)
4. (c) and (d)
58 A sample of 70 people is tested on a test for 4 P119- "70 people is tested on a test for assertiveness before and after a
assertiveness before and after a workshop in which 120 workshop."
they are given assertiveness training. Which of the
following is the most appropriate formula for comparing We therefore know that we are dealing with two matched or dependant
the mean assertiveness score before the training with samples. We have to use the td test
the one thereafter?

1.

2.

3.

4.

Base your answers to Questions 59 and 60 on the following scenario:

A researcher compares a sample of children from a special school for gifted children with a group of children randomly drawn from other schools on a test
which measures the creativity of the children on a 9-point scale She finds the following

Group 1 ('gifted' children) n1= 40, 1 = 5.5, s1= 1.2


Group 2 (other children) n1= 62, 2 = 4.9, s2 = 0.8
All children pooled n =102,  = 5.1, s =1.0
She calculates a t-test statistic of t=3.37 and finds that p=0.0006, which she finds to be significant on the level of α = 0.01
59 Even though the result described in the scenario is 1 P86 Effect size: A major determinant of the sensitivity or power of a statistical test
statistically significant, the researcher is unsure is sample size (which is why we can increase sample size to enhance
whether the difference between the means is large power). When the sample is large, even smaller effects will have statistical
enough to be of practical importance. Which of the significance. The reason is that the larger the sample, the less error variance
following strategies are the most appropriate to get a can be expected (variance purely due to randomness). This is due to a
better idea of the usefulness of the result? principle called the law of large numbers, which states that on average the
result obtained from a large number of trials should be close to the expected
1. Calculating the effect size value, and will tend to become closer as more trials are performed (this law is
2. Calculating the correlation coefficient described in section 2.1.2). This implies that when sample sizes are large,
3. Calculating the power of the test even sample effects that seem insignificant can produce small p-values,
4. The low p-value is sufficient to show that the result leading to the rejection of H0.
is important In practical terms

60 If the researcher calculates the value implied In the 3 P116 1 = 5.5


previous question, what would the absolute value 2 = 4.9
(ignoring the sign) of the result be? sp = s =1.0

1. Between 0.0 and 0.3 d = (1 - 2) / sp = (5.5 - 4.9) / 1.0 = 0.6 / 1 = 0.6
2. Between 0.3 and 0.5
3. Between 0.5 and 0.8 d = 0.6 which lies between 0.5 and 0.8
4. Greater than 0.8

61 A scatter plot is a graphical representation of the 2 SG A graph showing the position of each of a number of sampling units on each
relation between ______ P130- of two variables
132
1. two variables measured on a nominal scale within a A scatter plot is a graph showing the relationship between two numerical
single group Tut202 variables. In such a graph the data of the one variable are plotted on the
2. two variables measured on a ratio or interval scale 2014 horizontal axis (usually referred to as the X axis), and the data of the other
within a single group Q18 variable on the vertical (or Y) axis. It is not a comparison of sample and
3. two groups of subjects measured on an interval or population, nor has it to do with spread of data or the independence of
ratio scale on a single variable variables
4. two groups of subjects measured on an interval or
ratio scale on two variables
62 A researcher obtains a correlation coefficient of 0.40 3 P130- Pearson product-moment correlation coefficient - the notion of the
between IQ scores and examination marks in a 140 relationship between two continuous variables and how the size of the
random sample of 10 PYC3704 students, and again a relationship can be expressed in terms of a correlation between them. This
correlation coefficient of 0.40 between the same two coefficient can also be used as a test statistic.
variables on another random sample of 100 PYC3704
students. Which of these two correlation coefficients is Correlation is a measurement of the extent to which a measurement on one
the more likely to differ significantly from zero under variable is related to a measurement on another variable for the same sample
the null hypothesis? of individual cases.

1. That obtained on the smaller sample P139 One should, however, be careful as to how one interprets a significant result.
2. Both are equally likely to be significant To clarify this, consider the relationship between the calculated significance
3. That obtained on the larger sample (the p-value) and the sample size (n).
4. There is no relationship between the size of the
correlation coefficient and significance For a smaller sample n, the test must be much more conservative. You must,
therefore, put up a bigger hurdle to be crossed before you conclude that the
result is not the consequence of chance. You, therefore, require a larger
value of r before you can conclude that the result is not a chance event due
to sampling or measurement error, but an actual representation of the state of
affairs in the population. The consequence of this is that, for a large sample,
a relatively modest correlation can turn out to be significant. For example, for
a sample of n = 40 (as in the HIV/AIDS research project in Appendix A), the
value of r must be at least r = 0.26 for a = 0.05 (a 5% level). If we increase
the sample size to 100, a smaller result of r = 0.16 would be significant at the
same level of a = 0.05. This shows that, for a large value of n, a very modest
r can be significant. The implication of this is that significance does not
indicate that a relationship is large. It merely tells you that some relationship
exists (perhaps a modest one), and that it is large enough not to be regarded
as purely due to the effect of chance, given the size of the sample.
63 Which of the combinations of the options below can be 2 The question reads: "when a significant negative correlation is found"
substituted in the following sentence to describe the
situation when a significant negative correlation is P133 When positive relationships occur, this implies that as one variable gets
found between two variables X and Y? larger, so does the other.

A person who scores ______ on variable X is likely to When negative relationships occur, this implies that as one variable gets
have a ______ score on variable Y larger, the other gets smaller.
(a) low, low
(b) low, high
(c) high, low
(d) high, high

1. (a) and (d)


2. (b) and (c)
3. (a) and (c)
4. (c) and (d)

64 A researcher wants to establish whether the type of 4 SG P140 The chi-square test is usually used when you have a cross tabulation of
employment category that is filled by employees of a frequency counts of events which are nominal scale measurements. This
particular company (manager, middle manager, clerical Tut202 table is referred to as a contingency table. It is used to compare an observed
worker, or technical worker) is at all related to their 2014 frequency distribution (frequency counts based on a sample of observation)
gender (male or female). Which would be the most Q22 with the frequency distribution which we would expect to find if the null
appropriate test to use? hypothesis of no relationship between two cross-tabulated variables were
true. The variables involved are qualitative in nature.
1. The t-test for two independent samples
2. Pearson’s correlation test statistic
3. The t-test for two dependent samples
4. The Chi-square (x²) test statistic

Base your answers to Questions 65 and 66 on the following scenario-

A group of hospitalized patients who have been diagnosed as suffering from schizophrenia are treated with certain drugs over a period of time. These
drugs were prescribed to improve their mental alertness. A researcher studies a random sample of 30 these patients who have been on these drugs for
varying amounts of time, hoping to establish a relationship between the number of days of drug treatment and patients’ scores on a Mental Alertness Test
65 Which is an appropriate null hypothesis for this 1 P161 ρ = Correlation between two measurements for population parameters
research? r = Correlation between two measurements for sample statistics
μ = population mean
1. ρ=0
2. μ1 = 0 This is correlation/relationship between patients at various stages/days of
3. r=0 drug treatment and patient's scores on a mental alertness test. In this case,
4. μ1 = μ2 we cannot select r=0 because the hypothesis is tested on population
parameters.

Symbol
Summary value Populations Samples
(Parameter) (Statistic)
Arithmetic mean μ 
Standard deviation σ s
Variance σ² s² (s=√s²)
Standard error of mean
(Also called Standard deviation of the σ (= σ/√n) s  (= s/√n)
sampling distribution of the mean)
Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean µ
under H0)
Z score for means z
Correlation between two measurements
(Pearson's R)
ρ r
Proportions P p
66 Which is an appropriate alternative hypothesis for this 1 P161 The question does not stated better than or more than or any other directional
research? alternative. It is merely comparing for or trying to establish a difference. It is
therefore non-directional (≠) and needs a two-tailed test.
1. ρ≠0
2. μ1 ≠ μ2 Since the hypothesis is tested on population parameters as established in the
3. p>0 previous question, option 1 must be correct.
4. r>0
67 What is the expected frequency in cell AX of the 2 P143- It is important to note that the relation between the variables is described by
following Contingency table? 144 the cell and not by the row or column frequencies. These cell frequencies
represent the way the information is distributed relative to the two variables.
X Y These cell frequencies are often referred to as the observed or empirical cell
A 7 3 frequencies.
B 3 7
To find the expected frequency for a particular cell, the row total for that row
1. 3 is multiplied by the column total for that column and this result is then divided
2. 5 by the overall total. These expected frequencies show what the results would
3. 7 have been like if the distribution of frequencies through the cells were
4. 20 homogeneous, in proportion to the respective row and column totals. If the
observed frequencies correspond precisely with the expected frequencies,
we know that the null hypothesis cannot be rejected. But the observed
frequencies will seldom be precisely equal to the expected frequencies - even
if H0 is not rejected - because of sampling error.

It is the differences between these expected and observed frequencies that


interest us, that is, we want to know how far the actual (observed) results are
removed from the expected situation, if there is no interaction effect.

X Y Total
A 7 3 10
B 3 7 10
Total 10 10 20
Row total (O.1) = 10
Column total (O1.) = 10
Sample total (size) (O..) = 20

E11 = (Row total x Column total) / Sample total


E11 = (O.1 x O1.) / O.. = (10 x 10) / 20 = 100 / 20 = 5
68 If there is no relationship at all between two variables, 3 P132- Correlation coefficients that measure the linear relationship between two
what would be the most likely value of Pearson’s 133 variables, such as the Pearson product-moment correlation coefficient, can
correlation coefficient r, out of the following? have a continuous value that ranges from -1 to 1 (a positive value is usually
written without the sign, so '1' is presumed to mean '+1'). We use 'r' as the
1. -1.0 symbol that represents a correlation coefficient (as in the case of the Pearson
2. 0.5 product-moment correlation coefficient), and the following applies:
3. 0.0 • r = 1 implies a perfect positive linear relationship (the dots in a scatter plot
4. 1.0 will run from lower left to upper right in a perfectly straight line)
• r = 0 implies no linear relationship at all (the dots may be scattered all
over the place)
• r = -1 implies a perfect negative linear relationship (the dots will run from
upper left to lower right in a straight line)

When positive relationships occur, this implies that as one variable gets
larger, so does the other. When negative relationships occur, this implies that
as one variable gets larger, the other gets smaller.

The relationship is called linear because Pearson’s correlation coefficient


measures the extent to which the relationship approximates a straight line.

69 A contingency table represents ______ 4 App B Contingency tables are used to represent frequency counts of data that
have been classified in terms of 2 nominal variables (for example, gender
1. the distribution of the frequencies for a variable P142- and occupational category). It is possible to fit ordinal, interval or ratio scale
2. the data used to plot the relationship between two 144 measurements into such a table, but they would first have to be transformed
variables into a classification system; that is, the data have to be treated as if they
3. frequency counts for each of a number of possible represent nominal scale measurements.
outcomes of an experiment
4. the frequency counts when two nominal-scale Tut202 A contingency table is a two dimensional table used to represent the cross
variables are cross-classified 2014 classification, or cross tabulation, of the responses relating to two nominal or
Q20 categorical variables. It is basically a way to display and record the
relationship between the two variables. The frequency counts of one variable
are presented in the rows of the table and the frequency counts of the other
variable in the columns, as shown in table 6.4 on page 142 and table 6.5 on
p. 144 of the PYC3704 Guide
70 Which of the values given below is the closest to the 1 P132- Correlation coefficients that measure the linear relationship between two
probable value of the Pearson's product moment 133 variables, such as the Pearson product-moment correlation coefficient, can
correlation coefficient for the variables X and Y? have a continuous value that ranges from -1 to 1 (a positive value is usually
written without the sign, so '1' is presumed to mean '+1'). We use 'r' as the
Variable X 1 2 3 4 5 6 7 8 symbol that represents a correlation coefficient (as in the case of the Pearson
Variable Y 16 14 12 10 8 6 4 2 product-moment correlation coefficient), and the following applies:
• r = 1 implies a perfect positive linear relationship (the dots in a scatter plot
1. -1.0 will run from lower left to upper right in a perfectly straight line)
2. 0.5 • r = 0 implies no linear relationship at all (the dots may be scattered all
3. 0 over the place)
4. 1.0 • r = -1 implies a perfect negative linear relationship (the dots will run from
upper left to lower right in a straight line)

When positive relationships occur, this implies that as one variable gets
larger, so does the other or as one variable gets smaller, so does the other.
Variable Y does the same as Variable X.

When negative relationships occur, this implies that as one variable gets
larger, the other gets smaller or as one variable gets smaller, the other gets
larger. Variable Y does the opposite to Variable X.

The relationship is called linear because Pearson’s correlation coefficient


measures the extent to which the relationship approximates a straight line.
May/June 2013

# Question Ans Page Comments


1 The end goal of Psychological research is usually to 4 SG P3-4 Psychological research is mainly concerned with testing theories of human
_______ human behaviour behaviour.
1. collect data on Tut201
2. diagnose psychological problems in 2012 Q7
3. develop hypotheses about
4. test theories of

2 A researcher believes that there is a difference in the 4 P2 An inference is a conclusion that follows from existing information, by
reasoning strategies used to solve puzzles between generalising from the specific information to the general type of phenomenon,
students who study physical sciences such as physics where the conclusion is not absolutely certain. So in summary inferential
and chemistry and students who study social sciences statistics are techniques for making generalisations based on imperfect
such as psychology or sociology. She sets up a series P10-11 numeric data, where the conclusions have a high probability of being true, but
of puzzles to be solved by students from different you can never be completely certain.
colleges or faculties at a university. This kind of
research is referred to as ________ research A distinction exists between inferential statistics and descriptive statistics.
1. statistical Descriptive statistics refers to a set of quantities used to summarise
2. theoretical aspects of numerical data. Examples that you may be familiar with are
3. empirical means, range, variance and standard deviation (see Appendix C for a quick
4. inferential introduction). These summary quantities are sometimes referred to as
parameters (when they refer to the whole collection or population of data; see
section 1.4.3 below).

Inferential statistics refers to the use of statistical techniques to make


generalisations about the relationships among (two or more) variables. Here
the patterns that may exist in the data are carefully investigated.

You are INFERRING from your sample back to your population of all
students. If they had said experimental that would also have been correct.

We don't really used the word EMPIRICAL to refer to a TYPE of research, its
used to describe the nature of the research, ie that your research should be
testable
# Question Ans Page Comments
3 Which of the following definitions best describe the 1 P6 The taking of a measurement is regarded as an act of observation
meaning of ‘measurement’ in the context of
psychological research? Measurement means to P7 A construct that has been measured in some way produces a variable. A
________ variable refers to a number that can take on any one of a range of possible
1. find a way to observe a specific construct or values. They can be discrete (when only whole numbers like 1, 2, 3 are
phenomenon which is hidden allowed) or continuous (what mathematicians refer to as 'real numbers'). In
2. determine the extent to which a specific some cases variables also take on values smaller than zero to produce
phenomenon is present on a numeric scale negative numbers.
3. specify the relationship that is believed to exist
between two (or more) constructs or phenomena So the (visible) variable reflects the intensity of the underlying (invisible)
4. calculate a summary value which describes an construct, in terms of how it was measured. We say that the variable is
aspect of a specific construct or phenomenon manifest (it is visible in the sense that we can observe it) and the construct is
latent (it is invisible in the sense that we need some way to make it appear).
So the latent construct is made manifest by the use of an appropriate
measurement procedure.

4 A variable is described as ‘manifest’ because it is a[n] 4 P7 So the (visible) variable reflects the intensity of the underlying (invisible)
(a) ______ measurement of a construct which is (b) construct, in terms of how it was measured. We say that the variable is
______ manifest (it is visible in the sense that we can observe it) and the construct is
1. (a) latent (b) observable latent (it is invisible in the sense that we need some way to make it
2. (a) dependent (b) independent appear). So the latent construct is made manifest by the use of an
3. (a) independent (b) dependent appropriate measurement procedure.
4. (a) observable (b) latent
P23 To say that a construct is 'latent' is another way of saying it is hidden from
direct observation
5 When a specific psychological construct or 3 P6 The taking of a measurement is regarded as an act of observation
phenomenon is measured on a quantitative scale, the
resulting value is referred to as a ______ P7 A construct that has been measured in some way produces a variable.
A variable refers to a number that can take on any one of a range of possible
1. parameter values. They can be discrete (when only whole numbers like 1, 2, 3 are
2. descriptive statistic allowed) or continuous (what mathematicians refer to as 'real numbers'). In
3. variable some cases variables also take on values smaller than zero to produce
4. test statistic negative numbers.

So the (visible) variable reflects the intensity of the underlying (invisible)


construct, in terms of how it was measured. We say that the variable is
manifest (it is visible in the sense that we can observe it) and the construct is
latent (it is invisible in the sense that we need some way to make it appear).
So the latent construct is made manifest by the use of an appropriate
measurement procedure.
# Question Ans Page Comments
6 Operational definitions of a concept are definitions 2 P24-26 Operational definitions of psychological constructs should define constructs in
which define a concept in terms of ______ terms of observable behaviour.

1. other concepts "Operational'' refers to practical procedures by which constructs are made
2. observable instances visible.
3. latent variables
4. underlying constructs "Operationalisation" is where you make the construct (which is usually an
abstract concept, so it is difficult to observe it clearly) visible by finding some
suitable way to measure it.
7 Which of the following is appropriate as a research or 3 P9 a) H0: μM = μF H1: μM ≠ μF
operational hypothesis? Too many independent factors like job level or seniority etc

(a) Gender plays a role in determining employees’ b) H0: μM = μF H1: μM ˃ μF


salaries in Winston & Johnson Inc. Too many independent factors like job level or seniority, amount of male vs
(b) Male employees earn more than female employees female etc
in Winston & Johnson Inc
(c) Male employees at Winston & Johnson Inc earn
c) H0: μM = μF H1: μM ˃ μF
higher annual salaries than female employees at the
Compared at corresponding levels
same company, at corresponding post levels

1. (a), (b) and (c)


2. (b) and (c) but not (a)
3. (c) only, but not (a) and (b)
4. (a) only, but not (b) and (c)
# Question Ans Page Comments
8 Quantities that summarise aspects of a population are 2 P11 These summary quantities are sometimes referred to as parameters (when
called (a) ______, while (b) ______ do the same for they refer to the whole collection or population of data
samples
P161 You should take careful note of the following important distinctions between
1. (a) statistics (b) parameters samples and populations. Summary values for populations are called
2. (a) parameters (b) statistics 'parameters' and are usually denoted by Greek letters, while summary values
3. (a) constructs (b) variables for samples are called 'statistics' and are denoted by Roman letters.
4. (a) variables (b) parameters Symbol
Summary value Populations Samples
(Parameter) (Statistic)
Arithmetic mean μ 
Standard deviation σ s
Variance σ² s² (s=√s²)
Standard error of mean
(Also called Standard deviation of the σ (= σ/√n) s  (= s/√n)
sampling distribution of the mean)
Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean µ
under H0)
Z score for means z
Correlation between two measurements
(Pearson's R)
ρ r
Proportions P p
# Question Ans Page Comments
9 Values which are calculated to test hypotheses about 1 P2 An inference is a conclusion that follows from existing information, by
relationships among variables are referred to as (a) generalising from the specific information to the general type of phenomenon,
______ statistics, while values which summarise where the conclusion is not absolutely certain. So in summary inferential
aspects of data such as the mean and standard statistics are techniques for making generalisations based on imperfect
deviation are referred to as (a) ______ statistics P10-11 numeric data, where the conclusions have a high probability of being true, but
you can never be completely certain.
1. (a) inferential (b) descriptive
2. (a) descriptive (b) correlational A distinction exists between inferential statistics and descriptive statistics.
3. (a) descriptive (b) inferential Descriptive statistics refers to a set of quantities used to summarise
4. (a) theoretical (d) empirical aspects of numerical data. Examples that you may be familiar with are
means, range, variance and standard deviation (see Appendix C for a quick
introduction). These summary quantities are sometimes referred to as
parameters (when they refer to the whole collection or population of data; see
section 1.4.3 below).

Inferential statistics refers to the use of statistical techniques to make


generalisations about the relationships among (two or more) variables. Here
the patterns that may exist in the data are carefully investigated.

10 A researcher believes that people who make eye 1 P8-9 The dependent variable is the one that is predicted or explained, and the
contact with others when they speak to them are P24 independent variable is manipulated to see how it affects the dependent
generally perceived to be more trustworthy than those variable.
who do not. She sets up an experiment where a group
of 100 research participants are each interviewed by a The independent variable is that variable which affects the dependent
research assistant. In half of the cases the interviewer variable; or, conversely, the dependent variable depends on the independent
makes a lot of eye contact with the participants during variable.
the interview and in half of the cases no or very little
eye contact is made. Afterwards participants are asked When a researcher focuses on the interaction of only two variables at a time,
to rate the research assistant for level of the dependent variable is usually the one that the researcher is interested in,
trustworthiness. In this scenario, whether eye contact the variable that is the focus of the research. The independent variable is
was made or not is the (a) ______ variable, while something that the researcher manipulates, to see how this affects the
perceived level of trustworthiness is the (b) ______ dependent variable (in other words, the dependent variable is dependent on
variable the independent variable).

1. (a) independent (b) dependent


2. (a) hidden (b) latent
3. (a) dependent (b) independent
4. (a) latent (b) manifest
# Question Ans Page Comments
11 A researcher conducts an experiment with two groups 3 Group 1 - 125ml alcohol
of university students. The students in the first group Group 2 - 350ml alcohol
are all given 125 ml of alcohol to drink, while the
students in the second group are required to drink 350 She finds that the subjects in the second group are significantly slower in
ml of alcohol each. She then tests their motor these tests than the subjects in the first group.
coordination in a series of tests and finds that the
subjects in the second group are significantly slower in Therefore: The level of alcohol consumption among the students has an effect
these tests than the subjects in the first group. on their motor performance

Which of the following is the most appropriate


formulation of the researcher's research hypothesis?

1. A study of the speed of motor coordination among


students who use alcohol
2. Comparing two groups of students on alcohol
consumption and motor performance
3. The level of alcohol consumption among the
students has an effect on their motor performance
4. Some students can consume more alcohol than
others before their motor coordination is affected
12 Which of the descriptions given below is the most 4 P4 A theory is a well-established principle that has been developed to explain
accurate description of the meaning of the word P15 some aspect of the natural world.
'theory' in scientific research? It refers to a ______ P18-19 A theory arises from repeated observation and testing and incorporates
P21-26 facts, laws, predictions, and tested hypotheses that are widely accepted. In
1. synonym for hypothesis science, a theory is a framework for facts. It is some kind of description that
2. reasonable guess about a relationship that may tells you how the facts are connected, and why the facts are as they are
exist among two or more variables (where the word 'facts' refers to things or events that were observed and
3. best explanation of why a specific relationship that described in a careful way).
is observed among variables is as it is observed to A theory is a network of relations among facts that were proposed to be true
be and explanations for observed phenomena in terms of constructs.
4. careful description of the facts that have been
observed in a specific situation P15-16 Option 3 refers more to an hypothesis.
• An hypothesis can be informally described as an educated guess. As we
indicated above, research usually tries to establish relationships among
constructs in order to develop a theory or to test an existing theory.
• This is the research hypothesis (although there could be more than one), which
expresses the problem in terms of very specific relationships among
constructs that we expect to find (if our guess is true). It is important that this
possible relationship should be clear and unambiguous. An hypothesis that is
stated clearly and specifies exactly what is to be observed and what should be
true if it is valid, is often called an operational hypothesis.
# Question Ans Page Comments
13 A famous hypnotist performs at Meanie Hall before a 1 So for probability questions the formula is P(event) = f/p (this is for a SINGLE
crowd of 350 students and 180 non-students. The event)
hypnotist knows from previous experience that one half
of the students and two-thirds of the non-students are Here the favourable outcome is being hypnotizable:
hypnotizable. What is the probability that a randomly
chosen person from the audience will be hypnotizable? The total possible outcome is all the audience members as they all have a
chance of being selected (p= 350 + 180 = 530)
1. 0.557
2. 0.340 So the only tricky part is the favourable outcomes:
3. 0.869
4. 0.670 1/2 of 350 + 1/3 of 180 = 175 + 120 = 295

So you can now go back to your formula:

P(hypnotizable) = 295/530 = 0.5566

Rounded off is P= 0.557


14 The expression '0.05 ≤ p < 0.10" denotes a probability 3 P33-34 Larger than or equal to 0.05 and smaller than 0.1
value ______
Because probabilities fall in a range from 0.0 to 1.0 when expressed
1. smaller than or equal to 0.05 and smaller than 0.10 decimally, a probability can never be higher than 1 or lower than 0. The
2. larger than or equal to 0.10 or smaller than or equal general rule is written symbolically as follows: 0 ≤ p ≤ 1. Note that a
to 0.05 probability can be 0, but to say that a probability is 0 is actually the same as
3. larger than or equal to 0.05 and smaller than 0.10 saying that the event is impossible and can never happen. Likewise, to say
4. of exactly 0.05 that the probability of an event is 1 is to assert that it is an absolute certainty.
In actual practice, probabilities fall within these two extremes.
You will typically encounter reference to probabilities in expressions such as
''p > 0.05''. This statement is interpreted as ''the probability value is higher
than 0.05''.
15 The probability of correctly guessing a two-digit 2 P35-36 Two digit numbers can be from 01....99
number is Thus 99 possible outcomes

1. 0.100 A single correct guess = 1 favourable outcome


2. 0.010
3. 0.200 Prob(guessing) = no of favourable outcomes
4. 0.500 Total no of possible outcomes

= 1 / 99 = 0.010
# Question Ans Page Comments
16 Suppose that over the years 2 000 students wrote the 3 P35-36 Part 1:
examinations in PYC 304-C and that 1200 of them p(E) = Number of favourable events = 600 = 3 = 0.30
passed, of which 600 obtained exactly 50%. This Number of possible outcomes 2000 10
means that for randomly selected students the
probability of obtaining exactly 50% is ______ while Part 2:
the probability of obtaining 50% or more is ______ p(E) = Number of favourable events = 1200 = 6 = 0.6
Number of possible outcomes 2000 10
1. 0.60; 0.30
2. 0.05; 0.60
3. 0.30; 0.60
4. 0.60; 0.50

17 The probability value “p is larger than or equal to 0.2" 1 P33-34 p is larger than or equal to 0.2 (p ≥ 0.2)
is ______ the probability value "p is smaller than or p is smaller than or equal to 10% (p ≤ 10% or p ≤ 0.1)
equal to 10%"
So the probability value “p is larger than or equal to 0.2" is larger than the
1. larger than probability value "p is smaller than or equal to 10%"
2. larger than or equal to
3. smaller than or equal to
4. exactly the same as

The following scenario applies to Questions 18 and 19 below.

In a business management test the following sample scores were recorded:

Individual A B C D E F G H I J

Test score 12 12 7 10 9 12 13 8 9 8
# Question Ans Page Comments
18 The mean of the test scores is? 2 P59-60 Formula is :

1. 12.00
2. 10.00
3. 9.00 So:
4. 8.00 μ = ∑xi / N
= (12+12+7+10+9+12+13+8+9+8) / 10
= 100 / 10
= 10
19 The standard deviation of the distribution of sample 3 P53 Formula is :
scores is 2.11 Therefore the z-score for individual E is

1. 0.47
2. 1.42 Where:
3. -0.47 X = 9 (test score for Student E)
4. -1.42 μ = 10 (calculated in previous question)
σ = 2.11 (standard deviation)

So:
Z = (x - μ) / σ = (9 - 10) / 2.11 = -1 / 2.11 = -0.474
20 The proportion of scores less than z=0.00 is 2 App D P(Z < z) = 0.5000

1. 0.00 See standard normal distribution table in Appendix D for Smaller Portion of
2. 0.50 z=0.00
3. 1.00
4. -0.50 If z=0.00, then half half the scores will be less and half will be more than the
mean.
# Question Ans Page Comments
21 In a normal distribution, approximately ______ of the 3 P53 The normal curve (also known as the bell curve) is the most common
scores fall within 1 standard deviation of the mean distribution of data. The normal curve is completely determined by two
parameters: mean (μ = 0) and standard deviation (σ = 1). The normal curve is
1. 14% symmetric about the mean which is also the median and the mode. Most data
2. 95% is clumped in close to the mean.
3. 68%
4. 83% Theorem 1 The 68-95-99.7 Rule: In every normal distribution with mean µ
and standard deviation σ, approximately 68% of the data falls within one
standard deviation of the mean. Approximately 95% of the data falls within
two standard deviations of the mean. And finally, approximately 99.7%
(almost everything) of the data falls within three standard deviations of the
mean.

According to the standard normal distribution tabel (z-tabel), if z=1 then the
mean to z = 0.3413. Multiply by 2 to get both sides of the mean = 0.6826 or
68.26%

So:
68.27% of the values lie within one standard deviation of the mean.
(0.3413 + 0.3413 = 0.6826 = 68.26% - numbers were rounded)

95.45% of the values lie within two standard deviations of the mean.
(0.1359 + 0.3413 + 0.3413 + 0.1359 = 0.9544 = 95.44%)

99.73% of the values lie within three standard deviations of the mean.
(0.0215 + 0.1359 + 0.3413 + 0.3413 + 0.1359 + 0.0215 = 0.9974 =
99.74%)

0.0013 (0.13%) are outside 3 standard deviations


# Question Ans Page Comments
22 The sampling distribution of a test statistic ______ 4 P57-60 The sampling distribution of a statistic gives all the values that the statistic
can take, (a) and the probability of each value occurring by chance alone. (b)
(a) gives all the values a test statistic can take
(b) gives the probability of getting each value of a test The sampling distribution tells us what values we might expect to obtain for a
statistic under the assumption the results are due to particular statistic if some predefined conditions are true (e.g., the conditions
chance alone stated by the null hypothesis).
(c) is a probability distribution
The sampling distribution assumes that the null hypothesis is true. When we
1. (a) and none of the other options compare an obtained test statistic to the sampling distribution, we’re asking
2. (b) and none of the other options how likely it is that we would get that statistic if we were sampling from a
3. (b) and (c) but not (a) population that has the null hypothesis characteristics (e.g., P = 0.50).
4. (a) (b) and (c)
The sampling distribution of a sample statistic of size n is defined as follows:
The experiment consists of choosing a sample of size n from the population
and measuring the statistic S. The sampling distribution is the resulting
probability distribution.

The sampling distribution of a statistic is the set of all possible values of the
statistic when all possible samples of a fixed size are taken from the
population. The sampling distribution refers to the variation of a statistic, for
example, the sample mean (), from sample to sample. Note that here we are
not concerned with the variation of individual elements in the sample, or
individual elements in the population, but with the variation of a summary
value (such as the mean) for a sample.

The sampling distribution refers to variation over a hypothetical set of all


possible samples. This may be a rather difficult concept to grasp. It is easy to
visualise the variation of individual elements in a sample because the values
are there for you to see. It is also easy to think of the variation of individual
elements in a population because you can picture the set of individual units.
But it is much more difficult to imagine the set of all possible samples
because (1) we typically deal with one or two samples so that the idea of a
sampling distribution is not really intuitive, and (2) the set of all possible
samples is typically extremely large (conceptually infinitely many samples).
# Question Ans Page Comments
23 Data analysis involving statistical inference basically 4 P55-56 First calculate the z-transformation scores to determine the standard
involves ______ deviation of the scores (z = ???). This is not standard deviation of the
population (σ)
(a) first determining the standard deviation of the
scores Then lookup the z-score in the z-table to determine the smaller or larger
(b) calculating the appropriate test statistic portion of the distribution
(c) evaluating the test statistic based on the sampling
distribution The standard deviation of the population (σ) is not required to perform a t-
test, therefore we do not necessarily require (a)
1. (a) but not (b) or (c)
2. (a) and then (c)
3. (b) but not (a) or (c)
4. (b) and (c) but not necessarily (a)

24 A normally distributed set of population scores has a 3 P61 μ = μ = 65


mean of 65 and a standard deviation of 10.2. A
number of samples, each of size 48 is taken from this Just as the normal distribution is defined by its mean and standard deviation,
population. The mean of the sampling distribution so the distribution of sample means is described by the same two quantities.
of the mean for these samples equals ______ The central value of the sampling distribution equals the population mean (i.e.
the mean of the distribution of all possible means is the same as the
1. 4.71 mean of the population from which the samples were drawn, or μ = μ
2. 65/√48 while the standard deviation of the sample means is estimated by a value we
3. 65 call the standard error of the mean.
4. 10.2/√48
Like a standard deviation, the standard error of the mean tells us by what
average amount the sample means deviate from the mean of the sampling
distribution. It is an estimate of the size of the error we shall make if we use
the mean of the distribution of sample means as an estimate of the true
population mean, that is, if we use μ to estimate μ.

25 An alpha level of 0.05 indicates that ______ 1 SG 82- An error of Type I is the error we make if we reject the null hypothesis when
86 we should not have done so, and the level of significance represents the
1. if H0 is true, the probability of falsely rejecting it is greatest risk of doing this that we are willing to take.
limited to 0.05
2. 95% of the time, chance is operating. The alpha level is the level of significance, in this case 0.05 or 5%.
3. the probability of a Type II error is 0.05
4. the probability of a correct decision is 0.05
# Question Ans Page Comments

Use the following scenario for Questions 26 - 28

A researcher believes that women today weigh less than in previous years. To investigate this belief she randomly samples 41 adult women and records
their weights. The scores have a mean of 51 kg and a standard deviation of 5.6. A local census taken several years ago shows the mean weight of adult
women was 52.6 kg at that time

26 Given the data above, what would be the most 2 P102- In this question the population standard deviation (σ) is considered to be
appropriate statistical approach to establish whether 106 unknown because the given standard deviation comes from the sample. So
there is a statistically significant difference between the we have to use the t-test (t)
average weight of the women in the sample and the
weight of the women recorded in the census? The important point is that - as in the case of the z-distribution - the t-
distribution is a statistical distribution with a probability distribution that can be
1. A correlational study focusing on the linear increase determined, which means that we can use it to predict the chances of
in weight obtaining specific outcomes when testing for comparisons of means when the
2. A study of the group differences using a single population standard deviation σ is unknown.
sample t-test
3. A study of the group differences using the t-test for
independent groups
4. The z-test

27 If the population standard deviation was available 4 P80 When the population standard deviation (σ) is known we use the z-test (z)
instead of the sample standard deviation, which
technique would then have been the most appropriate P100- In Topic 3 (section 3.2.2), in the process of explaining the logic of statistical
for the statistical analysis of the data? 102 testing in general, we introduced you to the z test for single-sample
comparisons. This is used when you have only one sample of data of a
1. A correlational study focusing on the linear increase variable from which a mean could be derived, and you want to compare this
in weight mean with a specific constant value.
2. A study of the group differences using a single
sample t-test
3. A study of the group differences using the t-test for
independent groups
4. The single-sample z-test
# Question Ans Page Comments
28 Suppose the obtained value of the appropriate statistic 4 p-value = 0.025
is -2.07, and subsequently a p-value of 0.025 was α = 0.01 (one-tailed)
found. What can be concluded based on these results
if a significance level of α = 0.01 (one-tailed) is used? Since p > α (0.025 > 0.01), do not reject H0

1. Accept H1
2. Do not accept H0
3. Reject H0
4. Do not reject H0

29 The nominal distribution is useful for interpreting 1 P51 Many psychological and educational variables are distributed approximately
psychological measurements because ______ normally, so that the normal curve can be used as a theoretical model for
interpreting the distribution of these variables. The distributions relating to
1. many psychological variables are approximately psychological variables such as measures of reading ability, introversion, job
normally distributed satisfaction and memory can all be plotted on a normal curve, and
2. it has a mean of zero and a standard deviation of 1 psychometric tests are often standardised in such a way that they conform to
3. it represents an arbitrarily large population of this distribution. Almost all the statistical tests discussed in this module
scores assume normal distributions. Furthermore, many psychological
4. it is symmetrical in shape measurements work very well even if the distribution is only approximately
normally distributed. Some tests work well even with very wide deviations
from normality. Also, apart from its theoretical significance, the normal
distribution is useful because it is easy to work with in practice, and because
many kinds of statistical tests can be derived for normal distributions..

App B On a nominal scale, numbers show category membership, but are otherwise
P156 arbitrary. They do not represent a size or intensity of something, but are only
used as labels to distinguish among qualities or characteristics. They can
also be referred to as categorical variables, or qualitative variables. This is
because differences in the numbers represent differences in quality,
character or type, but not in amount.

For example, we could code a variable like 'region' into 1 = North; 2 =West; 3
= South; and 4 = East. But these four categories can be coded in a different
sequence if we choose, without any information being lost. Note in the special
case where there are only two options, for example, when we code 'Gender'
as 1 = male and 2 = female, we refer to it as a dichotomy.

The important point about nominal scale measurements is that you cannot do
arithmetic with them. Adding them and obtaining an average makes no sense
(e.g. adding telephone numbers to obtain an 'average telephone number').
# Question Ans Page Comments
30 If examination scores are approximately nominally 3 Calculate the z-score for each class. The subject with the highest z-score is
distributed with a mean of 60% and a standard where student X did the best in.
deviation of 8% and Pete’s score is 66%, he did better
than about ______ of the candidates Formula is:

1. 27% Where x = 66% (Pete's score )


2. 23% μ = 60% (Mean of examination)
3. 77% σ =8% (Std dev)
4. 13%
Pete's score: Z = (x- μ) / σ = (66%-60%) / 8% = 6% / 8% = 0.75

P(score < 66%) = P(X < 66%)


So P(Z < 0.75) = 0. 7734 = 77.34% = 77%
(Refer to standard normal distribution table where z=0.75. Since we are
looking for < 0.75, refer to the larger portion column).

Pete did better than 77% who got less than 66%

Alternatively:

P(score > 66%) = P(X > 66%)


So P(Z > 0.75) = 0. 2266 = 22.66% = 23%
(Refer to standard normal distribution table where z=0.75. Since we are
looking for > 0.75, refer to the smaller portion column).

To summarize, 23% did better than 66% so Pete did better than 77% who got
less than 66%.
# Question Ans Page Comments
31 After findings that a significant difference exists 4 P86-87 Due to errors of measurements especially in the standard error, a
between male and female participants on a test which statistical hypothesis test may indicate a significant relationship yet
tests level of creativity, a researcher decides to also such a relationship is questionable in real life. For example, a study on
calculate an effect size, using Cohen's d. This is used reckless driving may indicate that taxi drivers in Johannesburg to be the most
to determine ______ careful drivers in South Africa, yet, such a result is questionable in really life.

1. the size of the error that would be made if the null Now to determine, if indeed this is significant or is not due to error of
hypothesis is rejected measurements we do the effect size test using Cohen’s d test.
2. the ability of a statistical test to detect a significant
relationship between variables
3. the level of confidence one can have that the test is
valid
4. whether a significant effect is meaningful from a A result of d > 1 would imply a difference of greater than one standard
practical point of view deviation between the means, which is quite large.
The rule of the thumb we can interpret the effect size as follows:
Around 0.2 “small”
Around 0.5 “medium”
Around 0.8 “large effect size”

32 Transforming variables to z-scores is useful because it 2 P55 Transforming a set of measurements, each with a different mean and a
______ different standard deviation, into a z-score can be used to compare an
individual across different distributions. After transformation, all the scores will
1. is used to calculate the test statistic fall on a common standard normal distribution with a mean of 0 and a
2. enables one to compare variables with different standard deviation of 1, which makes it possible to compare them directly.
means and standard deviations from scores with
different original units
3. can be used to test whether a score is normally
distributed
4. is easy to calculate the mean and standard
deviation of most scores
# Question Ans Page Comments
33 A probability of an event occurring which depends on 1 P36-37 In the formulation of the multiplicative rule given above we assume that the
something else occurring, such as passing a test when probabilities of the two events, A and B, are independent of one another.
you do not understand your course, can be described However, in some cases a particular probability is conditional on something
as ______ else happening. For example, the probability of event A occurring may be
conditional on the prior occurrence of event B. Conditional probabilities are
1. conditional probability written as p(B|A), where | indicates that a condition applies. p(B|A) is read as
2. an independent event 'the probability of B given A.' Likewise p(A|B) is read as 'the probability of A
3. mutually exclusive events given B', or equivalently, as 'the probability of A happening on condition that
4. a multiplicative probability B has occurred'.

The multiplicative rule that we use when we have conditional probabilities is


p(A and B) = p(A) x p(B|A) (Formula 1)

Suppose we let A denote 'Marie wins the race' and B|A stand for 'Marie gets
a trophy given that she won the race'. We further assign a probability of 0.5 to
A and a probability of 0.6 to B|A. Therefore, the probability that Marie will win
the race and get a trophy is p(A and B) = (0.5) x (0.6) = 0.3.

Note that from the formula for conditional probability, using simple algebra,
we can derive formula 2 below.
p(B|A) = p(A and B) / p(A) (Formula 2)

Let us assume that we know that the chance of Marie winning the race and
also a trophy is 0.3. We also know that the probability of winning the race is
0.6. What is the conditional probability of her winning a trophy provided she
had won the race?
We use formula 2, insert the given probabilities and, therefore, have
p(B|A) = 0.3 / 0.5 = 0.6
# Question Ans Page Comments
34 The sampling error of the mean will be smaller in cases 1 P61-62 We can estimate the size of the error we would make if we used the sample
where ______ mean as an estimate of the population mean. This is referred to as the
standard error, and it is specified in the central limit theorem.
1. the sample is larger and the standard deviation of
the population smaller The standard error is denoted by σẋ. The σ indicates that we are
2. the population is larger and the variability of the describing a population, and the subscript ẋ informs us that we are dealing
scores in the sample is smaller with a population of sample means. The standard error is given by dividing
3. the sample mean is smaller the population standard deviation by the square root of the sample size
4. a medium-size rather than a large sample is used σẋ = σ / √n
Like a standard deviation, the standard error of the mean tells us by what
average amount the sample means deviate from the mean of the sampling
distribution. It is an estimate of the size of the error we shall make if we
use the mean of the distribution of sample means as an estimate of the
true population mean, that is, if we use µẋ to estimate µ.

The sampling error is given by σẋ = σ / √n. So for it to be smaller, the sample


(n) must be larger and the standard (σ) deviation must be smaller.

For instance:
Assume σ = 5 and n=36: σẋ = σ / √n = 5 / √36 = 5/6 = 0.833

Now we increase n (n = 49) : σẋ = σ / √n = 5 / √49 = 5/7 = 0.714

By increasing the sample size (from 36 to 49), the standard error (σẋ) has
reduced

Base your answers to Questions 35 to 37 on the following scenario.

Suppose that the memory span of adults is normally distributed with a mean of 7 items and a standard deviation of 2 items. A researcher predicts that
'dyslexic adults have a shorter memory span than adults in general'
# Question Ans Page Comments
35 Which of the following is an appropriate null hypothesis 2 P73-75 The null hypothesis will always contain equal signs. In this case H0 : μ = 7.
for testing the above prediction? Since the hypothesis should verify dyslexic people's memory span, option 2 is
correct
1. The mean memory span of the population of
dyslexic adults is smaller than 7 H0 is defined as the hypothesis of no effect.
2. The mean memory span of the population of
dyslexic adults equals 7 • The null hypothesis (H0) represents the status quo or the current belief in
3. The mean memory span of the population of adults a situation. The null hypothesis will always contain equal signs.
equals 7 • The alternative hypothesis (H1) is the opposite of the null hypothesis and
4. The mean memory span of the population of adults represents a research claim or specific inference you would like to prove.
does not equal 7 This means that the alternative hypothesis takes the sign of the test
depending on the situation.
o If we are testing the difference, H1 is indicated with ≠.
o Otherwise we can use signs like less than (<) or greater than (>)
depending on the problem statement.
• If you reject H0, you have statistical proof that the alternative is correct.
• If you do not reject H0, you have failed to prove that the alternative
hypothesis is correct. Failure to prove the alternative hypothesis does not
necessarily mean that the null hypothesis is true.
• The null hypothesis (H0) always refers to a specific value of a parameter
(such as μ, not a statistic (such as ). This value is always known or will
come from the given scenario.

36 Which of the following is an appropriate alternative 1 P73-75 The alternative will take the direction of the question. Hence, “The mean
hypothesis for testing the above prediction? memory span of the population of dyslexic adults is smaller than 7”. In this
case H1 : μ < 7
1. The mean memory span of the population of
dyslexic adults is smaller than 7
2. The mean memory span of the population of adults
is not equal to 7
3. The mean memory span of the population of
dyslexic adults equals 7
4. The mean memory span of the population of adults
does not equal 7
# Question Ans Page Comments
37 Testing the above prediction will require a ______ 3 P75 Directional because H1 : μ < 7. It is also one-tailed because it only focus on
statistical test smaller than 7 and not larger than 7 as well.

1. non-directional Two-tailed is when H1 : μ ≠ 7. Now the focus will be on smaller than and larger
2. two-tailed than 7 results
3. directional
4. (1) and (2) are both correct

38 When applying a statistical test, the p-value represents 3 Tut201 The observed results are the values which you find in your sample(s) of data,
the probability of observing the ______ 2014 for example the sample mean and sample standard deviation, or (if it is
Q10 relevant), the correlation coefficient which you calculated.
1. sample statistic under the alternative hypothesis
2. population parameter under the null hypothesis P78-82 The p-value shows you the probability of seeing some relationship among
3. sample statistic under the null hypothesis these variables based on your calculations (such as a difference between
4. population parameter under the alternative means or a high correlation), if in fact this observed relationship is merely the
hypothesis consequence of chance (in other words, if the null hypothesis was true). You
are in fact comparing the observed relationships in the data with what you
would expect if the null hypothesis is true by calculating a relevant test
statistic.

The p-value is the probability that the NULL hypothesis is true. You test the
H0 using SAMPLE data

This test statistic can then be used to find the p-value if we know the
probability distribution of the test statistic. If this probability is small, it implies
the null hypothesis is probably not true.

Here is a summary of the important points regarding the p-value:


• The p-value gives the probability of obtaining the sample result under
H0.
• If the p-value is very small, the probability is very small that the sample
result would occur under H0, and one should consider rejecting H0 in
favour of H1.
• The smaller the p-value, the more likely that the null hypothesis is false
and should be rejected in favour of the alternative hypothesis.
# Question Ans Page Comments
39 The hypothesis “H1 μ < 30" is a ______ hypothesis and 3 P75 Directional because H1 : μ < 30. It is also one-tailed because it only focus on
requires a ______ statistical test smaller than 30 and not larger than 30 as well.

1. non-directional, one-tailed Two-tailed is when H1 : μ ≠ 30. Now the focus will be on smaller than and
2. directional, two-tailed larger than 7 results
3. directional, one-tailed
4. non-directional, two-tailed

40 An alpha level of 0.05 indicates that ______ 1 P82-83 The decision rule for H0 is simply as follows:
If the probability (p-value) of the sample result is smaller than α (alpha) (i.e. if
1. if H0 is true, the probability of falsely rejecting it is the p-value < α), the null hypothesis is rejected. If the p-value is not smaller
limited to 0.05 than α (i.e if the p-value ≥ α), the null hypothesis is not rejected.
2. 95% of the time chance is operating
3. the probability of a Type II error is 0.05 The α-value specifies the maximum risk that we are willing to take of making
4. the probability of a correct decision is 0.05 an error if we reject the null hypothesis

41 If alpha is changed from 0.05 to 0.01, the ______ 4 When alpha reduces, the probability of Type I (α) error decreases and Type II
(β) increases.
1. probability of a Type II error decreases.
2. probability of a Type l error increases SG 82- An error of Type I is the error we make if we reject the null hypothesis when
3. error probabilities stay the same but the probability 86 we should not have done so, and the level of significance (α) represents the
that we will retain a false H0 increases greatest risk of doing this that we are willing to take.
4. probability that we will retain a false H0 increases
An error of Type II is the opposite of Type I. We fail to reject the null
hypothesis when we were supposed to.

P85 Generally, though, the smaller α, the larger β. If we wish to avoid Type I
errors, we set α to a small value such as 0.01 or even 0.001, but if we want to
avoid Type II errors, we could set α to a larger value.
# Question Ans Page Comments
42 lf the alternative hypothesis states that alcohol affects 2 P73-75 The null hypothesis will always contain equal signs so in this case "alcohol
short-term memory, the null hypothesis states that has no effect on short-term memory"

1. alcohol does not decrease short-term memory H0 is defined as the hypothesis of no effect.
2. alcohol has no effect on short-term memory
3. alcohol decreases short-term memory • The null hypothesis (H0) represents the status quo or the current belief in
4. all of the above a situation. The null hypothesis will always contain equal signs.
• The alternative hypothesis (H1) is the opposite of the null hypothesis and
represents a research claim or specific inference you would like to prove.
This means that the alternative hypothesis takes the sign of the test
depending on the situation.
o If we are testing the difference, H1 is indicated with ≠.
o Otherwise we can use signs like less than (<) or greater than (>)
depending on the problem statement.
• If you reject H0, you have statistical proof that the alternative is correct.
• If you do not reject H0, you have failed to prove that the alternative
hypothesis is correct. Failure to prove the alternative hypothesis does not
necessarily mean that the null hypothesis is true.
• The null hypothesis (H0) always refers to a specific value of a parameter
(such as μ, not a statistic (such as ). This value is always known or will
come from the given scenario.

43 When the results are statistically significant, this means 4 P82-83 This question examines judgement using the p-value (results are statistically
that ______ significant).

(a) the obtained probability is equal to or less than We generally would reject the null hypothesis when the p-value is less than
alpha the level of significance (α), therefore A and C are correct
(b) the independent variable has had a large effect
(c) we can reject H0 The dependent variable is the one that is predicted or explained, and the
independent variable is manipulated to see how it affects the dependent
1. (a) is correct but neither of the other statements variable. This has nothing to do with this question.
2. (b) and (c) are correct but not necessarily (a)
3. (a) and (b) are correct but not (c)
4. (a) and (c) are both correct but not necessarily (b)
# Question Ans Page Comments
44 A researcher draws a single random sample from a 1 P102- In this case the population standard deviation is unknown. So we use the t-
population to test his hypothesis about the mean 106 test (t).
population score on a psychological test. Scores on
this test are distributed normally in the general The important point is that - as in the case of the z-distribution - the t-
population with a known mean but an unknown distribution is a statistical distribution with a probability distribution that can be
standard deviation. Which test statistic should the determined, which means that we can use it to predict the chances of
researcher calculate to test his hypothesis? obtaining specific outcomes when testing for comparisons of means when the
population standard deviation σ is unknown.
1. The t-statistic for the mean of a single sample
2. The z-statistic for the mean of a single sample
3. The standard deviation of the sampling distribution
of the mean of a single sample
4. The t-statistic for independent groups

Base your answers to Questions 45 to 48 on the following scenario:

A researcher hypothesizes that chess-playing students are better at non-verbal reasoning than students in general. He draws a random sample of 25
students from the members of the chess clubs of South African universities and measures their non-verbal reasoning ability by means of a test developed
for this purpose. The scores of a large group of students on this test were found in earlier research to be distributed normally with a mean of 20. Suppose
the researcher finds that the mean score of his sample is 22.3 and the standard deviation of the scores is 6.0

45 Which research design did the researcher use? 1 P100- He drew one random sample which he is comparing to the general population
106
1. Single-sample groups design
2. Two-groups design
3. Two-groups design with a known population mean
4. A correlational design

46 Which are the appropriate statistical hypotheses for 2 H0 : μ = 20


testing the researcher's hypothesis? H1 : μ > 20.
The researcher hypothesised that chess-playing students are better in non-
1. H0 μ is not equal to 20, H1 μ is larger than 20 verbal reasoning than students in general therefore H1 : μ > 20
2. H0 μ equals 20, H1 μ is Larger than 20
3. H0 μ equals 20, H1 μ is not equal to 20
4. H0 μ equals 20, H1 μ is smaller than 20
# Question Ans Page Comments
47 Which is the appropriate test statistic to calculate? 3 P102- The t-statistic for the mean of a single sample. This is because the standard
106 deviation is unknown. What is given (s=6) was the scores extracted from a
1. The z-statistic for the mean of a single sample sample of 25.
2. The t-statistic for the difference between the means
of two independent samples So we have to use the t-test (t)
3. The t-statistic for the mean of a single sample
4. The Chi Square test statistic The important point is that - as in the case of the z-distribution - the t-
distribution is a statistical distribution with a probability distribution that can be
determined, which means that we can use it to predict the chances of
obtaining specific outcomes when testing for comparisons of means when the
population standard deviation σ is unknown.

48 Which of the following values most accurately reflects 1 P104-


the correct result when calculating the test statistic? 105 Formula: or

1. 2.3/1.2
2. 2.3/5 where:  = 22.3
3. -2.3/1.2 μ = 20
4. -2.3/5 s = 6
n = 25

so: t = (22.3 - 20) / (6 / sqrt(25)) = 2.3 / (6 / 5) = 2.3/1.2

49 Two samples can be considered independent when 1 P110 Samples are considered as comprising independent groups if the
______ composition of the one sample in no way affects, in any systematic way,
the composition of the other sample. The two samples come from two
1. the composition of one sample is not systematically groups that have no obvious relationship. For example, where one sample is
related to the composition of the other one measurements of a construct like 'self-esteem' among men, and the other
2. the samples are drawn under different experimental among women, but both groups were sampled purely randomly.
conditions
3. one sample comes from a treatment or On the other hand, the concept of dependent groups refers to situations
experimental group while the other comes from a where the samples are related, and it implies that each subject in one group
control group can be systematically paired off with a subject from the other group. For this
4. care was taken that the samples are drawn at reason, a dependent groups research design is often referred to as a
random matched-pairs design.
# Question Ans Page Comments

Base your answers to Questions 50 and 51 on the following scenario:

A researcher wants to validate a new depression scale where a high score indicates a high incidence of depression. She applies it to a sample of 40
patients diagnosed with depression and a control group of 40 persons who were Judged not to suffer from depression by a panel of clinical psychologists

50 Which is an appropriate alternative hypothesis to test 2 A researcher wants to validate a new depression scale where a high score
the validity of the depression scale based on group indicates a high incidence of depression.
mean values?
So the depression must be larger than the control
1. μDepression ≠ μControl
2. μDepression ˃μControl Therefore: μDepression ˃μControl
3. μDepression ˂ μControl
4. The population mean of the difference scores
equals zero

51 Which of the following would be the most appropriate 3 P110 Samples are considered as comprising independent groups if the
statistical test to determine whether a significant composition of the one sample in no way affects, in any systematic way, the
difference exist between the scores for the two groups composition of the other sample. The two samples come from two groups
(measuring depression and non-depression scores)? that have no obvious relationship. For example, where one sample is
measurements of a construct like 'self-esteem' among men, and the other
1. A test for a correlation coefficient among women, but both groups were sampled purely randomly.
2. The t-test for dependent samples
3. The t-test for independent samples On the other hand, the concept of dependent groups refers to situations
4. The chi-square (x²) test where the samples are related, and it implies that each subject in one group
can be systematically paired off with a subject from the other group. For this
reason, a dependent groups research design is often referred to as a
matched-pairs design.

SG P140 The chi-square test is usually used when you have a cross tabulation of
frequency counts of events which are nominal scale measurements. This
Tut202 table is referred to as a contingency table. It is used to compare an observed
2014 frequency distribution (frequency counts based on a sample of observation)
Q22 with the frequency distribution which we would expect to find if the null
hypothesis of no relationship between two cross-tabulated variables were
true.
# Question Ans Page Comments
52 A researcher wants to determine whether a significant 4 The effect size is to assess whether a significant effect is meaningful from a
difference exists in the scores on a test of creativity practical point of view.
between a group of students who are studying for a
BSc degree and a group of students who are studying P87 The implication is that we have to be careful how we interpret significant
for a BA degree. She finds a mean score of ẋ1 = 20.4 results. A p-value of smaller than our chosen level of significance (a) simply
for the BSc students and a mean score of ẋ2 = 22.3 for implies that, relative to this sample, it is improbable that the effect we see in
the BA students. A t-test shows that this difference is our observations is purely due to chance. It does not imply that the effect is
statistically significant. In spite of the significant big or important. This is something that we have to decide by looking at what
difference, the researcher feels that the difference the data means. One way that statisticians have suggested to deal with this
between the two means is too small to be really problem is by the notion of effect size. Different procedures exist to determine
important. What calculation can she do to confirm this the effect size of a result. In the case of a comparison between means, one
suspicion? way of calculating this is by the use of Cohen's d. We do this by expressing
the mean difference that we observed relative to the standard deviation:
1. Level of significance
2. Variance
3. Degrees of freedom
4. Effect size

53 A researcher plans to use the t-test to compare two 2 P115 In order to use the t-test (tc) statistic, we need to make two assumptions
independent samples of data with 20 individuals each. regarding the data:
• that the two populations being compared are normally distributed
Consider the following assumptions that may be • with the same variance (or standard deviation).
relevant here
a) The sample standard deviations have to be equal (Remember that the square root of the variance is equal to the standard
b) The data from both samples has to come from deviation.)
populations that are nominally distributed
We can also assume that the samples are independent - since the samples
What minimum assumptions from the ones given were selected randomly, we can safely consider them to be independent of
above needs to be met before she may proceed? each other. All of this makes the tc-test an appropriate test.

1. At least one of (a) or (b) must be true P116 Note: Even the most elementary statistics program makes provision for
2. (a) and (b) must both be true performing t-tests. Such programs usually require that we indicate which
3. Neither (a) nor (b) is relevant but other assumptions variable should be used to identify the two groups and which is the
exist that will have to be considered dependent variable. In addition, we have to choose between a tc test for
4. The t-test should never be used with such a small independent samples or a td test for dependent or correlated groups
sample
# Question Ans Page Comments
54 In which of the following cases should the scores being 2 P117- Often the two samples are not 'independent'. This happens when each
investigated be regarded as dependent when a test for 118 subject in one sample is matched with regard to some characteristic (usually
significance is selected? a nuisance or external variable that we wish to control) to a particular subject
in the other sample. The samples are dependent if each measurement of
1. The variables represent exam scores of children a variable for a particular case can be paired with the measurement of a
from two schools, matched on demographic criteria matching case in the other sample. The implication is that the two samples
like grade and gender will always have to be of the same size (that is, n1 = n2). This design is,
2. The variables represent scores from subjects on a therefore, often referred to as a matched-pairs design. This implicit matching
motivational scale, who were tested before and usually causes the scores to be correlated (see Topic 6 for the meaning of
after listening to a presentation by a motivational this term).
speaker
3. The scores on a test for mathematical ability and a A typical example would be if the same research participants are measured
test for attention span twice, once before and again after an intervention. From the point of view of
4. The variables represent frequency counts for research design, we would refer to this type of comparison as a two-sample
gender and favourite colour, cross-classified in a repeated measures design.
contingency table

Base your answers to Questions 55 to 57 on the following scenario:

A psychologist develops a series of workshops providing assertiveness training to a group of persons who suffer from low self esteem. To test the efficacy
of the workshops, she applies a psychometric test which measures level of self esteem to 50 persons at the start and again after the end of the series of
workshops, predicting that the latter scores will be higher (reflecting higher self esteem). The self esteem scale was standardised on the general
population with a mean score of 30 and a standard deviation of 10.

55 Which constructs are related to one another by the 2 P117- To test the efficacy of the workshops, she applies a psychometric test which
research hypothesis? 118 measures level of self esteem to 50 persons at the start and again after the
end of the series of workshops
1. Attending a workshop of assertiveness training, self
esteem
2. Self esteem before a workshop; self esteem after a
workshop
3. Self esteem in the treatment group, self esteem in
the general population
4. Level of assertiveness; level of self esteem
# Question Ans Page Comments
56 Which is an appropriate null hypothesis for the analysts 1 If they give you POPULATION values you MUST use them!!!
of the results? (The self esteem scale was standardised on the general population with a
mean score of 30 and a standard deviation of 10.)
1. μ = 30
2. μ1 = μ2 So here you will need to do td first (for the pre-test post test design) and then
3. The population mean of the difference scores z (for the single sample groups design) to test your hypothesis.
equals zero
4. μ1 ≠ μ2 If they didn't give you the population mean then option 3 would be correct.

57 Which is the appropriate test statistic to calculate? 2 In the previous question both td and z tests were performed, but the test is
done on dependant samples, therefore option 1 and 3 are incorrect.
1. The z-statistic for the difference between the means
of two independent samples The appropriate test statistic to calculate is the t-statistic for the difference
2. The t-statistic for the difference between the means between the means of two dependent samples
of two dependent samples
3. The t-statistic for the difference between the means Having said that, option 1 was incorrectly phrased. It should have read "The
of two independent samples z-test for single sample groups design". In this case option 1 would be the
4. A test of the correlation coefficient for the two sets correct answer since this is a single group and the population standard
of scores deviation was given. If σ is given, you have to use it for your tests and
therefore the z test must be performed

58 A researcher wants the compare the cognitive 1 P114- The t-test is performed before the p-value can be determined. Only if the p-
development of two groups of children using the mean 116 value is smaller than the level of significance (α) should the null hypothesis
score of each group to test the following hypotheses (H0) not be accepted.
H0: μ1 = μ2
H1: μ1 ˃ μ2 Therefore: She needs to find the relevant p-value before making any
conclusion
Her results derived from a random sample from each
group of children shows that the mean sample score
on a scale which measures level of cognitive
development for the first group is less than the mean
sample score for group two (i.e. ẋ1 < ẋ2 ). What may
she conclude?

1. She needs to find the relevant p-value before


making any conclusion
2. She can reject H0
3. She will not be able to reject H0
4. She needs to calculate a t-test before any
conclusion is made
# Question Ans Page Comments
59 A researcher wants to test the following hypotheses 1 P78-81 The relationship between one-tailed and two-tailed p-values can be
H0: μ1 = μ2 summarised as follows:
H1: μ1 ˃ μ2 • One-tailed p-value = (two-tailed p-value) / 2
• Two-tailed p-value = (one-tailed p-value) x 2
On the basis of data provided, the output from a
computer programme indicates that a t-value of t = -2.3 The important point to remember is that the p-value indicates more or less
was found, with the p-value for a non-directional test how likely the particular result we have observed in our data is if the null
(two-tailed) given as p=0.07. What should the hypothesis were true; or, as we say, 'under the null hypothesis'.
researcher do to evaluate this result?
Bacause the p-value is for a non-directional test (two-tailed) given as p=0.07, it
1. Divide 0.07 with 2 before comparing it with the pre- must be divided with 2 before comparing it with the pre-selected alpha level
selected alpha level
2. Multiply 0.07 by 2 before comparing It with the pre-
selected alpha level
3. Compare the computed p-value as given with the
pre-selected alpha level
4. A calculation error was made since a t-value cannot
be less than 0

60 Correlation is used in data analysis when one 3 P129- Correlation: measuring the association between variables
investigates the relation between ______ 130
Correlation is a measurement of the extent to which a measurement on
1. the mean of a single sample of subjects and a one variable is related to a measurement on another variable for the
population mean same sample of individual cases.
2. two groups of subjects, with respect to a single
variable This can be visualised by way of a graphical representation called a scatter
3. two variables measured on the same group of plot. A scatter plot is a graph that represents the measurements of two
subjects variables on two perpendicular axes, usually called the x-axis (horizontal axis
4. two variables from independent samples or abscissa) and the y-axis (vertical axis or ordinate).

61 A positive correlation between variables X and Y 2 P133 If a correlation exists, the way in which one variable varies will be related to
implies that persons scoring low on X will generally variation on the other one.
score ______ on Y
A negative correlation implies that as one variable changes, the other
1. high changes in the opposite direction. A high value on X will imply a low value on
2. low Y, while a low value on X will be matched by a high value on Y.
3. either high or low
4. in a totally unpredictable way Conversely, if the correlation is positive, the variable values will
generally vary in the same direction (both high or both low).
# Question Ans Page Comments
62 Which of the values given below is the best estimate of 1 P132- As variable X increases (from -2 to 2), variable Y (decreases (from 2 to -2).
the Pearson correlation coefficient between the 133 This implies a negative correlation.
following values of X and Y?
I also changes exactly the same amounts, so we have a perfect negative
X -2 -1 0 1 2 correlation which is -1

Y 2 1 0 -1 -2 We use 'r' as the symbol that represents a correlation coefficient (as in the
case of the Pearson product-moment correlation coefficient), and the
1. -1 following applies:
2. 0 • r = 1 implies a perfect positive linear relationship (the dots in a scatter plot
3. +1 will run from lower left to upper right in a perfectly straight line)
4. 0.5 • r = 0 implies no linear relationship at all (the dots may be scattered all
over the place)
• r = -1 implies a perfect negative linear relationship (the dots will run from
upper left to lower right in a straight line)
63 A researcher hypothesizes that the drug treatment of 1 SG P137 The symbol ‘ρ’ (the Greek letter ‘rho’) is used to represent the population
hospitalised schizophrenic patients improves their parameter being tested when you calculate the Pearson’s correlation
mental alertness. He studies a random sample of 27 Tut202 coefficient ‘r.’ That is, you calculate r for the sample, then have to decide
such patients and finds a correlation coefficient of 0.6 2014 whether this is likely to represent a significant linear correlation between two
between the number of days of drug treatment and Q13 variables for the whole population (with this population correlation symbolised
patients’ scores on the Mental Alertness Test. Which is by ρ), by looking at the p-value associated with this calculated sample
an appropriate null hypothesis for this research? statistic r.

1. p=0 In a similar way ‘μ’ represents the population parameter (statistic) for a mean,
2. μ=0 and ‘σ’ the population parameter for a standard deviation. These two are not
3. r=0 applicable in this question.
4. μ1 = μ2
# Question Ans Page Comments
64 The table below gives the number of persons observed 4 P143- It is important to note that the relation between the variables is described by
to be in each of the categories in a cross classification 144 the cell and not by the row or column frequencies. These cell frequencies
of gender (male/female) and place of residence represent the way the information is distributed relative to the two variables.
(rural/urban). What would the expected value be for These cell frequencies are often referred to as the observed or empirical cell
persons classified as both 'urban' and 'male’, if no frequencies.
relationship exists between gender and place of
residence? To find the expected frequency for a particular cell, the row total for that row
is multiplied by the column total for that column and this result is then divided
Row by the overall total. These expected frequencies show what the results would
Male Female
Total have been like if the distribution of frequencies through the cells were
Urban 6 4 10 homogeneous, in proportion to the respective row and column totals. If the
observed frequencies correspond precisely with the expected frequencies,
Rural 6 8 14 we know that the null hypothesis cannot be rejected. But the observed
Column frequencies will seldom be precisely equal to the expected frequencies - even
12 12 24 if H0 is not rejected - because of sampling error.
Total
It is the differences between these expected and observed frequencies that
1. 24 interest us, that is, we want to know how far the actual (observed) results are
2. 6 removed from the expected situation, if there is no interaction effect.
3. 10 Row total (O.1) = 10
4. 5 Column total (O1.) = 12
Sample total (size) (O..) = 24

E11 = (Row total x Column total) / Sample total


E11 = (O.1 x O1.) / O.. = (10 x 12) / 24 = 120 / 24 = 5
# Question Ans Page Comments
65 A researcher wants to determine whether a 2 P130 Correlation is a measurement of the extent to which a measurement on one
relationship exists between students’ general level of variable is related to a measurement on another variable for the same
anxiety and their exam marks. He presents each sample of individual cases. This can be visualised by way of a graphical
student from a random sample with a general anxiety Tut202 representation called a scatter plot. A scatter plot is a graph that represents
scale just before they are to write an important exam. 2014 the measurements of two variables on two perpendicular axes, usually called
Which of the following is the most appropriate test Q22 the x-axis (horizontal axis or abscissa) and the y-axis (vertical axis or
statistic to use to determine whether a relationship ordinate).
exists between the two variables (anxiety level and
exam results)? P132 Correlation coefficients that measure the linear relationship between two
variables, such as the Pearson product-moment correlation coefficient, can
1. t-test for independent samples have a continuous value that ranges from ±1 to 1 (a positive value is usually
2. Pearson's r test statistic written without the sign, so '1' is presumed to mean '+1'). We use 'r' as the
3. t-test for dependent samples symbol that represents a correlation coefficient
4. Chi-square test (x²)
P136- The Pearson correlation coefficient is really a descriptive statistic: it describes
141 the relationship between two variables.

P144 The chi-square test is usually used when you have a cross tabulation of
frequency counts of events which are nominal scale measurements. This
table is referred to as a contingency table. It is used to compare an observed
frequency distribution (frequency counts based on a sample of observation)
with the frequency distribution which we would expect to find if the null
hypothesis of no relationship between two cross-tabulated variables were
true. The Pearson chi-square test statistic is a calculation of the difference
between the observed and expected frequencies and is qualitative of nature.

66 Which of the following is the appropriate formula for the 1 P144- The Pearson chi-square test statistic, is a calculation of the difference
Chi square test? 145 between the observed and expected frequencies.

The formula is:

This means the expected value for each cell in the contingency table is
subtracted from the observed value for that cell, squared, and divided by the
expected value for that cell.

Then all of these terms are added together to yield


# Question Ans Page Comments

Base your answers to Questions 67 and 68 on the following scenario:

A psychologist reads an article in which the author claims that playing computer games leads to higher levels of aggression in children. She decides to
test this by asking a sample of children to report the number of computer games they play per month and measuring the aggression level of each child
with an appropriate psychometric test. She expects to find that a positive correlation will exist in her sample between level of aggression and number of
computer games played

67 The researcher draws a graph of the relationship 1 A graph showing the position of each of a number of sampling units on each
between aggression and number of computer games. of two variables
Which of the scatter plots below give the most
probable representation of the data if the expected P130- A scatter plot is a graph showing the relationship between two numerical
relationship exists? 132 variables. In such a graph the data of the one variable are plotted on the
horizontal axis (usually referred to as the X axis), and the data of the other
variable on the vertical (or Y) axis. It is not a comparison of sample and
Tut202 population, nor has it to do with spread of data or the independence of
2014 variables
Q18
The closer the dots in the plot are to a straight line, the closer the correlation
coefficient is to 1 (it can be either a positive number (+1) or a negative
number (-1)). The more arbitrary or spread out the dots, the closer the
correlation coefficient is to 0. If the plot seems to form a line from lower left to
1. Graph A upper right, the correlation is positive. On the other hand, if the line runs from
2. Graph B upper left to lower right, the correlation is negative.
3. Graph C
4. Graph D She expects to find that a positive correlation will exist in her sample
between level of aggression and number of computer games played.

Therefore the plot must form a line from lower left to upper right
# Question Ans Page Comments
68 The researcher calculates the Pearson product 4 P130- She expects to find that a positive correlation will exist in her sample
moment correlation coefficient of the relationship 132 between level of aggression and number of computer games played.
between level of aggression and number of computer
games played. Which of following expressions best This implies that the relationship must be between 0 and 1 (or greater than 0)
represent the relationship if the expectations of the
researcher about the relationship are true?
Therefore: r > 0
1. r≠0
2. r=0
3. r<0
4. r>0

Base your answers to Questions 69 and 70 on the following scenario:

A sample of 300 clients are drawn from three community mental health centres (indicated in the table as A, B and C). Counts are made of those clients
who are diagnosed as having social adjustment problems, those with problems related to anxiety, and the remaining clients are classified under 'other
problems '. Counts of the number of clients from the different centres which fall in each of the categories are supplied below.

Mental Health Centre


Row
(columns) A B C
Totals
By Type of Problem (rows)
Social adjustment problems 50 40 40 130
Anxiety related 26 34 20 80
Other 24 26 40 90
Column totals 100 100 100 300

69 What is the type of arrangement of data above called? 2 SG A contingency table is a table indicating the number of individual objects
P142- falling in each cell of cross-tabulated data. In other words, it is a two-
1. Histogram 144 dimensional table in which each observation is classified in terms of two
2. Contingency table categories simultaneously.
3. Correlation matrix
4. Classification table
# Question Ans Page Comments
70 A researcher want to establish whether the types of 1 P140 The chi-square test is usually used when you have a cross tabulation of
diagnoses made differs significantly among the P144- frequency counts of events which are nominal scale measurements. This
different mental health centres or not. Which of the 145 table is referred to as a contingency table. It is used to compare an observed
following would be the most appropriate statistical test frequency distribution (frequency counts based on a sample of observation)
to use? Tut202 with the frequency distribution which we would expect to find if the null
2014 hypothesis of no relationship between two cross-tabulated variables were
1. The Chi-square (x²) test Q22 true.
2. A test of the correlation coefficient
3. The t-test for two samples
4. The z-test for two samples
Oct/Nov 2013

# Question Ans Page Comments


1 Research is called empirical research when ______ 4 P2 All scientific knowledge begins with description of the phenomena being
studied, based on careful observation. Knowledge based on observation of
1. descriptive statistics are calculated from data physical events is referred to as empirical knowledge (as distinct from
2. use is made of inferential statistics knowledge based on contemplation, unexplained insights, mystical
3. hypotheses are carefully formulated and tested experiences or claims by authority figures).
using statistical tests
4. observations or measurements are made of objects
or entities being studied

2 In science, including social science, the word ‘theory’ 2 SG P4 A theory is a framework for facts: it is the explanation of why the facts (i.e.
refers to ______ observations, measurements, phenomenon) are as they are, or are related in
Tut201 the way in which they are related, based on empirical investigations.
1. a plausible guess based on one’s previous 2013 Q7
knowledge about a phenomenon Option 1 is a description of a hypothesis, but this is often how the word
2. an explanation of why a phenomenon appears as it ‘theory’ is used in informal conversation.
is observed to be
3. an explanation of the procedure by which a
construct should be measured
4. the process where independent variables are varied
to see how they affect the dependent variables
# Question Ans Page Comments
3 Inferential statistics refer to ______ 2 P2 An inference is a conclusion that follows from existing information, by
generalising from the specific information to the general type of phenomenon,
1. calculating statistics which summarises the data where the conclusion is not absolutely certain. So in summary inferential
2. using probability theory to make conclusions based statistics are techniques for making generalisations based on imperfect
on observations of data P10-11 numeric data, where the conclusions have a high probability of being true, but
3. the process of converting general research you can never be completely certain.
questions into specific formal hypotheses
4. the process of finding a way to measure an abstract A distinction exists between inferential statistics and descriptive statistics.
construct Descriptive statistics refers to a set of quantities used to summarise
aspects of numerical data. Examples that you may be familiar with are
means, range, variance and standard deviation (see Appendix C for a quick
introduction). These summary quantities are sometimes referred to as
parameters (when they refer to the whole collection or population of data; see
section 1.4.3 below).

Inferential statistics refers to the use of statistical techniques to make


generalisations about the relationships among (two or more) variables. Here
the patterns that may exist in the data are carefully investigated.

4 When doing research, the term 'Operationalisation' is 3 P24-26 Operational definitions of psychological constructs should define constructs in
used to refer to the process of ______ terms of observable behaviour.

1. calculating a test statistic to test a particular "Operational'' refers to practical procedures by which constructs are made
hypothesis visible.
2. converting a general research question into a
formal statistical hypothesis "Operationalisation" is where you make the construct (which is usually an
3. determining a way to get a numeric measurement abstract concept, so it is difficult to observe it clearly) visible by finding some
of a construct which is being measured suitable way to measure it.
4. converting a calculated test statistic into a
probability value called the p-value

5 In social science research, the total collection of 4 P10 When several measurements are collected from a number of people, the
measurements across a group of research participants collected information is referred to as the data (while a single item of
is referred to as ______ information is a datum). Data are all the variables for all the cases in the
research.
1. descriptive statistics
2. parameters
3. sample statistics
4. data
# Question Ans Page Comments
6 Psychological measurements are always imperfect. 1 P14-15 One of the consequences of using samples to represent populations is that
The way in which a measurement varies around its this always leads to a certain degree of measurement error, no matter how
‘true’ value is referred to as ______ rigorous our sampling procedure is. Another source of measurement error
lies in the fact that our measurements are imprecise, that the measurement of
1. measurement error a psychological construct is only more or less accurate. This measurement
2. variance error is a kind of hidden variable, which we always presume to exist in social
3. hidden variables scientific research. This is referred to as the error component or the error
4. standard deviation term.

This is one of the major reasons for using statistical probability theory in our
data analysis: we assume that any variable we measure contains a 'true'
element and an 'error' component. Furthermore, we assume that the mean of
the error component is zero. We can do this because it is reasonable to
assume that positive deviations and negative deviations from the perfect
score (measurements that are too high or too low) will cancel each other out.
We also need to make an additional assumption, namely, that these error
terms are distributed around this mean of zero in a normal distribution

Variance (s²) and standard deviation (s) are the same concept so they can't
both be correct (2 and 4)
7 A social science researcher is told by a grade 1 teacher 2 P3 Constructs and their interrelations (how they affect each other, their patterns
that some children are terribly shy while other children of interaction) are used in this way to develop theoretical explanations of
seem to be quite comfortable in the social group. The why people behave in certain ways in certain contexts, or why mental
researcher decides to investigate, using a test for phenomena appear to be as they are. Psychologists try to develop
shyness which was developed especially for young explanations for human experiences and behaviour. To do this, they often
children. In this study, ‘shyness’ would be a ______ have to make use of abstract concepts (also called constructs) that serve as
while the measurement of it is referred to as a ______ explanations for the behaviour they observe.

1. variable, construct P7 A construct that has been measured in some way produces a variable.
2. construct, test A variable refers to a number that can take on any one of a range of possible
3. construct, variable values. They can be discrete (when only whole numbers like 1, 2, 3 are
4. concept, scale allowed) or continuous (what mathematicians refer to as 'real numbers'). In
some cases variables also take on values smaller than zero to produce
negative numbers.

So the (visible) variable reflects the intensity of the underlying (invisible)


construct, in terms of how it was measured. We say that the variable is
manifest (it is visible in the sense that we can observe it) and the construct is
latent (it is invisible in the sense that we need some way to make it appear).
So the latent construct is made manifest by the use of an appropriate
measurement procedure.
# Question Ans Page Comments
8 A researcher wants to study the attitude to safety of 4 P11 The entire collection of cases that you are interested in when you make your
workers in the construction industry. She randomly measurements for a particular construct is referred to as the population. The
selects 200 workers from the employment records of population depends on which people or objects or events you are interested
10 major construction companies in the Gauteng in studying.
Province of South Africa. The group which was
selected is referred as the ______ and the general Because populations can be very large, and we rarely have access to them,
group of construction workers is the ______ we would draw a sample of observations from the population and use that
sample to infer certain things about the population's characteristics. The
1. population, sample most appropriate sample is usually a simple random sample, where each
2. dependent variable, independent variable individual has the same chance of being included. If our samples are not
3. independent variable, dependent variable random, they may lack external validity: it may not be possible to generalise
4. sample, population beyond the group from which we drew the sample.

9 Numeric values which represent some kind of 2 P7 A construct that has been measured in some way produces a variable.
psychological measurement and which can change A variable refers to a number that can take on any one of a range of possible
from one measurement to the next are referred to as values. They can be discrete (when only whole numbers like 1, 2, 3 are
______ allowed) or continuous (what mathematicians refer to as 'real numbers'). In
some cases variables also take on values smaller than zero to produce
1. statistics negative numbers.
2. variables
3. parameters
4. constructs

10 A psychologist is conducting research into hypnosis. 1 P8-9 The dependent variable is the one that is predicted or explained, and the
She believes that a relationship exists between a P24 independent variable is manipulated to see how it affects the dependent
person's suggestibility (susceptibility to hypnosis) and variable.
his or her level of self esteem. In this design,
‘suggestibility’ is the ______ variable and ‘level of self The independent variable is that variable which affects the dependent
esteem’ is the ______ variable variable; or, conversely, the dependent variable depends on the independent
variable.
1. dependent, independent
2. latent, manifest When a researcher focuses on the interaction of only two variables at a time,
3. independent, dependent the dependent variable is usually the one that the researcher is interested in,
4. hidden, operational the variable that is the focus of the research. The independent variable is
something that the researcher manipulates, to see how this affects the
dependent variable (in other words, the dependent variable is dependent on
the independent variable).
# Question Ans Page Comments
11 The symbol ______ is usually used to indicate the 1 P161
mean of a sample, while the mean of the population Symbol
from which the sample comes is indicated by the Summary value Populations Samples
symbol ______
(Parameter) (Statistic)
1. , μ Arithmetic mean μ 
2. s, σ Standard deviation σ s
3. α,  Variance σ² s² (s=√s²)
4. μ, σ Standard error of mean
(Also called Standard deviation of the σ (= σ/√n) s  (= s/√n)
sampling distribution of the mean)
Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean µ
under H0)
Z score for means z
Correlation between two measurements
(Pearson's R)
ρ r
Proportions P p

12 Which best describes “research hypothesis"? 2 Tut201 A psychological hypothesis formulates a testable empirical claim (something
2012 Q8 that can in principle be observed), and this usually involves postulating a
1. A proven relation between two constructs relationship between two or more variables.
2. A proposed relation between two or more variables
3. A network of all the possible relations between P1 A research hypothesis is formed as a clear statement in terms of a
constructs P18-19 relationship among the constructs (and the variables by which they are
4. A scientific theory measured). It is a statement about a possible relationship among constructs
that may explain some set of observations that one intends to investigate.
# Question Ans Page Comments
13 Which of the following does NOT represent a possible 3 P33 The probability value tells us at a glance how frequent or infrequent the event
value for a probability‘? is, and what the likelihood is of obtaining a favourable outcome associated
with it.
1. 99% • Probabilities can be expressed as percentages (e.g. a 10% probability),
2. 0 as fractions (e.g. a 1/10 probability), or as a decimals (e.g. a 0.10
3. -0.05 probability).
4. 1.0 • A probability value represents a proportion (i.e. the proportion of
outcomes supporting the event). A proportion is a decimal number
between 0 and 1 and indicates the fraction of the total.
• We often refer to the probability of an event (or statistic) as its p-value.
• When decimal notation is used to describe probabilities, they fall in a
range between 0 and 1, with values closer to 1 indicating a greater
likelihood (or chance of success) than values close to zero.
• Because probabilities fall in a range from 0.0 to 1.0 when expressed
decimally, a probability can never be higher than 1 or lower than 0. The
general rule is written symbolically as follows: 0 ≤ p ≤ 1.

14 A jar contains 5 red, 8 blue, 3 green and 4 yellow 4 P29 Number of possible outcomes = Total marbels = 20 (5+8+3+4)
marbles. What is the probability that a person who is Number of favourable events = Pick one blue marble = 8
blindfolded will choose a blue marble purely at
random? p(E) = Number of favourable events
Number of possible outcomes
1. 0.20
2. 0.25 =8 = 2/5 = 0.40
3. 0.50 20
4. 0.40

15 Consider the same jar, filled with the same number of 2 P29 Number of possible outcomes = Total marbels = 20 (5+8+3+4)
marbles than in the previous question. What is the Number of favourable events = Pick one red marble = 5 OR
probability that a person would choose either a red Pick one yellow marble = 4
marble or a yellow one?
p(E) = Number of favourable events
1. 0.50 Number of possible outcomes
2. 0.45
3. 0.25 = 5+4 = 9 = 4.5/10 = 0.45
4. 0.90 20 20
# Question Ans Page Comments
16 A researcher applies a test of creativity on a sample of 1 P53 Score of 8 or more :
fine arts students She creates the following graph Score of 8 = 3
based on the results, where the horizontal (x) axis Score of 9 = 5
represents the scores on the creativity test and the Score of 10 = 2
vertical axis (Y) are frequencies (counts for each score
are indicated on top of the bars in the graph) Number of participants (N) = 30 (2+4+8+6+3+5+2)

Formula is :

So:
μ = ∑xi / N
= (3+5+2) / 30
= 10 / 30
= 0.33

Based on this information, what is the probability that a


particular art student, chosen at random, would get a
score of 8 or greater on this test?

1. 0.33
2. 0.25
3. About 50%
4. More information is needed, the p-value will have to
be calculated from the raw data

17 The expression “0.05 ≤ p ≤ 0.10" denotes a probability 4 P33-34 Larger than or equal to 0.05 and smaller than or equal to 0.10
value which is ______
Because probabilities fall in a range from 0.0 to 1.0 when expressed
1. a number halfway between 0.05 and 0.10 decimally, a probability can never be higher than 1 or lower than 0. The
2. larger than or equal to 0.10 or smaller than or equal general rule is written symbolically as follows: 0 ≤ p ≤ 1. Note that a
to 0.05 probability can be 0, but to say that a probability is 0 is actually the same as
3. larger than 0.05 and smaller than 0.10 saying that the event is impossible and can never happen. Likewise, to say
4. larger than or equal to 0.05 and smaller than or that the probability of an event is 1 is to assert that it is an absolute certainty.
equal to 0.10 In actual practice, probabilities fall within these two extremes.
You will typically encounter reference to probabilities in expressions such as
''p > 0.05''. This statement is interpreted as ''the probability value is higher
than 0.05''.
# Question Ans Page Comments

Use the scenario below to answer Questions 18 and 19


A researcher is studying post-traumatic stress among a number of soldiers who recently returned from a peace-keeping mission. She applies a
psychometric instrument which measures stress on each of a sample of 100 soldiers. She finds that their scores are approximately normally distributed
with a mean of 3.5 and a standard deviation of 2.0.

18 What would the z-score be for a soldier with a score of 4 P53 Formula is :
5.5?

1. -0.25
2. 0 Where:
3. 0.5 X = 5.5 (score of soldier)
4. 1 μ = 3.5 (normally distributed mean)
σ = 2.0 (standard deviation)

So:
Z = (x - μ) / σ = (5.5 - 3.5) / 2.0 = 2 / 2.0 = 1

19 What would the probability be of a soldier getting a 2 The z-score for a soldier getting 5.5 = 1 (1.00)
score of greater than 5.5?
Refer to the standard normal distribution table and lookup 1.00
1. 0.84 • The larger portion for z=1 is 0.8413 (0.84)
2. 0.16 • The smaller portion for z=1 is 0.1587 (0.16)
3. 0.34
4. The p-value will have to be calculated using the raw Since the mean is 3.5 and the soldier is already at 5.5, that forms the larger
data portion. For a soldier to get a score of greater than 5.5, we have to look at the
smaller portion which is 0,1587 (0.16)
# Question Ans Page Comments
20 A variable X is found to be normally distributed. If the 2 P53-54 Figure 2.7 above shows the approximate proportions of scores distributed
probability distribution of this variable is plotted, what under the area covered by the curve.
would the total size of the area under the curve be, to • The total area under the curve gives the probability of the interval -∞
the left side of the sample mean? and +∞, and is equal to +1 (i.e., the probability of any value of z falling
between minus and plus infinity is equal to 1).
1. 100% • Because the distribution is symmetrical, 0.5 of the area lies to the
2. 0.5 left of the mean and the same proportion to the right of the mean.
3. 1 • Approximately 0.341 of the area lies between the mean and 1 standard
4. 0 deviation in each direction.
• Roughly two-thirds, or 0.682 (0.341 x 2) of the area of the curve lies within
one standard deviation of the mean.
• Approximately 0.477 (i.e. 0.3413 + 0.1359) of the area lies between the
mean and 2 standard deviations in each direction.
• Approximately 0.954 (i.e. 0.477 x 2) of the area lies within 2 standard
deviations from the mean.
• Approximately 0.998 (i.e. 0.954 + (0.0215 x 2)) of the area lies within
three standard deviations from the mean.

Questions 21 to 24 are based on the scenario below


A psychologist is conducting research on xenophobia (hatred of foreigners). She makes use of a Xenophobia Scale which measures attitudes towards
foreign language speakers and which consists of 60 items, each one scored 0 - 4. This scale is applied to a random sample of n=100 citizens. The results
for each research participant are added to produce an overall xenophobia score which falls in a range from a minimum score of 0 to a maximum score of
240.

The researcher calculates descriptive statistics for this sample, and finds a mean of Y = 120 and a standard variation of 10. Since the sample data is
roughly normally distributed, she draws the graph below.
# Question Ans Page Comments
21 Based on the data in the scenario, what would the 1 P68 The standard deviations in the graph are the square roots of the variances
variance of the distribution of the scores be? s² (s=√s²)

1. 100 P160 The variance is just the square of the standard deviation. Conversely, the
2. 10 standard deviation is the square root of the variance. Variance gives an
3. 12 indication of how much the data varies around the mean; the 'width' of the
4. 1 distribution (in both directions). The advantage of using standard deviation is
that it is expressed in the same units (the same measurement scale) as the
original data, while the variance represents a measurement in squares (x²).
• For a sample, the variance is s²
• For a population, the variance is σ²

So, the variance = s² and the standard deviation = 10

s=10 so s² = 10² (or 10 x 10) = 100

22 What would the z-score be for a xenophobia score of 2 P55-56


130? Z = X – or Z= (X - ) / S
S
1. 0.0 Where: x = 130 (xenophobia score)
2. 1.0  = 120 (sample mean)
3. 0.5 S = 10 (standard deviation)
4. It cannot be calculated based on given information
z = (130 - 120 / 10 = 10/10 = 1

23 If x is taken to represent a score on the xenophobia 4 P55-56


scale, which of the values below is the closest to the Z = X – or Z= (X - ) / S
value of p(x<100)? S
Where: x = 100 (xenophobia score)
1. 0.500  = 120 (sample mean)
2. 0.159 S = 10 (standard deviation)
3. 0.977
4. 0.023 z = (100 - 120 / 10 = -20/10 = -2
App D
Since we are looking for the proportion smaller than 100 (p(x<100), refer to
the z-table where z is 2 and look at the smaller portion value. The value is
0.0228 (rounded to 0.023)
# Question Ans Page Comments
24 The way in which the mean is distributed can be 3 P60-62 s = s/√n
estimated by finding the standard error. What would the = 10 / √100
standard error of the distribution of means be, based = 10 / 10
on the information in the scenario? =1

1. 120
2. 10
3. 1
4. It is the range of values between 110 and 130

25 Which of the following expressions of the rule for 2 P34-36 The additive rule is p(A or B) = p(A) + p(B). This rule is used when two or
combining mutually exclusive probabilities is correct? more events are mutually exclusive. The additive rule is used to determine
the sum of two or more probabilities, and is signalled by the use of the word
P(A or B) = 'or' (i.e. the probability of A or B).

1. P(A) / P(B) The multiplicative rule states that p(A and B) = p(A) x p(B) where A and B
2. P(A) + P(B) are both independent events. This rule is used to determine the product of
3. P(A) x P(B) two or more probabilities and is indicated by the word 'and' (i.e. the probability
4. P(A) - P(B) of A and B).

The multiplicative rule that we use when we have conditional


probabilities is p(A and B) = p(A) x p(B|A)
26 Which of the following statements are true? 3 P11 These summary quantities are sometimes referred to as parameters (when
they refer to the whole collection or population of data
1. Parameters describe sample characteristics and
statistics describe population characteristics P161 You should take careful note of the following important distinctions between
2. Statistics describe significance tests while samples and populations. Summary values for populations are called
parameters are measurements of samples or 'parameters' and are usually denoted by Greek letters, while summary
populations values for samples are called 'statistics' and are denoted by Roman
3. Parameters describe population characteristics and letters.
statistics describe sample characteristics
4. Statistics describe measurements of independent
variables while parameters are measurements of
dependent variables
# Question Ans Page Comments
27 A researcher in social science wants to compare a 1 P14 A test statistic is the quantity you calculate (often by making use of sample
sample mean to a known population mean and statistics) to test a statistical hypothesis.
chooses to calculate the value of z. What do we call
this calculated value? P82 When we transform a value such as 104 in this way to an equivalent z-score
so that we can use the z-tables to determine the p-value, this z-statistic is
1. A test statistic referred to as a 'test statistic'. We use special symbols to denote such test
2. A sample statistic statistics. In the present case, we use the symbol zx}, which indicates the z-
3. A population parameter test for a single sample mean.
4. An inferential statistic
One of the tasks a researcher faces is to decide on the appropriate test
statistic to use. We can refer to a test statistic as a variable that has a known
theoretical probability distribution. In other words, the probabilities of various
values for the test statistic can be calculated, although this usually requires
using appropriate computer programs. Examples of test statistics are the z,
the t and the x²

28 When a statistical test is performed, the size of the p- 1 P84 The test statistic is a value with a known probability distribution: we can use it
value will be a consequence of ______ to determine what the probability is of finding an effect of a particular size,
which we refer to as the p-value. It is because of our knowledge of the
1. the value of the test statistic probability distribution of the test statistic that we can determine the p-value.
2. a choice made by the researcher
3. the null hypothesis We compare this p-value with a level of significance (α) that we chose before
4. the level of significance at which the test is we did the sampling and made the observation. This is chosen by the
performed researcher, based on the risk of being wrong when rejecting the null
hypothesis that he or she is willing to take. If the p- value associated with the
test statistic is smaller than this α -value, the null hypothesis is rejected and
the alternative hypothesis accepted. If not, the null hypothesis is not rejected.
# Question Ans Page Comments
29 When a researcher sets the level of significance to α = 1 When alpha reduces, the probability of Type I (α) error decreases and Type II
0.01 during hypothesis testing, it implies that the (β) increases.
probability of making an error of ______ will be at
______ 1% SG 82- An error of Type I is the error we make if we reject the null hypothesis when
86 we should not have done so, and the level of significance (α) represents the
1. Type I, most greatest risk of doing this that we are willing to take.
2. Type I, least
3. Type II, most An error of Type II is the opposite of Type I. We fail to reject the null
4. Type II, least hypothesis when we were supposed to.

P85 Generally, though, the smaller α, the larger β. If we wish to avoid Type I
errors, we set α to a small value such as 0.01 or even 0.001, but if we
want to avoid Type II errors, we could set α to a larger value.

30 Before doing statistical testing, a researcher sets the 3 P85 See above comments
level of significance to 0.05. This is the ______

1. minimum probability of making an error of Type I


2. p-value at which the test is to be preformed
3. maximum probability of making an error of Type I
4. probability of rejecting H0

31 Which of the statements below are true? 4 P83 A test statistic is used to determine a p-value

A test statistic ______ P84 We calculate a test statistic that is an indication of how far the observed
(a) is used to determine a p-value effect - as reflected in the sample data - deviates from what the null
(b) is used to determine the value of α hypothesis tells us to expect (if it were true).
(c) shows how far an observed measurement The test statistic is a value with a known probability distribution: we can use it
deviates from what can be expected by chance to determine what the probability is of finding an effect of a particular size,
(d) indicates the probability of making an error if which we refer to as the p-value. It is because of our knowledge of the
the null hypothesis is rejected probability distribution of the test statistic that we can determine the p-value.
We compare this p-value with a level of significance (a) that we chose before
1. Only (a) we did the sampling and made the observation. This is chosen by the
2. Only (b) researcher, based on the risk of being wrong when rejecting the null
3. Both (c) and (d) hypothesis that he or she is willing to take. If the p-value associated with the
4. Both (a) and (c) test statistic is smaller than this a-value, the null hypothesis is rejected and the
alternative hypothesis accepted. If not, the null hypothesis is not rejected.
The p-value represents the probability that the null hypothesis is true: that the
effect we see in our observation is due to chance effects like measurement
error.
# Question Ans Page Comments
32 When a statistical test yields a large p-value, which of 3 P83-84 The p-value represents the probability that the null hypothesis is true: that the
the following statements is most correct? effect we see in our observation is due to chance effects like measurement
error. If this probability is small, we conclude that H0 is not true, and we
1. The alternative hypothesis is probably true reject it. If this probability is large, we conclude that H0 is probably true,
2. The null hypothesis is probably false and we fail to reject it (the research hypothesis could not be confirmed)
3. The null hypothesis is probably true
4. The alternative hypothesis cannot be rejected This p-value is also a direct indication of the probability that the null
hypothesis is being mistakenly rejected. In other words, it shows the
probability that the researcher is rejecting a null hypothesis that is actually
true.

33 When applying a t-test to compare a sample mean 3 P110 Determine the p-value, which tells you what the probability of this observed
calculated from a measurement to a known population relationship (indicated by the test statistic) would be under the null
mean, the p-value represents ______ hypothesis.

1. the probability of correctly rejecting the null P77-78 Calculating the probability of the sample result under the null hypothesis
hypothesis
2. the probability of obtaining the sample mean under
the alternative hypothesis
3. the probability of obtaining the sample mean under
the null hypothesis
4. the largest risk of making an error by rejecting the
null hypothesis that one is willing to take

34 Under which of the circumstances below would you 3 P103 The t-distribution is a statistical distribution with a probability distribution that
make use of a t-test statistic? can be determined, which means that we can use it to predict the chances of
obtaining specific outcomes when testing for comparisons of means when
1. When comparing from two independent variables the population standard deviation σ is unknown.
2. The sample standard deviation is unknown
3. The population standard deviation is unknown App F Three types of t-tests:
4. The sample is not known to be normally distributed P177 t test - Difference between one group and a constant, σ is unknown
tc test - Difference between two independent groups, σ is unknown
td test - Difference between two dependent groups, σ is unknown
# Question Ans Page Comments
35 When a statistical test yields a very small p-value, we 4 P81 Here is a summary of the important points regarding the p-value:
know that the sample result is very ______ • The p-value gives the probability of obtaining the sample result under H0.
• If the p-value is very small, the probability is very small that the sample
1. likely under the null hypothesis result would occur under H0, and one should consider rejecting H0 in
2. unlikely under the alternative hypothesis favour of H1.
3. likely at a specific level of significance • The smaller the p-value, the more likely that the null hypothesis is false
4. unlikely under the null hypothesis and should be rejected in favour of the alternative hypothesis.

So, if the p-value is very large, the probability is very big that the sample
result would occur under H0, and one should consider accepting H0 in favour
of H1. The null hypothesis is then probably true

36 A type I error occurs when the ______ 1 SG 82- An error of Type I is the error we make if we reject the null hypothesis
86 when we should not have done so, and the level of significance (α)
1. null hypothesis is wrongly rejected represents the greatest risk of doing this that we are willing to take.
2. null hypothesis is not rejected when it should be
3. alternative hypothesis is wrongly rejected An error of Type II is the opposite of Type I. We fail to reject the null
4. p-value exceeds the level of significance hypothesis when we were supposed to.

P85 Generally, though, the smaller α, the larger β. If we wish to avoid Type I
errors, we set α to a small value such as 0.01 or even 0.001, but if we want to
avoid Type II errors, we could set α to a larger value.

37 Which one of the following alternative hypotheses 4 P75 The alternative hypothesis can contain any of the symbols '>', '<' or '≠'
requires a non-directional test of significance? respectively, the symbols for 'larger than', 'smaller than' or 'not equal to'.
When a comparison is between a value that is greater (more) than another,
1. The mean anxiety score for boys is greater than we use the symbol '>' and when a comparison is between a value that is
that of girls smaller (less than) than another, we use '<'. The statistical test that must be
2. The mean verbal ability score for boys is lower than performed in either of these cases is a directional or one-tailed statistical
that of girls test (we use these expressions interchangeably).
3. There is no correlation between the test marks and
examination marks for a group of boys and girls When we do not specify what the direction of the difference should be,
4. There is not a significant correlation between the and both a larger and a smaller difference between means are
anxiety scores of boys and those of girls considered as relevant, the symbol '≠' must be used. The statistical test
to be performed will now be a non-directional or two-tailed test.

H0: μ = 100
H1: μ ≠ 100
Where both values of the mean, either greater than or smaller than 100 are to
be considered, a non-directional or two-tailed test is required.
# Question Ans Page Comments
38 Which of the following statements about the p-value 4 P81 Here is a summary of the important points regarding the p-value:
are true? • The p-value gives the probability of obtaining the sample result under H0.
• If the p-value is very small, the probability is very small that the sample
a) It gives the probability of making an error of result would occur under H0, and one should consider rejecting H0 in
Type I favour of H1.
b) It should exceed the level of significance • The smaller the p-value, the more likely that the null hypothesis is false
c) If it is relatively large the null hypothesis will and should be rejected in favour of the alternative hypothesis.
probably have to be rejected
d) If it is less than or equal to the level of So, if the p-value is very large, the probability is very big that the sample
significance, H0 should be rejected result would occur under H0, and one should consider accepting H0 in favour
of H1. The null hypothesis is then probably true
1. (d) and none of the others
2. (b) and (c)
3. (a) and (b)
4. (a) and (d)

39 Which of the following are appropriate ways to express 4 P75 The alternative hypothesis (H1) can contain any of the symbols '>', '<' or '≠'
an alternative hypothesis when a formal statistical respectively, the symbols for 'larger than', 'smaller than' or 'not equal to'.
hypothesis is being formulated? When a comparison is between a value that is greater (more) than another,
we use the symbol '>' and when a comparison is between a value that is
a) μ1 = μ2 smaller (less than) than another, we use '<'. The statistical test that must be
b) μ1 ≠ μ2 performed in either of these cases is a directional or one-tailed statistical
c) μ1 > μ2 test (we use these expressions interchangeably).
d) μ1 < μ2
When we do not specify what the direction of the difference should be,
and both a larger and a smaller difference between means are
1. Only (a) considered as relevant, the symbol '≠' must be used. The statistical test
2. Only (b) to be performed will now be a non-directional or two-tailed test.
3. Both (c) and (d) only
4. (b), (c) and (d)

40 During statistical hypothesis testing, a p-value is 3 P82 The decision rule for H0 is simply as follows:
calculated based on a test statistic. If this p-value is If the p-value of the sample result is smaller (less) than α (level of
______ than the level of significance, the ______ significance), the null hypothesis is rejected. If the p-value is not smaller than
hypothesis should be ______ α, the null hypothesis (H0) is not rejected.

1. greater, null, rejected


2. less, alternative, rejected
3. less, null, rejected
4. less, null, accepted
# Question Ans Page Comments
41 An error of Type II is committed when ______ 2 SG 82- An error of Type I is the error we make if we reject the null hypothesis when
86 we should not have done so, and the level of significance (α) represents the
1. H0 is rejected in error greatest risk of doing this that we are willing to take.
2. H0 is not rejected when it should be
3. the p-value is found to exceed α An error of Type II is the opposite of Type I. We fail to reject the null
4. the rejection H0 is rejected based only on the p- hypothesis (H0) when we were supposed to.
value without looking at the effect size
P85-86 What if the p-value is not smaller than the α -level and we decide not to reject
H0? Now we run the risk of not rejecting H0 when - in fact - H0 is false and H1
is true. This is referred to as a Type II error. The decision not to reject H0 is
based on the test statistic, from which we determine a p-value that is not
smaller than our chosen level of significance (i.e., p-value > α). However, this
outcome may also be the result of measurement error. The effect may be
close to what H0 predicts purely by chance.
42 The size of the p-value depends on the ______ 2 P83 A test statistic is used to determine a p-value

1. z-tables P84 We calculate a test statistic that is an indication of how far the observed
2. size of the test statistic effect - as reflected in the sample data - deviates from what the null
3. null hypothesis statement hypothesis tells us to expect (if it were true).
4. level of significance The test statistic is a value with a known probability distribution: we can use it
to determine what the probability is of finding an effect of a particular size,
which we refer to as the p-value. It is because of our knowledge of the
probability distribution of the test statistic that we can determine the p-value.
We compare this p-value with a level of significance (α) that we chose before
we did the sampling and made the observation. This is chosen by the
researcher, based on the risk of being wrong when rejecting the null
hypothesis that he or she is willing to take. If the p-value associated with the
test statistic is smaller than this a-value, the null hypothesis is rejected and the
alternative hypothesis accepted. If not, the null hypothesis is not rejected.
The p-value represents the probability that the null hypothesis is true: that the
effect we see in our observation is due to chance effects like measurement
error.
# Question Ans Page Comments
43 A failure to reject H0 implies that a difference between 3 P83 We would say that our result does not enable us to conclude that H0 is false,
the calculated sample mean and its expected value or even that the result favours the null hypothesis, but that we cannot accept
under H0 is due to ______ it is literally true. Even if our sample mean is =100 exactly, there is a remote
possibility that this is a chance event (due to measurement or sampling
1. the dependent variable error). What we do know in such a case is that there is no indication that H1
2. the independent variable can be true and no reason to do a test to confirm this.
3. chance
4. the test statistic P87 The implication is that we have to be careful how we interpret significant
results. A p-value of smaller than our chosen level of significance (α) simply
implies that, relative to this sample, it is improbable that the effect we see in
our observations is purely due to chance. It does not imply that the effect is
big or important. This is something that we have to decide by looking at what
the data means.
Base your answers to Questions 44 to 47 on the following scenario
Based on her experience, a developmental psychologist formulates a hypothesis that infants of younger than four months old tend to look at their mother's
faces for longer periods of time than at the faces of strangers. She also knows that past research has established that an infant will look at a picture of a
random human face for an average of ten seconds before looking away.

To test her hypothesis, the psychologist selects a sample of n = 25 infants of between one and four months old. She presents each infant with a picture of
their mother on a video screen and records for how long each infant attends to the image before looking away.

After collecting the data, she calculates that the infants in her sample spend a mean of  = 12.5 seconds looking at the images before looking away, with a
standard deviation of s = 5.5 seconds.

44 Which research design is the most appropriate? 2 P99 A mean based on a single sample is to be compared to a specific value - a
population mean that is treated as a given.
1. Correlational design
2. Single sample group design
3. Two-groups design for dependent samples
4. Two-groups design with a known population mean

45 Which is the most appropriate way of formulating the 1 "a developmental psychologist formulates a hypothesis that infants of
relevant statistical hypotheses? younger than four months old tend to look at their mother's faces for longer
periods of time than at the faces of strangers"
1. H0 μ = 10, H1 μ > 10
2. H0 μ = 10, H1 μ < 10 The term "longer" indicates a directional hypothesis (greater than ">")
3. H0 μ ≠ 10, H1 μ > 10
4. H0 μ = 10, H1 μ ≠ 10
# Question Ans Page Comments
46 Based on the information presented in the scenario, 4 P103 The t-distribution is a statistical distribution with a probability distribution that
which would be the most appropriate test statistic to can be determined, which means that we can use it to predict the chances of
use out of the following‘? obtaining specific outcomes when testing for comparisons of means when
the population standard deviation σ is unknown.
1. The z-statistic for the mean of a single sample (z)
2. The t-statistic for the difference between the means App F Three types of t-tests:
of two independent samples (tc) P177 t test - Difference between one group and a constant, σ is unknown
3. The t-statistic for the difference between the means tc test - Difference between two independent groups, σ is unknown
of two dependent samples (td) td test - Difference between two dependent groups, σ is unknown
4. The t-statistic for the mean of a single sample (t)
z test - Difference between one group and a constant, σ is known

47 Given the scenario above, what would the calculated 2 P105 s = s/√n
value of the standard deviation of the distribution of the where: s = 5.5
means (the standard error) be? n = 25

1. 5.5 s = 5.5 / √25


2. 1.1 = 5.5 / 5
3. 0.2 = 1.1
4. There is insufficient information to calculate it

48 The standard error of the mean for samples of a 3 P103- This is the standard deviation of the distribution of the means (or standard
specific size is the ______ 104 error of the mean), which we can calculate using the central limit theorem:

1. standard deviation of the population mean s = s/√n


2. mean of the standard deviations of repeated
samples of this specific size
3. standard deviation of the sampling distribution of
the mean
4. standard deviation of the sample mean
# Question Ans Page Comments
49 During the process of using statistical procedures to 1 P139 The squared correlation (r²) measures the proportion of variance in one
establish whether a relationship exists between the variable that can be determined from its relationship with the other, or how
variables x and y, a researcher considers the effect much variance they have in common. It can be used as an indication of the
size of the findings. What does this refer to? size of the effect.

1. It indicates the extent of a relationship among P140 Evaluating r²


variables irrespective of the significance of the • r² = 0.01 Small effect
statistical test • r² = 0.09 Medium effect
2. It is another way of saying that the statistical test • r² = 0.25 Large effect
was significant
3. It refers to the probability of making an error of
Type II it the null hypothesis is not rejected
4. It refers to the extent to which the obtained p-value
differs from the chosen level of significance

50 Suppose a test statistic is calculated, and based on this 3 p=0.03


the p-value is determined to be 0.03. Which of the
following decisions should the researcher make‘? If α=0.01, then do not reject H0
If α=0.05, then reject H0
1. Reject H0 if the level of significance was set in
advance at 0.01
2. Since p = 0.03, set the level of significance to 0.05
to make it possible to reject H0
3. Reject H0 if the level of significance was set in
advance at 0.05
4. Do not reject H0 if the level of significance was set
in advance at 0.05

Base your answers Questions 51 to 53 on the following scenario


A researcher in educational psychology wants to investigate the possibility that primary school children who regularly watch educational programmes on
television get better general grades in primary school than those who do not.

She draws a random sample of 100 children from a specific primary school and after investigation of their histories of television watching, allocates them
into two groups, a TV-Group of 45 children with a history of watching educational programmes and a Non-TV Group of 55 children with no such history.

At the end of the school year, she compares the final year marks of the two groups
# Question Ans Page Comments
51 Considering the scenario above, which of the following 2 Note:
statements are true? • The two groups are dependent.
(a) The two groups are dependent because they come • The dependent variable will be their final year marks as this will be
from the same school influenced by the independent variable.
(b) Watching television is the dependant variable and • The independent variable is watching educational programmes on TV
year mark is the independent variable • She hypothesises that the primary school children who regularly watch
(c) A one-tailed test would be required educational programmes on television get better general grades in
primary school than those who do not
1. (a) and none of the others o This means the grades are greater than (>)
2. (a) and (c)
3. (b) and (c) Samples are considered as comprising independent groups if the
4. (c) and none of the others P112 composition of the one sample in no way affects, in any systematic way, the
composition of the other sample. The two samples come from two groups
that have no obvious relationship. For example, where one sample is
measurements of a construct like 'self-esteem' among men, and the other
among women, but both groups were sampled purely randomly.

On the other hand, the concept of dependent groups refers to situations


where the samples are related, and it implies that each subject in one group
can be systematically paired off with a subject from the other group. For this
reason, a dependent groups research design is often referred to as a
matched-pairs design.

Sometimes dependent samples are produced when the researcher


deliberately matches subjects into pairs, based on the value of some hidden
or 'nuisance' variable. In this case, the groups are specific primary school
children. Another example of such a design would be a repeated measures
design, where the same research participant is observed under more than
one treatment or experimental condition

52 Which of the following is an appropriate description of 1 The researcher only focuses on primary school children, so that will be the
the research population in the scenario? whole population from which she will take a sample.

1. Primary school children


2. Primary school children who watch educational
television
3. Children in general
4. 100 children from a specific school
# Question Ans Page Comments
53 When comparing the year marks of the two groups of 2 She hypothesises that the primary school children who regularly watch
children (using ‘TV’ and ‘NoTV’ to indicate the group educational programmes on television get better general grades in primary
that watches educational TV and the one that does not school than those who do not
watch it, respectively) which is of the options below are
the most appropriate way to express the formal This means the grades are greater than (>)
alternative hypothesis suggested by this scenario?

1. μTV < μNoTV


2. μTV > μNoTV
3. μTV ≠ μNoTV
4. TV > NoTV

54 A psychotherapist wants to test the effectiveness of a 1 P112 Samples are considered as comprising independent groups if the
programme of cognitive behavioural therapy on clients composition of the one sample in no way affects, in any systematic way, the
who were diagnosed as suffering from high social composition of the other sample. The two samples come from two groups
anxiety. She uses a sample of 50 persons who were that have no obvious relationship. For example, where one sample is
diagnosed as persons with high anxiety and tests them measurements of a construct like 'self-esteem' among men, and the other
on a Social Anxiety Scale before the commencement of among women, but both groups were sampled purely randomly.
the series of therapy sessions, and again afterwards.
The two sets of measurements should be regarded as On the other hand, the concept of dependent groups refers to situations
______ where the samples are related, and it implies that each subject in one group
can be systematically paired off with a subject from the other group. For this
1. dependent reason, a dependent groups research design is often referred to as a
2. independent matched-pairs design.
3. drawn from a single population
4. highly correlated Sometimes dependent samples are produced when the researcher
deliberately matches subjects into pairs, based on the value of some hidden
or 'nuisance' variable. Another example of such a design would be a repeated
measures design, where the same research participant is observed under
more than one treatment or experimental condition
# Question Ans Page Comments
55 A researcher wants to compare two group means by P81 There is no correct answer for this question.
testing the following hypotheses at a significance level
of α = 0.05 Remember that H1 μ1 > μ2 is a directional hypothesis and requires a one-tail
test.
H0 μ1 = μ2 The p-value provided by the computer is non-directional (two-tailed) and must
H1 μ1 > μ2 therefore be divided by 2 to give a one-tail value.

On the basis of data provided, the output from a So p=0.07 / 2 = 0.035


computer program indicates that a t-value of t = -1.9
was found. The computer program also indicates that a We know α = 0.05, so p < α (0.035 < 0.05) and therefore H0 should be
p-value for a non-directional (two-tailed) t-test would be rejected.
p=0.07. What conclusion can the researcher make, and
why? This would be the normal way of calculating it

1. H0 can be rejected because α < p-value But also remember that the alpha value is normally indicated for one-tail.
2. H0 cannot be rejected because α < p-value Therefore we should also be able to multiply it by 2 to get it to a two tail
3. H0 can be rejected because α / 2 < p-value comparisson with the two-tailed p-value.
4. H0 cannot be rejected because α x 2 > p-value So: α x 2 > p-value is the same as p-value / 2 < α

This means in the question that α=0.05 x 2 = 0.10 which is greater than p=0.07
So p < α and therefore H0 should be rejected.

56 Two samples can be considered independent when 4 P112 Samples are considered as comprising independent groups if the
______ composition of the one sample in no way affects, in any systematic way, the
composition of the other sample. The two samples come from two groups
1. care was taken that there were no hidden variables that have no obvious relationship. For example, where one sample is
that could affect them measurements of a construct like 'self-esteem' among men, and the other
2. care was taken that the samples are drawn under among women, but both groups were sampled purely randomly.
different experimental or treatment conditions
3. the samples are drawn from more than a single On the other hand, the concept of dependent groups refers to situations
population of subjects where the samples are related, and it implies that each subject in one group
4. there is no systematic matching of individuals of can be systematically paired off with a subject from the other group. For this
one sample with individuals from the other one reason, a dependent groups research design is often referred to as a
matched-pairs design.

Sometimes dependent samples are produced when the researcher


deliberately matches subjects into pairs, based on the value of some hidden
or 'nuisance' variable. Another example of such a design would be a repeated
measures design, where the same research participant is observed under
more than one treatment or experimental condition
# Question Ans Page Comments
57 A researcher wants to test the hypothesis that girls are 1 P103 The t-distribution is a statistical distribution with a probability distribution that
generally less assertive than boys. He draws a sample can be determined, which means that we can use it to predict the chances of
of 100 boys and a sample of 100 girls, and gives each obtaining specific outcomes when testing for comparisons of means when
child a test that measures their general level of the population standard deviation σ is unknown.
assertiveness. Which would be the most appropriate
statistical test to use, out of the following? App F Three types of t-tests:
P177 t test - Difference between one group and a constant, σ is unknown
1. The t-test for independent samples tc test - Difference between two independent groups, σ is unknown
2. The chi-square (x²) test td test - Difference between two dependent groups, σ is unknown
3. The t-test for dependent samples
4. The test statistic based on the Pearson product- z test - Difference between one group and a constant, σ is known
moment correlation (r)
58 The difference score (đ = x2 - x1) is used in the 1 P112 Samples are considered as comprising independent groups if the
calculation of the t-test statistic in the case of ______ composition of the one sample in no way affects, in any systematic way, the
(a) dependent samples composition of the other sample. The two samples come from two groups
(b) independent samples that have no obvious relationship. For example, where one sample is
(c) random samples measurements of a construct like 'self-esteem' among men, and the other
among women, but both groups were sampled purely randomly.
1. only (a)
2. only (b) On the other hand, the concept of dependent groups refers to situations
3. both (a) and (b) where the samples are related, and it implies that each subject in one group
4. (c) can be systematically paired off with a subject from the other group. For this
reason, a dependent groups research design is often referred to as a
matched-pairs design. Sometimes dependent samples are produced when
the researcher deliberately matches subjects into pairs, based on the value of
some hidden or 'nuisance' variable.

Another example of such a design would be a repeated measures design,


where the same research participant is observed under more than one
treatment or experimental condition. Each subject is matched with himself or
herself, which is why the samples are regarded as dependent. It is to
P118 compensate for this existing relationship between the samples that we
require an adjustment in the t-statistic.

To develop this adjusted t-test, we use the two matched samples to create a
new variable called 'đ '. We do this by computing a 'difference score' between
1 and 2 so that đ reflects the mean of the differences between the
measurements before and after. đ = 1 - 2
# Question Ans Page Comments
59 Which of the following statements about the 1 P106 Note that the bigger the t-value the greater the likelihood of rejecting H0 (as is
relationship between the value of a t-test statistic and the case with z-statistics), because it refers to how far the observed value of
the p-value is true, if the sample size (n) remains the sample statistic differs from the population parameter that was provided
constant? and refers to the areas on the edges of the distribution.

1. The larger the value of the t-test statistic, the This implies, the bigger the t-value, the smaller the p-vlaue
smaller p will be
2. The smaller the value of the t-test statistic, the
smaller p will be
3. lf the sample size n remains the same, the
relationship between the test statistic and the p-
value will remain constant
4. There is no specific relationship between the p-
value and the t-test statistic

60 Which of the following gives the best description of a 4 P73-75 By convention, the null hypothesis is usually indicated with the symbol H0
null hypothesis? The null hypothesis is the hypothesis
that ______ This hypothesis is referred to as the 'null hypothesis' because it is the
hypothesis that implies no effect.
1. expresses the research hypothesis through the use
of appropriate symbols The null hypothesis always contains the 'equal to' symbol '='. The null
2. indicates the direction of the difference that is hypothesis is the hypothesis that no effect exists, and in cases where we
expected between two groups are testing a mean, this implies that two group means (or a group mean and
3. expresses the probability that observed relationship a specific constant value) do not differ.
will be significant
4. states that there is no relationship between the
variables

61 In correlational research one investigates the relation 3 P130 Correlation is a measurement of the extent to which a measurement on one
between ______ variable is related to a measurement on another variable for the same sample
of individual cases.
1. the mean of a single sample of subjects and a
population mean
2. two groups of subjects, with respect to a single
variable
3. two variables measured on the same group of
subjects
4. the difference scores of two groups of test
measurements
# Question Ans Page Comments
62 A researcher hypothesizes that the greater the number 1 P161 You should take careful note of the following important distinctions between
of books read by pupils over a specific school year, the samples and populations. Summary values for populations are called
greater their language comprehension will be at the 'parameters' and are usually denoted by Greek letters, while summary values
end of that year. He studies a random sample of 100 for samples are called 'statistics' and are denoted by Roman letters.
pupils in grades 10 — 12 in a specific school, collecting Symbol
information on the number of books they read in a Summary value Populations Samples
specific year and letting them do a reading (Parameter) (Statistic)
comprehension test at the end of the year Arithmetic mean μ 
Standard deviation σ s
Variance σ² s² (s=√s²)
Which is an appropriate formal expression of the Standard error of mean
alternative hypothesis for this research? (Also called Standard deviation of the σ (= σ/√n) s  (= s/√n)
sampling distribution of the mean)
1. ρ>0 Mean of sampling distribution of the mean
(Mean of all means. This is equal to the mean µ
2. μ>0 under H0)
3. r >0 Z score for means z
4. ρ≠0 Correlation between two measurements
(Pearson's R)
ρ r
Proportions P p

It is a correlational design where we are measuring a relationship between


two variables (populations).

P131 The population correlation value is indicated by ρ (Greek letter rho


corresponding to the sample r). This would be the correlation if the entire
population provided scores on the two measured variables you are interested
in. Remember we're going to state hypotheses in terms of our population
correlation, which here is stated as positive

H1: r = 0 This implies that a relationship that differs significantly from zero
does in fact exist, but we are making no `educated guesses' as to whether it
is a positive or negative relationship: we just want to know whether there is in
fact a relationship.
H1: r > 0 This implies that we want to establish whether a significant
relationship of greater than zero exists, that is, a significant positive
relationship.
H1: r < 0 This implies that we want to establish whether a significant
relationship of less than zero exists, that is, a significant negative relationship.
# Question Ans Page Comments
63 Which of the following can never be exactly zero? 2 P83-85 If the significance level (α) = 0, then there would be no possibility of the p-
value being lower than α so the null hypothesis H0 will always be true
1. a probability
2. a level of significance
3. a correlation coefficient
4. a t-test statistic

64 A graph showing the position of each of a number of 3 P130- A graph showing the position of each of a number of sampling units on each
measurements on each of two variables is called a 132 of two variables
______
Tut202 A scatter plot is a graph showing the relationship between two numerical
1. histogram 2014 variables. In such a graph the data of the one variable are plotted on the
2. contingency table Q18 horizontal axis (usually referred to as the X axis), and the data of the other
3. scatter plot variable on the vertical (or Y) axis. It is not a comparison of sample and
4. correlation coefficient population, nor has it to do with spread of data or the independence of
variables
65 For a larger sample size (n) ______ 4 P139 If you randomly put three dots on a blank square of paper, they may, purely
by chance, fall into something approximating a straight line. If you make a
1. a smaller value of a Pearson's correlation hundred marks on the same piece of paper, also in a totally random way, the
coefficient r will reach significance chance of them falling in a straight line is, however, a lot less. This tells you
2. a larger value of a Pearson's correlation coefficient something about the relationship between r (a measure of whether the dots
r is required before the result will be significant on a scatter plot fall in a straight line) and the number of dots (the sample
3. the size of Pearson's correlation coefficient r is size n): the smaller n, the more likely it is that the plot will represent a straight
likely to increase line purely by chance. Therefore, for a smaller sample n, the test must be
4. the size of Pearson’s correlation coefficient r is much more conservative. You must, therefore, put up a bigger hurdle to be
likely to move closer to zero crossed before you conclude that the result is not the consequence of
chance. You, therefore, require a larger value of r before you can conclude
that the result is not a chance event due to sampling or measurement error,
but an actual representation of the state of affairs in the population.

The consequence of this is that, for a large sample, a relatively modest


correlation can turn out to be significant. For example, for a sample of n =
40 (as in the HIV/AIDS research project in Appendix A), the value of r must
be at least r = 0.26 for α = 0.05 (a 5% level). If we increase the sample size to
100, a smaller result of r = 0.16 would be significant at the same level of α =
0.05. This shows that, for a large value of n, a very modest r can be
significant. The implication of this is that significance does not indicate that a
relationship is large. It merely tells you that some relationship exists (perhaps
a modest one), and that it is large enough not to be regarded as purely due to
the effect of chance, given the size of the sample.
# Question Ans Page Comments
66 A Pearson correlation of r = -0.71 is found when the 3 P132- Correlation coefficients that measure the linear relationship between two
linear correlation between two variables is calculated. 133 variables, such as the Pearson product-moment correlation coefficient, can
What kind of relationship between two variables X and have a continuous value that ranges from -1 to 1 (a positive value is usually
Y does this represent? written without the sign, so '1' is presumed to mean '+1'). We use 'r' as the
symbol that represents a correlation coefficient (as in the case of the Pearson
1. As one variable grows larger, so the other gets product-moment correlation coefficient), and the following applies:
larger • r = 1 implies a perfect positive linear relationship (the dots in a scatter plot
2. As one variable grows smaller, so the other gets will run from lower left to upper right in a perfectly straight line)
smaller • r = 0 implies no linear relationship at all (the dots may be scattered all
3. As one variable grows larger, the other grows over the place)
smaller • r = -1 implies a perfect negative linear relationship (the dots will run from
4. A correlation coefficient cannot get smaller than 0, upper left to lower right in a straight line)
since it implies a relationship of less than nothing
When positive relationships occur, this implies that as one variable gets
larger, so does the other. When negative relationships occur, this implies that
as one variable gets larger, the other gets smaller.

The relationship is called linear because Pearson’s correlation coefficient


measures the extent to which the relationship approximates a straight line.

67 After finding the correlation of r = -0.71 (as indicated in 1 P139 The squared correlation (r²) measures the proportion of variance in one
the previous question), the researcher decides to also variable that can be determined from its relationship with the other, or how
calculate the size of the effect of one variable on the much variance they have in common. It can be used as an indication of the
other. Given the information at his disposal, what is he size of the effect.
likely to conclude?
P140 Evaluating r²
1. About half of the variance in one of the variables is • r² = 0.01 = 1% Small effect
accounted for by the other one • r² = 0.09 = 9% Medium effect
2. About a quarter of the variance in one of the • r² = 0.25 = 25% Large effect
variables is accounted for by the other
3. About three quarters of the variance in one of the r = -0.71
variables is accounted for by the other one r² = (-0.71 x -0.71) = 0.5041 = 50%
4. He decides that before he can find the size of the
effect between the two variables, the two group At least 50% variance in one of the variables is accounted for by the other
means and standard variations will first have to be one
calculated
# Question Ans Page Comments
Base your answers to Questions 68 to 70 on the following scenario
A sample of clients are drawn from three community welfare centres (indicated as A, B and C). Based on interviews, the clients are categorised into one of
four categories
• Those that have psychological problems, that is, they may require intervention by psychotherapists,
• Those that have welfare problems, that is, those who may require intervention by social workers,
• Those with health-related problems, who may require interventions by health care givers
• Others - these are clients who do not fit into any of the other three groups

Counts are made of those clients from the different centres who fit into each of these categories, and this is reflected in the contingency table below

Mental Health Centre


Row
(columns) by Type of A B C
totals
Problem (rows)
‘Psychological' 27 35 32 94
‘Welfare’ 16 28 22 66
‘Health-related’ 16 20 34 70
Other 29 17 24 70
Column totals 88 100 112 300
68 Which of the following is an appropriate null hypothesis 4 P137 1. H0 : ρ≠0
to test relationships given the data above? 2. H0 : ρ≠0
1. There is a correlation between the type of 3. H0 : ρ≠0
intervention which the clients need and the 4. H0 : ρ=0
particular community mental health centres that
they visit H0 can only be "=". Options 1, 2 and 3 are alternative hypothesis statements.
2. The particular community mental health centres
that the clients visit have no relationship to the type So the null hypothesis (H0) - the hypothesis of no effect - will state that no
of intervention which the clients may require relationship exists: H0 : ρ = 0
3. There are no significant difference among clients
needing psychological, welfare, health-related or H1: ρ ≠ 0 This implies that a relationship that differs significantly from zero
those with other problems does in fact exist, but we are making no `educated guesses' as to whether it
4. There is no correlation between the type of is a positive or negative relationship: we just want to know whether there is in
intervention which the clients need and the fact a relationship.
particular community mental health centres that H1: ρ > 0 This implies that we want to establish whether a significant
they visit relationship of greater than zero exists, that is, a significant positive
relationship.
H1: ρ < 0 This implies that we want to establish whether a significant
relationship of less than zero exists, that is, a significant negative relationship.
# Question Ans Page Comments
69 Given the data above, which of the options below 4 P110 Samples are considered as comprising independent groups if the
would be the most appropriate test statistic to use to composition of the one sample in no way affects, in any systematic way, the
test the null hypothesis above against an appropriate composition of the other sample. The two samples come from two groups
alternative hypothesis‘? that have no obvious relationship. For example, where one sample is
measurements of a construct like 'self-esteem' among men, and the other
1. The test statistic for the Pearson product-moment among women, but both groups were sampled purely randomly.
correlation coefficient (r)
2. The t-test statistic tor two in dependent samples (tc) On the other hand, the concept of dependent groups refers to situations
3. The t-test statistic tor two dependent samples (td) where the samples are related, and it implies that each subject in one group
4. The chi-square (X²) test statistic can be systematically paired off with a subject from the other group. For this
reason, a dependent groups research design is often referred to as a
matched-pairs design.

SG P140 The chi-square test is usually used when you have a cross tabulation of
frequency counts of events which are nominal scale measurements. This
Tut202 table is referred to as a contingency table. It is used to compare an observed
2014 frequency distribution (frequency counts based on a sample of observation)
Q22 with the frequency distribution which we would expect to find if the null
hypothesis of no relationship between two cross-tabulated variables were
true.

The Pearson chi-square test statistic, which we indicate with , is a calculation


of the difference between the observed and expected frequencies.

The formula is

This means is that the expected value for each cell in the contingency table is
subtracted from the observed value for that cell, squared, and divided by the
expected value for that cell. Then all of these terms are added together to
yield
# Question Ans Page Comments
70 Given the data above, what would be the expected 2 Mental Health Centre (columns)
value (it the null hypothesis is true) for health-related by Type of Problem (rows) A B C Row totals
problems in community centre B? ‘Psychological' 27 (O11) 35 (O12) 32 (O13) 94 (O1.)
‘Welfare’ 16 (O21) 28 (O22) 22 (O23) 66 (O2.)
1. 20 ‘Health-related’ 16 (O31) 20 (O32) 34 (O33) 70 (O3.)
2. 23.3 Other 29 (O41 17 (O42) 24 (O43) 70 (O4.)
3. 70 Column totals 88 (O.1) 100 (O.2) 112 300 (O..)
4. 100
The cell frequencies represent the way the information is distributed relative
P142
to the variables. These cell frequencies are often referred to as the observed
or empirical cell frequencies. The question now is: How would these cell
frequencies be distributed under the null hypothesis, that is, if H0 is actually
true? Asked differently: What are the expected frequencies if the two
categorical variables are truly independent?

We can indicate these expected cell frequencies by Eij and they are
computed as follows:
E11 = (O1. x O.1)/O.. = (94 x 88)/300 = 27.57 .... (row 1, column 1)
E12 = (O1. x O.2)/O.. = (94 x 100)/ 300 = 31.33 ... (row 1, column 2)
E13 = (O1. x O.3)/O.. = (94 x 112)/ 300 = 35.09 ... (row 1, column 3)
E21 = (O2. x O.1)/O.. = (66 x 88)/ 300 = 19..36 ... (row 2, column 1)
E22 = (O2. x O.2)/O.. = (66 x 100)/ 300 = 22.00 ... (row 2, column 2)
E23 = (O2. x O.3)/O.. = (66 x 112)/ 300 = 24.64.... (row 2, column 3)
E21 = (O3. x O.1)/O.. = (70 x 88)/ 300 = 20.53 ..... (row 3, column 1)
E22 = (O3. x O.2)/O.. = (70 x 100)/ 300 = 23.33 ... (row 3, column 2)
E23 = (O3. x O.3)/O.. = (70 x 112)/ 300 = 26.13 ... (row 3, column 3)
E21 = (O4. x O.1)/O.. = (70 x 88)/ 300 = 20.53 ..... (row 4, column 1)
E22 = (O4. x O.2)/O.. = (70 x 100)/ 300 = 23.33 ... (row 4, column 2)
E23 = (O4. x O.3)/O.. = (70 x 112)/ 300 = 26.13 ... (row 4, column 3)

Mental Health Centre (columns)


by Type of Problem (rows) A B C Row totals
‘Psychological' 27 (27.57) 35 (31.33) 32 (35.09) 94
‘Welfare’ 16 (19.36) 28 (22) 22 (24.64) 66
‘Health-related’ 16 (20.53) 20 (23.33) 34 (26.13) 70
Other 29 (20.53) 17 (23.33) 24 (26.13) 70
Column totals 88 100 112 300

Powered by TCPDF (www.tcpdf.org)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy