JAMIE HALE - in Evidence We Trust - 2nd Edition
JAMIE HALE - in Evidence We Trust - 2nd Edition
JAMIE HALE - in Evidence We Trust - 2nd Edition
Contact information:
Jamie Hale
323 Calmes Blvd.
Winchester, KY 40391
www.knowledgesummit.net
www.maxcondition.com
Acknowledgements …......................................................1
Introduction ….................................................................2
References ...................................................................216
Appendices ..................................................................224
Appendix A Practice Problems …................................224
Appendix B APA Style Citation and Reference
Lists ….........................................................................233
Index …........................................................................236
1
Introduction
2
alternative explanations). Experiences may be very
important in some contexts, and they may serve as
meaningful research questions. However, a meaningful
question or a possible future finding is not synonymous
with scientific evidence; scientific evidence depends on
converging evidence. That is, the convergence of different
strategies, making use of the preponderance of evidence,
that converge as a tentative finding. In the following pages
science, rationality/ critical thinking (in cognitive science
terms) and statistics (frequentist type of stats) are
discussed.
3
Chapter two features short articles on rationality. That is
rationality, as defined by cognitive science. Some of the
same or similar information is contained across different
articles. There are at least a couple of advantages to
presenting information in this manner (refer to previously
mentioned advantages in chapter one). Some of the articles
focus on the rationality intelligence dichotomy. Also
included in this chapter are interviews with Keith Stanovich
and the Stanovich Research Lab (Keith Stanovich, Richard
West and Maggie Toplak). In the interviews with
Stanovich, he discusses the development of an RQ Test. In
the interview with the Stanovich lab, rationality and
intelligence are discussed.
4
The content in this book may be difficult for some to
comprehend. However, with some effort and patience the
content is learnable for most people. In the words of Albert
Einstein “Things should be made as simple as possible, but
not any simpler.” Science, rationality and statistics can be
simplified to a degree, but relative to most other topics
these topics are difficult. This book is not written for
cognitive misers (the cognitively lazy). This book is
written for individuals that are interested in separating
knowledge and nonsense, and are willing to put forth at
least a moderate level of cognitive effort.
5
is mechanistic, materialistic and congruent with model
dependent reality (Hawking & Mlodiknow, 2010).
6
Chapter 1
7
The Skeptic
Skeptic or Cynic
8
sounding words dressed up as scientific words (in many
cases words that do not exist or words they cannot
accurately define).
“Coach Hale: No. The only people that will make this
claim are people that are not willing to look at truth and
people that promote quack science. Fitness Skepticism
(this includes the health, nutrition and supplement
industries) is an approach to claims that investigates reason
to any and all ideas. Skeptics do not go into an
investigation closed to the possibility that a claim might be
true. When I say “skeptical,” I mean that I need to see valid
evidence before believing a claim. Cynical on the other
hand means taking a negative view and not willing to
accept valid evidence for the claim. I think skepticism is
healthy and should be promoted in all fields.”
9
scientific method should be avoided. Science uses an array
of methods and modeling strategies, and they vary
tremendously within and between domains. A single
concept, or simple definition doesn't suffice. Scientific
processes are systematic processes, and they are best we
have for acquiring new knowledge that uses deductive and
inductive reasoning (Randall, 2012). The hypothetico-
deductive model suggests scientific inquiry proceeds by
formulating a hypothesis (predicted outcome) in a form that
can be tested, falsified, and then it involves test(s) on
observable data (direct or indirect) where the outcome is
not yet known. This model is often taught to students as
the scientific method- the way science is done. This is a
over simplification; a lot of scientific data has been
acquired in the absence of this model. Research, often,
doesn't fit with this model. That are important scientific
findings that didn't happen via the hypothetico-deductive
model. This model often involves standard null hypothesis
statistical testing (as incorporated in the frequentist model),
which is often used in research, but it isn't always used.
10
generalizations. "Scientists inductively work from
observations to try to establish a consistent framework that
matches all the measured phenomena. Once the theory is
in place, scientists and detectives make deductions, too, in
order to predict other phenomena and relationships in the
world" (Randall, 2012, p.43-44). Some of the primary
goals of science include describing, predicting and
explaining phenomena. Science shines light on reality.
11
discouraged from critical thinking. Don’t question
authority (so they are told). When we were kids our
parents gave us advice and told us what to do. No
questions were asked. This continued throughout our
school years. The formal education system generally
discourages critical thinking (process of actively and
skillfully conceptualizing, applying, analyzing,
synthesizing, and evaluating information to reach an
answer or conclusion). You have probably often thought if
“your teacher said it so it must be right”. This cycle
continues throughout most of our lives. We are constantly
exposed to Newspapers, TV, so-called experts and other
sources of information that tell us what is right and wrong.
The lack of emphasis on critical thinking leads to various
problems in the decision making process. These problems
make it difficult to distinguish fact from fiction.
12
high levels of BCAAs) and so on. The point is there is no
way that it could be accurately determined that the BCAA
supplement was the reason the bodybuilder did not lose
muscle. This is also an example of confusing correlation
with causation. The illusion of cause is one of the most
common everyday cognitive errors. It is natural to see
patterns and infer causation even when there are no
causative patterns. High quality research, that rates high in
internal validity, helps get us closer to determining cause
and effect.
13
Rumors everywhere
Ignoring failures
14
correlation), temporal precedence, and internal validity
(eliminate confounds, possible alternative explanations for
outcome).
Emotive words
Ad Hominem
15
Overreliance on authorities
Argumentum ad antiquitatem
16
words "This is right because we've always done it this
way.” Consider the following example: Boxers have
traditionally performed moderate to high levels of long
distance roadwork to optimize their endurance levels. So,
performing long distance road work must be the best way
to optimize endurance; even though boxing relies primarily
on the anaerobic energy system and running long distance
mainly utilizes the aerobic energy system. Many boxers
have never tried anything different in regards to enhancing
endurance. There are many things people have always
done a particular way. That does not mean that way is the
most effective method for achieving the desired outcome. I
am sure everyone reading this has heard “well that’s the
way I have always done it”.
Argumentum ad novitatem
17
true unless proven otherwise. Shifting the burden of proof
is a common fallacy used in various industries. The quack
(bamboozler, charlatan, pretend fitness expert, etc.) often
lacks evidence for his/her claim therefore they shift roles
and insist that the skeptic disprove their claim.
Relativist Fallacy
18
Scientific & Nonscientific Approaches to Knowledge
19
Comparing Scientific & Nonscientific Approaches to
Knowledge
Scientific Nonscientific
General approach Empirical Intuitive
(systematically)
Observation Controlled Uncontrolled
Reporting Unbiased Biased
Concepts Clear definitions Ambiguous
definitions
Instruments Accurate/precise Inaccurate /
imprecise
Measurement Reliable/repeatable Non-reliable
Hypotheses Testable Untestable
Attitude Critical Uncritical
*Based on Table 1.1 from Research Methods in Psychology
(Shaughnessy & Zechmeister, 1990, p.6)
General approach
20
in controlled experiments, some causal relationships are
supported while others are rejected. Extending these
observations, scientists propose general explanations that
will explain the observations. “We could observe end-less
pieces of data, adding to the content of science, but our
observations would be of limited use without general
principles to structure them” (Myers & Hansen, 2002, p.
10).
21
excerpt is from The Demon-Haunted World (Sagan 1996).
Observation
22
Determining causality is particularly hard with so many
confounds.
23
a slip of paper and drawing them from a hat (flipping coin
or number generator may also be used for random
assignment). This does not mean there will be no
differences in the subject’s characteristics, ideally the
differences will probably be minor, and generally have
minimal effect on the results.
Reporting
How can two people witness the same event but see
different things? This often occurs due to personal biases
and subjective impressions. These characteristics are
common traits among non-scientists. Their reports often go
beyond what has just been observed and involve
speculation. In the book Research Methods in Psychology
(Shaughnessy & Zechmeister, 1990) an excellent example
is given demonstrating the difference between scientific
and non-scientific reporting. An illustration is provided
showing two people running along the street with one
person running in front of the other. The scientist would
report it in the way it was just described. The non-scientist
may take it a step further and report one person is chasing
the other or they are racing. A nonscientific approach is
generally more speculative than a scientific approach. This
type of reporting lacks objectivity.
24
susceptible to a wide range of conscious and unconscious
biases. Scientists may also be driven by incentives.
Concepts
Instruments
25
the bathroom scale says you weigh the same. Of course,
this weight difference is negligible, and probably has little
practical significance.
Measurement
26
report measures. It isn't as important when considering
performance based measures, tests or surveys. Each
question should be aimed at measuring the same thing.
Stability is often measured by test / retest reliability. The
same person takes the same test twice and the scores from
each test are compared. Interrater reliability is sometimes
used in assessing reliability. With interrater reliability
different judges or raters (two or more) make observations,
record their findings and then compare their observations.
If the raters are reliable then the percentage of agreement
should be high.
27
It is possible to have a reliable but not valid measure.
However, a valid measure is always a reliable measure.
Hypotheses
28
difference or not statistical difference- predicts when
comparing different groups there will be no difference. The
alternative hypothesis- there is a difference, statistically
significant difference- predicts when groups are compared
there will be a difference.
Attitude
29
can move left or right. A 0 appears on the far left end and a
1 appears on the far right end. The 0 corresponds with total
disbelief and the 1 corresponds with total belief (absolute
certainty). Lyttleton suggests that the bead should never
reach the far left or right end. The more that the evidence
suggests the belief is true the closer the bead should be to 1.
The more unlikely the belief is to be true the closer the
bead should be to 0.
Science or non-science
30
everyday language (Johnson, 2000). To a scientist, the
word theory represents that of which he or she is most
certain; in everyday language the word implies a guess (not
sure). This often causes confusion for those unfamiliar
with science. This confusion leads to the common
statement “It’s only a theory.”
31
Science Might Have it Wrong?
Lyttleton suggests that the bead should never reach the far
32
left or right end. The more that the evidence suggests the
belief is true the closer the bead should be to 1. The more
unlikely the belief is to be true the closer the bead should
be to 0.
33
The Common Sense Myth!
From Wikipedia:
34
sense is: commonly held belief, regardless of its truth-
value.
35
From Lilienfeld et al. (2010, p.6):
36
Correlational Studies are Important Even if They Don’t
Imply Causation!
37
disabilities). Other variables, such as birth
order, sex, and age are inherently
correlational because they cannot be
manipulated, and, therefore, the scientific
knowledge concerning them must be based
on correlation evidence.”
38
Why We Need Statistics!
39
Statistic: One number that summarizes a property or
characteristic of a set of numbers
40
become a better-informed consumer. If you understand
basic statistical concepts, you will be in a better position to
evaluate the information you have been given.
41
it is understood that the conclusion is tentative. This
willingness to admit fallibility and the need for change is
one of science’s biggest strengths. In virtually every other
area of knowledge acquisition, admitting fallibility is not a
virtue, but a weakness.
42
When Experts are Wrong
43
rules alone. For example, computers can automate clinical
judgments. The computer can be programmed to yield the
description “dependency traits” just as the clinical judge
would, whenever a certain response appears on a
psychological test. To be truly actuarial, interpretations
must be both automatic (that is, prespecifiied or routinized)
and based on empirically established relations” (Dawes, et
al., 1989, p.1668).
44
clinical assessments of dangerousness with
actuarial methods. "What we are advising is
not the addition of actuarial methods to
existing practice, but rather the complete
replacement of existing practice with
actuarial methods" (p. 171).
45
The supremacy of statistical prediction
46
Another type of investigation mentioned in the clinical-
actuarial prediction literature discusses giving the clinician
predictions from the actuarial prediction, and then asking
them to make any necessary changes based on their
personal experience with clients. When the clinician makes
changes to the actuarial judgments, the adjustments lead to
a decrease in the accuracy of the predictions (Dawes,
1994).
47
approaches should be calibrated to recognition of its
limitations and need for control. Albeit, surpassing clinical
methods actuarial procedures are not infallible, often
achieving only moderate results. A procedure that proves
successful in one setting should be periodically reevaluated
within that context and shouldn’t be applied to new settings
mindlessly (Dawes, et al., 1989).
48
Lloyd, M., 2006).
49
The use of clinical prediction relies on authority whose
assessments-precisely because these judgments are claimed
to be singular and idiosyncratic-are not subject to public
criticism. Thus, clinical predictions cannot be scrutinized
and evaluated at the same level as statistical predictions.
(Stanovich, K., 2007)
Conclusion
The intent of this article is not to imply that experts are not
important or do not have a role in predicting outcomes.
Expert advice and information is useful in observation,
gathering data and sometimes making predictions (when
predictions are commensurate with available evidence).
However, once relevant variables have been determined
and we want to use them to make decisions, “measuring
them and using a statistical equation to determine the
predictions constitute the best procedure.” (Stanovich,
2007, p.181)
50
I will leave you with these words (Meehl, 2003):
51
Understanding Scientific Research Methods
Recommended sources
The Rationality of Science
https://centerforinquiry.org/blog/rethinking-science-
education/
52
The intent here is to discuss some different research
methods, utilized mostly by those involved with health,
social and biomedical sciences.
53
understanding.
Description
Prediction
54
tentative, testable predictions concerning the relationships
between or among variables. Hypotheses are frequently
derived from Theories- interrelated set of concepts that
explains a body of data and makes predictions.
Explanation / Understanding
55
and effect three pre-requisites are essential- covariation of
events, proper time- order sequence (temporal precedence)
and the elimination of plausible alternative causes (internal
validity). .
56
depends on the level of another variable.
57
qualitative methods? It depends on context. Patten
compares six scenarios in the book- Understanding
Research Methods 4th edition (Patten, 2004). As examples,
Patten states when little is known about a topic qualitative
research is usually favored; when hard numbers are
required, such as those required by funding agencies,
quantitative is usually preferred.
Peer Review
58
for evaluation. Ideas that survive this critical process have
begun to meet the criterion of public verifiability”
(Stanovich, 2007, p. 12).
59
been replicated, study design, sample size, and conflicting
interest (design details and critiques will be discussed in
later articles). There are good studies that never get
published in peer review publications. The merits of the
paper should be weighed more heavily than the source of
publication. The peer review myth- a thought that peer
review automatically means quality or lack of peer review
publication indicates low quality- should be abandoned.
There is much more to quality science than peer review.
Retraction Watch publishes information on thousands of
reports that have been retracted, and most often these
reports were published in peer review publications.
Further reading
60
I’m well placed to answer this question as I’ve published
hundreds of peer-reviewed papers and written thousands of
referee reports for journals. And of course I’ve also done a
bit of post-publication review in recent years.
OK, let’s step back for a minute. What is peer review good
for? Peer reviewers can catch typos, they can catch certain
logical flaws in an argument, they can notice the absence of
references to the relevant literature—that is, the literature
that the peers are familiar with. That’s how the peer
reviewers for that psychology paper on ovulation and
voting didn’t catch the error of claiming that days 6-14
were the most fertile days of the cycle: these reviewers
were peers of the people who made the mistake in the first
place!
61
Peer review has its place. But peer reviewers have blind
spots. If you want to really review a paper, you need peer
reviewers who can tell you if you’re missing something
within the literature—and you need outside reviewers who
can rescue you from groupthink. If you’re writing a paper
on himmicanes and hurricanes, you want a peer reviewer
who can connect you to other literature on psychological
biases, and you also want an outside reviewer—someone
without a personal and intellectual stake in you being
right—who can point out all the flaws in your analysis and
can maybe talk you out of trying to publish it.
62
WHY SCIENCE MATTERS
By James Randi
63
they fail, they are either re-written or scrapped.
64
The Nonsense Detection Kit
65
“what only a limited few have discovered”, and so on.
These findings are not subject to criticism or replication.
That is not how science works. When conducting studies it
is imperative that researchers operationalize (provide
operational definition- precise observable operation used to
manipulate or measure a variable) variables so the specifics
can be criticized and replicated. Non-scientists are not
concerned with others being able to replicate their findings;
because they know attempted replications will probably be
unsuccessful. If a finding cannot be replicated this is a big
problem, and it is unreasonable to consider a single finding
as evidence. It is also problematic when only those making
the original finding have replicated successfully. When
independent researchers using the same methods as those
used in the original study are not able to replicate this is a
sign that something was faulty with the original research.
66
a good scientist goes out of their way to look for
disconfirmatory evidence. Why look for disconfirmatory
evidence? Because when discovering reality is the
objective it is necessary to look at all the available data, not
just the data supporting one’s own assertions.
Confirmation bias occurs when the only good evidence,
according to the claimant, is the evidence that supports
their claim. Often, perpetuators of nonsense may not even
be aware of disconfirmatory evidence. They have no
interest in even looking at it.
67
A large number of nonsense advocates do not even know
what the standard rules of reason and research are, let alone
adhere to them. They often lack any training in research
methodology, and are ignorant to the accepted rules of
scholarly work (Shermer, 2001). Consider the following
example provided by Shermer (2001, p.21).
68
read a manuscript before publication submission, or
formally when the manuscript is read and critiqued by
colleagues, or publicly after publication), such biases and
beliefs are rooted out, or the paper or book is rejected for
publication.” (Shermer, 2001, p. 22)
Thoughts on Authority
69
to certitude.” (Sagan, 1996, p.28)
Common fallacies:
70
Appeal to tradition- that is the way it has always been done
so that must be the right way to do it
Post hoc, ergo propter hoc- from Latin “It happened after,
so it was caused by” the illusion of cause
71
Nonsense indicator- extraordinary claims
72
I want to thank Sagan for his Baloney Detection Kit (1996),
Shermer for his Boundary Detection Kit (2001) and
Lilienfeld (2012) for his Pseudoscience Indicators.
73
Science Roundtable: Discussing Scientific Matters
74
theory. In a non-scientific context people use theory but
they mean hypothesis. People that are not scientifically
literate will go and say "theoretically speaking" but they
mean hypothetically speaking. Scientific theories are facts,
not a hypothesis. This reminds me of "Newspeak" in
George Orwell's 1984, which was pretty much the
deterioration of the english language in order to limit free
thinking.
75
impressive, but the study is sort of a 'perfect storm' of
events. Something that improves insulin sensitivity in obese
postmenopausal women with diabetes isn't going to be
something that necessarily pumps your muscles full of
glucose and should be taken after a workout (now, it may,
but the study on diabetic obese women isn't what you
should be looking at).
76
is one where we have investigated the particular question
and can use that evidence as proof that it should work as
the study says.
77
theoretical/ abstract considerations.
78
unless and until it’s been replicated by independent
investigators.
79
1- Jonathan Gore: The best tip I have would be to locate
someone in the field who is willing to walk you through the
elements of empirical papers. We have a course in our
department, "Information Literacy in Psychology,"
specifically devoted to teaching students that very skill. It is
difficult to try to absorb everything in a scientific
publication without some guidance.
80
advanced degrees in chemical engineering, environmental
studies, and social psychology. He teaches research
methods and statistics at Eastern Kentucky University in
Richmond, KY
81
for single examples; they look for big trends. And once you
identify the trends, then you are in position to use that
information wisely.
82
Lars Avemarie, he works full time as a certified personal
trainer in Sweden, his focus is on using evidence-based
methods to fulfill clients goals. He works with a wide range
of clients- weekend warriors to exercise enthusiasts of all
ages.
83
an opinion is not necessarily proof of anything.
84
up missing important aspects of the study or even get a
skewed impression of what the findings actually mean.
85
have many limitations. It is therefore imperative to view
each study in the context of the prevailing body of literature
and weigh its merit based on methodology employed and
its relevance to the individual.
86
Guidelines for Reading Research Reports
87
Read Abstracts
Once you have looked over the paper, you should have a
general idea of the specific areas that are relevant. It is
possible for the full paper to be relevant. As you focus in
on your areas of interest, take notes, highlight important
information and highlight references that may lead to
further reading. It is important to highlight references of
interest as you read through the paper. You probably won’t
remember the references by the time you have completed
reading. If you are reading on a computer or some other
electronic form that makes highlighting difficult, or doesn’t
allow highlighting, write the references on a piece of paper.
88
If this seems to you like a laborious task you are right. A
thorough investigation of the research is painstaking.
However, in order to gain a comprehensive understanding,
and knowledge in your area of interest, laborious activity is
essential.
89
Association Between Scientific Cognition and Scientific
Literacy: Implications for Learning Science (Hale,
Sloss, & Lawson, 2017)
90
components. Science, although fallible, is the great reality
detector. Science is a systematic approach to knowledge.
Proper use of scientific processes lead to rationalism
(basing conclusion on intellect, logic and evidence).
Science combats dogmatism (adherence to doctrine over
rational and enlightened inquiry, or basing conclusion on
authority rather than evidence) and provides a better
understanding of the world. Scientific processes/ methods
are unmistakably the most successful processes available
for describing, predicting and explaining phenomena in the
observable universe.
91
The empirical approach (as used in everyday observation)
allows us to learn things about the world. However,
everyday observations are often made carelessly and
unsystematically. Thus, using everyday observations in an
attempt to describe, predict and explain phenomena is
problematic.
92
elements of cognitive style (thinking disposition).
Cognitive style reflects types of thinking that occur during
typical performance conditions. That is, thinking or
engaging in tasks when not being explicitly cued to
maximize performance (Stanovich, West & Toplak, 2016).
Various scales have been developed to measure scientific
thinking / reasoning / cognition. Kahan developed a scale
called the Ordinary Science Intelligence Scale (OSI_2.0,
Kahan, 2014) and Drummond and Fischhoff (2015)
developed the Scientific Reasoning Scale (SRS).
Drummond and Fischhoff found that measures of scientific
reasoning were distinct from measures of scientific literacy,
even though there was a positive association to measures of
scientific literacy. The OSI_2.0 scale is intended to be a
measure of the capacity to recognize and making use of
scientific evidence in everyday decision making. The
OSI_2.0 scales consists of 18 items and can be divided into
four sets: scientific fact items, scientific methodology
items, quantitative reasoning and cognitive reflection items.
A measure of cognitive reflection is a measure of the
tendency to avoid providing an intuitive answer to a
problem, that on further analysis is shown to be incorrect.
Reflective processing allows one to override, by critical
processing the information, an incorrect fast response. The
SRS is intended to assess the skills needed to evaluate the
quality of scientific findings. The final version of the SRS
consisted of 11 items derived from concepts taken from
research method textbooks. Thus, scientific reasoning is a
measure of knowledge in research methodology, according
to Drummond and Fischhoff's scale. Dunbar's (2000)
research on scientific thinking used a different strategy.
Dunbar's research mostly involves examining cognitive
processes underpinning thinking during the research
process, rather than assessing scientific thinking with
93
prescribed types of measure.
Method
Participants
94
survey that included a scientific cognition and a scientific
literacy assessment. The study was approved by the
university's Institutional Review Board.
Materials
95
A scientific theory is defined as: A) A
comprehensive explanation of some aspect of
nature that is supported by a vast body of evidence
B) An educated guess used to explain an aspect of
nature C) An explanation that has not been tested
Procedure
96
29 questions on the survey. The first 14 questions were
scientific literacy questions; the next 14 questions were
scientific cognition questions, and the last question was a
question asking if the participant was male or female.
Participants were given up to 25 minutes to complete the
study. Upon completion of the survey a debriefing
statement was provided.
Results
97
incorrectly was an astrology question, and the question
most often answered correctly was a question involving the
earth's orbit of the sun. The total percentage of correct
answers, for individual items, varied on the scientific
cognition assessment from 46.7% - 82.6%. The question
most often answered incorrectly was a question involving a
covariation task, and the question most often answered
correctly involved estimating chances of winning a dollar.
Table 1 gives the total percentages of correct answers for
each item.
Table 1
98
both assessments. The results of the chi-square tests
indicate an association between gender (men vs. women)
and responses (correct vs. incorrect) for three items from
the online survey; one of the items from the scientific
literacy assessment and two of the items from the scientific
cognition assessment. The results of a chi-square test using
gender (men vs. women) and responses to scientific literacy
question no. 9 (correct vs. incorrect) as factors was
statistically significant, χ2 (1, N = 202) = 5.12, p < .05.
Men were more likely to produce a correct response for
scientific literacy question no. 9 than was expected.
Scientific literacy question no. 9 was "[w]hich of the
following are smaller than atoms a) proteins b) electrons c)
amino acids." The correct answer is b. The results of a chi-
square test using gender (men vs. women) and responses to
scientific cognition question no. 3 (correct vs. incorrect) as
factors was statistically significant, χ2 (1, N = 202) = 5.57,
p < .05. Women were more likely to produce a correct
response for scientific cognition question no. 3 than
expected. Scientific cognition question no. 3 was "[t]he
falsification criteria in the context of science suggests a) If
a scientific claim is proven then it is not false b) False
claims are not accepted c) In order for a claim to be
scientific it must be testable." The correct answer is c. The
results of a chi-square test using gender (men vs. women)
and responses to scientific cognition question no. 9 (correct
vs. incorrect) as factors was statistically significant, χ2 (1,
N = 202) = 7.05, p < .05. Men were more likely to produce
a correct response for scientific cognition question no. 9
than was expected. Scientific cognition question no. 9 was
"[i]n the universal lottery, the chances of winning a prize
are 1%. How many people do you think would win a prize
if 1000 people buy a single ticket?" The correct is answer is
10.
99
Discussion
100
regarding research methodology. Similar to the USIS it
also involves questions regarding probability (quantitative
reasoning). In addition, the scientific cognition assessment
involves items that require knowledge in the philosophy of
science.
101
do you think would win a prize if 1000 people buy a single
ticket?" The correct is answer is 10. A gender difference on
a task involving quantitative reasoning is in agreement with
the scientific literature that demonstrates better
performance of males regarding quantitative reasoning
(Friedman, 1989; Leahey, & Guo, 2001). The only gender
difference on quantitative reasoning occurred for question
no. 9; there were no differences for other items involving
quantitative reasoning. Women scored better on an item
(question no.3) involving the philosophy of science.
Question no. 3 was "[t]he falsification criteria in the
context of science suggests a) If a scientific claim is proven
then it is not false b) False claims are not accepted c) In
order for a claim to be scientific it must be testable." The
correct answer is c. The concept of falsification is one of
the most discussed concepts in the philosophy of science.
The concept is taught in low level research methods
courses and philosophy of science courses. We weren't able
to locate studies that investigated differences between
genders regarding philosophy of science. It is unclear why
the gender difference occurred on this task. There were
other philosophy of science questions on the scale, but
there were no gender differences on those tasks. Gender
differences are often found when comparing scoring for
individual items. A study investigating gender differences,
for Hong Kong students, didn't find significant differences
for total score in scientific literacy, but differences were
found for components of scientific literacy (Yan Yip, D.,
Ming Chiu, M., & Chu Ho, E., 2004). Scientific literacy as
conceptualized in that study was different than the
conceptualization used in the current study. Scientific
literacy ,in the study of Hong Kong students, consisted of
five components: "understanding concepts, recognizing
questions, identifying evidence, drawing conclusions,
102
communicating conclusions." Females scored significantly
higher in "recognizing questions" and "identifying
evidence" while boys scored higher in "understanding
concepts." These components demonstrate various elements
involved with scientific thinking. To reiterate, our
conceptualization of scientific literacy, is that scientific
literacy demonstrates general scientific knowledge.
Scientific literacy has a much broader definition in the
Hong-Kong study than the definition we used.
Another important finding in the current study was that
students confused science with pseudo- science. The
overwhelming majority of students (79%) in the current
study report that astrology is scientific, or is at least partly
scientific. Only twenty one percent of participants in the
study answered the following question correctly: "Which
of the following statements are true? A) Astrology is not at
all scientific B) Astrology is partly scientific C) Astrology
is a legitimate field of scientific study." The correct answer
is A. The astrology question is an item from the scientific
literacy assessment. The results from a study conducted by
Sugarman and colleagues (2011) found that majority of
students (78%) considered astrology at least sort or
scientific. Only 52% of science majors indicated that
astrology was “not at all” scientific. Those finding are
similar to what we found. Astrology has no scientific
validity, although at one time it was considered a science
by some. Newspapers and magazines dedicate sections to
horoscopes, and belief in astrology is prevalent in western
society. This exposure to astrology as a legit domain
probably has a strong influence regarding belief in the
scientific validity of astrology. Cognitive priming is often
powerful, and may modulate beliefs, even when priming is
used to promote pseudo-science. Some people may
confuse astrology with astronomy; astrology has origins
103
associated with positional astronomy. This confusion may
lead to an incorrect response regarding the scientific
validity of astrology. A high level of scientific literacy and
scientific cognition may serve as safeguards against these
sort of pseudo-scientific beliefs.
104
were validly measured. A more comprehensive measure
may require assessments consisting of more items. It is also
possible that measures of these concepts may yield
different results inside and outside the laboratory. Different
conceptualizations of scientific literacy and scientific
cognition require different measuring devices.
105
theories, laws and principles. Scientific cognition is
essentially analytical thinking that can be used, and should
be used in a wide range of conditions. At the very least in
an effort to develop better scientific cognition students
should be educated in the areas of the philosophy of
science, research methodology, quantitative reasoning
(probabilistic reasoning) and logic. These components are
involved with scientific thinking. Science educators and the
media do a disservice when they promote science and its
wide range of relevant concepts as "just" being able to
remember scientifically derived information, or promoting
science as if it is all about a just having a sense of
"wonder." Being able to recollect scientific facts and
having a sense of wonder is important regarding science,
but those qualities alone do not ensure high levels of
scientific thinking. Assessment tools may help predict
scientific eminence and be used as screening tools when
hiring or considering admissions to college programs.
More research needs to be done regarding scientific literacy
and scientific cognition. Both of these concepts involve
related cognitive mechanisms, and being knowledgeable in
these areas will have positive consequences. Society is
heavily dependent on science and technology, and these
complex endeavors require complex thinking. We would
like to see future research indicating a high positive
association between scientific cognition and scientific
literacy. A moderate association is not satisfactory.
References
106
10.1177/0963662506070159.
107
Research, 1–22,
doi.org/10.1080/13669877.2016.1148067
Pew Research Center for the People & the Press. (2013).
Public's Knowledge of Science and Technology.
Pew Research Center, Washington D.C.
108
belief in and attitudes toward science.
Walding, R., Fogliani, C., Over, R., & Bain, J.D. (1994).
Gender differences in response to questions on the
Australian national chemistry quiz. Journal of
Research in Science Teaching, 31(8), 833-846.
109
Yan Yip, D., Ming Chiu, M., & Sui Chu Ho, E. (2004).
Hong Kong student achievement in OECD-PISA
Study: Gender differences in science content,
literacy skills, and test item formats. International
Journal of Science and Mathematics Education, 2,
91–106.
110
Analytical Reading: Primary Scientific Literature- Key
points (Jones & Hale)
Abstract
The purpose of the current paper is to present a pedagogical
method for teaching students to read analyze, and evaluate
research methodology and conclusions in primary scientific
literature. Analytical reading of primary scientific literature
is an essential skill for advanced undergraduate and
graduate students. Evaluating research involves healthy
criticism and debate. Students should be introduced to this
process of criticism and analysis early and throughout their
college careers. These are skills students can use for their
own research papers, theses, and dissertations, and can also
ensure future clinical practice is evidence-based. The
present method is grounded in research on cognitive and
learning psychology and provides a structure for
developing analytical reading skills in the classroom. Our
conclusions are supported primarily by teaching
evaluations, personal communications with students, and
experience. The method presented is a practical method for
utilizing findings from educational, teaching, and
psychological research in the classroom.
Key points:
111
The analytical reading method begins with a lecture on the
layout of a scientific paper along with readings to provide
additional information and to serve as reference materials
throughout the course. Students are taught to be able to
distinguish among different sources, methodologies and
elements of scientific papers.
112
in advanced undergraduate seminars at two different higher
education institutions. The first institution was a larger
research institution with graduate and undergraduate
students and the other a small liberal arts college. Student
feedback at both sites, from on standardized institutional
student evaluation and through personal communication
with students, has been consistently positive.
113
Chapter 2
114
Developing The RQ Test
115
Your work shows that individuals can rate high in
intelligence, and at the same time, rate low in
rationality. Is it likely that an individual will rate low in
intelligence but high in rationality?
Yes, that is a very good question. It is important to realize
that those two outlier states will not occur with equal
frequency. By outlier states I mean people who are low in
rationality and high in intelligence, and then also the
converse state, people who are high in rationality and low
in intelligence. The former will be much more frequent
than the latter. For many types of rational thinking
subcomponents intelligence is necessary but not sufficient.
Also, with respect to many different rational thinking
components, there are at least mild to moderate correlations
with intelligence. Only on a few rational thinking
components–myside bias for example–is it the case that the
rational thinking component is totally disassociated from
intelligence. On those few tasks there will indeed be as
many individuals high in rationality and low in intelligence
as there are low in rationality and high in intelligence. But
that will be the minority of cases.
116
given statement about individual differences may vary
quite a bit across the subcomponents. The answer to your
question here will probably vary quite a bit across the
different subcomponents.
117
pro-rated IQ scores that were indicated. Thus, his
supporters missed the fact that Bush would excel on
something that was assessed by the tests. The supporters
assumed the tests measured only “school smarts” in the
trivial pursuit sense (“who wrote Hamlet?”) that is easily
mocked and dismissed as having nothing to do with “real
life.” That the tests would actually measure a quality that
cast Bush in a favorable light was something his supporters
never anticipated.
In the talks that I give on these topics, when I use the Bush
example I tried to head off questions and negative
comments by pointing out that there is an absolute
consensus that there was something wrong with his
thinking style and that this fact is not in dispute–that even
his supporters acknowledge this fact. For example, in a
generally positive portrait of the President, David Frum
nonetheless notes, “he is impatient and quick to anger;
sometimes glib, even dogmatic; often uncurious and as a
result ill-informed”. Conservative commentator George
Will agrees, when he states that in making Supreme Court
appointments, the President “has neither the inclination nor
the ability to make sophisticated judgments about
competing approaches to construing the Constitution” (p.
23, 2005). In short, there is considerable agreement that
President Bush’s thinking has several problematic aspects:
lack of intellectual engagement, cognitive inflexibility,
need for closure, belief perseverance, confirmation bias,
overconfidence, and insensitivity to inconsistency.
118
mindware. Mindware refers to the rules, knowledge,
procedures, and strategies that a person can retrieve from
memory in order to aid decision making and problem
solving. Most mindware is helpful and good for us.
However, some acquired mindware can be the direct cause
of irrational actions that thwart our goals. This type of
mindware I have termed contaminated mindware.
119
Almost regardless of what a person’s future goals may be,
these goals will be better served if accompanied by beliefs
about the world which happen to be true. Obviously there
are situations where not tracking truth may (often only
temporarily) serve a particular goal. Nevertheless, other
things being equal, the presence of the desire to have true
beliefs will have the long-term effect of facilitating the
achievement of many goals.
120
able to give everyone an otherwise harmless drug that
increased their algorithmic-level cognitive capacities (for
example, discrimination speed, working memory
capacity)—in short, that increased their intelligence.
Imagine that everyone in North America took the pill
before retiring and then woke up the next morning with
more memory capacity and processing speed. Both Baron
and I believe that there is little likelihood that much would
change the next day in terms of human happiness. It is
very unlikely that people would be better able to fulfill their
wishes and desires the day after taking the pill. In fact, it is
quite likely that people would simply go about their usual
business—only more efficiently! If given more memory
capacity and processing speed, people would, I believe:
carry on using the same ineffective medical treatments
because of failure to think of alternative causes; keep
making the same poor financial decisions because of
overconfidence; keep misjudging environmental risks
because of vividness (Chapter 6); play host to contaminated
mindware of Ponzi and pyramid schemes; be wrongly
influenced in their jury decisions by incorrect testimony
about probabilities; and continue making many other of the
suboptimal decisions described in several of my books.
The only difference would be that they would be able to do
all of these things much more quickly! Instead, because of
inadequately developed rational thinking abilities—because
of the processing biases and mindware problems I have
discussed in my books—physicians choose less effective
medical treatments; people fail to accurately assess risks in
their environment; information is misused in legal
proceedings; millions of dollars are spent on unneeded
projects by government and private industry; parents fail to
vaccinate their children; unnecessary surgery is performed;
animals are hunted to extinction; billions of dollars are
121
wasted on quack medical remedies; and costly financial
misjudgments are made.
122
Good Thinking: More Than Just Intelligence
“Abstract
Two critical thinking skills—the tendency to avoid myside
bias and to avoid one-sided thinking—were examined in
three different experiments involving over 1200
participants and across two different paradigms. Robust
indications of myside bias were observed in all three
experiments. Participants gave higher evaluations to
arguments that supported their opinions than those that
refuted their prior positions. Likewise, substantial one-side
bias was observed—participants were more likely to prefer
a one-sided to a balanced argument. There was substantial
variation in both types of bias, but we failed to find that
participants of higher cognitive ability displayed less
myside bias or less oneside bias. Although cognitive ability
failed to associate with the magnitude of the myside bias,
the strength and content of the prior opinion did predict the
degree of myside bias shown. Our results indicate that
cognitive ability—as defined by traditional psychometric
indicators—turns out to be surprisingly independent of two
of the most important critical thinking tendencies discussed
123
in the literature.”
124
In Experiment 1, the researchers concluded, there was "no
evidence at all that myside bias effects are smaller for
students of higher cognitive ability" (p.140). The main
purpose of Experiment 2 was to investigate the association
of cognitive abilities with myside and one side bias. "The
results... were quite clear cut. SAT total scores displayed a
nonsignificant 7.03 correlation with the degree of myside
bias and a correlation of .09 with the degree of one-side
bias (onebias1), which just missed significance on a
twotailed test but in any case was in the unexpected
direction" (p.147). It was also revealed that stronger beliefs
usually imply heavier myside bias. In Experiment 3 "the
degree of myside bias was uncorrelated with SAT scores",
and "[t]he degree of one-side bias was uncorrelated with
SAT scores" (p.156). Myside bias was weakly correlated
with thinking dispositions. One side bias showed no
correlation with thinking dispositions.
125
Intelligence and Rationality: different cognitive skills
126
purpose of Experiment 2 was to investigate the association
of cognitive abilities with myside and one side bias. "The
results... were quite clear cut. SAT total scores displayed a
nonsignificant 7.03 correlation with the degree of myside
bias and a correlation of .09 with the degree of one-side
bias (onebias1), which just missed significance on a
twotailed test but in any case was in the unexpected
direction" (p.147). It was also revealed that stronger beliefs
usually imply heavier myside bias. In Experiment 3 "the
degree of myside bias was uncorrelated with SAT scores",
and "[t]he degree of one-side bias was uncorrelated with
SAT scores" (p.156). Myside bias was weakly correlated
with thinking dispositions. One side bias showed no
correlation with thinking dispositions.
127
and behaving in a manner that optimizes one's ability to
achieve goals. Epistemic rationality can be defined as
holding beliefs that are in line with available evidence. This
type of rationality is concerned with how well our beliefs
map into the structure of the world. In order to optimize
rationality one needs adequate knowledge in the domains of
logic, scientific thinking, probabilistic thinking, and causal
reasoning. A wide variety of cognitive skills fall within
these broad domains of knowledge. Many of these skills are
not assessed on IQ tests.
128
cognitive tools that must be retrieved from memory to think
rationally (Perkins, 1995; Stanovich, 2009). The absence of
knowledge in areas important to rational thought creates a
mindware gap. These important areas are not adequately
assessed by typical intelligence tests. Mindware necessary
for rational thinking is often missing from the formal
education curriculum. It is not unusual for individuals to
graduate from college with minimal knowledge in areas
that are crucial for the development of rational thinking.
129
following: “[a] good first start is education, which readers
have already started here by reading this blog entry. Having
an understanding of how cognitive scientists have
expanded what is meant by rationality is important, namely
that rationality is about two critical things: What is true and
what to do.”
130
The Ultimate Goal of Critical thinking
131
function without interference from the context (prior
opinions, beliefs, vividness effects).
132
expressing the importance of critical thinking lack critical
thinking skills themselves. In fact, many educators are
simply in the business of repeating what others say-
Critical thinking is important.
Understanding Rationality
133
regarding scientific and probabilistic thinking?
134
Man is an Irrational Animal!
What is rationality?
135
Characteristics of rational thought
Judicious decision-making
Reflectivity
136
behave rationally. David Perkins, Harvard cognitive
scientist, refers to "mindware" as rules, strategies, and other
cognitive tools that must be retrieved from memory to think
rationally (Perkins, 1995; Stanovich, 2009). The absence of
knowledge in areas important to rational thought creates a
mindware gap. These important areas are not adequately
assessed by typical intelligence tests. Mindware necessary
for rational thinking is often missing from the formal
education curriculum. It is not unusual for individuals to
graduate from college with minimal knowledge in areas
that are crucial for the development of rational thinking.
Another type of content problem- mindware contamination-
occurs when one has acquired mindware that thwarts our
goals and causes irrational action.
137
Below is a list of rational thinking tasks and their
association with cognitive ability / intelligence from
Stanovich (2010, p.221):
138
al., 2002)
Probability matching (Stanovich & West, 2008; West &
Stanovich, 2003)
Hindsight bias (Stanovich & West, 1998c)
Ignoring P(D/NH) (Stanovich & West, 1998d,
1999)Covariation detection (Stanovich & West, 1998c,
1998d; Sá et al., 1999)
Belief bias in syllogistic reasoning (Stanovich & West,
1998c, 2008)
Belief bias in modus ponens (Stanovich & West, 2008)
Informal argument evaluation (Stanovich & West, 1997,
2008)
Four-card selection task (Stanovich & West, 1998a, 2008)
EV maximization in gambles (Frederick, 2005; Benjamin
& Shapiro, 2005)
139
Implications of Research and Future Research
140
strategy that can be taught (Reyna & Farley, 2006). The
teaching of considering alternative hypotheses is a
relatively easy strategy that promotes rational thinking. To
perpetuate the idea of thinking about alternative hypotheses
a simple instruction of “think of the opposite” is given.
Studies have demonstrated this strategy can help prevent
the occurrence of various thinking errors (Sanna &
Schwartz, 2006). Probabilistic thinking has been shown to
be more difficult to teach than the previously mentioned
strategies, yet still teachable (Stanovich, 2009). Causal
reasoning, another important element in achieving
rationality is teachable.
141
Common Myths About Rationality
142
In order to engage in actions that fulfill our goals, we need
to base those actions on beliefs that are supported by
evidence.
143
really needed is a more precise type of analytic thought.
More often than not, processes of emotional regulation
enhance rational thinking and behavior.
144
Dysrationalia: Intelligent People Behaving Irrationally
145
Of course one of the implications of this is that it will not
be uncommon to find people whose intelligence and
rationality are dissociated. That is, it will not be uncommon
to find people with high levels of intelligence and low
levels of rationality, and, to some extent, the converse. Or,
another way to put it is that we should not necessarily
expect the two mental characteristics to go together. The
correlations are low enough--or moderate enough--that
discrepancies between intelligence and rationality should
not be uncommon. For one type of discrepancy, that is for
people whose rationality is markedly below their
intelligence, we have coined the term dysrationalia by
analogy to many of the disabilities identified in the learning
disability literature:
http://en.wikipedia.org/wiki/Dysrationalia
146
In its stronger sense, the sense employed in cognitive
science and in this book by de Sousa (2007), rational
thought is a normative notion. Its opposite is irrationality,
and irrationality comes in degrees. Normative models of
optimal judgment and decision making define perfect
rationality in the noncategorical view employed in
cognitive science. Rationality and irrationality come in
degrees defined by the distance of the thought or behavior
from the optimum defined by a normative model. This
stronger sense is consistent with what recent cognitive
science studies have been demonstrating about rational
thought in humans.
147
scientists is termed epistemic rationality. This aspect of
rationality concerns how well beliefs map onto the actual
structure of the world. The two types of rationality are
related. In order to take actions that fulfill our goals, we
need to base those actions on beliefs that are properly
calibrated to the world.
148
What are some of the rational thinking skills that are
positively associated with intelligence? How about
rational thinking skills that are not associated with
intelligence?
149
The purpose of our work, and many of our recent
publications, has been to speed the development of an RQ
test along. We have done this by showing that there is no
impediment, theoretically, to designing such a measure.
The tasks that would be on such a measure have been
introduced into the recent literature. In several recent
publications we have been working on bringing them
together into a coherent structure. Of course there are
many, many, more steps that are needed before one has an
actual standardized test. Standardization samples would
need to be run and items would need to be piloted. In terms
of the corporations that produce mental tests, it’s an
endeavor that, if one were to measure it in dollars, would
be millions of dollars.
150
process. There clearly would be immediate practical uses of
less all-encompassing instruments that focused on
important components of rational thinking (e.g., economic
thinking, probabilistic thinking, scientific thinking, reduced
myside biased thinking).
151
A good first start is education, which readers have already
started here by reading this blog entry. Having an
understanding of how cognitive scientists have expanded
what is meant by rationality is important, namely that
rationality is about two critical things: What is true and
what to do.
152
edition is just out)
http://web.mac.com/kstanovich/iWeb/Site/Home.html
http://www.yorku.ca/mtoplak/
http://web.me.com/westrf1/Site_2/Welcome.html
153
Rationality Quotient
154
CART into an “in the box” standardized measure, but that
is a larger goal than we had for this book.
I think that, at least so far, most academics have understood
our goals and the feedback has been good. We wrote a
summary article on the CART in a 2016 issue of the journal
Educational Psychologist (51, 23-34) and the feedback
from that community has been good.
155
tendency to substitute affect for difficult evaluations; the
tendency to over-weight short-term rewards at the expense
of long-term well-being; the tendency to have choices
affected by vivid stimuli; and the tendency for decisions to
be affected by irrelevant context.
156
components of rational thinking do show considerable
dissociation from intelligence. Overconfidence (measured
by the Knowledge Calibration Subtest of the CART) shows
only a .38 correlation with intelligence. This represents a
substantial amount of dissociation for a key component of
rational thinking. Kahneman, for example, devoted
substantial portions of his best-selling book to this
component of rational thinking. Myside bias (measured by
our Argument Evaluation Subtest) likewise shows a
correlation of .38, indicating a substantial dissociation.
This thinking bias is at the center of many discussions of
what it means to be rational. Some of the subtests that
most directly measure the components of the axiomatic
approach to utility maximization show relatively mild
correlations with intelligence. For example, the Framing
Subtest shows a fairly low .28 correlation. Framing
measures a foundational aspect of rational thinking
according to the axiomatic approach.
Not in the near future, no. Our goal with the book was
more modest—to simply raise awareness of the importance
of rational thinking and the ability of modern cognitive
psychology to measure it. The result of our efforts will, we
157
hope, redress the imbalance between our tendency to value
intelligence versus rationality. In our society, what gets
measured gets valued. Our aim in developing the CART
was to draw attention to the skills of rational thought by
measuring them systematically. In the book, we are careful
to point out that we operationalized the construct of rational
thinking without making reference to any other construct in
psychology, most notably intelligence. Thus, we are not
trying to make a better intelligence test. Nor are we trying
to make a test with incremental validity over and above IQ
tests. Instead, we are trying to show how one would go
about measuring rational thinking as a psychological
construct in its own right. We wish to accentuate the
importance of a domain of thinking that has been obscured
because of the prominence of intelligence tests and their
proxies. It is long overdue that we had more systematic
ways of measuring these components of cognition, that are
important in their own right, but that are missing from IQ
tests. Rational thinking has a unique history grounded in
philosophy and psychology, and several of its
subcomponents are firmly identified with well-studied
paradigms. The story we tell in the book is of how we have
turned this literature into the first comprehensive device for
the assessment of rational thinking (the CART).
158
adequate degrees of instrumental rationality in our present
society the skills assessed by the CART are essential. In
Chapter 15 of The Rationality Quotient we include a table
showing that rational thinking tendencies are linked to real
life decision making. In that table, for each of the
paradigms and subtests of the CART, an association with a
real-life outcome is indicated. The associations are of two
types. Some studies represent investigations where a
laboratory measure of a bias was used as a predictor of a
real-world outcome. Others are reports of real-world
analogues of biases that were originally discovered in the
lab. Clearly more work remains to be done on tracing the
exact nature of the connections—that is, whether they are
causal. The sheer number of real-world connections,
however, serves to highlight the importance of the rational
thinking skills in our framework. Now that we have the
CART, we could, in theory, begin to assess rationality as
systematically as we do IQ. If not for professional inertia
and psychologists’ investment in the IQ concept, we could
choose tomorrow to more formally assess rational thinking
skills, focus more on teaching them, and redesign our
environment so that irrational thinking is not so costly.
Whereas just thirty years ago we knew vastly more about
intelligence than we knew about rational thinking, this
imbalance has been redressed in the last few decades
because of some remarkable work in behavioral decision
theory, cognitive science, and related areas of psychology.
In the past two decades cognitive scientists have developed
laboratory tasks and real-life performance indicators to
measure rational thinking tendencies such as sensible goal
prioritization, reflectivity, and the proper calibration of
evidence. People have been found to differ from each
other on these indicators. These indicators are structured
differently from the items used on intelligence tests. We
159
have brought this work together by producing here the first
comprehensive assessment measure for rational thinking,
the CART.
160
Chapter 3
161
When using frequency distribution tables when is it
appropriate to use group frequency distribution tables?
162
the researcher get a quick view of the data. If there are too
few or too many intervals, the table might not reflect a clear
picture. The table should be relatively easy to read and
understand.
163
Group-, which is a threat to internal validity. Researchers
attempt to balance group differences with random
assignment.
164
that differed between conditions. That is, an experiment is
internally valid if we can say with a high degree of
confidence that we properly determined causation, and the
experiment is not confounded (flawed, confused). There is
numerous threats to internal validity, some of the most
common include: nonequivalent control group, history
effect, maturation effect, testing effect, regression to the
mean, instrumentation effect, mortality or attrition,
diffusion of treatment, experimenter and participant effects,
and floor and ceiling effects.
165
may have possible alternative explanations. If it lacks
internal validity we cannot rule out other explanations, so
we cannot say the cause and effect was properly
determined.
166
a causal claim suggests that a variable(s) is causing a
change in another variable. Morling (2012, p.64, Table 3.2)
provides verb phrases that can help you distinguish
between association and causal claims. Below is a list of
some the verbs provided by Morling (2012).
Is linked to
Goes with
Predicts
Is tied to
Is at risk for
Promotes
Causes
Leads to
Helps
Increases
167
explanations for the relationship must be eliminated
(internal validity).
If, on the other hand, you ask people who play video games
to participate in a study and 60 people show up and then
you ask them what kind of video games they play (violent
or non-violent) and if they say violent you put them into
168
one group and if they say non-violent you put them into the
other. This is not really a true independent variable. It's a
quasi-independent variable (this is not a great quasi-
independent design). In a sense, your participants put
themselves into the groups. They were already formed
groups. A specific characteristic was required to be a
member of either group. If the subjects who watched the
violent video games scored higher on the aggressiveness
scale it could be because they had just played violent video
games, or it could be that they are more aggressive people
in the first place (third variable problem, could be an
alternative explanation). Another example of a quasi-
experiment is seen when investigating the effects of a new
reading program. One school is exposed to the new
program while another school is not. At the end of the year
the schools are tested, and compared on reading
comprehension. Participants were assigned to either of the
groups depending on the school they attend. Thus, a
characteristic placed them in either of the groups. To
reiterate, with random assignment participants have a equal
chance of being in either group.
169
operations used to measure or manipulate a variable.
Juxtapose this with a conceptual definition, which is more
abstract, and does not provide specified operations
concerning manipulation or measurement. Operational
definitions allow others to replicate studies, by using the
same operations used in the study they are attempting to
replicate. Operational definitions also allow studies to be
critiqued.
170
show how they are created. This allows researchers to
investigate construct validity. Which asks (regarding the
IV) – How well was the IV manipulated?
171
generally high achievers, while those showing up later may
not be so motivated or concerned with high levels of
achievement. This may be a threat to internal validity, and
result in non-equivalent groups.
172
really variables (in the context of the study) - because they
do not vary.
Cause: IV
173
Effect: DV
You need the mean score. Then you subtract mean from
the individual score. Then square the difference, and then
add the squared deviations. The SS is very important
regarding measures of dispersion (width, variability). The
SS is needed in order to calculate the variance.
174
represent characteristics of populations and samples. A
parameter is generally derived from measurements of
individuals in a population (entire group- people, non-
human animals or objects- a researcher is interested in). A
statistic is generally derived from measurements of
individuals in a sample (participants/ subjects in a study
used to represent population of interest).
175
When data consist of numerical scores measured on
continuous scale polygons or histograms are generally
used. When scores are measured on a nominal or ordinal
scale, bar graphs are used. Also, error bars are often
included to demonstrate the significance of the difference
between groups.
176
consider effect size. When interpreting effect size consider:
a small effect size might represent and important result, and
or a large effect size might represent and unimportant
result.
177
How well does the sample represent a given population?
178
What is the difference between standard deviation and
standard error?
179
2 additional key points when considering the Central Limit
theorem:
1- it describes the distribution means for any population
2- sample means gets close to the normal distribution
relatively fast. By the time the sample size reaches 30 the
distribution is close to perfectly normal
180
a particular study); and n, to sample size (subjects /
participants selected from a population, generally intended
to represent the population of interest). It is not always
possible to measure everyone in the population, so samples
are used as representations of the population. It is possible
to determine what the distribution of sample means looks
like without acquiring thousands of samples. This can be
done by way of the central limit theorem. It provides a
precise description of the distribution that would be
obtained if we were to acquire every possible sample, every
sample mean and constructed the distribution of the sample
mean.
181
Alternative (research) hypothesis- there is a difference, an
effect (stat..signific..difference)
What is cohen’s d?
Values of d
.20 small
.50 medium
.80 large
1.10 very large
1.40+ extremely large
182
are relevant to your interests or questions.
183
with different levels of the trait being measured? If
measuring depression, you can compare scores on the test
for those who have depression with those not suffering
from depression. There should be differences in scores if
the measure is valid.
184
used.
185
What is the difference between a independent measures
ANOVA and a two-factor ANOVA?
Possible effects:
Main effect: IV 1
Main effect: IV 2
Interaction effect: IV 1 * IV 2 (multiplicative effect)
If there are more than two authors cite all the authors the
first time they are cited, then for subsequent citing of the
work use the last name of the first author followed by - et
al., date. If there are one or two authors you cite one or
186
both of them each time they are cited.
If there are more than six authors cite the first author
followed by et al., date the first citing and all subsequent
times.
187
“Repetition is the mother of all skills”
No, they are not the same. The confusion usually occurs
when one reads that the standard score is used to indicate
the number of standard deviations and individual score is
from the mean of the distribution. That is, different than
saying the standard score and standard deviation are
synonymous.
188
Why are z-scores important?
189
performed better on the english test.
190
Correct- failing to reject the null hypothesis when it is true
191
involves at least two variables, and the variables are
measured, but not manipulated. The correlation coefficient
is the measure of the degree of relationship between scores.
It can vary between –1.00 and +1.00.
192
make unobtrusive observations (they hide). Two, they wait
it out. That is, they let the participants or subjects get used
to their presence before recording observations. Third,
instead of measuring behavior researchers may measure
traces as of behavior. As an example, in an art exhibit,
wear-and-tear on the floor tiles can tell you which areas are
most traveled.
193
The temporal precedence rule is sometimes referred to as
the directionality problem. This is because we don’t know
which variable came first. The internal validity rule is
sometimes referred to as the third-variable rule. When an
alternative explanation for the association between
variables is plausible, the alternative is the third variable.
Correlation alone cannot be used to establish cause and
effect. Two other criteria – temporal precedence, and
internal validity- are also needed to deem cause and effect
relationship.
194
exposure to each level of the independent variable.
What is a Meta-analysis?
195
Researchers may sort a group of studies into categories,
which allows them to compute separate effect size averages
for each category.
196
categorical variables because the measuring involves
dividing objects or individuals into different categories.
Examples of data measured on a nominal scale are gender
and ethnicity.
197
equal unit size and absolute zero.
198
in a distribution of scores that occurs with the greatest
frequency. With some distributions, all scores occur with
the same frequency, which means there is no mode in this
situation. In some distributions numerous scores may
occur with equal frequency, thus meaning more than one
mode. What is the mode for the following group of scores-
2,3,5,5,6? Answer: 5
199
not used then the study is a quasi-experiment.
200
shows if there is a difference between sample means and
whether the difference is greater than would be expected by
chance.
201
design that resembles an experiment is some ways, but is
different in others. Ex post facto designs and experiments
involve comparison groups, but ex post facto designs do
not involve manipulation of independent variables. In ex
post facto designs researchers choose a variable of interest
and select participants who already differ on this variable.
What is epidemiology?
202
Scientific researchers are careful, not to let personal biases
about the world blind them to reality. Scientists rely
primarily on objective information. History shows us that
subjective experience is confounded. Many people do not
understand the problems with subjective interpretations.
Instead of acknowledging our tendency to fool ourselves
many people focus on what feels right.
203
collected a basic understanding of statistics is needed.
Understanding research methodology and statistics will
assist you in:
To be a better thinker
To be scientifically literate
204
outcome. There are plausible possible alternative
explanations for the outcome.
205
the video when the multi-color bowl was placed in front of
them on the left, and the other half watched a video while
the multi-color bowl was placed to the right. The single
color bowl in both conditions was placed opposite side of
the multi-color bowl.
They are similar, but they might not be exactly the same.
With a randomized controlled trial participants are
randomly assigned to two or more groups. The only
difference between the two groups (regarding relevant
factors) is the levels (conditions) of the variable, which is
being studied, that they are exposed to. In a randomized
controlled trial at least one of the groups is a control group.
In experiments, the different groups might be exposed to
different treatments, while there is no control group. An
experiment may or may not have a control group.
206
A randomized controlled trial, as with an experiment, must
avoid the problem of self-selection. This problem is
avoided by not allowing participants to choose which level
of the variable (being studied) they are exposed to. To
reiterate, they are randomly assigned.
207
the independent variable(s) (variable of interest) they are
exposed to. Some researchers suggest that if a treatment is
administered or arrangements are made for treatment
administration the study is labeled an experiment (Patten,
2004).
208
sometimes referred to as factors). In the most common
factorial design, researchers use two independent variables.
Researchers study each possible combination of the
independent variables.
What is a p-value?
209
which statistical methods are superior and the implications
of these stats. This is a big topic. Refer to the sources I
mentioned earlier in the section titled- Why We Need
Statistics.
210
was insignificant; when I ran a statistical power analysis it
revealed I needed a larger sample, considering effect size
and p-value to find significance. Statistically significant
and insignificant finding should be replicated, and they
should involve different type of replications using samples
with varying characteristics.
211
found by one research group. To reiterate, scientific
progress is cumulative; it develops as a product of the
work, of sometimes many people. In some cases it is
necessary to repeat studies that didn't find significance.
The original study might be flawed.
212
individual correlations.
213
Recommended Sources
214
Myers, A., & Hansen, C. (2002). Experimental Psychology
(5th edition). Australia: Wadsworth Thomson
Learning.
215
References
216
Gilovich, T. (1991). How We Know What Isn’t So: The
Fallibility of Human Reason in Everyday Life. New
York: The Free Press.
217
Hall, G.C. (1988). Criminal Behavior as a Function of
Clinical and Actuarial Variables in a Sexual
Offender. Journal of Consulting and Clinical
Psychology, 56 (5), 773-775.
218
Human Behaivor. Malden, MA: Wiley-Blackwell.
219
Mitchell, M.L., & Jolley, J.M. (2010). Research Design
Explained 7th Edition. Belmont, CA: Wadsworth,
Cengage Learning.
220
Reber, A.S. (1985). The Penguin Dictionary of Psychology.
London, England.: Penguin Books.
221
Stanovich, K. (2007). How To Think Straight About
Psychology 8th Edition. New York, NY: Pearson.
222
Stanovich, K., West, R.,& Toplak, M. (2016). The
Rationality Quotient: Toward A Test of Rational
Thinking. Cambridge, MA: The MIT Press.
223
Appendix A
IV:
DV:
Operationalization of DV:
IV:
DV:
Operationalization of DV:
224
3- A group of researchers wanted to determine whether
people would eat more food in a cool room than in a hot
room. Half the participants ate in a warm room (75 degrees
Farenheit) and half the participants ate in a cool room (65
degrees Farenheit). The researchers then measured how
much food was consumed in each of the two rooms.
IV:
DV:
Operationalization of DV:
IV:
DV:
Operationalization of DV:
IV:
DV:
Operationalization of DV:
225
An answer key is provided at:
http://jamiehalesblog.blogspot.com/2013/11/answer-key-in-
evidence-we-trust.html
226
Practice Problems (Osbaldiston, 2011: Problems 1-8)
227
4- A researcher wanted to see if a new program would have
the same effect on workers paid hourly versus workers paid
on salary. What would be an appropriate outcome variable
for this research?
6- A confound is:
228
D- Deciding if the procedure will be single-blind or
double-blind
A- case study
B- correlational study
C- quasi-experimental study
D- experimental study
229
Answer key is provided at:
http://jamiehalesblog.blogspot.com/2013/11/answer-key-in-
evidence-we-trust.html
230
Calculating mean, median and mode:
231
Z - scores
Suppose that the individuals that took the test had the
following scores:
Person Score
Roberto 49
Dellis 45
Pamelisa 41
Hennis 39
Roberto =
Dellis =
Pamelisa =
Hennis =
232
Appendix B
One study found that participants ate more rice when it was
presented in a large bowl (Davis & Bush, 2000). With this
example, be sure to use an ampersand (&) and place the
period outside of the parentheses.
When a source has one or two authors you will cite their
233
names and date every time you refer to that source. When
three - five authors are the source, you will cite all of the
names and dates the first time. Subsequently when citing
the source you will use the author’s first name followed by
“et al.” and the date:
When six or more authors are the source you will cite the
first author, only, with each citation:
The reference list contains a list of all the sources you cited
in your paper, in alphabetical order. Do not put sources in
the reference list that were not cited in the paper. Below
is an example of a journal article with one author:
234
Below is an example book:
235
Index
abstract .................... 78, 82, 84, 88, 123, 170, 183, 191, 224
Abstract ........................................................................... 123
actuarial judgment ............................................................. 49
Ad Hominem ..................................................................... 15
alpha level ............................................................... 177, 180
alternative hypothesis........................................ 29, 181, 186
anecdotes ................................................................. 2, 71, 82
Anecdotes .................................................................... 12, 71
ANOVA........................................................... 184, 186, 227
Argumentum ad antiquitatem............................................ 16
Argumentum ad novitatem ............................................... 17
Association claim ............................................................ 167
between-participants ....................................................... 185
Between-participants....................................................... 185
burden of proof ........................................................... 17, 18
causation ............. 13, 14, 37, 38, 71, 78, 165, 167, 193, 204
Causation......................................................... 2, 37, 38, 205
central tendency ........................................ 40, 176, 198, 207
cognitive ability .............. 123, 124, 125, 126, 138, 145, 222
Cognitive ability.............................................................. 125
COGNITIVE ABILITY .................................................. 138
cohen’s d ......................................................................... 182
Common Myths .................................................................. 3
common sense ................................................34, 35, 36, 117
Common sense .................................................... 34, 36, 223
Common Sense ............................................... 2, 34, 35, 221
concepts....................................... 25, 29, 41, 54, 55, 72, 131
Concepts ............................................................................ 25
confirmation bias .........................................66, 67, 118, 122
Confirmation bias........................................................ 66, 67
construct validity............................................. 171, 172, 205
236
Construct validity ............................................................ 172
contaminated mindware ................... 118, 119, 121, 122, 141
control group ................................... 164, 165, 171, 199, 206
Control Group ................................................................. 163
control variable ....................................................... 172, 173
correction factor .............................................................. 175
correlation .... 13, 15, 37, 38, 55, 71, 78, 125, 126, 127, 129,
145, 151, 166, 167, 183, 192, 193
Correlation ............................................ 14, 37, 55, 191, 194
correlational studies .......................................................... 37
Correlational studies ......................................................... 38
Correlational Studies ........................................................... 2
counterbalance ................................................................ 205
covariation..................................................... 14, 38, 56, 167
Covariation ................................................................ 56, 193
critical region .................................................................. 180
critical thinking .. 12, 83, 123, 124, 125, 126, 131, 132, 133,
214, 222
Critical thinking .......................................... 3, 131, 133, 134
Critical Thinking ............................................................. 218
cross-sectional designs .................................................... 201
crystallized intelligence .................................................. 127
Crystallized intelligence.................................................. 127
cynic ................................................................................ 8, 9
Cynic ................................................................................... 8
decontextualized reasoning ............................................. 131
dependent variable ............ 23, 166, 174, 182, 194, 202, 209
descriptive statistics ........................................................ 176
Descriptive statistics ................................................. 40, 176
Double Blind ..................................................................... 58
dysrationalia .....................................................117, 146, 217
Dysrationalia ............................................... 3, 128, 146, 217
effect size ........................ 176, 177, 178, 182, 192, 195, 196
Effect size................................................................ 176, 182
237
epidemiology................................................................... 202
epistemic rationality ...................33, 114, 133, 142, 148, 152
Epistemic rationality ............................... 128, 133, 135, 148
ex post facto design ......................................................... 201
experts ................................................. 12, 18, 43, 45, 50, 69
Experts .................................................................... 2, 43, 46
extreme scores ......................................................... 198, 208
Extreme scores ................................................................ 208
factorial designs ...................................................... 185, 209
Fluid intelligence ............................................................ 127
frequency distribution ............................................. 162, 207
hypotheses ............11, 28, 29, 37, 78, 79, 122, 141, 191, 204
Hypotheses ...................................................... 28, 29, 54, 55
hypothesis ...11, 28, 29, 37, 71, 75, 177, 180, 181, 182, 186,
190, 191, 206, 228
Hypothesis..........................................................................11
inferential statistics ................................................. 176, 190
Inferential statistics ................................... 40, 176, 182, 190
instrumental rationality ................................... 143, 147, 148
Instrumental rationality ........................... 127, 133, 135, 143
instruments .......................................................... 25, 26, 151
Instruments ........................................................................ 25
intelligence 4, 22, 55, 67, 114, 116, 117, 120, 121, 122, 123,
124, 125, 126, 127, 128, 129, 137, 138, 140, 145, 146, 149,
150, 151, 183, 220, 222
Intelligence .. 2, 114, 115, 117, 123, 128, 129, 136, 137, 139,
140, 150, 222
internal validity ...... 13, 15, 38, 56, 164, 165, 166, 168, 172,
173, 193, 194, 199, 200
Internal validity ................................................. 38, 166, 193
Internal Validity .............................................................. 164
interval .................................................... 163, 196, 197, 198
Interval ............................................................................ 198
IQ ............................... 67, 117, 118, 127, 128, 136, 220, 222
238
IRB board ........................................................................ 171
logical fallacies ........................................................... 34, 70
Longitudinal designs ....................................................... 201
mean ....2, 9, 13, 14, 17, 24, 28, 37, 64, 75, 78, 85, 116, 142,
165, 174, 177, 179, 180, 181, 188, 189, 191, 198, 205, 207,
208, 231, 232
Mean ............................................................................... 232
measurement ....................... 26, 85, 170, 176, 196, 197, 202
Measurement ..................................................................... 26
median ..................................................................... 198, 231
meta-analysis........................................................... 195, 196
Meta-analysis .................................................................. 195
mode.................................................. 21, 198, 199, 207, 231
myside bias.................59, 116, 123, 124, 125, 126, 127, 149
Myside bias ............................................. 124, 125, 126, 127
nominal ................................................... 176, 196, 197, 198
Nominal........................................................................... 198
non-science ....................................................................... 30
nonsense detection kit ......................................................... 3
Nonsense Detection Kit ............................................ 2, 3, 65
nonsense indicator ............................................................. 72
Nonsense indicator .................. 65, 66, 67, 68, 69, 70, 71, 72
null hypothesis .......................... 28, 180, 186, 190, 191, 228
Null hypothesis ............................................................... 181
observation ................................................ 20, 21, 50, 91, 92
Observation ........................................................... 20, 22, 91
operational definition ............................ 25, 29, 66, 169, 170
ordinal ..................................................... 176, 196, 197, 198
Ordinal ............................................................................ 198
parameter................................................................. 174, 175
Parameter ........................................................................ 174
participant effects .................................................... 165, 200
Participant effects ............................................................ 200
peer review .................................................................. 59, 60
239
Peer review................................................................ 58, 217
Peer Review ................................................................ 58, 59
personal beliefs ................................................................. 68
Person-who statistics ......................................................... 42
Person-Who Statistics ....................................................... 41
pseudoscientific............................................................... 141
Pseudoscientific .............................................................. 122
quasi-independent variable ..................................... 168, 169
random assignment .... 23, 24, 163, 164, 169, 171, 172, 193,
199, 207
Random assignment ................................ 164, 171, 172, 193
random sampling ..................................................... 193, 199
randomized controlled trial ..................................... 206, 207
ratio ................................................................. 196, 197, 198
Ratio ................................................................................ 198
rationality ......3, 4, 5, 33, 114, 116, 120, 123, 126, 127, 128,
129, 130, 132, 133, 135, 137, 139, 140, 141, 142, 143, 145,
146, 147, 148, 150, 151, 152, 153, 219, 221
Rationality .......2, 3, 114, 126, 128, 133, 135, 137, 139, 140,
142, 143, 147, 222
reactivity ......................................................................... 192
Relativist Fallacy .............................................................. 18
reporting .................................................................... 24, 197
Reporting........................................................................... 24
research reports ............................................. 5, 87, 188, 220
Research Reports .................................................. 2, 87, 215
RQ test ............................................................................ 150
RQ Test ....................................................................2, 4, 115
SAT ................................................................. 125, 126, 127
scientific methods ....................................................... 33, 83
Scientific methods ......................................................... 7, 33
Single Blind ...................................................................... 58
skeptic ................................................................... 8, 18, 221
Skeptic............................................................................. 2, 8
240
SS ............................................................................ 174, 175
standard deviation ........................... 176, 179, 188, 189, 232
Standard deviation .......................................................... 179
standard error .................................................................. 179
Standard error.................................................................. 179
statistical prediction .............. 44, 45, 46, 47, 48, 49, 50, 219
Statistical Prediction ....................................................... 217
statistical validity .................................................... 176, 192
survey research........................................................ 205, 206
systematic empiricism ................................................. 20, 91
Systematic Empiricism ..................................................... 20
testimonials ................................................................. 13, 72
Testimonials ........................................................................ 2
theory .................... 30, 31, 63, 68, 74, 75, 85, 120, 127, 152
Theory ................................................................................11
true independent variable ........................................ 168, 169
Tukey’s HSD ................................................................... 184
two-tailed hypothesis ...................................................... 181
type 1 error ...................................................................... 228
Type 1 error ..................................................................... 190
type 2 error ...................................................................... 228
Type 2 error ..................................................................... 190
within-participants .......................................... 185, 200, 205
Within-participants.......................................................... 185
z-score ..................................................................... 189, 232
241
About the Author
242
Polemics Applications, LLC
Kevin Akers, President
polemicsapp@yahoo.com
info@polemicsapps.com