Toward Comprehensive Perspectives On The Learning and Teaching of Proof
Guershon Harel
Department of Mathematics
University of California, San Diego
San Diego, CA 92093-0112, USA
(858)534-5273 (Fax)
Larry Sowder
Department of Mathematics and Statistics
San Diego State University
San Diego, CA 92182-7720, USA
(619)594-6746 (Fax)
One of the most remarkable gifts human civilization has inherited from ancient Greece is
the notion of mathematical proof. The basic scheme of Euclid’s Elements has proved
astoundingly durable over the millennia and, in spite of numerous revolutionary
innovations in mathematics, it still guides the patterns of mathematical communication
(Babai, 1992).
This chapter examines proof in mathematics, both informal justifications and the
types of justification usually called mathematical proofs. The introductory section below
calls for what we label a “comprehensive perspective” toward the examination of the
learning and teaching of proof, and identifies the various elements of such a
comprehensive perspective. Our viewpoint next centers on students’ outlooks on proof,
as described by the “proof schemes” evidenced in students’ work; the second section
elaborates on this proof-scheme notion and includes a description of various proof
schemes as well as a listing of the various roles proof can play in mathematics. The third
section then gives a brief overview of how the idea of proof in mathematics has evolved
historically, and why historical considerations could be a part of educational research on
the learning or teaching of proof. The fourth and fifth sections include a look at selected
studies2 dealing with proof, at both the precollege and the college levels, with an effort to
show the value of the proof-schemes idea. The final section offers some questions
prompted by the earlier sections.
need by raising on several occasions (e.g., 2002) the question: “Is there a shared meaning
of ‘mathematical proof’ among researchers in mathematics education?” Balacheff was
not referring to just the standard, more or less formal definition of mathematical proof.
Rather, his question, as he puts it, is “whether beyond the keywords, we had some
common understanding” (Balacheff, 2002, p. 23). By “common understanding,” we
believe he means agreed-upon parameters in terms of which one can formulate
differences among perspectives into research questions. We agree with Balacheff that
without such an understanding it is hard to envision real progress in our field. The
comprehensive perspective on proof presented in this paper delineates a set of such
Response 1
log(4 ⋅ 3 ⋅ 7) = log84 = 1.924 log(4 ⋅ 3 ⋅ 6) = log 72 = 1.857
log 4 + log 3 + log 7 = 1.924 log 4 + log 3 + log 6 = 1.857
Since these work, then log( a1 ⋅ a2 ⋅ ⋅ an ) = log a1 + log a2 + + log an .
A probe into the reasoning of the students who provide responses of this kind reveals that
their conviction stems from the fact that the proposition is shown to be true in a few
instances, each with numbers that are randomly chosen—a behavior that is a
manifestation of the empirical proof scheme.
Response 2
(1) log( a1a2 ) = log a1 + log a2 by definition
(2) log( a1a2 a3 ) = log a1 + log a2 a3 . Similar to log( ax ) as in step (1), where
this time x = a2 a3 .
log( a1a2a3 ) = log a1 + log a2 + log a3
(3) We can see from step (2) any log( a1a2 a3 an ) can be repeatedly broken
down to
log a1 + log a2 + + log an
It is important to point out that in Response 2 the student recognizes that the
process employed in the first and second cases constitutes a pattern that recursively
applies to the entire sequence of propositions, log( a1a2 an ) = log a1 + log a2 + + log an ,
n = 1, 2,3,… .
In both responses the generalizations are made from two cases. This may suggest,
therefore, that both are empirical. As is explained in Harel (2001), this is not so:
response 2, unlike response 1, is an expression of the transformational proof scheme. To
see why, one needs to examine the two responses against the definitions of the two
9 A Comprehensive Perspective on Proof
schemes. While both responses share the first characteristic—i.e., in both the students
respond to the “for all” condition in the log-identity problem statement—they differ in
the latter two: whereas the mental operations in Response 1 are incapable of anticipating
possible subsequent outcomes in the sequence and are devoid of general principles in the
evidencing process, the mental operations in Response 2 correctly predict, on the basis of
the general rule, log( ax ) = log a + log x , that the same outcome will be obtained in each
step of the sequence. Further, in Response 1 the inference rule that governs the
evidencing process is empirical; namely, (∃r ∈ R )( P ( r )) ⇒ (∀r ∈ R )( P ( r )) . In Response
2, on the other hand, it is deductive; namely, it is based on the inference rule
(∀r ∈ R )( P( r )) ∧ ( w ∈ R ) ⇒ P( w) . (Here r is any pair of real numbers a and x , R is
the set of all pairs of real numbers, P( r ) is the statement “ log( ax ) = log a + log x ,” and
w in step n is a pair of real numbers a1a2 an −1 and an .)
The axiomatic proof scheme too has the three characteristics that define the
transformational proof scheme, but it includes others. For now, it is sufficient to define it
as a transformational proof scheme by which one understands that in principle any
proving process must start from accepted principles (axioms). The situation is more
complex, however, as we will show in the section on historical and epistemological
considerations (third section). For the purpose of this chapter we will introduce only the
Greek axiomatic proof scheme and the modern axiomatic proof scheme—as manifested,
for example, in Euclid’s Elements and Hilbert’s Grundlagen, respectively. The
distinction between these two schemes is further discussed in the third section.
resulting expressions, with the intention to derive information relevant to the problem at
hand. We return to this scheme in the next section.
The above definitions and taxonomy are not explicit enough about many critical
functions of proof within mathematics. There is a need to point to these functions due to
their importance in mathematics in general and to their instructional implications in
particular. For this, we point to the work of other scholars in the field, particularly the
work by Hanna (1990), Balacheff (1988), Bell (1976), Hersh (1993), and de Villiers
(1999). de Villiers, who built on the work of the others mentioned here, raises two
important questions about the role of proof: (a) What different functions does proof have
within mathematics itself? and (b) How can these functions be effectively utilized in the
classroom to make proof a more meaningful activity? (p. 1). According to de Villiers,
mathematical proof has six not mutually exclusive roles: verification, explanation,
discovery, systematization, intellectual challenge, and communication. At the end of the
next section, after the relevant schemes are defined, we show that these functions are
describable in terms of the proof scheme construct.
In Greek mathematics, on the other hand, they are undefined terms referring to humans’
idealized physical reality. Treating primitive terms as undefined is fundamentally
different from treating them as variables. Wilder quotes Boole (from 1847) to stress this
The validity of the processes of analysis does not depend upon the interpretation
of the symbols which are employed, but solely upon the laws of their
combination. Every system of interpretations which does not affect the truth of
the relations supposed, is equally admissible… (in Wilder, 1967, p. 116).
Despite the monumental conceptual difference in the referential imageries between Greek
mathematics and modern mathematics, the essential condition in applying deductive
reasoning in both is the existence of primary terms and primary propositions (axioms).
Grundlagen characterizes a structure that fits different models. This obviously is not
unique to geometry. In algebra, a group or a vector space is defined to be any system of
objects satisfying certain axioms that specify the structure under consideration. To
reflect this fundamental conceptual difference, we refer to the Greeks’ proving means as
the Greek axiomatic proof scheme and to the proving means of modern mathematics as
the modern axiomatic proof scheme.
the nature of the operations underlying the Euclidean construction, and hence were
unable to understand the difference between bisecting an angle and trisecting an angle
and why they were able perform the former but not the latter.
Thus, not until the 17th century with the invention of analytic geometry and
algebra did mathematicians begin to shift their attention from the result of mathematical
operations to the operations themselves. By means of analytic geometry, mathematicians
realized that all Euclidean geometry problems can be solved by a single approach, that of
reducing the problems to equations and applying algebraic techniques to solve them.
Euclidean straightedge-and-compass constructions were understood to be equivalent to
equations, and hence the solvability of a Euclidean problem became equivalent to the
solvability of its corresponding equation(s).
I say that the exterior angle ACD equals the sum of the two interior and opposite
angles CAB and ABC, and the sum of the three interior angles of the triangle ABC,
BCA, and CAB equals two right angles.
Draw CE through the point C parallel to the straight line AB (by Proposition I.31).
Since AB is parallel to CE, and AC falls upon them, therefore the alternate angles
BAC and ACE equal one another (by Proposition I.29).
Again, since AB is parallel to CE, and the straight line BD falls upon them,
therefore the exterior angle ECD equals the interior and opposite angle ABC (by
Proposition I.29).
But the angle ACE was also proved equal to the angle BAC. Therefore the whole
angle ACD equals the sum of the two interior and opposite angles BAC and ABC.
Add the angle ACB to each. Then the sum of the angles ACD and ACB equals the
sum of the three angles ABC, BCA, and CAB (by Common Notion 2).
15 A Comprehensive Perspective on Proof
But the sum of the angles ACD and ACB equals two right angles. Therefore the
sum of the angles ABC, BCA, and CAB also equals two right angles (by
Proposition I.13 and Common Notion 1).
This proof appeals to two facts, one about the auxiliary segment CE and the other
about the external angle ACD. Note that the property holds whether or not the segment
CE is produced and the angle ACD considered. One might then raise the question, what
is the true cause of the property proved? This question was a center of debate during the
16th-17th centuries about whether mathematics is a science. Philosophers of this period,
according to Mancosu (1996), used this proof to demonstrate their argument that
mathematics is not a perfect science because “implication” in mathematics is a mere
logical consequence rather than a demonstration of the cause of the conclusion. Their
argument was based on the Aristotelian definition of science, according to which one
does not understand something until he or she has grasped the why of it. “We suppose
ourselves to possess unqualified scientific knowledge of a thing, … , when we think that
we know the cause on which the fact depends as the cause of the fact and of no other …”
(Aristotle, p. 111-112).
Mathematical statements of the form “A if and only if B” provided an additional
argument against the scientificness (i.e., causal nature) of mathematics, for—these
philosophers claimed—if mathematical proof were scientific (i.e., causal), then such a
statement would entail that A is the cause of B and B is the cause of A, which implies A
is the cause of itself—an absurdity.
This position entailed rejection of proof by contradiction, for such a proof does
not demonstrate the cause of the property that is being argued. When a statement “A
implies B” is proved by showing how not B (and A) leads logically to an absurdity, one
does not learn anything about the causality relationship between A and B. Nor does one
gain any insight into how the result was obtained. Consequently, proof by exhaustion
(e.g., Archimedes’ known method of proof for calculating volume, area, and parameter of
different objects), which is necessarily based on proof by contradiction, also was
unsatisfactory to many mathematicians of the 16th and 17th centuries. They argued that
the ancients, who broadly used proof by exhaustion to avoid explicit use of infinity, failed
to convey their methods of discovery.
Not all philosophers of the time held this position. According to Mancosu (1996),
Barozzi, for example, argued that some parts of mathematics are more scientific (causal)
than others; but that a proof by contradiction is not a causal proof, and therefore it should
be eliminated from mathematics. Others, like Barrow, argued that all mathematics proofs
are causal, including proof by contradiction:
It seems to me … that Demonstrations, though some do outdo others in Brevity,
Elegance, Proximity to their first Principles, and the like Excellencies, yet are all
alike in Evidence, Certitude, Necessity, and the essential Connection and mutual
Dependence of the Terms one with another. Lastly, that Mathematical
Ratiocinations are the most perfect Demonstrations. (Quoted in Mancosu, 1996,
p. 23)
Of particular interest is the position held by Rivaltus on the issue of causal proof
in mathematics. Mancosu (1996) illustrates this position by Rivaltus’ commentary on
Archimedes proof for the theorem that the area of the surface of the sphere is four times
16 A Comprehensive Perspective on Proof
the area of a great circle of the sphere. In this proof, Archimedes inscribes and
circumscribes the sphere with auxiliary solids to show that the surface of the sphere can
be neither smaller nor greater than four times the great circle—a typical Archimedean
proof by exhaustion, which necessarily involves proof by contradiction. So, there are two
issues here: the use of proof by contradiction and the use of the auxiliary solids (as in the
case of the proof of Proposition I.32). Each of these two features render, in the eyes of
some philosophers of the time, Archimedes’ proof non-causal. Rivaltus rejects this
possibility on the basis of a distinction between “cause” and “reason”:
Ostensive demonstrations in mathematics are not considered more perfect than the
ones by contradiction, since in these disciplines it is not made use of the cause of
the thing, but of the cause of the knowledge of the thing. … The figures drawn are
not truly the cause [italics added] of that equality but are reasons [italics added]
from which we know it. From whence it follows that whatever is more fit to
knowledge is more appropriate to the mathematician. But we know more easily
that absurdities are impossible, false and repugnant by reason that we know the
true things. Indeed the truths are concealed and conversely the errors are obvious
everywhere. … Again it is to be observed that the Geometers do not make use of
the cause of a thing, but of the cause from which the thing is known. Indeed it is
sufficient to them to show the thing to be so and they do not enquire by which
means it is so. (Rivaltus, 1615; quoted in Mancosu, 1996, pp. 26-27)
The Aristotelian theory of science and particularly its appeal to cause and effect
manifested itself in another aspect of mathematical practice during the 17th century—that
of the use of “genetic definitions.” These definitions appeal to the generation of
mathematical magnitude by motion; Euclid’s definition of a sphere as an object generated
by the rotation of a semicircle around a segment taken as axis is an example. An example
of a non-genetic definition is that of a circle defined as a set of all points in the plane that
are equidistant from a given point in that plane. According to Mancosu (1996), although
the insistence on the use of genetic definitions was not universal, important
mathematicians of the time emphasized their importance because they were viewed as
demonstrating cause, and, hence, conform to Aristotle’s epistemological position of what
constitutes a science. For example, Barrow stated that of all the possible ways of
generating a magnitude, the most important is the method of local movements, and he
uses motion as a fundamental concept in his work in geometry” (Mancosu, 1996, p. 96),
and Hobbes and Spinoza, “emphasize the role of genetic definitions as the only causal
definitions, thereby excluding the nongenetic definitions from the realm of science” (p.
The Greeks’ motive for constructing the remarkable geometric edifice that we
now call Euclidean Geometry, was their desire to create a consistent system that is free
from paradoxes. Avoiding paradoxes constituted, in part, an intellectual need for the
transition from Greek to modern mathematics. According to Wilder (1967, based, in
part, on Freudenthal, 1962), against the customary view that attributes the change to the
introduction of non-Euclidean geometry,
there was little evidence of excitement or even interest in the mathematical
community regarding the work of Bolyai and Lobachevski, for at least 30 years
17 A Comprehensive Perspective on Proof
modern notion of number is a case in point. The emergence of negative numbers raised
questions as to the utility of symbols without a concrete referent and especially without a
geometrical referent. How is it possible, for example, to subtract a greater quantity from
a smaller one, where the mental image of “quantity” is nothing else but a physical amount
or a spatial capacity? Moreover, how is it possible to understand such a statement as (-
1)/1=1/(-1), where the quantity 1 is larger than the quantity –1, and therefore, the division
of 1 by –1 must be greater than the division of –1 by 1? (See Mancosu, 1996.)
extent and in what ways is the nature of the entities intertwined with the nature of
proving? For example, students’ ability to construct an image of a point as a
dimensionless geometric entity might impact their ability to develop the Greek axiomatic
proof scheme, and vice versa. As far as we know, this interdependency has not been
explicitly addressed and its implications for instruction have not been considered.
The motive for the Greeks’ construction of their geometric edifice, according to
the historical account presented earlier, was their desire to create a consistent system that
was free from paradoxes such as those of Zeno. For example, to avoid Zeno’s paradoxes
the Greeks based their geometric proofs strictly in the context of static concepts. What
does this tell us about how to help students see a need for the construction of a geometric
structure, particularly that of Euclid? What is the cognitive or social mechanism by
which deductive proving can be necessitated for the students?
Post-Greek mathematics.
Symbolic algebra, which began with Vieta’s work, seems to have played a critical
role in the transition from Greek mathematics to modern mathematics, particularly in
relation to the reconceptualization of mathematical proof as a sequence of arguments
valid by virtue of their form, not content. In the new concept of proof, one would begin
with identities and by virtue of rules of symbolic definitional substitutions proceed
through a finite number of steps until the theorem is proved. With symbolic algebra,
mathematicians shifted their attention from results of operations (e.g., whether and how
an angle can be bisected or trisected) to the operations themselves (e.g., the underlying
difference between bisecting and trisecting an angle). A critical outcome of this shift was
20 A Comprehensive Perspective on Proof
the discovery that all Euclidean geometry problems can be solved by a single approach,
that of reducing the problems to equations and applying algebraic techniques to solve
The role of symbolic algebra in the reconceptualization of mathematics in general
and of proof in particular raises a critical question about the role of symbolic
manipulation skills in students’ conceptual development of mathematics, in general, and
of proof schemes, in particular. Can students develop the modern conception of proof
without computational fluency? And in view of the increasing use of electronic
technologies in schools, particularly computer algebra systems, one should also ask:
Might these tools deprive students of—or, alternatively, provide students with—the
opportunity to develop algebraic manipulation skills that might be needed for the
development of advanced conception of proof? In addressing this question, it is
necessary, we believe, to distinguish between two kinds of symbolic proof schemes: non-
referential and referential. As we discussed earlier, in the former scheme, neither the
symbols nor the operations one performs on them represent a quantitative reality for the
students. Rather, students think of symbols and algebraic operations as if they possess a
life of their own without reference to their functional or quantitative meaning. By
contrast, in the symbolic referential proof scheme, to prove or refute an assertion or to
solve a problem, students learn to represent the statement algebraically and perform
symbol manipulations on the resulting expressions. The intention in these symbolic
representations and manipulations is to derive relevant information that deepens one’s
understanding of the statement, and that can potentially lead her or him to a proof or
refutation of the assertion or to a solution of the problem. In such an activity, one does
not necessarily form referential representations for each of the intermediate expressions
and relations that occur in the symbolic manipulation process, but has the ability to
attempt to do so in any stage in the process. It is only in critical stages—viewed as such
by the person who is carrying out the process—that one forms, or attempts to form, such
representations. This ability is potential rather than actual because in many cases the
attempt to form quantitative representations may not be successful. Nevertheless, a
significant feature of the referential symbolic proof scheme, which is absent from the
non-referential symbolic proof scheme, is that one possesses the ability to pause at will to
probe into the meaning (quantitative or geometric, for example) of the symbols.
Together with the emergence of symbolic algebra, a new conception of
mathematical entity, particularly that of number, began to emerge. A mathematical
entity, in this conception, is not necessarily dependent on its “natural” pre-scientific
experience but on its connection to other entities within a structure and its function within
that structure. For example, while the Greeks were highly selective in their choice of
numbers—they rejected irrational numbers, for example—the post-Greek mathematicians
began to accept them. This conceptual change was not without difficulty. For example,
some mathematicians of the 17th century rejected the utility of negative numbers, which
they viewed as symbols without real experiential referents. The conceptual attachment to
a context—whether it is the context of intuitive Euclidean space or that of R n —was
dubbed the contextual proof scheme. In this scheme, general statements, intended for
varying realities, are interpreted and proved in terms of a restricted context. The question
of the developmental inevitability of the contextual proof scheme has not been fully
addressed in mathematics education research. Some evidence exists to indicate that even
21 A Comprehensive Perspective on Proof
students in an advanced stage in their mathematical education have not developed this
scheme. For example, it has been shown that many mathematics majors enrolled in
advanced geometry courses have major difficulties dealing with any geometric structure
except the one corresponding to their spatial imageries, and that mathematics majors
enrolled in linear algebra courses interpret and justify general assertions about entities in
a general vector space in terms of R 2 and R 3 entities (Harel & Sowder, 1989, Harel,
1999). Such findings have major curricular implications. For example, they raise major
doubts as to the wisdom of the practice of starting off college geometry courses with
finite geometries or of introducing general vector spaces in the first course of linear
The debate among philosophers during the Renaissance about the scientificness of
mathematics and the mathematics practice that ensued is of particular pedagogical
significance. As we have outlined, the question was whether the mathematical practice in
which “implication” is a mere logical consequence, rather than a demonstration of the
cause of the conclusion, is scientifically acceptable. This, in turn, raises questions about
the acceptability of proof by contradiction and proof by exhaustion. Were these issues of
marginal concern to the mathematicians of the 16th and 17th centuries, or had they been
significantly affected by it? To what extent did the practice of mathematics in the 16th
and 17th centuries reflect global epistemological positions that can be traced back to
Aristotle's specifications for perfect science? These are important questions, if we are to
draw a parallel between the individual's epistemology of mathematics and that of the
community. As noted by Mancosu (1996), this debate had a deep and profound impact
on the practice of mathematics during the 15th to 18th centuries. For example, the
practices of Cavalieri, Guldin, Descartes, and Wallis reflected a deep concern with these
issues by, for example, explicitly avoiding proofs by contradiction in order to conform to
the Aristotelian position on what constitutes perfect science. This history shows that the
modern conception of proof was born out of an intellectual struggle—a struggle in which
Aristotelian causality seems to have played a significant role. Is it possible that the
development of students’ conception of proof includes some of these epistemological
obstacles (in the sense of Brousseau, 1997)—obstacles that may be unavoidable, for they
are inherent to the meaning of concepts in relation to humans’ current schemes?
We conclude this section on the relevance of history and philosophy of
mathematics to the learning and teaching of proof with questions pertaining to the idea of
“genetic definitions”—mathematical definitions that utilize motion to generate
magnitudes—from the 17th century. As we have indicated, the use of such definitions
was viewed by some important mathematicians of the time to conform to the Aristotelian
epistemological position on the centrality of causality in science. Can this account for the
positive impact that dynamic geometry environments might have on advancing students’
proof schemes? What is exactly the conceptual basis for the relationship between motion
and causal proofs? In a later section we will report on several studies that have examined
this effect.
Functions of Proof
Earlier, in the second section, we described a portion of our taxonomy of proof
schemes. The discussion that followed brought up several other schemes that emerged in
the history of mathematics. In the rest of this section, we depict all the schemes
22 A Comprehensive Perspective on Proof
mentioned in this paper (Table 1) and discuss their functions within mathematics. This
list is not complete; we only depict those schemes that are needed for the discussion in
this paper (for the complete taxonomy, see Harel & Sowder, 1998).
Table 1
Proof Schemes
External Conviction Empirical Deductive
Authoritative Inductive Transformational
Ritual Perceptual Causality
Non-referential Greek axiomatic
Modern axiomatic
As we indicated earlier (see the second section), de Villiers (1999) built on the
work of others scholars—particularly Hanna (1990), Balacheff (1988), Bell (1976),
Hersh (1993)—to address important questions about the role of proof. Specifically, what
different functions does proof have within mathematics itself and how can these functions
be effectively utilized in the classroom to make proof a more meaningful activity? de
Villiers suggests that mathematical proof has six not mutually exclusive roles:
• verification
• explanation
• discovery
• systematization
• intellectual challenge
• communication
In what follows, we will show that all of these functions but one (intellectual challenge)
are describable in terms of the proof scheme construct. Some of the proof schemes used
to interpret these functions appeared in the taxonomy presented above. A description of
each—with our additions and modifications—follows.
Verification refers to the role of proof as a means to demonstrate the truth of an
assertion according to a predetermined set of rules of logic and premises—the axiomatic
proof scheme.
Explanation is different from verification in that for a mathematician it is usually
insufficient to know only that a statement is true. He or she is likely to seek insight into
why the assertion is true. We referred to this as the causality proof scheme.
Discovery refers to the situations where through the process of proving, new
results may be discovered. For example, one might realize that some of the statement
conditions can be relaxed, thereby generalizing the statement to a larger class of cases.
Or, conversely, through the proving process, one might discover counterexamples to the
assertion, which, in turn, would lead to a refinement of the assertion by adding necessary
restrictions that would eliminate counterexamples. Lakatos’ (1976) thought experiment
on the proof of Euler’s theorem for polyhedra best illustrates this process. In some cases
one may ask whether a certain axiom is needed to establish a certain result, or what form
the result would have if a certain axiom is omitted. We considered this as a case of the
axiomatizing proof scheme.
Systematization refers to the presentation of verifications in organized forms,
where each result is derived sequentially from previously established results, definitions,
23 A Comprehensive Perspective on Proof
axioms, and primary terms. This too is a case of the axiomatic proof scheme. The
difference between systematization and verification is in the extent of formality.
Communication refers to the social interaction about the meaning, validity, and
importance of the mathematical knowledge offered by the proof produced.
Communication can be viewed in the context of the two subprocesses that define
proving: ascertaining and persuading.
Intellectual challenge refers to the mental state of self-realization and fulfillment
one can derive from constructing a proof. As we mentioned earlier, this role does not
correspond to any of our proof schemes.
With the notion of proof scheme as an organizing concept—appended with these
functions—we will now present selected findings reported in the literature that pertain to
students’ conceptions of proof. Of course, we are unable to describe without speculation
most of these findings in terms of the proof scheme construct, because these studies had
not been conceptualized or designed with our proof scheme construct in mind. However,
as we will see, much can be said about these research findings in relation to proof
NAEP studies
A first place to look for data is the periodic National Assessment of Educational
Progress in the United States [NAEP], involving, typically, students at ages 9, 13, and 17
(and now reported by grade: 4, 8, and 12). The wide geographic sampling and the large
sample sizes, usually many thousands of students, in a NAEP indicates the U.S. picture,
which can be further checked with smaller studies and contrasted with studies outside the
U.S. For our purposes, however, one limitation of the NAEP has been that only a few
items on proof or logic can be included, because of the scope of the tests. Another
limitation is that since all the NAEP questions considered here are multiple-choice items,
it is difficult to pinpoint the actual reasoning students employed in answering them.
In the planning stage, the first NAEP mathematics assessment (1972-1973)
included mathematical proof and logic as objectives to be evaluated, although only a few
items tested these areas (Carpenter, Coburn, Reys, & Wilson, 1978, p. 10). Later NAEPs
used different designs for setting objectives to be tested. The fourth NAEP (1985-1986),
for example, included items testing “mathematical methods,” with a few intended to
include “a general understanding of the nature of proof and axiomatic systems, and logic”
(Carpenter, 1989, p. 3). Analysts of the results concluded that “most 11th-grade students
demonstrated little understanding of the nature and methods of mathematical
argumentation and proof” (Silver & Carpenter, 1989, p. 11), citing results on items
requiring the recognition of counterexamples (with success rates of 31%-39%—see
Figure 1 for two items), on items testing understanding of the terms “axiom” and
“theorem” (fewer that one-fourth and about half correct, respectively), and on items
dealing with an undisclosed but “straightforward” item on indirect proof (about one-third
correct) and mathematical induction (similar results) ( pp. 17-18). Even when only those
students who had taken geometry were considered, results were just slightly better than
those for students who had taken mathematics only through first-year algebra. The
overall performance led the analysts to conclude that “the generally poor performance on
these items dealing with proof and proof-related methods suggests the extent to which
students’ experiences in school mathematics, even for students in college-preparatory
courses, may often fail to acquaint them with the fundamental nature and methods of the
discipline” (p. 18).
A. Larry says that n2 > n for all real numbers. Of the following, which value of n shows
the statement to be FALSE?
-1/2 (23%) 0 (29%) 1/10 (39%) 1 (9%) (9% not responding)
B. Jim says, “If a 4-sided figure has all equal sides, it is a square.” Which figure might
be used to prove that Jim is wrong?
1 2
Given: BD ≅ EC
Prove: AB ≅ EF
Fig. 2. A sample geometry proof item from Senk’s study (1985, p. 451).
The overall results were dismaying for the course in the U. S. in which deductive
proof schemes should be expected to develop: “[The] data suggest that approximately 30
percent of the students in full-year geometry courses that teach proof reach a 75-percent
mastery level in proof writing….29 percent of the sample could not write a single valid
proof” (Senk, 1985, p. 453). At the time, about half of high school graduates took a
course in geometry, so it is interesting to speculate about what more recent performances
might be, when roughly 80% of U. S. high schoolers take geometry (U. S. Department of
Education, 2000, p. 122).
Other, smaller studies in the U.S. have also involved students in geometry or later
courses, since virtually all formal work with proof has traditionally been introduced in the
geometry course (9th or 10th year is most common). But, for example, the interviewees in
Tinto’s (1988) study felt that proof was used only to verify facts that they already
knew—an antithesis to the discovery or explanation functions of proof discussed earlier.
Thompson’s study (1991) is of special interest since her subjects were all
advanced students taking the last course of a curriculum targeting university-bound
students and emphasizing reasoning and proof as a major strand. Yet Thompson
expressed concern about the number of students who “proved” a statement by providing a
specific example—a manifestation of the inductive proof scheme, and only about one-
third of her subjects could find a counterexample to a number theory statement (For all
integers a and b, if a2 is divisible by b then a is divisible by b), in a “prove or disprove”
context. Thompson also referred to the “enormous difficulty that students had with
indirect proof” (1991, p. 23), with only 3% able to complete one indirect proof (that the
sum of a rational number and an irrational number is an irrational number). Difficulties
with indirect proof could well be related to the earlier discussion of the causal proof
Knuth, Slaughter, Choppin, and Sutherland (2002) found that 70% of roughly 350
students in grades 6-8 used examples (the empirical proof scheme) in justifying the truth
of two statements (show that the sum of two consecutive numbers is always an odd
number; show that when you add any two even numbers, your answer is always even, p.
1696). Only a few students attempted general arguments. On a more positive note,
across grade levels, the students did show an increasing sensitivity to adhering to a given
27 A Comprehensive Perspective on Proof
Since the focus of these studies was proof performance, they provide even more
striking evidence than the NAEP studies did, that most U. S. students, even those in
college-preparatory programs, do not seem to utilize deductive proof schemes.
Porteous (1986, 1990) gave questionnaires to about 400 British students, ages 11-
16, and interviewed 50 of them. From the questionnaires, he found that more than 40%
of the responses endorsed completely a generalization on the basis of examples only, and
only about 10% then offered proofs on their own when asked to explain their decisions
(1990, p. 591). Furthermore, 83% of those not offering a proof initially claimed to
understand a given proof of an assertion, but only 61% of all the students, including the
ones offering a proof on their own, were sure that the author was correct about the
assertion after seeing a proof (1996, p. 8). From the students’ reactions to given proofs,
Porteous concluded that “It is only when a pupil devises a proof for himself that he is
convinced of the truth of a statement. A proof provided by a teacher may have some
effect, but it is small in comparison with a d(i)scovered proof…we need to encourage
pupils to investigate relationships for themselves, in order to produce their own reasons,
or proof, for general statements. This is not to say, of course, that empirical work has no
real part to play in the learning process” (1996, pp. 21-22).
Williams (1980) interviewed 11th grade Canadian students in a college
preparatory program and concluded that fewer than 30% showed a grasp of the meaning
of mathematical proof, that about half of the students saw no need to prove a statement
that they regarded as obvious, roughly 70% did not distinguish between inductive and
deductive arguments, and fewer than 20% understood how indirect proof works.
Spanish researchers Recio and Godino (2001) conducted a study involving two
groups (n = 429 and n = 193) of beginning university students. They found that about a
third of the 429 students and less than a quarter of the 193 students could prove both of
two elementary statements (the difference between the squares of consecutive natural
numbers is odd and equal to the sum of the numbers, and the bisectors of adjacent
supplementary angles are perpendicular). Also, fewer than half of the group of 429 were
successful on each individual proof. Roughly 40% used empirical reasoning: “Empirical
inductive [proof] schemes were the spontaneous type of argumentation in a high
percentage of students when they were confronted with new problems, in which it was
necessary to develop new proof strategies, different from the learned procedures” (p. 91).
In Israel, Fischbein and Kedem (1982) found that their high school students did
not appear to understand that no further examples need be checked, once a proof was
given, a finding confirmed by Vinner (1983), who also noted that many high school
students (35%), even high attaining ones (39%), seem to regard a given proof as the
method to examine and verify a later particular case.
Similar results were found in a recent Japanese study (cited in Fujita & Jones,
2003). The official curriculum of Japan calls for students to “understand the significance
and methodology of proof.” However, even though most 14 to 15 year old students are
successful at proof writing, “around 70% cannot understand why proofs are needed” (p.
Internationally, overall the most positive conclusion seems to be that the proof
glass is not completely empty but that it is by no means even close to full. Even “good”
performances may be tainted by little understanding or appreciation of the functions of
proof. The prevalence of empirical proof schemes for most students seems to be
30 A Comprehensive Perspective on Proof
domain’s proof techniques (p. 111), knowledge of which theorems are important and
when they are useful (p. 112), and knowledge of when, and when not, to use strategies
based on symbol manipulation rather than deeper knowledge (p. 113). In a similar vein,
Raman’s (2003) interviews of mathematics students and faculty suggested the importance
of a “key idea”—“an heuristic idea which one can map to a formal proof with appropriate
sense of rigor” (p. 323). She concluded, “For mathematicians, proof is essentially about
key ideas; for many students, it is not” (p. 324). In expert-novice studies, one often does
not know how the experts acquired their expertise (or whether there is a selection factor
involved), but knowing what differences exist may give ideas for instructional emphases.
But in questioning university mathematics faculty about university mathematics majors’
proof understanding, a “twice is nice” theme emerged: Exposure to the same material
twice allows the student, on the second exposure, to focus on proof methods (Sowder,
2004). Perhaps these second exposures are helpful in attaining other aspects of Weber’s
strategic knowledge and Raman’s key ideas, and in growing in deductive proof schemes.
Marty (1991) felt that his explicit attention to proof methods, rather than the common
focus on new mathematical content, helped his college students succeed in later
mathematics courses.
University students’ distinctions among axioms, definitions, and theorems are not
sharp. Vinner (1977), for example, found that only about half of a group of Berkeley
sophomores and juniors in mathematics could correctly identify all of three statements
about exponents as definitions, as opposed to theorems or laws or axioms. Since much of
algebra focuses on algorithms and these students had first studied the material in high
school, perhaps the lack of such distinctions is expected. Yet, when Brumfiel (1973)
questioned a class of University of Michigan juniors and seniors, nearly all of whom had
complete a university course in formal geometry, he found that their mastery of similar
distinctions were shockingly deficient. For example, collectively the students could
recall only one axiom (two points determine a line), about half called a definition of
isosceles triangle a theorem, and all were certain that, for two (given) independent
postulates about points and lines, one could be deduced from the other. In Israel,
Linchevsky, Vinner, and Karsenty (1992) found that only about one-fourth of their
university mathematics majors understood that it is possible to have alternate definitions
for concepts. These studies speak to about-proof topics. But about-proof topics cannot
be mastered without understanding the proof topics themselves. In this case the topics in
question involve the meaning and role of axioms and definitions. These studies, then,
suggest a weak, or even absent, axiomatic proof scheme among mathematics majors—an
observation that is consistent with findings of our study with mathematics majors (Harel
& Sowder, 1998).
Task 1. Given four envelopes with a Task 2. Given four envelopes with a
letter on the front side and a number on space for a stamp on one side and
the back, select just the envelopes sealed or not, select just the envelopes
definitely needed to be turned over to definitely need to be turned over to find
find out whether they violate the out whether they violate the rule.
rule.Separate envelopes show on front:
D and C, and on back: 5 and 4. Envelopes show
(a) back of sealed envelope;
(b) unsealed envelope with flap up;
(c) front of an envelope with stamp;
(d) front of an unstamped envelope.
Rule to test: If a letter has a D on one
side, then it has a 5 on the other side. Rule to test: If a letter is sealed, then it
Percent correct (D, 4): 8% has a 5 pence stamp on it.Percent
correct (a, d): 88%
Figure 4. Two logically isomorphic tasks with “abstract” [Task 1] and “concrete”
[Task 2] contexts (after Wason & Johnson-Laird, 1972, pp. 191-192).
Although our focus is mainly on studies dealing directly with mathematics, one
sobering study involving everyday contexts deserves special note, because some of the
data come from prospective secondary school mathematics teachers. Using items from
Eisenberg and McGinty (1974), Easterday and Galloway (1995) compared the
performance of last-year university students planning to teach middle school or high
school mathematics with those of 7-8th graders and 12th graders on a variety of reasoning
tasks. The tasks were based on everyday contexts that should be non-suggestive (e.g., “If
John is big, then Jane is big. John is big. Is Jane big? Yes/no/maybe”). In particular, on
modus tollens tasks, the college students scored 47%, about the same as the 12th grade
calculus students but 20% less than the 7-8th graders' 67%. These particular 7-8th graders
were studying geometry and may therefore have been exposed to logic, but an earlier
comparison in 1986 with 7-8th graders in advanced sections had given a similar result,
with the 7-8th graders scoring 62% and the college students only 40% on the modus
tollens tasks (Easterday & Galloway, 1995, p. 433). The authors concluded, “College
students are barely performing better than children whom they may one day teach” (p.
435), and neither group was by any means topping out in performance.
There seem to be only a few data on students’ abilities with specific ideas from
logic and within mathematics. Mentioned earlier were the limited results from NAEP,
and there the picture was clouded by the use of a non-mathematical context. The large-
scale Longitudinal Proof Project in England has, however, looked at students’
performance on if-then statements involving number theory ideas (Hoyles & Kuchemann,
2002). Among their findings was that 62% of 14-year-olds thought that a given if-then
statement (for example, if the product of two numbers is odd, then the sum of the
numbers is even) “said the same thing” as its converse. However, the longitudinal nature
of the study also allowed Hoyles and Kuchemann to find a 7% improvement on this item
over that of the same students at age 13. (In passing, it may be worth noting that so many
theorems, especially in geometry, do have true converses, so it is easy to see how
34 A Comprehensive Perspective on Proof
students may be insensitive to the logical difference between an if-then statement and its
Even advanced students have difficulty with quantifiers. Dubinsky and Yiparaki
(2000) found that university students at various levels, including some in an abstract
algebra course, had much greater trouble giving the mathematical meaning of a doubly
quantified statement when the existential quantifier appeared before the universal
quantifier (“There is a positive number b such that for every positive number a b ≤ a”—
19% correct) than when the quantifiers were reversed, universal before existential (“For
every positive number a there is a positive number b such that b < a”—59%). Selden and
Selden (1995) noted that university mathematics students may have difficulties in even
restating mathematical statements precisely, with their largely third- and fourth-year
students often giving incorrect responses when quantifiers were involved. Thus, although
there may be areas of apparent strength in the use of logic (for example, the use of modus
ponens), there appear to be many areas of weakness as well, and at a wide gamut of
levels of schooling.
An important point is that everyday usage of logical expressions may differ
considerably from the precise usage in mathematics. Epp (2003) has summarized the
differences in a compelling way, and O’Brien, Shapiro, and Reali (1971) have referred to
“child’s logic” in describing some of the differences. For example, “or” in everyday
usage is most often in the exclusive sense (“I’ll wear my sandals or my tennis shoes”), in
contrast with the inclusive convention common in mathematics. An everyday if-then
statement (for example, “If you finish your work, then you can watch the game”) often
connotes what would be an if-and-only-if statement in mathematics and to many children
seems to be an “and” statement. The disparities between everyday usages and
mathematical usages are so marked that explicit instruction in logic as used in
mathematics would seem to be necessary, with contrasts to the less precise everyday
usages pointed out, yet, as Epp contends (2003), perhaps exploiting non-mathematical
usages that do reflect the mathematically precise ones, as exemplars for the latter.
Another interesting disparity between everyday usage and mathematical usage is
that of indirect proof. According to Freudenthal (1973) indirect proof is a very common
activity. Seven to eight year old children used contradiction in game playing and
checking conjectures (Reid & Dobbin, 1998). Antonini (2003) even found that indirect
argumentations occurred spontaneously by students in his interviews with them about
mathematical assertions. Yet research has shown that students experience difficulties
with proof by contradiction in mathematics. Leron (1985), for example, observed that
despite the simple and elegant form of certain proofs by contradiction, students
experience what seem insurmountable difficulties. Lin, Lee, and Wu Yu (2003) see the
ability to negate a statement as a prerequisite ability for succeeding at a proof by
contradiction. They found that the difficulty levels of students’ negating a statement can
be ordered decreasingly as negating statements without quantifiers, negating “some,”
negating “all,” and negating “only one.”
In general, then, there are many weak spots in students’ likely grasp of the logical
reasoning used in advanced proof schemes. Is it a chicken-egg question, or can logical
thinking and proof performance grow together? Later in this section we summarize
several studies in which explicit instruction in logical principles was incorporated into
high school geometry courses.
35 A Comprehensive Perspective on Proof
Current Status
United States classes cries out for curriculum developers to address this aspect of
learning mathematics. (Manaster, 1998, p. 803)
It should be clear that only an authoritarian proof scheme is likely to be fostered in these
Despite Porter’s (1993) finding of little attention to proof in high school, one
would certainly expect more explicit attention to proof in the mathematics at those grade
levels where proof most often is a conscious part of the curriculum. Senk (1985) noted,
however, that there were consistent differences across schools in the geometry students’
performances on her proof tasks. Tinto (1988) too noted that one of her four teachers
seemed markedly different from the others in his approach to geometry. In their large-
scale study of proof in British classrooms, Healy and Hoyles (1998) found that students
who had been expected to write proofs and who had classes in which proof was taught as
a separate topic performed somewhat better on proof items than other students.
Thompson (1991), on the other hand, did not notice differences across her nine teachers
and schools; her sample, however, included three private schools and two magnet schools
and so was perhaps not representative. Overall, it appears that at least some of the
deficiencies in students’ acquisition of more sophisticated proof schemes may stem from
the lack of opportunity to engage in proof-fostering activities, even in courses where one
would expect much attention to proof.
The evidence from the status studies of university students’ proof knowledge
suggests that some, if not many, precollege teachers are unlikely to teach proof well,
perhaps because their own grasp of proof was probably limited in college and may not
have grown since then. Knuth (2002a) examined the conceptions of proof of 16
practicing secondary school mathematics teachers, most with backgrounds that would
pass a face-validity test for knowledge of mathematics. In interview settings, Knuth
asked the teachers to respond to general questions about proof (e.g., What purpose does
proof serve in mathematics?), to evaluate given arguments (both proofs and non-proofs),
and to identify the arguments that were most convincing. Although all of the teachers
endorsed the verification role of proof, none mentioned the explanatory role of proof (see
the section on the concept of proof scheme). Six of the 16 thought it might be possible to
find contradictory evidence of a (non-specified) statement that had been proved. Four of
the 16 tested a statement with a given, endorsed proof with further examples (cf.
Fischbein & Kedem, 1982), even though all of the teachers eventually acknowledged that
it would not be possible to find a counterexample. Even though the teachers collectively
correctly identified 93% of the correct arguments as being proofs, over a third of the non-
proof arguments were rated as being proofs! Ten of the 16 accepted the proof of the
converse of one statement as a proof of the statement. Thirteen of the 16 teachers found
arguments based on examples or visual presentations to be most convincing. Although
Knuth felt that their responses may have been directed toward personally-convincing
rather than mathematically-convincing, that mathematics teachers would be convinced by
such arguments more than by a mathematical proof is significant, because it reveals an
apparent dominance of the empirical proof schemes among the teachers. Knuth (2002b)
further examined these teachers’ ideas about proof in the context of school mathematics
(versus the earlier just-in-mathematics). In view of the NCTM (2000) recommendation
that reasoning and proof be considered fundamental aspects of the study of mathematics
at all levels of study, it is disappointing that the teachers in Knuth's study “…tended to
37 A Comprehensive Perspective on Proof
That mathematics curricula differ in their treatments of proof is by no means a
recent phenomenon. For example, in his study of students’ proof explanations, Bell
(1976) found that proof is the topic that shows the greatest variation in approaches
internationally. He noted that this variation can be attributed to the tension between the
recognition among teachers that deduction is essential to mathematics but that only the
most capable students develop a good understanding of it. There is evidence that this
condition remains true today as well. For example, Fujita and Jones (2003) compared the
textbook treatments of geometry in lower secondary schools in Japan and Scotland and
concluded the following.
Our analysis indicates that…Japanese textbooks set out to develop students’
deductive reasoning skills though the explicit teaching of proof in geometry,
whereas comparative UK [United Kingdom] textbooks tend, at this level, to
concentrate on finding angles, measurement, drawing, and so on, coupled with a
modicum of opportunities for conjecturing and inductive reasoning. (p. 1)
It is, perhaps, natural to expect great variation in the treatment of curricular topics
within countries that do not have national curricular and educational guidelines. In the U.
S., for example, how geometry, the primary locus of proof efforts until recently, should
be handled has led to vastly different opinions and occasionally to different approaches or
emphases in school geometry (cf., e.g., Hoffer, 1981; NCTM, 1973, 1987; Usiskin,
defined rules (i.e., postulates) and target configurations (i.e., theorems) to be attained by
applying the rules (i.e., with proofs). Hence, the students were dealing implicitly with an
axiomatic system.
Lester (1975) sampled 19 students from each of four groups, one group from
grades 1-3, a second from grades 4-6, a third from grades 7-9, and the fourth from grades
10-12, gave them practice with the rules, and studied their performances on the target
tasks. His grades 7-9 students performed as well as the students from grades 10-12, and
the students from grades 4-6 solved about as many tasks as the older students, but took
somewhat longer. Lester suggested that “even students in the upper elementary grades
can be successful at mathematical activities that are closely related to proof” (p. 23).
Indeed, King (1970, 1973) thoroughly developed a 17-day unit dealing with some
elementary number theory results (e.g., a number which is a factor of two numbers is also
a factor of their sum). He found that a group of 10 above-average sixth graders could
reproduce the proofs initially developed with considerable teacher support (in contrast to
a non-equivalent control group), but the evidence also suggested that the proofs were
given from rote memory.
Fawcett’s study of the late 1930s deserves special mention. The title and sub-title
give a good summary: The Nature of Proof, A Description and Evaluation of Certain
Procedures Used in a Senior High School to Develop an Understanding of the Nature of
Proof. Whatever the reason—World War II, the usual inertia in curriculum—this study
seemed to have had little impact, even though it was reported in a yearbook of the NCTM
(Fawcett, 1938/1995). His approach was surprisingly modern in tone. Fawcett’s
summary includes the following.
The theorems [of geometry] are not important in themselves. It is the method
by which they are established that is important, and in this study geometric
theorems are used only for the purpose of illustrating this method. The
procedures used are derived from four basic assumptions:
1. That a senior high school student has reasoned and reasoned accurately before
he begins the study of demonstrative geometry.
2. That he should have the opportunity to reason about the subject matter of
geometry in his own way.
3. That the logical processes which should guide the development of the work
should be those of the student and not those of the teacher.
4. That opportunity be provided for the application of the postulational method
to non-mathematical material.
Non-mathematical situations of interest to the pupils were used to introduce
them to the importance of definition and to the fact that conclusions depend on
assumptions, many of which are often unrecognized. To make definitions and
assumptions and to investigate their implications is to have firsthand experience
with the method of mathematics… (p. 117)
Fawcett’s teaching experiment, with a non-equivalent control group, continued
through two school years, with the report covering just the first year. As the excerpt
above suggests, the students eventually composed, collectively, their lists of undefined
terms, definitions, and assumptions. The need for such elements arose in discussing
everyday situations, such as the importance of definition in discussing how the governor
of Ohio handled a particular bill (pp. 51-52). Of course the teacher played a major role in
39 A Comprehensive Perspective on Proof
initiating such discussions and in providing fruitful leads for particular results, but in the
large the students were responsible for conjecturing results and then proving them. The
evaluation of the experiment was based on a state geometry test and a test of the ability to
analyze non-mathematical material. Even though the experimental students, after one
year of a two-year treatment, had not covered the usual material in the standard course,
their performance (although not reported thoroughly) seemed satisfactory on the 80-point
state geometry test: Median 52.0, state median 36.5 (p. 102). More telling was the
experimental students’ performance on the analysis of non-mathematical material, where
they out-performed by far the control group (change score of 7.5 vs a change score of 1.0;
maximum possible not given) (p. 103). Fawcett also quoted the laudatory reactions of
visitors and of the students themselves, contrasting the students’ final remarks with the
largely indifferent attitudes expressed at the beginning of the experiment.
One can only conclude from these studies that upper elementary school children
can deal with proof ideas or actions, and that high school students can develop
meaningful understandings of proof if they are taught appropriately.
teacher help, led to clearly stated conjectures (e.g., If the sun’s rays belong to the vertical
plane of the oblique stick, then the shadows are parallel) that were then examined further,
with an eye toward establishing them “in general.” In the analysis of these subsequent
attempts at general arguments (i.e., proofs), the researchers noticed that in the successful
proofs, there were connections with key observations made during the conjecture-
forming stage. The researchers’ collective work has led to their hypothesis of “cognitive
unity,” emphasizing the close connection between the reasoning during the formation of a
conjecture and the reasoning in an eventual proof:
(D)uring the production of a conjecture, the student progressively works out
his/her statement through an intensive argumentative activity functionally
intermingling with the justification of the plausibility of his/her choices; during
the subsequent statement proving stage, the student links up with this process in a
coherent way, organizing some of the justifications (‘arguments’) produced
during the construction of the statements according to a logical chain (Boero,
Garuti, Lemut, & Mariotti, 1996, p. 119-120).
They also argue that such a process is followed by many mathematicians--during the
conjecturing stage, the mathematician uses arguments that can later be adapted to support
his or her mathematical proof—and make the case that much more instruction in
mathematics should involve conjecturing.
One series of studies (Maher, 2002; Maher & Martino, 1996; Martino & Maher,
1999) carried out with instruction in a similar problem-based vein, is notable because of
its long-term nature (occasional sessions over 14 years, usually separate from the regular
mathematics classes) and because of proof behaviors—proof by contradiction, proof by
cases, proof by mathematical induction—that arose naturally, if informally, even in
elementary school, at least on the part of some students. The nature and flavor of the
sessions is communicated by these retrospections of a participant:
Well, we break up into groups…like five groups of three, say, and
everyone in their own groups would have their own ideas, and you’d argue
within your own group, about what you knew, what I thought the answer
was, what you thought the answer was and then from there, we’d all get
together and present our ideas, and then this group would argue with this
group about who was right with this…(Maher, 2002, p. 37)….You didn’t
come in and say, “this is what we were learning today and this is how
you’re going to figure out the problem.” We were figuring out how we
were going to figure out the problem. We weren’t attaching names to that
but we could see the commonness between what we were working on
there and maybe what we had done in school at some point in time and
been able to put those things together and come up with stuff and to do
these problems to come up with, what would be our own formulas because
we didn’t know that other people had done them before. We were just
kind of doing our own thing trying to come up with an answer that was
legitimate and that no matter how you tried to attack it, we could still
answer it… (Maher, 2002, p. 32).
As the excerpts illustrate, all of these student-centered, problem-based studies
have involved a way of teaching that is in stark contrast to the stereotype of a
mathematics class: Students check homework, teacher illustrates something new,
42 A Comprehensive Perspective on Proof
students then do seat-work or homework to practice the new material. The didactical
contract (Brousseau, 1997) in the experimental classes was obviously quite different from
that in the stereotypical one. In particular, the “social norms” were quite different:
Students were expected to work together rather than singly, students were to explain their
solution methods, and students were to listen carefully and evaluate the explanations of
other students--and hence perhaps learn different proof schemes.
Yackel and Cobb (1996) have sharpened the analysis of social norms to identify
“sociomathematical norms,” those social norms that refer specifically to mathematical
activity. For example, coming to accept that an explanation is expected might be a
general social norm, whereas what constitutes an acceptable mathematical explanation
would be a sociomathematical norm (Yackel & Cobb, 1996). Other sociomathematical
norms might include norms dealing with when different explanations are mathematically
different, or when a justification is acceptable, or when justifications or explanations
convey efficiency or elegance. Indeed, Yackel, Rasmussen, and King (2000) focused on
the norms that developed during a problem-based undergraduate differential equations
course. They found that students “frequently explained their reasoning without
prompting, offered alternative explanations and attempted to make sense of other
students’ reasoning and explanations, despite the fact that their prior experiences were
with traditional approaches to mathematics instruction” (p. 276) and, in particular, there
was evidence of the development of such sociomathematical norms as, what is an
acceptable mathematical justification and what makes up a mathematically different
explanation. That the students, who had likely experienced mathematics classrooms with
quite different social norms, developed such norms is particularly encouraging. Yackel
and Cobb point out that sociomathematical norms should be examined in teaching modes
that are different from the inquiry mode in which they have been studied, because the
classroom conduct, whether stereotypic or not, will automatically convey what is an
acceptable norm. For example, one can wonder whether some students, young or
experienced, might be more comfortable with a more directed approach. Dweck (1999)
has identified different goal orientations on the part of students. Some students may be
guided primarily by performance goals like grades, parent/teacher approval, high marks,
or status with others, whereas others have what Dweck calls “learning goals”—their
primary interest is in understanding or mastery. A student’s extreme preference for
performance goals may be an unfortunate aspect of schooling or society as it exists, of
course, but it might also result in resistance to sociomathematical norms that do not
obviously support perceived performance. It is an interesting question as to whether
different sorts of teaching might shape a student’s learning goals.
In most of the studies above, outside support for the teacher was particularly
important. The question of what is feasible in classrooms without further teacher
preparation or researcher involvement is crucial, as, for example, Yackel and Hanna point
out (2003). It is nonetheless exciting to envision learners, starting in the primary grades
and continuing through high school and college, developing the social and
sociomathematical norms about proof, and the proof schemes, that one might wish, but it
is daunting to think of the changes needed in curricula and teachers (and testing
programs) to support the development of such norms. As an indication of areas of
teacher preparation that are important, Martino and Maher (1999) suggest, based on their
videotape analysis of their multi-year study, that there is a “strong relationship between
43 A Comprehensive Perspective on Proof
Use of technology.
The increasing use of dynamic geometry software during the last couple of
decades has provoked researchers to look closely into the learning benefits as well as the
potential risks of this tool. In this section we focus on studies conducted to investigate
the impact of dynamic geometry environments (DGEs) on the learning proof, primarily
because those environments have been involved in several studies.
Generating and measuring many examples, as is now possible and easy with
DGEs, would seem only to support the idea that examples prove a result (the empirical
proof scheme) and hence interfere with any need for deductive proof. Chazan (1993a,b)
noted and corroborated that the ideas that evidence is proof and that proof is merely
evidence were widespread. He interviewed 17 students who were taught geometry in a
DGE environment (1993b) to see how students fared when they experienced a geometry
curriculum that involved both measurement of cases and deductive proof, with the
curriculum including attention to the different types of argumentation involved with the
two methods. He found both evidence-is-proof (the empirical proof scheme) and proof-
is-merely-evidence postures on the part of the students, and hence he noted that the
“comparison and contrast of verification and deductive proof certainly deserves explicit
attention in mathematics classrooms” (1993b, p. 382), with the teachers involved in the
study feeling that more time should have been spent in dealing with students’ doubts
about deductive proofs (1993a, p. 109). However, he also noticed that “fewer students
considered evidence to be proof…whereas more students were skeptical about the limits
of applicability of deductive proofs…Some students seemed to become more skeptical
about deductive proofs as a result of becoming more skeptical about measurement of
examples” (1993a, p. 109). He also suggested, based on interviewee comments, “that the
explanatory aspect of proofs is a useful starting point for a discussion of the value of
deductive proofs” (1993b, p. 383), that some students have no idea of what deductive
44 A Comprehensive Perspective on Proof
proofs are intended to do, and that some students resist the idea that a deductive argument
can assure that there cannot be counterexamples.
A special issue of the journal, Educational Studies in Mathematics (2000, 44),
included four reports on teaching experiments involving DGE. The first four are by
Mariotti, by Jones, by Marrades and Gutierrez, and by Hadas, Hershkowitz, and
Schwartz. The last paper is by Laborde, and synthesizes the previous four, providing a
connecting theme among them by using Brousseau’s construct of “milieu.” We continue
with these studies.
One of the significant results of the four studies is that their findings address a
major concern regarding the use of DGEs in the teaching of geometry in school: “the
opportunity offered by [DGE] to ‘see’ mathematical properties so easily might reduce or
even kill any need for proof and thus any learning of how to develop a proof” (Laborde,
2000, p. 151).
Jones’ study was a teaching experiment with 12-year-old students. The
intervention involved students working in pairs or small groups on the classification of
quadrilaterals. The instructional activities involved tasks where the students were to
reproduce a figure that could not be “messed up” by dragging any of its components (a
vertex or segment). More advanced activities included tasks of producing a figure that
can be transformed into another specified figure by dragging (e.g, from a rectangle to a
square). In each case the students had to explain why their constructed figure was the
expected one. Laborde (2000) points out that this type of explanation consists of giving
the conditions that imply that the constructed figure is the expected type of quadrilateral,
a necessary activity to understand how proof works. From our perspective, these
explanations in the context of a DGE involve movement, and, therefore—as we have
discussed in earlier in the section—they are causal, and hence deductive. Accordingly,
Jones’ (2000) study suggests that the DGE does not necessarily eliminate the need for
proof in the students’ eyes but can enhance students’ deductive proof scheme. The
developmental path of students’ conception in this study started where students were.
Initially, according to Jones, students lacked the capability to describe or explain in
precise mathematical language. The instructional emphasis in this stage was on
description rather than explanation, where students utilized perception rather than
mathematical language to describe their observations. As the emphasis extended to
explanations, students’ language became more precise but was mediated by the DGE
terminology (e.g., the term “dragging”). By end of the teaching experiment, students’
explanations related entirely to the mathematical context.
Mariotti’s study was a long-term teaching experiment with 15-16-year-old
students. Students were engaged in DGE activities through which the students
themselves constructed a geometric system of axioms and theorems as a system of Cabri
Geometry commands. Similar to Jones, Mariotti emphasized activities where the task is
for pairs of students to construct geometric figures, describe the construction procedure,
and justify why the procedure produces the expected figure. The basic conceptual change
that Mariotti’s (2000) study achieved was in students’ status of justification, which
transitioned from an “intuitive” geometry—a collection of self-evident properties—to a
theoretical geometry—a system of statements validated by proof. The theoretical
geometry that Mariotti’s students had constructed seems to be more than a deductive
system, in that students were not only constructing and proving theorems but also
45 A Comprehensive Perspective on Proof
establishing the axioms on which these theorems rest, and thereby laying the foundations
for their axiomatic proof scheme.
Hadas, Hershkowitz and Schwarz’ (2000) study was done in the context of a
geometry course that emphasized the concept of proof. They developed instructional
activities involving students making assertions about certain geometric relations and later
checking them with a DGE. The choice and sequence of activities were such that upon
checking their assertions with the DGE the students would find them to be false—a
realization that would make the students curious as to the reason for the falsity of the
conjecture. For example, in one of the activities the students began with two tasks. The
first task was to measure (with the software) the sum of the interior angles in polygons as
the number of sides increases, generalizing their observation, and then explaining their
conclusion. The second task was to measure (with the software) the sum of the exterior
angles of a quadrilateral. Following this, the students were asked to hypothesize the sum
of the exterior angles for polygons as the number of sides increases, and to check their
hypothesis by measuring (with the software) and explain what they found. Hadas,
Hershkowitz and Schwartz succeeded in creating in the students a need to find out the
cause for their assertion to be untrue. Laborde (2000) points out that such an
achievement would have been impossible without the use of a dynamic geometry system,
for “the false conjectures came after students were convinced of other properties thanks
to the DG system. … [The] interplay of conjectures and checks, of certainty and
uncertainty, was made possible by the exploration power and checking facilities offered
by the DG environment” (p. 154).
Marrades and Gutierrez investigated how DGEs can help secondary-school
students (aged 15-16 years) enhance their proof schemes. As in Hadas, Hershkowitz and
Schwarz’ study, Marrades and Gutierrez showed that a DGE can help students realize the
need for formal proofs in mathematics. By interpreting their results in terms of our
taxonomy of proof schemes, an important observation reported in their study is that
students’ transition to deductive proof schemes is very slow; the total teaching
experiment lasted 30 weeks, with two 55 minute class per week. Of particular
importance is their finding that for this transition to take place instruction must not ignore
students’ current empirical proof schemes and must institute a didactical contract that
attempts to suppress the authoritative proof scheme. Their method was to repeatedly
emphasize “the need to organize justifications by using definitions and results (theorems)
previously known and accepted by the class” (p. 120). Finally, another significant
finding of this study is that the ability to produce deductive proof evolves hand in hand
with students’ understanding of subject matter: the concepts and properties related to the
topic being studied. This is consistent with other findings. Simon and Blume (1996), for
example, illustrated that a learner may not fully understand another’s proof because of a
limited grasp of the concepts addressed in the proof (p. 29). One can argue that such
exposure might lead to a disequilibrium and eventually a greater understanding of both
the concepts and the proof.
Hence, DGEs are a promising tool, but they do not automatically or easily lead to
improved proof schemes. Accomplishing that sort of growth apparently requires a
carefully laid-out curriculum (cf. de Villiers, 1999) and considerable adjustment by a
teacher accustomed only to telling as the mode of instruction. Lampert (1993), for
example, described some of the difficulties encountered by teachers who allowed
46 A Comprehensive Perspective on Proof
conjecturing with a DGE. (Making conjectures, itself, may be difficult for students new
to the expectation. Koedinger [1998], with an eye toward possible software activities,
noted that with his task of writing a conjecture about kites, given the definition of kite,
about a quarter of the roughly 60 geometry students could not come up with a non-trivial
conjecture within 20 minutes.) Lampert noted that the change from “sage on the stage”
to “guide on the side” required adjustments both for the teacher and the students, since a
different sort of didactical contract was involved. Teachers were also concerned about
the coverage of a standard body of content (for external testing purposes, or for later
courses), as well as the departure from the usual, familiar axiomatic development that
often eventuated.
In this chapter we have presented a comprehensive perspective on proof
learning—a perspective that addresses mathematical, historical-epistemological,
cognitive, sociological, and instructional factors. Comprehensive perspectives on proof
are needed, we argued, in order to better understand the nature and roots of students’
difficulties with proof so that effective instructional treatments can be designed and
implemented to advance students’ conceptions of and attitudes toward proof. Our
perspective grew out of a decade of investigations—empirical as well as theoretical—into
students’ conceptions of proof. In various periods and stages of these investigations we
have repeatedly confronted questions that collectively address a combination of the five
factors mentioned above.
The notion of “proof scheme” serves as the main lens of our comprehensive
prospective on proof. Through it, for example, we analyze and interpret students’
proving behavior—in their individual work as well as in their interaction with others—
and understand the development of proof in the history of mathematics. A proof scheme
consists of what constitutes ascertaining and persuading for a person (or community).
This definition was born out of cognitive, epistemological, and instructional
considerations. Specifically, a critical observation in our and other scholars’ work is that
proof schemes vary from person to person and from community to community—in the
classroom, with individual students as well as the class as a whole, and throughout
history. “Proof,” when viewed in this subjective sense, highlights the student as learner.
As a result, teachers must take into account what constitutes ascertainment and
persuasion for their students and offer, accordingly, instructional activities that can help
them gradually refine and modify their proof schemes into desirable ones. This
subjective view of proof emerged from our studies and impacted many of the conclusions
we drew from them. For example, it influenced our conclusions as to the implications of
the epistemology of proof in the history of mathematics to the conceptual development of
proof with students, the implications of the way mathematicians construct proofs to
instructional treatments of proof, and the implications of the everyday justification and
argumentation on students’ proving behaviors in mathematical contexts. The subjective
notion of proof scheme is not in conflict with our insistence on unambiguous goals in the
teaching of proof—namely, to gradually help students develop an understanding of proof
that is consistent with that shared and practiced by the mathematicians of today. The
question of critical importance is: What instructional interventions can bring students to
see an intellectual need to refine and alter their current proof schemes into deductive
proof schemes (Harel, 2001)?
48 A Comprehensive Perspective on Proof
The status studies we have reviewed and presented in this paper show the absence
of the deductive proof scheme and the pervasiveness of the empirical proof scheme
among students at all levels. Students base their responses on the appearances in
drawings, and mental pictures alone constitute the meaning of geometric terms. They
prove mathematical statements by providing specific examples, not able to distinguish
between inductive and deductive arguments. Even more able students may not
understand that no further examples are needed, once a proof has been given. Students’
preference for proof is ritualistically and authoritatively based. For example, when the
stated purpose was to get the best mark, they often felt that more formal—e.g.,
algebraic—arguments might be preferable to their first choices. These studies also show
a lack of understanding of the functions of proof in mathematics, often even among
students who had taken geometry and among students for whom the curriculum pays
special attention to conjecturing and explaining or justifying conclusions in both algebra
and geometry. Students believe proofs are used only to verify facts that they already
know, and have no sense of a purpose of proof or of its meaning. Students have
difficulty understanding the role of counterexamples; many do not understand that one
counterexample is sufficient to disprove a conjecture. Students do not see any need to
prove a mathematical proposition, especially those they consider to be intuitively
obvious. This is the case even in a country like Japan where the official curriculum
emphasizes proof. They view proof as the method to examine and verify a later
particular case. Finally, the studies show that students have difficulty writing valid
simple proofs and constructing, or even starting, simple proofs. They have difficulty with
indirect proofs, and only a few can complete an indirect proof that has been started.
We believe there is a need for more longitudinal studies regarding students’ proof
schemes. The most difficult studies to carry out, for both financial and design reasons are
longitudinal studies. Yet we cannot gain a solid understanding of the effect on students
of continued attention to justification and proof throughout their studies in mathematics,
except through longitudinal studies. A one- or two-year exposure (let alone a one-
semester treatment) to instruction and curricula attentive to reason-giving can be dwarfed
by a multiple-year focus on instruction and curricula that, to use an extreme example,
emphasize rote skills likely to be useful in some external testing program. In the latter
cases, being asked to give reasons and arguments might well be viewed as aberrations
and irrelevant to the perceived “really important” side of mathematics.
The findings from studies of teachers’ conceptions of proof do not look much
better than those with students. Overall, teachers seem to acknowledge the verification
role of proof, yet for many the empirical proof schemes seem to be the most dominant,
even in dealing with mathematical statements, and they do not seem to understand other
important roles of proof, most noticeably its explanatory role. Some teachers tend to
view proof as an appropriate goal only for the mathematics education of a minority of
students, not considering proof and justification to be a central concern in school
mathematics, as has been repeatedly called for by the mathematics education leadership
(e.g., NCTM, 1989, 2000). Studies show that little or no instructional time is allocated to
the development of the deductive proof schemes, not even in geometry. In the U.S.
explicit mathematical reasoning in mathematics classes is rare, and in algebra and pre-
algebra courses it is virtually absent. Many teachers are unlikely to teach proof well,
since their own grasp of proof is limited. It is important to determine better the extent to
49 A Comprehensive Perspective on Proof
which teachers are equipped to deliver a curriculum in which proof is central. Results
from studies like those of Knuth (2002a) and Manaster (1998), if indeed typical of a
widespread performance of mathematics teachers, demand attention both on the part of
university mathematics departments, which have a primary responsibility for the
preparation of mathematics teachers, and on school districts, which support the continued
development of their existing faculty.
The bright side of the findings is that students who receive more instructional
time on developing analytical reasoning by solving unique problems fare noticeably
better on overall test scores. Likewise, students who have been expected to write proofs
and who have had classes that emphasized proof were somewhat better than other
students. It also seems possible to establish desirable sociomathematical norms relevant
to proof, through careful instruction, often featuring the student role in proof-giving.
There has been a concern that the ease with which technology can generate a large
number of examples naturally could undercut any student-felt need for deductive proof
schemes. Fortunately, several studies have shown that with careful, non-trivial planning
and instruction over a period of time, progress toward deductive proof schemes is
possible in technology environments, where such desiderata as making conjectures and
definitions occur.
An important element in deductive proof schemes is of course the use of logical
reasoning. Yet there is evidence that many students, and possibly even many teachers, do
not have a good grasp or appreciation of some important principles of logic. Nor is it
clear as to how best to devise instruction to improve performance with logic in
mathematical (and non-mathematical) contexts. It is unclear to us how best to prepare
students to deal with the logical reasoning essential in mathematical proofs and valuable
in even informal justifications—osmosis, or explicit attention? And, in particular, the
knowledge of teachers of mathematics about logical reasoning may be a matter of
concern (e.g., Easterday & Galloway, 1995). We see a need for the incorporation of
items on proof or logic (even multiple-choice ones) into the periodic National
Assessments of Education Progress. From the practical viewpoint, the NAEPs exist, and
they offer a view of performance across the U.S. Even more pleasing would be to see
large-scale efforts devoted explicitly to the study of performance in proof and logic, like
those in Great Britain (Healy & Hoyles, 1998, 2000; Hoyles & Kuchemann, 2002). A
deep look at the students and teachers’ knowledge of proof, on the one hand, and at the
development of the deductive proof scheme in the history of mathematics, on the other,
has provided us with important insights as to what might account for students’ difficulties
in constructing this scheme and what instructional approaches can facilitate its
construction. In particular, considerations of historical-epistemological developments
have led us to new research questions with direct bearing on the learning and teaching of
proofs. For example:
1. To what extent and in what ways is the nature of the content intertwined with
the nature of proving? In geometry, for example, does students’ ability to
construct an image of a point as a dimensionless geometric entity impact their
ability to develop the Greek axiomatic proof scheme?
2. What are the cognitive and social mechanisms by which deductive proving
can be necessitated for the students? The Greek’s construction of their
geometric edifice seems to have been a result of their desire to create a
50 A Comprehensive Perspective on Proof
consistent system that was free from paradoxes. Would paradoxes of the
same nature create a similar intellectual need with students?
3. Students encounter difficulties in moving between proof schemes, particularly
from the Greek’s axiomatic proof scheme (the one they construct in honors
high-school geometry, for example) to the modern axiomatic proof scheme
(the one they need to succeed in a real analysis course, for example). Exactly
what are these difficulties? What role does the emphasis on form rather than
content in modern mathematics (as opposed to Greek mathematics where
content is more prominent) play in this transition? Can students develop the
modern axiomatic proof without computational fluency? What role does the
causality proof scheme play in this transition?
These questions are examples of what we have delineated as important
contributions from the history of mathematics to our thinking about students’ proof
schemes. But there are likely to be other valuable ideas from further study of the growth
or development of proof ideas in the history of mathematics. How mathematical proof
arose in other cultures—e.g., the basis for the Japanese temple drawings (Rothman &
Fukagawa, 1998)—would in itself be fascinating and potentially instructive about how
proof ideas might be introduced or developed today. In this respect, our effort to form a
comprehensive perspective on proof is an attempt to understand what might be called the
“proving conceptual field,” a term analogous to Vergnaud’s (1983, 1988) “multiplicative
conceptual field.” Like the multiplicative conceptual field, the proving conceptual field
may be thought of as a set of problems and situations for which closely connected
concepts, procedures, and representations are necessary.
51 A Comprehensive Perspective on Proof
We wish to acknowledge the helpful comments from Gila Hanna, Carolyn Maher, and
Erna Yackel. Preparation of this chapter was supported in part by National Science
60 A Comprehensive Perspective on Proof
Foundation Grant No. REC-0310128. Any opinions or conclusions expressed are those
of the authors and do not represent an official position of NSF.
To assure a degree of quality control and for practical reasons, we have restricted our
survey to writings that have undergone an external review process, except in the case of a
few doctoral dissertations that we examined. Examination of at least one case—the
teaching and learning of mathematical induction—was omitted here since it has been
treated thoroughly elsewhere (Harel, 2001). We regret that no doubt we have
unintentionally overlooked other pieces of research or commentary that could have
improved this chapter.
All aspects of proof addressed in this paper must always be understood in the context of
the learning and teaching of proof. Even when we address mathematical, historical, or
philosophical aspects of proof, the goal is to utilize knowledge of these aspects for the
purpose of better understanding the processes of learning and teaching of proof. Thus,
phrases such as “research on proof,” “perspective on proof,” and the like should always
be understood in the context of mathematics education, not of mathematics, history, or
philosophy per se.
We are restricting our discussion to classical Greek mathematics and the mathematical
developments that grew out of it. The development of deductive reasoning in China,
India, and other non-Western cultures is not considered in this paper.