CSC2130: Empirical Research Methods For Software Engineering
CSC2130: Empirical Research Methods For Software Engineering
CSC2130: Empirical Research Methods For Software Engineering
CSC2130:
Empirical Research Methods for
Software Engineering
Steve Easterbrook
sme@cs.toronto.edu
www.cs.toronto.edu/~sme/CSC2130/
© 2004-5 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 1
Course Goals
Prepare students for advanced research (in SE):
Learn how to plan, conduct and report on empirical investigations.
Understand the key steps of a research project:
formulating research questions,
theory building,
data analysis (using both qualitative and quantitative methods),
building evidence,
assessing validity,
publishing.
1
University of Toronto Department of Computer Science
Intended Audience
This is an advanced software engineering course:
assumes a strong grasp of the key ideas of software engineering and the
common methods used in software practice.
Focus:
how do software developers work?
how do new tools and techniques affect their ability to construct high
quality software efficiently?
qualitative and quantitative techniques from behavioural sciences
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 3
Format
Seminars:
1 three-hour seminar per week
Mix of discussion, lecture, student presentations
Readings
Major component is discussion of weekly readings
Please read the set papers before the seminar
Assessment:
10% Class Participation
30% Oral Presentation - *critique a published empirical study
60% Written paper - design an empirical study for a SE research question
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 4
2
University of Toronto Department of Computer Science
Course Outline
1. Introduction & Orientation
2. What is Science?
Philosophy of Science
Sociology of Science
Meta-theories
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 5
3
University of Toronto Department of Computer Science
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 7
Scientific Method
No single official scientific method
Somehow, scientists are supposed to do this:
Observation
Theory World
Validation
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 8
4
University of Toronto Department of Computer Science
Observe!
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 9
Scientific Inquiry
Prior Knowledge
(Initial Hypothesis)
Observe
(what is wrong with
the current theory?)
Theorize
Experiment
(refine/create a
(manipulate the variables)
better theory)
Design
(Design empirical tests
of the theory)
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 10
5
University of Toronto Department of Computer Science
Creativity is important
Theories, hypotheses, experimental designs
Search for elegance, simplicity
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 11
E.g. Surveys
Self-selection of respondents biases the study
Respondents tell you what they think they ought to do, not what they
actually do
…etc...
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 12
6
University of Toronto Department of Computer Science
Empirical Induction
Series of studies over time…
Each designed to probe more aspects of the theory
…together build evidence for a clearly stated theory
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 13
Source: http://xkcd.com/882/"
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 14
7
University of Toronto Department of Computer Science
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 15
Advisor:
Prof. Helen Back
Topic:
Merging Stakeholder views in Model
Driven Development
Status:
2 years into his PhD
Has built a tool
Needs an evaluation plan
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 16
8
University of Toronto Department of Computer Science
Results
H1 accepted (strong evidence)
H2 & H3 rejected
Subjects found the tool unintuitive
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 17
Threats to Validity
Construct Validity
What do we mean by a merge? What is correctness?
5-point scale for subjective assessment - insufficient discriminatory power
(both tools scored very low)
Internal Validity
Confounding variables: Time taken to learn the tool; familiarity
Subjects were all familiar with RA, not with Stu-merge
External Validity
Task representativeness
class models were of a toy problem
Subject representativeness
Grad students as sample of what population?
Theoretical Reliability
Researcher bias
subjects knew Stu-merge was Stu s own tool
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 18
9
University of Toronto Department of Computer Science
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 19
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 20
10
University of Toronto Department of Computer Science
Why would we
expect it to
be better?
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 21
Some Definitions
A model is an abstract representation of a
phenomenon or set of related phenomena
Some details included, others excluded
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 22
11
University of Toronto Department of Computer Science
A simpler definition
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 23
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 24
12
University of Toronto Department of Computer Science
Stu s Theory
Background Assumptions
Large team projects, models contributed by many actors
Models are fragmentary, capture partial views
Partial views are inconsistent and incomplete most of the time
Basic Theory
(Brief summary:)
Model merging is an exploratory process, in which the aim is to discover
intended relationships between views. Goodness of a merge is a subjective
judgment. If an attempted merge doesn t seem good , many need to
change either the models, or the way in which they were mapped together.
[Still needs some work]
Derived Hypotheses
Useful merge tools need to represent relationships explicitly
Useful merge tools need to be complete (work for any models, even if
inconsistent)
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 25
Descriptive-Process Design
How does X normally work? What is an effective way to achieve X?
By what process does X happen? How can we improve X?
What are the steps as X evolves?
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 26
13
University of Toronto Department of Computer Science
Ex
Are X and Y related?
lati
on
Do occurrences of X correlate with
Description & Classification
plo
occurrences of Y?
What is X like?
rat
What are its properties? Causality
Ca
or
How can it be categorized? Does X cause Y?
Re usa
How can we measure it? Does X prevent Y?
What are its components? y What causes X?
Descriptive-Comparative lat l
What effect does X have on Y?
ion
How does X differ from Y? Causality-Comparative
Frequency and Distribution sh
Does X cause more Y than does Z?
ip
Is X better at preventing Y than is Z?
Ba
How often does X occur? Does X cause more Y than does Z under
se
What is an average amount of X? one condition but not others?
Design D
Descriptive-Process
ra
es
How does X normally work?
By what process does X happen? te
What are the steps as X evolves?
ign
What is an effective way to achieve X?
How can we improve X?
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 27
Description/Classification
What are the different types of model merging that occur in practice on
ju
Descriptive-Comparative
on
Causality
or
Causality-Comparative
w…
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 28
14
University of Toronto Department of Computer Science
Critical theory
Pragmatist
What will you accept
as valid truth?
How does this relate to
the established literature? New Paradigms
The Research
Existing Theories
Question
What new perspectives are
you bringing to this field?
Methodological Choices
Empirical
Data Collection
Data Analysis
Method
Techniques
Techniques
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 29
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 30
15
University of Toronto Department of Computer Science
Description/Classification ?
What are the different types of model
merging that occur in practice on large scale ?
Survey Research
systems?
Descriptive-Comparative
?
?
How does model merging with explicit
representation of relationships differ from ?
?
Ethnography
model merging without such representation?
Causality ?
Does an explicit representation of the
relationship between models cause developers
to explore different ways of merging models?
?
Action Research
?
Causality-Comparative
Does the algebraic representation of
relationships in Stu s tool lead developers to ?
Controlled Experiment
explore more than do pointcuts in AOM?
?
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 31
Warning
No method is perfect
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 32
16
University of Toronto Department of Computer Science
Okay, but…
© 2004-5 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 33
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 34
17