CSC2130: Empirical Research Methods For Software Engineering

University of Toronto Department of Computer Science
CSC2130:
Empirical Research Methods for
Software Engineering
Steve Easterbrook
sme@cs.toronto.edu
www.cs.toronto.edu/~sme/CSC2130/
© 2004-5 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 1
Course Goals
  Prepare students for advanced research (in SE):
 Learn how to plan, conduct and report on empirical investigations.
 Understand the key steps of a research project:
 formulating research questions,
  theory building,
  data analysis (using both qualitative and quantitative methods),
  building evidence,
  assessing validity,
  publishing.
  Motivate the need for an empirical basis for SE

  Cover all principal empirical methods applicable to SE:
 controlled experiment, case studies, surveys, archival analysis, action
research, ethnographies,…
  Relatethese methods to relevant meta-theories in

the philosophy and sociology of science.
© 2012 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 2
1

Intended Audience
  This is an advanced software engineering course:
 assumes a strong grasp of the key ideas of software engineering and the
common methods used in software practice.
  Focus:
 how do software developers work?
 how do new tools and techniques affect their ability to construct high
quality software efficiently?
 qualitative and quantitative techniques from behavioural sciences
  The course is aimed at students who:

 …plan to conduct SE research that demands some empirical validation
 …wish to establish an empirical basis for an existing SE research programme
 …wish to apply these techniques in related fields (e.g. HCI, Cog Sci)
  Note: we will *not* cover the kinds of experimental techniques

used in CS systems areas.
Format
  Seminars:
 1 three-hour seminar per week
 Mix of discussion, lecture, student presentations
  Readings
 Major component is discussion of weekly readings
 Please read the set papers before the seminar
  Assessment:
 10% Class Participation
 30% Oral Presentation - *critique a published empirical study
 60% Written paper - design an empirical study for a SE research question
*As part of a mock conference program committee meeting
2

Course Outline
1.  Introduction & Orientation
2.  What is Science?
  Philosophy of Science
  Sociology of Science
  Meta-theories
3.  What is software engineering?

  Engineering & Design
  Disciplinary Analogies for SE
  Evidence-based software engineering
4.  Basics of Doing Research

  Finding good research questions
  Theory building
  Research Design
  Ethics
  Evidence and Measurement
  Peer Review Process
Course Outline (cont)

5.  Experiments 9.  Interventions
  Controlled Experiments   Action Research
  Quasi-experiments   Pilot Studies
  Sampling   Benchmarking
  Replication
10.  Qualitative Analysis
6.  Case Studies   Grounded Theory
  Single and Multi-case   Phenomenography
  Longitudinal Case Studies   Mixed Methods Research
  Approaches to Data Collection
11.  Quantitative Analysis
7.  Histories and Simulations   Stats, power analysis, meta-analysis
  Artifact Analysis
12.  Publishing and Reviewing
  Archival Analysis and Post-mortems
  Simulation Techniques   (mock PC meeting)
8.  Survey and Observation 13.  Replication and Beyond

  Internal and External Replication
  Surveys
  Focus Groups   Biases and Influences
  Field Studies / Ethnographies   Threats to Validity
  When to use empirical methods
3

Is this your research plan?

Step 1: Build a new tool
Step 2: ??
Step 3: Profit
Scientific Method
  No single official scientific method
  Somehow, scientists are supposed to do this:
Observation
Theory World
Validation
4

Observe!
Scientific Inquiry
Prior Knowledge
(Initial Hypothesis)
Observe
(what is wrong with
the current theory?)
Theorize
Experiment
(refine/create a
(manipulate the variables)
better theory)
Design
(Design empirical tests
of the theory)
5

Some Characteristics of Science

  Science seeks to improve our understanding of the
world.
  Explanations are based on observations
 Scientific truths must stand up to empirical scrutiny
 Sometimes scientific truth must be thrown out in the face of new findings
  Theory and observation affect one another:

 Our perceptions of the world affect how we understand it
 Our understanding of the world affects how we perceive it
  Creativity is important
 Theories, hypotheses, experimental designs
 Search for elegance, simplicity
All Methods are flawed

  E.g. Laboratory Experiments
 Cannot study large scale software development in the lab!
 Too many variables to control them all!
  E.g. Case Studies

 How do we know what s true in one project generalizes to others?
 Researcher chose what questions to ask, hence biased the study
  E.g. Surveys
 Self-selection of respondents biases the study
 Respondents tell you what they think they ought to do, not what they
actually do
  …etc...
6

Strategies to overcome weaknesses

  Theory-building
 Testing a hypothesis is pointless (single flawed study!)…
 …unless it builds evidence for a clearly stated theory
  Empirical Induction
 Series of studies over time…
 Each designed to probe more aspects of the theory
 …together build evidence for a clearly stated theory
  Mixed Methods Research

 Use multiple methods to investigate the same research question
 Each method compensates for the flaws of the others
 …together build evidence for a clearly stated theory
Source: http://xkcd.com/882/"
7

What is a research contribution?

  Abetter understanding of how software engineers
work?
  Identification of problems with the current state-of-
the-art?
  Acharacterization of the properties of new tools/
techniques?
  Evidence that approach A is better than approach B?
How will you validate your claims?"
Meet Stuart Dent

  Name:
 Stuart Dent (a.k.a. Stu )
  Advisor:
 Prof. Helen Back
  Topic:
 Merging Stakeholder views in Model
Driven Development
  Status:
 2 years into his PhD
 Has built a tool
 Needs an evaluation plan
8

Stu s Evaluation Plan

  Formal Experiment
 Independent Variable: Stu-Merge vs. Rational Architect
 Dependent Variables: Correctness, Speed, Subjective Assessment
 Task: Merging Class Diagrams from two different stakeholders models
 Subjects: Grad Students in SE
 H1: Stu-Merge produces correct merges more often than RA
 H2: Subjects produce merges faster with Stu-Merge than with RA
 H3: Subjects prefer using Stu-Merge to RA
  Results
 H1 accepted (strong evidence)
 H2 & H3 rejected
 Subjects found the tool unintuitive
Threats to Validity
  Construct Validity
 What do we mean by a merge? What is correctness?
 5-point scale for subjective assessment - insufficient discriminatory power
  (both tools scored very low)
  Internal Validity
 Confounding variables: Time taken to learn the tool; familiarity
  Subjects were all familiar with RA, not with Stu-merge
  External Validity
 Task representativeness
  class models were of a toy problem
 Subject representativeness
  Grad students as sample of what population?
  Theoretical Reliability
 Researcher bias
  subjects knew Stu-merge was Stu s own tool
9

What went wrong?

  What was the research question?
  Is tool A better than tool B?
  What would count as an answer?

  What use would the answer be?
 How is it a contribution to knowledge ?
  How does this evaluation relate to the existing

literature?
Experiments as Clinical Trials
Why would we Why do

expect it to we need What will we
be better? to know? do with the
answer?
Is drug A better than drug B?

Better at Better in
doing what? what situations?
Better in
what way?
10

Why would we
expect it to
be better?
You gotta have a theory!
Some Definitions
  A model is an abstract representation of a
phenomenon or set of related phenomena
  Some details included, others excluded
  A theory is a set of statements that explain a set

of phenomena
  Serves to explain and predict
  Precisely defined terminology
  Concepts, relationships, causal inferences
  (operational definitions for theoretical terms)
  A hypothesis is a testable statement derived from a

theory
  A hypothesis is not a theory!
  In SE, we have mostly folk theories
11

A simpler definition
A Theory is the best

explanation of all the
available evidence"
The Role of Theory Building

  Theories lie at the heart of what it means to do
science.
 Production of generalizable knowledge
  Theory provides orientation for data collection

 Cannot observe the world without a theoretical perspective
  Theories allow us to compare similar work

 Theories include precise definition for the key terms
 Theories provide a rationale for which phenomena to measure
  Theories support analytical generalization

 Provide a deeper understanding of our empirical results
 …and hence how they apply more generally
 Much more powerful than statistical generalization
12

Stu s Theory
  Background Assumptions
 Large team projects, models contributed by many actors
 Models are fragmentary, capture partial views
 Partial views are inconsistent and incomplete most of the time
  Basic Theory
 (Brief summary:)
 Model merging is an exploratory process, in which the aim is to discover
intended relationships between views. Goodness of a merge is a subjective
judgment. If an attempted merge doesn t seem good , many need to
change either the models, or the way in which they were mapped together.
 [Still needs some work]
  Derived Hypotheses
 Useful merge tools need to represent relationships explicitly
 Useful merge tools need to be complete (work for any models, even if
inconsistent)
What type of question are you asking?

  Existence:   Relationship
  Does X exist?   Are X and Y related?
  Do occurrences of X correlate with
  Description & Classification occurrences of Y?
  What is X like?
  What are its properties?   Causality
  How can it be categorized?   Does X cause Y?
  How can we measure it?   Does X prevent Y?
  What are its components?   What causes X?
  What effect does X have on Y?
  Descriptive-Comparative
  How does X differ from Y?   Causality-Comparative
  Does X cause more Y than does Z?
  Frequency and Distribution   Is X better at preventing Y than is Z?
  How often does X occur?   Does X cause more Y than does Z under
  What is an average amount of X? one condition but not others?
  Descriptive-Process   Design
  How does X normally work?   What is an effective way to achieve X?
  By what process does X happen?   How can we improve X?
  What are the steps as X evolves?
13

What type of question are you asking?

Co
  Existence:  
rre
Relationship
  Does X exist?
Ex
  Are X and Y related?
lati
on
  Do occurrences of X correlate with
Description & Classification
plo
  occurrences of Y?
  What is X like?
rat
  What are its properties?   Causality
Ca
or
  How can it be categorized?   Does X cause Y?
Re usa
  How can we measure it?   Does X prevent Y?
  What are its components? y   What causes X?
Descriptive-Comparative lat l
  What effect does X have on Y?
ion
 
  How does X differ from Y?   Causality-Comparative
Frequency and Distribution sh
  Does X cause more Y than does Z?
ip
    Is X better at preventing Y than is Z?
Ba
  How often does X occur?   Does X cause more Y than does Z under
se
  What is an average amount of X? one condition but not others?
Design D
  Descriptive-Process
ra  
es
  How does X normally work?
  By what process does X happen? te
  What are the steps as X evolves?
ign
  What is an effective way to achieve X?
  How can we improve X?
Stu s Research Question(s)

  Existence
Pi
 Does model merging ever happen in practice?

ck
  Description/Classification
 What are the different types of model merging that occur in practice on
ju
large scale systems?

st
  Descriptive-Comparative
on
 How does model merging with explicit representation of relationships differ

from model merging without such representation?
ef
  Causality
or
 Does an explicit representation of the relationship between models cause

developers to explore different ways of merging models?
no
  Causality-Comparative
w…
 Does the algebraic representation of relationships in Stu s tool lead

developers to explore more than do pointcuts in AOM?
"
14

Putting the Question in Context

Philosophical Context

Positivist
Constructivist

Critical theory
Pragmatist

What will you accept
as valid truth?
How does this relate to
the established literature? New Paradigms

The Research
Existing Theories
Question
What new perspectives are
you bringing to this field?
What methods are appropriate

for answering this question?
Methodological Choices

Empirical
Data Collection
Data Analysis

Method
Techniques
Techniques

Many available methods…

  Common   Common
in the lab in the wild
Methods Methods
  Controlled Experiments   Quasi-Experiments

  Rational Reconstructions   Case Studies
  Exemplars   Survey Research
  Benchmarks   Ethnographies
  Simulations   Action Research
  Artifact/Archive Analysis ( mining !)
15

Stu s Method(s) Selection…

  Existence ?
Case study

  Does model merging ever happen in practice?
  Description/Classification ?

  What are the different types of model
merging that occur in practice on large scale ?
Survey Research

systems?
Descriptive-Comparative
?

  ?

  How does model merging with explicit
representation of relationships differ from ?
?
Ethnography

model merging without such representation?
  Causality ?

  Does an explicit representation of the
relationship between models cause developers
to explore different ways of merging models?
?
Action Research

?

  Causality-Comparative
  Does the algebraic representation of
relationships in Stu s tool lead developers to ?
Controlled Experiment

explore more than do pointcuts in AOM?
?

Warning
No method is perfect

Don t get hung up on methodological purity

Pick something and get on with it

Some knowledge is better than none

16

Okay, but…
© 2004-5 Steve Easterbrook. This presentation is available free for non-commercial use with attribution under a creative commons license. 33
Why Build a Tool?

  Build a Tool to Test a Theory
 Tool is part of the experimental materials needed to conduct your study
  Build a Tool to Develop a Theory

 Theory emerges as you explore the tool
  Build a Tool to Explain your Theory

 Theory as a concrete instantiation of (some aspect of) the theory
17

CSC2130: Empirical Research Methods For Software Engineering

Uploaded by

Copyright:

Available Formats

CSC2130: Empirical Research Methods For Software Engineering

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CSC2130: Empirical Research Methods For Software Engineering

Uploaded by

Copyright:

Available Formats

University of Toronto Department of Computer Science

University of Toronto Department of Computer Science

 Motivate the need for an empirical basis for SE

 Relatethese methods to relevant meta-theories in

 The course is aimed at students who:

 Note: we will *not* cover the kinds of experimental techniques

University of Toronto Department of Computer Science

*As part of a mock conference program committee meeting

3. What is software engineering?

4. Basics of Doing Research

University of Toronto Department of Computer Science

Course Outline (cont)

8. Survey and Observation 13. Replication and Beyond

Is this your research plan?

University of Toronto Department of Computer Science

University of Toronto Department of Computer Science

Some Characteristics of Science

 Theory and observation affect one another:

University of Toronto Department of Computer Science

All Methods are flawed

 E.g. Case Studies

Strategies to overcome weaknesses

 Mixed Methods Research

University of Toronto Department of Computer Science

What is a research contribution?

How will you validate your claims?"

University of Toronto Department of Computer Science

Meet Stuart Dent

Stu s Evaluation Plan

University of Toronto Department of Computer Science

What went wrong?

 What would count as an answer?

 How does this evaluation relate to the existing

University of Toronto Department of Computer Science

Experiments as Clinical Trials

Why would we Why do

Is drug A better than drug B?

You gotta have a theory!

University of Toronto Department of Computer Science

 A theory is a set of statements that explain a set

 A hypothesis is a testable statement derived from a

 In SE, we have mostly folk theories

A Theory is the best

University of Toronto Department of Computer Science

The Role of Theory Building

 Theory provides orientation for data collection

 Theories allow us to compare similar work

 Theories support analytical generalization

University of Toronto Department of Computer Science

What type of question are you asking?

What type of question are you asking?

University of Toronto Department of Computer Science

Stu s Research Question(s)

 Does model merging ever happen in practice?

large scale systems?

 How does model merging with explicit representation of relationships differ

 Does an explicit representation of the relationship between models cause

 Does the algebraic representation of relationships in Stu s tool lead

Putting the Question in Context

  Motivate the need for an empirical basis for SE

  Relatethese methods to relevant meta-theories in

  The course is aimed at students who:

  Note: we will not cover the kinds of experimental techniques

3.  What is software engineering?

4.  Basics of Doing Research

8.  Survey and Observation 13.  Replication and Beyond

  Theory and observation affect one another:

  E.g. Case Studies

  Mixed Methods Research

  What would count as an answer?

  How does this evaluation relate to the existing

  A theory is a set of statements that explain a set

  A hypothesis is a testable statement derived from a

  In SE, we have mostly folk theories

  Theory provides orientation for data collection

  Theories allow us to compare similar work

  Theories support analytical generalization

 Does model merging ever happen in practice?

 How does model merging with explicit representation of relationships differ

 Does an explicit representation of the relationship between models cause

 Does the algebraic representation of relationships in Stu s tool lead

  Controlled Experiments   Quasi-Experiments

  Artifact/Archive Analysis ( mining !)

  Build a Tool to Develop a Theory

  Build a Tool to Explain your Theory