0% found this document useful (0 votes)
8 views

Untitled

The document provides an overview of statistical analysis methods, focusing on their application in business and economics. It covers key concepts such as descriptive and inferential statistics, types of data, data collection methods, and the role of statistics in decision-making. Additionally, it outlines the fundamental elements of statistics, including populations, samples, and variables, as well as common statistical processes and potential biases in sampling.

Uploaded by

Manal Amir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Untitled

The document provides an overview of statistical analysis methods, focusing on their application in business and economics. It covers key concepts such as descriptive and inferential statistics, types of data, data collection methods, and the role of statistics in decision-making. Additionally, it outlines the fundamental elements of statistics, including populations, samples, and variables, as well as common statistical processes and potential biases in sampling.

Uploaded by

Manal Amir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 147

Statistical analysis methods

4 15
2025 DR
1
1
Mohamed Nada
Statistics for Business and
Economics

Chapter 1
Statistics, Data, &
Statistical Thinking
Contents
1. The Science of Statistics
2. Types of Statistical Applications in Business
3. Fundamental Elements of Statistics
4. Processes
5. Types of Data
6. Collecting Data
7. The Role of Statistics in Managerial Decision
Making
1 ‫ ـ‬Learning Objectives
1. Introduce the field of statistics
2. Demonstrate how statistics applies to business
3. Establish the link between statistics and data
4. Identify the different types of data and data-
collection methods
5. Differentiate between population and sample
data
6. Differentiate between descriptive and inferential
statistics
1.1

The Science of Statistics


What Is Statistics?

1. Collecting Data Data Why?


e.g., Survey Analysis
2. Presenting Data
© 1984-1994 T/Maker Co.
e.g., Charts & Tables
Decision-
3. Characterizing Data
Making
e.g., Average

© 1984-1994 T/Maker Co.


What Is Statistics?

Statistics is the science of data. It involves


collecting, classifying, summarizing, organizing,
analyzing, and interpreting numerical
information.
1.2

Types of Statistical Applications in


Business
Application Areas

• Economics • Engineering
– Forecasting – Construction
– Demographics – Materials

• Sports • Business
– Individual & Team – Consumer Preferences
Performance – Financial Trends
Statistics: Two Processes

Describing sets of data

and

Drawing conclusions (making estimates,


decisions, predictions, etc. about sets of data
based on sampling)
Statistical Methods
Statistical
Methods

Descriptive Inferential
Statistics Statistics
Descriptive Statistics
1. Involves
• Collecting Data $
50
• Presenting Data
• Characterizing Data 25

2. Purpose 0
• Describe Data Q1 Q2 Q3 Q4

X = 30.5 S2 = 113
Inferential Statistics
1. Involves
• Estimation Population?
• Hypothesis
Testing

2. Purpose
• Make decisions about
population characteristics
1.3

Fundamental Elements
of Statistics
Fundamental Elements
1. Experimental unit
• Object upon which we collect data
2. Population • P in Population
& Parameter
• All items of interest
• S in Sample
3. Variable & Statistic
• Characteristic of an individual
experimental unit
4. Sample
• Subset of the units of a population
Fundamental Elements
1. Statistical Inference
• Estimate or prediction or generalization about a
population based on information contained in a
sample
2. Measure of Reliability
• Statement (usually qualified) about the degree
of uncertainty associated with a statistical
inference
Four Elements of Descriptive
Statistical Problems
1. The population or sample of interest
2. One or more variables (characteristics of the
population or sample units) that are to be
investigated
3. Tables, graphs, or numerical summary tools
4. Identification of patterns in the data
Five Elements of Inferential
Statistical Problems
1. The population of interest
2. One or more variables (characteristics of the
population units) that are to be investigated
3. The sample of population units
4. The inference about the population based on
information contained in the sample
5. A measure of reliability for the inference
1.4

Processes
Process
A process is a series of actions or operations that
transforms inputs to outputs. A process produces or
generates output over time.
Process
A process whose operations or actions are unknown or
unspecified is called a black box.

Any set of output (object or numbers) produced by a


process is called a sample.
1.5

Types of Data
Types of Data

Quantitative data are measurements that are recorded


on a naturally occurring numerical scale.

Qualitative data are measurements that cannot be


measured on a natural numerical scale; they can only be
classified into one of a group of categories.
Types of Data
Types of
Data

Quantitative Qualitative
Data Data
Quantitative Data
Measured on a numeric 4
scale. 943
• Number of defective 52
items in a lot.
21
• Salaries of CEOs of 120 12
oil companies. 8
• Ages of employees at 71 3
a company.
Qualitative Data
Classified into categories.
• College major of each
student in a class.
• Gender of each employee
at a company.
• Method of payment
(cash, check, credit card).
$ Credit
1.6

Collecting Data
Obtaining Data

1. Data from a published source


2. Data from a designed experiment
3. Data from a survey
4. Data collected observationally
Obtaining Data
Published source:
book, journal, newspaper, Web site
Designed experiment:
researcher exerts strict control over units
Survey:
a group of people are surveyed and their
responses are recorded
Observation study:
units are observed in natural setting and
variables of interest are recorded
Samples
A representative sample exhibits characteristics
typical of those possessed by the population of
interest.

A random sample of n experimental units is a


sample selected from the population in such a way
that every different sample of size n has an equal
chance of selection.
Random Sample
Every sample of size n has an equal chance of
selection.
1.7

The Role of Statistics in


Managerial Decision Making
Statistical Thinking
Statistical thinking involves applying rational
thought and the science of statistics to critically
assess data and inferences. Fundamental to the
thought process is that variation exists in
populations and process data.

A random sample of n experimental units is a


sample selected from the population in such a way
that every different sample of size n has an equal
chance of selection.
Nonrandom Sample Errors
Selection bias results when a subset of the
experimental units in the population is excluded so
that these units have no chance of being selected for
the sample.
Nonresponse bias results when the researchers
conducting a survey or study are unable to obtain data
on all experimental units selected for the sample.
Measurement error refers to inaccuracies in the
values of the data recorded. In surveys, the error may
be due to ambiguous or leading questions and the
interviewer’s effect on the respondent.
Real-World Problem
Statistical
Computer Packages
1. Typical Software
• SPSS
• MINITAB
• Excel

2. Need Statistical
Understanding
• Assumptions
• Limitations
Key Ideas
Types of Statistical Applications

Descriptive
1. Identify population and sample (collection
of experimental units)
2. Identify variable(s)
3. Collect data
4. Describe data
Key Ideas
Types of Statistical Applications

Inferential
1. Identify population (collection of all
experimental units)
2. Identify variable(s)
3. Collect sample data (subset of population)
4. Inference about population based on sample
5. Measure of reliability for inference
Key Ideas

Types of Data

1. Quantitative (numerical in nature)


2. Qualitative (categorical in nature)
Key Ideas

Data-Collection Methods

1. Observational
2. Published source
3. Survey
4. Designed experiment
Key Ideas

Problems with Nonrandom Samples

1. Selection bias
2. Nonresponse bias
3. Measurement error
2. Descriptive Statistics

• Describing data with tables and graphs


(quantitative or categorical variables)

• Numerical descriptions of center, variability,


position (quantitative variables)

• Bivariate descriptions (In practice, most


studies have several variables)
1. Tables and Graphs

Frequency distribution: Lists possible values of


variable and number of times each occurs

Example: Student survey (n = 60)


www.stat.ufl.edu/~aa/social/data.html

“political ideology” measured as ordinal variable


with 1 = very liberal, …, 4 = moderate, …, 7 =
very conservative
Histogram: Bar graph of
frequencies or percentages
Shapes of histograms
(for quantitative variables)

• Bell-shaped (IQ, SAT, political ideology in all U.S. )


• Skewed right (annual income, no. times arrested)
• Skewed left (score on easy exam)
• Bimodal (polarized opinions)

Ex. GSS data on sex before marriage in Exercise 3.73:


always wrong, almost always wrong, wrong only
sometimes, not wrong at all
category counts 238, 79, 157, 409
Stem-and-leaf plot (John Tukey, 1977)

Example: Exam scores (n = 40 students)

Stem Leaf
3 6
4
5 37
6 235899
7 011346778999
8 00111233568889
9 02238
2.Numerical descriptions
Let y denote a quantitative variable, with
observations y1 , y2 , y3 , … , yn

a. Describing the center

Median: Middle measurement of ordered sample

Mean:
y1  y2  ...  yn yi
y 
n n
Example: Annual per capita carbon dioxide emissions
(metric tons) for n = 8 largest nations in population
size

Bangladesh 0.3, Brazil 1.8, China 2.3, India 1.2,


Indonesia 1.4, Pakistan 0.7, Russia 9.9, U.S. 20.1

Ordered sample:

Median =

Mean y =
Example: Annual per capita carbon dioxide emissions
(metric tons) for n = 8 largest nations in population
size

Bangladesh 0.3, Brazil 1.8, China 2.3, India 1.2,


Indonesia 1.4, Pakistan 0.7, Russia 9.9, U.S. 20.1

Ordered sample: 0.3, 0.7, 1.2, 1.4, 1.8, 2.3, 9.9, 20.1

Median =

Mean y =
Example: Annual per capita carbon dioxide emissions
(metric tons) for n = 8 largest nations in population
size

Bangladesh 0.3, Brazil 1.8, China 2.3, India 1.2,


Indonesia 1.4, Pakistan 0.7, Russia 9.9, U.S. 20.1

Ordered sample: 0.3, 0.7, 1.2, 1.4, 1.8, 2.3, 9.9, 20.1

Median = (1.4 + 1.8)/2 = 1.6

Mean y = (0.3 + 0.7 + 1.2 + … + 20.1)/8 = 4.7


Properties of mean and median
• For symmetric distributions, mean = median
• For skewed distributions, mean is drawn in
direction of longer tail, relative to median
• Mean valid for interval scales, median for
interval or ordinal scales
• Mean sensitive to “outliers” (median often
preferred for highly skewed distributions)
• When distribution symmetric or mildly skewed or
discrete with few values, mean preferred
because uses numerical values of observations
Examples:

• New York Yankees baseball team, 2006


mean salary = $7.0 million
median salary = $2.9 million

How possible? Direction of skew?

• Give an example for which you would expect

mean < median


b. Describing variability

Range: Difference between largest and smallest


observations
(but highly sensitive to outliers, insensitive to shape)

Standard deviation: A “typical” distance from the mean

The deviation of observation i from the mean is

yi  y
The variance of the n observations is

( yi  y ) ( y1  y )  ...  ( yn  y )
2 2 2
s 
2

n 1 n 1
The standard deviation s is the square root of the variance,

s  s 2
Example: Political ideology
• For those in the student sample who attend religious
services at least once a week (n = 9 of the 60),
• y = 2, 3, 7, 5, 6, 7, 5, 6, 4

y  5.0,
(2  5) 2
 (3  5) 2
 ...  (4  5) 2
24
s 
2
  3.0
9 1 8
s  3.0  1.7

For entire sample (n = 60), mean = 3.0, standard deviation = 1.6, tends
to have similar variability but be more liberal
• Properties of the standard deviation:
• s  0, and only equals 0 if all observations are equal
• s increases with the amount of variation around the mean
• Division by n - 1 (not n) is due to technical reasons (later)
• s depends on the units of the data (e.g. measure euro vs $)
•Like mean, affected by outliers

•Empirical rule: If distribution is approx. bell-shaped,


 about 68% of data within 1 standard dev. of mean
 about 95% of data within 2 standard dev. of mean
 all or nearly all data within 3 standard dev. of mean
Example: SAT with mean = 500, s = 100
(sketch picture summarizing data)

Example: y = number of close friends you have


GSS: The variable ‘frinum’ has mean 7.4, s = 11.0

Probably highly skewed: right or left?

Empirical rule fails; in fact, median = 5, mode=4

Example: y = selling price of home in Syracuse, NY.


If mean = $130,000, which is realistic?

s = 0, s = 1000, s = 50,000, s = 1,000,000


c. Measures of position
pth percentile: p percent of observations
below it, (100 - p)% above it.

 p = 50: median
 p = 25: lower quartile (LQ)
 p = 75: upper quartile (UQ)

 Interquartile range IQR = UQ - LQ


Quartiles portrayed graphically by box plots
(John Tukey)
Example: weekly TV watching for n=60 from
student survey data file, 3 outliers
Box plots have box from LQ to UQ, with
median marked. They portray a five-
number summary of the data:
Minimum, LQ, Median, UQ, Maximum
except for outliers identified separately

Outlier = observation falling


below LQ – 1.5(IQR)
or above UQ + 1.5(IQR)

Ex. If LQ = 2, UQ = 10, then IQR = 8 and


outliers above 10 + 1.5(8) = 22
3. Bivariate description
• Usually we want to study associations between two or
more variables (e.g., how does number of close
friends depend on gender, income, education, age,
working status, rural/urban, religiosity…)
• Response variable: the outcome variable
• Explanatory variable(s): defines groups to compare

Ex.: number of close friends is a response variable,


while gender, income, … are explanatory variables

Response var. also called “dependent variable”


Explanatory var. also called “independent variable”
Summarizing associations:
• Categorical var’s: show data using contingency tables
• Quantitative var’s: show data using scatterplots
• Mixture of categorical var. and quantitative var. (e.g.,
number of close friends and gender) can give
numerical summaries (mean, standard deviation) or
side-by-side box plots for the groups

• Ex. General Social Survey (GSS) data


Men: mean = 7.0, s = 8.4
Women: mean = 5.9, s = 6.0
Shape? Inference questions for later chapters?
Example: Income by highest degree
Contingency Tables

• Cross classifications of categorical variables in


which rows (typically) represent categories of
explanatory variable and columns represent
categories of response variable.

• Counts in “cells” of the table give the numbers of


individuals at the corresponding combination of
levels of the two variables
Happiness and Family Income
(GSS 2008 data: “happy,” “finrela”)

Happiness
Income Very Pretty Not too Total
-------------------------------
Above Aver. 164 233 26 423
Average 293 473 117 883
Below Aver. 132 383 172 687
------------------------------
Total 589 1089 315 1993
Can summarize by percentages on response
variable (happiness)

Example: Percentage “very happy” is

39% for above aver. income (164/423 = 0.39)


33% for average income (293/883 = 0.33)
19% for below average income (??)
Happiness
Income Very Pretty Not too Total
--------------------------------------------
Above 164 (39%) 233 (55%) 26 (6%) 423
Average 293 (33%) 473 (54%) 117 (13%) 883
Below 132 (19%) 383 (56%) 172 (25%) 687
----------------------------------------------

Inference questions for later chapters? (i.e., what can


we conclude about the corresponding population?)
Scatterplots (for quantitative variables)
plot response variable on vertical axis,
explanatory variable on horizontal axis

Example: Table 9.13 (p. 294) shows UN data for several


nations on many variables, including fertility (births per
woman), contraceptive use, literacy, female economic
activity, per capita gross domestic product (GDP), cell-
phone use, CO2 emissions

Data available at
http://www.stat.ufl.edu/~aa/social/data.html
Example: Survey in Alachua County, Florida,
on predictors of mental health
(data for n = 40 on p. 327 of text and at
www.stat.ufl.edu/~aa/social/data.html)

y = measure of mental impairment (incorporates various


dimensions of psychiatric symptoms, including aspects of
depression and anxiety)
(min = 17, max = 41, mean = 27, s = 5)

x = life events score (events range from severe personal


disruptions such as death in family, extramarital affair, to
less severe events such as new job, birth of child, moving)
(min = 3, max = 97, mean = 44, s = 23)
Bivariate data from 2000 Presidential election
Butterfly ballot, Palm Beach County, FL, text p.290
Example: The Massachusetts Lottery
(data for 37 communities)

% income
spent on
lottery

Per capita income


Correlation describes strength of
association
• Falls between -1 and +1, with sign indicating direction
of association (formula later in Chapter 9)

The larger the correlation in absolute value, the stronger


the association (in terms of a straight line trend)

Examples: (positive or negative, how strong?)


Mental impairment and life events, correlation =
GDP and fertility, correlation =
GDP and percent using Internet, correlation =
Correlation describes strength of
association

• Falls between -1 and +1, with sign indicating direction


of association

Examples: (positive or negative, how strong?)

Mental impairment and life events, correlation = 0.37


GDP and fertility, correlation = - 0.56
GDP and percent using Internet, correlation = 0.89
Regression analysis gives line
predicting y using x
Example:
y = mental impairment, x = life events

Predicted y = 23.3 + 0.09x

e.g., at x = 0, predicted y =
at x = 100, predicted y =
Regression analysis gives line
predicting y using x
Example:
y = mental impairment, x = life events

Predicted y = 23.3 + 0.09x

e.g., at x = 0, predicted y = 23.3


at x = 100, predicted y = 23.3 + 0.09(100) = 32.3

Inference questions for later chapters?


(i.e., what can we conclude about the population?)
Example: student survey
y = college GPA, x = high school GPA
(data at www.stat.ufl.edu/~aa/social/data.html)

What is the correlation?

What is the estimated regression equation?

We’ll see later in course the formulas for finding


the correlation and the “best fitting” regression
equation (with possibly several explanatory
variables), but for now, try using software such
as SPSS to find the answers.
Sample statistics /
Population parameters

• We distinguish between summaries of samples


(statistics) and summaries of populations
(parameters).

• Common to denote statistics by Roman letters,


parameters by Greek letters:

Population mean =m, standard deviation = s,


proportion  are parameters.

In practice, parameter values unknown, we make


inferences about their values using sample
statistics.
• The sample mean y estimates
the population mean m (quantitative variable)

• The sample standard deviation s estimates


the population standard deviation s (quantitative
variable)

• A sample proportion p estimates


a population proportion  (categorical variable)
Statistics:
A Gentle Introduction
Overview
• What is statistics?
• What is a statistician?
• All statistics are not alike
• On the science of science
• Why do we need it?
• Good vs. shady science
• Learning a new language
What is statistics?
• Statistics:

– A way to organize information to make it


easier to understand what the information
might mean.
What is statistics?

– Provides a conceptual understanding so


results can be communicated to others in a
clear and accurate way.
What is a statistician?
The Curious Detective

• The Curious Detective:

– Examines the crime scene


• The crime scene is the experiment.

– Looks for clues


• Data from experiments are the clues.
What is a statistician?
The Curious Detective

– Develops suspicions about the culprit


• Questions (hypotheses) from the crimes scene
(experiment) determine how to answer the
questions.

– Remains skeptical
• Relies on sound clues (good statistics), and
information from the crime scene (experiment), not
the “fad” of the day.
What is a statistician?
The Honest Attorney

• The Honest Attorney:

– Examine the facts of the case


• Examines the data.
• Is the data sound?
• What might the data mean?
What is a statistician?
The Honest Attorney

– Creates a legal argument using the facts

• Tries to come up with a reasonable explanation for


what happened.

• Is there another possible explanation?

• Do the data support the argument (hypotheses)?


What is a statistician?
The Honest Attorney

• The unscrupulous or naive attorney


– Either by choice or lack of experience, the
data are manipulated or forced to support the
hypothesis.

– Worst case:
• Ignore disconfirming data or make up the data.
What is a statistician?
A Good Storyteller

• A Good Storyteller:
– In order for the findings to be published, they
must be put together in a clear, coherent
manner that relates:
• What happened?
• What was found?
• Why it is important?
• What does it mean for the future?
All statistics are not alike
Conservative vs. Liberal statisticians

• Conservative
• Use the tried and true methods
• Prefer conventional rules & common practices
– Advantages:
• More accepted by peers and journal editors
• Guard against chance influencing the findings
– Disadvantages:
• New statistical methods are avoided
All statistics are not alike
Conservative vs. Liberal statisticians

• Liberal
• More likely to use new statistical methods
• Willing to question convention
– Advantages
• May be more likely to discover previously
undetected changes/causes/relationships
– Disadvantages
• More difficulty in having findings accepted by
publishers and peers
All statistics are not alike
Types of statistics

• Descriptive:
– Describing the information (parameters)
• How many (frequency)
• What does it look like (graphing)
• What types (tables)
All statistics are not alike
Types of statistics

• Inferential:
– Making educated guesses (inferences) about
a large group (population) based on what we
know about a smaller group (sample).
On the science of science
• The role of science

Science helps to build explanations of what we


experience that are consistent and predictive,
rather than changing, reactive, and biased.
On the science of science
• The need for scientific investigation

Scientific investigation provides a set of tools to


explore in a way that provides consistent
building blocks of information so that we can
better understand what we experience and
predict future events.
On the science of science
The scientific method

• The scientific method is a repetitive


process that:
– Uses observations to generate theories
– Uses theories to generate hypotheses
– Uses research methods to test hypotheses,
which generate new observations and/or
theories
On the science of science
The scientific method: Theories

• Theories
– What are they?

• An idea or set of ideas that attempt to explain an


important phenomenon.
– Theories of behavior
– Theory of relativity
On the science of science
The scientific method: Theories

– Where do they come from?

• They are generated from observations about the


phenomenon.

– Why might this happen?


– Is there something that consistently happens given a set
of initial conditions?
On the science of science
The scientific method: Theories

– How do we know if they are any good?

• Theories lead to guesses about why might happen


if . . . (hypotheses).

• If the hypotheses are supported through


experiments, then we put more belief that the
theory is useful.
On the science of science
The scientific method: Hypotheses

• Hypotheses:
– Usually generated by a theory.

– States what is predicted to happen as a result


of an experiment/event.
• I think “X” will happen as a result of “Y.”
• If “Y” occurs, then “X” will result.
On the science of science
The scientific method: Research

• Research:
– Provides the investigator with an opportunity
to examine an area of interest and/or
manipulate circumstances to observe the
outcome.

– Test a theory/hypotheses.
On the science of science
The scientific method: Observations

• Observations:
– The results of an experiment.

– Observations can:
• Support or detract from a theory
• Suggest revision of a theory
• Generate a new theory
Why do we need it?
• Statistics help us to:
– Understand what was observed.
– Communicate what was found.
– Make an argument.
– Answer a question.
– Be better consumers of information.
Why do we need it?
Better consumers of information

• To be better consumer of information, we


need to ask:
– Who was surveyed or studied?
• Are the participants like me or my interest group?
– All men
– All European American
– All twenty-something in age
• If not, might the information still be important?
Why do we need it?
Better consumers of information

– Why did the people participate in the study?


• Was it just for the money?
– If they were paid a lot, how might that influence their
performance/rating/reports?

• Were they desperate for a cure/treatment?

• Did the participants have something to prove?


Why do we need it?
Better consumers of information

– Was there a control group and did the control


group receive a placebo?
• If not, how do I know it worked?

• Did the participant know she or he received the


treatment?

• Was it the placebo effect (the belief in the


treatment) that caused the change?
Why do we need it?
Better consumers of information

– How many people participated in the study?


• Were there enough to detect a difference?
– Too few participants might result in not finding a
difference when there is one.

• Were there so many that any minor difference


would be detected?
– Too many participants will result in detecting almost any
tiny difference— even if it isn’t meaningful.
Why do we need it?
Better consumers of information

– How were the questions worded to the


participants in the study?
• Does the wording indicate the “expected” answer?
• Does the wording accurately reflect what is being
studied?
– The rape survey
• Was the wording at the appropriate level for the
participant?
Why do we need it?
Better consumers of information

– Was causation assumed from a correlational


study?
• Many of the studies we hear about from the media
are correlational studies (relationships only),

• But the results are reported as though they were


from an experiment (causation).
Why do we need it?
Better consumers of information

– Who paid for the study?


• Does the funding source have a reason for an
expected result of the study?
– Pharmaceutical companies
– Political party
– A specific interest group
Why do we need it?
Better consumers of information

– Was the study published in a peer-reviewed


journal?
• Peer-reviewed journals tend to be more rigorous in
the examination of the submission.

• Was it published in:


– Journal of Consulting and Clinical Psychology
– New England Journal of Medicine
– National Enquirer
Good vs. Shady science
• Good science
– To make sure what we get is useful:
• The sample of participants should be randomly
drawn from the population.
– Everyone has an equal chance of being selected.

• The sample should be relatively large.


– Able to detect differences
– Representative of the population
Good vs. Shady science
• Good science
– Random sample
– Random assignment
– Placebo studies
– Double-blind studies
– Control group studies
– Minimizing confounding variables
Good vs. Shady science
• Shady science
– 10% of the brain is used
– News surveys
– Does American Idol really pick America’s
favorite?
– Got any examples?
Learning a new language
• The words sound the same, but it is a
whole new game.

• The end of significance as you know it.

• Variable now means something more


stable.
Learning a new language
• Who is in control?
– Experimental control
– Statistical control

• The fly in the ointment


– Confounding variables
Learning a new language

• Independent variable (IV) • Dependent variable


– Manipulated by – Is measured in study
experimenter

– Related to topic of curiosity – Topic of curiosity

– Expected to influence the – Changes as a result


dependent variable of exposure to IV
Learning a new language
• What are you talking about?
– Operational definition

• Error is not a mistake


– Recognition of measurement imperfection
– Sources
• Participant
• Study conditions
Quantitative and Qualitative

Variables

Quantitative Qualitative

Discrete Non-
Numeric
Continuous numeric
Explanation of Terms

• Quantitative Data-Data Values that are


Numeric; Ex- math anxiety score
• Qualitative Data- Data values that can be
placed into distinct categories according to
some characteristic; Ex-eye color, hair
color, gender, types of foods, drinks;
typically either/or
Learning a new language
Types of variables

• How it can be measured matters


– Discrete variables
• What is measured belongs to unique and separate
categories
– Pets: dog, cat, goldfish, rats

• If there are only two categories, then it is called a


dichotomous variable
– Open or closed; male or female
Learning a new language
Types of variables

– Continuous variables
• What is measured varies along a line scale and
can have small or large units of measure
assume values that can take on all values
between any two given values;
Length
– Temperature
– Age
– Distance
– Time
Levels of Measurement

Nominal Level Ordinal Level


• Numbers are
• Symbols are assigned
to a set of categories assigned to rank-
for purpose of naming, ordered categories
labeling, or classifying
observations. Ex- ranging from low to
Gender; Other high; Example:
examples include
political party, religion, Social Class-
and race; Numbering “upper class”
is arbitrary;
“middle class”
Middle class is
Learning a new language
Measurement scales: Nominal

• Measurement scales
– Nominal scales
• Separated into different categories
• All categories are equal
– Cats, dogs, rats
– NOT: 1st, 2nd, 3rd
• There is no magnitude within a category
– One dog is not more dog than another.
Learning a new language
Measurement scales: Nominal

• No intermittent categories
– No dog/cat or cat/fish categories

• Membership in only one category, not both


Learning a new language
Measurement scales: Ordinal

– Ordinal scales
• What is measured is placed in groups by a ranking
– 1st, 2nd, 3rd
Learning a new language
Measurement scales: Ordinal

• Although there is a ranking difference between the


groups, the actual difference between the group
may vary.
– Marathon runners classified by finish order
» The times for each group will be different
» Top ten 4- to 5-hour times
» Bottom ten 4- to 5-week times

Time

1st place 2nd place 3rd place


Interval-Ratio Level

• When categories can be rank ordered, and if


measurements for all cases expressed in
same units; Examples include age, income,
and SAT scores; Not only can we rank order
as in ordinal level measurements, but also
how much larger or smaller one is compared
with another. Variables with a natural zero
point are called ratio variables (e.g. income, #
of friends) If it is meaningful to say “twice as
Much” then it’s a ratio variable.
Learning a new language
Measurement scales: Interval

– Interval scales
• Someone or thing is measured on a scale in which
interpretations can be made by knowing the
resulting measure.
• The difference between units of measure is
consistent.
– Height
– Speed

Length
Learning a new language
Measurement scales

– Ratio scale
• Just like an interval scale, and there is a definable
and reasonable zero point.
– Time, weight, length

• Seldom -20used-10
in social0 sciences
+10 +20
• All ratio scales are also interval scales, but not all
interval scales are ratio scales
Getting our toes wet
• Rounding numbers
– Less than 5, go down
– Greater than 5, go up

6.60 15.73 51.356


2.41 9.12 33.842
22.49 11.06 7.667
78.55 32.90 43.115
Getting our toes wet
Σ (sigma)

• Useful symbols
• Σ (sigma): used to indicate that the group
of numbers will be added together
x is 3, 78, 32, 15
Σx = 3 + 78 + 32 + 15
Σx = 128
Getting our toes wet
Σ (sigma)

• Let’s try it
x = 7, 33, 10, 19
Σx =

x = 62, 21, 73, 4


Σx =
Getting our toes wet
(‘x’ bar)

• x (‘x’ bar): the mean or average


– Add all the data points together (Σx)
– Divide by the number of data points (N)

x
 x
N
Getting our toes wet
(‘x’ bar)

Where: x = 3, 12, 6, 5, 11, 15, 1, 7


Σx = 60
N=8

60
x
8

x  7.5
Getting our toes wet
(‘x’ bar)

• Let’s try it
x = 3, 7, 1, 4, 4, 2

x
x = 28, 36, 22, 40, 34, 29

x
Getting our toes wet
Σx2 (Sigma x squared)

• Σx2 (Sigma x squared)


– Square each number, then
– Add them together

x = 2, 4, 6, 8
Σx2 = (2)2 + (4)2 + (6)2 + (8)2
Σx2 = 4 + 16 + 36 + 64
Σx2 = 120
Getting our toes wet
Σx2 (Sigma x squared)

• Let’s try it
x = 1, 3, 5, 7
Σx2 =

x = 4, 3, 9, 1
Σx2 =
Getting our toes wet
(Σx)2 (The square of Sigma x)

• (Σx)2 (The square of Sigma x)


– Sum all the numbers, then
– Square the sum
x = 5, 7, 2, 3
(Σx)2 = (5 + 7 + 2 + 3)2
(Σx)2 = (17)2
(Σx)2 = 289
Getting our toes wet
(Σx)2 (The square of Sigma x)

• Let’s try it
x = 7, 7, 3, 2, 5
(Σx)2 =

x = 3, 8, 1, 2
(Σx)2 =
Getting our toes wet
Σx2 versus (Σx)2

• Σx2 versus (Σx)2 : not the same


X = 4, 3, 2, 1
Σx2 = (4)2 + (3)2 + (2)2 + (1)2
Σx2 = (16) + (9) + (4) + (1)
Σx2 = 30
(Σx)2 = (4 + 3 + 2 + 1)2
(Σx)2 = (10)2
(Σx)2 = 100
‫‪DR‬‬
‫‪Mohamed Nada‬‬

‫‪92‬‬ ‫الدكتور خليل بن سعد خبير في المنظمات الدولية‬ ‫‪97‬‬


‫ومدرب عالمي في تقنيات القيادة والتسيير‬

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy