0% found this document useful (0 votes)

13 views

Lecture 2 Chapter 2

Uploaded by

mehranullah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Lecture 2 Chapter 2

Uploaded by

mehranullah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 90

Simple Comparative Experiments

Chapter 2
Lecture 2
• An engineer is studying the formulation of a Portland cement mortar.
• He has added a polymer latex emulsion during mixing to determine if this
impacts the curing time and tension bond strength of the mortar.
• The experimenter prepared 10 samples of the original formulation and 10
samples of the modified formulation.
• We will refer to the two different formulations as two treatments or as two
levels of the factor formulations.
• When the cure process was completed, the experimenter did find a very
large reduction in the cure time for the modified mortar formulation.
• Then he began to address the tension bond strength of the mortar. If the
new mortar formulation has an adverse effect on bond strength, this could
impact its usefulness.
• The tension bond strength data from
this experiment are shown in Table
2.1 and plotted in Figure 2.1.
• However, it is not obvious that this difference is large enough to imply
that the two formulations really are different.
• A technique of statistical inference called hypothesis testing can be
used to assist the experimenter in comparing these two formulations.
• Hypothesis testing allows the comparison of the two formulations to
be made on objective terms, with knowledge of the risks associated
with reaching the wrong conclusion.
Basic Statistical Concepts
• Each of the observations in the Portland cement experiment described above would be called a run.
• Notice that the individual runs differ, so there is fluctuation, or noise, in the observed bond strengths.
• This noise is usually called experimental error or simply error.
• It is a statistical error, meaning that it arises from variation that is uncontrolled and generally
unavoidable.
• The presence of error or noise implies that the response variable, tension bond strength, is a random
variable.
• A random variable may be either discrete or continuous. If the set of all possible values of the random
variable is either finite or countably infinite, then the random variable is discrete, whereas if the set of all
possible values of the random variable is an interval, then the random variable is continuous.
• For example the number of cars parked in a lot at any given time could be 0, 1, 2, 3, ..., up to the total
capacity of the lot. This is countable and thus a discrete random variable.
• The speed of a car on a highway can take any value within a range, say from 0 km/h to 120 km/h.
Between any two speeds (e.g., 80 km/h and 81 km/h), there are infinitely many possible values. Hence, it
is a continuous random variable.
Graphical Description of Variability

• We often use simple graphical methods to assist in analyzing the data

from an experiment.
• The dot diagram, illustrated in Figure 2.1, is a very useful device for
displaying a small body of data (say up to about 20 observations).
• The dot diagram enables the experimenter to see quickly the general
location or central tendency of the observations and their spread or
variability.
• For example, in the Portland cement tension bond experiment, the dot
diagram reveals that the two formulations may differ in mean strength but
that both formulations produce about the same variability in strength.
• If the data are fairly numerous, the dots
in a dot diagram become difficult to
distinguish and a histogram may be
preferable.
• Figure 2.2 presents a histogram for 200
observations on the metal recovery, or
yield, from a smelting process.
• The histogram shows the central
tendency, spread, and general shape of
the distribution of the data.
• Histograms should not be used with
fewer than 75–100 observations.
• The box plot (or box-and-whisker plot) is a
very useful way to display data.
• A box plot displays the minimum, the
maximum, the lower and upper quartiles
(the 25th percentile and the 75th percentile,
respectively), and the median (the 50th
percentile) on a rectangular box aligned
either horizontally or vertically.
• The box extends from the lower quartile to
the upper quartile, and a line is drawn
through the box at the median. Lines (or
whiskers) extend from the ends of the box
to (typically) the minimum and maximum
values.
• Figure 2.3 presents the box plots for the two
samples of tension bond strength in the
Portland cement mortar experiment.
• The modified mortar exhibits a wider spread
in bond strength values but clusters around a
median slightly below 17.00 kgf/cm².
• The unmodified mortar shows less
variation in bond strength values but has a
median slightly above 17.00 kgf/cm².
Probability Distributions
• To describe the observations that might occur in a
sample more completely, we use the concept of the
probability distribution.
• Probability distributions describe how the
probabilities of different outcomes are distributed for
a random variable (y). They can be categorized into
two main types:
• Discrete Probability Distributions/probability
mass function
• These describe the probabilities of outcomes
for discrete random variables, where the
possible outcomes are countable (like whole
numbers).
• It is the height of the function p(yj) that
represents probability,
Probability Distributions
• Continuous Probability
Distributions/Probability
Density Function (PDF):
• These describe the probabilities of
outcomes for continuous random
variables, where the outcomes can
take any value within a range.
• It is the area under the curve f(y)
associated with a given interval that
represents probability.
Mean, Variance, and Expected Values.
• The mean, , of a probability distribution is a measure of its central tendency or
location.
• Mathematically, we define the mean as
Sampling and Sampling Distributions
• The objective of statistical inference is
to draw conclusions about a population
using a sample from that population.
• Most of the methods that we will study
assume that random samples are used.
• A random sample is a sample that has
been selected from the population in
such a way that every possible sample
has an equal probability of being
selected.
• For example, suppose that y1, y2, . . . ,
yn represents a sample. Then the
sample mean
The Normal and Other Sampling Distributions
• The normal distribution, also known as
the Gaussian distribution, is used to
model continuous data.
• It's a bell-shaped curve that describes
how the values of a variable are
distributed, with most values clustering
around the mean and fewer values
occurring as you move further away
from the mean in either direction.
• Because sample runs that differ as a result of experimental error often are well described by the
normal distribution, the normal plays a central role in the analysis of data from designed
experiments.
• Many important sampling distributions may also be defined in terms of normal random variables.
• We often use the notation y ~ N() to denote that y is distributed normally with mean and variance .
• The standard normal distribution is a special case of the normal distribution where the mean is
0 and the standard deviation is 1.

• Where z represents the standard score (also known as a z-score), which measures how many
standard deviations a particular value is from the mean.
• The formula for converting any normally distributed variable y into a standard
normal variable Z is:

• Many statistical techniques assume that the random variable is normally

distributed. The central limit theorem is often a justification of approximate
normality.
Chi-square distribution
• The chi-square distribution is primarily used for:
• Goodness-of-fit tests: Testing whether an observed distribution matches an
expected distribution.
• Test for independence: Determining whether two categorical variables are
independent of each other.
• Testing variance: Evaluating if the variance of a population matches a
specific value.
• It's often used in analyzing categorical data or testing hypotheses
about population variance.
• The t-distribution, also known as
Student's t-distribution, is a probability
distribution used when estimating
population parameters when the sample
size is small and/or the population standard
deviation is unknown.
• The t-distribution is commonly used in
situations where the sample size is small
(typically less than 30), and the population
standard deviation is unknown. It is
especially useful in t-tests for hypothesis
testing.
• The t-statistic measures how many standard
errors the sample mean is away from the
population mean.
F-distribution
• The F-distribution is a probability distribution that arises frequently in
the context of hypothesis testing, particularly when comparing the
variances of two populations or when performing ANOVA (Analysis
of Variance).
• It is used to compare two variances by analyzing their ratio and helps
to determine whether they are significantly different.
• The F-statistic is used to compare two sample variances. It is defined
as the ratio of the two sample variances:

• If the null hypothesis is true (the variances are equal), the ratio of the
variances should be close to 1. If the null hypothesis is false (the
variances are different), the ratio will deviate significantly from 1.
Inferences About the Differences in
Means, Randomized Designs
• Hypothesis Testing
• We now reconsider the Portland cement
experiment introduced in Section 2.1. Recall
that we are interested in comparing the
strength of two different formulations: an
unmodified mortar and a modified mortar.
• In general, we can think of these two
formulations as two levels of the factor
“formulations.” Let y11, y12, . . . , represent the
n1 observations from the first factor level and
y21, y22, . . . , represent the n2 observations
from the second factor level.
• We assume that the samples are drawn at
random from two independent normal
populations. Figure 2.9 illustrates the
situation.
Inferences About the Differences in
Means, Randomized Designs
• Hypothesis Testing
• Statistical hypothesis is a statement either about the parameters of a
probability distribution or the parameters of a model. The hypothesis reflects
some conjecture about the problem situation.
• For example, in the Portland cement experiment, we may think that the mean
tension bond strengths of the two mortar formulations are equal.

• where is the mean tension bond strength of the modified mortar and is the
mean tension bond strength of the unmodified mortar.
• The statement : is called the null hypothesis and : is called the alternative
hypothesis. The alternative hypothesis specified here is called a two-sided
alternative hypothesis because it would be true if . Or if
Minitab
Choice of Sample Size
• Selection of an appropriate sample size is one of the most
important parts of any experimental design problem for
ensuring that the study has sufficient power to detect a
statistically significant difference between two groups.
• The sample size needed depends on several factors,
including the expected effect size, the desired level of
power, the significance level (alpha), and the variability in
the data. Steps are:
• Determine Effect Size
• The effect size is the magnitude of the difference you expect to detect
between the two groups. It can be calculated as:
• Set the Desired Power (1 - β)
• Power refers to the probability of correctly rejecting the null hypothesis (i.e., detecting
a true difference when one exists).
• A commonly used value is 0.80 (80%), meaning there is an 80% chance of detecting an
effect if it exists.
• For more rigorous studies, 0.90 or 0.95 (90% or 95%) power might be used.
• Higher power requires larger sample sizes.
• Set the Significance Level (α)
• The significance level (α) is the probability of rejecting the null hypothesis when it is
true (i.e., a Type I error). The most commonly used value is 0.05 (5%), meaning there is
a 5% chance of finding a false positive.
• More conservative significance levels, like 0.01, reduce the chance of false positives
but require larger sample sizes.
• Estimate the Standard Deviation (σ)
• If prior studies or pilot data are available, use them to estimate the pooled
standard deviation. Larger standard deviations imply more variability in the
data, which increases the required sample size.
• Use a Power Analysis Formula or Software
• We can calculate the sample size manually using a formula for a two-sample t-
test:
Inferences About the Differences in Means,
Paired Comparison Designs
• The Paired Comparison Problem
• Consider a hardness testing machine that presses a rod with a pointed tip into a metal
specimen with a known force. By measuring the depth of the depression caused by the
tip, the hardness of the specimen is determined. Two different tips are available for this
machine, and although the precision (variability) of the measurements made by the two
tips seems to be the same, it is suspected that one tip produces different mean hardness
readings than the other.
• An experiment could be performed as follows. A number of metal
specimens (e.g., 20) could be randomly selected. Half of these
specimens could be tested by tip 1 and the other half by tip 2. The
exact assignment of specimens to tips would be randomly
determined. Because this is a completely randomized design, the
average hardness of the two samples could be compared using the t-
test described in Section 2.4.
• A little reflection will reveal a serious disadvantage in the completely
randomized design for this problem. Suppose the metal specimens
were cut from different bar stock that were produced in different heats
or that were not exactly homogeneous in some other way that might
affect the hardness. This lack of homogeneity between specimens will
contribute to the variability of the hardness measurements and will
tend to inflate the experimental error, thus making a true difference
between tips harder to detect.
• To protect against this possibility, consider an alternative experimental
design. Assume that each specimen is large enough so that two
hardness determinations may be made on it. This alternative design
would consist of dividing each specimen into two parts, then randomly
assigning one tip to one-half of each specimen and the other tip to the
remaining half. The order in which the tips are tested for a particular
specimen would also be randomly selected. The experiment, when
performed according to this design with 10 specimens, produced the
Inferences About the Variances of Normal
Distributions
• In many experiments, we are interested in possible differences in the mean
response for two treatments.
• However, in some experiments it is the comparison of variability in the data that is
important.
• In the food and beverage industry, for example, it is important that the variability
of filling equipment be small so that all packages have close to the nominal net
weight or volume of content.
• In chemical laboratories, we may wish to compare the variability of two analytical
methods.

Tenko Raykov, George A. Marcoulides-Basic Statistics - An Introduction With R-Rowman & Littlefield Publishers (2012) PDF
No ratings yet
Tenko Raykov, George A. Marcoulides-Basic Statistics - An Introduction With R-Rowman & Littlefield Publishers (2012) PDF
345 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
MachineLearningNotes PDF
100% (1)
MachineLearningNotes PDF
299 pages
Okkokoko
No ratings yet
Okkokoko
4 pages
Gurruh Dwi Septano Tugas Rangkuman BAB 2
No ratings yet
Gurruh Dwi Septano Tugas Rangkuman BAB 2
16 pages
9.1. Prob - Stats
No ratings yet
9.1. Prob - Stats
19 pages
%5B1%5D+Random+Variables+and+Exploratory+Data+Analysis
No ratings yet
%5B1%5D+Random+Variables+and+Exploratory+Data+Analysis
13 pages
Stats Week 1 PDF
No ratings yet
Stats Week 1 PDF
6 pages
CHE331 L08 Descriptive Stats
No ratings yet
CHE331 L08 Descriptive Stats
31 pages
Statistics 2
No ratings yet
Statistics 2
14 pages
CENG3300 Lecture 2-2
No ratings yet
CENG3300 Lecture 2-2
23 pages
Statistical and Probability Tools For Cost Engineering
No ratings yet
Statistical and Probability Tools For Cost Engineering
16 pages
Distribusi
No ratings yet
Distribusi
13 pages
Statistics Lecture Course 2022-2023
No ratings yet
Statistics Lecture Course 2022-2023
66 pages
TEC 106-Complete Converted Notes-1
No ratings yet
TEC 106-Complete Converted Notes-1
35 pages
Statistics
No ratings yet
Statistics
50 pages
ZC-417 Quantitative Methods Exam Notes
No ratings yet
ZC-417 Quantitative Methods Exam Notes
144 pages
DAV Unit 4 Material
No ratings yet
DAV Unit 4 Material
49 pages
Probability, Statistics, and Data Analysis Notes # 1
No ratings yet
Probability, Statistics, and Data Analysis Notes # 1
5 pages
Class Notes v1
No ratings yet
Class Notes v1
4 pages
Lecture 1-Statistics
No ratings yet
Lecture 1-Statistics
6 pages
Satistical Mathods
No ratings yet
Satistical Mathods
27 pages
Qualitative Quantitative: Random Variable
No ratings yet
Qualitative Quantitative: Random Variable
4 pages
Modified Ps Final 2023
No ratings yet
Modified Ps Final 2023
124 pages
z table
No ratings yet
z table
13 pages
Basics For Understanding
No ratings yet
Basics For Understanding
8 pages
Statistics in Traffic Engineering-1
No ratings yet
Statistics in Traffic Engineering-1
14 pages
Erwin John Landicho
No ratings yet
Erwin John Landicho
8 pages
Univariate Statistics
No ratings yet
Univariate Statistics
7 pages
Statistics 101: Introduction To Data Management
No ratings yet
Statistics 101: Introduction To Data Management
37 pages
05 Statistical Inference-2 PDF
No ratings yet
05 Statistical Inference-2 PDF
14 pages
EE 214 Week 2 3 Module
No ratings yet
EE 214 Week 2 3 Module
23 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
Standard Error: Sampling Distribution
No ratings yet
Standard Error: Sampling Distribution
5 pages
Basic Statistics notes
No ratings yet
Basic Statistics notes
10 pages
Chapter 1. Introduction: 1 Terminology Basics
No ratings yet
Chapter 1. Introduction: 1 Terminology Basics
27 pages
Presentation 3
No ratings yet
Presentation 3
26 pages
الإحصاء الهندسي
No ratings yet
الإحصاء الهندسي
64 pages
Describing Data: Probability and Statistics For Science and Engineering With Examples in R
No ratings yet
Describing Data: Probability and Statistics For Science and Engineering With Examples in R
24 pages
PSAI Unit 5
No ratings yet
PSAI Unit 5
25 pages
Chapter 6
No ratings yet
Chapter 6
9 pages
Statistics
No ratings yet
Statistics
12 pages
Statistics YTU Day 1_70c47b3d-23fd-4707-8184-60cbab30a3c3
No ratings yet
Statistics YTU Day 1_70c47b3d-23fd-4707-8184-60cbab30a3c3
37 pages
Statistics - The Big Picture
No ratings yet
Statistics - The Big Picture
4 pages
ملزمة الاحصاء د.عبدالخالق
No ratings yet
ملزمة الاحصاء د.عبدالخالق
106 pages
MATH10282: Introduction To Statistics Supplementary Lecture Notes
No ratings yet
MATH10282: Introduction To Statistics Supplementary Lecture Notes
50 pages
Lab 3 Statistics Intro
No ratings yet
Lab 3 Statistics Intro
12 pages
Probability and Statistics For Engineers and Scientist1
No ratings yet
Probability and Statistics For Engineers and Scientist1
10 pages
Statistics For Quality
No ratings yet
Statistics For Quality
170 pages
Emerging Trends & Analysis 1. What Does The Following Statistical Tools Indicates in Research
No ratings yet
Emerging Trends & Analysis 1. What Does The Following Statistical Tools Indicates in Research
7 pages
Basic Statistics: Statistics: Is A Science That Analyzes Information Variables (For Instance
No ratings yet
Basic Statistics: Statistics: Is A Science That Analyzes Information Variables (For Instance
14 pages
Using R for Introductory Statistics 2nd Verzani Solution Manualinstant download
100% (2)
Using R for Introductory Statistics 2nd Verzani Solution Manualinstant download
45 pages
AML - Unit -2
No ratings yet
AML - Unit -2
29 pages
Statical Chapman
100% (1)
Statical Chapman
385 pages
Hypothesis Testing: Six Sigma Thinking, #6
From Everand
Hypothesis Testing: Six Sigma Thinking, #6
Sumeet Savant
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Introductory Statistics
From Everand
Introductory Statistics
Alandra Kahl
No ratings yet
Quantitative Method-Breviary - SPSS: A problem-oriented reference for market researchers
From Everand
Quantitative Method-Breviary - SPSS: A problem-oriented reference for market researchers
Jens K. Perret
No ratings yet
Data Preparation and Exploration: DSCI 5240 Data Mining and Machine Learning For Business
No ratings yet
Data Preparation and Exploration: DSCI 5240 Data Mining and Machine Learning For Business
27 pages
Box and Whisker Plot
No ratings yet
Box and Whisker Plot
7 pages
Data Visualization: Dr. P. Getzi Jeba Assistant Professor / CSE
No ratings yet
Data Visualization: Dr. P. Getzi Jeba Assistant Professor / CSE
13 pages
Non-Calculator 2014 Ptt1
No ratings yet
Non-Calculator 2014 Ptt1
26 pages
04 Dispersion Measures
No ratings yet
04 Dispersion Measures
17 pages
Mid review Math 2205
No ratings yet
Mid review Math 2205
7 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
10 pages
Statistics and Statistic
No ratings yet
Statistics and Statistic
11 pages
SM025 Overview (Chapter 06 To 10)
No ratings yet
SM025 Overview (Chapter 06 To 10)
26 pages
Mathematics - 4 and 5 Standard - Unit Plans - Harrison, Huizink, Sproat-Clements, Torres-Skoumal - Second Edition - Oxford 2021
No ratings yet
Mathematics - 4 and 5 Standard - Unit Plans - Harrison, Huizink, Sproat-Clements, Torres-Skoumal - Second Edition - Oxford 2021
37 pages
Unit 5-Data Visualization
No ratings yet
Unit 5-Data Visualization
22 pages
food_hub_businees_report
No ratings yet
food_hub_businees_report
15 pages
Rainfall Prediction System: (Peer-Reviewed, Open Access, Fully Refereed International Journal)
No ratings yet
Rainfall Prediction System: (Peer-Reviewed, Open Access, Fully Refereed International Journal)
7 pages
1ma1-3h-rms-20240822
No ratings yet
1ma1-3h-rms-20240822
24 pages
Data Analysis Calculator
No ratings yet
Data Analysis Calculator
28 pages
Instant Download (Ebook) Correspondence Analysis and West Mexico Archaeology: Ceramics from the Long-Glassow Collection by C. Roger Nance; Jan de Leeuw; Phil C. Weigand; Kathleen Prado; David S. Verity ISBN 9780826353948, 0826353940 PDF All Chapters
100% (2)
Instant Download (Ebook) Correspondence Analysis and West Mexico Archaeology: Ceramics from the Long-Glassow Collection by C. Roger Nance; Jan de Leeuw; Phil C. Weigand; Kathleen Prado; David S. Verity ISBN 9780826353948, 0826353940 PDF All Chapters
60 pages
6418 Chapter 3A Meyers I Proof 2
No ratings yet
6418 Chapter 3A Meyers I Proof 2
32 pages
A High-Resolution Canopy Height Model of The Earth: Nature Ecology & Evolution
No ratings yet
A High-Resolution Canopy Height Model of The Earth: Nature Ecology & Evolution
28 pages
1 Descriptive Statistics
No ratings yet
1 Descriptive Statistics
20 pages
Observation 4 - Lesson Plan
No ratings yet
Observation 4 - Lesson Plan
5 pages
Data Science 6th Sem CS Engineesring Questions
No ratings yet
Data Science 6th Sem CS Engineesring Questions
35 pages
Spoken Fictional Narrative and Literacy Skills of Children With Down Syndrome (Hessling y Brimo 2019)
No ratings yet
Spoken Fictional Narrative and Literacy Skills of Children With Down Syndrome (Hessling y Brimo 2019)
14 pages
Intra-Articular Cytokine Levels in Adolescent Pati
No ratings yet
Intra-Articular Cytokine Levels in Adolescent Pati
8 pages
Final Report Data Analysis Example 1 Template
No ratings yet
Final Report Data Analysis Example 1 Template
9 pages
Descriptive Statistics - SPSS Annotated Output
No ratings yet
Descriptive Statistics - SPSS Annotated Output
13 pages
NYS Test (2017) Practice #5
No ratings yet
NYS Test (2017) Practice #5
13 pages
May 8 INT 2021l
No ratings yet
May 8 INT 2021l
20 pages
Journal 36
No ratings yet
Journal 36
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lecture 2 Chapter 2

Uploaded by

Lecture 2 Chapter 2

Uploaded by

Simple Comparative Experiments

• We often use simple graphical methods to assist in analyzing the data

• Many statistical techniques assume that the random variable is normally

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.