0% found this document useful (0 votes)

18 views

Week 01

Uploaded by

gibawav948

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Week 01

Uploaded by

gibawav948

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 71

GE 204/ FENS 200

PROBABILITY & STATISTICS

Week 1: Introduction to
Statistics

1
Dealing with Uncertainty

Everyday decisions are based on incomplete

information

Consider:
 The price of IBM stock will be higher in six months
than it is now.

 If the federal budget deficit is as high as predicted,

interest rates will remain high for the rest of the year.

2
Dealing with Uncertainty
(continued)

Because of uncertainty, the statements

should be modified:

 The price of IBM stock is likely to be higher in six

months than it is now.

 If the federal budget deficit is as high as predicted, it

is probable that interest rates will remain high for the
rest of the year.

3
What does “Statistics” mean?
1. Numerical data
“According to statistics, this year’s exports has marked a record!”
“We need to collect statistics for the productivity of this business.”

2. Collection of theories, rules and techniques

“This company uses statistics in their quality control system.”
“This faculty teaches statistics.”

3. Meaning for a “Statistician”

You will learn this once you excel “statistics”. 

4
Descriptive and Inferential Statistics

Two branches of statistics:

 Descriptive statistics
 Collecting, summarizing, and processing data
to transform data into information
 Inferential statistics
 provide the bases for predictions, forecasts,
and estimates that are used to transform
information into knowledge
5
Descriptive Statistics

 Collect data
 e.g., Survey
 Present data
 e.g., Tables and graphs
 Summarize data
 e.g., Sample mean =  X i

n
6
Inferential Statistics
 Estimation
 e.g., Estimate the population
mean weight using the sample
mean weight
 Hypothesis testing
 e.g., Test the claim that the
population mean weight is 120
pounds

Inference is the process of drawing conclusions or making decisions

about a population based on sample results
7
The Decision Making Process
Decision

Knowledge
Experience, Theory,
Literature, Inferential
Statistics, Computers
Information
Descriptive Statistics,
Begin Here: Probability, Computers
Data
Identify the
Problem
8
Key Definitions

 A population is the collection of all items of interest or

under investigation
 N represents the population size

 A sample is an observed subset of the population

 n represents the sample size

 A parameter is a specific characteristic of a population

 A statistic is a specific characteristic of a sample

9
Population vs. Sample

Population Sample

a b cd
ef ghi jkl m n b c
o p q rs t u v w g i n
x y z o r u
y

Values calculated using Values computed from

population data are called sample data are called
parameters statistics 10
Examples of Populations

 Names of all registered voters in Turkey

 Incomes of all families living in Fatih/Istanbul
 Annual returns of all stocks traded on the
Istanbul Stock Exchange
 Grade point averages of all the students in
Kadir Has University

11
Why “Sampling”?
 Less time consuming than a census

 Less costly to administer than a census

 It is possible to obtain statistical results of a

sufficiently high precision based on samples.

12
Process of Statistical Data Analysis

Population

Random
Make Inferences
Sample
Describe
Sample
Statistics

13
Data Types

Data

Qualitative Quantitative
(Categorical) (Numerical)

Examples:
 Marital Status
 Political Party Discrete Continuous
 Eye Color
(Defined categories) Examples: Examples:
 Number of Children  Weight
 Defects per hour  Voltage
(Counted items) (Measured
characteristics) 14
Data Types

 Time Series Data

 Ordered data values observed over time

 Cross Section Data

 Data values observed at a fixed point in time

15
Data Types

Sales (in $1000’s)

2003 2004 2005 2006 Time
Atlanta 435 460 475 490 Series
Boston 320 345 375 395 Data

Cleveland 405 390 410 395

Denver 260 270 285 280

Cross Section
Data
16
Measurement Levels
Differences between
measurements, true Ratio Data
zero exists
Quantitative Data

Differences between
measurements but no Interval Data
true zero

Ordered Categories
(rankings, order, or Ordinal Data
scaling)
Qualitative Data

Categories (no
ordering or direction) Nominal Data
17
Measurement Levels-EXAMPLES
 Nominal: sex, eye-colour
 Percentages, frequency, mod (most frequent value)

 Ordinal: socio-economic status: high (A), mid to

high (B), low to mid (C), low (D)
Very commonly used as likert scale (e.g., in surveys
after a statement as strongly agree, agree, neither
agree nor disagree, disagree, strongly disagree)
 In addition to nominal: median (middle) value,
quartiles
 No addition/subtraction/multiplication/division
among measurement levels for nominal/ordinal
data 18
Measurement Levels-EXAMPLES
 Interval: temperature, welfare, utility, IQ level
 In addition to ordinal: Mean and variance but not
any ratios (e.g., coefficient of variation)
 Celsius degrees: 100 units between freezing
(0C) and boiling points of water (100C)
 Fahrenheit degrees: 180 units between freezing
(32F) and boiling (212F) points of water.

Different reference points (0 C vs. 32 F)

F = 32 + 1.8 C

40C is NOT twice the 20C, since it is also 104F
and 68F.
 Generally F1 = 32 + 1.8 C and F2 = 32 + 1.8 (2*C), 19
hence F ≠ 2*F .
Measurement Levels-EXAMPLES
 Ratio: units of kg, meter, TL
 A reference point (e.g., 0 is available regardless
of the unit)
 In addition to interval: any ratios (e.g., coefficient
of variation)
 1km = 0.6214 miles, 1kg = 2.2046 pounds (there
is no constant term in conversion).
 It may be discrete or continuous (mostly
rounded numbers are used).

20
Measurement Levels-EXERCISE
 Occupation, City
 Education
 Price
 Likeness

21
Descriptive statistics
 Compute and interpret statistics describing the
location of a set of values, such as the mean
and median.
 Compute and interpret statistics describing the
variability in a set of values, such as the range
and standard deviation.
 Compute and interpret the measures of shape,
skewness and kurtosis.
 Produce graphical displays of data.

22
Some Frequently Used Statistics and
Parameters
SAMPLE POPULATION
STATISTICS PARAMETERS
MEAN x 

VARIANCE s2 

STANDARD s 
DEVIATION
 
PROPORTION ˆ 

23
Measure of Location
 Descriptive statistics that locate the center
of your data are called measures of
central tendency
 Sample Mean
 The sample mean of a set of n
measurements (x1, x2,…xn) is equal
to the sum of the measurements
divided by n.
n
xi x1  x2  ...  xn
x  
i 1 n n
24
Measure of Location
 Median
 Median: the “middle” value (also known as the 50th percentile)
 The median of a set of n measurements (x , x ,…x ) is the
1 2 n
value that falls in the middle position when the
measurements are ordered from the smallest to the
largest.
 x n1 if n is odd
 2
~
x  x n  x n
 2 2
1
if n is even
 2

x1,…xn are arranged in increasing order of magnitude25

RULE FOR CALCULATING THE
MEDIAN

 1. Order the measurements from the smallest to the

largest.
 2. A) If the sample size is odd, the median is the
middle measurement.
 B) If the sample size is even, the median is the
average of the two middle measurements.

26
1 3 3 4 5 8 51 13345 8
n=3 n=3 n=3 n=3
Median=4 Median=3
.5
(3+4)/2=3.5

27
Example
A random sample of six values were
taken from a population. These values were:

x1=7, x2=1, x3=10, x4=8, x5=4, and x6=12.

What are the sample mean and

sample median for these data?

28
Example (con’t)
x1  x2  x3  x4  x5  x6 7  1  10  8  4  12
x  7
n 6

Order Sample

x2=1, x5=4, x1=7, x4=8, x3=10, x6=12

MEDIAN = ( 7 + 8 ) / 2 = 7.5

29
Example

Consider the following sample:

4 18 36 39 41 42 43 44 44 45
46 47 48 49 49 50 51 53 54 60

Which measure of central tendency best describes

the central location of the data:

THE SAMPLE MEAN OR SAMPLE MEDIAN? Why?

30
Example (con’t)
n

x i
x  i 1 43.15
n
~ 45  46
x 45.5
2
the median

31
Example (con’t)
Why?
Because there is an outlier (extreme value),4 in
the data set, the mean is heavily influenced
by this single outlier.
Solution:
Trimmed mean—drop the outlier and
recalculate the mean.
 n 
  xi   4
xtrim   i 1  45.21
n 1
32
Mode
 A measure of location
 The value that occurs most often
 Not affected by extreme values
 Used for either numerical or categorical data
 There may be no mode
 There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

Mode = 5 No Mode 33
Mode (con’t)
 What is the mode for the previous example
(slide 12)?
 44 (occurs twice)
 49 (occurs twice)

34
Distributions
 When you examine the distribution of values,
you can determine
 the range of possible data values
 the frequency of data values
 whether the data values accumulate in the
middle of the distribution or at one end.
 Median, mean and mode values have
relation with the shape of the distribution.

35
Measures of Central Tendency:
Shape of a Distribution
 Describes how data is distributed
 Symmetric or skewed
 Apply to many unimodal distributions (not
for multimodal)
Left-Skewed Symmetric Right-Skewed

Mean < Median < Mode Mode = Mean = Median Mode < Median < Mean
(Longer tail extends to left) (Longer tail extends to right)
36
Percentiles and Quartiles

Percentiles Quartiles

The pth percentile in a data array:  1st quartile = 25th percentile

 p% are less than or equal to this
value  2nd quartile = 50th percentile
 (100 – p)% are greater than or
= median
equal to this value
(where 0 ≤ p ≤ 100)  3rd quartile = 75th percentile

37
Percentiles and Quartiles
98
95 third quartile
92 75 Percentile=91
th

90
85
81 50th Percentile=80 (median)
79 Quartiles break your data
70 up into quarters.
63 25th Percentile=59
55 first quartile
47
42
38
Weighted Mean
 Used when values are grouped by frequency
or relative importance

Example: Sample of
26 Repair Projects
Weighted Mean Days
Days to
Complete
Frequency to Complete:
5 4
XW 
w x
i i

(4 5)  (12 6)  (8 7)  (2 8)
6 12 w i 4  12  8  2
7 8 164
  6.31 days
8 2 26

39
Measures of Variation

 Measures of variation give information on

the spread or variability of the data
values.

Same center,
different variation
40
The Spread of a Distribution:
Variation
Measure Definition
range the difference between the maximum and minimum
data values
interquartile range the difference between the 25th and 75th
percentiles (IR or IQR)
variance a measure of dispersion of the data around the
mean
standard deviation a measure of dispersion expressed in the same units
of measurement as your data (the square root of the
variance)
coefficient of standard deviation as a percentage of
variation of the mean

41
Range
 Simplest measure of variation
 Difference between the largest and the
smallest observations:

Range = xmaximum – xminimum

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 1 = 13
42
Disadvantages of the Range
 Ignores the way in which data are distributed

7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5

 Sensitive to outliers

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
43
Interquartile Range

 Can eliminate some outlier problems by using

the interquartile range

 Eliminate some high-and low-valued

observations and calculate the range from the
remaining values.

 Interquartile range = 3rd quartile – 1st quartile

44
Variance and Standard Deviation
•The variance is a measure of variation (σ2 or s2).
•The square root of the variance, or standard
deviation (σ or s), is a measure of variation in
terms of the original linear scale (most commonly
used).
    2 is the population standard deviation

 s  s 2 is sample standard deviation.

45
Measures of Variability (Population)
 Population Range
XMax-XMin
 Population Variance
n n

 i  xi
2
( x   ) 2

 2  i 1  i 1  2
N N

 Population Standard Deviation

  2

46
PROOF

47
Measures of Variability (Sample)
 Sample Range
XMax-XMin
 Sample Variance
2
 n

n
  x
 i 1 
i

( xi  x ) 2 
2
n
xi 
n
s 
2
 i 1
i 1 n 1 n 1

 Sample Standard Deviation

s  s2
48
Measures of Variability (Sample)
2
Obs. xi xi  x ( xi  x ) Obs.
2 xi xi

1 7 0 1 7
0 49
2 1 -6 2 1
36 1
3 10 3 3 10
9 100
4 8 1 4 8
1 64
80 5 424 374
5 4 -3
9 16
6 12 5 6 12 49
Sample Variance
2
 n

n n
  xi 
   i 1 
 i 
2 2
x  x xi 
n
S2  i 1 2
S  i 1
n 1 n 1

374 
 42 2

80 6
 
5 5
16 16
50
Sample Variance
• Calculate the sample variance by averaging
with n-1 instead of n.
n

 (x  x)
i
2

s 2  i 1
n 1
• n-1 is called the degrees of freedom
associated with the variance estimate. This
depicts the number of independent pieces of
information available for computing variability.
51
Comparing Standard Deviations
Same mean, but different
Data A standard deviations:
Mean = 15.5

11 12 13 14 15 16 17 18 19 20 21
s = 3.338

Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = .9258

Data C
Mean = 15.5

11 12 13 14 15 16 17 18 19 20 21
s = 4.57
52
Coefficient of Variation
 Measures relative variation
 Always in percentage (%)
 Shows variation relative to mean
 It is used to compare two or more sets of data
measured in different units

Population Sample

σ  s 
CV   100% CV   100%

μ  x 
53
Comparing Coefficients
of Variation
 Stock A:
 Average price last year = $50
 Standard deviation = $5
s $5
 
CVA   100%  100% 10%
x $50 Both stocks
 Stock B: have the same
standard
 Average price last year = $100 deviation, but
 Standard deviation = $5 stock B is less
variable relative
to its price
s $5
CVB   100%  100% 5%
x $100
54
 Presentation of Data
 Tables
 Graphs
 Frequency displays and Histograms
 Stem-leaf display
Stem and Leaf Diagram

 A simple way to see distribution details

from qualitative data
METHOD
1. Separate the sorted data series into leading digits
(the stem) and the trailing digits (the leaves)
2. List all stems in a column from low to high
3. For each stem, list all associated leaves
Example:

Data sorted from low to high:

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

 Here, use the 10’s digit for the stem unit:

Stem Leaf
 12 is shown as 1 2

 35 is shown as 3 5
Example:

Data in ordered array:

12, 13, 17, 21, 24, 24, 26, 27, 28, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

 Completed Stem-and-leaf diagram:

Stem Leaves
1 2 3 7
2 1 4 4 6 7 8
3 0 2 5 7 8
4 1 3 4 6
5 3 8
Using other stem units
 Using the 100’s digit as the stem:

 Round off the 10’s digit to form the leaves

Stem Leaf
 613 would become 6 1
 776 would become 7 8
 ...
 1224 becomes 12 2
Construction of a Stem-Leaf Display
 List the stem values, in order, in a vertical column
 Draw a vertical line to the right of the stem values
 For each observation, record the leaf portion of the
observation in the row corresponding to the appropriate
stem
 Reorder the leaves from the lowest to highest within
each stem row
 If the number of leaves appearing in each stem is too
large, divide the stems into two groups, the first
corresponding to leaves 0 through 4, and the second
corresponding to leaves 5 through 9. (This subdivision
can be increased to five groups if necessary).
EXAMPLE: Car Battery Life
2.2 4.1 3.5 4.5 3.2 3.7 3.0 2.6

3.4 1.6 3.1 3.3 3.8 3.1 4.7 3.7

2.5 4.3 3.4 3.6 2.9 3.3 3.9 3.1

3.3 3.1 3.7 4.4 3.2 4.1 1.9 3.4

4.7 3.8 3.2 2.6 3.9 3.0 4.2 3.5

Stem and Leaf Plot of Battery Life
STEM LEAF
Frequency
1 69 2
2 25669 5
3 0011112223334445567778899
25
4 11234577
8
Relative Frequency Distribution
 Group data into different classes or intervals
 Counting leaves belonging to each stem
 Each stem defines a class interval
 Divide each class frequency by the total
number of observations, we obtain the
proportion of the set of observations in each
of the classes.
Relative Frequency Distribution of Battery
Life

Class Interval Class midpoint Frequency, f Relative frequency

1.5-1.9 1.7 2 0.05
2.0-2.4 2.2 1 0.025
2.5-2.9 2.7 4 0.100
3.0-3.4 3.2 15 0.375
3.5-3.9 ? ? ?
4.0-4.4 ? ? ?
4.5-4.9 ? ? ?
Relative Frequency Distribution of Battery
Life (con’t)
Class Interval Class Frequency, Relative
midpoint f frequency
1.5-1.9 1.7 2 0.05
2.0-2.4 2.2 1 0.025
2.5-2.9 2.7 4 0.100

3.0-3.4 3.2 15 0.375

3.5-3.9 3.7 10 0.250

4.0-4.4 4.2 5 0.125

4.5-4.9 4.7 3 0.075

EXERCISE: Compute the sample mean and standard deviation

Picturing Distributions: Histogram
 Each bar in the
histogram represents
a group of values (a
PERCENT

bin).
 The height of the bar
is the percent of
values in the bin.

Bins
Relative Frequency Histogram of Battery
Life
How Many Class Intervals?

 Many (Narrow class intervals)

3.5

 may yield a very jagged 3

2.5

distribution with gaps from empty

Frequency
2
1.5

classes 1
0.5
 Can give a poor indication of how 0

4
8
12
16
20
24
28
32
36
40
44
48
52
56
60
More
frequency varies across classes Temperature

10
 Few (Wide class intervals) 8

Frequency
6
 may compress variation too much 4

and yield a blocky distribution 2

0
0 30 60 More
 can obscure important patterns of
Temperature

variation.
General Guidelines

 Number of Data Points Number of Classes

under 50 5- 7
50 – 100 6 - 10
100 – 250 7 - 12
over 250 10 - 20
 Class widths can typically be reduced as the
number of observations increases
 Distributions with numerous observations are
more likely to be smooth and have gaps filled
since data are plentiful
 Horizontal vs. vertical bars
Measures of Shape: Skewness

Skewed Skewed
to Left Symmetric to Right

FREQUENCY
FREQUENCY
FREQUENCY
Summary
 Basics of descriptive statistics
 Tables and graphs
 Inferential statistics
 Textbook Reading
 Chapter 1 (page 1-28)
 Chapter 8 (page 229-243)

 Motion Charts (link)

Statistics For Dummies
From Everand
Statistics For Dummies
Deborah J. Rumsey
4/5 (27)
Department of Computer Science and Engineering
No ratings yet
Department of Computer Science and Engineering
56 pages
�Untitled copy
No ratings yet
�Untitled copy
129 pages
Data Management
100% (1)
Data Management
51 pages
Chapter 01
No ratings yet
Chapter 01
56 pages
Statistics and Probabilities Quarter 1
No ratings yet
Statistics and Probabilities Quarter 1
6 pages
Lesson 02 Probability and Statistics
No ratings yet
Lesson 02 Probability and Statistics
127 pages
H1.1 Definitions, Measures, Plots, CLT
No ratings yet
H1.1 Definitions, Measures, Plots, CLT
83 pages
Lesson 5 (Descriptive Statistics Part 1)_Oct 2024
No ratings yet
Lesson 5 (Descriptive Statistics Part 1)_Oct 2024
72 pages
Article Review 1 Eng
No ratings yet
Article Review 1 Eng
30 pages
Quantitative Methods
100% (1)
Quantitative Methods
103 pages
Unit 3 - Descriptive Statistics
No ratings yet
Unit 3 - Descriptive Statistics
44 pages
Quantitative Methods For Decision Making: Dr. Akhter
No ratings yet
Quantitative Methods For Decision Making: Dr. Akhter
100 pages
Chapter1 Statistics
No ratings yet
Chapter1 Statistics
17 pages
Math Test Prep File
No ratings yet
Math Test Prep File
88 pages
Statistics Introduction
No ratings yet
Statistics Introduction
37 pages
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
No ratings yet
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
20 pages
Business Statistics NOtes
No ratings yet
Business Statistics NOtes
46 pages
Снимок экрана 2021-10-27 в 13.01.03
No ratings yet
Снимок экрана 2021-10-27 в 13.01.03
108 pages
2466939-EDA_and_STATISTICS_NOTES
No ratings yet
2466939-EDA_and_STATISTICS_NOTES
15 pages
Statistics For Economists: Lecturer: DR Omid Mazdak Email: Omid - Mazdak@kcl - Ac.uk
No ratings yet
Statistics For Economists: Lecturer: DR Omid Mazdak Email: Omid - Mazdak@kcl - Ac.uk
25 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
Stats 1 Module Updated
No ratings yet
Stats 1 Module Updated
53 pages
City_Uni_of_New_York
No ratings yet
City_Uni_of_New_York
33 pages
Applied Statistical Methods (ASM) : "The True Logic of This World Is in The Calculus of Probabilities"
No ratings yet
Applied Statistical Methods (ASM) : "The True Logic of This World Is in The Calculus of Probabilities"
90 pages
Desc. Stat
No ratings yet
Desc. Stat
41 pages
Data Management ( 1)
No ratings yet
Data Management ( 1)
46 pages
Safari
No ratings yet
Safari
385 pages
Probability and Statistics For Computer Scientists Second Edition, By: Michael Baron
No ratings yet
Probability and Statistics For Computer Scientists Second Edition, By: Michael Baron
63 pages
Chapter 08 Statistics 2
No ratings yet
Chapter 08 Statistics 2
47 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
Emgt 512 SP 2024
No ratings yet
Emgt 512 SP 2024
156 pages
Statistics For Data Science 1
No ratings yet
Statistics For Data Science 1
65 pages
Statistics For Data Analysis
No ratings yet
Statistics For Data Analysis
13 pages
WEEK 3 - Central-Tendency-Variation-And-Shape
No ratings yet
WEEK 3 - Central-Tendency-Variation-And-Shape
39 pages
CHAPTER+ONE+Descriptive+Statistics+ +Univariate
No ratings yet
CHAPTER+ONE+Descriptive+Statistics+ +Univariate
12 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
35 pages
Data Management
No ratings yet
Data Management
36 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Quantitative Methods
No ratings yet
Quantitative Methods
33 pages
Lecture 1 Statistics and Lecture2 (1)
No ratings yet
Lecture 1 Statistics and Lecture2 (1)
44 pages
Statistical Foundations - Intro 64zlf
100% (2)
Statistical Foundations - Intro 64zlf
86 pages
Ken Black QA ch03
0% (1)
Ken Black QA ch03
61 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
53 pages
Data Management
No ratings yet
Data Management
48 pages
4. Descriptive Statistics
No ratings yet
4. Descriptive Statistics
44 pages
Stats
No ratings yet
Stats
109 pages
Ch 2 Lecture Notes
No ratings yet
Ch 2 Lecture Notes
12 pages
CH1 and CH2 Definitions and Descriptive Statistics
No ratings yet
CH1 and CH2 Definitions and Descriptive Statistics
29 pages
Topic 2- Descriptive_statistics
No ratings yet
Topic 2- Descriptive_statistics
36 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Basic Statistics (3685) PPT - Lecture On 20-01-2019
100% (1)
Basic Statistics (3685) PPT - Lecture On 20-01-2019
64 pages
Unit 01 - Describing Data and Its Distributions - 1 Per Page
No ratings yet
Unit 01 - Describing Data and Its Distributions - 1 Per Page
79 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Lecture-1 Descriptive Statistics
No ratings yet
Lecture-1 Descriptive Statistics
50 pages
Math236_Lecture_2 (1)
No ratings yet
Math236_Lecture_2 (1)
64 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
24 pages
Week 01 Introduction
No ratings yet
Week 01 Introduction
33 pages
Thinking Statistically
From Everand
Thinking Statistically
Anthony Banfield
5/5 (1)
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
From Everand
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Peter Bradley
No ratings yet
Descriptive Statistics: Six Sigma Thinking, #3
From Everand
Descriptive Statistics: Six Sigma Thinking, #3
Sumeet Savant
No ratings yet
Shanghai: GDP Apostasy Macro Economics Wac Submission By: LT 3 Word Count: 1281 Plagiarism Check: 0%
No ratings yet
Shanghai: GDP Apostasy Macro Economics Wac Submission By: LT 3 Word Count: 1281 Plagiarism Check: 0%
6 pages
Leaflet Examples PDF - Google Search
No ratings yet
Leaflet Examples PDF - Google Search
1 page
Thesis One Way Anova
100% (3)
Thesis One Way Anova
8 pages
Guidance A IKEA FDN Full Application General Guidelines - May 2020 1
No ratings yet
Guidance A IKEA FDN Full Application General Guidelines - May 2020 1
3 pages
Daly, Dfle
No ratings yet
Daly, Dfle
7 pages
Nokia Asha 501 RM-899, RM-900, RM-902 Schematics - v1.0
No ratings yet
Nokia Asha 501 RM-899, RM-900, RM-902 Schematics - v1.0
8 pages
SCP Recovered Data. SCP 3567-J
No ratings yet
SCP Recovered Data. SCP 3567-J
7 pages
CISM 15e Domain 1
0% (1)
CISM 15e Domain 1
129 pages
New Syllabus
No ratings yet
New Syllabus
5 pages
Mscmt-09 Book
No ratings yet
Mscmt-09 Book
374 pages
Think and Grow Rich Graphic Summary
95% (21)
Think and Grow Rich Graphic Summary
3 pages
Sheet 4
No ratings yet
Sheet 4
2 pages
Fascination Advantage Report
No ratings yet
Fascination Advantage Report
16 pages
RTG CPR DLC CollectingtheRandom
100% (2)
RTG CPR DLC CollectingtheRandom
14 pages
Unit 1: Structuralism by Claude Levi Strauss
No ratings yet
Unit 1: Structuralism by Claude Levi Strauss
3 pages
PSPM SP025 CH 5
No ratings yet
PSPM SP025 CH 5
14 pages
Morphometry and Land Cover Based Multi-Criteria Analysis
100% (1)
Morphometry and Land Cover Based Multi-Criteria Analysis
22 pages
Freezing Point Depression
No ratings yet
Freezing Point Depression
3 pages
Session 6 - Gross Validation
No ratings yet
Session 6 - Gross Validation
26 pages
Del Norte - Campos - A Survey of Macro-Invertebrate Gleaning
No ratings yet
Del Norte - Campos - A Survey of Macro-Invertebrate Gleaning
10 pages
Dialogues With Coaches
No ratings yet
Dialogues With Coaches
20 pages
General History of Africa, Abridged Edition, V.1 Methodology and African Prehistory
No ratings yet
General History of Africa, Abridged Edition, V.1 Methodology and African Prehistory
367 pages
Asking and Giving Opinion Grade 8
No ratings yet
Asking and Giving Opinion Grade 8
8 pages
ASSIGNMENT - EnglishXHHW24 25ASSIGNMENT 1 - 20240604105929 2
No ratings yet
ASSIGNMENT - EnglishXHHW24 25ASSIGNMENT 1 - 20240604105929 2
5 pages
Steady Heat Conduction
No ratings yet
Steady Heat Conduction
25 pages
Lesson-Plan-luna TTL
No ratings yet
Lesson-Plan-luna TTL
8 pages
Unit 9
No ratings yet
Unit 9
17 pages
Sven Beckert Empire of Cotton A Global History Alfred A Knopf New York2015 Xxii 615 PP Ill Dollar3500 Paper Dollar1895 PDF
No ratings yet
Sven Beckert Empire of Cotton A Global History Alfred A Knopf New York2015 Xxii 615 PP Ill Dollar3500 Paper Dollar1895 PDF
4 pages
Non Destructive Testing
No ratings yet
Non Destructive Testing
7 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Week 01

Uploaded by

Week 01

Uploaded by

GE 204/ FENS 200

PROBABILITY & STATISTICS

Everyday decisions are based on incomplete

 If the federal budget deficit is as high as predicted,

Because of uncertainty, the statements

 The price of IBM stock is likely to be higher in six

 If the federal budget deficit is as high as predicted, it

2. Collection of theories, rules and techniques

3. Meaning for a “Statistician”

Two branches of statistics:

Inference is the process of drawing conclusions or making decisions

 A population is the collection of all items of interest or

 A sample is an observed subset of the population

 A parameter is a specific characteristic of a population

Values calculated using Values computed from

 Names of all registered voters in Turkey

 Less costly to administer than a census

 It is possible to obtain statistical results of a

 Time Series Data

 Cross Section Data

Sales (in $1000’s)

Cleveland 405 390 410 395

 Ordinal: socio-economic status: high (A), mid to

x1,…xn are arranged in increasing order of magnitude25

 1. Order the measurements from the smallest to the

x1=7, x2=1, x3=10, x4=8, x5=4, and x6=12.

What are the sample mean and

x2=1, x5=4, x1=7, x4=8, x3=10, x6=12

Consider the following sample:

Which measure of central tendency best describes

THE SAMPLE MEAN OR SAMPLE MEDIAN? Why?

The pth percentile in a data array:  1st quartile = 25th percentile

 Measures of variation give information on

Range = xmaximum – xminimum

 Can eliminate some outlier problems by using

 Eliminate some high-and low-valued

 Interquartile range = 3rd quartile – 1st quartile

 s  s 2 is sample standard deviation.

 Population Standard Deviation

 Sample Standard Deviation

 A simple way to see distribution details

Data sorted from low to high:

 Here, use the 10’s digit for the stem unit:

Data in ordered array:

 Completed Stem-and-leaf diagram:

 Round off the 10’s digit to form the leaves

3.4 1.6 3.1 3.3 3.8 3.1 4.7 3.7

2.5 4.3 3.4 3.6 2.9 3.3 3.9 3.1

3.3 3.1 3.7 4.4 3.2 4.1 1.9 3.4

4.7 3.8 3.2 2.6 3.9 3.0 4.2 3.5

Class Interval Class midpoint Frequency, f Relative frequency

3.0-3.4 3.2 15 0.375

3.5-3.9 3.7 10 0.250

4.0-4.4 4.2 5 0.125

4.5-4.9 4.7 3 0.075

EXERCISE: Compute the sample mean and standard deviation

 Many (Narrow class intervals)

 may yield a very jagged 3

distribution with gaps from empty

and yield a blocky distribution 2

 Number of Data Points Number of Classes

 Motion Charts (link)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.