0% found this document useful (0 votes)
214 views

STAT 3022 Data Analysis Class Slides 1

This document outlines chapter 1 of the STAT 3022 course on drawing statistical conclusions. It introduces common summary statistics like the sample mean, median, and standard deviation. It also discusses using histograms and numerical summaries to analyze starting salary data from men and women at a bank, to determine if there was gender discrimination. The chapter then covers the normal distribution as the most important probability distribution, and how the central limit theorem shows that sample means are approximately normally distributed.

Uploaded by

Yang Yi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
214 views

STAT 3022 Data Analysis Class Slides 1

This document outlines chapter 1 of the STAT 3022 course on drawing statistical conclusions. It introduces common summary statistics like the sample mean, median, and standard deviation. It also discusses using histograms and numerical summaries to analyze starting salary data from men and women at a bank, to determine if there was gender discrimination. The chapter then covers the normal distribution as the most important probability distribution, and how the central limit theorem shows that sample means are approximately normally distributed.

Uploaded by

Yang Yi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Chapter 1 Drawing Statistical Conclusions

STAT 3022
School of Statistic, University of Minnesota

January 27, 2013

Outline

Some Basics
Summary statistics

sample mean X =

i=1 Xi /n

sample standard deviation ( n (


i=1

Xi X

)2

)/ (n 1)

median, Q1 , Q3 , interquartile range (IQR) IQR = Q3 Q1

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

2 / 16

Outline

Some Basics

Roll a 6-face dice 3 times, outcome: 1, 2, 6 Sample Mean: X =


1+2+6 3

=3 n

Population Mean: =? Law of large numbers: Xn for

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

3 / 16

Outline

Some Basics

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

4 / 16

Outline

Some Basics
Graphical summary

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

5 / 16

Outline

Case Study: Observational Experiment

Question: Did a bank discriminatorily pay higher starting salaries to men than to women? Data: Beginning salaries for 32 men, 61 women. All skilled, entry-level employees hired between 1969 and 1977 Perform exploratory data analysis using graphical and numerical summaries of the data.

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

6 / 16

Outline

Graphical Summary
Male
12 0 3000 4 8

Frequency

4000

5000

6000 Starting Salary

7000

8000

9000

Female
20 0 3000 5 10

Frequency

4000

5000

6000 Starting Salary

7000

8000

9000

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

7 / 16

Outline

Interpreting Histograms

Relative frequency histograms allow us to visually display general characteristics of the data distribution of a particular variable: Central tendency - Do men tend to be paid higher than women? Spread - What is the range of most salaries? Symmetry - Is there a skew in either distribution? Are there any outliers? Histograms are used to show broad features, not exquisite detail

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

8 / 16

Outline

Numerical Summary

lec1_1.R lec1_2.R in-class

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

9 / 16

Outline

Normal Distribution
1 bell shaped, dened by the formula 2 e 22 two parameters: mean , variance 2 (standard deviation = 2)
(x)2

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

10 / 16

Outline

Normal Distribution
Normal distribution N(, ) is dened by
(x)2 1 f(x) = e 22 2

Standard normal distribution N(0, 1)


1 2 1 (x) = e 2 x 2

Why is standard normal distribution important ) ( 1 x f(x) =

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

11 / 16

Outline

Normal Distribution

Why normal distribution is so important?

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

12 / 16

Outline

Normal Distribution
What is this distribution?

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

13 / 16

Outline

Central Limit Theorem

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

14 / 16

Outline

> dnorm(0, mean = 0, sd = 1) # density [1] 0.3989423 > dnorm(0, mean = 0, sd = 2) [1] 0.1994711 > > pnorm(1, mean = 0, sd = 1) # distribution function [1] 0.8413447 > pnorm(1, mean = 0, sd = 1, lower.tail = FALSE) [1] 0.1586553 > > qnorm(0.5, mean = 2, sd = 1) # quantile function [1] 2 > qnorm(0, mean = 2, sd = 1) [1] -Inf > > rnorm(5, mean = 0, sd = 1) # random generation [1] 2.2867947 1.3311000 1.9408290 -0.5366956 1.1687528 > rnorm(5) [1] -0.48693373 0.02950848 -1.03232990 -0.24314950 -0.42515522

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

15 / 16

Outline

???

STAT 3022 | Chapter 1 Drawing Statistical Conclusions

16 / 16

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy