0% found this document useful (0 votes)
69 views

Business Statistics: Lecture 1: Course Introduction & Descriptive Statistics

This document provides an introduction to a business statistics course, including: - The professor's contact information and background. - An overview of the course goals, outline, texts, and software. Key topics to be covered are descriptive statistics, probability, confidence intervals, and hypothesis testing. - Descriptions of important statistical concepts like populations versus samples, types of data, notation, and measures of central tendency like the sample mean.

Uploaded by

tacamp da
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Business Statistics: Lecture 1: Course Introduction & Descriptive Statistics

This document provides an introduction to a business statistics course, including: - The professor's contact information and background. - An overview of the course goals, outline, texts, and software. Key topics to be covered are descriptive statistics, probability, confidence intervals, and hypothesis testing. - Descriptions of important statistical concepts like populations versus samples, types of data, notation, and measures of central tendency like the sample mean.

Uploaded by

tacamp da
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Business Statistics

Lecture 1: Course Introduction


& Descriptive Statistics

1
Goals for this Lecture

• Introduce professor & course


• Define some basic statistics terminology
• Populations vs. samples
• Descriptive vs. inferential statistics
• Numerical descriptive statistics
• Measures of location
• Measures of dispersion
• Short introduction to JMP

2
Contact Information

• Professor Ron Fricker


• Phone: 831-869-8414
• E-mail: rdfricker@nps.edu
• Located in Monterey

Call or e-mail anytime!


3
A Little Bit About Me...
• Academic credentials
• Ph.D. and M.A. in Statistics, Yale University
• M.S. in Ops Research, The George Washington University
• B.S. in Mathematics from the United States Naval Academy
• Teaching credentials
• Started teaching post-graduate courses in mid-80s
• Have taught at NPS, RAND Graduate School, and USC
• “Real world” credentials
• Former active duty naval officer
• Commercial managerial experience
• Two defense-related organizations
• One non-profit
• Can find out more at http://faculty.nps.edu/rdfricke/

4
Course Goals
• Be able to:
• Apply basic statistical methods to business
problems
• Understand more advanced statistical
techniques and how they are properly
applied
• Judge good statistics and statistical
practice from bad
• Know when to call in statistical experts

5
Course Outline
• Eleven lectures over nine class
meetings:
• Descriptive statistics
• Basic probability
• Confidence intervals
• Hypothesis testing

• See the course syllabus for class policies


• Course website:
http://faculty.nps.edu/rdfricke/Business_Stats.htm
6
Course Texts & Resources

• Course texts:

• Business Statistics by Downing and Clark


• Basic Business Statistics: A Casebook by Foster,
Stine and Waterman
• If supplemental reading is required, recommend
Cartoon Guide to Statistics by Gonick and Smith
• It’s a rigorous treatment of the material, but done in
a very accessible style
• Course software: Excel & JMP
7
Descriptive Statistics

• Numerical
• Mean, median, mode
• Variance standard
deviation, range
• Graphical
• Histograms
• Boxplots
• Scatter plots

8
Probability
• Basic concepts
• Discrete distributions
• Continuous distributions
• Conditional probability

9
Inferential Statistics
• Point Estimation
• Interval Estimation
• E.g., confidence intervals
• Hypotheses testing
• Testing sample means
and variances

10
How to Study Statistics
• Do the reading in multiple passes
• First skim for major ideas before the lecture
• After the lecture, go back for details
• Re-read as necessary to solidify concepts
• Do practice problems (homework)!
• Only after first completing reading assignment
• If necessary, make up simple data to see what
equations are doing
• Don’t just depend on your colleagues
to explain the concepts to you…
11
How Not to Study for this Course
Calvin & Hobbes by Bill Watterson

12
“Statistics”
• “Statistics” has two uses in English:
• Can mean “a collection of numerical data”
• Also refers to a branch of mathematics that
deals with the analysis of statistical data
• This class is all about the latter
• Though we must use “collections of
numerical data” to do our analyses

13
Why Study Statistics?
• The world is an uncertain place
• Your company is recruiting a new CEO.
What compensation should you offer?
• What GMAT score do you need to get in to
an MBA program?
• Statistics gives you the tools to make
informed decisions in uncertain
conditions

14
Statistics Uses Data

• Statistics attacks uncertainty with data


• CEO: Salaries of other CEO’s
• GMAT: Other students’ scores

• Statistics turns raw data into information


that speaks to your question

15
Variability
• Statistics is more than tabulating numbers
• Data exhibit variability
• CEO’s have different backgrounds, work in
different industries, etc.
• Students vary in ability and luck
• Standard statistics question: “Given the data I
have seen, what is the truth likely to be?”

Understanding and describing variability is


one of the main jobs of statistics
16
Some Types of Variation
• Cross sectional
• Data are a snapshot in time
• Use one variable to explain another
• Time series (also called longitudinal)
• Trend (long run changes)
• Seasonality (retail sales up in December)
• Random
• Not explained by anything
• That’s why we call it random!

17
Samples versus Populations
• A population consists of all possible
observations
• Example: All students enrolled in an MBA
program
• A sample is a subset of the population
• Example: Global MBA students are a
sample of all MBA students
• A random sample is a subset not drawn
in any systematic way from population
18
Samples versus Populations

Population Sample

The CEO incomes


All possible CEO incomes we’ve observed

19
Why Sample?
If we could see these: We wouldn’t need these:
• The TV viewing • Nielson survey of a
preferences for every sample of US television
individual in the US viewers
• The diameter of every • The diameters of 100
shaft ever produced by shafts produced by the
a manufacturing same process
process • The proportion of
• The proportion of individuals in a survey
potential customers who claiming knowledge of
know of your product your product

 Collecting data for whole populations can be


expensive and/or impossible 20
Two Roles of Statistics
• Descriptive: Describing a sample or
population
• Numbers: (mean, variance, mode)
• Pictures: (histogram, boxplot)
• Inferential: Using a sample to infer facts
about a population
• Making guesses (average income of MBA’s)
• Testing theories (does an MBA increase your
income?)

21
A Descriptive Question:

Population Sample

The CEO incomes


All possible CEO incomes we’ve observed

What is the average CEO income in our sample?


22
An Inferential Question:

Population Sample

The CEO incomes


All possible CEO incomes we’ve observed

Given what we have observed, what can we say about


the average CEO income for the population?23
Types of Data
• Continuous: Can divide by any number
and result still makes sense
• Examples: Salary, height, weight, age, etc.
• Categorical:
• Nominal: unordered categories
• Example: Country of origin, product color
• Ordinal: ordered categories
• Example: Small, medium, large
• Different types described in different
ways
24
Types of Data

Data

Qualitative Quantitative

Discrete Continuous

25
Notation
• Capital roman letters usually represent
an unknown quantity
• Example: What the outcome of a dice roll?
• Label this outcome “X”
• X can be 1, 2, 3, 4, 5, or 6

• A small i subscripted on a letter


represents a series of observations
• Example: The dice is rolled many times
• Xi is the outcome from the ith roll
26
Notation
• A greek letter capital sigma (  ) means
to sum up
• Subscripts tell what to sum
• Example:
3

X
i 1
i  X1  X 2  X 3

27
Continuous Data
• Numerical Summaries
• Location:
Mean, median
• Spread or variability:
Variance, standard deviation, range, percentiles,
quartiles, interquartile range
• Graphical Descriptions
• Histogram
Next class
• Boxplot
• Scatterplot
28
Sample Mean ( x )
• Sample average or sample mean
• Often denoted by x (spoken “x-bar”)
• From previous example:
1 3 x1  x2  x3
x   xi 
3 i 1 3
n
1
• In general: x 

n i 1
xi
Excel tip. Use the built-in function:
= AVERAGE ( cell reference )
29
Population Mean ( )
• Population mean
• Often denoted by  (Greek letter “mu”)
N
1
• In general:   
N i 1
xi

Excel tip. Built-in AVERAGE function


works for both samples and populations

30
The Median
• The median is the “typical” value
• Steps to calculate the median:
• Order your data from smallest to largest
• If the number of data is odd, the middle
observation is the median
1 3 5 6 12 12 99
• If the number is even, then the average of
the two middle observations is the median
1 3 5 6 12 12 Excel tip. Built-in function:
5.5 = MEDIAN ( cell reference )
31
Mean vs. Median

• Both are measures of location or “central


tendency”
• But, median less affected by outliers
• Example:
• Imagine a sample of data: 0, 0, 0, 1, 1, 1, 2, 2, 2
• Median=mean=1
• Another sample of data: 0, 0, 0, 1, 1, 1, 2, 2, 83
• Median still equals 1, but mean=10!
• Which to use? Depends on whether you are:
• characterizing a “typical” observation (the median)
• or describing the average value (the mean)
32
Sample Variance (s2)
• Sample variance measures data
variability
• For n observations, the sample variance
is n
1

2
s 
2
( xi  x )
n  1 i 1
Excel tip. Built-in function for sample variance
= VAR ( cell reference )

33
Population Variance (s2)
• Population variance measures data
variability too
• For N observations, the population
variance is
n
1
s   (x  x )
2 2
i
N i 1

Excel tip. Built-in function for population variance


= VARP ( cell reference )

34
Standard Deviation (s or s)
• The standard deviation is the square
root of the variance
s s 2

• Also a measure of the variability


• It’s in the same units as the sample mean
• For populations, the standard deviation is
denoted
s s 2

Excel tip. Built-in functions for the sample standard deviation


= STDEV ( cell reference ) or = STDEVP ( cell reference ) 35
Calculating Variance and SD
• Variance:
• Sample numbers: 1 3 7 9 Xi
• Mean: (1+3+7+9)/4 = 5 X
• Deviations from Mean: -4 -2 2 4 Xi  X
• Squared: 16 4 4 16 X
i X 
2

• Summed: 16+4+4+16 = 40 n

 (X
2
i  X)
• Divide by n-1: 40/3 = 13.3333 i 1

• Standard deviation: 1 n

2
(Xi  X)
• SD = 13.333  3.65 n  1 i 1
36
The Range
• Range is another measure of variability
• Denoted by R
• In words, it is the largest observation in
the sample minus the smallest
observation
• Example: Imagine we collect the ages of
students in the class
• Data: 21, 23, 23, 25, 25, 26, 27, 31, 33, 33, 35
• Range = 35 - 21 = 14

37
Other Measures of Variation
• Percentiles
• pth percentile: value of x such that p% of
the data is less than or equal to x
• Special Percentiles:
• Max: 100th percentile
• Min: 0th percentile
• Median: 50th percentile
• Quartiles: 25th and 75th percentiles
• Interquartile Range (IQR):
IQR = 75th percentile - 25th percentile
38
Categorical Data

• Numerical Measures:
• Mode: most commonly occurring value
• Frequency table: how often each value
occurs
• Graphics:
• Bar chart of frequencies (histogram)
Next
• Mosaic chart (stacked bar chart)
class
• Pareto chart

39
Mode
• Mode is the most frequently occurring
value in the sample or population
• It is the “typical” or “common” value
• For example, in the following data
1, 1, 1, 1, 2, 2, 2, 3, 4, 5, 5, 6, 7
the mode is “1”
• “1” occurs 4 times
• All other observations occur less than 4
times
40
Frequency Tables
• Tables of counts
by two or more
categorical
variables
• Example: Executive
compensation
(Forbes94.jmp)

41
Introduction to JMP

• Statistical analysis software


• More powerful than Excel for
statistical analyses
• Designed to facilitate analyses
and to do advanced statistics
• Particularly good at interactive analyses
• Interactive graphics
• Delete points and repeat analysis
• Conduct multiple analyses

42
Introduction to JMP
• Demonstration using GMAT case study (GMAT.jmp)

43
Remember the Notation
• Summation
• Σ notation and subscripts
• Size
• n denotes size of sample
• N denotes size of population

• Knowns vs. unknowns


• Small letters (i.e., “x”): quantity is known
• Capital letters (i.e., “X”): quantity unknown
• Later we will call these random variables
44
People Will Believe Any Statistic…

45
What We’ve Covered
• Introduced professor & course
• Defined some basic statistics
terminology
• Populations vs. samples
• Descriptive vs. inferential statistics
• Learned about some numerical
descriptive statistics
• Measures of location
• Measures of dispersion
• Introduced JMP 46

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy