0% found this document useful (0 votes)
2 views

MIDTERMS-STATS

The document provides an overview of basic concepts in statistics, including definitions, branches, types of data, and measurement scales. It discusses descriptive and inferential statistics, variables, sampling methods, and methods of data presentation. Additionally, it covers measures of central tendency and variability, along with guidelines for graphing data and using statistical software.

Uploaded by

addegocena1036qc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

MIDTERMS-STATS

The document provides an overview of basic concepts in statistics, including definitions, branches, types of data, and measurement scales. It discusses descriptive and inferential statistics, variables, sampling methods, and methods of data presentation. Additionally, it covers measures of central tendency and variability, along with guidelines for graphing data and using statistical software.

Uploaded by

addegocena1036qc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

SAMPLE

BASIC CONCEPTS OF STATISTICS ●​ subset of population


o​ smaller group from the population

1 DEFINITION OF STATISTICS
4 VARIABLES
STATISTICS
●​ Statistics is a set of mathematical VARIABLES
procedures for organizing, summarizing, and ●​ any property that may have different values
interpreting data or information. at different times.
o​ organizing and summarizing data for INDEPENDENT VARIABLES
clarity
o​ drawing conclusions about the broader ●​ The independent variables are variables that
population based on sample data. being manipulated by the researchers and
involves conditions set before observing
2 BRANCHES OF STATISTICS outcomes
●​ predictive
o​ type of teaching method (traditional or
DESCRIPTIVE STATISTICS
digital)

●​ statistical procedures used to organize,


DEPENDENT VARIABLES
summarize, and display data.
o​ tables, charts, averages ●​ it is the observed outcome used to evaluate
o​ Measure of Frequency: count, percent, the effects of the treatment
frequency ●​ determines the result of manipulation
o​ Measures of Central Tendency: Mean, o​ test scores
Median, Mode
o​ Measures of Dispersion or Variation: 5 TYPES OF DATA
Range, Variance, Standard Deviation
o​ Measures of Position: Percentile ranks,
QUANTITATIVE
Quartile Ranks

●​ is numerical. It’s used to define information


INFERENTIAL STATISTICS that can be counted

●​ methods that allow us to study samples and


QUALITATIVE
make generalizations or predictions about a
population based on sample data.
o​ Hypothesis Testing ●​ refers to information about qualities, or
o​ Confidence intervals information that cannot be measured. It’s
o​ Regression analysis usually descriptive and textual
o​ Correlation ●​ categorical responses (data set is
categorical)

3 POPULATION AND SAMPLE


POPULATION
●​ complete set of individuals, objects, or scores DISCRETE This is described as the one in
that the investigator is interested in studying VARIABLE which there are no possible
o​ entire group you want to study values between adjacent units
on the scale (Pagano, 2007)
-​ counted -​ Ex: Temperature in
-​ Ex: number of students in fahrenheit or celcius
the class, member of RATIO SCALE -​ used for variables on a
apples in a crate scale that have
CONTINUOUS This is described as the one that measurable intervals
VARIABLE theoretically can have an -​ Ex: Height
infinite number of values
between adjacent units on the SAMPLING METHODS
8
scale (Pagano, 2007)
SAMPLING FRAME
-​ measured
●​ the list from which the potential respondents
-​ Ex: weight, height,
are drawn
temperature
RANDOM SAMPLING
6 PROPERTIES OF MEASUREMENT SIMPLE -​ method wherein every
RANDOM member of the
IDENTITY
SAMPLING population is equally
●​ each value on measurement scale has
given a chance to be
unique meaning
chosen
EQUAL INTERVALS
SYSTEMATIC -​ it involves systematic
●​ scale units along the scale are equal to one
RANDOM method wherein the first
another
SAMPLING sample is selected
randomly and the rest
A MINIMUM VALUE OF ZERO
will be selected with a
●​ the scale has a true zero point
given interval

MAGNITUDE STRATIFIED -​ a method wherein the


●​ values on the measurement scale have an RANDOM population is divided into
ordered relationship to one another that is SAMPLING strata before selecting
same value are larger and some are smaller random sample in each
sub-group

7 SCALE OF MEASUREMENT
●​ how variable are defined and categorized NON-RANDOM SAMPLING
CONVENIENCE -​ used to ease of data
SAMPLING collection
-​ method used based on
NOMINAL -​ for naming variables in
researchers
SCALE no particular order
convenience and using
-​ category names
available possible
ORDINAL -​ for variables in ranked
samples
SCALE order, but the differences
SNOWBALL -​ referral method
between is not
SAMPLING -​ asking the first sample
determined
to refer other people
INTERVAL -​ used for numerical
who are also eligible to
SCALE variables with known
meet the criteria
equal intervals of the
same distance

9 CATEGORIES OF SCIENTIFIC RESEARCH


STEP 1 ARRANGE DATA IN ASCENDING TO
OBSERVATIONAL STUDIES DESCENDING ORDER

-​ no variables are actively manipulated by


investigator STEP 2 DETERMINE THE CLASSES
-​ this includes naturalistic observation,
parameter estimation and correlational ●​ Find the highest and lowest value
studies
●​ Find the range (difference of hv and lv in
QUALITATIVE
the distribution
-​ There is an attempt to determine whether ●​
changes in one variable produce changes ●​ Determine the number of class
in one variable ​ 2 to the k rule (2k is greater than
number of observations
​ 2k > total of observations
10 USING COMPUTER STATISTICS
●​ Determine the Class interval
-​ SPSS (Statistical Package for Social
​ Distance between the lower class
Sciences) boundary to upper class boundary
-​ BDMP (Biomedical Computer Programs-P ​ Denoted by “i”
Series 𝑟𝑎𝑛𝑔𝑒
i = # 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠
-​ SAS (Statistical Analysis System)
𝐻𝑉 − 𝐿𝑉
-​ SYSTAT i= 𝑘

-​ MINITAB Round the value of interval up to
-​ EXCEL the nearest whole number
-​ R Project for Statistical Computing, PSPP
●​ Select starting point for the lower class
-​ Jamovi limit (Apparent limit)
KEY TERMS: ​ Class limit is the highest and lowest
●​ Population values of a class
●​ Sample ​ Set the individual class limit by
adding the value of “i” until you reach the
●​ Variable (Independent and Dependent)
target number of classes
●​ Data ●​ Set the class boundaries in each class (Real
●​ Parameter: numerical value that Limits)
summarizes data of population ​ Upper and lower values of class
whose values has additional decimal place
●​ Statistic: number calculated on sample
more than the class limits and end with
data the digit .5
​ To obtain class boundaries, subtract
0.5 from lower class limit and add 0.5 to
each upper class limit
FREQUENCY DISTRIBUTION
GROUPED FREQUENCY DISTRIBUTION

A WHAT IS FREQUENCY DISTRIBUTION Class Limit Class Boundaries

●​ An organized tabulation of the number of


individuals located in each category on the STEP 3 TALLY THE RAW DATA
scale of measurements.
Used to group scores together

STEP 4 CONVERT THE DATA INTO


B GROUPED FREQUENCY DISTRIBUTION
NUMERICAL FREQUENCIES
●​ Is the point halfway between the class limit
GROUPED FREQUENCY DISTRIBUTION
of each class and is representative of the
data within the class
Class Limit Class f
●​ Can be found by getting the average of the
Boundaries
upper limit and lower limit in each class
𝑈𝐿 + 𝐿𝑊
2
STEP 5 DETERMINE THE RELATIVE
FREQUENCY
GROUPED FREQUENCY DISTRIBUTION

●​ Rf is the value obtained when frequencies Class Class f rf % cf Midpoints


in each class is divided by total number of Limit Boundaries
values
○​ f divided by number of
observations C CATEGORICAL FREQUENCY
DISTRIBUTION
GROUPED FREQUENCY DISTRIBUTION ●​ Used to organize nominal-level data
Class Class f rf
Limit Boundaries STEP 1 CONSTRUCT A TABLE

STEP 6 DETERMINE THE PERCENTAGE - - -

Tally
●​ Obtained by multiplying the relative
frequency by 100% Frequency
GROUPED FREQUENCY DISTRIBUTION

Class Class f rf %
Limit Boundaries STEP 2 TALLY RAW DATA

STEP 3 CONVERT TALLIED DATA INTO


STEP 7 DETERMINE THE CUMULATIVE NUMERICAL FREQUECIES
FREQUENCY

●​ The sum of frequencies accumulated up to STEP 4 DETERMINE THE PERCENTAGE


upper boundary of a class in a frequency 𝑓
distribution (zigzag method) 𝑃 = 𝑛
𝑥 100%
​ ​ f = frequency
GROUPED FREQUENCY DISTRIBUTION n = sample size

Class Class f rf % cf - - -
Limit Boundaries
Tally

Frequency

Percentage

STEP 8 DETERMINE THE MIDPOINTS


GRAPHICAL PRESENTATION
●​ A graph that displays data using points
A GUIDELINES IN GRAPHING DATA
which are connected by lines
●​ The graph/chart should include a title ●​ Frequencies are represented by the
●​ The scales for all axes should be included heights of the points at the midpoints of
●​ The scale on the x and y-axis should start the classes
at zero ●​ The vertical axis represents the frequency
●​ The graph/chart should not distort the of the distribution and the horizontal axis
data represents the midpoints
●​ The axes should be properly labeled ○​ Connected line segment
●​ The graph/chart should not contain ○​ Displays data using points which
unnecessary decorations are connected by lines
CUMULATIVE FREQUENCY POLYGON (OGIVE)
●​ A graph that displays the cumulative
B TYPES OF GRAPH frequencies for the classes in a frequency
distribution
BAR GRAPH ●​ The vertical axis represents the cumulative
●​ Frequency distribution for a nominal or frequency of the distribution while the
ordinal data where the heights of the bar horizontal axis represents the upper class
represent the frequency of members boundaries
under that category
●​ Used to visually represent and compare TIME SERIES GRAPH
data across different categories. They ●​ Represents data that occur over a specific
make it easy to identify trends, differences, period of time under observation
and patterns by displaying data
○​ Space between bars SCATTER PLOT
○​ For nominal ●​ Used to examine possible relationships
between 2 numerical variables
PARETO CHART
●​ Used to represent a frequency distribution PICTOGRAPH
for a categorical data and frequencies are ●​ Represents data through pictures
displayed by the heights of vertical bars arranged in a row/column
which are arranged from highest to lowest
STEM-AND-LEAF DISPLAY
LINE GRAPH -​ a technique for organizing data that
-​ used to present correlations between provides a simple alternative to a grouped
quantitative variables when the frequency distribution table or graph
independent variable has, or is organized -​ Each score is separated into a stem
into, a relatively small number of distinct (the first digit or digits) and a leaf (the last
levels digit). The display consists of the stems
listed in a column with the leaf for each
PIE CHART score written beside its stem. A stem and
-​ Are used when we have percentages in leaf display is similar to a grouped
categories (nominal data). Each segment frequency distribution table, however the
represents a percentage of the total. stem and leaf display identifies the exact
value of each score and the grouped
HISTOGRAM frequency distribution does not.
●​ A graph in which the classes are marked
on the horizontal axis (x-axis) and the class
frequencies on the vertical axis (y-axis) 1 MEASURES OF CENTRAL TENDENCY
●​ It focuses on the frequency of each class
and sacrifices whatever information was CENTRAL TENDENCY
contained in the actual observation ●​ a statistical measure to determine a single
○​ No gaps between bars score that defines the center of a
○​ Equal width intervals(Bins)
distribution. The goal of central tendency is
to find the single score that is most typical or
most representative of the entire group.
FREQUENCY POLYGON
central tendency attempts to identify the -​ for interval and ratio measurements
“average” or “typical” individual -​ higher statistical computation are
wanted
MEAN (AVERAGE) -​ if there are no extreme values in a
distributions since it is easily affected by
●​ also known as the arithmetic average
extremely high or extremely low scores.
●​ the sum of the scores divided by the number
Thus, the distribution is approximately
of scores.
normal
o​ The mean for a population is identified by
-​ the greatest reliability of the measure of
the Greek letter mu, μ (pronounced
central tendency is wanted, since its
“mew”)
computations include all the given
o​ The mean for a sample is identified by M
values
or X (read “x-bar”).
o​ formula for the population mean is
Σ𝑋
USE THE MEDIAN WHEN:
µ = 𝑁 -​ the data is skewed or has outliers
o​ formula for the sample mean uses -​ the middle value is needed for ordered
symbols that signify sample values: data
Σ𝑋
𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 = 𝑀 = 𝑛 -​ ordinal and ranked measurements
-​ there are extreme classes, thus the
MEDIAN (MIDPOINT) distribution is markedly skewed
-​ we desire to know whether the cases fall
●​ The goal of the median is to locate the
midpoint of the distribution
USE THE MODE WHEN:
●​ Defining the median as the midpoint of a
-​ working with categorical data
distribution means that the scores are being
-​ you want to determine the most
divided into two equal-sized groups.
frequent value
●​ If the scores in a distribution are listed in
-​ determining the most popular or most
order from smallest to largest, the median is
typical case
the midpoint of the list. More specifically, the
median is the point on the measurement
scale below which 50% of the scores in the 2 MEASURES OF VARIABILITY
distribution are located. these are statistical tools used to describe the
spread or dispersion of a dataset

MODE
they indicate how much individual data points
●​ the score or category that has the greatest differ from the central value (e.g, mean or
frequency median)

o​ One mode (unimodal) VARIABILITY


o​ Two modes (bimodal) ●​ a quantitative measure of the differences
o​ More than two modes (Multimodal) between scores in a distribution and
o​ No mode (if no value repeats) describes the degree to which the scores are
spread out or clustered together.
USE THE MEAN WHEN:
-​ The data is normally distributed
(symmetric)
-​ there are no extreme outliers that can PURPOSE OF VARIABILITY
skew the data
-​ variability describes the distribution. it tells ●​ equals the mean of the squared
whether the scores are clustered close deviations. Variance is the average
together or are spread out over a large squared distance from the mean
distance
-​ variability measures how well an individual ●​ Population variance is represented by the
score (group of scores) represents the symbol σ² and equals the mean squared
entire distribution distance from the mean. Population
variance is obtained by dividing the sum of
RANGE squares by N

●​ the distance covered by the scores in a


distribution, from the smallest score to the 3 DEVELOPING THE STANDARD
largest score DEVIATION
●​ the range simply measures the difference
between the largest score (Xmax) and the
STEP 1: DETERMINE DEVIATION
smallest score (Xmin).
o​ 𝑟𝑎𝑛𝑔𝑒 = 𝑋𝑚𝑎𝑥 − 𝑋𝑚𝑖𝑛 ●​ 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑠𝑐𝑜𝑟𝑒 = 𝑋 − µ
●​ Notice that there are two parts to deviation
score: the sign (+ or -). The sign tells the
DEVIATION direction from the mean, whether the
●​ how far a single number is from the mean score is located above (+) or below (-) the
o​ 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑠𝑐𝑜𝑟𝑒 = 𝑋 − µ mean

STANDARD DEVIATION STEP 2: CALCULATE MEAN OF DEVIATION


●​ The standard deviation is the most SCORES
commonly used and the most important
measure of variability. Standard deviation ●​ To compute this mean, you first add up the
uses the mean of the distribution as a deviation scores and then divide by N
reference point and measures variability ●​ Deviations sum to 0 because M is balance
by considering the distance between each point of the distribution
score and the mean ●​ The Mean Deviation will always equal 0;​
●​ the standard deviation provides a another method must be found
measure of the standard, or average, o​ 0= ∑(X-μ)
distance from the mean, and describes ●​ the total of the distances above the mean is
whether the scores are clustered closely exactly equal to the total of the distances
around the mean or are widely scattered below the mean
●​ the square root of the variance and ●​ it is 0 if the scores are clustered close
provides a measure of the standard, or together and it is 0 of the scores are widely
average distance from the mean scattered
o​ 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒

STEP 3: GET RID OF + AND – IN


●​ Population standard deviation is
DEVIATIONS
represented by the symbol s and equals
the square root of the population variance.
●​ Get rid of + and – in Deviations
○​ Square each deviation score (X-μ)²
○​ Compute the Mean Squared
VARIANCE
Deviation, known as the Variance
■​ ∑(X-μ)² =
●​ Population variance equals the mean
squared deviation. Variance is the average
squared distance from the mean
●​ variability is now measured in squared
units

STEP 4: COMPUTE A MEASURE OF THE


STANDARD DISTANCE OF THE SCORES
FROM THE MEAN

●​ Variance measures the average squared


distance from the mean; not quite on goal
●​ Correct for having squared all the
differences by taking the square root of the
variance
o​ 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒

4 FORMULA FOR POPULATION


VARIANCE AND STANDARD
DEVIATION
𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
●​ 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒𝑠
●​ SS (sum of squares) is the sum of the
squared deviations of scores from the
mean

5 FINAL FORMULAS
𝑆𝑆
●​ 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑁
𝑆𝑆
●​ Standard deviation = 𝑁

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy