0% found this document useful (0 votes)
21 views

Mathematics Statistics

Statistics topic summary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Mathematics Statistics

Statistics topic summary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

STATISTICS

Key Concepts
1. Statistics:
• The study of data collection, analysis, interpretation, presentation, and organization.
• It helps make decisions based on data and predict trends.
• It involves summarizing large amounts of data in a way that is easy to understand
and analyze.
2. Types of Data:
• Qualitative (Categorical) Data: Data that describes qualities or characteristics (e.g.,
colors, gender, names).
• Quantitative (Numerical) Data: Data that represents numbers and can be measured
(e.g., height, weight, age).
3. Data Collection Methods:
• Primary Data: Data collected firsthand through surveys, experiments, or
observations.
• Secondary Data: Data collected by someone else and used by the researcher for
analysis (e.g., census data, research papers).
4. Sampling:
• Population: The entire group being studied.
• Sample: A subset of the population, often used when studying the whole population
is not feasible.
• Sampling Methods:
• Random Sampling: Every member of the population has an equal chance of
being selected.
• Systematic Sampling: Select every n-th member from a list.
• Stratified Sampling: Divide the population into subgroups (strata) and
sample from each subgroup.
• Convenience Sampling: Use data that is easiest to collect.

1. Measures of Central Tendency


These are used to describe the center or average of a data set.
• Mean (Average):
• The sum of all data points divided by the number of data points.
Mean=n∑x
Where ∑x is the sum of all data points, and n is the number of data points.
• Median:
• The middle value of the data when arranged in ascending or descending order.
• If there is an odd number of data points, the median is the middle value.
• If there is an even number, the median is the average of the two middle values.
• Mode:
• The value that appears most frequently in a data set.
• A data set can have no mode, one mode, or multiple modes (bimodal or multimodal).

2. Measures of Dispersion (Spread)


These measure how spread out or varied the data is.
• Range:
• The difference between the highest and lowest values in the data set.
Value ValueRange=Maximum Value−Minimum Value
• Variance:
• A measure of how much each data point differs from the mean. It is the average of
the squared differences from the mean.
• Formula for population variance σ2:
σ2=N∑(x−μ)2
Where x is each data point, μ is the mean, and N is the number of data points.
• Formula for sample variance s2:
s2=n−1∑(x−xˉ)2
Where xˉ is the sample mean and n is the number of data points in the sample.
• Standard Deviation:
• The square root of the variance. It is a measure of how spread out the data is around
the mean.
• Population standard deviation σ is the square root of population variance σ2.
σ=σ2
• Sample standard deviation s is the square root of sample variance s2.
s=s2

3. Probability
Probability is the measure of the likelihood that a certain event will occur.
• Basic Probability:
• The probability of an event A occurring is given by:
of favorable outcomes number of outcomesP(A)=Total number of outcomesNumber of favo
rable outcomes
• Complementary Events:
• The probability of the complement of event A, denoted A′, is:
P(A′)=1−P(A)
• Addition Rule (for two events):
• If events A and B are mutually exclusive (they cannot happen at the same time), then:
P(A∪B)=P(A)+P(B)
• If the events are not mutually exclusive (they can happen at the same time), then:
P(A∪B)=P(A)+P(B)−P(A∩B)
• Multiplication Rule (for independent events):
• If events A and B are independent, then:
P(A∩B)=P(A)×P(B)
• Conditional Probability:
• The probability of event A given that event B has occurred is:
P(A∣B)=P(B)P(A∩B)

4. Data Representation
Various methods exist to represent data visually.
• Bar Graphs:
• Used for categorical data, where the height of each bar represents the frequency or
count of each category.
• Histograms:
• Used for continuous numerical data. The data is divided into bins, and the height of
each bar represents the frequency of data points in that bin.
• Pie Charts:
• Used for categorical data. The circle is divided into sections to represent different
categories or proportions.
• Line Graphs:
• Used to display data points in a continuous time series (e.g., stock prices over time).
• Box-and-Whisker Plots:
• Used to represent the distribution of data based on five-number summary (minimum,
first quartile, median, third quartile, and maximum).
• Scatter Plots:
• Used to show the relationship between two variables by plotting data points on a
Cartesian plane.

5. Measures of Position
• Quartiles:
• Divide a data set into four equal parts.
• First Quartile (Q1): The median of the lower half of the data (25th percentile).
• Second Quartile (Q2): The median of the entire data set (50th percentile).
• Third Quartile (Q3): The median of the upper half of the data (75th percentile).
• Interquartile Range (IQR):
• The difference between the third and first quartile. It represents the spread of the
middle 50% of the data.
IQR=Q3−Q1
• Z-scores:
• The Z-score indicates how many standard deviations a data point is from the mean.
Z=σx−μ
Where x is a data point, μ is the mean, and σ is the standard deviation.

6. Correlation and Regression


• Correlation:
• Measures the strength and direction of the linear relationship between two variables.
• Pearson’s Correlation Coefficient (r):
r=[n∑x2−(∑x)2][n∑y2−(∑y)2]n∑xy−(∑x)(∑y)
• r ranges from -1 to 1:
• r=1 means a perfect positive correlation.
• r=−1 means a perfect negative correlation.
• r=0 means no linear correlation.
• Linear Regression:
• A method to model the relationship between two variables by fitting a straight line to
the data.
• Equation of the line:
y=mx+c
Where m is the slope and c is the y-intercept.

7. Key Formulas Summary


• Mean: xˉ=n∑x
• Variance (Population): σ2=N∑(x−μ)2
• Variance (Sample): s2=n−1∑(x−xˉ)2
• Standard Deviation (Population): σ=σ2
• Standard Deviation (Sample): s=s2
• Probability: outcomes outcomesP(A)=total outcomesfavorable outcomes

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy