Probablity lab
Probablity lab
UNIVERSITY
Experiment - 0
Introduction to SPSS:
SPSS is a Windows based program that can be used to perform data entry and analysis and to create tables
and graphs. SPSS is capable of handling large amounts of data and can perform all of the analyses covered
in the text and much more. SPSS is commonly used in the Social Sciences and in the business world, so
familiarity with this program should serve you well in the future. SPSS is updated often
Experiment - 1
Objective: Transportation of data set to SPSS data editor
INPUT:
car_sales.xlsx
PROCEDURE:
1.) File > read text data.
2.) Choose type of file as excel.
OUTPUT:
INITIAL VIEW
DATA VIEW
VARIABLE VIEW
INPUT:
Book1.xlsx & Book2.xlsx
File-1
File-2
PROCEDURE:
(B) Merging of variables:
6.) Select the common variables using Ctrl + Click and move them to key variables list.
7.) Choose the radio button saying ‘Both files provide cases’ and click OK.
OUTPUT:
(A) Merging of cases
(B) Merging of variables
CONCLUSIONS: We conclude that in SPSS, two files can be merged either by cases or by
variables.
Experiment - 3
Objective: Pictorial representation of data.
PROCEDURE:
1.) Graphs > chart builder.
2.) Select and drag the type of graph from gallery.
3.) Select and drag variables according to chart preview.
4.) Click on element properties.
5.) Edit any properties, if required.
6.) Pres OK and graph is opened in output window.
7.) Do same for pie chart.
OUTPUT:
LINE GRAPH
BAR GRAPH
PIE CHART:
CONCLUSION: We conclude that for any set of data, we can represent it easily with the help
of graphs.
Experiment - 4
Objective: Drawing of Histogram and distribution curve
OUTPUT:
DESCRIPTIVES VARIABLES=price
Descriptives
Notes
Comments
C:\Program
Files\IBM\SPSS\Statistics\21\Samples\
Data
English\car_sales.sav
DataSet1
Active Dataset
Input <none>
Filter
<none>
Weight
<none>
Split File
157
N of Rows in Working Data
File
DESCRIPTIVES VARIABLES=price
Syntax
/STATISTICS=MEAN STDDEV
VARIANCE MIN MAX SKEWNESS.
00:00:00.00
Resources Processor Time
00:00:00.00
Elapsed Time
[DataSet1] C:\Program Files\IBM\SPSS\Statistics\21\Samples\English\car_sales.sav
Descriptive Statistics
N Minimum Maximum Mean Std. Deviation Variance
Descriptive Statistics
Skewness
Valid N (listwise)
FREQUENCIES VARIABLES=price
/HISTOGRAM
/ORDER=ANALYSIS.
Frequencies
Notes
Output Created
12-SEP-2023 10:31:12
Comments
C:\Program
Files\IBM\SPSS\Statistics\21\Samples\
Data English\car_sales.sav
DataSet1
Input
Active Dataset
Filter <none>
Weight <none>
157
N of Rows in Working Data
File
FREQUENCIES VARIABLES=price
/STATISTICS=STDDEV MEAN
MEDIAN MODE SUM SKEWNESS
SESKEW
Syntax /HISTOGRAM
/ORDER=ANALYSIS.
00:00:00.66
Processor Time
00:00:00.83
Resources Elapsed Time
[DataSet1] C:\Program Files\IBM\SPSS\Statistics\21\Samples\English\car_sales.sav
Statistics
Price in thousands
Valid 155
N
Missing 2
Mean 27.39075
Median 22.79900
Mode 12.640a
Std. Deviation
14.351653
1.766
Skewness
Sum 4245.567
Conclusion: The Data set of Car sales from repository is used, and frequency table is created. Then the
Histogram and frequency curve of Horse power of engine variable is successfully drawn. Discuss about class
interval and the information revealed by frequency table, histogram and frequency curve
Experiment - 5
Objective: Descriptive statistics
PROCEDURE:
(A) Descriptive Statistics
1.) Analyze > descriptive statistics > frequencies.
2.) Send variables to be used over the list called ‘Variables’ o the right side.
3.) Click statistics and choose the required options to be displayed. Click on CONTINUE.
4.) Click on charts and select HISTOGRAM.
5.) Click OK.
OUTPUT:
Descriptive Statistics
GET
FILE='C:\Program
Files\IBM\SPSS\Statistics\21\Samples\English\Employee data.sav'.
DATASET NAME DataSet1
WINDOW=FRONT. FREQUENCIES
VARIABLES=salary
/STATISTICS=STDDEV VARIANCE MEAN MEDIAN MODE
/HISTOGRAM
/ORDER=ANALYSIS.
Frequencies
Notes
Statistics
Current Salary
N Valid 474
Missing 0
Mean $34,419.5
7
Median $28,875.0
0
Mode $30,750
Std. Deviation $17,075.6
61
Variance 29157821
4.5
DESCRIPTIVES VARIABLES=prevexp
/STATISTICS=MEAN STDDEV VARIANCE RANGE MIN MAX.
Descriptives
Descriptive Statistics
Variance
Previous Experience (months) 10938.281
Valid N (listwise)
Descriptives
[DataSet1] C:\Program
Files\IBM\SPSS\Statistics\21\Samples\English\Employe e data.sav
a
Descriptive Statistics
N Range Minimu Maximu Mean Std.
m m Deviation
Previous 363 476 0 476 95.275
Experience 85.0
(months) 4
Valid N (listwise) 363
a
Descriptive Statistics
Variance
Previous Experience 9077.258
(months)
Valid N (listwise)
a. Employment Category = Clerical
a
Descriptive Statistics
Variance
Previous Experience 10287.333
(months)
Valid N (listwise)
a. Employment Category = Custodial
a
Descriptive Statistics
Variance
Previous Experience 5367.010
(months)
Valid N (listwise)
a. Employment Category = Manager
CONCLUSION:
1.) Frequency tables show us the vivid interpretation of data.
2.) Frequency curves show us easy interpretation of skewness and kurtosis curves.
3.) Binning operation gives us an extra variable in case of continuous data.
Experiment - 6
Objective: Correlation of two variables
PROCEDURE:
1.) Analyze > correlate > bivariate.
2.) Select and drag nay variable of your choice.
3.) Select “Two-Tailed” and click OK.
Graph:
OUTPUT:
CORRELATIONS
/VARIABLES=horsepow mpg
/PRINT=TWOTAIL NOSIG
/STATISTICS DESCRIPTIVES
/MISSING=LISTWISE.
Correlations
Note
s
[DataSet1] C:\Program
Files\IBM\SPSS\Statistics\21\Samples\English\car_sal es.sav
Descriptive Statistics
Mea Std. N
n Deviation
Horsepow 185.95 56.700 156
er
Fuel 23.88 4.271 156
efficiency
b
Correlations
Horsepow Fuel
er efficiency
Horsepower Pearson 1 -
Correlation .605
Sig. (2-tailed) **
.00
0
Fuel efficiency Pearson - 1
Correlation .605
Sig. (2-tailed) **
.00 0
Correlations
Note
s
00:00:00
.00
00:00:00
.00
[DataSet1] C:\Program
Files\IBM\SPSS\Statistics\21\Samples\English\car_sal es.sav
Descriptive Statistics
Mea Std. N
n Deviation
Horsepow 185.95 56.700 156
er
Fuel 23.88 4.271 156
efficiency
Correlations
Horsepow Fuel
er efficiency
Horsepow Pearson 1 **
-.605
er Correlation
Sig. (2-tailed) .000
N 156 156
Fuel Pearson ** 1
-.605
efficiency Correlation
Sig. (2-tailed) .000
N 156 156
CONCLUSION: From the above data, we conclude that horsepower and fuel efficiency are partially
correlated.
Experiment-7
Objective: Regression
PROCEDURE:
1.) Analyze > regression > curve estimation.
2.) Select engine sizes for dependent variable.
3.) Select sales for independent variable.
4.) Under category ‘Models’, select Linear, Quadratic, Exponential.
5.) Click OK.
OUTPUT:
Part a:
Curve Fit
Notes
Syntax CURVEFIT
/VARIABLES=sales WITH engine_s
/CONSTANT
/MODEL=LINEAR
/PLOT FIT.
Notes
Equations Include
CONSTANT
[DataSet1] C:\Program Files\IBM\SPSS\Statistics\21\Samples\English\car_sal
es.sav
Model Description
N
Total Cases 157
Excluded Casesa 1
Forecasted Cases 0
Newly Created Cases 0
a. Cases with a missing value in any variable are excluded from the analysis. Variable
Processing Summary
Variables
Dependent Independent
Sales in
thousands Engine size
Number of Positive Values 157 156
Number of Zeros 0 0
Number of Negative 0 0
Values
Number of Missing Values User-Missing 0 0
System-Missing 0 1
Regression
Notes
Variables Entered/Removeda
Variables Variables
Mode Entered Removed Method
l
1 Horsepowerb . Enter
a. Dependent Variable: Sales in thousands
b. All requested variables entered.
Model Summary
Adjusted R Std. Error of
Mode R R Square Square the Estimate
l
1 .198a .039 .033 67.117542
a. Predictors: (Constant),
Horsepower
ANOVAa
Sum of
Model Squares df Mean Square F Sig.
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT sales /METHOD=ENTER
price resale.
Regression
Notes
Variables Entered/Removeda
Variables Variables
Mode Entered Removed Method
l
1 4-year resale
value, Price in . Enter
thousandsb
a. Dependent Variable: Sales in thousands
b. All requested variables entered.
Model Summary
Adjusted R Std. Error of
Mode R R Square Square the Estimate
l
1 .281a .079 .063 72.176890
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
CONCLUSION: From above data, we analyse the regression of given data set.
Experiment-8
Objective: Hypothesis Testing ‘t’ – test
Procedure:
1. Open SPSS and load your data.
One-Sample t-test: Analyze > Compare Means > One-Sample T Test. Select the test variable and set
the population mean.
Independent-Samples t-test: Analyze > Compare Means > Independent-Samples T Test. Select the
test variable and grouping variable, then define groups.
Paired-Samples t-test: Analyze > Compare Means > Paired-Samples T Test. Select the paired
variables.
4. Interpret Results:
Review means and confidence intervals for context on the difference size.
OUTPUT: T-TEST
/TESTVAL=17.4
/MISSING=ANALYSIS
/VARIABLES=fuel_cap
/CRITERIA=CI(.95).
T-Test
Notes
Comments
C:\Program
Files\IBM\SPSS\Statistics\21\Samples\
Data English\car_sales.sav
DataSet1
Active Dataset
<none>
Input
Filter
Weight <none>
treated as missing.
Missing Value Handling
T-TEST
/TESTVAL=17.4
/MISSING=ANALYSIS
/VARIABLES=fuel_cap
Syntax
/CRITERIA=CI(.95).
00:00:00.00
Processor Time
00:00:00.00
Resources Elapsed Time
One-Sample Statistics
One-Sample Test
Lower
One-Sample Test
Upper
T-Test
Notes
Output Created 31-OCT-2023 11:00:44
Comments
C:\Program
Files\IBM\SPSS\Statistics\21\Sam
Data
ples\English\car_sales.sav
DataSet1
Active Dataset
Input <none>
Filter
Weight <none>
Definition of Missing
User defined missing values are
treated as missing.
T-TEST GROUPS=type(0 1)
/MISSING=ANALYSIS
Syntax
/VARIABLES=length
/CRITERIA=CI(.95).
Group Statistics
F Sig. t df
Lower
Upper
/CRITERIA=CI(.9500)
/MISSING=ANALYSIS.
T-Test
Notes
C:\Program
Files\IBM\SPSS\Statistics\21\Samples\
Data English\car_sales.sav
DataSet1
Active Dataset
<none>
Input
Filter
<none>
Weight
Split File
<none>
N of Rows in Working Data 157
File
treated as missing.
/MISSING=ANALYSIS.
N Correlation Sig.
Paired Differences
Lower
95% Confidence
Interval of the
Difference
Upper
CONCLUSION:
T Test formula:
We obtain that as we increase the test value, mean difference increases. It means that more approximately
we estimate the better result we get.
Experiment-9
Objective: Chi-square test
Procedure:
1. Open SPSS and load your data.
2. Choose Chi-Square Test:
o Go to Analyze > Descriptive Statistics > Crosstabs…
3. Set Variables:
o Select the row and column variables (categorical variables you want to test).
o Click Statistics…, check Chi-square, and then click Continue.
4. Run the Test by clicking OK.
5. Interpret Results:
o In the output, find the Chi-Square Tests table.
o Check the p-value for the Chi-Square statistic: If p < 0.05, there’s a significant association
between variables.
OUTPUT:
CROSSTABS
/TABLES=fraudulent BY gender
/FORMAT=AVALUE TABLES
/STATISTICS=CHISQ
/CELLS=COUNT
Notes
Comments
C:\Program
Files\IBM\SPSS\Statistics\21\Sampl
Data es\English\insurance_claims.sav
DataSet1
Active Dataset
Insurance Claims
Weight <none>
N of Rows in Working
Data File 4415
CROSSTABS
/TABLES=fraudulent BY gender
/FORMAT=AVALUE TABLES
Syntax
/STATISTICS=CHISQ
/CELLS=COUNT
174734
Cells Available
Case Processing Summary
Cases
Count
Total
Gender
Male Female
Chi-Square Tests
Value df
Asymp. Sig. Exact Sig. Exact Sig.
(2sided) (2sided) (1sided)
a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 228.09.
/CHISQUARE=claim_amount (1,30)
/EXPECTED=EQUAL
/STATISTICS DESCRIPTIVES
/MISSING ANALYSIS.
NPar Tests
Notes
Comments
C:\Program
Files\IBM\SPSS\Statistics\21\Samples\
Data English\insurance_claims.sav
DataSet1
Active Dataset
Insurance Claims
Input
File Label
Filter <none>
Weight <none>
4415
N of Rows in Working Data
File
Definition of Missing
User-defined missing values are treated
as missing.
Missing Value Handling
Cases Used
Statistics for each test are based on all
cases with valid data for the variable(s)
used in that test.
NPAR TESTS
/CHISQUARE=claim_amount (1,30)
Syntax /EXPECTED=EQUAL
/STATISTICS DESCRIPTIVES
/MISSING ANALYSIS.
Processor Time 00:00:00.00
Descriptive Statistics
Chi-Square Test
Test Statistics
Cost of claim in
thousands
Chi-Square
1072.485a
29
df
NPar Tests
Notes
Output Created 31-OCT-2023 11:19:52
Comments
C:\Program
Files\IBM\SPSS\Statistics\21\Samples\
Data
English\insurance_claims.sav
DataSet1
Active Dataset
Insurance Claims
File Label
Input
Filter <none>
Weight <none>
File
Definition of Missing User-defined missing values are treated
as missing.
NPAR TESTS
/CHISQUARE=claim_amount (1,3)
Syntax /EXPECTED=EQUAL
/STATISTICS DESCRIPTIVES
/MISSING ANALYSIS.
Descriptive Statistics
Chi-Square Test
Frequencies
Total 405
Test Statistics
Cost of claim in
thousands
Chi-Square 168.193a
df
Asymp. Sig.
2
.000