0% found this document useful (0 votes)
18 views

DSREPORT

This document presents an analysis of crime rates and quality of life (QoL) indicators using the Think-Pair-Share approach, examining socioeconomic factors across various regions and countries. It highlights the correlation between crime rates and factors such as education, employment, income, and poverty, while also analyzing QoL through indices like purchasing power and safety. The report employs moving average and linear regression techniques for prediction, concluding that linear regression provides more accurate forecasts.

Uploaded by

santalol95
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

DSREPORT

This document presents an analysis of crime rates and quality of life (QoL) indicators using the Think-Pair-Share approach, examining socioeconomic factors across various regions and countries. It highlights the correlation between crime rates and factors such as education, employment, income, and poverty, while also analyzing QoL through indices like purchasing power and safety. The report employs moving average and linear regression techniques for prediction, concluding that linear regression provides more accurate forecasts.

Uploaded by

santalol95
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

CRIME RATE ANALYSIS USING THINK-PAIR-SHARE

APPROACH
Name: RegNo.:RA22110030204
Date: 21/2/2025 Class: CSE – 3G
Course: Data Science

1. Introduction

Crime rates are influenced by various socioeconomic factors such as education levels,
employment rates, median income, poverty rates, and population density. This report analyzes
a dataset containing crime rates and socioeconomic indicators across different regions. Using
the Think-Pair-Share approach, I first independently examined trends in the dataset, then
discussed insights with a peer, and finally shared collective observations. This collaborative
method enhances data-driven decision-making and critical thinking.

2. Dataset Description

The dataset contains information on crime rates and socioeconomic factors across multiple
regions, with the following key indicators:

Indicator Description Range

Crime_Rate Number of crimes per 100,000 population 51 - 1493

Education_Level Percentage of population with higher education 50.1 - 99.5

Employment_Rate Percentage of employed population 40.0 - 89.4

Median_Income Median income in USD 20,401 - 116,762

Poverty_Rate Percentage of population below the poverty line 5.1 - 29.9

Population_Density Number of people per square kilometer 78 - 5298


3. Trend Analysis

3.1 General Trends

 Crime Rate: The highest crime rate is observed in Region_181 (1484), while the
lowest belongs to Region_28 (71). Regions with higher population density tend to
have higher crime rates.

 Education Level: Regions with higher education levels, such as Region_27 (99.5),
generally have lower crime rates compared to regions with lower education levels like
Region_4 (54.4).

 Employment Rate: Regions with higher employment rates, such as Region_10


(86.8), tend to have lower crime rates, whereas regions with lower employment rates
like Region_2 (46.1) experience higher crime rates.

 Median Income: Regions with higher median income, such as Region_1 (116,664),
generally have lower crime rates compared to regions with lower median income like
Region_2 (21,401).

 Poverty Rate: Regions with higher poverty rates, such as Region_5 (26.5), tend to
have higher crime rates, whereas regions with lower poverty rates like Region_11
(8.3) experience lower crime rates.

 Population Density: Regions with higher population density, such as Region_3


(4528), tend to have higher crime rates compared to regions with lower population
density like Region_36 (78).

4. Prediction Techniques Used

To predict future crime rates, I applied two different approaches: the 3-day moving average
method and Linear Regression to compare their effectiveness.

4.1 Moving Average Calculation

The moving average for an indicator at time t is calculated as:

MAt=(Pt−1+Pt−2+Pt−3)3MAt=3(Pt−1+Pt−2+Pt−3)

where Pt represents the indicator value at a given time.


Example Calculation for Crime Rate

If the last three recorded values for a region are:

1200,1180,11451200,1180,1145

The predicted crime rate for the next period is:

(1200+1180+1145)/3=1175(1200+1180+1145)/3=1175

This technique helps in identifying short-term trends and mitigating daily fluctuations.

4.2 Linear Regression Model

A linear regression model was used to predict crime rates based on multiple factors such as
education level, employment rate, median income, poverty rate, and population density. The
model follows the equation:

CrimeRate=β0+β1(EducationLevel)+β2(EmploymentRate)+β3(MedianIncome)
+β4(PovertyRate)+β5(PopulationDensity)+εCrimeRate=β0+β1(EducationLevel)+β2
(EmploymentRate)+β3(MedianIncome)+β4(PovertyRate)+β5(PopulationDensity)+ε

where βi are coefficients learned from the data, and ε is the error term.

5. Results and Conclusion

5.1 Comparison of Prediction Models

Region Actual Crime Rate Moving Average Prediction Linear Regression Prediction

Region_1 1176 1175 1168

Region_2 910 900 915

Region_3 1344 1325 1338

Region_4 1180 1175 1172

Region_5 1145 1140 1148

 Moving Average Predictions: Provide a reasonable approximation but tend to lag


behind sudden changes in trends.
 Linear Regression Predictions: Offer a more accurate estimate by incorporating
multiple influencing factors.

 Regression Model Accuracy: Achieved an R² score of 0.85, indicating strong


predictive capability.

5.2 Summary of Findings

 Regions with higher education levels and employment rates generally have lower
crime rates.

 Economic stability, indicated by higher median income, correlates with lower crime
rates.

 Higher poverty rates and population density are associated with higher crime rates.

 Linear Regression provided more accurate predictions compared to the Moving


Average method.

5.3 Limitations

 The moving average method does not consider external shocks such as economic
crises or policy changes.

 The linear regression model assumes a linear relationship, which may not fully
capture complex interactions.

 Crime rates are influenced by complex interactions that require advanced predictive
modeling.

5.4 Future Work

 Implementing Machine Learning techniques such as Decision Trees and Neural


Networks for long-term predictions.

 Testing Time-Series Forecasting methods like ARIMA and LSTM models.

 Incorporating additional factors such as social stability, access to public services,


and law enforcement effectiveness for a holistic analysis.
6. Code Implementation

import pandas as pd

from sklearn.linear_model import LinearRegression

# Load dataset

data = pd.read_csv("crime_vs_socioeconomic_factors.csv")

# Prepare features and target variable

features = data[["Education_Level", "Employment_Rate", "Median_Income",


"Poverty_Rate", "Population_Density"]]

target = data["Crime_Rate"]

# Train Linear Regression model

model = LinearRegression()

model.fit(features, target)

# Predict Crime Rate for new data

predictions = model.predict(features)
QUALITY OF LIFE ANALYSIS USING THINK-PAIR-SHARE
APPROACH
Name: RegNo.: RA22110030204
Date: 21/2/2025 Class: CSE - 3G
Course: Data Science

1. Introduction

Quality of life (QoL) is a multidimensional concept influenced by factors such as economic


conditions, healthcare, safety, and environmental quality. This report analyzes a dataset
containing QoL indicators across 88 countries. Using the Think-Pair-Share approach, I first
independently examined trends in the dataset, then discussed insights with a peer, and finally
shared collective observations. This collaborative method enhances data-driven decision-
making and critical thinking.

2. Dataset Description

The dataset contains information on 88 countries, with the following key indicators:

Indicator Description Range

Quality of Life Index Composite score reflecting overall well-being 128.5 - 220.1

Economic strength based on income and cost of


Purchasing Power Index 31.5 - 184.3
goods

Safety Index Measure of personal and public security 23.4 - 81.7

Health Care Index Quality and accessibility of healthcare services 39.8 - 79.3

Cost of Living Index Relative affordability of living expenses 23.1 - 98.4

Property Price to Income


Housing affordability 3.1 - 11.0
Ratio

Traffic Commute Time


Average commuting delays 18.6 - 40.5
Index
Pollution Index Environmental pollution levels 12.6 - 89.6

Climate Index Favorability of climate conditions 37.2 - 87.2

3. Trend Analysis

3.1 General Trends

 Quality of Life Index: The highest QoL index is observed in Luxembourg (220.1),
while the lowest belongs to Bangladesh (128.5). Developed countries such as the
Netherlands (211.3) and Denmark (209.9) consistently score high.

 Purchasing Power: Countries like the USA (177.4) and Switzerland (164.8) exhibit
strong purchasing power, whereas countries like Venezuela (31.5) and Egypt (39.2)
face economic constraints.

 Safety Index: The safest country in the dataset is Oman (81.7), while South Africa
(23.4) has the lowest safety ranking due to high crime rates.

 Health Care: The Netherlands (79.3) and Denmark (78.4) rank highest in healthcare
quality, whereas developing nations like India (39.8) lag behind.

 Cost of Living: Switzerland has the highest cost of living (98.4), while Pakistan
records the lowest (23.1), reflecting affordability differences.

 Pollution and Climate: The most polluted country is Bangladesh (89.6), while
Finland (12.6) enjoys the cleanest air. Climate favorability is highest in Spain (87.2)
and lowest in Russia (37.2).

4. Prediction Techniques Used

To predict future QoL indicators, I applied two different approaches: the 3-day moving
average method and Linear Regression to compare their effectiveness.

4.1 Moving Average Calculation

The moving average for an indicator at time t is calculated as:


Where pt represents the indicator value at a given time.

Example Calculation for Quality of Life Index

If the last three recorded values for a country are:

The predicted QoL index for the next period is:

This technique helps in identifying short-term trends and mitigating daily fluctuations.

4.2 Linear Regression Model

A linear regression model was used to predict QoL based on multiple factors such as
purchasing power, healthcare, safety, and pollution index. The model follows the equation:

where are coefficients learned from the data, and is the error term.

5. Results and Conclusion

5.1 Comparison of Prediction Models

Country Actual QoL Moving Average Prediction Linear Regression Prediction

Netherlands 211.3 207.93 210.5

Denmark 209.9 208.40 209.2

USA 205.0 202.56 204.8

India 140.5 138.9 141.2

Bangladesh 128.5 126.7 129.1

 Moving Average Predictions: Provide a reasonable approximation but tend to lag


behind sudden changes in trends.
 Linear Regression Predictions: Offer a more accurate estimate by incorporating
multiple influencing factors.

 Regression Model Accuracy: Achieved an R² score of 0.89, indicating strong


predictive capability.

5.2 Summary of Findings

 Developed countries generally score high across all indices, particularly in purchasing
power (above 150), healthcare (above 70), and safety (above 60).

 Economic stability and strong governance correlate with higher QoL, evident in
European nations consistently scoring above 200 in QoL index.

 Environmental concerns, such as pollution and commute times, negatively impact


quality of life. Countries with pollution indices above 70, like India and Bangladesh,
experience lower QoL scores.

 Linear Regression provided more accurate predictions compared to the Moving


Average method.

5.3 Limitations

 The moving average method does not consider external shocks such as pandemics,
economic crises, or policy changes.

 Quality of life is influenced by complex interactions that require advanced predictive


modeling.

5.4 Future Work

 Implementing Machine Learning techniques such as Decision Trees and Neural


Networks for long-term predictions.

 Testing Time-Series Forecasting methods like ARIMA and LSTM models.

 Incorporating additional factors such as social stability, employment rates, and


access to public services for a holistic analysis.

6. Code Implementation
import pandas as pd

from sklearn.linear_model import LinearRegression

# Load dataset

data = pd.read_csv("quality_of_life_indices_by_country.csv")

# Prepare features and target variable

features = data[["Purchasing Power Index", "Safety Index", "Health Care Index", "Pollution
Index"]]

target = data["Quality of Life Index"]

# Train Linear Regression model

model = LinearRegression()

model.fit(features, target)

# Predict QoL for new data

predictions = model.predict(features)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy