A comprehensive R-based project that analyzes the Gapminder dataset (1952β2007) to explore global trends in life expectancy, GDP per capita, and population. The analysis produces a suite of 12 visualizations, fits a regression model, and renders an HTML report.
- Project Overview
- Features & Charts
- Prerequisites
- Installation
- Usage
- File Structure
- Scripts & Automation
- Docker & Containerization
- Extending & Customizing
- Data Source & Citations
- License
This repository contains everything needed to:
- Fetch and prepare the Gapminder dataset via the
gapminder
R package. - Compute key metrics (global averages, per-country snapshots).
- Visualize 12 charts illustrating temporal trends, economic relationships, continental comparisons, and population impacts.
- Model the relationship between life expectancy and GDP per capita.
- Automate the workflow via shell scripts, a Makefile, and Docker.
- Render a polished RMarkdown report (
Gapminder_report.html
).
Ideal for students, educators, data scientists, and anyone interested in global development metrics.
- Global Average Life Expectancy Over Time
- Global Average GDP per Capita Over Time
- Global Total Population Over Time
- Scatter: Life Expectancy vs. GDP per Capita (2007)
- Regression: Life Expectancy ~ log(GDP per Capita)
- Top 10 Countries by Life Expectancy (2007)
- Boxplot: Life Expectancy by Continent (2007)
- Violin: GDP per Capita by Continent (2007)
- Line: Average Life Expectancy by Continent Over Time
- Density: Life Expectancy Distribution by Continent (2007)
- Heatmap: Average Life Expectancy by Year & Continent
- Bubble Plot: GDP vs. Life Expectancy Sized by Population (2007)
Each plot is saved to gap-<index>.png
and displayed in the R session.
- R β₯ 4.0
- RStudio (optional, but recommended)
- Internet connection (to install R packages and fetch data)
ggplot2
dplyr
gapminder
scales
viridis
tidyr
forcats
zoo
rmarkdown
(for report rendering)
The main script auto-installs any missing packages.
-
Clone this repository:
git clone https://github.com/yourusername/gapminder-analysis.git cd gapminder-analysis
-
Install R (if not already).
-
(Optional) Copy
.env.example
to.env
to customize environment variables.
bash scripts/run_gapminder.sh
All 12 charts will print and save as gap-1.png
β¦ gap-12.png
.
bash scripts/render_report.sh
This produces Gapminder_report.html
.
make report
Equivalent to running analysis and report steps.
.
βββ Gapminder_Analysis.R # Main R script
βββ Gapminder_Analysis.Rmd # RMarkdown report
βββ Gapminder_report.html # Generated HTML report
βββ gap-*.png # Saved plot images
βββ scripts/
β βββ run_gapminder.sh # Runs the analysis script
β βββ render_report.sh # Renders the RMarkdown
βββ Makefile # Make targets for automation
βββ Dockerfile # Container definition
βββ docker-compose.yml # Compose for containerized run
βββ .env.example # Sample environment config
βββ README.md # This file
-
scripts/run_gapminder.sh
RunsGapminder_Analysis.R
to produce and save all plots. -
scripts/render_report.sh
RendersGapminder_Analysis.Rmd
toGapminder_report.html
. -
Makefile
make analysis
β run shell script to generate plotsmake report
β run analysis + render reportmake clean
β delete images & report
Build and run in a container:
docker-compose up --build
This will:
- Install system and R dependencies
- Run the analysis script
- Render the RMarkdown report
All files are shared via a bind mount.
- Add more plots: insert new code chunks in
Gapminder_Analysis.R
or.Rmd
. - Parameterize: use YAML or command-line args to filter years or continents.
- CI/CD: integrate with GitHub Actions to auto-build the report on push.
- Data sources: replace Gapminder with another tidy dataset for similar workflows.
-
Gapminder dataset via R package:
install.packages("gapminder")
-
R Packages: see CRAN for
ggplot2
,dplyr
, etc.
This project is licensed under the MIT License. See LICENSE for details.