Skip to content

πŸ” A project for analyzing the Gapminder dataset (1952–2007) using R, producing 12 visualizations that explore trends in life expectancy, GDP per capita, and population across continents. Includes a regression analysis, RMarkdown reporting, and automation via scripts, Makefile, and Docker.

License

Notifications You must be signed in to change notification settings

hoangsonww/Gapminder-R-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Gapminder Data Analysis

A comprehensive R-based project that analyzes the Gapminder dataset (1952–2007) to explore global trends in life expectancy, GDP per capita, and population. The analysis produces a suite of 12 visualizations, fits a regression model, and renders an HTML report.

Table of Contents

  1. Project Overview
  2. Features & Charts
  3. Prerequisites
  4. Installation
  5. Usage
  6. File Structure
  7. Scripts & Automation
  8. Docker & Containerization
  9. Extending & Customizing
  10. Data Source & Citations
  11. License

Project Overview

This repository contains everything needed to:

  • Fetch and prepare the Gapminder dataset via the gapminder R package.
  • Compute key metrics (global averages, per-country snapshots).
  • Visualize 12 charts illustrating temporal trends, economic relationships, continental comparisons, and population impacts.
  • Model the relationship between life expectancy and GDP per capita.
  • Automate the workflow via shell scripts, a Makefile, and Docker.
  • Render a polished RMarkdown report (Gapminder_report.html).

Ideal for students, educators, data scientists, and anyone interested in global development metrics.


Features & Charts

  1. Global Average Life Expectancy Over Time
  2. Global Average GDP per Capita Over Time
  3. Global Total Population Over Time
  4. Scatter: Life Expectancy vs. GDP per Capita (2007)
  5. Regression: Life Expectancy ~ log(GDP per Capita)
  6. Top 10 Countries by Life Expectancy (2007)
  7. Boxplot: Life Expectancy by Continent (2007)
  8. Violin: GDP per Capita by Continent (2007)
  9. Line: Average Life Expectancy by Continent Over Time
  10. Density: Life Expectancy Distribution by Continent (2007)
  11. Heatmap: Average Life Expectancy by Year & Continent
  12. Bubble Plot: GDP vs. Life Expectancy Sized by Population (2007)

Each plot is saved to gap-<index>.png and displayed in the R session.


Prerequisites

  • R β‰₯ 4.0
  • RStudio (optional, but recommended)
  • Internet connection (to install R packages and fetch data)

Required R Packages

  • ggplot2
  • dplyr
  • gapminder
  • scales
  • viridis
  • tidyr
  • forcats
  • zoo
  • rmarkdown (for report rendering)

The main script auto-installs any missing packages.


Installation

  1. Clone this repository:

    git clone https://github.com/yourusername/gapminder-analysis.git
    cd gapminder-analysis
  2. Install R (if not already).

  3. (Optional) Copy .env.example to .env to customize environment variables.


Usage

Run Analysis & Save Plots

bash scripts/run_gapminder.sh

All 12 charts will print and save as gap-1.png … gap-12.png.

Render HTML Report

bash scripts/render_report.sh

This produces Gapminder_report.html.

Combined via Make

make report

Equivalent to running analysis and report steps.


File Structure

.
β”œβ”€β”€ Gapminder_Analysis.R       # Main R script
β”œβ”€β”€ Gapminder_Analysis.Rmd     # RMarkdown report
β”œβ”€β”€ Gapminder_report.html      # Generated HTML report
β”œβ”€β”€ gap-*.png                  # Saved plot images
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ run_gapminder.sh       # Runs the analysis script
β”‚   └── render_report.sh       # Renders the RMarkdown
β”œβ”€β”€ Makefile                   # Make targets for automation
β”œβ”€β”€ Dockerfile                 # Container definition
β”œβ”€β”€ docker-compose.yml         # Compose for containerized run
β”œβ”€β”€ .env.example               # Sample environment config
└── README.md                  # This file

Scripts & Automation

  • scripts/run_gapminder.sh Runs Gapminder_Analysis.R to produce and save all plots.

  • scripts/render_report.sh Renders Gapminder_Analysis.Rmd to Gapminder_report.html.

  • Makefile

    • make analysis β€” run shell script to generate plots
    • make report β€” run analysis + render report
    • make clean β€” delete images & report

Docker & Containerization

Build and run in a container:

docker-compose up --build

This will:

  1. Install system and R dependencies
  2. Run the analysis script
  3. Render the RMarkdown report

All files are shared via a bind mount.


Extending & Customizing

  • Add more plots: insert new code chunks in Gapminder_Analysis.R or .Rmd.
  • Parameterize: use YAML or command-line args to filter years or continents.
  • CI/CD: integrate with GitHub Actions to auto-build the report on push.
  • Data sources: replace Gapminder with another tidy dataset for similar workflows.

Data Source & Citations

  • Gapminder dataset via R package:

    install.packages("gapminder")
  • R Packages: see CRAN for ggplot2, dplyr, etc.


License

This project is licensed under the MIT License. See LICENSE for details.

About

πŸ” A project for analyzing the Gapminder dataset (1952–2007) using R, producing 12 visualizations that explore trends in life expectancy, GDP per capita, and population across continents. Includes a regression analysis, RMarkdown reporting, and automation via scripts, Makefile, and Docker.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy