This repository is the master repository for all the different repositories used for this project.
To clone all the submodules included in this project, clone this repository with the following command:
git clone --recursive https://github.com/spark-optimizations/join-optimizations.git
To generate the report use the following command:
make report
Here's the summary of all the submodule repositories. To know more about these submodules, please read their README
files:
-
- This repository contains code for building scala compiler plugin used to optimize spark joins using column pruning.
-
- This repository contains implementation of broadcast join on spark RDDs
-
- This repository contains benchmarking code for scala compiler plugin
-
- This repository contains benchmarking code for broadcast join