Skip to content

Commit 5e17778

Browse files
committed
adding data description
1 parent e4eafb0 commit 5e17778

File tree

1 file changed

+10
-4
lines changed

1 file changed

+10
-4
lines changed

README.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@
55

66
This example was created for the [2021 fall lecture series](https://datascience.stanford.edu/news/center-open-and-reproducible-science-cores-fall-lecture-series) of [Stanford's Center for Open and REproducible Science (CORES)](https://datascience.stanford.edu/cores).
77

8-
The goal of this analysis is to study the effect of varying different hyper-parameters of the training of a simple classification model on its performance in sklearn's handwritten digit dataset.
8+
The goal of this analysis is to study the effect of varying different hyper-parameters of the training of a simple classification model on its performance in scikit-learn's handwritten digit dataset.
99

10-
Specifically, we will study the effect of varying the learning rate, regularisation strength, number of gradient descent iterations, and random shuffling of the data on the cross-validated performance of [sklearn's default linear one-vs-rest SVM classifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html).
10+
Specifically, we will study the effect of varying the learning rate, regularisation strength, number of gradient descent iterations, and random shuffling of the data on the cross-validated performance of [scikit-learn's default linear one-vs-rest SVM classifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html).
1111

1212
Each hyper-parameter is varied individually, while all other hyper-parameters are set to default values (see [scripts/evaluate_hyper_params_effect.py](scripts/evaluate_hyper_params_effect.py))
1313

@@ -21,7 +21,7 @@ Each hyper-parameter is varied individually, while all other hyper-parameters ar
2121
├── pyproject.toml <- Lists all dependencies
2222
├── README.md <- This README file.
2323
├── data/
24-
| └── <- A copy of the handwritten digit dataset provided by sklearn
24+
| └── <- A copy of the handwritten digit dataset provided by scikit-learn
2525
|
2626
├── results/
2727
| ├── estimates/
@@ -46,6 +46,12 @@ Each hyper-parameter is varied individually, while all other hyper-parameters ar
4646
└── setup.py <- makes project pip-installable (pip install -e .) so that 'src' can be imported
4747
```
4848

49+
## Data description
50+
51+
We use the handwritten digits dataset provided by [scikit-learn](https://scikit-learn.org/stable/). For details on this dataset, see scikit-learn's documentation:
52+
53+
https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html
54+
4955

5056
## Installation
5157

@@ -82,7 +88,7 @@ Our Makefile provides the following analysis targets:
8288
| Analysis target | Description |
8389
| --- | ----------- |
8490
| all | Runs the entire analysis pipeline |
85-
| load | Downloads sklearn's handwritten digit dataset |
91+
| load | Downloads scikit-learn's handwritten digit dataset |
8692
| evaluate | Runs our cross-validated hyper-parameter evaluation |
8793
| plot | Summarizes results of evaluation in a figure |
8894

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy