The Fundamentals of Machine Learning
The Fundamentals of Machine Learning
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
iii
2. End-to-End Machine Learning Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Working with Real Data 39
Look at the Big Picture 41
Frame the Problem 41
Select a Performance Measure 43
Check the Assumptions 46
Get the Data 46
Running the Code Examples Using Google Colab 46
Saving Your Code Changes and Your Data 48
The Power and Danger of Interactivity 49
Book Code Versus Notebook Code 50
Download the Data 50
Take a Quick Look at the Data Structure 51
Create a Test Set 55
Explore and Visualize the Data to Gain Insights 60
Visualizing Geographical Data 61
Look for Correlations 63
Experiment with Attribute Combinations 66
Prepare the Data for Machine Learning Algorithms 67
Clean the Data 68
Handling Text and Categorical Attributes 71
Feature Scaling and Transformation 75
Custom Transformers 79
Transformation Pipelines 83
Select and Train a Model 88
Train and Evaluate on the Training Set 88
Better Evaluation Using Cross-Validation 89
Fine-Tune Your Model 91
Grid Search 91
Randomized Search 93
Ensemble Methods 95
Analyzing the Best Models and Their Errors 95
Evaluate Your System on the Test Set 96
Launch, Monitor, and Maintain Your System 97
Try It Out! 100
Exercises 101
3. Classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
MNIST 103
Training a Binary Classifier 106
Performance Measures 107
iv | Table of Contents
Measuring Accuracy Using Cross-Validation 107
Confusion Matrices 108
Precision and Recall 110
The Precision/Recall Trade-off 111
The ROC Curve 115
Multiclass Classification 119
Error Analysis 122
Multilabel Classification 125
Multioutput Classification 127
Exercises 129
Table of Contents | v
SVM Regression 184
Under the Hood of Linear SVM Classifiers 186
The Dual Problem 189
Kernelized SVMs 190
Exercises 193
vi | Table of Contents
Preserving the Variance 243
Principal Components 244
Projecting Down to d Dimensions 245
Using Scikit-Learn 246
Explained Variance Ratio 246
Choosing the Right Number of Dimensions 247
PCA for Compression 249
Randomized PCA 250
Incremental PCA 250
Random Projection 252
LLE 254
Other Dimensionality Reduction Techniques 256
Exercises 257
Table of Contents | ix
The CategoryEncoding Layer 463
The StringLookup Layer 465
The Hashing Layer 466
Encoding Categorical Features Using Embeddings 466
Text Preprocessing 471
Using Pretrained Language Model Components 473
Image Preprocessing Layers 474
The TensorFlow Datasets Project 475
Exercises 477
x | Table of Contents
15. Processing Sequences Using RNNs and CNNs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
Recurrent Neurons and Layers 538
Memory Cells 540
Input and Output Sequences 541
Training RNNs 542
Forecasting a Time Series 543
The ARMA Model Family 549
Preparing the Data for Machine Learning Models 552
Forecasting Using a Linear Model 555
Forecasting Using a Simple RNN 556
Forecasting Using a Deep RNN 557
Forecasting Multivariate Time Series 559
Forecasting Several Time Steps Ahead 560
Forecasting Using a Sequence-to-Sequence Model 562
Handling Long Sequences 565
Fighting the Unstable Gradients Problem 565
Tackling the Short-Term Memory Problem 568
Exercises 576
Table of Contents | xi
Stacked Autoencoders 640
Implementing a Stacked Autoencoder Using Keras 641
Visualizing the Reconstructions 642
Visualizing the Fashion MNIST Dataset 643
Unsupervised Pretraining Using Stacked Autoencoders 644
Tying Weights 645
Training One Autoencoder at a Time 646
Convolutional Autoencoders 648
Denoising Autoencoders 649
Sparse Autoencoders 651
Variational Autoencoders 654
Generating Fashion MNIST Images 658
Generative Adversarial Networks 659
The Difficulties of Training GANs 663
Deep Convolutional GANs 665
Progressive Growing of GANs 668
StyleGANs 671
Diffusion Models 673
Exercises 681
B. Autodiff. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811