Developing Machine Learning Solutions
Developing Machine Learning Solutions
SageMaker Environments
Amazon SageMaker Studio is the recommended option to access SageMaker. It is a
web-based UI that provides access to all SageMaker environments and resources.
SageMaker Studio
This web-based interface gives access to all the actions you can use to develop ML
applications, such as prepare data, train, deploy, and monitor models.
Applications
SageMaker Studio offers various applications, including the following:
JupyterLab: A tool to develop Jupyter notebooks, code, and data
Amazon SageMaker Canvas: A no code machine learning tool to
generate predictions without needing to write any code
RStudio: An integrated development environment for the R language
Code Editor (based on Visual Studio Code): Another option to develop
code and notebooks while getting access to thousands of VS Code
compatible extensions
Automated ML
SageMaker JumpStart provides pretrained open-source models for a
range of problem types to help you get started with machine learning.
AutoML is available in SageMaker Canvas. It simplifies ML development
by automating the process of building and deploying machine learning
models.
Model evaluations look at large language models (LLMs) or generative
artificial intelligence (generative AI) for model quality and responsibility.
Sources of ML Models
Model Implementations
SageMaker supports pre-trained models, built-in algorithms, and custom Docker
images.
The following are ways to use SageMaker to build your ML model:
Pre-trained models require the least effort and are models ready to deploy
or to fine-tune and deploy using SageMaker JumpStart.
Built-in models available in SageMaker require more effort and scale if the
dataset is large and significant resources are needed to train and deploy
the model.
If there is no built-in solution that works, try to develop one that uses pre-
made images for machine learning and deep learning frameworks for
supported frameworks such as scikit-learn, TensorFlow, PyTorch, MXNet, or
Chainer.
You can build your own custom Docker image that is configured to install
the necessary packages or software.
SageMaker provides several built-in general-purpose algorithms that you can use for
either classification or regression problems.
Unsupervised Learning
SageMaker provides several built-in algorithms that can be used for unsupervised
learning tasks such as clustering, dimension reduction, Topic modelling with pattern
recognition, and anomaly detection.
Image Processing
SageMaker also provides image processing algorithms that are used for image
classification, object detection, and computer vision, and time series.
Text Analysis
SageMaker Jumpstart
With SageMaker Jumpstart, you can deploy, fine-tune, and evaluate pre-trained
models from the most popular model hubs.
SageMaker JumpStart provides pretrained open source models from leading
providers for a range of problem types to help you get started with machine
learning. You can incrementally train and tune these models before deployment.
SageMaker JumpStart also provides solution templates that set up infrastructure for
common use cases and runnable example notebooks for machine learning with
SageMaker.
Model Fit
Model fit is important for understanding the root cause of poor model accuracy. This
understanding will guide you to take corrective steps. You can determine whether a
predictive model is underfitting or overfitting the training data by looking at the
prediction error on the training data and the evaluation data.
Overfitting
Overfitting is when the model performs well on the training data but does not
perform well on the evaluation data. This is because the model memorized the data
it has seen and is unable to generalize to unseen examples.
Underfitting
Underfitting is when the model performs poorly on the training data. This is because
the model is unable to capture the relationship between the input examples (often
called X) and the target values (often called Y).
If your model is underfitting and performing poorly on the training data, it could be
that the model is too simple (the input features are not expressive enough) to
describe the target well.
Balanced
The model is balanced when it is not overfit or underfit to the training data.
Bias and Variance
When evaluating models, both bias and variance contribute to errors the model
makes on unseen data, which affects its generalization.
A bullseye is a nice analogy because, generally speaking, the center of the bullseye
is where you aim your darts. The center of the bullseye in this situation is the label
or target—it predicts the value of your model—and each dot is a result that your
model produced during training.
Think about bias as the gap between your predicted value and the actual value,
whereas variance describes how dispersed your predicted values are.
In ML, the ideal algorithm has low bias and can accurately model the true
relationship. The ideal algorithm also has low variability, by producing consistent
predictions across different datasets.
Think about bias as the gap between your predicted value and the actual
value, whereas variance describes how dispersed your predicted values
are.
Balanced models have low bias and low variance.
To calculate the model’s accuracy, also known as its score, add up the correct
predictions and then divide that number by the total number of predictions.
Although accuracy is a widely used metric for classification problems, it
has limitations. This metric is less effective when there are a lot of true
negative cases in your dataset. This is why two other metrics are often
used in these situations: precision and recall.
Precision
Precision removes the negative predictions from the picture. Precision is the
proportion of positive predictions that are actually correct. You can calculate it by
taking the true positive count and dividing it by the total number of positives.
When the cost of false positives are high in your particular business situation,
precision can be a good metric. Think about a classification model that identifies
emails as spam or not. In this case, you do not want your model labeling a
legitimate email as spam and preventing your users from seeing that email.
Recall
In addition to precision, there is also recall (or sensitivity). In recall, you are looking
at the proportion of correct sets that are identified as positive. Recall is calculated
by dividing the true positive count by the sum of the true positives and false
negatives. By looking at that ratio, you get an idea of how good the algorithm is at
detecting, for example, cats.
Business Metrics
In the previous section, you saw how to evaluate the performance of an ML model.
But remember that when initiating a project, business set goals and KPIs are the
metrics used to evaluate if the goals are met.
To validate and monitor model performance, establish numerical metrics that
directly relate to the KPIs. These KPIs are established in the business goal
identification phase. They can include goals such as increasing sales, cutting costs,
or decreasing customer churn.
Evaluate whether the performance metrics accurately reflect the business’
tolerance for the error. For instance, false positives might lead to excessive
maintenance costs in predictive maintenance use cases. Another example is
deciding if acquiring a new customer is more expensive than retaining one. A
business should focus on numerical metrics, such as precision and recall, to help
differentiate the business requirements and be closer aligned to business value.
Consider developing custom metrics that tune the model directly for the business
objectives. One way is to develop a cost function to evaluate the economic impact
of the model. For the cost function, you can specify the cost, or value, of correct
predictions and the cost of errors.
By using A/B testing or the canary deployments technique, developers can
experiment with two or more variants of a model and help achieve the business
goals.
Model Deployment
Model Deployment Types
Model deployment is the integration of the model and its resources into a
production environment so that it can be used to create predictions.
Self-Hosted APIs
In a self-hosted API approach, you deploy and host your ML models on your
own infrastructure, either on premises or in the cloud (using virtual machines
or containers). This approach involves setting up and managing the
necessary infrastructure, such as web servers, load balancers, and
databases, to serve your ML models as APIs.
Managed API
Managed API services are cloud-based services that provide a fully managed
environment for deploying and hosting your ML models as APIs. SageMaker is
an example. These services abstract away the underlying infrastructure
management so you can focus on building and deploying your models.
Advantages of self-hosted APIs include greater control over the infrastructure,
potential cost savings (depending on usage), and the ability to customize the
deployment environment. However, this approach requires more operational
overhead and responsibility for managing and maintaining the infrastructure.
The choice between a managed API service or a self-hosted API for ML deployment
depends on factors such as the specific requirements of your use case, the level of
control and customization needed, the available resources and expertise, and cost
considerations.
SageMaker
SageMaker is a fully managed ML service. With SageMaker, data scientists and
developers can quickly and confidently build, train, and deploy ML models into a
production-ready, hosted environment. Within a few steps, you can deploy a model
into a secure and scalable environment.
SageMaker provides the following:
Deployment with one click or a single API call
Automatic scaling
Model hosting services
HTTPS endpoints that can host multiple models
You can use SageMaker to deploy a model to get predictions in several ways.
Real-Time
Real-time inference is ideal for inference workloads where you have real-time,
interactive, and low latency requirements.
Batch Transform
Use batch transform when you need to get inferences from large datasets and don't
need a persistent endpoint. You can also use it when you need to preprocess
datasets to remove noise or bias that interferes with training or inference from your
dataset.
Asynchronous
SageMaker asynchronous inference is a capability in SageMaker that queues
incoming requests and processes them asynchronously. This option is ideal for
requests with large payload sizes (up to 1GB), long processing times (up to one
hour), and near real-time latency requirements.
Serverless
On-demand serverless inference is ideal for workloads that have idle periods
between traffic spurts and can tolerate cold starts. It is a purpose-built inference
option that you can use to deploy and scale ML models without configuring or
managing any of the underlying infrastructure.
Fundamental Concepts of MLOps
MLOps
MLOps combines people, technology, and processes to deliver collaborative ML
solutions.
MLOps refers to the practice of operationalizing and streamlining the end-to-end
machine learning lifecycle from model development and deployment to monitoring
and maintenance. It helps ensure that models are not just developed but also
deployed, monitored, and retrained systematically and repeatedly.
It is an extension of the DevOps principles and practices to the specific domain of
machine learning systems.
Like DevOps, MLOps relies on a collaborative and streamlined approach to the
machine learning development lifecycle. It is the intersection of people, process,
and technology that optimizes the end-to-end activities required to develop, build,
and operate machine learning workloads.
Using MLOps
Applications that expose trained models might have different hosting requirements
and strategies than standard applications. Trained models are sensitive to changes
in data; therefore, a model-based application that works well when first
implemented might not perform as well days, weeks, or months after being
implemented. To account for these differences, you need different processes and
procedures for applications that are based in managing ML.
MLOps accounts for the unique aspects of artificial intelligence and machine
learning (AI/ML) projects in project management, continuous integration and
delivery (CI/CD), and quality assurance. With it, you can improve delivery time,
reduce defects, and make data science more productive.
Goals of MLOps
A goal of MLOps is to get ML workloads into production and keep them operating. To
meet this goal, MLOps adopts many DevOps principles and practices for the
development, training, deployment, monitoring, and retraining of machine learning
models. The aim is to use MLOps to do the following:
Increase the pace of the model development lifecycle through automation.
Improve quality metrics through testing and monitoring.
Promote a culture of collaboration between data scientists, data engineers,
software engineers, and IT operations.
Provide transparency, explainability, audibility, and security of the models by
using model governance.
Benefits of MLOps
Adopting MLOps practices gives you faster time-to-market for ML projects by
delivering the following benefits.
Productivity
By providing self-service environments with access to curated datasets, data
engineers and data scientists can move faster and waste less time with
missing or invalid data.
Reliability
By incorporating CI/CD practices, developers can deploy quickly with
increased quality and consistency.
Repeatability
By automating all the steps in the machine learning development lifecycle,
you can ensure a repeatable process, including how the model is trained,
evaluated, versioned, and deployed.
Auditability
By versioning all inputs and outputs, from data science experiments to source
data to trained models, you can demonstrate exactly how the model was built
and where it was deployed.
Data and Model quality
With MLOps, you can enforce policies that guard against model bias and track
changes to data statistical properties and model quality over time.
ML lifecycle
Managing code, data, and models throughout the ML lifecycle requires the following
touchpoints:
Processing code in data preparation
Training data and training code in model building
Candidate models, test, and validation data in model evaluation
Metadata during model selection
Deployment-ready models and inference code during deployment
Production code, models, and data for monitoring
With MLOps, you operationalize the processes around ML model development,
deployment, monitoring, and governance.
Implementing MLOps
The following diagram is an example of an end-to-end automation process. A
productionized ML lifecycle typically contains separate training and deployment
pipelines.
Model build
The model building pipeline creates new models upon initiation, for example when
new data become available.
Model evaluation
When the model building pipeline completes, you can implement quality control
measures at the model registration step. The quality control step can be either
manual (human in the loop) or automated.
If a model meets baseline performance metrics, it can be registered with a model
registry.
Model approval
You can use the registry to approve or reject model versions. The model approval
action can act as an initiation to start the deployment pipeline.
Model deployment
The deployment pipeline is most similar to traditional CI/CD systems. This pipeline
includes steps such as the following:
Source
Build
Deployment to staging environment
Testing
Promotion to production environment
Model in production
As soon as the model is in production, you should get feedback from the live
system. For ML solutions, monitor the hosting infrastructure, data quality, and
model performance.
Prepare data
SageMaker Data Wrangler is a LCNC tool that provides an end-to-end solution to
import, prepare, transform, featurize, and analyze data by using a web interface.
By using the SageMaker Processing API, data scientists can run scripts and
notebooks to process, transform, and analyze datasets various ML frameworks such
as scikit-learn, MXNet, or PyTorch while benefiting from fully managed machine
learning environments.
Store features
SageMaker Feature Store helps data scientists, machine learning engineers, and
general practitioners to create, share, and manage features for ML development.
Train
SageMaker provides a training job feature to train models using built-in algorithms
or custom algorithms.
Experiments
Use SageMaker Experiments to experiment with multiple combinations of data,
algorithms, and parameters, all while observing the impact of incremental changes
on model accuracy.
Processing job
SageMaker Processing refers to the capabilities to run data pre-processing and post-
processing, feature engineering, and model evaluation tasks on the SageMaker fully
managed infrastructure.
Registry
With SageMaker Model Registry you can catalog models, manage model versions,
manage the approval status of a model, or deploy models to production.
Deployments
With SageMaker, you can deploy your ML models to make predictions, also known
as inference. SageMaker provides a broad selection of ML infrastructure and model
deployment options to help meet all your ML inference needs.
Monitor model
With SageMaker Model Monitor, you can monitor the quality of SageMaker ML
models in production.
Pipelines
You can use Amazon SageMaker Model Building Pipelines to create end-to-end
workflows that manage and deploy SageMaker jobs.