Google AI - ML - Caltech

Program Management
Certificate Program
Developed for Applied Materials
by
California Institute of Technology
Center for Technology and Management Education
1200 East California Blvd., Mail Code 202-81
Pasadena, California 91125
Phone: 626.395.4042
©Caltech 1 https://ctme.caltech.edu
Google Cloud AI/ML
11/16/2022
Patrick Alexander
Google - Customer Engineer
Ex-Microsoft - Cloud Solution Architect
Board Member of AACIT (Caltech Flying Club)
Bases @LosAngeles
PatrickGCP@Google.com
CTME | Teaching Team (caltech.edu)

Google Office Spruce Goose
Playa Vista - California
©Caltech https://ctme.caltech.edu
Spruce Goose
Hughes H-4 Hercules
https://en.wikipedia.org/wiki/Hughes_H-4_Hercules
Google AI/ML
11/16/2022
Agenda
● Data Science vs Data Engineering
● AI/ML
● Machine Learning Lifecycle
● Google toolset and decision tree to find proper tools
● Productionizing Machine Learning Models (MLOPs)
● Q&A
Telling a story with Data
Data Engineer Data Scientist
● Develops, constructs, test and ● Deep dive into data and provides
maintains architecture such as meaningful business insights from it
databases and large-scale processing that are crucial for decision making in
systems. the company.
● Make sure the continuous flow of data ● Work on building and deploying
and pipelines. AI-based algorithms in various aspects
of the business to solve business
● Provide data in usable formats to the
problems.
data scientists who run queries and
algorithms against the information for
predictive analytics, machine learning
and data mining application.
● Deliver aggregated data to business

executives and analysts and other end
users to improve business operations.
Data Engineer Skills Data Scientist Skills
● Database Architecture ● Problem Solving
● Data warehousing ● Machine Learning
● ETL tooling ● Programming
● Big Data ● Mathematics
● Advanced analytics ● Statistical modeling
● Programming ● Business acumen
● Communications ● Communication
● Presentation
Data Engineer Tasks Data Scientist Tasks
● Data acquisition ● Develop hypotheses
● Manage and organize data ● Gather, organize and clean data
● Design, build and test data architecture ● Apply algorithms to test hypotheses
● Prepare data for modeling ● Identify patterns
● Discover tasks that can be automated ● Provide data-led business solutions to

business challenges
● Develop data set Processes
Proprietary + Confidential
AI/ML 101
AI & ML
What is AI?
Simply put, AI is the creation of software that imitates human behaviors and capabilities. Key
elements include:
•Machine learning - This is often the foundation for an AI system, and is the way we "teach" a
computer model to make prediction and draw conclusions from data.
•Anomaly detection - The capability to automatically detect errors or unusual activity in a system.
•Computer vision - The capability of software to interpret the world visually through cameras,
video, and images.
•Natural language processing - The capability for a computer to interpret written or spoken
language and respond in kind.
•Conversational AI - The capability of a software "agent" to participate in a conversation.
How machine learning works
So how do machines learn?
The answer is, from data.
Understand anomaly detection
Imagine you're creating a software system to monitor credit card transactions and detect unusual
usage patterns that might indicate fraud. Or an application that tracks activity in an automated
production line and identifies failures. Or a racing car telemetry system that uses sensors to
proactively warn engineers about potential mechanical failures before they happen.
Understand computer vision
Computer Vision is an area of AI that deals with visual processing. Let's explore some of the
possibilities that computer vision brings.
Understand natural language processing
Natural language processing (NLP) is the area of AI that deals with creating software that
understands written and spoken language.
NLP enables you to create software that can:

• Analyze and interpret text in documents, email messages, and other sources.
• Interpret spoken language and synthesize speech responses.
• Automatically translate spoken or written phrases between languages.
• Interpret commands and determine appropriate actions.
Understand conversational AI
Conversational AI is the term used to describe solutions where AI agents participate in
conversations with humans. Most commonly, conversational AI solutions use bots to manage
dialogs with users. These dialogs can take place through web site interfaces, email, social media
platforms, messaging systems, phone calls, and other channels.
Bots can be the basis of AI solutions for:

• Customer support for products or services.
• Reservation systems for restaurants, airlines, cinemas, and other appointment-based
businesses.
• Health care consultations and self-diagnosis.
• Home automation and personal digital assistants.
Understand responsible AI
Artificial Intelligence is a powerful tool that can be used to greatly benefit the world. However,
like any tool, it must be used responsibly.
Fairness
AI systems should treat all people fairly.
Reliability and safety

AI systems should perform reliably and safely.
Privacy and security

AI systems should be secure and respect privacy.
Inclusiveness
AI systems should empower everyone and engage people.
Transparency
AI systems should be understandable.
Accountability
What is machine learning?
Machine learning is a technique that uses mathematics and statistics to create a model that can
predict unknown values.
Mathematically, you can think of machine learning as a way of defining a function (let's call it f) that
operates on one or more features of something (which we'll call x) to calculate a predicted label (y)
- like this:
f(x) = y
The most common Machine Learning Models are:

• Classification
• Regression
• Clustering
• Time series forecasting
Kinds of machine learning
There are three main categories of machine learning: supervised learning, unsupervised
learning, and reinforcement learning.
Supervised learning
In supervised learning, each data point is labeled or associated with a category or value of
interest.
Unsupervised learning
In unsupervised learning, data points have no labels associated with them.
Reinforcement learning
In reinforcement learning, the algorithm gets to choose an action in response to each data
point.
©Caltech Program Introduction https://ctme.caltech.edu 30

Supervised
Learning
Un-Supervised
Learning
Reinforcement
Learning
Regression is a form of machine learning that is used to predict a numeric label based on an
item's features. For example, an automobile sales company might use the characteristics of car
(such as engine size, number of seats, mileage, and so on) to predict its likely selling price. In
this case, the characteristics of the car are the features, and the selling price is the label.
Regression is an example of a supervised machine learning technique in which you train a model

using data that includes both the features and known values for the label, so that the model
learns to fit the feature combinations to the label. Then, after training has been completed, you
can use the trained model to predict labels for new items for which the label is unknown.
Classification is a form of machine learning that is used to predict which category, or class, an
item belongs to. For example, a health clinic might use the characteristics of a patient (such as
age, weight, blood pressure, and so on) to predict whether the patient is at risk of diabetes. In
this case, the characteristics of the patient are the features, and the label is a classification of
either 0 or 1, representing non-diabetic or diabetic.
Classification is an example of a supervised machine learning technique in which you train a

model using data that includes both the features and known values for the label, so that the
model learns to fit the feature combinations to the label. Then, after training has been
completed, you can use the trained model to predict labels for new items for which the label is
unknown.
Advanced, embedded AI & ML capabilities
Democratize AI/ML
AutoML and BigQuery ML enable teams
without data scientists to use AI/ML with
capabilities not offered by competitors
Depth of AI/ML
Cloud Speech to Text supports 120 language
variants, more than AWS, Azure
Accelerate time to decision

Google is a Leader in
Google TPUs 15-30x faster than Computer Vision Platforms
contemporary GPUs and CPUs
Mapping the business objective to an AI absorption strategy
Use AI out of the box to Deploy custom AI to extract Build end-to-end AI to have
maximize the value AI delivers value from your data to material impact on competitive
into business workflows differentiate differentiation
Speed
Effort
Customization
AI Platform for every level of expertise
Pre-trained APIs Custom AI with AutoML End-to-end AI with core tools

No training data needed, Easily create custom models Help data scientists and ML
get started right away (A no-code approach) engineers build and deploy AI
37
Which capabilities to choose?
Design your ML workflow
BigQuery ML AutoML Models in Vertex AI End-to-end AI with Vertex AI

● Descriptive and predictive ● Predictive modeling on ● Custom models on pre-built
modeling on structured data structured & unstructured data frameworks
● Hyper-parameter tuning ● Hyper-parameter tuning ● Noops, serverless training with
● Feature engineering
● Feature engineering hyperparameter tuning
● Explainability
● Explainability ● Explainability
● Simple SQL code
● No code ● Custom code
Google Cloud enables your AI journey
Pre-packaged AI solutions A new middle pathway Custom ML models
Early stage enterprises Majority of enterprises Advanced stage enterprises
Pre-built, pre-trained APIs Cloud Auto ML Vertex AI platform
TensorFlow Cloud TPUs

Open Source Machine Learning Model Hardware optimised for machine learning
39
Google Cloud
Machine Learning tools for everyone
Business User Data analyst Data engineer Data scientist ML developer ML engineer
Insights and objectives Query and analyze Get clean, useful data Models that work Intelligent apps Models in production
Self-driving infra
Portable Images, videos Scalable model
Interactive BI Endless EDW BigQuery,
notebooks Vision, Video hosting
Looker BigQuery Dataflow,
Notebooks Intelligence Prediction
Composer
Model eval and
EDW in a Self-managed Broad choice of Sentiment
selection ML CI/CD and
Spreadsheet data pipelines tools/language analysis, entity
Explainable AI, orchestration
Connected Data Fusion, Dataproc, extraction
Tensorboard Pipelines
Sheets Dataflow Dataflow NL, Translation
Point-and-click
Data models, Data quality dev
Natural Language Chatbots, voice Provenance and
catalog /lineage AutoML
Query commands lineage
Looker, Data Vertex AI,
Data QnA Conversation ML Metadata
Catalog BigQuery, Dataflow Collaboration
Feature Store,
Real-time Pipelines Improvements
Machine Fleet routing,
capabilities and retraining
learning in SQL forecasting
BigQuery, Managed models Model
BigQuery ML Optimization
Dataflow Forecast Monitoring
Cloud AutoML
ML that creates ML for your problem
Dataset Cloud AutoML Generate predictions

with a REST API
Train Deploy Serve

AutoML Vision
Upload and label images Train your model Evaluate
AutoML Vision
Handbag Shoe Hat
Cloud AI
Cloud AI solutions ML professionals & service partners

New
Less ML
Cloud Contact Document Professional Services
expertise
ASL
Job Discovery Center AI Understanding Organization
Cloud AI building blocks Language Conversation
New New
Vision
Cloud Video Cloud Cloud Cloud Natural Cloud Cloud Cloud AutoML Cloud Dialog Flow Cloud
Intelligence AutoML Vision Vision Language AutoML NL Translation Translation Speech-to-Text Enterprise Text-to-Speech
Cloud AI platform ML accelerators ML libraries Kaggle/datasets

Machine & Deep
Learning
Cloud ML Cloud Cloud Cloud Cloud

Tensorflow Kubeflow Datasets
More ML
GPU
Engine Dataflow Dataproc TPU expertise
DEMO
https://cloud.google.com/products/ai
https://cloud.google.com/vision#section-2
https://cloud.google.com/text-to-speech
Complete ML journeys with Vertex AI
Pipelines
Orchestration
Data Data set Comparison

Training Evaluation Deployment Monitoring
exploration creation & Selection
BQML Model BQML Monitoring

Registry
Custom Batch
AutoML Online
Manual Manual
1 Complete data platform for SaaS applications
Intelligent Data Fabric Dataplex
Corporate
Customer
Collect Process Store Analyze Activate Empower Value
Industry
Dataflow Vertex AI Visualize
Pub/Sub BigQuery Business Intelligence
(Streaming) Relational Databases
Training AutoML
SaaS BI Data Google Analyze
Datastream Dataproc Cloud SQL Spanner BQML Looker
Engine QnA Explainable Predication Sheets
(Spark, Flink, Presto, MapReduce)
$ Database
Migration Service Data Fusion APIs MLOPs
App Platforms
Recognize
Commercial (No code) NoSQL Databases Google Multi Predict
Data Transfer Storage Cloud
Composer AI Solutions AppSheet Firebase
Service Bigtable Firestore
(Orchestration) CCAI DocAI Automate
IoT
Dataprep Memorystore Retail
Reco. AI Data Sharing Optimize
(Wrangling) Search
Social
Personalize
Partner and Open Source Technology
Geospatial Recommend
Open Secure Sustainable Google-scale Speechify

Google
Scale with open, flexible Keep your customer’s data Run on the cleanest cloud in Run on the same
technology. secure and compliant. the industry. infrastructure as Google.
A Unified ML Platform for Solving All Business Problems
Vertex AI
● One unified experience to create,

deploy, and manage models over
time, at scale
● Tools for all levels of expertise and

for all types of data
● Accuracy and fairness of predictions

and resulting decisions
● Flexible and secure

What is included in Vertex AI?
Data Feature Training/ Model Understanding/ Model Model

Edge
Readiness Engineering HP-Tuning serving Tuning Monitoring Management
AutoML No-code/
low code
Vision Video Language Translation Tables Forecast Bigquery ML
workflow
Data Feature Continuous

Training Prediction Hybrid AI Metadata
Labeling Store Monitoring
Vizier Explainable AI
Datasets Custom
Optimization
development
Experiments workflow
AI
Accelerators
Pipelines (Orchestration)
Deep Learning Environment (VM + Container)

Infrastructure
services / Add-ons
Workbench/Notebooks
Why organizations choose Vertex AI
Build on the Best of Accelerate Time

Trust & Responsibility
Google to Value
Access to Google’s continuously Make AI more accessible & useful Responsible AI, Security &
enhanced AI tools Flexible tools for streamlined and
Sustainability
scalable collaboration across all levels Leverage Google Cloud’s secure and
A platform created upon the of technical expertise sustainable infrastructure
foundation of Google’s pioneering AI
Google’s AI Principles review process
research MLOps capabilities make practitioners empowers users to trust that the tools
jobs easier are built with ethical governance
AutoML - Fastest path from data to value
Traditional Machine Learning Workflow
Prepare Data
Create Models
Flatten arrays
Convert data types
Parse datetimes Select model architecture
...
Tune parameters
Load dataset Deploy model Make predictions
Engineer Features
Train models
Encode categories
Ensemble models
Create embeddings
Create N-Grams ...
...
AutoML - Fastest path from data to value
AutoML Workflow
Load dataset Set training budget Make predictions

AutoML
Low/No code
Point and click to build custom, high-quality
models using the AutoML workflow in Vertex AI
AutoML workflow Automatically search

through Google’s whole
model zoo...
Train
your model Linear, logistic
Feedforward DNN
Feature
Define Evaluate Deploy Wide and Deep NN
Analyze engineering
your data your your model
your input Gradient Boosted Decision Tree
schema model to get
features
and target Model selection behavior predictions (GBDT)
DNN + GBDT Hybrid

Hyperparameter
tuning Adanet ensemble
Neural + Tree Architecture Search
...and more!
Alert
Experimentation ML Data assets Trigger

and development Vertex Datasets &
Notebooks training datasets Feature Store
Monitoring
Vertex Model
Code & config changes
Monitoring
Code Repository CI/CD for Visualizations

Cloud Source Training Pipeline Vertex TensorBoard
Repository Cloud Build
Pipeline artifacts
Logged experiment
Vertex Experiments
Pipeline CT pipeline
components Vertex Pipelines
ML metadata
Container Registry (Vertex Training)
Vertex Metadata
Trained model
CI/CD for Model

Model Registry Serving
Vertex Models Cloud Build
Model service
Prediction serving
Vertex Prediction
serving features
Serving logs
MLOps on Vertex AI
Where to start from?

A simple end to end ML workflow
Define Deploy
Analyze Evaluate
your data your model
your input Training your model
schema and to get
features behavior
target predictions
BigQuery Notebooks Vertex Vertex

(prod data) Training Prediction
Dataflow Model
Monitoring
Vertex Pipelines
Vertex AI Demo
Data engineering and smart analytics
https://cloud.google.com/training/data-engineering-and-analytics#data-engineer-learning-path
Data analytics design patterns

https://cloud.google.com/architecture/reference-patterns/overview
Professional Data Engineer

https://cloud.google.com/certification/data-engineer

Google AI - ML - Caltech

Uploaded by

Copyright:

Available Formats

Google AI - ML - Caltech

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Google AI - ML - Caltech

Uploaded by

Copyright:

Available Formats

Program Management

CTME | Teaching Team (caltech.edu)

● Deliver aggregated data to business

● Data warehousing ● Machine Learning

● ETL tooling ● Programming

● Big Data ● Mathematics

● Advanced analytics ● Statistical modeling

● Programming ● Business acumen

● Manage and organize data ● Gather, organize and clean data

● Prepare data for modeling ● Identify patterns

● Discover tasks that can be automated ● Provide data-led business solutions to

•Conversational AI - The capability of a software "agent" to participate in a conversation.

NLP enables you to create software that can:

Bots can be the basis of AI solutions for:

Reliability and safety

Privacy and security

The most common Machine Learning Models are:

©Caltech Program Introduction https://ctme.caltech.edu 30

Regression is an example of a supervised machine learning technique in which you train a model

Classiﬁcation is an example of a supervised machine learning technique in which you train a

Accelerate time to decision

Pre-trained APIs Custom AI with AutoML End-to-end AI with core tools

BigQuery ML AutoML Models in Vertex AI End-to-end AI with Vertex AI

Pre-packaged AI solutions A new middle pathway Custom ML models

Early stage enterprises Majority of enterprises Advanced stage enterprises

Pre-built, pre-trained APIs Cloud Auto ML Vertex AI platform

TensorFlow Cloud TPUs

Dataset Cloud AutoML Generate predictions

Train Deploy Serve

Upload and label images Train your model Evaluate

Cloud AI solutions ML professionals & service partners

Cloud AI building blocks Language Conversation

Cloud AI platform ML accelerators ML libraries Kaggle/datasets

Cloud ML Cloud Cloud Cloud Cloud

Data Data set Comparison

BQML Model BQML Monitoring

Intelligent Data Fabric Dataplex

Open Secure Sustainable Google-scale Speechify

● One unified experience to create,

● Tools for all levels of expertise and

● Accuracy and fairness of predictions

● Flexible and secure

What is included in Vertex AI?

Data Feature Training/ Model Understanding/ Model Model

Data Feature Continuous

Deep Learning Environment (VM + Container)

Why organizations choose Vertex AI

Build on the Best of Accelerate Time

AutoML - Fastest path from data to value

Traditional Machine Learning Workflow

Convert data types

Parse datetimes Select model architecture

Create N-Grams ...

AutoML - Fastest path from data to value

Load dataset Set training budget Make predictions

AutoML workflow Automatically search

DNN + GBDT Hybrid

Neural + Tree Architecture Search

Experimentation ML Data assets Trigger