Google AI - ML - Caltech
Google AI - ML - Caltech
Google AI - ML - Caltech
Certificate Program
Developed for Applied Materials
by
California Institute of Technology
Center for Technology and Management Education
1200 East California Blvd., Mail Code 202-81
Pasadena, California 91125
Phone: 626.395.4042
©Caltech 1 https://ctme.caltech.edu
Google Cloud AI/ML
11/16/2022
Patrick Alexander
Google - Customer Engineer
Ex-Microsoft - Cloud Solution Architect
Board Member of AACIT (Caltech Flying Club)
Bases @LosAngeles
PatrickGCP@Google.com
©Caltech https://ctme.caltech.edu
Spruce Goose
Hughes H-4 Hercules
©Caltech https://ctme.caltech.edu
https://en.wikipedia.org/wiki/Hughes_H-4_Hercules
©Caltech https://ctme.caltech.edu
Google AI/ML
11/16/2022
Agenda
● Data Science vs Data Engineering
● AI/ML
● Machine Learning Lifecycle
● Google toolset and decision tree to find proper tools
● Productionizing Machine Learning Models (MLOPs)
● Q&A
©Caltech https://ctme.caltech.edu
©Caltech https://ctme.caltech.edu
Telling a story with Data
©Caltech https://ctme.caltech.edu
Data Engineer Data Scientist
● Develops, constructs, test and ● Deep dive into data and provides
maintains architecture such as meaningful business insights from it
databases and large-scale processing that are crucial for decision making in
systems. the company.
● Make sure the continuous flow of data ● Work on building and deploying
and pipelines. AI-based algorithms in various aspects
of the business to solve business
● Provide data in usable formats to the
problems.
data scientists who run queries and
algorithms against the information for
predictive analytics, machine learning
and data mining application.
©Caltech https://ctme.caltech.edu
Data Engineer Skills Data Scientist Skills
● Database Architecture ● Problem Solving
● Communications ● Communication
● Presentation
©Caltech https://ctme.caltech.edu
Data Engineer Tasks Data Scientist Tasks
● Data acquisition ● Develop hypotheses
● Design, build and test data architecture ● Apply algorithms to test hypotheses
©Caltech https://ctme.caltech.edu
©Caltech https://ctme.caltech.edu
Proprietary + Confidential
AI/ML 101
AI & ML
©Caltech 21 https://ctme.caltech.edu
What is AI?
Simply put, AI is the creation of software that imitates human behaviors and capabilities. Key
elements include:
•Machine learning - This is often the foundation for an AI system, and is the way we "teach" a
computer model to make prediction and draw conclusions from data.
•Anomaly detection - The capability to automatically detect errors or unusual activity in a system.
•Computer vision - The capability of software to interpret the world visually through cameras,
video, and images.
•Natural language processing - The capability for a computer to interpret written or spoken
language and respond in kind.
©Caltech 22 https://ctme.caltech.edu
How machine learning works
So how do machines learn?
The answer is, from data.
©Caltech 23 https://ctme.caltech.edu
Understand anomaly detection
Imagine you're creating a software system to monitor credit card transactions and detect unusual
usage patterns that might indicate fraud. Or an application that tracks activity in an automated
production line and identifies failures. Or a racing car telemetry system that uses sensors to
proactively warn engineers about potential mechanical failures before they happen.
©Caltech 24 https://ctme.caltech.edu
Understand computer vision
Computer Vision is an area of AI that deals with visual processing. Let's explore some of the
possibilities that computer vision brings.
©Caltech 25 https://ctme.caltech.edu
Understand natural language processing
Natural language processing (NLP) is the area of AI that deals with creating software that
understands written and spoken language.
©Caltech 26 https://ctme.caltech.edu
Understand conversational AI
Conversational AI is the term used to describe solutions where AI agents participate in
conversations with humans. Most commonly, conversational AI solutions use bots to manage
dialogs with users. These dialogs can take place through web site interfaces, email, social media
platforms, messaging systems, phone calls, and other channels.
©Caltech 27 https://ctme.caltech.edu
Understand responsible AI
Artificial Intelligence is a powerful tool that can be used to greatly benefit the world. However,
like any tool, it must be used responsibly.
Fairness
AI systems should treat all people fairly.
Inclusiveness
AI systems should empower everyone and engage people.
Transparency
AI systems should be understandable.
©Caltech 28 https://ctme.caltech.edu
Accountability
What is machine learning?
Machine learning is a technique that uses mathematics and statistics to create a model that can
predict unknown values.
Mathematically, you can think of machine learning as a way of defining a function (let's call it f) that
operates on one or more features of something (which we'll call x) to calculate a predicted label (y)
- like this:
f(x) = y
©Caltech 29 https://ctme.caltech.edu
Kinds of machine learning
There are three main categories of machine learning: supervised learning, unsupervised
learning, and reinforcement learning.
Supervised learning
In supervised learning, each data point is labeled or associated with a category or value of
interest.
Unsupervised learning
In unsupervised learning, data points have no labels associated with them.
Reinforcement learning
In reinforcement learning, the algorithm gets to choose an action in response to each data
point.
Un-Supervised
Learning
Reinforcement
Learning
Regression is a form of machine learning that is used to predict a numeric label based on an
item's features. For example, an automobile sales company might use the characteristics of car
(such as engine size, number of seats, mileage, and so on) to predict its likely selling price. In
this case, the characteristics of the car are the features, and the selling price is the label.
©Caltech 33 https://ctme.caltech.edu
Classification is a form of machine learning that is used to predict which category, or class, an
item belongs to. For example, a health clinic might use the characteristics of a patient (such as
age, weight, blood pressure, and so on) to predict whether the patient is at risk of diabetes. In
this case, the characteristics of the patient are the features, and the label is a classification of
either 0 or 1, representing non-diabetic or diabetic.
©Caltech 34 https://ctme.caltech.edu
Advanced, embedded AI & ML capabilities
Democratize AI/ML
AutoML and BigQuery ML enable teams
without data scientists to use AI/ML with
capabilities not offered by competitors
Depth of AI/ML
Cloud Speech to Text supports 120 language
variants, more than AWS, Azure
©Caltech https://ctme.caltech.edu
Mapping the business objective to an AI absorption strategy
Use AI out of the box to Deploy custom AI to extract Build end-to-end AI to have
maximize the value AI delivers value from your data to material impact on competitive
into business workflows differentiate differentiation
Speed
Effort
Customization
©Caltech https://ctme.caltech.edu
AI Platform for every level of expertise
37
Which capabilities to choose?
Design your ML workflow
©Caltech https://ctme.caltech.edu
Google Cloud enables your AI journey
39
Google Cloud
Machine Learning tools for everyone
Business User Data analyst Data engineer Data scientist ML developer ML engineer
Insights and objectives Query and analyze Get clean, useful data Models that work Intelligent apps Models in production
Self-driving infra
Portable Images, videos Scalable model
Interactive BI Endless EDW BigQuery,
notebooks Vision, Video hosting
Looker BigQuery Dataflow,
Notebooks Intelligence Prediction
Composer
Model eval and
EDW in a Self-managed Broad choice of Sentiment
selection ML CI/CD and
Spreadsheet data pipelines tools/language analysis, entity
Explainable AI, orchestration
Connected Data Fusion, Dataproc, extraction
Tensorboard Pipelines
Sheets Dataflow Dataflow NL, Translation
Point-and-click
Data models, Data quality dev
Natural Language Chatbots, voice Provenance and
catalog /lineage AutoML
Query commands lineage
Looker, Data Vertex AI,
Data QnA Conversation ML Metadata
Catalog BigQuery, Dataflow Collaboration
Feature Store,
Real-time Pipelines Improvements
Machine Fleet routing,
capabilities and retraining
learning in SQL forecasting
BigQuery, Managed models Model
BigQuery ML Optimization
Dataflow Forecast Monitoring
Cloud AutoML
ML that creates ML for your problem
AutoML Vision
Handbag Shoe Hat
Cloud AI
Less ML
Cloud Contact Document Professional Services
expertise
ASL
Job Discovery Center AI Understanding Organization
New New
Vision
Cloud Video Cloud Cloud Cloud Natural Cloud Cloud Cloud AutoML Cloud Dialog Flow Cloud
Intelligence AutoML Vision Vision Language AutoML NL Translation Translation Speech-to-Text Enterprise Text-to-Speech
https://cloud.google.com/products/ai
https://cloud.google.com/vision#section-2
https://cloud.google.com/text-to-speech
©Caltech https://ctme.caltech.edu
Complete ML journeys with Vertex AI
Pipelines
Orchestration
AutoML Online
Manual Manual
1 Complete data platform for SaaS applications
Corporate
Customer
Collect Process Store Analyze Activate Empower Value
Industry
Dataflow Vertex AI Visualize
Pub/Sub BigQuery Business Intelligence
(Streaming) Relational Databases
Training AutoML
SaaS BI Data Google Analyze
Datastream Dataproc Cloud SQL Spanner BQML Looker
Engine QnA Explainable Predication Sheets
(Spark, Flink, Presto, MapReduce)
$ Database
Migration Service Data Fusion APIs MLOPs
App Platforms
Recognize
Commercial (No code) NoSQL Databases Google Multi Predict
Data Transfer Storage Cloud
Composer AI Solutions AppSheet Firebase
Service Bigtable Firestore
(Orchestration) CCAI DocAI Automate
IoT
Dataprep Memorystore Retail
Reco. AI Data Sharing Optimize
(Wrangling) Search
Social
Personalize
Partner and Open Source Technology
Geospatial Recommend
Vertex AI
AutoML No-code/
low code
Vision Video Language Translation Tables Forecast Bigquery ML
workflow
Vizier Explainable AI
Datasets Custom
Optimization
development
Experiments workflow
AI
Accelerators
Pipelines (Orchestration)
Access to Google’s continuously Make AI more accessible & useful Responsible AI, Security &
enhanced AI tools Flexible tools for streamlined and
Sustainability
scalable collaboration across all levels Leverage Google Cloud’s secure and
A platform created upon the of technical expertise sustainable infrastructure
foundation of Google’s pioneering AI
Google’s AI Principles review process
research MLOps capabilities make practitioners empowers users to trust that the tools
jobs easier are built with ethical governance
Proprietary + Confidential
Prepare Data
Create Models
Flatten arrays
...
Tune parameters
Load dataset Deploy model Make predictions
Engineer Features
Train models
Encode categories
Ensemble models
Create embeddings
...
Proprietary + Confidential
AutoML Workflow
Low/No code
Point and click to build custom, high-quality
models using the AutoML workflow in Vertex AI
Feedforward DNN
Feature
Define Evaluate Deploy Wide and Deep NN
Analyze engineering
your data your your model
your input Gradient Boosted Decision Tree
schema model to get
features
and target Model selection behavior predictions (GBDT)
...and more!
Alert
Pipeline artifacts
Logged experiment
Vertex Experiments
Pipeline CT pipeline
components Vertex Pipelines
ML metadata
Container Registry (Vertex Training)
Vertex Metadata
Trained model
Model service
Prediction serving
Vertex Prediction
serving features
Serving logs
MLOps on Vertex AI
Proprietary + Confidential
Define Deploy
Analyze Evaluate
your data your model
your input Training your model
schema and to get
features behavior
target predictions
Dataflow Model
Monitoring
Vertex Pipelines
Vertex AI Demo
©Caltech https://ctme.caltech.edu
©Caltech https://ctme.caltech.edu
©Caltech https://ctme.caltech.edu
©Caltech https://ctme.caltech.edu
Data engineering and smart analytics
https://cloud.google.com/training/data-engineering-and-analytics#data-engineer-learning-path
©Caltech https://ctme.caltech.edu
©Caltech 61 https://ctme.caltech.edu