Google AI - ML - Caltech

Download as pdf or txt
Download as pdf or txt
You are on page 1of 61

Program Management

Certificate Program
Developed for Applied Materials
by
California Institute of Technology
Center for Technology and Management Education
1200 East California Blvd., Mail Code 202-81
Pasadena, California 91125
Phone: 626.395.4042

©Caltech 1 https://ctme.caltech.edu
Google Cloud AI/ML
11/16/2022

Patrick Alexander
Google - Customer Engineer
Ex-Microsoft - Cloud Solution Architect
Board Member of AACIT (Caltech Flying Club)
Bases @LosAngeles

PatrickGCP@Google.com

CTME | Teaching Team (caltech.edu)


©Caltech 2 https://ctme.caltech.edu
Google Office Spruce Goose
Playa Vista - California

©Caltech https://ctme.caltech.edu
Spruce Goose
Hughes H-4 Hercules

©Caltech https://ctme.caltech.edu
https://en.wikipedia.org/wiki/Hughes_H-4_Hercules

©Caltech https://ctme.caltech.edu
Google AI/ML
11/16/2022

Agenda
● Data Science vs Data Engineering
● AI/ML
● Machine Learning Lifecycle
● Google toolset and decision tree to find proper tools
● Productionizing Machine Learning Models (MLOPs)
● Q&A

©Caltech https://ctme.caltech.edu
©Caltech https://ctme.caltech.edu
Telling a story with Data

©Caltech https://ctme.caltech.edu
Data Engineer Data Scientist
● Develops, constructs, test and ● Deep dive into data and provides
maintains architecture such as meaningful business insights from it
databases and large-scale processing that are crucial for decision making in
systems. the company.

● Make sure the continuous flow of data ● Work on building and deploying
and pipelines. AI-based algorithms in various aspects
of the business to solve business
● Provide data in usable formats to the
problems.
data scientists who run queries and
algorithms against the information for
predictive analytics, machine learning
and data mining application.

● Deliver aggregated data to business


executives and analysts and other end
users to improve business operations.

©Caltech https://ctme.caltech.edu
Data Engineer Skills Data Scientist Skills
● Database Architecture ● Problem Solving

● Data warehousing ● Machine Learning

● ETL tooling ● Programming

● Big Data ● Mathematics

● Advanced analytics ● Statistical modeling

● Programming ● Business acumen

● Communications ● Communication

● Presentation

©Caltech https://ctme.caltech.edu
Data Engineer Tasks Data Scientist Tasks
● Data acquisition ● Develop hypotheses

● Manage and organize data ● Gather, organize and clean data

● Design, build and test data architecture ● Apply algorithms to test hypotheses

● Prepare data for modeling ● Identify patterns

● Discover tasks that can be automated ● Provide data-led business solutions to


business challenges
● Develop data set Processes

©Caltech https://ctme.caltech.edu
©Caltech https://ctme.caltech.edu
Proprietary + Confidential

AI/ML 101
AI & ML

©Caltech 21 https://ctme.caltech.edu
What is AI?
Simply put, AI is the creation of software that imitates human behaviors and capabilities. Key
elements include:

•Machine learning - This is often the foundation for an AI system, and is the way we "teach" a
computer model to make prediction and draw conclusions from data.

•Anomaly detection - The capability to automatically detect errors or unusual activity in a system.

•Computer vision - The capability of software to interpret the world visually through cameras,
video, and images.

•Natural language processing - The capability for a computer to interpret written or spoken
language and respond in kind.

•Conversational AI - The capability of a software "agent" to participate in a conversation.

©Caltech 22 https://ctme.caltech.edu
How machine learning works
So how do machines learn?
The answer is, from data.

©Caltech 23 https://ctme.caltech.edu
Understand anomaly detection
Imagine you're creating a software system to monitor credit card transactions and detect unusual
usage patterns that might indicate fraud. Or an application that tracks activity in an automated
production line and identifies failures. Or a racing car telemetry system that uses sensors to
proactively warn engineers about potential mechanical failures before they happen.

©Caltech 24 https://ctme.caltech.edu
Understand computer vision
Computer Vision is an area of AI that deals with visual processing. Let's explore some of the
possibilities that computer vision brings.

©Caltech 25 https://ctme.caltech.edu
Understand natural language processing
Natural language processing (NLP) is the area of AI that deals with creating software that
understands written and spoken language.

NLP enables you to create software that can:


• Analyze and interpret text in documents, email messages, and other sources.
• Interpret spoken language and synthesize speech responses.
• Automatically translate spoken or written phrases between languages.
• Interpret commands and determine appropriate actions.

©Caltech 26 https://ctme.caltech.edu
Understand conversational AI
Conversational AI is the term used to describe solutions where AI agents participate in
conversations with humans. Most commonly, conversational AI solutions use bots to manage
dialogs with users. These dialogs can take place through web site interfaces, email, social media
platforms, messaging systems, phone calls, and other channels.

Bots can be the basis of AI solutions for:


• Customer support for products or services.
• Reservation systems for restaurants, airlines, cinemas, and other appointment-based
businesses.
• Health care consultations and self-diagnosis.
• Home automation and personal digital assistants.

©Caltech 27 https://ctme.caltech.edu
Understand responsible AI
Artificial Intelligence is a powerful tool that can be used to greatly benefit the world. However,
like any tool, it must be used responsibly.

Fairness
AI systems should treat all people fairly.

Reliability and safety


AI systems should perform reliably and safely.

Privacy and security


AI systems should be secure and respect privacy.

Inclusiveness
AI systems should empower everyone and engage people.

Transparency
AI systems should be understandable.
©Caltech 28 https://ctme.caltech.edu
Accountability
What is machine learning?

Machine learning is a technique that uses mathematics and statistics to create a model that can
predict unknown values.

Mathematically, you can think of machine learning as a way of defining a function (let's call it f) that
operates on one or more features of something (which we'll call x) to calculate a predicted label (y)
- like this:

f(x) = y

The most common Machine Learning Models are:


• Classification
• Regression
• Clustering
• Time series forecasting

©Caltech 29 https://ctme.caltech.edu
Kinds of machine learning
There are three main categories of machine learning: supervised learning, unsupervised
learning, and reinforcement learning.

Supervised learning
In supervised learning, each data point is labeled or associated with a category or value of
interest.

Unsupervised learning
In unsupervised learning, data points have no labels associated with them.

Reinforcement learning
In reinforcement learning, the algorithm gets to choose an action in response to each data
point.

©Caltech Program Introduction https://ctme.caltech.edu 30


Supervised
Learning

Un-Supervised
Learning
Reinforcement
Learning
Regression is a form of machine learning that is used to predict a numeric label based on an
item's features. For example, an automobile sales company might use the characteristics of car
(such as engine size, number of seats, mileage, and so on) to predict its likely selling price. In
this case, the characteristics of the car are the features, and the selling price is the label.

Regression is an example of a supervised machine learning technique in which you train a model


using data that includes both the features and known values for the label, so that the model
learns to fit the feature combinations to the label. Then, after training has been completed, you
can use the trained model to predict labels for new items for which the label is unknown.

©Caltech 33 https://ctme.caltech.edu
Classification is a form of machine learning that is used to predict which category, or class, an
item belongs to. For example, a health clinic might use the characteristics of a patient (such as
age, weight, blood pressure, and so on) to predict whether the patient is at risk of diabetes. In
this case, the characteristics of the patient are the features, and the label is a classification of
either 0 or 1, representing non-diabetic or diabetic.

Classification is an example of a supervised machine learning technique in which you train a


model using data that includes both the features and known values for the label, so that the
model learns to fit the feature combinations to the label. Then, after training has been
completed, you can use the trained model to predict labels for new items for which the label is
unknown.

©Caltech 34 https://ctme.caltech.edu
Advanced, embedded AI & ML capabilities

Democratize AI/ML
AutoML and BigQuery ML enable teams
without data scientists to use AI/ML with
capabilities not offered by competitors

Depth of AI/ML
Cloud Speech to Text supports 120 language
variants, more than AWS, Azure

Accelerate time to decision


Google is a Leader in
Google TPUs 15-30x faster than Computer Vision Platforms
contemporary GPUs and CPUs

©Caltech https://ctme.caltech.edu
Mapping the business objective to an AI absorption strategy

Use AI out of the box to Deploy custom AI to extract Build end-to-end AI to have
maximize the value AI delivers value from your data to material impact on competitive
into business workflows differentiate differentiation

Speed

Effort

Customization

©Caltech https://ctme.caltech.edu
AI Platform for every level of expertise

Pre-trained APIs Custom AI with AutoML End-to-end AI with core tools


No training data needed, Easily create custom models Help data scientists and ML
get started right away (A no-code approach) engineers build and deploy AI

37
Which capabilities to choose?
Design your ML workflow

BigQuery ML AutoML Models in Vertex AI End-to-end AI with Vertex AI


● Descriptive and predictive ● Predictive modeling on ● Custom models on pre-built
modeling on structured data structured & unstructured data frameworks
● Hyper-parameter tuning ● Hyper-parameter tuning ● Noops, serverless training with
● Feature engineering
● Feature engineering hyperparameter tuning
● Explainability
● Explainability ● Explainability
● Simple SQL code
● No code ● Custom code

©Caltech https://ctme.caltech.edu
Google Cloud enables your AI journey

Pre-packaged AI solutions A new middle pathway Custom ML models

Early stage enterprises Majority of enterprises Advanced stage enterprises

Pre-built, pre-trained APIs Cloud Auto ML Vertex AI platform

TensorFlow Cloud TPUs


Open Source Machine Learning Model Hardware optimised for machine learning

39
Google Cloud
Machine Learning tools for everyone

Business User Data analyst Data engineer Data scientist ML developer ML engineer
Insights and objectives Query and analyze Get clean, useful data Models that work Intelligent apps Models in production

Self-driving infra
Portable Images, videos Scalable model
Interactive BI Endless EDW BigQuery,
notebooks Vision, Video hosting
Looker BigQuery Dataflow,
Notebooks Intelligence Prediction
Composer
Model eval and
EDW in a Self-managed Broad choice of Sentiment
selection ML CI/CD and
Spreadsheet data pipelines tools/language analysis, entity
Explainable AI, orchestration
Connected Data Fusion, Dataproc, extraction
Tensorboard Pipelines
Sheets Dataflow Dataflow NL, Translation
Point-and-click
Data models, Data quality dev
Natural Language Chatbots, voice Provenance and
catalog /lineage AutoML
Query commands lineage
Looker, Data Vertex AI,
Data QnA Conversation ML Metadata
Catalog BigQuery, Dataflow Collaboration
Feature Store,
Real-time Pipelines Improvements
Machine Fleet routing,
capabilities and retraining
learning in SQL forecasting
BigQuery, Managed models Model
BigQuery ML Optimization
Dataflow Forecast Monitoring
Cloud AutoML
ML that creates ML for your problem

Dataset Cloud AutoML Generate predictions


with a REST API

Train Deploy Serve


AutoML Vision

Upload and label images Train your model Evaluate

AutoML Vision
Handbag Shoe Hat
Cloud AI

Cloud AI solutions ML professionals & service partners


New

Less ML
Cloud Contact Document Professional Services
expertise
ASL
Job Discovery Center AI Understanding Organization

Cloud AI building blocks Language Conversation

New New
Vision

Cloud Video Cloud Cloud Cloud Natural Cloud Cloud Cloud AutoML Cloud Dialog Flow Cloud
Intelligence AutoML Vision Vision Language AutoML NL Translation Translation Speech-to-Text Enterprise Text-to-Speech

Cloud AI platform ML accelerators ML libraries Kaggle/datasets


Machine & Deep
Learning

Cloud ML Cloud Cloud Cloud Cloud


Tensorflow Kubeflow Datasets
More ML
GPU
Engine Dataflow Dataproc TPU expertise
DEMO

https://cloud.google.com/products/ai

https://cloud.google.com/vision#section-2

https://cloud.google.com/text-to-speech

©Caltech https://ctme.caltech.edu
Complete ML journeys with Vertex AI

Pipelines

Orchestration

Data Data set Comparison


Training Evaluation Deployment Monitoring
exploration creation & Selection

BQML Model BQML Monitoring


Registry
Custom Batch

AutoML Online

Manual Manual
1 Complete data platform for SaaS applications

Intelligent Data Fabric Dataplex

Corporate
Customer
Collect Process Store Analyze Activate Empower Value
Industry
Dataflow Vertex AI Visualize
Pub/Sub BigQuery Business Intelligence
(Streaming) Relational Databases
Training AutoML
SaaS BI Data Google Analyze
Datastream Dataproc Cloud SQL Spanner BQML Looker
Engine QnA Explainable Predication Sheets
(Spark, Flink, Presto, MapReduce)

$ Database
Migration Service Data Fusion APIs MLOPs
App Platforms
Recognize
Commercial (No code) NoSQL Databases Google Multi Predict
Data Transfer Storage Cloud
Composer AI Solutions AppSheet Firebase
Service Bigtable Firestore
(Orchestration) CCAI DocAI Automate
IoT
Dataprep Memorystore Retail
Reco. AI Data Sharing Optimize
(Wrangling) Search
Social
Personalize
Partner and Open Source Technology
Geospatial Recommend

Open Secure Sustainable Google-scale Speechify


Google
Scale with open, flexible Keep your customer’s data Run on the cleanest cloud in Run on the same
technology. secure and compliant. the industry. infrastructure as Google.
A Unified ML Platform for Solving All Business Problems

Vertex AI

● One unified experience to create,


deploy, and manage models over
time, at scale

● Tools for all levels of expertise and


for all types of data

● Accuracy and fairness of predictions


and resulting decisions

● Flexible and secure


Proprietary + Confidential

What is included in Vertex AI?

Data Feature Training/ Model Understanding/ Model Model


Edge
Readiness Engineering HP-Tuning serving Tuning Monitoring Management

AutoML No-code/
low code
Vision Video Language Translation Tables Forecast Bigquery ML
workflow

Data Feature Continuous


Training Prediction Hybrid AI Metadata
Labeling Store Monitoring

Vizier Explainable AI
Datasets Custom
Optimization
development
Experiments workflow
AI
Accelerators

Pipelines (Orchestration)

Deep Learning Environment (VM + Container)


Infrastructure
services / Add-ons
Workbench/Notebooks
Proprietary + Confidential

Why organizations choose Vertex AI

Build on the Best of Accelerate Time


Trust & Responsibility
Google to Value

Access to Google’s continuously Make AI more accessible & useful Responsible AI, Security &
enhanced AI tools Flexible tools for streamlined and
Sustainability
scalable collaboration across all levels Leverage Google Cloud’s secure and
A platform created upon the of technical expertise sustainable infrastructure
foundation of Google’s pioneering AI
Google’s AI Principles review process
research MLOps capabilities make practitioners empowers users to trust that the tools
jobs easier are built with ethical governance
Proprietary + Confidential

AutoML - Fastest path from data to value

Traditional Machine Learning Workflow

Prepare Data
Create Models
Flatten arrays

Convert data types

Parse datetimes Select model architecture

...
Tune parameters
Load dataset Deploy model Make predictions
Engineer Features
Train models
Encode categories
Ensemble models
Create embeddings

Create N-Grams ...

...
Proprietary + Confidential

AutoML - Fastest path from data to value

AutoML Workflow

Load dataset Set training budget Make predictions


AutoML
Proprietary + Confidential

Low/No code
Point and click to build custom, high-quality
models using the AutoML workflow in Vertex AI

AutoML workflow Automatically search


through Google’s whole
model zoo...
Train
your model Linear, logistic

Feedforward DNN
Feature
Define Evaluate Deploy Wide and Deep NN
Analyze engineering
your data your your model
your input Gradient Boosted Decision Tree
schema model to get
features
and target Model selection behavior predictions (GBDT)

DNN + GBDT Hybrid


Hyperparameter
tuning Adanet ensemble

Neural + Tree Architecture Search

...and more!
Alert

Experimentation ML Data assets Trigger


and development Vertex Datasets &
Notebooks training datasets Feature Store
Monitoring
Vertex Model
Code & config changes
Monitoring

Code Repository CI/CD for Visualizations


Cloud Source Training Pipeline Vertex TensorBoard
Repository Cloud Build

Pipeline artifacts
Logged experiment
Vertex Experiments
Pipeline CT pipeline
components Vertex Pipelines
ML metadata
Container Registry (Vertex Training)
Vertex Metadata

Trained model

CI/CD for Model


Model Registry Serving
Vertex Models Cloud Build

Model service

Prediction serving
Vertex Prediction
serving features

Serving logs

MLOps on Vertex AI
Proprietary + Confidential

Where to start from?


A simple end to end ML workflow

Define Deploy
Analyze Evaluate
your data your model
your input Training your model
schema and to get
features behavior
target predictions

BigQuery Notebooks Vertex Vertex


(prod data) Training Prediction

Dataflow Model
Monitoring

Vertex Pipelines
Vertex AI Demo
©Caltech https://ctme.caltech.edu
©Caltech https://ctme.caltech.edu
©Caltech https://ctme.caltech.edu
©Caltech https://ctme.caltech.edu
Data engineering and smart analytics
https://cloud.google.com/training/data-engineering-and-analytics#data-engineer-learning-path

Data analytics design patterns


https://cloud.google.com/architecture/reference-patterns/overview

Professional Data Engineer


https://cloud.google.com/certification/data-engineer

©Caltech https://ctme.caltech.edu
©Caltech 61 https://ctme.caltech.edu

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy