AWS Data Analytics - Technical - Student
AWS Data Analytics - Technical - Student
AWS Data Analytics - Technical - Student
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 4
Agenda
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 5
Module 2: AWS Data Analytics
Portfolio
Customer challenges and
opportunities for APN Partners
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 7
New realities
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 8
Common data analytics challenges
What challenges do you see when using big data
analytics/technologies? (n=545)
https://bi-survey.com/challenges-big-data-analytics
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 9
AWS data analytics portfolio
overview
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 10
Secure infrastructure for analytics
Amazon GuardDuty AWS Identify and Access AWS Certificate Manager AWS Artifact
Management (IAM) Private Certificate Authority
(ACM Private CA) Amazon Inspector
AWS Shield
AWS Single Sign-On
AWS Well-Architected Tool AWS Key Management Service AWS CloudHSM
AWS Organizations (AWS KMS)
Amazon Macie Amazon Cognito
AWS Directory Service Encryption at rest
Amazon Virtual Private AWS CloudTrail
Cloud (Amazon VPC) Encryption in transit
Bring your own keys,
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.
hardware security module 11
(HSM) support
AWS data analytics portfolio
Data visualization, engagement, and machine learning
AWS Data Amazon Amazon Amazon Amazon Amazon Amazon Amazon Amazon
Exchange QuickSight Pinpoint SageMaker Comprehend Polly Lex Rekognition Translate
Analytics
Amazon Amazon EMR AWS Glue Amazon Amazon OpenSearch Amazon Kinesis
Redshift (Spark and Presto) (Spark and Python) Athena Service Data Analytics
Amazon Simple Storage Service (Amazon S3) AWS Lake Formation AWS Glue
Amazon S3 Glacier
Data movement
AWS Database Migration Service (AWS DMS) | AWS Snowball | AWS Snowmobile | Amazon Kinesis Data Firehose
Amazon Kinesis Data Streams | Amazon Managed Streaming for Apache Kafka
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 12
Data movement services
Help customers move data from on premises to the cloud
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 13
Data lake services
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 14
Analytics services
Help customers extract value out of their data
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 15
Data visualization, engagement, and
machine learning services
Help customers understand and visualize their data, and use
machine learning (ML) for advanced analytics and predictions
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 16
AWS value proposition
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 17
Standards, formats, and open source
Spark Logstash
Kibana
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 19
Data analytics pipeline
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 20
Data management challenges
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 21
Data analytics pipeline
Process and
Collect Visualize
analyze
Data Insights
Insights
Store
Time-to-answer (latency)
Balance of throughput and cost
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 22
Data pipeline challenges
Building a data pipeline is challenging. Customers must:
• Manage updates, patches, and software integrations
• Handle increased overhead costs plus need for support
• Maintain focus on the core task of building applications that lead to data insights
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 23
AWS data analytics pipeline services
Collect Store Process and analyze Visualize
Amazon Kinesis AWS Amazon S3 Amazon Amazon EMR Amazon Athena Amazon
Data Firehose Snowball S3 Glacier QuickSight
Amazon Kinesis AWS Direct Amazon DynamoDB Amazon RDS Amazon Kinesis Amazon
Data Streams Connect Data Analytics SageMaker
Automate 24
AWS Database Migration Service AWS Glue
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Flywheel
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 25
Data Flywheel and customer journey
Store and
ü Save time manage data Modernize data
ü Save costs warehouse and
ü Agility build a data lake
ü Global distribution
ü Scale and performance ü New and faster insights
ü Broader access to analytics
Migrate data and
workloads to the cloud Build data-driven
applications
010010010
01010001
Data
100010100
Attract new customers
Generate more data
Innovate with
ü Better experiences
machine learning
ü Deeper engagement
ü Efficient processes
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 26
https://pages.awscloud.com/data-flywheel.html
Module 3: Data Analytics
Solutions on AWS – Part I
Objectives
In this module, you will learn how to:
• Explain data migration options from on premises to the AWS Cloud
• Describe two AWS data analytics technical solutions
• Modernizing a data warehouse with Amazon Redshift
• Data lakes
Evolution of data architecture
10011000010010101110010
10101110010101000010111
11011010
0011110010110010110
0100011000010
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 28
Data migration options
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 29
Journey to a modern data architecture
Evolution of data architecture
100110000100
101011100101
010111001010
100001011111
011010
001111001011
Data Data
0010110
Machine
0
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Types of data 30
AWS data migration options
AWS Direct AWS Storage Amazon S3 Transfer AWS Snowball Amazon Kinesis AWS Database
Connect Gateway Acceleration Data Firehose Migration Service
• File gateway
• Tape gateway
• Volume gateway
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 31
Solution 1: Modernizing a data
warehouse with Amazon Redshift
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 32
Journey to a modern data architecture
Evolution of data architecture
100110000100
101011100101
010111001010
100001011111
011010
001111001011
Data Data
0010110
Machine
0
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Types of data 33
Data warehouses
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 34
Data warehouse defined
Extract
Central repository of curated data
from different sources
• Data optimized for reporting and data
analysis
• Data extracted, cleaned, transformed, Source 1
Staging area
Benefits
• Better decision making Source 2 Data warehouse
(database)
• Consolidated data from many sources
• Improved data quality, consistency, and
accuracy
• Access to historical intelligence Source 3
(database)
• Improved performance
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. Data warehouse concepts: https://aws.amazon.com/data-warehouse/ 35
OLTP and OLAP comparison
Online Transactional Online Analytical
Processing (OLTP) Processing (OLAP)
Relational Database Data Warehouse
Application create, read, update, delete
Data Source OLTP and secondary source
(CRUD), origin
SQL INSERT, UPDATE, DELETE – short ETL focused, batch job to import,
Workloads
and fast queries JOINs, run complex queries
Denormalized using fewer tables in
Highly normalized, many distinct
Database Design STAR and snowflake schema with
tables to reduce duplication
duplicated data for fast performance
Depends on the amount of data, Growth over time, typically ranges
Database Size
typically from MB to TB in size from TB to PB in size
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 36
Traditional architecture and on-premises
data warehouse challenges
• Difficult to scale
• Long lead times for hardware procurement
• Complex upgrades are the norm
• High overhead costs for administration
• Expensive licensing and support costs
• Proprietary formats do not support newer open data formats, which results in data silos
• Data not cataloged, unreliable quality
• Licensing cost limits number of users and how much data can be accommodated
• Difficult to integrate with services and tools
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 37
Amazon Redshift
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 38
Amazon Redshift
Secure data warehouse that extends seamlessly to a data lake
Breaks a large job it into smaller Data from each column is stored Independent and resilient nodes
tasks, then distributes the tasks to together so the data can be without any dependencies
multiple compute nodes accessed faster, without scanning
and sorting all other columns
Result: Faster processing time Result: Compression of stored Result: Improves scalability
data improves performance
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 40
Amazon Redshift architecture
Leader node
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 42
Compute node
Runs queries in parallel and returns the result to the leader node
• SQL running powerhouses
• Compute node can load, unload, backup, and Leader node
restore data to and from Amazon S3.
• Node clusters range from 1 to 128.
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 43
Compute node slices
Slices are a symmetric multiprocessing (SMP) mechanism.
Compute node 1 Compute node 2
• Partitioned into slices.
• Slices work in parallel to
complete operations. Node slices Node slices
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 44
Amazon Redshift instance types
https://docs.aws.amazon.com/redshift/latest/gsg/getting-started.html
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 45
Management interfaces
https://us-west-2.console.aws.amazon.com/redshiftv2/home?region=us-west-2#query-editor
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 46
Solution 2: Data lakes
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 47
Journey to a modern data architecture
Evolution of data architecture
100110000100
101011100101
010111001010
100001011111
011010
001111001011
Data Data
0010110
Machine
0
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Types of data 48
Extract more value from data
Data lake
Application developers
Applications IoT devices Data analysts
Data engineers and
Data scientists
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 50
Reference architecture:
Catalog and search Access and user interface
Data lake on AWS
AWS Glue Amazon DynamoDB Amazon ES Amazon API Gateway IAM Amazon Cognito
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 52
Cleansing data
After migration, data still presents challenges:
Amazon Athena
Amazon Redshift
lake house
Amazon EMR
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 56
Use case: Log aggregation with ETL
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 57
Data services – AWS Data
Exchange and Amazon Athena
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 58
AWS Data Exchange
Find and subscribe to third-party data in the cloud
Find diverse data in one place Analyze data Access third-party data
• More than 1,000 data products • Download of copy of data to • Streamlined access to data
Amazon S3
• More than 80 data providers • Minimize legal reviews and
• Combine, analyze, and model with negotiations
existing data
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 59
Amazon Athena
Interactive query service to analyze data in Amazon S3 using standard SQL
$ SQL
Zero setup costs, Pay only for queries run, ANSI SQL interface, Serverless, zero
point to Amazon S3 save 30%–90% on JDBC/ODBC drivers, multiple infrastructure, zero
and start querying per-query costs through formats, compression types, administration,
compression and complex joins and data integrated with Amazon
types QuickSight
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 60
AWS Lake Formation
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 61
Challenges of building a secure data lake
1 Set up
storage
4 Configure and
2 Move data enforce security and
3 Cleanse,
compliance policies
prepare, and 5 Make data available
catalog data for analytics
1 2 3 4
Ingest and organize Secure and control Collaborate and use Monitor and audit
Automates creating data Sets up fine-grained Search and data Based on data access
lake and data ingestion. access control and data discovery using Data and governance policies,
governance. Catalog metadata. alert notifications are
raised on policy violation
To protect data, all and logged.
access is checked against
set policies.
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 63
AWS Lake Formation builds on AWS Glue
AWS Lake Formation
Connections,
AWS Glue ETL jobs AWS Glue crawlers
databases, tables
AWS Glue
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 64
AWS Lake Formation benefits
Centralized management of
fine-grained permissions
AWS Lake empowers security officers.
Formation
Simplified ingest and cleaning
AWS Glue Blueprints ML Data Catalog Access enables data engineers to build
Transforms control faster.
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 65
Data visualization with Amazon
QuickSight
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 66
Amazon QuickSight
BI service built for the cloud with pay-per-session pricing and ML insights
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 68
Serverless data lakes and analytics
Amazon RDS
Amazon EMR Amazon
AWS Glue AWS Glue Data
Amazon S3 crawler QuickSight
Catalog
Other databases
Amazon Redshift
On-premises data Spectrum
Streaming data
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 69
Summary
10011000010010101110010
10101110010101000010111
11011010
0011110010110010110
0100011000010
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 70
Module 4: AWS Data Analytics
Solutions – Part II
Objectives
In this module, you will learn about three key types of data analytics
technical solutions on AWS:
• Streaming and real-time analytics with Amazon Kinesis
• Data governance
• Extended solution: Insights and monetization with machine learning (ML)
Evolution of data architecture
10011000010010101110010
10101110010101000010111
11011010
0011110010110010110
0100011000010
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 72
Solution 3: Streaming and
real-time analytics with
Amazon Kinesis
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 73
Journey to a modern data architecture
Evolution of data architecture
10011000010010101110010
10101110010101000010111
11011010
0011110010110010110
0100011000010
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 75
Common use cases: Real-time analytics
The value of data diminishes over time
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 76
Enabling real-time analytics
Data streaming technology enables a customer to ingest, process, and analyze high
volumes of high-velocity data from a variety of sources, in real time.
1. 2. 3. 4. 5.
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 77
Data streaming solution challenges
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 78
AWS streaming data solutions
Efficiently collect, process, and analyze data streams in real time
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 79
Data generators: Simple streaming
data patterns
Data producers Streaming services Data consumers
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 81
Amazon Kinesis Data Streams
Massively scalable, highly durable data ingestion and processing service
optimized for real-time data streaming
Data collected is Real-time analytics Data synchronously Serverless, can scale
available within replicates data across dynamically to handle
70 3
Availability MB to TB Thousands
and
Zones in a each hour to millions
Region of PutRecords
milliseconds • Dashboards each second
• Anomaly detection
• Dynamic pricing
No upfront cost
low, pay-as-
you-go pricing
https://aws.amazon.com/kinesis/data-streams/faqs/?nc=sn&loc=5
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 82
How Kinesis Data Streams works
Amazon Kinesis
Data Analytics
Amazon Kinesis
Input
Data Streams Output
Amazon EC2
Capture and send data Ingest and store data Analyze streaming data
streams for processing using BI tools
AWS Lambda
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 85
Amazon Kinesis Data Firehose
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 86
How Kinesis Data Firehose works
Amazon S3
Capture and send data Prepares and loads data Analyze streaming data
continuously to the Amazon using analytics tools
selected destinations OpenSearch Service
Splunk
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Durably store the data 87
for analytics
Kinesis Data Streams and
Kinesis Data Firehose
Characteristics Amazon Kinesis Data Streams Amazon Kinesis Data Firehose
Stream storage and In shards, default 24 hours and up to 365 Max buffer size 128 MB and max time 900
duration days seconds
Data transformation and
None Uses AWS Lambda and AWS Glue
conversion
Amazon Kinesis Agent, applications using Amazon Kinesis Producer Library (KPL), AWS SDK
Data producer
for Java, Amazon CloudWatch Logs and CloudWatch Events, AWS IoT
AWS Lambda, Amazon Kinesis Data Analytics,
AWS Lambda, Amazon Kinesis Data
and Kinesis Data Firehose, apps using the KCL
Analytics, Amazon Kinesis Data Firehose,
Data consumer and SWK for Java, Amazon S3, Amazon
Applications using the Kinesis Client Library
Redshift, Amazon ES, Splunk, and Amazon
(KCL) and SDK for Java
Kinesis Data Analytics
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. https://aws.amazon.com/kinesis/data-streams/faqs/?nc=sn&loc=5 88
https://aws.amazon.com/kinesis/data-firehose/faqs/?nc=sn&loc=5
When to use Kinesis Data Streams and
Kinesis Data Firehose
For data streaming applications that require near real-time responses in seconds
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 89
Amazon Kinesis Data Analytics
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 90
Amazon Kinesis Data Analytics
Amazon Kinesis
Input Data Analytics Output
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 91
Use case: Clickstream analytics
Evolve from batch processing to real-time analytics
Websites send Collects the data and Processes data in Loads processed Runs analytics Readers see
clickstream data sends to Kinesis Data near-real time data into models to identify personalized content
Analytics Amazon Redshift content suggestions and
recommendations increase
engagement
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. s92
Solution 4: Data governance
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 93
Journey to a modern data architecture
Evolution of data architecture
10011000010010101110010
10101110010101000010111
11011010
0011110010110010110
0100011000010
• Securing data
• Auditing data usage
• Managing data access
• Safeguarding sensitive data and PII
• Maintaining regulations and mandates
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 95
Resolving PII dangers
Consumer
consent
violation
External Data
• Do these issues need to be hacking breach
resolved?
• Is there a solution architecture
that solves all PII issues? Personally
Second- identifiable
• What best practices can be party information Spyware
used to mitigate PII dangers? misuse (PII)
Unsecured
Espionage
devices
Rogue
agents
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 96
Amazon Macie
Continually evaluate
Discover sensitive
Amazon Macie Amazon S3 Take action
data
environment
Enable Amazon Automatically Analyzes bucket Generates findings
Macie with one-click generates an using ML and and sends to
in the AWS inventory of pattern matching to Amazon
Management Amazon S3 bucket discover sensitive CloudWatch Events
Console or with a and details on the data, like PII for integration into
single API call bucket-level security workflows and
• Financial
and access controls remediation actions
• Personal
• National
• Medical
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Credentials and secrets 97
Journey to a modern data architecture
Evolution of data architecture
10011000010010101110010
10101110010101000010111
11011010
0011110010110010110
0100011000010
AI and
Business analytics machine learning
Machine learning requires:
• More data: Collect all types of data
• Flexibility: Define schema during analysis
Data warehouse Big data
Interactive Real time
queries processing • Scalability: Scale storage and compute (CPU or
Data Catalog GPU) independently
10011000010010101
11001010101110010
• Data transformation and processing: Run a broad
set of processing and analytics on the
10100001011111011
010
00111100101100101
Data warehouse
Data lake
• Security: Networking, identity, encryption, and
compliance
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. https://aws.amazon.com/machine-learning/?nc2=h_ql_prod_ml 100
Machine learning solutions on data lakes
Amazon S3
Processed data
Amazon S3 AWS Database Amazon
Migration Service DynamoDB
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 101
Use case: Next Caller
https://www.youtube.com/watch?v=K27WjYwyqw8&list=PLhr1KZpdzukdeX8m
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. Q2qO73bg6UKQHYsHb&index=1&did=ta_card&trk=ta_card 102
Summary
10011000010010101110010
10101110010101000010111
11011010
0011110010110010110
0100011000010
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 103
Module 5: AWS Technical
Conversations and Engagement
AWS six-phase strategy
for implementing a data
analytics solution
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 105
Data analytics projects: A phased strategy
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 106
Phase 1: Data analytics in the cloud
assessment
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 107
Phase 1: Data analytics in the cloud
assessment: Identify challenges
Introduce
Provide Emphasize how Conduct
customer
AWS services the pieces fit differentiation
references and
overview together conversations
use cases
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 108
Data analytics in the cloud assessment: AWS best
practices
Do Avoid
Do Avoid
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 111
Phase 2: Use case identification
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 113
Use case identification: AWS best practices
Do Avoid
Do Avoid
Do Avoid
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 117
Phase 3: Architecture and data migration
Do Avoid
Engage AWS
Engaging AWS Support
AWS Partner Development too late in the process
Phase 3
Managers
Architecture Partner Solutions Architects
and data
migration AWS Professional Services
Phase 4: Proof of concept delivery
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 120
Phase 4: Proof of concept
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 122
Phase 5: Application tuning
and optimization
Phase 5
GOAL
Application
tuning and Objectives
optimization
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 124
Phase 6: Migration from POC
to production
Do
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 127
Possible POC pitfalls
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 128
Delivering a successful POC
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 129
POC success factors
New products 2 new products annually 6 new products annually 200% increase in products launched
Micro-batch,
Data availability Batch only
real-time streaming
80% faster time to data visibility
Customer engagement 30,000 page views 37,500 page views 25% increase in customer engagement
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 131
https://calculator.aws/#/addService
AWS well-architected review
using the Analytics Lens
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 132
10 design principles:
Analytics applications, 1–5
1. Automate data ingestion to handle big data
2. Design ingestion for failures and duplicates
3. Preserve original source data
4. Describe data with metadata
5. Establish data lineage
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 133
10 design principles:
Analytics applications, 6–10
6. Use the right ETL tool for the job
7. Orchestrate ETL workflows
8. Tier storage appropriately
9. Secure, protect, and manage the entire analytics pipeline
10. Design for scalable and reliable analytics pipelines
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 134
Activity: Game Analytics Pipeline
Architecture
Game analytics pipeline solution
architecture
Role
Requirements
You are a Partner solution
engineer (SE) helping a cloud • Enable the ingestion of streaming data from millions of
gamers playing from their desktop PCs now, and eventually
gaming architect at a hot new mobile devices
startup. • Enable customers to capture real-time analytics, monitoring
the game and gamers to improve the gamer experience and
Goal
the game, and for monetization.
Whiteboard an AWS • Enable internal team needs to track activities such as key
architecture for a game performance indicators (KPIs), system performance, user
activity, gamer satisfaction reporting, and expenses.
analytics pipeline for a multi-
player game with over five Constraints
million gamers worldwide. • Small IT staff
• Low budget
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 136
Whiteboard: Game analytics pipeline architecture
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 137
Whiteboard: Data producers and consumers
Data
Consume
Data consumers
rs
producers
Producer
s
Live ops
PC
Service
teams
PC
PC Data
engineers
AWS SDK
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 138
Whiteboard: Data producers and consumers
Data
Consume
Data Events stream consumers
rs
producers
Producer
s Events stream
Service
teams
PC
PC Data
engineers
AWS SDK
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 139
Whiteboard: Events stream
Data
Consume
Data Events Stream
Events stream
consumers
rs
producers
Producer
s
Service
teams
PC
Data
PC engineers
AWS SDK
Data
Which AWS services would be most suitable for processing analysts
the event streaming data?
Whiteboard: Streaming analytics
Data Events Stream
stream Streaming analytics
Streaming Analytics Data
Consume
producers
Producer consumers
rs
s
PC Service
teams
PC
Data
AWS SDK engineers
PC Service
teams
PC
Data
AWS SDK engineers
Streaming ingestion
PC Service
teams
Kinesis AWS
PC
Data Firehose Lambda
Data
AWS SDK engineers
How will the business users be able to view the data? Data
Which AWS tools would you recommend? analysts
Whiteboard: Data visualization and interactive analytics
Data Events Stream
Events stream StreamingAnalytics
Streaming analytics Metricsand
Metrics andNotifications
notifications Data
Consume
producers
Producer consumers
rs
s
API Gateway
Kinesis AWS Amazon AWS Glue
Servers and (events)
Data Firehose Lambda S3
backend Data
AWS SDK engineers
Configuration data Interactive analytics
Configure Lambda authorizer
apps
Data
Athena QuickSight analysts
Configuration AWS Lambda DynamoDB
Admin
endpoints
Try it yourself
AWS Solution Implementation: Serverless Game Analytics New Game Technology Learning
Game Analytics Pipeline Workshop Path
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 147
Demo - Serverless Data Lake
Glue ETL
QuickSight
Real-time Data
Streaming into
Serverless Data Lake
Athena
Glue Crawler Glue Data Catalog
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 148
Module 6: APN Partner
Opportunities and Resources
APN Partners and
AWS for Data Analytics
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Discounting and funding programs
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 151
AWS Data and Analytics Competency
categories
Data Analytics Provide a set of integrated tools to solve data
Platforms analytics challenges within a standard framework
Business Intelligence
Help customers turn raw data into actionable business
(BI) and Data
Visualization
information, such as reporting, dashboards, and data visualization
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 152
Best practices after identifying an
opportunity
Register your
Cultivate strong
opportunity Use existing Partner Achieve AWS Data and relationships with
through programs Analytics competency AWS sales teams
APN Partner Central
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 153
Collaboration workflow
Register an
Receive approval Engage AWS
opportunity on Engage Before SA
from AWS PSM account or
APN Partner AWS sales involvement
Partner SA
Central
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 154
AWS Professional Services
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 155
Data solutions in AWS
Marketplace
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Marketplace
https://aws.amazon.com/marketplace/search/results?searchTerms=data+and+analytics
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 157
Call to action
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Build a data analytic practice on AWS
Build relationships
Participate in the with APN teams for Prepare for the AWS
AWS Data Lab funding opportunities Data Analytics –
for your marketing Specialty certification
https://aws.amazon.com/
aws-data-lab/
and sales efforts
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. 160
Thank you
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. This work may not be reproduced or redistributed, in whole or in part, without prior written permission
from Amazon Web Services, Inc. Commercial copying, lending, or selling is prohibited. Corrections, feedback, or other questions? Contact us at
https://support.aws.amazon.com/#/contacts/aws-training. All trademarks are the property of their owners.