ANT336 Building Data Mesh Architectures On AWS
ANT336 Building Data Mesh Architectures On AWS
ANT336
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Modern data strategy
Data Analytics
lakes
Catalog People,
Data
apps, and
sources
Governance devices
Machine
learning Databases
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Amazon
DynamoDB EMR
Amazon OpenSearch
Amazon
Service
Aurora
Amazon S3
on AWS
Amazon Amazon
Redshift SageMaker
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Security, durability, availability
Price/performance
Data governance
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customer challenges
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data lake
Data lake
Sharing data in an
enterprise can be
Data lake
Data lake
Data lake
Data lake
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data lake
Common data lake challenges we hear from our customers
“I wish to focus on innovating with “My data science team should easily find “My team needs to own datasets,
data, not on maintaining and the datasets they seek and have the pipelines, and repositories that are
administering a data lake” ability to share them with others” isolated from other teams”
“There is a mismatch between executive “Our internal policies on what can be “I need to create a model to support
leadership goals & business line shared is unclear & there is lack of sharing from both producers and
deliverables, and incentives” incentive to share” consumers of data”
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sharing data in an
enterprise can be
challenging Data lake
Time
outcomes
Delivery velocity
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Centralized data lakes are complex to scale across
business units
ANALYTICS BUSINESS INTELLIGENCE AND MACHINE LEARNING
Amazon Redshift Amazon EMR Amazon Athena AWS Data Exchange QuickSight Amazon SageMaker ML
Data warehousing Hadoop + Spark Interactive analytics Visualizations
Lake Formation
DATA LAKE
Amazon S3
data lake storage
DATA MOVEMENT
AWS Database Migration Service | AWS Snowball | AWS Snowmobile | Amazon Kinesis Data Firehose | Amazon Kinesis Data Streams M anaged Streaming for Kafka
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sharing data in an enterprise can be challenging
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sharing data in an enterprise can be challenging
O R G A N I Z AT I O N A L I N C E N T I V E S A R O U N D D ATA S H A R I N G C A N B E M I S A L I G N E D
FOR SALE
“Everyone wants to be a consumer,
FRESH
no one wants to be a producer” QUALITY DATA
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why data mesh? Data domain Data mesh
Measure and invest in data products based AWS AWS CloudFormation Service CloudFormation
Console SDK
on usage and business value catalog template library
Organizations service
control policies
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh:
Design goals and
four key principles
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh: Four core principles
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Modern data architecture consists of five parts
Log Relational
Data sharing analytics databases
Data domain
Unified governance
Data Machine
Data discovery warehousing learning
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh architecture
D E C E N T R A L I Z E D , L I G H T W E I G H T F E D E R AT E D G O V E R N A N C E A C R O S S
D O M A I N - O R I E N T E D D ATA S Y S T E M S TO D R I V E G O V E R N E D S H A R I N G
PRODUCER 1 …. PRODUCER N
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh principle #1: Data domain ownership
D ATA O W N E R S A R E A C C O U N TA B L E F O R T H E I R D ATA P R O D U C T S TO B E R E L I A B L E ,
AVA I L A B L E , A N D A C C U R AT E
account
atomic integrity
account account
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh principle #2: Data as a product
D O M A I N - D R I V E N D E S I G N T E C H N I Q U E S TO F O R M U L AT E A N D E S TA B L I S H B O U N D E D C O N T E X T S F O R
D ATA P R O D U C T S
metadata, lineage
› Data products are valuable
on their own
Devices Web Sensors Social
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh principle #3: Self-serve sharing
E C O S Y S T E M O F S E L F - S E R V E D ATA I N F R A S T R U C T U R E W I T H O P E N P R O TO C O L S
Data consumer
Data consumer
Amazon Athena AI/ML Amazon Redshift Amazon QuickSight
› Design for generalist majority
(i.e., make it easy to use and adopt
Authorize principal Query from client services with no specialist skills needed)
Federated data governance Data Catalog › Enable personas to discover, learn, understand,
Data attributes consume, and maintain data products
Policy control
AWS Lake
Formation Data permissions › Collection of interoperable data products, which
enable cross-functional domains to produce and
consume data easily and with autonomy and will
Data domain allow it to scale
› Data products must include data, metadata,
Data Data Data
code, and policy all as single
lake warehouse marketplace unit of value
› Abstract complexity through automation
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh principle #4:
Federated data governance
G O V E R N A N C E M O D E L T H AT E M B R A C E S D E C E N T R A L I Z AT I O N A N D D O M A I N S E L F - S O V E R E I G N T Y T H R O U G H D E C I S I O N -
M A K I N G M O D E L L E D BY F E D E R AT I O N O F D ATA P R O D U C T O W N E R S
› Decentralization implementation of governance team and › They create global policies and standardization to
standard authorization achieve interoperability
› Governance team = a guild consisting of representatives › Automated execution of policies by the data domains
of all teams taking part in the data mesh (e.g., data classification and privacy, compliance, security,
documentation, and interoperability)
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh architecture pattern: Data lake data
products sharing
Data domain organization N Data domain organization N
Data domain organization 1 Central organization – Data domain organization 1
federated governance account
AWS Data Amazon Amazon
1 Register data location Exchange Athena Redshift
9 Share assets collections
Data lake
AWS Data Exchange
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh architecture pattern: Data lake and data
warehouse data product sharing
Data domain organization N Data domain organization N
Data domain organization 1 Central organization – Data domain organization 1
federated governance account
1 Register data location
Data lake Data Catalog
Amazon Data
2 Register data share tables Redshift share
Data share
Populate
metadata
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
GoDaddy’s journey with AWS
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our vision
is to radically shift the global economy toward
life-fulfilling entrepreneurial ventures
Our mission
is to empower entrepreneurs everywhere,
making opportunity more inclusive for all
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our strategy
We champion everyday entrepreneurs by
empowering them with sage guidance set in
seamlessly intuitive experiences
to securely name, create, and grow their ventures in
select markets; leveraging the exponential power of
our community at global scale to deliver profitable
revenue growth
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
At GoDaddy our goal is
Marketplaces and social media
to partner with our
Messaging commerce
customers at every Digital identity
Payments
point on this wheel
Domain
Logo
Email
Connected commerce
Bio site
Create posts
Website
Online store
Hosting and security
Physical store
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The largest domain registrar 21M+ 84M+ 15M+
human-guided
customers domains
in the world moments*
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
GoDaddy & AWS – 5 years of strategic collaboration
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh:
Can a data mesh create customer clarity?
Multi-domain – insights & sharing
Preparation
Discovery
+
Augmentation
Data
Interaction +
governance
Exploration
=
Data consumer
Data discovery Unified data experience Data sharing
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multi-domain federated customer actions
Care
Cohort
First-time user
Learn
milestone use
First milestone use:
10,480
First
Churn risk
Manage products:
Customer
Manage products
11,202
B
High
use
Active users
Entitlements: GoDaddy Payments,
Product engagement
W+M, 0ffice365
Upcoming renewal: Web+Marketing
Low
use
Upcoming renewal date: 10/2/2023
Autopay: On
User conversion
C
Bot
SERP
Purchased
Repeat visitor
Product engagement:
D 20,042
cross-sell
Upsell/
Shop
Product engagement
Top prod name: WAM
Setup
Total active time: 14hrs 22mins
Websites published: 23
Websites updated: 204
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Upsell/
Care Churn risk User conversion
cross-sell
First High Low
Cohort Other 1 Other 2 Purchased Setup
milestone use use use
Cart Chat Forums Product engagement Renew Signup
Multi-domain federated customer actions
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Learn Manage products SERP Shop
First-time user Customer Bot Repeat visitor
D
A
C
GoDaddy data mesh – Customer layers
Data owner
Data domain ownership CUSTOMER TIERS
Multi-domain products
Data engineer Visiting user (tier 0)
Data as a product Prospective customer (tier 1)
Highly or lowly engaged (tier 2)
Conversion (tier 3)
Data steward Account (tier 4)
Data governance council High-value account (tier 5)
Data consumer
Self-serve sharing
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How GoDaddy built a
data mesh using AWS
modern data architecture
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data owner
Data mesh
Data domain ownership
Producer
domains
Utility
Systems of record
domains
1st party
3rd party
Services Data ingress
Products
S3
Self-
service*
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh Data consumer
Self-serve sharing
Producer Consumer
domains domains
Utility
Systems of record
domains
1st party
Data products
3rd party
Services Data ingress Data egress
Business insights
Products
S3
Self- Self-
service* service*
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh Data engineer
Data as a product
Producer Consumer
domains domains
Utility
Data processing
Systems of record
Data lake
(storage & metadata)
Self- Self-
service* service*
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh Data steward
Data governance council
Producer Consumer
domains domains
Utility
Data processing
Systems of record
Data lake
(storage & metadata)
Self- Enhanced
governance Data governance* and the data interfaces Self-
service* service*
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh
Utility Self-service, modern cloud data platform –
domains
delivers reliable, secure, near real-time customer 360 data
Producer
Notebook
Machine learning, data tooling, Consumer
domains domains
and experimentation
Utility
Data processing
Processing
Systems of record
domains (Streaming
(streaming & batch)
Batch)
1st party
Data products
3rd party
Services Data ingress Data egress
Business insights
Products S3
Data lake
(storage & metadata)
Self- Enhanced
governance Data governance* and the data interfaces Self-
service* service*
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh
Utility Self-service, modern cloud data platform –
domains
delivers reliable, secure, near real-time customer 360 data
1 Utility
Data processing
Processing
Systems of Record
domains (Streaming
(streaming & batch)
Batch)
1st party
Data products
3rd party
Services Data ingress Data egress 5
Business insights
Products S3
Data lake
2
(storage & metadata)
4
Self- Enhanced
governance Data governance* and the data interfaces Self-
service* service*
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh
Producer
domains
domains
1st party
3rd party
Services Data ingress
Products
2 Producers register
DOMAIN* data in catalog
Self-
service*
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh
Utility Self-service, modern cloud data platform –
domains
delivers reliable, secure, near real-time customer 360 data
Notebook
Machine learning, data tooling,
and experimentation
Data processing
(streaming & batch)
Inferencing* request
access table (DB-API)
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh
Enhanced
governance Data governance* and the data interfaces
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data mesh Consumer
Accounts
Data products
Consumers use shared Data
5
DOMAIN* data egress Business insights
Self
Service*
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
GoDaddy
conceptual/
domain
architecture
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Business outcomes
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved.