0% found this document useful (0 votes)
1 views

SqlAzure

The document provides an overview of SQL Azure as a Database-as-a-Service, highlighting its on-demand provisioning, familiar relational programming model, and pay-as-you-go pricing. It discusses the architecture, scalability, high availability, and challenges faced in deployment and management. Key insights emphasize the unique aspects of cloud computing, including the need for design considerations around failure, simplicity, and multi-tenancy.

Uploaded by

liztijara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

SqlAzure

The document provides an overview of SQL Azure as a Database-as-a-Service, highlighting its on-demand provisioning, familiar relational programming model, and pay-as-you-go pricing. It discusses the architecture, scalability, high availability, and challenges faced in deployment and management. Key insights emphasize the unique aspects of cloud computing, including the need for design considerations around failure, simplicity, and multi-tenancy.

Uploaded by

liztijara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

SQL Azure: Database-as-a-Service

What, how and why Cloud is different

Nigel Ellis <nigele@microsoft.com>


July 2010
Talk Outline
• Database-as-a-Service
• SQL Azure
– Overview
– Deployment and Monitoring
– High availability
– Scalability
• Lessons and insight

2
SQL Azure Database as a Service
• On-demand provisioning of SQL databases
• Familiar relational programming model
– Leverage existing skills and tools
• SLA for availability and performance
• Pay-as-you-go pricing model
• Full control over logical database administration
– No physical database administration headaches
• Large geo-presence
– 3 regions (US, Europe, Asia), each with 2 sub-regions

3
Challenges And Our Approach
• Challenges
– Scale – storage, processing, and delivery
– Consistency – transactions, replication, failures, HA
– Manageability – deployment and self-management
• Our approach
– SQL Server technology as node storage
– Distributed fabric for self-healing and scale
– Automated deployment and provisioning (low OpEx)
– Commodity hardware for reduced CapEx
– Software to achieve required reliability
4
SQL Azure model

• Each account has zero or more servers


Account – Azure wide, provisioned in a common portal
– Billing instrument

• Each server has one or more databases


– Zone for authentication: userId+password
– Zone for administration and billing
Server • Metadata about the databases and usage
– Network access control based on client IP
– Has unique DNS name and unit of geo-location

• Each database has standard SQL objects


– Unit of consistency and high availability
Database (autonomous replication)
– Contains Users, Tables, Views, Indices, etc…
– Most granular unit of usage reports
– Three SKUs available (1GB, 10GB and 50GB)

5
ARCHITECTURE

6
Network Topology
Applications use standard SQL
Application client libraries: ODBC, ADO.Net,
PHP, JDBC, …

Internet
Azure Cloud

TDS (tcp)
Security Boundary Load balancer forwards ‘sticky’
LB sessions to TDS protocol tier

TDS (tcp)

Gateway Gateway Gateway Gateway Gateway Gateway

Gateway: TDS protocol gateway, enforces AUTHN/AUTHZ policy; proxy to SQL tier
TDS (tcp)

L SQL SQL SQL SQL

7
Scalability and Availability: Fabric, Failover, Replication, and Load balancing
HIGH AVAILABILITY

8
Concepts

Storage Unit
• Supports CRUD
operations
e.g. DB row

Consistency Unit Failover Unit


(aka Rowgroup) (aka Partition)
• Set of storage units • Unit of management
• Specified by “application” • Group of consistency units
• Range partitioned or entire DB • Determined by the system
• SQL Azure uses entire DB only • Can be split or merged at
• Infra supports both consistency unit boundaries
9
Data Consistency
• Each Failover Unit is replicated for HA
– Desired replica count is configurable and actual
count is dynamic at runtime
• Clients must see the same linearized order of
read and write operations
• Replica set is dynamically reconfigured
to account for member arrivals
and departures
– Read-Write quorums are supported and
are dynamically adjusted

10
Replication
• All reads are
completed
at the primary Ack Read
Value
• Writes replicated to Write
write quorum of
replicas Ack
P Ack
• Commit on Ack Ack
secondaries first
then primary S Write Write S
• Each transaction
has a commit S S
Write Write
sequence number
(epoch, num)

11
Reconfiguration
• Types of reconfiguration
– Primary failover
– Removing a failed secondary
– Adding recovered replica
– Building a new secondary Failed

• Assumes B
S
X P

S
P

X
– Failure detector
– Leader election S S
– Both services provided Failed
by Fabric layer
Safe in the presence of
cascading failures

12
Partition Management
• Partition Manager (PM) is a highly available
service running in the Master cluster
– Ensures all partitions are operational
– Places replicas across failure domains
(rack/switch/server)
– Ensures all partitions have target replica count
– Balances the load across all the nodes
• Each node manages multiple partitions
• Global state maintained by the PM can be
recreated from the local node state in event of
disaster (GPM rebuild)
13
System in Operation

Primary Master Node


SQL
Server Partition Manager
Partition
Load Balancer
Global Management
Partition
Map Partition Placement
Advisor

Leader
Elector Fabric

Data Node Data Node Data Node Data Node Data Node Data Node
100 101 102 103 104 105
P S P P S P
S S S S PS S
S P S S S S
S S S S
S

Fabric Secondary Primary Secondary


14
SQL node Architecture
• Single physical DB for entire node
• DB files and log shared across
every logical database/partition Machine

– Allows better logging throughput


SQL Instance
master

with sequential IO/group commits tempdb

– No auto-growth on demand stalls msdb

– Uniform manageability and backup CloudNode

• Each partition is a “silo” with its


DB1 DB2 master

DB5 DB1 DB7

own independent schema DB3 DB4 DB7

• Local SQL backup guards against


software bugs

15
Recap
• Two kinds of nodes:
– Data nodes store application data
– Master nodes store cluster metadata
• Node failures are reliably detected
– On every node, SQL and Fabric processes monitor
each other
– Fabric processes monitor each other across nodes
• Local failures cause nodes to fail-fast
• Failures cause reconfiguration and placement
changes

16
DEPLOYMENT

17
Hardware Architecture
• Each rack hosts 2 pods of 20
machines each
L2 Switch
• Each pod has a TOR mini-
switch
• 10GB uplink to L2 switch
• Each SQL Azure machine runs
on commodity box
• Example:
• 8 cores
• 32 GB RAM
• 1TB+ SATA drives
• Programmable power
• 1Gb NIC
• Machine spec changes as
hardware (pricing) evolves

18
Hardware Challenges
• SATA drives
– On-disk cache and lack of true "write through" results in
Write Ahead Logging violations
• DB requires in-order writes to be honored
• Can force flush cache, but causes performance degradation
– Disk failures happen daily (at scale), fail-fast on those
• Bit-flips (enabled page checksums)
• Drives just disappear
• IOs are misdirected
• Faulty NIC
– Encountered message corruption
• Enabled message signing and checksums

19
Software Deployment
• OS is automatically imaged via deployment
• All the services are setup using file copy
– Guarantees on which version is running
– Provides fast switch to new version
– Minimal global state allows running side by side
– Yes, that includes the SQL Server DB engine
• Rollout is monitored to ensure high availability
– Knowledge of replica state health ensure SLA is met
– Two phase rollouts for data or protocol changes
• Leverages internal Autopilot technologies with SQL
Azure extensions

20
Software Challenges
• Lack of real-time OS features
– CPU priority
• High priority for Fabric lease traffic
– Page Faults/GC
• Locked pages for SQL and Fabric (in managed code)
• Fail fast or not?
– Yes, for corruption/AV
– No, for other issues unless centrally controlled
• What is really considered failed?
– Some failures are non-deterministic or hangs
– Multiple protocols / channels means partial failures too

21
Monitoring
• Health model w/repair actions
– Reboot  Re-deploy  Re-image (OS)  RMA cycle
• Additional monitoring for SQL tier
– Connect / network probes
– Memory leaks / hung worker processes
– Database corruption detection
– Trace and performance stats capture
• Sourced from regular SQL trace and support mechanisms
• Stored locally and pushed to a global cluster wide store
• Global cluster used for service insight and problem
tracking

22
LESSONS LEARNED

23
How is Cloud Different?
Minor differences:

• Cheap hardware
– No SANs, no SCSI, no Infiniband
– Iffy routers, network cards
– Relatively homogeneous
– Hardware not selected for the purpose

• Lots of it
– Not one machine, not 10 machines – think 1000+

• Public internet
– High latencies, sometimes
– All over the world
– Scary people (untrusted) lurking in the shadows

24
How is Cloud Different?
Real differences:

• You are responsible for the whole thing


– No such thing as “can you send us a repro”
– No such thing as “it’s a hardware problem” (it’s us)
– No such thing as “it’s a network issue” (it’s us)
– No such thing as “it’s a configuration issue” (it’s us)
– No such thing as “It’s not us, it’s DNS” (it’s us)
– No such thing as “It’s not us, it’s AD” (it’s us)

• User expectations: it’s a utility!


– Utility of databases, not instances or servers
– Highly available (means “it’s there” not “replication has been enabled”)
– Elastic (you need more, you can have it right away)
– Load-balanced (automatically)
– And yet: symmetric (“give me cursors or give me death”) 25
Design for Failure
• Common mistake #1: Failures can be eliminated
– Everybody fails! Hardware, software, universe

• Common mistake #2: All failures can be detected


– No watchdog is fast enough or good enough

• Common mistake #3: Failures can be enumerated


– Cannot deal with issues one at a time
– Must take a holistic, statistical approach
– Learn only as much as you need to take action

• Common mistake #4: Failures can be dealt with independently


– Local observation generates insufficient insight, leads to global
disasters

26
Design for Failure

Observe and
detect
Local
Implement Collect
decisions context Centralized

Commit Send
decisions complaints

Make Aggregate
decisions complaints

27
Design for Mediocre
• Network is not fast or slow, it varies
– Design for huge latency variance
– Machine independence is key

• Machines are not up or down, they are kind of slow


– Measure, it’s never black-and-white

• People are not good or fired, they all make mistakes


– Tools and processes to minimize risk

• Environment is often iffy


– Integrated security? Not so fast…

• It’s less important to succeed, than to know the difference

28
Design for (appropriate) Simplicity
• There’s no such thing as a “repro”
– Everything must be debuggable from logs (and dumps)
– This is much harder than it sounds – takes time to log the right stuff
• System state must be externally examinable
– Not locked in internal data structures
• Fail-fast
– Is great! Very hard to reason about partial failures. We kill it fast.
– Is awful! Cascading failures can kill entire system if you are not careful
– Principle: If you are sure it’s local, kill it. If not, not so fast
• ‘No workflows’ is best
– Machine independence is a virtue
– Things that can safely be local, should be
• Single-level workflows is next (reduce number of moving parts)
– Resumable (not tied to a specific machine)
– Design with failure as norm using distributed (persisted) state machines

29
Design for many
• Many machines is great!
– Reduce focus on machine reliability
• By the time RDBMS runs recovery, the world has moved on
– Distribution enables load-balancing
• Focus on elasticity and flexibility
– HA with 100 machines is better than 2
• Load distribution, parallelism of copy

• Many machines is hard!


– Elasticity needs to be built in
– All operations must be multi-machine
– Correlated failures are a fact of life

30
Design for multi-tenancy
• Customers like using many machines
– Enables load-balancing and elasticity
– But they don’t like paying for many machines

• Solution: multi-tenancy!
– Everyone gets many slices

• Hard!
– Isolation for security and performance
– Many small databases? Costs….
– Many relationships (replication)
– Tradeoffs: isolation vs. elasticity?
31
Local vs. Global
Balance between local and global is key!

• “Normal case” decisions must be local


– Any global state (e.g. routes) must be cached
– The fewer parties are involved, the better
– Otherwise: bottlenecks, single points of failure

• “Special case” decisions must be global


– How to react to an error?
– When to failover?
– When and where to balance load?
– When and how to upgrade software?
– Otherwise: instability, chaos, low availability

• Data must be where it is needed


– Global data needed for local operations must be cached locally
– Local data needed for management must be aggregated globally
32
Real Symmetry is End-to-End
• Symmetry is not just about surface area
– Too much focus on features

• It’s not symmetric if:


– If the syntax is the same, but it works in subtly different ways
– If my connection drops too often
– If the latency causes me to put everything in SPs
– If operations unpredictably take 10x as long sometimes

• Customers want clarity, predictability, and minimal


learning curve
33
Summary
• Cloud is different
– Not a different place to host code

• Opportunities are great


– Customers want a utility approach to storage
– New businesses and abilities in scale, availability, etc

• But the price must be paid


– Which is a good thing, otherwise everyone would be
doing it!

34
Future Work and Challenges
• Performance SLAs
– Delivering on “guaranteed capacity” while consolidating diverse
workloads is hard
• Privacy, Governance and Compliance
– Perceptions and realities
– Private Cloud appliances
• Programming Models
– Support for loosely coupled scaleout patterns such as sharding
– Transparent multi-node scaleout
• Data Redundancy
– Point in time restore (backup knobs)
– Geo-availability for multiple points of presence
• Health Model for Applications
– Data tier is only part of the problem – support for hosting N-tier
apps and providing insight into health and performance

35
QUESTIONS?

36
SQL Azure Links
• SQL Azure
http://www.microsoft.com/windowsazure/sqlazure/

• SQL Azure “under the hood”


http://www.microsoftpdc.com/sessions/tags/sqlazure

• SQL Azure Fabric


http://channel9.msdn.com/pdc2008/BB03/

37

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy