224 Postgres-XC Tutorial

Download as pdf or txt
Download as pdf or txt
You are on page 1of 117

PGCon2012Tutorial

ConfiguringWriteScalablePostgreSQL Cluster
PostgresXCPrimerandMore by KoichiSuzuki MichaelPaquier AshutoshBapat May16th,2012

Agenda

APostgresXCintroduction BClusterdesign,navigationandconfiguration CDistributingdataeffectively DBackupandrestore,highavailability EPostgresXCasacommunity

May16th,2012

PostgresXC

ChapterA PostgresXCIntroduction

May16th,2012

PostgresXC

A1 WhatPostgresXCisand whatitisnot

May16th,2012

PostgresXC

Summary(1)

PostgreSQLbaseddatabasecluster

Binarycompatibleapplications

Manycoreextension
AtpresentbaseduponPG9.1.SoonwillbeupgradedtoPG9.2.

CatchesuplatestPostgreSQLversion

SymmetricCluster

Nomaster,noslave

NotjustPostgreSQLreplication.

Applicationcanread/writetoanyserver
CompleteACIDpropertytoallthetransactionsinthecluster

Consistentdatabaseviewtoallthetransactions

ScalesbothforWriteandRead
PostgresXC 5

May16th,2012

Summary(2)

Notjustareplication

Configuredtoprovideparalleltransaction/statementhandling. HAconfigurationneedsseparatesetups(explainedlater)

May16th,2012

PostgresXC

SymmetricCluster(1) PostgreSQLreplication

May16th,2012

PostgresXC

SymmetricCluster(2) PostgresXCCluster

May16th,2012

PostgresXC

Read/WriteScalability
DBT1throughputscalability

May16th,2012

PostgresXC

PresentStatus

Project/Developersite

http://postgres-xc.sourceforge.net/
http://sourceforge.net/projects/postgres-xc/

NowVersion1.0available

BasePostgreSQLversion:9.1

PromptlymergedwithPostgreSQL9.2whenavailable TestedonCentOS5.8andubuntu10.4

64bitLinuxonIntelX86_64architecture

May16th,2012

PostgresXC

10

HowtoachieveR/Wscalability Tabledistributionandreplication

Eachtablecanbedistributedorreplicated

Notasimpledatabasereplication Wellsuitedtodistributestarschemastructuredatabase

TransactiontablesDistributed MastertablesReplicate

Joinpushdown Whereclausepushdown Parallelaggregates

May16th,2012

PostgresXC

11

DBT1Example
CUSTOMER
C_ID C_UNAME C_PASSWD C_FNAME C_LNAME C_ADDR_ID C_PHONE C_EMAIL C_SINCE C_LAST_VISIT C_LOGIN C_EXPIRATION C_DISCOUNT C_BALANCE C_YTD_PMT C_BIRTHDATE C_DATA

ORDERS
O_ID O_C_ID O_DATE O_SUB_TOTAL O_TAX O_TOTAL O_SHIP_TYPE O_BILL_ADDR_ID O_SHIP_ADDR_ID O_STATUS

ORDER_LINE
OL_ID OL_O_ID OL_I_ID OL_QTY OL_DISCOUNT OL_COMMENTS OL_C_ID

ITEM
I_ID I_TITLE I_A_ID I_PUB_DATE I_PUBLISHER I_SUBJECT I_DESC I_RELATED1 I_RELATED2 I_RELATED3 I_RELATED4 I_RELATED5 I_THUMBNAIL I_IMAGE I_SRP I_COST I_AVAIL I_ISBN I_PAGE I_BACKING I_DIMENASIONS

SHOPPING_CART
SC_ID SC_C_ID SC_DATE SC_SUB_TOTAL SC_TAX SC_SHIPPING_COST SC_TOTAL SC_C_FNAME SC_C_LNAME SC_C>DISCOUNT

CC_XACTS
CX_I_ID CX_TYPE CX_NUM CX_NAME CX_EXPIRY CX_AUTH_ID CX_XACT_AMT CX_XACT_DATE CX_CO_ID CX_C_ID

Distributedwith CustomerID

Distributedwith ShoppingCartID
SHOPPING_CART_LINE

ADDRESS
ADDR_ID ADDR_STREET1 ADDR_STREET2 ADDR_CITY ADDR_STATE ADDR_ZIP ADDR_CO_ID ADDR_C_ID

STOCK

Replicated
COUNTRY
CO_ID CO_NAME CO_EXCHANGE CO_CURRENCY

ST_I_ID ST_STOCK

SCL_SC_ID SCL_I_ID SCL_QTY SCL_COST SCL_SRP SCL_TITLE SCL_BACKING SCL_C_ID

AUTHOR
OL_ID OL_O_ID OL_I_ID OL_QTY OL_DISCOUNT OL_COMMENTS OL_C_ID

Distributedwith ItemID

May16th,2012

PostgresXC

12

MajorDifferencefromPostgreSQL

Tabledistribution/replicationconsideration

CREATE TABLE tab () DISTRIBUTE BY HASH(col) | MODULO(col) | REPLICATE Purposeofthistutorial WHERECURRENTOF Trigger Savepoint...
PostgresXC 13

Configuration

Somemissingfeatures

May16th,2012

A2 PostgresXCComponents

May16th,2012

PostgresXC

14

Summary

Coordinator

ConnectionpointfromApps SQLanalysisandglobalplanning GlobalSQLexecution

PostgresXCKernel,basedupon vanillaPostgreSQL Sharethebinary Recommendedtoconfigureas apairinOLTPApps.

Datanode(orsimplyNODE)

Actualdatabasestore LocalSQLexecution

GTM(GlobalTransactionManager)

Providesconsistentdatabaseviewtotransactions

GXID(GlobalTransactionID) Snapshot(Listofactivetransactions) OtherglobalvaluessuchasSEQUENCE

Differentbinaries

GTMProxy,integratesserverlocaltransactionrequirementforperformance
PostgresXC 15

May16th,2012

Howcomponentswork
(HAcomponentsexcluded)
Apps Apps Apps Apps
Connecttoanyoneof thecoordinators

Coordinator

Global Catalog

Coordinator

Global Catalog

Datanode GTMProxy
ServerMachine (OrVM)

Localdata

Datanode GTMProxy
ServerMachine (OrVM)

Localdata

GTM
May16th,2012 PostgresXC 16

DifferencefromvanillaPostgreSQL

MorethanonePostgresXCkernel(almostPostgreSQL kernel) GTMandGTMProxy Mayconnecttoanyoneofthecoordinators

Providesingledatabaseview FullfledgedtransactionACIDproperty

SomerestrictionsinSSI(serializable)

May16th,2012

PostgresXC

17

Howsingledatabaseviewisprovided

PickupvanillaPostgreSQLMVCCmechanism

TransactionID(Transactiontimestamp) Snapshot(listifactivetransactions) CLOG(whethergiventransactioniscommitted)

MadetheformertwoglobalGTM

CLOGisstilllocallymaintainedbycoordinatorsanddatanodes Everycoordinator/datanodesharesthesamesnapshotatanygiventime

2PCisusedfortransactionsspanningovermultiplecoordinatorsand/or datanodes

Hassomeperformancepenaltyandmaybeimprovedinthefuture
PostgresXC 18

May16th,2012

Howeachcomponentworks
ActasjustPostgreSQL Determineswhichdatanode(s)togo Distributedqueryplanning ProvideGXIDandsnapshottodatanode(s)

Coordinator

Global Catalog

Handlelocalstatement(s)fromcoordinators MorelikejustsinglePostgreSQL

Datanode

Localdata

ReduceGTMinteractionbygroupingGTMrequest/response Takescareofconnection/reconnectiontoGTM

GTMProxy

ProvidesglobaltransactionIDandsnapshot Providessequence

GTM

May16th,2012

PostgresXC

19

AdditionalInfoforconfiguration

Bothcoordinator/datanodecanhavetheirownbackups usingPostgreSQLlogshippingreplication. GTMcanbeconfiguredasstandby,alivebackup

Explainedlater

May16th,2012

PostgresXC

20

ChapterB Clusterdesign,navigationandconfiguration

FriendlyapproachtoXC

May16th,2012

PostgresXC

21

B1 Designofcluster

Aboutserversandapplications

May16th,2012

PostgresXC

22

Generaladvice

AvoiddatamaterializationonCoordinator

postgres=#explain(costsfalse)select*fromaa,bbwhereaa.a=bb.a; QUERYPLAN DataNodeScanon"__REMOTE_FQS_QUERY__" Node/s:dn1,dn2 (2rows)

YES!

NO!
May16th,2012

postgres=#explain(costsfalse)select*fromaa,bbwhereaa.a=bb.a; QUERYPLAN NestedLoop JoinFilter:(aa.a=bb.a) >DataNodeScanonaa Node/s:dn1,dn2 >DataNodeScanonbb Node/s:dn1 (6rows)


PostgresXC 23

Highavailabilitydesign

Streamingreplicationoncritical nodes

Nodeshavinguniquedata Donotcareaboutunloggedtables forexample


Dnmaster

Logshipping

Dnslave

GTMStandby

GTMisSPOF Needtofallbacktoastandbyif failure


GTM

Statusbackup

GTMStandby

May16th,2012

PostgresXC

24

Onlinetransactionprocessing(OLTP) applications

Shortreadwritetransactions Coordinator/DatanodeCPUonsamemachine30/70 1/3ratioonseparateservers/VMs Mastertable:replicated Clients tableusedforjoins


Co1

WarehousetableofDBT2

Dn1
May16th,2012

Dn2

Dn3
PostgresXC 25

Analyticapplications

Longreadonlytransactions 1Co/1Dnonsameserver/VM
Clients

Maximizelocaljoinswith preferrednode

Co1/Dn1 Co2/Dn2 CoN/DnN

May16th,2012

PostgresXC

26

B2 Codeandbinaries

Codenavigationanddeployment

May16th,2012

PostgresXC

27

Structureofcode1

Codenavigation:

Useofflags#ifdefPGXC..#endif

GTM

Folder:src/gtm/ ContainsGTMandGTMProxycode Postgresside,GTMsideandclients

Nodelocationmanagement

Folder:src/backend/pgxc/locator Nodehashingcalculationanddeterminationofexecutingnodelist
PostgresXC 28

May16th,2012

Structureofcode2

Pooler

Folder:src/backend/pgxc/pool/ Poolerprocess,postgressidemanagement

Nodemanager

Folder:src/backend/pgxc/nodemgr/ Nodeandnodegroupcatalogs.Localnodemanagement

Planner

Folder:src/backend/pgxc/plan/ Fastqueryshippingcode

Documentation

Folder:docxc/
PostgresXC 29

May16th,2012

Compilationanddeployment

SameasvanillaPostgreSQL

./configureprefix... makehtml/man/worldandmakeinstall

Deploymentmethods:

Installcorebinaries/packagesonalltheservers/VMsinvolved Or...compileonceandcopybinariestoalltheservers/VMs

Configurator,automaticdeploymentthroughcluster

WritteninRuby YAMLconfigurationfile Notsupportedsince0.9.4:(


PostgresXC 30

May16th,2012

B3 Everythingaboutconfiguration

Settingsandoptions

May16th,2012

PostgresXC

31

Initialization

InitializationofGTMcreationofgtm.conf

Mandatoryoptions:

Datafolder GTMorGTMProxy?

Example:initgtmZgtmD$DATA_FOLDER Mandatoryoption

Initializationofanode

Nodename=>nodename Datafolder=>D
PostgresXC 32

Example:initdbnodenamemynodeD$DATA_FOLDER

May16th,2012

Configurationparameters

BasicsaresameasvanillaPostgres Extraconfigurationforallthenodes

GTMconnectionparameters:gtm_host/gtm_port Nodename:pgxc_node_nameforselfidentification

Coordinatoronly

Poolerparameters:pooler_port,min_pool_size,max_pool_size persistent_datanode_connections,connectionstakenforsessionnot sentbacktopool

May16th,2012

PostgresXC

33

ParametersexclusivetoXC

enforce_two_phase_commitdefault=on

Controlofautocommittemporaryobjects Turntoofftocreatetemporaryobjects

max_coordinatorsdefault16,maxCoordinatorsusable max_datanodesdefault16,maxDatanodesusable

May16th,2012

PostgresXC

34

Nodestartup

Coordinator/Datanode

SameoptionsasvanillaPostgres Except...Mandatorytochooseifnodestartsupasa Coordinator(C)oraDatanode(X) Possibletosetwithpg_ctlZcoordinator/Datanode gtmD$DATA_FOLDER gtm_ctlstartD$DATA_FOLDERSgtm


PostgresXC 35

GTM

May16th,2012

Commandsclustermanagement

CREATE/ALTER/DROPNODE CREATENODEGROUP Systemfunctions

pgxc_pool_check() pgxc_pool_reload()

May16th,2012

PostgresXC

36

Demonstration
psqlclient

GTM Port6666

Coordinator1 Port5432

Datanode1 Port15432

Datanode2 Port15433

May16th,2012

PostgresXC

37

B4 Datamanagementandclusterrelated commands
Allthecoremechanismstomanageand...checkyourcluster

May16th,2012

PostgresXC

38

CREATETABLEextensions

Controloftabledistribution

DISTRIBUTEBY

REPLICATION HASH(column) ROUNDROBIN

Datarepartition

TONODEnode1,nodeN TOGROUPnodegroup

May16th,2012

PostgresXC

39

PostgresXCcommandsmaintenance

EXECUTEDIRECT

Connectdirectlytoanode SELECTqueriesonly Canbeusedtocheckconnectiontoaremotenode Localmaintenance Dropconnectionsinpoolforgivendatabaseoruser

CLEANCONNECTION

May16th,2012

PostgresXC

40

PointintimerecoveryPITR

CREATEBARRIER

AdditionofabarrierIDinXlogsconsistentincluster Block2PCtransactionstohaveaconsistenttransactionstatus SpecifyabarrierIDinrecovery.conftorecoveranodeuptoa givenbarrierpoint

Recovery.confrecovery_target_barrier

May16th,2012

PostgresXC

41

Clustercatalogs

pgxc_nodeInformationofremotenodes

Connectioninfo:host/port Nodetype Nodename

pgxc_groupnodegroupinformation pgxc_classtabledistributioninformation

TableOid Distributiontype,distributionkey Nodelistwheretabledataisdistributed


PostgresXC 42

May16th,2012

ChapterC Distributingdataeffectively
Distributionstrategies Choosingdistributionstrategy TransactionManagement

May16th,2012

PostgresXC

43

Distributingthedata

Replicatedtable

Eachrowinthetableisreplicatedtothedatanodes Statementbasedreplication Eachrowofthetableisstoredononedatanode,decidedby oneoffollowingstrategies


Distributedtable

Hash RoundRobin Modulo RangeanduserdefinedfunctionTBD


PostgresXC 44

May16th,2012

ReplicatedTables
Writes Reads

write

write write

read

val 1 2 3

val2 2 10 4

val 1 2 3

val2 2 10 4

val 1 2 3

val2 2 10 4

val 1 2 3

val2 2 10 4

val 1 2 3

val2 2 10 4

val 1 2 3

val2 2 10 4

May16th,2012

PostgresXC

45

ReplicatedTables

Statementlevelreplication Eachwriteneedstobereplicated

writesarecostly Readsarefaster,sincereadsfromdifferentcoordinatorscan beroutedtodifferentnodes

Readcanhappenonanynode(wheretableisreplicated)

Usefulforrelativelystatictables,withhighreadload

May16th,2012

PostgresXC

46

Queryingreplicatedtables

Example:simpleSELECTonreplicatedtable

CREATETABLEtab1(valint,val2int) DISTRIBUTEBYREPLICATIONTONODEdatanode_1,datanode_2; EXPLAINVERBOSESELECT*FROMtab1WHEREval2=5; QUERYPLAN Result Output:val,val2 >DataNodeScanontab1Queriesthedatanode/s Output:val,val2 Node/s:datanode_1onenodeoutoftwoischosen Remotequery:SELECTval,val2FROMONLYtab1WHERE(val2=5)
May16th,2012 PostgresXC 47

Replicatedtablesmultirowoperations

Example:aggregationonreplicatedtable

EXPLAINVERBOSESELECTsum(val)FROMtab1GROUPBYval2; QUERYPLAN HashAggregateGroupsrowsoncoordinator,N(groups)<N(rows) Output:sum(val),val2 >DataNodeScanontab1 Bringsalltherowsfromonedatanodetocoordinator. Output:val,val2 Node/s:datanode_1 Remotequery:SELECTval,val2FROMONLYtab1WHEREtrue


May16th,2012 PostgresXC 48

Replicatedtablesmultirowoperation
Pushingaggregatestothedatanodesforbetterperformance
EXPLAINVERBOSESELECTsum(val)FROMtab1GROUPBYval2; QUERYPLAN DataNodeScanon"__REMOTE_FQS_QUERY__" Output:sum(tab1.val),tab1.val2 Node/s:datanode_1 Remotequery:SELECTsum(val)ASsumFROMtab1GROUPBYval2 Getgroupedandaggregatedresultsfromdatanodes

SimilarlypushDISTINCT,ORDERBYetc.
PostgresXC 49

May16th,2012

DistributedTables
Write Read Combiner

write

read

read

read

val 1 2 3

val2 2 10 4

val 11 21 31

val2 21 101 41

val 10 20 30

val2 20 100 40

val 1 2 3

val2 2 10 4

val 11 21 31

val2 21 101 41

val 10 20 30

val2 20 100 40

May16th,2012

PostgresXC

50

DistributedTable

Writetoasinglerowisappliedonlyonthenodewherethe rowresides

Multiplerowscanbewritteninparallel

Scanningrowsspanningacrossthenodes(e.g.table scans)canhamperperformance Pointreadsandwritesbasedonthedistributioncolumn valueshowgoodperformance

Datanodewheretheoperationhappenscanbeidentifiedbythe distributioncolumnvalue
PostgresXC 51

May16th,2012

Queryingdistributedtable

Example:simpleSELECTondistributedtable

CREATETABLEtab1(valint,val2int) DISTRIBUTEBYHASH(val) TONODEdatanode_1,datanode_2,datanode_3;distributedtable EXPLAINVERBOSESELECT*FROMtab1WHEREval2=5; QUERYPLAN DataNodeScanon"__REMOTE_FQS_QUERY__" Gathersrowsfromthenodeswheretableisdistributed Output:tab1.val,tab1.val2 Node/s:datanode_1,datanode_2,datanode_3 Remotequery:SELECTval,val2FROMtab1WHERE(val2=5)
May16th,2012 PostgresXC 52

Distributedtablesmultirowoperations(1)

Example:aggregationondistributedtable

EXPLAINVERBOSESELECTsum(val)FROMtab1GROUPBYval2; QUERYPLAN HashAggregateGroupsrowsoncoordinator,N(groups)<N(rows) Output:sum(val),val2 >DataNodeScanontab1 Bringsalltherowsfromthedatanodetocoordinator. Output:val,val2 Node/s:datanode_1,datanode_2,datanode_3 Remotequery:SELECTval,val2FROMONLYtab1WHEREtrue


May16th,2012 PostgresXC 53

Distributedtablesmultirowoperation(2)

Example:aggregationondistributedtable

EXPLAINVERBOSESELECTsum(val)FROMtab1GROUPBYval2; QUERYPLAN HashAggregate Output:pg_catalog.sum((sum(tab1.val))),tab1.val2 Finalisethegroupingandaggregationatcoordinator >DataNodeScanon"__REMOTE_GROUP_QUERY__" Output:sum(tab1.val),tab1.val2 Node/s:datanode_1,datanode_2,datanode_3 Remotequery:SELECTsum(group_1.val),group_1.val2 FROM(SELECTval,val2FROMONLYtab1 WHEREtrue)group_1GROUPBY2 Getpartiallygroupedandaggregatedresultsfromdatanodes
May16th,2012 PostgresXC 54

JOINs

Example:Joinondistributionkey

EXPLAINVERBOSESELECT*FROMtab1,tab2WHEREtab1.val=tab2.val; QUERYPLAN NestedLoopPerformJOINoncoordinator. Output:tab1.val,tab1.val2,tab2.val,tab2.val2 JoinFilter:(tab1.val=tab2.val) Queriestodatanodestofetchtherowsfromtab1andtab2 >DataNodeScanontab1 Output:tab1.val,tab1.val2 Remotequery:SELECTval,val2FROMONLYtab1WHEREtrue >DataNodeScanontab2 Output:tab2.val,tab2.val2 Remotequery:SELECTval,val2FROMONLYtab2WHEREtrue
May16th,2012 PostgresXC 55

JOINs

PerformingJOINoncoordinatorwon'tbeefficientif

Numberofrowsselected<<sizeofcartesianproductofthe relations ThesizeoftheJOINresulttupleisnotlargecomparedtothe rowsfromthejoiningrelations Numberofrowsselected~=sizeofcartesianproductofthe relations ThesizeofJOINresulttupleisverylargecomparedtothe individualrowsizes


PostgresXC 56

Itwillbeefficientif

May16th,2012

PerfomingJOINsondatanodes

Example:Joinondistributionkey

EXPLAINVERBOSESELECT*FROMtab1,tab2WHEREtab1.val=tab2.val; QUERYPLAN DataNodeScanon"__REMOTE_FQS_QUERY__" Output:tab1.val,tab1.val2,tab2.val,tab2.val2 Node/s:datanode_1 Remotequery:SELECTtab1.val,tab1.val2,tab2.val,tab2.val2 FROMtab1,tab2WHERE(tab1.val=tab2.val) JOINperformedondatanode

May16th,2012

PostgresXC

57

PerformingJOINsondatanodes

IndexescanhelptoperformJOINfaster

Indexesavailableonlyondatanode

Aggregates,grouping,sortingcanaswellbeperformed ondatanode AlwaysperformJOINsonthedatanode

InXC,coordinatorsdonothavecorrectstatistics,socan't predicttheselectivity Propercostingmodelisyettobeimplemented


PostgresXC 58

May16th,2012

ShippabilityofJoins
Hash/Moduledistributed RoundRobin Replicated

Hash/Modulodistributed

Innerjoinwithequality NO conditiononthe distributioncolumnwith samedatatypeandsame distributionstrategy No No

Innerjoinifreplicated table'sdistributionlistis supersetofdistributed table'sdistributionlist Innerjoinifreplicated table'sdistributionlistis supersetofdistributed table'sdistributionlist Allkindsofjoins

RoundRobin

Replicated

Innerjoinifreplicated table'sdistributionlistis supersetofdistributed table'sdistributionlist

Innerjoinifreplicated table'sdistributionlistis supersetofdistributed table'sdistributionlist

May16th,2012

PostgresXC

59

Constraints

XCdoesnotsupportGlobalconstraintsi.e.constraints acrossdatanodes Constraintswithinadatanodearesupported


Unique,primarykeyconstraints Foreignkeyconstraints Supported SupportedifprimaryORunique keyisdistributionkey Supportedifthereferencedtableis alsoreplicatedonthesamenodes Supportedifthereferencedtableis replicatedonsamenodesORit's distributedbyprimarykeyinthe samemannerandsamenodes Supportedifthereferencedtableis replicatedonsamenodes
60

Distributionstrategy Replicated Hash/Modulodistributed

RoundRobin

Notsupported

May16th,2012

PostgresXC

Choosingdistributionstrategy(1)

Replicationif

Thetableislessfrequentlywrittento Thetable'sprimarykeyisreferencedbymanydistributed tables Thetableneedstohaveaprimarykey/uniquekeywhichiscan notbedistributionkey ThetableispartofJOINformanyqueriesmakeseasierto pushtheJOINtothedatanode Dataredundancy


PostgresXC 61

May16th,2012

Choosingdistributionstrategy(2)

Hash/Modulodistributedif

Therearehighpointread/writeloads Thequerieshaveequalityconditionsonthedistributionkey, suchthatthedatacomesfromonlyonenode(essentiallyit becomesequivalenttoreplicatedtableforthatquery) Thereisnodefinitedistributionkey,butstillwanttobalancethe writeloadacrossthecluster

RoundRobinif

May16th,2012

PostgresXC

62

ExampleDBT1(1)

author,item

Lessfrequentlywritten Frequentlyreadfrom AuthoranditemarefrequentlyJOINed Hencereplicatedonallnodes

May16th,2012

PostgresXC

63

ExampleDBT1(2)

customer,address,orders,order_line,cc_xacts

Frequentlywritten

hencedistributed

ParticipateinJOINsamongsteachotherwithcustomer_idas JOINkey,pointSELECTsbasedoncustomer_id

hencediistributedbyhashoncustomer_idsothatJOINsare shippable HavingitemreplicatedhelpspushingJOINstodatanode

ParticipateinJOINswithitem

May16th,2012

PostgresXC

64

ExampleDBT1(3)

Shopping_cart,shopping_cart_line

Frequentlywritten

Hencedistributed Hencedistributedbyhashonshopping_cart_id HavingitemreplicatedhelpspushingJOINstodatanode

Pointselectsbasedonshopping_cart_id

JOINswithitemonitem_id

May16th,2012

PostgresXC

65

ExampleDBT1(4)
CUSTOMER
C_ID C_UNAME C_PASSWD C_FNAME C_LNAME C_ADDR_ID C_PHONE C_EMAIL C_SINCE C_LAST_VISIT C_LOGIN C_EXPIRATION C_DISCOUNT C_BALANCE C_YTD_PMT C_BIRTHDATE C_DATA

ORDERS
O_ID O_C_ID O_DATE O_SUB_TOTAL O_TAX O_TOTAL O_SHIP_TYPE O_BILL_ADDR_ID O_SHIP_ADDR_ID O_STATUS

ORDER_LINE
OL_ID OL_O_ID OL_I_ID OL_QTY OL_DISCOUNT OL_COMMENTS OL_C_ID

ITEM
I_ID I_TITLE I_A_ID I_PUB_DATE I_PUBLISHER I_SUBJECT I_DESC I_RELATED1 I_RELATED2 I_RELATED3 I_RELATED4 I_RELATED5 I_THUMBNAIL I_IMAGE I_SRP I_COST I_AVAIL I_ISBN I_PAGE I_BACKING I_DIMENASIONS

SHOPPING_CART
SC_ID SC_C_ID SC_DATE SC_SUB_TOTAL SC_TAX SC_SHIPPING_COST SC_TOTAL SC_C_FNAME SC_C_LNAME SC_C>DISCOUNT

CC_XACTS
CX_I_ID CX_TYPE CX_NUM CX_NAME CX_EXPIRY CX_AUTH_ID CX_XACT_AMT CX_XACT_DATE CX_CO_ID CX_C_ID

Distributedwith ShoppingCartID

Distributedwith CustomerID

SHOPPING_CART_LINE
SCL_SC_ID SCL_I_ID SCL_QTY SCL_COST SCL_SRP SCL_TITLE SCL_BACKING SCL_C_ID

ADDRESS
ADDR_ID ADDR_STREET1 ADDR_STREET2 ADDR_CITY ADDR_STATE ADDR_ZIP ADDR_CO_ID ADDR_C_ID

STOCK

Replicated
COUNTRY
CO_ID CO_NAME CO_EXCHANGE CO_CURRENCY

ST_I_ID ST_STOCK

AUTHOR
OL_ID OL_O_ID OL_I_ID OL_QTY OL_DISCOUNT OL_COMMENTS OL_C_ID

Distributedwith ItemID

May16th,2012

PostgresXC

66

DBT1scaleup

Olddata,wewillpublishbench marksfor1.0soon. DBT1(TPCW)benchmarkwith someminormodificationtothe schema 1server=1coordinator+1 datanodeonsamemachine CoordinatorisCPUbound DatanodeisI/Obound

May16th,2012

PostgresXC

67

Transactionmanagement

2PCisusedtoguaranteetransactionalconsistencyacrossnodes

WhentherearemorethanonenodesinvolvedOR Whenthereareexplicit2PCtransactions

Onlythosenodeswherewriteactivityhashappened,participatein 2PC InPostgreSQL2PCcannotbeappliediftemporarytablesare involved.SamerestrictionappliesinPostgresXC Whensinglecoordinatorcommandneedsmultipledatanode commands,weencasethoseintransactionblock


PostgresXC 68

May16th,2012

ChapterD Backup,restore,recoveryandhigh availability

May16th,2012

PostgresXC

69

Exampleconfiguration(1)

Coordinatorx2

Datanodex2

coord1

datanode1

D=/home/koichi/pgxc/nodes/coord1 port:20004

D=/home/koichi/pgxc/nodes/datanode1 port:20006

coord2

datanode2

D=/home/koichi/pgxc/nodes/coord2 port:20005

D=/home/koichi/pgxc/nodes/datanode2 port:20007

May16th,2012

PostgresXC

70

Exampleconfiguration(2)

GTM

GTMProxyX2

D/home/koichi/pgxc/nodes/gtm port:20001

gtm_pxy1

D/home/koichi/pgxc/nodes/gtm_pxy1 port:20002 Connects:coord1,datanode1

GTMStandby

D/home/koichi/pgxc/nodes/gtm_standby port:20000

gtm_pxy2

D/home/koichi/pgxc/nodes/gtm_pxy2 port:20003 Connect:coord2,datanode2

May16th,2012

PostgresXC

71

D1 Backup

May16th,2012

PostgresXC

72

General

CoordinatorandDatanode

NodifferenceinprinciplefromvanillaPostgreSQL pg_dump,pg_dumpallworksglobally Streamingreplicationshouldbeconfiguredforeachcoordinatoranddatanode Restorationshouldbeconsistentinallthenodes

Barrier

GTMproxy

NodynamicdataOnlystaticconfigurationfilesneedsbackup NeedtobackupcurrentstatusGTMStandby(explainedlater)
PostgresXC 73

GTM

May16th,2012

pg_dump,pg_dumpall,pg_restore

Youdon'thavetobackupeachnode. Connecttooneofthecoordinatorsandissuepg_dumpor pg_dumpalltobackup. Connecttooneofthecoordinatorsandissuepg_restore torestore. pg_dumpextendedtosupporttabledistribution

May16th,2012

PostgresXC

74

Coldbackupofeachdatabasecluster (staticbackup)(1)

Afteryoustopallthecomponents,youcanbackupallthe physicalfilesofeachcomponent. Restorethemandsimplyrestartallthecomponents. Tomaintainwholeclusterconsistent,youmustbackup themallatthesameoccasion(afteryoustoppedthe wholecluster).

May16th,2012

PostgresXC

75

Coldbackupofeachdatabasecluster (staticbackup)(2)

First,stopwholecluster
$ $ $ $ $ $ $ pg_ctl stop -Z coordinator -D /home/kochi/pgxc/nodes/coord1 # -m fast pg_ctl stop -Z coordinator -D /home/koichi/pgxc/nodes/coord2 # -m fast pg_ctl stop -Z datanode -D /home/koichi/pgxc/nodes/datanode1 # -m fast pg_ctl stop -Z datanode -D /home/koichi/pgxc/nodes/datanode2 # -m fast gtm_ctl stop -S gtm_proxy -D /home/koichi/pgxc/nodes/gtm_pxy1 gtm_ctl stop -S gtm_proxy -D /home/koichi/pgxc/nodes/gtm_pxy2 gtm_ctl stop -S gtm -D /home/koichi/pgxc/nodes/gtm

Youshouldruntheabovecommandsoneachmachineswhereeachcoordinatoror datanodearerunning. Youmaynotneedmfastoptionifyoudisconnectalltheconnectionsfromcoordinatorto datanodewithCLEAN CONNECTIONstatement. AssimpleasvanillaPostgreSQL,butyoushouldtakecareofalltherunningcomponents.

May16th,2012

PostgresXC

76

Physicalbackupofeachdatabasecluster (staticbackup)(3)

Then,backupeverything(tarisusedinthiscase)
$ $ $ $ $ $ $ $ cd /home/koichi/pgxc/node tar cvzf somewhere/gtm.tgz gtm tar cvzf somewhere/gtm_pxy1.tgz gtm_pxy1 tar cvzf somewhere/gtm_pxy2.tgz gtm_pxy2 tar cvzf somewhere/coord1.tgz coord1 tar cvzf somewhere/coord2.tgz coord2 tar cvzf somewhere/datanode1.tgz datanode1 tar cvzf somewhere/datanode2.tgz datanode2

Again,assimpleasvanillaPostgreSQL Youcanuseyourfavoritebackuptools.tar,rsync,whatsoever. Justyoushouldtakecareofallthecomponents.

May16th,2012

PostgresXC

77

Hotbackup(1)General

SimilartovanillaPostgreSQL Youneedtosynchronizerestorationpoint

Barrier:CREATE BARRIER 'barrier_id'


Advisetoissuethiscommandperiodicallyfrompsql Willpropagatetoallthecomponent Canbeusedasrestorationpoint

May16th,2012

PostgresXC

78

Hotbackup(2)

OtheraresimilartovanillaPostgreSQL Again,youshouldtakecareofalltherunningnodes. Setuphotbackup(basebackupandWALarchiving)for eachcoordinatoranddatanode. GTMhotbackupneedsdedicatedstandby

CommontoHAconfiguration

May16th,2012

PostgresXC

79

Coordinator/Datanodehotbackup

LikestandalonePGbackup.
Foreachcoordinator/datanode,youshould 1. ConfigureWALarchiving 2. Takebasebackups

Again,simplebuthavetotakecareofall.

May16th,2012

PostgresXC

80

Hotbackupcoord1example

ConfigureWALarchiving
$ cat /home/koichi/pgxc/nodes/coord1/postgresql.conf wal_level = archive archive_mode = on archive_command = 'cp -i %p /somewhere/%f </dev/null'

Takebasebackup
$ pg_basebackup -D backup_dir -h hostname -p 20004 -F tar

May16th,2012

PostgresXC

81

xoptionofpg_basebackup

ThisoptionincludesWALsegmentsinthebackupandenableto restorewithoutWALarchiving. InXC,weshouldrestoreallthenodestothesametimestamp. YoucanusethisoptionwithoutWALarchiving

Ifyouarequitesurethatxoptionincludesyourtargetbarrierin backupsofallthecoordinator/datanode Ifyouarenot,youshouldconfigureWALarchivingtobesurethatall theWALfilesincludesyourtargetbarrier.

May16th,2012

PostgresXC

82

GTMHotBackupGTMSandby

GTMhasdedicatedsynchronousbackupcalledGTMstandby. JustlikePostgreSQLsynchronousreplication. GTMstandbysharesthebinarywithGTM. StartGTM,thenGTMStandby.GTMStandbycopiesevery requesttoGTMandmaintainsGTM'sinternalstatusasabackup. GTMStandbycanbepromotedtoGTMwithgtm_ctlcommand. GTMproxycanreconnecttopromotedGTM.

May16th,2012

PostgresXC

83

RunningGTMStandby

RunGTMas
$ cat gtm_data_dir/gtm.conf nodename = 'gtm' # node name of your choice listen_addresses = 'gtm_ip_address' port = 20001 # port number of your choice startup = ACT # specify ACT mode to start $ gtm_ctl start -S gtm -D gtm_data_dir

RunGTMstandbyas
$ cat gtm_sby_data_dir/gtm.conf nodename = 'gtm_standby' # node name of your choice listen_addresses = 'gtm_standby_ip_address' port = 20000 # port # of your choice startup = STANDBY # specfy to start as standby active_port = 20001 # ACT gtm port numer active_host = 'gtm_ip_address' # ACT gtm ip address $ gtm_ctl start -S gtm -D gtm_sby_data_dir

May16th,2012

PostgresXC

84

GTMStandbybacksup

GTMstandbybacksupeverystatusofGTM

TorestoreGTMwithGTMStandby,youshouldpromoteGTMstandby(explained later)

May16th,2012

PostgresXC

85

D2 Restore

May16th,2012

PostgresXC

86

Restorefrompg_dumpandpg_dumpall

InitializewholePostgresXCcluster

Havebeencoveredatconfiguration Youdon'thavetorunpg_restoreforeachcoordinator/datanode Selectoneofthecoordinatortoconnectandrunpg_restore, that'sall. Takescareoftabledistribution/replication

Runpg_restore

May16th,2012

PostgresXC

87

Restorefromthecoldbackup(1)

Restoreallthecoldbackups
$ cd /home/koichi/pgxc/node $ tar xvzf somewhere/gtm.tgz $ tar xvzf somewhere/gtm_pxy1.tgz $ tar xvzf somewhere/gtm_pxy2.tgz $ tar xvzf somewhere/coord1.tgz $ tar xvzf somewhere/coord2.tgz $ tar xvzf somewhere/datanode1.tgz $ tar xvzf somewhre/datanode2.tgz

May16th,2012

PostgresXC

88

Restorefromthecoldbackup(2)

Startallthecomponentsagain
$ gtm_ctl start -S gtm -D /home/koichi/pgxc/nodes/gtm $ gtm_ctl start -S gtm_proxy -D /home/koichi/pgxc/nodes/gtm_pxy1 $ gtm_ctl start -S gtm_proxy -D /home/koichi/pgxc/nodes/gtm_pxy2 $ pg_ctl start -Z datanode -D /home/koichi/pgxc/nodes/datanode1 -o -i $ pg_ctl start -Z datanode -D /home/koichi/pgxc/nodes/datanode2 -o -i $ pg_ctl start -Z coordinator -D /home/koichi/pgxc/nodes/coord1 $ pg_ctl start -Z coordinator -D /home/koichi/pgxc/nodes/coord2 -o -i -o -i

Restorationisdone.

May16th,2012

PostgresXC

89

Restorationfromthehotbackup CoordinatorandDatanode

Restorethehotbackupofeachcoordinatoranddatanode justlikesinglePostgreSQL Makerestorepointconsistentamongcoordinatorsand datanodes.

SpecifyBarrierIDastherestorepoint

May16th,2012

PostgresXC

90

Recovery.confsettings coord1example
$ cat /home/koichi/nodes/coord1/reconvery.conf restore_command = 'cp /somewhere/%f %p' recovery_target_barrier = 'barrier_id' $ pg_ctl start -Z coordinator -D /home/koichi/pgxc/nodes/coord1 -o -i

Youcanspecifyotherrecovery.confoptionssuchas archive_cleanup_commandifneeded. Configurerecover.conffileforeverycoordinatoranddatanodeand startthemwithpg_ctl. Specifythesamebarrier_idvalue.


PostgresXC 91

May16th,2012

D3 Recovery

May16th,2012

PostgresXC

92

Whencoordinatorordatanodecrashes

Anytransactioninvolvedincrashedcoordinatoror datanodewillfail. Ifcrashrecoveryruns,doitasvanillaPostgreSQL. Ifcrashrecoverydoesnotrun,youneedarchiverecovery.

Youshoulddoarchiverecoveryforeverycoordinatorand datanodetomaintainclusterwidedataconsistensy.

May16th,2012

PostgresXC

93

WhenGTMProxycrashes

GTMProxydoesnothaveanydynamicdata. Restorebackupconfigurationfileandrestart. Youmayneedtorestartcoordinator/datanodeconnected tothefailedgtmproxy

May16th,2012

PostgresXC

94

IfGTMcrashes

Youmusthaverungtm_standbytorecoverfromGTM crash. YoumustconfigureXCwithgtm_proxy. Promotegtm_standbyasgtm. Thenreconnectgtm_proxytothepromotedgtm. Youdon'tstopcoordinatorsand/ordatanodes. Notransactionloss.


PostgresXC 95

May16th,2012

RunningGTMStandby(again)

RunGTMas
$ cat gtm_data_dir/gtm.conf nodename = 'gtm' # node name of your choice listen_addresses = 'gtm_ip_address' port = 20001 # port number of your choice startup = ACT # specify ACT mode to start $ gtm_ctl start -S gtm -D gtm_data_dir

RunGTMstandbyas
$ cat gtm_sby_data_dir/gtm.conf nodename = 'gtm_standby' # node name of your choice listen_addresses = 'gtm_standby_ip_address' port = 20000 # port # of your choice startup = STANDBY # specfy to start as standby active_port = 20001 # ACT gtm port numer active_host = 'gtm_ip_address' # ACT gtm ip address $ gtm_ctl start -S gtm -D gtm_sby_data_dir

May16th,2012

PostgresXC

96

RunGTMProxies
$ cat /home/koichi/pgxc/nodes/gtm_pxy1/gtm_proxy.conf nodename = 'gtm_pxy1' listen_addresses = 'gtm_pxy1_ip_address' port = 20002 gtm_host = 'gtm_ip_address' gtm_port = 20001 $ gtm_ctl start -S gtm_proxy $ (do the same thing for gtm_pxy2 as well)

May16th,2012

PostgresXC

97

GTMrecoveryprocedure
$ (GTM crash found) $ gtm_ctl promote -S gtm \ -D /home/koichi/nodes/gtm_standby $ gtm_ctl reconnect -S gtm_proxy \ -D /home/koichi/pgxc/nodes/gtm_pxy1 \ -o -s gtm_standby_ip_addr -t 20000 $ gtm_ctl reconnect -S gtm_proxy \ -D /home/koichi/pgxc/nodes/gtm_pxy2 \ -o -s gtm_standby_ip_addr -t 20000

May16th,2012

PostgresXC

98

Additionalgtm_proxyoptions

Timertocontrolstodealwithcommunicationfailureandwaitfor reconnect.
err_wait_idle err_wait_count err_wait_interval #timertowaitforthefirstreconnect #(insecond) #countstowaitforreconnect #timertowaitnextreconnect(insecond)

gtm_connect_retry_idle #timertoretryconnecttothecurrentgtmwhenerroris #detected(insecond). gtm_connect_retry_count #numberofconnectretries gtm_connect_retry_interval #intervalofconnectretrytothecurrentgtmwhenerroris #detected(insecond).


May16th,2012 PostgresXC 99

RunningGTMStandby

RunGTMas
$ cat gtm_data_dir/gtm.conf nodename = 'gtm' # node name of your choice listen_addresses = 'gtm_ip_address' port = 20001 # port number of your choice startup = ACT # specify ACT mode to start $ gtm_ctl start -S gtm -D gtm_data_dir

RunGTMstandbyas
$ cat gtm_sby_data_dir/gtm.conf nodename = 'gtm_standby' # node name of your choice listen_addresses = 'gtm_standby_ip_address' port = 20000 # port # of your choice startup = STANDBY # specfy to start as standby active_port = 20001 # ACT gtm port numer active_host = 'gtm_ip_address' # ACT gtm ip address $ gtm_ctl start -S gtm -D gtm_sby_data_dir

May16th,2012

PostgresXC

100

RunningGTMStandby

RunGTMas
$ cat gtm_data_dir/gtm.conf nodename = 'gtm' # node name of your choice listen_addresses = 'gtm_ip_address' port = 20001 # port number of your choice startup = ACT # specify ACT mode to start $ gtm_ctl start -S gtm -D gtm_data_dir

RunGTMstandbyas
$ cat gtm_sby_data_dir/gtm.conf nodename = 'gtm_standby' # node name of your choice listen_addresses = 'gtm_standby_ip_address' port = 20000 # port # of your choice startup = STANDBY # specfy to start as standby active_port = 20001 # ACT gtm port numer active_host = 'gtm_ip_address' # ACT gtm ip address $ gtm_ctl start -S gtm -D gtm_sby_data_dir

May16th,2012

PostgresXC

101

D4 HighAvailability

May16th,2012

PostgresXC

102

HAGeneral

PostgresXC'sHAfeatureshouldbeachievedby integrationwithotherHAmiddlewaresuchaspacemaker (resourceagent). GTM,GTMstandbyandGTMproxiesprovides fundamentalHAfeature. Eachcoordinatoranddatanodeshouldbeconfiguredwith synchronousreplication. ThistutorialwillfocusonPostgresXCconfigurationforHA middlewareintegration.


PostgresXC 103

May16th,2012

GTMconfiguration

GTMshouldbeconfiguredwithGTMstandbyandGTM proxy. MonitoringGTM,GTMstandbyandGTMproxycanbe donebyprocessmonitoring. Dedicatedmonitoringcommandcanbeimplemented usingdirectinterfacetoeachofthem.

Needtobefamiliarwiththeirinternalstructure. Maybedevelopedelsewhere.
PostgresXC 104

May16th,2012

Coordinator/datanodeconfiguration

Again,sameasvanillaPostgreSQL. Configureeachcoordinatoranddatanodesynchronousreplication asvanillaPostgreSQL. Atfailover,othercoordinator/datanodemustbenotifiednew connectionpoint.

UseALTER NODEcommandafterthefailover ALTER NODE datanode1 WITH (HOST = 'new_ip_addr', PORT = 20010);

Then,SELECT pgxc_pool_reload() Needtorunpgxc_cleantocleanup2PCstatus. OtherprocedureisthesameasvanillaPostgreSQLstandbyserversettingsand operation.


PostgresXC 105

May16th,2012

Whypgxc_clean?

Whenanodeisfailedoverorrecovered,2PCstatuscould beinconsistent.

Atsomenodes,transactionhasbeencommittedorabortedbut atothernodes,itmightremainprepared. pgxc_cleancollectssuchoutstanding2PCtransactionstatus andcorrectthem.

May16th,2012

PostgresXC

106

Cleaningupoutstanding2PC

Collectsallthepreparedtransactionsatallthecoordinatorand datanode. Checkifpreparedtransactionsarecommittedorabortedatother nodes. Ifcommitted/aborted,thencommits/abortsthetransactionat preparednodes. Ifimplicit2PCtransactionisonlyprepared,itisintendedtobe committedandcommitit. Ifoutstanding2PCtransactioniscommittedandabortedinother nodes,itisanerror.pgxc_cleanwillnotifytooperatorandlet operatorsolveitmanually.
PostgresXC 107

May16th,2012

ChapterE PostgresXCasacommunity

Howtoenterthesect...

May16th,2012

PostgresXC

108

Sitestoknow

MainmanagementinSourceForge

URL:http://sourceforge.net/projects/postgresxc/ Bugtracker,mailinglists... URL:https://github.com/postgresxc/postgresxc MirrorofofficialrepositoryinSourceForge URL:http://postgresxc.sourceforge.net/ Roadmap,mailinglistdetails,members,docs


PostgresXC 109

OtherGITrepositoryinGithub

Projectwebpage

May16th,2012

Mailinglists

Allthedetailsinprojectwebpage

http://postgresxc.sourceforge.net/ SectionMailinglist,withsubscriptionlinks postgresxcXXX@lists.sourceforge.net

General:postgresxcgeneral Hackers:postgresxcdevelopers GITcommits:postgresxccommitters

May16th,2012

PostgresXC

110

Documentation

Publishedinhttp://postgresxc.sourceforge.net

SectionDocumentation master(automaticallydailyuploaded) Stablereleasesfrom1.0

Concernedreleaseandbranches

May16th,2012

PostgresXC

111

Dailysnapshotsandbuildfarming

Publishedinhttp://postgresxc.sourceforge.net Tarballdailysnapshots

Basedonlatestcommitofmasterbranch Containshtmlandmanpages

Regressionandperformanceresults

Regressions

TestsbasedonPostgreSQL+XCrelatedtests Dailyautomatictests DailyautomatictestsbasedonDBT1 Onmasterandstablebranchesfrom1.0


PostgresXC 112

Performance

May16th,2012

Releasepolicyandcodemerge

May16th,2012

PostgresXC

113

Howtocontribute

Asatesterbugreport Asapackager

Debian,RPM,pkg... Whynotwritingnewstuff?=>FeaturerequestsinSFtracker Bugcorrectionandstabilization?=>BugtrackerinSF

Asacoder

Asadocumentationcorrecter AnythingIamforgettinghere...
PostgresXC 114

May16th,2012

Whatnext?

Firstmajorrelease,basedonPostgres9.1=>1.0 Nextmove

MergewithPostgres9.2 Dataredistribution Nodeadditionanddeletion Triggers Globalconstraints

1.1attheendofOctober2012(?)
PostgresXC 115

May16th,2012

Contactinformation

KoichiSuzuki

koichi.clarinet@gmail.com ashutosh.bapat@enterprisedb.com michael.paquier@gmail.com Twitter:@michaelpq

AshutoshBapat

MichaelPaquier

May16th,2012

PostgresXC

116

ThankYouVeryMuch!

May16th,2012

PostgresXC

117

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy