版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
What
is
Patroni,
really?Polina
Bungina,
AlexanderKukushkinHOW2025PostgreSQL&IvorySQL
EcoConferenceAbout
usAlexander
Kukushkin
Polina
Bungina•PrincipalSoftware
Engineer
@Microsoft•
The
Patroniguy•
akukushkin@•Software
Engineer@Zalando•polina.bungina@zalando.de●What
is
it,
really?●
Automaticfailoverdonewrong●Patroni
overview○how
it
works?○notablefeaturesAgenda3Whatis
it,
really?Whatis
it,
really?●
Originatedfrom
GovernorprojectbyCompose,in
2015●
Mainfunctions:○
Automaticfailover○
Clustercreationand
initialsetup○
Clustermanagement○
~
Monitoring5AutomaticfailoverdonewrongprimaryWALstreamstandbyhealth-checkAutomaticfailoverdonewrongRunningtwonodes
only7primaryWALstreamstandby
health-checkShould
I
…promote?AutomaticfailoverdonewrongRunningtwonodes
only8primaryprimaryAutomaticfailoverdonewrongRunningtwonodes
only9standby
primaryhealth-checkWALstreamShould
I
…promote?Avoidingsplit-brainAutomaticfailoverdonewrongSTONITH10Avoidingsplit-brain●STONITH(shootthe
other
node
inthe
head)●Must
use
a
secondary
network●Almost
impossible
to
get
it
rightAutomaticfailoverdonewrong11primaryVWALstreamstandbywitness
node
(arbiter)health-checkhealth-checkSinglewitnessnodeAutomaticfailoverdonewrong12primaryVWALstreamstandbyhealth-checkhealth-checkSinglewitnessnodeAutomaticfailoverdonewrong13primaryWALstreamstandbywitness
node
(arbiter)health-checkPromotestandby! health-checkSinglewitnessnodeSTONITHAutomaticfailoverdonewrong14witness
node
(arbiter)primaryprimaryVhealth-checkSinglewitnessnodeAutomaticfailoverdonewrong15Thingstoconsider●Think
about
network
partition●
Preventsplit-brain→fencing○
STONITH●
Shutdown●Kill
old
connections,
re-configure
proxy○
Self-fencing(locally)●
WatchdogAutomaticfailoverdonewrong16primaryagentWALstreamstandbyagentwitness
node
(arbiter)LocalagentsAutomaticfailoverdonewrong17DC2DC3standbyagentstandbyagentprimary
agent
witness
node
(arbiter)LocalagentsAutomaticfailoverdonewrongDC118AVDC2DC3standbyagentstandbyagentprimary
agent
witness
node
(arbiter)LocalagentsDC1(isolated)Automaticfailoverdonewrong19AVstandbyprimarywitness
node
(arbiter)Buthowto
do
it
right?agentagentWALstreamQuorum20standbyprimaryButhowto
do
it
right?agentagentWALstreamQuorum21standbyprimaryWALstreamButhowto
do
it
right?agentQuorum22Patroni:how
itworks?How
itworks?General
idea●
Statestored
in
DistributedConfigurationStore(DCS)○Etcd,ZooKeeper,
Consul,
Kubernetes
control-plane●Built-indistributedconsensus
(RAFT,Zab)●
Key-valuestore●Atomic
CAS
(compare-and-swap)
operations●
Lease/Session/TTLtoexpiredata○
/leader,/members/*●
Watchesfor
keys24primary
node
Astandby
node
B4WATCHES/leaderPatronioverview/leader:
“A”,ttl:30UPDATES
/leader,/status,
…UPDATES
/members/BHow
itworks?AV25Aprimary
nodeAstandby
node
Cstandby
node
BprevValue=”A”)SUCCESS/leader:
“A”,ttl:30UPDATE(“/leader”,“A”,ttl=30,Leader
raceWATCH/leaderWATCH/leaderHow
itworks?26primary
nodeAstandby
node
Cstandby
node
B/leader:
“A”,ttl:
1Leader
raceWATCH/leaderWATCH/leaderHow
itworks?27standby
node
Cstandby
node
BNOTIFY(“/leader”,expired=true)NOTIFY(“/leader”,expired=true)Leader
raceHow
itworks?28standby
node
B1)GETA:8008/patroni->timeout2)GETC:8008/patroni
->
wal_position:
1001)GETA:8008/patroni->timeout2)GETC:8008/patroni
->
wal_position:
100standby
node
CLeader
raceHow
itworks?29standby
node
Cstandby
node
BCREATE(“/leader”,“B”,ttl=30,prevExists=False)/leader:
“B”,ttl:30CREATE(“/leader”,“C”,ttl=30,Leader
raceprevExists=False)SUCCESSHow
itworks?FAIL30Astandby
node
BAVUPDATE
/members/B
4WATCH/leaderprimary
node
Aread-only
instanceUPDATE
/leader/leader:
“A”,ttl:
19Self-fencingHow
itworks?31standby
node
Ademotestandby
node
Bread-only
instance4WATCH/leader/leader:
“A”,ttl:
9Self-fencingUPDATE
/leaderUPDATE
/members/BHow
itworks?AV32standby
node
A
1.
NOTIFIES/leaderexpiredstandby
node
Bpromoteread-only
instance/leader:
“A”,ttl:
0Self-fencing2.
CREATES
/leaderHow
itworks?3.AV33update/statuswrite/failoverupdate/sync…update/leaderget
/Communicationwith
DCS–leader(wholecluster)sleep
for
loop_waitHA
loopretriableHow
itworks?34(10)
(10)
(30)loop_wait
+2*retry_timeout
<=ttlget
/
update
/leaderttl,loop_wait,retry_timeout(wholecluster)How
itworks?35/*
global(dynamic)configuration*/
/*
cluster
identifier
*//*
whoistheprimary?
*//*
discovery
*//*
failover
history
*//*
manual
failover/switchover
*/
/*synchronousmode*//service/demo/config/service/demo/initialize/service/demo/leader/service/demo/members/patroni1/service/demo/members/patroni2/service/demo/members/patroni3/service/demo/status/service/demo/history/service/demo/failover/service/demo/syncDatastored
in
DCS$
etcdctl
get
--keys-only
--prefix
/service/demo}How
itworks?36+patroni1+|+|Leader+|running+|1+|+patroni2||Replica|streaming|1|0patroni3+|+|+Replica|+streaming|+1|+0+Datastored
in
DCSdata
retrievedfrom
DCS$
patronictl
list+Cluster:demo
(7497665970948870167)--------+----+-----------+|Member|Host|Role
|State|TL|Lag
in
MB|How
itworks?37Datastored
in
DCS$
etcdctl
get
--print-value-only
--prefix
/service/demo/leader
patroni1$
etcdctl
get
--print-value-only
--prefix
/service/demo/initialize7497665970948870167How
itworks?38Datastored
in
DCS$
etcdctl
get
--keys-only
--prefix/service/demo/members/patroni2{"conn_url":"postgres://:5432/postgres","api_url":
":8008/patroni","state":
"running","role":
"replica","version":
"4.0.5","xlog_location":
67425896,
/*
max(receive_lsnor0,replay_lsn
or
0)*/
"replication_state":"streaming","timeline":
1}How
itworks?39Datastored
in
DCS$
etcdctl
get
--print-value-only
--prefix
/service/demo/status
{"optime":
67425896,
/*
pg_current_wal_flush_lsn()
*/
"slots":
{},"retain_slots":["patroni1","patroni2",
/*
member_slots_ttl
*/"patroni3"]}"patroni2":
67425896,"patroni3":
67425896,/*
members
slots*/"patroni1":
67425896,“my_logical_slot:
67425700/*
permanent
slots
*/How
itworks?40Datastored
in
DCS$
etcdctl
get
--print-value-only
--prefix
/service/demo/config{"loop_wait":
10,"ttl":
30,"retry_timeout":
10,"maximum_lag_on_failover":1048576,"postgresql":{"parameters":{"max_connections":100},
/*
applied
to
all
members(global)*/"use_pg_rewind":
true},"synchronous_mode":
"quorum"}How
itworks?41Whatelse?Whatelse?Notablefeatures●Standbycluster–
runningcascading
replicationto
a
remote
datacenter
(reg
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 药品内部资料管理制度
- 落实企业内部安全制度
- 虚假诉讼内部核查制度规定
- 误餐费内部审批制度
- 财务企业内部控制制度
- 财务常规内部审计制度
- 2026年山东大学教师外其他专业技术岗位招聘(14人)笔试模拟试题及答案解析
- 2026浙江杭州市文苑小学招聘科学教师(非事业)1人考试参考试题及答案解析
- 2026江苏扬州市江都人民医院招聘18人笔试备考题库及答案解析
- 2026安徽六安金寨县通达公共交通有限公司招聘公交车储备驾驶员10人笔试模拟试题及答案解析
- 食管癌中医护理方案
- 妇女儿童权益法律知识讲座
- 奥迪A6L使用说明书
- 多联机安装全过程经典技术指导手册
- 智慧供应链管理PPT完整全套教学课件
- 医院课件:《规范医疗服务收费行为培训》
- GB/T 32017-2019水性墨水圆珠笔和笔芯
- GB/T 13744-1992磁性和非磁性基体上镍电镀层厚度的测量
- GB 2733-2015食品安全国家标准鲜、冻动物性水产品
- GA/T 1323-2016基于荧光聚合物传感技术的痕量炸药探测仪通用技术要求
- 处理医嘱流程课件
评论
0/150
提交评论