版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Building
Petabyte-scale
Postgres
ClustersChris
TraversJune
24,
2025·
4b·Speaker
Intro(Introducing
myself)Agenda▶
Why
Bagger
was
Built
at
Adjust▶
General
Design
of
Adjust’s
Version▶
Shortcomings
ofthe
First
Version▶
Design
of
the
Open
Source
Version▶
Tradeof
lessonsLife
Before
Bagger▶
We
used
ElasticSearch▶
Size
of
1PB
for
30
days
of
data
retention▶
Velocity
made
the
system
unusableSpecific
ES
Problems▶
Noisy
protocol
lead
to
poor
performance▶
At
that
scale,
ES
was
extremely
difficult
to
start/restart▶
GC
Stop
the
world
events
would
knock
nodes
out
of
cluster▶
Large
queries
could
kill
the
entire
cluster▶
We
had
to
use
fixed
ES
schemas
to
prevent
major
problems.Some
ofthese
may
have
gotten
better,
but
I
doubt
all
have.Enter
Bagger▶
Minimalist
design▶
Schemaless▶
Linearly
Write-scalable▶
Built
for
write-heavy
workloads▶
However,
read-scalability
is
staticAdjust’s
Design▶
Kala
partitioning
to
Sharding▶
Random
Sharding▶
Custom
data
shovel
(Schaufel)
for
ingestion▶
Patched
Postgres▶
Storingjust
JSONB
docs▶
C-language
routing
trigger▶
Why
Postgres?We
Wanted
to
Open
Source
ItWhen
I
was
heading
the
department
I
put
some
efort
into
open
sourcing
Bagger.
However:▶
Too
many
assumptions
to
Adjust’s
operations▶
Statically
defined
partitioning
criteriaIntermezzoAnd
then
I
left
Adjust...
fast
forward
2
yearsRestarting
the
Efort▶
Several
of
us
came
together
with
experience
on
this
system▶
Designed
a
similar
system▶
Using
the
existing
open
source
componentsDesign
Stragegy▶
Go
with
what
we
know▶
Minimize
unknown
unknowns▶
Evaluate
larger
changes
independently
laterRemember:
We
already
had
experience
on
a
prototype
that
reached
10PB
without
difficulty.Initial
Technology
ChoicesGoing
with
What
We
Know▶
Perl▶
Moose▶
PGObject▶
Database-centered
logic▶
C-language
triggers▶
Perldancer▶
Frontend
framework
to
be
decided▶
Etcd
for
cluster
state
handling
(though
this
is
pluggable)Architecture
and
EvolutionWill
discuss
the
design
and
how
it
is
diferent
from
Adjust’s
version
on
the
next
few
slides.Data
FlowSuperstructureCurrent
statusNode
control
infrastructureCompleteMetadata
server
schema/functions
CompleteC-Language
TriggerIn
TestingPostgREST
Config
DoneQuery
Proxy
In
progressFront
EndDeferred
until
after
GAFeel
free
to
get
involved!Beyond
GAHere
is
a
list
of
some
changes
we
plan
to
evaluate:▶
Better
language
for
query
proxyL:
Java?Golang?C?▶
Integration
of
CitusDB?
Pros?Cons?▶
How
can
We
Scale
Reads?▶
Replication
instead
ofwrite-twice?▶
Impact
of
compiled
block
size?Can
we
remove
most
TOASTing?
And
many
moreTradeof
Lessons▶
Scalability
Tradeofs
(read/write)▶
Benchmarking
Everything▶
Specialization
generalizationBonus
learnings▶
Know
your
requirements▶
Diferent
architectures
for
diferent
needs▶
Linear
scaling
in
one
dimen
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- GB/T 9881-2026橡胶术语
- 医联体背景下远程MDT的实践与挑战
- 医联体教学查房规范化建设
- 医联体大数据分析决策支持
- 2025年社区安全评估培训课件
- 护理妇儿护理课件制作
- 2025年建筑施工安全检测课件
- 2025年安全培训质量控制培训
- 手术后引流管护理
- 低钾血症引发室颤的护理质量改进
- 基坑监测培训课件
- 中航机载系统共性技术有限公司招聘笔试题库2025
- 分流员工安置管理办法
- 农行公会经费管理办法
- 以文化人:宁波七中校园文化德育功能强化的实践与启示
- 2025至2030全球及中国超可靠低延迟通信(URLLC)行业项目调研及市场前景预测评估报告
- 2025年贵州省普通高中学业水平合格性考试模拟(四)历史试题(含答案)
- GB/T 45732-2025再生资源回收利用体系回收站点建设规范
- CJ/T 120-2016给水涂塑复合钢管
- 广西南宁市2025届高三下学期第二次适应性考试化学试题(原卷版+解析版)
- 核电子学试题及答案
评论
0/150
提交评论