我们都知道,大多数DBA都非常保守,他们宁愿其数据库仅位于专用服务器上。在具有微服务,Kafka和Kubernetes的现代世界中,基础的数量开始与组织的大小成正比地增长,并且很快超过了舒适的手动或半自动管理。
我已经在Zalando工作了将近7年。有多少人听说过扎兰多?
对于那些没有听说过的人,这是一家类似于俄罗斯拉莫达的公司。
我们销售衣服和鞋子,但我们在欧洲的17个国家/地区销售。
我们有7个自己的物流中心和仓库。
Zalando拥有15,000多名员工。
其中,约有2,000名从事技术工作。技术人员遍布大约200个编写应用程序的团队。
最近,我们已经在Kubernetes上部署了很多东西,并与Kubernetes进行了很多合作。
?
- , Kubernetes, , .
- , Postgres Kubernetes Spilo Patroni.
- , Postgres-Operator Kubernetes.
- – , , .
- Kubernetes . 140 . 50/50 production/test environment. . . cost unit 2 Kubernetes-. , . .
- production deployment CI/CD. docker image, , CI/CD.
- production Kubernetes- , . request, 4- , - , . -.
Postgres Kubernetes? . 10 Postgres- Kubernetes-.
Postgres-Operator Postgres Kubernetes , 140, .
Kubernetes, Postgres? . , , Kubernetes.
, , - .
- Kubernetes . tools.
- Kubernetes . .
. -. worker-, , Kubernetes , kubelet, docker, fluentd, kube-proxy . .
. , .
?
- . docker . Kubernetes . , PersistentVolumes PersistentVolumeClaim.
– StatefulSets, , -, , . . . , -, StatefulSets PersistentVolumeClaim PersistentVolumeClaim templates volume, volume , .
Postgres Kubernetes, . , Kubernetes docker. , - .
- docker image. Spilo. Spilo – . image Postgres, . . , 9.3 12.
- postgres’ extensions , pg_partman, pg_cron, postgis, etc, timescaledb.
- tools , pgq, pgbouncer, wal-e/wal-g. , , docker Kubernetes, , image Kubernetes EC2 instance Amazon.
- HA Patroni,
- .
Patroni? , , . Postgres, HA.
Patroni Python. Kubernetes. Postgres first class citizen Kubernetes, . . Postgres .
Patroni Postgres Kubernetes supervisor , . . .
Patroni – , , failover . Patroni , . . InitDB Postgres, Patroni point in time recovery, .
, , Patroni .
, Patroni, Postgres. - Postgres, Patroni: « ». .
? StatefulSet. . . PersistentVolume. StatefulSet, demo-0 demo-1.
, – Patroni. Patroni kubernetes’ endpoint. . . , Patroni , . , , , endpoint, IP.
-. , .
demo — repl. , labelSelector: role = replica. , labelSelector.
?
, , YAML manifests. . , YAML. , .
Helm, . . CI/CD deployment. . rolling upgrade. minor Postgres, docker image, ? StatefulSet , StatefulSet, . . .
, , rolling upgrade. rolling upgrade Kubernetes-.
? , : 1, 2, 3. availability , . . -. , volumes .
Kubernetes upgrade, workers, . . . cloud environment AWS, - EC2 instance, . .
? , 3 , 3- . 2 availability .
Kubernetes , . Patroni . enter option , . . connections , . , .
.
Kubernetes rolling upgrade .
. . . .
, .
.
? – .
, 3 failover , . . 3 3 failover. B – 2, C – 2.
- , .
.
, , . . , : « Postgres». , pull request Git. kubectl Amazon. .
, - instance, .
.
, .
?
:
- Deployments. . .
- Upgrades clusters. rolling upgrade Postgres. rolling upgrade Kubernetes .
- : , , .
- failovers maintenance.
Postgres-Operator. Kubernetes, , . . , , . – , .
Postgres, YAML-. .
-, , ID , . . . Team, , ACID. ? , . . Atomicity, Consistency, Isolation, Durability.
-, volume. – 1 . – 2. Postgres. . : «, , . owner ».
?
DB deployer. , CI/CD. YAML- CRD-, . Postgres-operator event . StatefulSet - . endpoint, . Postgres, . . superuser , .
Kubernetes , .
rolling upgrade Kubernetes?
.
3 , 3 . , 3 , .
, . Kubernetes , .
. , , .
switchover.
, . . switchover = 1.
, .
Switchover . , , , , . . , downtime .
? issues ?
-, Kubernetes- AWS. .
AWS API , API. , - , AWS .
? Kubernetes AWS API , volumes, , , volumes , postgres’ . , . .
, deployment , . , .
EC2 instance Amazon. , , , , . Amazon, EBS volumes instances. ? , . . - , instances. , instance Amazon, volumes . . . 30 , . , .
Kubernetes, , Postgres, , . Postgres . Patroni . Postgres , Patroni . – crash loop. , .
partitions , -. volume . . volume, , throughput IOPS. volume .
auto-extend volumes? Amazon . API. volume 100 , .
, , , , , auto-extend. , , . . .
volumes , .
. , - jobs . .
? HA , Disaster Recovery , wal-e continuous archiving , basebackup.
wal-e – , - . pg_stat_statements 2- . , . , : APDATE WHERE id IN 150 . . . Postgres – .
Pg_stat_statements 2- . pg_stat_statements , . Kubernetes , , , . .
wal-e , . , , postgres’ - label- . - reinitializing.
– - tools, , , wal-g, pgBackRest. . -, , Postgres 9.6, 9.5 . -, , , .
. wal-e, , basebackup wal-e.
. Out-Of-Memory? docker Kubernetes – . Postgres, , 9. , . production .
. dmesg. , Memory cgroup out of memory Postgres. , ?
? process ID, .
, , . dmesg -T -. OOM system control «oom_score_adj», . Patroni Postgres, . . , .
memory limit 8 , cgroup , 6 + postgres’ shared buffers 2 . 6 . postgres’ , , , .
. . , cgroup shared memory , - .
, shared buffers 25 % 20 %. , , . . .
Postgres 11- . production minor releases, . , , .
. , – , - , shared memory. docker shared memory 64 .
Postgres 11? Postgres 11 parallel hash join. ? worker hash, shared memory. 64 , hash .
? docker dev/shm, .
Kubernetes . . . – tmpfs volume dshm.
, . . volume – enableShmVolume. , , volume. , .
Postgres . -, failover , . . Patroni, - events. Patroni failover , .
, , FATAL too many connections. . . 12- Postgres . max_wal_senders max_connections. wal_senders Postgres. .
Postgres – Built-in connection pooler.
– :
, cluster manifest, , . , : 100 . , , . , . OOM-Killer . , .
. , : 4 , 32 . , 5 64 , , Kubernetes’ . , - .
? production - ServiceAccout, Spilo. , , Postgres real only. ServiceAccount , , - , . .
YAML-.
.
, , , , array . .
tools, , Postgres , , 10.10, . 10. volume . .
tools . , , Git .
environment «». .
1 500 postgres’ . 100 Kubernetes-. . , on-Call , , , , . . - .
, . , , Patroni, Spilo, .
, open source. . Patroni Spilo .
! , .
Questions
availability ?
?
.
, anti-affinity, . . .
! . : production?*
, . 600 1 400 production. . . 600 . , . , , environment . , . , production 2- .
, external volume, . . Host Path , . . - ?
, . . . i3-volume Amazon . ? EBS , . , . . , . , .
, IO-bound , ?
, . Amazon i3-instances. NVMe . instance , . , , . Kubernetes team , , , rolling upgrade , . . 1-2 . 1-2 - .
! ?
wal-e. docker crone, basebackup. archive_command, . . wal, , S3 Amazon. , basebackup + wal . retention – 5 , . . 5 .
! . 1 400? ? 2?
200 . , , , , . . Kafka. , . , . . , . , , . . . 80, . . .
, , Postgres ?
7 . . , . pets world cattle. Pet – , -, . – , . . - , .
?
, .
, ! EBS volumes ?
gp2 , . Io1 – . 3 000 IOPS, io1 , , .
EBS gp2, 250 ?
. Kubernetes. – volumes, RAID. . Kubernetes . Kubernetes , ES2 i3-instance c nvme, instance, EBS , stripe.
Kubernetes + AWS?
, . . . . CPU, memory limit request 100 millicore, 100, 10 . . . . , 101, – . . .
RPO, RTO Postgres ?
, Kafka. . . , .
, .
通常情况下,数据会丢失1-2个沃尔码段(如果完全损坏)。通常,复制不会落后于我们。
1-2段,如果负载很小,则可以是半天。
是的,如果没有负载,那么这些段可能根本不会旋转,也就是说,即使在超时后也没有事务。
我可以把它放在那里吗?
它应该超时,但是如果没有事务,则不会轮换它们。我最近处理了这个。