Pavel Trukhanov,“通过USE和RED监视Postgres”
有两种性能监视方法:Brendan Gregg的USE(利用率,饱和度,错误)和Tom Wilkie的RED(请求,错误,持续时间)。在演讲中,我想谈谈我们如何受到他们的指导,并在我们在okmeter.io中实现Postgres监视时继续受到指导。
. Okmeter. , , Postgres , . , , , , , USE RED , Postgres.
, . , .
, , , , . - . performance - , , , , .
, , , Postgres – , , . , - , , , . - , . - , , .
USE. , . , , . , , saturation , .
, ? , pg_stat_activity . ? , . , - , . , , . - , - , , .
? «», CPU Usage, , iostat – . , . , , . , , . .
, , , Postgres. , Postgres . . , , . Data Egret. , .
- ?
. . , Postgres , connection connection .
. . , . – - . , , . , , , .
: « ?». , SpinLock - , , , . CPU usage , .
– . , , , , - , , , , - .
? , capacity. 100 %, , , . .
. , . , . . , . , . . .
. - , capacity. , capacity ? . . saturation, . , . , .
Postgres.
pg_stat_activity. - . , . . : 300 connection . , - . , , - .
, . , , . - , , capacity , . . , Postgres max connections.
, state connection, , , idle, . . connection , . - idle in transaction. , , . active, - .
, , . , . ? . , - , . – pool connections, – , , , , . – , . - : locks - .
, , , .
- , , active 5 % connections. 95 % . . , .
, . , connections .
?
, . ? 100 connections, max connections , setting’, . , . , 100 %. , – . - . , . - , - .
saturation, util ? Saturation , utilization 100 % . , , , utilization 100? , .
, , CPU usage , load avarage . , 100 %, saturation . Load avarage — saturation, - . runnable , . . , , , .
, CPU usage . ? . load avarage. Load avarage , . , - . . response .
. - – idle in transaction.
. . - , . saturation .
idle. max connections, . , . -.
, select’ pg_stat_activity connections, waiting try. . . active state, - , -. waiting.
, . utilization connection pool 100 %.
, .
waiting ? . , - saturation , . . stack Postgres, , - - . .
– locks. , lock. , locks - , , connections. , locks.
. . . - lock , .
lock – space , – . , , lock . , , connections, locks, — saturation lock.
Postgres , connection . TCP-. TCP-. Post master . , , , «reset». time wait .
? , connections .
connections .
, connection pool . , , , , . ? - . ? -, . connections 5 000. Postgres . ? - connections. , , .
TCP . time wait, , - Postgres - , .
, connect? postmaster , connections backlog list . , search, backlog 100. . 100 %. – , - – saturation. – .
, backlog , reset.
, . Postgres , TCP «».
RED, USE? DBA, , , , - . , - . - , . . , Postgres .
RED, , , , :
- ,
- ,
- .
Postgres. , . , - . . - , .
rollbacks, , 6 , , , , , search , . . , - .
, RED . , . ? , . , , . , .
queries . - - . 8 , .
, - . . select , .
. , - , . . - . . - . , . . . : « , », , .
, . pg_stat_statements , . . , , . . , . – . . , , - , , . .
slow log. Slow log – durations . , . . , , - , .
, . , - , .
. , - . . , , . – , .
. - .
, , - . - , .
, . , . , , . .
, . USE, RED, ad-hoc , ad-hoc tools - , , , , .
.
Postgres, USE, RED ? . . .
Okmeter, . , - , . , , , , . , - , USE, RED. , . , , , saturation . , , , saturation . , . , - . , , , . , , .
! ! , 4- .
4 – USE RED. , USE, durations. errors . RED , requests durations. - , USE RED . . . - . , , .
– instance.
, ? – . – , requests . .
, !
! . – , - , , . , , . .? . . ?
, . , . , . . , , , , , USE . , , , , , selects, , , requests . , requests .
, , , , ?
. , . , , . , . . , . . . , . - , , . . , queries . - . , .
, , Postgres . , . , .
! , instance Postgres - . , ? , BD .
. – . , , , , , . , , - . , . .
我们斗争的第二种方法是优化。我们优化我们的工作。实际上,Okmeter定期(但很少)每分钟一次对这些视图发出请求。
也就是说,这不是实时的吗?
这是一个难题,什么是实时的。让我们分开讨论。但是负载受到您发出的请求数量的限制。这些请求根本不是很繁重。有几十个。即使您在某种意义上比每分钟一次更实时地进行操作,这种负载仍然非常有限。这是向数据库发送多少查询的示例。有几千个。因此,即使每秒对这几十个对象进行一次轮询,它仍然只是一小部分。
知道了谢谢!