Quick Links

Re: Increase Vacuum ring buffer.

Lists:	pgsql-hackers

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Increase Vacuum ring buffer.
Date:	2017-07-18 10:09:41
Message-ID:	8737e9bddb82501da1134f021bf4929a@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Good day, every one.

I investigated autovacuum performance, and found that it suffers a lot
from small ring buffer. It suffers in a same way bulk writer suffered
before Tom Lane's commit 6382448cf96:

> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 2009-06-23 00:04:28
> For bulk write operations (eg COPY IN), use a ring buffer of 16MB
> instead of the 256KB limit originally enforced by a patch committed
> 2008-11-06. Per recent test results, the smaller size resulted in an
> undesirable decrease in bulk data loading speed, due to COPY
> processing frequently getting blocked for WAL flushing. This area
> might need more tweaking later, but this setting seems to be good
> enough for 8.4.

It is especially noticable when database doesn't fit in shared_buffers
but fit into OS file cache, and data is intensively updated (ie OLTP
load). In this scenario autovacuum with current 256kB (32 pages) ring
buffer lasts 3-10 times longer than with increased to 16MB ring buffer.

I've tested with synthetic load with 256MB or 1GB shared buffers and
2-6GB (with indices) tables, with different load factor and with/without
secondary indices on updated columns. Table were randomly updated with
hot and non-hot updates. Times before/after buffer increase (depending
on load) were 7500sec/1200sec, 75000sec/11500sec. So benefit is
consistently reproducible.

I didn't tested cases when database doesn't fit OS file cache. Probably
in this case benefit will be smaller cause more time will be spent in
disk read.
I didn't test intensively OLAP load. I've seen once that increased
buffer slows a bit scanning almost immutable huge table, perhaps cause
of decreased CPU cache locality. But given such scan is already fast,
and autovacuum of "almost immutable table" runs rarely, I don't think
it is very important.

Initially I wanted to make BAS_BULKWRITE and BAS_VACUUM ring sizes
configurable, but after testing I don't see much gain from increasing
ring buffer above 16MB. So I propose just 1 line change.

With regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

Attachment	Content-Type	Size
0001-Set-vacuum-ring-buffer-16MB.patch	text/x-diff	1.0 KB

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 14:59:28
Message-ID:	CA+TgmobCq4pXF7OoUxNcVjw6h66Fjh7XLfR8rLsf4UkdGXFvCw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Jul 18, 2017 at 6:09 AM, Sokolov Yura
<funny(dot)falcon(at)postgrespro(dot)ru> wrote:
> I investigated autovacuum performance, and found that it suffers a lot
> from small ring buffer. It suffers in a same way bulk writer suffered
> before Tom Lane's commit 6382448cf96:
>
>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 2009-06-23 00:04:28
>> For bulk write operations (eg COPY IN), use a ring buffer of 16MB
>> instead of the 256KB limit originally enforced by a patch committed
>> 2008-11-06. Per recent test results, the smaller size resulted in an
>> undesirable decrease in bulk data loading speed, due to COPY
>> processing frequently getting blocked for WAL flushing. This area
>> might need more tweaking later, but this setting seems to be good
>> enough for 8.4.
>
> It is especially noticable when database doesn't fit in shared_buffers
> but fit into OS file cache, and data is intensively updated (ie OLTP
> load). In this scenario autovacuum with current 256kB (32 pages) ring
> buffer lasts 3-10 times longer than with increased to 16MB ring buffer.
>
> I've tested with synthetic load with 256MB or 1GB shared buffers and
> 2-6GB (with indices) tables, with different load factor and with/without
> secondary indices on updated columns. Table were randomly updated with
> hot and non-hot updates. Times before/after buffer increase (depending
> on load) were 7500sec/1200sec, 75000sec/11500sec. So benefit is
> consistently reproducible.
>
> I didn't tested cases when database doesn't fit OS file cache. Probably
> in this case benefit will be smaller cause more time will be spent in
> disk read.
> I didn't test intensively OLAP load. I've seen once that increased
> buffer slows a bit scanning almost immutable huge table, perhaps cause
> of decreased CPU cache locality. But given such scan is already fast,
> and autovacuum of "almost immutable table" runs rarely, I don't think
> it is very important.
>
> Initially I wanted to make BAS_BULKWRITE and BAS_VACUUM ring sizes
> configurable, but after testing I don't see much gain from increasing
> ring buffer above 16MB. So I propose just 1 line change.

I think the question for this patch is "so, why didn't we do it this
way originally?".

It's no secret that making the ring buffer larger will improve
performance -- in fact, not having a ring buffer at all would improve
performance even more. But it would also increase the likelihood that
the background work of vacuum would impact the performance of
foreground operations, which is already a pretty serious problem that
we probably don't want to make worse. I'm not certain what the right
decision is here, but I think that a careful analysis of possible
downsides is needed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 15:09:49
Message-ID:	CAGTBQpY7HQrbizNOxJsx-s=GAvWii0qug4d2SchqfQK4-0E5rg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 20, 2017 at 11:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jul 18, 2017 at 6:09 AM, Sokolov Yura
> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>> I investigated autovacuum performance, and found that it suffers a lot
>> from small ring buffer. It suffers in a same way bulk writer suffered
>> before Tom Lane's commit 6382448cf96:
>>
>>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 2009-06-23 00:04:28
>>> For bulk write operations (eg COPY IN), use a ring buffer of 16MB
>>> instead of the 256KB limit originally enforced by a patch committed
>>> 2008-11-06. Per recent test results, the smaller size resulted in an
>>> undesirable decrease in bulk data loading speed, due to COPY
>>> processing frequently getting blocked for WAL flushing. This area
>>> might need more tweaking later, but this setting seems to be good
>>> enough for 8.4.
>>
>> It is especially noticable when database doesn't fit in shared_buffers
>> but fit into OS file cache, and data is intensively updated (ie OLTP
>> load). In this scenario autovacuum with current 256kB (32 pages) ring
>> buffer lasts 3-10 times longer than with increased to 16MB ring buffer.
>>
>> I've tested with synthetic load with 256MB or 1GB shared buffers and
>> 2-6GB (with indices) tables, with different load factor and with/without
>> secondary indices on updated columns. Table were randomly updated with
>> hot and non-hot updates. Times before/after buffer increase (depending
>> on load) were 7500sec/1200sec, 75000sec/11500sec. So benefit is
>> consistently reproducible.
>>
>> I didn't tested cases when database doesn't fit OS file cache. Probably
>> in this case benefit will be smaller cause more time will be spent in
>> disk read.
>> I didn't test intensively OLAP load. I've seen once that increased
>> buffer slows a bit scanning almost immutable huge table, perhaps cause
>> of decreased CPU cache locality. But given such scan is already fast,
>> and autovacuum of "almost immutable table" runs rarely, I don't think
>> it is very important.
>>
>> Initially I wanted to make BAS_BULKWRITE and BAS_VACUUM ring sizes
>> configurable, but after testing I don't see much gain from increasing
>> ring buffer above 16MB. So I propose just 1 line change.
>
> I think the question for this patch is "so, why didn't we do it this
> way originally?".
>
> It's no secret that making the ring buffer larger will improve
> performance -- in fact, not having a ring buffer at all would improve
> performance even more. But it would also increase the likelihood that
> the background work of vacuum would impact the performance of
> foreground operations, which is already a pretty serious problem that
> we probably don't want to make worse. I'm not certain what the right
> decision is here, but I think that a careful analysis of possible
> downsides is needed.

IIRC, originally, the default shared_buffers settings was tiny.

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 15:51:55
Message-ID:	CA+Tgmob1TN80rCUUSBc3Sm_gYTcCu1HEWkVn4UieMx65zmZb0Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 20, 2017 at 11:09 AM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>> It's no secret that making the ring buffer larger will improve
>> performance -- in fact, not having a ring buffer at all would improve
>> performance even more. But it would also increase the likelihood that
>> the background work of vacuum would impact the performance of
>> foreground operations, which is already a pretty serious problem that
>> we probably don't want to make worse. I'm not certain what the right
>> decision is here, but I think that a careful analysis of possible
>> downsides is needed.
>
> IIRC, originally, the default shared_buffers settings was tiny.

It is true that we increased the default shared_buffers value from
32MB to 128MB in f358428280953643313ee7756e0a8b8ccfde7660, but it's
also true ring buffers are capped at 1/8th of shared_buffers
regardless of anything else, so I don't think that's the explanation
here. Even if that weren't the case, how would a 4x increase in the
default size of shared_buffers (which is probably the most-commonly
changed GUC of any that PostgreSQL has) justify a 64x increase in the
size of the ring buffer?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 16:04:10
Message-ID:	3773.1500566650@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Claudio Freire <klaussfreire(at)gmail(dot)com> writes:
> On Thu, Jul 20, 2017 at 11:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I think the question for this patch is "so, why didn't we do it this
>> way originally?".
>>
>> It's no secret that making the ring buffer larger will improve
>> performance -- in fact, not having a ring buffer at all would improve
>> performance even more. But it would also increase the likelihood that
>> the background work of vacuum would impact the performance of
>> foreground operations, which is already a pretty serious problem that
>> we probably don't want to make worse. I'm not certain what the right
>> decision is here, but I think that a careful analysis of possible
>> downsides is needed.

> IIRC, originally, the default shared_buffers settings was tiny.

At the time we set the ring buffer size to 256K, the maximum
shared_buffers that initdb would configure was 32MB; and you often didn't
get that much due to SHMMAX. Now of course it's 128MB, and you'll pretty
much always get that. So there's certainly room to argue that it's time
to increase vacuum's ring buffer size, but that line of argument doesn't
justify more than ~10X increase at most.

Like Robert, I'm afraid of changing this number in a vacuum (ahem).
If you've got the default number of autovacuum workers going (3), you'd
have them thrashing a total of 3/8ths of shared memory by default, which
seems like a lot. We do need to look at the impact on foreground
processing, and not just at the speed of vacuum itself.

One idea for addressing this would be to raise the max values in the
switch, but tighten the fraction-of-shared-buffers limit just below.
I wouldn't have any objection to a 16MB ring buffer for vacuum when
it is coming out of a 1GB arena ... it just seems like a rather large
fraction of 128MB to give to a background process, especially to each
of several background processes.

Maybe the fraction-of-shared-buffers shouldn't be one size fits all,
but a different limit for each case?

regards, tom lane

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 16:08:55
Message-ID:	CAGTBQpZyB3YQDgO-zmMD9kcsdYSorWnNrfGzyL5Sd04RC-L5jw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 20, 2017 at 12:51 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Jul 20, 2017 at 11:09 AM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>>> It's no secret that making the ring buffer larger will improve
>>> performance -- in fact, not having a ring buffer at all would improve
>>> performance even more. But it would also increase the likelihood that
>>> the background work of vacuum would impact the performance of
>>> foreground operations, which is already a pretty serious problem that
>>> we probably don't want to make worse. I'm not certain what the right
>>> decision is here, but I think that a careful analysis of possible
>>> downsides is needed.
>>
>> IIRC, originally, the default shared_buffers settings was tiny.
>
> It is true that we increased the default shared_buffers value from
> 32MB to 128MB in f358428280953643313ee7756e0a8b8ccfde7660, but it's
> also true ring buffers are capped at 1/8th of shared_buffers
> regardless of anything else, so I don't think that's the explanation
> here. Even if that weren't the case, how would a 4x increase in the
> default size of shared_buffers (which is probably the most-commonly
> changed GUC of any that PostgreSQL has) justify a 64x increase in the
> size of the ring buffer?

I'm theorizing here, because I've not been involved in any of those
decisions. But I have been stracing and checking on vacuum quite
frequently lately, so my 2 cents:

The 4x increase in shared_buffers acknowledges increases in available
host memory over the years. It's not just about how much of
shared_buffers is dedicated to the ring buffer, but also whether we
can reasonably expect the whole ring to remain in the OS cache while
it's getting dirtied.

Vacuum will almost always dirty pages once and never again, and
flushing dirty pages back to the OS cache ASAP helps avoid a
read-modify-write cycle if the page didn't leave the OS cache. That's
more likely to happen with smaller rings than with bigger rings. But
as memory increases, the ring can be made bigger without fear of it
falling from the OS cache prematurely.

So, the 64x increase may be justifiable in absolute terms: it's not
unlikely that a 16MB buffer will be evicted from the OS cache before
vacuum is done with it, even in heavily throttled vacuums. Memory
pressure on the host would have to be insane to cause that, in modern
systems with GBs of RAM. That might not have been true in the early
days.

Now, whether autovacuum would suck up a big portion of the
shared_buffers or not, is another matter. Perhaps the ring buffer
could tune itself to whatever limit seems comfortable in that regard,
the way it does with other GUCs (like cost_limit): divide it among the
number of workers?

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 16:16:44
Message-ID:	867b83703644720cb2a7f22e557ffc17@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-20 17:59, Robert Haas wrote:
> On Tue, Jul 18, 2017 at 6:09 AM, Sokolov Yura
> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>> I investigated autovacuum performance, and found that it suffers a lot
>> from small ring buffer. It suffers in a same way bulk writer suffered
>> before Tom Lane's commit 6382448cf96:
>>
>>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 2009-06-23 00:04:28
>>> For bulk write operations (eg COPY IN), use a ring buffer of 16MB
>>> instead of the 256KB limit originally enforced by a patch committed
>>> 2008-11-06. Per recent test results, the smaller size resulted in an
>>> undesirable decrease in bulk data loading speed, due to COPY
>>> processing frequently getting blocked for WAL flushing. This area
>>> might need more tweaking later, but this setting seems to be good
>>> enough for 8.4.
>>
>> It is especially noticable when database doesn't fit in shared_buffers
>> but fit into OS file cache, and data is intensively updated (ie OLTP
>> load). In this scenario autovacuum with current 256kB (32 pages) ring
>> buffer lasts 3-10 times longer than with increased to 16MB ring
>> buffer.
>>
>> I've tested with synthetic load with 256MB or 1GB shared buffers and
>> 2-6GB (with indices) tables, with different load factor and
>> with/without
>> secondary indices on updated columns. Table were randomly updated with
>> hot and non-hot updates. Times before/after buffer increase (depending
>> on load) were 7500sec/1200sec, 75000sec/11500sec. So benefit is
>> consistently reproducible.
>>
>> I didn't tested cases when database doesn't fit OS file cache.
>> Probably
>> in this case benefit will be smaller cause more time will be spent in
>> disk read.
>> I didn't test intensively OLAP load. I've seen once that increased
>> buffer slows a bit scanning almost immutable huge table, perhaps cause
>> of decreased CPU cache locality. But given such scan is already fast,
>> and autovacuum of "almost immutable table" runs rarely, I don't think
>> it is very important.
>>
>> Initially I wanted to make BAS_BULKWRITE and BAS_VACUUM ring sizes
>> configurable, but after testing I don't see much gain from increasing
>> ring buffer above 16MB. So I propose just 1 line change.
>
> I think the question for this patch is "so, why didn't we do it this
> way originally?".
>
> It's no secret that making the ring buffer larger will improve
> performance -- in fact, not having a ring buffer at all would improve
> performance even more. But it would also increase the likelihood that
> the background work of vacuum would impact the performance of
> foreground operations, which is already a pretty serious problem that
> we probably don't want to make worse. I'm not certain what the right
> decision is here, but I think that a careful analysis of possible
> downsides is needed.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Initially, ring buffer were introduced for sequential scan.
It was added for vacuum "for a company", and before introducing
vacuum used just 1 page, so giving 32 pages to was huge improvement:

d526575f893c1a4e05ebd Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 2007-05-31 00:12:03
"Make large sequential scans and VACUUMs work in a limited-size "ring"
of
buffers, rather than blowing out the whole shared-buffer arena. Aside
from
avoiding cache spoliation, this fixes the problem that VACUUM formerly
tended
to cause a WAL flush for every page it modified, because we had it
hacked to
use only a single buffer."

Later ring buffer were added for bulk writer, but the same 32 pages:

85e2cedf985bfecaf43a18ca Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 2008-11-06
23:51:15
" Improve bulk-insert performance by keeping the current target buffer
pinned
(but not locked, as that would risk deadlocks). Also, make it work in
a small
ring of buffers to avoid having bulk inserts trash the whole buffer
arena.
Robert Haas, after an idea of Simon Riggs'."

And finally after some real world usage buffer for bulk writer were
increased

6382448cf96a9b88 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 2009-06-23 00:04:28
" For bulk write operations (eg COPY IN), use a ring buffer of 16MB
instead
of the 256KB limit originally enforced by a patch committed
2008-11-06.
Per recent test results, the smaller size resulted in an undesirable
decrease
in bulk data loading speed, due to COPY processing frequently getting
blocked
for WAL flushing. This area might need more tweaking later, but this
setting
seems to be good enough for 8.4."

So, from my point of view, no one just evaluate performance of increased
ring buffer for vacuum.

It were discussed year age:
https://www.postgresql.org/message-id/flat/CA%2BTgmobmP%3DKE-z5f7-CegXMFGRbV%3DhC%2B%3DFxb2mbhpfD-ZD%3D-bA%40mail(dot)gmail(dot)com#CA+TgmobmP=KE-z5f7-CegXMFGRbV=hC+=Fxb2mbhpfD-ZD=-bA(at)mail(dot)gmail(dot)com

There was your, Robert wrong assumption:
> But all that does is force the backend to write to the operating
system, which is where the real buffering happens.

But in fact, vacuum process performs FSYNC! It happens, cause vacuum
evicts dirty pages from its ring buffer. And to evict dirty page, it
has to be sure WAL record about its modification is FSYNC-ed to WAL.
Because ring buffer is damn small, vacuum almost always have to perform
FSYNC by itself and have to do it very frequently (cause ring is damn
small).

With greater ring buffer, there is greater chance that fsync were
performed by wal_writer process, or other backend. And even if
vacuum does fsync by itself, it syncs records about more modified
pages from its ring, so evicting next page is free.

Also it were discussed in 2012-2013
https://www.postgresql.org/message-id/flat/CA%2BTgmoZ3BOips7ot0tnSPO0yhKB3RUShDFoiYruoYXZDPr%3DptQ%40mail(dot)gmail(dot)com#CA+TgmoZ3BOips7ot0tnSPO0yhKB3RUShDFoiYruoYXZDPr=ptQ(at)mail(dot)gmail(dot)com

No decision were made, unfortunately.

If some fears that increasing vacuum ring buffer will lead to
decreasing transaction performance, then why it were not exhaustively
tested?

I had no evidence that transactions suffers from autovacuum improved
in this way. Perhaps I tested not precisely.

People, lets collect more results! Please test this one line change
with load of your choice and share the results!

If improving autovacuum doesn't hurt performance much, then why we
should live with databases bloated?

If such improvement really hurts performance, then it should be
documented by letters in pgsql-hackers, and comment should be put
into src/backend/storage/buffer/freelist.c .

And possible community will decide, that it should be GUC variable:
- if one prefers to keep its database unbloated, he could increase
vacuum ring buffer,
- otherwise just left it in "safe-but-slow" default.

With regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 16:22:05
Message-ID:	CAGTBQpa_xNGgODe0adSsea4bE3nYR=wdFp+JrNnv6JAZvin+jg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 20, 2017 at 1:08 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> So, the 64x increase may be justifiable in absolute terms: it's not
> unlikely that a 16MB buffer will be evicted from the OS cache before
> vacuum is done with it, even in heavily throttled vacuums.

Sorry, that should read "It's not *likely* that a 16MB buffer will be evicted"

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Claudio Freire <klaussfreire(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 16:22:45
Message-ID:	1d87cf09e57286154a7415f183ee1f1d@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-20 19:04, Tom Lane wrote:
> Claudio Freire <klaussfreire(at)gmail(dot)com> writes:
>> On Thu, Jul 20, 2017 at 11:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
>> wrote:
>>> I think the question for this patch is "so, why didn't we do it this
>>> way originally?".
>>>
>>> It's no secret that making the ring buffer larger will improve
>>> performance -- in fact, not having a ring buffer at all would improve
>>> performance even more. But it would also increase the likelihood
>>> that
>>> the background work of vacuum would impact the performance of
>>> foreground operations, which is already a pretty serious problem that
>>> we probably don't want to make worse. I'm not certain what the right
>>> decision is here, but I think that a careful analysis of possible
>>> downsides is needed.
>
>> IIRC, originally, the default shared_buffers settings was tiny.
>
> At the time we set the ring buffer size to 256K, the maximum
> shared_buffers that initdb would configure was 32MB; and you often
> didn't
> get that much due to SHMMAX. Now of course it's 128MB, and you'll
> pretty
> much always get that. So there's certainly room to argue that it's
> time
> to increase vacuum's ring buffer size, but that line of argument
> doesn't
> justify more than ~10X increase at most.
>
> Like Robert, I'm afraid of changing this number in a vacuum (ahem).
> If you've got the default number of autovacuum workers going (3), you'd
> have them thrashing a total of 3/8ths of shared memory by default,
> which
> seems like a lot. We do need to look at the impact on foreground
> processing, and not just at the speed of vacuum itself.
>
> One idea for addressing this would be to raise the max values in the
> switch, but tighten the fraction-of-shared-buffers limit just below.
> I wouldn't have any objection to a 16MB ring buffer for vacuum when
> it is coming out of a 1GB arena ... it just seems like a rather large
> fraction of 128MB to give to a background process, especially to each
> of several background processes.
>
> Maybe the fraction-of-shared-buffers shouldn't be one size fits all,
> but a different limit for each case?
>
> regards, tom lane

It could be 'min(shared_buffers/8/autovacuum_workers, 16MB)'.
It quite rarely people live shared_buffers as 128MB, and those people
don't really care about other settings. So 5MB will be enough for
their autovacuum's ring buffer.

People, who care about tuning its postgresql, will increase their
shared_buffers, and autovacuum will have its 16MB ring buffer.

With regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 16:52:09
Message-ID:	CAMkU=1ziPhBJD-XJNGjLBcowddN3-ior3dz80dpVdqqgL342kQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 20, 2017 at 7:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> >
> > Initially I wanted to make BAS_BULKWRITE and BAS_VACUUM ring sizes
> > configurable, but after testing I don't see much gain from increasing
> > ring buffer above 16MB. So I propose just 1 line change.
>
> I think the question for this patch is "so, why didn't we do it this
> way originally?".
>
> It's no secret that making the ring buffer larger will improve
> performance -- in fact, not having a ring buffer at all would improve
> performance even more. But it would also increase the likelihood that
> the background work of vacuum would impact the performance of
> foreground operations, which is already a pretty serious problem that
> we probably don't want to make worse.

But having a very fast sequence of fdatasync calls is not terribly friendly
to the performance of the foreground operations, either.

I think the reason we didn't do it this way originally is tied the same
reason that autovacuum_vacuum_cost_delay = 20ms by default. If we want it
to be heavily throttled, there isn't much point in using a larger ring
buffer. It is just wasted space. Maybe we could have it start out at
BAS_VACUUM's default size, then grow by one buffer every time it had to
issue a WAL sync, until it reached BAS_BULKWRITE's size where it would max
out.

Cheers,

Jeff

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 17:59:00
Message-ID:	CA+TgmoY3Ye--95Ck49SOZaDkPnfZW5bCWjRNJ2WpqbF=L_BbqQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 20, 2017 at 12:16 PM, Sokolov Yura
<funny(dot)falcon(at)postgrespro(dot)ru> wrote:
> So, from my point of view, no one just evaluate performance of increased
> ring buffer for vacuum.

I think that argument is clearly incorrect. In commit
6382448cf96a9b88d418cbaf86027b63f465b5d8, which you cited, Tom even
added a note in the README file about why he didn't increase the value
for vacuum also. He knew it would have increased performance had he
also done it for BAS_BULKWRITE, and I knew it too, but it wasn't clear
that it was a good idea, and it's still not.

> But in fact, vacuum process performs FSYNC! It happens, cause vacuum
> evicts dirty pages from its ring buffer. And to evict dirty page, it
> has to be sure WAL record about its modification is FSYNC-ed to WAL.
> Because ring buffer is damn small, vacuum almost always have to perform
> FSYNC by itself and have to do it very frequently (cause ring is damn
> small).
>
> With greater ring buffer, there is greater chance that fsync were
> performed by wal_writer process, or other backend. And even if
> vacuum does fsync by itself, it syncs records about more modified
> pages from its ring, so evicting next page is free.

OK, but I have helped *many* customers whose problem was that vacuum
ran too fast and blew data out of the OS cache causing query response
times to go through the roof. That's a common problem. Making VACUUM
run faster will presumably make it more common. I've also run into
many customers whose problem that vacuum ran too slowly, and generally
raising vacuum_cost_limit fixes that problem just fine. So I don't
think it's nearly as clear as you do that making VACUUM run faster is
desirable.

> If some fears that increasing vacuum ring buffer will lead to
> decreasing transaction performance, then why it were not exhaustively
> tested?

If you want something changed, it's your job to do that testing.
Asking why nobody else tested the effects of changing the thing you
want changed is like asking why nobody else wrote the patch you want
written.

> And possible community will decide, that it should be GUC variable:
> - if one prefers to keep its database unbloated, he could increase
> vacuum ring buffer,
> - otherwise just left it in "safe-but-slow" default.

That's a possible outcome, but I don't think this discussion is really
going anywhere unless you are willing to admit that increasing VACUUM
performance could have some downsides. If you're not willing to admit
that, there's not a lot to talk about here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 19:04:05
Message-ID:	20170720190405.GM1769@tamriel.snowman.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Thu, Jul 20, 2017 at 12:16 PM, Sokolov Yura
> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
> > But in fact, vacuum process performs FSYNC! It happens, cause vacuum
> > evicts dirty pages from its ring buffer. And to evict dirty page, it
> > has to be sure WAL record about its modification is FSYNC-ed to WAL.
> > Because ring buffer is damn small, vacuum almost always have to perform
> > FSYNC by itself and have to do it very frequently (cause ring is damn
> > small).
> >
> > With greater ring buffer, there is greater chance that fsync were
> > performed by wal_writer process, or other backend. And even if
> > vacuum does fsync by itself, it syncs records about more modified
> > pages from its ring, so evicting next page is free.
>
> OK, but I have helped *many* customers whose problem was that vacuum
> ran too fast and blew data out of the OS cache causing query response
> times to go through the roof. That's a common problem. Making VACUUM
> run faster will presumably make it more common. I've also run into
> many customers whose problem that vacuum ran too slowly, and generally
> raising vacuum_cost_limit fixes that problem just fine. So I don't
> think it's nearly as clear as you do that making VACUUM run faster is
> desirable.

I agree that it's a common problem for VACUUM to go too fast, or for
VACUUM to go too slow, but that's really what the vacuum_cost_limit
mechanism is for.

I can see an argument that existing tuned systems which have been
expecting the small ring-buffer to help slow down VACUUM may have to be
adjusted to handle a change, though I would think that other changes
we've made might also require changes to vacuum costing, so I'm not sure
that this is particularly different in that regard.

What I don't agree with is holding off on improving VACUUM in the case
where cost delay is set to zero because we think people might be
depending on it only going so fast in that case. If the cost delay is
set to zero, then VACUUM really should be going as fast as it can and we
should welcome improvments in that area in much the same way that we
welcome performance improvements in sorting and other backend
algorithms.

> > If some fears that increasing vacuum ring buffer will lead to
> > decreasing transaction performance, then why it were not exhaustively
> > tested?
>
> If you want something changed, it's your job to do that testing.
> Asking why nobody else tested the effects of changing the thing you
> want changed is like asking why nobody else wrote the patch you want
> written.

I do agree with this. Asking for others to also test is fine, but it's
the patch submitter who needs to ensure that said testing actually
happens and that results are provided to -hackers to support the change.

In particular, multiple different scenarios (DB all in shared_buffers,
DB all in OS cache, DB not able to fit in memory at all, etc) should be
tested.

> > And possible community will decide, that it should be GUC variable:
> > - if one prefers to keep its database unbloated, he could increase
> > vacuum ring buffer,
> > - otherwise just left it in "safe-but-slow" default.
>
> That's a possible outcome, but I don't think this discussion is really
> going anywhere unless you are willing to admit that increasing VACUUM
> performance could have some downsides. If you're not willing to admit
> that, there's not a lot to talk about here.

I'd rather we encourage people to use the existing knobs for tuning
VACUUM speed rather than adding another one that ends up being actually
only a proxy for speed. If there's a memory utilization concern here,
then having a knob for that might make sense, but it sounds like the
concern here is more about the speed and less about coming up with a
reasonable way to scale the size of the ring buffer.

Of course, I'm all for coming up with a good way to size the ring
buffer, and providing a knob if we aren't able to do so, I just don't
want to add unnecessary knobs if we don't need them.

Thanks!

Stephen

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 19:34:59
Message-ID:	CA+TgmoZvhX8mJCFHm+uLdYmpoz4sA6As1Y=g33XbKXoEwJG6=w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 20, 2017 at 3:04 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> I agree that it's a common problem for VACUUM to go too fast, or for
> VACUUM to go too slow, but that's really what the vacuum_cost_limit
> mechanism is for.

I think that's a valid point. There are also other concerns here -
e.g. whether instead of adopting the patch as proposed we ought to (a)
use some smaller size, or (b) keep the size as-is but reduce the
maximum fraction of shared_buffers that can be consumed, or (c) divide
the ring buffer size through by autovacuum_max_workers. Personally,
of those approaches, I favor (b). I think a 16MB ring buffer is
probably just fine if you've got 8GB of shared_buffers but I'm
skeptical about it when you've got 128MB of shared_buffers.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 19:37:44
Message-ID:	20170720193744.GO1769@tamriel.snowman.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Thu, Jul 20, 2017 at 3:04 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I agree that it's a common problem for VACUUM to go too fast, or for
> > VACUUM to go too slow, but that's really what the vacuum_cost_limit
> > mechanism is for.
>
> I think that's a valid point. There are also other concerns here -
> e.g. whether instead of adopting the patch as proposed we ought to (a)
> use some smaller size, or (b) keep the size as-is but reduce the
> maximum fraction of shared_buffers that can be consumed, or (c) divide
> the ring buffer size through by autovacuum_max_workers. Personally,
> of those approaches, I favor (b). I think a 16MB ring buffer is
> probably just fine if you've got 8GB of shared_buffers but I'm
> skeptical about it when you've got 128MB of shared_buffers.

Right, agreed on that and that (b) looks to be a good option there.

Thanks!

Stephen

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-20 19:51:59
Message-ID:	12790.1500580319@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I think that's a valid point. There are also other concerns here -
> e.g. whether instead of adopting the patch as proposed we ought to (a)
> use some smaller size, or (b) keep the size as-is but reduce the
> maximum fraction of shared_buffers that can be consumed, or (c) divide
> the ring buffer size through by autovacuum_max_workers. Personally,
> of those approaches, I favor (b). I think a 16MB ring buffer is
> probably just fine if you've got 8GB of shared_buffers but I'm
> skeptical about it when you've got 128MB of shared_buffers.

WFM. I agree with *not* dividing the basic ring buffer size by
autovacuum_max_workers. If you have allocated more AV workers, I think
you expect AV to go faster, not for the workers to start fighting among
themselves.

It might, however, be reasonable for the fraction-of-shared-buffers
limitation to have something to do with autovacuum_max_workers, so that
you can't squeeze yourself out of shared_buffers if you set that number
really high. IOW, I think the upthread suggestion of
min(shared_buffers/8/autovacuum_workers, 16MB) is basically the right
idea, though we could debate the exact constants.

regards, tom lane

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-21 08:19:28
Message-ID:	029a38142f2fcd65e669680dedd23342@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-20 20:59, Robert Haas wrote:
> If you want something changed, it's your job to do that testing.

I've been testing for two weeks before I wrote to pgsql-hackers. And I
wrote some highlevel results in first letter.
I haven't noticed transactions slowdown from increased vacuum ring
buffer.
But:
- my workload could be too synthetic,
- I've measured total performed transactions, and sampled time of random
ones.
So probably my measurement were not exhaustive. And definitely I could
not imagine whole set of interesting workloads. And given every test
run for at least 5 hours (and, in fact, test run on master for 20 hours,
cause autovacuum doesn't finish faster on that synthetic workload)
I will spend at least half-year if I test all possible workloads.

That is why I asked community to test it on workloads people consider
interesting.

I may measure by my self, if some tells me what workload he wants to be
tested.

Two previous discussions of the topic were killed without any evidence
of
testing at all, only with theoretical doubts. Is it fair? Why "probably
it
is bad" is better than "probably it is good"?

You are one of leadership. I know it is not your job to test every tiny
change a school boy proposed. But here is a lot of people, who waits for
your word. Instead of cooling rush and closing discussions, you may just
say: "please, someone test it with that particular workload".

> I don't think this discussion is really going anywhere unless you are
> willing to admit that increasing VACUUM performance could have some
> downsides. If you're not willing to admit that, there's not a lot to
> talk
> about here.

I can admit many things. I've seen how autovacuum drops pgbench
performance
from 100000tps down to 1500tps cause of contention on CLogControlLock.
(btw
my LWLock patch improves it to 3000tps).

But that is not a reason autovacuum should be intentionally slow. How
Stephen Frost said, that is what vacuum_cost_limit and vacuum_cost_delay
are
for. (and, certainly, it is reason to improve CLog and SLRU).

> OK, but I have helped *many* customers whose problem was that vacuum
> ran too fast and blew data out of the OS cache causing query response
> times to go through the roof.

When there is no garbage, increasing autovacuum ring buffer changes
almost
nothing. When there is garbage, current small ring buffer leads to a
storm
of fsyncs. Frequent fsyncs slows down hdd a lot, and then hdd isn't
capable
to satisfy queries and refill OS cache. Will you admit it?

> I've also run into many customers whose problem that vacuum ran too
> slowly, and generally raising vacuum_cost_limit fixes that problem just
> fine.

Probably with increased ring buffer there is no need in raising
vacuum_cost_limit. Will you admit it?

With regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers(at)postgresql(dot)org, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-21 15:47:05
Message-ID:	69da6b82faceaa8a0e4e07892e5fb6ca@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-20 22:51, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I think that's a valid point. There are also other concerns here -
>> e.g. whether instead of adopting the patch as proposed we ought to (a)
>> use some smaller size, or (b) keep the size as-is but reduce the
>> maximum fraction of shared_buffers that can be consumed, or (c) divide
>> the ring buffer size through by autovacuum_max_workers. Personally,
>> of those approaches, I favor (b). I think a 16MB ring buffer is
>> probably just fine if you've got 8GB of shared_buffers but I'm
>> skeptical about it when you've got 128MB of shared_buffers.
>
> WFM. I agree with *not* dividing the basic ring buffer size by
> autovacuum_max_workers. If you have allocated more AV workers, I think
> you expect AV to go faster, not for the workers to start fighting among
> themselves.
>
> It might, however, be reasonable for the fraction-of-shared-buffers
> limitation to have something to do with autovacuum_max_workers, so that
> you can't squeeze yourself out of shared_buffers if you set that number
> really high. IOW, I think the upthread suggestion of
> min(shared_buffers/8/autovacuum_workers, 16MB) is basically the right
> idea, though we could debate the exact constants.
>
> regards, tom lane

Attached version is with min(shared_buffers/8/autovacuum_workers, 16MB).

With regards
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

Attachment	Content-Type	Size
0001-Set-total-vacuum-ring-buffer-16MB-v2.patch	text/x-diff	2.1 KB

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-21 16:32:22
Message-ID:	CA+TgmoamG3MZyqT9XFvy9J4zw=C-iiK=ZhsLbp8vKg2Ui6rQJA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Jul 21, 2017 at 4:19 AM, Sokolov Yura
<funny(dot)falcon(at)postgrespro(dot)ru> wrote:
> You are one of leadership. I know it is not your job to test every tiny
> change a school boy proposed. But here is a lot of people, who waits for
> your word. Instead of cooling rush and closing discussions, you may just
> say: "please, someone test it with that particular workload".

I had no intention of cooling rush and closing discussions. I was
trying to help you understand what points you needed to address in
order to have a chance of getting this committed. I feel like I came
into this discussion to try to help you make some progress on this
issue, and instead of appreciating that, you're making me the bad guy.

> When there is no garbage, increasing autovacuum ring buffer changes almost
> nothing. When there is garbage, current small ring buffer leads to a storm
> of fsyncs. Frequent fsyncs slows down hdd a lot, and then hdd isn't capable
> to satisfy queries and refill OS cache. Will you admit it?

I haven't tested it, but it sounds believable.

>> I've also run into many customers whose problem that vacuum ran too
>> slowly, and generally raising vacuum_cost_limit fixes that problem just
>> fine.
>
> Probably with increased ring buffer there is no need in raising
> vacuum_cost_limit. Will you admit it?

No, I definitely won't admit that. With default settings autovacuum
won't write more than ~2.3MB/s if I remember the math correctly, so if
you've got a 1TB table you're probably going to need a bigger value.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-21 17:41:30
Message-ID:	28346ef754737e67db261ec85a66400c@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-21 19:32, Robert Haas wrote:
> On Fri, Jul 21, 2017 at 4:19 AM, Sokolov Yura
> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>
>> Probably with increased ring buffer there is no need in raising
>> vacuum_cost_limit. Will you admit it?
>
> No, I definitely won't admit that. With default settings autovacuum
> won't write more than ~2.3MB/s if I remember the math correctly, so if
> you've got a 1TB table you're probably going to need a bigger value.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

I've seed autovacuum process spending >50% of its time in fsync
(with current ring buffer) (but I used autovacuum_cost_delay=2ms).
fsync could lasts up to second on hdd if there is concurrent IO.
Even on ssd fsync could be really noticeable.

But, I agree that for 1TB table autovacuum_cost_limit still should
be increased, even with larger ring buffer.

My friend noticed, that I didn't said why I bother with autovacuum.
Our customers suffers from table bloating. I've made synthetic
bloating test, and started experiments with modifying micro- and
auto-vacuum. My first attempts were to update FSM early (both in
micro and autovacuum) and update it upto root, not only low level.

Then I looked to strace of autovacuum process, and noticed storm
of fsync. I catched backtraces with gdb rooting on fsync, and
found that evicting dirty pages from small ring buffer it the
reason.

After some experiments with combining my "early fsm update" and
size of ring buffer, I understood that increasing ring buffer
gives most of benefits: autovacuum runs faster, and bloating is
greatly reduced. On extreme case, 400mb table bloats to 17GB
on master, and only to 5GB with faster autovacuum.

I used custom scripts, and that is why my statistic is not full.
Though, I didn't found performance reduction. In fact, it looks
like tests with "larger autovacuum ring" did more queries per hour
than tests against master.

I will run pgbench for weekend, so latencies and percentiles
will be collected.

With regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-21 21:27:42
Message-ID:	CAGTBQpZsAGWr8bcLZAqLR87F=B9G4vGSkGXUrqQ9hqhNkYG=ZQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Jul 21, 2017 at 2:41 PM, Sokolov Yura
<funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>
> My friend noticed, that I didn't said why I bother with autovacuum.
> Our customers suffers from table bloating. I've made synthetic
> bloating test, and started experiments with modifying micro- and
> auto-vacuum. My first attempts were to update FSM early (both in
> micro and autovacuum) and update it upto root, not only low level.

This FSM thing is probably not a bad idea as well.

We're forced to run regular manual vacuums because for some tables
autovacuums seems to never be enough, no matter how it's configured,
mostly because it gets canceled all the time. These are high-churn,
huge tables, so vacuuming them takes hours or days, there's always
someone with a conflicting lock at some point that ends up canceling
the autovacuum task.

The above paragraph triggered me to go check, and it seems in those
cases the FSM never gets vacuumed. That's probably not a good thing,
but I don't see how to vacuum the FSM after a cancel. So vacuuming the
FSM from time to time during long-running vacuums seems like a good
idea at this point.

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-24 09:37:25
Message-ID:	39c025edf2f8811e097f86b872562893@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Good day, Claudio

On 2017-07-22 00:27, Claudio Freire wrote:
> On Fri, Jul 21, 2017 at 2:41 PM, Sokolov Yura
> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>
>> My friend noticed, that I didn't said why I bother with autovacuum.
>> Our customers suffers from table bloating. I've made synthetic
>> bloating test, and started experiments with modifying micro- and
>> auto-vacuum. My first attempts were to update FSM early (both in
>> micro and autovacuum) and update it upto root, not only low level.
>
> This FSM thing is probably not a bad idea as well.
>
> We're forced to run regular manual vacuums because for some tables
> autovacuums seems to never be enough, no matter how it's configured,
> mostly because it gets canceled all the time. These are high-churn,
> huge tables, so vacuuming them takes hours or days, there's always
> someone with a conflicting lock at some point that ends up canceling
> the autovacuum task.
>
> The above paragraph triggered me to go check, and it seems in those
> cases the FSM never gets vacuumed. That's probably not a good thing,
> but I don't see how to vacuum the FSM after a cancel. So vacuuming the
> FSM from time to time during long-running vacuums seems like a good
> idea at this point.

Attached patch changes fsm update: instead of updating only lowest
level, it propagates space increase up to root.

It slows autovacuum a bit, so that I didn't propose it together with
ring buffer increase.

--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

Attachment	Content-Type	Size
0001-fsm-vacuum-write-increasing-of-free-space-on-upper-l.patch	text/x-diff	7.5 KB

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-24 09:41:08
Message-ID:	c5d9f987c929cc8328be6f052fc8ac08@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-21 20:41, Sokolov Yura wrote:
> On 2017-07-21 19:32, Robert Haas wrote:
>> On Fri, Jul 21, 2017 at 4:19 AM, Sokolov Yura
>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>>
>>> Probably with increased ring buffer there is no need in raising
>>> vacuum_cost_limit. Will you admit it?
>>
>> No, I definitely won't admit that. With default settings autovacuum
>> won't write more than ~2.3MB/s if I remember the math correctly, so if
>> you've got a 1TB table you're probably going to need a bigger value.
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>
> I've seed autovacuum process spending >50% of its time in fsync
> (with current ring buffer) (but I used autovacuum_cost_delay=2ms).
> fsync could lasts up to second on hdd if there is concurrent IO.
> Even on ssd fsync could be really noticeable.
>
> But, I agree that for 1TB table autovacuum_cost_limit still should
> be increased, even with larger ring buffer.
>
>
> My friend noticed, that I didn't said why I bother with autovacuum.
> Our customers suffers from table bloating. I've made synthetic
> bloating test, and started experiments with modifying micro- and
> auto-vacuum. My first attempts were to update FSM early (both in
> micro and autovacuum) and update it upto root, not only low level.
>
> Then I looked to strace of autovacuum process, and noticed storm
> of fsync. I catched backtraces with gdb rooting on fsync, and
> found that evicting dirty pages from small ring buffer it the
> reason.
>
> After some experiments with combining my "early fsm update" and
> size of ring buffer, I understood that increasing ring buffer
> gives most of benefits: autovacuum runs faster, and bloating is
> greatly reduced. On extreme case, 400mb table bloats to 17GB
> on master, and only to 5GB with faster autovacuum.
>
> I used custom scripts, and that is why my statistic is not full.
> Though, I didn't found performance reduction. In fact, it looks
> like tests with "larger autovacuum ring" did more queries per hour
> than tests against master.
>
> I will run pgbench for weekend, so latencies and percentiles
> will be collected.
>
> With regards,
> --
> Sokolov Yura aka funny_falcon
> Postgres Professional: https://postgrespro.ru
> The Russian Postgres Company

Default pgbench script wasn't able to trigger autovacuum of
pgbench_accounts table in 8 hours (scale 400, 40 clients, 900tps
average), so weekend testing were not useful.

I will re-run with custom script for next day-two.

--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-24 16:11:14
Message-ID:	CAGTBQpbM-m3xcxEAaCLAJwG-L++cEkfRsHCx8Cwnqz-B-sPtWA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Jul 24, 2017 at 6:37 AM, Sokolov Yura
<funny(dot)falcon(at)postgrespro(dot)ru> wrote:
> Good day, Claudio
>
>
> On 2017-07-22 00:27, Claudio Freire wrote:
>>
>> On Fri, Jul 21, 2017 at 2:41 PM, Sokolov Yura
>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>>
>>>
>>> My friend noticed, that I didn't said why I bother with autovacuum.
>>> Our customers suffers from table bloating. I've made synthetic
>>> bloating test, and started experiments with modifying micro- and
>>> auto-vacuum. My first attempts were to update FSM early (both in
>>> micro and autovacuum) and update it upto root, not only low level.
>>
>>
>> This FSM thing is probably not a bad idea as well.
>>
>> We're forced to run regular manual vacuums because for some tables
>> autovacuums seems to never be enough, no matter how it's configured,
>> mostly because it gets canceled all the time. These are high-churn,
>> huge tables, so vacuuming them takes hours or days, there's always
>> someone with a conflicting lock at some point that ends up canceling
>> the autovacuum task.
>>
>> The above paragraph triggered me to go check, and it seems in those
>> cases the FSM never gets vacuumed. That's probably not a good thing,
>> but I don't see how to vacuum the FSM after a cancel. So vacuuming the
>> FSM from time to time during long-running vacuums seems like a good
>> idea at this point.
>
>
> Attached patch changes fsm update: instead of updating only lowest
> level, it propagates space increase up to root.
>
> It slows autovacuum a bit, so that I didn't propose it together with
> ring buffer increase.

I was mostly thinking about something like the attached patch.

Simple, unintrusive, and shouldn't cause any noticeable slowdown.

Attachment	Content-Type	Size
0001-Vacuum-FSM-after-each-index-pass.patch	text/x-patch	1.0 KB

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-24 17:10:48
Message-ID:	e3e71845731b8b9437564d35ed2d9f1e@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-24 19:11, Claudio Freire wrote:
> On Mon, Jul 24, 2017 at 6:37 AM, Sokolov Yura
> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>> Good day, Claudio
>>
>>
>> On 2017-07-22 00:27, Claudio Freire wrote:
>>>
>>> On Fri, Jul 21, 2017 at 2:41 PM, Sokolov Yura
>>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>>>
>>>>
>>>> My friend noticed, that I didn't said why I bother with autovacuum.
>>>> Our customers suffers from table bloating. I've made synthetic
>>>> bloating test, and started experiments with modifying micro- and
>>>> auto-vacuum. My first attempts were to update FSM early (both in
>>>> micro and autovacuum) and update it upto root, not only low level.
>>>
>>>
>>> This FSM thing is probably not a bad idea as well.
>>>
>>> We're forced to run regular manual vacuums because for some tables
>>> autovacuums seems to never be enough, no matter how it's configured,
>>> mostly because it gets canceled all the time. These are high-churn,
>>> huge tables, so vacuuming them takes hours or days, there's always
>>> someone with a conflicting lock at some point that ends up canceling
>>> the autovacuum task.
>>>
>>> The above paragraph triggered me to go check, and it seems in those
>>> cases the FSM never gets vacuumed. That's probably not a good thing,
>>> but I don't see how to vacuum the FSM after a cancel. So vacuuming
>>> the
>>> FSM from time to time during long-running vacuums seems like a good
>>> idea at this point.
>>
>>
>> Attached patch changes fsm update: instead of updating only lowest
>> level, it propagates space increase up to root.
>>
>> It slows autovacuum a bit, so that I didn't propose it together with
>> ring buffer increase.
>
> I was mostly thinking about something like the attached patch.
>
> Simple, unintrusive, and shouldn't cause any noticeable slowdown.

Your change is small, clear, and currently useful for huge tables under
high update load (until "allowing vacuum to use more than 1GB memory"
is merged).

But it still delays updating fsm until whole first batch of dead tuples
cleared (ie all indices scanned, and all heap pages cleared), and on
such
huge table it will be hours.

On the other hand, if "dead" tuples consumes all useful item pointer (
MaxHeapTuplesPerPage ~ 290 on 8k page), then space, that actually exists
on a page, could not be used until "dead" tuples are converted into
"unused" tuples.

With my patch I've seen that writing FSM until dead tuples cleared
helps a little: while bloating is slowed a little by this change, it
is stopped only after final cleaning of dead tuples.

--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-24 17:16:52
Message-ID:	CAMkU=1wdhazYp6x60_qYSzhg=ohPaKkSWKZs6xoK9Ap7W3yWww@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 20, 2017 at 12:51 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > I think that's a valid point. There are also other concerns here -
> > e.g. whether instead of adopting the patch as proposed we ought to (a)
> > use some smaller size, or (b) keep the size as-is but reduce the
> > maximum fraction of shared_buffers that can be consumed, or (c) divide
> > the ring buffer size through by autovacuum_max_workers. Personally,
> > of those approaches, I favor (b). I think a 16MB ring buffer is
> > probably just fine if you've got 8GB of shared_buffers but I'm
> > skeptical about it when you've got 128MB of shared_buffers.
>
> WFM. I agree with *not* dividing the basic ring buffer size by
> autovacuum_max_workers. If you have allocated more AV workers, I think
> you expect AV to go faster, not for the workers to start fighting among
> themselves.
>

But fighting among themselves is just what they do regarding the
autovacuum_vacuum_cost_limit, so I don't see why it should be one way there
but different here. The reason for setting autovacuum_max_workers to N is
so that small tables aren't completely starved of vacuuming even if N-1
larger tables are already being vacuumed simultaneously. Now the small
tables get vacuumed at speed 1/N, which kind of sucks, but that is the
mechanism we currently have.

Of course just because we are in a hole with vacuum_cost_limit doesn't mean
we should dig ourselves deeper, but we are being inconsistent then.

Cheers,

Jeff

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-24 17:20:49
Message-ID:	CAGTBQpYxkbvEO3dqNdbMuvfY0FLdx1hDZebjVee52O=D5ryAbw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Jul 24, 2017 at 2:10 PM, Sokolov Yura
<funny(dot)falcon(at)postgrespro(dot)ru> wrote:
> On 2017-07-24 19:11, Claudio Freire wrote:
>>
>> On Mon, Jul 24, 2017 at 6:37 AM, Sokolov Yura
>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>>
>>> Good day, Claudio
>>>
>>>
>>> On 2017-07-22 00:27, Claudio Freire wrote:
>>>>
>>>>
>>>> On Fri, Jul 21, 2017 at 2:41 PM, Sokolov Yura
>>>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>>>>
>>>>>
>>>>>
>>>>> My friend noticed, that I didn't said why I bother with autovacuum.
>>>>> Our customers suffers from table bloating. I've made synthetic
>>>>> bloating test, and started experiments with modifying micro- and
>>>>> auto-vacuum. My first attempts were to update FSM early (both in
>>>>> micro and autovacuum) and update it upto root, not only low level.
>>>>
>>>>
>>>>
>>>> This FSM thing is probably not a bad idea as well.
>>>>
>>>> We're forced to run regular manual vacuums because for some tables
>>>> autovacuums seems to never be enough, no matter how it's configured,
>>>> mostly because it gets canceled all the time. These are high-churn,
>>>> huge tables, so vacuuming them takes hours or days, there's always
>>>> someone with a conflicting lock at some point that ends up canceling
>>>> the autovacuum task.
>>>>
>>>> The above paragraph triggered me to go check, and it seems in those
>>>> cases the FSM never gets vacuumed. That's probably not a good thing,
>>>> but I don't see how to vacuum the FSM after a cancel. So vacuuming the
>>>> FSM from time to time during long-running vacuums seems like a good
>>>> idea at this point.
>>>
>>>
>>>
>>> Attached patch changes fsm update: instead of updating only lowest
>>> level, it propagates space increase up to root.
>>>
>>> It slows autovacuum a bit, so that I didn't propose it together with
>>> ring buffer increase.
>>
>>
>> I was mostly thinking about something like the attached patch.
>>
>> Simple, unintrusive, and shouldn't cause any noticeable slowdown.
>
>
> Your change is small, clear, and currently useful for huge tables under
> high update load (until "allowing vacuum to use more than 1GB memory"
> is merged).

In high-bloat conditions, it doesn't take long to accumulate 1GB of
dead tuples (which is about 178M tuples, btw).

The index scan takes way longer than the heap scan in that case.

> But it still delays updating fsm until whole first batch of dead tuples
> cleared (ie all indices scanned, and all heap pages cleared), and on such
> huge table it will be hours.

So, true, it will get delayed considerably. But as you realized,
there's not much point in trying to vacuum the FSM sooner, since it
won't be accurate shortly afterwards anyway. Dead line pointers do use
up a fair bit of space, especially on narrow tables.

In a particular table I have that exhibits this problem, most of the
time is spent scanning the index. It performs dozens of index scans
before it's done, so it would vacuum the FSM quite often enough, even
if I were to increase the mwm setting n-fold.

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-24 17:27:25
Message-ID:	CAGTBQpYa_xxrSWMk308S1bQ4PE-cA+XhxXG5p9ucmeFKXT7wOA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Jul 24, 2017 at 2:20 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> On Mon, Jul 24, 2017 at 2:10 PM, Sokolov Yura
> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>> On 2017-07-24 19:11, Claudio Freire wrote:
>>> I was mostly thinking about something like the attached patch.
>>>
>>> Simple, unintrusive, and shouldn't cause any noticeable slowdown.
>>
>>
>> Your change is small, clear, and currently useful for huge tables under
>> high update load (until "allowing vacuum to use more than 1GB memory"
>> is merged).
>
> In high-bloat conditions, it doesn't take long to accumulate 1GB of
> dead tuples (which is about 178M tuples, btw).
>
> The index scan takes way longer than the heap scan in that case.
>
>> But it still delays updating fsm until whole first batch of dead tuples
>> cleared (ie all indices scanned, and all heap pages cleared), and on such
>> huge table it will be hours.
>
> So, true, it will get delayed considerably. But as you realized,
> there's not much point in trying to vacuum the FSM sooner, since it
> won't be accurate shortly afterwards anyway. Dead line pointers do use
> up a fair bit of space, especially on narrow tables.
>
> In a particular table I have that exhibits this problem, most of the
> time is spent scanning the index. It performs dozens of index scans
> before it's done, so it would vacuum the FSM quite often enough, even
> if I were to increase the mwm setting n-fold.

I hate to reply to myself, but I wanted to add: in any case, the case
I'm trying to avoid is the case where the FSM *never* gets vacuumed.
That's bad. But it may not be the phenomenon you're experiencing in
your tests.

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-26 16:39:17
Message-ID:	aca90251891edc3742254eac3f42aa90@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-24 12:41, Sokolov Yura wrote:
> On 2017-07-21 20:41, Sokolov Yura wrote:
>> On 2017-07-21 19:32, Robert Haas wrote:
>>> On Fri, Jul 21, 2017 at 4:19 AM, Sokolov Yura
>>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>>>
>>>> Probably with increased ring buffer there is no need in raising
>>>> vacuum_cost_limit. Will you admit it?
>>>
>>> No, I definitely won't admit that. With default settings autovacuum
>>> won't write more than ~2.3MB/s if I remember the math correctly, so
>>> if
>>> you've got a 1TB table you're probably going to need a bigger value.
>>>
>>> --
>>> Robert Haas
>>> EnterpriseDB: http://www.enterprisedb.com
>>> The Enterprise PostgreSQL Company
>>
>> I've seed autovacuum process spending >50% of its time in fsync
>> (with current ring buffer) (but I used autovacuum_cost_delay=2ms).
>> fsync could lasts up to second on hdd if there is concurrent IO.
>> Even on ssd fsync could be really noticeable.
>>
>> But, I agree that for 1TB table autovacuum_cost_limit still should
>> be increased, even with larger ring buffer.
>>
>>
>> My friend noticed, that I didn't said why I bother with autovacuum.
>> Our customers suffers from table bloating. I've made synthetic
>> bloating test, and started experiments with modifying micro- and
>> auto-vacuum. My first attempts were to update FSM early (both in
>> micro and autovacuum) and update it upto root, not only low level.
>>
>> Then I looked to strace of autovacuum process, and noticed storm
>> of fsync. I catched backtraces with gdb rooting on fsync, and
>> found that evicting dirty pages from small ring buffer it the
>> reason.
>>
>> After some experiments with combining my "early fsm update" and
>> size of ring buffer, I understood that increasing ring buffer
>> gives most of benefits: autovacuum runs faster, and bloating is
>> greatly reduced. On extreme case, 400mb table bloats to 17GB
>> on master, and only to 5GB with faster autovacuum.
>>
>> I used custom scripts, and that is why my statistic is not full.
>> Though, I didn't found performance reduction. In fact, it looks
>> like tests with "larger autovacuum ring" did more queries per hour
>> than tests against master.
>>
>> I will run pgbench for weekend, so latencies and percentiles
>> will be collected.
>>
>> With regards,
>> --
>> Sokolov Yura aka funny_falcon
>> Postgres Professional: https://postgrespro.ru
>> The Russian Postgres Company
>
> Default pgbench script wasn't able to trigger autovacuum of
> pgbench_accounts table in 8 hours (scale 400, 40 clients, 900tps
> average), so weekend testing were not useful.
>
> I will re-run with custom script for next day-two.
>
> --
> Sokolov Yura aka funny_falcon
> Postgres Professional: https://postgrespro.ru
> The Russian Postgres Company

I've maid 3*8hour runs with master and 16MB ring.
scale 400, 40 clients, query script:

\set aid1 random(1, 100000 * :scale)
\set aidd random(1, 3)
\set aid2 :aid1 + :aidd
\set aid3 :aid1 + 2 * :aidd
\set aid4 :aid1 + 3 * :aidd
\set aid5 :aid1 + 4 * :aidd
\set delta random(-5000, 5000)

update pgbench_accounts set abalance = abalance + :delta
where aid in (:aid1, :aid2, :aid3, :aid4, :aid5);

postgresql.conf:

max_connections = 300
shared_buffers = 2GB
work_mem = 128MB
maintenance_work_mem = 512MB
bgwriter_lru_maxpages = 10
bgwriter_flush_after = 2MB
backend_flush_after = 2MB
wal_compression = on
wal_buffers = 32MB
checkpoint_flush_after = 2MB
autovacuum = on
log_autovacuum_min_duration = 0
autovacuum_vacuum_scale_factor = 0.05
autovacuum_vacuum_cost_delay = 2ms

(I had to slow down bgwriter (bgwriter_lru_maxpages = 10),
cause otherwise all were too slow. May be I did it wrong)
(I open for all suggestion about postgresql.conf)

I've tried to make pretty log in testing5_pretty.tar.gz .
Files 'testing5_sum/test_master{,_ring16}_[123]/pretty.log contains
combined prettified logs from postgresql and pgbench.

Some excerpts:

test_master_1/pretty.log
time activity tps latency stddev min max
10980 av 1364 29ms 76ms 6ms 1170ms
11010 av 430 90ms 253ms 7ms 2472ms
11040 245 170ms 485ms 7ms 2821ms
11070 487 81ms 238ms 6ms 2404ms
11100 av 360 112ms 261ms 7ms 2549ms
11130 av+ch 198 198ms 374ms 7ms 1956ms
11160 av+ch 248 163ms 401ms 7ms 2601ms
11190 av+ch 321 125ms 363ms 7ms 2722ms
11220 av+ch 1155 35ms 123ms 7ms 2668ms
11250 av+ch 1390 29ms 79ms 7ms 1422ms
11280 av 624 64ms 176ms 6ms 1922ms
11310 av 454 87ms 236ms 7ms 2481ms
11340 av 524 77ms 223ms 6ms 2383ms
11370 av 414 96ms 267ms 7ms 2853ms
11400 av 380 103ms 235ms 7ms 2298ms
11430 av+ch 239 168ms 344ms 7ms 2482ms
11460 av+ch 138 293ms 516ms 7ms 2438ms
11490 av+ch 231 175ms 494ms 7ms 3150ms
11520 av+ch 1133 35ms 112ms 7ms 2378ms
11550 av+ch 1391 29ms 64ms 6ms 957ms

test_master_ring16_1/pretty.log
time activity tps latency stddev min max
10710 498 82ms 249ms 7ms 2801ms
10740 408 99ms 271ms 7ms 2793ms
10770 399 99ms 284ms 7ms 3233ms
10800 279 142ms 347ms 7ms 2641ms
10830 ch 245 164ms 436ms 7ms 2618ms
10860 ch 462 86ms 246ms 7ms 2497ms
10890 ch 78 506ms 905ms 6ms 3198ms
10920 ch 17 2407ms 217ms 1650ms 2678ms
10950 ch 652 64ms 272ms 7ms 2471ms
10980 av 976 41ms 126ms 6ms 2219ms
11010 av 379 104ms 257ms 7ms 2491ms
11040 av 381 107ms 274ms 7ms 2426ms
11070 av 325 123ms 294ms 6ms 2497ms
11100 av 226 173ms 387ms 7ms 1993ms
11130 av+ch 26 1575ms 635ms 101ms 2536ms
11160 av+ch 25 1552ms 648ms 58ms 2376ms
11190 av+ch 32 1275ms 726ms 16ms 2493ms
11220 av+ch 23 1584ms 674ms 48ms 2454ms
11250 av+ch 35 1235ms 777ms 22ms 3627ms
11280 av+ch 1301 30ms 145ms 6ms 2778ms
11310 av 903 46ms 125ms 7ms 2406ms
11340 av 395 100ms 291ms 7ms 2849ms
11370 av 377 103ms 255ms 7ms 2082ms
11400 av 340 114ms 309ms 7ms 3160ms

Archive testing5_all.tar.gz contains more raw logs.

My interpretation:
- autovacuum runs 3-4 times faster
(2700sec unpatched vs 800sec patched)
- faster autovacuum alone is not big problem.
While concurrent transactions are slower a bit, but not
catastrophically slower,
- faster autovacuum with checkpoint running simultaneously is a
big problem.

May be checkpoint process should affect autovacuum_cost ?

With regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

Attachment	Content-Type	Size
testing5_pretty.tar.gz	application/x-gzip	77.1 KB
testing5_all.tar.gz	application/x-gzip	543.3 KB

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-26 16:46:25
Message-ID:	CAGTBQpbp9kRhADpvyBsZtg8V_xvxULO34k0CM3zXkwi6Z5d66w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jul 26, 2017 at 1:39 PM, Sokolov Yura
<funny(dot)falcon(at)postgrespro(dot)ru> wrote:
> On 2017-07-24 12:41, Sokolov Yura wrote:
> test_master_1/pretty.log
...
> time activity tps latency stddev min max
> 11130 av+ch 198 198ms 374ms 7ms 1956ms
> 11160 av+ch 248 163ms 401ms 7ms 2601ms
> 11190 av+ch 321 125ms 363ms 7ms 2722ms
> 11220 av+ch 1155 35ms 123ms 7ms 2668ms
> 11250 av+ch 1390 29ms 79ms 7ms 1422ms

> test_master_ring16_1/pretty.log
> time activity tps latency stddev min max
> 11130 av+ch 26 1575ms 635ms 101ms 2536ms
> 11160 av+ch 25 1552ms 648ms 58ms 2376ms
> 11190 av+ch 32 1275ms 726ms 16ms 2493ms
> 11220 av+ch 23 1584ms 674ms 48ms 2454ms
> 11250 av+ch 35 1235ms 777ms 22ms 3627ms

That's a very huge change in latency for the worse

Are you sure that's the ring buffer's doing and not some methodology snafu?

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-26 17:28:28
Message-ID:	aad2de81e3fe74db9772cea1e9bb543b@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-26 19:46, Claudio Freire wrote:
> On Wed, Jul 26, 2017 at 1:39 PM, Sokolov Yura
> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>> On 2017-07-24 12:41, Sokolov Yura wrote:
>> test_master_1/pretty.log
> ...
>> time activity tps latency stddev min max
>> 11130 av+ch 198 198ms 374ms 7ms 1956ms
>> 11160 av+ch 248 163ms 401ms 7ms 2601ms
>> 11190 av+ch 321 125ms 363ms 7ms 2722ms
>> 11220 av+ch 1155 35ms 123ms 7ms 2668ms
>> 11250 av+ch 1390 29ms 79ms 7ms 1422ms
>
> vs
>
>> test_master_ring16_1/pretty.log
>> time activity tps latency stddev min max
>> 11130 av+ch 26 1575ms 635ms 101ms 2536ms
>> 11160 av+ch 25 1552ms 648ms 58ms 2376ms
>> 11190 av+ch 32 1275ms 726ms 16ms 2493ms
>> 11220 av+ch 23 1584ms 674ms 48ms 2454ms
>> 11250 av+ch 35 1235ms 777ms 22ms 3627ms
>
> That's a very huge change in latency for the worse
>
> Are you sure that's the ring buffer's doing and not some methodology
> snafu?

Well, I tuned postgresql.conf so that there is no such
catastrophic slows down on master branch. (with default
settings such slowdown happens quite frequently).
bgwriter_lru_maxpages = 10 (instead of default 200) were one
of such tuning.

Probably there were some magic "border" that triggers this
behavior. Tuning postgresql.conf shifted master branch on
"good side" of this border, and faster autovacuum crossed it
to "bad side" again.

Probably, backend_flush_after = 2MB (instead of default 0) is
also part of this border. I didn't try to bench without this
option yet.

Any way, given checkpoint and autovacuum interference could be
such noticeable, checkpoint clearly should affect autovacuum
cost mechanism, imho.

With regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-27 08:30:51
Message-ID:	CAD21AoAZb2dK6EutyV-4MGT1m=r-PtAwWpA4X91KT3XobGB-gQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Jul 25, 2017 at 2:27 AM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> On Mon, Jul 24, 2017 at 2:20 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>> On Mon, Jul 24, 2017 at 2:10 PM, Sokolov Yura
>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>> On 2017-07-24 19:11, Claudio Freire wrote:
>>>> I was mostly thinking about something like the attached patch.
>>>>
>>>> Simple, unintrusive, and shouldn't cause any noticeable slowdown.
>>>
>>>
>>> Your change is small, clear, and currently useful for huge tables under
>>> high update load (until "allowing vacuum to use more than 1GB memory"
>>> is merged).
>>
>> In high-bloat conditions, it doesn't take long to accumulate 1GB of
>> dead tuples (which is about 178M tuples, btw).
>>
>> The index scan takes way longer than the heap scan in that case.
>>
>>> But it still delays updating fsm until whole first batch of dead tuples
>>> cleared (ie all indices scanned, and all heap pages cleared), and on such
>>> huge table it will be hours.
>>
>> So, true, it will get delayed considerably. But as you realized,
>> there's not much point in trying to vacuum the FSM sooner, since it
>> won't be accurate shortly afterwards anyway. Dead line pointers do use
>> up a fair bit of space, especially on narrow tables.
>>
>> In a particular table I have that exhibits this problem, most of the
>> time is spent scanning the index. It performs dozens of index scans
>> before it's done, so it would vacuum the FSM quite often enough, even
>> if I were to increase the mwm setting n-fold.
>
> I hate to reply to myself, but I wanted to add: in any case, the case
> I'm trying to avoid is the case where the FSM *never* gets vacuumed.
> That's bad. But it may not be the phenomenon you're experiencing in
> your tests.
>

I think the frequently vacuuming the FSM during long-time vacuum would
be worth to have in order to avoid a table bloating. The patch
triggers to vacuum the FSM after vacuumed the table and indexes but I
think that we can have a similar mechanism for a table with no index.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc:	Claudio Freire <klaussfreire(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-27 08:48:29
Message-ID:	e8dfa0b39429781d1477c8722b6a66fe@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-27 11:30, Masahiko Sawada wrote:
> On Tue, Jul 25, 2017 at 2:27 AM, Claudio Freire
> <klaussfreire(at)gmail(dot)com> wrote:
>> On Mon, Jul 24, 2017 at 2:20 PM, Claudio Freire
>> <klaussfreire(at)gmail(dot)com> wrote:
>>> On Mon, Jul 24, 2017 at 2:10 PM, Sokolov Yura
>>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>>> On 2017-07-24 19:11, Claudio Freire wrote:
>>>>> I was mostly thinking about something like the attached patch.
>>>>>
>>>>> Simple, unintrusive, and shouldn't cause any noticeable slowdown.
>>>>
>>>>
>>>> Your change is small, clear, and currently useful for huge tables
>>>> under
>>>> high update load (until "allowing vacuum to use more than 1GB
>>>> memory"
>>>> is merged).
>>>
>>> In high-bloat conditions, it doesn't take long to accumulate 1GB of
>>> dead tuples (which is about 178M tuples, btw).
>>>
>>> The index scan takes way longer than the heap scan in that case.
>>>
>>>> But it still delays updating fsm until whole first batch of dead
>>>> tuples
>>>> cleared (ie all indices scanned, and all heap pages cleared), and on
>>>> such
>>>> huge table it will be hours.
>>>
>>> So, true, it will get delayed considerably. But as you realized,
>>> there's not much point in trying to vacuum the FSM sooner, since it
>>> won't be accurate shortly afterwards anyway. Dead line pointers do
>>> use
>>> up a fair bit of space, especially on narrow tables.
>>>
>>> In a particular table I have that exhibits this problem, most of the
>>> time is spent scanning the index. It performs dozens of index scans
>>> before it's done, so it would vacuum the FSM quite often enough, even
>>> if I were to increase the mwm setting n-fold.
>>
>> I hate to reply to myself, but I wanted to add: in any case, the case
>> I'm trying to avoid is the case where the FSM *never* gets vacuumed.
>> That's bad. But it may not be the phenomenon you're experiencing in
>> your tests.
>>
>
> I think the frequently vacuuming the FSM during long-time vacuum would
> be worth to have in order to avoid a table bloating. The patch
> triggers to vacuum the FSM after vacuumed the table and indexes but I
> think that we can have a similar mechanism for a table with no index.
>
> Regards,
>
> --
> Masahiko Sawada
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center

I could be wrong, but it looks like table without index doesn't
suffer that much. Since there is no indices, there is only one stage -
scanning heap, and no quadratic behavior cause of full dead-tuple array
and repeating index vacuuming.

--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-27 08:53:47
Message-ID:	38e275a82b0d1db3a8397633f516b22f@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-26 20:28, Sokolov Yura wrote:
> On 2017-07-26 19:46, Claudio Freire wrote:
>> On Wed, Jul 26, 2017 at 1:39 PM, Sokolov Yura
>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>> On 2017-07-24 12:41, Sokolov Yura wrote:
>>> test_master_1/pretty.log
>> ...
>>> time activity tps latency stddev min max
>>> 11130 av+ch 198 198ms 374ms 7ms 1956ms
>>> 11160 av+ch 248 163ms 401ms 7ms 2601ms
>>> 11190 av+ch 321 125ms 363ms 7ms 2722ms
>>> 11220 av+ch 1155 35ms 123ms 7ms 2668ms
>>> 11250 av+ch 1390 29ms 79ms 7ms 1422ms
>>
>> vs
>>
>>> test_master_ring16_1/pretty.log
>>> time activity tps latency stddev min max
>>> 11130 av+ch 26 1575ms 635ms 101ms 2536ms
>>> 11160 av+ch 25 1552ms 648ms 58ms 2376ms
>>> 11190 av+ch 32 1275ms 726ms 16ms 2493ms
>>> 11220 av+ch 23 1584ms 674ms 48ms 2454ms
>>> 11250 av+ch 35 1235ms 777ms 22ms 3627ms
>>
>> That's a very huge change in latency for the worse
>>
>> Are you sure that's the ring buffer's doing and not some methodology
>> snafu?
>
> Well, I tuned postgresql.conf so that there is no such
> catastrophic slows down on master branch. (with default
> settings such slowdown happens quite frequently).
> bgwriter_lru_maxpages = 10 (instead of default 200) were one
> of such tuning.
>
> Probably there were some magic "border" that triggers this
> behavior. Tuning postgresql.conf shifted master branch on
> "good side" of this border, and faster autovacuum crossed it
> to "bad side" again.
>
> Probably, backend_flush_after = 2MB (instead of default 0) is
> also part of this border. I didn't try to bench without this
> option yet.
>
> Any way, given checkpoint and autovacuum interference could be
> such noticeable, checkpoint clearly should affect autovacuum
> cost mechanism, imho.
>
> With regards,
> --
> Sokolov Yura aka funny_falcon
> Postgres Professional: https://postgrespro.ru
> The Russian Postgres Company

I'll run two times with default postgresql.conf (except
shared_buffers and maintence_work_mem) to find out behavior on
default setting.

Then I'll try to investigate checkpoint co-operation with
autovacuum to fix behavior with "tuned" postgresql.conf.

--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc:	Claudio Freire <klaussfreire(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-27 09:16:50
Message-ID:	CAD21AoDFYJTf9fC7tLnKaKxj7tg=4_9GY25bNYohE-r+_aMxew@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 27, 2017 at 5:48 PM, Sokolov Yura
<funny(dot)falcon(at)postgrespro(dot)ru> wrote:
> On 2017-07-27 11:30, Masahiko Sawada wrote:
>>
>> On Tue, Jul 25, 2017 at 2:27 AM, Claudio Freire <klaussfreire(at)gmail(dot)com>
>> wrote:
>>>
>>> On Mon, Jul 24, 2017 at 2:20 PM, Claudio Freire <klaussfreire(at)gmail(dot)com>
>>> wrote:
>>>>
>>>> On Mon, Jul 24, 2017 at 2:10 PM, Sokolov Yura
>>>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>>>>
>>>>> On 2017-07-24 19:11, Claudio Freire wrote:
>>>>>>
>>>>>> I was mostly thinking about something like the attached patch.
>>>>>>
>>>>>> Simple, unintrusive, and shouldn't cause any noticeable slowdown.
>>>>>
>>>>>
>>>>>
>>>>> Your change is small, clear, and currently useful for huge tables under
>>>>> high update load (until "allowing vacuum to use more than 1GB memory"
>>>>> is merged).
>>>>
>>>>
>>>> In high-bloat conditions, it doesn't take long to accumulate 1GB of
>>>> dead tuples (which is about 178M tuples, btw).
>>>>
>>>> The index scan takes way longer than the heap scan in that case.
>>>>
>>>>> But it still delays updating fsm until whole first batch of dead tuples
>>>>> cleared (ie all indices scanned, and all heap pages cleared), and on
>>>>> such
>>>>> huge table it will be hours.
>>>>
>>>>
>>>> So, true, it will get delayed considerably. But as you realized,
>>>> there's not much point in trying to vacuum the FSM sooner, since it
>>>> won't be accurate shortly afterwards anyway. Dead line pointers do use
>>>> up a fair bit of space, especially on narrow tables.
>>>>
>>>> In a particular table I have that exhibits this problem, most of the
>>>> time is spent scanning the index. It performs dozens of index scans
>>>> before it's done, so it would vacuum the FSM quite often enough, even
>>>> if I were to increase the mwm setting n-fold.
>>>
>>>
>>> I hate to reply to myself, but I wanted to add: in any case, the case
>>> I'm trying to avoid is the case where the FSM *never* gets vacuumed.
>>> That's bad. But it may not be the phenomenon you're experiencing in
>>> your tests.
>>>
>>
>> I think the frequently vacuuming the FSM during long-time vacuum would
>> be worth to have in order to avoid a table bloating. The patch
>> triggers to vacuum the FSM after vacuumed the table and indexes but I
>> think that we can have a similar mechanism for a table with no index.
>>
>> Regards,
>>
>> --
>> Masahiko Sawada
>> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>> NTT Open Source Software Center
>
>
> I could be wrong, but it looks like table without index doesn't
> suffer that much. Since there is no indices, there is only one stage -
> scanning heap, and no quadratic behavior cause of full dead-tuple array
> and repeating index vacuuming.
>

The vacuuming the very large table with no index could also take a
long time, and it scans and vacuums blocks one by one. So I imagined
that we can vacuum the FSM once vacuumed a certain amount of blocks.
And that can avoid bloating table during the long-time vacuum.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-27 16:42:55
Message-ID:	CAGTBQpar1ix+TEn+PdAga-9HXxeacXFa3MFmLR4aq2_8dFiT4g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 27, 2017 at 6:16 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Thu, Jul 27, 2017 at 5:48 PM, Sokolov Yura
> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>> On 2017-07-27 11:30, Masahiko Sawada wrote:
>>>
>>> On Tue, Jul 25, 2017 at 2:27 AM, Claudio Freire <klaussfreire(at)gmail(dot)com>
>>> wrote:
>>>>
>>>> On Mon, Jul 24, 2017 at 2:20 PM, Claudio Freire <klaussfreire(at)gmail(dot)com>
>>>> wrote:
>>>>>
>>>>> On Mon, Jul 24, 2017 at 2:10 PM, Sokolov Yura
>>>>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>>>>>
>>>>>> On 2017-07-24 19:11, Claudio Freire wrote:
>>>>>>>
>>>>>>> I was mostly thinking about something like the attached patch.
>>>>>>>
>>>>>>> Simple, unintrusive, and shouldn't cause any noticeable slowdown.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Your change is small, clear, and currently useful for huge tables under
>>>>>> high update load (until "allowing vacuum to use more than 1GB memory"
>>>>>> is merged).
>>>>>
>>>>>
>>>>> In high-bloat conditions, it doesn't take long to accumulate 1GB of
>>>>> dead tuples (which is about 178M tuples, btw).
>>>>>
>>>>> The index scan takes way longer than the heap scan in that case.
>>>>>
>>>>>> But it still delays updating fsm until whole first batch of dead tuples
>>>>>> cleared (ie all indices scanned, and all heap pages cleared), and on
>>>>>> such
>>>>>> huge table it will be hours.
>>>>>
>>>>>
>>>>> So, true, it will get delayed considerably. But as you realized,
>>>>> there's not much point in trying to vacuum the FSM sooner, since it
>>>>> won't be accurate shortly afterwards anyway. Dead line pointers do use
>>>>> up a fair bit of space, especially on narrow tables.
>>>>>
>>>>> In a particular table I have that exhibits this problem, most of the
>>>>> time is spent scanning the index. It performs dozens of index scans
>>>>> before it's done, so it would vacuum the FSM quite often enough, even
>>>>> if I were to increase the mwm setting n-fold.
>>>>
>>>>
>>>> I hate to reply to myself, but I wanted to add: in any case, the case
>>>> I'm trying to avoid is the case where the FSM *never* gets vacuumed.
>>>> That's bad. But it may not be the phenomenon you're experiencing in
>>>> your tests.
>>>>
>>>
>>> I think the frequently vacuuming the FSM during long-time vacuum would
>>> be worth to have in order to avoid a table bloating. The patch
>>> triggers to vacuum the FSM after vacuumed the table and indexes but I
>>> think that we can have a similar mechanism for a table with no index.
>>>
>>> Regards,
>>>
>>> --
>>> Masahiko Sawada
>>> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>>> NTT Open Source Software Center
>>
>>
>> I could be wrong, but it looks like table without index doesn't
>> suffer that much. Since there is no indices, there is only one stage -
>> scanning heap, and no quadratic behavior cause of full dead-tuple array
>> and repeating index vacuuming.
>>
>
> The vacuuming the very large table with no index could also take a
> long time, and it scans and vacuums blocks one by one. So I imagined
> that we can vacuum the FSM once vacuumed a certain amount of blocks.
> And that can avoid bloating table during the long-time vacuum.

Could do that. I'll see about doing something of the sort and
submitting it as a separate patch.

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-27 16:46:37
Message-ID:	20170727164637.6b6kjuhlwvjsapsf@alvherre.pgsql
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Claudio Freire wrote:

> > The vacuuming the very large table with no index could also take a
> > long time, and it scans and vacuums blocks one by one. So I imagined
> > that we can vacuum the FSM once vacuumed a certain amount of blocks.
> > And that can avoid bloating table during the long-time vacuum.
>
> Could do that. I'll see about doing something of the sort and
> submitting it as a separate patch.

Vacuuming of the FSM is in no way strictly tied to vacuuming the heap
(it's not really "vacuuming", it's just about updating the upper layers
to match the data in the leaves). I think we could use the new autovac
"workitem" infrastructure and tack an item there once in a while for FSM
vacuuming ...

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-27 17:04:59
Message-ID:	CAGTBQpZFJ4VpWatTezepX4sJqm+MDxuJSmCfNhL32S0pV_Oz1A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 27, 2017 at 1:46 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Claudio Freire wrote:
>
>> > The vacuuming the very large table with no index could also take a
>> > long time, and it scans and vacuums blocks one by one. So I imagined
>> > that we can vacuum the FSM once vacuumed a certain amount of blocks.
>> > And that can avoid bloating table during the long-time vacuum.
>>
>> Could do that. I'll see about doing something of the sort and
>> submitting it as a separate patch.
>
> Vacuuming of the FSM is in no way strictly tied to vacuuming the heap
> (it's not really "vacuuming", it's just about updating the upper layers
> to match the data in the leaves). I think we could use the new autovac
> "workitem" infrastructure and tack an item there once in a while for FSM
> vacuuming ...

Well, it *is* tied in the sense that vacuum is the one massively
adding free space.

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-27 17:10:01
Message-ID:	20170727171001.e6gl7oo4t6rgsh62@alvherre.pgsql
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Claudio Freire wrote:
> On Thu, Jul 27, 2017 at 1:46 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > Claudio Freire wrote:
> >
> >> > The vacuuming the very large table with no index could also take a
> >> > long time, and it scans and vacuums blocks one by one. So I imagined
> >> > that we can vacuum the FSM once vacuumed a certain amount of blocks.
> >> > And that can avoid bloating table during the long-time vacuum.
> >>
> >> Could do that. I'll see about doing something of the sort and
> >> submitting it as a separate patch.
> >
> > Vacuuming of the FSM is in no way strictly tied to vacuuming the heap
> > (it's not really "vacuuming", it's just about updating the upper layers
> > to match the data in the leaves). I think we could use the new autovac
> > "workitem" infrastructure and tack an item there once in a while for FSM
> > vacuuming ...
>
> Well, it *is* tied in the sense that vacuum is the one massively
> adding free space.

Yes, but if vacuum dies after releasing lots of space but before
vacuuming FSM, then it's not tied anymore and you could just as well run
it anytime.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-27 17:25:26
Message-ID:	CAGTBQpbVFhzTSCh+-9vtewoRa=6gDS9B6wpSEGyPESk0aafxsA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 27, 2017 at 2:10 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Claudio Freire wrote:
>> On Thu, Jul 27, 2017 at 1:46 PM, Alvaro Herrera
>> <alvherre(at)2ndquadrant(dot)com> wrote:
>> > Claudio Freire wrote:
>> >
>> >> > The vacuuming the very large table with no index could also take a
>> >> > long time, and it scans and vacuums blocks one by one. So I imagined
>> >> > that we can vacuum the FSM once vacuumed a certain amount of blocks.
>> >> > And that can avoid bloating table during the long-time vacuum.
>> >>
>> >> Could do that. I'll see about doing something of the sort and
>> >> submitting it as a separate patch.
>> >
>> > Vacuuming of the FSM is in no way strictly tied to vacuuming the heap
>> > (it's not really "vacuuming", it's just about updating the upper layers
>> > to match the data in the leaves). I think we could use the new autovac
>> > "workitem" infrastructure and tack an item there once in a while for FSM
>> > vacuuming ...
>>
>> Well, it *is* tied in the sense that vacuum is the one massively
>> adding free space.
>
> Yes, but if vacuum dies after releasing lots of space but before
> vacuuming FSM, then it's not tied anymore and you could just as well run
> it anytime.

I see your point.

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-07-31 17:11:25
Message-ID:	535409cd89927f551bd106cacd2149f4@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2017-07-27 11:53, Sokolov Yura wrote:
> On 2017-07-26 20:28, Sokolov Yura wrote:
>> On 2017-07-26 19:46, Claudio Freire wrote:
>>> On Wed, Jul 26, 2017 at 1:39 PM, Sokolov Yura
>>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
>>>> On 2017-07-24 12:41, Sokolov Yura wrote:
>>>> test_master_1/pretty.log
>>> ...
>>>> time activity tps latency stddev min max
>>>> 11130 av+ch 198 198ms 374ms 7ms 1956ms
>>>> 11160 av+ch 248 163ms 401ms 7ms 2601ms
>>>> 11190 av+ch 321 125ms 363ms 7ms 2722ms
>>>> 11220 av+ch 1155 35ms 123ms 7ms 2668ms
>>>> 11250 av+ch 1390 29ms 79ms 7ms 1422ms
>>>
>>> vs
>>>
>>>> test_master_ring16_1/pretty.log
>>>> time activity tps latency stddev min max
>>>> 11130 av+ch 26 1575ms 635ms 101ms 2536ms
>>>> 11160 av+ch 25 1552ms 648ms 58ms 2376ms
>>>> 11190 av+ch 32 1275ms 726ms 16ms 2493ms
>>>> 11220 av+ch 23 1584ms 674ms 48ms 2454ms
>>>> 11250 av+ch 35 1235ms 777ms 22ms 3627ms
>>>
>>> That's a very huge change in latency for the worse
>>>
>>> Are you sure that's the ring buffer's doing and not some methodology
>>> snafu?
>>
>> Well, I tuned postgresql.conf so that there is no such
>> catastrophic slows down on master branch. (with default
>> settings such slowdown happens quite frequently).
>> bgwriter_lru_maxpages = 10 (instead of default 200) were one
>> of such tuning.
>>
>> Probably there were some magic "border" that triggers this
>> behavior. Tuning postgresql.conf shifted master branch on
>> "good side" of this border, and faster autovacuum crossed it
>> to "bad side" again.
>>
>> Probably, backend_flush_after = 2MB (instead of default 0) is
>> also part of this border. I didn't try to bench without this
>> option yet.
>>
>> Any way, given checkpoint and autovacuum interference could be
>> such noticeable, checkpoint clearly should affect autovacuum
>> cost mechanism, imho.
>>
>> With regards,
>
> I'll run two times with default postgresql.conf (except
> shared_buffers and maintence_work_mem) to find out behavior on
> default setting.
>
> Then I'll try to investigate checkpoint co-operation with
> autovacuum to fix behavior with "tuned" postgresql.conf.
>

I've accidentally lost results of this run, so I will rerun it.

This I remembered:
- even with default settings, autovacuum runs 3 times faster:
9000s on master, 3000s with increased ring buffer.
So xlog-fsync really slows down autovacuum.
- but concurrent transactions slows down (not so extremely as in
previous test, but still significantly).
I could not draw pretty table now, cause I lost results. I'll do
it after re-run completes.

Could someone suggest, how to cooperate checkpoint with autovacuum,
to slow down autovacuum a bit during checkpoint?

With regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

From:	Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To:	Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers-owner(at)postgresql(dot)org
Subject:	Re: Increase Vacuum ring buffer.
Date:	2017-08-15 15:00:38
Message-ID:	20170815180038.1f1953b2@falcon-work
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

В Mon, 31 Jul 2017 20:11:25 +0300
Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru> пишет:

> On 2017-07-27 11:53, Sokolov Yura wrote:
> > On 2017-07-26 20:28, Sokolov Yura wrote:
> >> On 2017-07-26 19:46, Claudio Freire wrote:
> >>> On Wed, Jul 26, 2017 at 1:39 PM, Sokolov Yura
> >>> <funny(dot)falcon(at)postgrespro(dot)ru> wrote:
> >>>> On 2017-07-24 12:41, Sokolov Yura wrote:
> >>>> test_master_1/pretty.log
> >>> ...
> >>>> time activity tps latency stddev min max
> >>>> 11130 av+ch 198 198ms 374ms 7ms 1956ms
> >>>> 11160 av+ch 248 163ms 401ms 7ms 2601ms
> >>>> 11190 av+ch 321 125ms 363ms 7ms 2722ms
> >>>> 11220 av+ch 1155 35ms 123ms 7ms 2668ms
> >>>> 11250 av+ch 1390 29ms 79ms 7ms 1422ms
> >>>
> >>> vs
> >>>
> >>>> test_master_ring16_1/pretty.log
> >>>> time activity tps latency stddev min max
> >>>> 11130 av+ch 26 1575ms 635ms 101ms 2536ms
> >>>> 11160 av+ch 25 1552ms 648ms 58ms 2376ms
> >>>> 11190 av+ch 32 1275ms 726ms 16ms 2493ms
> >>>> 11220 av+ch 23 1584ms 674ms 48ms 2454ms
> >>>> 11250 av+ch 35 1235ms 777ms 22ms 3627ms
> >>>
> >>> That's a very huge change in latency for the worse
> >>>
> >>> Are you sure that's the ring buffer's doing and not some
> >>> methodology snafu?
> >>
> >> Well, I tuned postgresql.conf so that there is no such
> >> catastrophic slows down on master branch. (with default
> >> settings such slowdown happens quite frequently).
> >> bgwriter_lru_maxpages = 10 (instead of default 200) were one
> >> of such tuning.
> >>
> >> Probably there were some magic "border" that triggers this
> >> behavior. Tuning postgresql.conf shifted master branch on
> >> "good side" of this border, and faster autovacuum crossed it
> >> to "bad side" again.
> >>
> >> Probably, backend_flush_after = 2MB (instead of default 0) is
> >> also part of this border. I didn't try to bench without this
> >> option yet.
> >>
> >> Any way, given checkpoint and autovacuum interference could be
> >> such noticeable, checkpoint clearly should affect autovacuum
> >> cost mechanism, imho.
> >>
> >> With regards,
> >
> > I'll run two times with default postgresql.conf (except
> > shared_buffers and maintence_work_mem) to find out behavior on
> > default setting.
> >
> I've accidentally lost results of this run, so I will rerun it.
>
> This I remembered:
> - even with default settings, autovacuum runs 3 times faster:
> 9000s on master, 3000s with increased ring buffer.
> So xlog-fsync really slows down autovacuum.
> - but concurrent transactions slows down (not so extremely as in
> previous test, but still significantly).
> I could not draw pretty table now, cause I lost results. I'll do
> it after re-run completes.
>
> With regards,

Excuse me for long delay.

I did run with default postgresql.conf .

First: query was a bit different - instead of updating 5 close but
random points using `aid in (:aid1, :aid2, :aid3, :aid4, :aid5)`,
condition was `aid between (:aid1 and :aid1+9)`. TPS is much slower
(on master 330tps vs 540tps for previous version), but it is hard to
tell, is it due query difference, or is it due config change.
I'm sorry for this inconvenience :-( I will never repeat this mistake
in a future.

Overview:
master : 339 tps, average autovacuum 6000sec.
ring16 : 275 tps, average autovacuum 1100sec, first 2500sec.
ring16 + `vacuum_cost_page_dirty = 40` :
293 tps, average autovacuum 2100sec, first 4226sec.

Running with default postgresql.conf doesn't show catastrophic tps
decline when checkpoint starts during autovacuum (seen in previous test
runs with `autovacuum_cost_delay = 2ms` ). Overall average tps through
8 hours test is also much closer to "master". Still, increased ring
buffer significantly improves autovacuum performance, that means, fsync
consumes a lot of time, comparable with autovacuum_cost_delay.

Runs with ring16 has occasional jumps in minimum and average
response latency:
test_master_ring16_2/pretty.log
8475 av+ch 15 2689ms 659ms 1867ms 3575ms
27420 av 15 2674ms 170ms 2393ms 2926ms
Usually it happens close to end of autovacuum.
What could it be? It is clearly bad behavior hidden by current small
ring buffer.

Runs with ring16+`cost_page_dirty = 40` are much more stable in term
of performance of concurrent transactions. Only first autovacuum has
such "latency jump", latter runs smoothly.

So, increasing ring buffer certainly improves autovacuum performance.
Its negative effects could be compensated with configuration. It
exposes some bad behavior in current implementation, that should be
investigated closer.

--
With regards,
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

Attachment	Content-Type	Size
testing6_pretty.tar.gz	application/gzip	486.9 KB