Content-Length: 402092 | pFad | http://github.com/tonybelloni/postgres/commit/48354581a49c30f5757c203415aa8412d85b0f70

00 Allow Pin/UnpinBuffer to operate in a lockfree manner. · tonybelloni/postgres@4835458 · GitHub
Skip to content

Commit 4835458

Browse files
committed
Allow Pin/UnpinBuffer to operate in a lockfree manner.
Pinning/Unpinning a buffer is a very frequent operation; especially in read-mostly cache resident workloads. Benchmarking shows that in various scenarios the spinlock protecting a buffer header's state becomes a significant bottleneck. The problem can be reproduced with pgbench -S on larger machines, but can be considerably worse for queries which touch the same buffers over and over at a high frequency (e.g. nested loops over a small inner table). To allow atomic operations to be used, cram BufferDesc's flags, usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable; that allows to manipulate them together using 32bit compare-and-swap operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could be lifted by using a 64bit field, but it's not a realistic configuration atm). As not all operations can easily implemented in a lockfree manner, implement the previous buf_hdr_lock via a flag bit in the atomic variable. That way we can continue to lock the header in places where it's needed, but can get away without acquiring it in the more frequent hot-paths. There's some additional operations which can be done without the lock, but aren't in this patch; but the most important places are covered. As bufmgr.c now essentially re-implements spinlocks, abstract the delay logic from s_lock.c into something more generic. It now has already two users, and more are coming up; there's a follupw patch for lwlock.c at least. This patch is based on a proof-of-concept written by me, which Alexander Korotkov made into a fully working patch; the committed version is again revised by me. Benchmarking and testing has, amongst others, been provided by Dilip Kumar, Alexander Korotkov, Robert Haas. On a large x86 system improvements for readonly pgbench, with a high client count, of a factor of 8 have been observed. Author: Alexander Korotkov and Andres Freund Discussion: 2400449.GjM57CE0Yg@dinodell
1 parent cf223c3 commit 4835458

File tree

10 files changed

+622
-357
lines changed

10 files changed

+622
-357
lines changed

contrib/pg_buffercache/pg_buffercache_pages.c

+8-7
Original file line numberDiff line numberDiff line change
@@ -148,33 +148,34 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
148148
*/
149149
for (i = 0; i < NBuffers; i++)
150150
{
151-
volatile BufferDesc *bufHdr;
151+
BufferDesc *bufHdr;
152+
uint32 buf_state;
152153

153154
bufHdr = GetBufferDescriptor(i);
154155
/* Lock each buffer header before inspecting. */
155-
LockBufHdr(bufHdr);
156+
buf_state = LockBufHdr(bufHdr);
156157

157158
fctx->record[i].bufferid = BufferDescriptorGetBuffer(bufHdr);
158159
fctx->record[i].relfilenode = bufHdr->tag.rnode.relNode;
159160
fctx->record[i].reltablespace = bufHdr->tag.rnode.spcNode;
160161
fctx->record[i].reldatabase = bufHdr->tag.rnode.dbNode;
161162
fctx->record[i].forknum = bufHdr->tag.forkNum;
162163
fctx->record[i].blocknum = bufHdr->tag.blockNum;
163-
fctx->record[i].usagecount = bufHdr->usage_count;
164-
fctx->record[i].pinning_backends = bufHdr->refcount;
164+
fctx->record[i].usagecount = BUF_STATE_GET_USAGECOUNT(buf_state);
165+
fctx->record[i].pinning_backends = BUF_STATE_GET_REFCOUNT(buf_state);
165166

166-
if (bufHdr->flags & BM_DIRTY)
167+
if (buf_state & BM_DIRTY)
167168
fctx->record[i].isdirty = true;
168169
else
169170
fctx->record[i].isdirty = false;
170171

171172
/* Note if the buffer is valid, and has storage created */
172-
if ((bufHdr->flags & BM_VALID) && (bufHdr->flags & BM_TAG_VALID))
173+
if ((buf_state & BM_VALID) && (buf_state & BM_TAG_VALID))
173174
fctx->record[i].isvalid = true;
174175
else
175176
fctx->record[i].isvalid = false;
176177

177-
UnlockBufHdr(bufHdr);
178+
UnlockBufHdr(bufHdr, buf_state);
178179
}
179180

180181
/*

src/backend/storage/buffer/buf_init.c

+2-5
Original file line numberDiff line numberDiff line change
@@ -135,12 +135,9 @@ InitBufferPool(void)
135135
BufferDesc *buf = GetBufferDescriptor(i);
136136

137137
CLEAR_BUFFERTAG(buf->tag);
138-
buf->flags = 0;
139-
buf->usage_count = 0;
140-
buf->refcount = 0;
141-
buf->wait_backend_pid = 0;
142138

143-
SpinLockInit(&buf->buf_hdr_lock);
139+
pg_atomic_init_u32(&buf->state, 0);
140+
buf->wait_backend_pid = 0;
144141

145142
buf->buf_id = i;
146143

0 commit comments

Comments
 (0)








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/tonybelloni/postgres/commit/48354581a49c30f5757c203415aa8412d85b0f70

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy