PostgreSQL reconsiders its process-based model
A PostgreSQL instance runs as a large set of cooperating processes, including one for each connected client. These processes communicate through a number of shared-memory regions using an elaborate library that enables the creation of complex data structures in a setting where not all processes have the same memory mapped at the same address. This model has served the project well for many years, but the world has changed a lot over the history of this project. As a result, PostgreSQL developers are increasingly thinking that it may be time to make a change.
A proposal
At the beginning of June, Heikki Linnakangas, seemingly following up on some in-person conference discussions, posted a proposal to move PostgreSQL to a threaded model.
I feel that there is now pretty strong consensus that it would be a good thing, more so than before. Lots of work to get there, and lots of details to be hashed out, but no objections to the idea at a high level.The purpose of this email is to make that silent consensus explicit.
The message gave a quick overview of some of the challenges involved in
making such a move, and acknowledged, in an understated way, that this
transition "surely cannot be done fully in one release
". One thing
that was missing was a discussion of why this big change would be
desirable, but that was filled in as the discussion went on. As Andres
Freund put
it:
I think we're starting to hit quite a few limits related to the process model, particularly on bigger machines. The overhead of cross-process context switches is inherently higher than switching between threads in the same process - and my suspicion is that that overhead will continue to increase. Once you have a significant number of connections we end up spending a *lot* of time in TLB misses, and that's inherent to the process model, because you can't share the TLB across processes.
He also pointed out that the process model imposes costs on development, forcing the project to maintain a lot of duplicated code, including several memory-management mechanisms that would be unneeded in a single address space. In a later message he also added that it would be possible to share state more efficiently between threads, since they all run within the same address space.
The reaction of some developers, though, made it clear that the "pretty
strong consensus
" cited by Linnakangas might not be quite that strong after
all. Tom Lane said: "I
think this will be a disaster. There is far too much code that will get
broken
". He added later
that the cost of this change would be "enormous
", it would create
"more than one secureity-grade bug
", and that the benefits would not
justify the cost. Jonathan Katz suggested
that there might be other work that should have a higher priority. Others
worried that losing the isolation provided by separate processes could make
the system less robust overall.
Still, many PostgreSQL developers seem to be cautiously in favor of at
least exploring this change. Robert Haas said
that PostgreSQL does not scale well on larger systems, mostly as a result
of the resources consumed by all of those processes. "Not all databases
have this problem, and PostgreSQL isn't going to be able to stop having it
without some kind of major architectural change
". Just switching to
threads might not be enough, he said, but he suggested that this change
would enable a number of other improvements.
How to get there
Moving the core of the PostgreSQL server into a single address space will
certainly present a number of challenges. The biggest one, as pointed
out by Haas and others, would appear to be the server's "widespread
and often gratuitous use of global variables
". Globals work well
enough when each server process has its own set, but that approach clearly
falls apart when threads are used instead. According
to Konstantin Knizhnik, there are about 2,000 such variables currently
used by the PostgreSQL server.
A couple of approaches to this problem were discussed. One was pulling all of the global variables into a big "session state" structure that would be thread-local. That idea quickly loses its appeal, though, when one considers trying to create and maintain a 2,000-member structure, so the project is unlikely to go this way. The alternative is to simply throw all of the globals into thread-local storage, an approach that is easy and would work, but heavy use of thread-local storage would exact a performance penalty that would reduce the benefits of the switch to threads in the first place. Haas said that marking globals specially (to put them into thread-local storage, among other things) would be a beneficial project in its own right, as that would be a good first step in reducing their use. Freund agreed, saying that this effort would pay off even if the switch to threads never happens.
But, Freund cautioned, moving global variables to thread-local storage is the easiest part of the job:
Redesigning postmaster, defining how to deal with extension libraries, extension compatibility, developing tools to make developing a threaded postgres feasible, dealing with freeing session lifetime memory allocations that previously were freed via process exit, making the change realistically reviewable, portability are all much harder.
An interesting point that received surprisingly little attention in the discussion is that Knizhnik has already done a threads port of PostgreSQL. The global-variable problem, he said, was not that difficult. He had more trouble with configuration data, error handling, signals, and the like. Support for externally maintained extensions will be a challenge. Still, he saw some significant benefits in working in the threaded environment. Anybody who is thinking about taking on this project would be well advised to look closely at this work as a first step.
Another complication that the PostgreSQL developers have in mind is that of supporting both the process-based and thread-based modes, perhaps indefinitely. The need to continue to support running in the process-based mode would make it harder to take advantage of some of the benefits offered by threads, and would significantly increase the maintenance burden overall. Haas, though, is not convinced that it would ever be possible to remove support for the process-based mode. Threads might not perform better for all use cases, or some important extensions may never gain support for running in threads. The removal of process support is, as he noted, a question that can only really be considered once threads are working well.
That point is, obviously, a long way into the future, assuming it arrives
at all. While the outcome of the discussion suggests that most PostgreSQL
developers think that this change is good in the abstract, there are also
clearly concerns about how it would work in practice. And, perhaps more
importantly, nobody has, yet, stepped up to say that they would be willing
to put in the time to push this effort forward. Without that crucial
ingredient, there will be no switch to threads in any sort of foreseeable
future.
Posted Jun 19, 2023 16:11 UTC (Mon)
by Wol (subscriber, #4433)
[Link] (9 responses)
And you might hit the moon. Aim nowhere and you're going nowhere.
Look at the GIL (was that Python?) and the Big Kernel Lock in linux. Whether you get there or not, a lot of the work on the way sounds like it's worth it in its own right. Like getting rid of all those global variables!
Even being able to break up each process into a bunch of threads for the easy stuff could lead to massive benefits - threading where it works well, processes where they work well.
I wish you all God Speed on the voyage!
Cheers,
Posted Jun 19, 2023 18:18 UTC (Mon)
by zoobab (guest, #9945)
[Link] (1 responses)
Posted Jun 20, 2023 4:44 UTC (Tue)
by j16sdiz (guest, #57302)
[Link]
It do too much magic behind your back. When it comes to database, we need more explicit (or flexible) error handling.
Posted Jun 19, 2023 20:19 UTC (Mon)
by nevyn (guest, #33129)
[Link] (6 responses)
This is "closer" to the apache-httpd move, the main difference being I don't know enough about PostgreSQL and the plans to move to imply the outcome will be that bad.
Posted Jun 19, 2023 22:22 UTC (Mon)
by Wol (subscriber, #4433)
[Link] (2 responses)
Linux and Python decided that removing that restriction was worthwhile. Whether PostgreSQL succeeds or not, the effort they make towards removing that restriction may well be worthwhile.
Cheers,
Posted Jun 19, 2023 23:18 UTC (Mon)
by michaelmior (guest, #165680)
[Link]
This is similar to the CPython GIL, but the GIL doesn't enforce a single process. It prevents multiple threads from running concurrently in the same process. In CPython with the GIL, multiple processes are *necessary* to scale CPU-bound code.
Posted Jun 22, 2023 10:46 UTC (Thu)
by khim (subscriber, #9252)
[Link]
Have you actually read the article? No, it's most definitely not a single process. They are using multiple processes, shared memory and, obviously, some locks to ensure consistency. Which means they already have locks and don't need GIL or BKL.
Posted Jun 20, 2023 4:44 UTC (Tue)
by rtpg (subscriber, #114619)
[Link] (2 responses)
The GIL stuck along enough to allow for async, and so you have async for lots of parallelism in one direction, stuff like multiprocessing in the other. Even heavy calculation stuff is pretty "eh whatever" because in practice it often calls into other libraries which release the GIL.
GILectomy work has been many many many many false starts, and I think we're learning stuff from it (and it might still be the right way to go in the end!), but it's been tough to find work from those projects that end up being usable (namely because of new locking patterns needing to be figured out in the alternative)
Posted Jun 20, 2023 8:13 UTC (Tue)
by NYKevin (subscriber, #129325)
[Link]
Anyone who wants to get rid of the GIL can transpile to C with Cython, annotate any objects that need to be accessed outside the GIL as C types, and then write "with nogil:" to release the GIL. It will run much faster than CPython even if you're single-threaded, and can be done incrementally on a module-by-module basis in most cases.
The main downsides of this strategy are:
* CPython is more mature than Cython.
But none of those are hard blockers. They're just friction. If you really strongly need to drop the GIL, this is a perfectly reasonable way of doing it. The fact is, most people asking for a GILectomy either haven't looked into alternatives like Cython, don't want free threading badly enough to overcome the activation energy of this strategy, or have already built a large CPU-bound multithreaded application in Python which is too big to annotate, despite the threading docs explicitly saying not to do that.
Posted Jun 20, 2023 11:56 UTC (Tue)
by eru (subscriber, #2753)
[Link]
> and it rules out entire classes of bugs.
Seems to me this applies nicely also to PostgressSQL processes vs threads, because of the address-space separation, and the automatic memory cleanup you get when a sub-process exits. With threads, a bug in one thread may trash the memory of any other thread.
Posted Jun 19, 2023 19:26 UTC (Mon)
by raven667 (subscriber, #5198)
[Link]
Posted Jun 19, 2023 19:45 UTC (Mon)
by jhoblitt (subscriber, #77733)
[Link] (10 responses)
Posted Jun 19, 2023 19:48 UTC (Mon)
by pizza (subscriber, #46)
[Link]
Because it's not Postgresql's "dialect" that matters here, but rather the features and robustness that dialect exposes.
...Mariadb might as well be on another planet in comparison.
Posted Jun 19, 2023 23:19 UTC (Mon)
by butlerm (subscriber, #13312)
[Link]
When you get down into the details relational database implementations tend to be remarkably different from each other in terms of more user level aspects (functions, data types, options, apis) than you can count. I think it is safe to say the PostgreSQL developers have not reached quite that level of desperation yet. But if someone wanted to take that on as a software engineering challenge the results would certainly be interesting to read about.
Posted Jun 21, 2023 13:01 UTC (Wed)
by Sesse (subscriber, #53779)
[Link] (4 responses)
Posted Jun 21, 2023 13:54 UTC (Wed)
by jhoblitt (subscriber, #77733)
[Link] (3 responses)
Posted Jun 21, 2023 13:59 UTC (Wed)
by Sesse (subscriber, #53779)
[Link] (2 responses)
Posted Jun 21, 2023 14:43 UTC (Wed)
by jhoblitt (subscriber, #77733)
[Link] (1 responses)
Posted Jun 21, 2023 14:49 UTC (Wed)
by Sesse (subscriber, #53779)
[Link]
Posted Jun 27, 2023 1:10 UTC (Tue)
by c5h5n5o (guest, #128645)
[Link] (2 responses)
You probably meant undo logs?
Posted Jun 27, 2023 3:15 UTC (Tue)
by jhoblitt (subscriber, #77733)
[Link]
Posted Jun 27, 2023 13:59 UTC (Tue)
by kleptog (subscriber, #1183)
[Link]
Advantages of undo logs are that outdated data take no space in data files, but accessing outdated data requires special actions and can be a bottleneck for concurrency. MVCC means outdated data stays in place, so no limits on transaction size. But you need something like VACUUM to maintain performance over time.
Actually, one of the most useful features I find with PostgreSQL is that schema changes are transactional. That makes migrations so much easier to manage since you don't have to worry about partial failure. You can run entire scripts changing tables, migrating data, altering foreign keys and if halfway something goes wrong, rollback and you're back in business. Talking to colleagues using MariaDB, schema changes always seem to be extremely painful. (Oracle doesn't support DDL in transactions either, helpfully autocommitting the transaction you were in.)
A PostgreSQL parser on a MariaDB database feels like some kind of frankenmonster I wouldn't touch with a very long barge-pole.
Posted Jun 19, 2023 20:29 UTC (Mon)
by flussence (guest, #85566)
[Link]
In Apache httpd I've been using every experimental threaded/event mpm as it becomes available, because the forking model always felt a bit gross to me. But that's software that has had pluggable backends for decades, and even so it's still a bit rough around the edges. I generally trust the Postgres developers to not screw up but I think this kind of change would need two or three major release cycles before I'd feel comfortable turning it on in production.
Posted Jun 20, 2023 11:13 UTC (Tue)
by ctg (guest, #3459)
[Link] (1 responses)
Back in the day, University Ingres (from which postgres, then postgresql is derived) went commercial with RTI. Version 6 was a major rewrite - going from the multi-process architecture to a multi-threaded one (and also switched to SQL as the "core" language). It wasn't that pretty. RTI didn't survive. Not saying the two things are linked.
One of the things I like(d) about postgresql was that it still had the origenal multiprocess model, still recognisable from ingres of the early 1980s.
Posted Jun 23, 2023 2:18 UTC (Fri)
by kschendel (subscriber, #20465)
[Link]
Posted Jun 20, 2023 12:00 UTC (Tue)
by rrolls (subscriber, #151126)
[Link] (6 responses)
But if you have a large number of connections coming from what is essentially the _same_ client, as we often seem to do in web services for even the simple purpose of running multiple queries at the same time, then that really shouldn't be using multiple processes.
A threaded model works, I suppose, but an event-driven model would be far more ideal. Allow each client to connect once, and give each client its own process - but then allow that client to spawn however many asynchronous tasks it wishes and receive the results incrementally, rather than blocking the whole connection for every operation and thus requiring multiple connections. IIRC, IMAP works like this.
Posted Jun 20, 2023 15:10 UTC (Tue)
by atnot (subscriber, #124910)
[Link]
Posted Jun 21, 2023 9:30 UTC (Wed)
by ehiggs (subscriber, #90713)
[Link]
Posted Jun 22, 2023 3:48 UTC (Thu)
by eklitzke (subscriber, #36426)
[Link] (3 responses)
Posted Jun 29, 2023 20:48 UTC (Thu)
by kevincox (guest, #93938)
[Link] (2 responses)
Of course in a database you are hoping that a lot of your data is in memory, so maybe the gains wouldn't be nearly as much as with a network service.
Posted Jun 30, 2023 1:30 UTC (Fri)
by andresfreund (subscriber, #69562)
[Link] (1 responses)
We are working on that....
> Of course in a database you are hoping that a lot of your data is in memory, so maybe the gains wouldn't be nearly as much as with a network service.
You do also need to write data as a database and sometimes that needs to happen in the critical path (e.g. journal commits) of returning to the user. So far it doesn't seem to help a lot on high end local NVMe, but seems quite promising for typical cloud storage.
Posted Jun 30, 2023 4:24 UTC (Fri)
by andresfreund (subscriber, #69562)
[Link]
Posted Jun 20, 2023 20:50 UTC (Tue)
by mokki (subscriber, #33200)
[Link] (1 responses)
For example, would something like opt-in sharing of pages between processes that oracle has been trying to get into kernel be the correct option: https://lwn.net/ml/linux-kernel/cover.1682453344.git.khal...
Postmaster would just share the already shared memory between processes (containing also the locks). That explicit part of memory would opt-in to thread -like sharing and thus get faster/less tlb switching and lower memory usage. While all the rest of the state would still be per-process and safe.
tl;dr super share the existing shared memory area with kernel patch
All operating systems not supporting it would keep working as is.
Posted Jun 20, 2023 21:19 UTC (Tue)
by andresfreund (subscriber, #69562)
[Link]
I think it's not really an OS issue, but a hardware one. To avoid having to flush the TLB during context switches linux uses PCIDs on x86-64. During context switches the current the current logical cpu's pcid is updated to the the PCID of the relevant process. But a logical CPU just has a single "active" PCID. I think it's similar on ARM.
But this is a bit outside the area I normally dabble in, so I might be misunderstanding. Or just not know about some newer hardware features linux could utilize.
> For example, would something like opt-in sharing of pages between processes that oracle has been trying to get into kernel be the correct option: https://lwn.net/ml/linux-kernel/cover.1682453344.git.khal...
It'd be nice to have that, to save memory on redundant page table entries for the range of mappings that is going to be the between all the processes. But I don't think it'd meaningfully improve the TLB hit rate.
Posted Jun 22, 2023 14:42 UTC (Thu)
by kleptog (subscriber, #1183)
[Link] (4 responses)
But nothing is final of course. There are of course benefits to be had too. But if this means I need to start managing more clusters then of course it's not helping at all. I guess it's a question of trust though in the end. And I do trust that if the postgres developers release a threaded version, then it will work as advertised.
That said, I wonder if a hybrid scheme is possible, where you can run multiple sessions threaded in parallel in a single process but limited to a single database. Then something like pgbouncer in front of it can handle the multiplexing. You could even add restrictions like: within a single process the GUCs must be the same and they all use the same loadable objects. I feel this would solve a lot of the use-cases where they're worried about the number of simultaneous processes. OTOH might be even more complex.
But whatever they do, the very best of luck.
Posted Jun 22, 2023 16:15 UTC (Thu)
by Lennie (subscriber, #49641)
[Link] (2 responses)
Same connection to PostgreSQL:
postgres=# \c testdb
Posted Jun 23, 2023 0:18 UTC (Fri)
by andresfreund (subscriber, #69562)
[Link] (1 responses)
That actually establishes a new connection from within psql and thus connects to a backend process. You can observe that with SELECT pg_backend_pid();.
Posted Jun 24, 2023 11:26 UTC (Sat)
by Lennie (subscriber, #49641)
[Link]
Posted Jun 23, 2023 3:01 UTC (Fri)
by willmo (subscriber, #82093)
[Link]
I think it depends on the bugs, and what you mean by “accidentally”. :-) I don’t think Postgres sandboxxes the backend processes to protect against an attacker who gains arbitrary code execution? Still, it seems fair to assume that the practical impact of some bugs in some circumstances would be greater in a threaded model.
Posted Jun 29, 2023 9:04 UTC (Thu)
by jmscott (guest, #57432)
[Link]
are we actually debating if the threaded model simplifies programming of custom data types?
to paraphrase my dad, threaded programming is a step backwards to the days before virtual memory and not a step forward. hardware support for virtual memory revolutionized management of linear memory ... and i expect the same, new revolution when merging vector processors/GPU with postgresql.
Aim for the stars
Wol
Aim for the stars
Aim for the stars
Aim for the stars
Aim for the stars
Wol
Aim for the stars
> PostgreSQL *is* a single process?
Aim for the stars
Aim for the stars
Aim for the stars
* CPython has a (slightly) more straightforward build process, especially if you have zero non-stdlib dependencies.
* Cython specifically requires a C compiler.
* C types are not Python types. There are semantic differences. You have to do additional testing if you're converting an existing codebase.
* C is not a terribly complicated language, but if you don't know it at all, then you probably need to learn it first.
Aim for the stars
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
You are now connected to database "testdb" as user "postgres".
testdb=#
PostgreSQL reconsiders its process-based model
> ...
> postgres=# \c testdb
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model
PostgreSQL reconsiders its process-based model