Restartable sequences
Some workloads found at Google make use of per-thread free lists for the malloc() function, Paul said. This technique performs well, but it also eats up a lot of memory; that has led to an interest in using per-CPU free lists instead. The idea is to let threads detect if they have been interrupted while in a critical section and, if so, restart their operation from the beginning. Restartable sequences allow this kind of pattern with no locking and with no need for atomic variables. Paul suggested that it might prove useful for realtime developers as well.
Andy then said that he really didn't like the early attempts at support for restartable sequences. He likes it when debuggers work and context switches have sane semantics; the patches ran counter to both of those. He also was not a fan of accessing user-space memory during scheduling. The work has progressed, though, and could benefit from more review. In particular, there is an interest in having non-x86 developers look at the patches to see whether this functionality could be supported on their architectures.
Chris Mason noted that using restartable sequences cuts memory usage by 20% in a workload he has looked at; he described it as "a big deal." David Howells asked what was required from the kernel to support this functionality. Andy's answer was that user space needs to be able to register a critical section with the kernel. If a process is interrupted while executing within that region, it jumps to a specific recovery address when it resumes executing. That recovery code can then do whatever is needed, which usually is a matter of just restarting the operation from the beginning.
Ben Herrenschmidt asked whether it was possible to register more than one critical section; that would be important for libraries to be able to use this facility. The answer was that critical sections can be nested, so library use should be possible.
Paul noted that a new patch series had been posted that morning. Are there, he asked, any objections to the concept in general or to the patches in particular?
Andy responded that he still doesn't like the idea of context switches
having side effects. The current patches seem to be getting better in that
regard. Josh Triplett noted that restartable sequences could be useful for
timing sections of code; Paul agreed, and said they could be used for
user-space read-copy-update implementations as well. In general, objections
were scarce, but the real proof will be in how and when the patches are
accepted.
Index entries for this article | |
---|---|
Kernel | Restartable sequences |
Conference | Kernel Summit/2015 |
Posted Nov 6, 2015 0:58 UTC (Fri)
by ncm (guest, #165)
[Link]
It is finicky to get the mechanism and interface right, but the benefits could be huge.
Posted Nov 8, 2015 22:25 UTC (Sun)
by kleptog (subscriber, #1183)
[Link] (3 responses)
Mind you, this also seems really close to transactional memory. That seems a more generally useful feature which would solve the problem as well. In the example given, it doesn't actually matter if a block is accidentally pushed onto the wrong per-CPU list due to a reschedule, just as long as the in memory structures aren't corrupted.
But I don't know if you can efficiently do transactional memory without hardware support, even if the kernel is on board.
Posted Nov 8, 2015 23:14 UTC (Sun)
by andresfreund (subscriber, #69562)
[Link] (2 responses)
I can't immediately see how (in a useful way at least).
> know PostgreSQL under high load runs into issues where code gets rescheduled while holding a spinlock, leading to much joy all round.
I hope^Wthink we fixed most of these issues in PostgreSQL 9.5. The biggest problem was that spinlocks were used to protect the internal data structure of a read/write lock that supports queuing - which is horrible if you have dozens of processes acquiring locks in share mode. That's mostly been fixed.
There's a few places where spinlocks are still used (notably buffer pins/refcounts), but it's not easy to see a problem on 2 socket machines there. Hopefully we'll have that fixed in 9.6.
Posted Nov 10, 2015 21:02 UTC (Tue)
by eternaleye (guest, #67051)
[Link] (1 responses)
I'd suspect in much the same way as any adaptive spinlock - only instead of some fixed (or variable) number of spins before it falls back to blocking, it spins inside the restartable sequence and the recovery code blocks.
Posted Nov 10, 2015 21:43 UTC (Tue)
by andresfreund (subscriber, #69562)
[Link]
Posted Nov 9, 2015 0:38 UTC (Mon)
by npitre (subscriber, #5680)
[Link]
https://www.kernel.org/doc/Documentation/arm/kernel_user_...
Upon every kernel entry, the user space program counter is reset to
Restartable sequences
Restartable sequences
Restartable sequences
Restartable sequences
Restartable sequences
Restartable sequences
operations with CPUs lacking the necessary native instructions without
having to callinto the kernel to perform those operations.
the beginning of given operation if it is found to be within the critical
area for that operation. Given those operations are provided by thekernel,
finding out if user space was in the middle of a critical area is very
easy and quick.