Alternatives to fibrils
Linus Torvalds got inspired to create an
asynchronous system call patch of his own. Simplicity is the word to
describe this patch: it adds less than 200 lines of code to the kernel
("I even added comments, so a lot of the few new added lines aren't
even code!
"). It works like this:
- The new async() system call takes a system call number,
arguments for the system call, and a pointer to a location for the
final status code.
- The process's register set is saved, then the system call is executed
as usual.
- Should the kernel call schedule(), meaning that the system
call is about to block, the process will fork before blocking.
- The new child process returns to user space and continues executing there. Meanwhile, the original process will finish out the asynchronous system call.
The largest claimed advantage to this patch, beyond its simplicity, is that there is almost no overhead if the asynchronous system call can be completed without blocking. The fibril patch, instead, always runs asynchronous calls in independent fibrils. Linus claims that almost all asynchronous system calls can, in fact, be completed synchronously without blocking, so he would really rather see little or no up-front cost in that case.
There are various issues with Linus's patch. If the asynchronous call blocks, for example, the return to user space will happen in a different process - a change which could prove confusing to user space. Only one asynchronous operation can be outstanding at any given time. There is also no way to wait for an asynchronous operation to complete except to poll the exit status. But this patch was never meant to be a complete solution; as a proof of concept it is interesting.
For a rather more elaborate approach, Ingo Molnar's syslet patchset is worth a look. With syslets, a user-space program can run system calls asynchronously. Beyond that, however, it can load little programs into the kernel and let them run independently.
To use syslets, the application starts by filling in one of these structures:
struct syslet_uatom { unsigned long flags; unsigned long nr; long *ret_ptr; struct syslet_uatom *next; unsigned long *arg_ptr[6]; void *private; };
Here, nr is the number of the system call to run, arg_ptr holds pointers to the arguments, and ret_ptr tells the kernel where to put the final status from the call. The private field is not used by the kernel at all. We'll get to the other fields shortly.
Once the syslet_uatom structure is ready, the application can run it with:
long async_exec(struct syslet_uatom *atom);
This call will start on the requested system call immediately. If that system call never blocks, it will be run synchronously and the address of the atom will be returned from async_exec(). Otherwise the kernel will grab a thread from a pool and use that thread to return to user space, continuing the system call in the original thread. The application can then go off and do whatever makes sense - including running more syslets - while the system call runs to completion.
What actually happens when the system call completes is a little more complex and interesting, however. Unless user space has requested otherwise, the kernel does not just complete the syslet after the first system call runs; instead, it looks at the next field of the syslet_uatom structure. If that field is non-NULL, it is taken as the user-space address of the next syslet to be run by the kernel. In other words, an application is not restricted to running individual asynchronous system calls; it can chain up a whole series of them to run without ever exiting the kernel. The cost of fetching a new syslet atom is far less than a transition to user space and back, so there is a significant performance improvement to be had just by chaining two system calls together.
The final field in struct syslet_uatom is flags, which controls how syslets are executed. Four of them (SYSLET_STOP_ON_NONZERO, SYSLET_STOP_ON_ZERO, SYSLET_STOP_ON_NEGATIVE, and SYSLET_STOP_ON_NON_POSITIVE) will test the result of the current atom's system call and, possibly, terminate execution of the syslet. In this way, for example, a chain of system calls can be stopped early if one of them fails. It is also possible to create a kernel-space loop which reads a file until no more data is available.
The SYSLET_SKIP_TO_NEXT_ON_STOP modifies the above flags so that, rather than terminating the syslet, the kernel skips to an atom found immediately after the current one in the process's address space. This flag allows a syslet to terminate a loop and move on to further processing within the syslet. If an application knows that a syslet will block, it can request asynchronous execution from the outset with SYSLET_ASYNC. There is also a SYSLET_SYNC flag which causes the whole thing to run synchronously.
Syslets do not have any variables of their own. To help with the writing of useful programs, Ingo has added a new system call:
long umem_add(unsigned long *pointer, unsigned long increment);
This call simply adds the given increment to *pointer, returning the resulting value.
The application can register a ring buffer with the kernel using the async_register() system call. Whenever an atom completes, its address will be stored in the next ring buffer entry; the application can then use that address to find the system call status. The kernel will not overwrite non-NULL ring buffer entries, so the application must reset them as it consumes them. If the application needs to wait for syslet completion, it can call:
long async_wait(unsigned long min_events);
This call will block the process until at least min_events have been stored into the ring buffer.
This patch set, too, presents a number of unanswered questions. Once
again, signal handling has been punted for now. There's no end of security
implications which must be thought out; in the end, a number
of system calls will probably be marked as being off-limits for asynchronous
execution. There has still been no discussion on how this sort of
interface would play with the kevent patches - kevents seem to be concept
that nobody wants to talk about at the moment. 64/32-bit compatibility
could present interesting challenges of its own. And so on.
But the initial reaction to syslets appears to be positive (though Linus hates it); syslets might just point to
the form of the
fibril idea which eventually makes it into the mainline kernel.
Index entries for this article | |
---|---|
Kernel | Fibrils |
Kernel | Syslets |
Posted Feb 15, 2007 19:11 UTC (Thu)
by jospoortvliet (guest, #33164)
[Link] (4 responses)
This is why free software is special - he just assumes it can get into the
Posted Feb 16, 2007 4:55 UTC (Fri)
by ncm (guest, #165)
[Link] (2 responses)
Posted Feb 22, 2007 8:33 UTC (Thu)
by irios (guest, #19838)
[Link] (1 responses)
Posted Feb 28, 2007 20:25 UTC (Wed)
by tlw (guest, #31237)
[Link]
> I'm liberal by nature, and I think
which is incorrect. Freedom probably won't "find it is way".
Posted Feb 16, 2007 20:46 UTC (Fri)
by proski (subscriber, #104)
[Link]
I think you are generalizing too much. Other projects are run in a different way. Consensus doesn't always work. Although if developers are motivated to stick together, they will look for a solution that doesn't alienate any of them.
Posted Feb 16, 2007 19:47 UTC (Fri)
by spitzak (guest, #4593)
[Link] (3 responses)
Posted Feb 17, 2007 2:05 UTC (Sat)
by ds2horner (subscriber, #13438)
[Link] (2 responses)
Linus:
Posted Feb 20, 2007 0:28 UTC (Tue)
by mikov (guest, #33179)
[Link] (1 responses)
To me this seems completely unacceptable. Am I missing something ?
Posted Feb 22, 2007 21:02 UTC (Thu)
by huaz (guest, #10168)
[Link]
Posted Feb 20, 2007 2:09 UTC (Tue)
by mikov (guest, #33179)
[Link]
>But the initial reaction to syslets appears to be Alternatives to fibrils
>positive (though Linus hates it); syslets might
>just point to the form of the fibril idea which
>eventually makes it into the mainline kernel.
kernel, even tough Linus doesn't like it. Consensus is so 'normal' in FOSS
development you don't notice it all the time, but it's a great thing. It's
the reason for creativity and the 'free spirit'. I love it. I wish more
things worked this way... Politics anyone? I'm liberal by nature, and I
think freedom will always find it's way. As long as there is a community
watching it and protecting it. Wiki's, blogs, podcasts, they are
transforming the world. I hope we can protect it from you-know-who (from
evil governments to RIAA, from terrorists and other fanatics to greedy
companies).
The possessive for "it" is "its". The plural of "wiki" is "wikis". That is all.This isn't slashdot.
Right about "Wiki's", but both the "It's", meaning "It is" rather than the possesive, are correct.This isn't slashdot.
... so that just leaves the third "it's"...
This isn't slashdot.
> freedom will always find it's way.
Alternatives to fibrils
This is why free software is special - he just assumes it can get into the
kernel, even tough Linus doesn't like it.
Who is "he"? Ingo or our editor? If Ingo, he wrote the patch before Linus commented on it. If it's the editor, please note the word "eventually". I don't think anybody assumes that Ingo's patch can go to the kernel as is.
In both cases the system call is done by the parent process and the return is the child process. Why can't this be done the other way around, where the asyncrhonous call is being done by the child and it returns immediately to the parent? That would make a lot more sense, so I assumme there is a sensible reason that I can't figure out for how they are doing this.Stupid question
In his explanation it was so that the fork code could be reused with no modifications. He implies it could be done if it matters to callers.Stupid question - that Linus anticipated
Now, I agree that this is a bit ugly in some of the details: in
particular, it means that if the system call blocks, we will literally
return as a *different* thread to user space. If you care, you shouldn't
use this interface, or come up with some way to make it work nicely (doing
it this way meant that I could just re-use all the clone/fork code as-is).
How could it _not_ matter to callers ? If the thread id can change arbitrarily based on factors outside of the applications control - e.g. if some driver buffer is empty - then the thread id becomes completely pointless.Stupid question - that Linus anticipated
You are right, it's indeed unacceptable.Stupid question - that Linus anticipated
Has there been any discussion of canceling an asynchronous operation ? It seems to me that there may not be a structured way to do it using fibrils/kernel_threads because there are no formally defined states where a request sits and can be canceled.Cancellation of an operation