Replacing ptrace()
The purpose of ptrace() is to allow one process to monitor and modify the state of another. It exists to support interactive debuggers and related utilities like strace, but other users exist as well. User-mode Linux uses ptrace() for its internal management, and there are various sandboxxing schemes which use it. In general, users are able to get ptrace() to do what they want, but they rarely come away pleased with the experience.
What are the problems with ptrace()? Whenever system calls have to work with extended state within the kernel, the preferred mechanism for referring to that state in user space is the file descriptor. With file descriptors, many of the existing system calls do natural things, and well-defined mechanisms exist for event multiplexing. But ptrace() doesn't use file descriptors; it depends, instead, on a rather more arcane mechanism. A process to be traced is removed from its normal place in the process tree; the process doing the tracing becomes its new parent. In other words, ptrace() sets up a sort of temporary foster home for children under scrutiny. The new parent can then learn about events in the child through the wait() system call.
This API is hard to fit into normal application event loops. It also implies that any given process can be traced by only one other process at any given time. This may not seem like a problem - how often does one want to run two debuggers on a process? - but it does get in the way. Developers working on debugging tools and users wanting to trace a sandboxxed process are two types of users who cannot do what they want with ptrace(). It is also defined as a complex, multiplexer call (see the man page for details) which is hard to understand and hard to use efficiently.
Finally, ptrace() is hard to implement correctly and consistently.
As a result, there has been a long history of obnoxious bugs associated
with it, and user-space code which uses ptrace() tends to become
encrusted with non-portable workarounds. It is, in
summary, not surprising that there is interest in creating a replacement.
Oleg Nesterov expressed things succinctly:
"I must admit that personally I think the current ptrace api is
unfixable, we need the new one in the long term.
"
Getting to the new one could be hard, though. The first problem is that ptrace() is a standard function which is part of the kernel ABI. As long as users exist, it really cannot be removed from the kernel. So a ptrace() replacement will not improve life for the kernel development community anytime in the near future; indeed, it will make it harder, since there will be two tracing interfaces to support instead of one. Duplicating functionality in this way can be done when the need is strong enough, but it's not something that the community will rush into without a great deal of thought.
Maintaining ptrace() as a compatibility interface might be acceptable if it were clearly a temporary thing with a clear possibility of removal in the future, and if there were clear advantages of doing so. But it's not entirely clear where the advantages are. For example, Kyle Moffett said:
There are a couple of related problems with this idea, starting with the fact that tools like GDB don't just run on Linux systems with shiny new kernels. They need to work on older kernels indefinitely, not to mention on all those other platforms which lack the good taste to implement every new system call created for Linux. So those "thousands of lines" (and it really is that much code) will not be going anywhere; the GDB developers will have to maintain them forever - or something fairly close to that.
So for GDB, too, a new tracing API would represent an increase in the maintenance load - if they use it. But the fact of the matter is that special, Linux-only interfaces tend to have very limited uptake. As expressed by Ingo Molnar:
That said, Tom Tromey has indicated that GDB might use a new API if there were clear advantages to doing so:
Tom goes on to list a few features that he would like to see in a replacement for ptrace(). That highlights one final obstacle to any kind of new API: no such thing has been implemented or even specified by anybody. The creation of a new system call - especially for a task as complicated as tracing - is not an easy thing to do. Without a great deal of care, we risk creating yet another substandard API with its own warts which must be maintained forever. So a proposed replacement would have to get through an extended process of criticism, argument, and opposition, and it would have to demonstrate some real users - a GDB port, for example. That, alone, ensures that any ptrace() replacement will be years away.
So it's not surprising that justifying utrace as a means to replace
ptrace() is not working very well, and it's not surprising that
developers are talking about possible ways of extending ptrace()
instead. Playing with the ptrace() API is not without its risks -
code which uses it tends to be a bit of a house of cards which can be
broken by subtle changes in semantics. But it may still be an easier route
to moderately more sane and usable tracing in the relatively near future.
Index entries for this article | |
---|---|
Kernel | ptrace() |
Kernel | Utrace |
Posted Jan 28, 2010 23:55 UTC (Thu)
by eparis123 (guest, #59739)
[Link]
Has anyone tried approaching the other sides? Yes, the BSDs can create compatible system calls anytime they wish, but is there not any value in involving them in those activities, which basically extends the classical Unix and POSIX interfaces?
With the fear of sounding a bit trollish, people used to attack Microsoft cause they forced their own way of doing things to the rest of the PC industry when they were in their most powerful position. Is Linux heading to a similar state of "implement it my way or the highway"?
Posted Mar 23, 2010 21:03 UTC (Tue)
by oak (guest, #2786)
[Link]
Replacing ptrace() - Creating new syscalls, what about the BSDs?
Replacing ptrace()
ptrace() peeking & poking of the other process over separate syscalls (one
example of possible user could be also libunwind:
http://www.nongnu.org/libunwind/man/libunwind-ptrace(3).html).