Another new ABI for fanotify
The first obstacle has been more-or-less overcome. Even developers who think that malware scanning is the worst sort of security snake oil can agree that having these utilities use a well-defined kernel interface is better than having them employ nasty tricks like hooking into the system call table. ABI difficulties can be harder to overcome, though. With the latest fanotify posting, developer Eric Paris may have resolved this issue for at least a portion of the fanotify functionality.
The new version does away with the novel interface using setsockopt() in favor of a couple of new system calls. The first of these is fanotify_init():
int fanotify_init(unsigned int flags, unsigned int event_f_flags, int priority);
This system call initializes the fanotify subsystem, returning a file descriptor which is used for further operations. There are two flags values implemented: FAN_NONBLOCK creates a nonblocking file descriptor, and FAN_CLOEXEC sets the close-on-exec flag. Currently, event_f_flags and priority are unused; they should be set to zero.
Management of notification events is then done with fanotify_mark():
int fanotify_mark(int fanotify_fd, unsigned int flags, int dfd, const char *pathname, u64 mask, u64 ignored_mask);
This call is used to "mark" specific parts of the filesystem hierarchy, indicating an interest in events involving those files. fanotify_fd is the file descriptor returned by fanotify_init(). The flags parameter must be one of FAN_MARK_ADD or FAN_MARK_REMOVE, indicating whether this call adds new marks or removes existing ones; there are also a couple of flags to control following of symbolic links and the marking of directories (without their contents).
The file(s) to be marked are determined by dfd and pathname; these parameters work much like in any of the *at() system calls. If dfd is AT_FDCWD, the pathname is resolved using the current working directory. If, instead, dfd points to a directory, the pathname lookup starts at that directory. If pathname is null, though, then dfd is interpreted as the actual object to mark.
Finally, mask and ignored_mask control which events are reported. To generate a specific event, a file must have the appropriate flag set in mask and clear in ignored_mask. The flags are FAN_ACCESS (file access), FAN_MODIFY (a file is modified), FAN_CLOSE_WRITE (a writable file has been closed), FAN_CLOSE_NOWRITE (a read-only file has been closed), FAN_OPEN (a file has been opened), and FAN_EVENT_ON_CHILD (events on children of a directory). There is also a FAN_Q_OVERFLOW event for event queue overflows, but that is not currently implemented.
Once files have been marked, the application can simply read from the fanotify file descriptor to get events. The events look like:
struct fanotify_event_metadata { __u32 event_len; __u32 vers; __s32 fd; __u64 mask; };
Here, event_len is the length of the structure, vers indicates which version of fanotify generated the structure, fd is an open file descriptor for the object being accessed, and mask describes what is actually happening.
There is one crucial component missing in these patches: there is no way
for the fanotify user to react to these events. In particular, the ability
to block an open() call, a core part of the malware-scanning
process, is missing. That, presumably, is to be added in a future
revision. Meanwhile, Eric has requested permission to put the notification
code into linux-next, presumably with a 2.6.33 merge in mind. As of this
writing, objections have not been forthcoming.
Index entries for this article | |
---|---|
Kernel | fanotify |
Another new ABI for fanotify
Posted Nov 12, 2009 3:57 UTC (Thu)
by mezcalero (subscriber, #45103)
[Link] (1 responses)
Posted Nov 12, 2009 3:57 UTC (Thu) by mezcalero (subscriber, #45103) [Link] (1 responses)
i.e. what inotify currently sucks at is to use it for reading files or devices nodes that have just been closed. i.e. a loop such as "for (;;) { wait_until_someone_closes_a_file_after_writing(); check_what_changed(); }", since the check_what_changed() call might itself open() and close() the file/device node, one would enter a loop here which is very hard to break, since one cannot distuingish between events that were triggered by the process itself or by someone else. An easy fix this could be to include the PID of the process that triggered an event. That way programs could simply ignore all events triggered by themselves.
Another new ABI for fanotify
Posted Nov 12, 2009 15:18 UTC (Thu)
by eparis (guest, #33060)
[Link]
Posted Nov 12, 2009 15:18 UTC (Thu) by eparis (guest, #33060) [Link]
Also you have an open fd which will not cause you to get events. So you can just operate on that fd and you won't hit the loop, open files yourself and you will get events for it.
Another new ABI for fanotify
Posted Nov 12, 2009 11:18 UTC (Thu)
by etienne_lorrain@yahoo.fr (guest, #38022)
[Link] (2 responses)
Posted Nov 12, 2009 11:18 UTC (Thu) by etienne_lorrain@yahoo.fr (guest, #38022) [Link] (2 responses)
Some would say the other intended use case is malware-spreading utilities, it is better to "infect" executables which are often executed than those who lay dormant... and having a standard interface for viruses would greatly simplify their development.
Moreover, because it seems you should be able to use multiple independant virus checker, you can hook "under" or "over" a virus checker, to hide your virus from upper layers, or to add it once the file has been certified clean.
Another new ABI for fanotify
Posted Nov 12, 2009 15:26 UTC (Thu)
by eparis (guest, #33060)
[Link]
Posted Nov 12, 2009 15:26 UTC (Thu) by eparis (guest, #33060) [Link]
Another new ABI for fanotify
Posted Nov 13, 2009 2:46 UTC (Fri)
by bronson (subscriber, #4806)
[Link]
Posted Nov 13, 2009 2:46 UTC (Fri) by bronson (subscriber, #4806) [Link]
That's an argument for keeping useful features out of the kernel? Are you kidding??
Pretty much all viruses are transferred via network. Does that mean that the networking stack should be removed from the kernel?
Another new ABI for fanotify
Posted Nov 12, 2009 16:06 UTC (Thu)
by xav (guest, #18536)
[Link] (3 responses)
Posted Nov 12, 2009 16:06 UTC (Thu) by xav (guest, #18536) [Link] (3 responses)
I hardly see how both these approches can coexist ...
Another new ABI for fanotify
Posted Nov 13, 2009 2:41 UTC (Fri)
by bronson (subscriber, #4806)
[Link] (2 responses)
Posted Nov 13, 2009 2:41 UTC (Fri) by bronson (subscriber, #4806) [Link] (2 responses)
Another new ABI for fanotify
Posted Nov 13, 2009 10:13 UTC (Fri)
by xav (guest, #18536)
[Link] (1 responses)
Posted Nov 13, 2009 10:13 UTC (Fri) by xav (guest, #18536) [Link] (1 responses)
Another new ABI for fanotify
Posted Dec 20, 2009 17:01 UTC (Sun)
by Blaisorblade (guest, #25465)
[Link]
Posted Dec 20, 2009 17:01 UTC (Sun) by Blaisorblade (guest, #25465) [Link]
Transactions have been invented in databases, and in that context it's obvious that part of a transaction may fail; and even in btrfs transactions allow for failures. So, what's the problem here?
A bigger problem is instead that during the transaction the filesystem is locked, so userspace needs to avoid modifying the fs during the check, if btrfs is used. It's possible I guess, the atime change problem needs to be solved to perform reads, but that's doable. But if developers don't test this scenario, they won't notice.
I really want ... something that is almost this...
Posted Nov 13, 2009 2:28 UTC (Fri)
by knobunc (subscriber, #4678)
[Link]
Posted Nov 13, 2009 2:28 UTC (Fri) by knobunc (subscriber, #4678) [Link]
Except, it looks like the interface does not generate events for file moves.
I know about the other notification mechanisms, but the tree is rather large and I do not want to have to add inotify_watches for all of the directories within... I assume (perhaps erroneously) that inotify does not scale to tens of thousands of directories.
-ben
Fanotify has a bug in 3.1 or below
Posted Oct 12, 2011 3:23 UTC (Wed)
by searockcliff (guest, #76465)
[Link]
Posted Oct 12, 2011 3:23 UTC (Wed) by searockcliff (guest, #76465) [Link]
here is a patch for kernel 3.1:
http://marc.info/?l=linux-kernel&m=131822913806350&...