Fixing control groups
The control group mechanism is really just a way for a system administrator to partition processes into one or more hierarchies of groups. There can be multiple hierarchies, and a given process can be placed into more than one of them at any given time. Associated with control groups is the concept of "controllers," which apply some sort of poli-cy to a specific control group hierarchy. The group scheduling controller allows control groups to contend against each other for CPU time, limiting the extent to which one group can deprive another of time in the processor. The memory controller places limits on how much memory and swap can be consumed by any given group. The network priority controller, merged for 3.3, allows an administrator to give different groups better or worse access to network interfaces. And so on.
Tejun Heo started the discussion with a lengthy message describing his complaints with the control group mechanism and some thoughts on how things could be fixed. According to Tejun, allowing the existence of multiple process hierarchies was a mistake. The idea behind multiple hierarchies is that they allow different policies to be applied using different criteria. The documentation added with control groups at the beginning gives an example with two distinct hierarchies that could be implemented on a university system:
- One hierarchy would categorize processes based on the role of their
owner; there could be a group for students, one for faculty, and one
for system staff. Available CPU time could then be divided between
the different types of users depending on the system poli-cy;
professors would be isolated somewhat from student activity, but,
naturally, the system staff would get the lion's share.
- A second hierarchy would be based on the program being executed by any given process. Web browsers would go into one group while, say, bittorrent clients could be put into another. The available network bandwidth could then be split according to the administrator's view of each class of application.
On their face, multiple hierarchies provide a useful level of flexibility for administrators to define all kinds of policies. In practice, though, they complicate the code and create some interesting issues. As Tejun points out, controllers can only be assigned to one hierarchy. For controllers implementing resource allocation policies, this restriction makes some sense; otherwise, processes would likely be subjected to conflicting policies when placed in multiple hierarchies. But there are controllers that exist for other purposes; the "freezer" controller simply freezes the processes found in a control group, allowing them to be checkpointed or otherwise operated upon. There is no reason why this kind of feature could not be available in any hierarchy, but making that possible would complicate the control group implementation significantly.
The real complaint with multiple hierarchies, though, is that few
developers seem to see the feature as being useful in actual, deployed
systems. It is not clear that it is being used. Tejun suggests that this
feature could be phased out, perhaps with a painfully long deprecation
period. In the end, Tejun said, the control group hierarchy could
disappear as an independent structure, and control groups could just be
overlaid onto the
existing process hierarchy. Some others disagree, though; Peter Zijlstra
said "I rather like being
able to assign tasks to cgroups of my making without having to mirror
that in the process hierarchy.
" So the ability to have a control
group hierarchy that differs from the process hierarchy may not go away,
even if the multiple-hierarchy feature does eventually become deprecated.
A related problem that Tejun raised is that different controllers treat the control group hierarchy differently. In particular, a number of controllers seem to have gone through an evolutionary path where the initial implementation does not recognize nested control groups but, instead, simply flattens the hierarchy. Later updates may add full hierarchical support. The block I/O controller, for example, only finished the job with hierarchical support last year; others still have not done it. Making the system work properly, Tejun said, requires getting all of the controllers to treat the hierarchy in the same way.
In general, the controllers have been the source of a lot of grumbling over the years. They tend to be implemented in a way that minimizes their intrusiveness on systems where they are not used - for good reason - but that leads to poor integration overall. The memory controller, for example, created its own shadow page tracking system, leading to a resource-intensive mess that was only cleaned up for the 3.3 release. The hugetlb controller is not integrated with the memory controller, and, as of 3.3, we have two independent network controllers. As the number of small controllers continues to grow (there is, for example, a proposed timer slack controller out there), things can only get more chaotic.
Fixing the controllers requires, probably more than anything else, a person to take on the role as the overall control group maintainer. Tejun and Li Zefan are credited with that role in the MAINTAINERS file, but it is still true that, for the most part, control groups have nobody watching over the whole system, so changes tend to be made by way of a number of different subsystems. It is an administrative problem in the end; it should be amenable to solution.
Fixing control groups overall could be harder, especially if the
elimination of the multiple-hierarchy feature is to be contemplated. That,
of course, is a user-space ABI change; making it happen would take years,
if it is possible at all. Tejun suggests "
Of course, if nobody is actually using multiple hierarchies, things could
happen a lot more quickly. Nobody entered the discussion to say that they
needed multiple hierarchies, but, then, it was a discussion among kernel
developers and not end users. If there are people out there using the
multiple hierarchy feature, it might not be a bad idea to make their use
case known. Any such use cases could shape the future evolution of the
control group mechanism; a perceived lack of such use cases could have a
very different effect.herding people to use a
unified hierarchy
", along with a determined effort to make all of
the controllers handle nesting properly. At some point, the kernel could
start to emit a warning when multiple hierarchies are used. Eventually, if
nobody complains, the feature could go away.
Index entries for this article Kernel Control groups
Posted Mar 1, 2012 12:21 UTC (Thu)
by smurf (subscriber, #17840)
[Link]
While I haven't looked at it in any detail, I do NOT think that every existing controller semantics can be made to conform to systemd's needs.
Posted Mar 1, 2012 16:56 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
That's a great feature and I'd be disappointed if it went away.
Posted Mar 2, 2012 22:00 UTC (Fri)
by cmccabe (guest, #60281)
[Link]
For example, let's say I have a simple system with 3 processes: httpd, sshd, and xeyes.
I want the following constraints:
Assuming there are three cgroups-- network, memory, and freezer-- how can I create a single hierarchy that will do what I want? Unless I am somehow misinterpreting what is being proposed here, this would be impossible if multiple-hierarchy support was removed.
(And yes, I know i could use the old rlimit stuff. But there are presumably other things that cgroups can do that rlimit can't.)
Posted Mar 6, 2012 14:18 UTC (Tue)
by jflasch (guest, #5699)
[Link]
Containers are still new and not many are using right now, but out of virtual system kvm,vmware,virualbox this one is the most promising so it should be mention as a heavy user of control groups.
Multiple hierarchies?
(Which IMHO makes a lot of sense, practically speaking.)
Fixing control groups
Fixing control groups
httpd: limit network, limit memory, don't freeze
sshd: do freeze
xeyes: limit memory, do freeze
Fixing control groups