KS2010: Deadline scheduling

By Jonathan Corbet
November 2, 2010

2010 Kernel Summit

Dario Faggioli is the author of the deadline scheduling patch; he made his Kernel Summit debut to introduce this work and to get an idea of whether it might make it into a future kernel. The message he got was mixed, but it was about as positive as could realistically be expected for this kind of significant new feature.

Deadline scheduling does away with the classic notion of process priorities. Instead, each process requests scheduling of a maximum amount of CPU time within a specific deadline. The scheduler can then either arrange things to ensure that the deadline will be met or reject the request if the CPU would be overcommitted. Dario chose the case of a video player application as his main example for how this can be useful; with a deadline for each frame, the player can produce skip-free video even in the presence of significant contention for the CPU. Deadline scheduling can thus make a number of problems go away.

Linus wasn't convinced. He has never been entirely impressed by the realtime work (he pronounced that "realtime is bullshit" in the session) and does not see deadline scheduling as the right answer to this problem. He is more optimistic about group scheduling, and especially about the recently-posted per-tty task groups patch. Peter Zijlstra defended deadline scheduling, citing its ability to reject tasks which would overcommit the CPU, but Linus wasn't going for it. He says that multimedia people don't want the video player to be rejected if the scheduler can't guarantee the deadlines; they want a best-effort attempt to play the video.

Beyond that, he says, the problem with tasks like video playback is almost never the CPU scheduler. Video skips tend to be caused by I/O problems; the latencies appear in the I/O scheduler or in the virtual memory subsystem somewhere. Messing with the CPU scheduler is the wrong approach.

Peter said that, regardless, deadline scheduling is a feature that a number of people want. Should they press on with the work, or is it hopeless? Linus grumbled some more about how too much time goes into CPU scheduling and not enough into the other parts of the problem. But, he said, if the scheduler people want deadline scheduling in the end, he'll pull the patch.

Linus complained that, in practice, it's impossible to set the deadlines correctly. In the end, application developers have to request the absolute worst-case execution time, even though the application will almost never need that much CPU time. That's a problem because users often want to overcommit the resources - including the processor - on their systems. In practice, it almost always works because the worst-case CPU time is not actually needed.

Peter replied that deadline scheduling is safer because it can be made available to unprivileged applications. The maximum amount of CPU time can be bounded, as can the worst-case execution time. The scheduler can deny requests which would overcommit the amount of CPU time available. Linus replied that, on a server with multiple users, that approach still is not safe; there is no way to keep users from interfering with each other. The truth of the matter is that multi-user servers are probably not a place where using deadline scheduling will make sense.

In general, Linus said, deadline scheduling make him nervous. He has seen attempts to fix things with scheduler tweaks before; he's afraid that they can end up making things worse. The scenario of using deadline scheduling on desktop systems is, he says, not realistic; the real problems are elsewhere. Deadline scheduling will require system tuning, and that just doesn't work on desktop systems. There is no way to tune for everybody, and desktop users tend to be uninterested in and incapable of tuning their systems themselves.

Ted Ts'o said that video playback is an easy example to illustrate deadline scheduling, but, perhaps, it's not the best use case. This theme was to return a couple of times in the session; video was almost certainly not the best example to choose for this particular crowd. He asked for an alternative use case - something which cannot be fixed with changes elsewhere in the system. Thomas Gleixner said that OS X is using deadline scheduling for desktop tasks and it works great. But the desktop example was used because it is easy to understand; there are also a lot of industrial applications which benefit from a non-priority-based scheduling mechanism. Deadline scheduling, he says, is all about describing the work that must be done instead of tweaking priorities.

Tim Bird said that, in his job, he's spent 18 months tweaking embedded systems for proper performance. CPU scheduling, he says, is never the real issue; the hard problems are elsewhere. So, he said, they are unlikely to look at deadline scheduling in the next ten years.

Ted closed the session with the note that it's important to better identify the users for deadline scheduling. Without that, it will be hard to know whether the associated ABI is right.

Next: Regressions.

Index entries for this article
Kernel	Realtime/Deadline scheduling
Kernel	Scheduler/Deadline scheduling

Realtime is not BS

Posted Nov 2, 2010 16:21 UTC (Tue) by daniel (guest, #3181) [Link] (1 responses)

[Linus] pronounced that "realtime is bullshit" in the session

For that matter, SMP is bullshit and so is virtual memory. The common element shared by all three of these varieties of bullshit is that each is essential to some important things that we do with computers. If Linus means "bullshit" as in not useful or important then he is just plain wrong, as he is from time to time (see the long slow process of cluing him in to the value of kernel preemption). Perhaps Linus has never had to deal with the reality of programming a multi ton intelligent machine that can kill you in an instant, or cause unbelievable amounts of monetary damage if certain realtime deadlines are ever missed. I have. The experience is purifying. There is absolutely no substitute for being sure that deadlines will be met, if it is important to meet them. There is no question about that, except perhaps in Linus's mind. The only valid question is how best to achieve it.

Until Linux has a usable CONFIG_RT compile option, just as it has CONFIG_SMP, then it will not be useful for a large and important class of industrial and scientific applications. Believe it or not, this void is now filled largely by Windows or Dos systems, often with hardware offload to some black box with largely unknown behavior characteristics. Do these systems suck? Yes they do. Do they kill people? Yes they can, and most probably have. Can we do something about it? Yes we can, but it's hard if our fearless leader is busy pulling in a different direction than the people on the ground who actually have a clue about the importance of the application area.

Realtime is not BS

Posted Nov 2, 2010 19:38 UTC (Tue) by caitlinbestler (guest, #32532) [Link]

Agreed, realtime is a vital objective.

But there is still a valid discussion on whether it can be achieved in the
same set of kernel code as more general scheduling.

And Linus' comments that addressing CPU resources alone is a very on target.
A true deadline scheduler needs to schedule CPU, Memory bandwidth and IO
bandwidth. Whether that can be achieved side-by-side with a kernel that
defaults to "best effort" is a major question.

KS2010: Deadline scheduling

Posted Nov 2, 2010 17:04 UTC (Tue) by daniel (guest, #3181) [Link] (3 responses)

Linus complained that, in practice, it's impossible to set the deadlines correctly. In the end, application developers have to request the absolute worst-case execution time, even though the application will almost never need that much CPU time.

Where there's a will there's a way. Designate some subset of CPUs as realtime-only, then the worst that can happen is a task that could otherwise overcommit a CPU cannot be moved to that CPU.

KS2010: Deadline scheduling

Posted Nov 3, 2010 19:30 UTC (Wed) by tcucinotta (guest, #69261) [Link] (2 responses)

Actually, profiling the "absolute WCET" is not really necessary, at least for a set of time-sensitive applications, like multimedia. You can perform an allocation of CPU power which nearly corresponds to the maximum workload observed over a benchmarking time horizon, or a sufficiently high percentile of its distribution, and/or use some over-provisioning threshold.

If actually needed, given a certain allocation, one can compute what is the expected probability of meeting the deadlines by the application, or generally speaking one can embrace stochastic analysis techniques in order to assess what is the expected performance (on a probabilistic basis) of an application.

Also, one can use adaptive scheduling techniques in order to dynamically change the allocation of computing power as due to the fluctuations (foreseen and/or observed) in the application workload (e.g., as it happens in multimedia).

KS2010: Deadline scheduling

Posted Nov 8, 2010 1:47 UTC (Mon) by vonbrand (guest, #4458) [Link] (1 responses)

That might work for soft real time (whatever that means), for the "multiton machine that could kill you in an instant" application mentioned above this is very far from acceptable.

"Real time" is a system property (kernel with all its areas, computer on which it runs, and application code). Trying to fix that scheduling the CPU alone is certainly bullshit.

KS2010: Deadline scheduling

Posted Nov 8, 2010 20:40 UTC (Mon) by tcucinotta (guest, #69261) [Link]

I agree absolutely with all the statements. However, I also think we should undertake an incremental approach, otherwise we end-up with nothing. For this round, we're discussing about how to improve CPU scheduling. This is only one of the essential bricks that are needed in order to support real-time applications and increase predictability of software. Of course, a comprehensive approach to the problem needs to consider other resources as well, like networking and disk transfers, as well as the IRQ subsystem architecture. Still, improving CPU scheduling is just a little step that is strongly needed, has a relevant impact on the problem, and it can be done without subverting the way scheduling and resource management is handled inside the kernel.

KS2010: Deadline scheduling

Posted Nov 3, 2010 19:18 UTC (Wed) by tcucinotta (guest, #69261) [Link]

"Linus wasn't going for it. He says that multimedia people don't want the video player to be rejected if the scheduler can't guarantee the deadlines; they want a best-effort attempt to play the video."

Applications may always have a fall-back mechanism: try to go (really) real-time, if that's not possible, then go best-effort, possibly with some UI trick for letting the user know he's running too many RT applications. That already happens with RT priorities: applications (e.g., jack) try to acquire RT prio, if that fails, they log a warning/error and they run anyway without it.

Or, you can have a more elaborate user-space middle-ware that allows to manage overload situations, e.g., have a look at the FRESCOR Application Level Contracts (ALC): http://www.frescor.org/index.php?page=publications.

Hard and soft real time

Posted Nov 4, 2010 9:36 UTC (Thu) by csimmonds (subscriber, #3130) [Link] (7 responses)

Seems that the discussion at the summit went off on a bit of a tangent. Playing a video stream is a classic example of a soft real time system: there are deadlines but nothing bad happens if you miss them some times, and as Linus pointed out in most cases the behaviour people want is best effort, but try harder in this case.

With hard real time, you must not miss the deadline ever. An example of a system I have worked on recently: printing sell-by dates on bottles going along a production line. If one bottle is not printed the whole production line has to be halted, the problem fixed and then restarted, which is very expensive. This is hard real time.

You can achieve hard real time behaviour with a priority based scheduler, but it is not easy. Much better to use a deadline scheduler which gives you the guarantees that you want. So, I vote very energetically for inclusion of the deadline scheduler. It may not be useful on the desktop or server, but that is a small part of what Linux does.

Chris.

Hard and soft real time

Posted Nov 4, 2010 13:34 UTC (Thu) by zmower (subscriber, #3005) [Link] (2 responses)

I was watching videos from the Ada conference the other day. They were considering multicore processors and it seems that the interaction between the processors and various caches is such that these systems can no longer be considered to be predictable. One speaker even went so far as to say "Hard realtime is dead" which got a few lighthearted boos!

So would you like to change your statement to embedded hard realtime on unicore processors is a very small part of what linux should do?

Hard and soft real time

Posted Nov 4, 2010 15:36 UTC (Thu) by Shewmaker (guest, #1126) [Link] (1 responses)

Research into optimal real-time multiprocessor scheduling is not dead.
Here's a well written paper from this year.

DP-FAIR: A Simple Model for Understanding Optimal Multiprocessor Scheduling
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5562...

A short overview presentation of that paper.
https://systems.soe.ucsc.edu/sites/default/files/webform/...

This same research group is also working on solving the other parts of the problem. Linus is correct, we can't solve these problems with just CPU scheduling.

Efficient Guaranteed Disk Request Scheduling with Fahrrad (2008)
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1...

Work on memory and network resources is in earlier stages, but the result will hopefully be a coherent general theory of real-time performance management.

Hard and soft real time

Posted Nov 4, 2010 20:15 UTC (Thu) by zmower (subscriber, #3005) [Link]

I think this falls into the category of too simple models. In particular I would questions the assumptions that tasks are independent and that there's no cost overhead in context switches or CPU migrations. Would that it were so easy.

As for linux, even if you have optimal scheduling for all the subsystems, the combined effect is still chaotic.

Ada conference videos etc here http://www.disca.upv.es/jorge/ae2010/outcome.html

Hard and soft real time

Posted Nov 4, 2010 19:20 UTC (Thu) by dlang (guest, #313) [Link] (3 responses)

but even with hard-real-time work, most of the situations allow you to miss a deadline once in a while. If your production line has to be stopped daily or hourly, you can't accept it, but if it stops once a year because the deadline was missed, that's probably acceptable.

you could also have a hard-real-time task that has a 5 second deadline to get the task done. non-real-time linux can accomplish that today (except in the face of failing hardware)

even things with fairly short deadlines can be made statistically reliable in many cases. It's only when you start getting into extremely short deadlines or systems that also handle non-critical processes that compete for resources that you start having real problems.

"hard" real-time

Posted Nov 4, 2010 20:06 UTC (Thu) by dmarti (subscriber, #11625) [Link] (2 responses)

"Hard" real-time is where if you miss a deadline, the robot chops off the user's head, or the user's plane crashes.

"hard" real-time

Posted Nov 4, 2010 20:37 UTC (Thu) by dlang (guest, #313) [Link] (1 responses)

even in the case of aircraft, missing one deadline does not make the plane crash. having the system lock up will make the plane crash, making the system miss deadlines too badly, or too frequently may make the plane crash.

there are _very_ few situations where a single missed deadline proves fatal to the system (or the user :-)

in engineering, the assumption is that you may have unexpected loads, or you may have sub-par materials, so every design includes a safety margin, which means that they make it statistically unlikely that too many things will go wrong and the item will fail.

Even on the Space Shuttle, something as critical as the heat resistant tiles are not individually critical, it's expected that some number of them will be damaged or fall off on any flight. When too many of them get damages, you have the Columbia disintegrating, but that doesn't translate into zero tolerance of tile failure

"hard" real-time

Posted Nov 4, 2010 22:26 UTC (Thu) by csimmonds (subscriber, #3130) [Link]

We seem to be getting off the topic here. The point is that Linux is used in real-time applications - I have worked on several myself. Real time is about determinism and deadlines, and typically there are a range of acceptable behaviours. Dead line scheduling is a valuable tool in designing systems that have to meet deadlines (in essence, it makes the analysis easier). So, it would make sense to have such a scheduler as an option in the Linux. People seem to be arguing against improving the kernel, and I don't understand why.

Chris.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

KS2010: Deadline scheduling

Realtime is not BS

Realtime is not BS

KS2010: Deadline scheduling

KS2010: Deadline scheduling

KS2010: Deadline scheduling

KS2010: Deadline scheduling

KS2010: Deadline scheduling

Hard and soft real time

Hard and soft real time

Hard and soft real time

Hard and soft real time

Hard and soft real time

"hard" real-time

"hard" real-time

"hard" real-time

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.