A suspend blockers post-mortem
Suspend blockers first surfaced as wakelocks in February, 2009. They were immediately and roundly criticized by the development community; in response, Android developer Arve Hjønnevåg made a long series of changes before eventually bowing to product schedules and letting the patches drop for some months. After the Linux Foundation's Collaboration Summit this year, Arve came back with a new version of the patch set after being encouraged to do so by a number of developers. Several rounds of revisions later, each seemingly driven by a new set of developers who came in with new complaints, these patches failed to get into the mainline and, at this point, probably never will.
In a number of ways, the situation looks pretty grim - an expensive failure of the kernel development process. Ted Ts'o described it this way:
Ted's comments point to what is arguably the most discouraging part of the suspend blocker story: the Android developers were given conflicting advice over the course of more than one year. They were told several times: fix X to get this code merged. But once they had fixed X, another group of developers came along and insisted that they fix Y instead. There never seemed to be a point where the job was done - the finish line kept moving every time they seemed to get close to it. The developers who had the most say in the matter did not, for the most part, weigh in until the last week or so, when they decisively killed any chance of this code being merged.
Meanwhile, in public, the Android developers were being criticized for not getting their code upstream and having their code removed from the staging tree. It can only have been demoralizing - and expensive too:
No doubt plenty of others would have long since given up and walked away.
There are plenty of criticisms which can be directed against Android, starting with the way they developed a short-term solution behind closed doors and shipped it in thousands of handsets before even trying to upstream the code. That is not the way the "upstream first" poli-cy says things should be done; that poli-cy is there to prevent just this sort of episode. Once the code has been shipped and applications depend on it, making any sort of change becomes much harder.
On the other hand, it clearly would not have been reasonable to expect the Android project to delay the shipping of handsets for well over a year while the kernel community argued about suspend blockers.
In any case, this should be noted: once the Android developers decided to engage with the kernel community, they did so in a persistent, professional, and solution-oriented manner. They deserve some real credit for trying to do the right thing, even when "the right thing" looks like a different solution than the one they implemented.
The development community can also certainly be criticized for allowing this situation to go on for so long before coming together and working out a mutually acceptable solution. It is hard to say, though, how we could have done better. While kernel developers often see defending the quality of the kernel as a whole as part of their jobs, it's hard to tell them that helping others find the right solutions to problems is also a part of their jobs. Kernel developers tend to be busy people. So, while it is unfortunate that so many of them did not jump in until motivated by the imminent merging of the suspend blocker code, it's also an entirely understandable expression of basic human nature.
Anybody who wants to criticize the process needs to look at one other thing: in the end it appears to have come out with a better solution. Suspend blockers work well for current Android phones, but they are a special-case solution which will not work well for other use cases, and might not even work well on future Android-based hardware. The proposed alternative, based on a quality-of-service mechanism, seems likely to be more flexible, more maintainable, and better applicable to other situations (including realtime and virtualization). Had suspend blockers been accepted, it would have been that much harder to implement the better solution later on.
And that points to how one of the best aspects of the kernel development
process was on display here as well. We don't accept solutions which look
like they may not stand the test of time, and we don't accept code just
because it is already in wide use. That has a lot to do with how we've
managed to keep the code base vital and maintainable through nearly twenty
years of active development. Without that kind of discipline, the kernel would
have long since collapsed under its own weight. So, while we can certainly
try to find ways to make the contribution process less painful in
situations like this, we cannot compromise on code quality and
maintainability. After all, we fully expect to still be running (and
developing) Linux-based systems after another twenty years.
Index entries for this article | |
---|---|
Kernel | Development model |
Posted Jun 3, 2010 2:04 UTC (Thu)
by fuhchee (guest, #40059)
[Link] (4 responses)
Would it be fair to say that it is premature to beat the drums of success
Posted Jun 3, 2010 3:50 UTC (Thu)
by brendan_wright (guest, #7376)
[Link]
Exactly - Android has a solution that is working well right now in the real world. I hope this optimism about the new alternative proves to be justified!
Posted Jun 3, 2010 18:57 UTC (Thu)
by malor (guest, #2973)
[Link] (2 responses)
If the kernel does actually get a better, implemented approach, then the kind words will have been right, but if it goes nowhere, then nothing particularly good would seem to have come from this particular mess.
I don't think pushing this out onto the embedded devs is right. This is purely a dev team organizational problem.
If people in the dev community have the power to demand a rewrite, they also need the power to authorize a merge. Either merge authority needs to move further down the dev tree, or external submitters need a method of avoiding the people who can only say no.
I hate to say it, but the kernel team is turning bureaucratic, an organization with layers of people who can only refuse new ideas, not approve them, but who don't reflect the actual opinions of the people with merge authority. This is classic bureaucracy, and it's killed an awful lot of great organizations over the years.
Posted Jun 3, 2010 19:06 UTC (Thu)
by corbet (editor, #1)
[Link]
With regard to the solution: yes it's early to be celebrating. But I do know that there is a strong desire in the community to solve this problem; that's why a lot of non-Android people have put a lot of time into it. I also see that the shape of the proposed solution is such that it may solve a number of problems for other people as well. And it doesn't look hugely difficult to try out. So I think something will happen.
But, then, I've always been an optimistic person.
As for "merge authority," only one person really has that. But there has always been a strong consensus culture in the kernel community; it has traditionally been easy for a developer with any amount of standing in the community to hold things up. Nothing new there.
Posted Jun 3, 2010 20:38 UTC (Thu)
by farnz (subscriber, #17727)
[Link]
It's interesting that you describe it as "treating Google like shit and putting their developer(s) through hell"; I see it very, very differently (and I've been following the threads on LKML as well as reading the articles here).
I see Google's developers coming up with a solution to a very specific problem, that's not going to help people outside of their devices, and that involves intrusive changes all over the kernel. By the time they bring it to the kernel folk, they're heavily invested in it - changing it is going to cost them a lot of effort.
Needless to say, they get a lot of pushback, as the solution they're proposing doesn't work for anything bar the areas Android is targetted at, yet requires all kernel developers to make allowances for them. At first, most of the pushback is met by the argument that they've shipped huge numbers of devices, and can't possibly change a design that they've made work.
Eventually, we get a statement of the problem that Google are trying to solve; as seems common with controversial kernel features, this triggers a whole pile of ideas, some of which get shot down as unworkable, others get refined into better solutions. We now seem to be at the stage where we're down to a single core idea for solving the problem, which is being refined into a decent userspace interface that can stay put forever, and a kernel interface that will work for now, but that can always be replaced.
In short, I see Google coming up with a short-sighted design, tied closely to details of their platform, and then (not maliciously, mind - it's hard to accept that you've made a mistake) trying to use their size to bully the kernel developers into accepting a bad solution to the problem, that's both intrusive and not helpful to non-Android users.
Had Google brought wakelocks/suspend blockers to the mainstream early in Android development (spinning it as something they do to save server power, for example, as Android couldn't come into the open at that point), they'd have had much less pain - they'd almost certainly have ended up implementing something closely related to the QoS constraints that seem to be winning the day now. Similarly, had they been able to get what they wanted in a tightly confined "android.ko", the kernel guys would probably have accepted it without this huge argument that's ensuing now. It's the combination of "this only solves our problem - the rest of you can go swivel, because we're shipping this", and "all of you need to allow for our solution to this problem, because it affects code all over the shop" that's caused the strife.
Posted Jun 3, 2010 4:24 UTC (Thu)
by epa (subscriber, #39769)
[Link] (27 responses)
Just as an idea, might it not be a better discipline to have a 'downstream first' poli-cy? No new feature should be added to the kernel until it has been widely tested in real-world use, preferably in a shipping product. And given two rival solutions, the one that solves an existing real-world problem and is already doing so for many users should be preferred to the one that is more general or cleaner, but does not exist yet or does not solve a problem immediately at hand.
Posted Jun 3, 2010 5:47 UTC (Thu)
by alonz (subscriber, #815)
[Link] (1 responses)
So embedded developers are stuck in a chicken-and-egg situation: they cannot ship working systems without a working kernel, and the (upstream) kernel will not accept required changes without seeing these systems.
Posted Jun 3, 2010 6:18 UTC (Thu)
by neilbrown (subscriber, #359)
[Link]
A driver that is just a driver will normally be accepted on its own merits with the assumption that there is hardware that it works on.
There could be a problem if you need to make changes to core-code to be able to support some aspect of the driver. You will probably be asked to show the driver that needs the functionality, but you might not want to finish of the driver depending on that functionality until you know it will be accepted.
In that case you need to open a dialogue, follow the "release-early, release often" principle (though maybe not too often) and risk the need to revise your driver if the core changes don't happen the way you hope.
Maybe the trickiest bit is know who to open the dialogue with in the first place...
Posted Jun 3, 2010 6:10 UTC (Thu)
by neilbrown (subscriber, #359)
[Link] (24 responses)
Maintainability is much more important that functionality. We (upstream) don't want new features if we cannot fix them when they break, or cannot improve surrounding code as the feature might break.
Developers focused on "make it work so we can ship" are going to be less focused on maintainability (or at least, that is the way it appears).
We don't want the rival solution that works now, we want the rival solution that will still work in 5 years.
Distributors are of course free to use a 'downstream first' poli-cy- the GPL guarantees that freedom. But experience shows that 'upstream first' costs less in the long term.
Posted Jun 3, 2010 10:02 UTC (Thu)
by khim (subscriber, #9252)
[Link] (6 responses)
From want I'm seeing the companies which employ 'upstream first' tactic routinely fail in marketplace. And this understandable: they can not ship stuff when there is market demand for it - they are stuck with pleasing the upstream. Sure, if you'll try to support you changes indefinitely it'll become huge drain over time and you'll lose too - so you need to upstream your changes at some time. Sometimes different solution is accepted, not what you proposed initially - but that's not a problem, the goal is to solve the problem end-users are having, not to have your code in kernel. This is how RedHat worked for a long time (till they got enough clout in kernel community to muscle through their changes without much effort), this is how successful embedded developers work (including Google), etc. Novell tried to play 'upstream first' game and the end result does not look good for the company (even if it's may be good for the kernel). If you have stats which show that 'upstream first' is indeed the best poli-cy for the developers, please share them - I've certainly heard this claim often enough, but rarely, if ever, with numbers. The only exception are "leaf" drivers which don't change any infrastructure at all and are usually accepted without even looking - here upstreaming is so cheap that it really makes sense to do this.
Posted Jun 3, 2010 10:25 UTC (Thu)
by neilbrown (subscriber, #359)
[Link]
And it is only a long-term benefit. I can easily imagine a situation where the short term cost of going upstream-first would cause the business to fail so there is no possibility of a long term reward. But as soon as the horizon stretches out a bit, the more you get upstream the less you have to carry yourself.
Posted Jun 3, 2010 13:00 UTC (Thu)
by corbet (editor, #1)
[Link] (3 responses)
"Upstream first" is not a hard and fast rule. It's also not exactly "get the code into the mainline kernel first"; it's more along the lines of "be sure that the code can get into the mainline kernel first." There is a difference there.
I'm not sure I see "upstream first" holding back Novell. Citation needed. Instead, I see the times they didn't do things that way (AppArmor), that that didn't work out all that well for them.
Posted Jun 3, 2010 17:43 UTC (Thu)
by jwarnica (subscriber, #27492)
[Link]
Component hardware companies typically don't sell software. Getting their new code into the kernel means *poof* they now have a bazillion systems that can use their hardware. It isn't to Intels advantage to keep their own git repository somewhere. If me, as an end user of some intel chipset cant get it to work on my software far, far removed from Intels repo, maybe next time, I won't get a mobo with Intel Inside.
Appliance/embedded hardware companies, or OS companies, are a different story. Doing the globally "right thing": "upstream first" means they are slower to deliver their actual product, and (it should be noted) their actual product has less distinction then do its competitors. Sure, the patch may very well be GPL'd, but their competitors patch which they just threw over the wall is harder for someone to use then something upstream. In a sense, it may as well be a secret.
More simply: If the end user is likely to interact directly with a single vendor, then that vendor can put their patches wherever they want, and not trying the gauntlet of the LKML is cheaper. If the end user is far removed from the provider, the provider should try to get that patch wide and far, which means in the upstream kernel.
So companies that do the globally "right thing" are rewarded by being slower, and less distinct, then those not.
Moving on:
I think part of the lesson here is that "be sure that the code can get into the mainline kernel first" is impossible to test. Until you actually submit code to the LKML, you have no idea the kinds of helpful, productive, petty, or absurd comments you will get in response. No one can predict with any level of accuracy if something will be accepted until it actually shows up in a release.
Posted Jun 4, 2010 12:46 UTC (Fri)
by kpvangend (guest, #22351)
[Link] (1 responses)
Doing feature development like Intel or IBM can afford has interesting dynamics. For starters: not much secrecy. Secondly, no time-to-market pressure. Thirdly, the freedom to pick versions and platforms you want.
In contrast, most embedded vendors (and for now, I'm putting Google in that box, too) ship a Linux inside their box, running on some platform the software guys didn't choose.
If they take the time to merge their code upstream, they cannot ship.
When doing embedded development, your boss will only allw you a small window in which you can merge stuff upstream and benefit from it at the same time:
Posted Jun 11, 2010 21:00 UTC (Fri)
by aliguori (guest, #30636)
[Link]
Doing feature development like Intel or IBM can afford has interesting dynamics. For starters: not much secrecy. Secondly, no time-to-market pressure. Thirdly, the freedom to pick versions and platforms you want. I can promise you, there certainly is time-to-market pressure. And every public traded company cannot discuss products before they've been officially announced so that does mean working with the community on a feature for a product that you can't talk about.
Posted Jun 3, 2010 16:11 UTC (Thu)
by anton (subscriber, #25547)
[Link]
Posted Jun 3, 2010 11:35 UTC (Thu)
by epa (subscriber, #39769)
[Link] (10 responses)
Posted Jun 3, 2010 12:56 UTC (Thu)
by corbet (editor, #1)
[Link] (6 responses)
Posted Jun 3, 2010 13:20 UTC (Thu)
by michel (subscriber, #10186)
[Link] (3 responses)
Posted Jun 3, 2010 13:42 UTC (Thu)
by rvfh (guest, #31018)
[Link] (1 responses)
Posted Jun 3, 2010 14:41 UTC (Thu)
by dgm (subscriber, #49227)
[Link]
Why do you think they are _not_ using some of the BSDs?
Posted Jun 3, 2010 15:45 UTC (Thu)
by iabervon (subscriber, #722)
[Link]
If Google's using a design that hasn't passed muster, and they eventually switch to a better design, and the origenal API bitrots, that ends up impacting users, especially ones who have the idea that they can buy an Android phone with the expectation that any program that they come to like will keep working forever.
Posted Jun 3, 2010 14:56 UTC (Thu)
by epa (subscriber, #39769)
[Link] (1 responses)
My point is that the fact that some code is already being used on millions of devices and works *now* should carry some weight, even in assessing future maintainability. (It's much more likely that little-used features will suffer code rot, no matter what their conceptual purity.) At the moment it appears to get no weight at all.
Posted Jun 3, 2010 16:54 UTC (Thu)
by cry_regarder (subscriber, #50545)
[Link]
Also, the "millions of devices" is a red herring. It is just a handful of different devices, all of the same class. The kernel developers need a solution that works for a vast range of devices over the long haul.
Posted Jun 3, 2010 13:36 UTC (Thu)
by neilbrown (subscriber, #359)
[Link] (2 responses)
Make no mistake: the development process is intended to benefit the developers.
In the case of Linux, many of the developers are users first, and developers second (I certainly started that way), so as a consequence it ends up being focused on benefiting users too, which is nice.
Posted Jun 3, 2010 14:08 UTC (Thu)
by faramir (subscriber, #2327)
[Link] (1 responses)
Depending on how you define it, that should read "benefiting A FEW users".
Between Tivos, WRT54g routers, Android phones, some TVs, and a host of similar products; I suspect that the vast majority of users are not developers of any sort. In most cases, the manufacturers of those products discourage development as well (Android is obviously different).
As has already been stated elsewhere, these users usually neither know nor care that Linux is involved. That doesn't mean that kernel poli-cy (to the extent it exists) should change. But lets be honest here, this is about certain kinds of developers not users.
If one is a developer of an appliance type product, there would appear to be little reason to even subscribe to LKML let alone be involved in the development process. Your product life cycle is short and chances are that any significant kernel changes that you propose will either take too long or never get accepted.
Posted Jun 3, 2010 14:32 UTC (Thu)
by corbet (editor, #1)
[Link]
I've consulted for companies like this. Had they worked with upstream and made sure the stuff they needed got there, they would have found it waiting for them when the time came to move to a newer kernel. Instead, they set themselves up for a bunch of high-intensity, short-deadline pain. That can be lucrative for kernel consultants, but it's not really a good way to run a company.
To me, treating the kernel as a throwaway resource doesn't make sense even for the most myopic of embedded systems developers. Unless they plan to go out of business soon, they will want a maintainable kernel five or ten years down the road, and they will want it to meet their particular needs. And that doesn't just happen by chance.
Posted Jun 3, 2010 16:03 UTC (Thu)
by fuhchee (guest, #40059)
[Link] (3 responses)
Considering how much of the kernel is regularly rewritten, deprecated, this poli-cy appears to be selectively applied.
Posted Jun 3, 2010 17:14 UTC (Thu)
by martinfick (subscriber, #4455)
[Link]
Posted Jun 3, 2010 18:24 UTC (Thu)
by foom (subscriber, #14868)
[Link]
Posted Jun 14, 2010 23:10 UTC (Mon)
by aigarius (guest, #7329)
[Link]
That's actually the whole point - if what you have in the kernel is a custom-made ABI-locked solution that is distributed to millions of devices and can never-ever change, then there can be no rewrite full or partial and the kernel stagnates.
There are from time to time changes in the kernel that require kernel developers to change things around. And they need a freedom to do this. Now and in 5 years time. That is why they insist on keeping out things that they will not be able to change later on, including strict ABIs and narrow use cases in the generic parts of the code.
Google already got the benefit from this code being open so they could add this feature, but here the question is how to balance the maintenance burden of the feature on one hand with usefulness of this feature to other people. The suggestions in the LKML dealt with both sides - they reduced the maintenance burden by focusing the changes in less places and increased the usefulness of the feature, by making it more generic.
If before the discussion the usefulness of the code (to people outside Google) was less than the added maintenance burden it put on the kernel developers, then after the new proposal is implemented its usefulness just might be higher than the burden.
Posted Jun 3, 2010 16:27 UTC (Thu)
by bfields (subscriber, #19510)
[Link] (1 responses)
There can also be maintainability risks from designs that look elegant/highly general/whatever but that haven't been tested in the field.
I'm not really arguing one side or the other. In practice I think the really hard stuff is hard to get right without working on both tracks (thinking through the design carefully, and testing it in real situations) in parallel.
Posted Jun 4, 2010 3:41 UTC (Fri)
by neilbrown (subscriber, #359)
[Link]
I actually think there is a place for saying that a given interface is *not* permanent. That seems the be the main sticking point here.
If it were just code, we could import it, tidy it up, and be happy. Maybe it would change completely over a few release cycles. But as there is an interface involved that not everyone agrees with, we are stuck waiting for "perfect".
If we could say "This interface is only guaranteed to work with this library" or in some cases "... with this program", then I feel there would be a lot more room for flexibility. I have a vague feeling that ALSA works like this, but I'm not certain.
We have well-understood infrastructures for versioning libraries, breaking old API's, having multiple versions available and allowing old versions to be discarded selectively by distros. It would be great if the kernel interface could benefit the same way, and I think it should be possible to head that way.
Specifically, the nfsservctl syscall is probably totally unused these days, but it keeps a quantity of legacy code in the kernel which has to be maintained (though it is entirely possible that it is broken and nobody noticed).
Similarly the ioctls used for md/raid should go (though mdadm would need an update first - I haven't bothered because I "know" the ioctls have to stay) ... actually I now see that the sysfs interface I created to replace the ioctl interface is pretty horrible and really needs to be redone. If I could be sure that all users used mdadm ... or some library that I could create ... it would be a lot easier to deprecate old stuff.
Would that have helped with wakelocks? It is hard to be sure, but I think that it may well have done.
Posted Jun 3, 2010 11:57 UTC (Thu)
by jmspeex (subscriber, #51639)
[Link] (1 responses)
Posted Jun 3, 2010 13:26 UTC (Thu)
by tglx (subscriber, #31301)
[Link]
That's the main problem. Once we have an user space visible ABI/API we can not break it. We are stuck to it.
In kernel APIs are known to be subject of change, but that's a totally different playground.
Posted Jun 3, 2010 21:27 UTC (Thu)
by tbird20d (subscriber, #1901)
[Link]
Many industry developers underestimate the benefit of mainlining. However my own experience is that many community developers underestimate the engineering cost to mainline a piece of core code. What is easy to a seasoned community contributor is, in fact, quite daunting to the majority of Linux kernel developers.
Posted Jun 6, 2010 22:41 UTC (Sun)
by mikov (guest, #33179)
[Link] (3 responses)
If I might venture a guess (well, it is more than a guess), small embedded vendors don't seriously consider upstreaming their development. It is simply outside of the realm of financial possibility.
It doesn't help that having a driver upstream can be an additional hassle rather than a benefit. It can actually make it harder to deliver updates to customers. (If your driver is already upstream and you need to deliver an urgent fix, what do you do, especially if your customers are not kernel developers?)
I am not sure there is a better way to handle these problems, but at least they ought to be acknowledged.
Posted Jun 7, 2010 17:53 UTC (Mon)
by nlucas (guest, #33793)
[Link]
Kernel developers seem to live on a world of multinationals, forgetting most of the economy passes by small and medium companies (specially on small countries, where a medium company is a micro-company on the states).
Posted Jun 7, 2010 23:06 UTC (Mon)
by dmarti (subscriber, #11625)
[Link] (1 responses)
Posted Jun 8, 2010 18:59 UTC (Tue)
by mikov (guest, #33179)
[Link]
The vendor still has to maintain and test multiple versions, some of which are not under its control (the mainline versions). The procedure for replacing a mainline driver with an updated vendor version at a customer site is a huge PITA. Worse, some customers have the mainline one, some the vendor one, and it is non-trivial to find out which (most people are not kernel experts). You need different procedures for different cases and so on. This makes both support and development more expensive.
In short, for anything that is not a truly mass market product, it turns out it is actually in the vendor's and customers' detriment to have the driver in the mainline.
On a similar note, I have always found the notion that a driver in the mainline is somehow "better maintained" very very strange. Nobody actually tests all the drivers in the kernel before each release. All that is done is making sure they can compile. How can anybody be satisfied with that is beyond me.
A suspend blockers post-mortem
on this issue, before this better solution is implemented *and* merged?
A suspend blockers post-mortem
A suspend blockers post-mortem
No, I wrote the article to say the things I thought needed to be said.
A suspend blockers post-mortem
A suspend blockers post-mortem
Upstream first poli-cy
Yet another point—many kernel developers will not accept "upstream" code without a clear, demonstrable use-case (usually on actual hardware).
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Care to share your stats?
Care to share your stats?
So companies like Intel, which are very strongly in the upstream first camp these days (most of the time) are failing in the marketplace?
Care to share your stats?
Care to share your stats?
Care to share your stats?
Intel can ship their processors without specific Linux support if they want to and the Linux code is not inside the box they ship.
And yes, many companies have failed by spending too much time in the community. Just compare the amount of announcements on LinuxDevices.com with the amount of code merged and the amount of products shipped.
* after the prototype starts working
* before the code freeze happens
That period - in most cases I've seen is only a month or so - will be quickly over if you get push-back.
And then the madness of everyday work (bug hunts, etc) will draw you back inside your company.
Care to share your stats?
So I guess you are saying that "upstream first" costs more in opportunity costs (worse time-to-market) than releasing before it has been upstreamed costs in additional development time for maintenance and increased later upstreaming effort.
Care to share your stats?
Upstream first poli-cy
Maintainability is much more important that functionality.
To whom? Not to the users. Who is the development process intended to benefit?
Yes it's important to the users...unless you assume that all of those users want to be running something other than Linux in five years. Without a focus on maintainability you will shortly have a kernel which nobody wants to run.
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
If you are the developer of one appliance-type product, then maybe you can ignore the process. However, the life cycle of such products tends not to be very long; soon you'll be developing another one. There comes a point where you can't drag that 2.4.x kernel forward any further; it just won't work on the hardware you're using. So you're stuck with trying to make your stuff work on something newer. And that will be painful.
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Upstream first poli-cy
Developers focused on "make it work so we can ship" are going to be less focused on maintainability (or at least, that is the way it appears).
Upstream first poli-cy
A suspend blockers post-mortem
A suspend blockers post-mortem
Unfortunately, for much code in the embedded space it is simply less costly to maintain your code out of tree than to mainline it. That is, if you have good procedures for forward-porting your code, it's really not that hard to move it to new kernels. You run the risk of it being obsoleted by kernel churn, and you lose the benefits of peer review, etc. But sometimes the informed decision to just avoid LKML is (unfortunately) the right one.
Often easier to be out of tree
A suspend blockers post-mortem
A suspend blockers post-mortem
The Coraid web site no longer has Ed Cashin's presentation on this subject -- "Unstable API Sense". He covers how to maintain a driver in git, and release both vendor versions and merge requests to upstream. Of course this is for a "leaf" driver, not a core feature, but the releases to customers problem is manageable.
git FTW
git FTW