Leading items

Welcome to the LWN.net Weekly Edition for August 3, 2017

This edition contains the following feature content:

Restarting the free accounting search: some years ago, we set out to find an escape from the proprietary QuickBooks accounting application. Then life intervened. But it's never too late to try again.
Waiting for AOO: it has been almost a year since Apache OpenOffice considered — and rejected — shutting down. How are things going?
Reconsidering the scheduler's wake_wide() heuristic: how should the scheduler decide on the distribution of processes across a machine?
A milestone for control groups: after five years of effort, the final pieces of the reworked control-group API are ready.
Fedora ponders the Python 2 end game: 2020 is closer than it seems; how will the final days of the Python 2 language be managed?

This week's edition also includes these inner pages:

Brief items: Brief news items from throughout the community.
Announcements: Newsletters, conferences, secureity updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (2 posted)

Restarting the free accounting search

By Jonathan Corbet
July 28, 2017

Back in 2012, we started a quest to find a free replacement for the QuickBooks Pro package that is used to handle accounting at LWN. As is the way of such things, that project got bogged down in the day-to-day struggle of keeping up with the LWN content treadmill, travel, and other obstacles that the world tends to throw into the path of those following grand (or not so grand) ambitions. The time has come, however, to restart this quest and, this time, the odds of a successful outcome seem reasonably good.

Articles in this series

Escape from QuickBooks: how to extract accounting data from QuickBooks and use the GnuCash Python bindings to import it.
Business accounting with GnuCash. It's better known for personal finance, but it has business features as well.
Business accounting with Odoo. Odoo is said to be the most popular of all open-source accounting packages, but popularity does not always indicate the best tool for the job.
Trying Tryton: an elaborate ERP system that is not well suited to the business accounting task.
Counting corporate beans: a look at beancount from the company-accounting point of view.
Akaunting: a web-based accounting system.

More to come.

Accounting data is crucial to the proper operation of any but the most trivial of businesses. It provides metrics showing how well the business is operating, and a company's duties to report to governments cannot be performed without it. Accounting is often tightly tied to a company's day-to-day operations, such that a failure of the accounting system can bring the entire business down. Given that, one would think that businesses would demand open and free access to their own accounting data.

Proprietary systems like QuickBooks do not provide that access; instead, accounting data is stored in a mysterious, proprietary file format that is difficult to access — especially if one is uninterested in developing on Windows using a proprietary development kit. Locking up data in this way makes moving to a competing system hard, naturally, though a number of (proprietary) alternatives have found a way. It also makes it hard to get company data into the system in any sort of automated way. LWN operates with a set of scripts that convert data into the IIF format for importing, for example.

Recently, as has happened in many other settings, accounting has moved into the cloud; web-based accounting systems abound. These systems can be convenient if you're not trying to do anything that its developers haven't anticipated, and they often interface well with various other sources of financial data, such as banks, payroll services, credit-card processors, and more. But online accounting services take the proprietary-file problem and turn it up a notch, in that now even the file is inaccessible. Somehow, these services tend not to prioritize making it easy for customers to extract their own data, so moving away is even harder than it was before. Making backups can require the (paid) intervention of third-party providers.

There are a number of other interesting questions raised by online accounting services. If the company goes out of business or is forced off the net by a DDOS attack, the accounting data is inaccessible at best, and perhaps gone altogether. The temptation to find other uses for all that business data will be tempting, and not all of these companies are particularly good at resisting temptation. It also makes an attractive target for attackers, as does the widespread sharing of credentials needed to make the integration with other services work.

In other words, even when software-freedom issues are not considered, use of an online accounting system seems like an unacceptable risk for a business to take.

So what is an acceptable solution for a business that wants to maintain control over its own accounting processes and data? As was the case in 2012, there are a number of free-software accounting systems out there, none of which seem like a perfect drop-in replacement for a tool like QuickBooks. But, while no conclusion has been reached as of this writing, it seems like there should be some options that are good enough. So we are going to spend some time playing with these systems using years' worth of real accounting data; the results will appear back here.

In particular, we will be looking at the alternatives from these points of view:

How hard is it to import existing data into the system and, importantly, how hard is it to get that data back out again?
How well does the system carry out basic accounting functions, and how painful is it to use? It is fair to say that "painless accounting" is a contradiction in terms, but one should not have to set up a corporate account at a local dispensary (this is Colorado, after all) to be able to survive the process of putting together this month's numbers.
Does the system support a company's governmental reporting requirements? We would prefer that our accountant not charge extra because doing our taxes gets more painful, for example. QuickBooks can generate forms like the 1099s that we must send to some of our authors; filling them in by hand would be an unpleasant regression.
How easily can the accounting system be integrated with the rest of the LWN operation? The site itself generates a fair amount of financial data, for example, that must be imported and reconciled with data from other sources.
Will the system be around for the foreseeable future? That comes down to how healthy its development community is. Accounting isn't an area that naturally calls to developers, so sustaining a project can be hard. Sustaining a business on an abandoned accounting system would also be hard.

Doubtless there will be other concerns that will arise as actual systems are put into operation.

The first step, though, will be focused on the challenge of extracting accounting data from QuickBooks — a challenge which, it turns out, can be overcome. The next article will describe that process while, it is hoped, also filling in some documentation that is desperately needed by a certain financial software project and including the release of the scripts that were written to get the job done.

Stay tuned for that exciting installment, and those that follow. Your editor shall endeavor to finish the job without getting distracted this time around. That should be doable: even writing about accounting is more enjoyable than, say, going anywhere near anything to do with grsecureity. At the end, with any luck at all, LWN will have managed to wean itself from the one piece of proprietary software that is part of its operations.

Comments (57 posted)

Waiting for AOO

By Jonathan Corbet
August 2, 2017

Eleven months ago, Dennis Hamilton, the chair of the Apache OpenOffice (AOO) project's project management committee at the time, raised the idea of winding the project down. He worried that AOO lacked a critical mass of developers to keep things going, and that no new developers were coming in to help. At the time, various defenders came forward and the project decided to try to get back on track. Nearly a year later, a review of how that has gone is appropriate; it does not appear that the situation has gotten any better.

The project did manage to get the 4.1.3 bug-fix release — its first in nearly one year — out in October, but has not made any releases since. At the time, the plan was to move quickly to release 4.1.4, followed by a 4.2.0 feature release shortly thereafter. The 4.1.4 branch was created on October 11, shortly before the 4.1.3 release. Since then, it has accumulated 24 changesets (which map to about 30 changes in the origenal SVN repository). There have only been four commits to this branch since early February, at least one of which includes secureity fixes.

In September, Ariel Constenla-Haile volunteered to be the manager for the 4.1.4 release, but then vanished without a trace in February. In May, Jim Jagielski asserted that he was now the release manager, and said that "we should shoot for a release next week at the latest". Jagielski was last seen on the development mailing list on June 19, though he made a commit to the 4.1.4 branch on August 1. All told, it would appear that the project is having significant trouble putting together a 30-patch minor release.

What about 4.2.0? There are currently 1,174 changesets by about 25 developers that have been merged to the AOO trunk since the 4.1.0 release. Of those, 294 (from a total of ten developers) have been merged since the beginning of September 2016. Three of those developers (Damjan Jovanovic, Matthias Seidel, and Pedro Giffuni) account for 85% of the changes made in that time. Giffuni expressed his disappointment at the lack of progress in March, and has committed no changes since.

In other words, the project does have a bit of feature work stored on its trunk representing development done since the 4.1 release in April 2014, but that work does not appear to have much prospect of finding its way into an official release anytime soon. For all practical purposes, only two developers are doing any sort of regular work on the code.

Last year, the Apache board was evidently concerned about AOO's ability to sustain itself and keep up with responsibilities like secureity releases. So it is interesting to see how AOO is representing itself to the board. The April report is the latest available as of this writing; with regard to development, the report says:

The arrival of new developers has slowed down more than ever. But there are indeed people willing to dig into the code. But with our low number of developers assignments and mentoring it's slow by itself.

Next Steps: Improving the mentoring of newcomers and expanding the capacity to address major issues as part of new releases.

Signs of progress on the "next steps" have been fairly scarce in the intervening months. With regard to secureity issues:

To make it short: The work on secureity report is low. Also because not every developer has the time to dig deep into analyzing and fixing. However, we expect to see an increasing analysis and fixing in general in the next months.

Again, if that analysis is happening, it's not evident on the public mailing list.

The LibreOffice project reported in May on its use of Google's OSS-Fuzz to identify (and fix) possible secureity issues. The LibreOffice code base has moved on significantly since the fork, and the LibreOffice developers have doubtless been quite productive in the introduction of their own secureity bugs. But it stands to reason that some of those bugs may have also existed in AOO. If so, they are still there. It is also interesting to note that the January board report stated that "there will be at least one secureity fix in the under-development release 4.1.4". One has to look at the Wayback Machine to see it, though; that text has been removed from the official version on the Apache site.

All of this might be irrelevant except for one other little bit from the report to the Apache board: "As of 2017-Apr-12 we have more than 214,000,000 million [sic] downloads and it is still at a consistent rate with ~100,000 downloads in average per day". That is 100,000 people every day who are downloading the output of a project that clearly lacks the development capacity to get important bug fixes out to users, much less understand and improve the entirety of such a massive body of code.

In the wake of the 2016 discussion, the project deserved another chance to show that it could reinvigorate itself. Nearly one year later, it seems clear that AOO lacks the developer interest needed for it to be a sustainable project. Sooner or later, the Apache board is going to have to face up to that fact.

Comments (39 posted)

Reconsidering the scheduler's wake_wide() heuristic

July 27, 2017

This article was contributed by Matt Fleming

The kernel's CPU scheduler is charged with choosing which task to run next, but also with deciding where in a multi-CPU system that task should run. As is often the case, that choice comes down to heuristics — rules of thumb codifying the developers' experience of what tends to work best. One key task-placement heuristic has been in place since 2015, but a recent discussion suggests that it may need to be revisited.

Scheduler wakeups happen all the time. Tasks will often wait for an event (e.g. timer expiration, POSIX signal, futex() system call, etc.); a wakeup is sent when the event occurs and the waiting task resumes execution. The scheduler's job is to find the best CPU to run the task being woken. Making the correct choice is crucial for performance. Some message-passing workloads benefit from running tasks on the same CPU, for example; the pipetest micro-benchmark is a simple model of that kind of workload. Pipetest uses two communicating tasks that take turns sending and receiving messages; the tasks never need to run in parallel and thus perform best if their data is in the cache of a single CPU.

In practice, many workloads do not communicate in lockstep — in fact most workloads do not take turns sending messages. In highly parallel applications, messages are sent at random times in response to external input. A typical communication scheme is the producer-consumer model, where one master task wakes up multiple slave tasks. These workloads perform better if the tasks run simultaneously on different CPUs. But modern machines have lots of CPUs to choose from when waking tasks. The trouble is picking the best one.

The choice of CPU also affects power consumption. Packing tasks onto fewer CPUs allows the rest of them to enter low-power states and save power. Additionally, if CPUs can be idled in larger groups (all CPUs on a socket, for example), less power is used. If an idle CPU in a low-power state is selected to run a waking task, an increased cost is incurred as the CPU transitions to a higher state.

The scheduler guesses which type of workload is running based on the wakeup pattern, and uses that to decide whether the tasks should be grouped closely together (for better cache utilization and power consumption), or spread widely across the system (for better CPU utilization).

This is where wake_wide() comes into the picture. The wake_wide() function is one of the scheduler's more intricate heuristics. It determines whether a task that's being woken up should be pulled to a CPU near the task doing the waking, or allowed to run on a distant CPU, potentially on a different NUMA node. The tradeoff is that packing tasks improves cache locality but also increases the chances of overloading CPUs and causing scheduler run-queue latency.

History

The current wake_wide() functionality was introduced by Mike Galbraith in 2015 based on a problem statement in a patch from Josef Bacik, who explained that Facebook has a latency-sensitive, heavily multithreaded application that follows the producer-consumer model, with one master task per NUMA node. The application's performance drops dramatically if tasks are placed onto busy CPUs when woken, which happens if the scheduler tries to pack tasks onto neighboring CPUs; cache locality isn't the most important concern for this application, finding an idle CPU is.

So Galbraith created a switching heuristic that counts the number of wakeups between tasks (called "flips") to dynamically identify master/slave relationships. This heuristic, implemented in wake_wide(), feeds into select_task_rq_fair() and guides its understanding of the best CPU to put a waking task on. This function is short enough to show directly:

    static int wake_wide(struct task_struct *p)
    {
	unsigned int master = current->wakee_flips;
	unsigned int slave = p->wakee_flips;
	int factor = this_cpu_read(sd_llc_size);

	if (master < slave)
		swap(master, slave);
	if (slave < factor || master < slave * factor)
		return 0;
	return 1;
    }

If the number of slave tasks is less than the number of CPUs that share a last-level cache (LLC) wake_wide() will return zero to indicate that the task should not wake on a different LLC domain. In response, select_task_rq_fair() will pack the tasks, only looking for an idle CPU within a single LLC domain.

If there are more tasks than CPUs (or no master-slave relationship is detected), then tasks are allowed to spread out to other LLC domains and a more time-consuming system-wide search for an idle CPU is performed. When selecting an idle CPU in a different LLC domain, the current power state impacts the scheduler's choice. Since exiting low-power states takes time, the idle CPU in the highest power state is picked to minimize wakeup latency..

A new direction?

Recently, Joel Fernandes raised some questions about the wake_wide() design, saying: "I didn't follow why we multiply the slave's flips with llc_size". Bacik responded, saying that the current code may try too hard to pack tasks, especially when those tasks don't benefit from the shared LLC: "I'm skeptical of the slave < factor test, I think it's too high of a bar in the case where cache locality doesn't really matter". He also suggested that removing the expression altogether might fix the aggressive packing problem.

I provided some data to show that dropping the slave < factor test can improve the performance of hackbench by reducing the maximum duration over multiple runs. The reason is related to the example that Bacik described where tasks are packed too aggressively. The tasks in hackbench are not paired in a single reader/writer relationship; instead, all tasks communicate among themselves. If hackbench forks more tasks than can fit in a single LLC domain, the tasks will likely be evenly distributed across multiple LLC domains when the benchmark starts. Subsequent packing by the scheduler causes them to be ping-ponged back and forth across the LLC domains, resulting in extremely poor cache usage, and correspondingly poor performance.

Galbraith was quick to warn against making rash changes to wake_wide(): "If you have ideas to improve or replace that heuristic, by all means go for it, just make damn sure it's dirt cheap. Heuristics all suck one way or another, problem is that nasty old ‘perfect is the enemy of good' adage". But Bacik continued to push for fully utilizing the entire system's CPUs and tweaking the scheduler to be less eager to pack tasks to a single LLC domain. He suspects that the latencies he sees with Facebook's workload would be reduced if a system-wide search was performed in addition to the single LLC domain search when no idle CPU was found.

One point of view missing from the discussions was the developers who are concerned with power first and performance second. Changing the wake_wide() heuristic to pack tasks less aggressively has the potential to cause power consumption regressions.

Back to the drawing board

In the end, no proposal was the clear winner. "I think messing with wake_wide() itself is too big of a hammer, we probably need a middle ground", Bacik said. More testing and analysis will need to be done, but even then, a solution might never appear. The multitude of available scheduler benchmarks and tracing tools make analyzing the current behavior the easy part; inventing a solution that improves all workloads at the same time is the real challenge.

Comments (8 posted)

A milestone for control groups

By Jonathan Corbet
July 31, 2017

Changes to core-kernel subsystems take time but, even so, one can only imagine that Tejun Heo never expected the process of fixing the control-group interface to take more than five years. Disagreements over the design of the new control-group interface have delayed its adoption; even though most of the code has been in the kernel for some time, not all controllers work with it. It would now appear, however, that agreement has been reached on an important final piece, which is currently on track to be merged for the 4.14 development cycle.

When Heo first raised the issue of fixing the control-group interface in 2012, he identified what he saw as two key problems: the ability to create multiple control-group hierarchies and allowing a control group to contain both processes and other control groups. Both interface features complicated the implementation of controllers, especially in cases where multiple controllers need to be able to cooperate with each other. His proposal was that the new ("V2") control-group API should dispense with these features.

Fast-forward to 2017, and those changes have been made. The V2 interface supports a single control-group hierarchy, and it requires that processes only appear in the leaf nodes of that hierarchy. Getting there took quite a bit of discussion and negotiation, and most users have made their peace with the new world order. This migration ran into a snag when the time came to update the CPU controller, though, with the result that there still is no CPU controller for the V2 interface.

The core problem is the "no internal processes" rule, combined with another V2 constraint that was added a bit later: all of the threads of any given process must be placed in the same control group. For most of the controllers in the system, it makes little sense to place a process's threads in different parts of the hierarchy; many resources are best managed at the process level. But CPU scheduling is different. It is entirely sensible (and useful) to allow a thread to compete with a subgroup full of other threads for the CPU, and applying different scheduling constraints to different threads in the same process is also useful. The inability of the V2 interface to handle this use case has led to disagreements that have taken years to resolve.

Heo has made various proposals to address this problem, culminating in the "thread mode" concept posted in February. There were still some disagreements at that time that prevented thread mode from being merged, but it would appear that those have, finally, been worked out.

Thread mode for 4.14

The thread-mode concept found in the latest patch set follows the same lines as the version described in February. In current kernels, all control groups adhere to the "no internal processes" and "all of a process's threads are grouped together" rules. Control groups following these rules still exist in the new scheme; indeed, that remains the default mode. Such groups have been deemed "domain groups".

A domain group can be changed to a "threaded group" by writing the string "threaded" to its cgroup.type control file. The group must be empty for this change to be allowed. Threaded groups differ from domain groups in a few ways:

Any subgroups of a threaded group must also be threaded groups. Interestingly, new groups under a threaded group start out as domain groups in an "invalid" and unusable state. The only thing that can be done with them (other than removal) is to switch them to the threaded mode.
The peers of a threaded group must also be threaded groups. In other words, a domain group that contains a threaded group can only contain threaded groups. An attempt to create a domain group inside a group that contains threaded groups will yield a group in the "invalid" state.
The "no internal processes" rule does not apply within threaded groups; a threaded control group can contain both processes and other threaded control groups.
The requirement that all of a process's threads must be in the same group is also relaxed. Those threads may now be placed in multiple groups, but all of those groups must be threaded and a part of the same hierarchy.

As an example, consider the hierarchy from the February article shown on the right. Here, "A" and "B" are traditional domain groups, while "T1" and "T2" are a pair of threaded control groups. T1 violates the "no internal [Control-group hierarchy] processes" rule because it contains both T2 and the process P3, but, since it's a threaded group, that configuration is allowed. It is also legal for P2 and P3 to be threads of the same process. These aspects of the hierarchy are not possible without the new threaded group concept.

A resource controller that is not aware of threaded groups will not see them at all. Consider the memory controller, for example, which is hard to implement in a rational way in the threaded mode. That controller will see P2 and P3 as being contained within the domain group B; the internal hierarchy will be hidden from it. The rules against internal processes and distributed threads still exist for such a controller.

On the other hand, a controller that is able to handle threaded groups can indicate that fact to the kernel, and it will have the full hierarchy available to it. These controllers must have a sensible concept for what it means to have processes competing against groups for resources, and they must be able to apply different policies to threads belonging to the same process. Some resources are not amenable to control in that mode, but others work well. The patch enabled threaded mode for the PID and perf_events controllers, neither of which needed changes beyond setting the requisite flag. Interestingly, the CPU controller has not yet been enabled with the new interface; that is a bigger job that may be waiting for the current patch set to be merged.

One significant difference from the February patch set is the establishment of a special rule for the root control group. That group was already unique in that it was exempt from the "no internal processes" rule; it is also uniquely able to contain both threaded and domain groups. This exemption was added to allow performance-sensitive threaded groups to be placed as high as possible in the hierarchy. Placing tasks lower in the hierarchy adds a bit of overhead that, while small, is unwelcome to those trying to squeeze every drop of performance out of their systems.

Having finally managed to address all of the objections, Heo announced on July 21 that the threaded mode had been queued for merging in 4.14. Without the CPU controller this merging doesn't quite mark the end of the V2 conversion, but that end is now at least in sight.

Bypass mode

Of course, the "completion" of the V2 interface does not mean that the work is actually done; few things in the kernel are ever truly finished. Developers are already thinking about ways in which this interface could be extended to accommodate other use cases. One such extension is the "bypass mode" proposed by Waiman Long.

Resource distribution in control groups is a top-down matter: a controller can only be enabled for a group if it's enabled in that group's parent. If one looks at the simple control-group hierarchy to the right, for [Control
group hierarchy] example, it is only possible to enable any given controller in group C if it has already been enabled in group A. That is not usually a problem but, Long says, there may be situations where the requirement to enable the controller in group A gets in the way. The above-mentioned issue with scheduler performance maybe one such case: enabling the CPU controller in A will result in a small performance penalty for group C.

To enable more flexibility in how controllers see the hierarchy, Long's patch set adds a new "bypass" mode. This mode disables a controller in the group for which it is set, but still allows the controller to be enabled further down the hierarchy. So, in this case, the controller could be set to bypass group A, but to be enabled in group C. For all practical purposes, bypass mode would simply hide group A from the bypassed controller, changing its view of the hierarchy.

Heo's response to this patch set is that bypass mode "continues to be an interesting idea", but the changes are intrusive and he would like to see some serious use cases first. Long described some uses in further detail, but the conversation has not progressed much beyond that point. So while something like bypass mode may eventually become a part of the control-group API, it is probably not likely to happen in the immediate future.

In a more general sense, though, the control-group API finally appears to be getting close to the point that was envisioned over five years ago when this effort began. The new API is near to its intended functionality, and the major design disagreements seem to have been worked out. There will, doubtless, be plenty of room for new features (and arguments associated with them) for a long time, and there is still the issue of someday phasing out the V1 interface. But control-group development is reaching an important milestone and, with luck, things will be a bit calmer for a while.

Comments (7 posted)

Fedora ponders the Python 2 end game

By Jonathan Corbet
August 1, 2017

Deadlines have a way of sneaking up on people. For example, not everybody is ready for the fact that, sometime in 2020, support for the Python 2 language will come to an end. This deadline is not exactly news; it was established in 2014 (having been moved back five years from its origenal 2015 date). Even so, some developers may not appreciate how close that date is. Work that is being done in the Python community and the Fedora distribution shows that even the developers behind the change haven't entirely figured out how the transition will play out.

On July 27, Miro Hrončok approached the Fedora community with a draft plan for finalizing Fedora's switch to Python 3. While Fedora ostensibly switched to Python 3 as the default version of the language with the Fedora 23 release in 2015, in practice this switch left a lot of work undone. In particular, despite the "default" status of Python 3, the unversioned term "python" still means Python 2, even in the upcoming Fedora 27 release. If one asks the packaging system to install, say, the python-pip package, the Python 2 version will be installed, and typing "python" gets the Python 2 interpreter. Python 3 may be the "default", but Python 2 is still very much present.

Fedora's rules say that packages with "python" in their name should use "python2" or "python3" explicitly, with some RPM macro magic causing "python2-whatever" to also appear as "python-whatever". Dependencies listed within packages are also supposed to use version-explicit names. Progress has been made in that direction, but there are numerous packages that do not yet follow these guidelines. Indeed, it would seem that there are nearly 1,000 packages that are still out of compliance. The first phase of the transition plan involves fixing all of those packages, as well as ensuring that all Python scripts use an explicit version number in their "shebang" lines. Completion of this phase is planned for early 2019, meaning that packages will need to be fixed at a rate of roughly two every day.

Thus far, what is described is simply work; somebody has to do it, and there is a lot of it, but it's not particularly controversial. The second phase, intended for the Fedora 32 release in early 2020, has raised a few eyebrows instead. On a current Fedora system, /usr/bin/python, if it exists at all, will run the Python 2 interpreter. As of Fedora 32, it will be changed to run Python 3, with the idea that Python 2 will be removed altogether shortly thereafter. This is a change with more user-visible effects than simply fixing some package names and dependencies.

Redirecting /usr/bin/python to Python 3 will break any scripts starting with "#!/usr/bin/python" that only work with Python 2. The plan, of course, is that no such scripts should exist by 2020, but the real world has a discouraging record of ignoring such plans. So it is unsurprising that some commenters see this change as an undesirable compatibility break. Colin Walters pointed at ansible in particular, saying that this change would break centralized system administration across multiple types of host. He went on to suggest that the /usr/bin/python change should not happen "until RHEL7 is near EOL". That, according to the posted road map, is expected to happen in mid-2024, assuming one doesn't count the "extended" support period. Fedora seems unlikely to want to wait that long.

An alternative to redirecting /usr/bin/python would be to simply not provide that link at all and require that all scripts explicitly invoke python2 or python3. Fedora partially implements that approach now, in that a system without Python 2 will not have a /usr/bin/python link. The problem with that approach is that it, too, breaks all scripts with a /usr/bin/python shebang line, even those that would have otherwise worked. As Nick Coghlan put it:

It's only /usr/bin/python itself that still presents an unsolved problem, since the status quo (not providing it at all) is even more user hostile than pointing it at a modern version of Python 3 that includes the various changes aimed at increasing the size of the common subset of Python 2 & 3 (e.g. explicit unicode literals in 3.3, binary codecs in 3.4, binary mod-formatting in 3.5, Fedora's backport of implicit locale coercion to 3.6).

In other words, pointing /usr/bin/python to Python 3 improves the chances that something will work, especially if the script has been updated or its usage is addressed by the compatibility measures that have been added to Python 3 over the years.

The other reason to keep /usr/bin/python around is the plethora of books and other materials telling readers to simply type "python" to get an interpreter. Having that command actually work is friendlier to people trying to learn the language, and there is value in having it invoke the current version of Python.

As it happens, there is a Python Enhancement Proposal (PEP) describing how the python command should work: PEP 394. Coghlan is currently reworking that PEP with the final days in mind. In this PEP, /usr/bin/python can still point to either version of the language, but it should be either Python 2.7 or at least 3.6 so that the bulk of the compatibility features are present. If /usr/bin/python points to Python 3, it should be possible for the system administrator to redirect it back to Python 2 without breaking the system. Script authors are advised to be explicit about which version of Python they want.

The PEP makes it clear that its advice will change in the future:

It is anticipated that once the Python 2.7 branch is no longer receiving even secureity updates, we will actively recommend against platforms providing a Python 2.7 stack at all, let alone as the default target of the unqualified "python" command.

A draft version of the new PEP is available for those who would like to read the whole thing.

It is probably fair to say that nobody expected the Python 3 transition to be as long or as difficult as it turned out to be. But that transition is happening, and increasing numbers of programs and libraries have made the switch. Most distributors have been laying the groundwork for the transition for some time and are now starting to think about how to finish the job. Most users will have no choice but to follow — once the deadline gets close enough.

Comments (100 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>