Getting the message from the kernel

[Posted June 19, 2007 by corbet]

As a general rule, Linux users would rather not hear from their kernel. If all is well, devices are working, applications are running, and the kernel just quietly makes it all happen. When things go wrong, however, it may become necessary to dig through the messages that the kernel puts out. These messages sometimes make sense to the developers who created them, but they are not always clear to the rest of the world. Neal Stephenson, in his In the Beginning was the Command Line, describes Linux kernel messages as having "the semi-inscrutable menace of graffiti tags." For a kernel developer, often as not, the main value of a kernel message is to pinpoint the location of the complaining code - from which the real problem can be determined.

Non-developers have a harder time using kernel messages in that way, though, and people who are not native English speakers are at even more of a disadvantage. So it is not surprising that the topic of fixing up kernel messages has popped up occasionally. It's back, possibly in a more serious form this time around.

People who would reform kernel messages generally have two goals in mind:

They would like for every message to have a unique identifier attached to it. This idea brings back memories of VMS or most IBM operating systems, which have used message identifiers for decades. The main purpose behind message identifiers is to allow the system administrator (or the support person they have called) to look up the identifier in a manual and figure out what the message is really saying. Various legacy operating systems have come with message manuals which take up significant amounts of shelf space; they contain a (relatively) detailed explanation of the problem and suggestions for how to make the problem go away.
It is much easier to maintain translations for messages which have unique identifiers attached to them. A Linux system which could output messages in multiple languages would be more approachable for much of the potential user base.

The problem, of course, is that attaching identifiers to messages is a significant job. There are tens of thousands of printk() calls in the kernel; each of them would need to have an identifier assigned and the code changed. New messages are added - in large numbers - with every kernel release; it's easy to imagine that the overhead of putting identifiers onto all of those messages would irritate developers in a hurry. For these reasons, Linus has, in the past, rejected schemes aimed at improving kernel messaging.

The idea has come back anyway. A new approach has been proposed by users in Japan who are having trouble supporting Linux as well as they would like. In this scheme, every kernel message would be assigned a component name and a message number. The component would be a per-file define:

    #define KMSG_COMPONENT "railgun"

Then printk calls would be modified to include the message number:

    printk(KMSG_ERR(100) "Rail gun fired accidentally - sorry\n")

The end result would be a message prepended with the string "railgun.100:", enabling the message to be translated or looked up in a manual. To help ensure that there is a manual, the proposal requires kerneldoc-style documentation of messages within the source; something like:

    /**
     * message
     * @100: 
     *
     * Description:
     * The rail gun fired accidentally in the absence of a specific 
     * user request.  
     *
     * User Response:
     * Operator should be sure to stand to the side.
     */

The kerneldoc scripts would be upgraded to collect all of these message descriptions and turn them into a printable manual. Another tool would check source files and complain about messages which lack accompanying descriptions.

Schemes like this have been greeted with complaints in the past, and the same happened this time around. The overhead of documenting messages in this way is more than many developers want to take on; David Miller expressed this feeling well:

I think my general response to something like this, if it goes in, would be to stop emitting useful kernel log messages in the code I write because having to document it too on top of that is just too much extra work to be worthwhile.

Keeping the message descriptions current would also be a challenge - code is often changed without updating the neighboring comments; there is no reason to believe that message descriptions would get a higher level of attention.

Andrew Morton has come back with a counter proposal designed for easier developer acceptance. His scheme would add a new form of printk() which would take a message ID in some as-yet-undetermined format. That ID would be output with the message, but everything else - translations, descriptions, condolences, etc. - would be kept in a database outside of the kernel.

The key point is that developers would not be expected to do much of anything with this database - or even with their kernel messages. Instead, there would be a "kernel messages team" charged with maintaining this information. Occasionally somebody from that team would look over new code, add message IDs where needed, and send a patch to the maintainer. Unless they were personally interested in helping, developers would not have to worry about the new mechanism at all.

There are a few gaps in this proposal; how the kernel message team would be funded (or otherwise motivated) is one of them. But it may be sufficiently low-impact to be accepted by the rest of the development community. Someday soon, Linux users, too, may have to make room on their shelves for a hefty messages manual.

Index entries for this article

Kernel Messages

Index entries for this article
Kernel	Messages

Kernel message team

Posted Jun 21, 2007 2:25 UTC (Thu) by emgrasso (guest, #4029) [Link]

I'm a CM tool specialist, mostly working in Perl these days, but I have
programmed in C.

Scanning the kernel source for kprint statements and providing patches to
fix the ones that don't match a desired format is the sort of work that I
could contribute to the Linux kernel.

I will contact Andrew Morton and volunteer.

Getting the message from the kernel

Posted Jun 21, 2007 3:04 UTC (Thu) by pj (subscriber, #4506) [Link] (6 responses)

What about the good 'ol C macros __FILE__ and __LINE__ ? They would seem to pinpoint the problem rather exactly (given a particular kernel version)

Getting the message from the kernel

Posted Jun 21, 2007 6:44 UTC (Thu) by pfavr (guest, #38205) [Link] (2 responses)

Yes! This is the way to go.

If you get messages from the kernel - then __FILE__ and __LINE__ is the easiest way to get people grok the source.

People interested in looking up numbers on a list are probably building their own kernel anyway.

Using __FILE__ and __LINE__ will make sure the references are updated with changes to the kernel.

(and the source is the real documentation anyway :-)

Best regards,

Peter

Getting the message from the kernel

Posted Jun 21, 2007 20:57 UTC (Thu) by jordanb (guest, #45668) [Link] (1 responses)

The problem is that line numbers are volatile so it'd be difficult to keep a manual or (more likely) a translation table attached to the proper message.

Getting the message from the kernel

Posted Jul 2, 2007 9:19 UTC (Mon) by alext (guest, #7589) [Link]

Besides the error message is already going to have to be unique and therefore
anyone capable of looking at the code meaningfully won't have much trouble locating it.

The big use must be to allow external full explanations to exist that allow admin's etc to apply any known changes to config that get around the problem.

And as the start of the thread suggested, why is there so much resistance to potentially having something as simple as a call with a string and that string having a unique ID attached? It sounds like Linus being a bit precious rather than treating it like just adding a new hook to the code base for others to use in documenting behavior.

Getting the message from the kernel

Posted Jun 21, 2007 6:45 UTC (Thu) by tzafrir (subscriber, #11501) [Link] (2 responses)

To a developer: sure. To a user (system administrator): no.

Also note that emmiting messages in a language different than English reduces the usefulness of a search engine as a reference guide for those cases.

(I'm not a native English speaker, but fluent enough)

Getting the message from the kernel

Posted Jun 21, 2007 9:24 UTC (Thu) by james (subscriber, #1325) [Link] (1 responses)

This is where unique message IDs really come in handy -- they're great for Googling. The message itself can be localised, but the message ID can be used to find descriptions and fixes in whatever language you like.

In this day and age, I don't see why "making life easy for search engines and their users" shouldn't be a major design point.

Getting the message from the kernel

Posted Jun 21, 2007 17:13 UTC (Thu) by cpeterso (guest, #305) [Link]

Definitely! If message IDs are just integers, users will never find them in Google.

For a good example, Microsoft's compiler errors have IDs such as C2097 and linker errors have IDs such as LNK2019. Googling those error codes usually brings up exactly what you were looking for.

Getting the message from the kernel

Posted Jun 21, 2007 4:40 UTC (Thu) by error27 (subscriber, #8346) [Link] (9 responses)

Is there any need to put docbook style comments on a printk? Shouldn't the printk itself be self explanitory like "b44: eth0: Link is down." Probably if users don't understand the printk they aren't going to understand the comment either.

Getting the message from the kernel

Posted Jun 21, 2007 6:44 UTC (Thu) by thedevil (guest, #32913) [Link] (3 responses)

Right, exactly my sentiment. The whole _idea_ of adding a "unique ID" seems rubbish to me: isn't the message string _itself_ already unique? If it isn't, it's just a handful of cases and it can easily be checked mechanically at each release. And how is some cryptic thing like RLGNERR100 better than "Railgun error 100" ??

Getting the message from the kernel

Posted Jun 21, 2007 7:47 UTC (Thu) by dlang (guest, #313) [Link]

no, the message itself is not always unique.

remember that messages are formed through printf, where you give a message format with variables and then have the variables fill in the blanks

it's not at all uncommon to see something like "error %s happened when doing %s"

depending on what the variables are filled in with you could have this happen anywhere in the kernel.

on a lot of my programs I add a number to the front of the message, even if it isn't unique it at least limits the number of places I need to look. the line and file macros mentioned above sound like exactly the right thing to use.

the main purpose of these tags is to look in the right place in the kernel, not to try and translate all possible kernel errors into multiple languages.

Getting the message from the kernel

Posted Jun 21, 2007 10:12 UTC (Thu) by ayeomans (guest, #1848) [Link] (1 responses)

Why not just do a hash function of the message string? Into (say) a 32-bit number. Any duplicate hashes could be treated as a bug and modified.

Should be a fully automatic job to scan the entire source for the printk strings to get the hash values, source file name (and line number if you wish). The catalogue could be used for translations, documentation, etc. And would not in itself create any extra work for kernel maintainers, apart from the occasional change to fix duplicate hashes.

Getting the message from the kernel

Posted Jun 21, 2007 10:18 UTC (Thu) by ayeomans (guest, #1848) [Link]

And having subsequently read the thread, that's just what is being proposed by many there.

Getting the message from the kernel

Posted Jun 21, 2007 17:07 UTC (Thu) by cpeterso (guest, #305) [Link]

Even if all printks were unique and self-explanatory, they are written in English. Many users would prefer localized messages in their native language. An id # allows that.

Getting the message from the kernel

Posted Jun 22, 2007 7:48 UTC (Fri) by adi (guest, #7892) [Link] (3 responses)

Well, if you are the specialist on a particular topic the message itself might be sufficient and sometimes indeed be self-explanatory to you, perhaps even knowing/understanding the source code, having read it so many times.

However, if you are "just" a sys admin managing a large variety of systems you appreciate any help the system - with the plethora of things that can go wrong - can give you identifying problems, understanding what they were caused by, validating their respective impact, and proposing possible remedies without having you to dig into Linux kernel sources first. Except rare cases the message itself can't give you all this information and we certainly don't want novels to be issued as messages curing this inherent deficiency.

I understand that such approach suggests a kernel programmer to accept that messages indeed define some form of event mechanism others are dependent on processing for problem determination and automation and you may go that far that they indeed become some form of committed interface, hence doing the printk more consciously and prudently ...

Getting the message from the kernel

Posted Jun 22, 2007 18:17 UTC (Fri) by giraffedata (guest, #1954) [Link] (2 responses)

Except rare cases the message itself can't give you all this information and we certainly don't want novels to be issued as messages curing this inherent deficiency.

The right length of a message is somewhere between the traditional length and a novel. And it's the same length as developers would write in the "documentation" comment or database or whatever under the message ID proposals. The message manual will not have a novel -- it will contain a few sentences. And they will be fairly inaccurate.

The article talks about the long history of message IDs, but fails to put it in its historical context. Those first message manuals went with systems where storage (including disk space) was so precious you couldn't afford to put a text description of an error in it. The actual error messages had less than 12 characters of text, so they also had a message ID, which was an address in cheap tertiary storage: the paper manual.

Technology has progressed to where it is now a waste of resources to have a person look up a message. It's more efficient to have the computer just tell you what's wrong. But we've stuck with the tradition of terse error messages. They're usually one sentence or less, and in the Unix world, 3-4 words is considered ideal. Only part of this can be explained by programmer laziness; the rest must be just custom.

Other benefits of message IDs have been given here: enabling translation and searching problem databases. But enabling error messages to remain coy and withhold the majority of the information from you isn't one.

Getting the message from the kernel

Posted Jun 22, 2007 21:24 UTC (Fri) by jzbiciak (guest, #5246) [Link] (1 responses)

Hmmm... there's a tradeoff. Verbose error messages are very useful for the beginner, or for obscure error messages that happen very rarely. Terse error messages are more efficient, especially for errors that occur often.

Compare "permission denied" to "Your currently active user id, 'im14u2c', does not have write permission on the file '/tmp/xyzpdq'. This file is owned by 'im14u2c', but the user write permission bit on the file is not set. Please consult the 'chmod' man page."

The latter is very friendly to a new user. Just awesome. But, it would get real old real quick. And, depending on the context, the advice implied by the error message (in this case, chmod +w is implied) might be wrong advice. (For example, what if the file in question is an RCS controlled file that isn't checked out?) Perhaps a settable "user expert level" needs to be specified to indicate how chatty the system should be?

Getting the message from the kernel

Posted Jun 23, 2007 2:16 UTC (Sat) by giraffedata (guest, #1954) [Link]

I don't think it gets old like you think it would. To know, you'd have to try it for a while. I use a lot of software that prints out 5 line error messages on a terminal (because I wrote the code) and it really doesn't bother me. And consider that a lot of programs respond to the most casual error of them all -- fat-fingering -- by not issuing an error message at all but just dumping the full command syntax on the terminal. This seems to be quite popular.

Incidentally, I've found it's rarely a good idea to give advice on how to fix it in the message; the best you can do is to describe the problem. Same is true for a message manual -- the complete set of advice would be a textbook.

And I've tried the expert/novice thing (as a user), and that doesn't work. You're never expert enough that you know all the errors. But maybe something that avoids issuing the same verbose message frequently.

But really, that's all beside the point because we're talking about kernel messages. The kernel isn't interactive -- these things go primarily in a log thousands of lines long.

Getting the message from the kernel

Posted Jun 21, 2007 6:49 UTC (Thu) by mjthayer (guest, #39183) [Link] (1 responses)

What about a gettext-style scheme, with the actual translation done in the userspace logger? That way, all that needs to be changed in the kernel is adding tr() macros around the text in question, probably with a context parameter (there could also be a global per-file context to save typing time).

Getting the message from the kernel

Posted Jun 24, 2007 21:20 UTC (Sun) by k8to (guest, #15413) [Link]

Sure, and I think that is what is being considered, but it won't work unless the "source" messages are stable enough to look them up. Thus numbers are discussed because there is a problem with attempting to enforce message text stability.

Getting the message from the kernel

Posted Jun 21, 2007 11:31 UTC (Thu) by buendgen (subscriber, #35298) [Link]

There is actually one more reason why customer like messages with a committed semantics and id:
Some want to automate reactions to certain events thus resolving problems before they turn fatal to the system.

BTW the Michael the submitter of "A new approach" is not from Japan.

Getting the message from the kernel

Posted Jun 22, 2007 11:04 UTC (Fri) by AndyBurns (guest, #27521) [Link]

VMS %FACILITY-SEVERITY-IDENT-TEXT anyone?

Getting the message from the kernel

Posted Jul 3, 2007 17:45 UTC (Tue) by Blaisorblade (guest, #25465) [Link]

I was thinking to an entirely different approach altogether: communicating with hal and dbus important messages.

Most printk are for debugging purposes (for instance):

[22156.206781] sd 6:0:0:0: Attached scsi removable disk sdb
[22156.207067] Device driver target6:0:1 lacks bus and class support for being resumed.
[...repeated for all partitions...]
[22156.215831] usb-storage: device scan complete
[22261.183379] usb 1-3: USB disconnect, address 8

"Interface is up / down " messages are already told to the user via tons of graphical applets.

What instead is important and is not told are, for instance, I/O errors, on internal or external media. On Windows, such messages pop up from the systray. The same should happen on a Linux desktop.

Such messages would be just a selection, and creating unique IDs for them (and adding the message in userspace) would be a task for userspace developers at that point.

Getting the message from the kernel

Kernel message team

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

Getting the message from the kernel

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!