LKML Archive on
help / color / mirror / Atom feed
From: Ingo Molnar <>
To: Greg KH <>
Cc:, Yinghai Lu <>,
	Jacob Shin <>,
	Linus Torvalds <>,
	Alexander Viro <>
Subject: Re: [PATCH 2/5] x86: fix runtime error in arch/x86/kernel/cpu/mcheck/mce_amd_64.c
Date: Mon, 28 Jan 2008 20:01:49 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

* Greg KH <> wrote:

> > Are kobjects protected against accidental copying? If not add &kobj 
> > to the 'magic value' too, and check that - it becomes 
> > copying-resistent that way and has the same cost to check. (which is 
> > negligible anyway)
> Oh, that's a very cool idea, I like it :)

hey, you are welcome :-)

[ I guess i should not mention that i've implemented list debugging for 
  Linux that checksums the struct list contents and stores the checksum 
  in it (offset by a magic value plus to address of the list head), and 
  thus protects it against accidental corruption? It was capable of 
  reliably detecting mixed up list_add() arguments for example, it 
  detected list corruption of _every_ sort, it detected double
  list_del() and list_add() of an already active list member as well. It
  was even capable of detecting SMP races: two parallel unserialized
  list_del()'s on the same list head were detected and warned about as 
  well. I guess i should release it one of these days? =B-) ]

> So I guess the complaint is that CONFIG_KOBJECT is too verbose?  I can 
> understand that, I'll look to clean things up as we have a lot of 
> "tracing" type functions in the debug output.  Hm, I wonder if the new 
> marker code could help with that...

that would certainly be helpful and nice as a second line of defense. 
(as a second line of defense the DEBUG_KOBJECT stuff was pretty 
effective as well, as you have shown it with this bug.)

but the "first line of defense" must be an as automatically and as 
transparently fault-resilient data structure as humanly possible.

"please enable DEBUG_KOBJECT and send me the logs" reduces the active 
tester base to somewhere around 0.1% of the full tester base - and if 
the bug is in any way spurious, the tester base goes down to 0.001% or 
down to outright zero.

i really think that in terms of central kernel data structures, no 
amount of debugging infrastructure is "too much" (within taste limits) - 
as long as it's cleanly structured. Even 10,000 lines of debugging code 
dwarves the 9,000,000 lines of code that it ultimately covers.

> > For example locks currently have 4000 lines of code of debugging 
> > infrastructure and 1500 lines of code of self-tests. The total 
> > amount of core locking code is less 1000 lines of code. So for every 
> > line of locking code there's more than 5 lines of debugging 
> > infrastructure (!).
> Heh, good point, I'll look into fixing this up to be more useful to 
> the "normal" developer, and not just the two of us doing kobject core 
> development :)

you are right, that is another important argument: meaningful debugging 
checks that catch difficult to debug problems speed up development and 
allows one to iterate the codebase faster. As long as the debugging 
checks are transparent (i.e. they have no outrageous false positive rate 
and they do not spam the logs), distros and testers will happily enable 
it and you'll get much more feedback than you ever imagined would be 

Time is precious, even if you know what you are doing, so the more of 
the debugging you are capable of pushing to the computer, the better. 
Even if some debugging levels are as outrageously expensive as "iterate 
the linear list of all kobjects in the system, for every kobject op", 
people will still be happy to turn it on, if it catches bugs. Attach a 
nice WARN_ON_ONCE() to it and will pick it up and 
categorize it for you to check. And i think that's an important effort 
for core subsystem maintainers: for example i spent far more time 
writing debugging code for mutexes and other locks than i spent time 
writing the facilities themselves. Preventing a certain category of bugs 
is far more important to users than shaving off 1 cycle from a fastpath.


  reply	other threads:[~2008-01-28 19:02 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-27 23:37 [GIT PATCH] driver core fixes against 2.6.24 Greg KH
2008-01-27 23:38 ` [PATCH 1/5] Driver core: Fix up build when CONFIG_BLOCK=N Greg Kroah-Hartman
2008-01-27 23:38 ` [PATCH 2/5] x86: fix runtime error in arch/x86/kernel/cpu/mcheck/mce_amd_64.c Greg Kroah-Hartman
2008-01-28 12:24   ` Ingo Molnar
2008-01-28 17:37     ` Greg KH
2008-01-28 17:57       ` Ingo Molnar
2008-01-28 18:32         ` Greg KH
2008-01-28 19:01           ` Ingo Molnar [this message]
2008-01-28 19:29             ` Cyrill Gorcunov
2008-01-28 19:42             ` Cyrill Gorcunov
2008-01-27 23:38 ` [PATCH 3/5] Module: check to see if we have a built in module with the same name Greg Kroah-Hartman
2008-01-28 23:54   ` Jan Engelhardt
2008-01-29  6:20   ` Rusty Russell
2008-01-27 23:38 ` [PATCH 4/5] Driver core: add bus_find_device_by_name function Greg Kroah-Hartman
2008-01-27 23:38 ` [PATCH 5/5] PPC: Fix powerpc vio_find_name to not use devices_subsys Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \
    --subject='Re: [PATCH 2/5] x86: fix runtime error in arch/x86/kernel/cpu/mcheck/mce_amd_64.c' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).