LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: "Ahmed S. Darwish" <darwish.07@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>, X86-ML <x86@kernel.org>,
	"Tony Luck" <tony.luck@intel.com>,
	"Dave Jones" <davej@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Randy Dunlap" <rdunlap@xenotime.net>,
	"Willy Tarreau" <wtarreau@hera.kernel.org>,
	"Willy Tarreau" <w@1wt.eu>,
	"Dirk Hohndel" <hohndel@infradead.org>,
	Dirk.Hohndel@intel.com, IDE-ML <linux-ide@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Arjan van de Ven" <arjan@infradead.org>
Subject: Re: [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic
Date: Tue, 25 Jan 2011 15:09:48 +0100
Message-ID: <20110125140948.GA26762@elte.hu> (raw)
In-Reply-To: <20110125134748.GA10051@laptop>


* Ahmed S. Darwish <darwish.07@gmail.com> wrote:

> Hi,
> 
> I've faced some very early panics in latest kernel. Being a run of the mill
> x86 laptop, the machine is void of debugging aids like serial ports or
> network boot.
> 
> As a possible solution, below patches prototypes the idea of persistently
> storing the kernel log ring to a hard disk partition using the enhanced BIOS
> 0x13 services.
> 
> The used BIOS INT 0x13 functions are the same ones originally used by all
> contemporary bootloaders to load the Linux kernel. If the kernel code is
> already loaded to RAM and being executed, such parts of the BIOS should be
> stable enough.
> 
> The basic idea is to switch from 64-bit long mode all the way down to 16-bit
> real-mode. Once in real-mode, we reset the disk controller and write the log
> buffer to disk using a user-supplied absolute disk block address (LBA).
> 
> Doing so, we can capture very early panics (along with earlier log messages)
> reliably since the writing mechanism has minimal dependency on any Linux code.
> 
> Unfortunately, there are problems on some machines.
> 
> In my laptop, when calling the BIOS with the "Reset Disk Controllers" command
> or even issuing a direct "Extend Write" without a controller reset, the BIOS
> hangs for around __5 minutes__. Afterwards, it returns with a 'Timeout' error
> code.
> 
> The main problem, it seems, is that the BIOS "Reset controller" command is not
> enough to restore disk hardware to a state understandable by the BIOS code.
> 
> So:
> 
>  - Is it possible to re-initialize the disk hardware to its POST state (thus
>    make the BIOS services work reliably) while keeping system RAM unmodified?
>  - If not, can we do it manually by reprogramming the controllers?
> 
> The first patch (#1) implements the longMode -> realMode switch and invokes
> the BIOS. The second reserves needed low-memory areas for such code and
> registers a panic logger using the kmsg_dump interface.
> 
> Both patches are on '-next' and include XXX marks where further help is also
> appreciated. Please remember that these patches, while tested, are now for
> prototyping the technical feasibility of the idea.
> 
> Diffstat:
> 
>  arch/x86/kernel/saveoops-rmode.S |  483 ++++++++++++++++++++++++++++++++++++++
>  arch/x86/include/asm/saveoops.h  |   15 ++
>  arch/x86/kernel/saveoops.c       |  219 +++++++++++++++++
>  arch/x86/kernel/setup.c          |    9 +
>  arch/x86/kernel/Makefile         |    3 +
>  lib/Kconfig.debug                |   15 ++
>  6 files changed, 744 insertions(+), 0 deletions(-)

Ok, i have to admit that while i'm a rabid BIOS-hater i find this debug feature very 
very interesting, for the plain reason that if it's implemented in a robust and 
clever way then this has a chance to improve debuggability of pretty much any Linux 
laptop quite enormously!

While we generally thoroughly hate BIOSes from beginning to end, one thing can be 
said, a BIOS bootstraps very early during bootup, and it's relatively simple to 
trigger as well.

Also, since latest kernels do not stomp on BIOS data structures anymore (low RAM), 
there's some good chance it's still functional at the point of crash - be that an 
early crash or a later crash.

I think the biggest areas of practical concern would be:

 - Can this mechanism ever, under any circumstance corrupt any real data, destroy 
   the MBR or do other nasties. Can you think of any additional fail-safe measures 
   where you could _further robustify the BIOS calls_ to make sure it can never go 
   to the wrong sector(s)? I really do not want to think of trusting a BIOS to 
   _write to my disk_.

 - Is there some hidden disk area somewhere on PCs, or somewhere on a real partition
   on typical Linux distributions, which we could use without having to reinstall
   the box? This would increase utility and availability enormously. I'm thinking of 
   partition _ends_ which sometimes get rounded in an awkward way and which are 
   potentially skipped by most Linux filesystems. Even a very small, 512 bytes of 
   area would be extremely useful for debugging weird suspend hangs ...

 - Could we automate the recovery of the dump, and just put it into the regular 
   kernel log on the next (successful) bootup (on a feature-enabled kernel)? That 
   would make the log of the 'previous crash' very conveniently available in dmesg 
   and the syslog. Tools like kerneloops could make use of it immediately.

All in one, a very intriguing idea IMO, and the hardest bits (lowlevel x86 
transition) is all implemented already.

Thanks,

	Ingo

  parent reply index

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-25 13:47 Ahmed S. Darwish
2011-01-25 13:51 ` [PATCH -next 1/2][RFC] x86: Saveoops: Switch to real-mode and call BIOS Ahmed S. Darwish
2011-01-25 17:26   ` H. Peter Anvin
2011-01-25 13:53 ` [PATCH -next 2/2][RFC] x86: Saveoops: Reserve low memory and register code Ahmed S. Darwish
2011-01-25 17:29   ` H. Peter Anvin
2011-01-26  9:04     ` Ahmed S. Darwish
2011-01-25 14:09 ` Ingo Molnar [this message]
2011-01-25 15:08   ` [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic Tejun Heo
2011-01-25 17:33     ` H. Peter Anvin
2011-01-26 11:44       ` Ahmed S. Darwish
2011-02-03 14:36     ` Pavel Machek
2011-02-03 15:28       ` H. Peter Anvin
2011-02-03 17:57         ` Ingo Molnar
2011-02-03 21:07           ` H. Peter Anvin
2011-01-25 15:36   ` Ahmed S. Darwish
2011-01-25 16:02     ` James Bottomley
2011-01-25 17:05       ` Ahmed S. Darwish
2011-01-25 17:20         ` James Bottomley
2011-01-25 22:10         ` Mark Lord
2011-01-25 22:16           ` Randy Dunlap
2011-01-25 22:45             ` Jeff Garzik
2011-01-25 22:58               ` H. Peter Anvin
2011-01-26  0:26                 ` Jeff Garzik
2011-01-31  2:59                 ` Rusty Russell
2011-01-31 10:45                   ` Ingo Molnar
2011-01-25 17:32     ` Tony Luck
2011-01-25 17:36       ` H. Peter Anvin
2011-01-25 19:04       ` Jeff Garzik
2011-01-25 14:49 ` Tejun Heo
2011-01-28  7:59   ` Jan Ceuleers
2011-01-25 20:25 ` Linus Torvalds
     [not found]   ` <20110126124954.GC24527@laptop>
2011-01-26 23:07     ` Luck, Tony
     [not found]       ` <20110126231620.GA14807@redhat.com>
     [not found]         ` <987664A83D2D224EAE907B061CE93D53019438EB02@orsmsx505.amr.corp.intel.com>
     [not found]           ` <20110126233033.GB14807@redhat.com>
     [not found]             ` <987664A83D2D224EAE907B061CE93D53019438EBB6@orsmsx505.amr.corp.intel.com>
     [not found]               ` <4D40F7F1.3020509@zytor.com>
     [not found]                 ` <20110127120039.GD20279@elte.hu>
2011-01-27 18:35                   ` Luck, Tony
     [not found]                   ` <4D4197CB.9070201@zytor.com>
     [not found]                     ` <20110127162429.GB26437@elte.hu>
2011-01-27 18:56                       ` Luck, Tony
     [not found]     ` <20110127021338.GA20334@redhat.com>
     [not found]       ` <4D40F81E.1030009@zytor.com>
     [not found]         ` <20110127052639.GA16289@laptop>
     [not found]           ` <m1sjweyeax.fsf@fess.ebiederm.org>
2011-02-02 11:13             ` Ahmed S. Darwish

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110125140948.GA26762@elte.hu \
    --to=mingo@elte.hu \
    --cc=Dirk.Hohndel@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=bp@alien8.de \
    --cc=darwish.07@gmail.com \
    --cc=davej@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hohndel@infradead.org \
    --cc=hpa@zytor.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rdunlap@xenotime.net \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=w@1wt.eu \
    --cc=wtarreau@hera.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lkml.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lkml.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lkml.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lkml.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lkml.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lkml.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lkml.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lkml.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lkml.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lkml.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lkml.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lkml.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git