LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic
@ 2011-01-25 13:47 Ahmed S. Darwish
  2011-01-25 13:51 ` [PATCH -next 1/2][RFC] x86: Saveoops: Switch to real-mode and call BIOS Ahmed S. Darwish
                   ` (4 more replies)
  0 siblings, 5 replies; 35+ messages in thread
From: Ahmed S. Darwish @ 2011-01-25 13:47 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, X86-ML
  Cc: Tony Luck, Dave Jones, Andrew Morton, Randy Dunlap,
	Willy Tarreau, Willy Tarreau, Dirk Hohndel, Dirk.Hohndel, IDE-ML,
	LKML

Hi,

I've faced some very early panics in latest kernel. Being a run of the mill
x86 laptop, the machine is void of debugging aids like serial ports or
network boot.

As a possible solution, below patches prototypes the idea of persistently
storing the kernel log ring to a hard disk partition using the enhanced BIOS
0x13 services.

The used BIOS INT 0x13 functions are the same ones originally used by all
contemporary bootloaders to load the Linux kernel. If the kernel code is
already loaded to RAM and being executed, such parts of the BIOS should be
stable enough.

The basic idea is to switch from 64-bit long mode all the way down to 16-bit
real-mode. Once in real-mode, we reset the disk controller and write the log
buffer to disk using a user-supplied absolute disk block address (LBA).

Doing so, we can capture very early panics (along with earlier log messages)
reliably since the writing mechanism has minimal dependency on any Linux code.

Unfortunately, there are problems on some machines.

In my laptop, when calling the BIOS with the "Reset Disk Controllers" command
or even issuing a direct "Extend Write" without a controller reset, the BIOS
hangs for around __5 minutes__. Afterwards, it returns with a 'Timeout' error
code.

The main problem, it seems, is that the BIOS "Reset controller" command is not
enough to restore disk hardware to a state understandable by the BIOS code.

So:

 - Is it possible to re-initialize the disk hardware to its POST state (thus
   make the BIOS services work reliably) while keeping system RAM unmodified?
 - If not, can we do it manually by reprogramming the controllers?

The first patch (#1) implements the longMode -> realMode switch and invokes
the BIOS. The second reserves needed low-memory areas for such code and
registers a panic logger using the kmsg_dump interface.

Both patches are on '-next' and include XXX marks where further help is also
appreciated. Please remember that these patches, while tested, are now for
prototyping the technical feasibility of the idea.

Diffstat:

 arch/x86/kernel/saveoops-rmode.S |  483 ++++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/saveoops.h  |   15 ++
 arch/x86/kernel/saveoops.c       |  219 +++++++++++++++++
 arch/x86/kernel/setup.c          |    9 +
 arch/x86/kernel/Makefile         |    3 +
 lib/Kconfig.debug                |   15 ++
 6 files changed, 744 insertions(+), 0 deletions(-)

Related work and discussions:

 - Tony Luck, persistent store: http://article.gmane.org/gmane.linux.kernel.cross-arch/8495
 - Dirk Hohndel, hpa, Japan Symposium, 2D barcode: http://video.linux.com/video/1661
 - akpm, Dave Jones, oops pauser: http://article.gmane.org/gmane.linux.kernel/369739
 - Willy Tarreau, Randy Dunlap, kmsgdump: http://www.xenotime.net/linux/kmsgdump/

Thanks,

--
Darwish
http://darwish.07.googlepages.com


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2011-02-03 21:10 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-25 13:47 [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic Ahmed S. Darwish
2011-01-25 13:51 ` [PATCH -next 1/2][RFC] x86: Saveoops: Switch to real-mode and call BIOS Ahmed S. Darwish
2011-01-25 17:26   ` H. Peter Anvin
2011-01-25 13:53 ` [PATCH -next 2/2][RFC] x86: Saveoops: Reserve low memory and register code Ahmed S. Darwish
2011-01-25 17:29   ` H. Peter Anvin
2011-01-26  9:04     ` Ahmed S. Darwish
2011-01-25 14:09 ` [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic Ingo Molnar
2011-01-25 15:08   ` Tejun Heo
2011-01-25 17:33     ` H. Peter Anvin
2011-01-26 11:44       ` Ahmed S. Darwish
2011-02-03 14:36     ` Pavel Machek
2011-02-03 15:28       ` H. Peter Anvin
2011-02-03 17:57         ` Ingo Molnar
2011-02-03 21:07           ` H. Peter Anvin
2011-01-25 15:36   ` Ahmed S. Darwish
2011-01-25 16:02     ` James Bottomley
2011-01-25 17:05       ` Ahmed S. Darwish
2011-01-25 17:20         ` James Bottomley
2011-01-25 22:10         ` Mark Lord
2011-01-25 22:16           ` Randy Dunlap
2011-01-25 22:45             ` Jeff Garzik
2011-01-25 22:58               ` H. Peter Anvin
2011-01-26  0:26                 ` Jeff Garzik
2011-01-31  2:59                 ` Rusty Russell
2011-01-31 10:45                   ` Ingo Molnar
2011-01-25 17:32     ` Tony Luck
2011-01-25 17:36       ` H. Peter Anvin
2011-01-25 19:04       ` Jeff Garzik
2011-01-25 14:49 ` Tejun Heo
2011-01-28  7:59   ` Jan Ceuleers
2011-01-25 20:25 ` Linus Torvalds
     [not found]   ` <20110126124954.GC24527@laptop>
2011-01-26 23:07     ` Luck, Tony
     [not found]       ` <20110126231620.GA14807@redhat.com>
     [not found]         ` <987664A83D2D224EAE907B061CE93D53019438EB02@orsmsx505.amr.corp.intel.com>
     [not found]           ` <20110126233033.GB14807@redhat.com>
     [not found]             ` <987664A83D2D224EAE907B061CE93D53019438EBB6@orsmsx505.amr.corp.intel.com>
     [not found]               ` <4D40F7F1.3020509@zytor.com>
     [not found]                 ` <20110127120039.GD20279@elte.hu>
2011-01-27 18:35                   ` Luck, Tony
     [not found]                   ` <4D4197CB.9070201@zytor.com>
     [not found]                     ` <20110127162429.GB26437@elte.hu>
2011-01-27 18:56                       ` Luck, Tony
     [not found]     ` <20110127021338.GA20334@redhat.com>
     [not found]       ` <4D40F81E.1030009@zytor.com>
     [not found]         ` <20110127052639.GA16289@laptop>
     [not found]           ` <m1sjweyeax.fsf@fess.ebiederm.org>
2011-02-02 11:13             ` Ahmed S. Darwish

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox