LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* DRM-based Oops viewer
@ 2019-03-10  1:31 Ahmed S. Darwish
  2019-03-10  8:44 ` Martin Steigerwald
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Ahmed S. Darwish @ 2019-03-10  1:31 UTC (permalink / raw)
  To: David Airlie, Daniel Vetter, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Alex Deucher, Christian König, David Zhou,
	Ard Biesheuvel, Matt Fleming
  Cc: Linus Torvalds, Greg Kroah-Hartman, John Ogness, dri-devel, linux-kernel

Hello DRM/UEFI maintainers,

Several years ago, I wrote a set of patches to dump the kernel
log to disk upon panic -- through BIOS INT 0x13 services. [1]

The overwhelming response was that it's unsafe to do this in a
generic manner. Linus proposed a video-based viewer instead: [2]

    If you want to do the BIOS services thing, do it for video: copy the
    oops to low RAM, return to real mode, re-run the graphics card POST
    routines to initialize text-mode, and use the BIOS to print out the
    oops.  That is WAY less scary than writing to disk.

Of course it's 2019 now though, and it's quite known that
Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]

Researching whether this can be done from UEFI, it was also clear
that UEFI "Runtime Services" do not provide any re-initialization
routines. [4]

The maximum possible that UEFI can provide is a GOP-provided
framebuffer that's ready to use by the OS -- even after the UEFI
boot phase is marked as done through ExitBootServices(). [5]

Of course, once native drivers like i915 or radeon take over,
such a framebuffer is toast... [6]

Thus a possible remaining option, is to display the oops through
"minimal" DRM drivers provided for each HW variant... Since
these special drivers will run only and fully under a panic()
context though, several constraints exist:

  - The code should be fully synchronous (irqs are disabled)
  - It should not allocate any dynamic memory
  - It should make minimal assumptions about HW state
  - It should not chain into any other kernel subsystem
  - It has ample freedom to use delay-based loops and the
    like, the kernel is already dead.

How feasible is it to have such a special "DRM viewoops"
framework + its minimal drivers in the kernel?

The target is to start from i915, since that's what in my
laptop now, and work from there..

Some final notes:

  - The NT kernel has a similar concept, but for storage instead.
    They're used to dump core under kernel panic() situations,
    and are called "Minoport storage drivers". [7]

  - Since Windows 7+, a very fancy Blue Screen of Death is
    displayed, with Unicode and whatnot, implying GPU drivers
    involvement. [8]

  - Mac OS X also does something similar [9]

  - On Linux laptops, the current situation is _really_ bad.

    In any graphical session, type "echo c > /proc/sysrq-trigger";
    the screen will just completely freeze...

    Desired first goal: just print the panic() log

Thanks a lot,

[1] https://lore.kernel.org/lkml/20110125134748.GA10051@laptop
[2] https://lore.kernel.org/lkml/AANLkTinU0KYiCd4p=z+=ojbkeEoT2G+CAYvdRU02KJEn@mail.gmail.com

[3] https://uefi.org/sites/default/files/resources/Brian_Richardson_Intel_Final.pdf

[4] UEFI v2.7 spec, Chapter 8, "Services — Runtime Services"
[5] UEFI v2.7 spec, Section 12.9, "Graphics Output Protocol"
    "The Graphics Output Protocol supports this capability by
     providing the EFI OS loader access to a hardware frame buffer
     and enough information to allow the OS to draw directly to
     the graphics output device."

[6] linux/drivers/gpu/drm/i915/i915_drv.c::i915_kick_out_firmware_fb()
    linux/drivers/gpu/drm/radeon/radeon_drv.c::radeon_pci_probe()

[7] https://docs.microsoft.com/en-us/windows-hardware/drivers/storage/restrictions-on-miniport-drivers-that-manage-the-boot-drive

[8] https://upload.wikimedia.org/wikipedia/commons/archive/5/56/20181019151937%21Bsodwindows10.png
[9] https://upload.wikimedia.org/wikipedia/commons/4/4a/Mac_OS_X_10.2_Kernel_Panic.jpg

--darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: DRM-based Oops viewer
  2019-03-10  1:31 DRM-based Oops viewer Ahmed S. Darwish
@ 2019-03-10  8:44 ` Martin Steigerwald
  2019-03-11  9:04 ` Jani Nikula
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Martin Steigerwald @ 2019-03-10  8:44 UTC (permalink / raw)
  To: Ahmed S. Darwish
  Cc: David Airlie, Daniel Vetter, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, Alex Deucher, Christian König, David Zhou,
	Ard Biesheuvel, Matt Fleming, Linus Torvalds, Greg Kroah-Hartman,
	John Ogness, dri-devel, linux-kernel

Hell Ahmed.

Ahmed S. Darwish - 10.03.19, 02:31:
> Hello DRM/UEFI maintainers,
> 
> Several years ago, I wrote a set of patches to dump the kernel
> log to disk upon panic -- through BIOS INT 0x13 services. [1]
> 
> The overwhelming response was that it's unsafe to do this in a
> generic manner. Linus proposed a video-based viewer instead: [2]
[…]
> Of course it's 2019 now though, and it's quite known that
> Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
[…]
> The maximum possible that UEFI can provide is a GOP-provided
> framebuffer that's ready to use by the OS -- even after the UEFI
> boot phase is marked as done through ExitBootServices(). [5]
> 
> Of course, once native drivers like i915 or radeon take over,
> such a framebuffer is toast... [6]
> 
> Thus a possible remaining option, is to display the oops through
> "minimal" DRM drivers provided for each HW variant... Since
> these special drivers will run only and fully under a panic()
> context though, several constraints exist:

Thank you for your idea and willingness to work on something like this.

As a user I'd very much favor a solution that could not only work with 
UEFI but with other firmwares. I still dream to be able to buy a laptop 
with up to date hardware and with Coreboot/Libreboot at some time.

While this would not solve all "I just freeze" kind of crashes, it may 
at least give some information about some of them. When testing rc 
kernels I often enough faced "I just freeze" crashes that just happened 
*sometimes*. On a machine that I also use for production work I find it  
infeasible to debug it as bisecting could take a long, long time. And 
well the machine could just crash every moment… even during doing 
important work with it.

In my ideal world an operating system would never ever crash or hang 
without telling why. Well it would not crash or hang at all… but there 
you go. Maybe some time with a widely usable micro kernel based OS that 
can restart device drivers in a broken state – at least almost. No 
discussion of that micro kernel topic required here. :)

Thanks,
-- 
Martin



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: DRM-based Oops viewer
  2019-03-10  1:31 DRM-based Oops viewer Ahmed S. Darwish
  2019-03-10  8:44 ` Martin Steigerwald
@ 2019-03-11  9:04 ` Jani Nikula
  2019-03-11 13:49   ` Daniel Vetter
  2019-03-11 22:12   ` Ahmed S. Darwish
  2019-03-11 12:10 ` Joonas Lahtinen
  2019-03-11 17:47 ` Noralf Trønnes
  3 siblings, 2 replies; 9+ messages in thread
From: Jani Nikula @ 2019-03-11  9:04 UTC (permalink / raw)
  To: Ahmed S. Darwish, David Airlie, Daniel Vetter, Joonas Lahtinen,
	Rodrigo Vivi, Alex Deucher, Christian König, David Zhou,
	Ard Biesheuvel, Matt Fleming
  Cc: Linus Torvalds, Greg Kroah-Hartman, John Ogness, dri-devel, linux-kernel

On Sun, 10 Mar 2019, "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> Hello DRM/UEFI maintainers,
>
> Several years ago, I wrote a set of patches to dump the kernel
> log to disk upon panic -- through BIOS INT 0x13 services. [1]
>
> The overwhelming response was that it's unsafe to do this in a
> generic manner. Linus proposed a video-based viewer instead: [2]
>
>     If you want to do the BIOS services thing, do it for video: copy the
>     oops to low RAM, return to real mode, re-run the graphics card POST
>     routines to initialize text-mode, and use the BIOS to print out the
>     oops.  That is WAY less scary than writing to disk.
>
> Of course it's 2019 now though, and it's quite known that
> Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
>
> Researching whether this can be done from UEFI, it was also clear
> that UEFI "Runtime Services" do not provide any re-initialization
> routines. [4]
>
> The maximum possible that UEFI can provide is a GOP-provided
> framebuffer that's ready to use by the OS -- even after the UEFI
> boot phase is marked as done through ExitBootServices(). [5]
>
> Of course, once native drivers like i915 or radeon take over,
> such a framebuffer is toast... [6]
>
> Thus a possible remaining option, is to display the oops through
> "minimal" DRM drivers provided for each HW variant... Since
> these special drivers will run only and fully under a panic()
> context though, several constraints exist:
>
>   - The code should be fully synchronous (irqs are disabled)
>   - It should not allocate any dynamic memory
>   - It should make minimal assumptions about HW state
>   - It should not chain into any other kernel subsystem
>   - It has ample freedom to use delay-based loops and the
>     like, the kernel is already dead.
>
> How feasible is it to have such a special "DRM viewoops"
> framework + its minimal drivers in the kernel?

Please first better define what you want to achieve.

Do you want to store the dmesg or oops (like your original series
suggests) or do you want to display the oops? Do you want the facility
to be functioning at all times, or only when specifically requested in
advance by the user? If you want to display the oops, do you want it to
also work when the display is disabled at the time of the oops? What if
the display is at attached to a port on a dock?

There's at least kdump, ramoops, and netconsole that can be used to
achieve some of what you want. How do they fall short for you?

BR,
Jani.


>
> The target is to start from i915, since that's what in my
> laptop now, and work from there..
>
> Some final notes:
>
>   - The NT kernel has a similar concept, but for storage instead.
>     They're used to dump core under kernel panic() situations,
>     and are called "Minoport storage drivers". [7]
>
>   - Since Windows 7+, a very fancy Blue Screen of Death is
>     displayed, with Unicode and whatnot, implying GPU drivers
>     involvement. [8]
>
>   - Mac OS X also does something similar [9]
>
>   - On Linux laptops, the current situation is _really_ bad.
>
>     In any graphical session, type "echo c > /proc/sysrq-trigger";
>     the screen will just completely freeze...
>
>     Desired first goal: just print the panic() log
>
> Thanks a lot,
>
> [1] https://lore.kernel.org/lkml/20110125134748.GA10051@laptop
> [2] https://lore.kernel.org/lkml/AANLkTinU0KYiCd4p=z+=ojbkeEoT2G+CAYvdRU02KJEn@mail.gmail.com
>
> [3] https://uefi.org/sites/default/files/resources/Brian_Richardson_Intel_Final.pdf
>
> [4] UEFI v2.7 spec, Chapter 8, "Services — Runtime Services"
> [5] UEFI v2.7 spec, Section 12.9, "Graphics Output Protocol"
>     "The Graphics Output Protocol supports this capability by
>      providing the EFI OS loader access to a hardware frame buffer
>      and enough information to allow the OS to draw directly to
>      the graphics output device."
>
> [6] linux/drivers/gpu/drm/i915/i915_drv.c::i915_kick_out_firmware_fb()
>     linux/drivers/gpu/drm/radeon/radeon_drv.c::radeon_pci_probe()
>
> [7] https://docs.microsoft.com/en-us/windows-hardware/drivers/storage/restrictions-on-miniport-drivers-that-manage-the-boot-drive
>
> [8] https://upload.wikimedia.org/wikipedia/commons/archive/5/56/20181019151937%21Bsodwindows10.png
> [9] https://upload.wikimedia.org/wikipedia/commons/4/4a/Mac_OS_X_10.2_Kernel_Panic.jpg
>
> --darwi
> http://darwish.chasingpointers.com

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: DRM-based Oops viewer
  2019-03-10  1:31 DRM-based Oops viewer Ahmed S. Darwish
  2019-03-10  8:44 ` Martin Steigerwald
  2019-03-11  9:04 ` Jani Nikula
@ 2019-03-11 12:10 ` Joonas Lahtinen
  2019-03-11 17:47 ` Noralf Trønnes
  3 siblings, 0 replies; 9+ messages in thread
From: Joonas Lahtinen @ 2019-03-11 12:10 UTC (permalink / raw)
  To: Ahmed S. Darwish, Alex Deucher, Ard Biesheuvel,
	Christian König, Daniel Vetter, David Airlie, David Zhou,
	Jani Nikula, Matt Fleming, Rodrigo Vivi
  Cc: Linus Torvalds, Greg Kroah-Hartman, John Ogness, dri-devel, linux-kernel

Quoting Ahmed S. Darwish (2019-03-10 03:31:42)
> Hello DRM/UEFI maintainers,
> 
> Several years ago, I wrote a set of patches to dump the kernel
> log to disk upon panic -- through BIOS INT 0x13 services. [1]
> 
> The overwhelming response was that it's unsafe to do this in a
> generic manner. Linus proposed a video-based viewer instead: [2]
> 
>     If you want to do the BIOS services thing, do it for video: copy the
>     oops to low RAM, return to real mode, re-run the graphics card POST
>     routines to initialize text-mode, and use the BIOS to print out the
>     oops.  That is WAY less scary than writing to disk.
> 
> Of course it's 2019 now though, and it's quite known that
> Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
> 
> Researching whether this can be done from UEFI, it was also clear
> that UEFI "Runtime Services" do not provide any re-initialization
> routines. [4]
> 
> The maximum possible that UEFI can provide is a GOP-provided
> framebuffer that's ready to use by the OS -- even after the UEFI
> boot phase is marked as done through ExitBootServices(). [5]
> 
> Of course, once native drivers like i915 or radeon take over,
> such a framebuffer is toast... [6]
> 
> Thus a possible remaining option, is to display the oops through
> "minimal" DRM drivers provided for each HW variant... Since
> these special drivers will run only and fully under a panic()
> context though, several constraints exist:
> 
>   - The code should be fully synchronous (irqs are disabled)
>   - It should not allocate any dynamic memory
>   - It should make minimal assumptions about HW state
>   - It should not chain into any other kernel subsystem
>   - It has ample freedom to use delay-based loops and the
>     like, the kernel is already dead.
> 
> How feasible is it to have such a special "DRM viewoops"
> framework + its minimal drivers in the kernel?

There is already (efi-)pstore, and I believe it could be already
integrated to distros so that the captured error is displayed on
next boot.

So the user experience difference would only be seeing the error
immediately vs. on boot?

For that gain, it might be bit much work to create failsafe mode
to each driver, but others may disagree.

Regards, Joonas

> The target is to start from i915, since that's what in my
> laptop now, and work from there..
> 
> Some final notes:
> 
>   - The NT kernel has a similar concept, but for storage instead.
>     They're used to dump core under kernel panic() situations,
>     and are called "Minoport storage drivers". [7]
> 
>   - Since Windows 7+, a very fancy Blue Screen of Death is
>     displayed, with Unicode and whatnot, implying GPU drivers
>     involvement. [8]
> 
>   - Mac OS X also does something similar [9]
> 
>   - On Linux laptops, the current situation is _really_ bad.
> 
>     In any graphical session, type "echo c > /proc/sysrq-trigger";
>     the screen will just completely freeze...
> 
>     Desired first goal: just print the panic() log
> 
> Thanks a lot,
> 
> [1] https://lore.kernel.org/lkml/20110125134748.GA10051@laptop
> [2] https://lore.kernel.org/lkml/AANLkTinU0KYiCd4p=z+=ojbkeEoT2G+CAYvdRU02KJEn@mail.gmail.com
> 
> [3] https://uefi.org/sites/default/files/resources/Brian_Richardson_Intel_Final.pdf
> 
> [4] UEFI v2.7 spec, Chapter 8, "Services — Runtime Services"
> [5] UEFI v2.7 spec, Section 12.9, "Graphics Output Protocol"
>     "The Graphics Output Protocol supports this capability by
>      providing the EFI OS loader access to a hardware frame buffer
>      and enough information to allow the OS to draw directly to
>      the graphics output device."
> 
> [6] linux/drivers/gpu/drm/i915/i915_drv.c::i915_kick_out_firmware_fb()
>     linux/drivers/gpu/drm/radeon/radeon_drv.c::radeon_pci_probe()
> 
> [7] https://docs.microsoft.com/en-us/windows-hardware/drivers/storage/restrictions-on-miniport-drivers-that-manage-the-boot-drive
> 
> [8] https://upload.wikimedia.org/wikipedia/commons/archive/5/56/20181019151937%21Bsodwindows10.png
> [9] https://upload.wikimedia.org/wikipedia/commons/4/4a/Mac_OS_X_10.2_Kernel_Panic.jpg
> 
> --darwi
> http://darwish.chasingpointers.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: DRM-based Oops viewer
  2019-03-11  9:04 ` Jani Nikula
@ 2019-03-11 13:49   ` Daniel Vetter
  2019-03-11 23:39     ` Ahmed S. Darwish
  2019-03-11 22:12   ` Ahmed S. Darwish
  1 sibling, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2019-03-11 13:49 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Ahmed S. Darwish, David Airlie, Daniel Vetter, Joonas Lahtinen,
	Rodrigo Vivi, Alex Deucher, Christian König, David Zhou,
	Ard Biesheuvel, Matt Fleming, Linus Torvalds, Greg Kroah-Hartman,
	John Ogness, dri-devel, linux-kernel

On Mon, Mar 11, 2019 at 11:04:19AM +0200, Jani Nikula wrote:
> On Sun, 10 Mar 2019, "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> > Hello DRM/UEFI maintainers,
> >
> > Several years ago, I wrote a set of patches to dump the kernel
> > log to disk upon panic -- through BIOS INT 0x13 services. [1]
> >
> > The overwhelming response was that it's unsafe to do this in a
> > generic manner. Linus proposed a video-based viewer instead: [2]
> >
> >     If you want to do the BIOS services thing, do it for video: copy the
> >     oops to low RAM, return to real mode, re-run the graphics card POST
> >     routines to initialize text-mode, and use the BIOS to print out the
> >     oops.  That is WAY less scary than writing to disk.
> >
> > Of course it's 2019 now though, and it's quite known that
> > Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
> >
> > Researching whether this can be done from UEFI, it was also clear
> > that UEFI "Runtime Services" do not provide any re-initialization
> > routines. [4]
> >
> > The maximum possible that UEFI can provide is a GOP-provided
> > framebuffer that's ready to use by the OS -- even after the UEFI
> > boot phase is marked as done through ExitBootServices(). [5]
> >
> > Of course, once native drivers like i915 or radeon take over,
> > such a framebuffer is toast... [6]
> >
> > Thus a possible remaining option, is to display the oops through
> > "minimal" DRM drivers provided for each HW variant... Since
> > these special drivers will run only and fully under a panic()
> > context though, several constraints exist:
> >
> >   - The code should be fully synchronous (irqs are disabled)
> >   - It should not allocate any dynamic memory
> >   - It should make minimal assumptions about HW state
> >   - It should not chain into any other kernel subsystem
> >   - It has ample freedom to use delay-based loops and the
> >     like, the kernel is already dead.
> >
> > How feasible is it to have such a special "DRM viewoops"
> > framework + its minimal drivers in the kernel?
> 
> Please first better define what you want to achieve.
> 
> Do you want to store the dmesg or oops (like your original series
> suggests) or do you want to display the oops? Do you want the facility
> to be functioning at all times, or only when specifically requested in
> advance by the user? If you want to display the oops, do you want it to
> also work when the display is disabled at the time of the oops? What if
> the display is at attached to a port on a dock?
> 
> There's at least kdump, ramoops, and netconsole that can be used to
> achieve some of what you want. How do they fall short for you?

Assuming the use-case is to get an oops to display on a kms driver, we do
have a fairly comprehensive plan of what that's should look like:

https://dri.freedesktop.org/docs/drm/gpu/todo.html#make-panic-handling-work

This takes into account all the failed previous attempts at trying to get
an oops to display. It's conceptually a match with your viewoops framework
I think.
-Daniel
> 
> BR,
> Jani.
> 
> 
> >
> > The target is to start from i915, since that's what in my
> > laptop now, and work from there..
> >
> > Some final notes:
> >
> >   - The NT kernel has a similar concept, but for storage instead.
> >     They're used to dump core under kernel panic() situations,
> >     and are called "Minoport storage drivers". [7]
> >
> >   - Since Windows 7+, a very fancy Blue Screen of Death is
> >     displayed, with Unicode and whatnot, implying GPU drivers
> >     involvement. [8]
> >
> >   - Mac OS X also does something similar [9]
> >
> >   - On Linux laptops, the current situation is _really_ bad.
> >
> >     In any graphical session, type "echo c > /proc/sysrq-trigger";
> >     the screen will just completely freeze...
> >
> >     Desired first goal: just print the panic() log
> >
> > Thanks a lot,
> >
> > [1] https://lore.kernel.org/lkml/20110125134748.GA10051@laptop
> > [2] https://lore.kernel.org/lkml/AANLkTinU0KYiCd4p=z+=ojbkeEoT2G+CAYvdRU02KJEn@mail.gmail.com
> >
> > [3] https://uefi.org/sites/default/files/resources/Brian_Richardson_Intel_Final.pdf
> >
> > [4] UEFI v2.7 spec, Chapter 8, "Services — Runtime Services"
> > [5] UEFI v2.7 spec, Section 12.9, "Graphics Output Protocol"
> >     "The Graphics Output Protocol supports this capability by
> >      providing the EFI OS loader access to a hardware frame buffer
> >      and enough information to allow the OS to draw directly to
> >      the graphics output device."
> >
> > [6] linux/drivers/gpu/drm/i915/i915_drv.c::i915_kick_out_firmware_fb()
> >     linux/drivers/gpu/drm/radeon/radeon_drv.c::radeon_pci_probe()
> >
> > [7] https://docs.microsoft.com/en-us/windows-hardware/drivers/storage/restrictions-on-miniport-drivers-that-manage-the-boot-drive
> >
> > [8] https://upload.wikimedia.org/wikipedia/commons/archive/5/56/20181019151937%21Bsodwindows10.png
> > [9] https://upload.wikimedia.org/wikipedia/commons/4/4a/Mac_OS_X_10.2_Kernel_Panic.jpg
> >
> > --darwi
> > http://darwish.chasingpointers.com
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: DRM-based Oops viewer
  2019-03-10  1:31 DRM-based Oops viewer Ahmed S. Darwish
                   ` (2 preceding siblings ...)
  2019-03-11 12:10 ` Joonas Lahtinen
@ 2019-03-11 17:47 ` Noralf Trønnes
  3 siblings, 0 replies; 9+ messages in thread
From: Noralf Trønnes @ 2019-03-11 17:47 UTC (permalink / raw)
  To: Ahmed S. Darwish, David Airlie, Daniel Vetter, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, Alex Deucher,
	Christian König, David Zhou, Ard Biesheuvel, Matt Fleming
  Cc: Greg Kroah-Hartman, Linus Torvalds, dri-devel, John Ogness, linux-kernel



Den 10.03.2019 02.31, skrev Ahmed S. Darwish:
> Hello DRM/UEFI maintainers,
> 
> Several years ago, I wrote a set of patches to dump the kernel
> log to disk upon panic -- through BIOS INT 0x13 services. [1]
> 
> The overwhelming response was that it's unsafe to do this in a
> generic manner. Linus proposed a video-based viewer instead: [2]
> 
>     If you want to do the BIOS services thing, do it for video: copy the
>     oops to low RAM, return to real mode, re-run the graphics card POST
>     routines to initialize text-mode, and use the BIOS to print out the
>     oops.  That is WAY less scary than writing to disk.
> 
> Of course it's 2019 now though, and it's quite known that
> Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
> 
> Researching whether this can be done from UEFI, it was also clear
> that UEFI "Runtime Services" do not provide any re-initialization
> routines. [4]
> 
> The maximum possible that UEFI can provide is a GOP-provided
> framebuffer that's ready to use by the OS -- even after the UEFI
> boot phase is marked as done through ExitBootServices(). [5]
> 
> Of course, once native drivers like i915 or radeon take over,
> such a framebuffer is toast... [6]
> 
> Thus a possible remaining option, is to display the oops through
> "minimal" DRM drivers provided for each HW variant... Since
> these special drivers will run only and fully under a panic()
> context though, several constraints exist:
> 
>   - The code should be fully synchronous (irqs are disabled)
>   - It should not allocate any dynamic memory
>   - It should make minimal assumptions about HW state
>   - It should not chain into any other kernel subsystem
>   - It has ample freedom to use delay-based loops and the
>     like, the kernel is already dead.
> 
> How feasible is it to have such a special "DRM viewoops"
> framework + its minimal drivers in the kernel?
> 
> The target is to start from i915, since that's what in my
> laptop now, and work from there..
> 
> Some final notes:
> 
>   - The NT kernel has a similar concept, but for storage instead.
>     They're used to dump core under kernel panic() situations,
>     and are called "Minoport storage drivers". [7]
> 
>   - Since Windows 7+, a very fancy Blue Screen of Death is
>     displayed, with Unicode and whatnot, implying GPU drivers
>     involvement. [8]
> 
>   - Mac OS X also does something similar [9]
> 
>   - On Linux laptops, the current situation is _really_ bad.
> 
>     In any graphical session, type "echo c > /proc/sysrq-trigger";
>     the screen will just completely freeze...
> 
>     Desired first goal: just print the panic() log
> 
> Thanks a lot,
> 

I just sent out a patchset I had lying around that tries to solve this:
https://patchwork.freedesktop.org/series/57849/

Noralf.

> [1] https://lore.kernel.org/lkml/20110125134748.GA10051@laptop
> [2] https://lore.kernel.org/lkml/AANLkTinU0KYiCd4p=z+=ojbkeEoT2G+CAYvdRU02KJEn@mail.gmail.com
> 
> [3] https://uefi.org/sites/default/files/resources/Brian_Richardson_Intel_Final.pdf
> 
> [4] UEFI v2.7 spec, Chapter 8, "Services — Runtime Services"
> [5] UEFI v2.7 spec, Section 12.9, "Graphics Output Protocol"
>     "The Graphics Output Protocol supports this capability by
>      providing the EFI OS loader access to a hardware frame buffer
>      and enough information to allow the OS to draw directly to
>      the graphics output device."
> 
> [6] linux/drivers/gpu/drm/i915/i915_drv.c::i915_kick_out_firmware_fb()
>     linux/drivers/gpu/drm/radeon/radeon_drv.c::radeon_pci_probe()
> 
> [7] https://docs.microsoft.com/en-us/windows-hardware/drivers/storage/restrictions-on-miniport-drivers-that-manage-the-boot-drive
> 
> [8] https://upload.wikimedia.org/wikipedia/commons/archive/5/56/20181019151937%21Bsodwindows10.png
> [9] https://upload.wikimedia.org/wikipedia/commons/4/4a/Mac_OS_X_10.2_Kernel_Panic.jpg
> 
> --darwi
> http://darwish.chasingpointers.com
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: DRM-based Oops viewer
  2019-03-11  9:04 ` Jani Nikula
  2019-03-11 13:49   ` Daniel Vetter
@ 2019-03-11 22:12   ` Ahmed S. Darwish
  2019-03-12 10:20     ` Jani Nikula
  1 sibling, 1 reply; 9+ messages in thread
From: Ahmed S. Darwish @ 2019-03-11 22:12 UTC (permalink / raw)
  To: Jani Nikula
  Cc: David Airlie, Daniel Vetter, Joonas Lahtinen, Rodrigo Vivi,
	Alex Deucher, Christian König, David Zhou, Ard Biesheuvel,
	Matt Fleming, Linus Torvalds, Greg Kroah-Hartman, John Ogness,
	dri-devel, linux-kernel

Hello Jani,

On Mon, Mar 11, 2019 at 11:04:19AM +0200, Jani Nikula wrote:
> On Sun, 10 Mar 2019, "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> > Hello DRM/UEFI maintainers,
> >
> > Several years ago, I wrote a set of patches to dump the kernel
> > log to disk upon panic -- through BIOS INT 0x13 services. [1]
> >
> > The overwhelming response was that it's unsafe to do this in a
> > generic manner. Linus proposed a video-based viewer instead: [2]
> >
> >     If you want to do the BIOS services thing, do it for video: copy the
> >     oops to low RAM, return to real mode, re-run the graphics card POST
> >     routines to initialize text-mode, and use the BIOS to print out the
> >     oops.  That is WAY less scary than writing to disk.
> >
> > Of course it's 2019 now though, and it's quite known that
> > Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
> >
> > Researching whether this can be done from UEFI, it was also clear
> > that UEFI "Runtime Services" do not provide any re-initialization
> > routines. [4]
> >
> > The maximum possible that UEFI can provide is a GOP-provided
> > framebuffer that's ready to use by the OS -- even after the UEFI
> > boot phase is marked as done through ExitBootServices(). [5]
> >
> > Of course, once native drivers like i915 or radeon take over,
> > such a framebuffer is toast... [6]
> >
> > Thus a possible remaining option, is to display the oops through
> > "minimal" DRM drivers provided for each HW variant... Since
> > these special drivers will run only and fully under a panic()
> > context though, several constraints exist:
> >
> >   - The code should be fully synchronous (irqs are disabled)
> >   - It should not allocate any dynamic memory
> >   - It should make minimal assumptions about HW state
> >   - It should not chain into any other kernel subsystem
> >   - It has ample freedom to use delay-based loops and the
> >     like, the kernel is already dead.
> >
> > How feasible is it to have such a special "DRM viewoops"
> > framework + its minimal drivers in the kernel?
>
> Please first better define what you want to achieve.
>

Oh I thought this was clear..

What I want to achieve is:

  - for normal day-to-day x86 laptops users
  - properly inform the user when a kernel panic happens during
    a running graphical session (e.g. wayland/gnome/kde/...).

The current situation is dismal: the screen _just freezes_, and
users are left wondering what the heck has really happened to
their system (?)

Some out-of-the-box notification mechanism, for everyday distros
like Fedora and Ubuntu that can be enabled by default, would
improve the situation considerably..

Yes, there are many _developer_ features that can mitigate the
issue somewhat, but they're not really useful for everyday "normal"
usage:

  - netconsole is definitely not an option. It implies a lab
    setting where two machines are always connected through a
    network.

  - ramoops is _completely irrelevant_ for normal users. It's
    mostly for embedded systems and the like; requires intimate
    knowledge of the hardware by the user translated into DT
    bindings or special platform_data struct..

  - kexec/kdump partially solves the save-to-disk problem, but
    doesn't solve the user notification part..

    Maybe a new "kexec/kview" solution can be useful, but
    distributions don't enable kexec/k* solutions _by default_.

    Maybe they fear the extra burden of maintaining two kernels
    at the same time? or the requirement of reserving memory
    for the crashkernel beforehand?

    Linux should not just "freeze the screen" upon panic, even
    if a crashkernel is not present .. _some_ sane default
    built-in user notification mechanism should be there.

  - efivars are neat, they partially solve the save-to-disk
    problem, but does not solve the user notification problem.

    Moreover, they always carry the risk of bricking laptops..

> Do you want to store the dmesg or oops (like your original series
> suggests) or do you want to display the oops?

The original save-to-disk series was only shown for context.
This is a pure display solution; no disk is invovled at all.

> Do you want the facility to be functioning at all times, or only
> when specifically requested in advance by the user?

At all times, as a basic "sane default" for laptop-oriented
distributions to enable (think ubuntu, fedora, mint, etc.)

> If you want to display the oops, do you want it to
> also work when the display is disabled at the time of the oops?

If the screen is disabled, then this is definitely out of scope.

This only deals with classical laptop usage scenarios, where we
want to notify the user that something went wrong at the kernel
level.

> the display is at attached to a port on a dock?
>

This is a much bigger scope that's not important at this stage.

If I'm attaching my laptop to a projector and the kernel panics,
notifying the user only in the main laptop screen should be
enough.

> There's at least kdump, ramoops, and netconsole that can be used to
> achieve some of what you want. How do they fall short for you?
>

Hopfully my answers above provided more insight of why these
solutions fall short..

> BR,
> Jani.
>

Thanks!
--darwi

>
> >
> > The target is to start from i915, since that's what in my
> > laptop now, and work from there..
> >
> > Some final notes:
> >
> >   - The NT kernel has a similar concept, but for storage instead.
> >     They're used to dump core under kernel panic() situations,
> >     and are called "Minoport storage drivers". [7]
> >
> >   - Since Windows 7+, a very fancy Blue Screen of Death is
> >     displayed, with Unicode and whatnot, implying GPU drivers
> >     involvement. [8]
> >
> >   - Mac OS X also does something similar [9]
> >
> >   - On Linux laptops, the current situation is _really_ bad.
> >
> >     In any graphical session, type "echo c > /proc/sysrq-trigger";
> >     the screen will just completely freeze...
> >
> >     Desired first goal: just print the panic() log
> >
> > Thanks a lot,
> >
> > [1] https://lore.kernel.org/lkml/20110125134748.GA10051@laptop
> > [2] https://lore.kernel.org/lkml/AANLkTinU0KYiCd4p=z+=ojbkeEoT2G+CAYvdRU02KJEn@mail.gmail.com
> >
> > [3] https://uefi.org/sites/default/files/resources/Brian_Richardson_Intel_Final.pdf
> >
> > [4] UEFI v2.7 spec, Chapter 8, "Services — Runtime Services"
> > [5] UEFI v2.7 spec, Section 12.9, "Graphics Output Protocol"
> >     "The Graphics Output Protocol supports this capability by
> >      providing the EFI OS loader access to a hardware frame buffer
> >      and enough information to allow the OS to draw directly to
> >      the graphics output device."
> >
> > [6] linux/drivers/gpu/drm/i915/i915_drv.c::i915_kick_out_firmware_fb()
> >     linux/drivers/gpu/drm/radeon/radeon_drv.c::radeon_pci_probe()
> >
> > [7] https://docs.microsoft.com/en-us/windows-hardware/drivers/storage/restrictions-on-miniport-drivers-that-manage-the-boot-drive
> >
> > [8] https://upload.wikimedia.org/wikipedia/commons/archive/5/56/20181019151937%21Bsodwindows10.png
> > [9] https://upload.wikimedia.org/wikipedia/commons/4/4a/Mac_OS_X_10.2_Kernel_Panic.jpg
> >
> > --darwi
> > http://darwish.chasingpointers.com
>
> --
> Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: DRM-based Oops viewer
  2019-03-11 13:49   ` Daniel Vetter
@ 2019-03-11 23:39     ` Ahmed S. Darwish
  0 siblings, 0 replies; 9+ messages in thread
From: Ahmed S. Darwish @ 2019-03-11 23:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jani Nikula, David Airlie, Joonas Lahtinen, Rodrigo Vivi,
	Alex Deucher, Christian König, David Zhou, Ard Biesheuvel,
	Matt Fleming, Linus Torvalds, Greg Kroah-Hartman, John Ogness,
	dri-devel, linux-kernel

On Mon, Mar 11, 2019 at 02:49:41PM +0100, Daniel Vetter wrote:
> On Mon, Mar 11, 2019 at 11:04:19AM +0200, Jani Nikula wrote:
> > On Sun, 10 Mar 2019, "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> > > Hello DRM/UEFI maintainers,
> > >
> > > Several years ago, I wrote a set of patches to dump the kernel
> > > log to disk upon panic -- through BIOS INT 0x13 services. [1]
> > >
> > > The overwhelming response was that it's unsafe to do this in a
> > > generic manner. Linus proposed a video-based viewer instead: [2]
> > >
> > >     If you want to do the BIOS services thing, do it for video: copy the
> > >     oops to low RAM, return to real mode, re-run the graphics card POST
> > >     routines to initialize text-mode, and use the BIOS to print out the
> > >     oops.  That is WAY less scary than writing to disk.
> > >
> > > Of course it's 2019 now though, and it's quite known that
> > > Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
> > >
> > > Researching whether this can be done from UEFI, it was also clear
> > > that UEFI "Runtime Services" do not provide any re-initialization
> > > routines. [4]
> > >
> > > The maximum possible that UEFI can provide is a GOP-provided
> > > framebuffer that's ready to use by the OS -- even after the UEFI
> > > boot phase is marked as done through ExitBootServices(). [5]
> > >
> > > Of course, once native drivers like i915 or radeon take over,
> > > such a framebuffer is toast... [6]
> > >
> > > Thus a possible remaining option, is to display the oops through
> > > "minimal" DRM drivers provided for each HW variant... Since
> > > these special drivers will run only and fully under a panic()
> > > context though, several constraints exist:
> > >
> > >   - The code should be fully synchronous (irqs are disabled)
> > >   - It should not allocate any dynamic memory
> > >   - It should make minimal assumptions about HW state
> > >   - It should not chain into any other kernel subsystem
> > >   - It has ample freedom to use delay-based loops and the
> > >     like, the kernel is already dead.
> > >
> > > How feasible is it to have such a special "DRM viewoops"
> > > framework + its minimal drivers in the kernel?
> >
> > Please first better define what you want to achieve.
> >
> > Do you want to store the dmesg or oops (like your original series
> > suggests) or do you want to display the oops? Do you want the facility
> > to be functioning at all times, or only when specifically requested in
> > advance by the user? If you want to display the oops, do you want it to
> > also work when the display is disabled at the time of the oops? What if
> > the display is at attached to a port on a dock?
> >
> > There's at least kdump, ramoops, and netconsole that can be used to
> > achieve some of what you want. How do they fall short for you?
>
> Assuming the use-case is to get an oops to display on a kms driver, we do
> have a fairly comprehensive plan of what that's should look like:
>
> https://dri.freedesktop.org/docs/drm/gpu/todo.html#make-panic-handling-work
>
> This takes into account all the failed previous attempts at trying to get
> an oops to display. It's conceptually a match with your viewoops framework
> I think.

Thanks a lot Daniel for the reference! Yup, this is a conceptual
match indeed!

It's great to know that at the maintainer level there is some
agreement on, awareness of, and plans for, the general topic...

I'll jump to Noralf Trønnes's just-posted patches then and see how
to move from there :)

all the best,

--darwi
http://darwish.chasingpointers.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: DRM-based Oops viewer
  2019-03-11 22:12   ` Ahmed S. Darwish
@ 2019-03-12 10:20     ` Jani Nikula
  0 siblings, 0 replies; 9+ messages in thread
From: Jani Nikula @ 2019-03-12 10:20 UTC (permalink / raw)
  To: Ahmed S. Darwish
  Cc: David Airlie, Daniel Vetter, Joonas Lahtinen, Rodrigo Vivi,
	Alex Deucher, Christian König, David Zhou, Ard Biesheuvel,
	Matt Fleming, Linus Torvalds, Greg Kroah-Hartman, John Ogness,
	dri-devel, linux-kernel

On Mon, 11 Mar 2019, "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
> Hello Jani,
>
> On Mon, Mar 11, 2019 at 11:04:19AM +0200, Jani Nikula wrote:
>> On Sun, 10 Mar 2019, "Ahmed S. Darwish" <darwish.07@gmail.com> wrote:
>> Please first better define what you want to achieve.
>>
>
> Oh I thought this was clear..
>
> What I want to achieve is:
>
>   - for normal day-to-day x86 laptops users
>   - properly inform the user when a kernel panic happens during
>     a running graphical session (e.g. wayland/gnome/kde/...).

Thanks for the detailed reply. It always helps to start with the problem
you want to solve or end goal you want to reach.

BR,
Jani.



-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-03-12 10:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-10  1:31 DRM-based Oops viewer Ahmed S. Darwish
2019-03-10  8:44 ` Martin Steigerwald
2019-03-11  9:04 ` Jani Nikula
2019-03-11 13:49   ` Daniel Vetter
2019-03-11 23:39     ` Ahmed S. Darwish
2019-03-11 22:12   ` Ahmed S. Darwish
2019-03-12 10:20     ` Jani Nikula
2019-03-11 12:10 ` Joonas Lahtinen
2019-03-11 17:47 ` Noralf Trønnes

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox