LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] mm: trigger panic on bad page or PTE states if panic_on_oops
@ 2015-03-16  8:37 Christian Borntraeger
  2015-03-16 11:00 ` Kirill A. Shutemov
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Borntraeger @ 2015-03-16  8:37 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, Christian Borntraeger

while debugging a memory management problem it helped a lot to
get a system dump as early as possible for bad page states.

Lets assume that if panic_on_oops is set then the system should
not continue with broken mm data structures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 mm/memory.c     | 2 ++
 mm/page_alloc.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index 2c3536c..bdbf9cc 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -696,6 +696,8 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
 		printk(KERN_ALERT "vma->vm_file->f_op->mmap: %pSR\n",
 		       vma->vm_file->f_op->mmap);
 	dump_stack();
+	if (panic_on_oops)
+		panic("Fatal exception");
 	add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
 }
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8e20f9c..8c19db3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -337,6 +337,8 @@ static void bad_page(struct page *page, const char *reason,
 
 	print_modules();
 	dump_stack();
+	if (panic_on_oops)
+		panic("Fatal exception");
 out:
 	/* Leave bad fields for debug, except PageBuddy could make trouble */
 	page_mapcount_reset(page); /* remove PageBuddy */
-- 
2.3.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm: trigger panic on bad page or PTE states if panic_on_oops
  2015-03-16  8:37 [PATCH] mm: trigger panic on bad page or PTE states if panic_on_oops Christian Borntraeger
@ 2015-03-16 11:00 ` Kirill A. Shutemov
  2015-03-16 11:12   ` Christian Borntraeger
  0 siblings, 1 reply; 6+ messages in thread
From: Kirill A. Shutemov @ 2015-03-16 11:00 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-mm, linux-kernel

On Mon, Mar 16, 2015 at 09:37:01AM +0100, Christian Borntraeger wrote:
> while debugging a memory management problem it helped a lot to
> get a system dump as early as possible for bad page states.
> 
> Lets assume that if panic_on_oops is set then the system should
> not continue with broken mm data structures.

bed_pte is not an oops.

Probably we should consider putting VM_BUG() at the end of these
functions instead.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm: trigger panic on bad page or PTE states if panic_on_oops
  2015-03-16 11:00 ` Kirill A. Shutemov
@ 2015-03-16 11:12   ` Christian Borntraeger
  2015-03-16 12:15     ` Kirill A. Shutemov
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Borntraeger @ 2015-03-16 11:12 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: linux-mm, linux-kernel

Am 16.03.2015 um 12:00 schrieb Kirill A. Shutemov:
> On Mon, Mar 16, 2015 at 09:37:01AM +0100, Christian Borntraeger wrote:
>> while debugging a memory management problem it helped a lot to
>> get a system dump as early as possible for bad page states.
>>
>> Lets assume that if panic_on_oops is set then the system should
>> not continue with broken mm data structures.
> 
> bed_pte is not an oops.

I know that this is not an oops, but semantically it is like one.  I certainly
want to a way to hard stop the system if something like that happens.

Would something like panic_on_mm_error be better?

> 
> Probably we should consider putting VM_BUG() at the end of these
> functions instead.

That is probably also a workable solution if I can reproduce the issue on
my system, but VM_BUG  defaults to off for many production systems (RHEL, SLES..)

Any other suggestion?

Christian


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm: trigger panic on bad page or PTE states if panic_on_oops
  2015-03-16 11:12   ` Christian Borntraeger
@ 2015-03-16 12:15     ` Kirill A. Shutemov
  2015-03-17 17:19       ` Konstantin Khlebnikov
  0 siblings, 1 reply; 6+ messages in thread
From: Kirill A. Shutemov @ 2015-03-16 12:15 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-mm, linux-kernel

On Mon, Mar 16, 2015 at 12:12:54PM +0100, Christian Borntraeger wrote:
> Am 16.03.2015 um 12:00 schrieb Kirill A. Shutemov:
> > On Mon, Mar 16, 2015 at 09:37:01AM +0100, Christian Borntraeger wrote:
> >> while debugging a memory management problem it helped a lot to
> >> get a system dump as early as possible for bad page states.
> >>
> >> Lets assume that if panic_on_oops is set then the system should
> >> not continue with broken mm data structures.
> > 
> > bed_pte is not an oops.
> 
> I know that this is not an oops, but semantically it is like one.  I certainly
> want to a way to hard stop the system if something like that happens.
> 
> Would something like panic_on_mm_error be better?

Or panic_on_taint=<mask> where <mask> is bit-mask of TAINT_* values.

The problem is that TAINT_* will effectevely become part of kernel ABI
and I'm not sure it's good idea.

Oopsing on any taint will have limited usefulness, I think.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm: trigger panic on bad page or PTE states if panic_on_oops
  2015-03-16 12:15     ` Kirill A. Shutemov
@ 2015-03-17 17:19       ` Konstantin Khlebnikov
  2015-03-17 19:40         ` Kirill A. Shutemov
  0 siblings, 1 reply; 6+ messages in thread
From: Konstantin Khlebnikov @ 2015-03-17 17:19 UTC (permalink / raw)
  To: Kirill A. Shutemov, Christian Borntraeger; +Cc: linux-mm, linux-kernel

On 16.03.2015 15:15, Kirill A. Shutemov wrote:
> On Mon, Mar 16, 2015 at 12:12:54PM +0100, Christian Borntraeger wrote:
>> Am 16.03.2015 um 12:00 schrieb Kirill A. Shutemov:
>>> On Mon, Mar 16, 2015 at 09:37:01AM +0100, Christian Borntraeger wrote:
>>>> while debugging a memory management problem it helped a lot to
>>>> get a system dump as early as possible for bad page states.
>>>>
>>>> Lets assume that if panic_on_oops is set then the system should
>>>> not continue with broken mm data structures.
>>>
>>> bed_pte is not an oops.
>>
>> I know that this is not an oops, but semantically it is like one.  I certainly
>> want to a way to hard stop the system if something like that happens.
>>
>> Would something like panic_on_mm_error be better?
>
> Or panic_on_taint=<mask> where <mask> is bit-mask of TAINT_* values.
>
> The problem is that TAINT_* will effectevely become part of kernel ABI
> and I'm not sure it's good idea.

Taint bits have associated letters: for example panic_on_taint=OP
panic on out-of-tree or propriate =)

>
> Oopsing on any taint will have limited usefulness, I think.
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm: trigger panic on bad page or PTE states if panic_on_oops
  2015-03-17 17:19       ` Konstantin Khlebnikov
@ 2015-03-17 19:40         ` Kirill A. Shutemov
  0 siblings, 0 replies; 6+ messages in thread
From: Kirill A. Shutemov @ 2015-03-17 19:40 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: Christian Borntraeger, linux-mm, linux-kernel

On Tue, Mar 17, 2015 at 08:19:19PM +0300, Konstantin Khlebnikov wrote:
> On 16.03.2015 15:15, Kirill A. Shutemov wrote:
> >On Mon, Mar 16, 2015 at 12:12:54PM +0100, Christian Borntraeger wrote:
> >>Am 16.03.2015 um 12:00 schrieb Kirill A. Shutemov:
> >>>On Mon, Mar 16, 2015 at 09:37:01AM +0100, Christian Borntraeger wrote:
> >>>>while debugging a memory management problem it helped a lot to
> >>>>get a system dump as early as possible for bad page states.
> >>>>
> >>>>Lets assume that if panic_on_oops is set then the system should
> >>>>not continue with broken mm data structures.
> >>>
> >>>bed_pte is not an oops.
> >>
> >>I know that this is not an oops, but semantically it is like one.  I certainly
> >>want to a way to hard stop the system if something like that happens.
> >>
> >>Would something like panic_on_mm_error be better?
> >
> >Or panic_on_taint=<mask> where <mask> is bit-mask of TAINT_* values.
> >
> >The problem is that TAINT_* will effectevely become part of kernel ABI
> >and I'm not sure it's good idea.
> 
> Taint bits have associated letters: for example panic_on_taint=OP
> panic on out-of-tree or propriate =)

Works for me.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-03-17 19:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-16  8:37 [PATCH] mm: trigger panic on bad page or PTE states if panic_on_oops Christian Borntraeger
2015-03-16 11:00 ` Kirill A. Shutemov
2015-03-16 11:12   ` Christian Borntraeger
2015-03-16 12:15     ` Kirill A. Shutemov
2015-03-17 17:19       ` Konstantin Khlebnikov
2015-03-17 19:40         ` Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).