LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Nicolas Pitre <nicolas.pitre@linaro.org>
To: Guenter Roeck <linux@roeck-us.net>
Cc: Tejun Heo <tj@kernel.org>, Christoph Lameter <cl@linux.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Mikael Starvik <starvik@axis.com>,
	Jesper Nilsson <jesper.nilsson@axis.com>,
	linux-cris-kernel@axis.com
Subject: Re: mm/percpu.c: use smarter memory allocation for struct pcpu_alloc_info (crisv32 hang)
Date: Mon, 20 Nov 2017 13:18:38 -0500 (EST)	[thread overview]
Message-ID: <nycvar.YSQ.7.76.1711201305160.16045@knanqh.ubzr> (raw)
In-Reply-To: <62a3b680-6dde-d308-3da8-9c9a2789b114@roeck-us.net>

On Sun, 19 Nov 2017, Guenter Roeck wrote:

> On 11/19/2017 08:08 PM, Nicolas Pitre wrote:
> > On Sun, 19 Nov 2017, Guenter Roeck wrote:
> > > On 11/19/2017 12:36 PM, Nicolas Pitre wrote:
> > > > On Sat, 18 Nov 2017, Guenter Roeck wrote:
> > > > > On Tue, Oct 03, 2017 at 06:29:49PM -0400, Nicolas Pitre wrote:
> > > > > > @@ -2295,6 +2295,7 @@ void __init setup_per_cpu_areas(void)
> > > > > >      	if (pcpu_setup_first_chunk(ai, fc) < 0)
> > > > > >    		panic("Failed to initialize percpu areas.");
> > > > > > +	pcpu_free_alloc_info(ai);
> > > > > 
> > > > > This is the culprit. Everything works fine if I remove this line.
> > > > 
> > > > Without this line, the memory at the ai pointer is leaked. Maybe this is
> > > > modifying the memory allocation pattern and that triggers a bug later on
> > > > in your case.
> > > > 
> > > > At that point the console driver is not yet initialized and any error
> > > > message won't be printed. You should enable the early console mechanism
> > > > in your kernel (see arch/cris/arch-v32/kernel/debugport.c) and see what
> > > > that might tell you.
> > > > 
> > > 
> > > The problem is that BUG() on crisv32 does not yield useful output.
> > > Anyway, here is the culprit.
> > > 
> > > diff --git a/mm/bootmem.c b/mm/bootmem.c
> > > index 6aef64254203..2bcc8901450c 100644
> > > --- a/mm/bootmem.c
> > > +++ b/mm/bootmem.c
> > > @@ -382,7 +382,8 @@ static int __init mark_bootmem(unsigned long start,
> > > unsigned long end,
> > >                          return 0;
> > >                  pos = bdata->node_low_pfn;
> > >          }
> > > -       BUG();
> > > +       WARN(1, "mark_bootmem(): memory range 0x%lx-0x%lx not found\n",
> > > start,
> > > end);
> > > +       return -ENOMEM;
> > >   }
> > > 
> > >   /**
> > > diff --git a/mm/percpu.c b/mm/percpu.c
> > > index 79e3549cab0f..c75622d844f1 100644
> > > --- a/mm/percpu.c
> > > +++ b/mm/percpu.c
> > > @@ -1881,6 +1881,7 @@ struct pcpu_alloc_info * __init
> > > pcpu_alloc_alloc_info(int nr_groups,
> > >    */
> > >   void __init pcpu_free_alloc_info(struct pcpu_alloc_info *ai)
> > >   {
> > > +       printk("pcpu_free_alloc_info(%p (0x%lx))\n", ai, __pa(ai));
> > >          memblock_free_early(__pa(ai), ai->__ai_size);
> > 
> > The problem here is that there is two possibilities for
> > memblock_free_early(). From include/linux/bootmem.h:
> > 
> > #if defined(CONFIG_HAVE_MEMBLOCK) && defined(CONFIG_NO_BOOTMEM)
> > 
> > static inline void __init memblock_free_early(
> >                                          phys_addr_t base, phys_addr_t size)
> > {
> >          __memblock_free_early(base, size);
> > }
> > 
> > #else
> > 
> > static inline void __init memblock_free_early(
> >                                          phys_addr_t base, phys_addr_t size)
> > {
> >          free_bootmem(base, size);
> > }
> > 
> > #endif
> > 
> > It looks like most architectures use the memblock variant, including all
> > the ones I have access to.
> > 
> > > results in:
> > > 
> > > pcpu_free_alloc_info(c0534000 (0x40534000))
> > > ------------[ cut here ]------------
> > > WARNING: CPU: 0 PID: 0 at mm/bootmem.c:385 mark_bootmem+0x9a/0xaa
> > > mark_bootmem(): memory range 0x2029a-0x2029b not found
> > 
> > Well... PFN_UP(0x40534000) should give 0x40534. How you might end up
> > with 0x2029a in mark_bootmem(), let alone not exit on the first "if (max
> > == end) return 0;" within the loop is rather weird.
> > 
> pcpu_free_alloc_info: ai=c0536000, __pa(ai)=0x40536000,
> PFN_UP(__pa(ai))=0x2029b, PFN_UP(ai)=0x6029b
> 
> bootmem range is 0x60000..0x61000. It doesn't get to "if (max == end)"
> because "pos (=0x2029b) < bdata->node_min_pfn (=0x60000)".

OK. the 0x2029b is the result of PAGE_SIZE being 8192 in your case.
However the bootmem allocator deals with physical addresses not virtual 
ones. So it shouldn't give you a 0x60000..0x61000 range.

Would be interesting to see what result you get on line 860 of 
mm/bootmem.c.


Nicolas

  reply	other threads:[~2017-11-20 18:18 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-03 20:57 [PATCH] mm/percpu.c: use smarter memory allocation for struct pcpu_alloc_info Nicolas Pitre
2017-10-03 21:05 ` Tejun Heo
2017-10-03 22:29   ` Nicolas Pitre
2017-10-03 22:36     ` Tejun Heo
2017-10-03 23:48       ` Dennis Zhou
2017-10-04  0:13         ` Nicolas Pitre
2017-10-04 14:15     ` Tejun Heo
2017-11-18 18:25     ` mm/percpu.c: use smarter memory allocation for struct pcpu_alloc_info (crisv32 hang) Guenter Roeck
2017-11-19 20:36       ` Nicolas Pitre
2017-11-20  2:03         ` Guenter Roeck
2017-11-20  4:08           ` Nicolas Pitre
2017-11-20  5:05             ` Guenter Roeck
2017-11-20 18:18               ` Nicolas Pitre [this message]
2017-11-20 18:51                 ` Guenter Roeck
2017-11-20 20:21                   ` Nicolas Pitre
2017-11-20 21:11                     ` Guenter Roeck
2017-11-21  0:28                       ` Nicolas Pitre
2017-11-21  1:48                         ` Guenter Roeck
2017-11-21  3:50                           ` Nicolas Pitre
2017-11-22 15:34                             ` Jesper Nilsson
2017-11-22 20:17                               ` Nicolas Pitre
2017-11-23  7:56                                 ` Jesper Nilsson
2017-11-27 19:41       ` Tejun Heo
2017-11-27 20:31         ` Nicolas Pitre
2017-11-27 20:33           ` Tejun Heo
2017-11-27 20:51             ` Nicolas Pitre
2017-11-27 20:54               ` Tejun Heo
2017-11-27 21:11                 ` Guenter Roeck
2017-11-28  8:19               ` Jesper Nilsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YSQ.7.76.1711201305160.16045@knanqh.ubzr \
    --to=nicolas.pitre@linaro.org \
    --cc=cl@linux.com \
    --cc=jesper.nilsson@axis.com \
    --cc=linux-cris-kernel@axis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@roeck-us.net \
    --cc=starvik@axis.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).