LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Toshi Kani <toshi.kani@hp.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "hpa@zytor.com" <hpa@zytor.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"arnd@arndb.de" <arnd@arndb.de>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"dave.hansen@intel.com" <dave.hansen@intel.com>,
	"Elliott, Robert (Server Storage)" <Elliott@hp.com>
Subject: Re: [PATCH v3 6/6] x86, mm: Support huge KVA mappings on x86
Date: Wed, 04 Mar 2015 09:23:36 -0700	[thread overview]
Message-ID: <1425486216.17007.236.camel@misato.fc.hp.com> (raw)
In-Reply-To: <20150303170035.85e94c87.akpm@linux-foundation.org>

On Wed, 2015-03-04 at 01:00 +0000, Andrew Morton wrote:
> On Tue, 03 Mar 2015 16:14:32 -0700 Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > On Tue, 2015-03-03 at 14:44 -0800, Andrew Morton wrote:
> > > On Tue,  3 Mar 2015 10:44:24 -0700 Toshi Kani <toshi.kani@hp.com> wrote:
> >  :
> > > > +
> > > > +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
> > > > +int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
> > > > +{
> > > > +	u8 mtrr;
> > > > +
> > > > +	/*
> > > > +	 * Do not use a huge page when the range is covered by non-WB type
> > > > +	 * of MTRRs.
> > > > +	 */
> > > > +	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
> > > > +	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
> > > > +		return 0;
> > > 
> > > It would be good to notify the operator in some way when this happens. 
> > > Otherwise the kernel will run more slowly and there's no way of knowing
> > > why.  I guess slap a pr_info() in there.  Or maybe pr_warn()?
> > 
> > We only use 4KB mappings today, so this case will not make it run
> > slowly, i.e. it will be the same as today.
> 
> Yes, but it would be slower than it would be if the operator fixed the
> mtrr settings!  How do we let the operator know this?
> 
> >  Also, adding a message here
> > can generate a lot of messages when MTRRs cover a large area.
> 
> Really?  This is only going to happen when a device driver requests a
> huge io mapping, isn't it?  That's rare.  We could emit a warning,
> return an error code and fall all the way back to the top-level ioremap
> code which can then retry with 4k mappings.  Or something similar -
> somehow record the fact that this warning has been emitted or use
> printk ratelimiting (bad option).

Yes, an IO device with a huge MMIO space that is covered by MTRRs is a
rare case.  BIOS does not need to specify how MMIO of each card needs to
be accessed with MTRRs (or BIOS should not do it since an MMIO address
is configurable on each card).

However, PCIe has the MMCONFIG space, PCIe config space, which is also
memory mapped and must be accessed with UC.  The PCI subsystem calls
ioremap_nocache() to map the entire MMCONFIG space, which covers the
PCIe config space of all possible cards.  Here are boot messages on my
test system.

  :
PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xc0000000-0xcf
ffffff] (base 0xc0000000)
PCI: MMCONFIG at [mem 0xc0000000-0xcfffffff] reserved in E820
  :

And MTRRs cover this MMCONFIG space with UC to assure that the range is
always accessed with UC.

# cat /proc/mtrr
reg00: base=0x0c0000000 ( 3072MB), size= 1024MB, count=1: uncachable

So, if we add a message into the code, it will be displayed many times
in this ioremap_nocache() call from PCI.

Ideally, pud_set_huge() and pmd_set_huge() should allow using a huge
page mapping when the entire map range is covered by a single MTRR
entry, which is the case with MMCONFIG.  But I did not include such
handling into the patch because UC map is slow by itself, MMCONFIG is
only accessed at boot-time, and mtrr_type_lookup() does not provide the
level of info necessary.

Thanks,
-Toshi


  reply	other threads:[~2015-03-04 16:24 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-03 17:44 [PATCH v3 0/6] Kernel huge I/O mapping support Toshi Kani
2015-03-03 17:44 ` [PATCH v3 1/6] mm: Change __get_vm_area_node() to use fls_long() Toshi Kani
2015-03-03 17:44 ` [PATCH v3 2/6] lib: Add huge I/O map capability interfaces Toshi Kani
2015-03-03 17:44 ` [PATCH v3 3/6] mm: Change ioremap to set up huge I/O mappings Toshi Kani
2015-03-04 22:09   ` Ingo Molnar
2015-03-04 23:15     ` Toshi Kani
2015-03-03 17:44 ` [PATCH v3 4/6] mm: Change vunmap to tear down huge KVA mappings Toshi Kani
2015-03-03 17:44 ` [PATCH v3 5/6] x86, mm: Support huge I/O mapping capability I/F Toshi Kani
2015-03-03 17:44 ` [PATCH v3 6/6] x86, mm: Support huge KVA mappings on x86 Toshi Kani
2015-03-03 22:44   ` Andrew Morton
2015-03-03 23:14     ` Toshi Kani
2015-03-04  1:00       ` Andrew Morton
2015-03-04 16:23         ` Toshi Kani [this message]
2015-03-04 20:17           ` Ingo Molnar
2015-03-04 21:16             ` Toshi Kani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1425486216.17007.236.camel@misato.fc.hp.com \
    --to=toshi.kani@hp.com \
    --cc=Elliott@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --subject='Re: [PATCH v3 6/6] x86, mm: Support huge KVA mappings on x86' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).