LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Jack Steiner <steiner@sgi.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Andrea Arcangeli <andrea@qumranet.com>,
	akpm@linux-foundation.org, Robin Holt <holt@sgi.com>,
	Avi Kivity <avi@qumranet.com>, Izik Eidus <izike@qumranet.com>,
	kvm-devel@lists.sourceforge.net,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	general@lists.openfabrics.org,
	Steve Wise <swise@opengridcomputing.com>,
	Roland Dreier <rdreier@cisco.com>,
	Kanoj Sarcar <kanojsarcar@yahoo.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	daniel.blueman@quadrics.com, Christoph Lameter <clameter@sgi.com>
Subject: Re: [PATCH] mmu notifiers #v8
Date: Mon, 3 Mar 2008 13:15:40 -0600	[thread overview]
Message-ID: <20080303191540.GB11156@sgi.com> (raw)
In-Reply-To: <20080303184517.GA4951@wotan.suse.de>

On Mon, Mar 03, 2008 at 07:45:17PM +0100, Nick Piggin wrote:
> On Mon, Mar 03, 2008 at 12:06:05PM -0600, Jack Steiner wrote:
> > On Mon, Mar 03, 2008 at 05:59:10PM +0100, Nick Piggin wrote:
> > > > Maintaining a long-term reference on a page is a problem. The GRU does not
> > > > currently maintain tables to track the pages for which dropins have been done.
> > > > 
> > > > The GRU has a large internal TLB and is designed to reference up to 8PB of
> > > > memory. The size of the tables to track this many referenced pages would be
> > > > a problem (at best).
> > > 
> > > Is it any worse a problem than the pagetables of the processes which have
> > > their virtual memory exported to GRU? AFAIKS, no; it is on the same
> > > magnitude of difficulty. So you could do it without introducing any
> > > fundamental problem (memory usage might be increased by some constant
> > > factor, but I think we can cope with that in order to make the core patch
> > > really nice and simple).
> > 
> > Functionally, the GRU is very close to what I would consider to be the
> > "standard TLB" model. Dropins and flushs map closely to processor dropins
> > and flushes for cpus.  The internal structure of the GRU TLB is identical to
> > the TLB of existing cpus.  Requiring the GRU driver to track dropins with
> > long term page references seems to me a deviation from having the basic
> > mmuops support a "standard TLB" model. AFAIK, no other processor requires
> > this.
> 
> That is because the CPU TLBs have the mmu_gather batching APIs which
> avoid the problem. It would be possible to do something similar for
> GRU which would involve taking a reference for each page-to-be-invalidated
> in invalidate_page, and release them when you invalidate_range. Or else
> do some other scheme which makes mmu notifiers work similarly to the
> mmu gather API. But not just go an invent something completely different
> in the form of this invalidate_begin,clear linux pte,invalidate_end API.

Correct. If the mmu_gather were passed on the mmuops callout and the callout were
done at the same point as the tlb_finish_mmu(), the GRU could
efficiently work w/o the range invalidates. A range invalidate might still
be slightly more efficient but not measureable so. The net difference is
not worth the extra complexity of range callouts.


> 
> 
> > Tracking TLB dropins (and long term page references) could be done but it
> > adds significant complexity and scaling issues. The size of the tables to
> > track many TB (to PB) of memory can get large. If the memory is being
> > referenced by highly threaded applications, then the problem becomes even
> > more complex. Either tables must be replicated per-thread (and require even
> > more memory), or the table structure becomes even more complex to deal with
> > node locality, cacheline bouncing, etc.
> 
> I don't think it would be that significant in terms of complexity or
> scaling.
> 
> For a quick solution, you could stick a radix tree in each of your mmu
> notifiers registered (ie. one per mm), which is indexed on virtual address
> >> PAGE_SHIFT, and returns the struct page *. Size is no different than
> page tables, and locking is pretty scalable.
> 
> After that, I would really like to see whether the numbers justify
> larger changes.

I'm still concerned about performance. Each dropin would first have to access
an additional data structure that would most likely be non-node-local and
non-cache-resident. The net effect would be measurable but not a killer.

I haven't thought about locking requirements for the radix tree. Most accesses
would be read-only & updates infrequent. Any chance of an RCU-based radix
implementation?  Otherwise, don't we add the potential for hot locks/cachelines
for threaded applications ???

  reply	other threads:[~2008-03-03 19:16 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-19  8:43 [patch] my mmu notifiers Nick Piggin
2008-02-19  8:44 ` [patch] my mmu notifier sample driver Nick Piggin
2008-02-19 11:59 ` [patch] my mmu notifiers Robin Holt
2008-02-19 13:58 ` Andrea Arcangeli
2008-02-19 14:27   ` Jack Steiner
2008-02-19 23:04     ` Nick Piggin
2008-02-20  0:52       ` Andrea Arcangeli
2008-02-20  2:46         ` Robin Holt
2008-02-27 22:50     ` Christoph Lameter
2008-02-19 22:59   ` Nick Piggin
2008-02-20  0:46     ` Andrea Arcangeli
2008-02-27 22:55     ` Christoph Lameter
2008-02-19 23:11   ` Nick Piggin
2008-02-19 23:40     ` Jack Steiner
2008-02-21  4:42       ` Nick Piggin
2008-02-22 16:31         ` Jack Steiner
2008-02-20  1:09     ` Andrea Arcangeli
2008-02-20 10:39       ` [PATCH] mmu notifiers #v6 Andrea Arcangeli
2008-02-20 10:45         ` [PATCH] KVM swapping (+ seqlock fix) with " Andrea Arcangeli
2008-02-27 22:06           ` [PATCH] KVM swapping with mmu notifiers #v7 Andrea Arcangeli
2008-02-28  8:42             ` izik eidus
2008-02-20 11:33         ` [PATCH] mmu notifiers #v6 Robin Holt
2008-02-20 12:03           ` Andrea Arcangeli
2008-02-20 12:24             ` Robin Holt
2008-02-20 12:32               ` Andrea Arcangeli
2008-02-20 13:15                 ` Robin Holt
2008-02-21  5:02             ` Nick Piggin
2008-02-20 14:41         ` Robin Holt
2008-02-20 15:34           ` Andrea Arcangeli
2008-02-20 21:03         ` Jack Steiner
2008-02-21  4:54         ` Nick Piggin
2008-02-21 14:40           ` Andrea Arcangeli
2008-02-21 16:10             ` Jack Steiner
2008-02-27 19:26               ` [PATCH] mmu notifiers #v7 Andrea Arcangeli
2008-02-27 20:04                 ` Peter Zijlstra
2008-02-27 23:06                 ` Christoph Lameter
2008-02-27 23:43                   ` [kvm-devel] " Andrea Arcangeli
2008-02-28  0:08                     ` Christoph Lameter
2008-02-28  0:21                       ` Andrea Arcangeli
2008-02-28  0:24                         ` Christoph Lameter
2008-02-28 19:48                 ` Christoph Lameter
2008-02-28 21:52                   ` Andrea Arcangeli
2008-02-28 22:00                     ` Christoph Lameter
2008-02-28 23:17                     ` Jack Steiner
2008-02-29  0:24                       ` Andrea Arcangeli
2008-02-29  1:13                         ` Christoph Lameter
2008-02-28 23:05                 ` Christoph Lameter
2008-02-29  0:40                   ` Andrea Arcangeli
2008-02-29  0:56                     ` Andrew Morton
2008-02-29  1:03                     ` Christoph Lameter
2008-02-29 13:09                       ` Andrea Arcangeli
2008-02-29 19:46                         ` Christoph Lameter
2008-03-02 15:54                 ` [PATCH] mmu notifiers #v8 Andrea Arcangeli
2008-03-02 16:03                   ` [PATCH] mmu notifiers #v8 + xpmem Andrea Arcangeli
2008-03-02 16:23                     ` Peter Zijlstra
2008-03-03  3:29                   ` [PATCH] mmu notifiers #v8 Nick Piggin
2008-03-03 12:51                     ` Andrea Arcangeli
2008-03-03 13:10                       ` Nick Piggin
2008-03-03 13:24                         ` Andrea Arcangeli
2008-03-03 15:18                         ` Jack Steiner
2008-03-03 16:59                           ` Nick Piggin
2008-03-03 18:06                             ` Jack Steiner
2008-03-03 18:09                               ` Avi Kivity
2008-03-03 18:23                                 ` Jack Steiner
2008-03-03 18:45                               ` Nick Piggin
2008-03-03 19:15                                 ` Jack Steiner [this message]
2008-03-04 10:35                                   ` Peter Zijlstra
2008-03-04 14:44                                     ` Jack Steiner
2008-03-03 19:02                             ` Christoph Lameter
2008-03-03 19:01                     ` Christoph Lameter
2008-03-03 21:15                       ` Andrea Arcangeli
2008-03-05  0:37                       ` Nick Piggin
2008-03-05 18:48                         ` Christoph Lameter
2008-03-06  2:59                           ` Nick Piggin
2008-03-03  3:33                   ` Nick Piggin
2008-03-03 19:03                     ` Christoph Lameter
2008-03-03  3:34                   ` Nick Piggin
2008-03-03 19:04                     ` Christoph Lameter
2008-03-03  3:39                   ` Nick Piggin
2008-03-03 21:37                   ` [PATCH] mmu notifiers #v9 Andrea Arcangeli
2008-03-03 22:05                     ` [PATCH] KVM swapping with " Andrea Arcangeli
2008-03-04  0:44                       ` izik eidus
2008-03-04  7:31                         ` [RFC] Notifier for Externally Mapped Memory (EMM) Christoph Lameter
2008-03-04  7:34                           ` [Early draft] Conversion of i_mmap_lock to semaphore Christoph Lameter
2008-03-04 13:30                           ` [RFC] Notifier for Externally Mapped Memory (EMM) Andrea Arcangeli
2008-03-04 19:00                             ` Christoph Lameter
2008-03-04 22:20                               ` Andrea Arcangeli
2008-03-04 22:35                                 ` Christoph Lameter
2008-03-04 22:42                                   ` Peter Zijlstra
2008-03-04 23:14                                     ` Christoph Lameter
2008-03-04 23:25                                       ` Peter Zijlstra
2008-03-04 23:30                                         ` Peter Zijlstra
2008-03-05  5:09                                     ` Avi Kivity
2008-03-05  9:47                                       ` Robin Holt
2008-03-05  9:53                                         ` Avi Kivity
2008-03-05 10:02                                         ` [kvm-devel] " Dor Laor
2008-03-07 15:17                                   ` [PATCH] 2/4 move all invalidate_page outside of PT lock (#v9 was 1/4) Andrea Arcangeli
2008-03-07 15:23                                     ` [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep " Andrea Arcangeli
2008-03-07 15:52                                       ` [PATCH] 4/4 i_mmap_lock spinlock2rwsem " Andrea Arcangeli
2008-03-07 20:03                                         ` Christoph Lameter
2008-03-19 21:27                                         ` Christoph Lameter
2008-03-07 16:52                                       ` [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep " Peter Zijlstra
2008-03-07 17:50                                         ` Andrea Arcangeli
2008-03-07 18:01                                           ` Peter Zijlstra
2008-03-07 18:45                                             ` Andrea Arcangeli
2008-03-07 19:47                                               ` Andrea Arcangeli
2008-03-07 20:15                                                 ` Christoph Lameter
2008-03-07 20:12                                               ` Christoph Lameter
2008-03-07 20:10                                           ` Christoph Lameter
2008-03-07 20:00                                       ` Christoph Lameter
2008-03-07 19:54                                     ` [PATCH] 2/4 move all invalidate_page outside of PT lock " Christoph Lameter
2008-03-04 13:21                         ` [PATCH] KVM swapping with mmu notifiers #v9 Andrea Arcangeli
2008-02-21  4:47       ` [patch] my mmu notifiers Nick Piggin
2008-02-20  2:49     ` Robin Holt
2008-02-27 22:56     ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080303191540.GB11156@sgi.com \
    --to=steiner@sgi.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@qumranet.com \
    --cc=avi@qumranet.com \
    --cc=clameter@sgi.com \
    --cc=daniel.blueman@quadrics.com \
    --cc=general@lists.openfabrics.org \
    --cc=holt@sgi.com \
    --cc=izike@qumranet.com \
    --cc=kanojsarcar@yahoo.com \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=rdreier@cisco.com \
    --cc=swise@opengridcomputing.com \
    --subject='Re: [PATCH] mmu notifiers #v8' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).