LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Robin Holt <holt@sgi.com>
To: Andrea Arcangeli <andrea@qumranet.com>
Cc: Christoph Lameter <clameter@sgi.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Robin Holt <holt@sgi.com>, Avi Kivity <avi@qumranet.com>,
	Izik Eidus <izike@qumranet.com>,
	kvm-devel@lists.sourceforge.net,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	general@lists.openfabrics.org,
	Steve Wise <swise@opengridcomputing.com>,
	Roland Dreier <rdreier@cisco.com>,
	Kanoj Sarcar <kanojsarcar@yahoo.com>,
	steiner@sgi.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, daniel.blueman@quadrics.com
Subject: Re: [patch 1/6] mmu_notifier: Core code
Date: Sun, 17 Feb 2008 06:24:38 -0600	[thread overview]
Message-ID: <20080217122438.GB11391@sgi.com> (raw)
In-Reply-To: <20080217030120.GO11732@v2.random>

On Sun, Feb 17, 2008 at 04:01:20AM +0100, Andrea Arcangeli wrote:
> On Sat, Feb 16, 2008 at 11:21:07AM -0800, Christoph Lameter wrote:
> > On Fri, 15 Feb 2008, Andrew Morton wrote:
> > 
> > > What is the status of getting infiniband to use this facility?
> > 
> > Well we are talking about this it seems.
> 
> It seems the IB folks think allowing RDMA over virtual memory is not
> interesting, their argument seem to be that RDMA is only interesting
> on RAM (and they seem not interested in allowing RDMA over a ram+swap
> backed _virtual_ memory allocation). They've just to decide if
> ram+swap allocation for RDMA is useful or not.

I don't think that is a completely fair characterization.  It would be
more fair to say that the changes required to their library/user api
would be too significant to allow an adaptation to any scheme which
allowed removal of physical memory below a virtual mapping.

I agree with the IB folks when they say it is impossible with their
current scheme.  The fact that any consumer of their endpoint identifier
can use any identifier without notifying the kernel prior to its use
certainly makes any implementation under any scheme impossible.

I guess we could possibly make things work for IB if we did some heavy
work.  Let's assume, instead of passing around the physical endpoint
identifiers, they passed around a handle.  In order for any IB endpoint
to commuicate, it would need to request the kernel translate a handle
into an endpoint identifier.  In order for the kernel to put a TLB
entry into the processes address space allowing the process access to
the _CARD_, it would need to ensure all the current endpoint identifiers
for this process were "active" meaning we have verified with the other
endpoint that all pages are faulted and TLB/PFN information is in the
owning card's TLB/PFN tables.  Once all of a processes endoints are
"active" we would drop in the PFN for the adapter into the pages tables.
Any time pages are being revoked from under an active handle, we would
shoot-down the IB adapter card TLB entries for all the remote users of
this handle and quiesce the cards state to ensure transfers are either
complete or terminated.  When their are no active transfers, we would
respond back to the owner and they could complete the source process
page table cleaning.  Any time all of the pages for a handle can not be
mapped from virtual to physical, the remote process would be SIGBUS'd
instead of having it IB adapter TLB installed.

This is essentially how XPMEM does it except we have the benefit of
working on individual pages.

Again, not knowing what I am talking about, but under the assumption that
MPI IB use is contained to a library, I would hope the changes could be
contained under the MPI-to-IB library interface and would not need any
changes at the MPI-user library interface.

We do keep track of the virtual address ranges within a handle that
are being used.  I assume the IB folks will find that helpful as well.
Otherwise, I think they could make things operate this way.  XPMEM has
the advantage of not needing to have virtual-to-physical at all times,
but otherwise it is essentially the same.

Thanks,
Robin

  reply	other threads:[~2008-02-17 12:32 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-15  6:48 [patch 0/6] MMU Notifiers V7 Christoph Lameter
2008-02-15  6:49 ` [patch 1/6] mmu_notifier: Core code Christoph Lameter
2008-02-16  3:37   ` Andrew Morton
2008-02-16  8:45     ` Avi Kivity
2008-02-16  8:56       ` Andrew Morton
2008-02-16  9:21         ` Avi Kivity
2008-02-16 10:41     ` Brice Goglin
2008-02-16 10:58       ` Andrew Morton
2008-02-16 19:31         ` Christoph Lameter
2008-02-16 19:21     ` Christoph Lameter
2008-02-17  3:01       ` Andrea Arcangeli
2008-02-17 12:24         ` Robin Holt [this message]
2008-02-17  5:04     ` Doug Maxey
2008-02-18 22:33   ` Roland Dreier
2008-02-15  6:49 ` [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges Christoph Lameter
2008-02-16  3:37   ` Andrew Morton
2008-02-16 19:26     ` Christoph Lameter
2008-02-19  8:54   ` Nick Piggin
2008-02-19 13:34     ` Andrea Arcangeli
2008-02-27 22:23       ` Christoph Lameter
2008-02-27 23:57         ` Andrea Arcangeli
2008-02-19 23:08   ` Nick Piggin
2008-02-20  1:00     ` Andrea Arcangeli
2008-02-20  3:00       ` Robin Holt
2008-02-20  3:11         ` Nick Piggin
2008-02-20  3:19           ` Robin Holt
2008-02-27 22:39       ` Christoph Lameter
2008-02-28  0:38         ` Andrea Arcangeli
2008-02-27 22:35     ` Christoph Lameter
2008-02-27 22:42       ` Jack Steiner
2008-02-28  0:10       ` Christoph Lameter
2008-02-28  0:11       ` Andrea Arcangeli
2008-02-28  0:14         ` Christoph Lameter
2008-02-28  0:52           ` Andrea Arcangeli
2008-02-28  1:03             ` Christoph Lameter
2008-02-28  1:10               ` Andrea Arcangeli
2008-02-28 18:43                 ` Christoph Lameter
2008-02-29  0:55                   ` Andrea Arcangeli
2008-02-29  0:59                     ` Christoph Lameter
2008-02-29 13:13                       ` Andrea Arcangeli
2008-02-29 19:55                         ` Christoph Lameter
2008-02-29 20:17                           ` Andrea Arcangeli
2008-02-29 21:03                             ` Christoph Lameter
2008-02-29 21:23                               ` Andrea Arcangeli
2008-02-29 21:29                                 ` Christoph Lameter
2008-02-29 21:34                                 ` Christoph Lameter
2008-02-29 21:48                                   ` Andrea Arcangeli
2008-02-29 22:12                                     ` Christoph Lameter
2008-02-29 22:41                                       ` Andrea Arcangeli
2008-02-28 10:53             ` Robin Holt
2008-03-03  5:11       ` Nick Piggin
2008-03-03 19:28         ` Christoph Lameter
2008-03-03 19:50           ` Nick Piggin
2008-03-04 18:58             ` Christoph Lameter
2008-03-05  0:52               ` Nick Piggin
2008-02-15  6:49 ` [patch 3/6] mmu_notifier: invalidate_page callbacks Christoph Lameter
2008-02-16  3:37   ` Andrew Morton
2008-02-16 11:07     ` Andrea Arcangeli
2008-02-16 19:22     ` Christoph Lameter
2008-02-16 19:54       ` Avi Kivity
2008-02-19  8:46       ` Nick Piggin
2008-02-19 13:30         ` Andrea Arcangeli
2008-02-18  1:51     ` Nick Piggin
2008-02-15  6:49 ` [patch 4/6] mmu_notifier: Skeleton driver for a simple mmu_notifier Christoph Lameter
2008-02-15  6:49 ` [patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem) Christoph Lameter
2008-02-16  3:37   ` Andrew Morton
2008-02-16 19:28     ` Christoph Lameter
2008-02-19 23:55   ` Nick Piggin
2008-02-20  3:12     ` Robin Holt
2008-02-20  3:51       ` Nick Piggin
2008-02-20  9:00         ` Robin Holt
2008-02-20  9:05           ` Robin Holt
2008-02-21  4:20           ` Nick Piggin
2008-02-21 10:58             ` Robin Holt
2008-02-26  6:11               ` Nick Piggin
2008-02-26  7:21                 ` [ofa-general] " Gleb Natapov
2008-02-26  8:52                   ` Nick Piggin
2008-02-26  9:38                     ` Gleb Natapov
2008-02-26  9:52                       ` KOSAKI Motohiro
2008-02-26 12:28                     ` Robin Holt
2008-02-26 12:29                 ` Robin Holt
2008-02-27 22:43     ` Christoph Lameter
2008-02-28  0:42       ` Andrea Arcangeli
2008-02-28  1:01         ` Christoph Lameter
2008-02-15  6:49 ` [patch 6/6] mmu_rmap_notifier: Skeleton for complex driver that uses its own rmaps Christoph Lameter
2008-02-16 10:48 ` [PATCH] KVM swapping with MMU Notifiers V7 Andrea Arcangeli
2008-02-16 11:08   ` Andrew Morton
2008-02-18 12:17     ` Andrea Arcangeli
2008-02-16 11:51   ` Robin Holt
2008-02-18 12:35     ` Andrea Arcangeli
  -- strict thread matches above, loose matches on Subject: below --
2008-02-08 22:06 [patch 0/6] MMU Notifiers V6 Christoph Lameter
2008-02-08 22:06 ` [patch 1/6] mmu_notifier: Core code Christoph Lameter
2008-01-30  2:29 [patch 0/6] [RFC] MMU Notifiers V3 Christoph Lameter
2008-01-30  2:29 ` [patch 1/6] mmu_notifier: Core code Christoph Lameter
2008-01-30 15:37   ` Andrea Arcangeli
2008-01-30 15:53     ` Jack Steiner
2008-01-30 16:38       ` Andrea Arcangeli
2008-01-30 19:19       ` Christoph Lameter
2008-01-30 22:20         ` Robin Holt
2008-01-30 23:38           ` Andrea Arcangeli
2008-01-30 23:55             ` Christoph Lameter
2008-01-30 17:10     ` Peter Zijlstra
2008-01-30 19:28       ` Christoph Lameter
2008-01-30 18:02   ` Robin Holt
2008-01-30 19:08     ` Christoph Lameter
2008-01-30 19:14     ` Christoph Lameter
2008-01-28 20:28 [patch 0/6] [RFC] MMU Notifiers V2 Christoph Lameter
2008-01-28 20:28 ` [patch 1/6] mmu_notifier: Core code Christoph Lameter
2008-01-28 22:06   ` Christoph Lameter
2008-01-29  0:05   ` Robin Holt
2008-01-29  1:19     ` Christoph Lameter
2008-01-29 13:59   ` Andrea Arcangeli
2008-01-29 14:34     ` Andrea Arcangeli
2008-01-29 19:49     ` Christoph Lameter
2008-01-29 20:41       ` Avi Kivity
2008-01-29 16:07   ` Robin Holt
2008-02-05 18:05   ` Andy Whitcroft
2008-02-05 18:17     ` Peter Zijlstra
2008-02-05 18:19     ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080217122438.GB11391@sgi.com \
    --to=holt@sgi.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@qumranet.com \
    --cc=avi@qumranet.com \
    --cc=clameter@sgi.com \
    --cc=daniel.blueman@quadrics.com \
    --cc=general@lists.openfabrics.org \
    --cc=izike@qumranet.com \
    --cc=kanojsarcar@yahoo.com \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rdreier@cisco.com \
    --cc=steiner@sgi.com \
    --cc=swise@opengridcomputing.com \
    --subject='Re: [patch 1/6] mmu_notifier: Core code' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).