LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Christoph Lameter <clameter@sgi.com>
To: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Roland Dreier <rdreier@cisco.com>, Rik van Riel <riel@redhat.com>,
	steiner@sgi.com, Andrea Arcangeli <andrea@qumranet.com>,
	a.p.zijlstra@chello.nl, izike@qumranet.com,
	linux-kernel@vger.kernel.org, avi@qumranet.com,
	linux-mm@kvack.org, daniel.blueman@quadrics.com,
	Robin Holt <holt@sgi.com>,
	general@lists.openfabrics.org,
	Andrew Morton <akpm@linux-foundation.org>,
	kvm-devel@lists.sourceforge.net
Subject: Re: [ofa-general] Re: Demand paging for memory regions
Date: Tue, 12 Feb 2008 18:35:09 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0802121819530.12328@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <20080213012638.GD31435@obsidianresearch.com>

On Tue, 12 Feb 2008, Jason Gunthorpe wrote:

> The problem is that the existing wire protocols do not have a
> provision for doing an 'are you ready' or 'I am not ready' exchange
> and they are not designed to store page tables on both sides as you
> propose. The remote side can send RDMA WRITE traffic at any time after
> the RDMA region is established. The local side must be able to handle
> it. There is no way to signal that a page is not ready and the remote
> should not send.
> 
> This means the only possible implementation is to stall/discard at the
> local adaptor when a RDMA WRITE is recieved for a page that has been
> reclaimed. This is what leads to deadlock/poor performance..

You would only use the wire protocols *after* having established the RDMA 
region. The notifier chains allows a RDMA region (or parts thereof) to be 
down on demand by the VM. The region can be reestablished if one of 
the side accesses it. I hope I got that right. Not much exposure to 
Infiniband so far.

Lets say you have a two systems A and B. Each has their memory region MemA 
and MemB. Each side also has page tables for this region PtA and PtB.

Now you establish a RDMA connection between both side. The pages in both
MemB and MemA are present and so are entries in PtA and PtB. RDMA 
traffic can proceed.

The VM on system A now gets into a situation in which memory becomes 
heavily used by another (maybe non RDMA process) and after checking that 
there was no recent reference to MemA and MemB (via a notifier aging 
callback) decides to reclaim the memory from MemA.

In that case it will notify the RDMA subsystem on A that it is trying to
reclaim a certain page.

The RDMA subsystem on A will then send a message to B notifying it that 
the memory will be going away. B now has to remove its corresponding page 
from memory (and drop the entry in PtB) and confirm to A that this has 
happened. RDMA traffic is then stopped for this page. Then A can also 
remove its page, the corresponding entry in PtA and the page is reclaimed 
or pushed out to swap completing the page reclaim.

If either side then accesses the page again then the reverse process 
happens. If B accesses the page then it wil first of all incur a page 
fault because the entry in PtB is missing. The fault will then cause a 
message to be send to A to establish the page again. A will create an 
entry in PtA and will then confirm to B that the page was established. At 
that point RDMA operations can occur again.

So the whole scheme does not really need a hardware page table in the RDMA 
hardware. The page tables of the two systems A and B are sufficient.

The scheme can also be applied to a larger range than only a single page. 
The RDMA subsystem could tear down a large section when reclaim is 
pushing on it and then reestablish it as needed.

Swapping and page reclaim is certainly not something that improves the 
speed of the application affected by swapping and page reclaim but it 
allows the VM to manage memory effectively if multiple loads are runing on 
a system.


  parent reply	other threads:[~2008-02-13  2:35 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-08 22:06 [patch 0/6] MMU Notifiers V6 Christoph Lameter
2008-02-08 22:06 ` [patch 1/6] mmu_notifier: Core code Christoph Lameter
2008-02-08 22:06 ` [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges Christoph Lameter
2008-02-08 22:06 ` [patch 3/6] mmu_notifier: invalidate_page callbacks Christoph Lameter
2008-02-08 22:06 ` [patch 4/6] mmu_notifier: Skeleton driver for a simple mmu_notifier Christoph Lameter
2008-02-08 22:06 ` [patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem) Christoph Lameter
2008-02-08 22:06 ` [patch 6/6] mmu_rmap_notifier: Skeleton for complex driver that uses its own rmaps Christoph Lameter
2008-02-08 22:23 ` [patch 0/6] MMU Notifiers V6 Andrew Morton
2008-02-08 23:32   ` Christoph Lameter
2008-02-08 23:36     ` Robin Holt
2008-02-08 23:41       ` Christoph Lameter
2008-02-08 23:43         ` Robin Holt
2008-02-08 23:56           ` Andrew Morton
2008-02-09  0:05             ` Christoph Lameter
2008-02-09  0:12               ` [ofa-general] " Roland Dreier
2008-02-09  0:16                 ` Christoph Lameter
2008-02-09  0:22                   ` Roland Dreier
2008-02-09  0:36                     ` Christoph Lameter
2008-02-09  1:24                       ` Andrea Arcangeli
2008-02-09  1:27                         ` Christoph Lameter
2008-02-09  1:56                           ` Andrea Arcangeli
2008-02-09  2:16                             ` Christoph Lameter
2008-02-09 12:55                               ` Rik van Riel
2008-02-09 21:46                                 ` Christoph Lameter
2008-02-11 22:40                                   ` Demand paging for memory regions (was Re: MMU Notifiers V6) Roland Dreier
2008-02-12 22:01                                     ` Steve Wise
2008-02-12 22:10                                       ` Christoph Lameter
2008-02-12 22:41                                         ` [ofa-general] Re: Demand paging for memory regions Roland Dreier
2008-02-12 23:14                                           ` Felix Marti
2008-02-13  0:57                                             ` Christoph Lameter
2008-02-14 15:09                                             ` Steve Wise
2008-02-14 15:53                                               ` Robin Holt
2008-02-14 16:23                                                 ` Steve Wise
2008-02-14 17:48                                                   ` Caitlin Bestler
2008-02-14 19:39                                               ` Christoph Lameter
2008-02-14 20:17                                                 ` Caitlin Bestler
2008-02-14 20:20                                                   ` Christoph Lameter
2008-02-14 22:43                                                     ` Caitlin Bestler
2008-02-14 22:48                                                       ` Christoph Lameter
2008-02-15  1:26                                                         ` Caitlin Bestler
2008-02-15  2:37                                                           ` Christoph Lameter
2008-02-15 18:09                                                             ` Caitlin Bestler
2008-02-15 18:45                                                               ` Christoph Lameter
2008-02-15 18:53                                                                 ` Caitlin Bestler
2008-02-15 20:02                                                                   ` Christoph Lameter
2008-02-15 20:14                                                                     ` Caitlin Bestler
2008-02-15 22:50                                                                       ` Christoph Lameter
2008-02-15 23:50                                                                         ` Caitlin Bestler
2008-02-12 23:23                                           ` Jason Gunthorpe
2008-02-13  1:01                                             ` Christoph Lameter
2008-02-13  1:26                                               ` Jason Gunthorpe
2008-02-13  1:45                                                 ` Steve Wise
2008-02-13  2:35                                                 ` Christoph Lameter [this message]
2008-02-13  3:25                                                   ` Jason Gunthorpe
2008-02-13  3:56                                                     ` Patrick Geoffray
2008-02-13  4:26                                                       ` Jason Gunthorpe
2008-02-13  4:47                                                         ` Patrick Geoffray
2008-02-13 18:51                                                     ` Christoph Lameter
2008-02-13 19:51                                                       ` Jason Gunthorpe
2008-02-13 20:36                                                         ` Christoph Lameter
2008-02-13  4:09                                                   ` Christian Bell
2008-02-13 19:00                                                     ` Christoph Lameter
2008-02-13 19:46                                                       ` Christian Bell
2008-02-13 20:32                                                         ` Christoph Lameter
2008-02-13 22:44                                                           ` Kanoj Sarcar
2008-02-13 23:02                                                             ` Christoph Lameter
2008-02-13 23:43                                                               ` Kanoj Sarcar
2008-02-13 23:48                                                                 ` Jesse Barnes
2008-02-14  0:56                                                                 ` [ofa-general] " Andrea Arcangeli
2008-02-14 19:35                                                                 ` Christoph Lameter
2008-02-13 23:23                                                     ` Pete Wyckoff
2008-02-14  0:01                                                       ` Jason Gunthorpe
2008-02-27 22:11                                                         ` Christoph Lameter
2008-02-13  1:55                                               ` Christian Bell
2008-02-13  2:19                                                 ` Christoph Lameter
2008-02-13  0:56                                           ` Christoph Lameter
2008-02-13 12:11                                           ` Christoph Raisch
2008-02-13 19:02                                             ` Christoph Lameter
2008-02-09  0:12               ` [patch 0/6] MMU Notifiers V6 Andrew Morton
2008-02-09  0:18                 ` Christoph Lameter
2008-02-13 14:31 ` Jack Steiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0802121819530.12328@schroedinger.engr.sgi.com \
    --to=clameter@sgi.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@qumranet.com \
    --cc=avi@qumranet.com \
    --cc=daniel.blueman@quadrics.com \
    --cc=general@lists.openfabrics.org \
    --cc=holt@sgi.com \
    --cc=izike@qumranet.com \
    --cc=jgunthorpe@obsidianresearch.com \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rdreier@cisco.com \
    --cc=riel@redhat.com \
    --cc=steiner@sgi.com \
    --subject='Re: [ofa-general] Re: Demand paging for memory regions' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).