LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Thomas Graf <tgraf@suug.ch>, David Miller <davem@davemloft.net>
Subject: [PATCH 00/29] swap over networked storage -v11
Date: Wed, 21 Feb 2007 15:43:04 +0100	[thread overview]
Message-ID: <20070221144304.512721000@taijtu.programming.kicks-ass.net> (raw)

(patches against 2.6.20-mm1)

There is a fundamental deadlock associated with paging; when writing out a page
to free memory requires free memory to complete. The usually solution is to
keep a small amount of memory available at all times so we can overcome this
problem. This however assumes the amount of memory needed for writeout is
(constant and) smaller than the provided reserve.

It is this latter assumption that breaks when doing writeout over network.
Network can take up an unspecified amount of memory while waiting for a reply
to our write request. This re-introduces the deadlock; we might never complete
the writeout, for we might not have enough memory to receive the completion
message.

The proposed solution is simple, only allow traffic servicing the VM to make
use of the reserves. Since the VM is always present to service, this limited
amount of memory can sustain a full connection; after a packet has been
processed its memory can be re-used for the next packet.

This however implies you know what packets are for whom, which generally
speaking you don't. Hence we need to receive all packets but discard them as
soon as we encounter a non VM bound packet allocated from the reserves.

Also knowing it is headed towards the VM needs a little help, hence we
introduce the socket flag SOCK_VMIO to mark sockets with.

Of course, since we are paging all this has to happen in kernel-space, since
user-space might just not be there.

Since packet processing might also require memory, this all also implies that
those auxiliary allocations may use the reserves when an emergency packet is
processed. This is accomplished by using PF_MEMALLOC.

How much memory is to be reserved is also an issue, enough memory to saturate
both the route cache and IP fragment reassembly, along with various constants.

This patch-set comes in 5 parts:

1) introduce the memory reserve and make the SLAB allocator play nice with it.
   patches 01-09

2) add some needed infrastructure to the network code
   patches 10-12

3) implement the idea outlined above
   patches 13-19

4) teach the swap machinery to use generic address_spaces
   patches 20-23

5) implement swap over NFS using all the new stuff
   patches 24-29
-- 


             reply	other threads:[~2007-02-21 15:34 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-21 14:43 Peter Zijlstra [this message]
2007-02-21 14:43 ` [PATCH 01/29] mm: page allocation rank Peter Zijlstra
2007-02-21 14:43 ` [PATCH 02/29] mm: slab allocation fairness Peter Zijlstra
2007-02-21 15:33   ` Pekka Enberg
2007-02-21 14:43 ` [PATCH 03/29] mm: allow PF_MEMALLOC from softirq context Peter Zijlstra
2007-02-21 15:53   ` Arjan van de Ven
2007-02-22  9:16     ` Peter Zijlstra
2007-02-22  9:48       ` Arjan van de Ven
2007-02-21 14:43 ` [PATCH 04/29] mm: serialize access to min_free_kbytes Peter Zijlstra
2007-02-21 14:43 ` [PATCH 05/29] mm: emergency pool Peter Zijlstra
2007-02-21 14:43 ` [PATCH 06/29] mm: __GFP_EMERGENCY Peter Zijlstra
2007-02-21 14:43 ` [PATCH 07/29] mm: allow mempool to fall back to memalloc reserves Peter Zijlstra
2007-02-21 14:43 ` [PATCH 08/29] mm: kmem_cache_objs_to_pages() Peter Zijlstra
2007-02-21 15:47   ` Pekka Enberg
2007-02-22  9:28     ` Peter Zijlstra
2007-02-22  9:45       ` Pekka Enberg
2007-02-22  9:49         ` Pekka Enberg
2007-02-21 14:43 ` [PATCH 09/29] selinux: tag avc cache alloc as non-critical Peter Zijlstra
2007-02-21 15:22   ` James Morris
2007-02-21 14:43 ` [PATCH 10/29] net: wrap sk->sk_backlog_rcv() Peter Zijlstra
2007-02-21 14:43 ` [PATCH 11/29] net: packet split receive api Peter Zijlstra
2007-02-21 14:43 ` [PATCH 12/29] net: remove alloc_skb_from_cache Peter Zijlstra
2007-02-21 14:43 ` [PATCH 13/29] netvm: link network to vm layer Peter Zijlstra
2007-02-21 14:43 ` [PATCH 14/29] netvm: INET reserves Peter Zijlstra
2007-02-21 14:43 ` [PATCH 15/29] netvm: hook skb allocation to reserves Peter Zijlstra
2007-02-21 14:43 ` [PATCH 16/29] netvm: filter emergency skbs Peter Zijlstra
2007-02-21 14:43 ` [PATCH 17/29] netvm: prevent a TCP specific deadlock Peter Zijlstra
2007-02-21 14:43 ` [PATCH 18/29] netfilter: notify about NF_QUEUE vs emergency skbs Peter Zijlstra
2007-02-24 15:27   ` Patrick McHardy
2007-02-24 15:46     ` Peter Zijlstra
2007-02-24 16:17       ` Patrick McHardy
2007-02-24 16:18         ` Peter Zijlstra
2007-02-24 16:40           ` Patrick McHardy
2007-02-24 16:55             ` Peter Zijlstra
2007-02-21 14:43 ` [PATCH 19/29] netvm: skb processing Peter Zijlstra
2007-02-21 14:43 ` [PATCH 20/29] uml: rename arch/um remove_mapping() Peter Zijlstra
2007-02-21 14:43 ` [PATCH 21/29] mm: prepare swap entry methods for use in page methods Peter Zijlstra
2007-02-21 14:43 ` [PATCH 22/29] mm: add support for non block device backed swap files Peter Zijlstra
2007-02-21 14:43 ` [PATCH 23/29] mm: methods for teaching filesystems about PG_swapcache pages Peter Zijlstra
2007-02-21 14:43 ` [PATCH 24/29] nfs: remove mempools Peter Zijlstra
2007-02-21 14:43 ` [PATCH 25/29] nfs: only use stable storage for swap Peter Zijlstra
2007-02-21 14:43 ` [PATCH 26/29] nfs: teach the NFS client how to treat PG_swapcache pages Peter Zijlstra
2007-02-21 14:43 ` [PATCH 27/29] nfs: disable data cache revalidation for swapfiles Peter Zijlstra
2007-02-21 14:43 ` [PATCH 28/29] nfs: enable swap on NFS Peter Zijlstra
2007-02-21 14:43 ` [PATCH 29/29] balance_dirty_pages() vs throttle_vm_writeout() deadlock Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070221144304.512721000@taijtu.programming.kicks-ass.net \
    --to=a.p.zijlstra@chello.nl \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    --cc=trond.myklebust@fys.uio.no \
    --subject='Re: [PATCH 00/29] swap over networked storage -v11' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).