From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934033AbXCTQZS (ORCPT ); Tue, 20 Mar 2007 12:25:18 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S934024AbXCTQZR (ORCPT ); Tue, 20 Mar 2007 12:25:17 -0400 Received: from atlrel8.hp.com ([156.153.255.206]:34661 "EHLO atlrel8.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934033AbXCTQZP (ORCPT ); Tue, 20 Mar 2007 12:25:15 -0400 Subject: Re: [RFC][PATCH] split file and anonymous page queues #2 From: Lee Schermerhorn To: Rik van Riel Cc: linux-mm , linux-kernel In-Reply-To: <45FF3052.0@redhat.com> References: <45FF3052.0@redhat.com> Content-Type: text/plain Organization: HP/OSLO Date: Tue, 20 Mar 2007 12:24:57 -0400 Message-Id: <1174407897.5664.38.camel@localhost> Mime-Version: 1.0 X-Mailer: Evolution 2.6.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2007-03-19 at 20:52 -0400, Rik van Riel wrote: > Split the anonymous and file backed pages out onto their own pageout > queues. This we do not unnecessarily churn through lots of anonymous > pages when we do not want to swap them out anyway. > > This should (with additional tuning) be a great step forward in > scalability, allowing Linux to run well on very large systems where > scanning through the anonymous memory (on our way to the page cache > memory we do want to evict) is slowing systems down significantly. > > This patch has been stress tested and seems to work, but has not > been fine tuned or benchmarked yet. For now the swappiness parameter > can be used to tweak swap aggressiveness up and down as desired, but > in the long run we may want to simply measure IO cost of page cache > and anonymous memory and auto-adjust. > > We apply pressure to each of sets of the pageout queues based on: > - the size of each queue > - the fraction of recently referenced pages in each queue, > not counting used-once file pages > - swappiness (file IO is more efficient than swap IO) > > Please take this patch for a spin and let me know what goes well > and what goes wrong. Rick: Which tree is the patch against. Diffs say 2.6.20.x86_64, but doesn't apply to 2.6.20 which doesn't use __inc_zone_state() for things like nr_active, nr_inactive, ... Also, in the snippet: >--- linux-2.6.20.x86_64/mm/swap_state.c.vmsplit 2007-02-04 >13:44:54.000000000 -0500 >+++ linux-2.6.20.x86_64/mm/swap_state.c 2007-03-19 12:00:23.000000000 -0400 >@@ -354,7 +354,7 @@ struct page *read_swap_cache_async(swp_e > /* > * Initiate read into locked page and return. > */ >- lru_cache_add_active(new_page); >+ lru_cache_add_anon(new_page); > swap_readpage(NULL, new_page); > return new_page; > } Should that be lru_cache_add_active_anon()? Or did you intend to add it to the inactive anon list? Finally, could you [should you?] skip scanning the anon lists--or at least the inactive anon list--when nr_swap_pages == 0? The anon pages aren't going anywhere, right? I think this would obviate Christoph L's patch to exclude anon pages from the LRU when there is no swap. Lee