LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Jes Sorensen <jes@trained-monkey.org>
To: Alexander Viro <viro@math.psu.edu>
Cc: Andrew Morton <akpm@osdl.org>,
	"William Lee Irwin, III" <wli@holomorphy.com>,
	linux-kernel@vger.kernel.org, jbarnes@sgi.com, steiner@sgi.com
Subject: hash table sizes
Date: Tue, 25 Nov 2003 08:35:49 -0500	[thread overview]
Message-ID: <16323.23221.835676.999857@gargle.gargle.HOWL> (raw)

Hi,

On NUMA systems with way too much memory, the current algorithms for
determining the size of the inode and dentry hash tables ends up trying
to allocate tables that are so big they may not fit within the physical
memory of a single node. Ie. on a 256 node system with 512GB of RAM with
16KB pages it basically ends up eating up all the memory on node before
completing a boot because of this. The inode and dentry hashes are 256MB
each and the IP routing table hash is 128MB.

I have tried changing the algorithm as below and it seems to produce
reasonable results and almost identical numbers for the smaller /
mid-sized configs I looked at.

This is not meant to be a final patch, any input/oppinion is welcome.

Thanks,
Jes

--- orig/linux-2.6.0-test10/fs/dcache.c	Sat Oct 25 11:42:58 2003
+++ linux-2.6.0-test10/fs/dcache.c	Tue Nov 25 05:33:04 2003
@@ -1549,9 +1549,8 @@
 static void __init dcache_init(unsigned long mempages)
 {
 	struct hlist_head *d;
-	unsigned long order;
 	unsigned int nr_hash;
-	int i;
+	int i, order;
 
 	/* 
 	 * A constructor could be added for stable state like the lists,
@@ -1571,12 +1570,17 @@
 	
 	set_shrinker(DEFAULT_SEEKS, shrink_dcache_memory);
 
+#if 0
 #if PAGE_SHIFT < 13
 	mempages >>= (13 - PAGE_SHIFT);
 #endif
 	mempages *= sizeof(struct hlist_head);
 	for (order = 0; ((1UL << order) << PAGE_SHIFT) < mempages; order++)
 		;
+#endif
+	mempages >>= (23 - (PAGE_SHIFT - 1));
+	order = max(2, fls(mempages));
+	order = min(12, order);
 
 	do {
 		unsigned long tmp;
@@ -1594,7 +1598,7 @@
 			__get_free_pages(GFP_ATOMIC, order);
 	} while (dentry_hashtable == NULL && --order >= 0);
 
-	printk(KERN_INFO "Dentry cache hash table entries: %d (order: %ld, %ld bytes)\n",
+	printk(KERN_INFO "Dentry cache hash table entries: %d (order: %d, %ld bytes)\n",
 			nr_hash, order, (PAGE_SIZE << order));
 
 	if (!dentry_hashtable)
--- orig/linux-2.6.0-test10/fs/inode.c	Sat Oct 25 11:44:53 2003
+++ linux-2.6.0-test10/fs/inode.c	Tue Nov 25 05:33:27 2003
@@ -1333,17 +1333,21 @@
 void __init inode_init(unsigned long mempages)
 {
 	struct hlist_head *head;
-	unsigned long order;
 	unsigned int nr_hash;
-	int i;
+	int i, order;
 
 	for (i = 0; i < ARRAY_SIZE(i_wait_queue_heads); i++)
 		init_waitqueue_head(&i_wait_queue_heads[i].wqh);
 
+#if 0
 	mempages >>= (14 - PAGE_SHIFT);
 	mempages *= sizeof(struct hlist_head);
 	for (order = 0; ((1UL << order) << PAGE_SHIFT) < mempages; order++)
 		;
+#endif
+	mempages >>= (23 - (PAGE_SHIFT - 1));
+	order = max(2, fls(mempages));
+	order = min(12, order);
 
 	do {
 		unsigned long tmp;

             reply	other threads:[~2003-11-25 13:36 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-25 13:35 Jes Sorensen [this message]
2003-11-25 13:42 ` William Lee Irwin III
2003-11-25 13:54   ` Jes Sorensen
2003-11-25 16:25     ` Thomas Schlichter
2003-11-25 17:52       ` Antonio Vargas
2003-11-25 17:54         ` William Lee Irwin III
2003-11-25 20:48 ` Jack Steiner
2003-11-25 21:07   ` Andrew Morton
2003-11-25 21:14     ` Jesse Barnes
2003-11-25 21:24       ` Andrew Morton
2003-11-26  2:14         ` David S. Miller
2003-11-26  5:27         ` Matt Mackall
2003-11-28 14:15         ` Jes Sorensen
2003-11-28 14:52           ` Jack Steiner
2003-11-28 16:22             ` Jes Sorensen
2003-11-28 19:35               ` Jack Steiner
2003-11-28 21:18                 ` Jörn Engel
2003-12-01  9:46                   ` Jes Sorensen
2003-12-01 21:06     ` Anton Blanchard
2003-12-01 22:57       ` Martin J. Bligh
2003-11-25 21:16   ` Anton Blanchard
2003-11-25 23:11     ` Jack Steiner
2003-11-26  3:39       ` Rik van Riel
2003-11-26  3:59         ` William Lee Irwin III
2003-11-26  4:25           ` Andrew Morton
2003-11-26  4:23             ` William Lee Irwin III
2003-11-26  5:14           ` Martin J. Bligh
2003-11-26  9:51             ` William Lee Irwin III
2003-11-26 16:17               ` Martin J. Bligh
2003-11-26  7:25       ` Anton Blanchard
2003-11-26  5:53 Zhang, Yanmin
2003-11-29 10:39 Manfred Spraul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16323.23221.835676.999857@gargle.gargle.HOWL \
    --to=jes@trained-monkey.org \
    --cc=akpm@osdl.org \
    --cc=jbarnes@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=steiner@sgi.com \
    --cc=viro@math.psu.edu \
    --cc=wli@holomorphy.com \
    --subject='Re: hash table sizes' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).