LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [PATCH] shrink hash sizes on small machines, take 2
@ 2004-05-18 12:32 Gerald Schaefer
[not found] ` <20040518174210.GD28735@waste.org>
0 siblings, 1 reply; 4+ messages in thread
From: Gerald Schaefer @ 2004-05-18 12:32 UTC (permalink / raw)
To: linux-kernel; +Cc: mpm
On Sat, Apr 10, 2004, Matt Mackall wrote:
> Base hash sizes on available memory rather than total memory. An
> additional 50% above current used memory is considered reserved for
> the purposes of hash sizing to compensate for the hashes themselves
> and the remainder of kernel and userspace initialization.
>
> Index: tiny/fs/dcache.c
> ===================================================================
> --- tiny.orig/fs/dcache.c 2004-03-25 13:36:09.000000000 -0600
> +++ tiny/fs/dcache.c 2004-04-10 18:14:42.000000000 -0500
> @@ -28,6 +28,7 @@
> #include <asm/uaccess.h>
> #include <linux/security.h>
> #include <linux/seqlock.h>
> +#include <linux/swap.h>
>
> #define DCACHE_PARANOIA 1
> /* #define DCACHE_DEBUG 1 */
> @@ -1619,13 +1620,21 @@
>
> void __init vfs_caches_init(unsigned long mempages)
> {
> - names_cachep = kmem_cache_create("names_cache",
> - PATH_MAX, 0,
> + unsigned long reserve;
> +
> + /* Base hash sizes on available memory, with a reserve equal to
> + 150% of current kernel size */
> +
> + reserve = (mempages - nr_free_pages()) * 3/2;
> + mempages -= reserve;
This calculation doesn't sound right. The value of mempages is set to
num_physpages from the calling function start_kernel(), which also
includes pages from a potential "mem=" kernel parameter (which we use
for reserving memory to load DCSS segments on z/VM).
Whenever the amount of reserved pages goes above some limit (for whatever
reason, not only with "mem="), this calculation results in a negative value
for mempages, respectively a very large value due to "unsigned long".
I am not on the mailing list, so please put me on cc if you answer.
--
Gerald
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] shrink hash sizes on small machines, take 2
[not found] ` <20040518174210.GD28735@waste.org>
@ 2004-05-19 17:40 ` Gerald Schaefer
0 siblings, 0 replies; 4+ messages in thread
From: Gerald Schaefer @ 2004-05-19 17:40 UTC (permalink / raw)
To: Matt Mackall; +Cc: akpm, arnd, schwidefsky, linux-kernel
On Tuesday 18 May 2004 19:42, you wrote:
> num_physpages should represent the memory available to the kernel for
> normal use and should not reflect any other reserved memory. Otherwise
> it'll unduly influence the hash sizes both with and without this
> patch.
This is a very good point, Arnd and I have been looking at our "mem=" hack
and it does look a little bit fishy...
There are possibly more things affected in the s390 kernel with "mem="
parameter, not only num_physpages (max_low_pfn, etc.), so we will have to
take a closer look and see what Martin thinks about it.
> Index: mm/fs/dcache.c
> ===================================================================
> --- mm.orig/fs/dcache.c 2004-05-18 12:29:28.000000000 -0500
> +++ mm/fs/dcache.c 2004-05-18 12:37:42.000000000 -0500
> @@ -1625,7 +1625,7 @@
> /* Base hash sizes on available memory, with a reserve equal to
> 150% of current kernel size */
>
> - reserve = (mempages - nr_free_pages()) * 3/2;
> + reserve = min((mempages - nr_free_pages()) * 3/2, mempages - 1);
> mempages -= reserve;
>
> names_cachep = kmem_cache_create("names_cache",
This patch is o.k. for us, in our case (with "mem=") it would more or less
restore the previous situation (without the calculation, but still with a
potential memory problem on our side)
--
Gerald
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] shrink hash sizes on small machines, take 2
2004-04-10 23:27 Matt Mackall
@ 2004-04-15 15:16 ` Marcelo Tosatti
0 siblings, 0 replies; 4+ messages in thread
From: Marcelo Tosatti @ 2004-04-15 15:16 UTC (permalink / raw)
To: Matt Mackall; +Cc: Andrew Morton, arjanv, linux-kernel
On Sat, Apr 10, 2004 at 06:27:07PM -0500, Matt Mackall wrote:
> The following attempts to cleanly address the low end of the problem,
> something like my calc_hash_order or Marcelo's approach should be used
> to attack the upper end of the problem.
>
> 8<
>
> Shrink hashes on small systems
>
> Base hash sizes on available memory rather than total memory. An
> additional 50% above current used memory is considered reserved for
> the purposes of hash sizing to compensate for the hashes themselves
> and the remainder of kernel and userspace initialization.
Hi Matt,
As far as I remember from my tests booting with 8MB yields 0-order (one page)
dentry/inode hash tables, and 16MB yields
1-order dentry/0-order inode hash.
So we can't go lower than 1 page on <8MB anyway (and we dont). What
is the problem you are seeing ?
Your patch changes 16MB to 0-order dentry hashtable?
On the higher end, we still need to figure out if the "huge"
hash tables (1MB dentry/512K inode on 4GB box, upto 8MB hash dentry
on 16GB box) are really worth it.
Arjan seems to be clipping the dentry to 128K on RH's kernels.
I couldnt find much of a difference on dbench performance from 1MB to 512K
or 128K dhash. Someone willing to help with SDET or different tests?
Thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] shrink hash sizes on small machines, take 2
@ 2004-04-10 23:27 Matt Mackall
2004-04-15 15:16 ` Marcelo Tosatti
0 siblings, 1 reply; 4+ messages in thread
From: Matt Mackall @ 2004-04-10 23:27 UTC (permalink / raw)
To: Andrew Morton, linux-kernel
The following attempts to cleanly address the low end of the problem,
something like my calc_hash_order or Marcelo's approach should be used
to attack the upper end of the problem.
8<
Shrink hashes on small systems
Base hash sizes on available memory rather than total memory. An
additional 50% above current used memory is considered reserved for
the purposes of hash sizing to compensate for the hashes themselves
and the remainder of kernel and userspace initialization.
Index: tiny/fs/dcache.c
===================================================================
--- tiny.orig/fs/dcache.c 2004-03-25 13:36:09.000000000 -0600
+++ tiny/fs/dcache.c 2004-04-10 18:14:42.000000000 -0500
@@ -28,6 +28,7 @@
#include <asm/uaccess.h>
#include <linux/security.h>
#include <linux/seqlock.h>
+#include <linux/swap.h>
#define DCACHE_PARANOIA 1
/* #define DCACHE_DEBUG 1 */
@@ -1619,13 +1620,21 @@
void __init vfs_caches_init(unsigned long mempages)
{
- names_cachep = kmem_cache_create("names_cache",
- PATH_MAX, 0,
+ unsigned long reserve;
+
+ /* Base hash sizes on available memory, with a reserve equal to
+ 150% of current kernel size */
+
+ reserve = (mempages - nr_free_pages()) * 3/2;
+ mempages -= reserve;
+
+ names_cachep = kmem_cache_create("names_cache",
+ PATH_MAX, 0,
SLAB_HWCACHE_ALIGN, NULL, NULL);
if (!names_cachep)
panic("Cannot create names SLAB cache");
- filp_cachep = kmem_cache_create("filp",
+ filp_cachep = kmem_cache_create("filp",
sizeof(struct file), 0,
SLAB_HWCACHE_ALIGN, filp_ctor, filp_dtor);
if(!filp_cachep)
@@ -1633,7 +1642,7 @@
dcache_init(mempages);
inode_init(mempages);
- files_init(mempages);
+ files_init(mempages);
mnt_init(mempages);
bdev_cache_init();
chrdev_init();
--
Matt Mackall : http://www.selenic.com : Linux development and consulting
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-05-22 2:41 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-18 12:32 [PATCH] shrink hash sizes on small machines, take 2 Gerald Schaefer
[not found] ` <20040518174210.GD28735@waste.org>
2004-05-19 17:40 ` Gerald Schaefer
-- strict thread matches above, loose matches on Subject: below --
2004-04-10 23:27 Matt Mackall
2004-04-15 15:16 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).