LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: linux-cachefs@redhat.com
Cc: dhowells@redhat.com, Anna Schumaker <anna.schumaker@netapp.com>,
Steve French <sfrench@samba.org>,
Dominique Martinet <asmadeus@codewreck.org>,
Jeff Layton <jlayton@redhat.com>,
David Wysochanski <dwysocha@redhat.com>,
linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org,
linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org,
v9fs-developer@lists.sourceforge.net,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH 10/12] fscache: Fix cookie key hashing
Date: Mon, 21 Jun 2021 22:46:58 +0100 [thread overview]
Message-ID: <162431201844.2908479.8293647220901514696.stgit@warthog.procyon.org.uk> (raw)
In-Reply-To: <162431188431.2908479.14031376932042135080.stgit@warthog.procyon.org.uk>
The current hash algorithm used for hashing cookie keys is really bad,
producing almost no dispersion (after a test kernel build, ~30000 files
were split over just 18 out of the 32768 hash buckets).
Borrow the full_name_hash() hash function into fscache to do the hashing
for cookie keys and, in the future, volume keys.
I don't want to use full_name_hash() as-is because I want the hash value to
be consistent across arches and over time as the hash value produced may
get used on disk.
I can also optimise parts of it away as the key will always be a padded
array of aligned 32-bit words.
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/fscache/cookie.c | 14 +-------------
fs/fscache/internal.h | 2 ++
fs/fscache/main.c | 39 +++++++++++++++++++++++++++++++++++++++
3 files changed, 42 insertions(+), 13 deletions(-)
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index ec9bce33085f..2558814193e9 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -87,10 +87,8 @@ void fscache_free_cookie(struct fscache_cookie *cookie)
static int fscache_set_key(struct fscache_cookie *cookie,
const void *index_key, size_t index_key_len)
{
- unsigned long long h;
u32 *buf;
int bufs;
- int i;
bufs = DIV_ROUND_UP(index_key_len, sizeof(*buf));
@@ -104,17 +102,7 @@ static int fscache_set_key(struct fscache_cookie *cookie,
}
memcpy(buf, index_key, index_key_len);
-
- /* Calculate a hash and combine this with the length in the first word
- * or first half word
- */
- h = (unsigned long)cookie->parent;
- h += index_key_len + cookie->type;
-
- for (i = 0; i < bufs; i++)
- h += buf[i];
-
- cookie->key_hash = h ^ (h >> 32);
+ cookie->key_hash = fscache_hash(0, buf, bufs);
return 0;
}
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 200082cafdda..a49136c63e4b 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -74,6 +74,8 @@ extern struct workqueue_struct *fscache_object_wq;
extern struct workqueue_struct *fscache_op_wq;
DECLARE_PER_CPU(wait_queue_head_t, fscache_object_cong_wait);
+extern unsigned int fscache_hash(unsigned int salt, unsigned int *data, unsigned int n);
+
static inline bool fscache_object_congested(void)
{
return workqueue_congested(WORK_CPU_UNBOUND, fscache_object_wq);
diff --git a/fs/fscache/main.c b/fs/fscache/main.c
index c1e6cc9091aa..4207f98e405f 100644
--- a/fs/fscache/main.c
+++ b/fs/fscache/main.c
@@ -93,6 +93,45 @@ static struct ctl_table fscache_sysctls_root[] = {
};
#endif
+/*
+ * Mixing scores (in bits) for (7,20):
+ * Input delta: 1-bit 2-bit
+ * 1 round: 330.3 9201.6
+ * 2 rounds: 1246.4 25475.4
+ * 3 rounds: 1907.1 31295.1
+ * 4 rounds: 2042.3 31718.6
+ * Perfect: 2048 31744
+ * (32*64) (32*31/2 * 64)
+ */
+#define HASH_MIX(x, y, a) \
+ ( x ^= (a), \
+ y ^= x, x = rol32(x, 7),\
+ x += y, y = rol32(y,20),\
+ y *= 9 )
+
+static inline unsigned int fold_hash(unsigned long x, unsigned long y)
+{
+ /* Use arch-optimized multiply if one exists */
+ return __hash_32(y ^ __hash_32(x));
+}
+
+/*
+ * Generate a hash. This is derived from full_name_hash(), but we want to be
+ * sure it is arch independent and that it doesn't change as bits of the
+ * computed hash value might appear on disk. The caller also guarantees that
+ * the hashed data will be a series of aligned 32-bit words.
+ */
+unsigned int fscache_hash(unsigned int salt, unsigned int *data, unsigned int n)
+{
+ unsigned int a, x = 0, y = salt;
+
+ for (; n; n--) {
+ a = *data++;
+ HASH_MIX(x, y, a);
+ }
+ return fold_hash(x, y);
+}
+
/*
* initialise the fs caching module
*/
next prev parent reply other threads:[~2021-06-21 21:47 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-21 21:44 [PATCH 00/12] fscache: Some prep work for fscache rewrite David Howells
2021-06-21 21:44 ` [PATCH 01/12] fscache: Select netfs stats if fscache stats are enabled David Howells
2021-06-21 21:45 ` [PATCH 02/12] netfs: Move cookie debug ID to struct netfs_cache_resources David Howells
2021-06-21 21:45 ` [PATCH 03/12] cachefiles: Use file_inode() rather than accessing ->f_inode David Howells
2021-06-21 21:45 ` [PATCH 04/12] fscache: Add a cookie debug ID and use that in traces David Howells
2021-06-21 21:45 ` [PATCH 05/12] fscache: Procfile to display cookies David Howells
2021-06-21 21:45 ` [PATCH 06/12] fscache, cachefiles: Remove the histogram stuff David Howells
2021-06-21 21:46 ` [PATCH 07/12] fscache: Remove the object list procfile David Howells
2021-06-21 21:46 ` [PATCH 08/12] fscache: Change %p in format strings to something else David Howells
2021-06-21 21:46 ` [PATCH 09/12] cachefiles: " David Howells
2021-06-21 21:46 ` David Howells [this message]
2021-08-24 16:11 ` [PATCH 10/12] fscache: Fix cookie key hashing Jeff Layton
2021-08-25 14:04 ` David Howells
2021-06-21 21:47 ` [PATCH 11/12] fscache: Fix fscache_cookie_put() to not deref after dec David Howells
2021-08-24 14:24 ` Jeff Layton
2021-08-25 14:05 ` David Howells
2021-06-21 21:47 ` [PATCH 12/12] fscache: Use refcount_t for the cookie refcount instead of atomic_t David Howells
2021-08-24 14:25 ` [PATCH 00/12] fscache: Some prep work for fscache rewrite Jeff Layton
2021-08-27 12:31 ` [PATCH v2 04/12] fscache: Add a cookie debug ID and use that in traces David Howells
2021-08-27 12:49 ` [Linux-cachefs] " Marc Dionne
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=162431201844.2908479.8293647220901514696.stgit@warthog.procyon.org.uk \
--to=dhowells@redhat.com \
--cc=anna.schumaker@netapp.com \
--cc=asmadeus@codewreck.org \
--cc=ceph-devel@vger.kernel.org \
--cc=dwysocha@redhat.com \
--cc=jlayton@redhat.com \
--cc=linux-afs@lists.infradead.org \
--cc=linux-cachefs@redhat.com \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=sfrench@samba.org \
--cc=v9fs-developer@lists.sourceforge.net \
--subject='Re: [PATCH 10/12] fscache: Fix cookie key hashing' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).