LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Paolo Abeni <pabeni@redhat.com>, Thomas Graf <tgraf@suug.ch>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [4/4] rhashtable: improve rhashtable_walk stability when stop/start used.
Date: Fri, 06 Jul 2018 08:56:03 +1000	[thread overview]
Message-ID: <87bmblz3zw.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <86f305ff238d7cdac7ab20b0d6395cc6571cf4e0.camel@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6361 bytes --]

On Thu, Jul 05 2018, Paolo Abeni wrote:

> On Tue, 2018-04-24 at 08:29 +1000, NeilBrown wrote:
>> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
>> index 81edf1ab38ab..9427b5766134 100644
>> --- a/lib/rhashtable.c
>> +++ b/lib/rhashtable.c
>> @@ -727,6 +727,7 @@ int rhashtable_walk_start_check(struct rhashtable_iter *iter)
>>  	__acquires(RCU)
>>  {
>>  	struct rhashtable *ht = iter->ht;
>> +	bool rhlist = ht->rhlist;
>>  
>>  	rcu_read_lock();
>>  
>> @@ -735,13 +736,52 @@ int rhashtable_walk_start_check(struct rhashtable_iter *iter)
>>  		list_del(&iter->walker.list);
>>  	spin_unlock(&ht->lock);
>>  
>> -	if (!iter->walker.tbl && !iter->end_of_table) {
>> +	if (iter->end_of_table)
>> +		return 0;
>> +	if (!iter->walker.tbl) {
>>  		iter->walker.tbl = rht_dereference_rcu(ht->tbl, ht);
>>  		iter->slot = 0;
>>  		iter->skip = 0;
>>  		return -EAGAIN;
>>  	}
>>  
>> +	if (iter->p && !rhlist) {
>> +		/*
>> +		 * We need to validate that 'p' is still in the table, and
>> +		 * if so, update 'skip'
>> +		 */
>> +		struct rhash_head *p;
>> +		int skip = 0;
>> +		rht_for_each_rcu(p, iter->walker.tbl, iter->slot) {
>> +			skip++;
>> +			if (p == iter->p) {
>> +				iter->skip = skip;
>> +				goto found;
>> +			}
>> +		}
>> +		iter->p = NULL;
>> +	} else if (iter->p && rhlist) {
>> +		/* Need to validate that 'list' is still in the table, and
>> +		 * if so, update 'skip' and 'p'.
>> +		 */
>> +		struct rhash_head *p;
>> +		struct rhlist_head *list;
>> +		int skip = 0;
>> +		rht_for_each_rcu(p, iter->walker.tbl, iter->slot) {
>> +			for (list = container_of(p, struct rhlist_head, rhead);
>> +			     list;
>> +			     list = rcu_dereference(list->next)) {
>> +				skip++;
>> +				if (list == iter->list) {
>> +					iter->p = p;
>> +					skip = skip;
>> +					goto found;
>> +				}
>> +			}
>> +		}
>> +		iter->p = NULL;
>> +	}
>> +found:
>>  	return 0;
>>  }
>>  EXPORT_SYMBOL_GPL(rhashtable_walk_start_check);
>
> While testing new code that uses the rhashtable walker, I'm obeserving
> an use after free, that is apparently caused by the above:
>
> [  146.834815] ==================================================================
> [  146.842933] BUG: KASAN: use-after-free in inet_frag_worker+0x9f/0x210
> [  146.850120] Read of size 4 at addr ffff881b6b9342d8 by task kworker/13:1/177
> [  146.857984] 
> [  146.859645] CPU: 13 PID: 177 Comm: kworker/13:1 Not tainted 4.18.0-rc3.mirror_unclone_6_frag_dbg+ #1974
> [  146.870128] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
> [  146.878478] Workqueue: events inet_frag_worker
> [  146.883433] Call Trace:
> [  146.886162]  dump_stack+0x90/0xe3
> [  146.889861]  print_address_description+0x6a/0x2a0
> [  146.895109]  kasan_report+0x176/0x2d0
> [  146.899193]  ? inet_frag_worker+0x9f/0x210
> [  146.903762]  inet_frag_worker+0x9f/0x210
> [  146.908142]  process_one_work+0x24f/0x6e0
> [  146.912614]  ? process_one_work+0x1a6/0x6e0
> [  146.917285]  worker_thread+0x4e/0x3d0
> [  146.921373]  kthread+0x106/0x140
> [  146.924970]  ? process_one_work+0x6e0/0x6e0
> [  146.929637]  ? kthread_bind+0x10/0x10
> [  146.933723]  ret_from_fork+0x3a/0x50
> [  146.937717] 
> [  146.939377] Allocated by task 177:
> [  146.943170]  kasan_kmalloc+0x86/0xb0
> [  146.947158]  __kmalloc_node+0x181/0x430
> [  146.951438]  kvmalloc_node+0x4f/0x70
> [  146.955427]  alloc_bucket_spinlocks+0x34/0xa0
> [  146.960286]  bucket_table_alloc.isra.13+0x61/0x180
> [  146.965630]  rhashtable_rehash_alloc+0x26/0x80
> [  146.970585]  rht_deferred_worker+0x394/0x810
> [  146.975348]  process_one_work+0x24f/0x6e0
> [  146.979819]  worker_thread+0x4e/0x3d0
> [  146.983902]  kthread+0x106/0x140
> [  146.987502]  ret_from_fork+0x3a/0x50
> [  146.991487] 
> [  146.993146] Freed by task 90:
> [  146.996455]  __kasan_slab_free+0x11d/0x180
> [  147.001025]  kfree+0x113/0x350
> [  147.004431]  bucket_table_free+0x1c/0x70
> [  147.008806]  rcu_process_callbacks+0x6c8/0x880
> [  147.013762]  __do_softirq+0xd5/0x505
> [  147.017747] 
> [  147.019407] The buggy address belongs to the object at ffff881b6b934200
> [  147.019407]  which belongs to the cache kmalloc-8192 of size 8192
> [  147.033574] The buggy address is located 216 bytes inside of
> [  147.033574]  8192-byte region [ffff881b6b934200, ffff881b6b936200)
> [  147.046773] The buggy address belongs to the page:
> [  147.052116] page:ffffea006dae4c00 count:1 mapcount:0 mapping:ffff880107c0e400 index:0x0 compound_mapcount: 0
> [  147.063086] flags: 0x17ffffc0008100(slab|head)
> [  147.068043] raw: 0017ffffc0008100 dead000000000100 dead000000000200 ffff880107c0e400
> [  147.076684] raw: 0000000000000000 0000000000030003 00000001ffffffff 0000000000000000
> [  147.085324] page dumped because: kasan: bad access detected
> [  147.091540] 
> [  147.093210] Memory state around the buggy address:
> [  147.098553]  ffff881b6b934180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [  147.106613]  ffff881b6b934200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  147.114670] >ffff881b6b934280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  147.122730]                                                     ^
> [  147.129527]  ffff881b6b934300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  147.137587]  ffff881b6b934380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  147.145646] ==================================================================
>
> Reverting the above change avoid the issue. 
>
> I *think* that reusing iter->p after a
> rcu_read_lock()/rcu_read_unlock() is unsafe:
> if the table has been reashed we can still and reuse 'p', but the
> related grace period could be already expired.

Thanks for the report.
It would be unsafe to dereference iter->p, but the code doesn't.
At least, it doesn't dereference it until it has searched through the
table and confirmed that the pointer is still in the table.

Could you please use scripts/faddr2line to identify exactly where the
error is occurring?
e.g
   ./scripts/faddr2line vmlinux.o inet_frag_worker+0x9f/0x210

(any .o which contains inet_frag_worker should work).
Thanks,
NeilBrown

>
> I can't think of any better fix than a revert. Other opinions welcome!
>
> Paolo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2018-07-05 22:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-23 22:29 [PATCH 0/4] A few rhashtables cleanups NeilBrown
2018-04-23 22:29 ` [PATCH 4/4] rhashtable: improve rhashtable_walk stability when stop/start used NeilBrown
2018-07-05 11:20   ` [4/4] " Paolo Abeni
2018-07-05 22:56     ` NeilBrown [this message]
2018-07-07 22:11     ` NeilBrown
2018-04-23 22:29 ` [PATCH 2/4] rhashtable: Revise incorrect comment on r{hl, hash}table_walk_enter() NeilBrown
2018-04-23 22:29 ` [PATCH 3/4] rhashtable: reset iter when rhashtable_walk_start sees new table NeilBrown
2018-04-23 22:29 ` [PATCH 1/4] rhashtable: remove outdated comments about grow_decision etc NeilBrown
2018-04-24 17:38 ` [PATCH 0/4] A few rhashtables cleanups David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bmblz3zw.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=tgraf@suug.ch \
    --subject='Re: [4/4] rhashtable: improve rhashtable_walk stability when stop/start used.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).