Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: <bpf@vger.kernel.org>, <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>, <kernel-team@fb.com>,
	"Paul E . McKenney" <paulmck@kernel.org>
Subject: Re: [PATCH bpf v2 1/3] bpf: fix a rcu_sched stall issue with bpf task/task_file iterator
Date: Tue, 18 Aug 2020 17:30:37 -0700	[thread overview]
Message-ID: <ac2c7081-8e6b-76c6-e032-ed2be3727e4d@fb.com> (raw)
In-Reply-To: <20200819000547.7qv32me2fxviwdkx@ast-mbp.dhcp.thefacebook.com>



On 8/18/20 5:05 PM, Alexei Starovoitov wrote:
> On Tue, Aug 18, 2020 at 03:23:09PM -0700, Yonghong Song wrote:
>>
>> We did not use cond_resched() since for some iterators, e.g.,
>> netlink iterator, where rcu read_lock critical section spans between
>> consecutive seq_ops->next(), which makes impossible to do cond_resched()
>> in the key while loop of function bpf_seq_read().
> 
> but after this patch we can, right?

We can do cond_resched() after seq->op->stop(). See more below.

> 
>>   
>> +/* maximum visited objects before bailing out */
>> +#define MAX_ITER_OBJECTS	1000000
>> +
>>   /* bpf_seq_read, a customized and simpler version for bpf iterator.
>>    * no_llseek is assumed for this file.
>>    * The following are differences from seq_read():
>> @@ -79,7 +82,7 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size,
>>   {
>>   	struct seq_file *seq = file->private_data;
>>   	size_t n, offs, copied = 0;
>> -	int err = 0;
>> +	int err = 0, num_objs = 0;
>>   	void *p;
>>   
>>   	mutex_lock(&seq->lock);
>> @@ -135,6 +138,7 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size,
>>   	while (1) {
>>   		loff_t pos = seq->index;
>>   
>> +		num_objs++;
>>   		offs = seq->count;
>>   		p = seq->op->next(seq, p, &seq->index);
>>   		if (pos == seq->index) {
>> @@ -153,6 +157,15 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size,
>>   		if (seq->count >= size)
>>   			break;
>>   
>> +		if (num_objs >= MAX_ITER_OBJECTS) {
>> +			if (offs == 0) {
>> +				err = -EAGAIN;
>> +				seq->op->stop(seq, p);
>> +				goto done;
>> +			}
>> +			break;
>> +		}
>> +
> 
> should this block be after op->show() and error processing?
> Otherwise bpf_iter_inc_seq_num() will be incorrectly incremented?

The purpose of op->next() is to calculate the "next" object position,
stored in the seq private data. So for next read() syscall, start()
will try to fetch the data based on the info in seq private data.

This is true for conditions "if (seq->count >= size) break"
in the above so next op->start() can try to locate the correct
object. The same is for this -EAGAIN thing.

> 
>>   		err = seq->op->show(seq, p);
>>   		if (err > 0) {
>>   			bpf_iter_dec_seq_num(seq);
> 
> After op->stop() we can do cond_resched() in all cases,
> since rhashtable walk does rcu_unlock in stop() callback, right?

Yes, we can. I am thinking since we return to user space,
cond_resched() might not be needed since returning to user space
will trigger some kind of scheduling. This patch fixed
the rcu stall issue. But if my understanding is incorrect,
I am happy to add cond_reched().

> I think copy_to_user() and mutex_unlock() don't do cond_resched()
> equivalent work.
> 

  reply	other threads:[~2020-08-19  0:31 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-18 22:23 [PATCH bpf v2 0/3] bpf: two fixes for bpf iterators Yonghong Song
2020-08-18 22:23 ` [PATCH bpf v2 1/3] bpf: fix a rcu_sched stall issue with bpf task/task_file iterator Yonghong Song
2020-08-19  0:05   ` Alexei Starovoitov
2020-08-19  0:30     ` Yonghong Song [this message]
2020-08-19  0:49       ` Alexei Starovoitov
2020-08-18 22:23 ` [PATCH bpf v2 2/3] bpf: avoid visit same object multiple times Yonghong Song
2020-08-18 22:23 ` [PATCH bpf v2 3/3] bpftool: handle EAGAIN error code properly in pids collection Yonghong Song
2020-08-18 22:30   ` Andrii Nakryiko
2020-08-18 22:28 ` [PATCH bpf v2 0/3] bpf: two fixes for bpf iterators Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac2c7081-8e6b-76c6-e032-ed2be3727e4d@fb.com \
    --to=yhs@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --subject='Re: [PATCH bpf v2 1/3] bpf: fix a rcu_sched stall issue with bpf task/task_file iterator' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).