Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Cc: <andrii@kernel.org>, <ast@kernel.org>, <benh@amazon.com>,
	<bpf@vger.kernel.org>, <daniel@iogearbox.net>,
	<davem@davemloft.net>, <john.fastabend@gmail.com>, <kafai@fb.com>,
	<kpsingh@kernel.org>, <kuba@kernel.org>, <kuni1840@gmail.com>,
	<netdev@vger.kernel.org>, <songliubraving@fb.com>
Subject: Re: [PATCH bpf-next 1/2] bpf: af_unix: Implement BPF iterator for UNIX domain socket.
Date: Fri, 30 Jul 2021 00:09:08 -0700	[thread overview]
Message-ID: <65fa9a82-6e1b-da0f-9cad-9b26771980fd@fb.com> (raw)
In-Reply-To: <20210730065359.43302-1-kuniyu@amazon.co.jp>



On 7/29/21 11:53 PM, Kuniyuki Iwashima wrote:
> From:   Yonghong Song <yhs@fb.com>
> Date:   Thu, 29 Jul 2021 23:24:41 -0700
>> On 7/29/21 4:36 PM, Kuniyuki Iwashima wrote:
>>> This patch implements the BPF iterator for the UNIX domain socket.
>>>
>>> Currently, the batch optimization introduced for the TCP iterator in the
>>> commit 04c7820b776f ("bpf: tcp: Bpf iter batching and lock_sock") is not
>>> applied.  It will require replacing the big lock for the hash table with
>>> small locks for each hash list not to block other processes.
>>
>> Thanks for the contribution. The patch looks okay except
>> missing seq_ops->stop implementation, see below for more explanation.
>>
>>>
>>> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
>>> ---
>>>    include/linux/btf_ids.h |  3 +-
>>>    net/unix/af_unix.c      | 78 +++++++++++++++++++++++++++++++++++++++++
>>>    2 files changed, 80 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
>>> index 57890b357f85..bed4b9964581 100644
>>> --- a/include/linux/btf_ids.h
>>> +++ b/include/linux/btf_ids.h
>>> @@ -172,7 +172,8 @@ extern struct btf_id_set name;
>>>    	BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP_TW, tcp_timewait_sock)		\
>>>    	BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP6, tcp6_sock)			\
>>>    	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP, udp_sock)			\
>>> -	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock)
>>> +	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock)			\
>>> +	BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock)
>>>    
>>>    enum {
>>>    #define BTF_SOCK_TYPE(name, str) name,
>>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>>> index 89927678c0dc..d45ad87e3a49 100644
>>> --- a/net/unix/af_unix.c
>>> +++ b/net/unix/af_unix.c
>>> @@ -113,6 +113,7 @@
>>>    #include <linux/security.h>
>>>    #include <linux/freezer.h>
>>>    #include <linux/file.h>
>>> +#include <linux/btf_ids.h>
>>>    
>>>    #include "scm.h"
>>>    
>>> @@ -2935,6 +2936,49 @@ static const struct seq_operations unix_seq_ops = {
>>>    	.stop   = unix_seq_stop,
>>>    	.show   = unix_seq_show,
>>>    };
>>> +
>>> +#ifdef CONFIG_BPF_SYSCALL
>>> +struct bpf_iter__unix {
>>> +	__bpf_md_ptr(struct bpf_iter_meta *, meta);
>>> +	__bpf_md_ptr(struct unix_sock *, unix_sk);
>>> +	uid_t uid __aligned(8);
>>> +};
>>> +
>>> +static int unix_prog_seq_show(struct bpf_prog *prog, struct bpf_iter_meta *meta,
>>> +			      struct unix_sock *unix_sk, uid_t uid)
>>> +{
>>> +	struct bpf_iter__unix ctx;
>>> +
>>> +	meta->seq_num--;  /* skip SEQ_START_TOKEN */
>>> +	ctx.meta = meta;
>>> +	ctx.unix_sk = unix_sk;
>>> +	ctx.uid = uid;
>>> +	return bpf_iter_run_prog(prog, &ctx);
>>> +}
>>> +
>>> +static int bpf_iter_unix_seq_show(struct seq_file *seq, void *v)
>>> +{
>>> +	struct bpf_iter_meta meta;
>>> +	struct bpf_prog *prog;
>>> +	struct sock *sk = v;
>>> +	uid_t uid;
>>> +
>>> +	if (v == SEQ_START_TOKEN)
>>> +		return 0;
>>> +
>>> +	uid = from_kuid_munged(seq_user_ns(seq), sock_i_uid(sk));
>>> +	meta.seq = seq;
>>> +	prog = bpf_iter_get_info(&meta, false);
>>> +	return unix_prog_seq_show(prog, &meta, v, uid);
>>> +}
>>> +
>>> +static const struct seq_operations bpf_iter_unix_seq_ops = {
>>> +	.start	= unix_seq_start,
>>> +	.next	= unix_seq_next,
>>> +	.stop	= unix_seq_stop,
>>
>> Although it is not required for /proc/net/unix, we should still
>> implement bpf_iter version of seq_ops->stop here. The main purpose
>> of bpf_iter specific seq_ops->stop is to call bpf program one
>> more time after ALL elements have been traversed. Such
>> functionality is implemented in all other bpf_iter variants.
> 
> Thanks for your review!
> I will implement the extra call in the next spin.
> 
> Just out of curiosity, is there a specific use case for the last call?

We don't have use cases for dumps similar to /proc/net/... etc.
The original thinking is to permit in-kernel aggregation and the
seq_ops->stop() bpf program will have an indication as the last
bpf program invocation for the iterator at which point bpf program
may wrap up aggregation and send/signal the result to user space.
I am not sure whether people already used this feature or not, or
people may have different way to do that (e.g., from user space
directly checking map value if read() length is 0). But
bpf seq_ops->stop() provides an in-kernel way for bpf program
to respond to the end of iterating.

> 

  reply	other threads:[~2021-07-30  7:09 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-29 23:36 [PATCH bpf-next 0/2] " Kuniyuki Iwashima
2021-07-29 23:36 ` [PATCH bpf-next 1/2] bpf: af_unix: Implement " Kuniyuki Iwashima
2021-07-30  6:19   ` kernel test robot
2021-07-30  6:24   ` Yonghong Song
2021-07-30  6:53     ` Kuniyuki Iwashima
2021-07-30  7:09       ` Yonghong Song [this message]
2021-07-30  8:05         ` Kuniyuki Iwashima
2021-07-29 23:36 ` [PATCH bpf-next 2/2] selftest/bpf: Implement sample UNIX domain socket iterator program Kuniyuki Iwashima
2021-07-30  6:54   ` Yonghong Song
2021-07-30  7:58     ` [PATCH bpf-next 2/2] selftest/bpf: Implement sample UNIX domain Kuniyuki Iwashima
2021-07-30 16:22       ` Yonghong Song
2021-07-30 22:55         ` Kuniyuki Iwashima
2021-07-30 19:34   ` [PATCH bpf-next 2/2] selftest/bpf: Implement sample UNIX domain socket iterator program Andrii Nakryiko
2021-07-30 23:03     ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=65fa9a82-6e1b-da0f-9cad-9b26771980fd@fb.com \
    --to=yhs@fb.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=benh@amazon.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=kuniyu@amazon.co.jp \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    --subject='Re: [PATCH bpf-next 1/2] bpf: af_unix: Implement BPF iterator for UNIX domain socket.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox