Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: David Ahern <dsahern@gmail.com>
To: Dmitry Safonov <0x7f454c46@gmail.com>,
	Leonard Crestez <cdleonard@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Kuniyuki Iwashima <kuniyu@amazon.co.jp>,
	David Ahern <dsahern@kernel.org>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Jakub Kicinski <kuba@kernel.org>,
	Yuchung Cheng <ycheng@google.com>,
	Francesco Ruggeri <fruggeri@arista.com>,
	Mat Martineau <mathew.j.martineau@linux.intel.com>,
	Christoph Paasch <cpaasch@apple.com>,
	Ivan Delalande <colona@arista.com>,
	Priyaranjan Jha <priyarjha@google.com>,
	Menglong Dong <dong.menglong@zte.com.cn>,
	open list <linux-kernel@vger.kernel.org>,
	linux-crypto@vger.kernel.org,
	Network Development <netdev@vger.kernel.org>,
	Dmitry Safonov <dima@arista.com>
Subject: Re: [RFCv2 1/9] tcp: authopt: Initial support and key management
Date: Wed, 11 Aug 2021 11:15:52 -0600	[thread overview]
Message-ID: <ac911d47-eef7-c97b-9a77-f386546b56e8@gmail.com> (raw)
In-Reply-To: <68749e37-8e29-7a51-2186-7692f5fd6a79@gmail.com>

On 8/11/21 8:31 AM, Dmitry Safonov wrote:
> On 8/11/21 9:29 AM, Leonard Crestez wrote:
>> On 8/10/21 11:41 PM, Dmitry Safonov wrote:
> [..]
>>>> +       u32 flags;
>>>> +       /* Wire identifiers */
>>>> +       u8 send_id, recv_id;
>>>> +       u8 alg_id;
>>>> +       u8 keylen;
>>>> +       u8 key[TCP_AUTHOPT_MAXKEYLEN];
>>>> +       struct rcu_head rcu;
>>>
>>> This is unaligned and will add padding.
>>
>> Not clear padding matters. Moving rcu_head higher would avoid it, is
>> that what you're suggesting.
> 
> Yes.
> 
>>> I wonder if it's also worth saving some bytes by something like
>>> struct tcp_ao_key *key;
>>>
>>> With
>>> struct tcp_ao_key {
>>>        u8 keylen;
>>>        u8 key[0];
>>> };
>>>
>>> Hmm?
>>
>> This increases memory management complexity for very minor gains. Very
>> few tcp_authopt_key will ever be created.
> 
> The change doesn't seem to be big, like:
> --- a/net/ipv4/tcp_authopt.c
> +++ b/net/ipv4/tcp_authopt.c
> @@ -422,8 +422,16 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t
> optval, unsig>
>         key_info = __tcp_authopt_key_info_lookup(sk, info, opt.local_id);
>         if (key_info)
>                 tcp_authopt_key_del(sk, info, key_info);
> +
> +       key = sock_kmalloc(sk, sizeof(*key) + opt.keylen, GFP_KERNEL |
> __GFP_ZERO);
> +       if (!key) {
> +               tcp_authopt_alg_release(alg);
> +               return -ENOMEM;
> +       }
> +
>         key_info = sock_kmalloc(sk, sizeof(*key_info), GFP_KERNEL |
> __GFP_ZERO);
>         if (!key_info) {
> +               sock_kfree_s(sk, key, sizeof(*key) + opt.keylen);
>                 tcp_authopt_alg_release(alg);
>                 return -ENOMEM;
>         }
> 
> I don't know, probably it'll be enough for every user to limit their
> keys by length of 80, but if one day it won't be enough - this ABI will
> be painful to extend.
> 
> [..]
>>>> +#define TCP_AUTHOPT                    38      /* TCP Authentication
>>>> Option (RFC2385) */
>>>> +#define TCP_AUTHOPT_KEY                39      /* TCP Authentication
>>>> Option update key (RFC2385) */
>>>
>>> RFC2385 is md5 one.
>>> Also, should there be TCP_AUTHOPT_ADD_KEY, TCP_AUTHOPT_DELTE_KEY
>>> instead of using flags inside setsocketopt()? (no hard feelings)
>>
>> Fixed RFC reference.
>>
>> TCP_AUTHOPT_DELETE_KEY could be clearer, I just wanted to avoid bloating
>> the sockopt space. But there's not any scarcity.
>>
>> For reference tcp_md5 handles key deletion based on keylen == 0. This
>> seems wrong to me, an empty key is in fact valid though not realistic.
>>
>> If local_id is removed in favor of "full match on id and binding" then
>> the delete sockopt would still need most of a full struct
>> tcp_authopt_key anyway.
> 
> Sounds like a plan.
> 
> [..]>> I'm not sure what's the use of enum here, probably: > #define
>>> TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED BIT(2)
>>
>> This is an enum because it looks nice in kernel-doc. I couldn't find a
>> way to attach docs to a macro and include it somewhere else.
> 
> Yeah, ok, seems like good justification.
> 
>> BTW, the enum gains more members later.
>>
>> As for BIT() it doesn't see to be allowed in uapi and there were recent
>> changes removing such usage.
> 
> Ok, I just saw it's still used in include/uapi, but not aware of the
> removal.
> 
>>
>>> [..]
>>>> +struct tcp_authopt_key {
>>>> +       /** @flags: Combination of &enum tcp_authopt_key_flag */
>>>> +       __u32   flags;
>>>> +       /** @local_id: Local identifier */
>>>> +       __u32   local_id;
>>>> +       /** @send_id: keyid value for send */
>>>> +       __u8    send_id;
>>>> +       /** @recv_id: keyid value for receive */
>>>> +       __u8    recv_id;
>>>> +       /** @alg: One of &enum tcp_authopt_alg */
>>>> +       __u8    alg;
>>>> +       /** @keylen: Length of the key buffer */
>>>> +       __u8    keylen;
>>>> +       /** @key: Secret key */
>>>> +       __u8    key[TCP_AUTHOPT_MAXKEYLEN];
>>>> +       /**
>>>> +        * @addr: Key is only valid for this address
>>>> +        *
>>>> +        * Ignored unless TCP_AUTHOPT_KEY_ADDR_BIND flag is set
>>>> +        */
>>>> +       struct __kernel_sockaddr_storage addr;
>>>> +};
>>>
>>> It'll be an ABI if this is accepted. As it is - it can't support
>>> RFC5925 fully.
>>> Extending syscall ABI is painful. I think, even the initial ABI *must*
>>> support
>>> all possible features of the RFC.
>>> In other words, there must be src_addr, src_port, dst_addr and dst_port.
>>> I can see from docs you've written you don't want to support a mix of
>>> different
>>> addr/port MKTs, so you can return -EINVAL or -ENOTSUPP for any value
>>> but zero.
>>> This will create an ABI that can be later extended to fully support RFC.
>>
>> RFC states that MKT connection identifiers can be specified using ranges
>> and wildcards and the details are up to the implementation. Keys are
>> *NOT* just bound to a classical TCP 4-tuple.
>>
>> * src_addr and src_port is implicit in socket binding. Maybe in theory
>> src_addr they could apply for a server socket bound to 0.0.0.0:PORT but
>> userspace can just open more sockets.
>> * dst_port is not supported by MD5 and I can't think of any useful
>> usecase. This is either well known (179 for BGP) or auto-allocated.
>> * tcp_md5 was recently enhanced allow a "prefixlen" for addr and
>> "l3mdev" ifindex binding.
>>
>> This last point shows that the binding features people require can't be
>> easily predicted in advance so it's better to allow the rules to be
>> extended.
> 
> Yeah, I see both changes you mention went on easy way as they reused
> existing paddings in the ABI structures.
> Ok, if you don't want to reserve src_addr/src_port/dst_addr/dst_port,
> than how about reserving some space for those instead?
> 
>>> The same is about key: I don't think you need to define/use
>>> tcp_authopt_alg.
>>> Just use algo name - that way TCP-AO will automatically be able to use
>>> any algo supported by crypto engine.
>>> See how xfrm does it, e.g.:
>>> struct xfrm_algo_auth {
>>>      char        alg_name[64];
>>>      unsigned int    alg_key_len;    /* in bits */
>>>      unsigned int    alg_trunc_len;  /* in bits */
>>>      char        alg_key[0];
>>> };
>>>
>>> So you can let a user chose maclen instead of hard-coding it.
>>> Much more extendable than what you propose.
>>
>> This complicates ABI and implementation with features that are not
>> needed. I'd much rather only expose an enum of real-world tcp-ao
>> algorithms.
> 
> I see it exactly the opposite way: a new enum unnecessary complicates
> ABI, instead of passing alg_name[] to crypto engine. No need to add any
> support in tcp-ao as the algorithms are already provided by kernel.
> That is how it transparently works for ipsec, why not for tcp-ao?
> 
>>
>>> [..]
>>>> +#ifdef CONFIG_TCP_AUTHOPT
>>>> +       case TCP_AUTHOPT: {
>>>> +               struct tcp_authopt info;
>>>> +
>>>> +               if (get_user(len, optlen))
>>>> +                       return -EFAULT;
>>>> +
>>>> +               lock_sock(sk);
>>>> +               tcp_get_authopt_val(sk, &info);
>>>> +               release_sock(sk);
>>>> +
>>>> +               len = min_t(unsigned int, len, sizeof(info));
>>>> +               if (put_user(len, optlen))
>>>> +                       return -EFAULT;
>>>> +               if (copy_to_user(optval, &info, len))
>>>> +                       return -EFAULT;
>>>> +               return 0;
>>>> +       }
>>>
>>> I'm pretty sure it's not a good choice to write partly tcp_authopt.
>>> And a user has no way to check what's the correct len on this kernel.
>>> Instead of len = min_t(unsigned int, len, sizeof(info)), it should be
>>> if (len != sizeof(info))
>>>      return -EINVAL;
>>
>> Purpose is to allow sockopts to grow as md5 has grown.
> 
> md5 has not grown. See above.

MD5 uapi has - e.g., 8917a777be3ba and  6b102db50cdde. We want similar
capabilities for growth with this API.

> 
> Another issue with your approach
> 
> +       /* If userspace optlen is too short fill the rest with zeros */
> +       if (optlen > sizeof(opt))
> +               return -EINVAL;
> +       memset(&opt, 0, sizeof(opt));
> +       if (copy_from_sockptr(&opt, optval, optlen))
> +               return -EFAULT;
> 
> is that userspace compiled with updated/grew structure will fail on
> older kernel. So, no extension without breaking something is possible.
> Which also reminds me that currently you don't validate (opt.flags) for
> unknown by kernel flags.
> 
> Extending syscalls is impossible without breaking userspace if ABI is
> not designed with extensibility in mind. That was quite a big problem,
> and still is. Please, read this carefully:
> https://lwn.net/Articles/830666/
> 
> That is why I'm suggesting you all those changes that will be harder to
> fix when/if your patches get accepted.
> As an example how it should work see in copy_clone_args_from_user().
> 

Look at how TCP_ZEROCOPY_RECEIVE has grown over releases as an example
of how to properly handle this.


  reply	other threads:[~2021-08-11 17:16 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-09 21:35 [RFCv2 0/9] tcp: Initial support for RFC5925 auth option Leonard Crestez
2021-08-09 21:35 ` [RFCv2 1/9] tcp: authopt: Initial support and key management Leonard Crestez
2021-08-10 20:41   ` Dmitry Safonov
2021-08-11  8:29     ` Leonard Crestez
2021-08-11 13:42       ` David Ahern
2021-08-11 19:11         ` Leonard Crestez
2021-08-11 20:26           ` Dmitry Safonov
2021-08-11 20:26           ` David Ahern
2021-08-11 14:31       ` Dmitry Safonov
2021-08-11 17:15         ` David Ahern [this message]
2021-08-11 20:12           ` Dmitry Safonov
2021-08-11 20:23             ` David Ahern
2021-08-11 19:08         ` Leonard Crestez
2021-08-12 19:46       ` Leonard Crestez
2021-08-09 21:35 ` [RFCv2 2/9] docs: Add user documentation for tcp_authopt Leonard Crestez
2021-08-09 21:35 ` [RFCv2 3/9] tcp: authopt: Add crypto initialization Leonard Crestez
2021-08-09 21:35 ` [RFCv2 4/9] tcp: authopt: Compute packet signatures Leonard Crestez
2021-08-09 21:35 ` [RFCv2 5/9] tcp: authopt: Hook into tcp core Leonard Crestez
2021-08-09 21:35 ` [RFCv2 6/9] tcp: authopt: Add key selection controls Leonard Crestez
2021-08-09 21:35 ` [RFCv2 7/9] tcp: authopt: Add snmp counters Leonard Crestez
2021-08-09 21:35 ` [RFCv2 8/9] selftests: Initial TCP-AO support for nettest Leonard Crestez
2021-08-09 21:35 ` [RFCv2 9/9] selftests: Initial TCP-AO support for fcnal-test Leonard Crestez
2021-08-11 13:46   ` David Ahern
2021-08-11 19:09     ` Leonard Crestez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac911d47-eef7-c97b-9a77-f386546b56e8@gmail.com \
    --to=dsahern@gmail.com \
    --cc=0x7f454c46@gmail.com \
    --cc=cdleonard@gmail.com \
    --cc=colona@arista.com \
    --cc=cpaasch@apple.com \
    --cc=davem@davemloft.net \
    --cc=dima@arista.com \
    --cc=dong.menglong@zte.com.cn \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=fruggeri@arista.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.co.jp \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathew.j.martineau@linux.intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=priyarjha@google.com \
    --cc=ycheng@google.com \
    --cc=yoshfuji@linux-ipv6.org \
    --subject='Re: [RFCv2 1/9] tcp: authopt: Initial support and key management' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).