Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH net] af_unix: fix garbage collect vs. MSG_PEEK
@ 2021-07-26 15:36 Greg Kroah-Hartman
2021-07-26 19:27 ` Kees Cook
0 siblings, 1 reply; 3+ messages in thread
From: Greg Kroah-Hartman @ 2021-07-26 15:36 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, davem, kuba, Miklos Szeredi, stable, Greg Kroah-Hartman
From: Miklos Szeredi <mszeredi@redhat.com>
Gc assumes that in-flight sockets that don't have an external ref can't
gain one while unix_gc_lock is held. That is true because
unix_notinflight() will be called before detaching fds, which takes
unix_gc_lock.
Only MSG_PEEK was somehow overlooked. That one also clones the fds, also
keeping them in the skb. But through MSG_PEEK an external reference can
definitely be gained without ever touching unix_gc_lock.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/unix/af_unix.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
Note, this is a resend of this old submission that somehow fell through
the cracks:
https://lore.kernel.org/netdev/CAOssrKcfncAYsQWkfLGFgoOxAQJVT2hYVWdBA6Cw7hhO8RJ_wQ@mail.gmail.com/
and was never submitted "properly" and this issue never seemed to get
resolved properly.
I've cleaned it up and made the change much smaller and localized to
only one file. I kept Miklos's authorship as he did the hard work on
this, I just removed lines and fixed a formatting issue :)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 23c92ad15c61..cdea997aa5bf 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1526,6 +1526,18 @@ static int unix_getname(struct socket *sock, struct sockaddr *uaddr, int peer)
return err;
}
+static void unix_peek_fds(struct scm_cookie *scm, struct sk_buff *skb)
+{
+ scm->fp = scm_fp_dup(UNIXCB(skb).fp);
+
+ /* During garbage collection it is assumed that in-flight sockets don't
+ * get a new external reference. So we need to wait until current run
+ * finishes.
+ */
+ spin_lock(&unix_gc_lock);
+ spin_unlock(&unix_gc_lock);
+}
+
static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb, bool send_fds)
{
int err = 0;
@@ -2175,7 +2187,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg,
sk_peek_offset_fwd(sk, size);
if (UNIXCB(skb).fp)
- scm.fp = scm_fp_dup(UNIXCB(skb).fp);
+ unix_peek_fds(&scm, skb);
}
err = (flags & MSG_TRUNC) ? skb->len - skip : size;
@@ -2418,7 +2430,7 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state,
/* It is questionable, see note in unix_dgram_recvmsg.
*/
if (UNIXCB(skb).fp)
- scm.fp = scm_fp_dup(UNIXCB(skb).fp);
+ unix_peek_fds(&scm, skb);
sk_peek_offset_fwd(sk, chunk);
--
2.32.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net] af_unix: fix garbage collect vs. MSG_PEEK
2021-07-26 15:36 [PATCH net] af_unix: fix garbage collect vs. MSG_PEEK Greg Kroah-Hartman
@ 2021-07-26 19:27 ` Kees Cook
2021-07-29 14:29 ` Miklos Szeredi
0 siblings, 1 reply; 3+ messages in thread
From: Kees Cook @ 2021-07-26 19:27 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: netdev, linux-kernel, davem, kuba, Miklos Szeredi, stable
On Mon, Jul 26, 2021 at 05:36:21PM +0200, Greg Kroah-Hartman wrote:
> From: Miklos Szeredi <mszeredi@redhat.com>
>
> Gc assumes that in-flight sockets that don't have an external ref can't
I think this commit log could be expanded. I had to really study things
to even beging to understand what was going on. I assume "Gc" here means
specifically unix_gc()?
> gain one while unix_gc_lock is held. That is true because
> unix_notinflight() will be called before detaching fds, which takes
> unix_gc_lock.
In reading the code, I *think* what is being protected by unix_gc_lock is
user->unix_inflight, u->inflight, unix_tot_inflight, and gc_inflight_list?
I note that unix_tot_inflight isn't an atomic but is read outside of
locking by unix_release_sock() and wait_for_unix_gc(), which seems wrong
(or at least inefficient).
But regardless, are the "external references" the f_count (i.e. get_file()
of u->sk.sk_socket->file) being changed by scm_fp_dup() and read by
unix_gc() (i.e. file_count())? It seems the test in unix_gc() is for
the making sure f_count isn't out of sync with u->inflight (is this the
corresponding "internal" reference?):
total_refs = file_count(u->sk.sk_socket->file);
inflight_refs = atomic_long_read(&u->inflight);
BUG_ON(inflight_refs < 1);
BUG_ON(total_refs < inflight_refs);
if (total_refs == inflight_refs) {
> Only MSG_PEEK was somehow overlooked. That one also clones the fds, also
> keeping them in the skb. But through MSG_PEEK an external reference can
> definitely be gained without ever touching unix_gc_lock.
The idea appears to be that all scm_fp_dup() callers need to refresh the
u->inflight counts which is what unix_attach_fds() and unix_detach_fds()
do. Why is lock/unlock sufficient for unix_peek_fds()?
I assume the rationale is because MSG_PEEK uses a temporary scm, which
only gets fput() clean-up on destroy ("inflight" is neither incremented
nor decremented at any point in the scm lifetime).
But I don't see why any of this helps.
unix_attach_fds():
fget(), spin_lock(), inflight++, spin_unlock()
unix_detach_fds():
spin_lock(), inflight--, spin_unlock(), fput()
unix_peek_fds():
fget(), spin_lock(), spin_unlock()
unix_gx():
spin_lock(), "total_refs == inflight_refs" to hitlist,
spin_unlock(), free hitlist skbs
Doesn't this mean total_refs and inflight_refs can still get out of
sync? What keeps an skb from being "visible" to unix_peek_fds() between
the unix_gx() spin_unlock() and the unix_peek_fds() fget()?
A: unix_gx():
spin_lock()
find "total_refs == inflight_refs", add to hitlist
spin_unlock()
B: unix_peek_fds():
fget()
A: unix_gc():
walk hitlist and free(skb)
B: unix_peek_fds():
*use freed skb*
I feel like I must be missing something since the above race would
appear to exist even for unix_attach_fds()/unix_detach_fds():
A: unix_gx():
spin_lock()
find "total_refs == inflight_refs", add to hitlist
spin_unlock()
B: unix_attach_fds():
fget()
A: unix_gc():
walk hitlist and free(skb)
B: unix_attach_fds():
*use freed skb*
I'm assuming I'm missing a top-level usage count on skb that is held by
callers, which means the skb isn't actually freed by unix_gc(). But I
return to not understanding why adding the lock/unlock helps.
What are the expected locking semantics here?
-Kees
>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
> net/unix/af_unix.c | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> Note, this is a resend of this old submission that somehow fell through
> the cracks:
> https://lore.kernel.org/netdev/CAOssrKcfncAYsQWkfLGFgoOxAQJVT2hYVWdBA6Cw7hhO8RJ_wQ@mail.gmail.com/
> and was never submitted "properly" and this issue never seemed to get
> resolved properly.
>
> I've cleaned it up and made the change much smaller and localized to
> only one file. I kept Miklos's authorship as he did the hard work on
> this, I just removed lines and fixed a formatting issue :)
>
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 23c92ad15c61..cdea997aa5bf 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -1526,6 +1526,18 @@ static int unix_getname(struct socket *sock, struct sockaddr *uaddr, int peer)
> return err;
> }
>
> +static void unix_peek_fds(struct scm_cookie *scm, struct sk_buff *skb)
> +{
> + scm->fp = scm_fp_dup(UNIXCB(skb).fp);
> +
> + /* During garbage collection it is assumed that in-flight sockets don't
> + * get a new external reference. So we need to wait until current run
> + * finishes.
> + */
> + spin_lock(&unix_gc_lock);
> + spin_unlock(&unix_gc_lock);
> +}
--
Kees Cook
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net] af_unix: fix garbage collect vs. MSG_PEEK
2021-07-26 19:27 ` Kees Cook
@ 2021-07-29 14:29 ` Miklos Szeredi
0 siblings, 0 replies; 3+ messages in thread
From: Miklos Szeredi @ 2021-07-29 14:29 UTC (permalink / raw)
To: Kees Cook; +Cc: Greg Kroah-Hartman, netdev, lkml, David Miller, kuba, stable
On Mon, Jul 26, 2021 at 9:27 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Mon, Jul 26, 2021 at 05:36:21PM +0200, Greg Kroah-Hartman wrote:
> > From: Miklos Szeredi <mszeredi@redhat.com>
> >
> > Gc assumes that in-flight sockets that don't have an external ref can't
>
> I think this commit log could be expanded. I had to really study things
> to even beging to understand what was going on. I assume "Gc" here means
> specifically unix_gc()?
Yeah, the original description was not too good. Commit cbcf01128d0a
("af_unix: fix garbage collect vs MSG_PEEK") now in Linus' tree has a
much expanded description.
> I note that unix_tot_inflight isn't an atomic but is read outside of
> locking by unix_release_sock() and wait_for_unix_gc(), which seems wrong
> (or at least inefficient).
I don't think it matters in practice. Do you have specific worries?
> Doesn't this mean total_refs and inflight_refs can still get out of
> sync? What keeps an skb from being "visible" to unix_peek_fds() between
> the unix_gx() spin_unlock() and the unix_peek_fds() fget()?
>
> A: unix_gx():
> spin_lock()
> find "total_refs == inflight_refs", add to hitlist
> spin_unlock()
> B: unix_peek_fds():
> fget()
> A: unix_gc():
> walk hitlist and free(skb)
> B: unix_peek_fds():
> *use freed skb*
>
> I feel like I must be missing something since the above race would
> appear to exist even for unix_attach_fds()/unix_detach_fds():
What you are missing is that anything that could have been peeked must
not have been garbage collected. I.e. the garbage collection
algorithm will find that there's an external in-flight reference to
the peeked socket and so it will not add it to the hitlist.
Thanks,
Miklos
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-07-29 14:29 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-26 15:36 [PATCH net] af_unix: fix garbage collect vs. MSG_PEEK Greg Kroah-Hartman
2021-07-26 19:27 ` Kees Cook
2021-07-29 14:29 ` Miklos Szeredi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).