LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH net] tuntap: raise EPOLLOUT on device up
@ 2018-05-18 13:00 Jason Wang
  2018-05-18 13:13 ` Michael S. Tsirkin
  2018-05-21 15:47 ` David Miller
  0 siblings, 2 replies; 15+ messages in thread
From: Jason Wang @ 2018-05-18 13:00 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: mst, Jason Wang, Hannes Frederic Sowa, Eric Dumazet

We return -EIO on device down but can not raise EPOLLOUT after it was
up. This may confuse user like vhost which expects tuntap to raise
EPOLLOUT to re-enable its TX routine after tuntap is down. This could
be easily reproduced by transmitting packets from VM while down and up
the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Eric Dumazet <edumazet@google.com>
Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/tun.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index d45ac37..1b29761 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1734,8 +1734,10 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	int skb_xdp = 1;
 	bool frags = tun_napi_frags_enabled(tun);
 
-	if (!(tun->dev->flags & IFF_UP))
+	if (!(tun->dev->flags & IFF_UP)) {
+		set_bit(SOCKWQ_ASYNC_NOSPACE, &tfile->socket.flags);
 		return -EIO;
+	}
 
 	if (!(tun->flags & IFF_NO_PI)) {
 		if (len < sizeof(pi))
-- 
2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-18 13:00 [PATCH net] tuntap: raise EPOLLOUT on device up Jason Wang
@ 2018-05-18 13:13 ` Michael S. Tsirkin
  2018-05-18 13:26   ` Jason Wang
  2018-05-21 15:47 ` David Miller
  1 sibling, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2018-05-18 13:13 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, Hannes Frederic Sowa, Eric Dumazet

On Fri, May 18, 2018 at 09:00:43PM +0800, Jason Wang wrote:
> We return -EIO on device down but can not raise EPOLLOUT after it was
> up. This may confuse user like vhost which expects tuntap to raise
> EPOLLOUT to re-enable its TX routine after tuntap is down. This could
> be easily reproduced by transmitting packets from VM while down and up
> the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
> 
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Cc: Eric Dumazet <edumazet@google.com>
> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/net/tun.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index d45ac37..1b29761 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1734,8 +1734,10 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  	int skb_xdp = 1;
>  	bool frags = tun_napi_frags_enabled(tun);
>  
> -	if (!(tun->dev->flags & IFF_UP))
> +	if (!(tun->dev->flags & IFF_UP)) {

Isn't this racy?  What if flag is cleared at this point?

> +		set_bit(SOCKWQ_ASYNC_NOSPACE, &tfile->socket.flags);
>  		return -EIO;
> +	}
>  
>  	if (!(tun->flags & IFF_NO_PI)) {
>  		if (len < sizeof(pi))
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-18 13:13 ` Michael S. Tsirkin
@ 2018-05-18 13:26   ` Jason Wang
  2018-05-18 14:00     ` Jason Wang
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2018-05-18 13:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, Hannes Frederic Sowa, Eric Dumazet



On 2018年05月18日 21:13, Michael S. Tsirkin wrote:
> On Fri, May 18, 2018 at 09:00:43PM +0800, Jason Wang wrote:
>> We return -EIO on device down but can not raise EPOLLOUT after it was
>> up. This may confuse user like vhost which expects tuntap to raise
>> EPOLLOUT to re-enable its TX routine after tuntap is down. This could
>> be easily reproduced by transmitting packets from VM while down and up
>> the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
>>
>> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> Cc: Eric Dumazet <edumazet@google.com>
>> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   drivers/net/tun.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index d45ac37..1b29761 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -1734,8 +1734,10 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>>   	int skb_xdp = 1;
>>   	bool frags = tun_napi_frags_enabled(tun);
>>   
>> -	if (!(tun->dev->flags & IFF_UP))
>> +	if (!(tun->dev->flags & IFF_UP)) {
> Isn't this racy?  What if flag is cleared at this point?

I think you mean "set at this point"? Then yes, so we probably need to 
set the bit during tun_net_close().

Thanks

>> +		set_bit(SOCKWQ_ASYNC_NOSPACE, &tfile->socket.flags);
>>   		return -EIO;
>> +	}
>>   
>>   	if (!(tun->flags & IFF_NO_PI)) {
>>   		if (len < sizeof(pi))
>> -- 
>> 2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-18 13:26   ` Jason Wang
@ 2018-05-18 14:00     ` Jason Wang
  2018-05-18 14:06       ` Michael S. Tsirkin
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2018-05-18 14:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, Hannes Frederic Sowa, Eric Dumazet



On 2018年05月18日 21:26, Jason Wang wrote:
>
>
> On 2018年05月18日 21:13, Michael S. Tsirkin wrote:
>> On Fri, May 18, 2018 at 09:00:43PM +0800, Jason Wang wrote:
>>> We return -EIO on device down but can not raise EPOLLOUT after it was
>>> up. This may confuse user like vhost which expects tuntap to raise
>>> EPOLLOUT to re-enable its TX routine after tuntap is down. This could
>>> be easily reproduced by transmitting packets from VM while down and up
>>> the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
>>>
>>> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>> Cc: Eric Dumazet <edumazet@google.com>
>>> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>>   drivers/net/tun.c | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>> index d45ac37..1b29761 100644
>>> --- a/drivers/net/tun.c
>>> +++ b/drivers/net/tun.c
>>> @@ -1734,8 +1734,10 @@ static ssize_t tun_get_user(struct tun_struct 
>>> *tun, struct tun_file *tfile,
>>>       int skb_xdp = 1;
>>>       bool frags = tun_napi_frags_enabled(tun);
>>>   -    if (!(tun->dev->flags & IFF_UP))
>>> +    if (!(tun->dev->flags & IFF_UP)) {
>> Isn't this racy?  What if flag is cleared at this point?
>
> I think you mean "set at this point"? Then yes, so we probably need to 
> set the bit during tun_net_close().
>
> Thanks 

Looks no need, vhost will poll socket after it see EIO. So we are ok here?

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-18 14:00     ` Jason Wang
@ 2018-05-18 14:06       ` Michael S. Tsirkin
  2018-05-18 14:11         ` Jason Wang
  0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2018-05-18 14:06 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, Hannes Frederic Sowa, Eric Dumazet

On Fri, May 18, 2018 at 10:00:31PM +0800, Jason Wang wrote:
> 
> 
> On 2018年05月18日 21:26, Jason Wang wrote:
> > 
> > 
> > On 2018年05月18日 21:13, Michael S. Tsirkin wrote:
> > > On Fri, May 18, 2018 at 09:00:43PM +0800, Jason Wang wrote:
> > > > We return -EIO on device down but can not raise EPOLLOUT after it was
> > > > up. This may confuse user like vhost which expects tuntap to raise
> > > > EPOLLOUT to re-enable its TX routine after tuntap is down. This could
> > > > be easily reproduced by transmitting packets from VM while down and up
> > > > the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
> > > > 
> > > > Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > ---
> > > >   drivers/net/tun.c | 4 +++-
> > > >   1 file changed, 3 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > > > index d45ac37..1b29761 100644
> > > > --- a/drivers/net/tun.c
> > > > +++ b/drivers/net/tun.c
> > > > @@ -1734,8 +1734,10 @@ static ssize_t tun_get_user(struct
> > > > tun_struct *tun, struct tun_file *tfile,
> > > >       int skb_xdp = 1;
> > > >       bool frags = tun_napi_frags_enabled(tun);
> > > >   -    if (!(tun->dev->flags & IFF_UP))
> > > > +    if (!(tun->dev->flags & IFF_UP)) {
> > > Isn't this racy?  What if flag is cleared at this point?
> > 
> > I think you mean "set at this point"? Then yes, so we probably need to
> > set the bit during tun_net_close().
> > 
> > Thanks
> 
> Looks no need, vhost will poll socket after it see EIO. So we are ok here?
> 
> Thanks

In fact I don't even understand why does this help any longer.

-- 
MST

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-18 14:06       ` Michael S. Tsirkin
@ 2018-05-18 14:11         ` Jason Wang
  2018-05-18 14:46           ` Michael S. Tsirkin
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2018-05-18 14:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, Hannes Frederic Sowa, Eric Dumazet



On 2018年05月18日 22:06, Michael S. Tsirkin wrote:
> On Fri, May 18, 2018 at 10:00:31PM +0800, Jason Wang wrote:
>>
>> On 2018年05月18日 21:26, Jason Wang wrote:
>>>
>>> On 2018年05月18日 21:13, Michael S. Tsirkin wrote:
>>>> On Fri, May 18, 2018 at 09:00:43PM +0800, Jason Wang wrote:
>>>>> We return -EIO on device down but can not raise EPOLLOUT after it was
>>>>> up. This may confuse user like vhost which expects tuntap to raise
>>>>> EPOLLOUT to re-enable its TX routine after tuntap is down. This could
>>>>> be easily reproduced by transmitting packets from VM while down and up
>>>>> the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
>>>>>
>>>>> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>>>> Cc: Eric Dumazet <edumazet@google.com>
>>>>> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>> ---
>>>>>    drivers/net/tun.c | 4 +++-
>>>>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>>>> index d45ac37..1b29761 100644
>>>>> --- a/drivers/net/tun.c
>>>>> +++ b/drivers/net/tun.c
>>>>> @@ -1734,8 +1734,10 @@ static ssize_t tun_get_user(struct
>>>>> tun_struct *tun, struct tun_file *tfile,
>>>>>        int skb_xdp = 1;
>>>>>        bool frags = tun_napi_frags_enabled(tun);
>>>>>    -    if (!(tun->dev->flags & IFF_UP))
>>>>> +    if (!(tun->dev->flags & IFF_UP)) {
>>>> Isn't this racy?  What if flag is cleared at this point?
>>> I think you mean "set at this point"? Then yes, so we probably need to
>>> set the bit during tun_net_close().
>>>
>>> Thanks
>> Looks no need, vhost will poll socket after it see EIO. So we are ok here?
>>
>> Thanks
> In fact I don't even understand why does this help any longer.
>

We disable tx polling and only enable it on demand for a better rx 
performance. You may want to have a look at :

commit feb8892cb441c742d4220cf7ced001e7fa070731
Author: Jason Wang <jasowang@redhat.com>
Date:   Mon Nov 13 11:45:34 2017 +0800

     vhost_net: conditionally enable tx polling

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-18 14:11         ` Jason Wang
@ 2018-05-18 14:46           ` Michael S. Tsirkin
  2018-05-19  1:09             ` Jason Wang
  0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2018-05-18 14:46 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, Hannes Frederic Sowa, Eric Dumazet

On Fri, May 18, 2018 at 10:11:54PM +0800, Jason Wang wrote:
> 
> 
> On 2018年05月18日 22:06, Michael S. Tsirkin wrote:
> > On Fri, May 18, 2018 at 10:00:31PM +0800, Jason Wang wrote:
> > > 
> > > On 2018年05月18日 21:26, Jason Wang wrote:
> > > > 
> > > > On 2018年05月18日 21:13, Michael S. Tsirkin wrote:
> > > > > On Fri, May 18, 2018 at 09:00:43PM +0800, Jason Wang wrote:
> > > > > > We return -EIO on device down but can not raise EPOLLOUT after it was
> > > > > > up. This may confuse user like vhost which expects tuntap to raise
> > > > > > EPOLLOUT to re-enable its TX routine after tuntap is down. This could
> > > > > > be easily reproduced by transmitting packets from VM while down and up
> > > > > > the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
> > > > > > 
> > > > > > Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > > > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > > > Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > ---
> > > > > >    drivers/net/tun.c | 4 +++-
> > > > > >    1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > > > > > index d45ac37..1b29761 100644
> > > > > > --- a/drivers/net/tun.c
> > > > > > +++ b/drivers/net/tun.c
> > > > > > @@ -1734,8 +1734,10 @@ static ssize_t tun_get_user(struct
> > > > > > tun_struct *tun, struct tun_file *tfile,
> > > > > >        int skb_xdp = 1;
> > > > > >        bool frags = tun_napi_frags_enabled(tun);
> > > > > >    -    if (!(tun->dev->flags & IFF_UP))
> > > > > > +    if (!(tun->dev->flags & IFF_UP)) {
> > > > > Isn't this racy?  What if flag is cleared at this point?
> > > > I think you mean "set at this point"? Then yes, so we probably need to
> > > > set the bit during tun_net_close().
> > > > 
> > > > Thanks
> > > Looks no need, vhost will poll socket after it see EIO. So we are ok here?
> > > 
> > > Thanks
> > In fact I don't even understand why does this help any longer.
> > 
> 
> We disable tx polling and only enable it on demand for a better rx
> performance. You may want to have a look at :
> 
> commit feb8892cb441c742d4220cf7ced001e7fa070731
> Author: Jason Wang <jasowang@redhat.com>
> Date:   Mon Nov 13 11:45:34 2017 +0800
> 
>     vhost_net: conditionally enable tx polling
> 
> Thanks


Question is, what looks at SOCKWQ_ASYNC_NOSPACE.
I think it's tested when packet is transmitted,
but there is no guarantee here any packet will
ever be transmitted.

-- 
MST

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-18 14:46           ` Michael S. Tsirkin
@ 2018-05-19  1:09             ` Jason Wang
  2018-05-21 22:06               ` Michael S. Tsirkin
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2018-05-19  1:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, Hannes Frederic Sowa, Eric Dumazet



On 2018年05月18日 22:46, Michael S. Tsirkin wrote:
> On Fri, May 18, 2018 at 10:11:54PM +0800, Jason Wang wrote:
>>
>> On 2018年05月18日 22:06, Michael S. Tsirkin wrote:
>>> On Fri, May 18, 2018 at 10:00:31PM +0800, Jason Wang wrote:
>>>> On 2018年05月18日 21:26, Jason Wang wrote:
>>>>> On 2018年05月18日 21:13, Michael S. Tsirkin wrote:
>>>>>> On Fri, May 18, 2018 at 09:00:43PM +0800, Jason Wang wrote:
>>>>>>> We return -EIO on device down but can not raise EPOLLOUT after it was
>>>>>>> up. This may confuse user like vhost which expects tuntap to raise
>>>>>>> EPOLLOUT to re-enable its TX routine after tuntap is down. This could
>>>>>>> be easily reproduced by transmitting packets from VM while down and up
>>>>>>> the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
>>>>>>>
>>>>>>> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>>>>>> Cc: Eric Dumazet <edumazet@google.com>
>>>>>>> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
>>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>>> ---
>>>>>>>     drivers/net/tun.c | 4 +++-
>>>>>>>     1 file changed, 3 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>>>>>> index d45ac37..1b29761 100644
>>>>>>> --- a/drivers/net/tun.c
>>>>>>> +++ b/drivers/net/tun.c
>>>>>>> @@ -1734,8 +1734,10 @@ static ssize_t tun_get_user(struct
>>>>>>> tun_struct *tun, struct tun_file *tfile,
>>>>>>>         int skb_xdp = 1;
>>>>>>>         bool frags = tun_napi_frags_enabled(tun);
>>>>>>>     -    if (!(tun->dev->flags & IFF_UP))
>>>>>>> +    if (!(tun->dev->flags & IFF_UP)) {
>>>>>> Isn't this racy?  What if flag is cleared at this point?
>>>>> I think you mean "set at this point"? Then yes, so we probably need to
>>>>> set the bit during tun_net_close().
>>>>>
>>>>> Thanks
>>>> Looks no need, vhost will poll socket after it see EIO. So we are ok here?
>>>>
>>>> Thanks
>>> In fact I don't even understand why does this help any longer.
>>>
>> We disable tx polling and only enable it on demand for a better rx
>> performance. You may want to have a look at :
>>
>> commit feb8892cb441c742d4220cf7ced001e7fa070731
>> Author: Jason Wang <jasowang@redhat.com>
>> Date:   Mon Nov 13 11:45:34 2017 +0800
>>
>>      vhost_net: conditionally enable tx polling
>>
>> Thanks
>
> Question is, what looks at SOCKWQ_ASYNC_NOSPACE.
> I think it's tested when packet is transmitted,
> but there is no guarantee here any packet will
> ever be transmitted.
>

Well, actually, I do plan to disable vq polling from the beginning. But 
looks like you do not want this:

See https://patchwork.kernel.org/patch/10034025/

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-18 13:00 [PATCH net] tuntap: raise EPOLLOUT on device up Jason Wang
  2018-05-18 13:13 ` Michael S. Tsirkin
@ 2018-05-21 15:47 ` David Miller
  2018-05-21 22:08   ` Michael S. Tsirkin
  1 sibling, 1 reply; 15+ messages in thread
From: David Miller @ 2018-05-21 15:47 UTC (permalink / raw)
  To: jasowang; +Cc: netdev, linux-kernel, mst, hannes, edumazet

From: Jason Wang <jasowang@redhat.com>
Date: Fri, 18 May 2018 21:00:43 +0800

> We return -EIO on device down but can not raise EPOLLOUT after it was
> up. This may confuse user like vhost which expects tuntap to raise
> EPOLLOUT to re-enable its TX routine after tuntap is down. This could
> be easily reproduced by transmitting packets from VM while down and up
> the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
> 
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Cc: Eric Dumazet <edumazet@google.com>
> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> Signed-off-by: Jason Wang <jasowang@redhat.com>

I'm no so sure what to do with this patch.

Like Michael says, this flag bit is only checks upon transmit which
may or may not happen after this point.  It doesn't seem to be
guaranteed.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-19  1:09             ` Jason Wang
@ 2018-05-21 22:06               ` Michael S. Tsirkin
  0 siblings, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2018-05-21 22:06 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, Hannes Frederic Sowa, Eric Dumazet

On Sat, May 19, 2018 at 09:09:11AM +0800, Jason Wang wrote:
> 
> 
> On 2018年05月18日 22:46, Michael S. Tsirkin wrote:
> > On Fri, May 18, 2018 at 10:11:54PM +0800, Jason Wang wrote:
> > > 
> > > On 2018年05月18日 22:06, Michael S. Tsirkin wrote:
> > > > On Fri, May 18, 2018 at 10:00:31PM +0800, Jason Wang wrote:
> > > > > On 2018年05月18日 21:26, Jason Wang wrote:
> > > > > > On 2018年05月18日 21:13, Michael S. Tsirkin wrote:
> > > > > > > On Fri, May 18, 2018 at 09:00:43PM +0800, Jason Wang wrote:
> > > > > > > > We return -EIO on device down but can not raise EPOLLOUT after it was
> > > > > > > > up. This may confuse user like vhost which expects tuntap to raise
> > > > > > > > EPOLLOUT to re-enable its TX routine after tuntap is down. This could
> > > > > > > > be easily reproduced by transmitting packets from VM while down and up
> > > > > > > > the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
> > > > > > > > 
> > > > > > > > Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > > > > > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > > > > > Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > ---
> > > > > > > >     drivers/net/tun.c | 4 +++-
> > > > > > > >     1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > > > > > > > index d45ac37..1b29761 100644
> > > > > > > > --- a/drivers/net/tun.c
> > > > > > > > +++ b/drivers/net/tun.c
> > > > > > > > @@ -1734,8 +1734,10 @@ static ssize_t tun_get_user(struct
> > > > > > > > tun_struct *tun, struct tun_file *tfile,
> > > > > > > >         int skb_xdp = 1;
> > > > > > > >         bool frags = tun_napi_frags_enabled(tun);
> > > > > > > >     -    if (!(tun->dev->flags & IFF_UP))
> > > > > > > > +    if (!(tun->dev->flags & IFF_UP)) {
> > > > > > > Isn't this racy?  What if flag is cleared at this point?
> > > > > > I think you mean "set at this point"? Then yes, so we probably need to
> > > > > > set the bit during tun_net_close().
> > > > > > 
> > > > > > Thanks
> > > > > Looks no need, vhost will poll socket after it see EIO. So we are ok here?
> > > > > 
> > > > > Thanks
> > > > In fact I don't even understand why does this help any longer.
> > > > 
> > > We disable tx polling and only enable it on demand for a better rx
> > > performance. You may want to have a look at :
> > > 
> > > commit feb8892cb441c742d4220cf7ced001e7fa070731
> > > Author: Jason Wang <jasowang@redhat.com>
> > > Date:   Mon Nov 13 11:45:34 2017 +0800
> > > 
> > >      vhost_net: conditionally enable tx polling
> > > 
> > > Thanks
> > 
> > Question is, what looks at SOCKWQ_ASYNC_NOSPACE.
> > I think it's tested when packet is transmitted,
> > but there is no guarantee here any packet will
> > ever be transmitted.
> > 
> 
> Well, actually, I do plan to disable vq polling from the beginning. But
> looks like you do not want this:
> 
> See https://patchwork.kernel.org/patch/10034025/
> 
> Thanks

Not sure I understand what you are saying, it's enabling polling we are
talking about.

-- 
MST

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-21 15:47 ` David Miller
@ 2018-05-21 22:08   ` Michael S. Tsirkin
  2018-05-22  3:22     ` Jason Wang
  0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2018-05-21 22:08 UTC (permalink / raw)
  To: David Miller; +Cc: jasowang, netdev, linux-kernel, hannes, edumazet

On Mon, May 21, 2018 at 11:47:42AM -0400, David Miller wrote:
> From: Jason Wang <jasowang@redhat.com>
> Date: Fri, 18 May 2018 21:00:43 +0800
> 
> > We return -EIO on device down but can not raise EPOLLOUT after it was
> > up. This may confuse user like vhost which expects tuntap to raise
> > EPOLLOUT to re-enable its TX routine after tuntap is down. This could
> > be easily reproduced by transmitting packets from VM while down and up
> > the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
> > 
> > Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > Cc: Eric Dumazet <edumazet@google.com>
> > Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> 
> I'm no so sure what to do with this patch.
> 
> Like Michael says, this flag bit is only checks upon transmit which
> may or may not happen after this point.  It doesn't seem to be
> guaranteed.

Jason, can't we detect a link up transition and respond accordingly?
What do you think?

-- 
MST

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-21 22:08   ` Michael S. Tsirkin
@ 2018-05-22  3:22     ` Jason Wang
  2018-05-22  3:45       ` Michael S. Tsirkin
  2018-05-22  3:46       ` Michael S. Tsirkin
  0 siblings, 2 replies; 15+ messages in thread
From: Jason Wang @ 2018-05-22  3:22 UTC (permalink / raw)
  To: Michael S. Tsirkin, David Miller; +Cc: netdev, linux-kernel, hannes, edumazet



On 2018年05月22日 06:08, Michael S. Tsirkin wrote:
> On Mon, May 21, 2018 at 11:47:42AM -0400, David Miller wrote:
>> From: Jason Wang <jasowang@redhat.com>
>> Date: Fri, 18 May 2018 21:00:43 +0800
>>
>>> We return -EIO on device down but can not raise EPOLLOUT after it was
>>> up. This may confuse user like vhost which expects tuntap to raise
>>> EPOLLOUT to re-enable its TX routine after tuntap is down. This could
>>> be easily reproduced by transmitting packets from VM while down and up
>>> the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
>>>
>>> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>> Cc: Eric Dumazet <edumazet@google.com>
>>> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> I'm no so sure what to do with this patch.
>>
>> Like Michael says, this flag bit is only checks upon transmit which
>> may or may not happen after this point.  It doesn't seem to be
>> guaranteed.

The flag is checked in tun_chr_poll() as well.

> Jason, can't we detect a link up transition and respond accordingly?
> What do you think?
>

I think we've already tried to do this, in tun_net_open() we call 
write_space(). But the problem is the bit may not be set at that time.

A second thought is to set the bit in tun_chr_poll() instead of -EIO like:

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index d45ac37..46a1573 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1423,6 +1423,13 @@ static void tun_net_init(struct net_device *dev)
         dev->max_mtu = MAX_MTU - dev->hard_header_len;
  }

+static bool tun_sock_writeable(struct tun_struct *tun, struct tun_file 
*tfile)
+{
+       struct sock *sk = tfile->socket.sk;
+
+       return (tun->dev->flags & IFF_UP) && sock_writeable(sk);
+}
+
  /* Character device part */

  /* Poll */
@@ -1445,10 +1452,9 @@ static __poll_t tun_chr_poll(struct file *file, 
poll_table *wait)
         if (!ptr_ring_empty(&tfile->tx_ring))
                 mask |= EPOLLIN | EPOLLRDNORM;

-       if (tun->dev->flags & IFF_UP &&
-           (sock_writeable(sk) ||
-            (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, 
&sk->sk_socket->flags) &&
-             sock_writeable(sk))))
+       if (tun_sock_writeable(tun, tfile) ||
+           (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, 
&sk->sk_socket->flags) &&
+            tun_sock_writeable(tun, tfile)));
                 mask |= EPOLLOUT | EPOLLWRNORM;

         if (tun->dev->reg_state != NETREG_REGISTERED)

Does this make more sense?

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-22  3:22     ` Jason Wang
@ 2018-05-22  3:45       ` Michael S. Tsirkin
  2018-05-22  3:46       ` Michael S. Tsirkin
  1 sibling, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2018-05-22  3:45 UTC (permalink / raw)
  To: Jason Wang; +Cc: David Miller, netdev, linux-kernel, hannes, edumazet

On Tue, May 22, 2018 at 11:22:11AM +0800, Jason Wang wrote:
> 
> 
> On 2018年05月22日 06:08, Michael S. Tsirkin wrote:
> > On Mon, May 21, 2018 at 11:47:42AM -0400, David Miller wrote:
> > > From: Jason Wang <jasowang@redhat.com>
> > > Date: Fri, 18 May 2018 21:00:43 +0800
> > > 
> > > > We return -EIO on device down but can not raise EPOLLOUT after it was
> > > > up. This may confuse user like vhost which expects tuntap to raise
> > > > EPOLLOUT to re-enable its TX routine after tuntap is down. This could
> > > > be easily reproduced by transmitting packets from VM while down and up
> > > > the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
> > > > 
> > > > Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > I'm no so sure what to do with this patch.
> > > 
> > > Like Michael says, this flag bit is only checks upon transmit which
> > > may or may not happen after this point.  It doesn't seem to be
> > > guaranteed.
> 
> The flag is checked in tun_chr_poll() as well.
> 
> > Jason, can't we detect a link up transition and respond accordingly?
> > What do you think?
> > 
> 
> I think we've already tried to do this, in tun_net_open() we call
> write_space(). But the problem is the bit may not be set at that time.

Which bit? __dev_change_flags seems to set IFF_UP before calling
ndo_open. The issue  I think is that tun_sock_write_space
exits if SOCKWQ_ASYNC_NOSPACE is clear.

And now I think I understand what is going on:

	When link is down, writes to the device might fail with -EIO.
	Userspace needs an indication when the status is resolved.  As a fix,
	tun_net_open attempts to wake up writers - but that is only effective if
	SOCKWQ_ASYNC_NOSPACE has been set in the past.  As a quick hack, set
	SOCKWQ_ASYNC_NOSPACE when write fails because of the link down status.
	If no writes failed, userspace does not know that interface
	was down so should not care that it's going up.


does this describe what this line of code does?
If yes feel free to include this info in a code comment and commit log.



> A second thought is to set the bit in tun_chr_poll() instead of -EIO like:
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index d45ac37..46a1573 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1423,6 +1423,13 @@ static void tun_net_init(struct net_device *dev)
>         dev->max_mtu = MAX_MTU - dev->hard_header_len;
>  }
> 
> +static bool tun_sock_writeable(struct tun_struct *tun, struct tun_file
> *tfile)
> +{
> +       struct sock *sk = tfile->socket.sk;
> +
> +       return (tun->dev->flags & IFF_UP) && sock_writeable(sk);
> +}
> +
>  /* Character device part */
> 
>  /* Poll */
> @@ -1445,10 +1452,9 @@ static __poll_t tun_chr_poll(struct file *file,
> poll_table *wait)
>         if (!ptr_ring_empty(&tfile->tx_ring))
>                 mask |= EPOLLIN | EPOLLRDNORM;
> 
> -       if (tun->dev->flags & IFF_UP &&
> -           (sock_writeable(sk) ||
> -            (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags)
> &&
> -             sock_writeable(sk))))
> +       if (tun_sock_writeable(tun, tfile) ||
> +           (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags)
> &&
> +            tun_sock_writeable(tun, tfile)));
>                 mask |= EPOLLOUT | EPOLLWRNORM;
> 
>         if (tun->dev->reg_state != NETREG_REGISTERED)
> 
> Does this make more sense?
> 
> Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-22  3:22     ` Jason Wang
  2018-05-22  3:45       ` Michael S. Tsirkin
@ 2018-05-22  3:46       ` Michael S. Tsirkin
  2018-05-22  4:00         ` Jason Wang
  1 sibling, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2018-05-22  3:46 UTC (permalink / raw)
  To: Jason Wang; +Cc: David Miller, netdev, linux-kernel, hannes, edumazet

On Tue, May 22, 2018 at 11:22:11AM +0800, Jason Wang wrote:
> 
> 
> On 2018年05月22日 06:08, Michael S. Tsirkin wrote:
> > On Mon, May 21, 2018 at 11:47:42AM -0400, David Miller wrote:
> > > From: Jason Wang <jasowang@redhat.com>
> > > Date: Fri, 18 May 2018 21:00:43 +0800
> > > 
> > > > We return -EIO on device down but can not raise EPOLLOUT after it was
> > > > up. This may confuse user like vhost which expects tuntap to raise
> > > > EPOLLOUT to re-enable its TX routine after tuntap is down. This could
> > > > be easily reproduced by transmitting packets from VM while down and up
> > > > the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
> > > > 
> > > > Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > I'm no so sure what to do with this patch.
> > > 
> > > Like Michael says, this flag bit is only checks upon transmit which
> > > may or may not happen after this point.  It doesn't seem to be
> > > guaranteed.
> 
> The flag is checked in tun_chr_poll() as well.
> 
> > Jason, can't we detect a link up transition and respond accordingly?
> > What do you think?
> > 
> 
> I think we've already tried to do this, in tun_net_open() we call
> write_space(). But the problem is the bit may not be set at that time.
> 
> A second thought is to set the bit in tun_chr_poll() instead of -EIO like:
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index d45ac37..46a1573 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1423,6 +1423,13 @@ static void tun_net_init(struct net_device *dev)
>         dev->max_mtu = MAX_MTU - dev->hard_header_len;
>  }
> 
> +static bool tun_sock_writeable(struct tun_struct *tun, struct tun_file
> *tfile)
> +{
> +       struct sock *sk = tfile->socket.sk;
> +
> +       return (tun->dev->flags & IFF_UP) && sock_writeable(sk);
> +}
> +
>  /* Character device part */
> 
>  /* Poll */
> @@ -1445,10 +1452,9 @@ static __poll_t tun_chr_poll(struct file *file,
> poll_table *wait)
>         if (!ptr_ring_empty(&tfile->tx_ring))
>                 mask |= EPOLLIN | EPOLLRDNORM;
> 
> -       if (tun->dev->flags & IFF_UP &&
> -           (sock_writeable(sk) ||
> -            (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags)
> &&
> -             sock_writeable(sk))))
> +       if (tun_sock_writeable(tun, tfile) ||
> +           (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags)
> &&
> +            tun_sock_writeable(tun, tfile)));
>                 mask |= EPOLLOUT | EPOLLWRNORM;
> 
>         if (tun->dev->reg_state != NETREG_REGISTERED)
> 
> Does this make more sense?
> 
> Thanks

I just understood the motivation for doing it on EIO.
Maybe there's a reason it makes sense here as well,
but it's far from obvious. I suggest you repost adding
an explanation in the comment. The original patch will
be fine with an explanation as well.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] tuntap: raise EPOLLOUT on device up
  2018-05-22  3:46       ` Michael S. Tsirkin
@ 2018-05-22  4:00         ` Jason Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jason Wang @ 2018-05-22  4:00 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: David Miller, netdev, linux-kernel, hannes, edumazet



On 2018年05月22日 11:46, Michael S. Tsirkin wrote:
> On Tue, May 22, 2018 at 11:22:11AM +0800, Jason Wang wrote:
>>
>> On 2018年05月22日 06:08, Michael S. Tsirkin wrote:
>>> On Mon, May 21, 2018 at 11:47:42AM -0400, David Miller wrote:
>>>> From: Jason Wang <jasowang@redhat.com>
>>>> Date: Fri, 18 May 2018 21:00:43 +0800
>>>>
>>>>> We return -EIO on device down but can not raise EPOLLOUT after it was
>>>>> up. This may confuse user like vhost which expects tuntap to raise
>>>>> EPOLLOUT to re-enable its TX routine after tuntap is down. This could
>>>>> be easily reproduced by transmitting packets from VM while down and up
>>>>> the tap device. Fixing this by set SOCKWQ_ASYNC_NOSPACE on -EIO.
>>>>>
>>>>> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>>>> Cc: Eric Dumazet <edumazet@google.com>
>>>>> Fixes: 1bd4978a88ac2 ("tun: honor IFF_UP in tun_get_user()")
>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> I'm no so sure what to do with this patch.
>>>>
>>>> Like Michael says, this flag bit is only checks upon transmit which
>>>> may or may not happen after this point.  It doesn't seem to be
>>>> guaranteed.
>> The flag is checked in tun_chr_poll() as well.
>>
>>> Jason, can't we detect a link up transition and respond accordingly?
>>> What do you think?
>>>
>> I think we've already tried to do this, in tun_net_open() we call
>> write_space(). But the problem is the bit may not be set at that time.
>>
>> A second thought is to set the bit in tun_chr_poll() instead of -EIO like:
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index d45ac37..46a1573 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -1423,6 +1423,13 @@ static void tun_net_init(struct net_device *dev)
>>          dev->max_mtu = MAX_MTU - dev->hard_header_len;
>>   }
>>
>> +static bool tun_sock_writeable(struct tun_struct *tun, struct tun_file
>> *tfile)
>> +{
>> +       struct sock *sk = tfile->socket.sk;
>> +
>> +       return (tun->dev->flags & IFF_UP) && sock_writeable(sk);
>> +}
>> +
>>   /* Character device part */
>>
>>   /* Poll */
>> @@ -1445,10 +1452,9 @@ static __poll_t tun_chr_poll(struct file *file,
>> poll_table *wait)
>>          if (!ptr_ring_empty(&tfile->tx_ring))
>>                  mask |= EPOLLIN | EPOLLRDNORM;
>>
>> -       if (tun->dev->flags & IFF_UP &&
>> -           (sock_writeable(sk) ||
>> -            (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags)
>> &&
>> -             sock_writeable(sk))))
>> +       if (tun_sock_writeable(tun, tfile) ||
>> +           (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags)
>> &&
>> +            tun_sock_writeable(tun, tfile)));
>>                  mask |= EPOLLOUT | EPOLLWRNORM;
>>
>>          if (tun->dev->reg_state != NETREG_REGISTERED)
>>
>> Does this make more sense?
>>
>> Thanks
> I just understood the motivation for doing it on EIO.
> Maybe there's a reason it makes sense here as well,
> but it's far from obvious. I suggest you repost adding
> an explanation in the comment. The original patch will
> be fine with an explanation as well.
>

Ok, let me add explanation on both code and commit log.

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-05-22  4:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-18 13:00 [PATCH net] tuntap: raise EPOLLOUT on device up Jason Wang
2018-05-18 13:13 ` Michael S. Tsirkin
2018-05-18 13:26   ` Jason Wang
2018-05-18 14:00     ` Jason Wang
2018-05-18 14:06       ` Michael S. Tsirkin
2018-05-18 14:11         ` Jason Wang
2018-05-18 14:46           ` Michael S. Tsirkin
2018-05-19  1:09             ` Jason Wang
2018-05-21 22:06               ` Michael S. Tsirkin
2018-05-21 15:47 ` David Miller
2018-05-21 22:08   ` Michael S. Tsirkin
2018-05-22  3:22     ` Jason Wang
2018-05-22  3:45       ` Michael S. Tsirkin
2018-05-22  3:46       ` Michael S. Tsirkin
2018-05-22  4:00         ` Jason Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).