Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2] ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
@ 2021-09-03  6:40 Feng zhou
  2021-09-06  6:37 ` [Intel-wired-lan] " Paul Menzel
  0 siblings, 1 reply; 5+ messages in thread
From: Feng zhou @ 2021-09-03  6:40 UTC (permalink / raw)
  To: jesse.brandeburg, anthony.l.nguyen, davem, kuba, ast, daniel,
	hawk, john.fastabend, jeffrey.t.kirsher, magnus.karlsson,
	maciej.fijalkowski
  Cc: intel-wired-lan, netdev, linux-kernel, bpf, duanxiongchun,
	songmuchun, zhouchengming, chenying.kernel, zhengqi.arch,
	wangdongdong.6, zhoufeng.zf

From: Feng Zhou <zhoufeng.zf@bytedance.com>

The ixgbe driver currently generates a NULL pointer dereference with
some machine (online cpus < 63). This is due to the fact that the
maximum value of num_xdp_queues is nr_cpu_ids. Code is in
"ixgbe_set_rss_queues"".

Here's how the problem repeats itself:
Some machine (online cpus < 63), And user set num_queues to 63 through
ethtool. Code is in the "ixgbe_set_channels",
adapter->ring_feature[RING_F_FDIR].limit = count;
It becames 63.
When user use xdp, "ixgbe_set_rss_queues" will set queues num.
adapter->num_rx_queues = rss_i;
adapter->num_tx_queues = rss_i;
adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);
And rss_i's value is from
f = &adapter->ring_feature[RING_F_FDIR];
rss_i = f->indices = f->limit;
So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",
for (i = 0; i < adapter->num_rx_queues; i++)
	if (adapter->xdp_ring[i]->xsk_umem)
lead to panic.
Call trace:
[exception RIP: ixgbe_xdp+368]
RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297
RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000
RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90
RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000
R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000
R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530
ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc
 8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808
 9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235
10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384
11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd
12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb
13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88
14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319
15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290
16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8
17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64
18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9
19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c

Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for
AF_XDP")
Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
---
Updates since v1:
- Fix "ixgbe_max_channels" callback so that it will not allow a setting of 
queues to be higher than the num_online_cpus().
more details can be seen from here:
https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20210817075407.11961-1-zhoufeng.zf@bytedance.com/
Thanks to Maciej Fijalkowski for your advice.

 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 8 ++++++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 4ceaca0f6ce3..21321d164708 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -3204,7 +3204,7 @@ static unsigned int ixgbe_max_channels(struct ixgbe_adapter *adapter)
 		max_combined = ixgbe_max_rss_indices(adapter);
 	}
 
-	return max_combined;
+	return min_t(int, max_combined, num_online_cpus());
 }
 
 static void ixgbe_get_channels(struct net_device *dev,
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 14aea40da50f..5db496cc5070 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10112,6 +10112,7 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
 	struct ixgbe_adapter *adapter = netdev_priv(dev);
 	struct bpf_prog *old_prog;
 	bool need_reset;
+	int num_queues;
 
 	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
 		return -EINVAL;
@@ -10161,11 +10162,14 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
 	/* Kick start the NAPI context if there is an AF_XDP socket open
 	 * on that queue id. This so that receiving will start.
 	 */
-	if (need_reset && prog)
-		for (i = 0; i < adapter->num_rx_queues; i++)
+	if (need_reset && prog) {
+		num_queues = min_t(int, adapter->num_rx_queues,
+			adapter->num_xdp_queues);
+		for (i = 0; i < num_queues; i++)
 			if (adapter->xdp_ring[i]->xsk_pool)
 				(void)ixgbe_xsk_wakeup(adapter->netdev, i,
 						       XDP_WAKEUP_RX);
+	}
 
 	return 0;
 }
-- 
2.11.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
  2021-09-03  6:40 [PATCH v2] ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup Feng zhou
@ 2021-09-06  6:37 ` Paul Menzel
  2021-09-06  7:49   ` [External] " Feng Zhou
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Menzel @ 2021-09-06  6:37 UTC (permalink / raw)
  To: Feng Zhou
  Cc: duanxiongchun, netdev, linux-kernel, zhengqi.arch,
	chenying.kernel, intel-wired-lan, songmuchun, bpf,
	wangdongdong.6, zhouchengming, jesse.brandeburg,
	anthony.l.nguyen, davem, kuba, ast, daniel, hawk, john.fastabend,
	jeffrey.t.kirsher, magnus.karlsson, maciej.fijalkowski

Dear Feng,


Am 03.09.21 um 08:40 schrieb Feng zhou:

(If you care, in your email client, your last name does not start with a 
capital letter.)

> From: Feng Zhou <zhoufeng.zf@bytedance.com>
> 
> The ixgbe driver currently generates a NULL pointer dereference with
> some machine (online cpus < 63). This is due to the fact that the
> maximum value of num_xdp_queues is nr_cpu_ids. Code is in
> "ixgbe_set_rss_queues"".
> 
> Here's how the problem repeats itself:
> Some machine (online cpus < 63), And user set num_queues to 63 through
> ethtool. Code is in the "ixgbe_set_channels",
> adapter->ring_feature[RING_F_FDIR].limit = count;

For better legibility, you might want to indent code (blocks) by four 
spaces and add blank lines around it (also below).

> It becames 63.

becomes

> When user use xdp, "ixgbe_set_rss_queues" will set queues num.
> adapter->num_rx_queues = rss_i;
> adapter->num_tx_queues = rss_i;
> adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);
> And rss_i's value is from
> f = &adapter->ring_feature[RING_F_FDIR];
> rss_i = f->indices = f->limit;
> So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",
> for (i = 0; i < adapter->num_rx_queues; i++)
> 	if (adapter->xdp_ring[i]->xsk_umem)
> lead to panic.

lead*s*?

> Call trace:
> [exception RIP: ixgbe_xdp+368]
> RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297
> RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000
> RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90
> RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000
> R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000
> R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530
> ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>   7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc
>   8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808
>   9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235
> 10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384
> 11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd
> 12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb
> 13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88
> 14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319
> 15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290
> 16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8
> 17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64
> 18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9
> 19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c

Please describe the fix in the commit message.

> Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for
> AF_XDP")
> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
> ---
> Updates since v1:
> - Fix "ixgbe_max_channels" callback so that it will not allow a setting of
> queues to be higher than the num_online_cpus().
> more details can be seen from here:
> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20210817075407.11961-1-zhoufeng.zf@bytedance.com/
> Thanks to Maciej Fijalkowski for your advice.
> 
>   drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +-
>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 8 ++++++--
>   2 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> index 4ceaca0f6ce3..21321d164708 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> @@ -3204,7 +3204,7 @@ static unsigned int ixgbe_max_channels(struct ixgbe_adapter *adapter)
>   		max_combined = ixgbe_max_rss_indices(adapter);
>   	}
>   
> -	return max_combined;
> +	return min_t(int, max_combined, num_online_cpus());
>   }
>   
>   static void ixgbe_get_channels(struct net_device *dev,
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 14aea40da50f..5db496cc5070 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -10112,6 +10112,7 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
>   	struct ixgbe_adapter *adapter = netdev_priv(dev);
>   	struct bpf_prog *old_prog;
>   	bool need_reset;
> +	int num_queues;
>   
>   	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
>   		return -EINVAL;
> @@ -10161,11 +10162,14 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
>   	/* Kick start the NAPI context if there is an AF_XDP socket open
>   	 * on that queue id. This so that receiving will start.
>   	 */
> -	if (need_reset && prog)
> -		for (i = 0; i < adapter->num_rx_queues; i++)
> +	if (need_reset && prog) {
> +		num_queues = min_t(int, adapter->num_rx_queues,
> +			adapter->num_xdp_queues);
> +		for (i = 0; i < num_queues; i++)
>   			if (adapter->xdp_ring[i]->xsk_pool)
>   				(void)ixgbe_xsk_wakeup(adapter->netdev, i,
>   						       XDP_WAKEUP_RX);
> +	}
>   
>   	return 0;
>   }
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [External] Re: [Intel-wired-lan] [PATCH v2] ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
  2021-09-06  6:37 ` [Intel-wired-lan] " Paul Menzel
@ 2021-09-06  7:49   ` Feng Zhou
  2021-09-06 11:31     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 5+ messages in thread
From: Feng Zhou @ 2021-09-06  7:49 UTC (permalink / raw)
  To: Paul Menzel
  Cc: duanxiongchun, netdev, linux-kernel, zhengqi.arch,
	chenying.kernel, intel-wired-lan, songmuchun, bpf,
	wangdongdong.6, zhouchengming, jesse.brandeburg,
	anthony.l.nguyen, davem, kuba, ast, daniel, hawk, john.fastabend,
	jeffrey.t.kirsher, magnus.karlsson, maciej.fijalkowski


在 2021/9/6 下午2:37, Paul Menzel 写道:
> Dear Feng,
>
>
> Am 03.09.21 um 08:40 schrieb Feng zhou:
>
> (If you care, in your email client, your last name does not start with 
> a capital letter.)
>
>> From: Feng Zhou <zhoufeng.zf@bytedance.com>
>>
>> The ixgbe driver currently generates a NULL pointer dereference with
>> some machine (online cpus < 63). This is due to the fact that the
>> maximum value of num_xdp_queues is nr_cpu_ids. Code is in
>> "ixgbe_set_rss_queues"".
>>
>> Here's how the problem repeats itself:
>> Some machine (online cpus < 63), And user set num_queues to 63 through
>> ethtool. Code is in the "ixgbe_set_channels",
>> adapter->ring_feature[RING_F_FDIR].limit = count;
>
> For better legibility, you might want to indent code (blocks) by four 
> spaces and add blank lines around it (also below).
>
>> It becames 63.
>
> becomes
>
>> When user use xdp, "ixgbe_set_rss_queues" will set queues num.
>> adapter->num_rx_queues = rss_i;
>> adapter->num_tx_queues = rss_i;
>> adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);
>> And rss_i's value is from
>> f = &adapter->ring_feature[RING_F_FDIR];
>> rss_i = f->indices = f->limit;
>> So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",
>> for (i = 0; i < adapter->num_rx_queues; i++)
>>     if (adapter->xdp_ring[i]->xsk_umem)
>> lead to panic.
>
> lead*s*?
>
>> Call trace:
>> [exception RIP: ixgbe_xdp+368]
>> RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297
>> RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000
>> RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90
>> RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000
>> R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000
>> R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530
>> ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>   7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc
>>   8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808
>>   9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235
>> 10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384
>> 11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd
>> 12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb
>> 13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88
>> 14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319
>> 15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290
>> 16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8
>> 17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64
>> 18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9
>> 19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c
>
> Please describe the fix in the commit message.
>
>> Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for
>> AF_XDP")
>> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
>> ---
>> Updates since v1:
>> - Fix "ixgbe_max_channels" callback so that it will not allow a 
>> setting of
>> queues to be higher than the num_online_cpus().
>> more details can be seen from here:
>> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20210817075407.11961-1-zhoufeng.zf@bytedance.com/ 
>>
>> Thanks to Maciej Fijalkowski for your advice.
>>
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +-
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 8 ++++++--
>>   2 files changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c 
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> index 4ceaca0f6ce3..21321d164708 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> @@ -3204,7 +3204,7 @@ static unsigned int ixgbe_max_channels(struct 
>> ixgbe_adapter *adapter)
>>           max_combined = ixgbe_max_rss_indices(adapter);
>>       }
>>   -    return max_combined;
>> +    return min_t(int, max_combined, num_online_cpus());
>>   }
>>     static void ixgbe_get_channels(struct net_device *dev,
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 14aea40da50f..5db496cc5070 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -10112,6 +10112,7 @@ static int ixgbe_xdp_setup(struct net_device 
>> *dev, struct bpf_prog *prog)
>>       struct ixgbe_adapter *adapter = netdev_priv(dev);
>>       struct bpf_prog *old_prog;
>>       bool need_reset;
>> +    int num_queues;
>>         if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
>>           return -EINVAL;
>> @@ -10161,11 +10162,14 @@ static int ixgbe_xdp_setup(struct 
>> net_device *dev, struct bpf_prog *prog)
>>       /* Kick start the NAPI context if there is an AF_XDP socket open
>>        * on that queue id. This so that receiving will start.
>>        */
>> -    if (need_reset && prog)
>> -        for (i = 0; i < adapter->num_rx_queues; i++)
>> +    if (need_reset && prog) {
>> +        num_queues = min_t(int, adapter->num_rx_queues,
>> +            adapter->num_xdp_queues);
>> +        for (i = 0; i < num_queues; i++)
>>               if (adapter->xdp_ring[i]->xsk_pool)
>>                   (void)ixgbe_xsk_wakeup(adapter->netdev, i,
>>                                  XDP_WAKEUP_RX);
>> +    }
>>         return 0;
>>   }
>>
Thanks for your advice. I will modify the commit message in v3


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [External] Re: [Intel-wired-lan] [PATCH v2] ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
  2021-09-06  7:49   ` [External] " Feng Zhou
@ 2021-09-06 11:31     ` Jesper Dangaard Brouer
  2021-09-08  2:36       ` Jason Xing
  0 siblings, 1 reply; 5+ messages in thread
From: Jesper Dangaard Brouer @ 2021-09-06 11:31 UTC (permalink / raw)
  To: Feng Zhou, Paul Menzel, Jason Xing
  Cc: brouer, duanxiongchun, netdev, linux-kernel, zhengqi.arch,
	chenying.kernel, intel-wired-lan, songmuchun, bpf,
	wangdongdong.6, zhouchengming, jesse.brandeburg,
	anthony.l.nguyen, davem, kuba, ast, daniel, hawk, john.fastabend,
	jeffrey.t.kirsher, magnus.karlsson, maciej.fijalkowski,
	Tony Nguyen

Hi Feng and Jason,

Please notice that you are both developing patches that change the ixgbe 
driver in related areas.

Jason's patch:
  Subject: [PATCH v7] ixgbe: let the xdpdrv work with more than 64 cpus
 
https://lore.kernel.org/all/20210901101206.50274-1-kerneljasonxing@gmail.com/

We might need both as this patch looks like a fix to a panic, and 
Jason's patch allows XDP on ixgbe to work on machines with more than 64 
CPUs.

-Jesper

On 06/09/2021 09.49, Feng Zhou wrote:
> 
> 在 2021/9/6 下午2:37, Paul Menzel 写道:
>> Dear Feng,
>>
>>
>> Am 03.09.21 um 08:40 schrieb Feng zhou:
>>
>> (If you care, in your email client, your last name does not start with 
>> a capital letter.)
>>
>>> From: Feng Zhou <zhoufeng.zf@bytedance.com>
>>>
>>> The ixgbe driver currently generates a NULL pointer dereference with
>>> some machine (online cpus < 63). This is due to the fact that the
>>> maximum value of num_xdp_queues is nr_cpu_ids. Code is in
>>> "ixgbe_set_rss_queues"".
>>>
>>> Here's how the problem repeats itself:
>>> Some machine (online cpus < 63), And user set num_queues to 63 through
>>> ethtool. Code is in the "ixgbe_set_channels",
>>> adapter->ring_feature[RING_F_FDIR].limit = count;
>>
>> For better legibility, you might want to indent code (blocks) by four 
>> spaces and add blank lines around it (also below).
>>
>>> It becames 63.
>>
>> becomes
>>
>>> When user use xdp, "ixgbe_set_rss_queues" will set queues num.
>>> adapter->num_rx_queues = rss_i;
>>> adapter->num_tx_queues = rss_i;
>>> adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);
>>> And rss_i's value is from
>>> f = &adapter->ring_feature[RING_F_FDIR];
>>> rss_i = f->indices = f->limit;
>>> So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",
>>> for (i = 0; i < adapter->num_rx_queues; i++)
>>>     if (adapter->xdp_ring[i]->xsk_umem)
>>> lead to panic.
>>
>> lead*s*?
>>
>>> Call trace:
>>> [exception RIP: ixgbe_xdp+368]
>>> RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297
>>> RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000
>>> RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90
>>> RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000
>>> R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000
>>> R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530
>>> ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>>   7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc
>>>   8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808
>>>   9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235
>>> 10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384
>>> 11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd
>>> 12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb
>>> 13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88
>>> 14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319
>>> 15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290
>>> 16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8
>>> 17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64
>>> 18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9
>>> 19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c
>>
>> Please describe the fix in the commit message.
>>
>>> Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for
>>> AF_XDP")
>>> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
>>> ---
>>> Updates since v1:
>>> - Fix "ixgbe_max_channels" callback so that it will not allow a 
>>> setting of
>>> queues to be higher than the num_online_cpus().
>>> more details can be seen from here:
>>> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20210817075407.11961-1-zhoufeng.zf@bytedance.com/ 
>>>
>>> Thanks to Maciej Fijalkowski for your advice.
>>>
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +-
>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 8 ++++++--
>>>   2 files changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c 
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>>> index 4ceaca0f6ce3..21321d164708 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>>> @@ -3204,7 +3204,7 @@ static unsigned int ixgbe_max_channels(struct 
>>> ixgbe_adapter *adapter)
>>>           max_combined = ixgbe_max_rss_indices(adapter);
>>>       }
>>>   -    return max_combined;
>>> +    return min_t(int, max_combined, num_online_cpus());
>>>   }
>>>     static void ixgbe_get_channels(struct net_device *dev,
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index 14aea40da50f..5db496cc5070 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -10112,6 +10112,7 @@ static int ixgbe_xdp_setup(struct net_device 
>>> *dev, struct bpf_prog *prog)
>>>       struct ixgbe_adapter *adapter = netdev_priv(dev);
>>>       struct bpf_prog *old_prog;
>>>       bool need_reset;
>>> +    int num_queues;
>>>         if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
>>>           return -EINVAL;
>>> @@ -10161,11 +10162,14 @@ static int ixgbe_xdp_setup(struct 
>>> net_device *dev, struct bpf_prog *prog)
>>>       /* Kick start the NAPI context if there is an AF_XDP socket open
>>>        * on that queue id. This so that receiving will start.
>>>        */
>>> -    if (need_reset && prog)
>>> -        for (i = 0; i < adapter->num_rx_queues; i++)
>>> +    if (need_reset && prog) {
>>> +        num_queues = min_t(int, adapter->num_rx_queues,
>>> +            adapter->num_xdp_queues);
>>> +        for (i = 0; i < num_queues; i++)
>>>               if (adapter->xdp_ring[i]->xsk_pool)
>>>                   (void)ixgbe_xsk_wakeup(adapter->netdev, i,
>>>                                  XDP_WAKEUP_RX);
>>> +    }
>>>         return 0;
>>>   }
>>>
> Thanks for your advice. I will modify the commit message in v3
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [External] Re: [Intel-wired-lan] [PATCH v2] ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
  2021-09-06 11:31     ` Jesper Dangaard Brouer
@ 2021-09-08  2:36       ` Jason Xing
  0 siblings, 0 replies; 5+ messages in thread
From: Jason Xing @ 2021-09-08  2:36 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Feng Zhou, Paul Menzel, Jason Xing, brouer, duanxiongchun,
	netdev, LKML, zhengqi.arch, chenying.kernel, intel-wired-lan,
	songmuchun, bpf, wangdongdong.6, zhouchengming, Jesse Brandeburg,
	David Miller, kuba, ast, daniel, hawk, john.fastabend,
	jeffrey.t.kirsher, magnus.karlsson, Maciej Fijalkowski,
	Tony Nguyen

On Mon, Sep 6, 2021 at 7:32 PM Jesper Dangaard Brouer
<jbrouer@redhat.com> wrote:
>
> Hi Feng and Jason,
>
> Please notice that you are both developing patches that change the ixgbe
> driver in related areas.

Thanks for noticing. We're doing different things as they are both
related to XDP on ixgbe actually.

Jason

>
> Jason's patch:
>   Subject: [PATCH v7] ixgbe: let the xdpdrv work with more than 64 cpus
>
> https://lore.kernel.org/all/20210901101206.50274-1-kerneljasonxing@gmail.com/
>
> We might need both as this patch looks like a fix to a panic, and
> Jason's patch allows XDP on ixgbe to work on machines with more than 64
> CPUs.
>
> -Jesper
>
> On 06/09/2021 09.49, Feng Zhou wrote:
> >
> > 在 2021/9/6 下午2:37, Paul Menzel 写道:
> >> Dear Feng,
> >>
> >>
> >> Am 03.09.21 um 08:40 schrieb Feng zhou:
> >>
> >> (If you care, in your email client, your last name does not start with
> >> a capital letter.)
> >>
> >>> From: Feng Zhou <zhoufeng.zf@bytedance.com>
> >>>
> >>> The ixgbe driver currently generates a NULL pointer dereference with
> >>> some machine (online cpus < 63). This is due to the fact that the
> >>> maximum value of num_xdp_queues is nr_cpu_ids. Code is in
> >>> "ixgbe_set_rss_queues"".
> >>>
> >>> Here's how the problem repeats itself:
> >>> Some machine (online cpus < 63), And user set num_queues to 63 through
> >>> ethtool. Code is in the "ixgbe_set_channels",
> >>> adapter->ring_feature[RING_F_FDIR].limit = count;
> >>
> >> For better legibility, you might want to indent code (blocks) by four
> >> spaces and add blank lines around it (also below).
> >>
> >>> It becames 63.
> >>
> >> becomes
> >>
> >>> When user use xdp, "ixgbe_set_rss_queues" will set queues num.
> >>> adapter->num_rx_queues = rss_i;
> >>> adapter->num_tx_queues = rss_i;
> >>> adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);
> >>> And rss_i's value is from
> >>> f = &adapter->ring_feature[RING_F_FDIR];
> >>> rss_i = f->indices = f->limit;
> >>> So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",
> >>> for (i = 0; i < adapter->num_rx_queues; i++)
> >>>     if (adapter->xdp_ring[i]->xsk_umem)
> >>> lead to panic.
> >>
> >> lead*s*?
> >>
> >>> Call trace:
> >>> [exception RIP: ixgbe_xdp+368]
> >>> RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297
> >>> RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000
> >>> RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90
> >>> RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000
> >>> R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000
> >>> R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530
> >>> ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> >>>   7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc
> >>>   8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808
> >>>   9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235
> >>> 10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384
> >>> 11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd
> >>> 12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb
> >>> 13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88
> >>> 14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319
> >>> 15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290
> >>> 16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8
> >>> 17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64
> >>> 18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9
> >>> 19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c
> >>
> >> Please describe the fix in the commit message.
> >>
> >>> Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for
> >>> AF_XDP")
> >>> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
> >>> ---
> >>> Updates since v1:
> >>> - Fix "ixgbe_max_channels" callback so that it will not allow a
> >>> setting of
> >>> queues to be higher than the num_online_cpus().
> >>> more details can be seen from here:
> >>> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20210817075407.11961-1-zhoufeng.zf@bytedance.com/
> >>>
> >>> Thanks to Maciej Fijalkowski for your advice.
> >>>
> >>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +-
> >>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 8 ++++++--
> >>>   2 files changed, 7 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> >>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> >>> index 4ceaca0f6ce3..21321d164708 100644
> >>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> >>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> >>> @@ -3204,7 +3204,7 @@ static unsigned int ixgbe_max_channels(struct
> >>> ixgbe_adapter *adapter)
> >>>           max_combined = ixgbe_max_rss_indices(adapter);
> >>>       }
> >>>   -    return max_combined;
> >>> +    return min_t(int, max_combined, num_online_cpus());
> >>>   }
> >>>     static void ixgbe_get_channels(struct net_device *dev,
> >>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >>> index 14aea40da50f..5db496cc5070 100644
> >>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> >>> @@ -10112,6 +10112,7 @@ static int ixgbe_xdp_setup(struct net_device
> >>> *dev, struct bpf_prog *prog)
> >>>       struct ixgbe_adapter *adapter = netdev_priv(dev);
> >>>       struct bpf_prog *old_prog;
> >>>       bool need_reset;
> >>> +    int num_queues;
> >>>         if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
> >>>           return -EINVAL;
> >>> @@ -10161,11 +10162,14 @@ static int ixgbe_xdp_setup(struct
> >>> net_device *dev, struct bpf_prog *prog)
> >>>       /* Kick start the NAPI context if there is an AF_XDP socket open
> >>>        * on that queue id. This so that receiving will start.
> >>>        */
> >>> -    if (need_reset && prog)
> >>> -        for (i = 0; i < adapter->num_rx_queues; i++)
> >>> +    if (need_reset && prog) {
> >>> +        num_queues = min_t(int, adapter->num_rx_queues,
> >>> +            adapter->num_xdp_queues);
> >>> +        for (i = 0; i < num_queues; i++)
> >>>               if (adapter->xdp_ring[i]->xsk_pool)
> >>>                   (void)ixgbe_xsk_wakeup(adapter->netdev, i,
> >>>                                  XDP_WAKEUP_RX);
> >>> +    }
> >>>         return 0;
> >>>   }
> >>>
> > Thanks for your advice. I will modify the commit message in v3
> >
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-08  2:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-03  6:40 [PATCH v2] ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup Feng zhou
2021-09-06  6:37 ` [Intel-wired-lan] " Paul Menzel
2021-09-06  7:49   ` [External] " Feng Zhou
2021-09-06 11:31     ` Jesper Dangaard Brouer
2021-09-08  2:36       ` Jason Xing

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).