Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH net 0/4] sfc: more EF100 fixes
@ 2020-08-18 12:41 Edward Cree
2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:41 UTC (permalink / raw)
To: linux-net-drivers, davem; +Cc: netdev
Fix up some bugs in the initial EF100 submission, and re-fix
the hash_valid fix which was incomplete.
The reset bugs are currently hard to trigger; they were found
with an in-progress patch adding ethtool support, whereby
ethtool --reset reliably reproduces them.
Edward Cree (4):
sfc: really check hash is valid before using it
sfc: take correct lock in ef100_reset()
sfc: null out channel->rps_flow_id after freeing it
sfc: don't free_irq()s if they were never requested
drivers/net/ethernet/sfc/ef100_nic.c | 10 ++++++----
drivers/net/ethernet/sfc/net_driver.h | 2 ++
drivers/net/ethernet/sfc/nic.c | 4 ++++
drivers/net/ethernet/sfc/rx_common.c | 1 +
4 files changed, 13 insertions(+), 4 deletions(-)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net 1/4] sfc: really check hash is valid before using it
2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
@ 2020-08-18 12:43 ` Edward Cree
2020-08-18 18:54 ` Jesse Brandeburg
2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:43 UTC (permalink / raw)
To: linux-net-drivers, davem; +Cc: netdev
Actually hook up the .rx_buf_hash_valid method in EF100's nic_type.
Fixes: 068885434ccb ("sfc: check hash is valid before using it")
Reported-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
drivers/net/ethernet/sfc/ef100_nic.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/sfc/ef100_nic.c b/drivers/net/ethernet/sfc/ef100_nic.c
index 206d70f9d95b..b8a7e9ed7913 100644
--- a/drivers/net/ethernet/sfc/ef100_nic.c
+++ b/drivers/net/ethernet/sfc/ef100_nic.c
@@ -739,6 +739,7 @@ const struct efx_nic_type ef100_pf_nic_type = {
.rx_remove = efx_mcdi_rx_remove,
.rx_write = ef100_rx_write,
.rx_packet = __ef100_rx_packet,
+ .rx_buf_hash_valid = ef100_rx_buf_hash_valid,
.fini_dmaq = efx_fini_dmaq,
.max_rx_ip_filters = EFX_MCDI_FILTER_TBL_ROWS,
.filter_table_probe = ef100_filter_table_up,
@@ -820,6 +821,7 @@ const struct efx_nic_type ef100_vf_nic_type = {
.rx_remove = efx_mcdi_rx_remove,
.rx_write = ef100_rx_write,
.rx_packet = __ef100_rx_packet,
+ .rx_buf_hash_valid = ef100_rx_buf_hash_valid,
.fini_dmaq = efx_fini_dmaq,
.max_rx_ip_filters = EFX_MCDI_FILTER_TBL_ROWS,
.filter_table_probe = ef100_filter_table_up,
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 2/4] sfc: take correct lock in ef100_reset()
2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
@ 2020-08-18 12:43 ` Edward Cree
2020-08-18 18:55 ` Jesse Brandeburg
2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
` (2 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:43 UTC (permalink / raw)
To: linux-net-drivers, davem; +Cc: netdev
When downing and upping the ef100 filter table, we need to take a write
lock on efx->filter_sem, not just a read lock, because we may kfree()
the table pointers.
Without this, resets cause a WARN_ON from efx_rwsem_assert_write_locked().
Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
drivers/net/ethernet/sfc/ef100_nic.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/sfc/ef100_nic.c b/drivers/net/ethernet/sfc/ef100_nic.c
index b8a7e9ed7913..19fe86b3b316 100644
--- a/drivers/net/ethernet/sfc/ef100_nic.c
+++ b/drivers/net/ethernet/sfc/ef100_nic.c
@@ -431,18 +431,18 @@ static int ef100_reset(struct efx_nic *efx, enum reset_type reset_type)
/* A RESET_TYPE_ALL will cause filters to be removed, so we remove filters
* and reprobe after reset to avoid removing filters twice
*/
- down_read(&efx->filter_sem);
+ down_write(&efx->filter_sem);
ef100_filter_table_down(efx);
- up_read(&efx->filter_sem);
+ up_write(&efx->filter_sem);
rc = efx_mcdi_reset(efx, reset_type);
if (rc)
return rc;
netif_device_attach(efx->net_dev);
- down_read(&efx->filter_sem);
+ down_write(&efx->filter_sem);
rc = ef100_filter_table_up(efx);
- up_read(&efx->filter_sem);
+ up_write(&efx->filter_sem);
if (rc)
return rc;
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it
2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
@ 2020-08-18 12:44 ` Edward Cree
2020-08-18 18:58 ` Jesse Brandeburg
2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
2020-08-18 19:49 ` [PATCH net 0/4] sfc: more EF100 fixes David Miller
4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:44 UTC (permalink / raw)
To: linux-net-drivers, davem; +Cc: netdev
If an ef100_net_open() fails, ef100_net_stop() may be called without
channel->rps_flow_id having been written; thus it may hold the address
freed by a previous ef100_net_stop()'s call to efx_remove_filters().
This then causes a double-free when efx_remove_filters() is called
again, leading to a panic.
To prevent this, after freeing it, overwrite it with NULL.
Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
drivers/net/ethernet/sfc/rx_common.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c
index ef9bca92b0b7..5e29284c89c9 100644
--- a/drivers/net/ethernet/sfc/rx_common.c
+++ b/drivers/net/ethernet/sfc/rx_common.c
@@ -849,6 +849,7 @@ void efx_remove_filters(struct efx_nic *efx)
efx_for_each_channel(channel, efx) {
cancel_delayed_work_sync(&channel->filter_work);
kfree(channel->rps_flow_id);
+ channel->rps_flow_id = NULL;
}
#endif
down_write(&efx->filter_sem);
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 4/4] sfc: don't free_irq()s if they were never requested
2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
` (2 preceding siblings ...)
2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
@ 2020-08-18 12:44 ` Edward Cree
2020-08-18 19:03 ` Jesse Brandeburg
2020-08-18 19:49 ` [PATCH net 0/4] sfc: more EF100 fixes David Miller
4 siblings, 1 reply; 11+ messages in thread
From: Edward Cree @ 2020-08-18 12:44 UTC (permalink / raw)
To: linux-net-drivers, davem; +Cc: netdev
If efx_nic_init_interrupt fails, or was never run (e.g. due to an earlier
failure in ef100_net_open), freeing irqs in efx_nic_fini_interrupt is not
needed and will cause error messages and stack traces.
So instead, only do this if efx_nic_init_interrupt successfully completed,
as indicated by the new efx->irqs_hooked flag.
Fixes: 965b549f3c20 ("sfc_ef100: implement ndo_open/close and EVQ probing")
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
drivers/net/ethernet/sfc/net_driver.h | 2 ++
drivers/net/ethernet/sfc/nic.c | 4 ++++
2 files changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index dcb741d8bd11..062462a13847 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -846,6 +846,7 @@ struct efx_async_filter_insertion {
* @timer_quantum_ns: Interrupt timer quantum, in nanoseconds
* @timer_max_ns: Interrupt timer maximum value, in nanoseconds
* @irq_rx_adaptive: Adaptive IRQ moderation enabled for RX event queues
+ * @irqs_hooked: Channel interrupts are hooked
* @irq_rx_mod_step_us: Step size for IRQ moderation for RX event queues
* @irq_rx_moderation_us: IRQ moderation time for RX event queues
* @msg_enable: Log message enable flags
@@ -1004,6 +1005,7 @@ struct efx_nic {
unsigned int timer_quantum_ns;
unsigned int timer_max_ns;
bool irq_rx_adaptive;
+ bool irqs_hooked;
unsigned int irq_mod_step_us;
unsigned int irq_rx_moderation_us;
u32 msg_enable;
diff --git a/drivers/net/ethernet/sfc/nic.c b/drivers/net/ethernet/sfc/nic.c
index d994d136bb03..d1e908846f5d 100644
--- a/drivers/net/ethernet/sfc/nic.c
+++ b/drivers/net/ethernet/sfc/nic.c
@@ -129,6 +129,7 @@ int efx_nic_init_interrupt(struct efx_nic *efx)
#endif
}
+ efx->irqs_hooked = true;
return 0;
fail2:
@@ -154,6 +155,8 @@ void efx_nic_fini_interrupt(struct efx_nic *efx)
efx->net_dev->rx_cpu_rmap = NULL;
#endif
+ if (!efx->irqs_hooked)
+ return;
if (EFX_INT_MODE_USE_MSI(efx)) {
/* Disable MSI/MSI-X interrupts */
efx_for_each_channel(channel, efx)
@@ -163,6 +166,7 @@ void efx_nic_fini_interrupt(struct efx_nic *efx)
/* Disable legacy interrupt */
free_irq(efx->legacy_irq, efx);
}
+ efx->irqs_hooked = false;
}
/* Register dump */
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net 1/4] sfc: really check hash is valid before using it
2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
@ 2020-08-18 18:54 ` Jesse Brandeburg
0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 18:54 UTC (permalink / raw)
To: Edward Cree; +Cc: linux-net-drivers, davem, netdev
On Tue, 18 Aug 2020 13:43:30 +0100
Edward Cree <ecree@solarflare.com> wrote:
> Actually hook up the .rx_buf_hash_valid method in EF100's nic_type.
>
> Fixes: 068885434ccb ("sfc: check hash is valid before using it")
> Reported-by: Martin Habets <mhabets@solarflare.com>
> Signed-off-by: Edward Cree <ecree@solarflare.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net 2/4] sfc: take correct lock in ef100_reset()
2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
@ 2020-08-18 18:55 ` Jesse Brandeburg
0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 18:55 UTC (permalink / raw)
To: Edward Cree; +Cc: linux-net-drivers, davem, netdev
On Tue, 18 Aug 2020 13:43:57 +0100
Edward Cree <ecree@solarflare.com> wrote:
> When downing and upping the ef100 filter table, we need to take a
> write lock on efx->filter_sem, not just a read lock, because we may
> kfree() the table pointers.
> Without this, resets cause a WARN_ON from
> efx_rwsem_assert_write_locked().
>
> Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and
> related gubbins")
> Signed-off-by: Edward Cree <ecree@solarflare.com>
Fix makes sense
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it
2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
@ 2020-08-18 18:58 ` Jesse Brandeburg
2020-08-18 19:01 ` Jesse Brandeburg
0 siblings, 1 reply; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 18:58 UTC (permalink / raw)
To: Edward Cree; +Cc: linux-net-drivers, davem, netdev
Edward Cree wrote:
> If an ef100_net_open() fails, ef100_net_stop() may be called without
> channel->rps_flow_id having been written; thus it may hold the address
> freed by a previous ef100_net_stop()'s call to efx_remove_filters().
> This then causes a double-free when efx_remove_filters() is called
> again, leading to a panic.
> To prevent this, after freeing it, overwrite it with NULL.
>
> Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
> Signed-off-by: Edward Cree <ecree@solarflare.com>
> ---
> drivers/net/ethernet/sfc/rx_common.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c
> index ef9bca92b0b7..5e29284c89c9 100644
> --- a/drivers/net/ethernet/sfc/rx_common.c
> +++ b/drivers/net/ethernet/sfc/rx_common.c
> @@ -849,6 +849,7 @@ void efx_remove_filters(struct efx_nic *efx)
> efx_for_each_channel(channel, efx) {
> cancel_delayed_work_sync(&channel->filter_work);
> kfree(channel->rps_flow_id);
> + channel->rps_flow_id = NULL;
> }
> #endif
> down_write(&efx->filter_sem);
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it
2020-08-18 18:58 ` Jesse Brandeburg
@ 2020-08-18 19:01 ` Jesse Brandeburg
0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 19:01 UTC (permalink / raw)
To: Edward Cree; +Cc: linux-net-drivers, davem, netdev
Jesse Brandeburg wrote:
> Edward Cree wrote:
>
> > If an ef100_net_open() fails, ef100_net_stop() may be called without
> > channel->rps_flow_id having been written; thus it may hold the address
> > freed by a previous ef100_net_stop()'s call to efx_remove_filters().
> > This then causes a double-free when efx_remove_filters() is called
> > again, leading to a panic.
> > To prevent this, after freeing it, overwrite it with NULL.
> >
> > Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
> > Signed-off-by: Edward Cree <ecree@solarflare.com>
My mailer messed up my previous reply, but this is what I meant to say:
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net 4/4] sfc: don't free_irq()s if they were never requested
2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
@ 2020-08-18 19:03 ` Jesse Brandeburg
0 siblings, 0 replies; 11+ messages in thread
From: Jesse Brandeburg @ 2020-08-18 19:03 UTC (permalink / raw)
To: Edward Cree; +Cc: linux-net-drivers, davem, netdev
Edward Cree wrote:
> If efx_nic_init_interrupt fails, or was never run (e.g. due to an earlier
> failure in ef100_net_open), freeing irqs in efx_nic_fini_interrupt is not
> needed and will cause error messages and stack traces.
> So instead, only do this if efx_nic_init_interrupt successfully completed,
> as indicated by the new efx->irqs_hooked flag.
>
> Fixes: 965b549f3c20 ("sfc_ef100: implement ndo_open/close and EVQ probing")
> Signed-off-by: Edward Cree <ecree@solarflare.com>
Makes sense.
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net 0/4] sfc: more EF100 fixes
2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
` (3 preceding siblings ...)
2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
@ 2020-08-18 19:49 ` David Miller
4 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2020-08-18 19:49 UTC (permalink / raw)
To: ecree; +Cc: linux-net-drivers, netdev
From: Edward Cree <ecree@solarflare.com>
Date: Tue, 18 Aug 2020 13:41:49 +0100
> Fix up some bugs in the initial EF100 submission, and re-fix
> the hash_valid fix which was incomplete.
>
> The reset bugs are currently hard to trigger; they were found
> with an in-progress patch adding ethtool support, whereby
> ethtool --reset reliably reproduces them.
Series applied.
Thanks for the review Jesse.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-08-18 19:49 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-18 12:41 [PATCH net 0/4] sfc: more EF100 fixes Edward Cree
2020-08-18 12:43 ` [PATCH net 1/4] sfc: really check hash is valid before using it Edward Cree
2020-08-18 18:54 ` Jesse Brandeburg
2020-08-18 12:43 ` [PATCH net 2/4] sfc: take correct lock in ef100_reset() Edward Cree
2020-08-18 18:55 ` Jesse Brandeburg
2020-08-18 12:44 ` [PATCH net 3/4] sfc: null out channel->rps_flow_id after freeing it Edward Cree
2020-08-18 18:58 ` Jesse Brandeburg
2020-08-18 19:01 ` Jesse Brandeburg
2020-08-18 12:44 ` [PATCH net 4/4] sfc: don't free_irq()s if they were never requested Edward Cree
2020-08-18 19:03 ` Jesse Brandeburg
2020-08-18 19:49 ` [PATCH net 0/4] sfc: more EF100 fixes David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).