LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] Fix crash in tg3 when using irqpoll
@ 2007-03-22 20:46 Bernhard Walle
2007-03-22 22:04 ` Michael Chan
0 siblings, 1 reply; 3+ messages in thread
From: Bernhard Walle @ 2007-03-22 20:46 UTC (permalink / raw)
To: netdev; +Cc: linux-kernel
When using irqpoll I had a crash when I loaded the tg3 network driver
(on IA64). The stack trace was:
ia64_leave_kernel
[tg3]tg3_interrupt_tagged
note_interrupt
__do_IRQ
ia64_handle_irq
ia64_leave_kernel
_spin_unlock_irqrestore
pci_bus_read_config_dword
pci_restore_state
[tg3]tg3_chip_reset
[tg3]tg3_reset_hw
[tg3]tg3_init_hw
[tg3]tg3_open
dev_open
dev_change_flags
devinet_ioctl
inet_ioctl
sock_ioctl
do_ioctl
vfs_ioctl
sys_ioctl
ia64_ret_from_syscall
__kernel_syscall_via_break
Also, I had a MCA that ended up in a read from a PCI address that belongs to the
tg3 driver.
This patch makes sure that even the tr32() instruction in the interrupt handler
is not executed which accesses PCI memory. Accessing PCI memory when
pci_restore_state() is called is a bad idea because that function modifies
the BARs of the PCI device.
I think the problem could also happen when using shared interrupts, not only
irqpoll.
Signed-off-by: Bernhard Walle <bwalle@suse.de>
---
drivers/net/tg3.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
Index: mainline-msi-init/drivers/net/tg3.c
===================================================================
--- mainline-msi-init.orig/drivers/net/tg3.c
+++ mainline-msi-init/drivers/net/tg3.c
@@ -3561,7 +3561,10 @@ static irqreturn_t tg3_interrupt(int irq
struct net_device *dev = dev_id;
struct tg3 *tp = netdev_priv(dev);
struct tg3_hw_status *sblk = tp->hw_status;
- unsigned int handled = 1;
+ unsigned int handled = 0;
+
+ if (tg3_irq_sync(tp))
+ goto out;
/* In INTx mode, it is possible for the interrupt to arrive at
* the CPU before the status block posted prior to the interrupt.
@@ -3579,8 +3582,6 @@ static irqreturn_t tg3_interrupt(int irq
*/
tw32_mailbox(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW,
0x00000001);
- if (tg3_irq_sync(tp))
- goto out;
sblk->status &= ~SD_STATUS_UPDATED;
if (likely(tg3_has_work(tp))) {
prefetch(&tp->rx_rcb[tp->rx_rcb_ptr]);
@@ -3592,8 +3593,7 @@ static irqreturn_t tg3_interrupt(int irq
tw32_mailbox_f(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW,
0x00000000);
}
- } else { /* shared interrupt */
- handled = 0;
+ handled = 1;
}
out:
return IRQ_RETVAL(handled);
@@ -3604,7 +3604,10 @@ static irqreturn_t tg3_interrupt_tagged(
struct net_device *dev = dev_id;
struct tg3 *tp = netdev_priv(dev);
struct tg3_hw_status *sblk = tp->hw_status;
- unsigned int handled = 1;
+ unsigned int handled = 0;
+
+ if (tg3_irq_sync(tp))
+ goto out;
/* In INTx mode, it is possible for the interrupt to arrive at
* the CPU before the status block posted prior to the interrupt.
@@ -3622,8 +3625,6 @@ static irqreturn_t tg3_interrupt_tagged(
*/
tw32_mailbox(MAILBOX_INTERRUPT_0 + TG3_64BIT_REG_LOW,
0x00000001);
- if (tg3_irq_sync(tp))
- goto out;
if (netif_rx_schedule_prep(dev)) {
prefetch(&tp->rx_rcb[tp->rx_rcb_ptr]);
/* Update last_tag to mark that this status has been
@@ -3634,8 +3635,7 @@ static irqreturn_t tg3_interrupt_tagged(
tp->last_tag = sblk->status_tag;
__netif_rx_schedule(dev);
}
- } else { /* shared interrupt */
- handled = 0;
+ handled = 1;
}
out:
return IRQ_RETVAL(handled);
@@ -7052,7 +7052,7 @@ static int tg3_open(struct net_device *d
return err;
}
- tg3_full_lock(tp, 0);
+ tg3_full_lock(tp, 1);
err = tg3_init_hw(tp, 1);
if (err) {
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Fix crash in tg3 when using irqpoll
2007-03-22 20:46 [PATCH] Fix crash in tg3 when using irqpoll Bernhard Walle
@ 2007-03-22 22:04 ` Michael Chan
2007-03-22 23:20 ` Bernhard Walle
0 siblings, 1 reply; 3+ messages in thread
From: Michael Chan @ 2007-03-22 22:04 UTC (permalink / raw)
To: Bernhard Walle; +Cc: netdev, linux-kernel
On Thu, 2007-03-22 at 21:46 +0100, Bernhard Walle wrote:
>
> This patch makes sure that even the tr32() instruction in the interrupt handler
> is not executed which accesses PCI memory. Accessing PCI memory when
> pci_restore_state() is called is a bad idea because that function modifies
> the BARs of the PCI device.
It is not caused by the BAR as it doesn't get changed in this case. The
pci_restore_state() call is to restore the memory enable bit in the PCI
command register. The tr32() call in tg3_interrupt() will cause a
master abort if it is called before the memory enable bit has been
restored.
> --- mainline-msi-init.orig/drivers/net/tg3.c
> +++ mainline-msi-init/drivers/net/tg3.c
> @@ -3561,7 +3561,10 @@ static irqreturn_t tg3_interrupt(int irq
> struct net_device *dev = dev_id;
> struct tg3 *tp = netdev_priv(dev);
> struct tg3_hw_status *sblk = tp->hw_status;
> - unsigned int handled = 1;
> + unsigned int handled = 0;
> +
> + if (tg3_irq_sync(tp))
> + goto out;
This will break other things. When we disable interrupts, we set the
irq_sync flag but allow one more interrupt to be generated. The irq
handler will simply mask off the interrupt when it sees the irq_sync
flag. With the above change, the irq handler can no longer mask off
this trailing interrupt and you may get screaming interrupts as a
result.
Thanks for reporting this. I'll try to come up with a good solution.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Fix crash in tg3 when using irqpoll
2007-03-22 22:04 ` Michael Chan
@ 2007-03-22 23:20 ` Bernhard Walle
0 siblings, 0 replies; 3+ messages in thread
From: Bernhard Walle @ 2007-03-22 23:20 UTC (permalink / raw)
To: Michael Chan; +Cc: netdev, linux-kernel
Hello Michael,
* Michael Chan <mchan@broadcom.com> [2007-03-22 23:04]:
> On Thu, 2007-03-22 at 21:46 +0100, Bernhard Walle wrote:
>
> > This patch makes sure that even the tr32() instruction in the interrupt handler
> > is not executed which accesses PCI memory. Accessing PCI memory when
> > pci_restore_state() is called is a bad idea because that function modifies
> > the BARs of the PCI device.
>
> It is not caused by the BAR as it doesn't get changed in this case. The
> pci_restore_state() call is to restore the memory enable bit in the PCI
> command register. The tr32() call in tg3_interrupt() will cause a
> master abort if it is called before the memory enable bit has been
> restored.
Ok, thanks for the explanation. I wondered why you call
pci_restore_state() here, normally that's only called from .resume
handlers.
> > --- mainline-msi-init.orig/drivers/net/tg3.c
> > +++ mainline-msi-init/drivers/net/tg3.c
> > @@ -3561,7 +3561,10 @@ static irqreturn_t tg3_interrupt(int irq
> > struct net_device *dev = dev_id;
> > struct tg3 *tp = netdev_priv(dev);
> > struct tg3_hw_status *sblk = tp->hw_status;
> > - unsigned int handled = 1;
> > + unsigned int handled = 0;
> > +
> > + if (tg3_irq_sync(tp))
> > + goto out;
>
>
> This will break other things. When we disable interrupts, we set the
> irq_sync flag but allow one more interrupt to be generated. The irq
> handler will simply mask off the interrupt when it sees the irq_sync
> flag. With the above change, the irq handler can no longer mask off
> this trailing interrupt and you may get screaming interrupts as a
> result.
You're right, I had only the case in mind where the network device
doesn't generate any interrupts (in initialisation phase and in
shutdown phase) because it's disabled in the device, and the interrupt
handler is only called because of IRQ-sharing/irqpoll.
> Thanks for reporting this. I'll try to come up with a good solution.
Could you please CC me, I'd like to test it here.
Thanks,
Bernhard
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-03-22 23:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-22 20:46 [PATCH] Fix crash in tg3 when using irqpoll Bernhard Walle
2007-03-22 22:04 ` Michael Chan
2007-03-22 23:20 ` Bernhard Walle
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).