From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756591AbYC2VP1 (ORCPT ); Sat, 29 Mar 2008 17:15:27 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754598AbYC2VPM (ORCPT ); Sat, 29 Mar 2008 17:15:12 -0400 Received: from srv5.dvmed.net ([207.36.208.214]:36417 "EHLO mail.dvmed.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754588AbYC2VPK (ORCPT ); Sat, 29 Mar 2008 17:15:10 -0400 Message-ID: <47EEB154.9060605@garzik.org> Date: Sat, 29 Mar 2008 17:15:00 -0400 From: Jeff Garzik User-Agent: Thunderbird 2.0.0.12 (X11/20080226) MIME-Version: 1.0 To: yhlu.kernel@gmail.com CC: Andrew Morton , David Miller , Greg KH , Ingo Molnar , kernel list , netdev@vger.kernel.org, linux-pci@atrey.karlin.mff.cuni.cz Subject: Re: [PATCH] e1000: fix IRQx nobody cared for shared irq with INTx References: <200803291403.23479.yhlu.kernel@gmail.com> In-Reply-To: <200803291403.23479.yhlu.kernel@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -4.4 (----) X-Spam-Report: SpamAssassin version 3.2.4 on srv5.dvmed.net summary: Content analysis details: (-4.4 points, 5.0 required) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Yinghai Lu wrote: > when try to kexec one latest kernel from kernel.org from RHEL 5.1 got > > ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [LNKA] -> GSI 19 (level, low) -> IRQ 19 > acpi->mptable 2 : Int: type 0, pol 1, trig 1, bus 02, IRQ 00, APIC ID 0, APIC INT 13 > PCI: Setting latency timer of device 0000:02:00.0 to 64 > PCI: Enabling Mem-Wr-Inval for device 0000:02:00.0 > scsi0 : on PCI bus 02 device 00 irq 19 > irq 19: nobody cared (try booting with the "irqpoll" option) > Pid: 1, comm: swapper Not tainted 2.6.24-smp-07682-g551e4fb-dirty #19 > > Call Trace: > [] __report_bad_irq+0x30/0x72 > [] note_interrupt+0x224/0x26f > [] handle_fasteoi_irq+0xa5/0xc8 > [] call_softirq+0x1c/0x28 > [] do_IRQ+0xf1/0x15f > [] ret_from_intr+0x0/0xa > [] pci_mmcfg_write+0x0/0xb0 > [] native_read_tsc+0xd/0x1d > [] __delay+0x17/0x22 > [] lpfc_sli_brdrestart+0x14c/0x16b > [] lpfc_do_config_port+0x9c/0x3e4 > [] sysfs_link_sibling+0x17/0x31 > [] lpfc_sli_hba_setup+0xc8/0x4a2 > [] lpfc_pci_probe_one+0x750/0x914 > [] pci_device_probe+0xb3/0xfb > [] driver_probe_device+0xb5/0x132 > [] __driver_attach+0x0/0x93 > [] __driver_attach+0x5a/0x93 > [] bus_for_each_dev+0x44/0x6f > [] bus_add_driver+0xae/0x1f5 > [] driver_register+0x59/0xce > [] __pci_register_driver+0x4a/0x7c > [] lpfc_init+0x98/0xba > [] kernel_init+0x175/0x2e1 > [] child_rip+0xa/0x12 > [] kernel_init+0x0/0x2e1 > [] child_rip+0x0/0x12 > > handlers: > [] (lpfc_intr_handler+0x0/0x4c6) > Disabling IRQ #19 > > root caused that there is one Intel card that shared io apic pin and irq with > lpfc > > e1000_probe path only use pci_enable_device to setup irq entry but masked, and > will use e1000_open to use request_irq/setup_irq to install action and > enable/unmask that io apic entry. > > but lpfc driver will call it's probe and request_irq/setup_irq. so it > enable/umask that io apic entry. and only lpfc's action the lpfc_intr_handler > is installed. > > and some case, the e1000 sent out irq (hw bug or first kernel doesn't call > e1000_irq_disable?) > that irq will confuse the hanlder ... it is not for lpfc_intr_handler... > > So try to call pci_intx(dev, 0) in e1000_probe, > and later call pci_intx(dev, 1) after request_irq in e1000_open patch, if the > irq is using INTx > > even e1000 is using MSI, still need this patch. Because even pci_enable_msi in > e1000_open path will call pci_intx(dev, 0), that is too late. when we have lpfc > driver loaded before use ifconfig to set network connection. > > othe drivers may need to be updated in the same way, if they have same problem > like nobody cared irq with shared INTx irq. > > Signed-off-by: Yinghai Lu > > Index: linux-2.6/drivers/net/e1000/e1000_main.c > =================================================================== > --- linux-2.6.orig/drivers/net/e1000/e1000_main.c > +++ linux-2.6/drivers/net/e1000/e1000_main.c > @@ -324,6 +324,9 @@ static int e1000_request_irq(struct e100 > pci_disable_msi(adapter->pdev); > DPRINTK(PROBE, ERR, > "Unable to allocate interrupt Error: %d\n", err); > + } else if (!adapter->have_msi) { > + /* enable INTx before if not using MSI */ > + pci_intx(adapter->pdev, 1); > } > > return err; > @@ -934,6 +937,8 @@ e1000_probe(struct pci_dev *pdev, > uint16_t eeprom_apme_mask = E1000_EEPROM_APME; > DECLARE_MAC_BUF(mac); > > + /* disable INTx at first */ > + pci_intx(pdev, 0); > if ((err = pci_enable_device(pdev))) > return err; > > Index: linux-2.6/drivers/net/e1000e/netdev.c > =================================================================== > --- linux-2.6.orig/drivers/net/e1000e/netdev.c > +++ linux-2.6/drivers/net/e1000e/netdev.c > @@ -960,6 +960,9 @@ static int e1000_request_irq(struct e100 > err); > if (adapter->flags & FLAG_MSI_ENABLED) > pci_disable_msi(adapter->pdev); > + } else if (!(adapter->flags & FLAG_MSI_ENABLED)) { > + /* enable INTx before if not using MSI */ > + pci_intx(adapter->pdev, 1); > } > > return err; These seem sane. > @@ -3726,6 +3729,8 @@ static int __devinit e1000_probe(struct > u16 eeprom_apme_mask = E1000_EEPROM_APME; > > e1000e_disable_l1aspm(pdev); > + /* disable INTx at first */ > + pci_intx(pdev, 0); > err = pci_enable_device(pdev); > if (err) > return err; Any pci_* call before pci_enable_device() is questionable. I would put it after pci_enable_device(), unless there is a _strong_ reason. PCI devices are not considered available, with resources assigned, until pci_enable_device() I am also curious what irq events are being raised? That seems like another problem area to address, since pci_intx() is just a band-aid hiding that behavior. Jeff