LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>, Len Brown <lenb@kernel.org>
Subject: [patch] fix ACPI boot regression (was: Re: Linux 2.6.25-rc5)
Date: Mon, 10 Mar 2008 18:04:34 +0100	[thread overview]
Message-ID: <20080310170434.GA22430@elte.hu> (raw)
In-Reply-To: <alpine.LFD.1.00.0803092231580.5896@woody.linux-foundation.org>


> I'd like developers and testers alike to take a good look at the 
> regression email that Rafael sends out, and please help him keep the 
> entries up-to-date.

figured out another ACPI related regression today.

randconfig testing triggered an early boot-time hang on a laptop of mine 
(32-bit x86, config attached) - the screen was scrolling ACPI AML 
exceptions [with no serial port and no early debugging available].
 
v2.6.24 works fine on that laptop with the same .config, so after a few 
hours of bisection (had to restart it 3 times - other regressions 
interacted), it honed in on this commit:

| 10270d4838bdc493781f5a1cf2e90e9c34c9142f is first bad commit
|
| Author: Linus Torvalds <torvalds@woody.linux-foundation.org>
| Date:   Wed Feb 13 09:56:14 2008 -0800
|
|     acpi: fix acpi_os_read_pci_configuration() misuse of raw_pci_read()

reverting this commit ontop of -rc5 gave a correctly booting kernel.

but this commit fixes a real bug so the real question is, why did it 
break the bootup?

after quite some head-scratching, the following change stood out:

-                               pci_id->bus = tu8;
+                               pci_id->bus = val;

pci_id->bus is defined as u16:

   struct acpi_pci_id {
           u16 segment;
           u16 bus;
   ...

and 'tu8' changed from u8 to u32. So previously we'd unconditionally 
mask the return value of acpi_os_read_pci_configuration() 
(raw_pci_read()) to 8 bits, but now we just trust whatever comes back 
from the PCI access routines and only crop it to 16 bits.

But if the high 8 bits of that result contains any noise then we'll 
write that into ACPI's PCI ID descriptor and confuse the heck out of the 
rest of ACPI.

So lets check the PCI-BIOS code on that theory. We have this codepath 
for 8-bit accesses (arch/x86/pci/pcbios.c:pci_bios_read()):

        switch (len) {
        case 1:
                __asm__("lcall *(%%esi); cld\n\t"
                        "jc 1f\n\t"
                        "xor %%ah, %%ah\n"
                        "1:"
                        : "=c" (*value),
                          "=a" (result)
                        : "1" (PCIBIOS_READ_CONFIG_BYTE),
                          "b" (bx),
                          "D" ((long)reg),
                          "S" (&pci_indirect));

aha! The "=a" output constraint puts the full 32 bits of EAX into 
*value. But if the BIOS's routines set any of the high bits to nonzero, 
we'll return a value with more set in it than intended.

the other, more common PCI access methods (v1 and v2 PCI reads) clear 
out the high bits already, for example pci_conf1_read() does:

        switch (len) {
        case 1:
                *value = inb(0xCFC + (reg & 3));

which explicitly converts the return byte up to 32 bits and zero-extends 
it.

so zero-extending the result in the PCI-BIOS read routine fixes the 
regression on my laptop. ( It might fix some other long-standing issues 
we had with PCI-BIOS during the past decade ... ) Both 8-bit and 16-bit 
accesses were buggy.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/pci/pcbios.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

Index: linux/arch/x86/pci/pcbios.c
===================================================================
--- linux.orig/arch/x86/pci/pcbios.c
+++ linux/arch/x86/pci/pcbios.c
@@ -198,6 +198,11 @@ static int pci_bios_read(unsigned int se
 			  "b" (bx),
 			  "D" ((long)reg),
 			  "S" (&pci_indirect));
+		/*
+		 * Zero-extend the result beyond 8 bits, do not trust the
+		 * BIOS having done it:
+		 */
+		*value &= 0xff;
 		break;
 	case 2:
 		__asm__("lcall *(%%esi); cld\n\t"
@@ -210,6 +215,11 @@ static int pci_bios_read(unsigned int se
 			  "b" (bx),
 			  "D" ((long)reg),
 			  "S" (&pci_indirect));
+		/*
+		 * Zero-extend the result beyond 16 bits, do not trust the
+		 * BIOS having done it:
+		 */
+		*value &= 0xffff;
 		break;
 	case 4:
 		__asm__("lcall *(%%esi); cld\n\t"

  parent reply	other threads:[~2008-03-10 17:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-10  5:42 Linux 2.6.25-rc5 Linus Torvalds
2008-03-10  6:05 ` charles gagalac
2008-03-10 14:45   ` Linus Torvalds
2008-03-10 17:04 ` Ingo Molnar [this message]
2008-03-10 17:14   ` [patch] fix ACPI boot regression (was: Re: Linux 2.6.25-rc5) Linus Torvalds
2008-03-10 17:24     ` Ingo Molnar
2008-03-11  6:04       ` Len Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080310170434.GA22430@elte.hu \
    --to=mingo@elte.hu \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rjw@sisk.pl \
    --cc=torvalds@linux-foundation.org \
    --subject='[patch] fix ACPI boot regression (was: Re: Linux 2.6.25-rc5)' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).