LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: IO-APIC on nforce2   
@ 2004-04-13  1:17 Ross Dickson
  2004-04-13  4:01 ` really bensoo_at_soo_dot_com
  2004-04-13  5:08 ` Len Brown
  0 siblings, 2 replies; 93+ messages in thread
From: Ross Dickson @ 2004-04-13  1:17 UTC (permalink / raw)
  To: christian.kroener; +Cc: linux-kernel, Maciej W. Rozycki, Len Brown

[-- Attachment #1: Type: text/plain, Size: 11113 bytes --]

Christian Kröner wrote

> I got a problem using LOCAL APIC and IO-APIC on my uniprocessor nforce2 board. 
> With recent kernels (latest -mm and 2.6.5-linus) the timer irq gets set to 
> XT-PIC, which results in having a constant hi-load of 15% (after booting) to 
> about 25% (after having the system run about 12 h). Earlier versions of -mm 
> set the timer-irq to IO-APIC-level (or edge, i dont remember it well) and i 
> never had any constant hi-load with these versions. Since mainline kernel 
> versions never ever set the timer irq to IO-APIC-{level,edge} i used to patch 
> them with the ross' nforce-patches, so that the timer-irq gets to be 
> IO-APCI-edge, which worked even though the patch applied with offset. Anyways 
> with the latest mm-kernels these patches dont work anymore. I could apply 
> them with offset but it seems the code isn't used or something else is wrong 
> since the timer-irq stays XT-PIC, which results in the problems above. Could 
> anyone point out, how to resolve this problem or tell me what I could do, to 
> get my timer-irq right? I'm sure willing to test patches... 
> Thanks in advance, christian. 
> - 


Hi Christian

I don't know why the high load on xtpic except maybe heaps of spurious irq's 
under the hood.

I am working with 2.4.26-rc2 and have noticed a change with the the recent acpi?
update. The recent fix to stop unnecessary ioapic irq routing entries puts the 
following if statement into io_apic.c, io_apic_set_pci_routing()

	/*
	 * IRQs < 16 are already in the irq_2_pin[] map
	 */
	if (irq >= 16)
		add_pin_to_irq(irq, ioapic, pin);

which prevents my io-apic patch from using that function to reprogram the
io-apic pin on irq0 from pin2 to pin0. 

As a quick fix you could drop the "if (irq >= 16)".
I don't know what harm if any that would do other than create unwanted
irq mapping entries as in the past.

As a better solution to work with the new code I have created a function to
change the pin an irq comes into the io-apic on and also re-initialise the 
io-apic to deal with the change. 

Here is the function for 2.4.26-rc2.

/*
 * reroute irq to different pin clearing old and enabling new
 */
static void __init replace_IO_APIC_pin_at_irq(unsigned int irq,
						int oldapic, int oldpin,
						int newapic, int newpin)
{
	struct IO_APIC_route_entry entry;
	unsigned long flags;
	/*
	 * read oldapic entry
	 */
	spin_lock_irqsave(&ioapic_lock, flags);
	*(((int*)&entry) + 0) = io_apic_read(oldapic, 0x10 + 2 * oldpin);
	*(((int*)&entry) + 1) = io_apic_read(oldapic, 0x11 + 2 * oldpin);
	spin_unlock_irqrestore(&ioapic_lock, flags);
	/*
	 * Check delivery_mode to be sure we're not clearing an SMI pin
	 */
	if (entry.delivery_mode == dest_SMI)
		return;
	/*
	 * clear oldpin on oldapic
	 */
	clear_IO_APIC_pin(oldapic, oldpin);
	/*
	 * reroute irq to newpin on newapic
	 */
	replace_pin_at_irq(irq, oldapic, oldpin, newapic, newpin);
	/*
	 * Enable newpin on newapic
	*/
	spin_lock_irqsave(&ioapic_lock, flags);
	io_apic_write(newapic, 0x10 + 2*newpin, *(((int *)&entry) + 0));
	io_apic_write(newapic, 0x11 + 2*newpin, *(((int *)&entry) + 1));
	spin_unlock_irqrestore(&ioapic_lock, flags);
}

I am now using this instead of the io_apic_set_pci_routing().
My modified check_timer() to work with it is as follows.

/*
 * This code may look a bit paranoid, but it's supposed to cooperate with
 * a wide range of boards and BIOS bugs.  Fortunately only the timer IRQ
 * is so screwy.  Thanks to Brian Perkins for testing/hacking this beast
 * fanatically on his truly buggy board.
 */
static inline void check_timer(void)
{
	extern int timer_ack;
	int pin1, pin2;
	int vector, i;

	/*
	 * get/set the timer IRQ vector:
	 */
	disable_8259A_irq(0);
	vector = assign_irq_vector(0);
	set_intr_gate(vector, interrupt[0]);

	/*
	 * Subtle, code in do_timer_interrupt() expects an AEOI
	 * mode for the 8259A whenever interrupts are routed
	 * through I/O APICs.  Also IRQ0 has to be enabled in
	 * the 8259A which implies the virtual wire has to be
	 * disabled in the local APIC.
	 */
	apic_write_around(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_EXTINT);
	init_8259A(1);
	timer_ack = 1;
	enable_8259A_irq(0);

	pin1 = find_isa_irq_pin(0, mp_INT);
	pin2 = find_isa_irq_pin(0, mp_ExtINT);

	printk(KERN_INFO "..TIMER: vector=0x%02X pin1=%d pin2=%d\n", vector, pin1, pin2);

	if (pin1 != -1) {
		for(i=0;i<2;i++) {
			/*
			 * Ok, does IRQ0 through the IOAPIC work?
			 */
			unmask_IO_APIC_irq(0);
			if (timer_irq_works()) {
				if (nmi_watchdog == NMI_IO_APIC) {
					disable_8259A_irq(0);
					setup_nmi();
					enable_8259A_irq(0);
					check_nmi_watchdog();
				}
				printk(KERN_INFO "..TIMER: works OK on IO-APIC irq0\n" );
				return;
			}
			mask_IO_APIC_irq(0);
			if(!i) { /* try INTIN0 if INTIN2 failed */
				printk(KERN_ERR "..MP-BIOS bug: 8254 timer not connected to IO-APIC INTIN%d\n",pin1);
				printk(KERN_INFO "..TIMER: Check if 8254 timer connected to IO-APIC INTIN0? ...\n");
				replace_IO_APIC_pin_at_irq(0, 0, pin1, 0, 0);
				timer_ack=0;
				disable_8259A_irq(0);
			} else { /* restore settings */
				clear_IO_APIC_pin(0, 0);
				printk(KERN_ERR "..TIMER: 8254 timer not connected to IO-APIC INTIN0\n");
				timer_ack=1;
				enable_8259A_irq(0);
			}
		}
	}

	printk(KERN_INFO "...trying to set up timer (IRQ0) through the 8259A ... ");
	if (pin2 != -1) {
		printk("\n..... (found pin %d) ...", pin2);
		/*
		 * legacy devices should be connected to IO APIC #0
		 */
		setup_ExtINT_IRQ0_pin(pin2, vector);
		if (timer_irq_works()) {
			printk("works.\n");
			if (pin1 != -1)
				replace_pin_at_irq(0, 0, pin1, 0, pin2);
			else
				add_pin_to_irq(0, 0, pin2);
			if (nmi_watchdog == NMI_IO_APIC) {
				setup_nmi();
				check_nmi_watchdog();
			}
			return;
		}
		/*
		 * Cleanup, just in case ...
		 */
		clear_IO_APIC_pin(0, pin2);
	}
	printk(" failed.\n");

	if (nmi_watchdog) {
		printk(KERN_WARNING "timer doesn't work through the IO-APIC - disabling NMI Watchdog!\n");
		nmi_watchdog = 0;
	}

	printk(KERN_INFO "...trying to set up timer as Virtual Wire IRQ...");

	disable_8259A_irq(0);
	irq_desc[0].handler = &lapic_irq_type;
	apic_write_around(APIC_LVT0, APIC_DM_FIXED | vector);	/* Fixed mode */
	enable_8259A_irq(0);

	if (timer_irq_works()) {
		printk(" works.\n");
		return;
	}
	apic_write_around(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_FIXED | vector);
	printk(" failed.\n");

	printk(KERN_INFO "...trying to set up timer as ExtINT IRQ...");

	init_8259A(0);
	make_8259A_irq(0);
	apic_write_around(APIC_LVT0, APIC_DM_EXTINT);

	unlock_ExtINT_logic();

	if (timer_irq_works()) {
		printk(" works.\n");
		return;
	}
	printk(" failed :(.\n");
	panic("IO-APIC + timer doesn't work! pester mingo@redhat.com");
}

This version loops twice on the "pin1" attempt, firstly trying the bios assigned
pin, then trying pin0 with no timer acks and the 8259 xtpic disabled.

I have not as yet downloaded 2.6.5xxx
From memory this 2.4.26-rc2 code should be very similar to the (2.6.5-linus)
but a bit different to the -mm series. For the -mm series I think you can drop
the "timer_ack=" lines from my changes as it still has Maciej Rozycki's 8259 
ack patch? The timer ack should already have been correctly set to off by it's
checking if the apic is an integrated one.

Here are the changes as a diff on the io_apic.c in 2.4.26-rc2

--- io_apic.c.orig	2004-04-08 15:56:53.000000000 +1000
+++ io_apic.c	2004-04-10 02:33:02.000000000 +1000
@@ -197,10 +197,48 @@ static void clear_IO_APIC (void)
 		for (pin = 0; pin < nr_ioapic_registers[apic]; pin++)
 			clear_IO_APIC_pin(apic, pin);
 }

 /*
+ * reroute irq to different pin clearing old and enabling new
+ */
+static void __init replace_IO_APIC_pin_at_irq(unsigned int irq,
+						int oldapic, int oldpin,
+						int newapic, int newpin)
+{
+	struct IO_APIC_route_entry entry;
+	unsigned long flags;
+	/*
+	 * read oldapic entry
+	 */
+	spin_lock_irqsave(&ioapic_lock, flags);
+	*(((int*)&entry) + 0) = io_apic_read(oldapic, 0x10 + 2 * oldpin);
+	*(((int*)&entry) + 1) = io_apic_read(oldapic, 0x11 + 2 * oldpin);
+	spin_unlock_irqrestore(&ioapic_lock, flags);
+	/*
+	 * Check delivery_mode to be sure we're not clearing an SMI pin
+	 */
+	if (entry.delivery_mode == dest_SMI)
+		return;
+	/*
+	 * clear oldpin on oldapic
+	 */
+	clear_IO_APIC_pin(oldapic, oldpin);
+	/*
+	 * reroute irq to newpin on newapic
+	 */
+	replace_pin_at_irq(irq, oldapic, oldpin, newapic, newpin);
+	/*
+	 * Enable newpin on newapic
+	*/
+	spin_lock_irqsave(&ioapic_lock, flags);
+	io_apic_write(newapic, 0x10 + 2*newpin, *(((int *)&entry) + 0));
+	io_apic_write(newapic, 0x11 + 2*newpin, *(((int *)&entry) + 1));
+	spin_unlock_irqrestore(&ioapic_lock, flags);
+}
+
+/*
  * support for broken MP BIOSs, enables hand-redirection of PIRQ0-7 to
  * specific CPU-side IRQs.
  */

 #define MAX_PIRQS 8
@@ -1582,11 +1620,11 @@ static inline void unlock_ExtINT_logic(v
  */
 static inline void check_timer(void)
 {
 	extern int timer_ack;
 	int pin1, pin2;
-	int vector;
+	int vector, i;

 	/*
 	 * get/set the timer IRQ vector:
 	 */
 	disable_8259A_irq(0);
@@ -1609,25 +1647,39 @@ static inline void check_timer(void)
 	pin2 = find_isa_irq_pin(0, mp_ExtINT);

 	printk(KERN_INFO "..TIMER: vector=0x%02X pin1=%d pin2=%d\n", vector, pin1, pin2);

 	if (pin1 != -1) {
-		/*
-		 * Ok, does IRQ0 through the IOAPIC work?
-		 */
-		unmask_IO_APIC_irq(0);
-		if (timer_irq_works()) {
-			if (nmi_watchdog == NMI_IO_APIC) {
+		for(i=0;i<2;i++) {
+			/*
+			 * Ok, does IRQ0 through the IOAPIC work?
+			 */
+			unmask_IO_APIC_irq(0);
+			if (timer_irq_works()) {
+				if (nmi_watchdog == NMI_IO_APIC) {
+					disable_8259A_irq(0);
+					setup_nmi();
+					enable_8259A_irq(0);
+					check_nmi_watchdog();
+				}
+				printk(KERN_INFO "..TIMER: works OK on IO-APIC irq0\n" );
+				return;
+			}
+			mask_IO_APIC_irq(0);
+			if(!i) { /* try INTIN0 if INTIN2 failed */
+				printk(KERN_ERR "..MP-BIOS bug: 8254 timer not connected to IO-APIC INTIN%d\n",pin1);
+				printk(KERN_INFO "..TIMER: Check if 8254 timer connected to IO-APIC INTIN0? ...\n");
+				replace_IO_APIC_pin_at_irq(0, 0, pin1, 0, 0);
+				timer_ack=0;
 				disable_8259A_irq(0);
-				setup_nmi();
+			} else { /* restore settings */
+				clear_IO_APIC_pin(0, 0);
+				printk(KERN_ERR "..TIMER: 8254 timer not connected to IO-APIC INTIN0\n");
+				timer_ack=1;
 				enable_8259A_irq(0);
-				check_nmi_watchdog();
 			}
-			return;
 		}
-		clear_IO_APIC_pin(0, pin1);
-		printk(KERN_ERR "..MP-BIOS bug: 8254 timer not connected to IO-APIC\n");
 	}

 	printk(KERN_INFO "...trying to set up timer (IRQ0) through the 8259A ... ");
 	if (pin2 != -1) {
 		printk("\n..... (found pin %d) ...", pin2);

Also attached as tarball if whitespace problems,
Hope this helps, Please cc me on responses,
Ross Dickson


[-- Attachment #2: nforce2-ioapic-rd-2.4.26-rc2.patch.tgz --]
[-- Type: application/x-tgz, Size: 1420 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2
  2004-04-13  1:17 IO-APIC on nforce2 Ross Dickson
@ 2004-04-13  4:01 ` really bensoo_at_soo_dot_com
  2004-04-13  4:55   ` Ross Dickson
  2004-04-13  5:08 ` Len Brown
  1 sibling, 1 reply; 93+ messages in thread
From: really bensoo_at_soo_dot_com @ 2004-04-13  4:01 UTC (permalink / raw)
  To: linux-kernel, Ross Dickson

Very odd.  i'm using plain 2.6.5 with your 2.6.3
APIC patches, and left all this io_apic_set_pci_routing()
stuff in.  And, for this first time in who knows
how long i seem to have a stable computer.  Actually
been up more than eight days.

This is an old overclocked MSI K7N2 with the first
revision of the nForce2 chipset, the one that's only
supposed to have UDMA100 (dunno if that's the chipset
or the MSI mboard: the 2.6.X kernels have always said
during bootup that it's running UDMA133).  i use an
old Tulip ethercard instead of the onboard LAN.

This machine is the beater box: an HTPC and a 24/7
file share client, compile and test stuff, play music
thru an Audigy sound card, burn DVD's, play video
files, many of these things at the same time.

Before this kernel i was lucky to have uptimes over
two days.

b

On Tue, Apr 13, 2004 at 11:17:31AM +1000, Ross Dickson wrote:
> I am working with 2.4.26-rc2 and have noticed a change with the the recent acpi?
> update. The recent fix to stop unnecessary ioapic irq routing entries puts the 
> following if statement into io_apic.c, io_apic_set_pci_routing()
> 
> 	/*
> 	 * IRQs < 16 are already in the irq_2_pin[] map
> 	 */
> 	if (irq >= 16)
> 		add_pin_to_irq(irq, ioapic, pin);
> 
> which prevents my io-apic patch from using that function to reprogram the
> io-apic pin on irq0 from pin2 to pin0. 


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2
  2004-04-13  4:01 ` really bensoo_at_soo_dot_com
@ 2004-04-13  4:55   ` Ross Dickson
  2004-04-13 17:22     ` Christian Kröner
  2004-04-13 21:18     ` really bensoo_at_soo_dot_com
  0 siblings, 2 replies; 93+ messages in thread
From: Ross Dickson @ 2004-04-13  4:55 UTC (permalink / raw)
  To: really bensoo_at_soo_dot_com, linux-kernel
  Cc: Maciej W. Rozycki, Len Brown, christian.kroener

On Tuesday 13 April 2004 14:01, really bensoo_at_soo_dot_com wrote:
> Very odd.  i'm using plain 2.6.5 with your 2.6.3

Yes odd, it's the first report of a "hi-load" XT-PIC issue I know of.

> APIC patches, and left all this io_apic_set_pci_routing()
> stuff in.  And, for this first time in who knows
> how long i seem to have a stable computer.  Actually
> been up more than eight days.

Sounds Very Good.

Are you using my io-apic patch with the apic ack delay or with the
C1idle version?
i.e. patched io_apic.c and apic.c and using kernel arg "apic_tack="
or patched io_apic.c and process.c and using kernel arg "idle=C1halt"?

My cat proc/cmdline
root=/dev/hdb2 idle=C1halt nmi_watchdog=1

Could you please cat /proc/interrupts.
I would like to see how irq0 is routed.
Mine looks like.

           CPU0
  0:     229404    IO-APIC-edge  timer
  1:        376    IO-APIC-edge  keyboard
  2:          0          XT-PIC  cascade
  9:          0   IO-APIC-level  acpi
 12:      13499    IO-APIC-edge  PS/2 Mouse
 14:      10482    IO-APIC-edge  ide0
 15:         73    IO-APIC-edge  ide1
 16:      27055   IO-APIC-level  nvidia
 20:      46913   IO-APIC-level  eth0, usb-ohci
 21:       3660   IO-APIC-level  ehci_hcd, NVIDIA nForce Audio
 22:          0   IO-APIC-level  usb-ohci
NMI:     229547
LOC:     229340
ERR:          0
MIS:          0

And from boot log
with my new timer setup

ENABLING IO-APIC IRQs
init IO_APIC IRQs
 IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
..TIMER: vector=0x31 pin1=2 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC INTIN2
..TIMER: Check if 8254 timer connected to IO-APIC INTIN0? ...
activating NMI Watchdog ... done.
testing NMI watchdog ... OK.
..TIMER: works OK on IO-APIC irq0
Using local APIC timer interrupts.
calibrating APIC timer ...

and my ioapic routing

number of MP IRQ sources: 15.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................

IO APIC #2......
.... register #00: 02000000
.......    : physical APIC id: 02
.......    : Delivery Type: 0
.......    : LTS          : 0
.... register #01: 00170011
.......     : max redirection entries: 0017
.......     : PRQ implemented: 0
.......     : IO APIC version: 0011
.... register #02: 00000000
.......     : arbitration: 00
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
 00 001 01  0    0    0   0   0    1    1    31
 01 001 01  0    0    0   0   0    1    1    39
 02 000 00  1    0    0   0   0    0    0    00
 03 001 01  0    0    0   0   0    1    1    41
 04 001 01  0    0    0   0   0    1    1    49
 05 001 01  0    0    0   0   0    1    1    51
 06 001 01  0    0    0   0   0    1    1    59
 07 001 01  0    0    0   0   0    1    1    61
 08 001 01  0    0    0   0   0    1    1    69
 09 001 01  0    1    0   0   0    1    1    71
 0a 001 01  0    0    0   0   0    1    1    79
 0b 001 01  0    0    0   0   0    1    1    81
 0c 001 01  0    0    0   0   0    1    1    89
 0d 001 01  0    0    0   0   0    1    1    91
 0e 001 01  0    0    0   0   0    1    1    99
 0f 001 01  0    0    0   0   0    1    1    A1
 10 001 01  1    1    0   0   0    1    1    D9
 11 001 01  1    1    0   0   0    1    1    E1
 12 001 01  1    1    0   0   0    1    1    C9
 13 001 01  1    1    0   0   0    1    1    D1
 14 001 01  1    1    0   0   0    1    1    B1
 15 001 01  1    1    0   0   0    1    1    C1
 16 001 01  1    1    0   0   0    1    1    B9
 17 001 01  1    1    0   0   0    1    1    A9
IRQ to pin mappings:
IRQ0 -> 0:0
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4
IRQ5 -> 0:5
IRQ6 -> 0:6
IRQ7 -> 0:7
IRQ8 -> 0:8
IRQ9 -> 0:9
IRQ10 -> 0:10
IRQ11 -> 0:11
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15
IRQ16 -> 0:16
IRQ17 -> 0:17
IRQ18 -> 0:18
IRQ19 -> 0:19
IRQ20 -> 0:20
IRQ21 -> 0:21
IRQ22 -> 0:22
IRQ23 -> 0:23
.................................... done.
PCI: Using ACPI for IRQ routing


> 
> This is an old overclocked MSI K7N2 with the first
> revision of the nForce2 chipset, the one that's only
> supposed to have UDMA100 (dunno if that's the chipset
> or the MSI mboard: the 2.6.X kernels have always said
> during bootup that it's running UDMA133).  i use an
> old Tulip ethercard instead of the onboard LAN.
>
 
I am now using forcedeth for onboard ether. It works well and is
convenient when rebuilding and testing kernels and modules.

> This machine is the beater box: an HTPC and a 24/7
> file share client, compile and test stuff, play music
> thru an Audigy sound card, burn DVD's, play video
> files, many of these things at the same time.
> 
> Before this kernel i was lucky to have uptimes over
> two days.

Yes I remember how frustrating it felt to have linux regularly die and 
even fail to boot properly.

> 
> b
> 
> On Tue, Apr 13, 2004 at 11:17:31AM +1000, Ross Dickson wrote:
> > I am working with 2.4.26-rc2 and have noticed a change with the the recent acpi?
> > update. The recent fix to stop unnecessary ioapic irq routing entries puts the 
> > following if statement into io_apic.c, io_apic_set_pci_routing()
> > 
> > 	/*
> > 	 * IRQs < 16 are already in the irq_2_pin[] map
> > 	 */
> > 	if (irq >= 16)
> > 		add_pin_to_irq(irq, ioapic, pin);
> > 
> > which prevents my io-apic patch from using that function to reprogram the
> > io-apic pin on irq0 from pin2 to pin0. 
> 
> 
> 
> 

I did some more reading on kernel version re Maciej's 8259 ack patch
Ignore my comments in previous posting as patch was fully pulled from all
kernels at end of 2.6.3 ie. never appeared in 2.6.4 or later

>>I have not as yet downloaded 2.6.5xxx
>>From memory this 2.4.26-rc2 code should be very similar to the (2.6.5-linus)
>>but a bit different to the -mm series. For the -mm series I think you can drop
>>the "timer_ack=" lines from my changes as it still has Maciej Rozycki's 8259 
>>ack patch? The timer ack should already have been correctly set to off by it's
>>checking if the apic is an integrated one.

Shame as it seemed theoretically correct to me to not ack.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/2143.html

Regards
Ross.




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2
  2004-04-13  1:17 IO-APIC on nforce2 Ross Dickson
  2004-04-13  4:01 ` really bensoo_at_soo_dot_com
@ 2004-04-13  5:08 ` Len Brown
  2004-04-13  7:03   ` Ross Dickson
  1 sibling, 1 reply; 93+ messages in thread
From: Len Brown @ 2004-04-13  5:08 UTC (permalink / raw)
  To: ross; +Cc: christian.kroener, linux-kernel, Maciej W. Rozycki

On Mon, 2004-04-12 at 21:17, Ross Dickson wrote:

> I am working with 2.4.26-rc2 and have noticed a change with the the recent acpi?
> update. The recent fix to stop unnecessary ioapic irq routing entries puts the 
> following if statement into io_apic.c, io_apic_set_pci_routing()
> 
> 	/*
> 	 * IRQs < 16 are already in the irq_2_pin[] map
> 	 */
> 	if (irq >= 16)
> 		add_pin_to_irq(irq, ioapic, pin);
> 
> which prevents my io-apic patch from using that function to reprogram the
> io-apic pin on irq0 from pin2 to pin0. 
> 
> As a quick fix you could drop the "if (irq >= 16)".
> I don't know what harm if any that would do other than create unwanted
> irq mapping entries as in the past.

I made that change -- sorry I broke your patch.
No, I doubt it would matter if you hacked out "if (irq >=16)"
for the time being.

I haven't been following this thread closely, but
http://bugme.osdl.org/show_bug.cgi?id=1203 says I should;-)

I understand that these boards have the timer attached to pin0
in APIC mode, but that the BIOS says it is connected to pin2:

ACPI: INT_SRC_OVR (bus[0] irq[0x0] global_irq[0x2] polarity[0x0]
trigger[0x0])

Wouldn't it be a simpler patch to recognize this board and simply
disable this bogus BIOS INT_SRC_OVR?

Also, what is the symptom of the XT-PIC timer?  Is it the source
of the nForce2 hangs, or something else?  The latest message
suggested that it caused a backround load on the system, but
I don't recall hearing that one on this thread before.

thanks,
-Len



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2
  2004-04-13  5:08 ` Len Brown
@ 2004-04-13  7:03   ` Ross Dickson
  2004-04-13 13:46     ` Maciej W. Rozycki
  2004-04-14  1:02     ` IO-APIC on nforce2 [PATCH] Len Brown
  0 siblings, 2 replies; 93+ messages in thread
From: Ross Dickson @ 2004-04-13  7:03 UTC (permalink / raw)
  To: Len Brown; +Cc: christian.kroener, linux-kernel, Maciej W. Rozycki

On Tuesday 13 April 2004 15:08, Len Brown wrote:
> On Mon, 2004-04-12 at 21:17, Ross Dickson wrote:
> 
> > I am working with 2.4.26-rc2 and have noticed a change with the the recent acpi?
> > update. The recent fix to stop unnecessary ioapic irq routing entries puts the 
> > following if statement into io_apic.c, io_apic_set_pci_routing()
> > 
> > 	/*
> > 	 * IRQs < 16 are already in the irq_2_pin[] map
> > 	 */
> > 	if (irq >= 16)
> > 		add_pin_to_irq(irq, ioapic, pin);
> > 
> > which prevents my io-apic patch from using that function to reprogram the
> > io-apic pin on irq0 from pin2 to pin0. 
> > 
> > As a quick fix you could drop the "if (irq >= 16)".
> > I don't know what harm if any that would do other than create unwanted
> > irq mapping entries as in the past.
> 
> I made that change -- sorry I broke your patch.
> No, I doubt it would matter if you hacked out "if (irq >=16)"
> for the time being.

Thanks Len, my patch was a bit of a quick hack anyway.

> 
> I haven't been following this thread closely, but
> http://bugme.osdl.org/show_bug.cgi?id=1203 says I should;-)
> 
> I understand that these boards have the timer attached to pin0
> in APIC mode, but that the BIOS says it is connected to pin2:
> 
> ACPI: INT_SRC_OVR (bus[0] irq[0x0] global_irq[0x2] polarity[0x0]
> trigger[0x0])
> 
> Wouldn't it be a simpler patch to recognize this board and simply
> disable this bogus BIOS INT_SRC_OVR?

I will go with you on this one as I have read the intel spec docs 
but have not yet learnt the acpi code base. 

Maciej forwarded me some an override patch he developed for another
architecture where one could spec MP info as kernel args and that worked but
we still had no nmi_debug=1 with the timer_ack=1 situation, which he then
fixed in 2.6.3-mm3 but it got pulled for 2.6.4

Maciej, is that override code good to go on latest kernels? I am a novice to 
acpi parsing etc.

Also some users reported clock skew with timer routed via io-apic pin0.
We never got to the bottom of that so I don't know if doing a pci quirk
on nforce2 would satisfy all for widespread use.

> 
> Also, what is the symptom of the XT-PIC timer?  Is it the source
> of the nForce2 hangs, or something else?  The latest message
> suggested that it caused a backround load on the system, but
> I don't recall hearing that one on this thread before.

Christian could we have more detail on "hi-load" XTPIC please?

Source of nforce2 hang is officially not commented on by Nvidia or 
AMD. 

>From what I know it appears now to be an Athlon to chipset problem as it has
also occured on an SIS-740 board. It seems to have less to do with the 
interrupt routing and everything to do with the timing of back to back C1 
disconnect cycles when those cycles are occuring at a high rate.

Unfortunately spurious interrupts contribute to disconnect rate - and there
are lots of those in XT-PIC mode. I hacked the proc/interrupts code to view
them on irq7 and it was really bad if I used local apic without io-apic.

What I think the mechanism is...
After C1 cycle has occured, if the HLT instruction (to disconnect again) is
executed sooner than about 1us after the interrupt that pulled cpu out
of the C1 cycle occured then likely -we die. The probability of this happening
greatly increases with the rate of C1 cycles. Evident by 1000Hz timer ticks
of 2.6 showing problem up more than 100Hz 2.4. 

Also acpi support for nforce2 in apic with io-apic mode is not widespread 
amongst major 2.4 distros to my knowledge - they stick with
XTPIC on install. Also in XTPIC mode the southbridge accesses provide
the delay time needed for stability in most cases but of course NVIDIA to my
knowledge have not published PCI irq routing registers to be able to manually 
route irqs so devices get stuck unnecessarily sharing a single irq in XTPIC 
mode. I tried a kernel hacked with the AMD 76x registers but they were 
obviously different. 

I think this is going to be a major headache when 2.6 is the main stream distro
as there is a lot of cheap nforce2 out there.
Judging from the silence from AMD hardware vendors- they seem prepared to
wait it out - maybe hoping everyone will go 64bit before it hits the fan?

-Ross.

> 
> thanks,
> -Len
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2
  2004-04-13  7:03   ` Ross Dickson
@ 2004-04-13 13:46     ` Maciej W. Rozycki
  2004-04-14  1:02     ` IO-APIC on nforce2 [PATCH] Len Brown
  1 sibling, 0 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2004-04-13 13:46 UTC (permalink / raw)
  To: Ross Dickson; +Cc: Len Brown, christian.kroener, linux-kernel

On Tue, 13 Apr 2004, Ross Dickson wrote:

> Maciej forwarded me some an override patch he developed for another
> architecture where one could spec MP info as kernel args and that worked but
> we still had no nmi_debug=1 with the timer_ack=1 situation, which he then
> fixed in 2.6.3-mm3 but it got pulled for 2.6.4
> 
> Maciej, is that override code good to go on latest kernels? I am a novice to 
> acpi parsing etc.

 I suppose it should be fine.

> Unfortunately spurious interrupts contribute to disconnect rate - and there
> are lots of those in XT-PIC mode. I hacked the proc/interrupts code to view
> them on irq7 and it was really bad if I used local apic without io-apic.

 Spurious interrupts are normally recorded in the "ERR" entry in
/proc/interrupts, so you shouldn't have to record them separately.  And
there should be none counted, except perhaps a few arriving upon 
initialization of the local APIC.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2
  2004-04-13  4:55   ` Ross Dickson
@ 2004-04-13 17:22     ` Christian Kröner
  2004-04-13 21:18     ` really bensoo_at_soo_dot_com
  1 sibling, 0 replies; 93+ messages in thread
From: Christian Kröner @ 2004-04-13 17:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Maciej W. Rozycki, Len Brown, ross

Here is some other (maybe useful) info I can give:

This is part of my system log from kernel 2.6.5-mm4 (no other patches than -mm).
The irq0 gets set to XT-PIC with this kernel version...

timer setup:

ENABLING IO-APIC IRQs
init IO_APIC IRQs
IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
 ..TIMER: vector=0x31 pin1=2 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A ...  failed.
...trying to set up timer as Virtual Wire IRQ... failed.
...trying to set up timer as ExtINT IRQ... works.
Using local APIC timer interrupts.

interrupt routing:

number of MP IRQ sources: 15.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................
  IO APIC #2......
  .... register #00: 02000000
  .......    : physical APIC id: 02
  .......    : Delivery Type: 0
  .......    : LTS          : 0
  .... register #01: 00170011
  .......     : max redirection entries: 0017
  .......     : PRQ implemented: 0
  .......     : IO APIC version: 0011
  .... register #02: 00000000
  .......     : arbitration: 00
  .... IRQ redirection table:
   NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
   00 000 00  1    0    0   0   0    0    0    00
   01 001 01  0    0    0   0   0    1    1    39
   02 000 00  1    0    0   0   0    0    0    00
   03 001 01  0    0    0   0   0    1    1    41
   04 001 01  0    0    0   0   0    1    1    49
   05 001 01  0    0    0   0   0    1    1    51
   06 001 01  0    0    0   0   0    1    1    59
   07 001 01  1    0    0   0   0    1    1    61
   08 001 01  0    0    0   0   0    1    1    69
   09 001 01  0    1    0   0   0    1    1    71
   0a 001 01  0    0    0   0   0    1    1    79
   0b 001 01  0    0    0   0   0    1    1    81
   0c 001 01  0    0    0   0   0    1    1    89
   0d 001 01  0    0    0   0   0    1    1    91
   0e 001 01  0    0    0   0   0    1    1    99
   0f 001 01  0    0    0   0   0    1    1    A1
   10 001 01  1    1    0   0   0    1    1    D1
   11 001 01  1    1    0   0   0    1    1    D9
   12 001 01  1    1    0   0   0    1    1    E1
   13 001 01  1    1    0   0   0    1    1    C9
   14 001 01  1    1    0   0   0    1    1    B1
   15 001 01  1    1    0   0   0    1    1    C1
   16 001 01  1    1    0   0   0    1    1    B9
   17 001 01  1    1    0   0   0    1    1    A9


Now, with 2.6.5-mm5-1, patched by hand with the io_apic.c-patch I got from Ross
(removing the declaration of extern int timer_ack in check_timer(), changing nothing else),
I get the following:

timer setup:

ENABLING IO-APIC IRQs
init IO_APIC IRQs
 IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
..TIMER: vector=0x31 pin1=2 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC INTIN2
..TIMER: Check if 8254 timer connected to IO-APIC INTIN0? ...
..TIMER: works OK on IO-APIC irq0
Using local APIC timer interrupts.

irq routing:

number of MP IRQ sources: 15.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................
IO APIC #2......
.... register #00: 02000000
.......    : physical APIC id: 02
.......    : Delivery Type: 0
.......    : LTS          : 0
.... register #01: 00170011
.......     : max redirection entries: 0017
.......     : PRQ implemented: 0
.......     : IO APIC version: 0011
.... register #02: 00000000
.......     : arbitration: 00
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
 00 001 01  0    0    0   0   0    1    1    31
 01 001 01  0    0    0   0   0    1    1    39
 02 000 00  1    0    0   0   0    0    0    00
 03 001 01  0    0    0   0   0    1    1    41
 04 001 01  0    0    0   0   0    1    1    49
 05 001 01  0    0    0   0   0    1    1    51
 06 001 01  0    0    0   0   0    1    1    59
 07 001 01  0    0    0   0   0    1    1    61
 08 001 01  0    0    0   0   0    1    1    69
 09 001 01  0    1    0   0   0    1    1    71
 0a 001 01  0    0    0   0   0    1    1    79
 0b 001 01  0    0    0   0   0    1    1    81
 0c 001 01  0    0    0   0   0    1    1    89
 0d 001 01  0    0    0   0   0    1    1    91
 0e 001 01  0    0    0   0   0    1    1    99
 0f 001 01  0    0    0   0   0    1    1    A1
 10 001 01  1    1    0   0   0    1    1    D1
 11 001 01  1    1    0   0   0    1    1    D9
 12 001 01  1    1    0   0   0    1    1    E1
 13 001 01  1    1    0   0   0    1    1    C9
 14 001 01  1    1    0   0   0    1    1    B1
 15 001 01  1    1    0   0   0    1    1    C1
 16 001 01  1    1    0   0   0    1    1    B9
 17 001 01  1    1    0   0   0    1    1    A9
IRQ to pin mappings:
IRQ0 -> 0:0
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4
IRQ5 -> 0:5
IRQ6 -> 0:6
IRQ7 -> 0:7
IRQ8 -> 0:8
IRQ9 -> 0:9
IRQ10 -> 0:10
IRQ11 -> 0:11
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15
IRQ16 -> 0:16
IRQ17 -> 0:17
IRQ18 -> 0:18
IRQ19 -> 0:19
IRQ20 -> 0:20
IRQ21 -> 0:21
IRQ22 -> 0:22
IRQ23 -> 0:23
.................................... done.

cat /proc/interrupts gets me:

          CPU0
  0:     568313    IO-APIC-edge  timer
  1:       1359    IO-APIC-edge  i8042
  2:          0          XT-PIC  cascade
  7:          0    IO-APIC-edge  parport0
  8:          4    IO-APIC-edge  rtc
  9:          0   IO-APIC-level  acpi
 12:      14859    IO-APIC-edge  i8042
 14:      17983    IO-APIC-edge  ide0
 15:         92    IO-APIC-edge  ide1
 16:       2335   IO-APIC-level  ide2, saa7134[0]
 17:        142   IO-APIC-level  CMI8738
 19:      31779   IO-APIC-level  nvidia
 20:      72619   IO-APIC-level  ohci_hcd, eth0
 21:   86626041   IO-APIC-level  ehci_hcd
 22:         78   IO-APIC-level  ohci_hcd
NMI:          0
LOC:     566374
ERR:          0
MIS:          0

There is NO constant hi-load anymore, cool!

thanks, christian.

P.S: Ross, could you send a patch that could be applied using the patch-utility?

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2
  2004-04-13  4:55   ` Ross Dickson
  2004-04-13 17:22     ` Christian Kröner
@ 2004-04-13 21:18     ` really bensoo_at_soo_dot_com
  2004-04-14  4:24       ` really bensoo_at_soo_dot_com
  1 sibling, 1 reply; 93+ messages in thread
From: really bensoo_at_soo_dot_com @ 2004-04-13 21:18 UTC (permalink / raw)
  To: Ross Dickson, linux-kernel

On Tue, Apr 13, 2004 at 02:55:52PM +1000, Ross Dickson wrote:
> Are you using my io-apic patch with the apic ack delay or with the
> C1idle version?
> i.e. patched io_apic.c and apic.c and using kernel arg "apic_tack="
> or patched io_apic.c and process.c and using kernel arg "idle=C1halt"?

i'm using your C1idle patches:

nforce2-idleC1halt-rd-2.6.3.patch
nforce2-ioapic-rd-2.6.3.patch

cat /proc/cmdline
BOOT_IMAGE=linux-test ro root=305 idebus=33 acpi=on idle=C1halt

> Could you please cat /proc/interrupts.
> I would like to see how irq0 is routed.

My irq0 says XT-PIC.  i'm not complaining, box's still
very stable and since the last post i've burned a few
DVDs on it while running the file share client and
playing music.

cat /proc/interrupts

           CPU0       
  0:  759809583          XT-PIC  timer
  1:     382279    IO-APIC-edge  i8042
  2:          0          XT-PIC  cascade
  8:          1    IO-APIC-edge  rtc
  9:          0   IO-APIC-level  acpi
 12:    6386931    IO-APIC-edge  i8042
 14:    2117474    IO-APIC-edge  ide0
 15:    5575006    IO-APIC-edge  ide1
201:    6425958   IO-APIC-level  EMU10K1
209:  167929203   IO-APIC-level  eth0
NMI:          0 
LOC:  759718637 
ERR:          0
MIS:          0


> 
> And from boot log
> with my new timer setup

my boot dmesg timer setup:

ENABLING IO-APIC IRQs
init IO_APIC IRQs
 IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
..TIMER: vector=0x31 pin1=2 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC INTIN2
..TIMER: Is timer irq0 connected to IO-APIC INTIN0? ...
IOAPIC[0]: Set PCI routing entry (2-0 -> 0x31 -> IRQ 0 Mode:0 Active:0)
IOAPIC[0]: Set PCI routing entry (2-2 -> 0x31 -> IRQ 0 Mode:0 Active:0)
..MP-BIOS: 8254 timer not connected to IO-APIC INTIN0
...trying to set up timer (IRQ0) through the 8259A ...  failed.
...trying to set up timer as Virtual Wire IRQ... failed.
...trying to set up timer as ExtINT IRQ... works.
Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 2135.0772 MHz.
..... host bus clock speed is 388.0322 MHz.

-------------------------------------------------------------
ioapic routing:

number of MP IRQ sources: 15.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................

IO APIC #2......
.... register #00: 02000000
.......    : physical APIC id: 02
.......    : Delivery Type: 0
.......    : LTS          : 0
.... register #01: 00170011
.......     : max redirection entries: 0017
.......     : PRQ implemented: 0
.......     : IO APIC version: 0011
.... register #02: 00000000
.......     : arbitration: 00
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:   
 00 000 00  1    0    0   0   0    0    0    00
 01 001 01  0    0    0   0   0    1    1    39
 02 001 01  1    0    0   0   0    1    1    31
 03 001 01  0    0    0   0   0    1    1    41
 04 001 01  0    0    0   0   0    1    1    49
 05 001 01  0    0    0   0   0    1    1    51
 06 001 01  0    0    0   0   0    1    1    59
 07 001 01  1    0    0   0   0    1    1    61
 08 001 01  0    0    0   0   0    1    1    69
 09 001 01  0    1    0   0   0    1    1    71
 0a 001 01  0    0    0   0   0    1    1    79
 0b 001 01  0    0    0   0   0    1    1    81
 0c 001 01  0    0    0   0   0    1    1    89
 0d 001 01  0    0    0   0   0    1    1    91
 0e 001 01  0    0    0   0   0    1    1    99
 0f 001 01  0    0    0   0   0    1    1    A1
 10 001 01  1    1    0   0   0    1    1    D1
 11 001 01  1    1    0   0   0    1    1    D9
 12 001 01  1    1    0   0   0    1    1    E1
 13 001 01  1    1    0   0   0    1    1    C9
 14 001 01  1    1    0   0   0    1    1    B1
 15 001 01  1    1    0   0   0    1    1    C1
 16 001 01  1    1    0   0   0    1    1    B9
 17 001 01  1    1    0   0   0    1    1    A9
IRQ to pin mappings:
IRQ0 -> 0:2
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4
IRQ5 -> 0:5
IRQ6 -> 0:6
IRQ7 -> 0:7
IRQ8 -> 0:8
IRQ9 -> 0:9
IRQ10 -> 0:10
IRQ11 -> 0:11
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15
IRQ16 -> 0:16
IRQ17 -> 0:17
IRQ18 -> 0:18
IRQ19 -> 0:19
IRQ20 -> 0:20
IRQ21 -> 0:21
IRQ22 -> 0:22
IRQ23 -> 0:23
.................................... done.
PCI: Using ACPI for IRQ routing

> I am now using forcedeth for onboard ether. It works well and is
> convenient when rebuilding and testing kernels and modules.

i would too but am still on a coax network here...
What?  Upgrade?  What?

b

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH]
  2004-04-13  7:03   ` Ross Dickson
  2004-04-13 13:46     ` Maciej W. Rozycki
@ 2004-04-14  1:02     ` Len Brown
  2004-04-14  5:02       ` Ross Dickson
                         ` (2 more replies)
  1 sibling, 3 replies; 93+ messages in thread
From: Len Brown @ 2004-04-14  1:02 UTC (permalink / raw)
  To: ross; +Cc: christian.kroener, linux-kernel, Maciej W. Rozycki

[-- Attachment #1: Type: text/plain, Size: 2294 bytes --]

Re: IRQ0 XT-PIC timer issue

Since the hardware is connected to APIC pin0, it is a BIOS bug
that an ACPI interrupt source override from pin2 to IRQ0 exists.

With this simple 2.6.5 patch you can specify "acpi_skip_timer_override"
to ignore that bogus BIOS directive.  The result is with your
ACPI-enabled APIC-enabled kernel, you'll get IRQ0 IO-APIC-edge timer.

Probably there is a more clever way to trigger this workaround
automatcially instead of via boot parameter.

cheers,
-Len

===== Documentation/kernel-parameters.txt 1.44 vs edited =====
--- 1.44/Documentation/kernel-parameters.txt	Mon Mar 22 16:03:22 2004
+++ edited/Documentation/kernel-parameters.txt	Tue Apr 13 17:47:11 2004
@@ -122,6 +122,10 @@
 
 	acpi_serialize	[HW,ACPI] force serialization of AML methods
 
+	acpi_skip_timer_override [HW,ACPI]]
+			Recognize IRQ0/pin2 Interrupt Source Override
+			and ignore it -- for broken nForce2 BIOS.
+
 	ad1816=		[HW,OSS]
 			Format: <io>,<irq>,<dma>,<dma2>
 			See also Documentation/sound/oss/AD1816.
===== arch/i386/kernel/setup.c 1.115 vs edited =====
--- 1.115/arch/i386/kernel/setup.c	Fri Apr  2 07:21:43 2004
+++ edited/arch/i386/kernel/setup.c	Tue Apr 13 17:41:31 2004
@@ -614,6 +614,12 @@
 		else if (!memcmp(from, "acpi_sci=low", 12))
 			acpi_sci_flags.polarity = 3;
 
+		else if (!memcmp(from, "acpi_skip_timer_override", 24)) {
+			extern int acpi_skip_timer_override;
+
+			acpi_skip_timer_override = 1;
+		}
+
 #ifdef CONFIG_X86_LOCAL_APIC
 		/* disable IO-APIC */
 		else if (!memcmp(from, "noapic", 6))
===== arch/i386/kernel/acpi/boot.c 1.57 vs edited =====
--- 1.57/arch/i386/kernel/acpi/boot.c	Tue Mar 30 17:05:19 2004
+++ edited/arch/i386/kernel/acpi/boot.c	Tue Apr 13 17:50:14 2004
@@ -62,6 +62,7 @@
 
 acpi_interrupt_flags acpi_sci_flags __initdata;
 int acpi_sci_override_gsi __initdata;
+int acpi_skip_timer_override __initdata;
 
 #ifdef CONFIG_X86_LOCAL_APIC
 static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
@@ -327,6 +328,12 @@
 		acpi_sci_ioapic_setup(intsrc->global_irq,
 			intsrc->flags.polarity, intsrc->flags.trigger);
 		return 0;
+	}
+
+	if (acpi_skip_timer_override &&
+		intsrc->bus_irq == 0 && intsrc->global_irq == 2) {
+			printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
+			return 0;
 	}
 
 	mp_override_legacy_irq (



[-- Attachment #2: wip.patch --]
[-- Type: text/plain, Size: 1806 bytes --]

===== Documentation/kernel-parameters.txt 1.44 vs edited =====
--- 1.44/Documentation/kernel-parameters.txt	Mon Mar 22 16:03:22 2004
+++ edited/Documentation/kernel-parameters.txt	Tue Apr 13 17:47:11 2004
@@ -122,6 +122,10 @@
 
 	acpi_serialize	[HW,ACPI] force serialization of AML methods
 
+	acpi_skip_timer_override [HW,ACPI]]
+			Recognize IRQ0/pin2 Interrupt Source Override
+			and ignore it -- for broken nForce2 BIOS.
+
 	ad1816=		[HW,OSS]
 			Format: <io>,<irq>,<dma>,<dma2>
 			See also Documentation/sound/oss/AD1816.
===== arch/i386/kernel/setup.c 1.115 vs edited =====
--- 1.115/arch/i386/kernel/setup.c	Fri Apr  2 07:21:43 2004
+++ edited/arch/i386/kernel/setup.c	Tue Apr 13 17:41:31 2004
@@ -614,6 +614,12 @@
 		else if (!memcmp(from, "acpi_sci=low", 12))
 			acpi_sci_flags.polarity = 3;
 
+		else if (!memcmp(from, "acpi_skip_timer_override", 24)) {
+			extern int acpi_skip_timer_override;
+
+			acpi_skip_timer_override = 1;
+		}
+
 #ifdef CONFIG_X86_LOCAL_APIC
 		/* disable IO-APIC */
 		else if (!memcmp(from, "noapic", 6))
===== arch/i386/kernel/acpi/boot.c 1.57 vs edited =====
--- 1.57/arch/i386/kernel/acpi/boot.c	Tue Mar 30 17:05:19 2004
+++ edited/arch/i386/kernel/acpi/boot.c	Tue Apr 13 17:50:14 2004
@@ -62,6 +62,7 @@
 
 acpi_interrupt_flags acpi_sci_flags __initdata;
 int acpi_sci_override_gsi __initdata;
+int acpi_skip_timer_override __initdata;
 
 #ifdef CONFIG_X86_LOCAL_APIC
 static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
@@ -327,6 +328,12 @@
 		acpi_sci_ioapic_setup(intsrc->global_irq,
 			intsrc->flags.polarity, intsrc->flags.trigger);
 		return 0;
+	}
+
+	if (acpi_skip_timer_override &&
+		intsrc->bus_irq == 0 && intsrc->global_irq == 2) {
+			printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
+			return 0;
 	}
 
 	mp_override_legacy_irq (

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2
  2004-04-13 21:18     ` really bensoo_at_soo_dot_com
@ 2004-04-14  4:24       ` really bensoo_at_soo_dot_com
  0 siblings, 0 replies; 93+ messages in thread
From: really bensoo_at_soo_dot_com @ 2004-04-14  4:24 UTC (permalink / raw)
  To: Ross Dickson, linux-kernel

i must add that i've been using your patches for
the nForce chipset since they first appeared on
this mailing list, and while they've all helped
this box to last a bit longer between lockups
none of them cured it.  Once the IO-APIC code was
compiled in and the Athlon idle powersaving
turned on it would inevitabley lock up in a day
or two.

This incorrect result from the mismatch between
your 2.6.3 patches and the current IO-APIC
code is the first time this box seems to be
free from lockup.

b

On Tue, Apr 13, 2004 at 05:18:24PM -0400, really bensoo_at_soo_dot_com wrote:
> My irq0 says XT-PIC.  i'm not complaining, box's still
> very stable and since the last post i've burned a few
> DVDs on it while running the file share client and
> playing music.
> 
> cat /proc/interrupts
> 
>            CPU0       
>   0:  759809583          XT-PIC  timer
>   1:     382279    IO-APIC-edge  i8042
>   2:          0          XT-PIC  cascade
>   8:          1    IO-APIC-edge  rtc
>   9:          0   IO-APIC-level  acpi
>  12:    6386931    IO-APIC-edge  i8042
>  14:    2117474    IO-APIC-edge  ide0
>  15:    5575006    IO-APIC-edge  ide1
> 201:    6425958   IO-APIC-level  EMU10K1
> 209:  167929203   IO-APIC-level  eth0
> NMI:          0 
> LOC:  759718637 
> ERR:          0
> MIS:          0

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH]
  2004-04-14  1:02     ` IO-APIC on nforce2 [PATCH] Len Brown
@ 2004-04-14  5:02       ` Ross Dickson
  2004-04-14  6:30         ` Jamie Lokier
                           ` (2 more replies)
  2004-04-15 15:10       ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Ross Dickson
  2004-04-15 15:21       ` IO-APIC on nforce2 [PATCH] Zwane Mwaikambo
  2 siblings, 3 replies; 93+ messages in thread
From: Ross Dickson @ 2004-04-14  5:02 UTC (permalink / raw)
  To: Len Brown; +Cc: christian.kroener, linux-kernel, Maciej W. Rozycki

On Wednesday 14 April 2004 11:02, Len Brown wrote:
> Re: IRQ0 XT-PIC timer issue
> 
> Since the hardware is connected to APIC pin0, it is a BIOS bug
> that an ACPI interrupt source override from pin2 to IRQ0 exists.
> 
> With this simple 2.6.5 patch you can specify "acpi_skip_timer_override"
> to ignore that bogus BIOS directive.  The result is with your
> ACPI-enabled APIC-enabled kernel, you'll get IRQ0 IO-APIC-edge timer.
> 
> Probably there is a more clever way to trigger this workaround
> automatcially instead of via boot parameter.
> 
> cheers,
> -Len

Many thanks Len,
I cannot try it just yet (rebuilding car engine,-greasy mess
 - hopefully get to it tonight).

Just would like to add that if we cannot get Maciej's 8259 ack patch
back into the distro then we need an if statement in the check_timer()
to turn off timer_ack for nforce2 or Christian might get his hi-load back
and certainly nmi_debug=1 won't work.

e.g. for 2.4.26-rc2 io_apic.c line 1613 or 2.6.5 line 2180 
	if (pin1 != -1) {
		/*
		 * Ok, does IRQ0 through the IOAPIC work?
		 */
+		if(acpi_skip_timer_override)
+			timer_ack=0;
		unmask_IO_APIC_irq(0);

I might also grab the pci quirk source from the old nforce2 disconnect bit
patch and try it as a means of detection for automatic trigger. i.e. instead
of writing the pci config bit, set acpi_skip_timer_override instead - but then
if someone gets clock skew we would need the kern arg to turn it off - 
unless the potential for clock skew is fixed.
 
The clock skew is an interesting one, I think the clock uses tsc if available
to interpolate between timer ints and if so should it not also be used to 
validate the timer ints in case of noise? Apparently the clock speeds up not
slows down in those cases?

Regards
Ross.

> 
> ===== Documentation/kernel-parameters.txt 1.44 vs edited =====
> --- 1.44/Documentation/kernel-parameters.txt	Mon Mar 22 16:03:22 2004
> +++ edited/Documentation/kernel-parameters.txt	Tue Apr 13 17:47:11 2004
> @@ -122,6 +122,10 @@
>  
>  	acpi_serialize	[HW,ACPI] force serialization of AML methods
>  
> +	acpi_skip_timer_override [HW,ACPI]]
> +			Recognize IRQ0/pin2 Interrupt Source Override
> +			and ignore it -- for broken nForce2 BIOS.
> +
>  	ad1816=		[HW,OSS]
>  			Format: <io>,<irq>,<dma>,<dma2>
>  			See also Documentation/sound/oss/AD1816.
> ===== arch/i386/kernel/setup.c 1.115 vs edited =====
> --- 1.115/arch/i386/kernel/setup.c	Fri Apr  2 07:21:43 2004
> +++ edited/arch/i386/kernel/setup.c	Tue Apr 13 17:41:31 2004
> @@ -614,6 +614,12 @@
>  		else if (!memcmp(from, "acpi_sci=low", 12))
>  			acpi_sci_flags.polarity = 3;
>  
> +		else if (!memcmp(from, "acpi_skip_timer_override", 24)) {
> +			extern int acpi_skip_timer_override;
> +
> +			acpi_skip_timer_override = 1;
> +		}
> +
>  #ifdef CONFIG_X86_LOCAL_APIC
>  		/* disable IO-APIC */
>  		else if (!memcmp(from, "noapic", 6))
> ===== arch/i386/kernel/acpi/boot.c 1.57 vs edited =====
> --- 1.57/arch/i386/kernel/acpi/boot.c	Tue Mar 30 17:05:19 2004
> +++ edited/arch/i386/kernel/acpi/boot.c	Tue Apr 13 17:50:14 2004
> @@ -62,6 +62,7 @@
>  
>  acpi_interrupt_flags acpi_sci_flags __initdata;
>  int acpi_sci_override_gsi __initdata;
> +int acpi_skip_timer_override __initdata;
>  
>  #ifdef CONFIG_X86_LOCAL_APIC
>  static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
> @@ -327,6 +328,12 @@
>  		acpi_sci_ioapic_setup(intsrc->global_irq,
>  			intsrc->flags.polarity, intsrc->flags.trigger);
>  		return 0;
> +	}
> +
> +	if (acpi_skip_timer_override &&
> +		intsrc->bus_irq == 0 && intsrc->global_irq == 2) {
> +			printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
> +			return 0;
>  	}
>  
>  	mp_override_legacy_irq (
> 
> 
> 


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH]
  2004-04-14  5:02       ` Ross Dickson
@ 2004-04-14  6:30         ` Jamie Lokier
  2004-04-14 10:37         ` Maciej W. Rozycki
  2004-04-14 19:57         ` Christian Kröner
  2 siblings, 0 replies; 93+ messages in thread
From: Jamie Lokier @ 2004-04-14  6:30 UTC (permalink / raw)
  To: Ross Dickson
  Cc: Len Brown, christian.kroener, linux-kernel, Maciej W. Rozycki

Ross Dickson wrote:
> The clock skew is an interesting one, I think the clock uses tsc if available
> to interpolate between timer ints and if so should it not also be used to 
> validate the timer ints in case of noise? Apparently the clock speeds up not
> slows down in those cases?

If the clock is speeding up due to spurious extra timer interrupts,
how about reading the timer chip to validate the interrupts?  Doesn't
sound unreasonable to me :)

The problem with using the tsc is that the tsc frequency isn't
constant on some systems.  If it slows down, it would make valid timer
interrupts appear to be spurious.

-- Jamie

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH]
  2004-04-14  5:02       ` Ross Dickson
  2004-04-14  6:30         ` Jamie Lokier
@ 2004-04-14 10:37         ` Maciej W. Rozycki
  2004-04-15 19:28           ` Len Brown
  2004-04-14 19:57         ` Christian Kröner
  2 siblings, 1 reply; 93+ messages in thread
From: Maciej W. Rozycki @ 2004-04-14 10:37 UTC (permalink / raw)
  To: Ross Dickson; +Cc: Len Brown, christian.kroener, linux-kernel

On Wed, 14 Apr 2004, Ross Dickson wrote:

> e.g. for 2.4.26-rc2 io_apic.c line 1613 or 2.6.5 line 2180 
> 	if (pin1 != -1) {
> 		/*
> 		 * Ok, does IRQ0 through the IOAPIC work?
> 		 */
> +		if(acpi_skip_timer_override)
> +			timer_ack=0;
> 		unmask_IO_APIC_irq(0);
> 
> I might also grab the pci quirk source from the old nforce2 disconnect bit
> patch and try it as a means of detection for automatic trigger. i.e. instead
> of writing the pci config bit, set acpi_skip_timer_override instead - but then
> if someone gets clock skew we would need the kern arg to turn it off - 
> unless the potential for clock skew is fixed.

 Well, the question is whether the timer->INTIN0 routing is hardwired
inside the nforce2 chipset or is it external and thus board-dependent.  
Any way to get this clarified by the chipset's manufacturer?

> The clock skew is an interesting one, I think the clock uses tsc if available
> to interpolate between timer ints and if so should it not also be used to 
> validate the timer ints in case of noise? Apparently the clock speeds up not
> slows down in those cases?

 With real hardware perhaps it can be debugged.  The interaction between
the 8254, the 8259As and the APICs seems interesting in the chipset.  
Perhaps the override to INTIN2 is to tell the timer is really unavailable
directly?  I can't see a way to have an ACPI override that specifies an
ISA interrupt is not connected to the I/O APIC (unlike with the MPS).

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH]
  2004-04-14  5:02       ` Ross Dickson
  2004-04-14  6:30         ` Jamie Lokier
  2004-04-14 10:37         ` Maciej W. Rozycki
@ 2004-04-14 19:57         ` Christian Kröner
  2004-04-15  0:17           ` Len Brown
  2 siblings, 1 reply; 93+ messages in thread
From: Christian Kröner @ 2004-04-14 19:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Len Brown, Maciej W. Rozycki, ross

> Just would like to add that if we cannot get Maciej's 8259 ack patch
> back into the distro then we need an if statement in the check_timer()
> to turn off timer_ack for nforce2 or Christian might get his hi-load back
> and certainly nmi_debug=1 won't work.
>
> e.g. for 2.4.26-rc2 io_apic.c line 1613 or 2.6.5 line 2180
> 	if (pin1 != -1) {
> 		/*
> 		 * Ok, does IRQ0 through the IOAPIC work?
> 		 */
> +		if(acpi_skip_timer_override)
> +			timer_ack=0;
> 		unmask_IO_APIC_irq(0);
>

Well it seems that if at least on -mm this isn't necessary.

Len, I simply applied your patch against 2.6.5-mm5-1 and it just works, great 
work! Having finally read http://bugme.osdl.org/show_bug.cgi?id=1203 I must 
say that my nforce2-board (MSI K7N2-Delta) never ever hung, wether I had the 
wrong timer setup or APIC on/off didn't harm any.

What I get now:

cat /proc/interrupts

           CPU0
  0:   25978569    IO-APIC-edge  timer
  1:       2102    IO-APIC-edge  i8042
  2:          0          XT-PIC  cascade
  7:          0    IO-APIC-edge  parport0
  8:          4    IO-APIC-edge  rtc
  9:          0   IO-APIC-level  acpi
 12:     147962    IO-APIC-edge  i8042
 14:     405977    IO-APIC-edge  ide0
 15:         93    IO-APIC-edge  ide1
 16:      60192   IO-APIC-level  ide2, saa7134[0]
 17:        155   IO-APIC-level  CMI8738
 19:    2209002   IO-APIC-level  nvidia
 20:    7538158   IO-APIC-level  ohci_hcd, eth0
 21:          0   IO-APIC-level  ehci_hcd
 22:         78   IO-APIC-level  ohci_hcd
NMI:          0
LOC:   25946237
ERR:          0
MIS:          0


timer setup:

ENABLING IO-APIC IRQs
init IO_APIC IRQs
 IO-APIC (apicid-pin) 2-2, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not 
connected.
..TIMER: vector=0x31 pin1=0 pin2=-1
Using local APIC timer interrupts.
calibrating APIC timer ...


irq routing:

IO APIC #2......
.... register #00: 02000000
.......    : physical APIC id: 02
.......    : Delivery Type: 0
.......    : LTS          : 0
.... register #01: 00170011
.......     : max redirection entries: 0017
.......     : PRQ implemented: 0
.......     : IO APIC version: 0011
.... register #02: 00000000
.......     : arbitration: 00
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
 00 001 01  0    0    0   0   0    1    1    31
 01 001 01  0    0    0   0   0    1    1    39
 02 000 00  1    0    0   0   0    0    0    00
 03 001 01  0    0    0   0   0    1    1    41
 04 001 01  0    0    0   0   0    1    1    49
 05 001 01  0    0    0   0   0    1    1    51
 06 001 01  0    0    0   0   0    1    1    59
 07 001 01  0    0    0   0   0    1    1    61
 08 001 01  0    0    0   0   0    1    1    69
 09 001 01  0    1    0   0   0    1    1    71
 0a 001 01  0    0    0   0   0    1    1    79
 0b 001 01  0    0    0   0   0    1    1    81
 0c 001 01  0    0    0   0   0    1    1    89
 0d 001 01  0    0    0   0   0    1    1    91
 0e 001 01  0    0    0   0   0    1    1    99
 0f 001 01  0    0    0   0   0    1    1    A1
 10 001 01  1    1    0   0   0    1    1    D1
 11 001 01  1    1    0   0   0    1    1    D9
 12 001 01  1    1    0   0   0    1    1    E1
 13 001 01  1    1    0   0   0    1    1    C9
 14 001 01  1    1    0   0   0    1    1    B1
 15 001 01  1    1    0   0   0    1    1    C1
 16 001 01  1    1    0   0   0    1    1    B9
 17 001 01  1    1    0   0   0    1    1    A9
IRQ to pin mappings:
IRQ0 -> 0:0
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4
IRQ5 -> 0:5
IRQ6 -> 0:6
IRQ7 -> 0:7
IRQ8 -> 0:8
IRQ9 -> 0:9
IRQ10 -> 0:10
IRQ11 -> 0:11
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15
IRQ16 -> 0:16
IRQ17 -> 0:17
IRQ18 -> 0:18
IRQ19 -> 0:19
IRQ20 -> 0:20
IRQ21 -> 0:21
IRQ22 -> 0:22
IRQ23 -> 0:23
.................................... done.


This is simply great, any uncommon hi-load disappeared.
Will something like this get into mainline soon, maybe with automatic chipset 
detection?

Once again, thanks, christian.

P.S.: I will test the patch against mainline 2.6.5 kernel now and post the 
results later.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH]
  2004-04-14 19:57         ` Christian Kröner
@ 2004-04-15  0:17           ` Len Brown
  2004-04-15  1:48             ` Ross Dickson
  0 siblings, 1 reply; 93+ messages in thread
From: Len Brown @ 2004-04-15  0:17 UTC (permalink / raw)
  To: Christian Kröner; +Cc: linux-kernel, Maciej W. Rozycki, ross

On Wed, 2004-04-14 at 15:57, Christian Kröner wrote:

> This is simply great, any uncommon hi-load disappeared.
> Will something like this get into mainline soon, maybe with automatic chipset 
> detection?

I'm okay putting the bootparam and the workaround into the kernel,
for it is generic and we may find other platforms need it.

But I don't have a clean way to make it automatic.
This is a BIOS bug, so chipset ID will not always work.

We could list the BIOS in dmi_scan(), but I hate doing
that b/c then the vendor releases a new version of their
broken BIOS and the automatic workaround no longer works...

-Len



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH]
  2004-04-15  0:17           ` Len Brown
@ 2004-04-15  1:48             ` Ross Dickson
  2004-04-15 17:09               ` Christian Kröner
  0 siblings, 1 reply; 93+ messages in thread
From: Ross Dickson @ 2004-04-15  1:48 UTC (permalink / raw)
  To: Len Brown, Christian Kröner, linux-nforce-bugs
  Cc: linux-kernel, Maciej W. Rozycki

On Thursday 15 April 2004 10:17, Len Brown wrote:
> On Wed, 2004-04-14 at 15:57, Christian Kröner wrote:
> 
> > This is simply great, any uncommon hi-load disappeared.
> > Will something like this get into mainline soon, maybe with automatic chipset 
> > detection?
> 
> I'm okay putting the bootparam and the workaround into the kernel,
> for it is generic and we may find other platforms need it.

Great, it sure is simpler and cleaner than my workaround. Thanks.

> 
> But I don't have a clean way to make it automatic.
> This is a BIOS bug, so chipset ID will not always work.

True it is a bios thing but I have yet to see an nforce2 MOBO that is not 
routed in this way. I am thinking it is internal to the chipset. I have seen
none route it into io-apic pin2.

> Maciej wrote
>  Well, the question is whether the timer->INTIN0 routing is hardwired
> inside the nforce2 chipset or is it external and thus board-dependent.  
> Any way to get this clarified by the chipset's manufacturer?

Nvidia is the first Company in my 20+ years of working life to totally not 
respond to my attempts to communicate and I have had dealings with
numerous semiconductor firms and agents. I doubt that my email source 
would be blocked and I have also tried their form mail. Do real people
work there? Maybe I have to phone or fax them from here in Australia?
-or place an order for 10,000 chips? Maybe we need a worldwide union of
Linux support staff to exhibit collective sales pressure. Enough ranting....

I am also cautioned by Maciej's comments indicating that maybe the 
override appears in the nforce2 bios because there is no other way of 
saying this is a feature that nvidia could not get to work properly?...

On the flip side in favour of this routing the clock skew may be restricted
to only to 2.6.1 kerns, I do not have it on my patched 2.4 kerns, it may
be fine on 2.6.5.

Here is a link to the old thread with the skew issues.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-01/3129.html
Christian - would you please check if you get clock skew as described in
that thread?

> 
> We could list the BIOS in dmi_scan(), but I hate doing
> that b/c then the vendor releases a new version of their
> broken BIOS and the automatic workaround no longer works...
> 
> -Len
> 

Unfortunately the hard lockups in the BUG report won't be fixed by this io-apic
work. I think Shuttle is the only manufacturer to ship a bios update which has
taken a board with existing lockup problems and fixed it. So far nobody has
posted how this magic was done?
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-01/5003.html

In the mean time I and others with lockups have had success with my C1 idle 
patch but I have left it manual with kern arg for the same reason - no clean
way to automate it. Some nforce2 need it, others don't. Want me to finish 
cleaning it up for possible inclusion?
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/1707.html

-Ross.





^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-14  1:02     ` IO-APIC on nforce2 [PATCH] Len Brown
  2004-04-14  5:02       ` Ross Dickson
@ 2004-04-15 15:10       ` Ross Dickson
  2004-04-15 20:17         ` Len Brown
  2004-04-15 15:21       ` IO-APIC on nforce2 [PATCH] Zwane Mwaikambo
  2 siblings, 1 reply; 93+ messages in thread
From: Ross Dickson @ 2004-04-15 15:10 UTC (permalink / raw)
  To: Len Brown
  Cc: christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Prakash K. Cheemplavam, Craig Bradney, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij

[-- Attachment #1: Type: text/plain, Size: 7087 bytes --]

On Wednesday 14 April 2004 11:02, Len Brown wrote:
> Re: IRQ0 XT-PIC timer issue
> 
> Since the hardware is connected to APIC pin0, it is a BIOS bug
> that an ACPI interrupt source override from pin2 to IRQ0 exists.
> 
> With this simple 2.6.5 patch you can specify "acpi_skip_timer_override"
> to ignore that bogus BIOS directive.  The result is with your
> ACPI-enabled APIC-enabled kernel, you'll get IRQ0 IO-APIC-edge timer.
> 
> Probably there is a more clever way to trigger this workaround
> automatcially instead of via boot parameter.
> 
> cheers,
> -Len
> 
> ===== Documentation/kernel-parameters.txt 1.44 vs edited =====
> --- 1.44/Documentation/kernel-parameters.txt	Mon Mar 22 16:03:22 2004
> +++ edited/Documentation/kernel-parameters.txt	Tue Apr 13 17:47:11 2004
> @@ -122,6 +122,10 @@
>  
>  	acpi_serialize	[HW,ACPI] force serialization of AML methods
>  
> +	acpi_skip_timer_override [HW,ACPI]]
> +			Recognize IRQ0/pin2 Interrupt Source Override
> +			and ignore it -- for broken nForce2 BIOS.
> +
>  	ad1816=		[HW,OSS]
>  			Format: <io>,<irq>,<dma>,<dma2>
>  			See also Documentation/sound/oss/AD1816.
> ===== arch/i386/kernel/setup.c 1.115 vs edited =====
> --- 1.115/arch/i386/kernel/setup.c	Fri Apr  2 07:21:43 2004
> +++ edited/arch/i386/kernel/setup.c	Tue Apr 13 17:41:31 2004
> @@ -614,6 +614,12 @@
>  		else if (!memcmp(from, "acpi_sci=low", 12))
>  			acpi_sci_flags.polarity = 3;
>  
> +		else if (!memcmp(from, "acpi_skip_timer_override", 24)) {
> +			extern int acpi_skip_timer_override;
> +
> +			acpi_skip_timer_override = 1;
> +		}
> +
>  #ifdef CONFIG_X86_LOCAL_APIC
>  		/* disable IO-APIC */
>  		else if (!memcmp(from, "noapic", 6))
> ===== arch/i386/kernel/acpi/boot.c 1.57 vs edited =====
> --- 1.57/arch/i386/kernel/acpi/boot.c	Tue Mar 30 17:05:19 2004
> +++ edited/arch/i386/kernel/acpi/boot.c	Tue Apr 13 17:50:14 2004
> @@ -62,6 +62,7 @@
>  
>  acpi_interrupt_flags acpi_sci_flags __initdata;
>  int acpi_sci_override_gsi __initdata;
> +int acpi_skip_timer_override __initdata;
>  
>  #ifdef CONFIG_X86_LOCAL_APIC
>  static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
> @@ -327,6 +328,12 @@
>  		acpi_sci_ioapic_setup(intsrc->global_irq,
>  			intsrc->flags.polarity, intsrc->flags.trigger);
>  		return 0;
> +	}
> +
> +	if (acpi_skip_timer_override &&
> +		intsrc->bus_irq == 0 && intsrc->global_irq == 2) {
> +			printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
> +			return 0;
>  	}
>  
>  	mp_override_legacy_irq (
> 
> 
> 

Hi Len, I have updated my nforce2 patches for 2.6.5 to work with your patch.
I have tested them only on one nforce2 board Epox 8Rga+ but as little has
changed in core functionality from past releases I think all will be OK....
Hopefully no clock skew. I saw none on my system but thats no guarantee.
 
I tried your above patch with the timer_ack on as is default in 2.6.5 and
nmi_watchdog=1 failed as expected. I still think Maciej's 8259 ack patch 
is more complete solution to the ack issue but this one gets watchdog going for
nforce2. I cannot see anyone using your above patch without an integrated
apic and tsc so I cannot see a problem triggering it off your kern arg.

The second patch is the C1halt update I suggested in another posting.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/1707.html

Both patches in attached tarball.
Regards
Ross.

Here is my revised patch for use with "acpi_skip_timer_override" to get 
nmi_debug=1 working with the above patch from Len Brown.

--- linux-2.6.5/arch/i386/kernel/io_apic.c.orig	2004-04-16 00:20:54.000000000 +1000
+++ linux-2.6.5/arch/i386/kernel/io_apic.c	2004-04-15 20:24:18.000000000 +1000
@@ -2179,10 +2179,13 @@ static inline void check_timer(void)
 
 	if (pin1 != -1) {
 		/*
 		 * Ok, does IRQ0 through the IOAPIC work?
 		 */
+		extern int acpi_skip_timer_override;
+		if(acpi_skip_timer_override)
+			timer_ack=0;
 		unmask_IO_APIC_irq(0);
 		if (timer_irq_works()) {
 			if (nmi_watchdog == NMI_IO_APIC) {
 				disable_8259A_irq(0);
 				setup_nmi();

Here is my revised patch for "idle=C1halt" to prevent nforce2 hard lockups.
Now more robust, better tested with apm config, and without x86 apic config, 
and nolapic, noapic, acpi=off. All gave my usual 38C CPU temp when idle and
no hard lockups. Temp measured by leaving machine idle on run level 3 for 
several minutes and then reading bios temp on reboot.

--- linux-2.6.5/arch/i386/kernel/process.c.orig	2004-04-04 13:36:10.000000000 +1000
+++ linux-2.6.5/arch/i386/kernel/process.c	2004-04-15 20:41:13.000000000 +1000
@@ -47,10 +47,13 @@
 #include <asm/irq.h>
 #include <asm/desc.h>
 #ifdef CONFIG_MATH_EMULATION
 #include <asm/math_emu.h>
 #endif
+#if defined(CONFIG_X86_UP_APIC)
+#include <asm/apic.h>
+#endif
 
 #include <linux/irq.h>
 #include <linux/err.h>
 
 asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
@@ -98,10 +101,34 @@ void default_idle(void)
 			local_irq_enable();
 	}
 }
 
 /*
+ * We use this to avoid nforce2 lockups
+ * Reduces frequency of C1 disconnects
+ */
+static void c1halt_idle(void)
+{
+	if (!hlt_counter && current_cpu_data.hlt_works_ok) {
+		local_irq_disable();
+#if defined(CONFIG_X86_UP_APIC)
+		/* only hlt disconnect if more than 1.6% of apic interval remains */
+      	extern int enable_local_apic;
+		if(!need_resched() && (enable_local_apic < 0 ||
+			(apic_read(APIC_TMCCT) > (apic_read(APIC_TMICT)>>6)))) {
+#else
+		/* just adds a little delay to assist in back to back disconnects */
+		if(!need_resched()) {
+#endif
+		ndelay(600); /* helps nforce2 but adds 0.6us hard int latency */
+		safe_halt(); /* nothing better to do until we wake up */
+		} else {
+			local_irq_enable();
+		}
+	}
+}
+/*
  * On SMP it's slightly faster (but much more power-consuming!)
  * to poll the ->work.need_resched flag instead of waiting for the
  * cross-CPU IPI to arrive. Use this option with caution.
  */
 static void poll_idle (void)
@@ -135,20 +162,18 @@ static void poll_idle (void)
  * The idle thread. There's no useful work to be
  * done, so just try to conserve power and have a
  * low exit latency (ie sit in a loop waiting for
  * somebody to say that they'd like to reschedule)
  */
+static void (*idle)(void);
 void cpu_idle (void)
 {
 	/* endless idle loop with no priority at all */
 	while (1) {
 		while (!need_resched()) {
-			void (*idle)(void) = pm_idle;
-
 			if (!idle)
-				idle = default_idle;
-
+				idle = pm_idle ? pm_idle : default_idle;
 			irq_stat[smp_processor_id()].idle_timestamp = jiffies;
 			idle();
 		}
 		schedule();
 	}
@@ -199,16 +224,18 @@ void __init select_idle_routine(const st
 
 static int __init idle_setup (char *str)
 {
 	if (!strncmp(str, "poll", 4)) {
 		printk("using polling idle threads.\n");
-		pm_idle = poll_idle;
+		idle = poll_idle;
 	} else if (!strncmp(str, "halt", 4)) {
 		printk("using halt in idle threads.\n");
-		pm_idle = default_idle;
+		idle = default_idle;
+	} else if (!strncmp(str, "C1halt", 6)) {
+		printk("using C1 halt disconnect friendly idle threads.\n");
+		idle = c1halt_idle;
 	}
-
 	return 1;
 }
 
 __setup("idle=", idle_setup);
 



[-- Attachment #2: nforce2-lockup-patches-rd-2.6.5.tgz --]
[-- Type: application/x-tgz, Size: 1671 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH]
  2004-04-14  1:02     ` IO-APIC on nforce2 [PATCH] Len Brown
  2004-04-14  5:02       ` Ross Dickson
  2004-04-15 15:10       ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Ross Dickson
@ 2004-04-15 15:21       ` Zwane Mwaikambo
  2 siblings, 0 replies; 93+ messages in thread
From: Zwane Mwaikambo @ 2004-04-15 15:21 UTC (permalink / raw)
  To: Len Brown
  Cc: ross, christian.kroener, Linux Kernel, Maciej W. Rozycki,
	Protasevich, Natalie

On Tue, 13 Apr 2004, Len Brown wrote:

> Re: IRQ0 XT-PIC timer issue
>
> Since the hardware is connected to APIC pin0, it is a BIOS bug
> that an ACPI interrupt source override from pin2 to IRQ0 exists.
>
> With this simple 2.6.5 patch you can specify "acpi_skip_timer_override"
> to ignore that bogus BIOS directive.  The result is with your
> ACPI-enabled APIC-enabled kernel, you'll get IRQ0 IO-APIC-edge timer.
>
> Probably there is a more clever way to trigger this workaround
> automatcially instead of via boot parameter.

Nice, this is the problem which broke Andrew's and the systems i tested
my adaptation of Natalie's mp_override_legacy_irq() change. Whacking out
previous mp_irq entries would have worked if the BIOS had not forced the
pin2 override.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH]
  2004-04-15  1:48             ` Ross Dickson
@ 2004-04-15 17:09               ` Christian Kröner
  0 siblings, 0 replies; 93+ messages in thread
From: Christian Kröner @ 2004-04-15 17:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: Len Brown, ross, Maciej W. Rozycki

On Thursday 15 April 2004 10:17, Len Brown wrote:
>  I'm okay putting the bootparam and the workaround into the kernel,
>  for it is generic and we may find other platforms need it.

Thats more than what I could have expected, thanks.


Ross, I tested my kernel with nmi_watchdog=1 and nmi_watchdog=2 getting only 2 
to work.

output: nmi_watchdog=1

activating NMI Watchdog ... done.
testing NMI watchdog ... CPU#0: NMI appears to be stuck!


output: nmi_watchdog=2

testing NMI watchdog ... OK.


This is on 2.6.5-mm5-1.

Concerning the timer, well I tested it against my radio-controlled clock, 
setting it with ntpdate first and letting the system run (with ntpd off) and 
my system is kinda faster than my radio-clock. After about one hour my system 
was off by +14s compared to the radio-clock. I don't know if that is pretty 
shitty or simply normal for these bad pc-clocks...

I'm now compiling 2.6.6-rc1 with the nmi_watchdog=1 workaround. One question 
about the C1idle-patch, does this add a new feature or is it just a 
workaround for locked up nforce2-systems (since I never experienced lockups 
on my system, I wouldn't need it then)?

thanks for now, christian.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH]
  2004-04-14 10:37         ` Maciej W. Rozycki
@ 2004-04-15 19:28           ` Len Brown
  0 siblings, 0 replies; 93+ messages in thread
From: Len Brown @ 2004-04-15 19:28 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Ross Dickson, christian.kroener, linux-kernel

On Wed, 2004-04-14 at 06:37, Maciej W. Rozycki wrote:
> On Wed, 14 Apr 2004, Ross Dickson wrote:

> > The clock skew is an interesting one, I think the clock uses tsc if available
> > to interpolate between timer ints and if so should it not also be used to 
> > validate the timer ints in case of noise? Apparently the clock speeds up not
> > slows down in those cases?
> 
>  With real hardware perhaps it can be debugged.  The interaction between
> the 8254, the 8259As and the APICs seems interesting in the chipset.

> Perhaps the override to INTIN2 is to tell the timer is really unavailable
> directly?

That would be way too subtle for a BIOS writer;-)

> I can't see a way to have an ACPI override that specifies an
> ISA interrupt is not connected to the I/O APIC (unlike with the MPS).

I agree.  And I think the existence of this /proc/interrupts
entry on an ACPI-enabled system should probably go away.

           CPU0       CPU1
  2:          0          0          XT-PIC  cascade

ACPI also doesn't support sharing more than 1 pin on an IRQ.
So if you see a construct like this below, it is also a bug:

IRQ to pin mappings:
IRQ23 -> 0:23-> 0:7

cheers,
-Len



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-15 15:10       ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Ross Dickson
@ 2004-04-15 20:17         ` Len Brown
  2004-04-15 21:04           ` Craig Bradney
  2004-04-15 21:56           ` Arjen Verweij
  0 siblings, 2 replies; 93+ messages in thread
From: Len Brown @ 2004-04-15 20:17 UTC (permalink / raw)
  To: ross
  Cc: christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Prakash K. Cheemplavam, Craig Bradney, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

On Thu, 2004-04-15 at 11:10, Ross Dickson wrote:
> On Wednesday 14 April 2004 11:02, Len Brown wrote:
> > Re: IRQ0 XT-PIC timer issue
> > 
> > Since the hardware is connected to APIC pin0, it is a BIOS bug
> > that an ACPI interrupt source override from pin2 to IRQ0 exists.
> > 
> > With this simple 2.6.5 patch you can specify "acpi_skip_timer_override"
> > to ignore that bogus BIOS directive.  The result is with your
> > ACPI-enabled APIC-enabled kernel, you'll get IRQ0 IO-APIC-edge timer.
> > 
> > Probably there is a more clever way to trigger this workaround
> > automatcially instead of via boot parameter.

> Hi Len, I have updated my nforce2 patches for 2.6.5 to work with your patch.
> I have tested them only on one nforce2 board Epox 8Rga+ but as little has
> changed in core functionality from past releases I think all will be OK....
> Hopefully no clock skew. I saw none on my system but thats no guarantee.

While I don't want to get into the business of maintaining
a dmi_scan entry for every system with this issue, I think
it might be a good idea to add a couple of example entries
for high volume systems for which there is no BIOS fix available.

Got any opinions on which system to use as the example?
I'll need the output from dmidecode for them.
 
> I tried your above patch with the timer_ack on as is default in 2.6.5 and
> nmi_watchdog=1 failed as expected. I still think Maciej's 8259 ack patch 
> is more complete solution to the ack issue but this one gets watchdog going for
> nforce2. I cannot see anyone using your above patch without an integrated
> apic and tsc so I cannot see a problem triggering it off your kern arg.

"acpi_skip_timer_override" is specific to IOAPIC mode,
since that is the only place that the bogus interrupt
source override is used.

I'm not clued-in on the nmi_watchdog and 8259 ack issues.
My focus is primarily the ACPI issues involved in getting
these systems up and running in IOAPIC mode.

> The second patch is the C1halt update I suggested in another posting.
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/1707.html

Clearly this hang issue is more important than the timer issue.
I'm impressed that you built such a sophisticated patch without
any support from the vendors.  But it would be a "really good thing"
if we got some input from the vendors before considering putting
a workaround into the upstream kernel -- for they may have
guidance which would either simplify it, or make it unnecessary.
Perhaps Allen Martin at nVidia can comment?

-Len




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-15 20:17         ` Len Brown
@ 2004-04-15 21:04           ` Craig Bradney
  2004-04-21 20:22             ` Len Brown
  2004-04-15 21:56           ` Arjen Verweij
  1 sibling, 1 reply; 93+ messages in thread
From: Craig Bradney @ 2004-04-15 21:04 UTC (permalink / raw)
  To: Len Brown
  Cc: ross, christian.kroener, linux-kernel, Maciej W. Rozycki,
	Jamie Lokier, Prakash K. Cheemplavam, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 3121 bytes --]

On Thu, 2004-04-15 at 22:17, Len Brown wrote:
> On Thu, 2004-04-15 at 11:10, Ross Dickson wrote:
> > On Wednesday 14 April 2004 11:02, Len Brown wrote:
> > > Re: IRQ0 XT-PIC timer issue
> > > 
> > > Since the hardware is connected to APIC pin0, it is a BIOS bug
> > > that an ACPI interrupt source override from pin2 to IRQ0 exists.
> > > 
> > > With this simple 2.6.5 patch you can specify "acpi_skip_timer_override"
> > > to ignore that bogus BIOS directive.  The result is with your
> > > ACPI-enabled APIC-enabled kernel, you'll get IRQ0 IO-APIC-edge timer.
> > > 
> > > Probably there is a more clever way to trigger this workaround
> > > automatcially instead of via boot parameter.
> 
> > Hi Len, I have updated my nforce2 patches for 2.6.5 to work with your patch.
> > I have tested them only on one nforce2 board Epox 8Rga+ but as little has
> > changed in core functionality from past releases I think all will be OK....
> > Hopefully no clock skew. I saw none on my system but thats no guarantee.
> 
> While I don't want to get into the business of maintaining
> a dmi_scan entry for every system with this issue, I think
> it might be a good idea to add a couple of example entries
> for high volume systems for which there is no BIOS fix available.
> 
> Got any opinions on which system to use as the example?
> I'll need the output from dmidecode for them.

I have an A7N8X Deluxe v2 BIOS v1007 that I can give u whatever numbers
u need. IOAPIC and APIC are on.

Its running gentoo-dev-sources 2.6.3-r1 plus the idlec1halt patch and
nmi patch from Ross. I guess the kernel doesnt matter too much if its
just board details. 

More details of my 2.6.1 days are at 
http://atlas.et.tudelft.nl/verwei90/nforce2/

Craig


> > I tried your above patch with the timer_ack on as is default in 2.6.5 and
> > nmi_watchdog=1 failed as expected. I still think Maciej's 8259 ack patch 
> > is more complete solution to the ack issue but this one gets watchdog going for
> > nforce2. I cannot see anyone using your above patch without an integrated
> > apic and tsc so I cannot see a problem triggering it off your kern arg.
> 
> "acpi_skip_timer_override" is specific to IOAPIC mode,
> since that is the only place that the bogus interrupt
> source override is used.
> 
> I'm not clued-in on the nmi_watchdog and 8259 ack issues.
> My focus is primarily the ACPI issues involved in getting
> these systems up and running in IOAPIC mode.
> 
> > The second patch is the C1halt update I suggested in another posting.
> > http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/1707.html
> 
> Clearly this hang issue is more important than the timer issue.
> I'm impressed that you built such a sophisticated patch without
> any support from the vendors.  But it would be a "really good thing"
> if we got some input from the vendors before considering putting
> a workaround into the upstream kernel -- for they may have
> guidance which would either simplify it, or make it unnecessary.
> Perhaps Allen Martin at nVidia can comment?
> 
> -Len
> 
> 
> 

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-15 20:17         ` Len Brown
  2004-04-15 21:04           ` Craig Bradney
@ 2004-04-15 21:56           ` Arjen Verweij
  1 sibling, 0 replies; 93+ messages in thread
From: Arjen Verweij @ 2004-04-15 21:56 UTC (permalink / raw)
  To: Len Brown
  Cc: ross, christian.kroener, linux-kernel, Maciej W. Rozycki,
	Jamie Lokier, Prakash K. Cheemplavam, Craig Bradney,
	Daniel Drake, Ian Kumlien, Jesse Allen, Allen Martin

On 15 Apr 2004, Len Brown wrote:

> On Thu, 2004-04-15 at 11:10, Ross Dickson wrote:
> > On Wednesday 14 April 2004 11:02, Len Brown wrote:
> > > Re: IRQ0 XT-PIC timer issue
> > >
> > > Since the hardware is connected to APIC pin0, it is a BIOS bug
> > > that an ACPI interrupt source override from pin2 to IRQ0 exists.
> > >
> > > With this simple 2.6.5 patch you can specify "acpi_skip_timer_override"
> > > to ignore that bogus BIOS directive.  The result is with your
> > > ACPI-enabled APIC-enabled kernel, you'll get IRQ0 IO-APIC-edge timer.
> > >
> > > Probably there is a more clever way to trigger this workaround
> > > automatcially instead of via boot parameter.
>
> > Hi Len, I have updated my nforce2 patches for 2.6.5 to work with your patch.
> > I have tested them only on one nforce2 board Epox 8Rga+ but as little has
> > changed in core functionality from past releases I think all will be OK....
> > Hopefully no clock skew. I saw none on my system but thats no guarantee.
>
> While I don't want to get into the business of maintaining
> a dmi_scan entry for every system with this issue, I think
> it might be a good idea to add a couple of example entries
> for high volume systems for which there is no BIOS fix available.
>
> Got any opinions on which system to use as the example?
> I'll need the output from dmidecode for them.
>
> > I tried your above patch with the timer_ack on as is default in 2.6.5 and
> > nmi_watchdog=1 failed as expected. I still think Maciej's 8259 ack patch
> > is more complete solution to the ack issue but this one gets watchdog going for
> > nforce2. I cannot see anyone using your above patch without an integrated
> > apic and tsc so I cannot see a problem triggering it off your kern arg.
>
> "acpi_skip_timer_override" is specific to IOAPIC mode,
> since that is the only place that the bogus interrupt
> source override is used.
>
> I'm not clued-in on the nmi_watchdog and 8259 ack issues.
> My focus is primarily the ACPI issues involved in getting
> these systems up and running in IOAPIC mode.
>
> > The second patch is the C1halt update I suggested in another posting.
> > http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/1707.html
>
> Clearly this hang issue is more important than the timer issue.
> I'm impressed that you built such a sophisticated patch without
> any support from the vendors.  But it would be a "really good thing"
> if we got some input from the vendors before considering putting
> a workaround into the upstream kernel -- for they may have
> guidance which would either simplify it, or make it unnecessary.
> Perhaps Allen Martin at nVidia can comment?

Yes, this sounds like a marvellous idea, since every board except some
Shuttle board after a BIOS update does not suffer from these hangs.
Unfortunately, Allen Martin already commented on this once:

"Likely the root of the problem has to do with the way the Linux kernel is
using the ACPI methods to setup the interrupts which is different from win
9x/2k/XP. I can help track this down, unfortunately so far I've been
unable to reproduce the hangs on any of the boards I have."
-Allen

http://lkml.org/lkml/2003/12/5/156

Maybe he can find useful hints on how to crash his box with an nforce2
chipset here:

http://atlas.et.tudelft.nl/verwei90/nforce2/

Basically just enable APIC in the kernel and start pushing the HDD or
anything related to I/O really. The crashes come more regularely in 2.6
kernels because of the increased Hz value.

Regards,

Arjen


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-15 21:04           ` Craig Bradney
@ 2004-04-21 20:22             ` Len Brown
  2004-04-21 20:33               ` Ian Kumlien
                                 ` (2 more replies)
  0 siblings, 3 replies; 93+ messages in thread
From: Len Brown @ 2004-04-21 20:22 UTC (permalink / raw)
  To: Craig Bradney
  Cc: ross, christian.kroener, linux-kernel, Maciej W. Rozycki,
	Jamie Lokier, Prakash K. Cheemplavam, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

On Thu, 2004-04-15 at 17:04, Craig Bradney wrote:

> > While I don't want to get into the business of maintaining
> > a dmi_scan entry for every system with this issue, I think
> > it might be a good idea to add a couple of example entries
> > for high volume systems for which there is no BIOS fix available.
> > 
> > Got any opinions on which system to use as the example?
> > I'll need the output from dmidecode for them.
> 
> I have an A7N8X Deluxe v2 BIOS v1007 that I can give u whatever numbers
> u need. IOAPIC and APIC are on.

Please send me the output from dmidecode, available in /usr/sbin/, or
here:
http://www.nongnu.org/dmidecode/
or
http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/

thanks,
-Len



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-21 20:22             ` Len Brown
@ 2004-04-21 20:33               ` Ian Kumlien
  2004-04-21 20:45               ` Craig Bradney
  2004-04-21 21:28               ` Prakash K. Cheemplavam
  2 siblings, 0 replies; 93+ messages in thread
From: Ian Kumlien @ 2004-04-21 20:33 UTC (permalink / raw)
  To: Len Brown
  Cc: Craig Bradney, ross, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Jamie Lokier, Prakash K. Cheemplavam,
	Daniel Drake, Jesse Allen, a.verweij, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 667 bytes --]

On Wed, 2004-04-21 at 22:22, Len Brown wrote:
> On Thu, 2004-04-15 at 17:04, Craig Bradney wrote:
> > > Got any opinions on which system to use as the example?
> > > I'll need the output from dmidecode for them.
> > 
> > I have an A7N8X Deluxe v2 BIOS v1007 that I can give u whatever numbers
> > u need. IOAPIC and APIC are on.
> 
> Please send me the output from dmidecode, available in /usr/sbin/, or

I sent a off ml dmidecode from my ASUS A7N8X-X 2.xx (BIOS: 1007)
Motherboard.

I have also TRIED to send some complaints to ASUS, but thats harder than
you might expect... =P

-- 
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-21 20:22             ` Len Brown
  2004-04-21 20:33               ` Ian Kumlien
@ 2004-04-21 20:45               ` Craig Bradney
  2004-04-21 21:28               ` Prakash K. Cheemplavam
  2 siblings, 0 replies; 93+ messages in thread
From: Craig Bradney @ 2004-04-21 20:45 UTC (permalink / raw)
  To: Len Brown
  Cc: ross, christian.kroener, linux-kernel, Maciej W. Rozycki,
	Jamie Lokier, Prakash K. Cheemplavam, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin


[-- Attachment #1.1: Type: text/plain, Size: 843 bytes --]

On Wed, 2004-04-21 at 22:22, Len Brown wrote:
> On Thu, 2004-04-15 at 17:04, Craig Bradney wrote:
> 
> > > While I don't want to get into the business of maintaining
> > > a dmi_scan entry for every system with this issue, I think
> > > it might be a good idea to add a couple of example entries
> > > for high volume systems for which there is no BIOS fix available.
> > > 
> > > Got any opinions on which system to use as the example?
> > > I'll need the output from dmidecode for them.
> > 
> > I have an A7N8X Deluxe v2 BIOS v1007 that I can give u whatever numbers
> > u need. IOAPIC and APIC are on.
> 
> Please send me the output from dmidecode, available in /usr/sbin/, or
> here:
> http://www.nongnu.org/dmidecode/
> or
> http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/

Enjoy :) & thanks

Craig

[-- Attachment #1.2: AsusA7N8Xv2BIOS1007_CraigBradney.txt --]
[-- Type: text/plain, Size: 14625 bytes --]

# dmidecode 2.4
SMBIOS 2.2 present.
48 structures occupying 1418 bytes.
Table at 0x000F0000.
Handle 0x0000
	DMI type 0, 19 bytes.
	BIOS Information
		Vendor: Phoenix Technologies, LTD
		Version: ASUS A7N8X2.0 Deluxe ACPI BIOS Rev 1007
		Release Date: 10/06/2003
		Address: 0xF0000
		Runtime Size: 64 kB
		ROM Size: 512 kB
		Characteristics:
			PCI is supported
			PNP is supported
			APM is supported
			BIOS is upgradeable
			BIOS shadowing is allowed
			Boot from CD is supported
			Selectable boot is supported
			BIOS ROM is socketed
			EDD is supported
			5.25"/360 KB floppy services are supported (int 13h)
			5.25"/1.2 MB floppy services are supported (int 13h)
			3.5"/720 KB floppy services are supported (int 13h)
			3.5"/2.88 MB floppy services are supported (int 13h)
			Print screen service is supported (int 5h)
			8042 keyboard services are supported (int 9h)
			Serial services are supported (int 14h)
			Printer services are supported (int 17h)
			CGA/mono video services are supported (int 10h)
			ACPI is supported
			USB legacy is supported
			AGP is supported
			LS-120 boot is supported
			ATAPI Zip drive boot is supported
Handle 0x0001
	DMI type 1, 25 bytes.
	System Information
		Manufacturer: ASUSTeK Computer INC.
		Product Name: A7N8X2.0
		Version: REV 2.xx
		Serial Number: xxxxxxxxxxx
		UUID: Not Present
		Wake-up Type: Power Switch
Handle 0x0002
	DMI type 2, 8 bytes.
	Base Board Information
		Manufacturer: ASUSTeK Computer INC.
		Product Name: A7N8X2.0
		Version: REV 2.xx
		Serial Number: xxxxxxxxxxx
Handle 0x0003
	DMI type 3, 13 bytes.
	Chassis Information
		Manufacturer: Chassis Manufactture
		Type: Desktop
		Lock: Not Present
		Version: Chassis Version
		Serial Number: Chassis serial Number
		Asset Tag: Asset-1234567890
		Boot-up State: Safe
		Power Supply State: Safe
		Thermal State: Safe
		Security Status: None
Handle 0x0004
	DMI type 4, 32 bytes.
	Processor Information
		Socket Designation: Socket A
		Type: Central Processor
		Family: Duron
		Manufacturer: AMD
		ID: A0 06 00 00 FF FB 83 03
		Signature: Family 6, Model A, Stepping 0
		Flags:
			FPU (Floating-point unit on-chip)
			VME (Virtual mode extension)
			DE (Debugging extension)
			PSE (Page size extension)
			TSC (Time stamp counter)
			MSR (Model specific registers)
			PAE (Physical address extension)
			MCE (Machine check exception)
			CX8 (CMPXCHG8 instruction supported)
			APIC (On-chip APIC hardware supported)
			SEP (Fast system call)
			MTRR (Memory type range registers)
			PGE (Page global enable)
			MCA (Machine check architecture)
			CMOV (Conditional move instruction supported)
			PAT (Page attribute table)
			PSE-36 (36-bit page size extension)
			MMX (MMX technology supported)
			FXSR (Fast floating-point save and restore)
			SSE (Streaming SIMD extensions)
		Version: AMD Athlon(tm) XP
		Voltage: 1.6 V
		External Clock: 166 MHz
		Max Speed: 3000 MHz
		Current Speed: 1916 MHz
		Status: Populated, Enabled
		Upgrade: ZIF Socket
		L1 Cache Handle: 0x0009
		L2 Cache Handle: 0x000A
		L3 Cache Handle: No L3 Cache
Handle 0x0005
	DMI type 5, 22 bytes.
	Memory Controller Information
		Error Detecting Method: None
		Error Correcting Capabilities:
			Other
		Supported Interleave: Unknown
		Current Interleave: Unknown
		Maximum Memory Module Size: 1024 MB
		Maximum Total Memory Size: 3072 MB
		Supported Speeds:
			70 ns
			60 ns
			50 ns
		Supported Memory Types:
			DIMM
			SDRAM
		Memory Module Voltage: 3.3 V
		Associated Memory Slots: 3
			0x0006
			0x0007
			0x0008
		Enabled Error Correcting Capabilities:
			None
Handle 0x0006
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: DDR1
		Bank Connections: 0
		Current Speed: Unknown
		Type: DIMM
		Installed Size: 256 MB (Single-bank Connection)
		Enabled Size: 256 MB (Single-bank Connection)
		Error Status: OK
Handle 0x0007
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: DDR2
		Bank Connections: 2
		Current Speed: Unknown
		Type: DIMM
		Installed Size: 256 MB (Single-bank Connection)
		Enabled Size: 256 MB (Single-bank Connection)
		Error Status: OK
Handle 0x0008
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: DDR3
		Bank Connections: 4 5
		Current Speed: Unknown
		Type: DIMM
		Installed Size: 512 MB (Double-bank Connection)
		Enabled Size: 512 MB (Double-bank Connection)
		Error Status: OK
Handle 0x0009
	DMI type 7, 19 bytes.
	Cache Information
		Socket Designation: L1 Cache
		Configuration: Enabled, Not Socketed, Level 1
		Operational Mode: Write Back
		Location: Internal
		Installed Size: 128 KB
		Maximum Size: 128 KB
		Supported SRAM Types:
			Pipeline Burst
			Synchronous
		Installed SRAM Type: Pipeline Burst Synchronous
		Speed: Unknown
		Error Correction Type: Unknown
		System Type: Data
		Associativity: 4-way Set-associative
Handle 0x000A
	DMI type 7, 19 bytes.
	Cache Information
		Socket Designation: L2 Cache
		Configuration: Enabled, Not Socketed, Level 2
		Operational Mode: Write Back
		Location: External
		Installed Size: 512 KB
		Maximum Size: 512 KB
		Supported SRAM Types:
			Pipeline Burst
			Synchronous
		Installed SRAM Type: Pipeline Burst Synchronous
		Speed: Unknown
		Error Correction Type: Unknown
		System Type: Data
		Associativity: 4-way Set-associative
Handle 0x000B
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: PRIMARY IDE/HDD
		Internal Connector Type: On Board IDE
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: None
Handle 0x000C
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: SECONDARY IDE/HDD
		Internal Connector Type: On Board IDE
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: None
Handle 0x000D
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: FLOPPY
		Internal Connector Type: On Board Floppy
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: None
Handle 0x000E
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Serial Port 1
		Internal Connector Type: None
		External Reference Designator: Serial Port 1
		External Connector Type: DB-9 male
		Port Type: Serial Port 16550 Compatible
Handle 0x000F
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Serial Port 2
		Internal Connector Type: None
		External Reference Designator: Serial Port 2
		External Connector Type: DB-9 male
		Port Type: Serial Port 16550 Compatible
Handle 0x0010
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Parallel Port
		Internal Connector Type: None
		External Reference Designator: Parallel Port
		External Connector Type: DB-25 female
		Port Type: Parallel Port ECP/EPP
Handle 0x0011
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: PS/2 Keyboard
		Internal Connector Type: None
		External Reference Designator: PS/2 Keyboard
		External Connector Type: PS/2
		Port Type: Keyboard Port
Handle 0x0012
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: PS/2 Mouse
		Internal Connector Type: None
		External Reference Designator: PS/2 Mouse
		External Connector Type: PS/2
		Port Type: Mouse Port
Handle 0x0013
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: USB1
		External Connector Type: Access Bus (USB)
		Port Type: USB
Handle 0x0014
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: USB2
		External Connector Type: Access Bus (USB)
		Port Type: USB
Handle 0x0015
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: USB3
		External Connector Type: Access Bus (USB)
		Port Type: USB
Handle 0x0016
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: USB4
		External Connector Type: Access Bus (USB)
		Port Type: USB
Handle 0x0017
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: USB5
		External Connector Type: Access Bus (USB)
		Port Type: USB
Handle 0x0018
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: USB6
		External Connector Type: Access Bus (USB)
		Port Type: USB
Handle 0x0019
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: ETHERNET
		External Connector Type: RJ-45
		Port Type: Network Port
Handle 0x001A
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: ETHERNET
		External Connector Type: RJ-45
		Port Type: Network Port
Handle 0x001B
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: Joystic Port
		External Connector Type: DB-15 female
		Port Type: Joystick Port
Handle 0x001C
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: MIDI Port
		External Connector Type: DB-15 female
		Port Type: MIDI Port
Handle 0x001D
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI1
		Type: 32-bit PCI
		Current Usage: Available
		Length: Short
		ID: 1
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x001E
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI2
		Type: 32-bit PCI
		Current Usage: Available
		Length: Short
		ID: 2
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x001F
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI3
		Type: 32-bit PCI
		Current Usage: Available
		Length: Short
		ID: 3
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0020
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI4
		Type: 32-bit PCI
		Current Usage: Available
		Length: Short
		ID: 4
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0021
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI5
		Type: 32-bit PCI
		Current Usage: Available
		Length: Short
		ID: 5
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0022
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: AGP
		Type: 32-bit AGP
		Current Usage: In Use
		Length: Short
		ID: 6
		Characteristics:
			3.3 V is provided
Handle 0x0023
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: Onboard 1394
		External Connector Type: IEEE 1394
		Port Type: Firewire (IEEE P1394)
Handle 0x0024
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: Line In Jack Port
		External Connector Type: Mini Jack (headphones)
		Port Type: Audio Port
Handle 0x0025
	DMI type 13, 22 bytes.
	BIOS Language Information
		Installable Languages: 3
			n|US|iso8859-1
			n|US|iso8859-1
			r|CA|iso8859-1
		Currently Installed Language: n|US|iso8859-1
Handle 0x0026
	DMI type 16, 15 bytes.
	Physical Memory Array
		Location: System Board Or Motherboard
		Use: System Memory
		Error Correction Type: None
		Maximum Capacity: 1536 MB
		Error Information Handle: Not Provided
		Number Of Devices: 3
Handle 0x0027
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x0026
		Error Information Handle: Not Provided
		Total Width: 72 bits
		Data Width: 64 bits
		Size: 256 MB
		Form Factor: DIMM
		Set: None
		Locator: DDR1
		Bank Locator: Bank0/1
		Type: DRAM
		Type Detail: Synchronous
Handle 0x0028
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x0026
		Error Information Handle: Not Provided
		Total Width: 72 bits
		Data Width: 64 bits
		Size: 256 MB
		Form Factor: DIMM
		Set: None
		Locator: DDR2
		Bank Locator: Bank2/3
		Type: DRAM
		Type Detail: Synchronous
Handle 0x0029
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x0026
		Error Information Handle: Not Provided
		Total Width: 72 bits
		Data Width: 64 bits
		Size: 512 MB
		Form Factor: DIMM
		Set: None
		Locator: DDR3
		Bank Locator: Bank4/5
		Type: DRAM
		Type Detail: Synchronous
Handle 0x002A
	DMI type 19, 15 bytes.
	Memory Array Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x0003FFFFFFF
		Range Size: 1 GB
		Physical Array Handle: 0x0026
		Partition Width: 0
Handle 0x002B
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x0000FFFFFFF
		Range Size: 256 MB
		Physical Device Handle: 0x0027
		Memory Array Mapped Address Handle: 0x002A
		Partition Row Position: 1
Handle 0x002C
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00010000000
		Ending Address: 0x0001FFFFFFF
		Range Size: 256 MB
		Physical Device Handle: 0x0028
		Memory Array Mapped Address Handle: 0x002A
		Partition Row Position: 1
Handle 0x002D
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00020000000
		Ending Address: 0x0003FFFFFFF
		Range Size: 512 MB
		Physical Device Handle: 0x0029
		Memory Array Mapped Address Handle: 0x002A
		Partition Row Position: 1
Handle 0x002E
	DMI type 32, 11 bytes.
	System Boot Information
		Status: No errors detected
Handle 0x002F
	DMI type 127, 4 bytes.
	End Of Table

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-21 20:22             ` Len Brown
  2004-04-21 20:33               ` Ian Kumlien
  2004-04-21 20:45               ` Craig Bradney
@ 2004-04-21 21:28               ` Prakash K. Cheemplavam
  2004-04-21 22:41                 ` Len Brown
  2 siblings, 1 reply; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-04-21 21:28 UTC (permalink / raw)
  To: Len Brown
  Cc: Craig Bradney, ross, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 873 bytes --]

Len Brown wrote:
> On Thu, 2004-04-15 at 17:04, Craig Bradney wrote:
> 
> 
>>>While I don't want to get into the business of maintaining
>>>a dmi_scan entry for every system with this issue, I think
>>>it might be a good idea to add a couple of example entries
>>>for high volume systems for which there is no BIOS fix available.
>>>
>>>Got any opinions on which system to use as the example?
>>>I'll need the output from dmidecode for them.
>>
>>I have an A7N8X Deluxe v2 BIOS v1007 that I can give u whatever numbers
>>u need. IOAPIC and APIC are on.
> 
> 
> Please send me the output from dmidecode, available in /usr/sbin/, or
> here:
> http://www.nongnu.org/dmidecode/
> or
> http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/


Hi,

this is the output for Abit NF7-S Rev20 using bios d23. I have NOT 
activated APIC for this. Is it needed?

bye,

Prakash

[-- Attachment #2: dmiabitnf7sv2d23.txt --]
[-- Type: text/plain, Size: 11288 bytes --]

# dmidecode 2.3
SMBIOS 2.2 present.
37 structures occupying 981 bytes.
Table at 0x000F0800.
Handle 0x0000
	DMI type 0, 19 bytes.
	BIOS Information
		Vendor: Phoenix Technologies, LTD
		Version: 6.00 PG
		Release Date: 03/24/2004
		Address: 0xE0000
		Runtime Size: 128 kB
		ROM Size: 512 kB
		Characteristics:
			ISA is supported
			PCI is supported
			PNP is supported
			APM is supported
			BIOS is upgradeable
			BIOS shadowing is allowed
			ESCD support is available
			Boot from CD is supported
			Selectable boot is supported
			BIOS ROM is socketed
			EDD is supported
			5.25"/360 KB floppy services are supported (int 13h)
			5.25"/1.2 MB floppy services are supported (int 13h)
			3.5"/720 KB floppy services are supported (int 13h)
			3.5"/2.88 MB floppy services are supported (int 13h)
			Print screen service is supported (int 5h)
			8042 keyboard services are supported (int 9h)
			Serial services are supported (int 14h)
			Printer services are supported (int 17h)
			CGA/mono video services are supported (int 10h)
			ACPI is supported
			USB legacy is supported
			AGP is supported
			LS-120 boot is supported
			ATAPI Zip drive boot is supported
Handle 0x0001
	DMI type 1, 25 bytes.
	System Information
		Manufacturer:  
		Product Name:  
		Version:  
		Serial Number:  
		UUID: 00000000-0000-0000-0000-00508DF1FBE3
		Wake-up Type: Power Switch
Handle 0x0002
	DMI type 2, 8 bytes.
	Base Board Information
		Manufacturer: http://www.abit.com.tw/
		Product Name: NF7-S/NF7,NF7-V (nVidia-nForce2)
		Version: 2.X,1.0
		Serial Number:  
Handle 0x0003
	DMI type 3, 13 bytes.
	Chassis Information
		Manufacturer:  
		Type: Desktop
		Lock: Not Present
		Version:  
		Serial Number:  
		Asset Tag:  
		Boot-up State: Unknown
		Power Supply State: Unknown
		Thermal State: Unknown
		Security Status: Unknown
Handle 0x0004
	DMI type 4, 32 bytes.
	Processor Information
		Socket Designation: Socket A
		Type: Central Processor
		Family: Athlon
		Manufacturer: AMD
		ID: 81 06 00 00 FF FB 83 03
		Signature: Type 0, Family 6, Model 8, Stepping 1
		Flags:
			FPU (Floating-point unit on-chip)
			VME (Virtual mode extension)
			DE (Debugging extension)
			PSE (Page size extension)
			TSC (Time stamp counter)
			MSR (Model specific registers)
			PAE (Physical address extension)
			MCE (Machine check exception)
			CX8 (CMPXCHG8 instruction supported)
			APIC (On-chip APIC hardware supported)
			SEP (Fast system call)
			MTRR (Memory type range registers)
			PGE (Page global enable)
			MCA (Machine check architecture)
			CMOV (Conditional move instruction supported)
			PAT (Page attribute table)
			PSE-36 (36-bit page size extension)
			MMX (MMX technology supported)
			FXSR (Fast floating-point save and restore)
			SSE (Streaming SIMD extensions)
		Version: AMD Athlon(tm) XP
		Voltage: 1.6 V
		External Clock: 200 MHz
		Max Speed: 3000 MHz
		Current Speed: 2100 MHz
		Status: Populated, Enabled
		Upgrade: ZIF Socket
		L1 Cache Handle: 0x0009
		L2 Cache Handle: 0x000A
		L3 Cache Handle: No L3 Cache
Handle 0x0005
	DMI type 5, 22 bytes.
	Memory Controller Information
		Error Detecting Method: 8-bit Parity
		Error Correcting Capabilities:
			None
		Supported Interleave: One-way Interleave
		Current Interleave: One-way Interleave
		Maximum Memory Module Size: 1024 MB
		Maximum Total Memory Size: 3072 MB
		Supported Speeds:
			Other
		Supported Memory Types:
			Other
			DIMM
			SDRAM
		Memory Module Voltage: 2.9 V
		Associated Memory Slots: 3
			0x0006
			0x0007
			0x0008
		Enabled Error Correcting Capabilities: None
Handle 0x0006
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: A0
		Bank Connections: 0 1
		Current Speed: 10 ns
		Type: Other DIMM SDRAM
		Installed Size: 512 MB (Double-bank Connection)
		Enabled Size: 512 MB (Double-bank Connection)
		Error Status: OK
Handle 0x0007
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: A1
		Bank Connections: None
		Current Speed: 10 ns
		Type: Other DIMM SDRAM
		Installed Size: Not Installed (Single-bank Connection)
		Enabled Size: Not Installed (Single-bank Connection)
		Error Status: OK
Handle 0x0008
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: A2
		Bank Connections: 4 5
		Current Speed: 10 ns
		Type: Other DIMM SDRAM
		Installed Size: 512 MB (Double-bank Connection)
		Enabled Size: 512 MB (Double-bank Connection)
		Error Status: OK
Handle 0x0009
	DMI type 7, 19 bytes.
	Cache Information
		Socket Designation: Internal Cache
		Configuration: Enabled, Not Socketed, Level 1
		Operational Mode: Write Back
		Location: Internal
		Installed Size: 128 KB
		Maximum Size: 128 KB
		Supported SRAM Types:
			Synchronous
		Installed SRAM Type: Synchronous
		Speed: Unknown
		Error Correction Type: Unknown
		System Type: Unknown
		Associativity: Unknown
Handle 0x000A
	DMI type 7, 19 bytes.
	Cache Information
		Socket Designation: External Cache
		Configuration: Enabled, Not Socketed, Level 2
		Operational Mode: Write Back
		Location: External
		Installed Size: 256 KB
		Maximum Size: 256 KB
		Supported SRAM Types:
			Synchronous
		Installed SRAM Type: Synchronous
		Speed: Unknown
		Error Correction Type: Unknown
		System Type: Unknown
		Associativity: Unknown
Handle 0x000B
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: PRIMARY IDE
		Internal Connector Type: On Board IDE
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: Other
Handle 0x000C
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: SECONDARY IDE
		Internal Connector Type: On Board IDE
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: Other
Handle 0x000D
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: FDD
		Internal Connector Type: On Board Floppy
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: 8251 FIFO Compatible
Handle 0x000E
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: COM1
		Internal Connector Type: 9 Pin Dual Inline (pin 10 cut)
		External Reference Designator:  
		External Connector Type: DB-9 male
		Port Type: Serial Port 16450 Compatible
Handle 0x000F
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: COM2
		Internal Connector Type: 9 Pin Dual Inline (pin 10 cut)
		External Reference Designator:  
		External Connector Type: DB-9 male
		Port Type: Serial Port 16450 Compatible
Handle 0x0010
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: LPT1
		Internal Connector Type: DB-25 female
		External Reference Designator:  
		External Connector Type: DB-25 female
		Port Type: Parallel Port ECP/EPP
Handle 0x0011
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Keyboard
		Internal Connector Type: PS/2
		External Reference Designator:  
		External Connector Type: PS/2
		Port Type: Keyboard Port
Handle 0x0012
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: PS/2 Mouse
		Internal Connector Type: PS/2
		External Reference Designator:  
		External Connector Type: PS/2
		Port Type: Mouse Port
Handle 0x0013
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: USB
		External Connector Type: Other
		Port Type: USB
Handle 0x0014
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI0
		Type: 32-bit PCI
		Current Usage: In Use
		Length: Long
		ID: 1
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0015
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI1
		Type: 32-bit PCI
		Current Usage: Available
		Length: Long
		ID: 2
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0016
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI2
		Type: 32-bit PCI
		Current Usage: In Use
		Length: Long
		ID: 3
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0017
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI3
		Type: 32-bit PCI
		Current Usage: In Use
		Length: Long
		ID: 4
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0018
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI4
		Type: 32-bit PCI
		Current Usage: Available
		Length: Long
		ID: 5
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0019
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: AGP
		Type: 32-bit AGP
		Current Usage: Available
		Length: Long
		ID: 240
		Characteristics:
			5.0 V is provided
			3.3 V is provided
Handle 0x001A
	DMI type 13, 22 bytes.
	BIOS Language Information
		Installable Languages: 3
			n|US|iso8859-1
			n|US|iso8859-1
			r|CA|iso8859-1
		Currently Installed Language: n|US|iso8859-1
Handle 0x001B
	DMI type 16, 15 bytes.
	Physical Memory Array
		Location: System Board Or Motherboard
		Use: System Memory
		Error Correction Type: None
		Maximum Capacity: 1536 MB
		Error Information Handle: Not Provided
		Number Of Devices: 3
Handle 0x001C
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x001B
		Error Information Handle: Not Provided
		Total Width: Unknown
		Data Width: Unknown
		Size: 512 MB
		Form Factor: DIMM
		Set: None
		Locator: A0
		Bank Locator: Bank0/1
		Type: Unknown
		Type Detail: None
Handle 0x001D
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x001B
		Error Information Handle: Not Provided
		Total Width: Unknown
		Data Width: Unknown
		Size: No Module Installed
		Form Factor: DIMM
		Set: None
		Locator: A1
		Bank Locator: Bank2/3
		Type: Unknown
		Type Detail: None
Handle 0x001E
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x001B
		Error Information Handle: Not Provided
		Total Width: Unknown
		Data Width: Unknown
		Size: 512 MB
		Form Factor: DIMM
		Set: None
		Locator: A2
		Bank Locator: Bank4/5
		Type: Unknown
		Type Detail: None
Handle 0x001F
	DMI type 19, 15 bytes.
	Memory Array Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x0003FFFFFFF
		Range Size: 1 GB
		Physical Array Handle: 0x001B
		Partition Width: 0
Handle 0x0020
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x0001FFFFFFF
		Range Size: 512 MB
		Physical Device Handle: 0x001C
		Memory Array Mapped Address Handle: 0x001F
		Partition Row Position: 1
Handle 0x0021
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x000000003FF
		Range Size: 1 kB
		Physical Device Handle: 0x001D
		Memory Array Mapped Address Handle: 0x001F
		Partition Row Position: 1
Handle 0x0022
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00020000000
		Ending Address: 0x0003FFFFFFF
		Range Size: 512 MB
		Physical Device Handle: 0x001E
		Memory Array Mapped Address Handle: 0x001F
		Partition Row Position: 1
Handle 0x0023
	DMI type 32, 11 bytes.
	System Boot Information
		Status: No errors detected
Handle 0x0024
	DMI type 127, 4 bytes.
	End Of Table

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-21 21:28               ` Prakash K. Cheemplavam
@ 2004-04-21 22:41                 ` Len Brown
  2004-04-22  7:26                   ` Prakash K. Cheemplavam
                                     ` (4 more replies)
  0 siblings, 5 replies; 93+ messages in thread
From: Len Brown @ 2004-04-21 22:41 UTC (permalink / raw)
  To: Prakash K. Cheemplavam
  Cc: Craig Bradney, ross, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 5145 bytes --]

> Please send me the output from dmidecode, available in /usr/sbin/, or
> > here:
> > http://www.nongnu.org/dmidecode/
> > or
> > http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/

On Wed, 2004-04-21 at 17:28, Prakash K. Cheemplavam wrote:

> this is the output for Abit NF7-S Rev20 using bios d23. I have NOT 
> activated APIC for this. Is it needed?

Yes, you need to enable ACPI and IOAPIC.  The goal of this patch
is to address the XT-PIC timer issue in IOAPIC mode.

Here's the latest (vs 2.6.5).

I've got 1 Abit, 2 Asus, and 1 Shuttle DMI entry.  Let me know if the
product names (1st line of dmidecode entry) are correct,
these are not from DMI, but are supposed to be human-readable titles.

I'm interested only in the latest BIOS -- if it is still broken.
The assumption is that if a fixed BIOS is available, the users
should upgrade.

thanks,
-Len

ps. latest BIOS on my shuttle has a C1 disconnect enable setting,
(curiously, it is disabled by default) so I'll try to reproduce the hang
on it...

===== Documentation/kernel-parameters.txt 1.44 vs edited =====
--- 1.44/Documentation/kernel-parameters.txt	Mon Mar 22 16:03:22 2004
+++ edited/Documentation/kernel-parameters.txt	Wed Apr 21 15:28:12 2004
@@ -122,6 +122,10 @@
 
 	acpi_serialize	[HW,ACPI] force serialization of AML methods
 
+	acpi_skip_timer_override [HW,ACPI]
+			Recognize and ignore IRQ0/pin2 Interrupt Override.
+			For broken nForce2 BIOS resulting in XT-PIC timer.
+
 	ad1816=		[HW,OSS]
 			Format: <io>,<irq>,<dma>,<dma2>
 			See also Documentation/sound/oss/AD1816.
===== arch/i386/kernel/dmi_scan.c 1.57 vs edited =====
--- 1.57/arch/i386/kernel/dmi_scan.c	Fri Apr 16 22:03:06 2004
+++ edited/arch/i386/kernel/dmi_scan.c	Wed Apr 21 18:29:35 2004
@@ -540,6 +540,19 @@
 #endif
 
 /*
+ * early nForce2 reference BIOS shipped with a
+ * bogus ACPI IRQ0 -> pin2 interrupt override -- ignore it
+ */
+static __init int ignore_timer_override(struct dmi_blacklist *d)
+{
+	extern int acpi_skip_timer_override;
+	printk(KERN_NOTICE "%s detected: BIOS IRQ0 pin2 override"
+		" will be ignored\n", d->ident); 	
+
+	acpi_skip_timer_override = 1;
+	return 0;
+}
+/*
  *	Process the DMI blacklists
  */
  
@@ -944,6 +957,37 @@
 			MATCH(DMI_BOARD_VENDOR, "IBM"),
 			MATCH(DMI_PRODUCT_NAME, "eserver xSeries 440"),
 			NO_MATCH, NO_MATCH }},
+
+/*
+ * Systems with nForce2 BIOS timer override bug
+ * add Albatron KM18G Pro
+ * add DFI NFII 400-AL
+ * add Epox 8RGA+
+ * add Shuttle AN35N
+ */
+	{ ignore_timer_override, "Abit NF7-S v2", {
+			MATCH(DMI_BOARD_VENDOR, "http://www.abit.com.tw/"),
+			MATCH(DMI_BOARD_NAME, "NF7-S/NF7,NF7-V (nVidia-nForce2)"),
+			MATCH(DMI_BIOS_VERSION, "6.00 PG"),
+			MATCH(DMI_BIOS_DATE, "03/24/2004") }},
+
+	{ ignore_timer_override, "Asus A7N8X v2", {
+			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
+			MATCH(DMI_BOARD_NAME, "A7N8X2.0"),
+			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X2.0 Deluxe ACPI BIOS Rev 1007"),
+			MATCH(DMI_BIOS_DATE, "10/06/2003") }},
+
+	{ ignore_timer_override, "Asus A7N8X-X", {
+			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
+			MATCH(DMI_BOARD_NAME, "A7N8X-X"),
+			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X-X ACPI BIOS Rev 1007"),
+			MATCH(DMI_BIOS_DATE, "10/07/2003") }},
+
+	{ ignore_timer_override, "Shuttle SN41G2", {
+			MATCH(DMI_BOARD_VENDOR, "Shuttle Inc"),
+			MATCH(DMI_BOARD_NAME, "FN41"),
+			MATCH(DMI_BIOS_VERSION, "6.00 PG"),
+			MATCH(DMI_BIOS_DATE, "01/14/2004") }},
 #endif	// CONFIG_ACPI_BOOT
 
 #ifdef	CONFIG_ACPI_PCI
===== arch/i386/kernel/setup.c 1.115 vs edited =====
--- 1.115/arch/i386/kernel/setup.c	Fri Apr  2 07:21:43 2004
+++ edited/arch/i386/kernel/setup.c	Wed Apr 21 15:28:12 2004
@@ -614,6 +614,9 @@
 		else if (!memcmp(from, "acpi_sci=low", 12))
 			acpi_sci_flags.polarity = 3;
 
+		else if (!memcmp(from, "acpi_skip_timer_override", 24))
+			acpi_skip_timer_override = 1;
+
 #ifdef CONFIG_X86_LOCAL_APIC
 		/* disable IO-APIC */
 		else if (!memcmp(from, "noapic", 6))
===== arch/i386/kernel/acpi/boot.c 1.58 vs edited =====
--- 1.58/arch/i386/kernel/acpi/boot.c	Tue Apr 20 20:54:03 2004
+++ edited/arch/i386/kernel/acpi/boot.c	Wed Apr 21 15:28:13 2004
@@ -62,6 +62,7 @@
 
 acpi_interrupt_flags acpi_sci_flags __initdata;
 int acpi_sci_override_gsi __initdata;
+int acpi_skip_timer_override __initdata;
 
 #ifdef CONFIG_X86_LOCAL_APIC
 static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
@@ -327,6 +328,12 @@
 		acpi_sci_ioapic_setup(intsrc->global_irq,
 			intsrc->flags.polarity, intsrc->flags.trigger);
 		return 0;
+	}
+
+	if (acpi_skip_timer_override &&
+		intsrc->bus_irq == 0 && intsrc->global_irq == 2) {
+			printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
+			return 0;
 	}
 
 	mp_override_legacy_irq (
===== include/asm-i386/acpi.h 1.18 vs edited =====
--- 1.18/include/asm-i386/acpi.h	Tue Mar 30 17:05:19 2004
+++ edited/include/asm-i386/acpi.h	Wed Apr 21 15:28:14 2004
@@ -118,6 +118,7 @@
 #ifdef CONFIG_X86_IO_APIC
 extern int skip_ioapic_setup;
 extern int acpi_irq_to_vector(u32 irq);	/* deprecated in favor of
acpi_gsi_to_irq */
+extern int acpi_skip_timer_override;
 
 static inline void disable_ioapic_setup(void)
 {



[-- Attachment #2: wip.patch --]
[-- Type: text/plain, Size: 4121 bytes --]

===== Documentation/kernel-parameters.txt 1.44 vs edited =====
--- 1.44/Documentation/kernel-parameters.txt	Mon Mar 22 16:03:22 2004
+++ edited/Documentation/kernel-parameters.txt	Wed Apr 21 15:28:12 2004
@@ -122,6 +122,10 @@
 
 	acpi_serialize	[HW,ACPI] force serialization of AML methods
 
+	acpi_skip_timer_override [HW,ACPI]
+			Recognize and ignore IRQ0/pin2 Interrupt Override.
+			For broken nForce2 BIOS resulting in XT-PIC timer.
+
 	ad1816=		[HW,OSS]
 			Format: <io>,<irq>,<dma>,<dma2>
 			See also Documentation/sound/oss/AD1816.
===== arch/i386/kernel/dmi_scan.c 1.57 vs edited =====
--- 1.57/arch/i386/kernel/dmi_scan.c	Fri Apr 16 22:03:06 2004
+++ edited/arch/i386/kernel/dmi_scan.c	Wed Apr 21 18:29:35 2004
@@ -540,6 +540,19 @@
 #endif
 
 /*
+ * early nForce2 reference BIOS shipped with a
+ * bogus ACPI IRQ0 -> pin2 interrupt override -- ignore it
+ */
+static __init int ignore_timer_override(struct dmi_blacklist *d)
+{
+	extern int acpi_skip_timer_override;
+	printk(KERN_NOTICE "%s detected: BIOS IRQ0 pin2 override"
+		" will be ignored\n", d->ident); 	
+
+	acpi_skip_timer_override = 1;
+	return 0;
+}
+/*
  *	Process the DMI blacklists
  */
  
@@ -944,6 +957,37 @@
 			MATCH(DMI_BOARD_VENDOR, "IBM"),
 			MATCH(DMI_PRODUCT_NAME, "eserver xSeries 440"),
 			NO_MATCH, NO_MATCH }},
+
+/*
+ * Systems with nForce2 BIOS timer override bug
+ * add Albatron KM18G Pro
+ * add DFI NFII 400-AL
+ * add Epox 8RGA+
+ * add Shuttle AN35N
+ */
+	{ ignore_timer_override, "Abit NF7-S v2", {
+			MATCH(DMI_BOARD_VENDOR, "http://www.abit.com.tw/"),
+			MATCH(DMI_BOARD_NAME, "NF7-S/NF7,NF7-V (nVidia-nForce2)"),
+			MATCH(DMI_BIOS_VERSION, "6.00 PG"),
+			MATCH(DMI_BIOS_DATE, "03/24/2004") }},
+
+	{ ignore_timer_override, "Asus A7N8X v2", {
+			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
+			MATCH(DMI_BOARD_NAME, "A7N8X2.0"),
+			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X2.0 Deluxe ACPI BIOS Rev 1007"),
+			MATCH(DMI_BIOS_DATE, "10/06/2003") }},
+
+	{ ignore_timer_override, "Asus A7N8X-X", {
+			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
+			MATCH(DMI_BOARD_NAME, "A7N8X-X"),
+			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X-X ACPI BIOS Rev 1007"),
+			MATCH(DMI_BIOS_DATE, "10/07/2003") }},
+
+	{ ignore_timer_override, "Shuttle SN41G2", {
+			MATCH(DMI_BOARD_VENDOR, "Shuttle Inc"),
+			MATCH(DMI_BOARD_NAME, "FN41"),
+			MATCH(DMI_BIOS_VERSION, "6.00 PG"),
+			MATCH(DMI_BIOS_DATE, "01/14/2004") }},
 #endif	// CONFIG_ACPI_BOOT
 
 #ifdef	CONFIG_ACPI_PCI
===== arch/i386/kernel/setup.c 1.115 vs edited =====
--- 1.115/arch/i386/kernel/setup.c	Fri Apr  2 07:21:43 2004
+++ edited/arch/i386/kernel/setup.c	Wed Apr 21 15:28:12 2004
@@ -614,6 +614,9 @@
 		else if (!memcmp(from, "acpi_sci=low", 12))
 			acpi_sci_flags.polarity = 3;
 
+		else if (!memcmp(from, "acpi_skip_timer_override", 24))
+			acpi_skip_timer_override = 1;
+
 #ifdef CONFIG_X86_LOCAL_APIC
 		/* disable IO-APIC */
 		else if (!memcmp(from, "noapic", 6))
===== arch/i386/kernel/acpi/boot.c 1.58 vs edited =====
--- 1.58/arch/i386/kernel/acpi/boot.c	Tue Apr 20 20:54:03 2004
+++ edited/arch/i386/kernel/acpi/boot.c	Wed Apr 21 15:28:13 2004
@@ -62,6 +62,7 @@
 
 acpi_interrupt_flags acpi_sci_flags __initdata;
 int acpi_sci_override_gsi __initdata;
+int acpi_skip_timer_override __initdata;
 
 #ifdef CONFIG_X86_LOCAL_APIC
 static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
@@ -327,6 +328,12 @@
 		acpi_sci_ioapic_setup(intsrc->global_irq,
 			intsrc->flags.polarity, intsrc->flags.trigger);
 		return 0;
+	}
+
+	if (acpi_skip_timer_override &&
+		intsrc->bus_irq == 0 && intsrc->global_irq == 2) {
+			printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
+			return 0;
 	}
 
 	mp_override_legacy_irq (
===== include/asm-i386/acpi.h 1.18 vs edited =====
--- 1.18/include/asm-i386/acpi.h	Tue Mar 30 17:05:19 2004
+++ edited/include/asm-i386/acpi.h	Wed Apr 21 15:28:14 2004
@@ -118,6 +118,7 @@
 #ifdef CONFIG_X86_IO_APIC
 extern int skip_ioapic_setup;
 extern int acpi_irq_to_vector(u32 irq);	/* deprecated in favor of acpi_gsi_to_irq */
+extern int acpi_skip_timer_override;
 
 static inline void disable_ioapic_setup(void)
 {

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-21 22:41                 ` Len Brown
@ 2004-04-22  7:26                   ` Prakash K. Cheemplavam
  2004-04-22 14:58                     ` Len Brown
  2004-04-22  8:45                   ` Craig Bradney
                                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-04-22  7:26 UTC (permalink / raw)
  To: Len Brown
  Cc: Craig Bradney, ross, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 1500 bytes --]

Len Brown wrote:
>>Please send me the output from dmidecode, available in /usr/sbin/, or
>>
>>>here:
>>>http://www.nongnu.org/dmidecode/
>>>or
>>>http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
> 
> 
> On Wed, 2004-04-21 at 17:28, Prakash K. Cheemplavam wrote:
> 
> 
>>this is the output for Abit NF7-S Rev20 using bios d23. I have NOT 
>>activated APIC for this. Is it needed?
> 
> 
> Yes, you need to enable ACPI and IOAPIC.  The goal of this patch
> is to address the XT-PIC timer issue in IOAPIC mode.

Ok, I recompiled using your (former) patch and Ross' apic tack patch. I 
attached the new dmidecode Text.

> I've got 1 Abit, 2 Asus, and 1 Shuttle DMI entry.  Let me know if the
> product names (1st line of dmidecode entry) are correct,
> these are not from DMI, but are supposed to be human-readable titles.

Are you referring to (as the first line doesn't say much):

Product Name: NF7-S/NF7,NF7-V (nVidia-nForce2)
Version: 2.X,1.0

Seems pretty much OK, though I don't understand, why 1.0 is in the 
Version string. Durthermore I don't understand, why "Phoenix" appears as 
bios vendor. It should be Award, AFAIK.

> I'm interested only in the latest BIOS -- if it is still broken.

It is the latest (d23). And I guess it is broken, as without your patch 
the timer gets connected to XT-PIC.

> The assumption is that if a fixed BIOS is available, the users
> should upgrade.

Well, I posted in Abit's Forum, but I don't know whether it will have an 
effect.

bye,

Prakash

[-- Attachment #2: dmiabitnf7sv2d23apic.txt --]
[-- Type: text/plain, Size: 11288 bytes --]

# dmidecode 2.3
SMBIOS 2.2 present.
37 structures occupying 981 bytes.
Table at 0x000F0800.
Handle 0x0000
	DMI type 0, 19 bytes.
	BIOS Information
		Vendor: Phoenix Technologies, LTD
		Version: 6.00 PG
		Release Date: 03/24/2004
		Address: 0xE0000
		Runtime Size: 128 kB
		ROM Size: 512 kB
		Characteristics:
			ISA is supported
			PCI is supported
			PNP is supported
			APM is supported
			BIOS is upgradeable
			BIOS shadowing is allowed
			ESCD support is available
			Boot from CD is supported
			Selectable boot is supported
			BIOS ROM is socketed
			EDD is supported
			5.25"/360 KB floppy services are supported (int 13h)
			5.25"/1.2 MB floppy services are supported (int 13h)
			3.5"/720 KB floppy services are supported (int 13h)
			3.5"/2.88 MB floppy services are supported (int 13h)
			Print screen service is supported (int 5h)
			8042 keyboard services are supported (int 9h)
			Serial services are supported (int 14h)
			Printer services are supported (int 17h)
			CGA/mono video services are supported (int 10h)
			ACPI is supported
			USB legacy is supported
			AGP is supported
			LS-120 boot is supported
			ATAPI Zip drive boot is supported
Handle 0x0001
	DMI type 1, 25 bytes.
	System Information
		Manufacturer:  
		Product Name:  
		Version:  
		Serial Number:  
		UUID: 00000000-0000-0000-0000-00508DF1FBE3
		Wake-up Type: Power Switch
Handle 0x0002
	DMI type 2, 8 bytes.
	Base Board Information
		Manufacturer: http://www.abit.com.tw/
		Product Name: NF7-S/NF7,NF7-V (nVidia-nForce2)
		Version: 2.X,1.0
		Serial Number:  
Handle 0x0003
	DMI type 3, 13 bytes.
	Chassis Information
		Manufacturer:  
		Type: Desktop
		Lock: Not Present
		Version:  
		Serial Number:  
		Asset Tag:  
		Boot-up State: Unknown
		Power Supply State: Unknown
		Thermal State: Unknown
		Security Status: Unknown
Handle 0x0004
	DMI type 4, 32 bytes.
	Processor Information
		Socket Designation: Socket A
		Type: Central Processor
		Family: Athlon
		Manufacturer: AMD
		ID: 81 06 00 00 FF FB 83 03
		Signature: Type 0, Family 6, Model 8, Stepping 1
		Flags:
			FPU (Floating-point unit on-chip)
			VME (Virtual mode extension)
			DE (Debugging extension)
			PSE (Page size extension)
			TSC (Time stamp counter)
			MSR (Model specific registers)
			PAE (Physical address extension)
			MCE (Machine check exception)
			CX8 (CMPXCHG8 instruction supported)
			APIC (On-chip APIC hardware supported)
			SEP (Fast system call)
			MTRR (Memory type range registers)
			PGE (Page global enable)
			MCA (Machine check architecture)
			CMOV (Conditional move instruction supported)
			PAT (Page attribute table)
			PSE-36 (36-bit page size extension)
			MMX (MMX technology supported)
			FXSR (Fast floating-point save and restore)
			SSE (Streaming SIMD extensions)
		Version: AMD Athlon(tm) XP
		Voltage: 1.6 V
		External Clock: 200 MHz
		Max Speed: 3000 MHz
		Current Speed: 2100 MHz
		Status: Populated, Enabled
		Upgrade: ZIF Socket
		L1 Cache Handle: 0x0009
		L2 Cache Handle: 0x000A
		L3 Cache Handle: No L3 Cache
Handle 0x0005
	DMI type 5, 22 bytes.
	Memory Controller Information
		Error Detecting Method: 8-bit Parity
		Error Correcting Capabilities:
			None
		Supported Interleave: One-way Interleave
		Current Interleave: One-way Interleave
		Maximum Memory Module Size: 1024 MB
		Maximum Total Memory Size: 3072 MB
		Supported Speeds:
			Other
		Supported Memory Types:
			Other
			DIMM
			SDRAM
		Memory Module Voltage: 2.9 V
		Associated Memory Slots: 3
			0x0006
			0x0007
			0x0008
		Enabled Error Correcting Capabilities: None
Handle 0x0006
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: A0
		Bank Connections: 0 1
		Current Speed: 10 ns
		Type: Other DIMM SDRAM
		Installed Size: 512 MB (Double-bank Connection)
		Enabled Size: 512 MB (Double-bank Connection)
		Error Status: OK
Handle 0x0007
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: A1
		Bank Connections: None
		Current Speed: 10 ns
		Type: Other DIMM SDRAM
		Installed Size: Not Installed (Single-bank Connection)
		Enabled Size: Not Installed (Single-bank Connection)
		Error Status: OK
Handle 0x0008
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: A2
		Bank Connections: 4 5
		Current Speed: 10 ns
		Type: Other DIMM SDRAM
		Installed Size: 512 MB (Double-bank Connection)
		Enabled Size: 512 MB (Double-bank Connection)
		Error Status: OK
Handle 0x0009
	DMI type 7, 19 bytes.
	Cache Information
		Socket Designation: Internal Cache
		Configuration: Enabled, Not Socketed, Level 1
		Operational Mode: Write Back
		Location: Internal
		Installed Size: 128 KB
		Maximum Size: 128 KB
		Supported SRAM Types:
			Synchronous
		Installed SRAM Type: Synchronous
		Speed: Unknown
		Error Correction Type: Unknown
		System Type: Unknown
		Associativity: Unknown
Handle 0x000A
	DMI type 7, 19 bytes.
	Cache Information
		Socket Designation: External Cache
		Configuration: Enabled, Not Socketed, Level 2
		Operational Mode: Write Back
		Location: External
		Installed Size: 256 KB
		Maximum Size: 256 KB
		Supported SRAM Types:
			Synchronous
		Installed SRAM Type: Synchronous
		Speed: Unknown
		Error Correction Type: Unknown
		System Type: Unknown
		Associativity: Unknown
Handle 0x000B
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: PRIMARY IDE
		Internal Connector Type: On Board IDE
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: Other
Handle 0x000C
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: SECONDARY IDE
		Internal Connector Type: On Board IDE
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: Other
Handle 0x000D
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: FDD
		Internal Connector Type: On Board Floppy
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: 8251 FIFO Compatible
Handle 0x000E
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: COM1
		Internal Connector Type: 9 Pin Dual Inline (pin 10 cut)
		External Reference Designator:  
		External Connector Type: DB-9 male
		Port Type: Serial Port 16450 Compatible
Handle 0x000F
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: COM2
		Internal Connector Type: 9 Pin Dual Inline (pin 10 cut)
		External Reference Designator:  
		External Connector Type: DB-9 male
		Port Type: Serial Port 16450 Compatible
Handle 0x0010
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: LPT1
		Internal Connector Type: DB-25 female
		External Reference Designator:  
		External Connector Type: DB-25 female
		Port Type: Parallel Port ECP/EPP
Handle 0x0011
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Keyboard
		Internal Connector Type: PS/2
		External Reference Designator:  
		External Connector Type: PS/2
		Port Type: Keyboard Port
Handle 0x0012
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: PS/2 Mouse
		Internal Connector Type: PS/2
		External Reference Designator:  
		External Connector Type: PS/2
		Port Type: Mouse Port
Handle 0x0013
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: USB
		External Connector Type: Other
		Port Type: USB
Handle 0x0014
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI0
		Type: 32-bit PCI
		Current Usage: In Use
		Length: Long
		ID: 1
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0015
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI1
		Type: 32-bit PCI
		Current Usage: Available
		Length: Long
		ID: 2
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0016
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI2
		Type: 32-bit PCI
		Current Usage: In Use
		Length: Long
		ID: 3
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0017
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI3
		Type: 32-bit PCI
		Current Usage: In Use
		Length: Long
		ID: 4
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0018
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI4
		Type: 32-bit PCI
		Current Usage: Available
		Length: Long
		ID: 5
		Characteristics:
			5.0 V is provided
			3.3 V is provided
			PME signal is supported
Handle 0x0019
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: AGP
		Type: 32-bit AGP
		Current Usage: Available
		Length: Long
		ID: 240
		Characteristics:
			5.0 V is provided
			3.3 V is provided
Handle 0x001A
	DMI type 13, 22 bytes.
	BIOS Language Information
		Installable Languages: 3
			n|US|iso8859-1
			n|US|iso8859-1
			r|CA|iso8859-1
		Currently Installed Language: n|US|iso8859-1
Handle 0x001B
	DMI type 16, 15 bytes.
	Physical Memory Array
		Location: System Board Or Motherboard
		Use: System Memory
		Error Correction Type: None
		Maximum Capacity: 1536 MB
		Error Information Handle: Not Provided
		Number Of Devices: 3
Handle 0x001C
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x001B
		Error Information Handle: Not Provided
		Total Width: Unknown
		Data Width: Unknown
		Size: 512 MB
		Form Factor: DIMM
		Set: None
		Locator: A0
		Bank Locator: Bank0/1
		Type: Unknown
		Type Detail: None
Handle 0x001D
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x001B
		Error Information Handle: Not Provided
		Total Width: Unknown
		Data Width: Unknown
		Size: No Module Installed
		Form Factor: DIMM
		Set: None
		Locator: A1
		Bank Locator: Bank2/3
		Type: Unknown
		Type Detail: None
Handle 0x001E
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x001B
		Error Information Handle: Not Provided
		Total Width: Unknown
		Data Width: Unknown
		Size: 512 MB
		Form Factor: DIMM
		Set: None
		Locator: A2
		Bank Locator: Bank4/5
		Type: Unknown
		Type Detail: None
Handle 0x001F
	DMI type 19, 15 bytes.
	Memory Array Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x0003FFFFFFF
		Range Size: 1 GB
		Physical Array Handle: 0x001B
		Partition Width: 0
Handle 0x0020
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x0001FFFFFFF
		Range Size: 512 MB
		Physical Device Handle: 0x001C
		Memory Array Mapped Address Handle: 0x001F
		Partition Row Position: 1
Handle 0x0021
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x000000003FF
		Range Size: 1 kB
		Physical Device Handle: 0x001D
		Memory Array Mapped Address Handle: 0x001F
		Partition Row Position: 1
Handle 0x0022
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00020000000
		Ending Address: 0x0003FFFFFFF
		Range Size: 512 MB
		Physical Device Handle: 0x001E
		Memory Array Mapped Address Handle: 0x001F
		Partition Row Position: 1
Handle 0x0023
	DMI type 32, 11 bytes.
	System Boot Information
		Status: No errors detected
Handle 0x0024
	DMI type 127, 4 bytes.
	End Of Table

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-21 22:41                 ` Len Brown
  2004-04-22  7:26                   ` Prakash K. Cheemplavam
@ 2004-04-22  8:45                   ` Craig Bradney
  2004-04-22 15:03                     ` Len Brown
  2004-04-22  8:50                   ` Arjen Verweij
                                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 93+ messages in thread
From: Craig Bradney @ 2004-04-22  8:45 UTC (permalink / raw)
  To: Len Brown
  Cc: Prakash K. Cheemplavam, ross, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 6447 bytes --]

On Thu, 2004-04-22 at 00:41, Len Brown wrote:
> > Please send me the output from dmidecode, available in /usr/sbin/, or
> > > here:
> > > http://www.nongnu.org/dmidecode/
> > > or
> > > http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
> 
> On Wed, 2004-04-21 at 17:28, Prakash K. Cheemplavam wrote:
> 
> > this is the output for Abit NF7-S Rev20 using bios d23. I have NOT 
> > activated APIC for this. Is it needed?
> 
> Yes, you need to enable ACPI and IOAPIC.  The goal of this patch
> is to address the XT-PIC timer issue in IOAPIC mode.
> 
> Here's the latest (vs 2.6.5).


Do we need any other patch? eg the idlec1halt patch? My Athlon still has
2.6.3 on it.

> I've got 1 Abit, 2 Asus, and 1 Shuttle DMI entry.  Let me know if the
> product names (1st line of dmidecode entry) are correct,
> these are not from DMI, but are supposed to be human-readable titles.

+ { ignore_timer_override, "Asus A7N8X v2", { 
> +			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
> +			MATCH(DMI_BOARD_NAME, "A7N8X2.0"),
> +			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X2.0 Deluxe ACPI BIOS Rev 1007"),
> +			MATCH(DMI_BIOS_DATE, "10/06/2003") }},

my dmidecode output also shows (in the first BIOS information section):
Vendor: Phoenix Technologies, LTD
although the Manufacturer is ASUSTek Computer INC. form the Base Board
and System sections.

Not really sure about the code. If it matches on all of above then it
might not work. Ill try a new kernel later today and see the result.

> I'm interested only in the latest BIOS -- if it is still broken.
> The assumption is that if a fixed BIOS is available, the users
> should upgrade.
> 

Yes, I just checked yesterday and there was nothing new.

thanks
Craig

> thanks,
> -Len
> 
> ps. latest BIOS on my shuttle has a C1 disconnect enable setting,
> (curiously, it is disabled by default) so I'll try to reproduce the hang
> on it...
> 
> ===== Documentation/kernel-parameters.txt 1.44 vs edited =====
> --- 1.44/Documentation/kernel-parameters.txt	Mon Mar 22 16:03:22 2004
> +++ edited/Documentation/kernel-parameters.txt	Wed Apr 21 15:28:12 2004
> @@ -122,6 +122,10 @@
>  
>  	acpi_serialize	[HW,ACPI] force serialization of AML methods
>  
> +	acpi_skip_timer_override [HW,ACPI]
> +			Recognize and ignore IRQ0/pin2 Interrupt Override.
> +			For broken nForce2 BIOS resulting in XT-PIC timer.
> +
>  	ad1816=		[HW,OSS]
>  			Format: <io>,<irq>,<dma>,<dma2>
>  			See also Documentation/sound/oss/AD1816.
> ===== arch/i386/kernel/dmi_scan.c 1.57 vs edited =====
> --- 1.57/arch/i386/kernel/dmi_scan.c	Fri Apr 16 22:03:06 2004
> +++ edited/arch/i386/kernel/dmi_scan.c	Wed Apr 21 18:29:35 2004
> @@ -540,6 +540,19 @@
>  #endif
>  
>  /*
> + * early nForce2 reference BIOS shipped with a
> + * bogus ACPI IRQ0 -> pin2 interrupt override -- ignore it
> + */
> +static __init int ignore_timer_override(struct dmi_blacklist *d)
> +{
> +	extern int acpi_skip_timer_override;
> +	printk(KERN_NOTICE "%s detected: BIOS IRQ0 pin2 override"
> +		" will be ignored\n", d->ident); 	
> +
> +	acpi_skip_timer_override = 1;
> +	return 0;
> +}
> +/*
>   *	Process the DMI blacklists
>   */
>   
> @@ -944,6 +957,37 @@
>  			MATCH(DMI_BOARD_VENDOR, "IBM"),
>  			MATCH(DMI_PRODUCT_NAME, "eserver xSeries 440"),
>  			NO_MATCH, NO_MATCH }},
> +
> +/*
> + * Systems with nForce2 BIOS timer override bug
> + * add Albatron KM18G Pro
> + * add DFI NFII 400-AL
> + * add Epox 8RGA+
> + * add Shuttle AN35N
> + */
> +	{ ignore_timer_override, "Abit NF7-S v2", {
> +			MATCH(DMI_BOARD_VENDOR, "http://www.abit.com.tw/"),
> +			MATCH(DMI_BOARD_NAME, "NF7-S/NF7,NF7-V (nVidia-nForce2)"),
> +			MATCH(DMI_BIOS_VERSION, "6.00 PG"),
> +			MATCH(DMI_BIOS_DATE, "03/24/2004") }},
> +
> +	{ ignore_timer_override, "Asus A7N8X v2", {
> +			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
> +			MATCH(DMI_BOARD_NAME, "A7N8X2.0"),
> +			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X2.0 Deluxe ACPI BIOS Rev 1007"),
> +			MATCH(DMI_BIOS_DATE, "10/06/2003") }},
> +
> +	{ ignore_timer_override, "Asus A7N8X-X", {
> +			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
> +			MATCH(DMI_BOARD_NAME, "A7N8X-X"),
> +			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X-X ACPI BIOS Rev 1007"),
> +			MATCH(DMI_BIOS_DATE, "10/07/2003") }},
> +
> +	{ ignore_timer_override, "Shuttle SN41G2", {
> +			MATCH(DMI_BOARD_VENDOR, "Shuttle Inc"),
> +			MATCH(DMI_BOARD_NAME, "FN41"),
> +			MATCH(DMI_BIOS_VERSION, "6.00 PG"),
> +			MATCH(DMI_BIOS_DATE, "01/14/2004") }},
>  #endif	// CONFIG_ACPI_BOOT
>  
>  #ifdef	CONFIG_ACPI_PCI
> ===== arch/i386/kernel/setup.c 1.115 vs edited =====
> --- 1.115/arch/i386/kernel/setup.c	Fri Apr  2 07:21:43 2004
> +++ edited/arch/i386/kernel/setup.c	Wed Apr 21 15:28:12 2004
> @@ -614,6 +614,9 @@
>  		else if (!memcmp(from, "acpi_sci=low", 12))
>  			acpi_sci_flags.polarity = 3;
>  
> +		else if (!memcmp(from, "acpi_skip_timer_override", 24))
> +			acpi_skip_timer_override = 1;
> +
>  #ifdef CONFIG_X86_LOCAL_APIC
>  		/* disable IO-APIC */
>  		else if (!memcmp(from, "noapic", 6))
> ===== arch/i386/kernel/acpi/boot.c 1.58 vs edited =====
> --- 1.58/arch/i386/kernel/acpi/boot.c	Tue Apr 20 20:54:03 2004
> +++ edited/arch/i386/kernel/acpi/boot.c	Wed Apr 21 15:28:13 2004
> @@ -62,6 +62,7 @@
>  
>  acpi_interrupt_flags acpi_sci_flags __initdata;
>  int acpi_sci_override_gsi __initdata;
> +int acpi_skip_timer_override __initdata;
>  
>  #ifdef CONFIG_X86_LOCAL_APIC
>  static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
> @@ -327,6 +328,12 @@
>  		acpi_sci_ioapic_setup(intsrc->global_irq,
>  			intsrc->flags.polarity, intsrc->flags.trigger);
>  		return 0;
> +	}
> +
> +	if (acpi_skip_timer_override &&
> +		intsrc->bus_irq == 0 && intsrc->global_irq == 2) {
> +			printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
> +			return 0;
>  	}
>  
>  	mp_override_legacy_irq (
> ===== include/asm-i386/acpi.h 1.18 vs edited =====
> --- 1.18/include/asm-i386/acpi.h	Tue Mar 30 17:05:19 2004
> +++ edited/include/asm-i386/acpi.h	Wed Apr 21 15:28:14 2004
> @@ -118,6 +118,7 @@
>  #ifdef CONFIG_X86_IO_APIC
>  extern int skip_ioapic_setup;
>  extern int acpi_irq_to_vector(u32 irq);	/* deprecated in favor of
> acpi_gsi_to_irq */
> +extern int acpi_skip_timer_override;
>  
>  static inline void disable_ioapic_setup(void)
>  {
> 
> 

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-21 22:41                 ` Len Brown
  2004-04-22  7:26                   ` Prakash K. Cheemplavam
  2004-04-22  8:45                   ` Craig Bradney
@ 2004-04-22  8:50                   ` Arjen Verweij
  2004-04-22 16:39                   ` Jesse Allen
  2004-05-01  6:51                   ` Prakash K. Cheemplavam
  4 siblings, 0 replies; 93+ messages in thread
From: Arjen Verweij @ 2004-04-22  8:50 UTC (permalink / raw)
  To: Len Brown
  Cc: Prakash K. Cheemplavam, Craig Bradney, ross, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, Jesse Allen, Allen Martin

Len,

Please bear in mind that the people from Shuttle are the only ones that
have seemingly fixed it, alledgedly, late in December. I only have data
for one Shuttle board, but that if they (Shuttle) would fix it, they would
fix it for all boards. For Shuttle AN35N rev 1.1 there is a BIOS update
from 05-Dec-2003 that has probably addressed this issue.

So if you are looking to reproduce this hang, don't update your BIOS :)

Regards,

Arjen

On 21 Apr 2004, Len Brown wrote:

> > Please send me the output from dmidecode, available in /usr/sbin/, or
> > > here:
> > > http://www.nongnu.org/dmidecode/
> > > or
> > > http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
>
> On Wed, 2004-04-21 at 17:28, Prakash K. Cheemplavam wrote:
>
> > this is the output for Abit NF7-S Rev20 using bios d23. I have NOT
> > activated APIC for this. Is it needed?
>
> Yes, you need to enable ACPI and IOAPIC.  The goal of this patch
> is to address the XT-PIC timer issue in IOAPIC mode.
>
> Here's the latest (vs 2.6.5).
>
> I've got 1 Abit, 2 Asus, and 1 Shuttle DMI entry.  Let me know if the
> product names (1st line of dmidecode entry) are correct,
> these are not from DMI, but are supposed to be human-readable titles.
>
> I'm interested only in the latest BIOS -- if it is still broken.
> The assumption is that if a fixed BIOS is available, the users
> should upgrade.
>
> thanks,
> -Len
>
> ps. latest BIOS on my shuttle has a C1 disconnect enable setting,
> (curiously, it is disabled by default) so I'll try to reproduce the hang
> on it...
>
> ===== Documentation/kernel-parameters.txt 1.44 vs edited =====
> --- 1.44/Documentation/kernel-parameters.txt	Mon Mar 22 16:03:22 2004
> +++ edited/Documentation/kernel-parameters.txt	Wed Apr 21 15:28:12 2004
> @@ -122,6 +122,10 @@
>
>  	acpi_serialize	[HW,ACPI] force serialization of AML methods
>
> +	acpi_skip_timer_override [HW,ACPI]
> +			Recognize and ignore IRQ0/pin2 Interrupt Override.
> +			For broken nForce2 BIOS resulting in XT-PIC timer.
> +
>  	ad1816=		[HW,OSS]
>  			Format: <io>,<irq>,<dma>,<dma2>
>  			See also Documentation/sound/oss/AD1816.
> ===== arch/i386/kernel/dmi_scan.c 1.57 vs edited =====
> --- 1.57/arch/i386/kernel/dmi_scan.c	Fri Apr 16 22:03:06 2004
> +++ edited/arch/i386/kernel/dmi_scan.c	Wed Apr 21 18:29:35 2004
> @@ -540,6 +540,19 @@
>  #endif
>
>  /*
> + * early nForce2 reference BIOS shipped with a
> + * bogus ACPI IRQ0 -> pin2 interrupt override -- ignore it
> + */
> +static __init int ignore_timer_override(struct dmi_blacklist *d)
> +{
> +	extern int acpi_skip_timer_override;
> +	printk(KERN_NOTICE "%s detected: BIOS IRQ0 pin2 override"
> +		" will be ignored\n", d->ident);
> +
> +	acpi_skip_timer_override = 1;
> +	return 0;
> +}
> +/*
>   *	Process the DMI blacklists
>   */
>
> @@ -944,6 +957,37 @@
>  			MATCH(DMI_BOARD_VENDOR, "IBM"),
>  			MATCH(DMI_PRODUCT_NAME, "eserver xSeries 440"),
>  			NO_MATCH, NO_MATCH }},
> +
> +/*
> + * Systems with nForce2 BIOS timer override bug
> + * add Albatron KM18G Pro
> + * add DFI NFII 400-AL
> + * add Epox 8RGA+
> + * add Shuttle AN35N
> + */
> +	{ ignore_timer_override, "Abit NF7-S v2", {
> +			MATCH(DMI_BOARD_VENDOR, "http://www.abit.com.tw/"),
> +			MATCH(DMI_BOARD_NAME, "NF7-S/NF7,NF7-V (nVidia-nForce2)"),
> +			MATCH(DMI_BIOS_VERSION, "6.00 PG"),
> +			MATCH(DMI_BIOS_DATE, "03/24/2004") }},
> +
> +	{ ignore_timer_override, "Asus A7N8X v2", {
> +			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
> +			MATCH(DMI_BOARD_NAME, "A7N8X2.0"),
> +			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X2.0 Deluxe ACPI BIOS Rev 1007"),
> +			MATCH(DMI_BIOS_DATE, "10/06/2003") }},
> +
> +	{ ignore_timer_override, "Asus A7N8X-X", {
> +			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
> +			MATCH(DMI_BOARD_NAME, "A7N8X-X"),
> +			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X-X ACPI BIOS Rev 1007"),
> +			MATCH(DMI_BIOS_DATE, "10/07/2003") }},
> +
> +	{ ignore_timer_override, "Shuttle SN41G2", {
> +			MATCH(DMI_BOARD_VENDOR, "Shuttle Inc"),
> +			MATCH(DMI_BOARD_NAME, "FN41"),
> +			MATCH(DMI_BIOS_VERSION, "6.00 PG"),
> +			MATCH(DMI_BIOS_DATE, "01/14/2004") }},
>  #endif	// CONFIG_ACPI_BOOT
>
>  #ifdef	CONFIG_ACPI_PCI
> ===== arch/i386/kernel/setup.c 1.115 vs edited =====
> --- 1.115/arch/i386/kernel/setup.c	Fri Apr  2 07:21:43 2004
> +++ edited/arch/i386/kernel/setup.c	Wed Apr 21 15:28:12 2004
> @@ -614,6 +614,9 @@
>  		else if (!memcmp(from, "acpi_sci=low", 12))
>  			acpi_sci_flags.polarity = 3;
>
> +		else if (!memcmp(from, "acpi_skip_timer_override", 24))
> +			acpi_skip_timer_override = 1;
> +
>  #ifdef CONFIG_X86_LOCAL_APIC
>  		/* disable IO-APIC */
>  		else if (!memcmp(from, "noapic", 6))
> ===== arch/i386/kernel/acpi/boot.c 1.58 vs edited =====
> --- 1.58/arch/i386/kernel/acpi/boot.c	Tue Apr 20 20:54:03 2004
> +++ edited/arch/i386/kernel/acpi/boot.c	Wed Apr 21 15:28:13 2004
> @@ -62,6 +62,7 @@
>
>  acpi_interrupt_flags acpi_sci_flags __initdata;
>  int acpi_sci_override_gsi __initdata;
> +int acpi_skip_timer_override __initdata;
>
>  #ifdef CONFIG_X86_LOCAL_APIC
>  static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
> @@ -327,6 +328,12 @@
>  		acpi_sci_ioapic_setup(intsrc->global_irq,
>  			intsrc->flags.polarity, intsrc->flags.trigger);
>  		return 0;
> +	}
> +
> +	if (acpi_skip_timer_override &&
> +		intsrc->bus_irq == 0 && intsrc->global_irq == 2) {
> +			printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
> +			return 0;
>  	}
>
>  	mp_override_legacy_irq (
> ===== include/asm-i386/acpi.h 1.18 vs edited =====
> --- 1.18/include/asm-i386/acpi.h	Tue Mar 30 17:05:19 2004
> +++ edited/include/asm-i386/acpi.h	Wed Apr 21 15:28:14 2004
> @@ -118,6 +118,7 @@
>  #ifdef CONFIG_X86_IO_APIC
>  extern int skip_ioapic_setup;
>  extern int acpi_irq_to_vector(u32 irq);	/* deprecated in favor of
> acpi_gsi_to_irq */
> +extern int acpi_skip_timer_override;
>
>  static inline void disable_ioapic_setup(void)
>  {
>
>
>


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-22  7:26                   ` Prakash K. Cheemplavam
@ 2004-04-22 14:58                     ` Len Brown
  0 siblings, 0 replies; 93+ messages in thread
From: Len Brown @ 2004-04-22 14:58 UTC (permalink / raw)
  To: Prakash K. Cheemplavam
  Cc: Craig Bradney, ross, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

On Thu, 2004-04-22 at 03:26, Prakash K. Cheemplavam wrote:
> Len Brown wrote:

> > Yes, you need to enable ACPI and IOAPIC.  The goal of this patch
> > is to address the XT-PIC timer issue in IOAPIC mode.
> 
> Ok, I recompiled using your (former) patch and Ross' apic tack patch. I 
> attached the new dmidecode Text.

Actually dmidecode dumps hard-coded BIOS data, so it will not change
unless you upgrade your BIOS.

> > I've got 1 Abit, 2 Asus, and 1 Shuttle DMI entry.  Let me know if the
> > product names (1st line of dmidecode entry) are correct,
> > these are not from DMI, but are supposed to be human-readable titles.
> 
> Are you referring to (as the first line doesn't say much):
> 
> Product Name: NF7-S/NF7,NF7-V (nVidia-nForce2)
> Version: 2.X,1.0


+       { ignore_timer_override, "Abit NF7-S v2", {

This one is for humans and anything can be in the string.

+                       MATCH(DMI_BOARD_VENDOR,
"http://www.abit.com.tw/"),
+                       MATCH(DMI_BOARD_NAME, "NF7-S/NF7,NF7-V
(nVidia-nForce2)"
),
+                       MATCH(DMI_BIOS_VERSION, "6.00 PG"),
+                       MATCH(DMI_BIOS_DATE, "03/24/2004") }},

These are keys in the DMI table, and have to match the BIOS (as seen in
dmidecode) exactly.

> Seems pretty much OK, though I don't understand, why 1.0 is in the 
> Version string. Durthermore I don't understand, why "Phoenix" appears as 
> bios vendor. It should be Award, AFAIK.

Phoenix and Award merged.
Doesn't really matter what it says, it is just a string compare to
linux.  Also, I chose not to look at the BIOS vendor in this example
b/c it adds no value, here we're just looking at BOARD vendor & name,
plus BIOS version and date.

Thanks for confirming that the entry matched your system and that the
patch triggered automatically.

-Len



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-22  8:45                   ` Craig Bradney
@ 2004-04-22 15:03                     ` Len Brown
  2004-04-22 20:50                       ` Craig Bradney
  0 siblings, 1 reply; 93+ messages in thread
From: Len Brown @ 2004-04-22 15:03 UTC (permalink / raw)
  To: Craig Bradney
  Cc: Prakash K. Cheemplavam, ross, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

On Thu, 2004-04-22 at 04:45, Craig Bradney wrote:

> > Yes, you need to enable ACPI and IOAPIC.  The goal of this patch
> > is to address the XT-PIC timer issue in IOAPIC mode.
> > 
> > Here's the latest (vs 2.6.5).
> 
> 
> Do we need any other patch? eg the idlec1halt patch? My Athlon still has
> 2.6.3 on it.

If you needed idlec1halt before, you still need it.
This patch just addresses the XT-PIC timer issue.

> 
> > I've got 1 Abit, 2 Asus, and 1 Shuttle DMI entry.  Let me know if the
> > product names (1st line of dmidecode entry) are correct,
> > these are not from DMI, but are supposed to be human-readable titles.
> 
> + { ignore_timer_override, "Asus A7N8X v2", { 
> > +			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
> > +			MATCH(DMI_BOARD_NAME, "A7N8X2.0"),
> > +			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X2.0 Deluxe ACPI BIOS Rev 1007"),
> > +			MATCH(DMI_BIOS_DATE, "10/06/2003") }},
> 
> my dmidecode output also shows (in the first BIOS information section):
> Vendor: Phoenix Technologies, LTD
> although the Manufacturer is ASUSTek Computer INC. form the Base Board
> and System sections.

Right, DMI has separate sections for System, Board, BIOS, and we're
using two pieces from the BOARD and two pieces from the BIOS sections.

> Not really sure about the code. If it matches on all of above then it
> might not work. Ill try a new kernel later today and see the result.

The workaround is triggered only if all the MATCH()'s above match.
If it doesn't trigger, then either I munged it on copy out of dmidecode
or you've got a different BIOS and we need a new dmidecode...

> > I'm interested only in the latest BIOS -- if it is still broken.
> > The assumption is that if a fixed BIOS is available, the users
> > should upgrade.
> > 
> 
> Yes, I just checked yesterday and there was nothing new.

thanks,
-Len



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-21 22:41                 ` Len Brown
                                     ` (2 preceding siblings ...)
  2004-04-22  8:50                   ` Arjen Verweij
@ 2004-04-22 16:39                   ` Jesse Allen
  2004-04-22 17:21                     ` Len Brown
  2004-05-01  6:51                   ` Prakash K. Cheemplavam
  4 siblings, 1 reply; 93+ messages in thread
From: Jesse Allen @ 2004-04-22 16:39 UTC (permalink / raw)
  To: Len Brown
  Cc: Prakash K. Cheemplavam, Craig Bradney, ross, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, a.verweij, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 1221 bytes --]

On Wed, Apr 21, 2004 at 06:41:38PM -0400, Len Brown wrote:
> > Please send me the output from dmidecode, available in /usr/sbin/, or
> 
> I've got 1 Abit, 2 Asus, and 1 Shuttle DMI entry.  Let me know if the
> product names (1st line of dmidecode entry) are correct,
> these are not from DMI, but are supposed to be human-readable titles.
> 
> I'm interested only in the latest BIOS -- if it is still broken.
> The assumption is that if a fixed BIOS is available, the users
> should upgrade.
> 
> thanks,
> -Len
> 
> ps. latest BIOS on my shuttle has a C1 disconnect enable setting,
> (curiously, it is disabled by default) so I'll try to reproduce the hang
> on it...
> 


On the Shuttle AN35N, the C1 disconnect option default is auto.  If you're
talking about this board, or another board Shuttle seemingly fixed, then I
can tell you that I haven't been able to get my to hang with vanilla kernels.

As for your patch, I get a fast timer, and gain about 1 sec per 5 minutes.
The only patch that seemed to work without a fast timer so far was the one 
removed by Linus in a testing version.  The AN35N has the timer override 
bug.

Attached is the dmidecode for the AN35N.  Note: onboard sound may be disabled.

Jesse


[-- Attachment #2: junk --]
[-- Type: text/plain, Size: 11071 bytes --]

# dmidecode 2.4
SMBIOS 2.2 present.
37 structures occupying 937 bytes.
Table at 0x000F0000.
Handle 0x0000
	DMI type 0, 19 bytes.
	BIOS Information
		Vendor: Phoenix Technologies, LTD
		Version: 6.00 PG
		Release Date: 12/05/2003
		Address: 0xE0000
		Runtime Size: 128 kB
		ROM Size: 256 kB
		Characteristics:
			ISA is supported
			PCI is supported
			PNP is supported
			APM is supported
			BIOS is upgradeable
			BIOS shadowing is allowed
			Boot from CD is supported
			Selectable boot is supported
			BIOS ROM is socketed
			EDD is supported
			5.25"/360 KB floppy services are supported (int 13h)
			5.25"/1.2 MB floppy services are supported (int 13h)
			3.5"/720 KB floppy services are supported (int 13h)
			3.5"/2.88 MB floppy services are supported (int 13h)
			Print screen service is supported (int 5h)
			8042 keyboard services are supported (int 9h)
			Serial services are supported (int 14h)
			Printer services are supported (int 17h)
			CGA/mono video services are supported (int 10h)
			ACPI is supported
			USB legacy is supported
			AGP is supported
			LS-120 boot is supported
			ATAPI Zip drive boot is supported
Handle 0x0001
	DMI type 1, 25 bytes.
	System Information
		Manufacturer:  
		Product Name:  
		Version:  
		Serial Number:  
		UUID: 1297A535-FFFF-FFFF-FFFF-FFFFFFFFFFFF
		Wake-up Type: Power Switch
Handle 0x0002
	DMI type 2, 8 bytes.
	Base Board Information
		Manufacturer: Shuttle Inc
		Product Name: AN35 
		Version:  
		Serial Number:  
Handle 0x0003
	DMI type 3, 13 bytes.
	Chassis Information
		Manufacturer:  
		Type: Desktop
		Lock: Not Present
		Version:  
		Serial Number:  
		Asset Tag:  
		Boot-up State: Unknown
		Power Supply State: Unknown
		Thermal State: Unknown
		Security Status: Unknown
Handle 0x0004
	DMI type 4, 32 bytes.
	Processor Information
		Socket Designation: Socket A
		Type: Central Processor
		Family: Duron
		Manufacturer: AMD
		ID: A0 06 00 00 FF FB 83 03
		Signature: Family 6, Model A, Stepping 0
		Flags:
			FPU (Floating-point unit on-chip)
			VME (Virtual mode extension)
			DE (Debugging extension)
			PSE (Page size extension)
			TSC (Time stamp counter)
			MSR (Model specific registers)
			PAE (Physical address extension)
			MCE (Machine check exception)
			CX8 (CMPXCHG8 instruction supported)
			APIC (On-chip APIC hardware supported)
			SEP (Fast system call)
			MTRR (Memory type range registers)
			PGE (Page global enable)
			MCA (Machine check architecture)
			CMOV (Conditional move instruction supported)
			PAT (Page attribute table)
			PSE-36 (36-bit page size extension)
			MMX (MMX technology supported)
			FXSR (Fast floating-point save and restore)
			SSE (Streaming SIMD extensions)
		Version: AMD Athlon(tm) XP
		Voltage: 1.6 V
		External Clock: 166 MHz
		Max Speed: 2000 MHz
		Current Speed: 1916 MHz
		Status: Populated, Enabled
		Upgrade: ZIF Socket
		L1 Cache Handle: 0x0009
		L2 Cache Handle: 0x000A
		L3 Cache Handle: No L3 Cache
Handle 0x0005
	DMI type 5, 22 bytes.
	Memory Controller Information
		Error Detecting Method: 8-bit Parity
		Error Correcting Capabilities:
			None
		Supported Interleave: One-way Interleave
		Current Interleave: One-way Interleave
		Maximum Memory Module Size: 32 MB
		Maximum Total Memory Size: 96 MB
		Supported Speeds:
			70 ns
			60 ns
		Supported Memory Types:
			Standard
			EDO
		Memory Module Voltage: 5.0 V
		Associated Memory Slots: 3
			0x0006
			0x0007
			0x0008
		Enabled Error Correcting Capabilities: None
Handle 0x0006
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: A0
		Bank Connections: None
		Current Speed: 10 ns
		Type: Other
		Installed Size: Not Installed (Single-bank Connection)
		Enabled Size: Not Installed (Single-bank Connection)
		Error Status: OK
Handle 0x0007
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: A1
		Bank Connections: 2
		Current Speed: 10 ns
		Type: Other
		Installed Size: 256 MB (Single-bank Connection)
		Enabled Size: 256 MB (Single-bank Connection)
		Error Status: OK
Handle 0x0008
	DMI type 6, 12 bytes.
	Memory Module Information
		Socket Designation: A2
		Bank Connections: None
		Current Speed: 10 ns
		Type: Other
		Installed Size: Not Installed (Single-bank Connection)
		Enabled Size: Not Installed (Single-bank Connection)
		Error Status: OK
Handle 0x0009
	DMI type 7, 19 bytes.
	Cache Information
		Socket Designation: Internal Cache
		Configuration: Enabled, Not Socketed, Level 1
		Operational Mode: Write Back
		Location: Internal
		Installed Size: 128 KB
		Maximum Size: 128 KB
		Supported SRAM Types:
			Synchronous
		Installed SRAM Type: Synchronous
		Speed: Unknown
		Error Correction Type: Unknown
		System Type: Unknown
		Associativity: Unknown
Handle 0x000A
	DMI type 7, 19 bytes.
	Cache Information
		Socket Designation: External Cache
		Configuration: Enabled, Not Socketed, Level 2
		Operational Mode: Write Back
		Location: External
		Installed Size: 512 KB
		Maximum Size: 512 KB
		Supported SRAM Types:
			Synchronous
		Installed SRAM Type: Synchronous
		Speed: Unknown
		Error Correction Type: Unknown
		System Type: Unknown
		Associativity: Unknown
Handle 0x000B
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: PRIMARY IDE
		Internal Connector Type: On Board IDE
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: Other
Handle 0x000C
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: SECONDARY IDE
		Internal Connector Type: On Board IDE
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: Other
Handle 0x000D
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: FDD
		Internal Connector Type: On Board Floppy
		External Reference Designator: Not Specified
		External Connector Type: None
		Port Type: 8251 FIFO Compatible
Handle 0x000E
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: COM1
		Internal Connector Type: 9 Pin Dual Inline (pin 10 cut)
		External Reference Designator:  
		External Connector Type: DB-9 male
		Port Type: Serial Port 16450 Compatible
Handle 0x000F
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: COM2
		Internal Connector Type: 9 Pin Dual Inline (pin 10 cut)
		External Reference Designator:  
		External Connector Type: DB-9 male
		Port Type: Serial Port 16450 Compatible
Handle 0x0010
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: LPT1
		Internal Connector Type: DB-25 female
		External Reference Designator:  
		External Connector Type: DB-25 female
		Port Type: Parallel Port ECP/EPP
Handle 0x0011
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Keyboard
		Internal Connector Type: PS/2
		External Reference Designator:  
		External Connector Type: PS/2
		Port Type: Keyboard Port
Handle 0x0012
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: PS/2 Mouse
		Internal Connector Type: PS/2
		External Reference Designator:  
		External Connector Type: PS/2
		Port Type: Mouse Port
Handle 0x0013
	DMI type 8, 9 bytes.
	Port Connector Information
		Internal Reference Designator: Not Specified
		Internal Connector Type: None
		External Reference Designator: USB0
		External Connector Type: Other
		Port Type: USB
Handle 0x0014
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI0
		Type: 32-bit PCI
		Current Usage: In Use
		Length: Long
		ID: 1
		Characteristics:
			5.0 V is provided
			PME signal is supported
Handle 0x0015
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI1
		Type: 32-bit PCI
		Current Usage: In Use
		Length: Long
		ID: 2
		Characteristics:
			5.0 V is provided
			PME signal is supported
Handle 0x0016
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI2
		Type: 32-bit PCI
		Current Usage: Available
		Length: Long
		ID: 3
		Characteristics:
			5.0 V is provided
			PME signal is supported
Handle 0x0017
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI3
		Type: 32-bit PCI
		Current Usage: Available
		Length: Long
		ID: 4
		Characteristics:
			5.0 V is provided
			PME signal is supported
Handle 0x0018
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: PCI4
		Type: 32-bit PCI
		Current Usage: In Use
		Length: Long
		ID: 5
		Characteristics:
			5.0 V is provided
			PME signal is supported
Handle 0x0019
	DMI type 9, 13 bytes.
	System Slot Information
		Designation: AGP
		Type: 32-bit AGP
		Current Usage: Available
		Length: Long
		ID: 240
		Characteristics:
			5.0 V is provided
Handle 0x001A
	DMI type 13, 22 bytes.
	BIOS Language Information
		Installable Languages: 3
			n|US|iso8859-1
			n|US|iso8859-1
			r|CA|iso8859-1
		Currently Installed Language: n|US|iso8859-1
Handle 0x001B
	DMI type 16, 15 bytes.
	Physical Memory Array
		Location: System Board Or Motherboard
		Use: System Memory
		Error Correction Type: None
		Maximum Capacity: 1536 MB
		Error Information Handle: Not Provided
		Number Of Devices: 3
Handle 0x001C
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x001B
		Error Information Handle: Not Provided
		Total Width: Unknown
		Data Width: Unknown
		Size: No Module Installed
		Form Factor: DIMM
		Set: None
		Locator: A0
		Bank Locator: Bank0/1
		Type: Unknown
		Type Detail: None
Handle 0x001D
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x001B
		Error Information Handle: Not Provided
		Total Width: Unknown
		Data Width: Unknown
		Size: 256 MB
		Form Factor: DIMM
		Set: None
		Locator: A1
		Bank Locator: Bank2/3
		Type: Unknown
		Type Detail: None
Handle 0x001E
	DMI type 17, 21 bytes.
	Memory Device
		Array Handle: 0x001B
		Error Information Handle: Not Provided
		Total Width: Unknown
		Data Width: Unknown
		Size: No Module Installed
		Form Factor: DIMM
		Set: None
		Locator: A2
		Bank Locator: Bank4/5
		Type: Unknown
		Type Detail: None
Handle 0x001F
	DMI type 19, 15 bytes.
	Memory Array Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x0000FFFFFFF
		Range Size: 256 MB
		Physical Array Handle: 0x001B
		Partition Width: 0
Handle 0x0020
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x000000003FF
		Range Size: 1 kB
		Physical Device Handle: 0x001C
		Memory Array Mapped Address Handle: 0x001F
		Partition Row Position: 1
Handle 0x0021
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x0000FFFFFFF
		Range Size: 256 MB
		Physical Device Handle: 0x001D
		Memory Array Mapped Address Handle: 0x001F
		Partition Row Position: 1
Handle 0x0022
	DMI type 20, 19 bytes.
	Memory Device Mapped Address
		Starting Address: 0x00000000000
		Ending Address: 0x000000003FF
		Range Size: 1 kB
		Physical Device Handle: 0x001E
		Memory Array Mapped Address Handle: 0x001F
		Partition Row Position: 1
Handle 0x0023
	DMI type 32, 11 bytes.
	System Boot Information
		Status: No errors detected
Handle 0x0024
	DMI type 127, 4 bytes.
	End Of Table

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-22 16:39                   ` Jesse Allen
@ 2004-04-22 17:21                     ` Len Brown
  2004-04-22 21:29                       ` Len Brown
  2004-04-26 11:41                       ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Ross Dickson
  0 siblings, 2 replies; 93+ messages in thread
From: Len Brown @ 2004-04-22 17:21 UTC (permalink / raw)
  To: Jesse Allen
  Cc: Prakash K. Cheemplavam, Craig Bradney, ross, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, a.verweij, Allen Martin

On Thu, 2004-04-22 at 12:39, Jesse Allen wrote:

> On the Shuttle AN35N, the C1 disconnect option default is auto.  If you're
> talking about this board, or another board Shuttle seemingly fixed, then I
> can tell you that I haven't been able to get my to hang with vanilla kernels.

Have you been able to hang the AN35N under any conditions?
Old BIOS, non-vanilla kernel?

> As for your patch, I get a fast timer, and gain about 1 sec per 5 minutes.
> The only patch that seemed to work without a fast timer so far was the one 
> removed by Linus in a testing version.  The AN35N has the timer override 
> bug.

Hmm, I didn't notice fast time on my FN41, i'll look for it.

I'm not familiar with the "one removed by Linux in a testing version",
perhaps you could point me to that?

> Attached is the dmidecode for the AN35N.

applied.

thanks,
-Len



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-22 15:03                     ` Len Brown
@ 2004-04-22 20:50                       ` Craig Bradney
  0 siblings, 0 replies; 93+ messages in thread
From: Craig Bradney @ 2004-04-22 20:50 UTC (permalink / raw)
  To: Len Brown
  Cc: Prakash K. Cheemplavam, ross, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 2943 bytes --]

On Thu, 2004-04-22 at 17:03, Len Brown wrote:
> On Thu, 2004-04-22 at 04:45, Craig Bradney wrote:
> 
> > > Yes, you need to enable ACPI and IOAPIC.  The goal of this patch
> > > is to address the XT-PIC timer issue in IOAPIC mode.
> > > 
> > > Here's the latest (vs 2.6.5).
> > 
> > 
> > Do we need any other patch? eg the idlec1halt patch? My Athlon still has
> > 2.6.3 on it.
> 
> If you needed idlec1halt before, you still need it.
> This patch just addresses the XT-PIC timer issue.
> 
> > 
> > > I've got 1 Abit, 2 Asus, and 1 Shuttle DMI entry.  Let me know if the
> > > product names (1st line of dmidecode entry) are correct,
> > > these are not from DMI, but are supposed to be human-readable titles.
> > 
> > + { ignore_timer_override, "Asus A7N8X v2", { 
> > > +			MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
> > > +			MATCH(DMI_BOARD_NAME, "A7N8X2.0"),
> > > +			MATCH(DMI_BIOS_VERSION, "ASUS A7N8X2.0 Deluxe ACPI BIOS Rev 1007"),
> > > +			MATCH(DMI_BIOS_DATE, "10/06/2003") }},
> > 
> > my dmidecode output also shows (in the first BIOS information section):
> > Vendor: Phoenix Technologies, LTD
> > although the Manufacturer is ASUSTek Computer INC. form the Base Board
> > and System sections.
> 
> Right, DMI has separate sections for System, Board, BIOS, and we're
> using two pieces from the BOARD and two pieces from the BIOS sections.
> 
> > Not really sure about the code. If it matches on all of above then it
> > might not work. Ill try a new kernel later today and see the result.
> 
> The workaround is triggered only if all the MATCH()'s above match.
> If it doesn't trigger, then either I munged it on copy out of dmidecode
> or you've got a different BIOS and we need a new dmidecode...
> 
> > > I'm interested only in the latest BIOS -- if it is still broken.
> > > The assumption is that if a fixed BIOS is available, the users
> > > should upgrade.
> > > 
> > 
> > Yes, I just checked yesterday and there was nothing new.

[Have sent this email with attachments directly to Len, attachments are
just /proc/interrupts and dmegs output. If someone is interested, please
ask for them]

Hi Len

Please find attached /proc/interrupts and dmesg from 3 boots, 2 with new
kernel.

263 : gentoo-dev-sources-r1 2.6.3 kernel with Ross Dickson's idleC1halt
and IOAPIC patches.

265: gentoo-dev-sources-r1 2.6.5 kernel with Ross Dickson's idleC1halt
for 2.6.5 kernel only. Note in 265pi (/proc/interrupts):
0:      54821          XT-PIC  timer

265-lb: gentoo-dev-sources-r1 2.6.5 kernel with Ross Dickson's
idleC1halt for 2.6.5 kernel and your patch for the interrupt. 
Note in 265pi-lb (/proc/interrupts):
0:      51144    IO-APIC-edge  timer

so.. looks good here. :) I was surprised to see this effect with no boot
kernel option though. Having read the code I see you set the value to 1
and therefore on. Seems fine to me.

regards
Craig

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-22 17:21                     ` Len Brown
@ 2004-04-22 21:29                       ` Len Brown
  2004-04-23  8:48                         ` Prakash K. Cheemplavam
                                           ` (2 more replies)
  2004-04-26 11:41                       ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Ross Dickson
  1 sibling, 3 replies; 93+ messages in thread
From: Len Brown @ 2004-04-22 21:29 UTC (permalink / raw)
  To: Jesse Allen
  Cc: Prakash K. Cheemplavam, Craig Bradney, ross, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, a.verweij, Allen Martin

On Thu, 2004-04-22 at 13:21, Len Brown wrote:

> > As for your patch, I get a fast timer, and gain about 1 sec per 5 minutes.
> > The only patch that seemed to work without a fast timer so far was the one 
> > removed by Linus in a testing version.  The AN35N has the timer override 
> > bug.
> 
> Hmm, I didn't notice fast time on my FN41, i'll look for it.
> 
> I'm not familiar with the "one removed by Linux in a testing version",
> perhaps you could point me to that?

date seems to gain 9sec/hour on my Shuttle/SN41G2/FN41 when using IOAPIC
timer.

booted with "noapic" for XT-PIC timer, it stays locked
onto my wristwatch after an hour.  If the workaround is disabled,
and XT-PIC timer is used, it matches the "noapic" behaviour -- no drift.

I can't explain it.  I think it is a timer problem independent of the
IRQ routing.

-Len

ps. when i ran in XT-PIC mode there were lots of ERR's registered in
/proc/interrupts -- doesn't look healthy.




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-22 21:29                       ` Len Brown
@ 2004-04-23  8:48                         ` Prakash K. Cheemplavam
  2004-04-23  9:01                           ` Arjen Verweij
  2004-04-23 12:18                         ` Maciej W. Rozycki
  2004-04-27  7:57                         ` ACPI broken on nforce2? Prakash K. Cheemplavam
  2 siblings, 1 reply; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-04-23  8:48 UTC (permalink / raw)
  To: Len Brown
  Cc: Jesse Allen, Craig Bradney, ross, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, a.verweij, Allen Martin

Len Brown wrote:
> On Thu, 2004-04-22 at 13:21, Len Brown wrote:
> 
> 
>>>As for your patch, I get a fast timer, and gain about 1 sec per 5 minutes.
>>>The only patch that seemed to work without a fast timer so far was the one 
>>>removed by Linus in a testing version.  The AN35N has the timer override 
>>>bug.
>>
>>Hmm, I didn't notice fast time on my FN41, i'll look for it.
>>
>>I'm not familiar with the "one removed by Linux in a testing version",
>>perhaps you could point me to that?
> 
> 
> date seems to gain 9sec/hour on my Shuttle/SN41G2/FN41 when using IOAPIC
> timer.

Do you get lock-ups wihtout the timer_ack/C1halt patch? If yes, this may 
be the cause. I remember someone finding out that Ross' patch made the 
timer actually slower which resulted in stable operation. Maciej found 
out, not connecting the timer at all made it stabke as well. So is there 
a possibility to sync both timers?

According to a recent post, builöding kernel with SMP makes it stable, 
as well, but I haven't tested.

> booted with "noapic" for XT-PIC timer, it stays locked
> onto my wristwatch after an hour.  If the workaround is disabled,
> and XT-PIC timer is used, it matches the "noapic" behaviour -- no drift.
> 
> I can't explain it.  I think it is a timer problem independent of the
> IRQ routing.
> 
> -Len
> 
> ps. when i ran in XT-PIC mode there were lots of ERR's registered in
> /proc/interrupts -- doesn't look healthy.
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-23  8:48                         ` Prakash K. Cheemplavam
@ 2004-04-23  9:01                           ` Arjen Verweij
  2004-04-23  9:08                             ` Prakash K. Cheemplavam
  2004-04-23  9:11                             ` Prakash K. Cheemplavam
  0 siblings, 2 replies; 93+ messages in thread
From: Arjen Verweij @ 2004-04-23  9:01 UTC (permalink / raw)
  To: Prakash K. Cheemplavam
  Cc: Len Brown, Jesse Allen, Craig Bradney, ross, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, Allen Martin

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 1779 bytes --]

He even filed a bug report:

http://bugme.osdl.org/show_bug.cgi?id=2552

I don't have access to my box atm, but I will certainly be trying a
vanilla kernel built with SMP to see what's going on.

Regards,

Arjen

On Fri, 23 Apr 2004, Prakash K. Cheemplavam wrote:

> Len Brown wrote:
> > On Thu, 2004-04-22 at 13:21, Len Brown wrote:
> >
> >
> >>>As for your patch, I get a fast timer, and gain about 1 sec per 5 minutes.
> >>>The only patch that seemed to work without a fast timer so far was the one
> >>>removed by Linus in a testing version.  The AN35N has the timer override
> >>>bug.
> >>
> >>Hmm, I didn't notice fast time on my FN41, i'll look for it.
> >>
> >>I'm not familiar with the "one removed by Linux in a testing version",
> >>perhaps you could point me to that?
> >
> >
> > date seems to gain 9sec/hour on my Shuttle/SN41G2/FN41 when using IOAPIC
> > timer.
>
> Do you get lock-ups wihtout the timer_ack/C1halt patch? If yes, this may
> be the cause. I remember someone finding out that Ross' patch made the
> timer actually slower which resulted in stable operation. Maciej found
> out, not connecting the timer at all made it stabke as well. So is there
> a possibility to sync both timers?
>
> According to a recent post, builöding kernel with SMP makes it stable,
> as well, but I haven't tested.
>
> > booted with "noapic" for XT-PIC timer, it stays locked
> > onto my wristwatch after an hour.  If the workaround is disabled,
> > and XT-PIC timer is used, it matches the "noapic" behaviour -- no drift.
> >
> > I can't explain it.  I think it is a timer problem independent of the
> > IRQ routing.
> >
> > -Len
> >
> > ps. when i ran in XT-PIC mode there were lots of ERR's registered in
> > /proc/interrupts -- doesn't look healthy.
> >
> >
> >
> >
>
>


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-23  9:01                           ` Arjen Verweij
@ 2004-04-23  9:08                             ` Prakash K. Cheemplavam
  2004-04-23  9:11                             ` Prakash K. Cheemplavam
  1 sibling, 0 replies; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-04-23  9:08 UTC (permalink / raw)
  To: a.verweij
  Cc: Len Brown, Jesse Allen, Craig Bradney, ross, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, Allen Martin

Arjen Verweij wrote:
> He even filed a bug report:
> 
> http://bugme.osdl.org/show_bug.cgi?id=2552
> 
> I don't have access to my box atm, but I will certainly be trying a
> vanilla kernel built with SMP to see what's going on.

Hmm, well, I just tried it with 2.6.6-rc2-mm1 and it did NOT succeed, ie 
it locked up. Maybe I need to use the exact kernel version and 
configuration to find out, what's going on.

Prakash

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-23  9:01                           ` Arjen Verweij
  2004-04-23  9:08                             ` Prakash K. Cheemplavam
@ 2004-04-23  9:11                             ` Prakash K. Cheemplavam
  1 sibling, 0 replies; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-04-23  9:11 UTC (permalink / raw)
  To: a.verweij
  Cc: Len Brown, Jesse Allen, Craig Bradney, ross, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, Allen Martin

Arjen Verweij wrote:
> He even filed a bug report:
> 
> http://bugme.osdl.org/show_bug.cgi?id=2552
> 
> I don't have access to my box atm, but I will certainly be trying a
> vanilla kernel built with SMP to see what's going on.

Ok, I read the bug report, so it ssems it will still lock-up from my 
silicon image sata controller, but not from PATA internal ide. Well, I 
only tried the sata, but I don't quite understand what makes the 
difference...at least no go for me.

Prakash



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-22 21:29                       ` Len Brown
  2004-04-23  8:48                         ` Prakash K. Cheemplavam
@ 2004-04-23 12:18                         ` Maciej W. Rozycki
  2004-04-27  7:57                         ` ACPI broken on nforce2? Prakash K. Cheemplavam
  2 siblings, 0 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2004-04-23 12:18 UTC (permalink / raw)
  To: Len Brown
  Cc: Jesse Allen, Prakash K. Cheemplavam, Craig Bradney, ross,
	christian.kroener, linux-kernel, Jamie Lokier, Daniel Drake,
	Ian Kumlien, a.verweij, Allen Martin

On Thu, 22 Apr 2004, Len Brown wrote:

> date seems to gain 9sec/hour on my Shuttle/SN41G2/FN41 when using IOAPIC
> timer.
> 
> booted with "noapic" for XT-PIC timer, it stays locked
> onto my wristwatch after an hour.  If the workaround is disabled,
> and XT-PIC timer is used, it matches the "noapic" behaviour -- no drift.
> 
> I can't explain it.  I think it is a timer problem independent of the
> IRQ routing.
> 
> -Len
> 
> ps. when i ran in XT-PIC mode there were lots of ERR's registered in
> /proc/interrupts -- doesn't look healthy.

 It looks like a noise on the timer IRQ line causing spurious interrupt
edges.  In the XT-PIC mode it gets ignored -- at the time the CPU issues
an ack, the request is already gone and the PIC signals a spurious
interrupt.  In the APIC mode the interrupt is delivered as a regular one
as edge interrupt events are persistent for the APICs -- if a falling edge
happens before an interrupt is acked it's not assumed to be gone and is
delivered as a real one.

 Another possibility is there's a bug in our APIC interrupt setup, leading
to the timer interrupt being enabled both in the APIC and in the PIC.  
You can verify that by calling debug functions for dumping states of the
controllers from io_apic.c.  They are print_IO_APIC(), print_local_APIC()  
and print_PIC() -- you may call them from an ad-hoc written small module,
although the first one is (accidentally?) marked __init, so you'd have to
remove the mark first.  You need to call all of them to get a complete
view.

  Maciej

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-22 17:21                     ` Len Brown
  2004-04-22 21:29                       ` Len Brown
@ 2004-04-26 11:41                       ` Ross Dickson
  2004-04-27 17:02                         ` Arjen Verweij
  2004-04-27 21:31                         ` Prakash K. Cheemplavam
  1 sibling, 2 replies; 93+ messages in thread
From: Ross Dickson @ 2004-04-26 11:41 UTC (permalink / raw)
  To: Len Brown, Jesse Allen
  Cc: Prakash K. Cheemplavam, Craig Bradney, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, a.verweij, Allen Martin

On Friday 23 April 2004 03:21, Len Brown wrote:
> On Thu, 2004-04-22 at 12:39, Jesse Allen wrote:
> 
> > On the Shuttle AN35N, the C1 disconnect option default is auto.  If you're
> > talking about this board, or another board Shuttle seemingly fixed, then I
> > can tell you that I haven't been able to get my to hang with vanilla kernels.
> 
> Have you been able to hang the AN35N under any conditions?
> Old BIOS, non-vanilla kernel?
> 
> > As for your patch, I get a fast timer, and gain about 1 sec per 5 minutes.
> > The only patch that seemed to work without a fast timer so far was the one 
> > removed by Linus in a testing version.  The AN35N has the timer override 
> > bug.
> 
> Hmm, I didn't notice fast time on my FN41, i'll look for it.
> 
> I'm not familiar with the "one removed by Linux in a testing version",
> perhaps you could point me to that?

This is Maciej's patch - latest posting of it that I have seen,
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/3174.html

His fix up of the 8259 ack issue (when used without routing 8254 pit into
io-apic INTIN0) successfully establishes a virtual wire mode input of the timer
which the nforce2 seems happy with albeit without being able to use
"nmi_debug=1"

It is that timer ack issue tied up with the integrated apic.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/2143.html

This refers to when it was in the 2.6.3-rc1-mm1
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-02/2658.html 

Regards
Ross.

> 
> > Attached is the dmidecode for the AN35N.
> 
> applied.
> 
> thanks,
> -Len
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 93+ messages in thread

* ACPI broken on nforce2?
  2004-04-22 21:29                       ` Len Brown
  2004-04-23  8:48                         ` Prakash K. Cheemplavam
  2004-04-23 12:18                         ` Maciej W. Rozycki
@ 2004-04-27  7:57                         ` Prakash K. Cheemplavam
  2 siblings, 0 replies; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-04-27  7:57 UTC (permalink / raw)
  To: Len Brown, linux-kernel

Hi,

we once had this subject a bit, but it doesn't seem to be fully 
resolved. It is still about the C1 halt state. Perhaps you remember me 
having trouble getting low idle temps with my nforce2 and Athlon XP. 
WIth a previous kernel I could get them back using agpgart and nvidia 
binary. But now (2.6.6-rc2-mm1) even using the open source nvidia 
driver, idle temps seem to do whatever they like (no matter if PIC or 
APIC is used). I really think that the C1 state isn't called properly. 
(cpu disconnect is activated)

cat /proc/acpi/processor/CPU0/power
active state:            C1
default state:           C1
bus master activity:     00000000
states:
    *C1:                  promotion[--] demotion[--] latency[000] 
usage[00000000]
     C2:                  <not supported>
     C3:                  <not supported>

You told that the usage probably keeps 0 as it is not counted. But this 
makes me wonder: Yesterday with I tried acpi=force on a board with VIA 
MVP3 chipset. The bios is from 2000 and guess what, here C1 and even C2 
  semm to be used properly and the usage is even counted. ACPI seems to 
work better than on my nforce2...

So I wonder why on nforce2 C1 usage isn't counted. I now have the strong 
feeling that is itn't properly called under some circumstances.

Should I open a bug report? If yes, what files do you need?

Thanks,

Prakash

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-26 11:41                       ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Ross Dickson
@ 2004-04-27 17:02                         ` Arjen Verweij
  2004-04-27 17:35                           ` Ian Kumlien
  2004-04-27 18:00                           ` Len Brown
  2004-04-27 21:31                         ` Prakash K. Cheemplavam
  1 sibling, 2 replies; 93+ messages in thread
From: Arjen Verweij @ 2004-04-27 17:02 UTC (permalink / raw)
  To: Ross Dickson
  Cc: Len Brown, Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

Hello,

I'm sorry for the small interlude in this thread, but I just want to get
something clear.

Basically we have a problem that is all around, except for (some) Shuttle
boards. Noone really knows what's going on, or at least if they know they
are not vocal about it.

In comes Ross Dickson. He starts poking at the problem until he comes up
with two patches. Near the end of 2003, an NVIDIA engineer (Allen Martin)
states that he (or maybe NVIDIA as a whole?) has been unable to reproduce
this weird problem with hard locks, seemingly related to APIC and IO.

He can tell us there was a bug in a reference BIOS that NVIDIA sent out
into the world, but that it has been fixed in a follow-up. Somewhere at
the start of December, Shuttle updates its BIOS for the AN35. Jesse Allen
flashes the new BIOS into his board and for reasons unknown his hard lock
problem has vanished. The importance of the update of NVIDIA's reference
BIOS in relation to the Shuttle update of the BIOS for their product(s) is
unknown as well.

Meanwhile, Ross Dickson drops requests for support tickets at AMD and
NVIDIA. Until this day, no reply yet. Unaffected by the deafening silence
he keeps improving his patches which seem to work(tm).

Without Ross' hard labor one can avoid the hard locks by banning APIC
support from the kernel, or turn off the C1 disconnect feature in the
BIOS, which is misinterpreted by one ACPI developer as running the CPU
"out of spec."

Recently Len Brown, the ACPI Linux kernel maintainer and Intel employee -
can you spot the irony? - agrees to attempt to reproduce the problem.
After having his box run with cat /dev/hda > /dev/null for a night
straight no lockup has occured. The brand of his motherboard is Shuttle.
Did I mention irony...?

Although this topic is primarily about nforce2 chipsets, similar problems
have been reported with SiS chipsets for AMD cpus. Other chipsets capable
of having the CPU disconnect include VIA KT266(A), KT333 and KT400. For
linux a tool like athcool can set the bits for the disconnect and the HLT
instruction. It is unconfirmed that these chipsets suffer from the same
symptoms as nforce2 chipsets.

Does anyone have some input on how to tackle this problem? The only things
I can come up with is mailing all the motherboard manufacturers I can
think of, harass NVIDIA and/or AMD some more through proper channels (i.e.
file a "bug report", but I don't expect much from this, sorry Allen) or
buy Len the cheapest broken nforce2 board I can find at pricewatch.com and
have it shipped to his house :)

Best regards,

Arjen


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-27 17:02                         ` Arjen Verweij
@ 2004-04-27 17:35                           ` Ian Kumlien
  2004-04-27 18:00                           ` Len Brown
  1 sibling, 0 replies; 93+ messages in thread
From: Ian Kumlien @ 2004-04-27 17:35 UTC (permalink / raw)
  To: a.verweij
  Cc: Ross Dickson, Len Brown, Jesse Allen, Prakash K. Cheemplavam,
	Craig Bradney, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 4209 bytes --]

On Tue, 2004-04-27 at 19:02, Arjen Verweij wrote:
> Hello,
> 
> I'm sorry for the small interlude in this thread, but I just want to get
> something clear.

Imho it was a nice summation of the situation and it might be welcome
for ppl that just started reading about this.

> Basically we have a problem that is all around, except for (some) Shuttle
> boards. Noone really knows what's going on, or at least if they know they
> are not vocal about it.

Yep, and asus seems to only add support for new ram manuf in dual ddr
mode.

> In comes Ross Dickson. He starts poking at the problem until he comes up
> with two patches. Near the end of 2003, an NVIDIA engineer (Allen Martin)
> states that he (or maybe NVIDIA as a whole?) has been unable to reproduce
> this weird problem with hard locks, seemingly related to APIC and IO.



> He can tell us there was a bug in a reference BIOS that NVIDIA sent out
> into the world, but that it has been fixed in a follow-up. Somewhere at
> the start of December, Shuttle updates its BIOS for the AN35. Jesse Allen
> flashes the new BIOS into his board and for reasons unknown his hard lock
> problem has vanished. The importance of the update of NVIDIA's reference
> BIOS in relation to the Shuttle update of the BIOS for their product(s) is
> unknown as well.



> Meanwhile, Ross Dickson drops requests for support tickets at AMD and
> NVIDIA. Until this day, no reply yet. Unaffected by the deafening silence
> he keeps improving his patches which seem to work(tm).

Yep, and we are all great full for that =), thanks Ross.

> Without Ross' hard labor one can avoid the hard locks by banning APIC
> support from the kernel, or turn off the C1 disconnect feature in the
> BIOS, which is misinterpreted by one ACPI developer as running the CPU
> "out of spec."

Well, it gets hot... like hell.

> Recently Len Brown, the ACPI Linux kernel maintainer and Intel employee -
> can you spot the irony? - agrees to attempt to reproduce the problem.
> After having his box run with cat /dev/hda > /dev/null for a night
> straight no lockup has occured. The brand of his motherboard is Shuttle.
> Did I mention irony...?

Heh.

> Although this topic is primarily about nforce2 chipsets, similar problems
> have been reported with SiS chipsets for AMD cpus. Other chipsets capable
> of having the CPU disconnect include VIA KT266(A), KT333 and KT400. For
> linux a tool like athcool can set the bits for the disconnect and the HLT
> instruction. It is unconfirmed that these chipsets suffer from the same
> symptoms as nforce2 chipsets.

There are several other things that can nuke machines though.
A friend has problem with dma on a intel chipset (i keep monitoring the
changelogs for fixes) but he has problems getting a > 20 says uptime.
(crashes faster with dma enabled)

My firewall, a VIA Samuel 2 (microitx) dies after a few hours if you
enable cpu freq. But it also seems like it changes cpu speed to often.

The common denominator with my fw and my desktop is 'to often'. Which
leads me to suspect that the Hz change from 100 -> 1000 could be
somewhat responsible. Could it be that we just run it to often and thus
worsen the impact? And C1 disconnect shouldn't be run that often imho.
Neither should cpu freq.

Perhaps some throttling would have about the same affect as Ross patches
(which is what his original patches did, but not to the C1 disconnect or
the HLT instruction. Could it be that some kernel code isn't well
adapted to the 100 -> 1000 change?)

Anyways, that my 0.2 eur

> Does anyone have some input on how to tackle this problem? The only things
> I can come up with is mailing all the motherboard manufacturers I can
> think of, harass NVIDIA and/or AMD some more through proper channels (i.e.
> file a "bug report", but I don't expect much from this, sorry Allen) or
> buy Len the cheapest broken nforce2 board I can find at pricewatch.com and
> have it shipped to his house :)

Heh, that would be fun if he's willing to do the work/research =).

PS. CC, since i'm not on this list.
-- 
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-27 17:02                         ` Arjen Verweij
  2004-04-27 17:35                           ` Ian Kumlien
@ 2004-04-27 18:00                           ` Len Brown
  2004-04-27 18:24                             ` Arjen Verweij
                                               ` (2 more replies)
  1 sibling, 3 replies; 93+ messages in thread
From: Len Brown @ 2004-04-27 18:00 UTC (permalink / raw)
  To: a.verweij
  Cc: Ross Dickson, Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

On Tue, 2004-04-27 at 13:02, Arjen Verweij wrote:

> After having his box run with cat /dev/hda > /dev/null for a night
> straight no lockup has occured. The brand of his motherboard is Shuttle.

My shuttle is a FN41 board in a SN41G2 system.

I found "rev 1.0" BIOS (FN41S00X of 12/18/2002) on Shuttle's ftp site
and downgraded to that, but still no hang.

It may be this board never hangs no matter what,
or perhaps C1 disconnect was simply disabled in that BIOS
b/c there was no option for it in Advanced Chipset Features
like there is for the most recent BIOS.

Other things about my board.
I run "optimized defaults", I don't overclock anything.
Processor is an AMD XP 2200+
Does anybody else see the hang with this processor model?
I wonder if the hang is processor model or speed dependent?

> Does anyone have some input on how to tackle this problem?

Unfortunately I don't have tools for debugging nvidia + amd hardware.
I would expect that those companies do, however.  So encouraging them
to reproduce the hang internally may be the best way to go.

> buy Len the cheapest broken nforce2 board I can find at pricewatch.com and
> have it shipped to his house :)

I got tangled in this b/c this board (actually, the reference BIOS for
this chipset) had some unusual ACPI related failures.  If the failures
turn out to be related to ACPI, I'll do what I can to help.  But I
expect that hardware debugging tools may be necessary before the
hang issue is completely explained and solved.

-Len



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-27 18:00                           ` Len Brown
@ 2004-04-27 18:24                             ` Arjen Verweij
  2004-04-27 18:51                             ` Jussi Laako
  2004-04-28 11:33                             ` Ross Dickson
  2 siblings, 0 replies; 93+ messages in thread
From: Arjen Verweij @ 2004-04-27 18:24 UTC (permalink / raw)
  To: Len Brown
  Cc: Ross Dickson, Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

Len,

I don't think that the CPU model has much to do with anything, it is
pretty much chipset related. My last remark about buying you a mobo for
debugging purposes was a vain attempt at humor.

I'm just surprised of the lack of support the vendors and NVIDIA/AMD are
giving. I realise that Linux may be only a marginal part of the market for
those companies so it is not commercially justifiable to invest a lot of
time in this.

We all appreciate whatever input you may have, because a solution without
indepth knowledge of how ACPI/APIC code handles stuff is probably needed
to tackle this issue.

All I can do is gather info, and I'm currently thinking of a plan to get
the info we need.

Regards,

Arjen


On 27 Apr 2004, Len Brown wrote:

> On Tue, 2004-04-27 at 13:02, Arjen Verweij wrote:
>
> > After having his box run with cat /dev/hda > /dev/null for a night
> > straight no lockup has occured. The brand of his motherboard is Shuttle.
>
> My shuttle is a FN41 board in a SN41G2 system.
>
> I found "rev 1.0" BIOS (FN41S00X of 12/18/2002) on Shuttle's ftp site
> and downgraded to that, but still no hang.
>
> It may be this board never hangs no matter what,
> or perhaps C1 disconnect was simply disabled in that BIOS
> b/c there was no option for it in Advanced Chipset Features
> like there is for the most recent BIOS.
>
> Other things about my board.
> I run "optimized defaults", I don't overclock anything.
> Processor is an AMD XP 2200+
> Does anybody else see the hang with this processor model?
> I wonder if the hang is processor model or speed dependent?
>
> > Does anyone have some input on how to tackle this problem?
>
> Unfortunately I don't have tools for debugging nvidia + amd hardware.
> I would expect that those companies do, however.  So encouraging them
> to reproduce the hang internally may be the best way to go.
>
> > buy Len the cheapest broken nforce2 board I can find at pricewatch.com and
> > have it shipped to his house :)
>
> I got tangled in this b/c this board (actually, the reference BIOS for
> this chipset) had some unusual ACPI related failures.  If the failures
> turn out to be related to ACPI, I'll do what I can to help.  But I
> expect that hardware debugging tools may be necessary before the
> hang issue is completely explained and solved.
>
> -Len
>
>
>


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-27 18:00                           ` Len Brown
  2004-04-27 18:24                             ` Arjen Verweij
@ 2004-04-27 18:51                             ` Jussi Laako
  2004-04-28 11:33                             ` Ross Dickson
  2 siblings, 0 replies; 93+ messages in thread
From: Jussi Laako @ 2004-04-27 18:51 UTC (permalink / raw)
  To: linux-kernel

On Tue, 2004-04-27 at 21:00, Len Brown wrote:

> I run "optimized defaults", I don't overclock anything.
> Processor is an AMD XP 2200+
> Does anybody else see the hang with this processor model?
> I wonder if the hang is processor model or speed dependent?

Have people run memtest86 over weekend without errors?

I found nForce2 (newer ones, rev2? A7N8X rev 2 and K7N2 Delta) to be
veery picky to DDR400 RAMs. I was able to find 2 properly working memory
modules out of 6. Also tested with several different brands. However
A7N8X rev 1 runs fine without need for carefully picking memory modules.

Memory test may need to be run over 48 hours to detect errors. But the
time required may be lower when running Linux kernel.


-- 
Jussi Laako <jussi@sonarnerd.net>


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-26 11:41                       ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Ross Dickson
  2004-04-27 17:02                         ` Arjen Verweij
@ 2004-04-27 21:31                         ` Prakash K. Cheemplavam
  2004-04-28 11:26                           ` Prakash K. Cheemplavam
  1 sibling, 1 reply; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-04-27 21:31 UTC (permalink / raw)
  To: ross
  Cc: Len Brown, Jesse Allen, Craig Bradney, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, a.verweij, Allen Martin

Hi all,

I have just made soem interesting experience. It seems Len's timer 
routing patch (or whatever you wanna call it) stabilizes my system to a 
certain amount or NOT using AGP stabilizes it to an amount...

The whole story: I am using Ross' C1halt patch to make the system stable 
in APIC mode, but due to a recent change I borked my kernel parameters 
and just had idle=halt instead of idle=C1halt as parameter, thus I had 
not activated Ross patch by accident. Nevertheless, the system survived 
a whole day! Usually it locks up within minutes, but this time no. I 
even did yome heavy copying from DVD to HD and from one HD to another 
with peaks of about 40mb/s. Finally the system crashed when I recorded 
from dvb to hd (but only after 20minutes). Then after a reboot (still 
NOT using Ross' patch) it survived dvb recording for about 30min.

I only manage to instantly lock the system when doing a hdparm (rather a 
second hdparm, the first one gives just about 20mb/sec, hello Jeff? What 
is libata doing here?) which goes up to >60mb/sec.

So Len, maybe try using a faster hd to crash your shuttle if it is one 
of the borked bioses...

As I used the open source NV driver all the time, AGP probably wasn't in 
use (or someone tell me how to use AGP with nv driver...), as ususally 
without Ross' patches using AGP I get fast lock-ups or as stated above 
Len's patch makes it a bit better. In fact I would need to try Len's 
patch and AGP on (with nvidia binary) to find out whether agp or Len's 
patch makes the difference. But currently I am too tired and not in the 
mood to further patch current mm-kernel to get Nvidia's binary running 
again...

Does anybody know a tool to generate certain amount of traffic on PCI 
bus? So I could test at which point the system wants to lock-up now. 
Only idea I have right now is to put an older hd into the system an test.

bye,

Prakash

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-27 21:31                         ` Prakash K. Cheemplavam
@ 2004-04-28 11:26                           ` Prakash K. Cheemplavam
  0 siblings, 0 replies; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-04-28 11:26 UTC (permalink / raw)
  Cc: ross, Len Brown, Jesse Allen, Craig Bradney, christian.kroener,
	linux-kernel, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien, a.verweij, Allen Martin

Prakash K. Cheemplavam wrote:
> Hi all,
> 
> I have just made soem interesting experience. It seems Len's timer 
> routing patch (or whatever you wanna call it) stabilizes my system to a 
> certain amount or NOT using AGP stabilizes it to an amount...

[snip]

Btw, I found another possible reason for this behaviour, which would fit 
into the idle temp problem I am experiencing again with 2.6.6-rc2-mm1 
kernel (unless it seems I use Ross C1halt idle patch): Perhaps this 
kernel uses the disconnect feature less often, so the probality of 
lock-up goes down. That would explain my higher temps...

Prakash

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-27 18:00                           ` Len Brown
  2004-04-27 18:24                             ` Arjen Verweij
  2004-04-27 18:51                             ` Jussi Laako
@ 2004-04-28 11:33                             ` Ross Dickson
  2004-04-28 20:59                               ` Jesse Allen
  2 siblings, 1 reply; 93+ messages in thread
From: Ross Dickson @ 2004-04-28 11:33 UTC (permalink / raw)
  To: Len Brown, a.verweij
  Cc: Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

On Wednesday 28 April 2004 04:00, Len Brown wrote:
> On Tue, 2004-04-27 at 13:02, Arjen Verweij wrote:
> 
> > After having his box run with cat /dev/hda > /dev/null for a night
> > straight no lockup has occured. The brand of his motherboard is Shuttle.
> 
> My shuttle is a FN41 board in a SN41G2 system.
>

I have had 3 Albatron KM18G pro and one Epox8rga+.
 
> I found "rev 1.0" BIOS (FN41S00X of 12/18/2002) on Shuttle's ftp site
> and downgraded to that, but still no hang.

My Albatrons hang with bios R1.01, R1.01a, R1.04 which is latest, probably also
hang with earlier bios but have not tried. I have emailed Albatron in last couple
of weeks re Allen's comments on Nvidia reference bios and about lockups but
have had no response as yet.

My Epox hangs but does not have latest bios - don't have floppy hooked up in 
that box to flash it to latest bios as yet.

> 
> It may be this board never hangs no matter what,
> or perhaps C1 disconnect was simply disabled in that BIOS
> b/c there was no option for it in Advanced Chipset Features
> like there is for the most recent BIOS.

Maybe other MOBO manufacturers skimp on filter caps and regulator damping
ability and a resonance occurs in the on-board supply rails? Do Shuttle make
any claims to using an improved on board regulator? Or Shuttle may have 
always programmed more time in C1 cycle handshakes if such is 
configurable? 

> 
> Other things about my board.
> I run "optimized defaults", I don't overclock anything.
> Processor is an AMD XP 2200+
> Does anybody else see the hang with this processor model?
> I wonder if the hang is processor model or speed dependent?

I have tried XP2200, XP2400, XP2500, I know I get lockups with both t'bred
and barton cores. Epox mobo has been tried with both Aopen H-500A and 
Elanvital full size case and power supplies. My albatron are all in Aopen m-atx
H-400A cases.

> 
> > Does anyone have some input on how to tackle this problem?
> 
> Unfortunately I don't have tools for debugging nvidia + amd hardware.
> I would expect that those companies do, however.  So encouraging them
> to reproduce the hang internally may be the best way to go.

Ditto I figured out early on it could do with emulator or bond out cpu/chipset
and tried to draw in Nvidia and AMD starting in December last year.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/2549.html

It was cc'd to Mr Allen Martin of Nvidia as were other emails on topic.
His reply was "\0" so I assumed he was on a long holiday or no longer worked
there. It has been good to hear from him on the topic some 4 months later.
Don't scare him off! -we appear to be making some progress. 

I also spoke to Mr Michael Apthorpe of AMD in Australia in December and 
forwarded the support request email who replied "Thanks Ross I will forwards
it on and see what comes back." But nothing has to date.

In January I spotted Mr Richard Brunner of AMD had previously corresponded
with the LKML so I emailed him and he was interested at the time but said
whilst he could not promise anything he would forward my query to the hardware
certification labs. And guess what - he was right to promise nothing as I have 
received "\0" to date.

I followed up with the AMD guys in February this year but again received "\0".

> 
> > buy Len the cheapest broken nforce2 board I can find at pricewatch.com and
> > have it shipped to his house :)
> 
> I got tangled in this b/c this board (actually, the reference BIOS for
> this chipset) had some unusual ACPI related failures.  If the failures
> turn out to be related to ACPI, I'll do what I can to help.  But I
> expect that hardware debugging tools may be necessary before the
> hang issue is completely explained and solved.

I have had good (100%) success in reproducing the fault with the Albatron 
KM18G pro MOBO. I needed m-atx form factor and distributor was local to me.
Makes very nice - cheap and stable system but only with the lockup workaround.

I also recollect that Windows had lockups with nforce2 for a while depending 
whether you ran the Nvidia or Microsoft driver.
http://lkml.org/lkml/2003/12/13/5
Anybody got the inside running on that one and what was different between the 
two drivers?

Regards
Ross.

> 
> -Len
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-28 11:33                             ` Ross Dickson
@ 2004-04-28 20:59                               ` Jesse Allen
  2004-04-29 11:44                                 ` Ross Dickson
  0 siblings, 1 reply; 93+ messages in thread
From: Jesse Allen @ 2004-04-28 20:59 UTC (permalink / raw)
  To: Ross Dickson
  Cc: Len Brown, a.verweij, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

On Wed, Apr 28, 2004 at 09:33:34PM +1000, Ross Dickson wrote:
> > 
> > It may be this board never hangs no matter what,
> > or perhaps C1 disconnect was simply disabled in that BIOS
> > b/c there was no option for it in Advanced Chipset Features
> > like there is for the most recent BIOS.
> 
> Maybe other MOBO manufacturers skimp on filter caps and regulator damping
> ability and a resonance occurs in the on-board supply rails? Do Shuttle make
> any claims to using an improved on board regulator? Or Shuttle may have 
> always programmed more time in C1 cycle handshakes if such is 
> configurable? 

Do you really think so?  I think there may be a resonance occuring, even with 
this new BIOS.  I plugged in new headphones into my nforce2 onboard sound, and 
get a high pitched noise.  Now here is where it gets weird:  This noise does 
not occur on boot until sometime after the IDE driver is loaded.  I also 
believe it varies under a high load.  If you disable C1 disconnect, it's gone.  
Also I've heard a high pitched noise at certain times coming right from the 
copmuter (very faint, but I do have very good hearing, I can even hear a hush 
sounding from my router.  my brother was quite astonished when I pointed that 
out)  I try to distinguish whats doing it.  It could be the hard drive.  But 
when I found the other sound in the head phones, I found that the sound varies 
almost in unison with the sound coming from the computer.  Maybe the IDE or 
hard drive is related, but it is too much related to C1 disconnect.

Whether it is really possible that my board can really generate this sound, I 
don't know.  Though, I have once determined that resonance was occuring in an 
old system, causing unstable CPU operation.  It wasn't that I heard a sound 
coming from it =).  But what I thought was the case was causing it, and pulled 
it out of the case.  I ran it on the table and found it to be stable.  That 
was the only thing wrong.  I've also studied resonance before a bit.  I know 
resonance can break systems.  But to think that my board is doing emmitting 
noise like that is pretty bizarre. 

It may be true that this Shuttle board may have resonance problems.  So that 
would indicate that they did something much like you describe by changing the 
C1 handshake time?  Isn't that much like what your patch does?


> 
> > hang issue is completely explained and solved.
> 
> I have had good (100%) success in reproducing the fault with the Albatron 
> KM18G pro MOBO. I needed m-atx form factor and distributor was local to me.
> Makes very nice - cheap and stable system but only with the lockup workaround.
> 
> I also recollect that Windows had lockups with nforce2 for a while depending 
> whether you ran the Nvidia or Microsoft driver.
> http://lkml.org/lkml/2003/12/13/5
> Anybody got the inside running on that one and what was different between the 
> two drivers?
> 

Yeah, unfortunately, I didn't save a link to the message board that I found 
that on.  But the issue is pretty common.  I'm sure more info can be found on i
the windows side.

Jesse


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-28 20:59                               ` Jesse Allen
@ 2004-04-29 11:44                                 ` Ross Dickson
  2004-04-29 11:54                                   ` Maciej W. Rozycki
                                                     ` (3 more replies)
  0 siblings, 4 replies; 93+ messages in thread
From: Ross Dickson @ 2004-04-29 11:44 UTC (permalink / raw)
  To: Jesse Allen
  Cc: Len Brown, a.verweij, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

On Thursday 29 April 2004 06:59, Jesse Allen wrote:
> On Wed, Apr 28, 2004 at 09:33:34PM +1000, Ross Dickson wrote:
> > > 
> > > It may be this board never hangs no matter what,
> > > or perhaps C1 disconnect was simply disabled in that BIOS
> > > b/c there was no option for it in Advanced Chipset Features
> > > like there is for the most recent BIOS.
> > 
> > Maybe other MOBO manufacturers skimp on filter caps and regulator damping
> > ability and a resonance occurs in the on-board supply rails? Do Shuttle make
> > any claims to using an improved on board regulator? Or Shuttle may have 
> > always programmed more time in C1 cycle handshakes if such is 
> > configurable? 
> 
> Do you really think so?  I think there may be a resonance occuring, even with 
> this new BIOS.  I plugged in new headphones into my nforce2 onboard sound, and 
> get a high pitched noise.  Now here is where it gets weird:  This noise does 
> not occur on boot until sometime after the IDE driver is loaded.  I also 
> believe it varies under a high load.  If you disable C1 disconnect, it's gone.  
> Also I've heard a high pitched noise at certain times coming right from the 
> copmuter (very faint, but I do have very good hearing, I can even hear a hush 
> sounding from my router.  my brother was quite astonished when I pointed that 
> out)  I try to distinguish whats doing it.  It could be the hard drive.  But 
> when I found the other sound in the head phones, I found that the sound varies 
> almost in unison with the sound coming from the computer.  Maybe the IDE or 
> hard drive is related, but it is too much related to C1 disconnect.

I think I might break out my oscilloscope this weekend and have a look at how 
clean the supply rails are around the cpu and northbridge and southbridge. 
Who knows I might get lucky and see some unexpected ripple or spikes.

> 
> Whether it is really possible that my board can really generate this sound, I 
> don't know.  Though, I have once determined that resonance was occuring in an 
> old system, causing unstable CPU operation.  It wasn't that I heard a sound 
> coming from it =).  But what I thought was the case was causing it, and pulled 
> it out of the case.  I ran it on the table and found it to be stable.  That 
> was the only thing wrong.  I've also studied resonance before a bit.  I know 
> resonance can break systems.  But to think that my board is doing emmitting 
> noise like that is pretty bizarre.

Not as bizarre as you may think. I have heard coils and even capacitors "sing"
in years past whilst servicing electronics.

> 
> It may be true that this Shuttle board may have resonance problems.  So that 
> would indicate that they did something much like you describe by changing the 
> C1 handshake time?  Isn't that much like what your patch does?

I had not really thought about it from that perspective. Whilst my patch cannot 
alter the handshake times it does prevent consecutive C1 cycles from occurring
too close together. Too close together I think being less than about 800ns. I 
guess I could look at that with a cro too - use an appropriate pin as the trigger
source and see if supply rails have load dump voltage rises when going into
disconnect. Maybe rail voltage rings for about 700ns and might be out of 
tolerence inside Athlon during that time. Would be very interesting if a 
few hundred picofarad of low esr decoupling cap placed on a supply rail near a
chip makes a difference? A pinout of the nforce2 chipset would help a great deal
here but I do not have one. Can anyone oblige me?

> 
> 
> > 
> > > hang issue is completely explained and solved.
> > 
> > I have had good (100%) success in reproducing the fault with the Albatron 
> > KM18G pro MOBO. I needed m-atx form factor and distributor was local to me.
> > Makes very nice - cheap and stable system but only with the lockup workaround.
> > 
> > I also recollect that Windows had lockups with nforce2 for a while depending 
> > whether you ran the Nvidia or Microsoft driver.
> > http://lkml.org/lkml/2003/12/13/5
> > Anybody got the inside running on that one and what was different between the 
> > two drivers?
> > 
> 
> Yeah, unfortunately, I didn't save a link to the message board that I found 
> that on.  But the issue is pretty common.  I'm sure more info can be found on i
> the windows side.

No tech info but this link shows user had Lockups with Nvidia's ide driver but
OK with MS one.
http://club.cdfreaks.com/showthread/t-91381.html

-Ross

> 
> Jesse
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-29 11:44                                 ` Ross Dickson
@ 2004-04-29 11:54                                   ` Maciej W. Rozycki
  2004-04-29 12:00                                     ` Jamie Lokier
  2004-04-29 11:57                                   ` Jamie Lokier
                                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 93+ messages in thread
From: Maciej W. Rozycki @ 2004-04-29 11:54 UTC (permalink / raw)
  To: Ross Dickson
  Cc: Jesse Allen, Len Brown, a.verweij, Prakash K. Cheemplavam,
	Craig Bradney, christian.kroener, linux-kernel, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

On Thu, 29 Apr 2004, Ross Dickson wrote:

> > Do you really think so?  I think there may be a resonance occuring, even with 
> > this new BIOS.  I plugged in new headphones into my nforce2 onboard sound, and 
> > get a high pitched noise.  Now here is where it gets weird:  This noise does 
> > not occur on boot until sometime after the IDE driver is loaded.  I also 
> > believe it varies under a high load.  If you disable C1 disconnect, it's gone.  
> > Also I've heard a high pitched noise at certain times coming right from the 
> > copmuter (very faint, but I do have very good hearing, I can even hear a hush 
> > sounding from my router.  my brother was quite astonished when I pointed that 
> > out)  I try to distinguish whats doing it.  It could be the hard drive.  But 
> > when I found the other sound in the head phones, I found that the sound varies 
> > almost in unison with the sound coming from the computer.  Maybe the IDE or 
> > hard drive is related, but it is too much related to C1 disconnect.
> 
> I think I might break out my oscilloscope this weekend and have a look at how 
> clean the supply rails are around the cpu and northbridge and southbridge. 
> Who knows I might get lucky and see some unexpected ripple or spikes.

 Not necessarily related to the PSU, but the noise may actually be the
reason of spurious timer interrupts.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-29 11:44                                 ` Ross Dickson
  2004-04-29 11:54                                   ` Maciej W. Rozycki
@ 2004-04-29 11:57                                   ` Jamie Lokier
  2004-04-29 12:16                                   ` Craig Bradney
  2004-04-29 20:24                                   ` Jesse Allen
  3 siblings, 0 replies; 93+ messages in thread
From: Jamie Lokier @ 2004-04-29 11:57 UTC (permalink / raw)
  To: Ross Dickson
  Cc: Jesse Allen, Len Brown, a.verweij, Prakash K. Cheemplavam,
	Craig Bradney, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Daniel Drake, Ian Kumlien, Allen Martin

Ross Dickson wrote:
> Not as bizarre as you may think. I have heard coils and even capacitors "sing"
> in years past whilst servicing electronics.

See the thread "Increasing HZ (patch for HZ > 1000)" for something
along these lines.  The change of HZ from 100 to 1000 causes some
notebooks to make a noise.

(Mine makes a noise with both, though).

-- Jamie

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-29 11:54                                   ` Maciej W. Rozycki
@ 2004-04-29 12:00                                     ` Jamie Lokier
  2004-04-29 12:26                                       ` Maciej W. Rozycki
  0 siblings, 1 reply; 93+ messages in thread
From: Jamie Lokier @ 2004-04-29 12:00 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Ross Dickson, Jesse Allen, Len Brown, a.verweij,
	Prakash K. Cheemplavam, Craig Bradney, christian.kroener,
	linux-kernel, Daniel Drake, Ian Kumlien, Allen Martin

Maciej W. Rozycki wrote:
>  Not necessarily related to the PSU, but the noise may actually be the
> reason of spurious timer interrupts.

With most device interrupts, additional spurious ones don't cause any
malfunction because the driver's handler checks whether the device
actually has a condition pending.

This is the basis of shared interrupts, of course.

Is there any way we can check the timer itself to see whether an
interrupt was caused by it, so that spurious timer interrupts are ignored?

-- Jamie

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-29 11:44                                 ` Ross Dickson
  2004-04-29 11:54                                   ` Maciej W. Rozycki
  2004-04-29 11:57                                   ` Jamie Lokier
@ 2004-04-29 12:16                                   ` Craig Bradney
  2004-04-29 20:24                                   ` Jesse Allen
  3 siblings, 0 replies; 93+ messages in thread
From: Craig Bradney @ 2004-04-29 12:16 UTC (permalink / raw)
  To: ross
  Cc: Jesse Allen, Len Brown, a.verweij, Prakash K. Cheemplavam,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 5223 bytes --]

On Thu, 2004-04-29 at 13:44, Ross Dickson wrote:
> On Thursday 29 April 2004 06:59, Jesse Allen wrote:
> > On Wed, Apr 28, 2004 at 09:33:34PM +1000, Ross Dickson wrote:
> > > > 
> > > > It may be this board never hangs no matter what,
> > > > or perhaps C1 disconnect was simply disabled in that BIOS
> > > > b/c there was no option for it in Advanced Chipset Features
> > > > like there is for the most recent BIOS.
> > > 
> > > Maybe other MOBO manufacturers skimp on filter caps and regulator damping
> > > ability and a resonance occurs in the on-board supply rails? Do Shuttle make
> > > any claims to using an improved on board regulator? Or Shuttle may have 
> > > always programmed more time in C1 cycle handshakes if such is 
> > > configurable? 
> > 
> > Do you really think so?  I think there may be a resonance occuring, even with 
> > this new BIOS.  I plugged in new headphones into my nforce2 onboard sound, and 
> > get a high pitched noise.  Now here is where it gets weird:  This noise does 
> > not occur on boot until sometime after the IDE driver is loaded.  I also 
> > believe it varies under a high load.  If you disable C1 disconnect, it's gone.  
> > Also I've heard a high pitched noise at certain times coming right from the 
> > copmuter (very faint, but I do have very good hearing, I can even hear a hush 
> > sounding from my router.  my brother was quite astonished when I pointed that 
> > out)  I try to distinguish whats doing it.  It could be the hard drive.  But 
> > when I found the other sound in the head phones, I found that the sound varies 
> > almost in unison with the sound coming from the computer.  Maybe the IDE or 
> > hard drive is related, but it is too much related to C1 disconnect.
> 
> I think I might break out my oscilloscope this weekend and have a look at how 
> clean the supply rails are around the cpu and northbridge and southbridge. 
> Who knows I might get lucky and see some unexpected ripple or spikes.
> 
> > 
> > Whether it is really possible that my board can really generate this sound, I 
> > don't know.  Though, I have once determined that resonance was occuring in an 
> > old system, causing unstable CPU operation.  It wasn't that I heard a sound 
> > coming from it =).  But what I thought was the case was causing it, and pulled 
> > it out of the case.  I ran it on the table and found it to be stable.  That 
> > was the only thing wrong.  I've also studied resonance before a bit.  I know 
> > resonance can break systems.  But to think that my board is doing emmitting 
> > noise like that is pretty bizarre.
> 
> Not as bizarre as you may think. I have heard coils and even capacitors "sing"
> in years past whilst servicing electronics.
> 
> > 
> > It may be true that this Shuttle board may have resonance problems.  So that 
> > would indicate that they did something much like you describe by changing the 
> > C1 handshake time?  Isn't that much like what your patch does?
> 
> I had not really thought about it from that perspective. Whilst my patch cannot 
> alter the handshake times it does prevent consecutive C1 cycles from occurring
> too close together. Too close together I think being less than about 800ns. I 
> guess I could look at that with a cro too - use an appropriate pin as the trigger
> source and see if supply rails have load dump voltage rises when going into
> disconnect. Maybe rail voltage rings for about 700ns and might be out of 
> tolerence inside Athlon during that time. Would be very interesting if a 
> few hundred picofarad of low esr decoupling cap placed on a supply rail near a
> chip makes a difference? A pinout of the nforce2 chipset would help a great deal
> here but I do not have one. Can anyone oblige me?
> 
> > 
> > 
> > > 
> > > > hang issue is completely explained and solved.
> > > 
> > > I have had good (100%) success in reproducing the fault with the Albatron 
> > > KM18G pro MOBO. I needed m-atx form factor and distributor was local to me.
> > > Makes very nice - cheap and stable system but only with the lockup workaround.
> > > 
> > > I also recollect that Windows had lockups with nforce2 for a while depending 
> > > whether you ran the Nvidia or Microsoft driver.
> > > http://lkml.org/lkml/2003/12/13/5
> > > Anybody got the inside running on that one and what was different between the 
> > > two drivers?
> > > 
> > 
> > Yeah, unfortunately, I didn't save a link to the message board that I found 
> > that on.  But the issue is pretty common.  I'm sure more info can be found on i
> > the windows side.
> 
> No tech info but this link shows user had Lockups with Nvidia's ide driver but
> OK with MS one.
> http://club.cdfreaks.com/showthread/t-91381.html
> 
> -

This has become a rather interesting problem to watch from afar. The
Athlon here seems to have no issues with the NForce driver under Windows
(I dont burn a lot of DVDs on it tho). Whenever its in Linux, its mainly
a testing machine these days.

It will be interesting to see if theres a real hardware problem and then
if it can be worked around in software (cant image a single product
recall happening).

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-29 12:00                                     ` Jamie Lokier
@ 2004-04-29 12:26                                       ` Maciej W. Rozycki
  0 siblings, 0 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2004-04-29 12:26 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: Ross Dickson, Jesse Allen, Len Brown, a.verweij,
	Prakash K. Cheemplavam, Craig Bradney, christian.kroener,
	linux-kernel, Daniel Drake, Ian Kumlien, Allen Martin

On Thu, 29 Apr 2004, Jamie Lokier wrote:

> >  Not necessarily related to the PSU, but the noise may actually be the
> > reason of spurious timer interrupts.
> 
> With most device interrupts, additional spurious ones don't cause any
> malfunction because the driver's handler checks whether the device
> actually has a condition pending.

 Note the 8254 timer uses edge-triggered interrupts and is just a square
wave signal.  There's no acking to deassert the interrupt -- it goes away
spontaneously after a predefined time.

> This is the basis of shared interrupts, of course.

 Yep, but the timer is non-shareable by definition.

> Is there any way we can check the timer itself to see whether an
> interrupt was caused by it, so that spurious timer interrupts are ignored?

 This may be possible, but complicated and likely unreliable -- an I/O
APIC may deliver a spurious interrupt at the time a real one would be
probable and you can't check if a period between two consecutive timer
interrupts is appropriate without an additional time reference, which may
be unavailable (like the TSC).

 Note the timer is special -- we don't really do any device handling, but
we want to get periodic interrupts at the right times to have a time
reference.  Coalescing interrupts or discarding spurious ones, which is
normal and acceptable for regular devices, doesn't work here.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-29 11:44                                 ` Ross Dickson
                                                     ` (2 preceding siblings ...)
  2004-04-29 12:16                                   ` Craig Bradney
@ 2004-04-29 20:24                                   ` Jesse Allen
  2004-04-29 20:31                                     ` Prakash K. Cheemplavam
  3 siblings, 1 reply; 93+ messages in thread
From: Jesse Allen @ 2004-04-29 20:24 UTC (permalink / raw)
  To: Ross Dickson
  Cc: Len Brown, a.verweij, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

On Thu, Apr 29, 2004 at 09:44:37PM +1000, Ross Dickson wrote:
> On Thursday 29 April 2004 06:59, Jesse Allen wrote:
> > almost in unison with the sound coming from the computer.  Maybe the IDE or 
> > hard drive is related, but it is too much related to C1 disconnect.
> 
> I think I might break out my oscilloscope this weekend and have a look at how 
> clean the supply rails are around the cpu and northbridge and southbridge. 
> Who knows I might get lucky and see some unexpected ripple or spikes.

I'd be interested in knowing the results.

> > resonance can break systems.  But to think that my board is doing emmitting 
> > noise like that is pretty bizarre.
> 
> Not as bizarre as you may think. I have heard coils and even capacitors "sing"
> in years past whilst servicing electronics.

Yes, I know that these things can theorectically happen.  But when it happens
to me, it's a suprise.  To an electronics genius, he probably encounters it 
more often. =)

> > C1 handshake time?  Isn't that much like what your patch does?
> 
> I had not really thought about it from that perspective. Whilst my patch cannot 
> alter the handshake times it does prevent consecutive C1 cycles from occurring
> too close together. Too close together I think being less than about 800ns. I 

ah, ok.

> guess I could look at that with a cro too - use an appropriate pin as the 
> trigger source and see if supply rails have load dump voltage rises when 
> going into disconnect. Maybe rail voltage rings for about 700ns and might be 
> out of tolerence inside Athlon during that time. Would be very interesting if
> a few hundred picofarad of low esr decoupling cap placed on a supply rail 
> near a chip makes a difference? A pinout of the nforce2 chipset would help a 
> great deal here but I do not have one. Can anyone oblige me?


What I'd like to know is where the sound chip is really at on my board.  I've 
tried looking before, but find myself confused.

A pic:
http://us.shuttle.com/images/productimages/AN35.jpg

According to a diagram that I have, it points to an AC'97 6-CH AUDIO as a chip
near of the top of the board in the image that I link to, above 2nd PCI slot 
left of the AGP.  But I'm am also left thinking, how does the NForce2 MCP come 
into play.  Specs would help.  Maybe if we can figure out how the sound is 
wired on the board, we could also trace the source of noise to the exact 
component.

Jesse

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-29 20:24                                   ` Jesse Allen
@ 2004-04-29 20:31                                     ` Prakash K. Cheemplavam
  2004-05-03 20:45                                       ` Jesse Allen
  0 siblings, 1 reply; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-04-29 20:31 UTC (permalink / raw)
  To: Jesse Allen
  Cc: Ross Dickson, Len Brown, a.verweij, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

Jesse Allen wrote:
> What I'd like to know is where the sound chip is really at on my board.  I've 
> tried looking before, but find myself confused.
> 
> A pic:
> http://us.shuttle.com/images/productimages/AN35.jpg
> 
> According to a diagram that I have, it points to an AC'97 6-CH AUDIO as a chip
> near of the top of the board in the image that I link to, above 2nd PCI slot 
> left of the AGP.  But I'm am also left thinking, how does the NForce2 MCP come 
> into play.  Specs would help.  Maybe if we can figure out how the sound is 
> wired on the board, we could also trace the source of noise to the exact 
> component.

Yes, I also think the chip above 2nd PCI slot is the right one. You can 
see the realtek logo. It is only a ac97 codec (basically not more than a 
DAC and ADC) and linux currently only has drivers for this. The MCP-T 
has an APU, which could do dsp stuff by hardware, but no drivers still 
(Hello Nvidia?), so all of this is done via software. (THe APU has even 
more functionality, like DD5.1 realtime encoding, fx, and whatever). In 
our case, the APU shouldn't cause any troubles, as it is not used. With 
the APU, nforce2 chipset behaves like a "real" soundcard. Without, its 
sound abilities are not better than the average mainboard's onboard sound.

Prakash

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-21 22:41                 ` Len Brown
                                     ` (3 preceding siblings ...)
  2004-04-22 16:39                   ` Jesse Allen
@ 2004-05-01  6:51                   ` Prakash K. Cheemplavam
  4 siblings, 0 replies; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-05-01  6:51 UTC (permalink / raw)
  To: Len Brown
  Cc: Craig Bradney, ross, christian.kroener, linux-kernel,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake, Ian Kumlien,
	Jesse Allen, a.verweij, Allen Martin

Hi Len,

don't you want to change your wip.patch in such a way, that it always 
activates on nforce2? Allen told that "The 8254 PIT is hardwared to IRQ0 
on all nForce chipsets, it can't be routed.", which I guess is what you 
needed to know. If this statement doesn't apply to the timer fix, here 
dmidecode change of newest bios for Abit NF7-S v2: Just need to
change

MATCH(DMI_BIOS_DATE, "03/24/2004")

to

MATCH(DMI_BIOS_DATE, "04/22/2004")

Prakash



> +	{ ignore_timer_override, "Abit NF7-S v2", {
> +			MATCH(DMI_BOARD_VENDOR, "http://www.abit.com.tw/"),
> +			MATCH(DMI_BOARD_NAME, "NF7-S/NF7,NF7-V (nVidia-nForce2)"),
> +			MATCH(DMI_BIOS_VERSION, "6.00 PG"),
> +			MATCH(DMI_BIOS_DATE, "03/24/2004") }},
> +

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-29 20:31                                     ` Prakash K. Cheemplavam
@ 2004-05-03 20:45                                       ` Jesse Allen
  2004-05-17 15:26                                         ` Prakash K. Cheemplavam
  0 siblings, 1 reply; 93+ messages in thread
From: Jesse Allen @ 2004-05-03 20:45 UTC (permalink / raw)
  To: Prakash K. Cheemplavam
  Cc: Ross Dickson, Len Brown, a.verweij, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

On Thu, Apr 29, 2004 at 10:31:52PM +0200, Prakash K. Cheemplavam wrote:
> Jesse Allen wrote:
> >What I'd like to know is where the sound chip is really at on my board.  
> 
> Yes, I also think the chip above 2nd PCI slot is the right one. You can 
> see the realtek logo. It is only a ac97 codec (basically not more than a 
> DAC and ADC) and linux currently only has drivers for this. The MCP-T 
> has an APU, which could do dsp stuff by hardware, but no drivers still 
> (Hello Nvidia?), so all of this is done via software. (THe APU has even 
> more functionality, like DD5.1 realtime encoding, fx, and whatever). In 
> our case, the APU shouldn't cause any troubles, as it is not used. With 
> the APU, nforce2 chipset behaves like a "real" soundcard. Without, its 
> sound abilities are not better than the average mainboard's onboard sound.
> 
> Prakash
> 

Thanks.  I've also got some one reporting to me of having the same problem
with the Asus A7N8X board.  Also note, that I don't have the intel8x0 loaded
and it will still do it.  I can even disable the onboard sound in BIOS and it
will _still_ have the sound on the speaker out.  Want to know how the sound 
varies?  Try compiling a linux kernel.  Between executing make processes the sound will vary alot (from nothing to the annoying pitch).  The sound is quite 
faint to probably be heard on speakers, but on headphones you probably will 
hear it.  To me, this is indicative of the C1 disconnects.

Jesse


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-03 20:45                                       ` Jesse Allen
@ 2004-05-17 15:26                                         ` Prakash K. Cheemplavam
  2004-05-17 19:32                                           ` Craig Bradney
  0 siblings, 1 reply; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-05-17 15:26 UTC (permalink / raw)
  To: Jesse Allen
  Cc: Ross Dickson, Len Brown, a.verweij, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

Hi all,

I just made an interesting finding and would like to have comments from 
NVidia:

Chip   Current Value   New Value
C17       1F0FFF01     1F01FF01
C18D      9F0FFF01     9F01FF01

In fact I have the newer chip revision (lspci says c1), but due to a 
post at Abit Forums I tried to use the value for the older revision on 
my board, and guess what: I never had such low idle temps! I am 
currently even using nvidia binary graphics driver and usually I would 
be having around 49-51°C idle temp, but now it is around 45°C, and it 
was not the first boot (then the mobo usually shows 5°C less). Instead 
the temp steadily fell from >50°C to 45°C.

(esp @nvidia:) Is there anything evil using the old chip's value for the 
new chip? So far I haven't noticed any bad thing about it. Perhaps some 
daring nforce2 user with the new revision should try as well.


bye,

Prakash

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-17 15:26                                         ` Prakash K. Cheemplavam
@ 2004-05-17 19:32                                           ` Craig Bradney
  2004-05-17 19:37                                             ` Prakash K. Cheemplavam
  0 siblings, 1 reply; 93+ messages in thread
From: Craig Bradney @ 2004-05-17 19:32 UTC (permalink / raw)
  To: Prakash K. Cheemplavam
  Cc: Jesse Allen, Ross Dickson, Len Brown, a.verweij,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 1135 bytes --]

On Mon, 2004-05-17 at 17:26, Prakash K. Cheemplavam wrote:
> Hi all,
> 
> I just made an interesting finding and would like to have comments from 
> NVidia:
> 
> Chip   Current Value   New Value
> C17       1F0FFF01     1F01FF01
> C18D      9F0FFF01     9F01FF01
> 
> In fact I have the newer chip revision (lspci says c1), but due to a 
> post at Abit Forums I tried to use the value for the older revision on 
> my board, and guess what: I never had such low idle temps! I am 
> currently even using nvidia binary graphics driver and usually I would 
> be having around 49-51°C idle temp, but now it is around 45°C, and it 
> was not the first boot (then the mobo usually shows 5°C less). Instead 
> the temp steadily fell from >50°C to 45°C.
> 
> (esp @nvidia:) Is there anything evil using the old chip's value for the 
> new chip? So far I haven't noticed any bad thing about it. Perhaps some 
> daring nforce2 user with the new revision should try as well.
> 

Isnt it the case that that change is the one that brings about
stability? Was indicated before to be the main causing c1halt crashes.

Craig

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-17 19:32                                           ` Craig Bradney
@ 2004-05-17 19:37                                             ` Prakash K. Cheemplavam
  2004-05-17 19:57                                               ` Craig Bradney
  0 siblings, 1 reply; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-05-17 19:37 UTC (permalink / raw)
  To: Craig Bradney
  Cc: Jesse Allen, Ross Dickson, Len Brown, a.verweij,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

Craig Bradney wrote:
> On Mon, 2004-05-17 at 17:26, Prakash K. Cheemplavam wrote:
> 
>>Hi all,
>>
>>I just made an interesting finding and would like to have comments from 
>>NVidia:
>>
>>Chip   Current Value   New Value
>>C17       1F0FFF01     1F01FF01
>>C18D      9F0FFF01     9F01FF01
>>
>>In fact I have the newer chip revision (lspci says c1), but due to a 
>>post at Abit Forums I tried to use the value for the older revision on 
>>my board, and guess what: I never had such low idle temps! I am 
>>currently even using nvidia binary graphics driver and usually I would 
>>be having around 49-51°C idle temp, but now it is around 45°C, and it 
>>was not the first boot (then the mobo usually shows 5°C less). Instead 
>>the temp steadily fell from >50°C to 45°C.
>>
>>(esp @nvidia:) Is there anything evil using the old chip's value for the 
>>new chip? So far I haven't noticed any bad thing about it. Perhaps some 
>>daring nforce2 user with the new revision should try as well.
>>
> 
> 
> Isnt it the case that that change is the one that brings about
> stability? Was indicated before to be the main causing c1halt crashes.

Nope, I am changing the 9F to 1F. The "stability byte" was changing the 
0F to 01. I am no using 1F01FF01 instead of 9F01FF01. I guess I wasn't 
clear enough.

Prakash

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-17 19:37                                             ` Prakash K. Cheemplavam
@ 2004-05-17 19:57                                               ` Craig Bradney
  0 siblings, 0 replies; 93+ messages in thread
From: Craig Bradney @ 2004-05-17 19:57 UTC (permalink / raw)
  To: Prakash K. Cheemplavam
  Cc: Jesse Allen, Ross Dickson, Len Brown, a.verweij,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien, Allen Martin

[-- Attachment #1: Type: text/plain, Size: 1562 bytes --]

On Mon, 2004-05-17 at 21:37, Prakash K. Cheemplavam wrote:
> Craig Bradney wrote:
> > On Mon, 2004-05-17 at 17:26, Prakash K. Cheemplavam wrote:
> > 
> >>Hi all,
> >>
> >>I just made an interesting finding and would like to have comments from 
> >>NVidia:
> >>
> >>Chip   Current Value   New Value
> >>C17       1F0FFF01     1F01FF01
> >>C18D      9F0FFF01     9F01FF01
> >>
> >>In fact I have the newer chip revision (lspci says c1), but due to a 
> >>post at Abit Forums I tried to use the value for the older revision on 
> >>my board, and guess what: I never had such low idle temps! I am 
> >>currently even using nvidia binary graphics driver and usually I would 
> >>be having around 49-51°C idle temp, but now it is around 45°C, and it 
> >>was not the first boot (then the mobo usually shows 5°C less). Instead 
> >>the temp steadily fell from >50°C to 45°C.
> >>
> >>(esp @nvidia:) Is there anything evil using the old chip's value for the 
> >>new chip? So far I haven't noticed any bad thing about it. Perhaps some 
> >>daring nforce2 user with the new revision should try as well.
> >>
> > 
> > 
> > Isnt it the case that that change is the one that brings about
> > stability? Was indicated before to be the main causing c1halt crashes.
> 
> Nope, I am changing the 9F to 1F. The "stability byte" was changing the 
> 0F to 01. I am no using 1F01FF01 instead of 9F01FF01. I guess I wasn't 
> clear enough.

And I wasnt looking hard enough at those characters. :) Interesting
find.. Allen.. any comments?

Craig

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-07  4:47 ` Richard James
  2004-05-07  7:13   ` Craig Bradney
@ 2004-05-08  5:33   ` Richard James
  1 sibling, 0 replies; 93+ messages in thread
From: Richard James @ 2004-05-08  5:33 UTC (permalink / raw)
  To: Richard James; +Cc: Jesse Allen, linux-kernel

Richard James wrote:

> ASUS have now supplied a BIOS update for the A7N8X-X which fixes the 
> C1 halt crash.
> dated the 2004/04/21.  So I assume that they will supply a patch for 
> all nforce2 motherboards.


No this is wrong after retesting with a clean kernel the machine still 
locks up. BIOS 1009 does nothing for us.

Richard James.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-07  4:47 ` Richard James
@ 2004-05-07  7:13   ` Craig Bradney
  2004-05-08  5:33   ` Richard James
  1 sibling, 0 replies; 93+ messages in thread
From: Craig Bradney @ 2004-05-07  7:13 UTC (permalink / raw)
  To: Richard James; +Cc: Jesse Allen, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 936 bytes --]

On Fri, 2004-05-07 at 06:47, Richard James wrote:
> Jesse Allen wrote:
> 
> >Len Brown wrote:
> >  
> >
> >>Have you been able to hang the AN35N under any conditions?
> >>Old BIOS, non-vanilla kernel?
> >>    
> >>
> >
> >Yes, and I described that it will hang under the pre-Dec 5th BIOS in another 
> >mail.
> >
> >I still have images of the buggy BIOS, and the fixed one on my hard drive.
> >They are also available at ftp://ftp.shuttle.com/BIOS/an35_n/ as
> >an35s00j.bin (Oct 2003)
> >an35s00l.bin (Dec 5th 2003)
> >
> >  
> >
> ASUS have now supplied a BIOS update for the A7N8X-X which fixes the C1 
> halt crash.
> dated the 2004/04/21.  So I assume that they will supply a patch for all 
> nforce2 motherboards.

Only for the A7N8X-X though. I like their description of the fixes:
1. Improve some memory modules stability.

Did you apply it and then run lspci -xxx -vvv on it to find out?

Craig

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-23  1:30 Jesse Allen
@ 2004-05-07  4:47 ` Richard James
  2004-05-07  7:13   ` Craig Bradney
  2004-05-08  5:33   ` Richard James
  0 siblings, 2 replies; 93+ messages in thread
From: Richard James @ 2004-05-07  4:47 UTC (permalink / raw)
  To: Jesse Allen; +Cc: linux-kernel

Jesse Allen wrote:

>Len Brown wrote:
>  
>
>>Have you been able to hang the AN35N under any conditions?
>>Old BIOS, non-vanilla kernel?
>>    
>>
>
>Yes, and I described that it will hang under the pre-Dec 5th BIOS in another 
>mail.
>
>I still have images of the buggy BIOS, and the fixed one on my hard drive.
>They are also available at ftp://ftp.shuttle.com/BIOS/an35_n/ as
>an35s00j.bin (Oct 2003)
>an35s00l.bin (Dec 5th 2003)
>
>  
>
ASUS have now supplied a BIOS update for the A7N8X-X which fixes the C1 
halt crash.
dated the 2004/04/21.  So I assume that they will supply a patch for all 
nforce2 motherboards.

Richard James


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-05 13:08       ` Ian Kumlien
@ 2004-05-06  1:50         ` Jesse Allen
  0 siblings, 0 replies; 93+ messages in thread
From: Jesse Allen @ 2004-05-06  1:50 UTC (permalink / raw)
  To: Ian Kumlien; +Cc: ross, linux-kernel

On Wed, May 05, 2004 at 03:08:01PM +0200, Ian Kumlien wrote:
> On Wed, 2004-05-05 at 14:52, Ross Dickson wrote:
> > On Wednesday 05 May 2004 22:18, Ian Kumlien wrote:
> > > On Wed, 2004-05-05 at 13:24, Ross Dickson wrote:
> > > <snip>
> > > > They can't see through their Windows.??!@@#$$%%&*&
> > > > 
> > > > ML1-0505-19 Re: Cause of lockups with KM-18G Pro is incorrect pci reg values in bios -please update bios
> > > > 
> > > > From: 
> > > > "dr.pro" <dr.pro@albatron.com.tw>
> > > > 
> > > > To: 
> > > > <ross@datscreative.com.au>
> > > > 
> > > > Date: 
> > > > Today 17:38:08
> > > > 
> > > >   Dear Ross,
> > > > 
> > > >   Thank you very much for contacting Albatron technical support.
> > > > 
> > > >   KM18G Pro has been proved under Windows 98SE/ME/2000/XP but Linux, so > > > > you may encounter problems with it under Linux. We suggest you use 
> > > > Windows 98SE/ME/2000/XP for the stable performance. Sorry for the 
> > > > inconvenience and please kindly understand it.
> > > > 
> > > >   Please let us know if you have any question.

!!!

> > > 
> > > Please kindly understand it? I wouldn't... I'm about to bash asus, so...
> > > This information gets me in the moood to do some real bashing =)
> > > 
> > > Btw, does windows do a C1 disconnect? And if so how often?
> > 
> > I think it does as temps are lower then linux without disconnect.
> > Here are some temperatures from my machine read from the bios on reboot.
> > I gave it minimal activity for the minutes prior to reboot.
> > 
> >  Win98, 47C
> >  XPHome, 42C
> >  Patched Linux 2.4.24 (1000Hz), 40C
> >  Linux 2.6.3-rc1-mm1, 53C  with no disconnect

Patched AN35N Bios w/ Linux, C1 Disconnect on:
idle system, 39-41 C
heavy activity, 50-51 C

Though I have since added two additional fans to my system.  When it is under
heavy activity, it will obviously go up to 51 C.  When it is finished and 
becomes idle again, then the CPU temp will quickly go back down to 39-41 C
because the additional fans remove the heat quite effectively.

> >  
> > I think the disconnect happens for less time percentage. With slower
> > ticks one might assume less often than linux. 
> > -Ross
> 
> Which means that the problem isn't as likely to occur under Windows,
> which also explains why mb-manuf ppl are lazy =P.
>  


Ross, you should reply to them and say the problem affects windows as well.  I
can't imagine it immune.  Although, windows is not that aggressive, I think 
it's still affected.  It doesn't matter that people don't think windows has
this hang.  It still has it.  It doesn't matter that they can't reproduce it 
under what they think are normal circumstances.  It _still_ has it.

Actually it isn't a OS problem, it's a BIOS problem.  It has nothing to do with
the OS.  It's the quality of their boards we're talking about.  If only they
get that...

Jesse



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-05 12:48   ` Patrick Dreker
@ 2004-05-05 13:34     ` Patrick Dreker
  0 siblings, 0 replies; 93+ messages in thread
From: Patrick Dreker @ 2004-05-05 13:34 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Allen Martin, linux-kernel, Ross Dickson, Len Brown

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Mittwoch, 5. Mai 2004 14:48 schrieb Patrick Dreker:
> As a side note: the idle CPU temperature reported by ACPI on my Shuttle
> iDeq200N barebone has gone from approx. 50 degrees down to approx. 43
Make that *Biostar* iDeq200N...

Patrick
- -- 
Patrick Dreker

GPG KeyID  : 0xFCC2F7A7 (Patrick Dreker)
Fingerprint: 7A21 FC7F 707A C498 F370  1008 7044 66DA FCC2 F7A7
Key available from keyservers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAmO1ccERm2vzC96cRAom8AJoDAOZ9aiTVoxfbr88BptRt29yHAwCghl9j
IM91QHnnHlTnuOJ1sf/i3Jw=
=XCJ4
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-05 13:12       ` Ross Dickson
@ 2004-05-05 13:23         ` Ian Kumlien
  0 siblings, 0 replies; 93+ messages in thread
From: Ian Kumlien @ 2004-05-05 13:23 UTC (permalink / raw)
  To: ross
  Cc: Bartlomiej Zolnierkiewicz, Allen Martin, linux-kernel, Len Brown,
	Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, Maciej W. Rozycki, Jamie Lokier, Daniel Drake

[-- Attachment #1: Type: text/plain, Size: 2665 bytes --]

On Wed, 2004-05-05 at 15:12, Ross Dickson wrote:
> On Wednesday 05 May 2004 22:27, Ian Kumlien wrote:
> > On Wed, 2004-05-05 at 14:14, Ross Dickson wrote:
> > > To my knowledge the only thing left to sort out for the normal kernel
> > > distro is what to do about the timer_ack issue in check_timer().
> > > 
> > > We need it off for nforce2 to get nmi_watchdog=1 working with ioapic
> > > 8254 timer pin0  timer override patch routing. I vote to revisit Maciej's
> > > patch that was dropped by Linus after appearing in 2.6.3-mm3. 
> > > For those with problems of clock skew with the timer into pin0 routing, 
> > > that patch gave a virtual wire timer routing which worked well for those
> > > users.
> > 
> > Whats the real difference between nmi_watchdog?1 and =2? Since
> > nmi_watchdog=2 works here:
> > 
> > NMI:       9884
> > LOC:   80297310
> > ERR:          0
> > MIS:          0
> 
> From memory 2 uses resources that code profiling tools need to use so
> if you can use 1 then you can have your watchdog and profile too.

Ahh outch... 

> > Also, wouldn't it be better to not depend on bioses and bios versions
> > atm, ie hardcode pin0 since Allen Martin stated that it's hardwired on
> > pin0?
> > 
> > ie, just:
> > if(pin2 && nforce2_chip)
> > {
> > 	printk("ALERT: Known defect in bios, mail your manufacturer. Using
> > pin0\n");
> > 	<whateverisneededtousepin0>
> > }
> 
> It should be OK, but those with mobos that get clock skew on pin 0 would
> then demand a clock skew fix for their noisy hardware. I don't have a
> motherboard with skew problems.

Like: cat ntp.drift
-12.282

> Personally I think that the clock system should be made immune to noise
> generated timer interrupts just as it has been coded to detect missing
> timer interrupts. I am pretty sure on nforce2 athlon mobos the tsc is used
> in detecting missing pulses. Kind of like a digital phase locked loop? so
> should it not also debounce the interrupts given that the ioapic interrupt
> hardware cannot?
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/6385.html
> Obviously the pc hardware design is flawed in this respect.

x86 is flawed in many ways, but it's cheap and you get what you pay for
=).

But wouldn't that cause problems with cpu freq scaling?

> Anyone know how to modify the existing timer tsc code to do this? And
> offer to do it? Any brand/type of mobo is open to clock speed up due
> to this effect, so I think it should be fixed, debouncing is fundamental
> to input transitions that need to be counted.

-- 
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-05 12:27     ` Ian Kumlien
@ 2004-05-05 13:12       ` Ross Dickson
  2004-05-05 13:23         ` Ian Kumlien
  0 siblings, 1 reply; 93+ messages in thread
From: Ross Dickson @ 2004-05-05 13:12 UTC (permalink / raw)
  To: Ian Kumlien
  Cc: Bartlomiej Zolnierkiewicz, Allen Martin, linux-kernel, Len Brown,
	Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, Maciej W. Rozycki, Jamie Lokier, Daniel Drake

On Wednesday 05 May 2004 22:27, Ian Kumlien wrote:
> On Wed, 2004-05-05 at 14:14, Ross Dickson wrote:
> > To my knowledge the only thing left to sort out for the normal kernel
> > distro is what to do about the timer_ack issue in check_timer().
> > 
> > We need it off for nforce2 to get nmi_watchdog=1 working with ioapic
> > 8254 timer pin0  timer override patch routing. I vote to revisit Maciej's
> > patch that was dropped by Linus after appearing in 2.6.3-mm3. 
> > For those with problems of clock skew with the timer into pin0 routing, 
> > that patch gave a virtual wire timer routing which worked well for those
> > users.
> 
> Whats the real difference between nmi_watchdog?1 and =2? Since
> nmi_watchdog=2 works here:
> 
> NMI:       9884
> LOC:   80297310
> ERR:          0
> MIS:          0

>From memory 2 uses resources that code profiling tools need to use so
if you can use 1 then you can have your watchdog and profile too.

> 
> Also, wouldn't it be better to not depend on bioses and bios versions
> atm, ie hardcode pin0 since Allen Martin stated that it's hardwired on
> pin0?
> 
> ie, just:
> if(pin2 && nforce2_chip)
> {
> 	printk("ALERT: Known defect in bios, mail your manufacturer. Using
> pin0\n");
> 	<whateverisneededtousepin0>
> }

It should be OK, but those with mobos that get clock skew on pin 0 would
then demand a clock skew fix for their noisy hardware. I don't have a
motherboard with skew problems.

Personally I think that the clock system should be made immune to noise
generated timer interrupts just as it has been coded to detect missing
timer interrupts. I am pretty sure on nforce2 athlon mobos the tsc is used
in detecting missing pulses. Kind of like a digital phase locked loop? so
should it not also debounce the interrupts given that the ioapic interrupt
hardware cannot?
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/6385.html
Obviously the pc hardware design is flawed in this respect.

Anyone know how to modify the existing timer tsc code to do this? And
offer to do it? Any brand/type of mobo is open to clock speed up due
to this effect, so I think it should be fixed, debouncing is fundamental
to input transitions that need to be counted.
 
-Ross

> 
> Since this whole problem is pissing me off... It would be much better if
> one had some kind of access to the information from nvidia so you can
> just point at it, telling the mb-manuf. that they are morons and go fix
> =). (Did i mention that i have had this problem for quite some time and
> would have gone postal if it wasn't for Ross Dicksons fixes =))
> 
> -- 
> Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net
> 


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-05 12:52     ` Ross Dickson
@ 2004-05-05 13:08       ` Ian Kumlien
  2004-05-06  1:50         ` Jesse Allen
  0 siblings, 1 reply; 93+ messages in thread
From: Ian Kumlien @ 2004-05-05 13:08 UTC (permalink / raw)
  To: ross
  Cc: Allen Martin, linux-kernel, Len Brown, Jesse Allen,
	Prakash K. Cheemplavam, Craig Bradney, christian.kroener,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake

[-- Attachment #1: Type: text/plain, Size: 1858 bytes --]

On Wed, 2004-05-05 at 14:52, Ross Dickson wrote:
> On Wednesday 05 May 2004 22:18, Ian Kumlien wrote:
> > On Wed, 2004-05-05 at 13:24, Ross Dickson wrote:
> > <snip>
> > > They can't see through their Windows.??!@@#$$%%&*&
> > > 
> > > ML1-0505-19 Re: Cause of lockups with KM-18G Pro is incorrect pci reg values in bios -please update bios
> > > 
> > > From: 
> > > "dr.pro" <dr.pro@albatron.com.tw>
> > > 
> > > To: 
> > > <ross@datscreative.com.au>
> > > 
> > > Date: 
> > > Today 17:38:08
> > > 
> > >   Dear Ross,
> > > 
> > >   Thank you very much for contacting Albatron technical support.
> > > 
> > >   KM18G Pro has been proved under Windows 98SE/ME/2000/XP but Linux, so you
> > > may encounter problems with it under Linux. We suggest you use Windows
> > > 98SE/ME/2000/XP for the stable performance. Sorry for the inconvenience and
> > > please kindly understand it.
> > > 
> > >   Please let us know if you have any question.
> > 
> > Please kindly understand it? I wouldn't... I'm about to bash asus, so...
> > This information gets me in the moood to do some real bashing =)
> > 
> > Btw, does windows do a C1 disconnect? And if so how often?
> 
> I think it does as temps are lower then linux without disconnect.
> Here are some temperatures from my machine read from the bios on reboot.
> I gave it minimal activity for the minutes prior to reboot.
> 
>  Win98, 47C
>  XPHome, 42C
>  Patched Linux 2.4.24 (1000Hz), 40C
>  Linux 2.6.3-rc1-mm1, 53C  with no disconnect
>  
> I think the disconnect happens for less time percentage. With slower
> ticks one might assume less often than linux. 
> -Ross

Which means that the problem isn't as likely to occur under Windows,
which also explains why mb-manuf ppl are lazy =P.
 
-- 
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-05 12:14   ` Ross Dickson
  2004-05-05 12:27     ` Ian Kumlien
@ 2004-05-05 12:58     ` Maciej W. Rozycki
  1 sibling, 0 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2004-05-05 12:58 UTC (permalink / raw)
  To: Ross Dickson
  Cc: Bartlomiej Zolnierkiewicz, Allen Martin, linux-kernel, Len Brown,
	Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, Jamie Lokier, Daniel Drake, Ian Kumlien

On Wed, 5 May 2004, Ross Dickson wrote:

> We need it off for nforce2 to get nmi_watchdog=1 working with ioapic
> 8254 timer pin0  timer override patch routing. I vote to revisit Maciej's
> patch that was dropped by Linus after appearing in 2.6.3-mm3. 
> For those with problems of clock skew with the timer into pin0 routing, 
> that patch gave a virtual wire timer routing which worked well for those
> users.
> 
> It also works around problems for ibm users.
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/4421.html
> 
> That patch is last posted here (Maciej please correct me if i'm wrong)
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/3174.html

 That's my latest version, although the one in -mm had a minor readability 
improvement, so you may use that one instead.

 BTW, can you please check if the chipset fixup cures the timer IRQ line
noise?

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-05 12:18   ` Ian Kumlien
@ 2004-05-05 12:52     ` Ross Dickson
  2004-05-05 13:08       ` Ian Kumlien
  0 siblings, 1 reply; 93+ messages in thread
From: Ross Dickson @ 2004-05-05 12:52 UTC (permalink / raw)
  To: Ian Kumlien
  Cc: Allen Martin, linux-kernel, Len Brown, Jesse Allen,
	Prakash K. Cheemplavam, Craig Bradney, christian.kroener,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake

On Wednesday 05 May 2004 22:18, Ian Kumlien wrote:
> On Wed, 2004-05-05 at 13:24, Ross Dickson wrote:
> <snip>
> > They can't see through their Windows.??!@@#$$%%&*&
> > 
> > ML1-0505-19 Re: Cause of lockups with KM-18G Pro is incorrect pci reg values in bios -please update bios
> > 
> > From: 
> > "dr.pro" <dr.pro@albatron.com.tw>
> > 
> > To: 
> > <ross@datscreative.com.au>
> > 
> > Date: 
> > Today 17:38:08
> > 
> >   Dear Ross,
> > 
> >   Thank you very much for contacting Albatron technical support.
> > 
> >   KM18G Pro has been proved under Windows 98SE/ME/2000/XP but Linux, so you
> > may encounter problems with it under Linux. We suggest you use Windows
> > 98SE/ME/2000/XP for the stable performance. Sorry for the inconvenience and
> > please kindly understand it.
> > 
> >   Please let us know if you have any question.
> 
> Please kindly understand it? I wouldn't... I'm about to bash asus, so...
> This information gets me in the moood to do some real bashing =)
> 
> Btw, does windows do a C1 disconnect? And if so how often?

I think it does as temps are lower then linux without disconnect.
Here are some temperatures from my machine read from the bios on reboot.
I gave it minimal activity for the minutes prior to reboot.

 Win98, 47C
 XPHome, 42C
 Patched Linux 2.4.24 (1000Hz), 40C
 Linux 2.6.3-rc1-mm1, 53C  with no disconnect
 
I think the disconnect happens for less time percentage. With slower
ticks one might assume less often than linux. 
-Ross

> 
> -- 
> Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net
> 


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-03 23:11 ` Bartlomiej Zolnierkiewicz
                     ` (2 preceding siblings ...)
  2004-05-05 12:14   ` Ross Dickson
@ 2004-05-05 12:48   ` Patrick Dreker
  2004-05-05 13:34     ` Patrick Dreker
  3 siblings, 1 reply; 93+ messages in thread
From: Patrick Dreker @ 2004-05-05 12:48 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Allen Martin, linux-kernel, Ross Dickson, Len Brown

[-- Attachment #1: Type: text/plain, Size: 1300 bytes --]

Am Dienstag, 4. Mai 2004 01:11 schrieb Bartlomiej Zolnierkiewicz:
> On Tuesday 04 of May 2004 00:09, Allen Martin wrote:
> > I'm happy to be able to make this information public to the Linux
> > community.  This information has been previously released to BIOS /
> > board vendors as an appnote, but in the interest of getting a workaround
> > into the hands of users the quickest we're making it public for possible
> > inclusion into the Linux kernel.
>
> This is a great news!  Below is an untested patch to address this issue.
The patch also applies cleanly to kernel 2.6.5, which is what I am running. 
The machine is now running for more than 21 hours with APIC enabled and it 
seems it is completely stable now. Without the patch I was able to lock the 
system solid in less than a minute by just pushing some MB of data across the 
LAN. I have been running continuous network copies of a 400MB file for about 
8 hours and experienced no problems.

As a side note: the idle CPU temperature reported by ACPI on my Shuttle 
iDeq200N barebone has gone from approx. 50 degrees down to approx. 43 
degrees.

Patrick
-- 
Patrick Dreker

GPG KeyID  : 0xFCC2F7A7 (Patrick Dreker)
Fingerprint: 7A21 FC7F 707A C498 F370  1008 7044 66DA FCC2 F7A7
Key available from keyservers

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-05 12:14   ` Ross Dickson
@ 2004-05-05 12:27     ` Ian Kumlien
  2004-05-05 13:12       ` Ross Dickson
  2004-05-05 12:58     ` Maciej W. Rozycki
  1 sibling, 1 reply; 93+ messages in thread
From: Ian Kumlien @ 2004-05-05 12:27 UTC (permalink / raw)
  To: ross
  Cc: Bartlomiej Zolnierkiewicz, Allen Martin, linux-kernel, Len Brown,
	Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, Maciej W. Rozycki, Jamie Lokier, Daniel Drake

[-- Attachment #1: Type: text/plain, Size: 1473 bytes --]

On Wed, 2004-05-05 at 14:14, Ross Dickson wrote:
> To my knowledge the only thing left to sort out for the normal kernel
> distro is what to do about the timer_ack issue in check_timer().
> 
> We need it off for nforce2 to get nmi_watchdog=1 working with ioapic
> 8254 timer pin0  timer override patch routing. I vote to revisit Maciej's
> patch that was dropped by Linus after appearing in 2.6.3-mm3. 
> For those with problems of clock skew with the timer into pin0 routing, 
> that patch gave a virtual wire timer routing which worked well for those
> users.

Whats the real difference between nmi_watchdog?1 and =2? Since
nmi_watchdog=2 works here:

NMI:       9884
LOC:   80297310
ERR:          0
MIS:          0

Also, wouldn't it be better to not depend on bioses and bios versions
atm, ie hardcode pin0 since Allen Martin stated that it's hardwired on
pin0?

ie, just:
if(pin2 && nforce2_chip)
{
	printk("ALERT: Known defect in bios, mail your manufacturer. Using
pin0\n");
	<whateverisneededtousepin0>
}

Since this whole problem is pissing me off... It would be much better if
one had some kind of access to the information from nvidia so you can
just point at it, telling the mb-manuf. that they are morons and go fix
=). (Did i mention that i have had this problem for quite some time and
would have gone postal if it wasn't for Ross Dicksons fixes =))

-- 
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-05 11:24 ` Ross Dickson
@ 2004-05-05 12:18   ` Ian Kumlien
  2004-05-05 12:52     ` Ross Dickson
  0 siblings, 1 reply; 93+ messages in thread
From: Ian Kumlien @ 2004-05-05 12:18 UTC (permalink / raw)
  To: ross
  Cc: Allen Martin, linux-kernel, Len Brown, Jesse Allen,
	Prakash K. Cheemplavam, Craig Bradney, christian.kroener,
	Maciej W. Rozycki, Jamie Lokier, Daniel Drake

[-- Attachment #1: Type: text/plain, Size: 1039 bytes --]

On Wed, 2004-05-05 at 13:24, Ross Dickson wrote:
<snip>
> They can't see through their Windows.??!@@#$$%%&*&
> 
> ML1-0505-19 Re: Cause of lockups with KM-18G Pro is incorrect pci reg values in bios -please update bios
> 
> From: 
> "dr.pro" <dr.pro@albatron.com.tw>
> 
> To: 
> <ross@datscreative.com.au>
> 
> Date: 
> Today 17:38:08
> 
>   Dear Ross,
> 
>   Thank you very much for contacting Albatron technical support.
> 
>   KM18G Pro has been proved under Windows 98SE/ME/2000/XP but Linux, so you
> may encounter problems with it under Linux. We suggest you use Windows
> 98SE/ME/2000/XP for the stable performance. Sorry for the inconvenience and
> please kindly understand it.
> 
>   Please let us know if you have any question.

Please kindly understand it? I wouldn't... I'm about to bash asus, so...
This information gets me in the moood to do some real bashing =)

Btw, does windows do a C1 disconnect? And if so how often?

-- 
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-03 23:11 ` Bartlomiej Zolnierkiewicz
  2004-05-04  8:28   ` Prakash K. Cheemplavam
  2004-05-04 21:10   ` Jeff Garzik
@ 2004-05-05 12:14   ` Ross Dickson
  2004-05-05 12:27     ` Ian Kumlien
  2004-05-05 12:58     ` Maciej W. Rozycki
  2004-05-05 12:48   ` Patrick Dreker
  3 siblings, 2 replies; 93+ messages in thread
From: Ross Dickson @ 2004-05-05 12:14 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz, Allen Martin, linux-kernel
  Cc: Len Brown, Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien

On Tuesday 04 May 2004 09:11, Bartlomiej Zolnierkiewicz wrote:
> On Tuesday 04 of May 2004 00:09, Allen Martin wrote:
> > I'm happy to be able to make this information public to the Linux
> > community.  This information has been previously released to BIOS /
> > board vendors as an appnote, but in the interest of getting a workaround
> > into the hands of users the quickest we're making it public for possible
> > inclusion into the Linux kernel.
> 
> This is a great news!  Below is an untested patch to address this issue.
> 
> Cheers.
> 
> 
> [PATCH] fixup for C1 Halt Disconnect problem on nForce2 chipsets
> 
> Based on information provided by "Allen Martin" <AMartin@nvidia.com>.
> 
>  linux-2.6.6-rc3-bk2-bzolnier/arch/i386/pci/fixup.c |   39 +++++++++++++++++++++
>  1 files changed, 39 insertions(+)
> 
> diff -puN arch/i386/pci/fixup.c~nforce2_fix arch/i386/pci/fixup.c
> --- linux-2.6.6-rc3-bk2/arch/i386/pci/fixup.c~nforce2_fix	2004-05-04 00:27:18.114421672 +0200
> +++ linux-2.6.6-rc3-bk2-bzolnier/arch/i386/pci/fixup.c	2004-05-04 01:02:29.821393416 +0200
> @@ -187,6 +187,39 @@ static void __devinit pci_fixup_transpar
>  		dev->transparent = 1;
>  }
>  
> +/*
> + * Fixup for C1 Halt Disconnect problem on nForce2 systems.
> + *
> + * From information provided by "Allen Martin" <AMartin@nvidia.com>:
> + *
> + * A hang is caused when the CPU generates a very fast CONNECT/HALT cycle
> + * sequence.  Workaround is to set the SYSTEM_IDLE_TIMEOUT to 80 ns.
> + * This allows the state-machine and timer to return to a proper state within
> + * 80 ns of the CONNECT and probe appearing together.  Since the CPU will not
> + * issue another HALT within 80 ns of the initial HALT, the failure condition
> + * is avoided.
> + */
> +static void __devinit pci_fixup_nforce2(struct pci_dev *dev)
> +{
> +	u32 val, fixed_val;
> +	u8 rev;
> +
> +	pci_read_config_byte(dev, PCI_REVISION_ID, &rev);
> +
> +	/*
> +	 * Chip  Old value   New value
> +	 * C17   0x1F01FF01  0x1F0FFF01
> +	 * C18D  0x9F01FF01  0x9F0FFF01
> +	 */
> +	fixed_val = rev < 0xC1 ? 0x1F01FF01 : 0x9F01FF01;
> +
> +	pci_read_config_dword(dev, 0x6c, &val);
> +	if (val != fixed_val) {
> +		printk(KERN_WARNING "PCI: nForce2 C1 Halt Disconnet fixup\n");
> +		pci_write_config_dword(dev, 0x6c, fixed_val);
> +	}
> +}
> +
>  struct pci_fixup pcibios_fixups[] = {
>  	{
>  		.pass		= PCI_FIXUP_HEADER,
> @@ -290,5 +323,11 @@ struct pci_fixup pcibios_fixups[] = {
>  		.device		= PCI_ANY_ID,
>  		.hook		= pci_fixup_transparent_bridge
>  	},
> +	{
> +		.pass		= PCI_FIXUP_HEADER,
> +		.vendor		= PCI_VENDOR_ID_NVIDIA,
> +		.device		= PCI_DEVICE_ID_NVIDIA_NFORCE2,
> +		.hook		= pci_fixup_nforce2
> +	},
>  	{ .pass = 0 }
>  };
> 
> _
> 
> 
> 
> 

Minor typo 
printk(KERN_WARNING "PCI: nForce2 C1 Halt Disconnet fixup\n");
should be
printk(KERN_WARNING "PCI: nForce2 C1 Halt Disconnect fixup\n");

For 2.4.26 follows a rediffed patch. 
Note as per other postings this 
+static void __devinit pci_fixup_nforce2(struct pci_dev *dev)
should be
+static void __init pci_fixup_nforce2(struct pci_dev *dev)
which would match other 2.4 fixups e.g.
static void __init pci_fixup_transparent_bridge(struct pci_dev *dev)

Works well for 2.4.26 on my epox 8rga+. Yet to try on my albatron KM18G-pros.

To my knowledge the only thing left to sort out for the normal kernel
distro is what to do about the timer_ack issue in check_timer().

We need it off for nforce2 to get nmi_watchdog=1 working with ioapic
8254 timer pin0  timer override patch routing. I vote to revisit Maciej's
patch that was dropped by Linus after appearing in 2.6.3-mm3. 
For those with problems of clock skew with the timer into pin0 routing, 
that patch gave a virtual wire timer routing which worked well for those
users.

It also works around problems for ibm users.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/4421.html

That patch is last posted here (Maciej please correct me if i'm wrong)
http://linux.derkeiler.com/Mailing-Lists/Kernel/2004-04/3174.html

Here is rediffed nforce2 patch for 2.4.26

--- linux-2.4.26/arch/i386/kernel/pci-pc.c.orig 2003-11-29 04:26:19.000000000 +1000
+++ linux/arch/i386/kernel/pci-pc.c     2004-05-04 22:54:32.000000000 +1000
@@ -1326,10 +1326,43 @@ static void __init pci_fixup_transparent
        if ((dev->class >> 8) == PCI_CLASS_BRIDGE_PCI &&
            (dev->device & 0xff00) == 0x2400)
                dev->transparent = 1;
 }
 
+/*
+ * Fixup for C1 Halt Disconnect problem on nForce2 systems.
+ *
+ * From information provided by "Allen Martin" <AMartin@nvidia.com>:
+ *
+ * A hang is caused when the CPU generates a very fast CONNECT/HALT cycle
+ * sequence.  Workaround is to set the SYSTEM_IDLE_TIMEOUT to 80 ns.
+ * This allows the state-machine and timer to return to a proper state within
+ * 80 ns of the CONNECT and probe appearing together.  Since the CPU will not
+ * issue another HALT within 80 ns of the initial HALT, the failure condition
+ * is avoided.
+ */
+static void __devinit pci_fixup_nforce2(struct pci_dev *dev)
+{
+       u32 val, fixed_val;
+       u8 rev;
+
+       pci_read_config_byte(dev, PCI_REVISION_ID, &rev);
+
+       /*
+       * Chip  Old value       New value
+       * C17   0x1F01FF01      0x1F0FFF01
+       * C18D  0x9F01FF01      0x9F0FFF01
+       */
+       fixed_val = rev < 0xC1 ? 0x1F01FF01 : 0x9F01FF01;
+
+       pci_read_config_dword(dev, 0x6c, &val);
+       if (val != fixed_val) {
+               printk(KERN_WARNING "PCI: nForce2 C1 Halt Disconnect fixup\n");
+               pci_write_config_dword(dev, 0x6c, fixed_val);
+       }
+}
+
 struct pci_fixup pcibios_fixups[] = {
        { PCI_FIXUP_HEADER,     PCI_VENDOR_ID_INTEL,    PCI_DEVICE_ID_INTEL_82451NX,    pci_fixup_i450nx },
        { PCI_FIXUP_HEADER,     PCI_VENDOR_ID_INTEL,    PCI_DEVICE_ID_INTEL_82454GX,    pci_fixup_i450gx },
        { PCI_FIXUP_HEADER,     PCI_VENDOR_ID_UMC,      PCI_DEVICE_ID_UMC_UM8886BF,     pci_fixup_umc_ide },
        { PCI_FIXUP_HEADER,     PCI_VENDOR_ID_SI,       PCI_DEVICE_ID_SI_5513,          pci_fixup_ide_trash },
@@ -1341,10 +1374,11 @@ struct pci_fixup pcibios_fixups[] = {
        { PCI_FIXUP_HEADER,     PCI_VENDOR_ID_VIA,      PCI_DEVICE_ID_VIA_8622,         pci_fixup_via_northbridge_bug },
        { PCI_FIXUP_HEADER,     PCI_VENDOR_ID_VIA,      PCI_DEVICE_ID_VIA_8361,         pci_fixup_via_northbridge_bug },
        { PCI_FIXUP_HEADER,     PCI_VENDOR_ID_VIA,      PCI_DEVICE_ID_VIA_8367_0,       pci_fixup_via_northbridge_bug },
        { PCI_FIXUP_HEADER,     PCI_VENDOR_ID_NCR,      PCI_DEVICE_ID_NCR_53C810,       pci_fixup_ncr53c810 },
        { PCI_FIXUP_HEADER,     PCI_VENDOR_ID_INTEL,    PCI_ANY_ID,                     pci_fixup_transparent_bridge },
+       { PCI_FIXUP_HEADER,     PCI_VENDOR_ID_NVIDIA,   PCI_DEVICE_ID_NVIDIA_NFORCE2,   pci_fixup_nforce2},
        { 0 }
 };
 
 /*
  *  Called after each bus is probed, but before its children


Regards
Ross.






^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-03 22:09 Allen Martin
  2004-05-03 23:11 ` Bartlomiej Zolnierkiewicz
@ 2004-05-05 11:24 ` Ross Dickson
  2004-05-05 12:18   ` Ian Kumlien
  1 sibling, 1 reply; 93+ messages in thread
From: Ross Dickson @ 2004-05-05 11:24 UTC (permalink / raw)
  To: Allen Martin, linux-kernel
  Cc: Len Brown, Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, Maciej W. Rozycki, Jamie Lokier, Daniel Drake,
	Ian Kumlien

On Tuesday 04 May 2004 08:09, Allen Martin wrote:
> I'm happy to be able to make this information public to the Linux
> community.  This information has been previously released to BIOS /
> board vendors as an appnote, but in the interest of getting a workaround
> into the hands of users the quickest we're making it public for possible
> inclusion into the Linux kernel.
> 
<snip>

Thank you very much Allen for being involved in linux development.
Obsolecense is the best ending a temporary workaround could have.

I think I have found the problem with the Albatron KM18G-pro Mobos I have been
using.

They can't see through their Windows.??!@@#$$%%&*&

ML1-0505-19 Re: Cause of lockups with KM-18G Pro is incorrect pci reg values in bios -please update bios

From: 
"dr.pro" <dr.pro@albatron.com.tw>

To: 
<ross@datscreative.com.au>

Date: 
Today 17:38:08

  Dear Ross,

  Thank you very much for contacting Albatron technical support.

  KM18G Pro has been proved under Windows 98SE/ME/2000/XP but Linux, so you
may encounter problems with it under Linux. We suggest you use Windows
98SE/ME/2000/XP for the stable performance. Sorry for the inconvenience and
please kindly understand it.

  Please let us know if you have any question.

  Best regards,
  Dr.Pro
  ----- Original Message ----- 
  From: "Ross Dickson" <ross@datscreative.com.au>
  To: <dr.pro@albatron.com.tw>
  Sent: Tuesday, May 04, 2004 8:19 PM
  Subject: Cause of lockups with KM-18G Pro is incorrect pci reg values in
bios -please update bios


  > Greetings,
  >
  > The following is required for Linux to function correctly on your KM-18G
Pro.
  >
  > Allen Martin of Nvidia explains.
  >
  > I'm happy to be able to make this information public to the Linux
  > community.  This information has been previously released to BIOS /
  > board vendors as an appnote, but in the interest of getting a workaround
  > into the hands of users the quickest we're making it public for possible
  > inclusion into the Linux kernel.
  >
  >
  > Problem:
  > C1 Halt Disconnect problem on nForce2 systems
  >
  > Description:
  > A hang is caused when the CPU generates a very fast CONNECT/HALT cycle
  > sequence.
  >
  > Workaround:
  > Set the SYSTEM_IDLE_TIMEOUT to 80 ns. This allows the state-machine and
  > timer to return to a proper state within 80 ns of the CONNECT and probe
  > appearing together.
  >
  > Since the CPU will not issue another HALT within 80 ns of the initial
  > HALT, the failure condition is avoided.
  >
  > This will require changing the value for register at bus:0 dev:0 func:0
  > offset 6c.
  >
  > Chip   Current Value   New Value
  > C17       1F0FFF01     1F01FF01
  > C18D      9F0FFF01     9F01FF01
  >
  > Northbridge chip version may be determined by reading the PCI revision
  > ID (offset 8) of the host bridge at bus:0 dev:0 func:0.  C1 or greater
  > is C18D.
  >
  > Regards
  > Ross Dickson
  >
  >


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-04 21:10   ` Jeff Garzik
@ 2004-05-04 21:29     ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 93+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2004-05-04 21:29 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Allen Martin, linux-kernel, Ross Dickson, Len Brown

On Tuesday 04 of May 2004 23:10, Jeff Garzik wrote:
> Bartlomiej Zolnierkiewicz wrote:
> > +/*
> > + * Fixup for C1 Halt Disconnect problem on nForce2 systems.
> > + *
> > + * From information provided by "Allen Martin" <AMartin@nvidia.com>:
> > + *
> > + * A hang is caused when the CPU generates a very fast CONNECT/HALT
> > cycle + * sequence.  Workaround is to set the SYSTEM_IDLE_TIMEOUT to 80
> > ns. + * This allows the state-machine and timer to return to a proper
> > state within + * 80 ns of the CONNECT and probe appearing together. 
> > Since the CPU will not + * issue another HALT within 80 ns of the initial
> > HALT, the failure condition + * is avoided.
> > + */
> > +static void __devinit pci_fixup_nforce2(struct pci_dev *dev)
>
> Minor nit:  is __devinit really needed?

No, it's not needed.

I was mislead by the fact that all fixups there are marked with __devinit.

> You're changing a northbridge or a southbridge, not a PCI card, I
> presume...?  That would only need to be done once, when the kernel is
> booted, regardless of CONFIG_HOTPLUG AFAICS.

Yep, the same is probably true for:

static void __devinit pci_fixup_i450nx(struct pci_dev *d)
static void __devinit pci_fixup_i450gx(struct pci_dev *d)
static void __devinit  pci_fixup_umc_ide(struct pci_dev *d)
static void __devinit  pci_fixup_latency(struct pci_dev *d)
static void __devinit pci_fixup_piix4_acpi(struct pci_dev *d)
static void __devinit pci_fixup_via_northbridge_bug(struct pci_dev *d)
static void __devinit pci_fixup_transparent_bridge(struct pci_dev *dev)

Bartlomiej


^ permalink raw reply	[flat|nested] 93+ messages in thread

* RE: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-04 20:38 Jesse Allen
@ 2004-05-04 21:14 ` Craig Bradney
  0 siblings, 0 replies; 93+ messages in thread
From: Craig Bradney @ 2004-05-04 21:14 UTC (permalink / raw)
  To: Jesse Allen; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1068 bytes --]

On Tue, 2004-05-04 at 22:38, Jesse Allen wrote:
> Allen Martin wrote:
> > This will require changing the value for register at bus:0 dev:0 func:0
> > offset 6c.
> >
> > Chip   Current Value   New Value
> > C17       1F0FFF01     1F01FF01
> > C18D      9F0FFF01     9F01FF01
> >
> > Northbridge chip version may be determined by reading the PCI revision
> > ID (offset 8) of the host bridge at bus:0 dev:0 func:0.  C1 or greater
> > is C18D.
> 
> I believe I have confirmed that the Shuttle AN35N BIOS indeed has this fix as
> of Dec 5th 03 version.
> 
> lspci -xxx -vvv
> 00:00.0 Host bridge: nVidia Corporation nForce2 AGP (different version?) 
> (rev c1)
> ...
> 60: 08 00 01 20 20 00 88 80 10 00 00 00 01 ff 01 9f

I can confirm on my Asus A7N8X Deluxe v2.0 with BIOS 1007 (the latest),
that its reporting

60: 08 00 01 20 20 00 88 80 10 00 00 00 01 ff 0f 9f

So, it looks like it needs the new value as per the note.

I haven't applied that patch that was posted yesterday, am awaiting more
reaction from Len, Ross and all.

Craig

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-03 23:11 ` Bartlomiej Zolnierkiewicz
  2004-05-04  8:28   ` Prakash K. Cheemplavam
@ 2004-05-04 21:10   ` Jeff Garzik
  2004-05-04 21:29     ` Bartlomiej Zolnierkiewicz
  2004-05-05 12:14   ` Ross Dickson
  2004-05-05 12:48   ` Patrick Dreker
  3 siblings, 1 reply; 93+ messages in thread
From: Jeff Garzik @ 2004-05-04 21:10 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Allen Martin, linux-kernel, Ross Dickson, Len Brown

Bartlomiej Zolnierkiewicz wrote:
> +/*
> + * Fixup for C1 Halt Disconnect problem on nForce2 systems.
> + *
> + * From information provided by "Allen Martin" <AMartin@nvidia.com>:
> + *
> + * A hang is caused when the CPU generates a very fast CONNECT/HALT cycle
> + * sequence.  Workaround is to set the SYSTEM_IDLE_TIMEOUT to 80 ns.
> + * This allows the state-machine and timer to return to a proper state within
> + * 80 ns of the CONNECT and probe appearing together.  Since the CPU will not
> + * issue another HALT within 80 ns of the initial HALT, the failure condition
> + * is avoided.
> + */
> +static void __devinit pci_fixup_nforce2(struct pci_dev *dev)


Minor nit:  is __devinit really needed?

You're changing a northbridge or a southbridge, not a PCI card, I 
presume...?  That would only need to be done once, when the kernel is 
booted, regardless of CONFIG_HOTPLUG AFAICS.

	Jeff




^ permalink raw reply	[flat|nested] 93+ messages in thread

* RE: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
@ 2004-05-04 20:38 Jesse Allen
  2004-05-04 21:14 ` Craig Bradney
  0 siblings, 1 reply; 93+ messages in thread
From: Jesse Allen @ 2004-05-04 20:38 UTC (permalink / raw)
  To: linux-kernel

Allen Martin wrote:
> This will require changing the value for register at bus:0 dev:0 func:0
> offset 6c.
>
> Chip   Current Value   New Value
> C17       1F0FFF01     1F01FF01
> C18D      9F0FFF01     9F01FF01
>
> Northbridge chip version may be determined by reading the PCI revision
> ID (offset 8) of the host bridge at bus:0 dev:0 func:0.  C1 or greater
> is C18D.

I believe I have confirmed that the Shuttle AN35N BIOS indeed has this fix as
of Dec 5th 03 version.

lspci -xxx -vvv
00:00.0 Host bridge: nVidia Corporation nForce2 AGP (different version?) 
(rev c1)
...
60: 08 00 01 20 20 00 88 80 10 00 00 00 01 ff 01 9f

Jesse


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-03 23:11 ` Bartlomiej Zolnierkiewicz
@ 2004-05-04  8:28   ` Prakash K. Cheemplavam
  2004-05-04 21:10   ` Jeff Garzik
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 93+ messages in thread
From: Prakash K. Cheemplavam @ 2004-05-04  8:28 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Allen Martin, linux-kernel, Ross Dickson, Len Brown

Bartlomiej Zolnierkiewicz wrote:
> On Tuesday 04 of May 2004 00:09, Allen Martin wrote:
> 
>>I'm happy to be able to make this information public to the Linux
>>community.  This information has been previously released to BIOS /
>>board vendors as an appnote, but in the interest of getting a workaround
>>into the hands of users the quickest we're making it public for possible
>>inclusion into the Linux kernel.
> 
> 
> This is a great news!  Below is an untested patch to address this issue.
> 

Yes it works!!!! Finally the nforce2 issue has been fixed. I still can't 
believe it.

Dear Allen, it is nice that after all Nvidia decided to give out 
information about this issue. I would have been so nice, if this had 
been doen about 6 months ago, where I originally discovered the 
connection between apic instability and cpu disconnect. But I guess I 
shouldn't scold Nvidia but the mainboard manufacturers who were still 
sleeping, like in my case: Abit. Till today the didn't manage to fix 
this issue and the timer issue (and they released a new bios a few days 
ago...)

Maybe Nvidia should scold the board manufacturers to keep their bioses 
updated. After all it is Nvidia getting a bad image if everybody thinks 
"Nvidia boards are unstable and they don't care to resplve it." So it 
would be in Nvidia's own interest to push the manufacturers to integrate 
such critical fixes ASAP.

The only issues left for me are

a) semi-stable nvidia binary driver
b) higher idle temperature with nvidia driver (I guess). I may also be a 
sensors probelm as Abit's reading is known as not to be very precise and 
read something else after every reboot thanks to new recalibration.
c) missing driver for nforce2 apu...

Thanx after all.

Prakash

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-05-03 22:09 Allen Martin
@ 2004-05-03 23:11 ` Bartlomiej Zolnierkiewicz
  2004-05-04  8:28   ` Prakash K. Cheemplavam
                     ` (3 more replies)
  2004-05-05 11:24 ` Ross Dickson
  1 sibling, 4 replies; 93+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2004-05-03 23:11 UTC (permalink / raw)
  To: Allen Martin, linux-kernel; +Cc: Ross Dickson, Len Brown

On Tuesday 04 of May 2004 00:09, Allen Martin wrote:
> I'm happy to be able to make this information public to the Linux
> community.  This information has been previously released to BIOS /
> board vendors as an appnote, but in the interest of getting a workaround
> into the hands of users the quickest we're making it public for possible
> inclusion into the Linux kernel.

This is a great news!  Below is an untested patch to address this issue.

Cheers.


[PATCH] fixup for C1 Halt Disconnect problem on nForce2 chipsets

Based on information provided by "Allen Martin" <AMartin@nvidia.com>.

 linux-2.6.6-rc3-bk2-bzolnier/arch/i386/pci/fixup.c |   39 +++++++++++++++++++++
 1 files changed, 39 insertions(+)

diff -puN arch/i386/pci/fixup.c~nforce2_fix arch/i386/pci/fixup.c
--- linux-2.6.6-rc3-bk2/arch/i386/pci/fixup.c~nforce2_fix	2004-05-04 00:27:18.114421672 +0200
+++ linux-2.6.6-rc3-bk2-bzolnier/arch/i386/pci/fixup.c	2004-05-04 01:02:29.821393416 +0200
@@ -187,6 +187,39 @@ static void __devinit pci_fixup_transpar
 		dev->transparent = 1;
 }
 
+/*
+ * Fixup for C1 Halt Disconnect problem on nForce2 systems.
+ *
+ * From information provided by "Allen Martin" <AMartin@nvidia.com>:
+ *
+ * A hang is caused when the CPU generates a very fast CONNECT/HALT cycle
+ * sequence.  Workaround is to set the SYSTEM_IDLE_TIMEOUT to 80 ns.
+ * This allows the state-machine and timer to return to a proper state within
+ * 80 ns of the CONNECT and probe appearing together.  Since the CPU will not
+ * issue another HALT within 80 ns of the initial HALT, the failure condition
+ * is avoided.
+ */
+static void __devinit pci_fixup_nforce2(struct pci_dev *dev)
+{
+	u32 val, fixed_val;
+	u8 rev;
+
+	pci_read_config_byte(dev, PCI_REVISION_ID, &rev);
+
+	/*
+	 * Chip  Old value   New value
+	 * C17   0x1F01FF01  0x1F0FFF01
+	 * C18D  0x9F01FF01  0x9F0FFF01
+	 */
+	fixed_val = rev < 0xC1 ? 0x1F01FF01 : 0x9F01FF01;
+
+	pci_read_config_dword(dev, 0x6c, &val);
+	if (val != fixed_val) {
+		printk(KERN_WARNING "PCI: nForce2 C1 Halt Disconnet fixup\n");
+		pci_write_config_dword(dev, 0x6c, fixed_val);
+	}
+}
+
 struct pci_fixup pcibios_fixups[] = {
 	{
 		.pass		= PCI_FIXUP_HEADER,
@@ -290,5 +323,11 @@ struct pci_fixup pcibios_fixups[] = {
 		.device		= PCI_ANY_ID,
 		.hook		= pci_fixup_transparent_bridge
 	},
+	{
+		.pass		= PCI_FIXUP_HEADER,
+		.vendor		= PCI_VENDOR_ID_NVIDIA,
+		.device		= PCI_DEVICE_ID_NVIDIA_NFORCE2,
+		.hook		= pci_fixup_nforce2
+	},
 	{ .pass = 0 }
 };

_


^ permalink raw reply	[flat|nested] 93+ messages in thread

* RE: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
@ 2004-05-03 22:09 Allen Martin
  2004-05-03 23:11 ` Bartlomiej Zolnierkiewicz
  2004-05-05 11:24 ` Ross Dickson
  0 siblings, 2 replies; 93+ messages in thread
From: Allen Martin @ 2004-05-03 22:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ross Dickson, Len Brown

I'm happy to be able to make this information public to the Linux
community.  This information has been previously released to BIOS /
board vendors as an appnote, but in the interest of getting a workaround
into the hands of users the quickest we're making it public for possible
inclusion into the Linux kernel.


Problem:
C1 Halt Disconnect problem on nForce2 systems

Description:
A hang is caused when the CPU generates a very fast CONNECT/HALT cycle
sequence.

Workaround:
Set the SYSTEM_IDLE_TIMEOUT to 80 ns. This allows the state-machine and
timer to return to a proper state within 80 ns of the CONNECT and probe
appearing together.

Since the CPU will not issue another HALT within 80 ns of the initial
HALT, the failure condition is avoided.

This will require changing the value for register at bus:0 dev:0 func:0
offset 6c.

Chip   Current Value   New Value
C17       1F0FFF01     1F01FF01
C18D      9F0FFF01     9F01FF01

Northbridge chip version may be determined by reading the PCI revision
ID (offset 8) of the host bridge at bus:0 dev:0 func:0.  C1 or greater
is C18D.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* RE: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
@ 2004-05-03  8:08 Allen Martin
  0 siblings, 0 replies; 93+ messages in thread
From: Allen Martin @ 2004-05-03  8:08 UTC (permalink / raw)
  To: ross, Len Brown, a.verweij
  Cc: Jesse Allen, Prakash K. Cheemplavam, Craig Bradney,
	christian.kroener, linux-kernel, Maciej W. Rozycki, Jamie Lokier,
	Daniel Drake, Ian Kumlien

 
> I also recollect that Windows had lockups with nforce2 for a while
> depending
> whether you ran the Nvidia or Microsoft driver.
> http://lkml.org/lkml/2003/12/13/5
> Anybody got the inside running on that one and what was different
between
> the
> two drivers?


There were some ATAPI device detection problems in some of the earlier
Windows nForce IDE drivers that would cause lockups during boot for some
people depending on what type of devices were attached.  

There's never been any reports of Windows hangs that have been root
caused to C1 disconnects that I'm aware of.

-Allen

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
@ 2004-04-23  1:30 Jesse Allen
  2004-05-07  4:47 ` Richard James
  0 siblings, 1 reply; 93+ messages in thread
From: Jesse Allen @ 2004-04-23  1:30 UTC (permalink / raw)
  To: linux-kernel

Len Brown wrote:
> Have you been able to hang the AN35N under any conditions?
> Old BIOS, non-vanilla kernel?

Yes, and I described that it will hang under the pre-Dec 5th BIOS in another 
mail.

I still have images of the buggy BIOS, and the fixed one on my hard drive.
They are also available at ftp://ftp.shuttle.com/BIOS/an35_n/ as
an35s00j.bin (Oct 2003)
an35s00l.bin (Dec 5th 2003)

XT-PIC timer bug still remains in both versions.

> I'm not familiar with the "one removed by Linus in a testing version",
> perhaps you could point me to that?

I had forgot the name, and hadn't looked it up.  But it is the 8259 timer ack 
workaround.  You can see the removal here:
http://linux.bkbits.net:8080/linux-2.5/cset@1.1608.99.1


Jesse


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5
  2004-04-15 23:07         ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Andi Kleen
@ 2004-04-21 22:00           ` Len Brown
  0 siblings, 0 replies; 93+ messages in thread
From: Len Brown @ 2004-04-21 22:00 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, Allen Martin

On Thu, 2004-04-15 at 19:07, Andi Kleen wrote:
> Len Brown <len.brown@intel.com> writes:
> 
> > While I don't want to get into the business of maintaining
> > a dmi_scan entry for every system with this issue, I think
> > it might be a good idea to add a couple of example entries
> > for high volume systems for which there is no BIOS fix available.
> 
> Or do a generic fix: check for the PCI-ID of the Nforce2 and when
> it is true and the timer is wrong just correct it. That's ugly,
> but it's probably the best solution for such a common issue
> (and the IO-APIC code is already filled with workarounds anyways)

IMO the fact that the IOAPIC code is full of workarounds is a reason NOT
to add another one.

> One problem is that this likely must happen before the PCI quirks
> run. In the x86-64 code I have special "early PCI scanning" code 
> for this that could be copied. I don't have a Nforce2, but when
> someone is willing to test I can do a patch for this.

If this issue had no other fix, I'd agree that the complexity is worth
it.  But a BIOS upgrade fixes this -- so I think dmi-scan simplicity
is the way to go.

-Len



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH]  for idle=C1halt, 2.6.5
       [not found]       ` <1LlEY-36q-11@gated-at.bofh.it>
@ 2004-04-15 23:07         ` Andi Kleen
  2004-04-21 22:00           ` Len Brown
  0 siblings, 1 reply; 93+ messages in thread
From: Andi Kleen @ 2004-04-15 23:07 UTC (permalink / raw)
  To: Len Brown; +Cc: linux-kernel, Allen Martin

Len Brown <len.brown@intel.com> writes:

> While I don't want to get into the business of maintaining
> a dmi_scan entry for every system with this issue, I think
> it might be a good idea to add a couple of example entries
> for high volume systems for which there is no BIOS fix available.

Or do a generic fix: check for the PCI-ID of the Nforce2 and when
it is true and the timer is wrong just correct it. That's ugly,
but it's probably the best solution for such a common issue
(and the IO-APIC code is already filled with workarounds anyways)

One problem is that this likely must happen before the PCI quirks
run. In the x86-64 code I have special "early PCI scanning" code 
for this that could be copied. I don't have a Nforce2, but when
someone is willing to test I can do a patch for this.

-Andi

P.S.: This problem of reference BIOS bugs getting haunting Linux
even after they are long fixed is unfortunately common :-(


^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2004-05-17 19:57 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-04-13  1:17 IO-APIC on nforce2 Ross Dickson
2004-04-13  4:01 ` really bensoo_at_soo_dot_com
2004-04-13  4:55   ` Ross Dickson
2004-04-13 17:22     ` Christian Kröner
2004-04-13 21:18     ` really bensoo_at_soo_dot_com
2004-04-14  4:24       ` really bensoo_at_soo_dot_com
2004-04-13  5:08 ` Len Brown
2004-04-13  7:03   ` Ross Dickson
2004-04-13 13:46     ` Maciej W. Rozycki
2004-04-14  1:02     ` IO-APIC on nforce2 [PATCH] Len Brown
2004-04-14  5:02       ` Ross Dickson
2004-04-14  6:30         ` Jamie Lokier
2004-04-14 10:37         ` Maciej W. Rozycki
2004-04-15 19:28           ` Len Brown
2004-04-14 19:57         ` Christian Kröner
2004-04-15  0:17           ` Len Brown
2004-04-15  1:48             ` Ross Dickson
2004-04-15 17:09               ` Christian Kröner
2004-04-15 15:10       ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Ross Dickson
2004-04-15 20:17         ` Len Brown
2004-04-15 21:04           ` Craig Bradney
2004-04-21 20:22             ` Len Brown
2004-04-21 20:33               ` Ian Kumlien
2004-04-21 20:45               ` Craig Bradney
2004-04-21 21:28               ` Prakash K. Cheemplavam
2004-04-21 22:41                 ` Len Brown
2004-04-22  7:26                   ` Prakash K. Cheemplavam
2004-04-22 14:58                     ` Len Brown
2004-04-22  8:45                   ` Craig Bradney
2004-04-22 15:03                     ` Len Brown
2004-04-22 20:50                       ` Craig Bradney
2004-04-22  8:50                   ` Arjen Verweij
2004-04-22 16:39                   ` Jesse Allen
2004-04-22 17:21                     ` Len Brown
2004-04-22 21:29                       ` Len Brown
2004-04-23  8:48                         ` Prakash K. Cheemplavam
2004-04-23  9:01                           ` Arjen Verweij
2004-04-23  9:08                             ` Prakash K. Cheemplavam
2004-04-23  9:11                             ` Prakash K. Cheemplavam
2004-04-23 12:18                         ` Maciej W. Rozycki
2004-04-27  7:57                         ` ACPI broken on nforce2? Prakash K. Cheemplavam
2004-04-26 11:41                       ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Ross Dickson
2004-04-27 17:02                         ` Arjen Verweij
2004-04-27 17:35                           ` Ian Kumlien
2004-04-27 18:00                           ` Len Brown
2004-04-27 18:24                             ` Arjen Verweij
2004-04-27 18:51                             ` Jussi Laako
2004-04-28 11:33                             ` Ross Dickson
2004-04-28 20:59                               ` Jesse Allen
2004-04-29 11:44                                 ` Ross Dickson
2004-04-29 11:54                                   ` Maciej W. Rozycki
2004-04-29 12:00                                     ` Jamie Lokier
2004-04-29 12:26                                       ` Maciej W. Rozycki
2004-04-29 11:57                                   ` Jamie Lokier
2004-04-29 12:16                                   ` Craig Bradney
2004-04-29 20:24                                   ` Jesse Allen
2004-04-29 20:31                                     ` Prakash K. Cheemplavam
2004-05-03 20:45                                       ` Jesse Allen
2004-05-17 15:26                                         ` Prakash K. Cheemplavam
2004-05-17 19:32                                           ` Craig Bradney
2004-05-17 19:37                                             ` Prakash K. Cheemplavam
2004-05-17 19:57                                               ` Craig Bradney
2004-04-27 21:31                         ` Prakash K. Cheemplavam
2004-04-28 11:26                           ` Prakash K. Cheemplavam
2004-05-01  6:51                   ` Prakash K. Cheemplavam
2004-04-15 21:56           ` Arjen Verweij
2004-04-15 15:21       ` IO-APIC on nforce2 [PATCH] Zwane Mwaikambo
     [not found] <1KkKQ-2v9-9@gated-at.bofh.it>
     [not found] ` <1Kqdx-6E1-5@gated-at.bofh.it>
     [not found]   ` <1KH4I-3W9-11@gated-at.bofh.it>
     [not found]     ` <1LgOQ-7px-3@gated-at.bofh.it>
     [not found]       ` <1LlEY-36q-11@gated-at.bofh.it>
2004-04-15 23:07         ` IO-APIC on nforce2 [PATCH] + [PATCH] for nmi_debug=1 + [PATCH] for idle=C1halt, 2.6.5 Andi Kleen
2004-04-21 22:00           ` Len Brown
2004-04-23  1:30 Jesse Allen
2004-05-07  4:47 ` Richard James
2004-05-07  7:13   ` Craig Bradney
2004-05-08  5:33   ` Richard James
2004-05-03  8:08 Allen Martin
2004-05-03 22:09 Allen Martin
2004-05-03 23:11 ` Bartlomiej Zolnierkiewicz
2004-05-04  8:28   ` Prakash K. Cheemplavam
2004-05-04 21:10   ` Jeff Garzik
2004-05-04 21:29     ` Bartlomiej Zolnierkiewicz
2004-05-05 12:14   ` Ross Dickson
2004-05-05 12:27     ` Ian Kumlien
2004-05-05 13:12       ` Ross Dickson
2004-05-05 13:23         ` Ian Kumlien
2004-05-05 12:58     ` Maciej W. Rozycki
2004-05-05 12:48   ` Patrick Dreker
2004-05-05 13:34     ` Patrick Dreker
2004-05-05 11:24 ` Ross Dickson
2004-05-05 12:18   ` Ian Kumlien
2004-05-05 12:52     ` Ross Dickson
2004-05-05 13:08       ` Ian Kumlien
2004-05-06  1:50         ` Jesse Allen
2004-05-04 20:38 Jesse Allen
2004-05-04 21:14 ` Craig Bradney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).