LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: SATA exceptions with 2.6.20-rc5
       [not found]         ` <fa.rI60BGlFbSyfLyumqmgiOfDqCI4@ifi.uio.no>
@ 2007-01-23 23:18           ` Robert Hancock
  2007-01-24  0:39             ` Björn Steinbrink
  2007-01-24  8:24             ` Ian Kumlien
  0 siblings, 2 replies; 9+ messages in thread
From: Robert Hancock @ 2007-01-23 23:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Larry Walton, B.Steinbrink, s0348365, pomac, chunkeey, Jeff Garzik

Larry Walton wrote:
> The last patch (sata_nv-force-int-dev-in-interrupt.patch) 
> seems to have fix the problem.  Much appreciated, 
> thank you. I'd consider it a must have in 2.6.20.

Can any of the rest of you that have been seeing this problem also 
confirm that this fixes it?

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SATA exceptions with 2.6.20-rc5
  2007-01-23 23:18           ` SATA exceptions with 2.6.20-rc5 Robert Hancock
@ 2007-01-24  0:39             ` Björn Steinbrink
  2007-01-24  2:09               ` [PATCH 2.6.20] sata_nv: don't rely on NV_INT_DEV indication with ADMA Robert Hancock
  2007-02-03  1:42               ` SATA exceptions with 2.6.20-rc5 Björn Steinbrink
  2007-01-24  8:24             ` Ian Kumlien
  1 sibling, 2 replies; 9+ messages in thread
From: Björn Steinbrink @ 2007-01-24  0:39 UTC (permalink / raw)
  To: Robert Hancock
  Cc: linux-kernel, Larry Walton, s0348365, pomac, chunkeey, Jeff Garzik

On 2007.01.23 17:18:43 -0600, Robert Hancock wrote:
> Larry Walton wrote:
> >The last patch (sata_nv-force-int-dev-in-interrupt.patch) 
> >seems to have fix the problem.  Much appreciated, 
> >thank you. I'd consider it a must have in 2.6.20.
> 
> Can any of the rest of you that have been seeing this problem also 
> confirm that this fixes it?

Seems to work for me, uptime is about an hour now and no exception yet.
Had the stress test running for only about 10 minutes, but I usually got
an exception within an hour even during plain irssi usage, so I'm quite
confident that the patch fixes it.

Thanks,
Björn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2.6.20] sata_nv: don't rely on NV_INT_DEV indication with ADMA
  2007-01-24  0:39             ` Björn Steinbrink
@ 2007-01-24  2:09               ` Robert Hancock
  2007-02-03  1:42               ` SATA exceptions with 2.6.20-rc5 Björn Steinbrink
  1 sibling, 0 replies; 9+ messages in thread
From: Robert Hancock @ 2007-01-24  2:09 UTC (permalink / raw)
  To: Björn Steinbrink, Robert Hancock, linux-kernel,
	Larry Walton, s0348365, pomac, chunkeey, Jeff Garzik

[-- Attachment #1: Type: text/plain, Size: 996 bytes --]

OK, here it is in full signed-off glory. Hopefully we can get this in 
for 2.6.20.

---

Several people reported issues with certain drive commands timing out on 
sata_nv controllers running in ADMA mode. The commands in question were 
non-DMA-mapped commands, usually FLUSH CACHE or FLUSH CACHE EXT.

 From experimentation it appears that the NV_INT_DEV indication isn't 
always set when a legitimate command completion interrupt is received on 
a legacy-mode command, at least not on these controllers in ADMA mode. 
When a command is pending on the port, force the flag on always in the 
irq_stat value before calling nv_host_intr so that the drive busy state 
is always checked by ata_host_intr.

This also fixes some questionable code in nv_host_intr which called 
ata_check_status when a command was pending and ata_host_intr returned 
"unhandled". If the device interrupted at just the wrong time this could 
cause interrupts to be lost.

Signed-off-by: Robert Hancock <hancockr@shaw.ca>


[-- Attachment #2: sata_nv-force-int-dev-in-interrupt.patch --]
[-- Type: text/plain, Size: 1315 bytes --]

--- linux-2.6.20-rc5/drivers/ata/sata_nv.c	2007-01-19 19:18:53.000000000 -0600
+++ linux-2.6.20-rc5debug/drivers/ata/sata_nv.c	2007-01-22 22:33:43.000000000 -0600
@@ -700,7 +700,6 @@ static void nv_adma_check_cpb(struct ata
 static int nv_host_intr(struct ata_port *ap, u8 irq_stat)
 {
 	struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap->active_tag);
-	int handled;
 
 	/* freeze if hotplugged */
 	if (unlikely(irq_stat & (NV_INT_ADDED | NV_INT_REMOVED))) {
@@ -719,13 +718,7 @@ static int nv_host_intr(struct ata_port 
 	}
 
 	/* handle interrupt */
-	handled = ata_host_intr(ap, qc);
-	if (unlikely(!handled)) {
-		/* spurious, clear it */
-		ata_check_status(ap);
-	}
-
-	return 1;
+	return ata_host_intr(ap, qc);
 }
 
 static irqreturn_t nv_adma_interrupt(int irq, void *dev_instance)
@@ -752,6 +745,11 @@ static irqreturn_t nv_adma_interrupt(int
 			if (pp->flags & NV_ADMA_PORT_REGISTER_MODE) {
 				u8 irq_stat = readb(host->mmio_base + NV_INT_STATUS_CK804)
 					>> (NV_INT_PORT_SHIFT * i);
+				if(ata_tag_valid(ap->active_tag))
+					/** NV_INT_DEV indication seems unreliable at times
+					    at least in ADMA mode. Force it on always when a
+					    command is active, to prevent losing interrupts. */
+					irq_stat |= NV_INT_DEV;
 				handled += nv_host_intr(ap, irq_stat);
 				continue;
 			}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SATA exceptions with 2.6.20-rc5
  2007-01-23 23:18           ` SATA exceptions with 2.6.20-rc5 Robert Hancock
  2007-01-24  0:39             ` Björn Steinbrink
@ 2007-01-24  8:24             ` Ian Kumlien
  2007-01-24 14:41               ` Björn Steinbrink
  1 sibling, 1 reply; 9+ messages in thread
From: Ian Kumlien @ 2007-01-24  8:24 UTC (permalink / raw)
  To: Robert Hancock
  Cc: linux-kernel, Larry Walton, B.Steinbrink, s0348365, chunkeey,
	Jeff Garzik

[-- Attachment #1: Type: text/plain, Size: 1197 bytes --]

On tis, 2007-01-23 at 17:18 -0600, Robert Hancock wrote:
> Larry Walton wrote:
> > The last patch (sata_nv-force-int-dev-in-interrupt.patch) 
> > seems to have fix the problem.  Much appreciated, 
> > thank you. I'd consider it a must have in 2.6.20.
> 
> Can any of the rest of you that have been seeing this problem also 
> confirm that this fixes it?

I applied it yesterday and today my dmesg contains three:
BUG: at mm/truncate.c:60 cancel_dirty_page()

Call Trace:
 [<ffffffff8029f3e5>] cancel_dirty_page+0x43/0x71
 [<ffffffff802ec1ab>] reiserfs_cut_from_item+0x5f8/0x61d
 [<ffffffff802074fc>] find_get_page+0x21/0x47
 [<ffffffff802ec51d>] reiserfs_do_truncate+0x34d/0x495
 [<ffffffff802d9d47>] reiserfs_truncate_file+0x199/0x2aa
 [<ffffffff802df9c5>] reiserfs_file_release+0x261/0x281
 [<ffffffff80211b02>] __fput+0xb1/0x17d
 [<ffffffff802218e0>] filp_close+0x5d/0x65
 [<ffffffff8021bef5>] sys_close+0x8c/0xcf
 [<ffffffff8025725e>] system_call+0x7e/0x83

Which never happened before... I dunno if they are related though, but
they weren't there before...

(It does fix the timeout problem)

-- 
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SATA exceptions with 2.6.20-rc5
  2007-01-24  8:24             ` Ian Kumlien
@ 2007-01-24 14:41               ` Björn Steinbrink
  0 siblings, 0 replies; 9+ messages in thread
From: Björn Steinbrink @ 2007-01-24 14:41 UTC (permalink / raw)
  To: Ian Kumlien
  Cc: Robert Hancock, linux-kernel, Larry Walton, s0348365, chunkeey,
	Jeff Garzik

On 2007.01.24 09:24:00 +0100, Ian Kumlien wrote:
> On tis, 2007-01-23 at 17:18 -0600, Robert Hancock wrote:
> > Larry Walton wrote:
> > > The last patch (sata_nv-force-int-dev-in-interrupt.patch) 
> > > seems to have fix the problem.  Much appreciated, 
> > > thank you. I'd consider it a must have in 2.6.20.
> > 
> > Can any of the rest of you that have been seeing this problem also 
> > confirm that this fixes it?
> 
> I applied it yesterday and today my dmesg contains three:
> BUG: at mm/truncate.c:60 cancel_dirty_page()

David Chinner sent two patches regarding that bug yesterday.
http://lkml.org/lkml/2007/1/23/190
http://lkml.org/lkml/2007/1/23/192

Björn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SATA exceptions with 2.6.20-rc5
  2007-01-24  0:39             ` Björn Steinbrink
  2007-01-24  2:09               ` [PATCH 2.6.20] sata_nv: don't rely on NV_INT_DEV indication with ADMA Robert Hancock
@ 2007-02-03  1:42               ` Björn Steinbrink
  2007-02-03  5:48                 ` Robert Hancock
  1 sibling, 1 reply; 9+ messages in thread
From: Björn Steinbrink @ 2007-02-03  1:42 UTC (permalink / raw)
  To: Robert Hancock, linux-kernel, Larry Walton, s0348365, pomac,
	chunkeey, Jeff Garzik

On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote:
> On 2007.01.23 17:18:43 -0600, Robert Hancock wrote:
> > Larry Walton wrote:
> > >The last patch (sata_nv-force-int-dev-in-interrupt.patch) 
> > >seems to have fix the problem.  Much appreciated, 
> > >thank you. I'd consider it a must have in 2.6.20.
> > 
> > Can any of the rest of you that have been seeing this problem also 
> > confirm that this fixes it?
> 
> Seems to work for me, uptime is about an hour now and no exception yet.
> Had the stress test running for only about 10 minutes, but I usually got
> an exception within an hour even during plain irssi usage, so I'm quite
> confident that the patch fixes it.

Or maybe not :( Just got an exception on 2.6.20-rc6. Took 4 days of
uptime to trigger, so it's just a lot harder to trigger now.

Björn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SATA exceptions with 2.6.20-rc5
  2007-02-03  1:42               ` SATA exceptions with 2.6.20-rc5 Björn Steinbrink
@ 2007-02-03  5:48                 ` Robert Hancock
  2007-02-04  1:13                   ` Björn Steinbrink
  0 siblings, 1 reply; 9+ messages in thread
From: Robert Hancock @ 2007-02-03  5:48 UTC (permalink / raw)
  To: Björn Steinbrink, Robert Hancock, linux-kernel,
	Larry Walton, s0348365, pomac, chunkeey, Jeff Garzik

Björn Steinbrink wrote:
> On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote:
>> On 2007.01.23 17:18:43 -0600, Robert Hancock wrote:
>>> Larry Walton wrote:
>>>> The last patch (sata_nv-force-int-dev-in-interrupt.patch) 
>>>> seems to have fix the problem.  Much appreciated, 
>>>> thank you. I'd consider it a must have in 2.6.20.
>>> Can any of the rest of you that have been seeing this problem also 
>>> confirm that this fixes it?
>> Seems to work for me, uptime is about an hour now and no exception yet.
>> Had the stress test running for only about 10 minutes, but I usually got
>> an exception within an hour even during plain irssi usage, so I'm quite
>> confident that the patch fixes it.
> 
> Or maybe not :( Just got an exception on 2.6.20-rc6. Took 4 days of
> uptime to trigger, so it's just a lot harder to trigger now.

Same exception details as before?

There's a patch in -mm (sata_nv-use-adma-for-nodata-commands.patch) 
which should hopefully avoid this problem for the cache flush commands, 
at least - can you try that one out? You'll have to apply the other 
sata_nv patches in -mm first, i.e. this order:

http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-cleanup-adma-error-handling-v2.patch
http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-cleanup-adma-error-handling-v2-cleanup.patch
http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-use-adma-for-nodata-commands.patch

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SATA exceptions with 2.6.20-rc5
  2007-02-03  5:48                 ` Robert Hancock
@ 2007-02-04  1:13                   ` Björn Steinbrink
  2007-02-09 12:03                     ` Björn Steinbrink
  0 siblings, 1 reply; 9+ messages in thread
From: Björn Steinbrink @ 2007-02-04  1:13 UTC (permalink / raw)
  To: Robert Hancock
  Cc: linux-kernel, Larry Walton, s0348365, pomac, chunkeey, Jeff Garzik

On 2007.02.02 23:48:14 -0600, Robert Hancock wrote:
> Björn Steinbrink wrote:
> >On 2007.01.24 01:39:23 +0100, Björn Steinbrink wrote:
> >>On 2007.01.23 17:18:43 -0600, Robert Hancock wrote:
> >>>Larry Walton wrote:
> >>>>The last patch (sata_nv-force-int-dev-in-interrupt.patch) 
> >>>>seems to have fix the problem.  Much appreciated, 
> >>>>thank you. I'd consider it a must have in 2.6.20.
> >>>Can any of the rest of you that have been seeing this problem also 
> >>>confirm that this fixes it?
> >>Seems to work for me, uptime is about an hour now and no exception yet.
> >>Had the stress test running for only about 10 minutes, but I usually got
> >>an exception within an hour even during plain irssi usage, so I'm quite
> >>confident that the patch fixes it.
> >
> >Or maybe not :( Just got an exception on 2.6.20-rc6. Took 4 days of
> >uptime to trigger, so it's just a lot harder to trigger now.
> 
> Same exception details as before?

Yes, exactly the same.

> There's a patch in -mm (sata_nv-use-adma-for-nodata-commands.patch) 
> which should hopefully avoid this problem for the cache flush commands, 
> at least - can you try that one out? You'll have to apply the other 
> sata_nv patches in -mm first, i.e. this order:
> 
> http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-cleanup-adma-error-handling-v2.patch
> http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-cleanup-adma-error-handling-v2-cleanup.patch
> http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-use-adma-for-nodata-commands.patch

Got 2.6.20-rc7 with them applied now (the rejects seemed trivial enough
for me to fix them). Let's see how that works out...

Björn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: SATA exceptions with 2.6.20-rc5
  2007-02-04  1:13                   ` Björn Steinbrink
@ 2007-02-09 12:03                     ` Björn Steinbrink
  0 siblings, 0 replies; 9+ messages in thread
From: Björn Steinbrink @ 2007-02-09 12:03 UTC (permalink / raw)
  To: Robert Hancock, linux-kernel, Larry Walton, s0348365, pomac,
	chunkeey, Jeff Garzik

On 2007.02.04 02:13:51 +0100, Björn Steinbrink wrote:
> On 2007.02.02 23:48:14 -0600, Robert Hancock wrote:
> > There's a patch in -mm (sata_nv-use-adma-for-nodata-commands.patch) 
> > which should hopefully avoid this problem for the cache flush commands, 
> > at least - can you try that one out? You'll have to apply the other 
> > sata_nv patches in -mm first, i.e. this order:
> > 
> > http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-cleanup-adma-error-handling-v2.patch
> > http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-cleanup-adma-error-handling-v2-cleanup.patch
> > http://www2.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc6/2.6.20-rc6-mm3/broken-out/sata_nv-use-adma-for-nodata-commands.patch
> 
> Got 2.6.20-rc7 with them applied now (the rejects seemed trivial enough
> for me to fix them). Let's see how that works out...

After about 1.5 days of uptime, an involuntary reboot and another 3
days of uptime, no sign of an exception. No stress testing was done,
but a few disk intensive actions did happen, at least more than with
that -rc6 that did throw an exception at me.

Björn

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-02-09 12:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <fa.1kBz5luWz8nR0lLqm1VD4hZZYdw@ifi.uio.no>
     [not found] ` <fa.QZxgjxcwtENaZNY24NMTlKBSgIM@ifi.uio.no>
     [not found]   ` <fa.fkPTbUGmKc/1pt0eD6TE4d02n+Q@ifi.uio.no>
     [not found]     ` <fa.6iQt5OtHZ3x5w8eYbLxwULhLTJ0@ifi.uio.no>
     [not found]       ` <fa.1aqo3IxNGJClHcBVZNTagX6bL9o@ifi.uio.no>
     [not found]         ` <fa.rI60BGlFbSyfLyumqmgiOfDqCI4@ifi.uio.no>
2007-01-23 23:18           ` SATA exceptions with 2.6.20-rc5 Robert Hancock
2007-01-24  0:39             ` Björn Steinbrink
2007-01-24  2:09               ` [PATCH 2.6.20] sata_nv: don't rely on NV_INT_DEV indication with ADMA Robert Hancock
2007-02-03  1:42               ` SATA exceptions with 2.6.20-rc5 Björn Steinbrink
2007-02-03  5:48                 ` Robert Hancock
2007-02-04  1:13                   ` Björn Steinbrink
2007-02-09 12:03                     ` Björn Steinbrink
2007-01-24  8:24             ` Ian Kumlien
2007-01-24 14:41               ` Björn Steinbrink

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).