LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008
@ 2008-11-03 23:23 Shehjar Tikoo
  2008-11-04 22:13 ` Luck, Tony
  2008-11-05 18:26 ` FUJITA Tomonori
  0 siblings, 2 replies; 8+ messages in thread
From: Shehjar Tikoo @ 2008-11-03 23:23 UTC (permalink / raw)
  To: fujita.tomonori, akpm, tony.luck, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 941 bytes --]

Hi All

I've been observing kernel panics for the past week on
kernel versions 2.6.26, 2.6.27 but not on 2.6.24 and 2.6.25.

The panic message says:

arch/ia64/hp/common/sba_iommu.c: I/O MMU is out of mapping resources

Using git-bisect, I've zeroed in on the commit that introduced this.
Please see the attached file for the commit.

The workload consists of 2 tests:
1. Single fio process writing a 1 TB file.
2. 15 fio processes writing 15GB files each.

The panic happens on both workloads. There is no stack trace after
the above message.

Other info:
System is HP RX6600(16Gb RAM, 16 processors w/ dual cores and HT)
20 SATA disks under software RAID0 with 6 TB capacity.
Silicon Image 3124 controller.
File system is XFS.

I'd much appreciate some help in fixing this because this panic has
basically stalled my own work. I'd be willing to run more tests on my
setup to test any patches that possibly fix this issue.

Regards
Shehjar

[-- Attachment #2: iommu-panic-culprit-commit.patch --]
[-- Type: text/x-patch, Size: 6793 bytes --]

commit b34eb53cdcb4f49fd31d78d0e385240820ed9063
Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Date:   Fri Mar 28 14:27:03 2008 -0700

    [IA64] make IOMMU respect the segment boundary limits
    
    IA64's IOMMU implementation allocates memory areas spanning LLD's segment
    boundary limit.  It forces low level drivers to have a workaround to adjust
    scatter lists that the IOMMU builds.
    
    We are in the process of making all the IOMMUs respect the segment boundary
    limits to remove such work around in LLDs.  This patch is for IA64's IOMMU.
    
    Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Tony Luck <tony.luck@intel.com>

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 8fa3faf..1b73ffe 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -611,6 +611,9 @@ config IRQ_PER_CPU
 	bool
 	default y
 
+config IOMMU_HELPER
+	def_bool (IA64_HP_ZX1 || IA64_HP_ZX1_SWIOTLB || IA64_GENERIC)
+
 source "arch/ia64/hp/sim/Kconfig"
 
 source "arch/ia64/Kconfig.debug"
diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index 523eae6..9409de5 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -35,6 +35,7 @@
 #include <linux/nodemask.h>
 #include <linux/bitops.h>         /* hweight64() */
 #include <linux/crash_dump.h>
+#include <linux/iommu-helper.h>
 
 #include <asm/delay.h>		/* ia64_get_itc() */
 #include <asm/io.h>
@@ -460,6 +461,13 @@ get_iovp_order (unsigned long size)
 	return order;
 }
 
+static unsigned long ptr_to_pide(struct ioc *ioc, unsigned long *res_ptr,
+				 unsigned int bitshiftcnt)
+{
+	return (((unsigned long)res_ptr - (unsigned long)ioc->res_map) << 3)
+		+ bitshiftcnt;
+}
+
 /**
  * sba_search_bitmap - find free space in IO PDIR resource bitmap
  * @ioc: IO MMU structure which owns the pdir we are interested in.
@@ -471,15 +479,25 @@ get_iovp_order (unsigned long size)
  * Cool perf optimization: search for log2(size) bits at a time.
  */
 static SBA_INLINE unsigned long
-sba_search_bitmap(struct ioc *ioc, unsigned long bits_wanted, int use_hint)
+sba_search_bitmap(struct ioc *ioc, struct device *dev,
+		  unsigned long bits_wanted, int use_hint)
 {
 	unsigned long *res_ptr;
 	unsigned long *res_end = (unsigned long *) &(ioc->res_map[ioc->res_size]);
-	unsigned long flags, pide = ~0UL;
+	unsigned long flags, pide = ~0UL, tpide;
+	unsigned long boundary_size;
+	unsigned long shift;
+	int ret;
 
 	ASSERT(((unsigned long) ioc->res_hint & (sizeof(unsigned long) - 1UL)) == 0);
 	ASSERT(res_ptr < res_end);
 
+	boundary_size = (unsigned long long)dma_get_seg_boundary(dev) + 1;
+	boundary_size = ALIGN(boundary_size, 1ULL << iovp_shift) >> iovp_shift;
+
+	BUG_ON(ioc->ibase & ~iovp_mask);
+	shift = ioc->ibase >> iovp_shift;
+
 	spin_lock_irqsave(&ioc->res_lock, flags);
 
 	/* Allow caller to force a search through the entire resource space */
@@ -504,9 +522,7 @@ sba_search_bitmap(struct ioc *ioc, unsigned long bits_wanted, int use_hint)
 			if (likely(*res_ptr != ~0UL)) {
 				bitshiftcnt = ffz(*res_ptr);
 				*res_ptr |= (1UL << bitshiftcnt);
-				pide = ((unsigned long)res_ptr - (unsigned long)ioc->res_map);
-				pide <<= 3;	/* convert to bit address */
-				pide += bitshiftcnt;
+				pide = ptr_to_pide(ioc, res_ptr, bitshiftcnt);
 				ioc->res_bitshift = bitshiftcnt + bits_wanted;
 				goto found_it;
 			}
@@ -535,11 +551,13 @@ sba_search_bitmap(struct ioc *ioc, unsigned long bits_wanted, int use_hint)
 			DBG_RES("    %p %lx %lx\n", res_ptr, mask, *res_ptr);
 			ASSERT(0 != mask);
 			for (; mask ; mask <<= o, bitshiftcnt += o) {
-				if(0 == ((*res_ptr) & mask)) {
+				tpide = ptr_to_pide(ioc, res_ptr, bitshiftcnt);
+				ret = iommu_is_span_boundary(tpide, bits_wanted,
+							     shift,
+							     boundary_size);
+				if ((0 == ((*res_ptr) & mask)) && !ret) {
 					*res_ptr |= mask;     /* mark resources busy! */
-					pide = ((unsigned long)res_ptr - (unsigned long)ioc->res_map);
-					pide <<= 3;	/* convert to bit address */
-					pide += bitshiftcnt;
+					pide = tpide;
 					ioc->res_bitshift = bitshiftcnt + bits_wanted;
 					goto found_it;
 				}
@@ -560,6 +578,11 @@ sba_search_bitmap(struct ioc *ioc, unsigned long bits_wanted, int use_hint)
 		end = res_end - qwords;
 
 		for (; res_ptr < end; res_ptr++) {
+			tpide = ptr_to_pide(ioc, res_ptr, 0);
+			ret = iommu_is_span_boundary(tpide, bits_wanted,
+						     shift, boundary_size);
+			if (ret)
+				goto next_ptr;
 			for (i = 0 ; i < qwords ; i++) {
 				if (res_ptr[i] != 0)
 					goto next_ptr;
@@ -572,8 +595,7 @@ sba_search_bitmap(struct ioc *ioc, unsigned long bits_wanted, int use_hint)
 				res_ptr[i] = ~0UL;
 			res_ptr[i] |= RESMAP_MASK(bits);
 
-			pide = ((unsigned long)res_ptr - (unsigned long)ioc->res_map);
-			pide <<= 3;	/* convert to bit address */
+			pide = tpide;
 			res_ptr += qwords;
 			ioc->res_bitshift = bits;
 			goto found_it;
@@ -605,7 +627,7 @@ found_it:
  * resource bit map.
  */
 static int
-sba_alloc_range(struct ioc *ioc, size_t size)
+sba_alloc_range(struct ioc *ioc, struct device *dev, size_t size)
 {
 	unsigned int pages_needed = size >> iovp_shift;
 #ifdef PDIR_SEARCH_TIMING
@@ -622,9 +644,9 @@ sba_alloc_range(struct ioc *ioc, size_t size)
 	/*
 	** "seek and ye shall find"...praying never hurts either...
 	*/
-	pide = sba_search_bitmap(ioc, pages_needed, 1);
+	pide = sba_search_bitmap(ioc, dev, pages_needed, 1);
 	if (unlikely(pide >= (ioc->res_size << 3))) {
-		pide = sba_search_bitmap(ioc, pages_needed, 0);
+		pide = sba_search_bitmap(ioc, dev, pages_needed, 0);
 		if (unlikely(pide >= (ioc->res_size << 3))) {
 #if DELAYED_RESOURCE_CNT > 0
 			unsigned long flags;
@@ -653,7 +675,7 @@ sba_alloc_range(struct ioc *ioc, size_t size)
 			}
 			spin_unlock_irqrestore(&ioc->saved_lock, flags);
 
-			pide = sba_search_bitmap(ioc, pages_needed, 0);
+			pide = sba_search_bitmap(ioc, dev, pages_needed, 0);
 			if (unlikely(pide >= (ioc->res_size << 3)))
 				panic(__FILE__ ": I/O MMU @ %p is out of mapping resources\n",
 				      ioc->ioc_hpa);
@@ -936,7 +958,7 @@ sba_map_single(struct device *dev, void *addr, size_t size, int dir)
 	spin_unlock_irqrestore(&ioc->res_lock, flags);
 #endif
 
-	pide = sba_alloc_range(ioc, size);
+	pide = sba_alloc_range(ioc, dev, size);
 
 	iovp = (dma_addr_t) pide << iovp_shift;
 
@@ -1373,7 +1395,7 @@ sba_coalesce_chunks(struct ioc *ioc, struct device *dev,
 		dma_len = (dma_len + dma_offset + ~iovp_mask) & iovp_mask;
 		ASSERT(dma_len <= DMA_CHUNK_SIZE);
 		dma_sg->dma_address = (dma_addr_t) (PIDE_FLAG
-			| (sba_alloc_range(ioc, dma_len) << iovp_shift)
+			| (sba_alloc_range(ioc, dev, dma_len) << iovp_shift)
 			| dma_offset);
 		n_mappings++;
 	}


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008
  2008-11-03 23:23 Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008 Shehjar Tikoo
@ 2008-11-04 22:13 ` Luck, Tony
  2008-11-06  3:01   ` Shehjar Tikoo
  2008-11-05 18:26 ` FUJITA Tomonori
  1 sibling, 1 reply; 8+ messages in thread
From: Luck, Tony @ 2008-11-04 22:13 UTC (permalink / raw)
  To: Shehjar Tikoo, fujita.tomonori, akpm, linux-kernel; +Cc: linux-ia64

Added Cc: linux-ia64 ... more likely to attract attention of HP
ia64 experts there.

> arch/ia64/hp/common/sba_iommu.c: I/O MMU is out of mapping resources

Odd ... the code (back to the dawn of git time in 2.6.12-rc1) looks like

        panic(__FILE__ ": I/O MMU @ %p is out of mapping resources\n"
                ioc->ioc_hpa);

I wonder why you don't see the "@ HEXADDRESS"?

> Using git-bisect, I've zeroed in on the commit that introduced this.
> Please see the attached file for the commit.

Did you confirm that reverting this commit on a recent kernel
fixes the problem (once in a while git bisect can point to
the wrong commit ... it seems very likely that it got the
right one here, but it is always good to check).  When I
tried to use "patch -R" to revert this it got confused on
the Kconfig file because the lines that were added were
subsequently changed ... so you may need to revert that
by hand ... the sba_iommu.c apparently reverted ok).

> Other info:
> System is HP RX6600(16Gb RAM, 16 processors w/ dual cores and HT)
> 20 SATA disks under software RAID0 with 6 TB capacity.
> Silicon Image 3124 controller.
> File system is XFS.

My HP test system is way too small to attempt to recreate
this (just 2 cpus & 1 disk).  How long does each of your
tests take to hit the problems ... a few minutes? Or hours?

> I'd much appreciate some help in fixing this because this panic has
> basically stalled my own work. I'd be willing to run more tests on my
> setup to test any patches that possibly fix this issue.

Adding some printk() before the panic might give a clue as to what
is going wrong.  Either a bogus call is trying to allocate far
too much space, or the bitmap is leaking, or we have a totally
messed up "ioc" structure.

Printing "pages_needed" the address of "ioc" and some interesting
fields from ioc (at least ioc->res_size) would help.  I assume
the the return value from sba_search_bitmap() is ~0x0 ... but
you should print "pide" just to be sure.

-Tony

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008
  2008-11-03 23:23 Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008 Shehjar Tikoo
  2008-11-04 22:13 ` Luck, Tony
@ 2008-11-05 18:26 ` FUJITA Tomonori
  2008-11-06  3:06   ` Shehjar Tikoo
  1 sibling, 1 reply; 8+ messages in thread
From: FUJITA Tomonori @ 2008-11-05 18:26 UTC (permalink / raw)
  To: shehjart; +Cc: fujita.tomonori, akpm, tony.luck, linux-kernel, linux-parisc

Sorry for the delay.

CC'ed linux-parisc since the same problem could happen to parisc.

On Tue, 04 Nov 2008 10:23:58 +1100
Shehjar Tikoo <shehjart@cse.unsw.edu.au> wrote:

> I've been observing kernel panics for the past week on
> kernel versions 2.6.26, 2.6.27 but not on 2.6.24 and 2.6.25.
> 
> The panic message says:
> 
> arch/ia64/hp/common/sba_iommu.c: I/O MMU is out of mapping resources
> 
> Using git-bisect, I've zeroed in on the commit that introduced this.
> Please see the attached file for the commit.
> 
> The workload consists of 2 tests:
> 1. Single fio process writing a 1 TB file.
> 2. 15 fio processes writing 15GB files each.
> 
> The panic happens on both workloads. There is no stack trace after
> the above message.
> 
> Other info:
> System is HP RX6600(16Gb RAM, 16 processors w/ dual cores and HT)
> 20 SATA disks under software RAID0 with 6 TB capacity.
> Silicon Image 3124 controller.
> File system is XFS.
> 
> I'd much appreciate some help in fixing this because this panic has
> basically stalled my own work. I'd be willing to run more tests on my
> setup to test any patches that possibly fix this issue.

This patch modified the sba IOMMU driver to support LLDs' segment
boundary limits properly.

ATA hardware has poor segment boundary limit, 64KB. In addition, sba
IOMMU driver uses size-aligned allocation algorithm. It means that
it's difficult for the IOMMU driver to find an appropriate I/O address
space. I think that you hit the allocation failure due to this problem
(of course, it's possible that my change breaks the IOMMU driver but I
can't find a problem so far).

To make matters worse, sba IOMMU driver panic when the allocation
fails. IIRC, only IA64 and parisc IOMMU drivers panic by default in
the case of the allocation failure. I think that we need to change
them to handle the failure properly.

Can you try this? I've not fixed map_single failure yet but I think
that you hit the failure allocation in map_sg path.


diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index d98f0f4..8f44dc8 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -676,12 +676,19 @@ sba_alloc_range(struct ioc *ioc, struct device *dev, size_t size)
 			spin_unlock_irqrestore(&ioc->saved_lock, flags);
 
 			pide = sba_search_bitmap(ioc, dev, pages_needed, 0);
-			if (unlikely(pide >= (ioc->res_size << 3)))
-				panic(__FILE__ ": I/O MMU @ %p is out of mapping resources\n",
-				      ioc->ioc_hpa);
+			if (unlikely(pide >= (ioc->res_size << 3))) {
+				printk(KERN_WARNING "%s: I/O MMU @ %p is"
+				       "out of mapping resources, %u %u %lx\n",
+				       __func__, ioc->ioc_hpa, ioc->res_size,
+				       pages_needed, dma_get_seg_boundary(dev));
+				return -1;
+			}
 #else
-			panic(__FILE__ ": I/O MMU @ %p is out of mapping resources\n",
-			      ioc->ioc_hpa);
+			printk(KERN_WARNING "%s: I/O MMU @ %p is"
+			       "out of mapping resources, %u %u %lx\n",
+			       __func__, ioc->ioc_hpa, ioc->res_size,
+			       pages_needed, dma_get_seg_boundary(dev));
+			return -1;
 #endif
 		}
 	}
@@ -962,6 +969,7 @@ sba_map_single_attrs(struct device *dev, void *addr, size_t size, int dir,
 #endif
 
 	pide = sba_alloc_range(ioc, dev, size);
+	BUG_ON(pide < 0);
 
 	iovp = (dma_addr_t) pide << iovp_shift;
 
@@ -1304,6 +1312,7 @@ sba_coalesce_chunks(struct ioc *ioc, struct device *dev,
 	unsigned long dma_offset, dma_len; /* start/len of DMA stream */
 	int n_mappings = 0;
 	unsigned int max_seg_size = dma_get_max_seg_size(dev);
+	int idx;
 
 	while (nents > 0) {
 		unsigned long vaddr = (unsigned long) sba_sg_address(startsg);
@@ -1402,9 +1411,13 @@ sba_coalesce_chunks(struct ioc *ioc, struct device *dev,
 		vcontig_sg->dma_length = vcontig_len;
 		dma_len = (dma_len + dma_offset + ~iovp_mask) & iovp_mask;
 		ASSERT(dma_len <= DMA_CHUNK_SIZE);
-		dma_sg->dma_address = (dma_addr_t) (PIDE_FLAG
-			| (sba_alloc_range(ioc, dev, dma_len) << iovp_shift)
-			| dma_offset);
+		idx = sba_alloc_range(ioc, dev, dma_len);
+		if (idx < 0) {
+			dma_sg->dma_length = 0;
+			return -1;
+		}
+		dma_sg->dma_address = (dma_addr_t)(PIDE_FLAG | (idx << iovp_shift)
+						   | dma_offset);
 		n_mappings++;
 	}
 
@@ -1476,6 +1489,10 @@ int sba_map_sg_attrs(struct device *dev, struct scatterlist *sglist, int nents,
 	** Access to the virtual address is what forces a two pass algorithm.
 	*/
 	coalesced = sba_coalesce_chunks(ioc, dev, sglist, nents);
+	if (coalesced < 0) {
+		sba_unmap_sg_attrs(dev, sglist, nents, dir, attrs);
+		return 0;
+	}
 
 	/*
 	** Program the I/O Pdir


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008
  2008-11-04 22:13 ` Luck, Tony
@ 2008-11-06  3:01   ` Shehjar Tikoo
  0 siblings, 0 replies; 8+ messages in thread
From: Shehjar Tikoo @ 2008-11-06  3:01 UTC (permalink / raw)
  To: Luck, Tony; +Cc: fujita.tomonori, akpm, linux-kernel, linux-ia64, linux-parisc

Luck, Tony wrote:
> Added Cc: linux-ia64 ... more likely to attract attention of HP
> ia64 experts there.
> 
>> arch/ia64/hp/common/sba_iommu.c: I/O MMU is out of mapping resources
> 
> Odd ... the code (back to the dawn of git time in 2.6.12-rc1) looks like
> 
>         panic(__FILE__ ": I/O MMU @ %p is out of mapping resources\n"
>                 ioc->ioc_hpa);
> 
> I wonder why you don't see the "@ HEXADDRESS"?

That was copy paste from memory. You're right. There is a hex address.
I've copied a full message at the end of the email.

> 
>> Using git-bisect, I've zeroed in on the commit that introduced this.
>> Please see the attached file for the commit.
> 
> Did you confirm that reverting this commit on a recent kernel
> fixes the problem (once in a while git bisect can point to
> the wrong commit ... it seems very likely that it got the
> right one here, but it is always good to check).  When I
> tried to use "patch -R" to revert this it got confused on
> the Kconfig file because the lines that were added were
> subsequently changed ... so you may need to revert that
> by hand ... the sba_iommu.c apparently reverted ok).


Yes, reverting this commit in 2.6.27 prevents kernel panic on both
workloads.

> 
>> Other info:
>> System is HP RX6600(16Gb RAM, 16 processors w/ dual cores and HT)
>> 20 SATA disks under software RAID0 with 6 TB capacity.
>> Silicon Image 3124 controller.
>> File system is XFS.
> 
> My HP test system is way too small to attempt to recreate
> this (just 2 cpus & 1 disk).  How long does each of your
> tests take to hit the problems ... a few minutes? Or hours?

The points at which panic occur are variable for both tests but
generally, I felt the panics were occurring nearer to the end of the
750G to 1TB writes.

> 
>> I'd much appreciate some help in fixing this because this panic has
>> basically stalled my own work. I'd be willing to run more tests on my
>> setup to test any patches that possibly fix this issue.
> 
> Adding some printk() before the panic might give a clue as to what
> is going wrong.  Either a bogus call is trying to allocate far
> too much space, or the bitmap is leaking, or we have a totally
> messed up "ioc" structure.
> 
> Printing "pages_needed" the address of "ioc" and some interesting
> fields from ioc (at least ioc->res_size) would help.  I assume
> the the return value from sba_search_bitmap() is ~0x0 ... but
> you should print "pide" just to be sure.


Heres some more info from a printk:

Kernel panic - not syncing: arch/ia64/hp/common/sba_iommu.c: I/O MMU @ 
c0000000fed01000 is out of mapping resources: pide: 
18446744073709551615, pages_needed: 5, iocres_size: 8192

> 
> -Tony


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008
  2008-11-05 18:26 ` FUJITA Tomonori
@ 2008-11-06  3:06   ` Shehjar Tikoo
  2008-11-07  3:49     ` FUJITA Tomonori
  0 siblings, 1 reply; 8+ messages in thread
From: Shehjar Tikoo @ 2008-11-06  3:06 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: akpm, tony.luck, linux-kernel, linux-parisc

FUJITA Tomonori wrote:
> Sorry for the delay.
> 
> CC'ed linux-parisc since the same problem could happen to parisc.
> 
> On Tue, 04 Nov 2008 10:23:58 +1100
> Shehjar Tikoo <shehjart@cse.unsw.edu.au> wrote:
> 
>> I've been observing kernel panics for the past week on
>> kernel versions 2.6.26, 2.6.27 but not on 2.6.24 and 2.6.25.
>>
>> The panic message says:
>>
>> arch/ia64/hp/common/sba_iommu.c: I/O MMU is out of mapping resources
>>
>> Using git-bisect, I've zeroed in on the commit that introduced this.
>> Please see the attached file for the commit.
>>
>> The workload consists of 2 tests:
>> 1. Single fio process writing a 1 TB file.
>> 2. 15 fio processes writing 15GB files each.
>>
>> The panic happens on both workloads. There is no stack trace after
>> the above message.
>>
>> Other info:
>> System is HP RX6600(16Gb RAM, 16 processors w/ dual cores and HT)
>> 20 SATA disks under software RAID0 with 6 TB capacity.
>> Silicon Image 3124 controller.
>> File system is XFS.
>>
>> I'd much appreciate some help in fixing this because this panic has
>> basically stalled my own work. I'd be willing to run more tests on my
>> setup to test any patches that possibly fix this issue.
> 
> This patch modified the sba IOMMU driver to support LLDs' segment
> boundary limits properly.
> 
> ATA hardware has poor segment boundary limit, 64KB. In addition, sba
> IOMMU driver uses size-aligned allocation algorithm. It means that
> it's difficult for the IOMMU driver to find an appropriate I/O address
> space. I think that you hit the allocation failure due to this problem
> (of course, it's possible that my change breaks the IOMMU driver but I
> can't find a problem so far).
> 
> To make matters worse, sba IOMMU driver panic when the allocation
> fails. IIRC, only IA64 and parisc IOMMU drivers panic by default in
> the case of the allocation failure. I think that we need to change
> them to handle the failure properly.
> 
> Can you try this? I've not fixed map_single failure yet but I think
> that you hit the failure allocation in map_sg path.
> 

On 2.6.27, this patch seems to prevent the panic from happening for
both the tests I had described earlier. Do you need more info to 
validate this? I will be running more tests with this patch over
the next few days, so we'll find out anyway.

Thanks
Shehjar

> 
> diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
> index d98f0f4..8f44dc8 100644
> --- a/arch/ia64/hp/common/sba_iommu.c
> +++ b/arch/ia64/hp/common/sba_iommu.c
> @@ -676,12 +676,19 @@ sba_alloc_range(struct ioc *ioc, struct device *dev, size_t size)
>  			spin_unlock_irqrestore(&ioc->saved_lock, flags);
>  
>  			pide = sba_search_bitmap(ioc, dev, pages_needed, 0);
> -			if (unlikely(pide >= (ioc->res_size << 3)))
> -				panic(__FILE__ ": I/O MMU @ %p is out of mapping resources\n",
> -				      ioc->ioc_hpa);
> +			if (unlikely(pide >= (ioc->res_size << 3))) {
> +				printk(KERN_WARNING "%s: I/O MMU @ %p is"
> +				       "out of mapping resources, %u %u %lx\n",
> +				       __func__, ioc->ioc_hpa, ioc->res_size,
> +				       pages_needed, dma_get_seg_boundary(dev));
> +				return -1;
> +			}
>  #else
> -			panic(__FILE__ ": I/O MMU @ %p is out of mapping resources\n",
> -			      ioc->ioc_hpa);
> +			printk(KERN_WARNING "%s: I/O MMU @ %p is"
> +			       "out of mapping resources, %u %u %lx\n",
> +			       __func__, ioc->ioc_hpa, ioc->res_size,
> +			       pages_needed, dma_get_seg_boundary(dev));
> +			return -1;
>  #endif
>  		}
>  	}
> @@ -962,6 +969,7 @@ sba_map_single_attrs(struct device *dev, void *addr, size_t size, int dir,
>  #endif
>  
>  	pide = sba_alloc_range(ioc, dev, size);
> +	BUG_ON(pide < 0);
>  
>  	iovp = (dma_addr_t) pide << iovp_shift;
>  
> @@ -1304,6 +1312,7 @@ sba_coalesce_chunks(struct ioc *ioc, struct device *dev,
>  	unsigned long dma_offset, dma_len; /* start/len of DMA stream */
>  	int n_mappings = 0;
>  	unsigned int max_seg_size = dma_get_max_seg_size(dev);
> +	int idx;
>  
>  	while (nents > 0) {
>  		unsigned long vaddr = (unsigned long) sba_sg_address(startsg);
> @@ -1402,9 +1411,13 @@ sba_coalesce_chunks(struct ioc *ioc, struct device *dev,
>  		vcontig_sg->dma_length = vcontig_len;
>  		dma_len = (dma_len + dma_offset + ~iovp_mask) & iovp_mask;
>  		ASSERT(dma_len <= DMA_CHUNK_SIZE);
> -		dma_sg->dma_address = (dma_addr_t) (PIDE_FLAG
> -			| (sba_alloc_range(ioc, dev, dma_len) << iovp_shift)
> -			| dma_offset);
> +		idx = sba_alloc_range(ioc, dev, dma_len);
> +		if (idx < 0) {
> +			dma_sg->dma_length = 0;
> +			return -1;
> +		}
> +		dma_sg->dma_address = (dma_addr_t)(PIDE_FLAG | (idx << iovp_shift)
> +						   | dma_offset);
>  		n_mappings++;
>  	}
>  
> @@ -1476,6 +1489,10 @@ int sba_map_sg_attrs(struct device *dev, struct scatterlist *sglist, int nents,
>  	** Access to the virtual address is what forces a two pass algorithm.
>  	*/
>  	coalesced = sba_coalesce_chunks(ioc, dev, sglist, nents);
> +	if (coalesced < 0) {
> +		sba_unmap_sg_attrs(dev, sglist, nents, dir, attrs);
> +		return 0;
> +	}
>  
>  	/*
>  	** Program the I/O Pdir


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008
  2008-11-06  3:06   ` Shehjar Tikoo
@ 2008-11-07  3:49     ` FUJITA Tomonori
  2008-11-07 16:58       ` Luck, Tony
  0 siblings, 1 reply; 8+ messages in thread
From: FUJITA Tomonori @ 2008-11-07  3:49 UTC (permalink / raw)
  To: shehjart; +Cc: fujita.tomonori, akpm, tony.luck, linux-kernel, linux-parisc

On Thu, 06 Nov 2008 14:06:09 +1100
Shehjar Tikoo <shehjart@cse.unsw.edu.au> wrote:

> FUJITA Tomonori wrote:
> > Sorry for the delay.
> > 
> > CC'ed linux-parisc since the same problem could happen to parisc.
> > 
> > On Tue, 04 Nov 2008 10:23:58 +1100
> > Shehjar Tikoo <shehjart@cse.unsw.edu.au> wrote:
> > 
> >> I've been observing kernel panics for the past week on
> >> kernel versions 2.6.26, 2.6.27 but not on 2.6.24 and 2.6.25.
> >>
> >> The panic message says:
> >>
> >> arch/ia64/hp/common/sba_iommu.c: I/O MMU is out of mapping resources
> >>
> >> Using git-bisect, I've zeroed in on the commit that introduced this.
> >> Please see the attached file for the commit.
> >>
> >> The workload consists of 2 tests:
> >> 1. Single fio process writing a 1 TB file.
> >> 2. 15 fio processes writing 15GB files each.
> >>
> >> The panic happens on both workloads. There is no stack trace after
> >> the above message.
> >>
> >> Other info:
> >> System is HP RX6600(16Gb RAM, 16 processors w/ dual cores and HT)
> >> 20 SATA disks under software RAID0 with 6 TB capacity.
> >> Silicon Image 3124 controller.
> >> File system is XFS.
> >>
> >> I'd much appreciate some help in fixing this because this panic has
> >> basically stalled my own work. I'd be willing to run more tests on my
> >> setup to test any patches that possibly fix this issue.
> > 
> > This patch modified the sba IOMMU driver to support LLDs' segment
> > boundary limits properly.
> > 
> > ATA hardware has poor segment boundary limit, 64KB. In addition, sba
> > IOMMU driver uses size-aligned allocation algorithm. It means that
> > it's difficult for the IOMMU driver to find an appropriate I/O address
> > space. I think that you hit the allocation failure due to this problem
> > (of course, it's possible that my change breaks the IOMMU driver but I
> > can't find a problem so far).
> > 
> > To make matters worse, sba IOMMU driver panic when the allocation
> > fails. IIRC, only IA64 and parisc IOMMU drivers panic by default in
> > the case of the allocation failure. I think that we need to change
> > them to handle the failure properly.
> > 
> > Can you try this? I've not fixed map_single failure yet but I think
> > that you hit the failure allocation in map_sg path.
> > 
> 
> On 2.6.27, this patch seems to prevent the panic from happening for
> both the tests I had described earlier.

Thanks!

> Do you need more info to 
> validate this? I will be running more tests with this patch over
> the next few days, so we'll find out anyway.

Can you check if data corruption doesn't happen during the tests?


Tony, changing the sba IOMMU driver to return an error instead of
panic in the case of allocation failure is fine with you?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008
  2008-11-07  3:49     ` FUJITA Tomonori
@ 2008-11-07 16:58       ` Luck, Tony
  2008-11-11  6:06         ` FUJITA Tomonori
  0 siblings, 1 reply; 8+ messages in thread
From: Luck, Tony @ 2008-11-07 16:58 UTC (permalink / raw)
  To: FUJITA Tomonori, shehjart; +Cc: akpm, linux-kernel, linux-parisc

> Can you check if data corruption doesn't happen during the tests?

Very important!!!

> Tony, changing the sba IOMMU driver to return an error instead of
> panic in the case of allocation failure is fine with you?

This is fine ... but we do need to audit the callers to make
sure that they check for and handle this new error.

-Tony

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008
  2008-11-07 16:58       ` Luck, Tony
@ 2008-11-11  6:06         ` FUJITA Tomonori
  0 siblings, 0 replies; 8+ messages in thread
From: FUJITA Tomonori @ 2008-11-11  6:06 UTC (permalink / raw)
  To: tony.luck; +Cc: fujita.tomonori, shehjart, akpm, linux-kernel, linux-parisc

On Fri, 7 Nov 2008 08:58:28 -0800
"Luck, Tony" <tony.luck@intel.com> wrote:

> > Tony, changing the sba IOMMU driver to return an error instead of
> > panic in the case of allocation failure is fine with you?
> 
> This is fine ... but we do need to audit the callers to make
> sure that they check for and handle this new error.

Well, this is the issue discussed in the past several times...

The most of SCSI drivers are fine; they can handle IOMMU mapping
failure properly or panic. But there are some network drivers that
don't even check the failure. Fixing these network drivers has been on
my todo list... But as I said before, we ignore this problem; only
swiotlb and SBA panic in the case of IOMMU mapping failure.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-11-11  6:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-03 23:23 Panic in multiple kernels: IA64 SBA IOMMU: Culprit commit on Mar 28, 2008 Shehjar Tikoo
2008-11-04 22:13 ` Luck, Tony
2008-11-06  3:01   ` Shehjar Tikoo
2008-11-05 18:26 ` FUJITA Tomonori
2008-11-06  3:06   ` Shehjar Tikoo
2008-11-07  3:49     ` FUJITA Tomonori
2008-11-07 16:58       ` Luck, Tony
2008-11-11  6:06         ` FUJITA Tomonori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).