LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: block layer / FS question: x86_32bit with LBD, 20 TB RAID volume => funny issues
@ 2008-03-10  9:13 Frantisek Rysanek
  0 siblings, 0 replies; 9+ messages in thread
From: Frantisek Rysanek @ 2008-03-10  9:13 UTC (permalink / raw)
  To: linux-kernel

> On 7 Mar 2008 at 10:30, Andi Kleen wrote:
> [...snip...]
> > BTW your problems mostly sound like driver issues. Some drivers 
> > (and some controller firmwares) have problems with large block numbers
> > 
> thanks for that hint, I'll investigate that too.
> 
> The HBA is a Qlogic QLA-2460. Based on some past experience with 
> other brands of FC HBA's, I tend to swear on Qlogic as the reference 
> implementation of FC hardware.
> 
And indeed it was the firmware.
ftp://ftp.qlogic.com/outgoing/linux/firmware/

Until now, I've been using version v4.00.27, which is the last 
numbered version in that FTP directory.

Upon closer inspection, I picked the file called 
ql2400_fw.bin_mid
with a timestamp from 12th February 2008.
This one turns out to be v4.03.01 and it SOLVES THE PROBLEM :-)

It seems to work with qla2xxx v8.01.07-k7 (2.6.22.6) and
v8.02.00-k5 (2.6.24.2).

My testbed server has passed some 5 loops of dd over the weekend.

Thanks for your help :-)

I'm installing a 64bit Fedora to give XFS another try...

Frank Rysanek


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block layer / FS question: x86_32bit with LBD, 20 TB RAID volume => funny issues
@ 2008-03-13 10:14 Frantisek Rysanek
  0 siblings, 0 replies; 9+ messages in thread
From: Frantisek Rysanek @ 2008-03-13 10:14 UTC (permalink / raw)
  To: linux-kernel

On 10 Mar 2008 at 10:13, Frantisek.Rysanek@post.cz wrote:
>
> My testbed server has passed some 5 loops of dd over the weekend.
> 
> Thanks for your help :-)
> 
> I'm installing a 64bit Fedora to give XFS another try...
> 
So I installed Fedora 8 64b, compiled a 64bit 2.6.24.2,
and XFS mounts without a word of objection :-)
even with my 32bit user-space on that old CD :-)

Interestingly, XFS even survives looped+parallel 
Bonnie++2 - only when I let it run overnight,
in the morning the box was stuck with a 
Machine Check Exception.
This is a dual Xeon Irwindale at 3 GHz, so far it's always
been rock-solid under 32bit operating systems.
Difficult for me to say if the CPU's indeed have a problem
or if this is some sorta compatibility bug...
Anyway it's unlikely to be a XFS-related issue.
I've checked the thermocouple paste on my heatsinks and I'll try 
again tonight with "nomce"...

Frank Rysanek


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block layer / FS question: x86_32bit with LBD, 20 TB RAID volume => funny issues
  2008-03-09 22:05 ` David Chinner
@ 2008-03-10  4:32   ` Frantisek Rysanek
  0 siblings, 0 replies; 9+ messages in thread
From: Frantisek Rysanek @ 2008-03-10  4:32 UTC (permalink / raw)
  To: linux-kernel

On 9 Mar 2008 at 23:05, David Chinner wrote:
>
> Sure. the largest address space that can be used on a 32bit platform
> with 4k pages is 16TB (2^32 * 2^12 = 2^44 = 16TB). For XFS, that means
> metadata can't be placed higher in the filesystem than 16TB, and seeing
> as we only have a single address space for metadata, the filesystem is
> limited to 16TB. It could be fixed with software changes, but really
> there's no excuse for using x86 given how cheap x86_64 is now..... 
>
[...] 
> 
> Yes, switching to 64 bit machines will fix this problem as the
> address space will now hold 2^64*2^12 bytes.....
> 
wow, thanks for such a precise answer, from such an authoritative 
source :-)

Frank Rysanek


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block layer / FS question: x86_32bit with LBD, 20 TB RAID volume => funny issues
  2008-03-06 21:25 Frantisek Rysanek
       [not found] ` <75b66ecd0803061905n5c06663cv2b75659917461199@mail.gmail.com>
@ 2008-03-09 22:05 ` David Chinner
  2008-03-10  4:32   ` Frantisek Rysanek
  1 sibling, 1 reply; 9+ messages in thread
From: David Chinner @ 2008-03-09 22:05 UTC (permalink / raw)
  To: Frantisek Rysanek; +Cc: linux-kernel

On Thu, Mar 06, 2008 at 10:25:59PM +0100, Frantisek Rysanek wrote:
> A few days ago, I've had my first opportunity to put my hands on a 
> 24bay RAID unit - configured for RAID 60, that's 20 TB of space in a 
> single chunk. I know that RAID units capable of this sort of capacity 
> have been on the market for some time now, so I was somewhat 
> surprised to discover that there are pending issues against Linux...
> 
> The block device is detected/reported just fine.
> I didn't even try Ext3, I know it's not appropriate for this sort of 
> capacity. I've tried Reiser3, and already mkfs.reiserfs (user-space 
> util) refused to create such a big FS. Then I tried XFS. The user-
> space mkfs.xfs had no objections - so far so good. But when I tried 
> to mount the volume thus created, the kernel-space XFS driver 
> (including the one in 2.6.24.2) refused to mount the FS, complaining 
> about the FS being too big to be mounted on this platform.

Sure. the largest address space that can be used on a 32bit platform
with 4k pages is 16TB (2^32 * 2^12 = 2^44 = 16TB). For XFS, that
means metadata can't be placed higher in the filesystem than 16TB,
and seeing as we only have a single address space for metadata, the
filesystem is limited to 16TB. It could be fixed with software
changes, but really there's no excuse for using x86 given how
cheap x86_64 is now.....

> So far I've been using kernels compiled for 32bit mode x86.
> Obviously I have LBD support enabled, and it's always worked 
> flawlessly. Would it be any help if I switched to 64bit mode?
> My machines have been capable of that for a few years now, but so far 
> I had no reason to switch, as the memory capacities installed hardly 
> ever reached 4 GB...

Yes, switching to 64 bit machines will fix this problem as the
address space will now hold 2^64*2^12 bytes.....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block layer / FS question: x86_32bit with LBD, 20 TB RAID volume => funny issues
  2008-03-07  6:10   ` Frantisek Rysanek
  2008-03-07  9:30     ` Andi Kleen
@ 2008-03-09  4:26     ` Lee Revell
  1 sibling, 0 replies; 9+ messages in thread
From: Lee Revell @ 2008-03-09  4:26 UTC (permalink / raw)
  To: Frantisek Rysanek; +Cc: linux-kernel

On Fri, Mar 7, 2008 at 1:10 AM, Frantisek Rysanek
<Frantisek.Rysanek@post.cz> wrote:
> On 7 Mar 2008 at 4:05, Lee Revell wrote:
>  > > Would it be any help if I switched to 64bit mode?
>  >
>  > Yes, it would be worth a try.
>  >
>  :-/ Okay, time to install 64bit Fedora 8 or something :-)
>
>  Anyway, thanks very much for your response :-)

You could try a Live CD, easier than installing a new distro...

Lee

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block layer / FS question: x86_32bit with LBD, 20 TB RAID volume => funny issues
  2008-03-07  9:30     ` Andi Kleen
@ 2008-03-07 10:48       ` Frantisek Rysanek
  0 siblings, 0 replies; 9+ messages in thread
From: Frantisek Rysanek @ 2008-03-07 10:48 UTC (permalink / raw)
  To: linux-kernel

On 7 Mar 2008 at 10:30, Andi Kleen wrote:
[...snip...]
> BTW your problems mostly sound like driver issues. Some drivers 
> (and some controller firmwares) have problems with large block numbers
> 
thanks for that hint, I'll investigate that too.

The HBA is a Qlogic QLA-2460. Based on some past experience with 
other brands of FC HBA's, I tend to swear on Qlogic as the reference 
implementation of FC hardware.

The driver I've tried so far is the vanilla version in 2.6.22.6 and 
2.6.24.2. The firmware that I load at runtime is something I've 
downloaded from Qlogic web maybe four months ago... time to update my 
firmware :-)

Frank Rysanek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block layer / FS question: x86_32bit with LBD, 20 TB RAID volume => funny issues
  2008-03-07  6:10   ` Frantisek Rysanek
@ 2008-03-07  9:30     ` Andi Kleen
  2008-03-07 10:48       ` Frantisek Rysanek
  2008-03-09  4:26     ` Lee Revell
  1 sibling, 1 reply; 9+ messages in thread
From: Andi Kleen @ 2008-03-07  9:30 UTC (permalink / raw)
  To: Frantisek Rysanek; +Cc: linux-kernel

"Frantisek Rysanek" <Frantisek.Rysanek@post.cz> writes:

> On 7 Mar 2008 at 4:05, Lee Revell wrote:
> > >  I didn't even try Ext3, I know it's not appropriate for this sort of
> > >  capacity.
> > 
> > Where did you get that idea?
> >
> Hmm... Google can find sources on the 'net claiming that Ext3 has a 
> maximum of 2 or 4 TB. Nice to know that I'm wrong, 

You're not wrong (for 4K ext2s). Only ext4 lifted that limit, but it is
still experimental.

BTW your problems mostly sound like driver issues. Some drivers 
(and some controller firmwares) have problems with large block numbers

-Andi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: block layer / FS question: x86_32bit with LBD, 20 TB RAID volume => funny issues
       [not found] ` <75b66ecd0803061905n5c06663cv2b75659917461199@mail.gmail.com>
@ 2008-03-07  6:10   ` Frantisek Rysanek
  2008-03-07  9:30     ` Andi Kleen
  2008-03-09  4:26     ` Lee Revell
  0 siblings, 2 replies; 9+ messages in thread
From: Frantisek Rysanek @ 2008-03-07  6:10 UTC (permalink / raw)
  To: linux-kernel

On 7 Mar 2008 at 4:05, Lee Revell wrote:
> >  I didn't even try Ext3, I know it's not appropriate for this sort of
> >  capacity.
> 
> Where did you get that idea?
>
Hmm... Google can find sources on the 'net claiming that Ext3 has a 
maximum of 2 or 4 TB. Nice to know that I'm wrong, I'll test this 
right away :-)
 
> Are you sure the hardware is not faulty?
>
I'm pretty sure. 
I know what it looks like when one of those 1TB drives has a problem 
it would prefer not to talk about in SMART (hint: only one disk 
activity LED out of 24 keeps blinking). Nowadays I have to handle 
several such drives in every new RAID unit delivered. 
When such a drive times out past some margin, say half a minute, the 
Linux kernel reports a SCSI CMD timeout, or rather, the RAID 
controller runs out of patience sooner than the kernel, the array 
gets degraded and keeps going on in degraded mode. So if the RAID 
controller itself goes out for lunch, Linux definitely complains. 
I've also seen a number of sly SCSI parity errors / general bus 
impedance problems, all of which yielded a proper error within half a 
minute or so. I am well equipped to debug such problems. Besides, 
this is FC. 
Once upon a time I've seen some ugly low-level incompatibility in FC 
too - that resulted in some really nice messages from the Linux 
kernel. 
I also know a brand of controllers which claim support for TCQ depth 
of 255, but actually hang with anything over 192 or so. This also 
yields a proper error in Linux, and the RAID controller goes toes up. 

None of this happens in my case... all is happy and calm.
Hmm... maybe I should try some modern FreeBSD for comparison :-)))

Apologies for not mentioning specific hardware brands - I don't want 
to get in trouble...

> > Would it be any help if I switched to 64bit mode?
> 
> Yes, it would be worth a try.
> 
:-/ Okay, time to install 64bit Fedora 8 or something :-)

Anyway, thanks very much for your response :-)

Frank Rysanek


^ permalink raw reply	[flat|nested] 9+ messages in thread

* block layer / FS question: x86_32bit with LBD, 20 TB RAID volume => funny issues
@ 2008-03-06 21:25 Frantisek Rysanek
       [not found] ` <75b66ecd0803061905n5c06663cv2b75659917461199@mail.gmail.com>
  2008-03-09 22:05 ` David Chinner
  0 siblings, 2 replies; 9+ messages in thread
From: Frantisek Rysanek @ 2008-03-06 21:25 UTC (permalink / raw)
  To: linux-kernel

Dear everyone,

I've got another silly question, rather vaguely formulated...

I have a homebrew Fedora 5 - based live CD with some basic system 
utilities and FS support, that I'm using to test various sorts of 
hardware and subsystems. Call it hardware debugging, in a PC hardware 
assembly shop...
Today I have "external storage" under the magnifier glass.

In the past, the biggest block devices I've met so far have been some 
14TB RAID volumes (16x 1TB disk in RAID 6). These are connected to 
the host PC via a SCSI/SAS/FC HBA, and essentially appear as a single 
huge SCSI disk. Such RAID units work pretty fine against Linux, 
preferably using LBA64/CDB16. Abnormal sector sizes are less 
appropriate, but also seem to work with some FS.

A few days ago, I've had my first opportunity to put my hands on a 
24bay RAID unit - configured for RAID 60, that's 20 TB of space in a 
single chunk. I know that RAID units capable of this sort of capacity 
have been on the market for some time now, so I was somewhat 
surprised to discover that there are pending issues against Linux...

The block device is detected/reported just fine.
I didn't even try Ext3, I know it's not appropriate for this sort of 
capacity. I've tried Reiser3, and already mkfs.reiserfs (user-space 
util) refused to create such a big FS. Then I tried XFS. The user-
space mkfs.xfs had no objections - so far so good. But when I tried 
to mount the volume thus created, the kernel-space XFS driver 
(including the one in 2.6.24.2) refused to mount the FS, complaining 
about the FS being too big to be mounted on this platform.

Okay, those are FS quirks, some of them even implied by the spec 
(=RTFM) or at least located in user space. Hang on a second,
there's more.

If I try 
  dd if=/dev/zero of=/dev/sda bs=4096
it never runs until the very end. It always seems to hang somewhere 
halfway through, sometimes at 7 TB, sometimes at 4 TB... it's weird.
The dd process stays alive, but data transfer stops (as seen in 
iostat and by the RAID unit's LED's), and the write() or whatever 
syscall inside dd just keeps sleeping blocked forever. The RAID seems 
perfectly happy, there are no timeout messages from the SCSI layer. 
If I terminate the dd process (CTRL+C) and start it again, everything 
looks perfectly allright again.

I also have a simple test app that runs in a loop, reading a whole 
raw block device (/dev/sda) from start to end. It open()s the device 
node with O_LARGEFILE and just read()s 64kB chunks until EOF.
This one always terminates halfway through, and the cause seems to be 
that the read() call returns EINVAL.


So far I've been using kernels compiled for 32bit mode x86.
Obviously I have LBD support enabled, and it's always worked 
flawlessly. Would it be any help if I switched to 64bit mode?
My machines have been capable of that for a few years now, but so far 
I had no reason to switch, as the memory capacities installed hardly 
ever reached 4 GB...

Any ideas would be welcome :-)

Frank Rysanek


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-03-13 10:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-10  9:13 block layer / FS question: x86_32bit with LBD, 20 TB RAID volume => funny issues Frantisek Rysanek
  -- strict thread matches above, loose matches on Subject: below --
2008-03-13 10:14 Frantisek Rysanek
2008-03-06 21:25 Frantisek Rysanek
     [not found] ` <75b66ecd0803061905n5c06663cv2b75659917461199@mail.gmail.com>
2008-03-07  6:10   ` Frantisek Rysanek
2008-03-07  9:30     ` Andi Kleen
2008-03-07 10:48       ` Frantisek Rysanek
2008-03-09  4:26     ` Lee Revell
2008-03-09 22:05 ` David Chinner
2008-03-10  4:32   ` Frantisek Rysanek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).