LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: change strip_cache_size freeze the whole raid
       [not found] <001801c73e14$c3177170$28df0f3d@kylecea1512a3f>
@ 2007-01-22 12:18 ` Justin Piszcz
  2007-01-22 13:09   ` kyle
                     ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Justin Piszcz @ 2007-01-22 12:18 UTC (permalink / raw)
  To: kyle; +Cc: linux-raid, linux-kernel



On Mon, 22 Jan 2007, kyle wrote:

> Hi,
> 
> Yesterday I tried to increase the value of strip_cache_size to see if I can
> get better performance or not. I increase the value from 2048 to something
> like 16384. After I did that, the raid5 freeze. Any proccess read / write to
> it stucked at D state. I tried to change it back to 2048, read
> strip_cache_active, cat /proc/mdstat, mdadm stop, etc. All didn't return back.
> I even cannot shutdown the machine. Finally I need to press the reset button
> in order to get back my control.
> 
> Kernel is 2.6.17.8 x86-64, running at AMD Athlon3000+, 2GB Ram, 8 x Seagate
> 8200.10 250GB HDD, nvidia chipset.
> 
> cat /proc/mdstat (after reboot):
> Personalities : [raid1] [raid5] [raid4]
> md1 : active raid1 hdc2[1] hda2[0]
>      6144768 blocks [2/2] [UU]
> 
> md2 : active raid5 sdf1[7] sde1[6] sdd1[5] sdc1[4] sdb1[3] sda1[2] hdc4[1]
> hda4[0]
>      1664893440 blocks level 5, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]
> 
> md0 : active raid1 hdc1[1] hda1[0]
>      104320 blocks [2/2] [UU]
> 
> Kyle
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Yes, I noticed this bug too, if you change it too many times or change it 
at the 'wrong' time, it hangs up when you echo numbr > 
/proc/stripe_cache_size.

Basically don't run it more than once and don't run it at the 'wrong' time 
and it works.  Not sure where the bug lies, but yeah I've seen that on 3 
different machines!

Justin.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: change strip_cache_size freeze the whole raid
  2007-01-22 12:18 ` change strip_cache_size freeze the whole raid Justin Piszcz
@ 2007-01-22 13:09   ` kyle
  2007-01-22 14:56     ` Justin Piszcz
  2007-01-22 14:57   ` Steve Cousins
  2007-01-22 16:10   ` Liang Yang
  2 siblings, 1 reply; 10+ messages in thread
From: kyle @ 2007-01-22 13:09 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-raid, linux-kernel

>
> On Mon, 22 Jan 2007, kyle wrote:
>
>> Hi,
>>
>> Yesterday I tried to increase the value of strip_cache_size to see if I 
>> can
>> get better performance or not. I increase the value from 2048 to 
>> something
>> like 16384. After I did that, the raid5 freeze. Any proccess read / write 
>> to
>> it stucked at D state. I tried to change it back to 2048, read
>> strip_cache_active, cat /proc/mdstat, mdadm stop, etc. All didn't return 
>> back.
>> I even cannot shutdown the machine. Finally I need to press the reset 
>> button
>> in order to get back my control.

> Yes, I noticed this bug too, if you change it too many times or change it
> at the 'wrong' time, it hangs up when you echo numbr >
> /proc/stripe_cache_size.
>
> Basically don't run it more than once and don't run it at the 'wrong' time
> and it works.  Not sure where the bug lies, but yeah I've seen that on 3
> different machines!
>
> Justin.
>
>

I just change it once, then it freeze. It's hard to get the 'right time'

Actually I tried it several times before. As I remember there was once it 
freezed for around 1 or 2 minutes , then back to normal operation. This is 
the first time it completely freezed and I waited after around 10 minutes it 
still didn't wake up.

Kyle


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: change strip_cache_size freeze the whole raid
  2007-01-22 13:09   ` kyle
@ 2007-01-22 14:56     ` Justin Piszcz
  2007-01-22 15:18       ` kyle
  0 siblings, 1 reply; 10+ messages in thread
From: Justin Piszcz @ 2007-01-22 14:56 UTC (permalink / raw)
  To: kyle; +Cc: linux-raid, linux-kernel



On Mon, 22 Jan 2007, kyle wrote:

> >
> > On Mon, 22 Jan 2007, kyle wrote:
> >
> > > Hi,
> > >
> > > Yesterday I tried to increase the value of strip_cache_size to see if I
> > > can
> > > get better performance or not. I increase the value from 2048 to something
> > > like 16384. After I did that, the raid5 freeze. Any proccess read / write
> > > to
> > > it stucked at D state. I tried to change it back to 2048, read
> > > strip_cache_active, cat /proc/mdstat, mdadm stop, etc. All didn't return
> > > back.
> > > I even cannot shutdown the machine. Finally I need to press the reset
> > > button
> > > in order to get back my control.
> 
> > Yes, I noticed this bug too, if you change it too many times or change it
> > at the 'wrong' time, it hangs up when you echo numbr >
> > /proc/stripe_cache_size.
> >
> > Basically don't run it more than once and don't run it at the 'wrong' time
> > and it works.  Not sure where the bug lies, but yeah I've seen that on 3
> > different machines!
> >
> > Justin.
> >
> >
> 
> I just change it once, then it freeze. It's hard to get the 'right time'
> 
> Actually I tried it several times before. As I remember there was once it
> freezed for around 1 or 2 minutes , then back to normal operation. This is the
> first time it completely freezed and I waited after around 10 minutes it still
> didn't wake up.
> 
> Kyle
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

What kernel version are you using?  It normally works the first time for 
me, I put it in my startup scripts, as one of the last items.  However, if 
I change it a few times, it will hang and there is no way to reboot except 
via SYSRQ or pressing the reboot button on the machine.

This seems to be true of 2.6.19.1 and 2.6.19.2, I did not try under 
2.6.20-rc5 because I am tired of hanging my machine :)

Justin.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: change strip_cache_size freeze the whole raid
  2007-01-22 12:18 ` change strip_cache_size freeze the whole raid Justin Piszcz
  2007-01-22 13:09   ` kyle
@ 2007-01-22 14:57   ` Steve Cousins
  2007-01-22 15:01     ` Justin Piszcz
                       ` (2 more replies)
  2007-01-22 16:10   ` Liang Yang
  2 siblings, 3 replies; 10+ messages in thread
From: Steve Cousins @ 2007-01-22 14:57 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: kyle, linux-raid, linux-kernel



Justin Piszcz wrote:
> Yes, I noticed this bug too, if you change it too many times or change it 
> at the 'wrong' time, it hangs up when you echo numbr > 
> /proc/stripe_cache_size.
> 
> Basically don't run it more than once and don't run it at the 'wrong' time 
> and it works.  Not sure where the bug lies, but yeah I've seen that on 3 
> different machines!

Can you tell us when the "right" time is or maybe what the "wrong" time 
is?  Also, is this kernel specific?  Does it (increasing 
stripe_cache_size) work with RAID6 too?

Thanks,

Steve
-- 
______________________________________________________________________
  Steve Cousins, Ocean Modeling Group    Email: cousins@umit.maine.edu
  Marine Sciences, 452 Aubert Hall       http://rocky.umeoce.maine.edu
  Univ. of Maine, Orono, ME 04469        Phone: (207) 581-4302



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: change strip_cache_size freeze the whole raid
  2007-01-22 14:57   ` Steve Cousins
@ 2007-01-22 15:01     ` Justin Piszcz
  2007-01-23 14:22       ` kyle
  2007-01-22 15:10     ` Justin Piszcz
  2007-01-22 15:13     ` kyle
  2 siblings, 1 reply; 10+ messages in thread
From: Justin Piszcz @ 2007-01-22 15:01 UTC (permalink / raw)
  To: Steve Cousins; +Cc: kyle, linux-raid, linux-kernel



On Mon, 22 Jan 2007, Steve Cousins wrote:

> 
> 
> Justin Piszcz wrote:
> > Yes, I noticed this bug too, if you change it too many times or change it at
> > the 'wrong' time, it hangs up when you echo numbr > /proc/stripe_cache_size.
> > 
> > Basically don't run it more than once and don't run it at the 'wrong' time
> > and it works.  Not sure where the bug lies, but yeah I've seen that on 3
> > different machines!
> 
> Can you tell us when the "right" time is or maybe what the "wrong" time is?
> Also, is this kernel specific?  Does it (increasing stripe_cache_size) work
> with RAID6 too?
> 
> Thanks,
> 
> Steve
> -- 
> ______________________________________________________________________
>  Steve Cousins, Ocean Modeling Group    Email: cousins@umit.maine.edu
>  Marine Sciences, 452 Aubert Hall       http://rocky.umeoce.maine.edu
>  Univ. of Maine, Orono, ME 04469        Phone: (207) 581-4302
> 
> 
> 

The wrong time (for me anyway) is when/or around the time in which kernel 
is auto-detecting arrays/udev starts, when I put it there I get OOPSES all 
over the screen and it gets really nasty.  Basically the best time appears 
to be right after the system has started up but I/O hasn't started hitting 
the array yet.  Tricky, I know.

Justin.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: change strip_cache_size freeze the whole raid
  2007-01-22 14:57   ` Steve Cousins
  2007-01-22 15:01     ` Justin Piszcz
@ 2007-01-22 15:10     ` Justin Piszcz
  2007-01-22 15:13     ` kyle
  2 siblings, 0 replies; 10+ messages in thread
From: Justin Piszcz @ 2007-01-22 15:10 UTC (permalink / raw)
  To: Steve Cousins; +Cc: kyle, linux-raid, linux-kernel



On Mon, 22 Jan 2007, Steve Cousins wrote:

> 
> 
> Justin Piszcz wrote:
> > Yes, I noticed this bug too, if you change it too many times or change it at
> > the 'wrong' time, it hangs up when you echo numbr > /proc/stripe_cache_size.
> > 
> > Basically don't run it more than once and don't run it at the 'wrong' time
> > and it works.  Not sure where the bug lies, but yeah I've seen that on 3
> > different machines!
> 
> Can you tell us when the "right" time is or maybe what the "wrong" time is?
> Also, is this kernel specific?  Does it (increasing stripe_cache_size) work
> with RAID6 too?
> 
> Thanks,
> 
> Steve
> -- 
> ______________________________________________________________________
>  Steve Cousins, Ocean Modeling Group    Email: cousins@umit.maine.edu
>  Marine Sciences, 452 Aubert Hall       http://rocky.umeoce.maine.edu
>  Univ. of Maine, Orono, ME 04469        Phone: (207) 581-4302
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Also, I have not tested the stripe_cache_size under RAID6, I am unsure.

Justin.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: change strip_cache_size freeze the whole raid
  2007-01-22 14:57   ` Steve Cousins
  2007-01-22 15:01     ` Justin Piszcz
  2007-01-22 15:10     ` Justin Piszcz
@ 2007-01-22 15:13     ` kyle
  2 siblings, 0 replies; 10+ messages in thread
From: kyle @ 2007-01-22 15:13 UTC (permalink / raw)
  To: Steve Cousins, Justin Piszcz; +Cc: linux-raid, linux-kernel


> Justin Piszcz wrote:
>> Yes, I noticed this bug too, if you change it too many times or change it 
>> at the 'wrong' time, it hangs up when you echo numbr > 
>> /proc/stripe_cache_size.
>>
>> Basically don't run it more than once and don't run it at the 'wrong' 
>> time and it works.  Not sure where the bug lies, but yeah I've seen that 
>> on 3 different machines!
>
> Can you tell us when the "right" time is or maybe what the "wrong" time 
> is?  Also, is this kernel specific?  Does it (increasing 
> stripe_cache_size) work with RAID6 too?
>
> Thanks,
>
> Steve

I think if your /sys/block/md_your_raid6/md/ have a file 
"stripe_cache_size", then it should works with raid6 too.

Kyle


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: change strip_cache_size freeze the whole raid
  2007-01-22 14:56     ` Justin Piszcz
@ 2007-01-22 15:18       ` kyle
  0 siblings, 0 replies; 10+ messages in thread
From: kyle @ 2007-01-22 15:18 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-raid, linux-kernel

>>
>> > Yes, I noticed this bug too, if you change it too many times or change 
>> > it
>> > at the 'wrong' time, it hangs up when you echo numbr >
>> > /proc/stripe_cache_size.
>> >
>> > Basically don't run it more than once and don't run it at the 'wrong' 
>> > time
>> > and it works.  Not sure where the bug lies, but yeah I've seen that on 
>> > 3
>> > different machines!
>> >
>> > Justin.
>> >
>> >
>>
>> I just change it once, then it freeze. It's hard to get the 'right time'
>>
>> Actually I tried it several times before. As I remember there was once it
>> freezed for around 1 or 2 minutes , then back to normal operation. This 
>> is the
>> first time it completely freezed and I waited after around 10 minutes it 
>> still
>> didn't wake up.
>>
>> Kyle
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> What kernel version are you using?  It normally works the first time for
> me, I put it in my startup scripts, as one of the last items.  However, if
> I change it a few times, it will hang and there is no way to reboot except
> via SYSRQ or pressing the reboot button on the machine.
>
> This seems to be true of 2.6.19.1 and 2.6.19.2, I did not try under
> 2.6.20-rc5 because I am tired of hanging my machine :)
>
> Justin.
>

It was 2.6.17.8. Now it's 2.6.7.13 but I won't touch it now! It's around 
15km from me!


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: change strip_cache_size freeze the whole raid
  2007-01-22 12:18 ` change strip_cache_size freeze the whole raid Justin Piszcz
  2007-01-22 13:09   ` kyle
  2007-01-22 14:57   ` Steve Cousins
@ 2007-01-22 16:10   ` Liang Yang
  2 siblings, 0 replies; 10+ messages in thread
From: Liang Yang @ 2007-01-22 16:10 UTC (permalink / raw)
  To: Justin Piszcz, kyle; +Cc: linux-raid, linux-kernel

Do we need to consider the chunk size when we adjust the value of 
Striped_Cache_Szie for the MD-RAID5 array?

Liang

----- Original Message ----- 
From: "Justin Piszcz" <jpiszcz@lucidpixels.com>
To: "kyle" <kylewong@southa.com>
Cc: <linux-raid@vger.kernel.org>; <linux-kernel@vger.kernel.org>
Sent: Monday, January 22, 2007 5:18 AM
Subject: Re: change strip_cache_size freeze the whole raid


>
>
> On Mon, 22 Jan 2007, kyle wrote:
>
>> Hi,
>>
>> Yesterday I tried to increase the value of strip_cache_size to see if I 
>> can
>> get better performance or not. I increase the value from 2048 to 
>> something
>> like 16384. After I did that, the raid5 freeze. Any proccess read / write 
>> to
>> it stucked at D state. I tried to change it back to 2048, read
>> strip_cache_active, cat /proc/mdstat, mdadm stop, etc. All didn't return 
>> back.
>> I even cannot shutdown the machine. Finally I need to press the reset 
>> button
>> in order to get back my control.
>>
>> Kernel is 2.6.17.8 x86-64, running at AMD Athlon3000+, 2GB Ram, 8 x 
>> Seagate
>> 8200.10 250GB HDD, nvidia chipset.
>>
>> cat /proc/mdstat (after reboot):
>> Personalities : [raid1] [raid5] [raid4]
>> md1 : active raid1 hdc2[1] hda2[0]
>>      6144768 blocks [2/2] [UU]
>>
>> md2 : active raid5 sdf1[7] sde1[6] sdd1[5] sdc1[4] sdb1[3] sda1[2] 
>> hdc4[1]
>> hda4[0]
>>      1664893440 blocks level 5, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]
>>
>> md0 : active raid1 hdc1[1] hda1[0]
>>      104320 blocks [2/2] [UU]
>>
>> Kyle
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> Yes, I noticed this bug too, if you change it too many times or change it
> at the 'wrong' time, it hangs up when you echo numbr >
> /proc/stripe_cache_size.
>
> Basically don't run it more than once and don't run it at the 'wrong' time
> and it works.  Not sure where the bug lies, but yeah I've seen that on 3
> different machines!
>
> Justin.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: change strip_cache_size freeze the whole raid
  2007-01-22 15:01     ` Justin Piszcz
@ 2007-01-23 14:22       ` kyle
  0 siblings, 0 replies; 10+ messages in thread
From: kyle @ 2007-01-23 14:22 UTC (permalink / raw)
  To: Justin Piszcz, Steve Cousins; +Cc: linux-raid, linux-kernel

> I can try and do this later this week possibly.

> Justin.
>>
>> alt-sysrq-T or "echo t > /proc/sysrq-trigger" can be really helpful to
>> diagnose this sort of problem (providing the system isn't so badly
>> stuck that the kernel logs don't get stored).
>>
>> It is probably hitting a memory-allocation deadlock, though I cannot
>> see exactly where the deadlock would be.  If you are able to reproduce
>> it and can get the kernel logs after 'alt-sysrq-T' I would really
>> appreciate it.
>> Justin,Maybe you can try freeze it once more and get the kernel logs 
>> before try Neil's patch ...... :D~Kyle 


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-01-23 14:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <001801c73e14$c3177170$28df0f3d@kylecea1512a3f>
2007-01-22 12:18 ` change strip_cache_size freeze the whole raid Justin Piszcz
2007-01-22 13:09   ` kyle
2007-01-22 14:56     ` Justin Piszcz
2007-01-22 15:18       ` kyle
2007-01-22 14:57   ` Steve Cousins
2007-01-22 15:01     ` Justin Piszcz
2007-01-23 14:22       ` kyle
2007-01-22 15:10     ` Justin Piszcz
2007-01-22 15:13     ` kyle
2007-01-22 16:10   ` Liang Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).