LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Marc Marais" <marcm@liquid-nexus.net>
To: Neil Brown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: md: md6_raid5 crash 2.6.20
Date: Mon, 12 Feb 2007 08:03:57 +0800	[thread overview]
Message-ID: <20070212000042.M73586@liquid-nexus.net> (raw)
In-Reply-To: <17871.37497.786198.834303@notabene.brown>

On Mon, 12 Feb 2007 09:02:33 +1100, Neil Brown wrote
> On Sunday February 11, marcm@liquid-nexus.net wrote:
> > Greetings,
> > 
> > I've been running md on my server for some time now and a few days ago one of
> > the (3) drives in the raid5 array starting giving read errors. The result was
> > usually system hangs and this was with kernel 2.6.17.13. I upgraded to the
> > latest production 2.6.20 kernel and experienced the same behaviour.
> 
> System hangs suggest a problem with the drive controller.  However
> this "kernel BUG" is something newly introduced in 2.6.20 which 
> should be fixed in 2.6.20.1.  Patch is below.
> 
> If you still get hangs with this patch installed, then please report
> detail, and probably copy to linux-ide@vger.kernel.org.
> 
> NeilBrown
> 
> Fix various bugs with aligned reads in RAID5.
> 
> It is possible for raid5 to be sent a bio that is too big
> for an underlying device.  So if it is a READ that we
> pass stright down to a device, it will fail and confuse
> RAID5.
> 
> So in 'chunk_aligned_read' we check that the bio fits within the
> parameters for the target device and if it doesn't fit, fall back
> on reading through the stripe cache and making lots of one-page
> requests.
> 
> Note that this is the earliest time we can check against the device
> because earlier we don't have a lock on the device, so it could 
> change underneath us.
> 
> Also, the code for handling a retry through the cache when a read
> fails has not been tested and was badly broken.  This patch fixes 
> that code.
> 
> Signed-off-by: Neil Brown <neilb@suse.de>
> 

Thanks for the quick response Neil unfortunately the kernel doesn't build with
this patch due to a missing symbol:

WARNING: "blk_recount_segments" [drivers/md/raid456.ko] undefined!

Is that in another file that needs patching or within raid5.c?

Marc

--

  reply	other threads:[~2007-02-12  0:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-11  7:27 Marc Marais
2007-02-11 22:02 ` Neil Brown
2007-02-12  0:03   ` Marc Marais [this message]
2007-02-12  0:15     ` Neil Brown
2007-02-12 16:28 Andrew Burgess
2007-02-13  8:44 ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070212000042.M73586@liquid-nexus.net \
    --to=marcm@liquid-nexus.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --subject='Re: md: md6_raid5 crash 2.6.20' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).