LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: noah <noah123@gmail.com>
To: linux-kernel@vger.kernel.org
Subject: Data corruption with raid5/dm-crypt/lvm/reiserfs on 2.6.19.2
Date: Thu, 18 Jan 2007 21:11:58 +0100	[thread overview]
Message-ID: <d00698fb0701181211l480966e5o6f10db9300d4c8aa@mail.gmail.com> (raw)

Hi!

I'm experiencing data corruption in the following setup:

1. mdadm --create /dev/md0 -n3 -lraid5 /dev/hda1 /dev/hdc1 /dev/hde1
2. cryptsetup -c aes-cbc-essiva:sha256 luksFormat /dev/md0 mykey
3. cryptsetup -d mykey luksOpen /dev/md0 cryptvol
4. pvcreate /dev/mapper/cryptvol
5. vgcreate vg0 /dev/cryptvol
6. lvcreate -n root  -L10G vg0
7. mkreiserfs -q /dev/vg0/root
8. mkdir /.newroot; mount /dev/vg0/root /.newroot
9. mkdir /.realroot; mount -o bind / /.realroot
10. tar cf - -C /.realroot|tar xvpf - -C /.newroot

With Linux 2.6.18 (it's broken, OK, but there's still something wrong
even in 2.6.19.2 so keep on reading) I started getting warnings from
ReiserFS indicating severe data corruptions.  Reiserfsck confirms
this.  It usually happened while extracting the Linux source tree.

So after asking around I found out dm-crypt had a bug[1] fixed in
early December.
It got fixed in 2.6.19 and the fix was backported and included in 2.6.18.6[2].

Fine, so I upgraded to 2.6.18.6, rebuilt the array from scratch and
did the whole procedure again.
No messages from reiserfs in dmesg this time, but reiserfsck still
revealed severe data corruption.
I also found compressed archives and ISO-images for which I had
md5sums to be corrupt.

I then upgraded to 2.6.19.2 with the exact same result as with 2.6.18.6.
I even verified this on a fairly new computer with different hardware
(Intel CPU and chipset).

Figured it maybe was some kind of race condition so on my second try
on 2.6.19.2, when recreating the array, I let md finish resyncing it
before copying over the files.
This time, reiserfsck didn't complain.

Just for the sake of fun, I did the whole thing again, rebuilding the
array from scratch, let md resync the third drive and then I started
to copy over all files again.  Thinking the cause of the problem was
heavy disk I/O I tried to stress the other LVM volumes residing on md0
using tar during the copy.  Everything seemed fine; no problems arose.

Did a few reboots and confirmed that reiserfsck didn't have any
complaints on any of the filesystems residing on the LVM volumes on
md0.

Started using the machine as normal, and half a day later I unmounted
the filesystems and ran reiserfsck just to make sure everything still
was OK.  Unfortunately, it wasn't.


The drives in the array are three brand new drives, 2x250GB and one
200GB, all three IDE drives.
According to SMART there's no problems with them.  And they worked
fine in my previous RAID1 setup with dm-crypt and LVM, by the way.
The computer itself is an Athlon XP with less than 1GB of RAM on a M/B
with nForce2 chipset FWIW.  No memory errors were detected with
memtest86+ (I completed the full test).
I haven't tried using another filesystem as I've got quite a lot of
faith in reiserfs's stability.

Is anybody else experiencing these problems?
Unfortunately I'm only able to do limited testing due to busy days,
but I'd love to help if I can.


[1] Here's a thread on the recently fixed data corruption bug in dm-crypt
http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/1974

[2] The backport of the dm-crypt fix for 2.6.18.6 is here
http://uwsg.iu.edu/hypermail/linux/kernel/0612.1/2299.html

             reply	other threads:[~2007-01-18 20:12 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-18 20:11 noah [this message]
2007-01-22 19:56 ` Andrew Morton
2007-01-22 21:42   ` Christophe Saout
2007-02-22 23:43     ` [dm-devel] " Piet Delaney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d00698fb0701181211l480966e5o6f10db9300d4c8aa@mail.gmail.com \
    --to=noah123@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --subject='Re: Data corruption with raid5/dm-crypt/lvm/reiserfs on 2.6.19.2' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).