LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: ext4: Fix data corruption with multi-block writepages support
@ 2011-02-04 22:40 Matt
2011-02-07 17:45 ` Ted Ts'o
0 siblings, 1 reply; 7+ messages in thread
From: Matt @ 2011-02-04 22:40 UTC (permalink / raw)
To: Ted Ts'o; +Cc: Linux Kernel, linux-ext4
>Thanks, added to the ext4 patch queue.
>I modified the commit description slightly to give credit to Jon
>Nelson, who reported the bug and really helped by devising a
>reproduceable test case. Many thanks, Jon!!
> - Ted
So that means that the file-corruption which existed until 2.6.37-rc6
and got triggered (for me) more easily via "dm crypt: scale to
multiple CPUs"
is fixed now ?
That should give ext4 a nice speedup for >=2.6.38 :)
Could you also please add an ?
Reported-by: Matthias Bayer <jackdachef <at> gmail <dot> com >
I mainly found it through testing with the mentioned dm-crypt scaling
patch and >=2.6.36-git*
Thanks & Regards !
Matt
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ext4: Fix data corruption with multi-block writepages support
2011-02-04 22:40 ext4: Fix data corruption with multi-block writepages support Matt
@ 2011-02-07 17:45 ` Ted Ts'o
2011-02-07 18:29 ` Milan Broz
2011-02-07 18:56 ` Matt
0 siblings, 2 replies; 7+ messages in thread
From: Ted Ts'o @ 2011-02-07 17:45 UTC (permalink / raw)
To: Matt; +Cc: Linux Kernel, linux-ext4
On Fri, Feb 04, 2011 at 10:40:47PM +0000, Matt wrote:
>
> So that means that the file-corruption which existed until 2.6.37-rc6
> and got triggered (for me) more easily via "dm crypt: scale to
> multiple CPUs"
> is fixed now ?
Well, a patch exists for it that will be merged into 2.6.38.
> That should give ext4 a nice speedup for >=2.6.38 :)
I'm not going to make it be the default for 2.6.38, since it's fairly
late in the -rc features. People who want it can explicitly enable it
using the mount option mblk_io_submit, though. (And let me know your
success stories! :-) I will be enabling it as the default in
2.6.39-rc1.
> Reported-by: Matthias Bayer <jackdachef <at> gmail <dot> com >
Sure!
- Ted
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ext4: Fix data corruption with multi-block writepages support
2011-02-07 17:45 ` Ted Ts'o
@ 2011-02-07 18:29 ` Milan Broz
2011-02-07 18:44 ` Matt
2011-02-07 20:44 ` Ted Ts'o
2011-02-07 18:56 ` Matt
1 sibling, 2 replies; 7+ messages in thread
From: Milan Broz @ 2011-02-07 18:29 UTC (permalink / raw)
To: Ted Ts'o, Matt, Linux Kernel, linux-ext4
On 02/07/2011 06:45 PM, Ted Ts'o wrote:
> On Fri, Feb 04, 2011 at 10:40:47PM +0000, Matt wrote:
>>
>> So that means that the file-corruption which existed until 2.6.37-rc6
>> and got triggered (for me) more easily via "dm crypt: scale to
>> multiple CPUs"
>> is fixed now ?
>
> Well, a patch exists for it that will be merged into 2.6.38.
>
>> That should give ext4 a nice speedup for >=2.6.38 :)
>
> I'm not going to make it be the default for 2.6.38, since it's fairly
> late in the -rc features. People who want it can explicitly enable it
> using the mount option mblk_io_submit, though. (And let me know your
> success stories! :-) I will be enabling it as the default in
> 2.6.39-rc1.
So it was ext4 only bug in ext4_end_bio(),
dm-crypt per-cpu code was just trigger here, right?
Milan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ext4: Fix data corruption with multi-block writepages support
2011-02-07 18:29 ` Milan Broz
@ 2011-02-07 18:44 ` Matt
2011-02-07 20:44 ` Ted Ts'o
1 sibling, 0 replies; 7+ messages in thread
From: Matt @ 2011-02-07 18:44 UTC (permalink / raw)
To: Milan Broz; +Cc: Ted Ts'o, Linux Kernel, linux-ext4
On Mon, Feb 7, 2011 at 6:29 PM, Milan Broz <mbroz@redhat.com> wrote:
> On 02/07/2011 06:45 PM, Ted Ts'o wrote:
>> On Fri, Feb 04, 2011 at 10:40:47PM +0000, Matt wrote:
>>>
>>> So that means that the file-corruption which existed until 2.6.37-rc6
>>> and got triggered (for me) more easily via "dm crypt: scale to
>>> multiple CPUs"
>>> is fixed now ?
>>
>> Well, a patch exists for it that will be merged into 2.6.38.
>>
>>> That should give ext4 a nice speedup for >=2.6.38 :)
>>
>> I'm not going to make it be the default for 2.6.38, since it's fairly
>> late in the -rc features. People who want it can explicitly enable it
>> using the mount option mblk_io_submit, though. (And let me know your
>> success stories! :-) I will be enabling it as the default in
>> 2.6.39-rc1.
>
> So it was ext4 only bug in ext4_end_bio(),
> dm-crypt per-cpu code was just trigger here, right?
>
> Milan
>
Hi Milan,
Well, that was at least the experience that I made
ext4: after Ted had disabled support for multiple page-io submission
I observed no data-corruption anymore (it had only appeared on the
system-partition, /home - where ext4 also is used or on my backup
partitions there was also no problem as far as I can tell)
XFS: no corruption observed
reiserfs: I can't say for sure since I'm only using it on my /boot partition :P
for other filesystems I can't say anything - I didn't use additional
ones at that time
Regards
Matt
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ext4: Fix data corruption with multi-block writepages support
2011-02-07 17:45 ` Ted Ts'o
2011-02-07 18:29 ` Milan Broz
@ 2011-02-07 18:56 ` Matt
1 sibling, 0 replies; 7+ messages in thread
From: Matt @ 2011-02-07 18:56 UTC (permalink / raw)
To: Ted Ts'o; +Cc: Linux Kernel, linux-ext4
On Mon, Feb 7, 2011 at 5:45 PM, Ted Ts'o <tytso@mit.edu> wrote:
> On Fri, Feb 04, 2011 at 10:40:47PM +0000, Matt wrote:
>>
>> So that means that the file-corruption which existed until 2.6.37-rc6
>> and got triggered (for me) more easily via "dm crypt: scale to
>> multiple CPUs"
>> is fixed now ?
>
> Well, a patch exists for it that will be merged into 2.6.38.
>
>> That should give ext4 a nice speedup for >=2.6.38 :)
>
> I'm not going to make it be the default for 2.6.38, since it's fairly
> late in the -rc features. People who want it can explicitly enable it
> using the mount option mblk_io_submit, though. (And let me know your
> success stories! :-) I will be enabling it as the default in
> 2.6.39-rc1.
>
Hi Ted,
I guess it should be save to enable it with 2.6.37, dm-crypt multi-cpu
patch and the following patch ?
"ext4: Fix data corruption with multi-block writepages support"
(of course that's the minimum - it would be better to pull in the ext4
changes for 2.6.38)
For a short time I had it activated (via additional) mblk_io_submit
mount-command on my portage-partition (where the portage-ball of my
Gentoo system is).
I was curious to see what messages I would get and wondered why there
was nothing about mballoc mentioned
If I recall correctly there were always messages in the past, like:
EXT4-fs: delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
these are from 2.6.28 -
I'm only getting:
EXT4-fs (dm-3): mounted filesystem with ordered data mode.
or
EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts:
commit=60,barrier=1
(I like to set the barriers / flushes explicitly).
Sorry if I didn't follow development but these messages were kind of
more and more silenced ?
Thanks !
>> Reported-by: Matthias Bayer <jackdachef <at> gmail <dot> com >
>
> Sure!
Thanks !
>
> - Ted
>
Regards
Matt
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ext4: Fix data corruption with multi-block writepages support
2011-02-07 18:29 ` Milan Broz
2011-02-07 18:44 ` Matt
@ 2011-02-07 20:44 ` Ted Ts'o
2011-02-07 20:51 ` Milan Broz
1 sibling, 1 reply; 7+ messages in thread
From: Ted Ts'o @ 2011-02-07 20:44 UTC (permalink / raw)
To: Milan Broz; +Cc: Matt, Linux Kernel, linux-ext4
On Mon, Feb 07, 2011 at 07:29:26PM +0100, Milan Broz wrote:
>
> So it was ext4 only bug in ext4_end_bio(),
> dm-crypt per-cpu code was just trigger here, right?
There appeared to be two bugs that people were discussing on that
particular dm_crypt mail thread. Some people were complaining about
issues with dm_crypt even when ext4 was not involved.
So I think it's fair to say that there was definitely _a_ ext4 bug
which was most easily seen when dm_crypt was in play, but which was
definitely not dm_crypt specific (it was possible to see it on an
hdd-only system, but the workload was much more severe). In any case,
as soon as the problem was found, we disabled the ext4 optimization
in 2.6.37-rc5.
So the fact that we found and fixed an ext4 bug that was triggered by
dm_crypt should not be taken as a statement (one way or the other)
that dm_crypt is Bug-Free(tm). :-)
- Ted
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ext4: Fix data corruption with multi-block writepages support
2011-02-07 20:44 ` Ted Ts'o
@ 2011-02-07 20:51 ` Milan Broz
0 siblings, 0 replies; 7+ messages in thread
From: Milan Broz @ 2011-02-07 20:51 UTC (permalink / raw)
To: Ted Ts'o, Matt, Linux Kernel, linux-ext4
On 02/07/2011 09:44 PM, Ted Ts'o wrote:
> So the fact that we found and fixed an ext4 bug that was triggered by
> dm_crypt should not be taken as a statement (one way or the other)
> that dm_crypt is Bug-Free(tm). :-)
Really? Sigh. ;-)
(There is a rule that if dm-crypt+XFS bug appears, the problem
is always in dm-crypt. So I am quite surprised that this time there
was NO bug in dm-crypt... yet :-)
Anyway, I would like to know if still some problem remains...
Thanks,
Milan
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-02-07 20:51 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-04 22:40 ext4: Fix data corruption with multi-block writepages support Matt
2011-02-07 17:45 ` Ted Ts'o
2011-02-07 18:29 ` Milan Broz
2011-02-07 18:44 ` Matt
2011-02-07 20:44 ` Ted Ts'o
2011-02-07 20:51 ` Milan Broz
2011-02-07 18:56 ` Matt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).