LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: ext4: Fix data corruption with multi-block writepages support
@ 2011-02-04 22:40 Matt
  2011-02-07 17:45 ` Ted Ts'o
  0 siblings, 1 reply; 7+ messages in thread
From: Matt @ 2011-02-04 22:40 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: Linux Kernel, linux-ext4

>Thanks, added to the ext4 patch queue.

>I modified the commit description slightly to give credit to Jon
>Nelson, who reported the bug and really helped by devising a
>reproduceable test case.  Many thanks, Jon!!

>							- Ted

So that means that the file-corruption which existed until 2.6.37-rc6
and got triggered (for me) more easily via "dm crypt: scale to
multiple CPUs"
is fixed now ?

That should give ext4 a nice speedup for >=2.6.38 :)

Could you also please add an ?

Reported-by: Matthias Bayer <jackdachef <at> gmail <dot> com >

I mainly found it through testing with the mentioned dm-crypt scaling
patch and >=2.6.36-git*

Thanks & Regards !

Matt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4: Fix data corruption with multi-block writepages support
  2011-02-04 22:40 ext4: Fix data corruption with multi-block writepages support Matt
@ 2011-02-07 17:45 ` Ted Ts'o
  2011-02-07 18:29   ` Milan Broz
  2011-02-07 18:56   ` Matt
  0 siblings, 2 replies; 7+ messages in thread
From: Ted Ts'o @ 2011-02-07 17:45 UTC (permalink / raw)
  To: Matt; +Cc: Linux Kernel, linux-ext4

On Fri, Feb 04, 2011 at 10:40:47PM +0000, Matt wrote:
> 
> So that means that the file-corruption which existed until 2.6.37-rc6
> and got triggered (for me) more easily via "dm crypt: scale to
> multiple CPUs"
> is fixed now ?

Well, a patch exists for it that will be merged into 2.6.38.

> That should give ext4 a nice speedup for >=2.6.38 :)

I'm not going to make it be the default for 2.6.38, since it's fairly
late in the -rc features.  People who want it can explicitly enable it
using the mount option mblk_io_submit, though.  (And let me know your
success stories!  :-) I will be enabling it as the default in
2.6.39-rc1.

> Reported-by: Matthias Bayer <jackdachef <at> gmail <dot> com >

Sure!

						- Ted

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4: Fix data corruption with multi-block writepages support
  2011-02-07 17:45 ` Ted Ts'o
@ 2011-02-07 18:29   ` Milan Broz
  2011-02-07 18:44     ` Matt
  2011-02-07 20:44     ` Ted Ts'o
  2011-02-07 18:56   ` Matt
  1 sibling, 2 replies; 7+ messages in thread
From: Milan Broz @ 2011-02-07 18:29 UTC (permalink / raw)
  To: Ted Ts'o, Matt, Linux Kernel, linux-ext4

On 02/07/2011 06:45 PM, Ted Ts'o wrote:
> On Fri, Feb 04, 2011 at 10:40:47PM +0000, Matt wrote:
>>
>> So that means that the file-corruption which existed until 2.6.37-rc6
>> and got triggered (for me) more easily via "dm crypt: scale to
>> multiple CPUs"
>> is fixed now ?
> 
> Well, a patch exists for it that will be merged into 2.6.38.
> 
>> That should give ext4 a nice speedup for >=2.6.38 :)
> 
> I'm not going to make it be the default for 2.6.38, since it's fairly
> late in the -rc features.  People who want it can explicitly enable it
> using the mount option mblk_io_submit, though.  (And let me know your
> success stories!  :-) I will be enabling it as the default in
> 2.6.39-rc1.

So it was ext4 only bug in ext4_end_bio(),
dm-crypt per-cpu code was just trigger here, right?

Milan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4: Fix data corruption with multi-block writepages support
  2011-02-07 18:29   ` Milan Broz
@ 2011-02-07 18:44     ` Matt
  2011-02-07 20:44     ` Ted Ts'o
  1 sibling, 0 replies; 7+ messages in thread
From: Matt @ 2011-02-07 18:44 UTC (permalink / raw)
  To: Milan Broz; +Cc: Ted Ts'o, Linux Kernel, linux-ext4

On Mon, Feb 7, 2011 at 6:29 PM, Milan Broz <mbroz@redhat.com> wrote:
> On 02/07/2011 06:45 PM, Ted Ts'o wrote:
>> On Fri, Feb 04, 2011 at 10:40:47PM +0000, Matt wrote:
>>>
>>> So that means that the file-corruption which existed until 2.6.37-rc6
>>> and got triggered (for me) more easily via "dm crypt: scale to
>>> multiple CPUs"
>>> is fixed now ?
>>
>> Well, a patch exists for it that will be merged into 2.6.38.
>>
>>> That should give ext4 a nice speedup for >=2.6.38 :)
>>
>> I'm not going to make it be the default for 2.6.38, since it's fairly
>> late in the -rc features.  People who want it can explicitly enable it
>> using the mount option mblk_io_submit, though.  (And let me know your
>> success stories!  :-) I will be enabling it as the default in
>> 2.6.39-rc1.
>
> So it was ext4 only bug in ext4_end_bio(),
> dm-crypt per-cpu code was just trigger here, right?
>
> Milan
>

Hi Milan,

Well, that was at least the experience that I made

ext4: after Ted had disabled support for multiple page-io submission

I observed no data-corruption anymore (it had only appeared on the
system-partition, /home - where ext4 also is used or on my backup
partitions there was also no problem as far as I can tell)

XFS: no corruption observed

reiserfs: I can't say for sure since I'm only using it on my /boot partition :P

for other filesystems I can't say anything - I didn't use additional
ones at that time

Regards

Matt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4: Fix data corruption with multi-block writepages support
  2011-02-07 17:45 ` Ted Ts'o
  2011-02-07 18:29   ` Milan Broz
@ 2011-02-07 18:56   ` Matt
  1 sibling, 0 replies; 7+ messages in thread
From: Matt @ 2011-02-07 18:56 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: Linux Kernel, linux-ext4

On Mon, Feb 7, 2011 at 5:45 PM, Ted Ts'o <tytso@mit.edu> wrote:
> On Fri, Feb 04, 2011 at 10:40:47PM +0000, Matt wrote:
>>
>> So that means that the file-corruption which existed until 2.6.37-rc6
>> and got triggered (for me) more easily via "dm crypt: scale to
>> multiple CPUs"
>> is fixed now ?
>
> Well, a patch exists for it that will be merged into 2.6.38.
>
>> That should give ext4 a nice speedup for >=2.6.38 :)
>
> I'm not going to make it be the default for 2.6.38, since it's fairly
> late in the -rc features.  People who want it can explicitly enable it
> using the mount option mblk_io_submit, though.  (And let me know your
> success stories!  :-) I will be enabling it as the default in
> 2.6.39-rc1.
>

Hi Ted,

I guess it should be save to enable it with 2.6.37, dm-crypt multi-cpu
patch and the following patch ?

"ext4: Fix data corruption with multi-block writepages support"
(of course that's the minimum - it would be better to pull in the ext4
changes for 2.6.38)


For a short time I had it activated (via additional) mblk_io_submit
mount-command on my portage-partition (where the portage-ball of my
Gentoo system is).
I was curious to see what messages I would get and wondered why there
was nothing about mballoc mentioned

If I recall correctly there were always messages in the past, like:

EXT4-fs: delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled

these are from 2.6.28 -

I'm only getting:

EXT4-fs (dm-3): mounted filesystem with ordered data mode.

or

EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts:
commit=60,barrier=1

(I like to set the barriers / flushes explicitly).

Sorry if I didn't follow development but these messages were kind of
more and more silenced ?


Thanks !


>> Reported-by: Matthias Bayer <jackdachef <at> gmail <dot> com >
>
> Sure!

Thanks !


>
>                                                - Ted
>

Regards

Matt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4: Fix data corruption with multi-block writepages support
  2011-02-07 18:29   ` Milan Broz
  2011-02-07 18:44     ` Matt
@ 2011-02-07 20:44     ` Ted Ts'o
  2011-02-07 20:51       ` Milan Broz
  1 sibling, 1 reply; 7+ messages in thread
From: Ted Ts'o @ 2011-02-07 20:44 UTC (permalink / raw)
  To: Milan Broz; +Cc: Matt, Linux Kernel, linux-ext4

On Mon, Feb 07, 2011 at 07:29:26PM +0100, Milan Broz wrote:
> 
> So it was ext4 only bug in ext4_end_bio(),
> dm-crypt per-cpu code was just trigger here, right?

There appeared to be two bugs that people were discussing on that
particular dm_crypt mail thread.  Some people were complaining about
issues with dm_crypt even when ext4 was not involved.

So I think it's fair to say that there was definitely _a_ ext4 bug
which was most easily seen when dm_crypt was in play, but which was
definitely not dm_crypt specific (it was possible to see it on an
hdd-only system, but the workload was much more severe).  In any case,
as soon as the problem was found, we disabled the ext4 optimization
in 2.6.37-rc5.

So the fact that we found and fixed an ext4 bug that was triggered by
dm_crypt should not be taken as a statement (one way or the other)
that dm_crypt is Bug-Free(tm).  :-)

					- Ted

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ext4: Fix data corruption with multi-block writepages support
  2011-02-07 20:44     ` Ted Ts'o
@ 2011-02-07 20:51       ` Milan Broz
  0 siblings, 0 replies; 7+ messages in thread
From: Milan Broz @ 2011-02-07 20:51 UTC (permalink / raw)
  To: Ted Ts'o, Matt, Linux Kernel, linux-ext4

On 02/07/2011 09:44 PM, Ted Ts'o wrote:
> So the fact that we found and fixed an ext4 bug that was triggered by
> dm_crypt should not be taken as a statement (one way or the other)
> that dm_crypt is Bug-Free(tm).  :-)

Really? Sigh. ;-)

(There is a rule that if dm-crypt+XFS bug appears, the problem
is always in dm-crypt. So I am quite surprised that this time there
was NO bug in dm-crypt... yet :-)

Anyway, I would like to know if still some problem remains...

Thanks,
Milan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-02-07 20:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-04 22:40 ext4: Fix data corruption with multi-block writepages support Matt
2011-02-07 17:45 ` Ted Ts'o
2011-02-07 18:29   ` Milan Broz
2011-02-07 18:44     ` Matt
2011-02-07 20:44     ` Ted Ts'o
2011-02-07 20:51       ` Milan Broz
2011-02-07 18:56   ` Matt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).