LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?
@ 2011-02-13 15:49 Marti Raudsepp
  2011-02-13 15:57 ` Josef Bacik
  2011-02-13 16:31 ` Hugo Mills
  0 siblings, 2 replies; 10+ messages in thread
From: Marti Raudsepp @ 2011-02-13 15:49 UTC (permalink / raw)
  To: btrfs hackers, Kernel hackers

Hi list!

It seems I have found a serious regression in compressed btrfs in
kernel 2.6.37. When creating a small file (less than the block size)
and then cp/mv it to *another* file system, an appropriate number of
zeroes gets written to the destination file. Case in point:

% echo foobar > foobar
% hexdump -C foobar
00000000  66 6f 6f 62 61 72 0a                              |foobar.|
00000007
% mv foobar /tmp
% hexdump -C /tmp/foobar
00000000  00 00 00 00 00 00 00                              |.......|
00000007
% cp foobar foobar2
% hexdump -C foobar2
00000000  00 00 00 00 00 00 00                              |.......|
00000007

Via strace I found that mv doesn't even attempt to read anything:

open("foobar", O_RDONLY|O_NOFOLLOW)     = 3
fstat(3, {st_mode=S_IFREG|0664, st_size=7, ...}) = 0
open("/tmp/foobar", O_WRONLY|O_CREAT|O_EXCL, 0600) = 4
fstat(4, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
ioctl(3, FS_IOC_FIEMAP, 0x7fff62f6bfa0) = 0
write(4, "\0\0\0\0\0\0\0", 7)           = 7

What's that, is FS_IOC_FIEMAP telling it that it's a sparse file?
Compare with ext4:

ioctl(3, FS_IOC_FIEMAP, 0x7fff2c576a90) = 0
lseek(3, 0, SEEK_SET)                   = 0
read(3, "foobar\n", 4096)               = 7
write(4, "foobar\n", 7)                 = 7

I'm currently running on 2.6.37, x86_64 using Arch Linux -testing with
coreutils 8.10. Filesystem is mounted from LVM2 to /usr/src with -o
noatime,compress

This only seems to occur with compressed file systems (either zlib or
LZO). A person on IRC also reproduced the same problem in 2.6.28-rc.
I'm pretty sure this used to work correctly around 2.6.35 or 2.6.36.

This is 100% reproducible here. If anyone has trouble reproducing
this, I can dig further and provide information as needed.

Regards,
Marti

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?
  2011-02-13 15:49 btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug? Marti Raudsepp
@ 2011-02-13 15:57 ` Josef Bacik
  2011-02-13 16:07   ` Marti Raudsepp
  2011-02-13 16:31 ` Hugo Mills
  1 sibling, 1 reply; 10+ messages in thread
From: Josef Bacik @ 2011-02-13 15:57 UTC (permalink / raw)
  To: Marti Raudsepp; +Cc: btrfs hackers, Kernel hackers

On Sun, Feb 13, 2011 at 05:49:42PM +0200, Marti Raudsepp wrote:
> Hi list!
> 
> It seems I have found a serious regression in compressed btrfs in
> kernel 2.6.37. When creating a small file (less than the block size)
> and then cp/mv it to *another* file system, an appropriate number of
> zeroes gets written to the destination file. Case in point:
> 
> % echo foobar > foobar
> % hexdump -C foobar
> 00000000  66 6f 6f 62 61 72 0a                              |foobar.|
> 00000007
> % mv foobar /tmp
> % hexdump -C /tmp/foobar
> 00000000  00 00 00 00 00 00 00                              |.......|
> 00000007
> % cp foobar foobar2
> % hexdump -C foobar2
> 00000000  00 00 00 00 00 00 00                              |.......|
> 00000007
> 
> Via strace I found that mv doesn't even attempt to read anything:
> 
> open("foobar", O_RDONLY|O_NOFOLLOW)     = 3
> fstat(3, {st_mode=S_IFREG|0664, st_size=7, ...}) = 0
> open("/tmp/foobar", O_WRONLY|O_CREAT|O_EXCL, 0600) = 4
> fstat(4, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
> ioctl(3, FS_IOC_FIEMAP, 0x7fff62f6bfa0) = 0
> write(4, "\0\0\0\0\0\0\0", 7)           = 7
> 
> What's that, is FS_IOC_FIEMAP telling it that it's a sparse file?
> Compare with ext4:
> 
> ioctl(3, FS_IOC_FIEMAP, 0x7fff2c576a90) = 0
> lseek(3, 0, SEEK_SET)                   = 0
> read(3, "foobar\n", 4096)               = 7
> write(4, "foobar\n", 7)                 = 7
> 
> I'm currently running on 2.6.37, x86_64 using Arch Linux -testing with
> coreutils 8.10. Filesystem is mounted from LVM2 to /usr/src with -o
> noatime,compress
> 
> This only seems to occur with compressed file systems (either zlib or
> LZO). A person on IRC also reproduced the same problem in 2.6.28-rc.
> I'm pretty sure this used to work correctly around 2.6.35 or 2.6.36.
> 
> This is 100% reproducible here. If anyone has trouble reproducing
> this, I can dig further and provide information as needed.
>

Does the same problem happen when you use cp --sparse=never?  Thanks,

Josef 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?
  2011-02-13 15:57 ` Josef Bacik
@ 2011-02-13 16:07   ` Marti Raudsepp
  2011-02-13 16:13     ` Josef Bacik
  0 siblings, 1 reply; 10+ messages in thread
From: Marti Raudsepp @ 2011-02-13 16:07 UTC (permalink / raw)
  To: Josef Bacik; +Cc: btrfs hackers, Kernel hackers

On Sun, Feb 13, 2011 at 17:57, Josef Bacik <josef@redhat.com> wrote:
> Does the same problem happen when you use cp --sparse=never?

You are right. cp --sparse=never does not cause data loss.

Regards,
Marti

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?
  2011-02-13 16:07   ` Marti Raudsepp
@ 2011-02-13 16:13     ` Josef Bacik
  2011-02-14 15:01       ` Chris Mason
  0 siblings, 1 reply; 10+ messages in thread
From: Josef Bacik @ 2011-02-13 16:13 UTC (permalink / raw)
  To: Marti Raudsepp; +Cc: Josef Bacik, btrfs hackers, Kernel hackers

On Sun, Feb 13, 2011 at 06:07:36PM +0200, Marti Raudsepp wrote:
> On Sun, Feb 13, 2011 at 17:57, Josef Bacik <josef@redhat.com> wrote:
> > Does the same problem happen when you use cp --sparse=never?
> 
> You are right. cp --sparse=never does not cause data loss.
>

So fiemap probably isn't doing the right thing when compression is enabled,
which doesn't suprise me since we don't do the right thing with delalloc either.
I will try and get to this soon.  Thanks,

Josef 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?
  2011-02-13 15:49 btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug? Marti Raudsepp
  2011-02-13 15:57 ` Josef Bacik
@ 2011-02-13 16:31 ` Hugo Mills
  1 sibling, 0 replies; 10+ messages in thread
From: Hugo Mills @ 2011-02-13 16:31 UTC (permalink / raw)
  To: Marti Raudsepp; +Cc: btrfs hackers, Kernel hackers

[-- Attachment #1: Type: text/plain, Size: 1190 bytes --]

On Sun, Feb 13, 2011 at 05:49:42PM +0200, Marti Raudsepp wrote:
> Hi list!
> 
> It seems I have found a serious regression in compressed btrfs in
> kernel 2.6.37. When creating a small file (less than the block size)
> and then cp/mv it to *another* file system, an appropriate number of
> zeroes gets written to the destination file. Case in point:

[snip]

> I'm currently running on 2.6.37, x86_64 using Arch Linux -testing with
> coreutils 8.10. Filesystem is mounted from LVM2 to /usr/src with -o
> noatime,compress
> 
> This only seems to occur with compressed file systems (either zlib or
> LZO). A person on IRC also reproduced the same problem in 2.6.28-rc.
> I'm pretty sure this used to work correctly around 2.6.35 or 2.6.36.

   This would seem to be the same effect that we've had reported on
IRC by at least two Gentoo users, of files full of zeroes in their
build system. We'll follow up with them over there and see if it's the
same bug.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
           --- I must be musical:  I've got *loads* of CDs ---           

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?
  2011-02-13 16:13     ` Josef Bacik
@ 2011-02-14 15:01       ` Chris Mason
  2011-02-14 17:58         ` Marti Raudsepp
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Mason @ 2011-02-14 15:01 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Marti Raudsepp, btrfs hackers, Kernel hackers

Excerpts from Josef Bacik's message of 2011-02-13 11:13:30 -0500:
> On Sun, Feb 13, 2011 at 06:07:36PM +0200, Marti Raudsepp wrote:
> > On Sun, Feb 13, 2011 at 17:57, Josef Bacik <josef@redhat.com> wrote:
> > > Does the same problem happen when you use cp --sparse=never?
> > 
> > You are right. cp --sparse=never does not cause data loss.
> >
> 
> So fiemap probably isn't doing the right thing when compression is enabled,
> which doesn't suprise me since we don't do the right thing with delalloc either.
> I will try and get to this soon.  Thanks,

This might be a bug in the cp code.  We're setting the disk extent to
zero but setting different flags to say we're inline and compressed.
The cp fiemap code might be ignoring the flags?

Or, it could just be delalloc ;)

-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?
  2011-02-14 15:01       ` Chris Mason
@ 2011-02-14 17:58         ` Marti Raudsepp
  2011-02-14 18:01           ` Chris Mason
  2011-02-15 11:30           ` Pádraig Brady
  0 siblings, 2 replies; 10+ messages in thread
From: Marti Raudsepp @ 2011-02-14 17:58 UTC (permalink / raw)
  To: Chris Mason; +Cc: Josef Bacik, btrfs hackers, Kernel hackers

On Mon, Feb 14, 2011 at 17:01, Chris Mason <chris.mason@oracle.com> wrote:
> Or, it could just be delalloc ;)

I suspect delalloc. After creating the file, filefrag reports "1
extent found", but for some reason it doesn't actually print out
details of the extent.

After a "sync" call, the extent appears and "cp" starts working as expected:

% rm -f foo bar
% echo foo > foo
% sync
% filefrag -v foo
Filesystem type is: 9123683e
File size of foo is 4 (1 block, blocksize 4096)
 ext logical physical expected length flags
   0       0        0            4096 not_aligned,inline,eof
foo: 1 extent found
% cp foo bar
% hexdump bar
0000000 6f66 0a6f
0000004

Without sync:

% rm -f foo bar
% echo foo > foo
% filefrag -v foo
Filesystem type is: 9123683e
File size of foo is 4 (1 block, blocksize 4096)
 ext logical physical expected length flags
foo: 1 extent found
% cp foo bar
% hexdump bar
0000000 0000 0000
0000004

Regards,
Marti

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?
  2011-02-14 17:58         ` Marti Raudsepp
@ 2011-02-14 18:01           ` Chris Mason
  2011-02-15 11:30           ` Pádraig Brady
  1 sibling, 0 replies; 10+ messages in thread
From: Chris Mason @ 2011-02-14 18:01 UTC (permalink / raw)
  To: Marti Raudsepp; +Cc: Josef Bacik, btrfs hackers, Kernel hackers

Excerpts from Marti Raudsepp's message of 2011-02-14 12:58:17 -0500:
> On Mon, Feb 14, 2011 at 17:01, Chris Mason <chris.mason@oracle.com> wrote:
> > Or, it could just be delalloc ;)
> 
> I suspect delalloc. After creating the file, filefrag reports "1
> extent found", but for some reason it doesn't actually print out
> details of the extent.
> 
> After a "sync" call, the extent appears and "cp" starts working as expected:

Great, that's a ton easier than fixing cp.

-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?
  2011-02-14 17:58         ` Marti Raudsepp
  2011-02-14 18:01           ` Chris Mason
@ 2011-02-15 11:30           ` Pádraig Brady
  2011-02-15 13:18             ` Josef Bacik
  1 sibling, 1 reply; 10+ messages in thread
From: Pádraig Brady @ 2011-02-15 11:30 UTC (permalink / raw)
  To: Marti Raudsepp
  Cc: Chris Mason, Josef Bacik, btrfs hackers, Kernel hackers, 8001

On 14/02/11 17:58, Marti Raudsepp wrote:
> On Mon, Feb 14, 2011 at 17:01, Chris Mason <chris.mason@oracle.com> wrote:
>> Or, it could just be delalloc ;)
> 
> I suspect delalloc. After creating the file, filefrag reports "1
> extent found", but for some reason it doesn't actually print out
> details of the extent.

That's a bug in `filefrag -v` that I noticed independently yesterday.
Without -v it will correctly report 0 extents.
I've already suggested a patch to fix upstream.

> After a "sync" call, the extent appears and "cp" starts working as expected:

About that sync.
I've noticed on ext4 loop back at least (and I suspect BTRFS is the same)
that specifying FIEMAP_FLAG_SYNC (which cp does) is ineffective.
I worked around this for cp tests by explicitly syncing with:
dd if=/dev/null of=foo conv=notrunc,fdatasync

> % rm -f foo bar
> % echo foo > foo
> % sync
> % filefrag -v foo
> Filesystem type is: 9123683e
> File size of foo is 4 (1 block, blocksize 4096)
>  ext logical physical expected length flags
>    0       0        0            4096 not_aligned,inline,eof
> foo: 1 extent found
> % cp foo bar
> % hexdump bar
> 0000000 6f66 0a6f
> 0000004

OK that's fine for normal files.
cp (from coreutils >= 8.10) may still do the wrong thing
as it currently ignores FIEMAP_EXTENT_DATA_ENCRYPTED and FIEMAP_EXTENT_ENCODED
as I've already reported:
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08356.html
I'd appreciate some `filefrag -v` output from a large compressed file.

cheers,
Pádraig.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug?
  2011-02-15 11:30           ` Pádraig Brady
@ 2011-02-15 13:18             ` Josef Bacik
  0 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2011-02-15 13:18 UTC (permalink / raw)
  To: Pádraig Brady
  Cc: Marti Raudsepp, Chris Mason, Josef Bacik, btrfs hackers,
	Kernel hackers, 8001

On Tue, Feb 15, 2011 at 11:30:38AM +0000, Pádraig Brady wrote:
> On 14/02/11 17:58, Marti Raudsepp wrote:
> > On Mon, Feb 14, 2011 at 17:01, Chris Mason <chris.mason@oracle.com> wrote:
> >> Or, it could just be delalloc ;)
> > 
> > I suspect delalloc. After creating the file, filefrag reports "1
> > extent found", but for some reason it doesn't actually print out
> > details of the extent.
> 
> That's a bug in `filefrag -v` that I noticed independently yesterday.
> Without -v it will correctly report 0 extents.
> I've already suggested a patch to fix upstream.
> 
> > After a "sync" call, the extent appears and "cp" starts working as expected:
> 
> About that sync.
> I've noticed on ext4 loop back at least (and I suspect BTRFS is the same)
> that specifying FIEMAP_FLAG_SYNC (which cp does) is ineffective.
> I worked around this for cp tests by explicitly syncing with:
> dd if=/dev/null of=foo conv=notrunc,fdatasync
>

Well thats not good, thats all take care of in the generic code before it gets
to the fs, I'll take a look at that when I try and fix delalloc fiemap for
btrfs.  Thanks,

Josef 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-02-15 13:24 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-13 15:49 btrfs: compression breaks cp and cross-FS mv, FS_IOC_FIEMAP bug? Marti Raudsepp
2011-02-13 15:57 ` Josef Bacik
2011-02-13 16:07   ` Marti Raudsepp
2011-02-13 16:13     ` Josef Bacik
2011-02-14 15:01       ` Chris Mason
2011-02-14 17:58         ` Marti Raudsepp
2011-02-14 18:01           ` Chris Mason
2011-02-15 11:30           ` Pádraig Brady
2011-02-15 13:18             ` Josef Bacik
2011-02-13 16:31 ` Hugo Mills

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).