LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 1/2] fs: truncate blocks outside i_size after generic_file_direct_write error
@ 2008-10-19 14:03 Dmitri Monakhov
  2008-10-19 14:03 ` [PATCH 2/2] ocfs2: truncate outstanding block after direct io failure Dmitri Monakhov
  2008-10-20 12:47 ` [PATCH 1/2] fs: truncate blocks outside i_size after generic_file_direct_write error Jeff Moyer
  0 siblings, 2 replies; 4+ messages in thread
From: Dmitri Monakhov @ 2008-10-19 14:03 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, Dmitri Monakhov

We need to remove block that was allocated in generic_file_direct_write()
if we fail. We have to do it *regardless* to blocksize. At least ext2,
ext3 and reiserfs interpret (i_size < biggest block) condition as error.
Fsck will complain about wrong i_size. Then fsck will fix the error
by changing i_size according to the biggest block. This is bad because
this blocks contain gurbage from previous write attempt. And result in
data corruption.

In order to call vmtruncate() we have to hold host->i_mutex. This is true
for generic_file_aio_write(). In fact occasionally it is also true for all
generic_file_aio_write_nolock() callers except blockdev. But this situation
may change someday. This patch fix only generic_write_aio_write() case.
BTW: update generic_file_direct_write's comment with currently outdated.

Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
---
 mm/filemap.c |   26 ++++++++++++++++++++------
 1 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 3d5a2e7..1e911e5 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2186,8 +2186,10 @@ generic_file_direct_write(struct kiocb *iocb, const struct iovec *iov,
 	/*
 	 * Sync the fs metadata but not the minor inode changes and
 	 * of course not the data as we did direct DMA for the IO.
-	 * i_mutex is held, which protects generic_osync_inode() from
-	 * livelocking.  AIO O_DIRECT ops attempt to sync metadata here.
+	 * i_mutex is held in case of DIO_LOCKING, which protects
+	 * generic_osync_inode() from livelocking. If it is not held, then
+	 * the filesystem must prevent this livelock by its own meaner.
+	 * AIO O_DIRECT ops attempt to sync metadata here.
 	 */
 out:
 	if ((written >= 0 || written == -EIOCBQUEUED) &&
@@ -2529,7 +2531,8 @@ EXPORT_SYMBOL(generic_file_buffered_write);
 
 static ssize_t
 __generic_file_aio_write_nolock(struct kiocb *iocb, const struct iovec *iov,
-				unsigned long nr_segs, loff_t *ppos)
+				unsigned long nr_segs, loff_t *ppos,
+				int lock_type)
 {
 	struct file *file = iocb->ki_filp;
 	struct address_space * mapping = file->f_mapping;
@@ -2574,7 +2577,18 @@ __generic_file_aio_write_nolock(struct kiocb *iocb, const struct iovec *iov,
 
 		written = generic_file_direct_write(iocb, iov, &nr_segs, pos,
 							ppos, count, ocount);
-		if (written < 0 || written == count)
+		if (written < 0) {
+			if (lock_type == DIO_LOCKING) {
+				/*
+				 * direct write may have instantiated a few
+				 * blocks outside i_size. Trim these off again.
+				 */
+				if (pos + count > inode->i_size)
+					vmtruncate(inode, inode->i_size);
+			}
+			goto out;
+		}
+		if (written == count)
 			goto out;
 		/*
 		 * direct-io write to a hole: fall through to buffered I/O
@@ -2638,7 +2652,7 @@ ssize_t generic_file_aio_write_nolock(struct kiocb *iocb,
 	BUG_ON(iocb->ki_pos != pos);
 
 	ret = __generic_file_aio_write_nolock(iocb, iov, nr_segs,
-			&iocb->ki_pos);
+					&iocb->ki_pos, DIO_OWN_LOCKING);
 
 	if (ret > 0 && ((file->f_flags & O_SYNC) || IS_SYNC(inode))) {
 		ssize_t err;
@@ -2663,7 +2677,7 @@ ssize_t generic_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
 
 	mutex_lock(&inode->i_mutex);
 	ret = __generic_file_aio_write_nolock(iocb, iov, nr_segs,
-			&iocb->ki_pos);
+					&iocb->ki_pos, DIO_LOCKING);
 	mutex_unlock(&inode->i_mutex);
 
 	if (ret > 0 && ((file->f_flags & O_SYNC) || IS_SYNC(inode))) {
-- 
1.5.4.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] ocfs2: truncate outstanding block after direct io failure.
  2008-10-19 14:03 [PATCH 1/2] fs: truncate blocks outside i_size after generic_file_direct_write error Dmitri Monakhov
@ 2008-10-19 14:03 ` Dmitri Monakhov
  2008-10-20 12:47 ` [PATCH 1/2] fs: truncate blocks outside i_size after generic_file_direct_write error Jeff Moyer
  1 sibling, 0 replies; 4+ messages in thread
From: Dmitri Monakhov @ 2008-10-19 14:03 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, Dmitri Monakhov


Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
---
 fs/ocfs2/file.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 8d3225a..0b68b30 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -1867,6 +1867,13 @@ relock:
 		written = generic_file_direct_write(iocb, iov, &nr_segs, *ppos,
 						    ppos, count, ocount);
 		if (written < 0) {
+			/*
+			 * direct write may have instantiated a few
+			 * blocks outside i_size. Trim these off again.
+			 * Don't need i_size_read because we hold i_mutex.
+			 */
+			if (*ppos + count > inode->i_size)
+				vmtruncate(inode, inode->i_size);
 			ret = written;
 			goto out_dio;
 		}
-- 
1.5.4.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/2] fs: truncate blocks outside i_size after generic_file_direct_write error
  2008-10-19 14:03 [PATCH 1/2] fs: truncate blocks outside i_size after generic_file_direct_write error Dmitri Monakhov
  2008-10-19 14:03 ` [PATCH 2/2] ocfs2: truncate outstanding block after direct io failure Dmitri Monakhov
@ 2008-10-20 12:47 ` Jeff Moyer
  2008-10-20 13:43   ` Dmitri Monakhov
  1 sibling, 1 reply; 4+ messages in thread
From: Jeff Moyer @ 2008-10-20 12:47 UTC (permalink / raw)
  To: Dmitri Monakhov; +Cc: linux-kernel, linux-fsdevel

Dmitri Monakhov <dmonakhov@openvz.org> writes:

> We need to remove block that was allocated in generic_file_direct_write()
> if we fail. We have to do it *regardless* to blocksize. At least ext2,
> ext3 and reiserfs interpret (i_size < biggest block) condition as error.
> Fsck will complain about wrong i_size. Then fsck will fix the error
> by changing i_size according to the biggest block. This is bad because
> this blocks contain gurbage from previous write attempt. And result in
> data corruption.
>
> In order to call vmtruncate() we have to hold host->i_mutex. This is true
> for generic_file_aio_write(). In fact occasionally it is also true for all
> generic_file_aio_write_nolock() callers except blockdev. But this situation
> may change someday. This patch fix only generic_write_aio_write() case.
> BTW: update generic_file_direct_write's comment with currently outdated.

Do you have a test case for this, by any chance?

Cheers,

Jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/2] fs: truncate blocks outside i_size after generic_file_direct_write error
  2008-10-20 12:47 ` [PATCH 1/2] fs: truncate blocks outside i_size after generic_file_direct_write error Jeff Moyer
@ 2008-10-20 13:43   ` Dmitri Monakhov
  0 siblings, 0 replies; 4+ messages in thread
From: Dmitri Monakhov @ 2008-10-20 13:43 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: linux-kernel, linux-fsdevel

Jeff Moyer <jmoyer@redhat.com> writes:

> Dmitri Monakhov <dmonakhov@openvz.org> writes:
>
>> We need to remove block that was allocated in generic_file_direct_write()
>> if we fail. We have to do it *regardless* to blocksize. At least ext2,
>> ext3 and reiserfs interpret (i_size < biggest block) condition as error.
>> Fsck will complain about wrong i_size. Then fsck will fix the error
>> by changing i_size according to the biggest block. This is bad because
>> this blocks contain gurbage from previous write attempt. And result in
>> data corruption.
>>
>> In order to call vmtruncate() we have to hold host->i_mutex. This is true
>> for generic_file_aio_write(). In fact occasionally it is also true for all
>> generic_file_aio_write_nolock() callers except blockdev. But this situation
>> may change someday. This patch fix only generic_write_aio_write() case.
>> BTW: update generic_file_direct_write's comment with currently outdated.
>
> Do you have a test case for this, by any chance?
Off courre.
####TESTCASE_BEGIN
$touch /mnt/test/BIG_FILE
## at this moment /mnt/test/BIG_FILE size and blocks equal to zero
open("/mnt/test/BIG_FILE", O_WRONLY|O_CREAT|O_DIRECT, 0666) = 3
write(3, "aaaaaaaaaaaa"..., 104857600) = -1 ENOSPC (No space left on device)
## size and block sould't be changed because write op failed.
$stat /mnt/test/BIG_FILE
File: `/mnt/test/BIG_FILE'
Size: 0 Blocks: 110896 IO Block: 1024 regular empty file
<<<<<<<<^^^^^^^^^^^^^^^^^^^^^^^^^^^^^file size is less than biggest block idx
Device: fe07h/65031d Inode: 14 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2007-01-24 20:03:38.000000000 +0300
Modify: 2007-01-24 20:03:38.000000000 +0300
Change: 2007-01-24 20:03:39.000000000 +0300

#fsck.ext3 -f /dev/VG/test
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Inode 14, i_size is 0, should be 56556544. Fix<y>? yes
Pass 2: Checking directory structure
....
#####TESTCASE_END

>
> Cheers,
>
> Jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-10-20 13:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-19 14:03 [PATCH 1/2] fs: truncate blocks outside i_size after generic_file_direct_write error Dmitri Monakhov
2008-10-19 14:03 ` [PATCH 2/2] ocfs2: truncate outstanding block after direct io failure Dmitri Monakhov
2008-10-20 12:47 ` [PATCH 1/2] fs: truncate blocks outside i_size after generic_file_direct_write error Jeff Moyer
2008-10-20 13:43   ` Dmitri Monakhov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).