LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] fs: fix lost error code in dio_complete
@ 2018-10-30 21:57 Maximilian Heyne
  2018-10-31  5:46 ` Christoph Hellwig
  2018-10-31  9:24 ` Shah, Amit
  0 siblings, 2 replies; 5+ messages in thread
From: Maximilian Heyne @ 2018-10-30 21:57 UTC (permalink / raw)
  Cc: Christoph Hellwig, Maximilian Heyne, stable, Torsten Mehlan,
	Uwe Dannowski, Amit Shah, David Woodhouse, Alexander Viro,
	linux-fsdevel, linux-kernel

commit e259221763a40403d5bb232209998e8c45804ab8 ("fs: simplify the
generic_write_sync prototype") reworked callers of generic_write_sync(),
and ended up dropping the error return for the directio path. Prior to
that commit, in dio_complete(), an error would be bubbled up the stack,
but after that commit, errors passed on to dio_complete were eaten up.

This was reported on the list earlier, and a fix was proposed in
https://lore.kernel.org/lkml/20160921141539.GA17898@infradead.org/, but
never followed up with.  We recently hit this bug in our testing where
fencing io errors, which were previously erroring out with EIO, were
being returned as success operations after this commit.

The fix proposed on the list earlier was a little short -- it would have
still called generic_write_sync() in case `ret` already contained an
error.  This fix ensures generic_write_sync() is only called when
there's no pending error in the write.

CC: stable@vger.kernel.org
Reported-by: Ravi Nankani <rnankani@amazon.com>
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Signed-off-by: Torsten Mehlan <tomeh@amazon.de>
Signed-off-by: Uwe Dannowski <uwed@amazon.de>
Signed-off-by: Amit Shah <aams@amazon.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 fs/direct-io.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index 093fb54cd316..199146036093 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -325,8 +325,8 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, unsigned int flags)
 		 */
 		dio->iocb->ki_pos += transferred;
 
-		if (dio->op == REQ_OP_WRITE)
-			ret = generic_write_sync(dio->iocb,  transferred);
+		if (ret > 0 && dio->op == REQ_OP_WRITE)
+			ret = generic_write_sync(dio->iocb, ret);
 		dio->iocb->ki_complete(dio->iocb, ret, 0);
 	}
 
-- 
2.16.2

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs: fix lost error code in dio_complete
  2018-10-30 21:57 [PATCH] fs: fix lost error code in dio_complete Maximilian Heyne
@ 2018-10-31  5:46 ` Christoph Hellwig
  2018-10-31  9:24 ` Shah, Amit
  1 sibling, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2018-10-31  5:46 UTC (permalink / raw)
  To: Maximilian Heyne
  Cc: Christoph Hellwig, stable, Torsten Mehlan, Uwe Dannowski,
	Amit Shah, David Woodhouse, Alexander Viro, linux-fsdevel,
	linux-kernel

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs: fix lost error code in dio_complete
  2018-10-30 21:57 [PATCH] fs: fix lost error code in dio_complete Maximilian Heyne
  2018-10-31  5:46 ` Christoph Hellwig
@ 2018-10-31  9:24 ` Shah, Amit
  2018-11-01  8:03   ` Maximilian Heyne
  1 sibling, 1 reply; 5+ messages in thread
From: Shah, Amit @ 2018-10-31  9:24 UTC (permalink / raw)
  To: Heyne, Maximilian
  Cc: linux-kernel, Woodhouse, David, hch, stable, viro, linux-fsdevel,
	Dannowski, Uwe, Mehlan, Torsten

On Di, 2018-10-30 at 21:57 +0000, Maximilian Heyne wrote:
> commit e259221763a40403d5bb232209998e8c45804ab8 ("fs: simplify the
> generic_write_sync prototype") reworked callers of generic_write_sync(),
> and ended up dropping the error return for the directio path. Prior to
> that commit, in dio_complete(), an error would be bubbled up the stack,
> but after that commit, errors passed on to dio_complete were eaten up.
> 
> This was reported on the list earlier, and a fix was proposed in
> https://lore.kernel.org/lkml/20160921141539.GA17898@infradead.org/, but
> never followed up with.  We recently hit this bug in our testing where
> fencing io errors, which were previously erroring out with EIO, were
> being returned as success operations after this commit.
> 
> The fix proposed on the list earlier was a little short -- it would have
> still called generic_write_sync() in case `ret` already contained an
> error.  This fix ensures generic_write_sync() is only called when
> there's no pending error in the write.
> 
> CC: stable@vger.kernel.org
> Reported-by: Ravi Nankani <rnankani@amazon.com>
> Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
> Signed-off-by: Torsten Mehlan <tomeh@amazon.de>
> Signed-off-by: Uwe Dannowski <uwed@amazon.de>
> Signed-off-by: Amit Shah <aams@amazon.de>
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  fs/direct-io.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/direct-io.c b/fs/direct-io.c
> index 093fb54cd316..199146036093 100644
> --- a/fs/direct-io.c
> +++ b/fs/direct-io.c
> @@ -325,8 +325,8 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, unsigned int flags)
>  		 */
>  		dio->iocb->ki_pos += transferred;
>  
> -		if (dio->op == REQ_OP_WRITE)
> -			ret = generic_write_sync(dio->iocb,  transferred);
> +		if (ret > 0 && dio->op == REQ_OP_WRITE)
> +			ret = generic_write_sync(dio->iocb, ret);

Is the s/transferred/ret/ change necessary?  Needs explaining, at least.

>  		dio->iocb->ki_complete(dio->iocb, ret, 0);
>  	}
>  

Thanks,



				Amit
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs: fix lost error code in dio_complete
  2018-10-31  9:24 ` Shah, Amit
@ 2018-11-01  8:03   ` Maximilian Heyne
  2018-11-01  9:06     ` Shah, Amit
  0 siblings, 1 reply; 5+ messages in thread
From: Maximilian Heyne @ 2018-11-01  8:03 UTC (permalink / raw)
  To: Shah, Amit
  Cc: linux-kernel, Woodhouse, David, hch, stable, viro, linux-fsdevel,
	Dannowski, Uwe, Mehlan, Torsten

On 10/31/18 10:24 AM, Shah, Amit wrote:
> On Di, 2018-10-30 at 21:57 +0000, Maximilian Heyne wrote:
>> [...]
>>
>> diff --git a/fs/direct-io.c b/fs/direct-io.c
>> index 093fb54cd316..199146036093 100644
>> --- a/fs/direct-io.c
>> +++ b/fs/direct-io.c
>> @@ -325,8 +325,8 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, unsigned int flags)
>>   		 */
>>   		dio->iocb->ki_pos += transferred;
>>   
>> -		if (dio->op == REQ_OP_WRITE)
>> -			ret = generic_write_sync(dio->iocb,  transferred);
>> +		if (ret > 0 && dio->op == REQ_OP_WRITE)
>> +			ret = generic_write_sync(dio->iocb, ret);
> Is the s/transferred/ret/ change necessary?  Needs explaining, at least.

In an above code line `ret` is set to `transferred`. So the change is
a no op. However, in my opinion the construct then looks cleaner.

>>   		dio->iocb->ki_complete(dio->iocb, ret, 0);
>>   	}
>>   
> Thanks,
>
>
>
> 				Amit



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs: fix lost error code in dio_complete
  2018-11-01  8:03   ` Maximilian Heyne
@ 2018-11-01  9:06     ` Shah, Amit
  0 siblings, 0 replies; 5+ messages in thread
From: Shah, Amit @ 2018-11-01  9:06 UTC (permalink / raw)
  To: Heyne, Maximilian
  Cc: linux-kernel, Woodhouse, David, hch, stable, viro, linux-fsdevel,
	Dannowski, Uwe, Mehlan, Torsten


On Do, 2018-11-01 at 09:03 +0100, Maximilian Heyne wrote:
> On 10/31/18 10:24 AM, Shah, Amit wrote:
> > 
> > On Di, 2018-10-30 at 21:57 +0000, Maximilian Heyne wrote:
> > > 
> > > [...]
> > > 
> > > diff --git a/fs/direct-io.c b/fs/direct-io.c
> > > index 093fb54cd316..199146036093 100644
> > > --- a/fs/direct-io.c
> > > +++ b/fs/direct-io.c
> > > @@ -325,8 +325,8 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, unsigned int flags)
> > >   		 */
> > >   		dio->iocb->ki_pos += transferred;
> > >   
> > > -		if (dio->op == REQ_OP_WRITE)
> > > -			ret = generic_write_sync(dio->iocb,  transferred);
> > > +		if (ret > 0 && dio->op == REQ_OP_WRITE)
> > > +			ret = generic_write_sync(dio->iocb, ret);
> > Is the s/transferred/ret/ change necessary?  Needs explaining, at least.
> In an above code line `ret` is set to `transferred`. So the change is
> a no op. However, in my opinion the construct then looks cleaner.

Yes, makes it also in line with the other callers, so this is good, thanks.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-11-01  9:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-30 21:57 [PATCH] fs: fix lost error code in dio_complete Maximilian Heyne
2018-10-31  5:46 ` Christoph Hellwig
2018-10-31  9:24 ` Shah, Amit
2018-11-01  8:03   ` Maximilian Heyne
2018-11-01  9:06     ` Shah, Amit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).