Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	"Darrick J . Wong" <darrick.wong@oracle.com>,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 9/9] iomap: Change calling convention for zeroing
Date: Tue, 25 Aug 2020 13:40:24 +0100	[thread overview]
Message-ID: <20200825124024.GN17456@casper.infradead.org> (raw)
In-Reply-To: <20200825042711.GL12131@dread.disaster.area>

On Tue, Aug 25, 2020 at 02:27:11PM +1000, Dave Chinner wrote:
> On Mon, Aug 24, 2020 at 09:35:59PM -0600, Andreas Dilger wrote:
> > On Aug 24, 2020, at 9:26 PM, Matthew Wilcox <willy@infradead.org> wrote:
> > > 
> > > On Tue, Aug 25, 2020 at 10:27:35AM +1000, Dave Chinner wrote:
> > >>> 	do {
> > >>> -		unsigned offset, bytes;
> > >>> -
> > >>> -		offset = offset_in_page(pos);
> > >>> -		bytes = min_t(loff_t, PAGE_SIZE - offset, count);
> > >>> +		loff_t bytes;
> > >>> 
> > >>> 		if (IS_DAX(inode))
> > >>> -			status = dax_iomap_zero(pos, offset, bytes, iomap);
> > >>> +			bytes = dax_iomap_zero(pos, length, iomap);
> > >> 
> > >> Hmmm. everything is loff_t here, but the callers are defining length
> > >> as u64, not loff_t. Is there a potential sign conversion problem
> > >> here? (sure 64 bit is way beyond anything we'll pass here, but...)
> > > 
> > > I've gone back and forth on the correct type for 'length' a few times.
> > > size_t is too small (not for zeroing, but for seek()).  An unsigned type
> > > seems right -- a length can't be negative, and we don't want to give
> > > the impression that it can.  But the return value from these functions
> > > definitely needs to be signed so we can represent an error.  So a u64
> > > length with an loff_t return type feels like the best solution.  And
> > > the upper layers have to promise not to pass in a length that's more
> > > than 2^63-1.
> > 
> > The problem with allowing a u64 as the length is that it leads to the
> > possibility of an argument value that cannot be returned.  Checking
> > length < 0 is not worse than checking length > 0x7ffffffffffffff,
> > and has the benefit of consistency with the other argument types and
> > signs...

The callee should just trust that the caller isn't going to do that.
File sizes can't be more than 2^63-1 bytes, so an extent of a file also
can't be more than 2^63-1 bytes.

> I think the problem here is that we have no guaranteed 64 bit size
> type. when that was the case with off_t, we created loff_t to always
> represent a 64 bit offset value. However, we never created one for
> the count/size that is passed alongside loff_t in many places - it
> was said that "syscalls are limited to 32 bit sizes" and
> "size_t is 64 bit on 64 bit platforms" and so on and so we still
> don't have a clean way to pass 64 bit sizes through the IO path.
> 
> We've been living with this shitty situation for a long time now, so
> perhaps it's time for us to define lsize_t for 64 bit lengths and
> start using that everywhere that needs a 64 bit clean path
> through the code, regardless of whether the arch is 32 or 64 bit...
> 
> Thoughts?

I don't think the THP patches should be blocked on this expedition.

We have a guaranteed 64 bit type -- it's u64.  I don't think defining
lsize_t is going to fix anything.  The next big problem to fix will be
supporting storage >16EiB, but I think that's a project that can start
in ten years and still be here before anyone but the TLAs have that much
storage in a single device.

Any objection to leaving this patch as-is with a u64 length?

  reply	other threads:[~2020-08-25 12:40 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-24 14:55 [PATCH 0/9] THP iomap patches for 5.10 Matthew Wilcox (Oracle)
2020-08-24 14:55 ` [PATCH 1/9] iomap: Fix misplaced page flushing Matthew Wilcox (Oracle)
2020-08-24 23:51   ` Dave Chinner
2020-08-25 20:47   ` Darrick J. Wong
2020-08-27  8:24   ` Christoph Hellwig
2020-08-24 14:55 ` [PATCH 2/9] fs: Introduce i_blocks_per_page Matthew Wilcox (Oracle)
2020-08-24 23:55   ` Dave Chinner
2020-08-25 20:49   ` Darrick J. Wong
2020-08-25 22:21     ` Dave Chinner
2020-08-24 14:55 ` [PATCH 3/9] iomap: Use kzalloc to allocate iomap_page Matthew Wilcox (Oracle)
2020-08-24 23:56   ` Dave Chinner
2020-08-25 20:49   ` Darrick J. Wong
2020-08-24 14:55 ` [PATCH 4/9] iomap: Use bitmap ops to set uptodate bits Matthew Wilcox (Oracle)
2020-08-24 23:56   ` Dave Chinner
2020-08-25 20:50   ` Darrick J. Wong
2020-08-24 14:55 ` [PATCH 5/9] iomap: Support arbitrarily many blocks per page Matthew Wilcox (Oracle)
2020-08-24 23:59   ` Dave Chinner
2020-08-25  0:22     ` Matthew Wilcox
2020-08-25 21:02   ` Darrick J. Wong
2020-08-26  2:26     ` Matthew Wilcox
2020-08-26  3:32       ` Darrick J. Wong
2020-08-27  8:26   ` Christoph Hellwig
2020-08-24 14:55 ` [PATCH 6/9] iomap: Convert read_count to byte count Matthew Wilcox (Oracle)
2020-08-25  0:09   ` Dave Chinner
2020-08-25 22:14     ` Darrick J. Wong
2020-08-27  8:35   ` Christoph Hellwig
2020-08-24 14:55 ` [PATCH 7/9] iomap: Convert write_count " Matthew Wilcox (Oracle)
2020-08-27  8:36   ` Christoph Hellwig
2020-08-24 14:55 ` [PATCH 8/9] iomap: Convert iomap_write_end types Matthew Wilcox (Oracle)
2020-08-25  0:12   ` Dave Chinner
2020-08-25  1:06     ` Matthew Wilcox
2020-08-25  1:33       ` Dave Chinner
2020-08-27  8:41   ` Christoph Hellwig
2020-08-24 14:55 ` [PATCH 9/9] iomap: Change calling convention for zeroing Matthew Wilcox (Oracle)
2020-08-25  0:27   ` Dave Chinner
2020-08-25  3:26     ` Matthew Wilcox
2020-08-25  3:35       ` Andreas Dilger
2020-08-25  4:27         ` Dave Chinner
2020-08-25 12:40           ` Matthew Wilcox [this message]
2020-08-25 22:05             ` Dave Chinner
2020-08-25 22:23   ` Darrick J. Wong
2020-08-27  8:39     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200825124024.GN17456@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=adilger@dilger.ca \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    --subject='Re: [PATCH 9/9] iomap: Change calling convention for zeroing' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).