LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	linux-erofs@lists.ozlabs.org, linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>,
	Andreas Gruenbacher <andreas.gruenbacher@gmail.com>
Subject: Re: [PATCH v4] iomap: support tail packing inline read
Date: Wed, 21 Jul 2021 08:33:12 +0800	[thread overview]
Message-ID: <YPdrSN6Vso98bLzB@B-P7TQMD6M-0146.local> (raw)
In-Reply-To: <20210721001720.GS22357@magnolia>

On Tue, Jul 20, 2021 at 05:17:20PM -0700, Darrick J. Wong wrote:
> On Wed, Jul 21, 2021 at 08:03:44AM +0800, Gao Xiang wrote:
> > On Tue, Jul 20, 2021 at 10:18:54PM +0100, Matthew Wilcox wrote:
> > > On Tue, Jul 20, 2021 at 01:42:24PM -0700, Darrick J. Wong wrote:
> > > > > -	BUG_ON(page_has_private(page));
> > > > > -	BUG_ON(page->index);
> > > > > -	BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
> > > > > +	/* inline source data must be inside a single page */
> > > > > +	BUG_ON(iomap->length > PAGE_SIZE - offset_in_page(iomap->inline_data));
> > > > 
> > > > Can we reduce the strength of these checks to a warning and an -EIO
> > > > return?
> > > 
> > > I'm not entirely sure that we need this check, tbh.
> > 
> > I'm fine to get rid of this check, it just inherited from:
> >  - BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
> > 
> > It has no real effect, but when reading INLINE extent, its .iomap_begin()
> > does:
> > 	iomap->private = erofs_get_meta_page()	/* get meta page */
> > 
> > and in the .iomap_end(), it does:
> > 	struct page *ipage = iomap->private;
> > 	if (ipage) {
> > 		unlock_page(ipage);
> > 		put_page(ipage);
> > 	}
> > 
> > > 
> > > > > +	/* handle tail-packing blocks cross the current page into the next */
> > > > > +	size = min_t(unsigned int, iomap->length + pos - iomap->offset,
> > > > > +		     PAGE_SIZE - poff);
> > > > >  
> > > > >  	addr = kmap_atomic(page);
> > > > > -	memcpy(addr, iomap->inline_data, size);
> > > > > -	memset(addr + size, 0, PAGE_SIZE - size);
> > > > > +	memcpy(addr + poff, iomap->inline_data - iomap->offset + pos, size);
> > > > > +	memset(addr + poff + size, 0, PAGE_SIZE - poff - size);
> > > > 
> > > > Hmm, so I guess the point of this is to support reading data from a
> > > > tail-packing block, where each file gets some arbitrary byte range
> > > > within the tp-block, and the range isn't aligned to an fs block?  Hence
> > > > you have to use the inline data code to read the relevant bytes and copy
> > > > them into the pagecache?
> > > 
> > > I think there are two distinct cases for IOMAP_INLINE.  One is
> > > where the tail of the file is literally embedded into the inode.
> > > Like ext4 fast symbolic links.  Taking the ext4 i_blocks layout
> > > as an example, you could have a 4kB block stored in i_block[0]
> > > and then store bytes 4096-4151 in i_block[1-14] (although reading
> > > https://www.kernel.org/doc/html/latest/filesystems/ext4/dynamic.html
> > > makes me think that ext4 only supports storing 0-59 in the i_blocks;
> > > it doesn't support 0-4095 in i_block[0] and then 4096-4151 in i_blocks)
> > > 
> > > The other is what I think erofs is doing where, for example, you'd
> > > specify in i_block[1] the block which contains the tail and then in
> > > i_block[2] what offset of the block the tail starts at.
> > 
> > Nope, EROFS inline data is embedded into the inode in order to save
> > I/O as well as space (maybe I didn't express clear before [1]). 
> > 
> > I understand the other one, but it can only save storage space but
> > cannot save I/O (we still need another independent I/O to read its
> > meta buffered page).
> > 
> > In the view of INLINE extent itself, I think both ways can be
> > supported with this approach.
> 
> OH, I see, so you need the multi-page inline data support because the
> ondisk layout is something like this:
> 
> +----------- page one ---------+----------- page two...
> V                              V
> +-------+-----------------------------+---------
> | inode |   inline data               | inode...
> +-------+-----------------------------+---------
> 
> And since you can only kmap one page at a time, an inline read grabs the
> first part of the data in "page one" and then we have to call
> iomap_begin a second time get a new address so that we can read the rest
> from "page two"?

Nope, currently EROFS inline data won't cross page like this.

But in principle, yes, I don't want to limit it to the current
EROFS or gfs2 usage. I think we could make this iomap function
more generally (I mean, I'd like to make the INLINE extent
functionity as general as possible, my v1 original approach
in principle can support any inline extent in the middle of
file rather than just tail blocks, but zeroing out post-EOF
needs another iteration) and I don't see it add more code and
complexity.

Thanks,
Gao Xiang

> 
> --D
> 
> > 
> > [1] https://www.kernel.org/doc/html/latest/filesystems/erofs.html
> >     "On-disk details" section.
> > 
> > Thanks,
> > Gao Xiang

  reply	other threads:[~2021-07-21  0:33 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-20 13:35 Gao Xiang
2021-07-20 20:42 ` Darrick J. Wong
2021-07-20 21:18   ` Matthew Wilcox
2021-07-21  0:03     ` Gao Xiang
2021-07-21  0:17       ` Darrick J. Wong
2021-07-21  0:33         ` Gao Xiang [this message]
2021-07-21  2:26           ` Andreas Grünbacher
2021-07-21  2:53             ` Gao Xiang
2021-07-21  6:43               ` Andreas Grünbacher
2021-07-21  7:28                 ` Gao Xiang
2021-07-21  6:33     ` Andreas Grünbacher
2021-07-20 23:46   ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YPdrSN6Vso98bLzB@B-P7TQMD6M-0146.local \
    --to=hsiangkao@linux.alibaba.com \
    --cc=andreas.gruenbacher@gmail.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=willy@infradead.org \
    --subject='Re: [PATCH v4] iomap: support tail packing inline read' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).