LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Vladislav Bolkhovitin <vst@vlnb.net>
Cc: Bart Van Assche <bart.vanassche@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
	linux-scsi@vger.kernel.org, scst-devel@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
Subject: Re: Integration of SCST in the mainstream Linux kernel
Date: Mon, 04 Feb 2008 12:54:53 -0600	[thread overview]
Message-ID: <1202151293.3096.80.camel@localhost.localdomain> (raw)
In-Reply-To: <47A75B8A.3020503@vlnb.net>

On Mon, 2008-02-04 at 21:38 +0300, Vladislav Bolkhovitin wrote:
> James Bottomley wrote:
> > On Mon, 2008-02-04 at 20:56 +0300, Vladislav Bolkhovitin wrote:
> > 
> >>James Bottomley wrote:
> >>
> >>>On Mon, 2008-02-04 at 20:16 +0300, Vladislav Bolkhovitin wrote:
> >>>
> >>>
> >>>>James Bottomley wrote:
> >>>>
> >>>>
> >>>>>>>>So, James, what is your opinion on the above? Or the overall SCSI target 
> >>>>>>>>project simplicity doesn't matter much for you and you think it's fine 
> >>>>>>>>to duplicate Linux page cache in the user space to keep the in-kernel 
> >>>>>>>>part of the project as small as possible?
> >>>>>>>
> >>>>>>>
> >>>>>>>The answers were pretty much contained here
> >>>>>>>
> >>>>>>>http://marc.info/?l=linux-scsi&m=120164008302435
> >>>>>>>
> >>>>>>>and here:
> >>>>>>>
> >>>>>>>http://marc.info/?l=linux-scsi&m=120171067107293
> >>>>>>>
> >>>>>>>Weren't they?
> >>>>>>
> >>>>>>No, sorry, it doesn't look so for me. They are about performance, but 
> >>>>>>I'm asking about the overall project's architecture, namely about one 
> >>>>>>part of it: simplicity. Particularly, what do you think about 
> >>>>>>duplicating Linux page cache in the user space to have zero-copy cached 
> >>>>>>I/O? Or can you suggest another architectural solution for that problem 
> >>>>>>in the STGT's approach?
> >>>>>
> >>>>>
> >>>>>Isn't that an advantage of a user space solution?  It simply uses the
> >>>>>backing store of whatever device supplies the data.  That means it takes
> >>>>>advantage of the existing mechanisms for caching.
> >>>>
> >>>>No, please reread this thread, especially this message: 
> >>>>http://marc.info/?l=linux-kernel&m=120169189504361&w=2. This is one of 
> >>>>the advantages of the kernel space implementation. The user space 
> >>>>implementation has to have data copied between the cache and user space 
> >>>>buffer, but the kernel space one can use pages in the cache directly, 
> >>>>without extra copy.
> >>>
> >>>
> >>>Well, you've said it thrice (the bellman cried) but that doesn't make it
> >>>true.
> >>>
> >>>The way a user space solution should work is to schedule mmapped I/O
> >>>from the backing store and then send this mmapped region off for target
> >>>I/O.  For reads, the page gather will ensure that the pages are up to
> >>>date from the backing store to the cache before sending the I/O out.
> >>>For writes, You actually have to do a msync on the region to get the
> >>>data secured to the backing store. 
> >>
> >>James, have you checked how fast is mmaped I/O if work size > size of 
> >>RAM? It's several times slower comparing to buffered I/O. It was many 
> >>times discussed in LKML and, seems, VM people consider it unavoidable. 
> > 
> > 
> > Erm, but if you're using the case of work size > size of RAM, you'll
> > find buffered I/O won't help because you don't have the memory for
> > buffers either.
> 
> James, just check and you will see, buffered I/O is a lot faster.

So in an out of memory situation the buffers you don't have are a lot
faster than the pages I don't have?

> >>So, using mmaped IO isn't an option for high performance. Plus, mmaped 
> >>IO isn't an option for high reliability requirements, since it doesn't 
> >>provide a practical way to handle I/O errors.
> > 
> > I think you'll find it does ... the page gather returns -EFAULT if
> > there's an I/O error in the gathered region. 
> 
> Err, to whom return? If you try to read from a mmaped page, which can't 
> be populated due to I/O error, you will get SIGBUS or SIGSEGV, I don't 
> remember exactly. It's quite tricky to get back to the faulted command 
> from the signal handler.
> 
> Or do you mean mmap(MAP_POPULATE)/munmap() for each command? Do you 
> think that such mapping/unmapping is good for performance?
> 
> > msync does something
> > similar if there's a write failure.
> > 
> >>>You also have to pull tricks with
> >>>the mmap region in the case of writes to prevent useless data being read
> >>>in from the backing store.
> >>
> >>Can you be more exact and specify what kind of tricks should be done for 
> >>that?
> > 
> > Actually, just avoid touching it seems to do the trick with a recent
> > kernel.
> 
> Hmm, how can one write to an mmaped page and don't touch it?

I meant from user space ... the writes are done inside the kernel.

However, as Linus has pointed out, this discussion is getting a bit off
topic.  There's no actual evidence that copy problems are causing any
performatince issues issues for STGT.  In fact, there's evidence that
they're not for everything except IB networks.

James



  reply	other threads:[~2008-02-04 18:55 UTC|newest]

Thread overview: 147+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-23 14:22 Bart Van Assche
2008-01-23 17:11 ` Vladislav Bolkhovitin
2008-01-29 20:42 ` James Bottomley
2008-01-29 21:31   ` Roland Dreier
2008-01-29 23:32     ` FUJITA Tomonori
2008-01-30  1:15       ` [Scst-devel] " Vu Pham
2008-01-30  8:38       ` Bart Van Assche
2008-01-30 10:56         ` FUJITA Tomonori
2008-01-30 11:40           ` Vladislav Bolkhovitin
2008-01-30 13:10           ` Bart Van Assche
2008-01-30 13:54             ` FUJITA Tomonori
2008-01-31  7:48               ` Bart Van Assche
2008-01-31 13:25           ` Nicholas A. Bellinger
2008-01-31 14:34             ` Bart Van Assche
2008-01-31 14:44               ` Nicholas A. Bellinger
2008-01-31 15:50               ` Vladislav Bolkhovitin
2008-01-31 16:25                 ` [Scst-devel] " Joe Landman
2008-01-31 17:08                   ` Bart Van Assche
2008-01-31 17:13                     ` Joe Landman
2008-01-31 18:12                     ` David Dillow
2008-02-01 11:50                       ` Vladislav Bolkhovitin
2008-02-01 11:50                     ` Vladislav Bolkhovitin
2008-02-01 12:25                       ` Vladislav Bolkhovitin
2008-01-31 17:14                 ` Nicholas A. Bellinger
2008-01-31 17:40                   ` Bart Van Assche
2008-01-31 18:15                     ` Nicholas A. Bellinger
2008-02-01  9:08                       ` Bart Van Assche
2008-02-01  8:11             ` Bart Van Assche
2008-02-01 10:39               ` Nicholas A. Bellinger
2008-02-01 11:04                 ` Bart Van Assche
2008-02-01 12:05                   ` Nicholas A. Bellinger
2008-02-01 13:25                     ` Bart Van Assche
2008-02-01 14:36                       ` Nicholas A. Bellinger
2008-01-30 16:34         ` James Bottomley
2008-01-30 16:50           ` Bart Van Assche
2008-02-02 15:32           ` Pete Wyckoff
2008-02-05 17:01         ` Erez Zilber
2008-02-06 12:16           ` Bart Van Assche
2008-02-06 16:45             ` Benny Halevy
2008-02-06 17:06             ` Roland Dreier
2008-02-18  9:43             ` Erez Zilber
2008-02-18 11:01               ` Bart Van Assche
2008-02-20  7:34                 ` Erez Zilber
2008-02-20  8:41                   ` Bart Van Assche
2008-01-30 11:18       ` Vladislav Bolkhovitin
2008-01-30  8:29   ` Bart Van Assche
2008-01-30 16:22     ` James Bottomley
2008-01-30 17:03       ` Bart Van Assche
2008-02-05  7:14       ` [Scst-devel] " Tomasz Chmielewski
2008-02-05 13:38         ` FUJITA Tomonori
2008-02-05 16:07           ` Tomasz Chmielewski
2008-02-05 16:21             ` Ming Zhang
2008-02-05 16:43             ` FUJITA Tomonori
2008-02-05 17:09           ` Matteo Tescione
2008-02-06  1:29             ` FUJITA Tomonori
2008-02-06  2:01               ` Nicholas A. Bellinger
2008-01-30 11:17   ` Vladislav Bolkhovitin
2008-02-04 12:27     ` Vladislav Bolkhovitin
2008-02-04 13:53       ` Bart Van Assche
2008-02-04 17:00         ` David Dillow
2008-02-04 17:08         ` Vladislav Bolkhovitin
2008-02-05 16:25         ` Bart Van Assche
2008-02-05 18:18           ` Linus Torvalds
2008-02-04 15:30       ` James Bottomley
2008-02-04 16:25         ` Vladislav Bolkhovitin
2008-02-04 17:06           ` James Bottomley
2008-02-04 17:16             ` Vladislav Bolkhovitin
2008-02-04 17:25               ` James Bottomley
2008-02-04 17:56                 ` Vladislav Bolkhovitin
2008-02-04 18:22                   ` James Bottomley
2008-02-04 18:38                     ` Vladislav Bolkhovitin
2008-02-04 18:54                       ` James Bottomley [this message]
2008-02-05 18:59                         ` Vladislav Bolkhovitin
2008-02-05 19:13                           ` James Bottomley
2008-02-06 18:07                             ` Vladislav Bolkhovitin
2008-02-07 13:13                             ` [Scst-devel] " Bart Van Assche
2008-02-07 13:45                               ` Vladislav Bolkhovitin
2008-02-07 22:51                                 ` david
2008-02-08 10:37                                   ` Vladislav Bolkhovitin
2008-02-09  7:40                                     ` david
2008-02-08 11:33                                   ` Nicholas A. Bellinger
2008-02-08 14:36                                     ` Vladislav Bolkhovitin
2008-02-08 23:53                                       ` Nicholas A. Bellinger
2008-02-15 15:02                                 ` Bart Van Assche
2008-02-07 15:38                               ` [Scst-devel] " Nicholas A. Bellinger
2008-02-07 20:37                                 ` Luben Tuikov
2008-02-08 10:32                                   ` Vladislav Bolkhovitin
2008-02-09  7:32                                     ` Luben Tuikov
2008-02-11 10:02                                       ` Vladislav Bolkhovitin
2008-02-08 11:53                                   ` [Scst-devel] " Nicholas A. Bellinger
2008-02-08 14:42                                     ` Vladislav Bolkhovitin
2008-02-09  0:00                                       ` Nicholas A. Bellinger
2008-02-04 18:29                 ` Linus Torvalds
2008-02-04 18:49                   ` James Bottomley
2008-02-04 19:06                   ` Nicholas A. Bellinger
2008-02-04 19:19                     ` Nicholas A. Bellinger
2008-02-04 19:44                     ` Linus Torvalds
2008-02-04 20:06                       ` [Scst-devel] " 4news
2008-02-04 20:24                       ` Nicholas A. Bellinger
2008-02-04 21:01                       ` J. Bruce Fields
2008-02-04 21:24                         ` Linus Torvalds
2008-02-04 22:00                           ` Nicholas A. Bellinger
2008-02-04 22:57                           ` Jeff Garzik
2008-02-04 23:45                             ` Linus Torvalds
2008-02-05  0:08                               ` Jeff Garzik
2008-02-05  1:20                                 ` Linus Torvalds
2008-02-05  8:38                             ` Bart Van Assche
2008-02-05 17:50                               ` Jeff Garzik
2008-02-06 10:22                                 ` Bart Van Assche
2008-02-06 14:21                                   ` Jeff Garzik
2008-02-05 13:05                             ` Olivier Galibert
2008-02-05 18:08                               ` Jeff Garzik
2008-02-05 19:01                           ` Vladislav Bolkhovitin
2008-02-04 22:43                       ` Alan Cox
2008-02-04 17:30                         ` Douglas Gilbert
2008-02-05  2:07                           ` [Scst-devel] " Chris Weiss
2008-02-05 14:19                             ` FUJITA Tomonori
2008-02-04 22:59                         ` Nicholas A. Bellinger
2008-02-04 23:00                         ` James Bottomley
2008-02-04 23:12                           ` Nicholas A. Bellinger
2008-02-04 23:16                             ` Nicholas A. Bellinger
2008-02-05 18:37                             ` James Bottomley
2008-02-04 23:04                         ` Jeff Garzik
2008-02-04 23:27                           ` Linus Torvalds
2008-02-05 19:01                           ` Vladislav Bolkhovitin
2008-02-05 19:12                             ` Jeff Garzik
2008-02-05 19:21                               ` Vladislav Bolkhovitin
2008-02-06  0:11                                 ` Nicholas A. Bellinger
2008-02-06  1:43                                   ` Nicholas A. Bellinger
2008-02-12 16:05                                   ` [Scst-devel] " Bart Van Assche
2008-02-13  3:44                                     ` Nicholas A. Bellinger
2008-02-13  6:18                                       ` CONFIG_SLUB and reproducable general protection faults on 2.6.2x Nicholas A. Bellinger
2008-02-13 16:37                                         ` Nicholas A. Bellinger
2008-02-06  0:17                               ` Integration of SCST in the mainstream Linux kernel Nicholas A. Bellinger
2008-02-06  0:48                             ` Nicholas A. Bellinger
2008-02-06  0:51                               ` Nicholas A. Bellinger
2008-02-05  0:07                         ` Matt Mackall
2008-02-05  0:24                           ` Linus Torvalds
2008-02-05  0:42                             ` Jeff Garzik
2008-02-05  0:45                             ` Matt Mackall
2008-02-05  4:43                             ` [Scst-devel] " Matteo Tescione
2008-02-05  5:07                               ` James Bottomley
2008-02-05 13:38                               ` FUJITA Tomonori
2008-02-05 19:00                       ` Vladislav Bolkhovitin
2008-02-05 17:10 ` Erez Zilber
2008-02-05 19:02   ` Bart Van Assche
2008-02-05 19:02   ` Vladislav Bolkhovitin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1202151293.3096.80.camel@localhost.localdomain \
    --to=james.bottomley@hansenpartnership.com \
    --cc=akpm@linux-foundation.org \
    --cc=bart.vanassche@gmail.com \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=scst-devel@lists.sourceforge.net \
    --cc=torvalds@linux-foundation.org \
    --cc=vst@vlnb.net \
    --subject='Re: Integration of SCST in the mainstream Linux kernel' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).