LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
To: Vladislav Bolkhovitin <vst@vlnb.net>
Cc: Jeff Garzik <jeff@garzik.org>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Mike Christie <michaelc@cs.wisc.edu>,
	linux-scsi@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	scst-devel@lists.sourceforge.net,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
	Julian Satran <Julian_Satran@il.ibm.com>
Subject: Re: Integration of SCST in the mainstream Linux kernel
Date: Tue, 05 Feb 2008 16:11:07 -0800	[thread overview]
Message-ID: <1202256667.2220.83.camel@haakon2.linux-iscsi.org> (raw)
In-Reply-To: <47A8B757.10101@vlnb.net>

On Tue, 2008-02-05 at 22:21 +0300, Vladislav Bolkhovitin wrote:
> Jeff Garzik wrote:
> >>> iSCSI is way, way too complicated. 
> >>
> >> I fully agree. From one side, all that complexity is unavoidable for 
> >> case of multiple connections per session, but for the regular case of 
> >> one connection per session it must be a lot simpler.
> > 
> > Actually, think about those multiple connections...  we already had to 
> > implement fast-failover (and load bal) SCSI multi-pathing at a higher 
> > level.  IMO that portion of the protocol is redundant:   You need the 
> > same capability elsewhere in the OS _anyway_, if you are to support 
> > multi-pathing.
> 
> I'm thinking about MC/S as about a way to improve performance using 
> several physical links. There's no other way, except MC/S, to keep 
> commands processing order in that case. So, it's really valuable 
> property of iSCSI, although with a limited application.
> 
> Vlad
> 

Greetings,

I have always observed the case with LIO SE/iSCSI target mode (as well
as with other software initiators we can leave out of the discussion for
now, and congrats to the open/iscsi on folks recent release. :-) that
execution core hardware thread and inter-nexus per 1 Gb/sec ethernet
port performance scales up to 4x and 2x core x86_64 very well with
MC/S).  I have been seeing 450 MB/sec using 2x socket 4x core x86_64 for
a number of years with MC/S.  Using MC/S on 10 Gb/sec (on PCI-X v2.0
266mhz as well, which was the first transport that LIO Target ran on
that was able to reach handle duplex ~1200 MB/sec with 3 initiators and
MC/S.  In the point to point 10 GB/sec tests on IBM p404 machines, the
initiators where able to reach ~910 MB/sec with MC/S.  Open/iSCSI was
able to go a bit faster (~950 MB/sec) because it uses struct sk_buff
directly. 

A good rule to keep in mind here while considering performance is that
context switching overhead and pipeline <-> bus stalling (along with
other legacy OS specific storage stack limitations with BLOCK and VFS
with O_DIRECT, et al and I will leave out of the discussion for iSCSI
and SE engine target mode) is that a initiator will scale roughly 1/2 as
well as a target, given comparable hardware and virsh output.  The
software target case target case also depends, in great regard in many
cases, if we are talking about something something as simple as doing
contiguous DMA memory allocations in from a SINGLE kernel thread, and
handling direction execution to a storage hardware DMA ring that may
have not been allocated in the current kernel thread.  In MC/S mode this
breaks down to:

1) Sorting logic that handles pre execution statemachine for transport
from local RDMA memory and OS specific data buffers.   TCP application
data buffer, struct sk_buff, or RDMA struct page or SG.  This should be
generic between iSCSI and iSER.

2) Allocation of said memory buffers to OS subsystem dependent code that
can be queued up to these drivers.  It breaks down to what you can get
drivers and OS subsystem folks to agree to implement, and can be made
generic in a Transport / BLOCK / VFS layered storage stack.  In the
"allocate thread DMA ring and use OS supported software and vendor
available hardware" I don't think the kernel space requirement will
every completely be able to go away.

Without diving into RFC-3720 specifics, the statemachine for MC/S side
for memory allocation, login and logout generic to iSCSi and ISER, and
ERL=2 recovery.  My plan is to post the locations in the LIO code where
this has been implemented, and where we where can make this easier, etc.
In the early in the development of what eventually became LIO Target
code, ERL was broken into separete files and separete function
prefixes. 

iscsi_target_erl0, iscsi_target_erl1 and iscsi_target_erl2.

The statemachine for ERL=0 and ERL=2 is pretty simple in RFC-3720 (have
a look for those interested in the discussion)

7.1.1.  State Descriptions for Initiators and Targets

The LIO target code is also pretty simple for this:

[root@ps3-cell target]# wc -l iscsi_target_erl*
  1115 iscsi_target_erl0.c
    45 iscsi_target_erl0.h
   526 iscsi_target_erl0.o
  1426 iscsi_target_erl1.c
    51 iscsi_target_erl1.h
  1253 iscsi_target_erl1.o
   605 iscsi_target_erl2.c
    45 iscsi_target_erl2.h
   447 iscsi_target_erl2.o
  5513 total

erl1.c is a bit larger than the others because it contains the MC/S
statemachine functions. iscsi_target_erl1.c:iscsi_execute_cmd() and
iscsi_target_util.c:iscsi_check_received_cmdsn() do most of the work for
LIO MC/S state machine.  I would  probably benefit from being in broken
up into say iscsi_target_mcs.c.  Note that all of this code is MC/S
safe, with the exception of the specific SCSI TMR functions.  For the
SCSI TMR pieces, I have always hoped to use SCST code for doing this...

Most of the login/logout code is done in iscsi_target.c, which is could
probably also benefit fot getting broken out...

--nab



  reply	other threads:[~2008-02-06  0:11 UTC|newest]

Thread overview: 147+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-23 14:22 Bart Van Assche
2008-01-23 17:11 ` Vladislav Bolkhovitin
2008-01-29 20:42 ` James Bottomley
2008-01-29 21:31   ` Roland Dreier
2008-01-29 23:32     ` FUJITA Tomonori
2008-01-30  1:15       ` [Scst-devel] " Vu Pham
2008-01-30  8:38       ` Bart Van Assche
2008-01-30 10:56         ` FUJITA Tomonori
2008-01-30 11:40           ` Vladislav Bolkhovitin
2008-01-30 13:10           ` Bart Van Assche
2008-01-30 13:54             ` FUJITA Tomonori
2008-01-31  7:48               ` Bart Van Assche
2008-01-31 13:25           ` Nicholas A. Bellinger
2008-01-31 14:34             ` Bart Van Assche
2008-01-31 14:44               ` Nicholas A. Bellinger
2008-01-31 15:50               ` Vladislav Bolkhovitin
2008-01-31 16:25                 ` [Scst-devel] " Joe Landman
2008-01-31 17:08                   ` Bart Van Assche
2008-01-31 17:13                     ` Joe Landman
2008-01-31 18:12                     ` David Dillow
2008-02-01 11:50                       ` Vladislav Bolkhovitin
2008-02-01 11:50                     ` Vladislav Bolkhovitin
2008-02-01 12:25                       ` Vladislav Bolkhovitin
2008-01-31 17:14                 ` Nicholas A. Bellinger
2008-01-31 17:40                   ` Bart Van Assche
2008-01-31 18:15                     ` Nicholas A. Bellinger
2008-02-01  9:08                       ` Bart Van Assche
2008-02-01  8:11             ` Bart Van Assche
2008-02-01 10:39               ` Nicholas A. Bellinger
2008-02-01 11:04                 ` Bart Van Assche
2008-02-01 12:05                   ` Nicholas A. Bellinger
2008-02-01 13:25                     ` Bart Van Assche
2008-02-01 14:36                       ` Nicholas A. Bellinger
2008-01-30 16:34         ` James Bottomley
2008-01-30 16:50           ` Bart Van Assche
2008-02-02 15:32           ` Pete Wyckoff
2008-02-05 17:01         ` Erez Zilber
2008-02-06 12:16           ` Bart Van Assche
2008-02-06 16:45             ` Benny Halevy
2008-02-06 17:06             ` Roland Dreier
2008-02-18  9:43             ` Erez Zilber
2008-02-18 11:01               ` Bart Van Assche
2008-02-20  7:34                 ` Erez Zilber
2008-02-20  8:41                   ` Bart Van Assche
2008-01-30 11:18       ` Vladislav Bolkhovitin
2008-01-30  8:29   ` Bart Van Assche
2008-01-30 16:22     ` James Bottomley
2008-01-30 17:03       ` Bart Van Assche
2008-02-05  7:14       ` [Scst-devel] " Tomasz Chmielewski
2008-02-05 13:38         ` FUJITA Tomonori
2008-02-05 16:07           ` Tomasz Chmielewski
2008-02-05 16:21             ` Ming Zhang
2008-02-05 16:43             ` FUJITA Tomonori
2008-02-05 17:09           ` Matteo Tescione
2008-02-06  1:29             ` FUJITA Tomonori
2008-02-06  2:01               ` Nicholas A. Bellinger
2008-01-30 11:17   ` Vladislav Bolkhovitin
2008-02-04 12:27     ` Vladislav Bolkhovitin
2008-02-04 13:53       ` Bart Van Assche
2008-02-04 17:00         ` David Dillow
2008-02-04 17:08         ` Vladislav Bolkhovitin
2008-02-05 16:25         ` Bart Van Assche
2008-02-05 18:18           ` Linus Torvalds
2008-02-04 15:30       ` James Bottomley
2008-02-04 16:25         ` Vladislav Bolkhovitin
2008-02-04 17:06           ` James Bottomley
2008-02-04 17:16             ` Vladislav Bolkhovitin
2008-02-04 17:25               ` James Bottomley
2008-02-04 17:56                 ` Vladislav Bolkhovitin
2008-02-04 18:22                   ` James Bottomley
2008-02-04 18:38                     ` Vladislav Bolkhovitin
2008-02-04 18:54                       ` James Bottomley
2008-02-05 18:59                         ` Vladislav Bolkhovitin
2008-02-05 19:13                           ` James Bottomley
2008-02-06 18:07                             ` Vladislav Bolkhovitin
2008-02-07 13:13                             ` [Scst-devel] " Bart Van Assche
2008-02-07 13:45                               ` Vladislav Bolkhovitin
2008-02-07 22:51                                 ` david
2008-02-08 10:37                                   ` Vladislav Bolkhovitin
2008-02-09  7:40                                     ` david
2008-02-08 11:33                                   ` Nicholas A. Bellinger
2008-02-08 14:36                                     ` Vladislav Bolkhovitin
2008-02-08 23:53                                       ` Nicholas A. Bellinger
2008-02-15 15:02                                 ` Bart Van Assche
2008-02-07 15:38                               ` [Scst-devel] " Nicholas A. Bellinger
2008-02-07 20:37                                 ` Luben Tuikov
2008-02-08 10:32                                   ` Vladislav Bolkhovitin
2008-02-09  7:32                                     ` Luben Tuikov
2008-02-11 10:02                                       ` Vladislav Bolkhovitin
2008-02-08 11:53                                   ` [Scst-devel] " Nicholas A. Bellinger
2008-02-08 14:42                                     ` Vladislav Bolkhovitin
2008-02-09  0:00                                       ` Nicholas A. Bellinger
2008-02-04 18:29                 ` Linus Torvalds
2008-02-04 18:49                   ` James Bottomley
2008-02-04 19:06                   ` Nicholas A. Bellinger
2008-02-04 19:19                     ` Nicholas A. Bellinger
2008-02-04 19:44                     ` Linus Torvalds
2008-02-04 20:06                       ` [Scst-devel] " 4news
2008-02-04 20:24                       ` Nicholas A. Bellinger
2008-02-04 21:01                       ` J. Bruce Fields
2008-02-04 21:24                         ` Linus Torvalds
2008-02-04 22:00                           ` Nicholas A. Bellinger
2008-02-04 22:57                           ` Jeff Garzik
2008-02-04 23:45                             ` Linus Torvalds
2008-02-05  0:08                               ` Jeff Garzik
2008-02-05  1:20                                 ` Linus Torvalds
2008-02-05  8:38                             ` Bart Van Assche
2008-02-05 17:50                               ` Jeff Garzik
2008-02-06 10:22                                 ` Bart Van Assche
2008-02-06 14:21                                   ` Jeff Garzik
2008-02-05 13:05                             ` Olivier Galibert
2008-02-05 18:08                               ` Jeff Garzik
2008-02-05 19:01                           ` Vladislav Bolkhovitin
2008-02-04 22:43                       ` Alan Cox
2008-02-04 17:30                         ` Douglas Gilbert
2008-02-05  2:07                           ` [Scst-devel] " Chris Weiss
2008-02-05 14:19                             ` FUJITA Tomonori
2008-02-04 22:59                         ` Nicholas A. Bellinger
2008-02-04 23:00                         ` James Bottomley
2008-02-04 23:12                           ` Nicholas A. Bellinger
2008-02-04 23:16                             ` Nicholas A. Bellinger
2008-02-05 18:37                             ` James Bottomley
2008-02-04 23:04                         ` Jeff Garzik
2008-02-04 23:27                           ` Linus Torvalds
2008-02-05 19:01                           ` Vladislav Bolkhovitin
2008-02-05 19:12                             ` Jeff Garzik
2008-02-05 19:21                               ` Vladislav Bolkhovitin
2008-02-06  0:11                                 ` Nicholas A. Bellinger [this message]
2008-02-06  1:43                                   ` Nicholas A. Bellinger
2008-02-12 16:05                                   ` [Scst-devel] " Bart Van Assche
2008-02-13  3:44                                     ` Nicholas A. Bellinger
2008-02-13  6:18                                       ` CONFIG_SLUB and reproducable general protection faults on 2.6.2x Nicholas A. Bellinger
2008-02-13 16:37                                         ` Nicholas A. Bellinger
2008-02-06  0:17                               ` Integration of SCST in the mainstream Linux kernel Nicholas A. Bellinger
2008-02-06  0:48                             ` Nicholas A. Bellinger
2008-02-06  0:51                               ` Nicholas A. Bellinger
2008-02-05  0:07                         ` Matt Mackall
2008-02-05  0:24                           ` Linus Torvalds
2008-02-05  0:42                             ` Jeff Garzik
2008-02-05  0:45                             ` Matt Mackall
2008-02-05  4:43                             ` [Scst-devel] " Matteo Tescione
2008-02-05  5:07                               ` James Bottomley
2008-02-05 13:38                               ` FUJITA Tomonori
2008-02-05 19:00                       ` Vladislav Bolkhovitin
2008-02-05 17:10 ` Erez Zilber
2008-02-05 19:02   ` Bart Van Assche
2008-02-05 19:02   ` Vladislav Bolkhovitin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1202256667.2220.83.camel@haakon2.linux-iscsi.org \
    --to=nab@linux-iscsi.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=Julian_Satran@il.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=jeff@garzik.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    --cc=scst-devel@lists.sourceforge.net \
    --cc=torvalds@linux-foundation.org \
    --cc=vst@vlnb.net \
    --subject='Re: Integration of SCST in the mainstream Linux kernel' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).