LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
@ 2008-02-09  7:44 Luben Tuikov
  0 siblings, 0 replies; 29+ messages in thread
From: Luben Tuikov @ 2008-02-09  7:44 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Bart Van Assche, James Bottomley, Vladislav Bolkhovitin,
	FUJITA Tomonori, linux-scsi, linux-kernel, scst-devel,
	Andrew Morton, Linus Torvalds, Ming Zhang

--- On Fri, 2/8/08, Nicholas A. Bellinger <nab@linux-iscsi.org> wrote:
> > Is there an open iSCSI Target implementation which
> does NOT
> > issue commands to sub-target devices via the SCSI
> mid-layer, but
> > bypasses it completely?
> > 
> >    Luben
> > 
> 
> Hi Luben,
> 
> I am guessing you mean futher down the stack, which I
> don't know this to

Yes, that's what I meant.

> be the case.  Going futher up the layers is the design of
> v2.9 LIO-SE.
> There is a diagram explaining the basic concepts from a
> 10,000 foot
> level.
> 
> http://linux-iscsi.org/builds/user/nab/storage-engine-concept.pdf

Thanks!

   Luben

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-12 16:05                 ` [Scst-devel] " Bart Van Assche
@ 2008-02-13  3:44                   ` Nicholas A. Bellinger
  0 siblings, 0 replies; 29+ messages in thread
From: Nicholas A. Bellinger @ 2008-02-13  3:44 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Vladislav Bolkhovitin, FUJITA Tomonori, Mike Christie,
	linux-scsi, Linux Kernel Mailing List, James Bottomley,
	scst-devel, Andrew Morton, Christoph Hellwig, Rik van Riel,
	Chris Weiss, Linus Torvalds

Greetings all,

On Tue, 2008-02-12 at 17:05 +0100, Bart Van Assche wrote:
> On Feb 6, 2008 1:11 AM, Nicholas A. Bellinger <nab@linux-iscsi.org> wrote:
> > I have always observed the case with LIO SE/iSCSI target mode ...
> 
> Hello Nicholas,
> 
> Are you sure that the LIO-SE kernel module source code is ready for
> inclusion in the mainstream Linux kernel ? As you know I tried to test
> the LIO-SE iSCSI target. Already while configuring the target I
> encountered a kernel crash that froze the whole system. I can
> reproduce this kernel crash easily, and I reported it 11 days ago on
> the LIO-SE mailing list (February 4, 2008). One of the call stacks I
> posted shows a crash in mempool_alloc() called from jbd. Or: the crash
> is most likely the result of memory corruption caused by LIO-SE.
> 

So I was able to FINALLY track this down to:

-# CONFIG_SLUB_DEBUG is not set
-# CONFIG_SLAB is not set
-CONFIG_SLUB=y
+CONFIG_SLAB=y

in both your and Chris Weiss's configs that was causing the
reproduceable general protection faults.  I also disabled
CONFIG_RELOCATABLE and crash dump because I was debugging using kdb in
x86_64 VM on 2.6.24 with your config.  I am pretty sure you can leave
this (crash dump) in your config for testing.

This can take a while to compile and take up alot of space, esp. with
all of the kernel debug options enabled, which on 2.6.24, really amounts
to alot of CPU time when building.  Also with your original config, I
was seeing some strange undefined module objects after Stage 2 Link with
iscsi_target_mod with modpost with the SLUB the lockups (which are not
random btw, and are tracked back to __kmalloc())..  Also, at module load
time with the original config, there where some warning about symbol
objects (I believe it was SCSI related, same as the ones with modpost).

In any event, the dozen 1000 loop discovery test is now working fine (as
well as IPoIB) with the above config change, and you should be ready to
go for your testing.

Tomo, Vlad, Andrew and Co:

Do you have any ideas why this would be the case with LIO-Target..?  Is
anyone else seeing something similar to this with their target mode
(mabye its all out of tree code..?) that is having an issue..? I am
using Debian x86_64 and Bart and Chris are using Ubuntu x86_64 and we
both have this problem with CONFIG_SLUB on >= 2.6.22 kernel.org
kernels. 

Also, I will recompile some of my non x86 machines with the above
enabled and see if I can reproduce..  Here the Bart's config again:

http://groups.google.com/group/linux-iscsi-target-dev/browse_thread/thread/30835aede1028188


> Because I was curious to know why it took so long to fix such a severe
> crash, I started browsing through the LIO-SE source code. Analysis of
> the LIO-SE kernel module source code learned me that this crash is not
> a coincidence. Dynamic memory allocation (kmalloc()/kfree()) in the
> LIO-SE kernel module is complex and hard to verify.

What the LIO-SE Target module does is complex. :P  Sorry for taking so
long, I had to start tracking this down by CONFIG_ option with your
config on an x86_64 VM. 

>  There are 412
> memory allocation/deallocation calls in the current version of the
> LIO-SE kernel module source code, which is a lot. Additionally,
> because of the complexity of the memory handling in LIO-SE, it is not
> possible to verify the correctness of the memory handling by analyzing
> a single function at a time. In my opinion this makes the LIO-SE
> source code hard to maintain.
> Furthermore, the LIO-SE kernel module source code does not follow
> conventions that have proven their value in the past like grouping all
> error handling at the end of a function. As could be expected, the
> consequence is that error handling is not correct in several
> functions, resulting in memory leaks in case of an error.

I would be more than happy to point the release paths for iSCSI Target
and LIO-SE to show they are not actual memory leaks (as I mentioned,
this code has been stable for a number of years) for some particular SE
or iSCSI Target logic if you are interested..

Also, if we are talking about target mode storage engine that should be
going upstream, the API to the current stable and future storage
systems, and of course the Mem->SG and SG->Mem that handles all possible
cases of max_sectors and sector_size to past, present, and future.  I
really glad that you have been taking a look at this, because some of
the code (as you mention) can get very complex to make this a reality as
it has been with LIO-Target since v2.2.  

>  Some
> examples of functions in which error handling is clearly incorrect:
> * transport_allocate_passthrough().
> * iscsi_do_build_list().
> 

You did find the one in transport_allocate_passthrough() and the
strncpy() + strlen() in userspace.  Also, thanks for pointing me to the
missing sg_init_table() and sg_mark_end() usage for 2.6.24.  I will post
an update to my thread about how to do this for other drivers..

I will have a look at your new changes and post them on LIO-Target-Dev
for your review.  Please feel free to Ack them when I post.

(Thanks Bart !!)

PS:  Sometimes it takes a while when you are on the bleeding edge of
development to track these types of issues down. :-)

--nab


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-06  0:11               ` Nicholas A. Bellinger
@ 2008-02-12 16:05                 ` Bart Van Assche
  2008-02-13  3:44                   ` Nicholas A. Bellinger
  0 siblings, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2008-02-12 16:05 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Vladislav Bolkhovitin, FUJITA Tomonori, Mike Christie,
	linux-scsi, Linux Kernel Mailing List, James Bottomley,
	scst-devel, Andrew Morton

On Feb 6, 2008 1:11 AM, Nicholas A. Bellinger <nab@linux-iscsi.org> wrote:
> I have always observed the case with LIO SE/iSCSI target mode ...

Hello Nicholas,

Are you sure that the LIO-SE kernel module source code is ready for
inclusion in the mainstream Linux kernel ? As you know I tried to test
the LIO-SE iSCSI target. Already while configuring the target I
encountered a kernel crash that froze the whole system. I can
reproduce this kernel crash easily, and I reported it 11 days ago on
the LIO-SE mailing list (February 4, 2008). One of the call stacks I
posted shows a crash in mempool_alloc() called from jbd. Or: the crash
is most likely the result of memory corruption caused by LIO-SE.

Because I was curious to know why it took so long to fix such a severe
crash, I started browsing through the LIO-SE source code. Analysis of
the LIO-SE kernel module source code learned me that this crash is not
a coincidence. Dynamic memory allocation (kmalloc()/kfree()) in the
LIO-SE kernel module is complex and hard to verify. There are 412
memory allocation/deallocation calls in the current version of the
LIO-SE kernel module source code, which is a lot. Additionally,
because of the complexity of the memory handling in LIO-SE, it is not
possible to verify the correctness of the memory handling by analyzing
a single function at a time. In my opinion this makes the LIO-SE
source code hard to maintain.
Furthermore, the LIO-SE kernel module source code does not follow
conventions that have proven their value in the past like grouping all
error handling at the end of a function. As could be expected, the
consequence is that error handling is not correct in several
functions, resulting in memory leaks in case of an error. Some
examples of functions in which error handling is clearly incorrect:
* transport_allocate_passthrough().
* iscsi_do_build_list().

Bart Van Assche.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-07 20:37                         ` Luben Tuikov
@ 2008-02-08 11:53                           ` Nicholas A. Bellinger
  0 siblings, 0 replies; 29+ messages in thread
From: Nicholas A. Bellinger @ 2008-02-08 11:53 UTC (permalink / raw)
  To: ltuikov
  Cc: Bart Van Assche, James Bottomley, Vladislav Bolkhovitin,
	FUJITA Tomonori, linux-scsi, linux-kernel, scst-devel,
	Andrew Morton, Linus Torvalds, Ming Zhang

On Thu, 2008-02-07 at 12:37 -0800, Luben Tuikov wrote:
> Is there an open iSCSI Target implementation which does NOT
> issue commands to sub-target devices via the SCSI mid-layer, but
> bypasses it completely?
> 
>    Luben
> 

Hi Luben,

I am guessing you mean futher down the stack, which I don't know this to
be the case.  Going futher up the layers is the design of v2.9 LIO-SE.
There is a diagram explaining the basic concepts from a 10,000 foot
level.

http://linux-iscsi.org/builds/user/nab/storage-engine-concept.pdf

Note that only traditional iSCSI target is currently implemented in v2.9
LIO-SE codebase in the list of target mode fabrics on left side of the
layout.  The API between the protocol headers that does
encoding/decoding target mode storage packets is probably the least
mature area of the LIO stack (because it has always been iSCSI looking
towards iSER :).  I don't know who has the most mature API between the
storage engine and target storage protocol for doing this between SCST
and STGT, I am guessing SCST because of the difference in age of the
projects.  Could someone be so kind to fill me in on this..?

Also note, the storage engine plugin for doing userspace passthrough on
the right is also currently not implemented.  Userspace passthrough in
this context is an target engine I/O that is enforcing max_sector and
sector_size limitiations, and encodes/decodes target storage protocol
packets all out of view of userspace.  The addressing will be completely
different if we are pointing SE target packets at non SCSI target ports
in userspace.

--nab

> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-07 15:38                       ` Nicholas A. Bellinger
@ 2008-02-07 20:37                         ` Luben Tuikov
  2008-02-08 11:53                           ` Nicholas A. Bellinger
  0 siblings, 1 reply; 29+ messages in thread
From: Luben Tuikov @ 2008-02-07 20:37 UTC (permalink / raw)
  To: Bart Van Assche, Nicholas A. Bellinger
  Cc: James Bottomley, Vladislav Bolkhovitin, FUJITA Tomonori,
	linux-scsi, linux-kernel, scst-devel, Andrew Morton,
	Linus Torvalds, Ming Zhang

Is there an open iSCSI Target implementation which does NOT
issue commands to sub-target devices via the SCSI mid-layer, but
bypasses it completely?

   Luben


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-07 13:13                     ` [Scst-devel] " Bart Van Assche
@ 2008-02-07 15:38                       ` Nicholas A. Bellinger
  2008-02-07 20:37                         ` Luben Tuikov
  0 siblings, 1 reply; 29+ messages in thread
From: Nicholas A. Bellinger @ 2008-02-07 15:38 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: James Bottomley, Vladislav Bolkhovitin, FUJITA Tomonori,
	linux-scsi, linux-kernel, scst-devel, Andrew Morton,
	Linus Torvalds, Ming Zhang

On Thu, 2008-02-07 at 14:13 +0100, Bart Van Assche wrote: 
> Since the focus of this thread shifted somewhat in the last few
> messages, I'll try to summarize what has been discussed so far:
> - There was a number of participants who joined this discussion
> spontaneously. This suggests that there is considerable interest in
> networked storage and iSCSI.
> - It has been motivated why iSCSI makes sense as a storage protocol
> (compared to ATA over Ethernet and Fibre Channel over Ethernet).
> - The direct I/O performance results for block transfer sizes below 64
> KB are a meaningful benchmark for storage target implementations.
> - It has been discussed whether an iSCSI target should be implemented
> in user space or in kernel space. It is clear now that an
> implementation in the kernel can be made faster than a user space
> implementation (http://kerneltrap.org/mailarchive/linux-kernel/2008/2/4/714804).
> Regarding existing implementations, measurements have a.o. shown that
> SCST is faster than STGT (30% with the following setup: iSCSI via
> IPoIB and direct I/O block transfers with a size of 512 bytes).
> - It has been discussed which iSCSI target implementation should be in
> the mainstream Linux kernel. There is no agreement on this subject
> yet. The short-term options are as follows:
> 1) Do not integrate any new iSCSI target implementation in the
> mainstream Linux kernel.
> 2) Add one of the existing in-kernel iSCSI target implementations to
> the kernel, e.g. SCST or PyX/LIO.
> 3) Create a new in-kernel iSCSI target implementation that combines
> the advantages of the existing iSCSI kernel target implementations
> (iETD, STGT, SCST and PyX/LIO).
> 
> As an iSCSI user, I prefer option (3). The big question is whether the
> various storage target authors agree with this ?
> 

I think the other data point here would be that final target design
needs to be as generic as possible.  Generic in the sense that the
engine eventually needs to be able to accept NDB and other ethernet
based target mode storage configurations to an abstracted device object
(struct scsi_device, struct block_device, or struct file) just as it
would for an IP Storage based request.

We know that NDB and *oE will have their own naming and discovery, and
the first set of IO tasks to be completed would be those using
(iscsi_cmd_t->cmd_flags & ICF_SCSI_DATA_SG_IO_CDB) in
iscsi_target_transport.c in the current code.    These are single READ_*
and WRITE_* codepaths that perform DMA memory pre-proceessing in v2.9
LIO-SE. 

Also, being able to tell the engine to accelerate to DMA ring operation
(say to underlying struct scsi_device or struct block_device) instead of
fileio in some cases you will see better performance when using hardware
(ie: not a underlying kernel thread queueing IO into block).  But I have
found FILEIO with sendpage with MD to be faster in single threaded tests
than struct block_device.  I am currently using IBLOCK for LVM for core
LIO operation (which actually sits on software MD raid6).  I do this
because using submit_bio() with se_mem_t mapped arrays of struct
scatterlist -> struct bio_vec can handle power failures properly, and
not send back StatSN Acks to the Initiator who thinks that everything
has already made it to disk.  This is the case with doing IO to struct
file in the kernel today without a kernel level O_DIRECT.

Also for proper kernel-level target mode support, using struct file with
O_DIRECT for storage blocks and emulating control path CDBS is one of
the work items.  This can be made generic or obtained from the
underlying storage object (anything that can be exported from LIO
Subsystem TPI) For real hardware (struct scsi_device in just about all
the cases these days).  Last time I looked this was due to
fs/direct-io.c:dio_refill_pages() using get_user_pages()...

For really transport specific CDB and control code, which in good amount
of cases, we are going eventually be expected to emulate in software. 
I really like how STGT breaks this up into per device type code
segments; spc.c sbc.c mmc.c ssc.c smc.c etc.  Having all of these split
out properly is one strong point of STGT IMHO, and really makes learning
things much easier.  Also, being able to queue these IOs into a
userspace and receive a asynchronous response back up the storage stack.
I think this is actually a pretty interesting potential for passing
storage protocol packets into userspace apps and leave the protocol
state machines and recovery paths in the kernel with a generic target
engine.

Also, I know that the SCST folks have put alot of time into getting the
very SCSI hardware specific target mode control modes to work.  I
personally own a bunch of this adapters, and would really like to see
better support for target mode on non iSCSI type adapters with a single
target mode storage engine that abstracts storage subsystems and wire
protocol fabrics.

--nab


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05 19:13                   ` James Bottomley
@ 2008-02-07 13:13                     ` Bart Van Assche
  2008-02-07 15:38                       ` Nicholas A. Bellinger
  0 siblings, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2008-02-07 13:13 UTC (permalink / raw)
  To: James Bottomley, Nicholas A. Bellinger, Vladislav Bolkhovitin,
	FUJITA Tomonori
  Cc: linux-scsi, linux-kernel, scst-devel, Andrew Morton, Linus Torvalds

Since the focus of this thread shifted somewhat in the last few
messages, I'll try to summarize what has been discussed so far:
- There was a number of participants who joined this discussion
spontaneously. This suggests that there is considerable interest in
networked storage and iSCSI.
- It has been motivated why iSCSI makes sense as a storage protocol
(compared to ATA over Ethernet and Fibre Channel over Ethernet).
- The direct I/O performance results for block transfer sizes below 64
KB are a meaningful benchmark for storage target implementations.
- It has been discussed whether an iSCSI target should be implemented
in user space or in kernel space. It is clear now that an
implementation in the kernel can be made faster than a user space
implementation (http://kerneltrap.org/mailarchive/linux-kernel/2008/2/4/714804).
Regarding existing implementations, measurements have a.o. shown that
SCST is faster than STGT (30% with the following setup: iSCSI via
IPoIB and direct I/O block transfers with a size of 512 bytes).
- It has been discussed which iSCSI target implementation should be in
the mainstream Linux kernel. There is no agreement on this subject
yet. The short-term options are as follows:
1) Do not integrate any new iSCSI target implementation in the
mainstream Linux kernel.
2) Add one of the existing in-kernel iSCSI target implementations to
the kernel, e.g. SCST or PyX/LIO.
3) Create a new in-kernel iSCSI target implementation that combines
the advantages of the existing iSCSI kernel target implementations
(iETD, STGT, SCST and PyX/LIO).

As an iSCSI user, I prefer option (3). The big question is whether the
various storage target authors agree with this ?

Bart Van Assche.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-06  1:29             ` FUJITA Tomonori
@ 2008-02-06  2:01               ` Nicholas A. Bellinger
  0 siblings, 0 replies; 29+ messages in thread
From: Nicholas A. Bellinger @ 2008-02-06  2:01 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: matteo, tomof, mangoo, vst, linux-scsi, linux-kernel,
	James.Bottomley, scst-devel, akpm, torvalds

On Wed, 2008-02-06 at 10:29 +0900, FUJITA Tomonori wrote:
> On Tue, 05 Feb 2008 18:09:15 +0100
> Matteo Tescione <matteo@rmnet.it> wrote:
> 
> > On 5-02-2008 14:38, "FUJITA Tomonori" <tomof@acm.org> wrote:
> > 
> > > On Tue, 05 Feb 2008 08:14:01 +0100
> > > Tomasz Chmielewski <mangoo@wpkg.org> wrote:
> > > 
> > >> James Bottomley schrieb:
> > >> 
> > >>> These are both features being independently worked on, are they not?
> > >>> Even if they weren't, the combination of the size of SCST in kernel plus
> > >>> the problem of having to find a migration path for the current STGT
> > >>> users still looks to me to involve the greater amount of work.
> > >> 
> > >> I don't want to be mean, but does anyone actually use STGT in
> > >> production? Seriously?
> > >> 
> > >> In the latest development version of STGT, it's only possible to stop
> > >> the tgtd target daemon using KILL / 9 signal - which also means all
> > >> iSCSI initiator connections are corrupted when tgtd target daemon is
> > >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > > 
> > > I don't know what "iSCSI initiator connections are corrupted"
> > > mean. But if you reboot a server, how can an iSCSI target
> > > implementation keep iSCSI tcp connections?
> > > 
> > > 
> > >> Imagine you have to reboot all your NFS clients when you reboot your NFS
> > >> server. Not only that - your data is probably corrupted, or at least the
> > >> filesystem deserves checking...
> > 

The TCP connection will drop, remember that the TCP connection state for
one side has completely vanished.  Depending on iSCSI/iSER
ErrorRecoveryLevel that is set, this will mean:

1) Session Recovery, ERL=0 - Restarting the entire nexus and all
connections across all of the possible subnets or comm-links.  All
outstanding un-StatSN acknowledged commands will be returned back to the
SCSI subsystem with RETRY status.  Once a single connection has been
reestablished to start the nexus, the CDBs will be resent.

2) Connection Recovery, ERL=2 - CDBs from the failed connection(s) will
be retried (nothing changes in the PDU) to fill the iSCSI CmdSN ordering
gap, or be explictly retried with TMR TASK_REASSIGN for ones already
acknowledged by the ExpCmdSN that are returned to the initiator in
response packets or by way of unsolicited NopINs.

> > Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
> > rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
> > manages stop/crash, by sending unit attention to clients on reconnect.
> > Drbd+heartbeat correctly manages those things too.
> > Still from an end-user POV, i was able to reboot/survive a crash only with
> > SCST, IETD still has reconnect problems and STGT are even worst.
> 
> Please tell us on stgt-devel mailing list if you see problems. We will
> try to fix them.
> 

FYI, the LIO code also supports rmmoding iscsi_target_mod while at full
10 Gb/sec speed.  I think it should be a requirement to be able to
control per initiator, per portal group, per LUN, per device, per HBA in
the design without restarting any other objects.

--nab

> Thanks,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05 17:09           ` Matteo Tescione
@ 2008-02-06  1:29             ` FUJITA Tomonori
  2008-02-06  2:01               ` Nicholas A. Bellinger
  0 siblings, 1 reply; 29+ messages in thread
From: FUJITA Tomonori @ 2008-02-06  1:29 UTC (permalink / raw)
  To: matteo
  Cc: tomof, mangoo, vst, linux-scsi, linux-kernel, James.Bottomley,
	scst-devel, akpm, torvalds, fujita.tomonori

On Tue, 05 Feb 2008 18:09:15 +0100
Matteo Tescione <matteo@rmnet.it> wrote:

> On 5-02-2008 14:38, "FUJITA Tomonori" <tomof@acm.org> wrote:
> 
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <mangoo@wpkg.org> wrote:
> > 
> >> James Bottomley schrieb:
> >> 
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> 
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >> 
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> > 
> > 
> >> Imagine you have to reboot all your NFS clients when you reboot your NFS
> >> server. Not only that - your data is probably corrupted, or at least the
> >> filesystem deserves checking...
> 
> Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
> rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
> manages stop/crash, by sending unit attention to clients on reconnect.
> Drbd+heartbeat correctly manages those things too.
> Still from an end-user POV, i was able to reboot/survive a crash only with
> SCST, IETD still has reconnect problems and STGT are even worst.

Please tell us on stgt-devel mailing list if you see problems. We will
try to fix them.

Thanks,

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05 13:38         ` FUJITA Tomonori
  2008-02-05 16:07           ` Tomasz Chmielewski
@ 2008-02-05 17:09           ` Matteo Tescione
  2008-02-06  1:29             ` FUJITA Tomonori
  1 sibling, 1 reply; 29+ messages in thread
From: Matteo Tescione @ 2008-02-05 17:09 UTC (permalink / raw)
  To: FUJITA Tomonori, mangoo
  Cc: vst, linux-scsi, linux-kernel, James.Bottomley, scst-devel, akpm,
	torvalds, fujita.tomonori

On 5-02-2008 14:38, "FUJITA Tomonori" <tomof@acm.org> wrote:

> On Tue, 05 Feb 2008 08:14:01 +0100
> Tomasz Chmielewski <mangoo@wpkg.org> wrote:
> 
>> James Bottomley schrieb:
>> 
>>> These are both features being independently worked on, are they not?
>>> Even if they weren't, the combination of the size of SCST in kernel plus
>>> the problem of having to find a migration path for the current STGT
>>> users still looks to me to involve the greater amount of work.
>> 
>> I don't want to be mean, but does anyone actually use STGT in
>> production? Seriously?
>> 
>> In the latest development version of STGT, it's only possible to stop
>> the tgtd target daemon using KILL / 9 signal - which also means all
>> iSCSI initiator connections are corrupted when tgtd target daemon is
>> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> 
> I don't know what "iSCSI initiator connections are corrupted"
> mean. But if you reboot a server, how can an iSCSI target
> implementation keep iSCSI tcp connections?
> 
> 
>> Imagine you have to reboot all your NFS clients when you reboot your NFS
>> server. Not only that - your data is probably corrupted, or at least the
>> filesystem deserves checking...

Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
manages stop/crash, by sending unit attention to clients on reconnect.
Drbd+heartbeat correctly manages those things too.
Still from an end-user POV, i was able to reboot/survive a crash only with
SCST, IETD still has reconnect problems and STGT are even worst.

Regards,
--matteo



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05 16:07           ` Tomasz Chmielewski
  2008-02-05 16:21             ` Ming Zhang
@ 2008-02-05 16:43             ` FUJITA Tomonori
  1 sibling, 0 replies; 29+ messages in thread
From: FUJITA Tomonori @ 2008-02-05 16:43 UTC (permalink / raw)
  To: mangoo
  Cc: tomof, James.Bottomley, bart.vanassche, vst, linux-scsi,
	linux-kernel, fujita.tomonori, scst-devel, akpm, torvalds,
	stgt-devel, fujita.tomonori

On Tue, 05 Feb 2008 17:07:07 +0100
Tomasz Chmielewski <mangoo@wpkg.org> wrote:

> FUJITA Tomonori schrieb:
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <mangoo@wpkg.org> wrote:
> > 
> >> James Bottomley schrieb:
> >>
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >>
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> 
> The problem with tgtd is that you can't start it (configured) in an
> "atomic" way.
> Usually, one will start tgtd and it's configuration in a script (I 
> replaced some parameters with "..." to make it shorter and more readable):

Thanks for the details. So the way to stop the daemon is not related
with your problem.

It's easily fixable. Can you start a new thread about this on
stgt-devel mailing list? When we agree on the interface to start the
daemon, I'll implement it.


> tgtd
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...

(snip)

> So the only way to start/restart tgtd reliably is to do hacks which are 
> needed with yet another iSCSI kernel implementation (IET): use iptables.
> 
> iptables <block iSCSI traffic>
> tgtd
> sleep 1
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> iptables <unblock iSCSI traffic>
> 
> 
> A bit ugly, isn't it?
> Having to tinker with a firewall in order to start a daemon is by no 
> means a sign of a well-tested and mature project.
> 
> That's why I asked how many people use stgt in a production environment 
> - James was worried about a potential migration path for current users.

I don't know how many people use stgt in a production environment but
I'm not sure that this problem prevents many people from using it in a
production environment.

You want to reboot a server running target devices while initiators
connect to it. Rebooting the target server behind the initiators
seldom works. System adminstorators in my workplace reboot storage
devices once a year and tell us to shut down the initiator machines
that use them before that.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05 16:07           ` Tomasz Chmielewski
@ 2008-02-05 16:21             ` Ming Zhang
  2008-02-05 16:43             ` FUJITA Tomonori
  1 sibling, 0 replies; 29+ messages in thread
From: Ming Zhang @ 2008-02-05 16:21 UTC (permalink / raw)
  To: Tomasz Chmielewski
  Cc: FUJITA Tomonori, vst, linux-scsi, linux-kernel, James.Bottomley,
	scst-devel, stgt-devel, akpm, torvalds, fujita.tomonori

On Tue, 2008-02-05 at 17:07 +0100, Tomasz Chmielewski wrote:
> FUJITA Tomonori schrieb:
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <mangoo@wpkg.org> wrote:
> > 
> >> James Bottomley schrieb:
> >>
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >>
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> 
> The problem with tgtd is that you can't start it (configured) in an
> "atomic" way.
> Usually, one will start tgtd and it's configuration in a script (I 
> replaced some parameters with "..." to make it shorter and more readable):
> 
> 
> tgtd
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> 
> 
> However, this won't work - tgtd goes immediately in the background as it 
> is still starting, and the first tgtadm commands will fail:

this should be a easy fix. start tgtd, get port setup ready in forked
process, then signal its parent that ready to quit. or set port ready in
parent, fork and pass to daemon.


> 
> # bash -x tgtd-start
> + tgtd
> + tgtadm --op new --mode target ...
> tgtadm: can't connect to the tgt daemon, Connection refused
> tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
> not connected
> + tgtadm --lld iscsi --op new --mode account ...
> tgtadm: can't connect to the tgt daemon, Connection refused
> tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
> not connected
> + tgtadm --lld iscsi --op bind --mode account --tid 1 ...
> tgtadm: can't find the target
> + tgtadm --op new --mode logicalunit --tid 1 --lun 1 ...
> tgtadm: can't find the target
> + tgtadm --op bind --mode target --tid 1 -I ALL
> tgtadm: can't find the target
> + tgtadm --op new --mode target --tid 2 ...
> + tgtadm --op new --mode logicalunit --tid 2 --lun 1 ...
> + tgtadm --op bind --mode target --tid 2 -I ALL
> 
> 
> OK, if tgtd takes longer to start, perhaps it's a good idea to sleep a 
> second right after tgtd?
> 
> tgtd
> sleep 1
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> 
> 
> No, it is not a good idea - if tgtd listens on port 3260 *and* is 
> unconfigured yet,  any reconnecting initiator will fail, like below:

this is another easy fix. tgtd started with unconfigured status and then
a tgtadm can configure it and turn it into ready status.


those are really minor usability issue. ( i know it is painful for user,
i agree)


the major problem here is to discuss in architectural wise, which one is
better... linux kernel should have one implementation that is good from
foundation...





> 
> end_request: I/O error, dev sdb, sector 7045192
> Buffer I/O error on device sdb, logical block 880649
> lost page write due to I/O error on sdb
> Aborting journal on device sdb.
> ext3_abort called.
> EXT3-fs error (device sdb): ext3_journal_start_sb: Detected aborted journal
> Remounting filesystem read-only
> end_request: I/O error, dev sdb, sector 7045880
> Buffer I/O error on device sdb, logical block 880735
> lost page write due to I/O error on sdb
> end_request: I/O error, dev sdb, sector 6728
> Buffer I/O error on device sdb, logical block 841
> lost page write due to I/O error on sdb
> end_request: I/O error, dev sdb, sector 7045192
> Buffer I/O error on device sdb, logical block 880649
> lost page write due to I/O error on sdb
> end_request: I/O error, dev sdb, sector 7045880
> Buffer I/O error on device sdb, logical block 880735
> lost page write due to I/O error on sdb
> __journal_remove_journal_head: freeing b_frozen_data
> __journal_remove_journal_head: freeing b_frozen_data
> 
> 
> Ouch.
> 
> So the only way to start/restart tgtd reliably is to do hacks which are 
> needed with yet another iSCSI kernel implementation (IET): use iptables.
> 
> iptables <block iSCSI traffic>
> tgtd
> sleep 1
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> iptables <unblock iSCSI traffic>
> 
> 
> A bit ugly, isn't it?
> Having to tinker with a firewall in order to start a daemon is by no 
> means a sign of a well-tested and mature project.
> 
> That's why I asked how many people use stgt in a production environment 
> - James was worried about a potential migration path for current users.
> 
> 
> 
> -- 
> Tomasz Chmielewski
> http://wpkg.org
> 
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Scst-devel mailing list
> Scst-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scst-devel
-- 
Ming Zhang


@#$%^ purging memory... (*!%
http://blackmagic02881.wordpress.com/
http://www.linkedin.com/in/blackmagic02881
--------------------------------------------


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05 13:38         ` FUJITA Tomonori
@ 2008-02-05 16:07           ` Tomasz Chmielewski
  2008-02-05 16:21             ` Ming Zhang
  2008-02-05 16:43             ` FUJITA Tomonori
  2008-02-05 17:09           ` Matteo Tescione
  1 sibling, 2 replies; 29+ messages in thread
From: Tomasz Chmielewski @ 2008-02-05 16:07 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: James.Bottomley, bart.vanassche, vst, linux-scsi, linux-kernel,
	fujita.tomonori, scst-devel, akpm, torvalds, stgt-devel

FUJITA Tomonori schrieb:
> On Tue, 05 Feb 2008 08:14:01 +0100
> Tomasz Chmielewski <mangoo@wpkg.org> wrote:
> 
>> James Bottomley schrieb:
>>
>>> These are both features being independently worked on, are they not?
>>> Even if they weren't, the combination of the size of SCST in kernel plus
>>> the problem of having to find a migration path for the current STGT
>>> users still looks to me to involve the greater amount of work.
>> I don't want to be mean, but does anyone actually use STGT in
>> production? Seriously?
>>
>> In the latest development version of STGT, it's only possible to stop
>> the tgtd target daemon using KILL / 9 signal - which also means all
>> iSCSI initiator connections are corrupted when tgtd target daemon is
>> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> 
> I don't know what "iSCSI initiator connections are corrupted"
> mean. But if you reboot a server, how can an iSCSI target
> implementation keep iSCSI tcp connections?

The problem with tgtd is that you can't start it (configured) in an
"atomic" way.
Usually, one will start tgtd and it's configuration in a script (I 
replaced some parameters with "..." to make it shorter and more readable):


tgtd
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


However, this won't work - tgtd goes immediately in the background as it 
is still starting, and the first tgtadm commands will fail:

# bash -x tgtd-start
+ tgtd
+ tgtadm --op new --mode target ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected
+ tgtadm --lld iscsi --op new --mode account ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected
+ tgtadm --lld iscsi --op bind --mode account --tid 1 ...
tgtadm: can't find the target
+ tgtadm --op new --mode logicalunit --tid 1 --lun 1 ...
tgtadm: can't find the target
+ tgtadm --op bind --mode target --tid 1 -I ALL
tgtadm: can't find the target
+ tgtadm --op new --mode target --tid 2 ...
+ tgtadm --op new --mode logicalunit --tid 2 --lun 1 ...
+ tgtadm --op bind --mode target --tid 2 -I ALL


OK, if tgtd takes longer to start, perhaps it's a good idea to sleep a 
second right after tgtd?

tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


No, it is not a good idea - if tgtd listens on port 3260 *and* is 
unconfigured yet,  any reconnecting initiator will fail, like below:

end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
Aborting journal on device sdb.
ext3_abort called.
EXT3-fs error (device sdb): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 6728
Buffer I/O error on device sdb, logical block 841
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data


Ouch.

So the only way to start/restart tgtd reliably is to do hacks which are 
needed with yet another iSCSI kernel implementation (IET): use iptables.

iptables <block iSCSI traffic>
tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...
iptables <unblock iSCSI traffic>


A bit ugly, isn't it?
Having to tinker with a firewall in order to start a daemon is by no 
means a sign of a well-tested and mature project.

That's why I asked how many people use stgt in a production environment 
- James was worried about a potential migration path for current users.



-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05  2:07         ` [Scst-devel] " Chris Weiss
@ 2008-02-05 14:19           ` FUJITA Tomonori
  0 siblings, 0 replies; 29+ messages in thread
From: FUJITA Tomonori @ 2008-02-05 14:19 UTC (permalink / raw)
  To: cweiss
  Cc: dougg, alan, michaelc, vst, linux-scsi, linux-kernel, nab,
	James.Bottomley, scst-devel, akpm, torvalds, fujita.tomonori,
	fujita.tomonori

On Mon, 4 Feb 2008 20:07:01 -0600
"Chris Weiss" <cweiss@gmail.com> wrote:

> On Feb 4, 2008 11:30 AM, Douglas Gilbert <dougg@torque.net> wrote:
> > Alan Cox wrote:
> > >> better. So for example, I personally suspect that ATA-over-ethernet is way
> > >> better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
> > >> low-level, and against those crazy SCSI people to begin with.
> > >
> > > Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
> > > would probably trash iSCSI for latency if nothing else.
> >
> > And a variant that doesn't do ATA or IP:
> > http://www.fcoe.com/
> >
> 
> however, and interestingly enough, the open-fcoe software target
> depends on scst (for now anyway)

STGT also supports software FCoE target driver though it's still
experimental stuff.

http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg12705.html

It works in user space like STGT's iSCSI (and iSER) target driver
(i.e. no kernel/user space interaction).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05  7:14       ` [Scst-devel] " Tomasz Chmielewski
@ 2008-02-05 13:38         ` FUJITA Tomonori
  2008-02-05 16:07           ` Tomasz Chmielewski
  2008-02-05 17:09           ` Matteo Tescione
  0 siblings, 2 replies; 29+ messages in thread
From: FUJITA Tomonori @ 2008-02-05 13:38 UTC (permalink / raw)
  To: mangoo
  Cc: James.Bottomley, bart.vanassche, vst, linux-scsi, linux-kernel,
	fujita.tomonori, scst-devel, akpm, torvalds, fujita.tomonori

On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski <mangoo@wpkg.org> wrote:

> James Bottomley schrieb:
> 
> > These are both features being independently worked on, are they not?
> > Even if they weren't, the combination of the size of SCST in kernel plus
> > the problem of having to find a migration path for the current STGT
> > users still looks to me to involve the greater amount of work.
> 
> I don't want to be mean, but does anyone actually use STGT in
> production? Seriously?
> 
> In the latest development version of STGT, it's only possible to stop
> the tgtd target daemon using KILL / 9 signal - which also means all
> iSCSI initiator connections are corrupted when tgtd target daemon is
> started again (kernel upgrade, target daemon upgrade, server reboot etc.).

I don't know what "iSCSI initiator connections are corrupted"
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


> Imagine you have to reboot all your NFS clients when you reboot your NFS
> server. Not only that - your data is probably corrupted, or at least the
> filesystem deserves checking...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05  4:43 ` [Scst-devel] " Matteo Tescione
  2008-02-05  5:07   ` James Bottomley
@ 2008-02-05 13:38   ` FUJITA Tomonori
  1 sibling, 0 replies; 29+ messages in thread
From: FUJITA Tomonori @ 2008-02-05 13:38 UTC (permalink / raw)
  To: matteo
  Cc: torvalds, mpm, michaelc, vst, linux-scsi, linux-kernel, nab,
	James.Bottomley, scst-devel, akpm, fujita.tomonori, alan,
	fujita.tomonori

On Tue, 05 Feb 2008 05:43:10 +0100
Matteo Tescione <matteo@rmnet.it> wrote:

> Hi all,
> And sorry for intrusion, i am not a developer but i work everyday with iscsi
> and i found it fantastic.
> Altough Aoe, Fcoe and so on could be better, we have to look in real world
> implementations what is needed *now*, and if we look at vmware world,
> virtual iron, microsoft clustering etc, the answer is iSCSI.
> And now, SCST is the best open-source iSCSI target. So, from an end-user
> point of view, what are the really problems to not integrate scst in the
> mainstream kernel?

Currently, the best open-source iSCSI target implemenation in Linux is
Nicholas's LIO, I guess.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-01-30 16:22     ` James Bottomley
@ 2008-02-05  7:14       ` Tomasz Chmielewski
  2008-02-05 13:38         ` FUJITA Tomonori
  0 siblings, 1 reply; 29+ messages in thread
From: Tomasz Chmielewski @ 2008-02-05  7:14 UTC (permalink / raw)
  To: James Bottomley
  Cc: Bart Van Assche, Vladislav Bolkhovitin, linux-scsi, linux-kernel,
	FUJITA Tomonori, scst-devel, Andrew Morton, Linus Torvalds

James Bottomley schrieb:

> These are both features being independently worked on, are they not?
> Even if they weren't, the combination of the size of SCST in kernel plus
> the problem of having to find a migration path for the current STGT
> users still looks to me to involve the greater amount of work.

I don't want to be mean, but does anyone actually use STGT in
production? Seriously?

In the latest development version of STGT, it's only possible to stop
the tgtd target daemon using KILL / 9 signal - which also means all
iSCSI initiator connections are corrupted when tgtd target daemon is
started again (kernel upgrade, target daemon upgrade, server reboot etc.).

Imagine you have to reboot all your NFS clients when you reboot your NFS
server. Not only that - your data is probably corrupted, or at least the
filesystem deserves checking...


-- 
Tomasz Chmielewski
http://wpkg.org




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05  4:43 ` [Scst-devel] " Matteo Tescione
@ 2008-02-05  5:07   ` James Bottomley
  2008-02-05 13:38   ` FUJITA Tomonori
  1 sibling, 0 replies; 29+ messages in thread
From: James Bottomley @ 2008-02-05  5:07 UTC (permalink / raw)
  To: Matteo Tescione
  Cc: Linus Torvalds, Matt Mackall, Mike Christie,
	Vladislav Bolkhovitin, linux-scsi, Linux Kernel Mailing List,
	Nicholas A. Bellinger, scst-devel, Andrew Morton,
	FUJITA Tomonori, Alan Cox


On Tue, 2008-02-05 at 05:43 +0100, Matteo Tescione wrote:
> Hi all,
> And sorry for intrusion, i am not a developer but i work everyday with iscsi
> and i found it fantastic.
> Altough Aoe, Fcoe and so on could be better, we have to look in real world
> implementations what is needed *now*, and if we look at vmware world,
> virtual iron, microsoft clustering etc, the answer is iSCSI.
> And now, SCST is the best open-source iSCSI target. So, from an end-user
> point of view, what are the really problems to not integrate scst in the
> mainstream kernel?

The fact that your last statement is conjecture.  It's definitely untrue
for non-IB networks, and the jury is still out on IB networks.

James



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-05  0:24 Linus Torvalds
@ 2008-02-05  4:43 ` Matteo Tescione
  2008-02-05  5:07   ` James Bottomley
  2008-02-05 13:38   ` FUJITA Tomonori
  0 siblings, 2 replies; 29+ messages in thread
From: Matteo Tescione @ 2008-02-05  4:43 UTC (permalink / raw)
  To: Linus Torvalds, Matt Mackall
  Cc: Mike Christie, Vladislav Bolkhovitin, linux-scsi,
	Linux Kernel Mailing List, Nicholas A. Bellinger,
	James Bottomley, scst-devel, Andrew Morton, FUJITA Tomonori,
	Alan Cox

Hi all,
And sorry for intrusion, i am not a developer but i work everyday with iscsi
and i found it fantastic.
Altough Aoe, Fcoe and so on could be better, we have to look in real world
implementations what is needed *now*, and if we look at vmware world,
virtual iron, microsoft clustering etc, the answer is iSCSI.
And now, SCST is the best open-source iSCSI target. So, from an end-user
point of view, what are the really problems to not integrate scst in the
mainstream kernel?

Just my two cent,
--
So long and thank for all the fish
--
#Matteo Tescione
#RMnet srl


> 
> 
> On Mon, 4 Feb 2008, Matt Mackall wrote:
>> 
>> But ATAoE is boring because it's not IP. Which means no routing,
>> firewalls, tunnels, congestion control, etc.
> 
> The thing is, that's often an advantage. Not just for performance.
> 
>> NBD and iSCSI (for all its hideous growths) can take advantage of these
>> things.
> 
> .. and all this could equally well be done by a simple bridging protocol
> (completely independently of any AoE code).
> 
> The thing is, iSCSI does things at the wrong level. It *forces* people to
> use the complex protocols, when it's a known that a lot of people don't
> want it. 
> 
> Which is why these AoE and FCoE things keep popping up.
> 
> It's easy to bridge ethernet and add a new layer on top of AoE if you need
> it. In comparison, it's *impossible* to remove an unnecessary layer from
> iSCSI.
> 
> This is why "simple and low-level is good". It's always possible to build
> on top of low-level protocols, while it's generally never possible to
> simplify overly complex ones.
> 
> Linus
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Scst-devel mailing list
> Scst-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scst-devel
> 



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-04 17:30       ` Douglas Gilbert
@ 2008-02-05  2:07         ` Chris Weiss
  2008-02-05 14:19           ` FUJITA Tomonori
  0 siblings, 1 reply; 29+ messages in thread
From: Chris Weiss @ 2008-02-05  2:07 UTC (permalink / raw)
  To: dougg
  Cc: Alan Cox, Mike Christie, Vladislav Bolkhovitin, linux-scsi,
	Linux Kernel Mailing List, Nicholas A. Bellinger,
	James Bottomley, scst-devel, Andrew Morton, Linus Torvalds,
	FUJITA Tomonori

On Feb 4, 2008 11:30 AM, Douglas Gilbert <dougg@torque.net> wrote:
> Alan Cox wrote:
> >> better. So for example, I personally suspect that ATA-over-ethernet is way
> >> better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
> >> low-level, and against those crazy SCSI people to begin with.
> >
> > Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
> > would probably trash iSCSI for latency if nothing else.
>
> And a variant that doesn't do ATA or IP:
> http://www.fcoe.com/
>

however, and interestingly enough, the open-fcoe software target
depends on scst (for now anyway)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-04 19:44   ` Linus Torvalds
@ 2008-02-04 20:06     ` 4news
  2008-02-04 22:43     ` Alan Cox
  1 sibling, 0 replies; 29+ messages in thread
From: 4news @ 2008-02-04 20:06 UTC (permalink / raw)
  To: scst-devel
  Cc: Linus Torvalds, Nicholas A. Bellinger, Mike Christie,
	Vladislav Bolkhovitin, linux-scsi, Linux Kernel Mailing List,
	James Bottomley, Andrew Morton, FUJITA Tomonori

On lunedì 4 febbraio 2008, Linus Torvalds wrote:
> So from a purely personal standpoint, I'd like to say that I'm not really
> interested in iSCSI (and I don't quite know why I've been cc'd on this
> whole discussion) and think that other approaches are potentially *much*
> better. So for example, I personally suspect that ATA-over-ethernet is way
> better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
> low-level, and against those crazy SCSI people to begin with.

surely aoe is better than iscsi almost on performance because of the lesser 
protocol stack:
iscsi ->  scsi - ip - eth
aoe -> ata - eth

but surely iscsi is more a standard than aoe and is more actively used by 
real-world .

Other really useful feature are that:
- iscsi is capable to move to a ip based san scsi devices by routing that ( 
i've some tape changer routed by scst to some system that don't have other 
way to see a tape).
- because it work on the ip layer it can be routed between long distance , so 
having needed bandwidth you can have a really remote block device spoking a 
standard protocol between non ethereogenus systems.
- iscsi is now the cheapest san avaible.

bye,
marco.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-02-01 11:50                     ` Vladislav Bolkhovitin
@ 2008-02-01 12:25                       ` Vladislav Bolkhovitin
  0 siblings, 0 replies; 29+ messages in thread
From: Vladislav Bolkhovitin @ 2008-02-01 12:25 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: landman, fujita.tomonori, linux-scsi, rdreier, linux-kernel,
	Nicholas A. Bellinger, James.Bottomley, scst-devel, akpm,
	FUJITA Tomonori, torvalds

Vladislav Bolkhovitin wrote:
> Bart Van Assche wrote:
> 
>> On Jan 31, 2008 5:25 PM, Joe Landman <landman@scalableinformatics.com> 
>> wrote:
>>
>>> Vladislav Bolkhovitin wrote:
>>>
>>>> Actually, I don't know what kind of conclusions it is possible to make
>>>> from disktest's results (maybe only how throughput gets bigger or 
>>>> slower
>>>> with increasing number of threads?), it's a good stress test tool, but
>>>> not more.
>>>
>>>
>>> Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to
>>> bear far closer to "real world" tests than disktest and iozone, the
>>> latter of which does more to test the speed of RAM cache and system call
>>> performance than actual IO.
>>
>>
>>
>> I have ran some tests with Bonnie++, but found out that on a fast
>> network like IB the filesystem used for the test has a really big
>> impact on the test results.
>>
>> If anyone has a suggestion for a better test than dd to compare the
>> performance of SCSI storage protocols, please let it know.
> 
> 
> I would suggest you to try something from real life, like:
> 
>  - Copying large file tree over a single or multiple IB links
> 
>  - Measure of some DB engine's TPC
> 
>  - etc.

Forgot to mention. During those tests make sure that imported devices 
from both SCST and STGT report in the kernel log the same write cache 
and FUA capabilities, since they significantly affect initiator's 
behavior. Like:

sd 4:0:0:5: [sdf] Write cache: enabled, read cache: enabled, supports 
DPO and FUA

For SCST the fastest mode is NV_CACHE, refer to its README file for details.

>> Bart Van Assche.
>>
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by: Microsoft
>> Defy all challenges. Microsoft(R) Visual Studio 2008.
>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
>> _______________________________________________
>> Scst-devel mailing list
>> Scst-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scst-devel
>>
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-01-31 17:08                   ` Bart Van Assche
  2008-01-31 17:13                     ` Joe Landman
  2008-01-31 18:12                     ` David Dillow
@ 2008-02-01 11:50                     ` Vladislav Bolkhovitin
  2008-02-01 12:25                       ` Vladislav Bolkhovitin
  2 siblings, 1 reply; 29+ messages in thread
From: Vladislav Bolkhovitin @ 2008-02-01 11:50 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: landman, fujita.tomonori, linux-scsi, rdreier, linux-kernel,
	Nicholas A. Bellinger, James.Bottomley, scst-devel, akpm,
	FUJITA Tomonori, torvalds

Bart Van Assche wrote:
> On Jan 31, 2008 5:25 PM, Joe Landman <landman@scalableinformatics.com> wrote:
> 
>>Vladislav Bolkhovitin wrote:
>>
>>>Actually, I don't know what kind of conclusions it is possible to make
>>>from disktest's results (maybe only how throughput gets bigger or slower
>>>with increasing number of threads?), it's a good stress test tool, but
>>>not more.
>>
>>Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to
>>bear far closer to "real world" tests than disktest and iozone, the
>>latter of which does more to test the speed of RAM cache and system call
>>performance than actual IO.
> 
> 
> I have ran some tests with Bonnie++, but found out that on a fast
> network like IB the filesystem used for the test has a really big
> impact on the test results.
> 
> If anyone has a suggestion for a better test than dd to compare the
> performance of SCSI storage protocols, please let it know.

I would suggest you to try something from real life, like:

  - Copying large file tree over a single or multiple IB links

  - Measure of some DB engine's TPC

  - etc.

> Bart Van Assche.
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Scst-devel mailing list
> Scst-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scst-devel
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-01-31 18:12                     ` David Dillow
@ 2008-02-01 11:50                       ` Vladislav Bolkhovitin
  0 siblings, 0 replies; 29+ messages in thread
From: Vladislav Bolkhovitin @ 2008-02-01 11:50 UTC (permalink / raw)
  To: David Dillow
  Cc: Bart Van Assche, landman, James.Bottomley, linux-scsi, rdreier,
	linux-kernel, Nicholas A. Bellinger, fujita.tomonori, scst-devel,
	akpm, FUJITA Tomonori, torvalds

David Dillow wrote:
> On Thu, 2008-01-31 at 18:08 +0100, Bart Van Assche wrote:
> 
>>If anyone has a suggestion for a better test than dd to compare the
>>performance of SCSI storage protocols, please let it know.
> 
> 
> xdd on /dev/sda, sdb, etc. using -dio to do direct IO seems to work
> decently, though it is hard (ie, impossible) to get a repeatable
> sequence of IO when using higher queue depths, as it uses threads to
> generate multiple requests.

This utility seems to be a good one, but it's basically the same as 
disktest, although much more advanced.

> You may also look at sgpdd_survey from Lustre's iokit, but I've not done
> much with that -- it uses the sg devices to send lowlevel SCSI commands.

Yes, it might be worth to try. Since fundamentally it's the same as 
O_DIRECT dd, but with a bit less overhead on the initiator side (hence 
less initiator side latency), most likely it will show ever bigger 
difference, than it is with dd.

> I've been playing around with some benchmark code using libaio, but it's
> not in generally usable shape.
> 
> xdd:
> http://www.ioperformance.com/products.htm
> 
> Lustre IO Kit:
> http://manual.lustre.org/manual/LustreManual16_HTML/DynamicHTML-20-1.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-01-31 17:08                   ` Bart Van Assche
  2008-01-31 17:13                     ` Joe Landman
@ 2008-01-31 18:12                     ` David Dillow
  2008-02-01 11:50                       ` Vladislav Bolkhovitin
  2008-02-01 11:50                     ` Vladislav Bolkhovitin
  2 siblings, 1 reply; 29+ messages in thread
From: David Dillow @ 2008-01-31 18:12 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: landman, Vladislav Bolkhovitin, James.Bottomley, linux-scsi,
	rdreier, linux-kernel, Nicholas A. Bellinger, fujita.tomonori,
	scst-devel, akpm, FUJITA Tomonori, torvalds


On Thu, 2008-01-31 at 18:08 +0100, Bart Van Assche wrote:
> If anyone has a suggestion for a better test than dd to compare the
> performance of SCSI storage protocols, please let it know.

xdd on /dev/sda, sdb, etc. using -dio to do direct IO seems to work
decently, though it is hard (ie, impossible) to get a repeatable
sequence of IO when using higher queue depths, as it uses threads to
generate multiple requests.

You may also look at sgpdd_survey from Lustre's iokit, but I've not done
much with that -- it uses the sg devices to send lowlevel SCSI commands.

I've been playing around with some benchmark code using libaio, but it's
not in generally usable shape.

xdd:
http://www.ioperformance.com/products.htm

Lustre IO Kit:
http://manual.lustre.org/manual/LustreManual16_HTML/DynamicHTML-20-1.html
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-01-31 17:08                   ` Bart Van Assche
@ 2008-01-31 17:13                     ` Joe Landman
  2008-01-31 18:12                     ` David Dillow
  2008-02-01 11:50                     ` Vladislav Bolkhovitin
  2 siblings, 0 replies; 29+ messages in thread
From: Joe Landman @ 2008-01-31 17:13 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Vladislav Bolkhovitin, James.Bottomley, linux-scsi, rdreier,
	linux-kernel, Nicholas A. Bellinger, fujita.tomonori, scst-devel,
	akpm, FUJITA Tomonori, torvalds

Bart Van Assche wrote:

> I have ran some tests with Bonnie++, but found out that on a fast
> network like IB the filesystem used for the test has a really big
> impact on the test results.

This is true of the file systems when physically directly connected to 
the unit as well.  Some file systems are designed with high performance 
in mind, some are not.

> If anyone has a suggestion for a better test than dd to compare the
> performance of SCSI storage protocols, please let it know.

Hmmm... if you care about the protocol side, I can't help.  Our users 
are more concerned with the file system side, so this is where we focus 
our tuning attention.

> 
> Bart Van Assche.

Joe

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman@scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-01-31 16:25                 ` [Scst-devel] " Joe Landman
@ 2008-01-31 17:08                   ` Bart Van Assche
  2008-01-31 17:13                     ` Joe Landman
                                       ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Bart Van Assche @ 2008-01-31 17:08 UTC (permalink / raw)
  To: landman
  Cc: Vladislav Bolkhovitin, James.Bottomley, linux-scsi, rdreier,
	linux-kernel, Nicholas A. Bellinger, fujita.tomonori, scst-devel,
	akpm, FUJITA Tomonori, torvalds

On Jan 31, 2008 5:25 PM, Joe Landman <landman@scalableinformatics.com> wrote:
> Vladislav Bolkhovitin wrote:
> > Actually, I don't know what kind of conclusions it is possible to make
> > from disktest's results (maybe only how throughput gets bigger or slower
> > with increasing number of threads?), it's a good stress test tool, but
> > not more.
>
> Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to
> bear far closer to "real world" tests than disktest and iozone, the
> latter of which does more to test the speed of RAM cache and system call
> performance than actual IO.

I have ran some tests with Bonnie++, but found out that on a fast
network like IB the filesystem used for the test has a really big
impact on the test results.

If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.

Bart Van Assche.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-01-31 15:50               ` Vladislav Bolkhovitin
@ 2008-01-31 16:25                 ` Joe Landman
  2008-01-31 17:08                   ` Bart Van Assche
  0 siblings, 1 reply; 29+ messages in thread
From: Joe Landman @ 2008-01-31 16:25 UTC (permalink / raw)
  To: Vladislav Bolkhovitin
  Cc: Bart Van Assche, James.Bottomley, linux-scsi, rdreier,
	linux-kernel, Nicholas A. Bellinger, fujita.tomonori, scst-devel,
	akpm, FUJITA Tomonori, torvalds

Vladislav Bolkhovitin wrote:
> Bart Van Assche wrote:

[...]

>> I can run disktest on the same setups I ran dd on. This will take some
>> time however.
> 
> Disktest was already referenced in the beginning of the performance 
> comparison thread, but its results are not very interesting if we are 
> going to find out, which implementation is more effective, because in 
> the modes, in which usually people run this utility, it produces latency 
> insensitive workload (multiple threads working in parallel). So, such 

There are other issues with disktest, in that you can easily specify 
option combinations that generate apparently 5+ GB/s of IO, though 
actual traffic over the link to storage is very low.  Caveat disktest 
emptor.

> multithreaded disktests results will be different between STGT and SCST 
> only if STGT's implementation will get target CPU bound. If CPU on the 
> target is powerful enough, even extra busy loops in the STGT or SCST hot 
> path code will change nothing.
> 
> Additionally, multithreaded disktest over RAM disk is a good example of 
> a synthetic benchmark, which has almost no relation with real life 
> workloads. But people like it, because it produces nice looking results.

I agree.  The backing store should be a disk for it to have meaning, 
though please note my caveat above.

> 
> Actually, I don't know what kind of conclusions it is possible to make 
> from disktest's results (maybe only how throughput gets bigger or slower 
> with increasing number of threads?), it's a good stress test tool, but 
> not more.

Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to 
bear far closer to "real world" tests than disktest and iozone, the 
latter of which does more to test the speed of RAM cache and system call 
performance than actual IO.


>> Disktest is new to me -- any hints with regard to suitable
>> combinations of command line parameters are welcome. The most recent
>> version I could find on http://ltp.sourceforge.net/ is ltp-20071231.
>>
>> Bart Van Assche.

Here is what I have run:

disktest -K 8 -B 256k  -I F -N 20000000 -P A -w /big/file
disktest -K 8 -B 64k   -I F -N 20000000 -P A -w /big/file
disktest -K 8 -B 1k    -I B -N 2000000  -P A  /dev/sdb2

and many others.



Joe


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman@scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
  2008-01-29 23:32     ` FUJITA Tomonori
@ 2008-01-30  1:15       ` Vu Pham
  2008-01-30  8:38       ` Bart Van Assche
  1 sibling, 0 replies; 29+ messages in thread
From: Vu Pham @ 2008-01-30  1:15 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: rdreier, vst, linux-scsi, linux-kernel, James.Bottomley,
	scst-devel, akpm, torvalds

FUJITA Tomonori wrote:
> On Tue, 29 Jan 2008 13:31:52 -0800
> Roland Dreier <rdreier@cisco.com> wrote:
> 
>>  > .                           .   STGT read     SCST read    .    STGT read      SCST read    .
>>  > .                           .  performance   performance   . performance    performance   .
>>  > .                           .  (0.5K, MB/s)  (0.5K, MB/s)  .   (1 MB, MB/s)   (1 MB, MB/s)  .
>>  > . iSER     (8 Gb/s network) .     250            N/A       .       360           N/A       .
>>  > . SRP      (8 Gb/s network) .     N/A            421       .       N/A           683       .
>>
>>  > On the comparable figures, which only seem to be IPoIB they're showing a
>>  > 13-18% variance, aren't they?  Which isn't an incredible difference.
>>
>> Maybe I'm all wet, but I think iSER vs. SRP should be roughly
>> comparable.  The exact formatting of various messages etc. is
>> different but the data path using RDMA is pretty much identical.  So
>> the big difference between STGT iSER and SCST SRP hints at some big
>> difference in the efficiency of the two implementations.
> 
> iSER has parameters to limit the maximum size of RDMA (it needs to
> repeat RDMA with a poor configuration)?
> 
> 
> Anyway, here's the results from Robin Humble:
> 
> iSER to 7G ramfs, x86_64, centos4.6, 2.6.22 kernels, git tgtd,
> initiator end booted with mem=512M, target with 8G ram
> 
>  direct i/o dd
>   write/read  800/751 MB/s
>     dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct
>     dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct
> 

Both Robin (iser/stgt) and Bart (scst/srp) using ramfs

Robin's numbers come from DDR IB HCAs

Bart's numbers come from SDR IB HCAs:
Results with /dev/ram0 configured as backing store on the 
target (buffered I/O):
                     Read          Write             Read 
        Write
                   performance   performance 
performance   performance
                   (0.5K, MB/s)  (0.5K, MB/s)      (1 MB, 
MB/s)  (1 MB, MB/s)
STGT + iSER           250          48                 349 
        781
SCST + SRP            411          66                 659 
        746

Results with /dev/ram0 configured as backing store on the 
target (direct I/O):
                     Read          Write             Read 
        Write
                   performance   performance 
performance   performance
                   (0.5K, MB/s)  (0.5K, MB/s)      (1 MB, 
MB/s)  (1 MB, MB/s)
STGT + iSER             7.9         9.8               589 
        647
SCST + SRP             12.3         9.7               811 
        794

http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg13514.html

Here are my numbers with DDR IB HCAs, SCST/SRP 5G /dev/ram0 
block_io mode, RHEL5 2.6.18-8.el5

direct i/o dd
    write/read  1100/895 MB/s
      dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct
      dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct

buffered i/o dd
    write/read  950/770 MB/s
      dd if=/dev/zero of=/dev/sdc bs=1M count=5000
      dd of=/dev/null if=/dev/sdc bs=1M count=5000

So when using DDR IB hcas:

               stgt/iser   scst/srp
direct I/O     800/751     1100/895
buffered I/O   1109/350    950/770


-vu
> http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg13502.html
> 
> I think that STGT is pretty fast with the fast backing storage. 
> 
> 
> I don't think that there is the notable perfornace difference between
> kernel-space and user-space SRP (or ISER) implementations about moving
> data between hosts. IB is expected to enable user-space applications
> to move data between hosts quickly (if not, what can IB provide us?).
> 
> I think that the question is how fast user-space applications can do
> I/Os ccompared with I/Os in kernel space. STGT is eager for the advent
> of good asynchronous I/O and event notification interfances.
> 
> 
> One more possible optimization for STGT is zero-copy data
> transfer. STGT uses pre-registered buffers and move data between page
> cache and thsse buffers, and then does RDMA transfer. If we implement
> own caching mechanism to use pre-registered buffers directly with (AIO
> and O_DIRECT), then STGT can move data without data copies.
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Scst-devel mailing list
> Scst-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scst-devel
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2008-02-13  3:58 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-09  7:44 [Scst-devel] Integration of SCST in the mainstream Linux kernel Luben Tuikov
  -- strict thread matches above, loose matches on Subject: below --
2008-02-05  0:24 Linus Torvalds
2008-02-05  4:43 ` [Scst-devel] " Matteo Tescione
2008-02-05  5:07   ` James Bottomley
2008-02-05 13:38   ` FUJITA Tomonori
2008-01-23 14:22 Bart Van Assche
2008-01-29 20:42 ` James Bottomley
2008-01-29 21:31   ` Roland Dreier
2008-01-29 23:32     ` FUJITA Tomonori
2008-01-30  1:15       ` [Scst-devel] " Vu Pham
2008-01-30  8:38       ` Bart Van Assche
2008-01-30 10:56         ` FUJITA Tomonori
2008-01-31 13:25           ` Nicholas A. Bellinger
2008-01-31 14:34             ` Bart Van Assche
2008-01-31 15:50               ` Vladislav Bolkhovitin
2008-01-31 16:25                 ` [Scst-devel] " Joe Landman
2008-01-31 17:08                   ` Bart Van Assche
2008-01-31 17:13                     ` Joe Landman
2008-01-31 18:12                     ` David Dillow
2008-02-01 11:50                       ` Vladislav Bolkhovitin
2008-02-01 11:50                     ` Vladislav Bolkhovitin
2008-02-01 12:25                       ` Vladislav Bolkhovitin
2008-01-30  8:29   ` Bart Van Assche
2008-01-30 16:22     ` James Bottomley
2008-02-05  7:14       ` [Scst-devel] " Tomasz Chmielewski
2008-02-05 13:38         ` FUJITA Tomonori
2008-02-05 16:07           ` Tomasz Chmielewski
2008-02-05 16:21             ` Ming Zhang
2008-02-05 16:43             ` FUJITA Tomonori
2008-02-05 17:09           ` Matteo Tescione
2008-02-06  1:29             ` FUJITA Tomonori
2008-02-06  2:01               ` Nicholas A. Bellinger
2008-02-04 16:25 ` Vladislav Bolkhovitin
2008-02-04 17:06   ` James Bottomley
2008-02-04 17:16     ` Vladislav Bolkhovitin
2008-02-04 17:25       ` James Bottomley
2008-02-04 17:56         ` Vladislav Bolkhovitin
2008-02-04 18:22           ` James Bottomley
2008-02-04 18:38             ` Vladislav Bolkhovitin
2008-02-04 18:54               ` James Bottomley
2008-02-05 18:59                 ` Vladislav Bolkhovitin
2008-02-05 19:13                   ` James Bottomley
2008-02-07 13:13                     ` [Scst-devel] " Bart Van Assche
2008-02-07 15:38                       ` Nicholas A. Bellinger
2008-02-07 20:37                         ` Luben Tuikov
2008-02-08 11:53                           ` Nicholas A. Bellinger
2008-02-04 18:29         ` Linus Torvalds
2008-02-04 19:06 ` Nicholas A. Bellinger
2008-02-04 19:44   ` Linus Torvalds
2008-02-04 20:06     ` [Scst-devel] " 4news
2008-02-04 22:43     ` Alan Cox
2008-02-04 17:30       ` Douglas Gilbert
2008-02-05  2:07         ` [Scst-devel] " Chris Weiss
2008-02-05 14:19           ` FUJITA Tomonori
2008-02-04 23:04       ` Jeff Garzik
2008-02-05 19:01         ` Vladislav Bolkhovitin
2008-02-05 19:12           ` Jeff Garzik
2008-02-05 19:21             ` Vladislav Bolkhovitin
2008-02-06  0:11               ` Nicholas A. Bellinger
2008-02-12 16:05                 ` [Scst-devel] " Bart Van Assche
2008-02-13  3:44                   ` Nicholas A. Bellinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).