LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Jarod Wilson <jwilson@redhat.com>
To: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: linux1394-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 9/9] firewire: fw-sbp2: fix I/O errors during reconnect
Date: Tue, 12 Feb 2008 00:07:25 -0500	[thread overview]
Message-ID: <47B1298D.2000600@redhat.com> (raw)
In-Reply-To: <47B0AE54.8090602@s5r6.in-berlin.de>

Stefan Richter wrote:
> Jarod Wilson wrote:
>> Stefan Richter wrote:
>>> +static void sbp2_conditionally_block(struct sbp2_logical_unit *lu)
>>> +{
>>> +    struct fw_card *card =
>>> fw_device(lu->tgt->unit->device.parent)->card;
>>> +
>>> +    if (!atomic_read(&lu->tgt->dont_block) &&
>>> +        lu->generation != card->generation &&
>>> +        atomic_cmpxchg(&lu->blocked, 0, 1) == 0) {
>> Just to be absolutely sure, we don't need any barriers here to ensure we
>> get the right generations, do we?
> 
> I didn't think so.  But I will carefully look at it again later this
> week.  The function definitely must not block the device when the
> generation is current.  We look at two data fields here which makes this
> even more problematic.  Could be that we need locks after all.

I didn't see anything else earlier on that guaranteed we got the current 
generation in both cases, but I didn't look exhaustively, and you know 
this code a lot better than I do, so I definitely defer to your better 
judgment, just wanted to make sure it had been considered.


>> Also, this isn't expected to let I/O survive a disk being unplugged
>> briefly, then plugged back in, is it?
> 
> No, this only tells the SCSI core to not bother fw-sbp2's
> .queuecommand() with new commands before reconnect.  This will
> mysteriously convince the SCSI core to not put the device offline too
> quickly and will stabilize application client behavior thanks to
> considerably fewer command retries.

Okay, that's what I thought and what I observed in operation.

> To survive real or perceived temporary unplugs ("perceived" unplugs can
> happen if a third node is slowly plugged in or out), we need to do
> something in fw-device.c.  We have to keep the fw_device around after
> node removal event until a timeout, to check newly added devices whether
> they are in fact one of the undead devices, and to revive that one
> rather than creating a new one.

Gotcha. So basically, a temporary table of devices recently "removed", 
which will expire to full/actual removal after some relatively short 
interval, but which we'll also check for matches of "newly" connected 
devices to see if we should cancel removing the device and just pretend 
like it never actually went away. Right? I wonder if there's any sort of 
guidance on this sort of thing in the firewire specs... I'll make an 
effort to search for relevant info, unless you've already got it.

>> (I recall that being discussed, but I think it was as a 'would be
>> nice to do in the future' thing).
> 
> I realized now that it is a 'need it sooner than later' thing because of
> these "perceived" unplugs.  We need this feature at least with a minimal
> timeout, otherwise people will sometimes lose connection to their
> devices (the scsi_device will be destroyed and a new one created) when
> they plug a 3rd or 4th or nth node.  As mentioned in another post, this
> is an actual regression for those who migrated from ieee1394 to fw-core.
> But fear not, it looks like I will have a prolonged weekend.  :-)

Heh, sounds good. I missed out on my entire past weekend (and half a 
week of work) due to family illnesses. Hoping to throw a bunch of time 
at further firewire work this week though.

-- 
Jarod Wilson
jwilson@redhat.com

  reply	other threads:[~2008-02-12  5:07 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-03 22:00 [PATCH 0/9] firewire-sbp2: misc hotplug related patches Stefan Richter
2008-02-03 22:03 ` [PATCH 1/9] firewire: log GUID of new devices Stefan Richter
2008-02-04  8:14   ` Stefan Richter
2008-02-11 16:53   ` Jarod Wilson
2008-02-03 22:04 ` [PATCH 2/9] firewire: fw-sbp2: add INQUIRY delay workaround Stefan Richter
2008-02-11 17:01   ` Jarod Wilson
2008-02-03 22:07 ` [PATCH 3/9] ieee1394: sbp2: " Stefan Richter
2008-02-11 17:03   ` Jarod Wilson
2008-02-03 22:08 ` [PATCH 4/9] firewire: fw-sbp2: wait for completion of fetch agent reset Stefan Richter
2008-02-04  8:11   ` Stefan Richter
2008-02-03 22:09 ` [PATCH 5/9] firewire: fw-sbp2: log bus_id at management request failures Stefan Richter
2008-02-11 17:16   ` Jarod Wilson
2008-02-03 22:10 ` [PATCH 6/9] firewire: fw-sbp2: don't add scsi_device twice Stefan Richter
2008-02-11 17:19   ` Jarod Wilson
2008-02-11 19:42     ` Stefan Richter
2008-02-12  8:55       ` Stefan Richter
2008-02-03 22:11 ` [PATCH 7/9] firewire: fw-sbp2: logout and login after failed reconnect Stefan Richter
2008-02-11 17:32   ` Jarod Wilson
2008-02-03 22:12 ` [PATCH 8/9] firewire: fw-sbp2: sort includes Stefan Richter
2008-02-03 22:13 ` [PATCH 9/9] firewire: fw-sbp2: fix I/O errors during reconnect Stefan Richter
2008-02-11 18:09   ` Jarod Wilson
2008-02-11 20:21     ` Stefan Richter
2008-02-12  5:07       ` Jarod Wilson [this message]
2008-02-12  8:01         ` Stefan Richter
2008-02-16 15:37       ` Stefan Richter
2008-02-16 15:51         ` Stefan Richter
2008-02-04 15:54 ` [PATCH 0/9] firewire-sbp2: misc hotplug related patches John Stoffel
2008-02-04 17:48   ` Stefan Richter
2008-02-04 18:51     ` John Stoffel
2008-02-06  5:17 ` Jarod Wilson
2008-02-06 18:27   ` Stefan Richter
2008-02-06 21:09     ` [PATCH 11/9] firewire: fw-sbp2: enforce a retry of __scsi_add_device if bus generation changed Stefan Richter
2008-02-08 18:54       ` Jarod Wilson
2008-02-08 19:58         ` Stefan Richter
2008-02-08 21:33           ` [PATCH 11/9 update] " Stefan Richter
2008-02-10 18:36             ` Jarod Wilson
2008-02-16 15:01               ` Stefan Richter
2008-02-06 21:07 ` [PATCH 10/9] firewire: fw-sbp2: preemptively block sdev Stefan Richter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47B1298D.2000600@redhat.com \
    --to=jwilson@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux1394-devel@lists.sourceforge.net \
    --cc=stefanr@s5r6.in-berlin.de \
    --subject='Re: [PATCH 9/9] firewire: fw-sbp2: fix I/O errors during reconnect' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).