LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Pete Zaitcev <zaitcev@redhat.com>
To: ebuddington@wesleyan.edu
Cc: Eric Buddington <ebuddington@verizon.net>,
	linux-kernel@vger.kernel.org, zaitcev@redhat.com
Subject: Re: USB misbehavior causes system hang
Date: Tue, 27 Feb 2007 13:49:40 -0800	[thread overview]
Message-ID: <20070227134940.63736039.zaitcev@redhat.com> (raw)
In-Reply-To: <20070227140610.GC6850@pool-71-123-99-133.spfdma.east.verizon.net>

On Tue, 27 Feb 2007 09:06:21 -0500, Eric Buddington <ebuddington@verizon.net> wrote:

> sd 1:0:0:0: rejecting I/O to offline device
> ...
> SoftDog: Initiating system reboot.

> Now, the USB problem may well be a device or cabling issue, but I
> don't think that this drive failure should trigger a reboot - I assume
> the drive failure is somehow constipating the entire disk I/O system,
> and preventing my softdog-patting script from running.

Have you tried ub? In theory, its threadless design is supposed to
help with just this kind of a problem. Please let me know, I'm very
curous.

However, the main issue here is the OOM with all the dirty data.
We saw that before. For some weird reason, ext3 is especially good
at producing the immense amounts of write-out. Are you on ext3 or
VFAT on that drive?

Please try to find the CPU traces by hitting SysRq-w, SysRq-p. CPU
is looping under a lock somewhere and eventually it cases the watchdog
to trigger. It may be a USB issue, maybe a VM issue. I can't tell
until we get stack traces.

This does not help you to deal with the unreliable drive, I'm afraid,
but it would be a great service if you pinned down the reason of looping.

-- Pete

  parent reply	other threads:[~2007-02-27 21:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-27 14:06 Eric Buddington
2007-02-27 14:15 ` Oliver Neukum
2007-02-27 21:49 ` Pete Zaitcev [this message]
2007-03-01 13:39   ` Eric Buddington
2007-03-05 10:17 ` Andrew Morton
2007-03-05 16:55   ` Eric Buddington

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070227134940.63736039.zaitcev@redhat.com \
    --to=zaitcev@redhat.com \
    --cc=ebuddington@verizon.net \
    --cc=ebuddington@wesleyan.edu \
    --cc=linux-kernel@vger.kernel.org \
    --subject='Re: USB misbehavior causes system hang' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).