LKML Archive on
help / color / mirror / Atom feed
From: "Michael K. Edwards" <>
To: "Michael K. Edwards" <>,
	"Jose Goncalves" <>,
	"Frederik Deweerdt" <>,,
Subject: Re: Serial related oops
Date: Mon, 19 Feb 2007 18:17:47 -0800	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On 2/19/07, Russell King <> wrote:
> This can't happen because when __do_irq unmasks the interrupt source,
> the CPU mask is set, thereby preventing any further interrupt exceptions
> being taken.  This is done precisely to prevent this situation happening.
> If you are seeing recursion for the same interrupt (two or more stack
> frames containing asm_do_IRQ for that very same IRQ) then your interrupt
> handling is buggy, plain and simple.

Imaginable.  I'll look at the mask/unmask code.  Thanks.

> I don't doubt that it is on the same IRQ line - I have such setups here
> and it works perfectly - multiple 8250 UARTs connected to a single
> level-triggered interrupt input which also happens to be shared with
> a SCSI host chip as well.  Absolutely no problems.

Can you do me a favor?  In the sys_open("/dev/console") path, turn on
the right bits in that second uart's IER, then insert a sleep in
request_irq or something (wherever seems best based on that
backtrace), and feed enough characters into the second UART during
that sleep to generate an IRQ.  Do you not get the same soft lockup?

> I still say that your understanding is completely flawed.  Moreover,
> you haven't read what I've said about the ordering of initialisation,
> the stress on when we disable interrupts for the ports, etc.

Well, all I can say is that that's a real backtrace and it shouldn't
be hard to reproduce if it's anything other than a broken interrupt
controller or broken code called by the __do_irq postamble.  I don't
see any platform-provided unmask routines in that backtrace, but maybe
it got inlined; I'll go back and check.

> You're actually *not* helping.  You're causing utter confusion through
> misunderstanding, but it seems you're not open to the possibility that
> your understanding is flawed.

Still open, though it's a pity you're more interested in my flawed
understanding that in the possibility that the kernel could be
systematically made more robust against hardware bugs and coding
errors by the simple expedient of putting all the ISRs in before
turning on any IRQ that might be shared.  Or are you telling me that's
already been done?  (Yes, I am aware that this interacts
entertainingly with hot-plug PCI.  Yes, I am aware that there is a
limit to how much software can fix stupid hardware.  But surely there
is room for an emergency IRQ suppressor to let chip initialization
code kick in and force the hardware to a known state.)

> I'm offering to look through your code and point you at the source of
> your issue for free.  Please don't throw that offer away without first
> considering that maybe I have a clue about what's going on here.

I appreciate that offer, and I hope to take advantage of it as soon as
I have the source code at my fingertips (not just the chat log where I
recorded the backtrace).

> ... which showed the port being opened well after system initialisation
> of devices, including all serial ports - including disabling of their
> interrupt source at the IER, has been completed.

Now that you mention it, the backtrace I sent is the
serial8250_startup one, not the serial8250_init one.  Sorry, this
one's probably an artifact of brain damage specific to this UART.  I
need to dig through a different account to find the init-path example;
but in either case, we're getting a new interrupt during the __do_irq
postamble.  If you're telling me that that shouldn't happen, what
should the backtrace for a soft lockup due to a stuck level-triggered
IRQ look like on ARM?

> Yes, and it's the same for any serial console with functioning break
> support.  You'll find it in Documentation/sysrq.txt, though it does
> misleadingly say "PC style standard serial ports only" whereas the
> reality is "where possible".

Thank you very much; this will help me get to the bottom of some other
chip-support nastiness on this device.

- Michael

  reply	other threads:[~2007-02-20  2:17 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-20 13:29 Frederik Deweerdt
2007-02-19 13:45 ` Russell King
2007-02-20 14:24   ` Frederik Deweerdt
2007-02-19 14:35     ` Russell King
2007-02-20 14:48       ` Frederik Deweerdt
2007-02-19 15:05         ` Russell King
2007-02-19 16:29           ` Jose Goncalves
2007-02-19 16:42             ` Russell King
2007-02-19 17:54               ` Jose Goncalves
2007-02-19 20:37                 ` Michael K. Edwards
2007-02-19 20:51                   ` Russell King
2007-02-19 21:24                     ` Michael K. Edwards
2007-02-19 21:31                       ` Russell King
2007-02-19 22:16                         ` Michael K. Edwards
2007-02-19 23:20                           ` Russell King
2007-02-20  0:04                             ` Michael K. Edwards
2007-02-20  0:21                               ` Russell King
2007-02-20  2:17                                 ` Michael K. Edwards [this message]
2007-02-24  2:46                             ` Michael K. Edwards
2007-02-19 21:23                 ` Russell King
2007-02-21 14:13                   ` Jose Goncalves
2007-02-21 14:55                     ` Jose Goncalves
2007-02-21 22:53                     ` Frederik Deweerdt
2007-02-21 23:05                     ` Russell King
2007-02-22  0:34                       ` Michael K. Edwards
2007-02-22  8:54                         ` Russell King
2007-02-22 15:07                           ` Jose Goncalves
2007-02-22 16:56                             ` Russell King
2007-02-22 17:24                               ` jose.goncalves
2007-02-22  5:57                       ` H. Peter Anvin
2007-02-22  7:39                         ` Frederik Deweerdt
2007-02-22  8:52                         ` Russell King
2007-02-22 15:02                       ` Jose Goncalves
2007-02-22 17:03                         ` Russell King
2007-02-22 17:21                           ` jose.goncalves
2007-02-22 17:32                           ` Paul Fulghum
2007-03-01 13:33                           ` Jose Goncalves
2007-03-01 15:10                             ` Russell King
2007-03-01 15:24                               ` Jose Goncalves
     [not found] <>
     [not found] ` <>
2007-02-20  2:48   ` Robert Hancock
2007-02-20  4:59     ` Michael K. Edwards
2007-02-20  5:18       ` Robert Hancock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \
    --subject='Re: Serial related oops' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).