LKML Archive on
help / color / mirror / Atom feed
From: (Eric W. Biederman)
Cc: Zwane Mwaikambo <>,
	Ashok Raj <>, Ingo Molnar <>,
	Andrew Morton <>, "Lu, Yinghai" <>,
	Natalie Protasevich <>, Andi Kleen <>,
	"Siddha, Suresh B" <>,
	Linus Torvalds <>
Subject: Conclusions from my investigation about ioapic programming
Date: Fri, 23 Feb 2007 03:51:04 -0700	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <> (Eric W. Biederman's message of "Sun, 11 Feb 2007 21:51:05 -0700")

Ok. This is just an email to summarize my findings after investigating
the ioapic programming.

The ioapics on the E75xx chipset do have issues if you attempt to
reprogramming them outside of the irq handler.  I have on several
instances caused the state machine to get stuck such that an
individual ioapic entry was no longer capable of delivering
interrupts.  I suspect the remote IRR bit was set stuck on such that
switch the irq to edge triggered and back to level triggered would not
clear it but I did not confirm this.  I just know that I was switching
the irq to between level and edge triggered with the irq masked
and the irq did not fire.

The ioapics on the AMD 8xxx chipset do have issues if you attempt
to reprogram them outside of the irq handler.  I would up with 
remote IRR set and never clearing.  But by temporarily switching
the irq to edge triggered while it was masked I could clear
this condition.

I could not hit verifiable bugs in the ioapics on the Nforce4
chipset.  It's amazing one part of that chipset that I can't find
issues with.

I did find an algorithm that will work successfully for migrating
IRQs in process context if you have an ioapic that will follow pci
ordering rules.  In particulars the properties that the algorithm
depend on are reads guaranteeing that outstanding writes are flushed,
and in this context irqs in flight are considered writes.  I have
assumed that to devices outside of the cpu asic the cpu and the local
apic appear as the same device.

The algorithm was:
- Be running with interrupts enabled in process context.
- Mask the ioapic.
- Read the ioapic to flush outstanding reads to the local apic.
- Read the local apic to flush outstanding irqs to be send the cpu.

- Now that all of the irqs have been delivered and the irq is masked
  that irq is finally quiescent.

- With the irq quiescent it is safe to reprogram interrupt controller
  and the irq reception data structures.

There were a lot more details but that was the essence.

What I discovered was that except on the nforce chipset masking the
ioapic and then issue a read did not behave as if the interrupts were
flushed to the local apic. 

I did not look close enough to tell if local apics suffered from this
issue.  With local apics at least a read was necessary before you
could guarantee the local apic would deliver pending irqs.  A work
around on the local apics is to simply issue a low priority interrupt
as an IPI and wait for it to be processed.  This guarantees that all
higher priority interrupts have been flushed from the apic, and that
the local apic has processed interrupts.

For ioapics because they cannot be stimulated to send any irq by
stimulation from the cpu side not similar work around was possible.

** Conclusions.

*IRQs must be reprogramed in interrupt context.

The result of this is investigation is that I am convinced we need
to perform the irq migration activities in interrupt context although
I am not convinced it is completely safe.  I suspect multiple irqs
firing closely enough to each other may hit the same issues as
migrating irqs from process context.  However the odds are on our
side, when we are in irq context.

The reasoning for this is simply that.
- Before we reprogram a level triggered irq it's remote irr bit
  must be cleared by the irq being acknowledged before the can be
  safely reprogrammed.

- There is no generally effective way short of receiving an additional
  irq to ensure that the irq handler has run.  Polling the ioapics
  remote irr bit does not work.

* The CPU hotplug is currently very buggy.

Irq migration in the cpu hotplug case is a serious problem.  If we can
only safely migrate irqs from interrupt context and we cannot control
when those interrupts fire, then we cannot bound the amount of time it
will take to migrate the irqs away from a cpu.   The current cpu
hotplug code currently calls chip->set_affinity directly which is
wrong, as it does not take the necessary locks, and it does not
attempt to delay execution until we are in process context.

* Only an additional irq can signal the completion of an irq movement.

The attempt to rebuild the irq migration code from first principles
did bear some fruit.  I asked the question: "When is it safe to tear
down the data structures for irq movement?".  The only answer I have
is when I have received an irq provably from after the irq was
reprogrammed.  This is because the only way I can reliably synchronize
with irq delivery from an apic is to receive an additional irq.

Currently this is a problem both for cpu hotplug on x86_64 and i386
and for general irq migration on x86_64.

Patches to follow shortly.


  reply	other threads:[~2007-02-23 10:52 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <>
2007-01-22 17:14 ` System crash after "No irq handler for vector" linux 2.6.19 Eric W. Biederman
     [not found]   ` <>
2007-01-23 12:18     ` Eric W. Biederman
     [not found]       ` <>
2007-01-31  8:39         ` Eric W. Biederman
     [not found]           ` <>
2007-02-01  5:56             ` [PATCH] x86_64: Survive having no irq mapping for a vector Eric W. Biederman
2007-02-01  5:59             ` System crash after "No irq handler for vector" linux 2.6.19 Eric W. Biederman
2007-02-01  7:20             ` Eric W. Biederman
     [not found]               ` <>
2007-02-02 18:02                 ` Eric W. Biederman
     [not found]                   ` <>
2007-02-02 18:32                     ` Eric W. Biederman
2007-02-03  0:31                     ` [PATCH 1/2] x86_64 irq: Simplfy __assign_irq_vector Eric W. Biederman
2007-02-03  0:35                       ` [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration Eric W. Biederman
2007-02-03  1:05                         ` Andrew Morton
2007-02-03  1:39                           ` Eric W. Biederman
2007-02-03  2:01                             ` Andrew Morton
2007-02-03  7:32                           ` Arjan van de Ven
2007-02-03  7:55                             ` Eric W. Biederman
2007-02-03 14:31                               ` l.genoni
2007-02-03 10:01                         ` Andi Kleen
2007-02-03 10:22                           ` Eric W. Biederman
2007-02-03 10:26                             ` Andi Kleen
2007-02-06  7:36                         ` Ingo Molnar
2007-02-06  8:57                           ` Eric W. Biederman
     [not found]                           ` <>
2007-02-06 22:05                             ` Eric W. Biederman
2007-02-06 22:16                           ` Eric W. Biederman
2007-02-06 22:25                             ` Ingo Molnar
2007-02-07  2:33                               ` Eric W. Biederman
2007-02-08 11:48                               ` Eric W. Biederman
2007-02-08 20:19                                 ` Eric W. Biederman
2007-02-09  6:40                                   ` Eric W. Biederman
2007-02-10 23:52                                     ` What are the real ioapic rte programming constraints? Eric W. Biederman
2007-02-11  5:57                                       ` Zwane Mwaikambo
2007-02-11 10:20                                         ` Eric W. Biederman
2007-02-11 16:16                                           ` Zwane Mwaikambo
2007-02-11 22:01                                             ` Eric W. Biederman
2007-02-12  1:05                                               ` Zwane Mwaikambo
2007-02-12  4:51                                                 ` Eric W. Biederman
2007-02-23 10:51                                                   ` Eric W. Biederman [this message]
2007-02-23 11:10                                                     ` [PATCH 0/14] x86_64 irq related fixes and cleanups Eric W. Biederman
2007-02-23 11:11                                                       ` [PATCH 01/14] x86_64 irq: Simplfy __assign_irq_vector Eric W. Biederman
2007-02-23 11:13                                                         ` [PATCH 02/14] irq: Remove set_native_irq_info Eric W. Biederman
2007-02-23 11:15                                                           ` [PATCH 03/14] x86_64 irq: Kill declaration of removed array, interrupt Eric W. Biederman
2007-02-23 11:16                                                             ` [PATCH 04/14] x86_64 irq: Remove the unused vector parameter from ioapic_register_intr Eric W. Biederman
2007-02-23 11:19                                                               ` [PATCH 05/14] x86_64 irq: Refactor setup_IO_APIC_irq Eric W. Biederman
2007-02-23 11:20                                                                 ` [PATCH 06/14] x86_64 irq: Simplfiy the set_affinity logic Eric W. Biederman
2007-02-23 11:23                                                                   ` [PATCH 07/14] x86_64 irq: In __DO_ACTION perform the FINAL action for every entry Eric W. Biederman
2007-02-23 11:26                                                                     ` [PATCH 08/14] x86_64 irq: Use NR_IRQS not NR_IRQ_VECTORS Eric W. Biederman
2007-02-23 11:32                                                                       ` [PATCH 09/14] x86_64 irq: Begin consolidating per_irq data in structures Eric W. Biederman
2007-02-23 11:35                                                                         ` [PATCH 10/14] x86_64 irq: Simplify assign_irq_vector's arguments Eric W. Biederman
2007-02-23 11:36                                                                           ` [PATCH 11/14] x86_64 irq: Remove unnecessary irq 0 setup Eric W. Biederman
2007-02-23 11:38                                                                             ` [PATCH 12/14] x86_64 irq: Add constants for the reserved IRQ vectors Eric W. Biederman
2007-02-23 11:40                                                                               ` [PATCH 13/14] x86_64 irq: Safely cleanup an irq after moving it Eric W. Biederman
2007-02-25 11:53                                                                                 ` Mika Penttilä
2007-02-25 12:09                                                                                   ` Eric W. Biederman
2007-02-23 11:46                                                                               ` [PATCH 14/14] genirq: Mask irqs when migrating them Eric W. Biederman
2007-02-23 12:01                                                                                 ` [PATCH] x86_64 irq: Document what works and why on ioapics Eric W. Biederman
2007-02-24  2:06                                                                                 ` [PATCH 14/14] genirq: Mask irqs when migrating them Siddha, Suresh B
2007-02-27 20:26                                                                                   ` Andrew Morton
2007-02-27 20:41                                                                                     ` Eric W. Biederman
2007-02-25 10:43                                                                               ` [PATCH 12/14] x86_64 irq: Add constants for the reserved IRQ vectors Pavel Machek
2007-02-25 11:15                                                                                 ` Eric W. Biederman
2007-02-25 19:48                                                                                   ` Pavel Machek
2007-02-25 21:01                                                                                     ` Eric W. Biederman
2007-02-25 21:13                                                                                       ` Pavel Machek
2007-02-23 16:48                                                     ` Conclusions from my investigation about ioapic programming Jeff V. Merkey
2007-02-23 18:10                                                       ` Eric W. Biederman
2007-02-23 17:48                                                         ` Jeff V. Merkey
2007-02-24  4:05                                                           ` Eric W. Biederman
2007-02-24  5:44                                                             ` Jeffrey V. Merkey
2007-02-23 17:48                                                         ` Jeff V. Merkey
     [not found]                                           ` <>
2007-02-11 21:36                                             ` What are the real ioapic rte programming constraints? Eric W. Biederman
2007-02-03  9:50                       ` [PATCH 1/2] x86_64 irq: Simplfy __assign_irq_vector Andi Kleen
2007-02-03  0:40                     ` System crash after "No irq handler for vector" linux 2.6.19 Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \
    --subject='Re: Conclusions from my investigation about ioapic programming' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).