LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>
Subject: [PATCH 3.14 34/98] genirq: Prevent proc race against freeing of irq descriptors
Date: Sun, 25 Jan 2015 10:06:52 -0800	[thread overview]
Message-ID: <20150125180714.355153726@linuxfoundation.org> (raw)
In-Reply-To: <20150125180712.859646324@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit c291ee622165cb2c8d4e7af63fffd499354a23be upstream.

Since the rework of the sparse interrupt code to actually free the
unused interrupt descriptors there exists a race between the /proc
interfaces to the irq subsystem and the code which frees the interrupt
descriptor.

CPU0				CPU1
				show_interrupts()
				  desc = irq_to_desc(X);
free_desc(desc)
  remove_from_radix_tree();
  kfree(desc);
				  raw_spinlock_irq(&desc->lock);

/proc/interrupts is the only interface which can actively corrupt
kernel memory via the lock access. /proc/stat can only read from freed
memory. Extremly hard to trigger, but possible.

The interfaces in /proc/irq/N/ are not affected by this because the
removal of the proc file is serialized in procfs against concurrent
readers/writers. The removal happens before the descriptor is freed.

For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
as the descriptor is never freed. It's merely cleared out with the irq
descriptor lock held. So any concurrent proc access will either see
the old correct value or the cleared out ones.

Protect the lookup and access to the irq descriptor in
show_interrupts() with the sparse_irq_lock.

Provide kstat_irqs_usr() which is protecting the lookup and access
with sparse_irq_lock and switch /proc/stat to use it.

Document the existing kstat_irqs interfaces so it's clear that the
caller needs to take care about protection. The users of these
interfaces are either not affected due to SPARSE_IRQ=n or already
protected against removal.

Fixes: 1f5a5b87f78f "genirq: Implement a sane sparse_irq allocator"
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/proc/stat.c              |    2 -
 include/linux/kernel_stat.h |    1 
 kernel/irq/internals.h      |    8 ++++++
 kernel/irq/irqdesc.c        |   52 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/irq/proc.c           |   22 +++++++++++++++++-
 5 files changed, 83 insertions(+), 2 deletions(-)

--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -159,7 +159,7 @@ static int show_stat(struct seq_file *p,
 
 	/* sum again ? it could be updated? */
 	for_each_irq_nr(j)
-		seq_put_decimal_ull(p, ' ', kstat_irqs(j));
+		seq_put_decimal_ull(p, ' ', kstat_irqs_usr(j));
 
 	seq_printf(p,
 		"\nctxt %llu\n"
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -74,6 +74,7 @@ static inline unsigned int kstat_softirq
  * Number of interrupts per specific IRQ source, since bootup
  */
 extern unsigned int kstat_irqs(unsigned int irq);
+extern unsigned int kstat_irqs_usr(unsigned int irq);
 
 /*
  * Number of interrupts per cpu, since bootup
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -74,6 +74,14 @@ extern void irq_percpu_disable(struct ir
 extern void mask_irq(struct irq_desc *desc);
 extern void unmask_irq(struct irq_desc *desc);
 
+#ifdef CONFIG_SPARSE_IRQ
+extern void irq_lock_sparse(void);
+extern void irq_unlock_sparse(void);
+#else
+static inline void irq_lock_sparse(void) { }
+static inline void irq_unlock_sparse(void) { }
+#endif
+
 extern void init_kstat_irqs(struct irq_desc *desc, int node, int nr);
 
 irqreturn_t handle_irq_event_percpu(struct irq_desc *desc, struct irqaction *action);
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -131,6 +131,16 @@ static void free_masks(struct irq_desc *
 static inline void free_masks(struct irq_desc *desc) { }
 #endif
 
+void irq_lock_sparse(void)
+{
+	mutex_lock(&sparse_irq_lock);
+}
+
+void irq_unlock_sparse(void)
+{
+	mutex_unlock(&sparse_irq_lock);
+}
+
 static struct irq_desc *alloc_desc(int irq, int node, struct module *owner)
 {
 	struct irq_desc *desc;
@@ -167,6 +177,12 @@ static void free_desc(unsigned int irq)
 
 	unregister_irq_proc(irq, desc);
 
+	/*
+	 * sparse_irq_lock protects also show_interrupts() and
+	 * kstat_irq_usr(). Once we deleted the descriptor from the
+	 * sparse tree we can free it. Access in proc will fail to
+	 * lookup the descriptor.
+	 */
 	mutex_lock(&sparse_irq_lock);
 	delete_irq_desc(irq);
 	mutex_unlock(&sparse_irq_lock);
@@ -489,6 +505,15 @@ void dynamic_irq_cleanup(unsigned int ir
 	raw_spin_unlock_irqrestore(&desc->lock, flags);
 }
 
+/**
+ * kstat_irqs_cpu - Get the statistics for an interrupt on a cpu
+ * @irq:	The interrupt number
+ * @cpu:	The cpu number
+ *
+ * Returns the sum of interrupt counts on @cpu since boot for
+ * @irq. The caller must ensure that the interrupt is not removed
+ * concurrently.
+ */
 unsigned int kstat_irqs_cpu(unsigned int irq, int cpu)
 {
 	struct irq_desc *desc = irq_to_desc(irq);
@@ -497,6 +522,14 @@ unsigned int kstat_irqs_cpu(unsigned int
 			*per_cpu_ptr(desc->kstat_irqs, cpu) : 0;
 }
 
+/**
+ * kstat_irqs - Get the statistics for an interrupt
+ * @irq:	The interrupt number
+ *
+ * Returns the sum of interrupt counts on all cpus since boot for
+ * @irq. The caller must ensure that the interrupt is not removed
+ * concurrently.
+ */
 unsigned int kstat_irqs(unsigned int irq)
 {
 	struct irq_desc *desc = irq_to_desc(irq);
@@ -509,3 +542,22 @@ unsigned int kstat_irqs(unsigned int irq
 		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
 	return sum;
 }
+
+/**
+ * kstat_irqs_usr - Get the statistics for an interrupt
+ * @irq:	The interrupt number
+ *
+ * Returns the sum of interrupt counts on all cpus since boot for
+ * @irq. Contrary to kstat_irqs() this can be called from any
+ * preemptible context. It's protected against concurrent removal of
+ * an interrupt descriptor when sparse irqs are enabled.
+ */
+unsigned int kstat_irqs_usr(unsigned int irq)
+{
+	int sum;
+
+	irq_lock_sparse();
+	sum = kstat_irqs(irq);
+	irq_unlock_sparse();
+	return sum;
+}
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -15,6 +15,23 @@
 
 #include "internals.h"
 
+/*
+ * Access rules:
+ *
+ * procfs protects read/write of /proc/irq/N/ files against a
+ * concurrent free of the interrupt descriptor. remove_proc_entry()
+ * immediately prevents new read/writes to happen and waits for
+ * already running read/write functions to complete.
+ *
+ * We remove the proc entries first and then delete the interrupt
+ * descriptor from the radix tree and free it. So it is guaranteed
+ * that irq_to_desc(N) is valid as long as the read/writes are
+ * permitted by procfs.
+ *
+ * The read from /proc/interrupts is a different problem because there
+ * is no protection. So the lookup and the access to irqdesc
+ * information must be protected by sparse_irq_lock.
+ */
 static struct proc_dir_entry *root_irq_dir;
 
 #ifdef CONFIG_SMP
@@ -437,9 +454,10 @@ int show_interrupts(struct seq_file *p,
 		seq_putc(p, '\n');
 	}
 
+	irq_lock_sparse();
 	desc = irq_to_desc(i);
 	if (!desc)
-		return 0;
+		goto outsparse;
 
 	raw_spin_lock_irqsave(&desc->lock, flags);
 	for_each_online_cpu(j)
@@ -479,6 +497,8 @@ int show_interrupts(struct seq_file *p,
 	seq_putc(p, '\n');
 out:
 	raw_spin_unlock_irqrestore(&desc->lock, flags);
+outsparse:
+	irq_unlock_sparse();
 	return 0;
 }
 #endif



  parent reply	other threads:[~2015-01-25 19:23 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-25 18:06 [PATCH 3.14 00/98] 3.14.30-stable review Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 02/98] netlink: Always copy on mmap TX Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 03/98] netlink: Dont reorder loads/stores before marking mmap netlink frame as available Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 04/98] in6: fix conflict with glibc Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 05/98] tg3: tg3_disable_ints using uninitialized mailbox value to disable interrupts Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 07/98] batman-adv: Unify fragment size calculation Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 08/98] batman-adv: avoid NULL dereferences and fix if check Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 09/98] net: Fix stacked vlan offload features computation Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 10/98] net: Reset secmark when scrubbing packet Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 11/98] tcp: Do not apply TSO segment limit to non-TSO packets Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 12/98] alx: fix alx_poll() Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 13/98] team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 14/98] enic: fix rx skb checksum Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 15/98] net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 16/98] drm/vmwgfx: Fix fence event code Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 17/98] drm/ttm: Avoid memory allocation from shrinker functions Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 18/98] drm/radeon: fix typo in CI dpm disable Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 19/98] drm/radeon: work around a hw bug in MGCG on CIK Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 21/98] drm/radeon: properly filter DP1.2 4k modes on non-DP1.2 hw Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 22/98] drm/i915: Dont complain about stolen conflicts on gen3 Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 23/98] drm/i915: Only warn the first time we attempt to mmio whilst suspended Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 24/98] drm/nv4c/mc: disable msi Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 27/98] ARC: [nsimosci] move peripherals to match model to FPGA Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 28/98] ARC: switch to generic ENTRY/END assembler annotations Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 29/98] cfg80211: dont WARN about two consecutive Country IE hint Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 30/98] cfg80211: avoid mem leak on driver hint set Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 31/98] cfg80211: Fix 160 MHz channels with 80+80 and 160 MHz drivers Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 32/98] hp_accel: Add support for HP ZBook 15 Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 33/98] tick/powerclamp: Remove tick_nohz_idle abuse Greg Kroah-Hartman
2015-01-25 18:06 ` Greg Kroah-Hartman [this message]
2015-01-25 18:06 ` [PATCH 3.14 35/98] iscsi-target: Fail connection on short sendmsg writes Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 36/98] Revert "[SCSI] mpt2sas: Remove phys on topology change." Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 37/98] Revert "[SCSI] mpt3sas: Remove phys on topology change" Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 38/98] scsi: blacklist RSOC for Microsoft iSCSI target devices Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 39/98] clk: samsung: Fix double add of syscore ops after driver rebind Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 40/98] clk: Really fix deadlock with mmap_sem Greg Kroah-Hartman
2015-01-26 12:46   ` Luis Henriques
2015-01-26 22:20     ` Greg Kroah-Hartman
2015-01-25 18:06 ` [PATCH 3.14 41/98] clk: Dont try to use a struct clk* after it could have been freed Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 42/98] parisc: fix out-of-register compiler error in ldcw inline assembler function Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 43/98] storvsc: ring buffer failures may result in I/O freeze Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 44/98] net: ethernet: cpsw: fix hangs with interrupts Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 45/98] video/logo: prevent use of logos after they have been freed Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 46/98] [media] smiapp-pll: Correct clock debug prints Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 48/98] [media] smiapp: Take mutex during PLL update in sensor initialisation Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 49/98] [media] sound: simplify au0828 quirk table Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 50/98] [media] sound: Update au0828 quirks table Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 51/98] [media] uvcvideo: Fix destruction order in uvc_delete() Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 52/98] vfio-pci: Fix the check on pci device type in vfio_pci_probe() Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 53/98] drivers: net: cpsw: fix multicast flush in dual emac mode Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 54/98] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 55/98] NFSv4.1: Fix client id trunking on Linux Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 56/98] gpiolib: of: Correct error handling in of_get_named_gpiod_flags Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 57/98] gpio: fix memory and reference leaks in gpiochip_add error path Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 58/98] OHCI: add a quirk for ULi M5237 blocking on reset Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 59/98] usb: dwc3: gadget: Fix TRB preparation during SG Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 60/98] usb: dwc3: gadget: Stop TRB preparation after limit is reached Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 61/98] USB: cp210x: fix ID for production CEL MeshConnect USB Stick Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 62/98] USB: cp210x: add IDs for CEL USB sticks and MeshWorks devices Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 63/98] USB: keyspan: fix null-deref at probe Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 64/98] USB: console: fix uninitialised ldisc semaphore Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 65/98] USB: console: fix potential use after free Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 66/98] USB: EHCI: fix initialization bug in iso_stream_schedule() Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 67/98] usb: musb: stuff leak of struct usb_hcd Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 68/98] can: kvaser_usb: Dont free packets when tight on URBs Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 69/98] can: kvaser_usb: Reset all URB tx contexts upon channel close Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 70/98] can: kvaser_usb: Dont send a RESET_CHIP for non-existing channels Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 72/98] Input: I8042 - add Acer Aspire 7738 to the nomux list Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 73/98] ARM: dts: imx25: Fix the SPI1 clocks Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 74/98] ARM: imx6q: drop unnecessary semicolon Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 75/98] ARM: clk-imx6q: fix video divider for rev T0 1.0 Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 76/98] ARM: omap5/dra7xx: Fix frequency typos Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 77/98] ARM: shmobile: sh73a0 legacy: Set .control_parent for all irqpin instances Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 78/98] decompress_bunzip2: off by one in get_next_block() Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 79/98] um: Skip futex_atomic_cmpxchg_inatomic() test Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 80/98] x86, um: actually mark system call tables readonly Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 81/98] LOCKD: Fix a race when initialising nlmsvc_timeout Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 82/98] tcm_loop: Fixup tag handling Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 83/98] tcm_loop: Fix wrong I_T nexus association Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 84/98] vhost-scsi: Add missing virtio-scsi -> TCM attribute conversion Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 85/98] iscsi,iser-target: Initiate termination only once Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 86/98] iser-target: Fix flush + disconnect completion handling Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 87/98] iser-target: Parallelize CM connection establishment Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 88/98] iser-target: Fix connected_handler + teardown flow race Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 89/98] iser-target: Handle ADDR_CHANGE event for listener cm_id Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 90/98] iser-target: Fix implicit termination of connections Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 91/98] bcache: Make sure to pass GFP_WAIT to mempool_alloc() Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 92/98] KVM: nVMX: Disable unrestricted mode if ept=0 Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 93/98] netfilter: ipset: small potential read beyond the end of buffer Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 94/98] net: prevent of emerging cross-namespace symlinks Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 95/98] net: fix creation adjacent device symlinks Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 96/98] fsnotify: next_i is freed during fsnotify_unmount_inodes Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 97/98] s390/3215: fix hanging console issue Greg Kroah-Hartman
2015-01-25 18:07 ` [PATCH 3.14 98/98] s390/3215: fix tty output containing tabs Greg Kroah-Hartman
2015-01-25 21:37 ` [PATCH 3.14 00/98] 3.14.30-stable review Guenter Roeck
2015-01-26 17:43 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150125180714.355153726@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --subject='Re: [PATCH 3.14 34/98] genirq: Prevent proc race against freeing of irq descriptors' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).