LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Chris Wright <chrisw@sous-sol.org>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: Justin Forbes <jmforbes@linuxtx.org>,
	Zwane Mwaikambo <zwane@arm.linux.org.uk>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Randy Dunlap <rdunlap@xenotime.net>,
	Dave Jones <davej@redhat.com>,
	Chuck Wolber <chuckw@quantumlinux.com>,
	Chris Wedgwood <reviews@ml.cw.f00f.org>,
	Michael Krufky <mkrufky@linuxtv.org>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk, NeilBrown <neilb@suse.de>
Subject: [patch 41/59] md: fix potential memalloc deadlock in md
Date: Fri, 02 Feb 2007 18:35:45 -0800	[thread overview]
Message-ID: <20070203024415.110184000@sous-sol.org> (raw)
In-Reply-To: <20070203023504.435051000@sous-sol.org>

[-- Attachment #1: md-fix-potential-memalloc-deadlock-in-md.patch --]
[-- Type: text/plain, Size: 3351 bytes --]

-stable review patch.  If anyone has any objections, please let us know.
------------------

From: NeilBrown <neilb@suse.de>

If a GFP_KERNEL allocation is attempted in md while the mddev_lock is
held, it is possible for a deadlock to eventuate.
This happens if the array was marked 'clean', and the memalloc triggers 
a write-out to the md device.
For the writeout to succeed, the array must be marked 'dirty', and that 
requires getting the mddev_lock.

So, before attempting a GFP_KERNEL alloction while holding the lock,
make sure the array is marked 'dirty' (unless it is currently read-only).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
---
 drivers/md/md.c         |   29 +++++++++++++++++++++++++++++
 drivers/md/raid1.c      |    2 ++
 drivers/md/raid5.c      |    3 +++
 include/linux/raid/md.h |    2 +-
 4 files changed, 35 insertions(+), 1 deletion(-)

--- linux-2.6.19.2.orig/drivers/md/md.c
+++ linux-2.6.19.2/drivers/md/md.c
@@ -3561,6 +3561,8 @@ static int get_bitmap_file(mddev_t * mdd
 	char *ptr, *buf = NULL;
 	int err = -ENOMEM;
 
+	md_allow_write(mddev);
+
 	file = kmalloc(sizeof(*file), GFP_KERNEL);
 	if (!file)
 		goto out;
@@ -5029,6 +5031,33 @@ void md_write_end(mddev_t *mddev)
 	}
 }
 
+/* md_allow_write(mddev)
+ * Calling this ensures that the array is marked 'active' so that writes
+ * may proceed without blocking.  It is important to call this before
+ * attempting a GFP_KERNEL allocation while holding the mddev lock.
+ * Must be called with mddev_lock held.
+ */
+void md_allow_write(mddev_t *mddev)
+{
+	if (!mddev->pers)
+		return;
+	if (mddev->ro)
+		return;
+
+	spin_lock_irq(&mddev->write_lock);
+	if (mddev->in_sync) {
+		mddev->in_sync = 0;
+		set_bit(MD_CHANGE_CLEAN, &mddev->flags);
+		if (mddev->safemode_delay &&
+		    mddev->safemode == 0)
+			mddev->safemode = 1;
+		spin_unlock_irq(&mddev->write_lock);
+		md_update_sb(mddev, 0);
+	} else
+		spin_unlock_irq(&mddev->write_lock);
+}
+EXPORT_SYMBOL_GPL(md_allow_write);
+
 static DECLARE_WAIT_QUEUE_HEAD(resync_wait);
 
 #define SYNC_MARKS	10
--- linux-2.6.19.2.orig/drivers/md/raid1.c
+++ linux-2.6.19.2/drivers/md/raid1.c
@@ -2104,6 +2104,8 @@ static int raid1_reshape(mddev_t *mddev)
 		return -EINVAL;
 	}
 
+	md_allow_write(mddev);
+
 	raid_disks = mddev->raid_disks + mddev->delta_disks;
 
 	if (raid_disks < conf->raid_disks) {
--- linux-2.6.19.2.orig/drivers/md/raid5.c
+++ linux-2.6.19.2/drivers/md/raid5.c
@@ -403,6 +403,8 @@ static int resize_stripes(raid5_conf_t *
 	if (newsize <= conf->pool_size)
 		return 0; /* never bother to shrink */
 
+	md_allow_write(conf->mddev);
+
 	/* Step 1 */
 	sc = kmem_cache_create(conf->cache_name[1-conf->active_name],
 			       sizeof(struct stripe_head)+(newsize-1)*sizeof(struct r5dev),
@@ -3045,6 +3047,7 @@ raid5_store_stripe_cache_size(mddev_t *m
 		else
 			break;
 	}
+	md_allow_write(mddev);
 	while (new > conf->max_nr_stripes) {
 		if (grow_one_stripe(conf))
 			conf->max_nr_stripes++;
--- linux-2.6.19.2.orig/include/linux/raid/md.h
+++ linux-2.6.19.2/include/linux/raid/md.h
@@ -94,7 +94,7 @@ extern int sync_page_io(struct block_dev
 			struct page *page, int rw);
 extern void md_do_sync(mddev_t *mddev);
 extern void md_new_event(mddev_t *mddev);
-
+extern void md_allow_write(mddev_t *mddev);
 
 #endif /* CONFIG_MD */
 #endif 

--

  parent reply	other threads:[~2007-02-03  2:39 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-03  2:35 [patch 00/59] -stable review Chris Wright
2007-02-03  2:35 ` [patch 01/59] i2c-mv64xxx: Fix random oops at boot Chris Wright
2007-02-03  2:35 ` [patch 02/59] i2c/m41t00: Do not forget to write year Chris Wright
2007-02-03  2:35 ` [patch 03/59] Check for populated zone in __drain_pages Chris Wright
2007-02-03  2:35 ` [patch 04/59] Fix HWRNG built-in initcalls priority Chris Wright
2007-02-03  2:35 ` [patch 05/59] md: pass down BIO_RW_SYNC in raid{1,10} Chris Wright
2007-02-03  2:35 ` [patch 06/59] NETFILTER: Fix routing of REJECT target generated packets in output chain Chris Wright
2007-02-03  2:35 ` [patch 07/59] NETFILTER: nf_conntrack_ipv6: fix crash when handling fragments Chris Wright
2007-02-03  2:35 ` [patch 08/59] NETFILTER: tcp conntrack: fix IP_CT_TCP_FLAG_CLOSE_INIT value Chris Wright
2007-02-03  2:35 ` [patch 09/59] NETFILTER: arp_tables: fix userspace compilation Chris Wright
2007-02-03  2:35 ` [patch 10/59] Repair snd-usb-usx2y over OHCI Chris Wright
2007-02-03  2:35 ` [patch 11/59] [stable] [PATCH] IB/mthca: Fix off-by-one in FMR handling on memfree Chris Wright
2007-02-03  2:35 ` [patch 12/59] [PATCH] Fix reparenting to the same thread group. (take 2) Chris Wright
2007-02-03  2:35 ` [patch 13/59] ieee1394: sbp2: fix probing of some DVD-ROM/RWs Chris Wright
2007-02-03  2:35 ` [patch 14/59] sched: tasks cannot run on cpus onlined after boot Chris Wright
2007-02-03  2:35 ` [patch 15/59] Fix up CIFS for "test_clear_page_dirty()" removal Chris Wright
2007-02-03  2:35 ` [patch 16/59] start_kernel: test if irqs got enabled early, barf, and disable them again Chris Wright
2007-02-03  2:35 ` [patch 17/59] PCI: prevent down_read when pci_devices is empty Chris Wright
2007-02-03 16:16   ` Kumar Gala
2007-02-05 17:04     ` Chris Wright
2007-02-03  2:35 ` [patch 18/59] IPV6 MCAST: Fix joining all-node multicast group on device initialization Chris Wright
2007-02-03  2:35 ` [patch 19/59] NETFILTER: ctnetlink: check for status attribute existence on conntrack creation Chris Wright
2007-02-03  2:35 ` [patch 20/59] NETFILTER: ctnetlink: fix leak in ctnetlink_create_conntrack error path Chris Wright
2007-02-03  2:35 ` [patch 21/59] IPSEC: Policy list disorder Chris Wright
2007-02-03  2:35 ` [patch 22/59] ALSA hda-codec - Fix NULL dereference in generic hda code Chris Wright
2007-02-03  2:35 ` [patch 23/59] SELinux: fix an oops with NetLabel and non-MLS SELinux policy Chris Wright
2007-02-03  2:35 ` [patch 24/59] IB/iser: return error code when PDUs may not be sent Chris Wright
2007-02-03  2:35 ` [patch 25/59] Fix UML on non-standard VM split hosts Chris Wright
2007-02-04  1:06   ` Randy Dunlap
2007-02-04  3:11     ` Jeff Dike
2007-02-04  4:19       ` Randy Dunlap
2007-02-03  2:35 ` [patch 26/59] ACPI: fix cpufreq regression Chris Wright
2007-02-03  2:35 ` [patch 27/59] x86: Work around gcc 4.2 over aggressive optimizer Chris Wright
2007-02-03  2:35 ` [patch 28/59] NETFILTER: Fix iptables ABI breakage on (at least) CRIS Chris Wright
2007-02-03  2:35 ` [patch 29/59] elevator: move clearing of unplug flag earlier Chris Wright
2007-02-03  2:35 ` [patch 30/59] Revert "[PATCH] Fix up mmap_kmem" Chris Wright
2007-02-03  2:35 ` [patch 31/59] remove __devinit markings from rtc_sysfs_add_device() Chris Wright
2007-02-03  2:35 ` [patch 32/59] SPARC64: Set g4/g5 properly in sun4v dtlb-prot handling Chris Wright
2007-02-03  2:35 ` [patch 33/59] sis190: failure to set the MAC address from EEPROM Chris Wright
2007-02-03  2:35 ` [patch 34/59] knfsd: fix setting of ACL server versions Chris Wright
2007-02-03  2:35 ` [patch 35/59] knfsd: fix an NFSD bug with full sized, non-page-aligned reads Chris Wright
2007-02-03  2:35 ` [patch 36/59] knfsd: fix type mismatch with filldir_t used by nfsd Chris Wright
2007-02-03  2:35 ` [patch 37/59] knfsd: fix up some bit-rot in exp_export Chris Wright
2007-02-03  2:35 ` [patch 38/59] md: assorted md and raid1 one-liners Chris Wright
2007-02-03  2:35 ` [patch 39/59] md: make repair actually work for raid1 Chris Wright
2007-02-03  2:35 ` [patch 40/59] md: fix a few problems with the interface (sysfs and ioctl) to md Chris Wright
2007-02-03  2:35 ` Chris Wright [this message]
2007-02-03  2:35 ` [patch 42/59] libata: use kmap_atomic(KM_IRQ0) in SCSI simulator Chris Wright
2007-02-03  2:35 ` [patch 43/59] Dont allow the stack to grow into hugetlb reserved regions Chris Wright
2007-02-03  2:35 ` [patch 44/59] uml: fix signal frame alignment Chris Wright
2007-02-03  2:35 ` [patch 45/59] bonding: ARP monitoring broken on x86_64 Chris Wright
2007-02-03  2:35 ` [patch 46/59] jmicron: 40/80pin primary detection Chris Wright
2007-02-03  2:35 ` [patch 47/59] DECNET: Handle a failure in neigh_parms_alloc (take 2) Chris Wright
2007-02-03  2:35 ` [patch 48/59] SPARC32: Fix over-optimization by GCC near ip_fast_csum Chris Wright
2007-02-03  2:35 ` [patch 49/59] IPV4: Fix the fib trie iterator to work with a single entry routing tables Chris Wright
2007-02-03  2:35 ` [patch 50/59] IPV4: Fix single-entry /proc/net/fib_trie output Chris Wright
2007-02-03  2:35 ` [patch 51/59] AF_PACKET: Fix BPF handling Chris Wright
2007-02-03  2:35 ` [patch 52/59] AF_PACKET: Check device down state before hard header callbacks Chris Wright
2007-02-03  2:35 ` [patch 53/59] TCP: rare bad TCP checksum with 2.6.19 Chris Wright
2007-02-03  2:35 ` [patch 54/59] TCP: Fix sorting of SACK blocks Chris Wright
2007-02-03  2:35 ` [patch 55/59] TCP: skb is unexpectedly freed Chris Wright
2007-02-03  2:36 ` [patch 56/59] NETFILTER: xt_connbytes: fix division by zero Chris Wright
2007-02-03  2:36 ` [patch 57/59] SUNRPC: Give cloned RPC clients their own rpc_pipefs directory Chris Wright
2007-02-03  2:36 ` [patch 58/59] move_task_off_dead_cpu() should be called with disabled ints Chris Wright
2007-02-03  2:36 ` [patch 59/59] sched: fix cond_resched_softirq() offset Chris Wright
2007-02-03  2:59 ` [stable] [patch 00/59] -stable review Chris Wright
2007-02-04  6:08   ` Randy Dunlap
2007-02-04 13:36     ` Dave Jones
2007-02-04 17:30       ` Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070203024415.110184000@sous-sol.org \
    --to=chrisw@sous-sol.org \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=chuckw@quantumlinux.com \
    --cc=davej@redhat.com \
    --cc=jmforbes@linuxtx.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkrufky@linuxtv.org \
    --cc=neilb@suse.de \
    --cc=rdunlap@xenotime.net \
    --cc=reviews@ml.cw.f00f.org \
    --cc=stable@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=zwane@arm.linux.org.uk \
    --subject='Re: [patch 41/59] md: fix potential memalloc deadlock in md' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).