LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Liu Bo <bo.liu@linux.alibaba.com>,
	Filipe Manana <fdmanana@suse.com>, Qu Wenruo <wqu@suse.com>,
	David Sterba <dsterba@suse.com>,
	Nikolay Borisov <nborisov@suse.com>
Subject: [PATCH 4.4 70/92] btrfs: fix reading stale metadata blocks after degraded raid1 mounts
Date: Thu, 24 May 2018 11:38:47 +0200	[thread overview]
Message-ID: <20180524093206.062830692@linuxfoundation.org> (raw)
In-Reply-To: <20180524093159.286472249@linuxfoundation.org>

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Liu Bo <bo.liu@linux.alibaba.com>

commit 02a3307aa9c20b4f6626255b028f07f6cfa16feb upstream.

If a btree block, aka. extent buffer, is not available in the extent
buffer cache, it'll be read out from the disk instead, i.e.

btrfs_search_slot()
  read_block_for_search()  # hold parent and its lock, go to read child
    btrfs_release_path()
    read_tree_block()  # read child

Unfortunately, the parent lock got released before reading child, so
commit 5bdd3536cbbe ("Btrfs: Fix block generation verification race") had
used 0 as parent transid to read the child block.  It forces
read_tree_block() not to check if parent transid is different with the
generation id of the child that it reads out from disk.

A simple PoC is included in btrfs/124,

0. A two-disk raid1 btrfs,

1. Right after mkfs.btrfs, block A is allocated to be device tree's root.

2. Mount this filesystem and put it in use, after a while, device tree's
   root got COW but block A hasn't been allocated/overwritten yet.

3. Umount it and reload the btrfs module to remove both disks from the
   global @fs_devices list.

4. mount -odegraded dev1 and write some data, so now block A is allocated
   to be a leaf in checksum tree.  Note that only dev1 has the latest
   metadata of this filesystem.

5. Umount it and mount it again normally (with both disks), since raid1
   can pick up one disk by the writer task's pid, if btrfs_search_slot()
   needs to read block A, dev2 which does NOT have the latest metadata
   might be read for block A, then we got a stale block A.

6. As parent transid is not checked, block A is marked as uptodate and
   put into the extent buffer cache, so the future search won't bother
   to read disk again, which means it'll make changes on this stale
   one and make it dirty and flush it onto disk.

To avoid the problem, parent transid needs to be passed to
read_tree_block().

In order to get a valid parent transid, we need to hold the parent's
lock until finishing reading child.

This patch needs to be slightly adapted for stable kernels, the
&first_key parameter added to read_tree_block() is from 4.16+
(581c1760415c4). The fix is to replace 0 by 'gen'.

Fixes: 5bdd3536cbbe ("Btrfs: Fix block generation verification race")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/ctree.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -2497,10 +2497,8 @@ read_block_for_search(struct btrfs_trans
 	if (p->reada)
 		reada_for_search(root, p, level, slot, key->objectid);
 
-	btrfs_release_path(p);
-
 	ret = -EAGAIN;
-	tmp = read_tree_block(root, blocknr, 0);
+	tmp = read_tree_block(root, blocknr, gen);
 	if (!IS_ERR(tmp)) {
 		/*
 		 * If the read above didn't mark this buffer up to date,
@@ -2512,6 +2510,8 @@ read_block_for_search(struct btrfs_trans
 			ret = -EIO;
 		free_extent_buffer(tmp);
 	}
+
+	btrfs_release_path(p);
 	return ret;
 }
 

  parent reply	other threads:[~2018-05-24  9:45 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-24  9:37 [PATCH 4.4 00/92] 4.4.133-stable review Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 01/92] 8139too: Use disable_irq_nosync() in rtl8139_poll_controller() Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 02/92] bridge: check iface upper dev when setting master via ioctl Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 03/92] dccp: fix tasklet usage Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 04/92] ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 05/92] llc: better deal with too small mtu Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 06/92] net: ethernet: sun: niu set correct packet size in skb Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 07/92] net/mlx4_en: Verify coalescing parameters are in range Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 08/92] net_sched: fq: take care of throttled flows before reuse Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 09/92] net: support compat 64-bit time in {s,g}etsockopt Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 10/92] openvswitch: Dont swap table in nlattr_set() after OVS_ATTR_NESTED is found Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 11/92] qmi_wwan: do not steal interfaces from class drivers Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 12/92] r8169: fix powering up RTL8168h Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 13/92] sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 14/92] sctp: use the old asoc when making the cookie-ack chunk in dupcook_d Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 15/92] tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent() Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 16/92] bonding: do not allow rlb updates to invalid mac Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 17/92] tcp: ignore Fast Open on repair mode Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 18/92] sctp: fix the issue that the cookie-ack with auth cant get processed Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 19/92] sctp: delay the authentication for the duplicated cookie-echo chunk Greg Kroah-Hartman
2018-06-06 22:31   ` Ben Hutchings
2018-06-07 18:21     ` Marcelo Ricardo Leitner
2018-05-24  9:37 ` [PATCH 4.4 20/92] ALSA: timer: Call notifier in the same spinlock Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 21/92] audit: move calcs after alloc and check when logging set loginuid Greg Kroah-Hartman
2018-05-24  9:37 ` [PATCH 4.4 22/92] arm64: introduce mov_q macro to move a constant into a 64-bit register Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 23/92] arm64: Add work around for Arm Cortex-A55 Erratum 1024718 Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 24/92] futex: Remove unnecessary warning from get_futex_key Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 25/92] futex: Remove duplicated code and fix undefined behaviour Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 26/92] xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM) Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 27/92] lockd: lost rollback of set_grace_period() in lockd_down_net() Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 28/92] Revert "ARM: dts: imx6qdl-wandboard: Fix audio channel swap" Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 29/92] l2tp: revert "l2tp: fix missing print session offset info" Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 30/92] pipe: cap initial pipe capacity according to pipe-max-size limit Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 31/92] futex: futex_wake_op, fix sign_extend32 sign bits Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 32/92] kernel/exit.c: avoid undefined behaviour when calling wait4() Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 33/92] usbip: usbip_host: refine probe and disconnect debug msgs to be useful Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 34/92] usbip: usbip_host: delete device from busid_table after rebind Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 35/92] usbip: usbip_host: run rebind from exit when module is removed Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 36/92] usbip: usbip_host: fix NULL-ptr deref and use-after-free errors Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 37/92] usbip: usbip_host: fix bad unlock balance during stub_probe() Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 38/92] ALSA: usb: mixer: volume quirk for CM102-A+/102S+ Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 39/92] ALSA: hda: Add Lenovo C50 All in one to the power_save blacklist Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 40/92] ALSA: control: fix a redundant-copy issue Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 41/92] spi: pxa2xx: Allow 64-bit DMA Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 42/92] powerpc/powernv: panic() on OPAL < V3 Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 43/92] powerpc/powernv: Remove OPALv2 firmware define and references Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 44/92] powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 45/92] cpuidle: coupled: remove unused define cpuidle_coupled_lock Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 46/92] powerpc: Dont preempt_disable() in show_cpuinfo() Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 47/92] vmscan: do not force-scan file lru if its absolute size is small Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 48/92] proc: meminfo: estimate available memory more conservatively Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 49/92] mm: filemap: remove redundant code in do_read_cache_page Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 50/92] mm: filemap: avoid unnecessary calls to lock_page when waiting for IO to complete during a read Greg Kroah-Hartman
2018-05-24 10:50   ` Jan Kara
2018-05-24 11:05     ` Greg Kroah-Hartman
2018-05-24 11:17       ` Hugh Dickins
2018-05-24 11:28         ` Greg KH
2018-05-24 12:02           ` Jan Kara
2018-05-24 13:12             ` Mel Gorman
2018-05-24 17:27           ` Hugh Dickins
2018-05-24 19:06             ` Greg KH
2018-05-24 20:01               ` Hugh Dickins
2018-11-01 21:45               ` Pavel Machek
2018-05-24  9:38 ` [PATCH 4.4 51/92] signals: avoid unnecessary taking of sighand->siglock Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 52/92] cpufreq: intel_pstate: Enable HWP by default Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 53/92] tracing/x86/xen: Remove zero data size trace events trace_xen_mmu_flush_tlb{_all} Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 54/92] proc read mms {arg,env}_{start,end} with mmap semaphore taken Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 55/92] procfs: fix pthread cross-thread naming if !PR_DUMPABLE Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 56/92] powerpc/powernv: Fix NVRAM sleep in invalid context when crashing Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 57/92] mm: dont allow deferred pages with NEED_PER_CPU_KM Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 58/92] s390/qdio: fix access to uninitialized qdio_q fields Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 59/92] s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 60/92] s390/qdio: dont release memory in qdio_setup_irq() Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 61/92] s390: remove indirect branch from do_softirq_own_stack Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 62/92] efi: Avoid potential crashes, fix the struct efi_pci_io_protocol_32 definition for mixed mode Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 63/92] ARM: 8771/1: kprobes: Prohibit kprobes on do_undefinstr Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 64/92] tick/broadcast: Use for_each_cpu() specially on UP kernels Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 65/92] ARM: 8769/1: kprobes: Fix to use get_kprobe_ctlblk after irq-disabed Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 66/92] ARM: 8770/1: kprobes: Prohibit probing on optimized_callback Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 67/92] ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 68/92] Btrfs: fix xattr loss after power failure Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 69/92] btrfs: fix crash when trying to resume balance without the resume flag Greg Kroah-Hartman
2018-05-24  9:38 ` Greg Kroah-Hartman [this message]
2018-05-24  9:38 ` [PATCH 4.4 71/92] net: test tailroom before appending to linear skb Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 72/92] packet: in packet_snd start writing at link layer allocation Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 73/92] sock_diag: fix use-after-free read in __sk_free Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 74/92] tcp: purge write queue in tcp_connect_init() Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 75/92] ext2: fix a block leak Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 76/92] s390: add assembler macros for CPU alternatives Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 77/92] s390: move expoline assembler macros to a header Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 78/92] s390/lib: use expoline for indirect branches Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 79/92] s390/ftrace: " Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 80/92] s390/kernel: " Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 81/92] s390: move spectre sysfs attribute code Greg Kroah-Hartman
2018-05-24  9:38 ` [PATCH 4.4 82/92] s390: extend expoline to BC instructions Greg Kroah-Hartman
2018-05-24  9:39 ` [PATCH 4.4 83/92] s390: use expoline thunks in the BPF JIT Greg Kroah-Hartman
2018-05-24  9:39 ` [PATCH 4.4 84/92] scsi: libsas: defer ata device eh commands to libata Greg Kroah-Hartman
2018-05-24  9:39 ` [PATCH 4.4 85/92] scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() Greg Kroah-Hartman
2018-05-24  9:39 ` [PATCH 4.4 86/92] scsi: zfcp: fix infinite iteration on ERP ready list Greg Kroah-Hartman
2018-05-24  9:39 ` [PATCH 4.4 87/92] dmaengine: ensure dmaengine helpers check valid callback Greg Kroah-Hartman
2018-05-24  9:39 ` [PATCH 4.4 88/92] time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting Greg Kroah-Hartman
2018-05-24  9:39 ` [PATCH 4.4 89/92] gpio: rcar: Add Runtime PM handling for interrupts Greg Kroah-Hartman
2018-05-24  9:39 ` [PATCH 4.4 90/92] cfg80211: limit wiphy names to 128 bytes Greg Kroah-Hartman
2018-05-24  9:39 ` [PATCH 4.4 91/92] hfsplus: stop workqueue when fill_super() failed Greg Kroah-Hartman
2018-05-24  9:39 ` [PATCH 4.4 92/92] x86/kexec: Avoid double free_page() upon do_kexec_load() failure Greg Kroah-Hartman
2018-05-24 13:22 ` [PATCH 4.4 00/92] 4.4.133-stable review Guenter Roeck
2018-05-24 14:45 ` Nathan Chancellor
2018-05-24 16:46 ` kernelci.org bot
2018-05-24 17:32 ` Guenter Roeck
2018-05-24 19:47   ` Greg Kroah-Hartman
2018-05-25 14:11     ` Greg Kroah-Hartman
2018-05-25 16:39       ` Guenter Roeck
2018-05-25 16:50         ` Greg Kroah-Hartman
2018-05-24 18:06 ` Dan Rue
2018-05-24 18:17   ` Guenter Roeck
2018-05-24 21:34     ` Naresh Kamboju
2018-05-24 21:52       ` Shuah Khan
2018-05-25  0:11         ` Dan Rue
2018-05-24 19:08   ` Greg Kroah-Hartman
2018-05-24 20:31     ` Rafael Tinoco
2018-05-25  1:34       ` Daniel Sangorrin
2018-05-25  2:51         ` Rafael Tinoco
2018-05-25  6:11           ` Daniel Sangorrin
2018-05-25  7:58         ` Naresh Kamboju
2018-05-25  0:46     ` Dan Rue
2018-05-24 19:28 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180524093206.062830692@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bo.liu@linux.alibaba.com \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    --subject='Re: [PATCH 4.4 70/92] btrfs: fix reading stale metadata blocks after degraded raid1 mounts' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).