LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Andreas Gruenbacher <agruenba@redhat.com>,
	Ross Lagerwall <ross.lagerwall@citrix.com>,
	Bob Peterson <rpeterso@redhat.com>,
	Sasha Levin <sashal@kernel.org>,
	cluster-devel@redhat.com
Subject: [PATCH AUTOSEL 4.14 005/167] gfs2: Fix occasional glock use-after-free
Date: Wed, 22 May 2019 15:26:00 -0400	[thread overview]
Message-ID: <20190522192842.25858-5-sashal@kernel.org> (raw)
In-Reply-To: <20190522192842.25858-1-sashal@kernel.org>

From: Andreas Gruenbacher <agruenba@redhat.com>

[ Upstream commit 9287c6452d2b1f24ea8e84bd3cf6f3c6f267f712 ]

This patch has to do with the life cycle of glocks and buffers.  When
gfs2 metadata or journaled data is queued to be written, a gfs2_bufdata
object is assigned to track the buffer, and that is queued to various
lists, including the glock's gl_ail_list to indicate it's on the active
items list.  Once the page associated with the buffer has been written,
it is removed from the ail list, but its life isn't over until a revoke
has been successfully written.

So after the block is written, its bufdata object is moved from the
glock's gl_ail_list to a file-system-wide list of pending revokes,
sd_log_le_revoke.  At that point the glock still needs to track how many
revokes it contributed to that list (in gl_revokes) so that things like
glock go_sync can ensure all the metadata has been not only written, but
also revoked before the glock is granted to a different node.  This is
to guarantee journal replay doesn't replay the block once the glock has
been granted to another node.

Ross Lagerwall recently discovered a race in which an inode could be
evicted, and its glock freed after its ail list had been synced, but
while it still had unwritten revokes on the sd_log_le_revoke list.  The
evict decremented the glock reference count to zero, which allowed the
glock to be freed.  After the revoke was written, function
revoke_lo_after_commit tried to adjust the glock's gl_revokes counter
and clear its GLF_LFLUSH flag, at which time it referenced the freed
glock.

This patch fixes the problem by incrementing the glock reference count
in gfs2_add_revoke when the glock's first bufdata object is moved from
the glock to the global revokes list. Later, when the glock's last such
bufdata object is freed, the reference count is decremented. This
guarantees that whichever process finishes last (the revoke writing or
the evict) will properly free the glock, and neither will reference the
glock after it has been freed.

Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/gfs2/glock.c | 1 +
 fs/gfs2/log.c   | 3 ++-
 fs/gfs2/lops.c  | 6 ++++--
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index cd6a64478a026..aea1ed0aebd0f 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -140,6 +140,7 @@ void gfs2_glock_free(struct gfs2_glock *gl)
 {
 	struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
 
+	BUG_ON(atomic_read(&gl->gl_revokes));
 	rhashtable_remove_fast(&gl_hash_table, &gl->gl_node, ht_parms);
 	smp_mb();
 	wake_up_glock(gl);
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index f72c442314062..483b82e2be923 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -588,7 +588,8 @@ void gfs2_add_revoke(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd)
 	bd->bd_bh = NULL;
 	bd->bd_ops = &gfs2_revoke_lops;
 	sdp->sd_log_num_revoke++;
-	atomic_inc(&gl->gl_revokes);
+	if (atomic_inc_return(&gl->gl_revokes) == 1)
+		gfs2_glock_hold(gl);
 	set_bit(GLF_LFLUSH, &gl->gl_flags);
 	list_add(&bd->bd_list, &sdp->sd_log_le_revoke);
 }
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index c8ff7b7954f05..049f8c6721b4a 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -660,8 +660,10 @@ static void revoke_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 		bd = list_entry(head->next, struct gfs2_bufdata, bd_list);
 		list_del_init(&bd->bd_list);
 		gl = bd->bd_gl;
-		atomic_dec(&gl->gl_revokes);
-		clear_bit(GLF_LFLUSH, &gl->gl_flags);
+		if (atomic_dec_return(&gl->gl_revokes) == 0) {
+			clear_bit(GLF_LFLUSH, &gl->gl_flags);
+			gfs2_glock_queue_put(gl);
+		}
 		kmem_cache_free(gfs2_bufdata_cachep, bd);
 	}
 }
-- 
2.20.1


  parent reply	other threads:[~2019-05-22 19:41 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-22 19:25 [PATCH AUTOSEL 4.14 001/167] gfs2: Fix lru_count going negative Sasha Levin
2019-05-22 19:25 ` [PATCH AUTOSEL 4.14 002/167] cxgb4: Fix error path in cxgb4_init_module Sasha Levin
2019-05-22 19:25 ` [PATCH AUTOSEL 4.14 003/167] NFS: make nfs_match_client killable Sasha Levin
2019-05-22 19:25 ` [PATCH AUTOSEL 4.14 004/167] IB/hfi1: Fix WQ_MEM_RECLAIM warning Sasha Levin
2019-05-22 19:26 ` Sasha Levin [this message]
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 006/167] mmc: core: Verify SD bus width Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 007/167] tools/bpf: fix perf build error with uClibc (seen on ARC) Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 008/167] dmaengine: tegra210-dma: free dma controller in remove() Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 009/167] net: ena: gcc 8: fix compilation warning Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 010/167] orangefs: truncate before updating size Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 011/167] pinctrl: zte: fix leaked of_node references Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 012/167] ASoC: hdmi-codec: unlock the device on startup errors Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 013/167] leds: avoid races with workqueue Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 014/167] powerpc/perf: Return accordingly on invalid chip-id in Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 015/167] powerpc/boot: Fix missing check of lseek() return value Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 016/167] ASoC: imx: fix fiq dependencies Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 017/167] spi: pxa2xx: fix SCR (divisor) calculation Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 018/167] brcm80211: potential NULL dereference in brcmf_cfg80211_vndr_cmds_dcmd_handler() Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 019/167] ACPI / property: fix handling of data_nodes in acpi_get_next_subnode() Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 020/167] ARM: vdso: Remove dependency with the arch_timer driver internals Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 021/167] arm64: Fix compiler warning from pte_unmap() with -Wunused-but-set-variable Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 022/167] sched/cpufreq: Fix kobject memleak Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 023/167] scsi: qla2xxx: Fix a qla24xx_enable_msix() error path Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 024/167] scsi: qla2xxx: Fix abort handling in tcm_qla2xxx_write_pending() Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 025/167] scsi: qla2xxx: Avoid that lockdep complains about unsafe locking in tcm_qla2xxx_close_session() Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 026/167] Btrfs: fix data bytes_may_use underflow with fallocate due to failed quota reserve Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 027/167] btrfs: fix panic during relocation after ENOSPC before writeback happens Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 028/167] btrfs: Don't panic when we can't find a root key Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 029/167] iwlwifi: pcie: don't crash on invalid RX interrupt Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 030/167] rtc: 88pm860x: prevent use-after-free on device remove Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 031/167] scsi: qedi: Abort ep termination if offload not scheduled Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 032/167] w1: fix the resume command API Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 033/167] dmaengine: pl330: _stop: clear interrupt status Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 034/167] mac80211/cfg80211: update bss channel on channel switch Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 035/167] libbpf: fix samples/bpf build failure due to undefined UINT32_MAX Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 036/167] driver core: platform: Fix the usage of platform device name(pdev->name) Sasha Levin
2019-05-22 20:05   ` Greg Kroah-Hartman
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 037/167] ASoC: fsl_sai: Update is_slave_mode with correct value Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 038/167] mwifiex: prevent an array overflow Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 039/167] net: cw1200: fix a NULL pointer dereference Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 040/167] at76c50x-usb: Don't register led_trigger if usb_register_driver failed Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 041/167] ssb: Fix possible NULL pointer dereference in ssb_host_pcmcia_exit Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 042/167] crypto: sun4i-ss - Fix invalid calculation of hash end Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 043/167] bcache: return error immediately in bch_journal_replay() Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 044/167] bcache: fix failure in journal relplay Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 045/167] bcache: add failure check to run_cache_set() for journal replay Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 046/167] bcache: avoid clang -Wunintialized warning Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 047/167] RDMA/cma: Consider scope_id while binding to ipv6 ll address Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 048/167] vfio-ccw: Do not call flush_workqueue while holding the spinlock Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 049/167] vfio-ccw: Release any channel program when releasing/removing vfio-ccw mdev Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 050/167] x86/build: Move _etext to actual end of .text Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 051/167] smpboot: Place the __percpu annotation correctly Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 052/167] x86/mm: Remove in_nmi() warning from 64-bit implementation of vmalloc_fault() Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 053/167] mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 054/167] HID: logitech-hidpp: use RAP instead of FAP to get the protocol version Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 055/167] pinctrl: pistachio: fix leaked of_node references Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 056/167] pinctrl: samsung: " Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 057/167] clk: rockchip: undo several noc and special clocks as critical on rk3288 Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 058/167] dmaengine: at_xdmac: remove BUG_ON macro in tasklet Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 059/167] media: coda: clear error return value before picture run Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 060/167] media: ov6650: Move v4l2_clk_get() to ov6650_video_probe() helper Sasha Levin
2019-05-22 19:26 ` [PATCH AUTOSEL 4.14 061/167] media: au0828: stop video streaming only when last user stops Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190522192842.25858-5-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=agruenba@redhat.com \
    --cc=cluster-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ross.lagerwall@citrix.com \
    --cc=rpeterso@redhat.com \
    --cc=stable@vger.kernel.org \
    --subject='Re: [PATCH AUTOSEL 4.14 005/167] gfs2: Fix occasional glock use-after-free' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).