LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue
@ 2018-04-04 19:17 jglisse
  2018-04-04 19:17 ` [RFC PATCH 04/79] pipe: add inode field to struct pipe_inode_info jglisse
                   ` (42 more replies)
  0 siblings, 43 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:17 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrea Arcangeli,
	Alexander Viro, Tim Chen, Theodore Ts'o, Tejun Heo, Jan Kara,
	Josef Bacik, Mel Gorman, Jeff Layton

From: Jérôme Glisse <jglisse@redhat.com>

https://cgit.freedesktop.org/~glisse/linux/log/?h=generic-write-protection-rfc

This is an RFC for LSF/MM discussions. It impacts the file subsystem,
the block subsystem and the mm subsystem. Hence it would benefit from
a cross sub-system discussion.

Patchset is not fully bake so take it with a graint of salt. I use it
to illustrate the fact that it is doable and now that i did it once i
believe i have a better and cleaner plan in my head on how to do this.
I intend to share and discuss it at LSF/MM (i still need to write it
down). That plan lead to quite different individual steps than this
patchset takes and his also easier to split up in more manageable
pieces.

I also want to apologize for the size and number of patches (and i am
not even sending them all).

----------------------------------------------------------------------
The Why ?

I have two objectives: duplicate memory read only accross nodes and or
devices and work around PCIE atomic limitations. More on each of those
objective below. I also want to put forward that it can solve the page
wait list issue ie having each page with its own wait list and thus
avoiding long wait list traversale latency recently reported [1].

It does allow KSM for file back pages (truely generic KSM even between
both anonymous and file back page). I am not sure how useful this can
be, this was not an objective i did pursue, this is just a for free
feature (see below).

[1] https://groups.google.com/forum/#!topic/linux.kernel/Iit1P5BNyX8

----------------------------------------------------------------------
Per page wait list, so long page_waitqueue() !

Not implemented in this RFC but below is the logic and pseudo code
at bottom of this email.

When there is a contention on struct page lock bit, the caller which
is trying to lock the page will add itself to a waitqueue. The issues
here is that multiple pages share the same wait queue and on large
system with a lot of ram this means we can quickly get to a long list
of waiters for differents pages (or for the same page) on the same
list [1].

The present patchset virtualy kills all places that need to access the
page->mapping field and only a handfull are left, namely for testing
page truncation and for vmscan. The former can be remove if we reuse
the PG_waiters flag for a new PG_truncate flag set on truncation then
we can virtualy kill all derefence of page->mapping (this patchset
proves it is doable). NOTE THIS DOES NOT MEAN THAT MAPPING is FREE TO
BE USE BY ANYONE TO STORE WHATEVER IN STRUCT PAGE. SORRY NO !

What this means whenever a thread want to spin on page until it can
lock it then it can carefully replace the page->mapping with a waiter
struct for a wait list. Thus each page under contention will have its
own wait list.

The fact that there is not many place that dereference page.mapping
is important because this means now that any dereference must be done
with preemption disabled (inside rcu read section) so that the waiter
can free the waiter struct without fear for hazard (the struct is on
the stack like today). Pseudo code at the end of this mail.

Devil is in the details but after long meditation and pondering on
this i believe this is a do-able solution. Note it does not rely on
the write protection, nor does it technically need to kill all struct
page mapping derefence. But the latter can really hurt performance if
they have to be done under rcu read lock and the corresponding grace
period needed before freeing waiter struct.

----------------------------------------------------------------------
KSM for everyone !

With generic write protection you can do KSM for file back page too
(even if they have different offset, mapping or buffer_head). While i
believe page sharing for containers is already solve with overlayfs,
this might still be an interesting feature for some.

Oh and crazy to crazy you can merge private anonymous page and file
back page together ... Probably totaly useless but cool like crazy.

----------------------------------------------------------------------
KDM (Kernel Duplicate Memory)

Most kernel development, especialy in mm sub-system, is about how to
save resources, how to share as much of them as possible so that we
maximize their availabilities for all the processes.

Objective here is slightly different. Some user favor performances and
already have properly sized system (ie they have enough resources for
the task at hand). For performance it is sometimes better to use more
resources to improve other parameters of the performance equation.

This is especialy true for big system that either use several devices
or spread accross several nodes or both. For those, sharing memory
means peer to peer traffic. This can become a bottleneck and saturate
the interconnect between those peers.

If some data set under consideration is access read only then we can
duplicate memory backing it on multiple nodes/devices. Access is then
local to each nodes/devices which greatly improves both latency and
bandwidth while also saving inter-connect bandwidth for other uses.

Note that KDM accross NUMA nodes means that we need to duplicate the
CPU page table and have special copy for each node. So in honesty,
here i am focusing on device, i am not sure the amount of work to do
that for CPU page table is sane ...

----------------------------------------------------------------------
PCIE atomic limitation

PCIE atomic only have tree atomic operations Fetch and Add (32, 64
bits), Swap (32, 64 bits), Compare And Swap aka CAS (32, 64 and 128
bits). The address must align on type (32, 64 or 128 bits alignment),
it can not cross 4KBytes boundary ie it must be inside a single page.
Note that the alignment constraint gives the boundary crossing for
free (but those are two distinct constraint in the specification).

PCIE atomic operation have a lower throughput than regular PCIE memory
write operation. Regular PCIE memory transaction has a maximum payload
which depends on the CPU/chipset and is often a multiple of cacheline
size. So one regular PCIE memory write can write multiple bytes while
a PCIE atomic operation can write only 4, 8 or 16 bytes (32, 64 or 128
bits). Note that each PCIE transaction involve an acknowledge answer
packet from receiver to transceiver.

If we write protect a page on the CPU then the device can cache and
write combine as much bytes as possible and improve its throughut for
atomic and regular write. The write protection on the CPU allow us to
ascertain that any write by the device behave as if it was atomic from
CPU point of view. So generic write protection allow to improve the
overal performances for atomic operation to system memory while it is
only updated by a device.


There is another PCIE limitation which generic write protection would
help to work around. If device offers more atomic operations than the
above (FetchAdd, Swap, CAS) to program it runs (like GPU for instance
that have same list of atomic operations as CPU roughly), then it has
to emulate those using a CAS loop:

  AtomicOpEmulate64(void *dst, u64 src, u64 (*op)(u64 val, u64 src))
  {
    u64 val, tmp;
    do {
        val = PCIE_Read64(dst);
        tmp = op(val, src);
    } while (!PCIE_AtomicCompareAndSwap(dst, val, tmp));
  }

The hardware have to implement this as part of its mmu or of its PCIE
bridge. Not only this can require quite a bit of die space (ie having
to implement each of the atomic operation the hardware support) but it
is also prone to starvation by the host CPU. If the host CPU is also
modifying the same address in a tight loop than the device which has
a higher latency (as it is on the PCIE bus) it unlikely to win the CAS
race. CPU interrupts and others CPU latency likely means this can not
turn into an infinite loop for the hardware. It can however quickly
monopolize the PCIE bandwidth for the device and severly impact its
performances (device might have to stop multiple of its threads each
waiting on atomic operation completion).

With generic write protection device can force serialization with CPU.
This does however slow down the overall process as the generic write
protection might require expensive CPU TLB flush and be prone to lock
contention. But sometime forward progress is more important than
maximizing throughput for one or another component in the system (here
the CPU or the device).

----------------------------------------------------------------------
The What ?

Aim of this patch serie is to introduce generic page write protection
for any kind of regular page in a process (private anonymous or back
by regular file). This feature already exist, in one form, for private
anonymous page, as part of KSM (Kernel Share Memory).

So this patch serie is two fold. First it factors out the page write
protection of KSM into a generic write protection mechanim which KSM
becomes the first user of. Then it add support for regular file back
page memory (regular file or share memory aka shmem). To achieve this
i need to cut the dependency lot of code have on page->mapping so i
can set page->mapping to point to special structure when write
protected.

----------------------------------------------------------------------
The How ?

The corner stone assumption in this patch serie is that page->mapping
is always the same as vma->vm_file->f_mapping (modulo when a page is
truncated). The one exception is in respect to swaping with nfs file.

Am i fundamentaly wrong in my assumption ?

I believe this is a do-able plan because virtually all place do know
the address_space a page belongs to, or someone in the callchain do.
Hence this patchset is all about passing down that information. The
only exception i am aware of is page reclamation (vmscan) but this can
be handled as a special case as there we not interested in the page
mapping per say but in reclaiming memory.

Once you have both struct page and mapping (without relying on the
struct page to get the latter) you can use mapping that as a unique
key to lookup page->private/page->index value. So all dereference of
those fields become:
    page_offset(page) -> page_offset(page, mapping)
    page_buffers(page) -> page_buffers(page, mapping)

Note than this only need special handling for write protected page ie
it is the same as before if page is not write protected so it just add
a test each time code call either helper.

Sinful function (all existing usage are remove in this patchset):
    page_mapping(page)

You can also use the page buffer head as a unique key. So following
helpers are added (thought i do not use them):
    page_mapping_with_buffers(page, (struct buffer_head *)bh)
    page_offset_with_buffers(page, (struct buffer_head *)bh)

A write protected page has page->mapping pointing to a structure like
struct rmap_item for KSM. So this structure has a list for each unique
combination:
    struct write_protect {
        struct list_head *mappings; /* write_protect_mapping list */
        ...
    };

    struct write_protect_mapping {
        struct list_head list
        struct address_space *mapping;
        unsigned long offset;
        unsigned long private;
        ...
    };

----------------------------------------------------------------------
Methodoly:

I have try to avoid any functional change within patches that add new
argument to function, simply by never using the new argument. So only
function signature changes. Doing so means that each individual file-
system maintainer should not need to pay too much attention to the big
patches that touch everything, they can focus on the individual patches
to their particular filesystem.

WHAT MUST BE CAREFULLY REVIEWED IS CALL SITE OF ADDRESS SPACE CALLBACK
TO ASCERTAIN THAT I AM USING THE RIGHT MAPPING WHICH THE PAGE BELONGS
TO (in some code path like pipe, splice, compression, encryption or
symlink this is not always obvious).

Conversion of common helpers have been done in the same way, i add the
argument but for all call site i use page->mapping so that each patch
do not change behavior for all filesystem. Removing page->mapping is
left to individual filesystem patch.

As this is an RFC i am not posting individual filesystem changes (some
are in the git repository).

----------------------------------------------------------------------
Patch grouping

Patch 1 to 3 are change to individual fs to simplify the rest of the
patchset and especialy help when i re-order thing. They literaly can
not regress (i would amaze if they do). They are just shuffling thing
around a bit in each fs. Getting those early in would probably help.

Patch 4 deal with pipe, just need to keep the inode|mapping when data
are pipe (to|from|to and from a file).

Patch 5 to 8 add helpers used latter

Patch 9 to 19 each patch add struct address_space to one of the call-
back (address_space_operation). As per methodology this does not make
use of the new argument and modify call site conservatively.

Patch 20 to 27 each patch add struct address_space or struct inode to
various fs and block sub-system helpers/generics implementation. As
per methodology this does not make use of the new argument and modify
call site conservatively.

Patch 29 to 32 add either address_space, inode or buffer_head to block
sub-system helpers.

Patch 35 to 49 deal with buffer_head infrastructure (again adding new
argument so that each function either know the mapping or buffer_head
pointer without relying on struct page).

Patch 53 to 62 each patch update a single filesystem in one chunk to
remove all usage of page->mapping, page->private, page->offset and
use helpers or contextual informations to get those value. REGRESSIONS
IF ANY ARE THERE !

Patch 65 to 68 deal with the swap code path (page_swap_info()).

Patch 76 to 79 factor out KSM write protection into a generic write
protection mechanism turning KSM into its first user. This is mostly
shuffling code around, renaming struct and constant and updating any
existing mm code to use write protection callback instead of calling
into KSM. I have try to be careful but if it regress then it should
only regress for KSM users.

----------------------------------------------------------------------

Thank you for reaching that point, feel free to throw electrons at me.


ANNEX (seriously there is more)
----------------------------------------------------------------------
Page wait list code

So here is a braim dump with all the expected syntax error and reverse
logic and toddler mistakes. However i believe it is correct overall.

    void __lock_page(struct page *page)
    {
        struct page_waiter waiter;
        struct address_space *special_wait;
        struct address_space *mapping;
        bool again = false;

        special_wait = make_mapping_special_wait(&waiter);
        /* Store a waitqueue in task_struct ? */
        page_waiter_init(&waiter, current);
        spin_lock(&waiter.lock);

        do {
            if (trylock_page(page)) {
                spin_unlock(&waiter.lock);
                /* Our struct was never expose to outside world */
                return;
            }

            mapping = READ_ONCE(page->mapping);
            if (mapping_special_wait(mapping)) {
                struct page_waiter *tmp = mapping_to_waiter(mapping);
                if (spin_trylock(&tmp->lock)) {
                    /* MAYBE kref and spin_lock() ? */
                    again = true;
                } else {
                    list_add_tail(&waiter.list, &tmp->list);
                    waiter.mapping = tmp.mapping;
                    spin_unlock(&tmp->lock);
                }
            } else {
                void *old;
                waiter.mapping = mapping;
                old = atomic64_cmpxchg(&page->mapping, mapping,
                                       special_wait);
                again = old != special_wait;
            }
        } while (again);

        /*
         * So nightmare here is a racing unlock_page() that did not
         * see our updated mapping and another thread locking the page
         * just freshly unlocked from under us. This mean some one got
         * in front of the line before us ! That's rude, however the
         * next unlock_page() will not miss us ! So here the trylock
         * is just to avoid waiting for nothing. Rude lucky locker
         * will be ahead of us ...
         */
        if (trylock_page(page)) {
            struct page_waiter *tmp;

            mapping = READ_ONCE(page->mapping);
            if (mapping == special_mapping) {
                /*
                 * Ok we are first inline and nobody can add itself
                 * to our list.
                 */
                BUG_ON(!list_is_empty(&waiter.list));
                page->mapping = waiter.mapping;
                spin_unlock(&waiter.lock);
                goto exit;
            }
            /*
             * We got in front of line and someone else was already
             * waiting, be nice.
             */
            tmp = mapping_to_waiter(mapping);
            tmp.you_own_it_dummy = 1;
            spin_unlock(&waiter.lock);
            wake_up(tmp->queue);
        } else
            spin_unlock(&waiter.lock);

        /* Wait queue in task_struct */
        wait_event(waiter.queue, !waiter.you_own_it_dummy);

        /* Lock to serialize page->mapping update */
        spin_lock(&waiter.lock);
        if (list_empty(&waiter.list)) {
            page->mapping = waiter.mapping;
        } else {
            struct page_waiter *tmp;
            tmp = list_first_entry(waiter.list...);
            page->mapping = make_mapping_special_wait(tmp);
        }
        spin_unlock(&waiter.lock);

    exit:
        /*
         * Do need rcu quiesce ? because of a racing spin_trylock ?
         * Call to page_mapping() see below for that function.
         * Waiting for rcu grace period would be bad i think maybe
         * we can keep the waiter struct at top of stack (dunno if
         * that trick exist in kernel already, ie preallocating top
         * of stack for common struct like waiter) ?
         */
    }

    void unlock_page(struct page *page)
    {
        struct address_space *mapping;

        mapping = READ_ONCE(page->mapping);
        if (mapping_special_wait(mapping)) {
                struct page_waiter *tmp = mapping_to_waiter(mapping);
                tmp->you_own_it_dummy = 1;
                wake_up(tmp->queue);
        } else {
            /* The race is handled in in the slow path __lock_page */
            clear_bit_unlock(PG_locked, &page->flags);
        }
    }

    struct address_space *page_mapping(struct page *page)
    {
        struct address_space *mapping;

        rcu_read_lock();
        mapping = READ_ONCE(page->mapping);
        if (mapping_special_wait(mapping)) {
                struct page_waiter *tmp = mapping_to_waiter(mapping);
                mapping = tmp->mapping;
        }
        rcu_read_unlock();
        return mapping;
    }
----------------------------------------------------------------------
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-block@vger.kernel.org

Jérôme Glisse (79):
  fs/fscache: remove fscache_alloc_page()
  fs/ufs: add struct super_block to ubh_bforget() arguments.
  fs/ext4: prepare move_extent_per_page() for the holy crusade
  pipe: add inode field to struct pipe_inode_info
  mm/swap: add an helper to get address_space from swap_entry_t
  mm/page: add helpers to dereference struct page index field
  mm/page: add helpers to find mapping give a page and buffer head
  mm/page: add helpers to find page mapping and private given a bio
  fs: add struct address_space to read_cache_page() callback argument
  fs: add struct address_space to is_dirty_writeback() callback argument
  fs: add struct address_space to is_partially_uptodate() callback
    argument
  fs: add struct address_space to launder_page() callback argument
  fs: add struct address_space to putback_page() callback argument
  fs: add struct address_space to isolate_page() callback argument
  fs: add struct address_space to releasepage() callback argument
  fs: add struct address_space to invalidatepage() callback argument
  fs: add struct address_space to set_page_dirty() callback argument
  fs: add struct address_space to readpage() callback argument
  fs: add struct address_space to writepage() callback argument
  fs: add struct address_space to write_cache_pages() callback argument
  fs: add struct inode to block_write_full_page() arguments
  fs: add struct inode to block_read_full_page() arguments
  fs: add struct inode to map_buffer_to_page() arguments
  fs: add struct inode to nobh_writepage() arguments
  fs: add struct address_space to mpage_writepage() arguments
  fs: add struct address_space to mpage_readpage() arguments
  fs: add struct address_space to fscache_read*() callback arguments
  fs: introduce page_is_truncated() helper
  fs/block: add struct address_space to bdev_write_page() arguments
  fs/block: add struct address_space to __block_write_begin() arguments
  fs/block: add struct address_space to __block_write_begin_int() args
  fs/block: do not rely on page->mapping get it from the context
  fs/journal: add struct super_block to jbd2_journal_forget() arguments.
  fs/journal: add struct inode to jbd2_journal_revoke() arguments.
  fs/buffer: add struct address_space and struct page to end_io callback
  fs/buffer: add struct super_block to bforget() arguments
  fs/buffer: add struct super_block to __bforget() arguments
  fs/buffer: add first buffer flag for first buffer_head in a page
  fs/buffer: add struct address_space to clean_page_buffers() arguments
  fs/buffer: add helper to dereference page's buffers with given mapping
  fs/buffer: add struct address_space to init_page_buffers() args
  fs/buffer: add struct address_space to drop_buffers() args
  fs/buffer: add struct address_space to page_zero_new_buffers() args
  fs/buffer: add struct address_space to create_empty_buffers() args
  fs/buffer: add struct address_space to page_seek_hole_data() args
  fs/buffer: add struct address_space to try_to_free_buffers() args
  fs/buffer: add struct address_space to attach_nobh_buffers() args
  fs/buffer: add struct address_space to mark_buffer_write_io_error()
    args
  fs/buffer: add struct address_space to block_commit_write() arguments
  fs: stop relying on mapping field of struct page, get it from context
  fs: stop relying on mapping field of struct page, get it from context
  fs/buffer: use _page_has_buffers() instead of page_has_buffers()
  fs/lustre: do not rely on page->mapping get it from the context
  fs/nfs: do not rely on page->mapping get it from the context
  fs/ext2: do not rely on page->mapping get it from the context
  fs/ext2: convert page's index lookup to be against specific mapping
  fs/ext4: do not rely on page->mapping get it from the context
  fs/ext4: convert page's index lookup to be against specific mapping
  fs/ext4: convert page's buffers lookup to be against specific mapping
  fs/xfs: do not rely on page->mapping get it from the context
  fs/xfs: convert page's index lookup to be against specific mapping
  fs/xfs: convert page's buffers lookup to be against specific mapping
  mm/page: convert page's index lookup to be against specific mapping
  mm/buffer: use _page_has_buffers() instead of page_has_buffers()
  mm/swap: add struct swap_info_struct swap_readpage() arguments
  mm/swap: add struct address_space to __swap_writepage() arguments
  mm/swap: add struct swap_info_struct *sis to swap_slot_free_notify()
    args
  mm/vma_address: convert page's index lookup to be against specific
    mapping
  fs/journal: add struct address_space to
    jbd2_journal_try_to_free_buffers() arguments
  mm: add struct address_space to mark_buffer_dirty()
  mm: add struct address_space to set_page_dirty()
  mm: add struct address_space to set_page_dirty_lock()
  mm: pass down struct address_space to set_page_dirty()
  mm/page_ronly: add config option for generic read only page framework.
  mm/page_ronly: add page read only core structure and helpers.
  mm/ksm: have ksm select PAGE_RONLY config.
  mm/ksm: hide set_page_stable_node() and page_stable_node()
  mm/ksm: rename PAGE_MAPPING_KSM to PAGE_MAPPING_RONLY
  mm/ksm: set page->mapping to page_ronly struct instead of stable_node.

 Documentation/filesystems/caching/netfs-api.txt    |  19 --
 arch/cris/arch-v32/drivers/cryptocop.c             |   2 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c             |   2 +-
 arch/powerpc/kvm/e500_mmu.c                        |   3 +-
 arch/s390/kvm/interrupt.c                          |   4 +-
 arch/x86/kvm/svm.c                                 |   2 +-
 block/bio.c                                        |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c            |   2 +-
 drivers/gpu/drm/drm_gem.c                          |   2 +-
 drivers/gpu/drm/exynos/exynos_drm_g2d.c            |   2 +-
 drivers/gpu/drm/i915/i915_gem.c                    |   6 +-
 drivers/gpu/drm/i915/i915_gem_fence_reg.c          |   2 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c            |   2 +-
 drivers/gpu/drm/radeon/radeon_ttm.c                |   2 +-
 drivers/gpu/drm/ttm/ttm_tt.c                       |   2 +-
 drivers/infiniband/core/umem.c                     |   2 +-
 drivers/infiniband/core/umem_odp.c                 |   2 +-
 drivers/infiniband/hw/hfi1/user_pages.c            |   2 +-
 drivers/infiniband/hw/qib/qib_user_pages.c         |   2 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c           |   2 +-
 drivers/md/md-bitmap.c                             |   3 +-
 .../media/common/videobuf2/videobuf2-dma-contig.c  |   2 +-
 drivers/media/common/videobuf2/videobuf2-dma-sg.c  |   2 +-
 drivers/media/common/videobuf2/videobuf2-vmalloc.c |   2 +-
 drivers/misc/genwqe/card_utils.c                   |   2 +-
 drivers/misc/vmw_vmci/vmci_queue_pair.c            |   2 +-
 drivers/mtd/devices/block2mtd.c                    |   4 +-
 drivers/platform/goldfish/goldfish_pipe.c          |   2 +-
 drivers/sbus/char/oradax.c                         |   2 +-
 .../lustre/include/lustre_patchless_compat.h       |   2 +-
 drivers/staging/lustre/lustre/llite/dir.c          |   4 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |  15 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |  11 +-
 drivers/staging/lustre/lustre/llite/llite_mmap.c   |   7 +-
 drivers/staging/lustre/lustre/llite/rw.c           |   8 +-
 drivers/staging/lustre/lustre/llite/rw26.c         |  20 +-
 drivers/staging/lustre/lustre/llite/vvp_dev.c      |   8 +-
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   8 +-
 drivers/staging/lustre/lustre/llite/vvp_page.c     |  15 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |  16 +-
 drivers/staging/ncpfs/symlink.c                    |   4 +-
 .../interface/vchiq_arm/vchiq_2835_arm.c           |   2 +-
 drivers/vhost/vhost.c                              |   2 +-
 drivers/video/fbdev/core/fb_defio.c                |   3 +-
 fs/9p/cache.c                                      |   7 +-
 fs/9p/cache.h                                      |  11 +-
 fs/9p/vfs_addr.c                                   |  36 ++-
 fs/9p/vfs_file.c                                   |   2 +-
 fs/adfs/dir_f.c                                    |   2 +-
 fs/adfs/inode.c                                    |  11 +-
 fs/affs/bitmap.c                                   |   6 +-
 fs/affs/file.c                                     |  14 +-
 fs/affs/super.c                                    |   2 +-
 fs/affs/symlink.c                                  |   3 +-
 fs/afs/file.c                                      |  31 ++-
 fs/afs/internal.h                                  |   9 +-
 fs/afs/write.c                                     |  11 +-
 fs/aio.c                                           |   8 +-
 fs/befs/linuxvfs.c                                 |  16 +-
 fs/bfs/file.c                                      |  15 +-
 fs/bfs/inode.c                                     |   4 +-
 fs/block_dev.c                                     |  26 +-
 fs/btrfs/ctree.h                                   |   3 +-
 fs/btrfs/disk-io.c                                 |  20 +-
 fs/btrfs/extent_io.c                               |   9 +-
 fs/btrfs/file.c                                    |   6 +-
 fs/btrfs/free-space-cache.c                        |   2 +-
 fs/btrfs/inode.c                                   |  35 +--
 fs/btrfs/ioctl.c                                   |  12 +-
 fs/btrfs/relocation.c                              |   4 +-
 fs/btrfs/scrub.c                                   |   2 +-
 fs/btrfs/send.c                                    |   2 +-
 fs/buffer.c                                        | 267 ++++++++++++---------
 fs/cachefiles/rdwr.c                               |   6 +-
 fs/ceph/addr.c                                     |  28 ++-
 fs/ceph/cache.c                                    |  10 +-
 fs/cifs/file.c                                     |  27 ++-
 fs/cifs/fscache.c                                  |   6 +-
 fs/coda/symlink.c                                  |   3 +-
 fs/cramfs/inode.c                                  |   3 +-
 fs/direct-io.c                                     |   2 +-
 fs/ecryptfs/mmap.c                                 |   7 +-
 fs/efs/inode.c                                     |   5 +-
 fs/efs/symlink.c                                   |   4 +-
 fs/exofs/dir.c                                     |   2 +-
 fs/exofs/inode.c                                   |  30 ++-
 fs/ext2/balloc.c                                   |   6 +-
 fs/ext2/dir.c                                      |  52 ++--
 fs/ext2/ext2.h                                     |   4 +-
 fs/ext2/ialloc.c                                   |   8 +-
 fs/ext2/inode.c                                    |  21 +-
 fs/ext2/namei.c                                    |   4 +-
 fs/ext2/super.c                                    |   4 +-
 fs/ext2/xattr.c                                    |  12 +-
 fs/ext4/ext4.h                                     |   4 +-
 fs/ext4/ext4_jbd2.c                                |  10 +-
 fs/ext4/ialloc.c                                   |   3 +-
 fs/ext4/inline.c                                   |  16 +-
 fs/ext4/inode.c                                    | 234 ++++++++++--------
 fs/ext4/mballoc.c                                  |  40 +--
 fs/ext4/mballoc.h                                  |   1 +
 fs/ext4/mmp.c                                      |   2 +-
 fs/ext4/move_extent.c                              |  35 +--
 fs/ext4/page-io.c                                  |  22 +-
 fs/ext4/readpage.c                                 |  11 +-
 fs/ext4/resize.c                                   |   2 +-
 fs/ext4/super.c                                    |  14 +-
 fs/f2fs/checkpoint.c                               |  16 +-
 fs/f2fs/data.c                                     |  38 +--
 fs/f2fs/dir.c                                      |  10 +-
 fs/f2fs/f2fs.h                                     |  10 +-
 fs/f2fs/file.c                                     |  12 +-
 fs/f2fs/gc.c                                       |   6 +-
 fs/f2fs/inline.c                                   |  18 +-
 fs/f2fs/inode.c                                    |   6 +-
 fs/f2fs/node.c                                     |  28 ++-
 fs/f2fs/node.h                                     |   2 +-
 fs/f2fs/recovery.c                                 |   2 +-
 fs/f2fs/segment.c                                  |  12 +-
 fs/f2fs/super.c                                    |   2 +-
 fs/f2fs/xattr.c                                    |   6 +-
 fs/fat/dir.c                                       |   4 +-
 fs/fat/inode.c                                     |  15 +-
 fs/fat/misc.c                                      |   2 +-
 fs/freevxfs/vxfs_immed.c                           |   7 +-
 fs/freevxfs/vxfs_subr.c                            |  10 +-
 fs/fscache/page.c                                  |  94 +-------
 fs/fuse/dev.c                                      |   2 +-
 fs/fuse/file.c                                     |  20 +-
 fs/gfs2/aops.c                                     |  47 ++--
 fs/gfs2/bmap.c                                     |  10 +-
 fs/gfs2/file.c                                     |   6 +-
 fs/gfs2/inode.h                                    |   3 +-
 fs/gfs2/lops.c                                     |   8 +-
 fs/gfs2/meta_io.c                                  |   7 +-
 fs/gfs2/quota.c                                    |   2 +-
 fs/hfs/bnode.c                                     |  12 +-
 fs/hfs/btree.c                                     |   6 +-
 fs/hfs/inode.c                                     |  16 +-
 fs/hfs/mdb.c                                       |  10 +-
 fs/hfsplus/bitmap.c                                |   8 +-
 fs/hfsplus/bnode.c                                 |  30 +--
 fs/hfsplus/btree.c                                 |   6 +-
 fs/hfsplus/inode.c                                 |  17 +-
 fs/hfsplus/xattr.c                                 |   2 +-
 fs/hostfs/hostfs_kern.c                            |   8 +-
 fs/hpfs/anode.c                                    |  34 +--
 fs/hpfs/buffer.c                                   |   8 +-
 fs/hpfs/dnode.c                                    |   4 +-
 fs/hpfs/ea.c                                       |   4 +-
 fs/hpfs/file.c                                     |  11 +-
 fs/hpfs/inode.c                                    |   2 +-
 fs/hpfs/namei.c                                    |  14 +-
 fs/hpfs/super.c                                    |   6 +-
 fs/hugetlbfs/inode.c                               |   3 +-
 fs/iomap.c                                         |   6 +-
 fs/isofs/compress.c                                |   3 +-
 fs/isofs/inode.c                                   |   5 +-
 fs/isofs/rock.c                                    |   4 +-
 fs/jbd2/commit.c                                   |   5 +-
 fs/jbd2/recovery.c                                 |   2 +-
 fs/jbd2/revoke.c                                   |   4 +-
 fs/jbd2/transaction.c                              |  14 +-
 fs/jffs2/file.c                                    |  12 +-
 fs/jffs2/fs.c                                      |   2 +-
 fs/jffs2/os-linux.h                                |   3 +-
 fs/jfs/inode.c                                     |  11 +-
 fs/jfs/jfs_imap.c                                  |   2 +-
 fs/jfs/jfs_metapage.c                              |  23 +-
 fs/jfs/jfs_mount.c                                 |   2 +-
 fs/jfs/resize.c                                    |   8 +-
 fs/jfs/super.c                                     |   2 +-
 fs/libfs.c                                         |  10 +-
 fs/minix/bitmap.c                                  |  10 +-
 fs/minix/inode.c                                   |  26 +-
 fs/minix/itree_common.c                            |   6 +-
 fs/mpage.c                                         |  70 +++---
 fs/nfs/dir.c                                       |   5 +-
 fs/nfs/direct.c                                    |  12 +-
 fs/nfs/file.c                                      |  25 +-
 fs/nfs/fscache.c                                   |  18 +-
 fs/nfs/fscache.h                                   |   3 +-
 fs/nfs/pagelist.c                                  |  12 +-
 fs/nfs/read.c                                      |  12 +-
 fs/nfs/symlink.c                                   |   6 +-
 fs/nfs/write.c                                     | 122 +++++-----
 fs/nilfs2/alloc.c                                  |  12 +-
 fs/nilfs2/btnode.c                                 |   4 +-
 fs/nilfs2/btree.c                                  |  38 +--
 fs/nilfs2/cpfile.c                                 |  24 +-
 fs/nilfs2/dat.c                                    |   4 +-
 fs/nilfs2/dir.c                                    |   3 +-
 fs/nilfs2/file.c                                   |   2 +-
 fs/nilfs2/gcinode.c                                |   2 +-
 fs/nilfs2/ifile.c                                  |   4 +-
 fs/nilfs2/inode.c                                  |  13 +-
 fs/nilfs2/ioctl.c                                  |   2 +-
 fs/nilfs2/mdt.c                                    |   7 +-
 fs/nilfs2/page.c                                   |   5 +-
 fs/nilfs2/segment.c                                |   7 +-
 fs/nilfs2/sufile.c                                 |  26 +-
 fs/ntfs/aops.c                                     |  25 +-
 fs/ntfs/attrib.c                                   |   8 +-
 fs/ntfs/bitmap.c                                   |   4 +-
 fs/ntfs/file.c                                     |  13 +-
 fs/ntfs/lcnalloc.c                                 |   4 +-
 fs/ntfs/mft.c                                      |   4 +-
 fs/ntfs/super.c                                    |   2 +-
 fs/ntfs/usnjrnl.c                                  |   2 +-
 fs/ocfs2/alloc.c                                   |   2 +-
 fs/ocfs2/aops.c                                    |  30 ++-
 fs/ocfs2/file.c                                    |   4 +-
 fs/ocfs2/inode.c                                   |   2 +-
 fs/ocfs2/mmap.c                                    |   2 +-
 fs/ocfs2/refcounttree.c                            |   3 +-
 fs/ocfs2/symlink.c                                 |   4 +-
 fs/omfs/bitmap.c                                   |   6 +-
 fs/omfs/dir.c                                      |   8 +-
 fs/omfs/file.c                                     |  15 +-
 fs/omfs/inode.c                                    |   4 +-
 fs/orangefs/inode.c                                |   9 +-
 fs/pipe.c                                          |   2 +
 fs/proc/page.c                                     |   2 +-
 fs/qnx4/inode.c                                    |   5 +-
 fs/qnx6/inode.c                                    |   5 +-
 fs/ramfs/inode.c                                   |   8 +-
 fs/reiserfs/file.c                                 |   2 +-
 fs/reiserfs/inode.c                                |  42 ++--
 fs/reiserfs/journal.c                              |  20 +-
 fs/reiserfs/resize.c                               |   4 +-
 fs/romfs/super.c                                   |   3 +-
 fs/splice.c                                        |   3 +-
 fs/squashfs/file.c                                 |   3 +-
 fs/squashfs/symlink.c                              |   4 +-
 fs/sysv/balloc.c                                   |   2 +-
 fs/sysv/ialloc.c                                   |   2 +-
 fs/sysv/inode.c                                    |   8 +-
 fs/sysv/itree.c                                    |  18 +-
 fs/sysv/sysv.h                                     |   4 +-
 fs/ubifs/file.c                                    |  20 +-
 fs/udf/balloc.c                                    |   6 +-
 fs/udf/file.c                                      |   9 +-
 fs/udf/inode.c                                     |  15 +-
 fs/udf/partition.c                                 |   4 +-
 fs/udf/super.c                                     |   8 +-
 fs/udf/symlink.c                                   |   3 +-
 fs/ufs/balloc.c                                    |   4 +-
 fs/ufs/ialloc.c                                    |   4 +-
 fs/ufs/inode.c                                     |  27 ++-
 fs/ufs/util.c                                      |   8 +-
 fs/ufs/util.h                                      |   2 +-
 fs/xfs/xfs_aops.c                                  |  84 ++++---
 fs/xfs/xfs_aops.h                                  |   2 +-
 fs/xfs/xfs_trace.h                                 |   5 +-
 include/linux/balloon_compaction.h                 |  13 -
 include/linux/blkdev.h                             |   5 +-
 include/linux/buffer_head.h                        |  96 +++++---
 include/linux/fs.h                                 |  29 ++-
 include/linux/fscache-cache.h                      |   2 +-
 include/linux/fscache.h                            |  39 +--
 include/linux/jbd2.h                               |  10 +-
 include/linux/ksm.h                                |  12 -
 include/linux/mm-page.h                            | 157 ++++++++++++
 include/linux/mm.h                                 |  11 +-
 include/linux/mpage.h                              |   7 +-
 include/linux/nfs_fs.h                             |   5 +-
 include/linux/nfs_page.h                           |   2 +
 include/linux/page-flags.h                         |  30 ++-
 include/linux/page_ronly.h                         | 169 +++++++++++++
 include/linux/pagemap.h                            |  18 +-
 include/linux/pipe_fs_i.h                          |   2 +
 include/linux/swap.h                               |  20 +-
 include/linux/writeback.h                          |   4 +-
 mm/Kconfig                                         |   4 +
 mm/balloon_compaction.c                            |   7 +-
 mm/filemap.c                                       |  58 ++---
 mm/gup.c                                           |   2 +-
 mm/huge_memory.c                                   |   2 +-
 mm/hugetlb.c                                       |   2 +-
 mm/internal.h                                      |   4 +-
 mm/khugepaged.c                                    |   2 +-
 mm/ksm.c                                           |  26 +-
 mm/memory-failure.c                                |   2 +-
 mm/memory.c                                        |  15 +-
 mm/migrate.c                                       |  18 +-
 mm/mprotect.c                                      |   2 +-
 mm/page-writeback.c                                |  54 +++--
 mm/page_idle.c                                     |   2 +-
 mm/page_io.c                                       |  57 +++--
 mm/process_vm_access.c                             |   2 +-
 mm/readahead.c                                     |   6 +-
 mm/rmap.c                                          |  12 +-
 mm/shmem.c                                         |  46 ++--
 mm/swap_state.c                                    |  14 +-
 mm/swapfile.c                                      |  13 +-
 mm/truncate.c                                      |  32 +--
 mm/vmscan.c                                        |   7 +-
 mm/zsmalloc.c                                      |   8 +-
 mm/zswap.c                                         |   4 +-
 net/ceph/pagevec.c                                 |   2 +-
 net/rds/ib_rdma.c                                  |   2 +-
 net/rds/rdma.c                                     |   4 +-
 302 files changed, 2468 insertions(+), 1691 deletions(-)
 create mode 100644 include/linux/mm-page.h
 create mode 100644 include/linux/page_ronly.h

-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 04/79] pipe: add inode field to struct pipe_inode_info
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
@ 2018-04-04 19:17 ` jglisse
  2018-04-04 19:17 ` [RFC PATCH 05/79] mm/swap: add an helper to get address_space from swap_entry_t jglisse
                   ` (41 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:17 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Eric Biggers, Kees Cook,
	Joe Lawrence, Willy Tarreau, Andrew Morton, Tejun Heo, Jan Kara,
	Josef Bacik, Mel Gorman, Jeff Layton

From: Jérôme Glisse <jglisse@redhat.com>

Pipes are associated with a file and thus an inode, store a pointer
back to the inode in struct pipe_inode_info, this will be use when
testing pages haven't been truncated.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>
---
 fs/pipe.c                 | 2 ++
 fs/splice.c               | 1 +
 include/linux/pipe_fs_i.h | 2 ++
 3 files changed, 5 insertions(+)

diff --git a/fs/pipe.c b/fs/pipe.c
index 7b1954caf388..41e115b0bde7 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -715,6 +715,7 @@ static struct inode * get_pipe_inode(void)
 
 	inode->i_pipe = pipe;
 	pipe->files = 2;
+	pipe->inode = inode;
 	pipe->readers = pipe->writers = 1;
 	inode->i_fop = &pipefifo_fops;
 
@@ -903,6 +904,7 @@ static int fifo_open(struct inode *inode, struct file *filp)
 		pipe = alloc_pipe_info();
 		if (!pipe)
 			return -ENOMEM;
+		pipe->inode = inode;
 		pipe->files = 1;
 		spin_lock(&inode->i_lock);
 		if (unlikely(inode->i_pipe)) {
diff --git a/fs/splice.c b/fs/splice.c
index 39e2dc01ac12..acab52a7fe56 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -927,6 +927,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
 		 * PIPE_READERS appropriately.
 		 */
 		pipe->readers = 1;
+		pipe->inode = file_inode(in);
 
 		current->splice_pipe = pipe;
 	}
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index 5a3bb3b7c9ad..171aa78ebbf0 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -44,6 +44,7 @@ struct pipe_buffer {
  *	@fasync_writers: writer side fasync
  *	@bufs: the circular array of pipe buffers
  *	@user: the user who created this pipe
+ *	@inode: inode this pipe is associated to
  **/
 struct pipe_inode_info {
 	struct mutex mutex;
@@ -60,6 +61,7 @@ struct pipe_inode_info {
 	struct fasync_struct *fasync_writers;
 	struct pipe_buffer *bufs;
 	struct user_struct *user;
+	struct inode *inode;
 };
 
 /*
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 05/79] mm/swap: add an helper to get address_space from swap_entry_t
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
  2018-04-04 19:17 ` [RFC PATCH 04/79] pipe: add inode field to struct pipe_inode_info jglisse
@ 2018-04-04 19:17 ` jglisse
  2018-04-04 19:17 ` [RFC PATCH 06/79] mm/page: add helpers to dereference struct page index field jglisse
                   ` (40 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:17 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Michal Hocko,
	Johannes Weiner, Andrew Morton

From: Jérôme Glisse <jglisse@redhat.com>

Each swap entry is associated to a file and thus an address_space.
That address_space is use for reading/writing to swap storage. This
patch add an helper to get the address_space from swap_entry_t.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/swap.h | 1 +
 mm/swapfile.c        | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index a1a3f4ed94ce..e2155df84d77 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -475,6 +475,7 @@ extern int __swp_swapcount(swp_entry_t entry);
 extern int swp_swapcount(swp_entry_t entry);
 extern struct swap_info_struct *page_swap_info(struct page *);
 extern struct swap_info_struct *swp_swap_info(swp_entry_t entry);
+struct address_space *swap_entry_to_address_space(swp_entry_t swap);
 extern bool reuse_swap_page(struct page *, int *);
 extern int try_to_free_swap(struct page *);
 struct backing_dev_info;
diff --git a/mm/swapfile.c b/mm/swapfile.c
index c7a33717d079..a913d4b45866 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3467,6 +3467,13 @@ struct swap_info_struct *swp_swap_info(swp_entry_t entry)
 	return swap_info[swp_type(entry)];
 }
 
+struct address_space *swap_entry_to_address_space(swp_entry_t swap)
+{
+	struct swap_info_struct *sis = swp_swap_info(swap);
+
+	return sis->swap_file->f_mapping;
+}
+
 struct swap_info_struct *page_swap_info(struct page *page)
 {
 	swp_entry_t entry = { .val = page_private(page) };
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 06/79] mm/page: add helpers to dereference struct page index field
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
  2018-04-04 19:17 ` [RFC PATCH 04/79] pipe: add inode field to struct pipe_inode_info jglisse
  2018-04-04 19:17 ` [RFC PATCH 05/79] mm/swap: add an helper to get address_space from swap_entry_t jglisse
@ 2018-04-04 19:17 ` jglisse
  2018-04-04 19:17 ` [RFC PATCH 07/79] mm/page: add helpers to find mapping give a page and buffer head jglisse
                   ` (39 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:17 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

Regroup all helpers that dereference struct page.index field into one
place and require a the address_space (mapping) against which caller
is looking the index (offset, pgoff, ...)

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: linux-mm@kvack.org
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/mm-page.h | 136 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/mm.h      |   5 ++
 2 files changed, 141 insertions(+)
 create mode 100644 include/linux/mm-page.h

diff --git a/include/linux/mm-page.h b/include/linux/mm-page.h
new file mode 100644
index 000000000000..2981db45eeef
--- /dev/null
+++ b/include/linux/mm-page.h
@@ -0,0 +1,136 @@
+/*
+ * Copyright 2018 Red Hat Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Authors: Jérôme Glisse <jglisse@redhat.com>
+ */
+/*
+ * This header file regroup everything that deal with struct page and has no
+ * outside dependency except basic types header files.
+ */
+/* Protected against rogue include ... do not include this file directly */
+#ifdef DOT_NOT_INCLUDE___INSIDE_MM
+#ifndef MM_PAGE_H
+#define MM_PAGE_H
+
+/* External struct dependencies: */
+struct address_space;
+
+/* External function dependencies: */
+extern pgoff_t __page_file_index(struct page *page);
+
+
+/*
+ * _page_index() - return page index value (with special case for swap)
+ * @page: page struct pointer for which we want the index value
+ * @mapping: mapping against which we want the page index
+ * Returns: index value for the page in the given mapping
+ *
+ * The index value of a page is against a given mapping and page which belongs
+ * to swap cache need special handling. For swap cache page what we want is the
+ * swap offset which is store encoded with other fields in page->private.
+ */
+static inline unsigned long _page_index(struct page *page,
+		struct address_space *mapping)
+{
+	if (unlikely(PageSwapCache(page)))
+		return __page_file_index(page);
+	return page->index;
+}
+
+/*
+ * _page_set_index() - set page index value against a give mapping
+ * @page: page struct pointer for which we want the index value
+ * @mapping: mapping against which we want the page index
+ * @index: index value to set
+ */
+static inline void _page_set_index(struct page *page,
+		struct address_space *mapping,
+		unsigned long index)
+{
+	page->index = index;
+}
+
+/*
+ * _page_to_index() - page index value against a give mapping
+ * @page: page struct pointer for which we want the index value
+ * @mapping: mapping against which we want the page index
+ * Returns: index value for the page in the given mapping
+ *
+ * The index value of a page is against a given mapping. THP page need special
+ * handling has the index is set in the head page thus the final index value is
+ * the tail page index plus the number of page from current page to head page.
+ */
+static inline unsigned long _page_to_index(struct page *page,
+		struct address_space *mapping)
+{
+	unsigned long pgoff;
+
+	if (likely(!PageTransTail(page)))
+		return page->index;
+
+	/*
+	 *  We don't initialize ->index for tail pages: calculate based on
+	 *  head page
+	 */
+	pgoff = compound_head(page)->index;
+	pgoff += page - compound_head(page);
+	return pgoff;
+}
+
+/*
+ * _page_to_pgoff() - page pgoff value against a give mapping
+ * @page: page struct pointer for which we want the index value
+ * @mapping: mapping against which we want the page index
+ * Returns: pgoff value for the page in the given mapping
+ *
+ * The pgoff value of a page is against a given mapping. Hugetlb pages need
+ * special handling as for they have page->index in size of the huge pages
+ * (PMD_SIZE or  PUD_SIZE), not in PAGE_SIZE as other types of pages.
+ *
+ * FIXME convert hugetlb to multi-order entries.
+ */
+static inline unsigned long _page_to_pgoff(struct page *page,
+		struct address_space *mapping)
+{
+	if (unlikely(PageHeadHuge(page)))
+		return page->index << compound_order(page);
+
+	return _page_to_index(page, mapping);
+}
+
+/*
+ * _page_offset() - page offset (in bytes) against a give mapping
+ * @page: page struct pointer for which we want the index value
+ * @mapping: mapping against which we want the page index
+ * Returns: page offset (in bytes) for the page in the given mapping
+ */
+static inline unsigned long _page_offset(struct page *page,
+		struct address_space *mapping)
+{
+	return page->index << PAGE_SHIFT;
+}
+
+/*
+ * _page_file_offset() - page offset (in bytes) against a give mapping
+ * @page: page struct pointer for which we want the index value
+ * @mapping: mapping against which we want the page index
+ * Returns: page offset (in bytes) for the page in the given mapping
+ */
+static inline unsigned long _page_file_offset(struct page *page,
+		struct address_space *mapping)
+{
+	return page->index << PAGE_SHIFT;
+}
+
+#endif /* MM_PAGE_H */
+#endif /* DOT_NOT_INCLUDE___INSIDE_MM */
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ad06d42adb1a..874a10f011ee 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2673,5 +2673,10 @@ void __init setup_nr_node_ids(void);
 static inline void setup_nr_node_ids(void) {}
 #endif
 
+/* Include here while header consolidation process is in progress */
+#define DOT_NOT_INCLUDE___INSIDE_MM
+#include <linux/mm-page.h>
+#undef DOT_NOT_INCLUDE___INSIDE_MM
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 07/79] mm/page: add helpers to find mapping give a page and buffer head
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (2 preceding siblings ...)
  2018-04-04 19:17 ` [RFC PATCH 06/79] mm/page: add helpers to dereference struct page index field jglisse
@ 2018-04-04 19:17 ` jglisse
  2018-04-04 19:17 ` [RFC PATCH 08/79] mm/page: add helpers to find page mapping and private given a bio jglisse
                   ` (38 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:17 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

For now this simply use exist page_mapping() inline. Latter it will
use buffer head pointer as a key to lookup mapping for write protected
page.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: linux-mm@kvack.org
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/mm-page.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/linux/mm-page.h b/include/linux/mm-page.h
index 2981db45eeef..647a8a8cf9ba 100644
--- a/include/linux/mm-page.h
+++ b/include/linux/mm-page.h
@@ -132,5 +132,17 @@ static inline unsigned long _page_file_offset(struct page *page,
 	return page->index << PAGE_SHIFT;
 }
 
+/*
+ * fs_page_mapping_get_with_bh() - page mapping knowing buffer_head
+ * @page: page struct pointer for which we want the mapping
+ * @bh: buffer_head associated with the page for the mapping
+ * Returns: page mapping for the given buffer head
+ */
+static inline struct address_space *fs_page_mapping_get_with_bh(
+		struct page *page, struct buffer_head *bh)
+{
+	return page_mapping(page);
+}
+
 #endif /* MM_PAGE_H */
 #endif /* DOT_NOT_INCLUDE___INSIDE_MM */
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 08/79] mm/page: add helpers to find page mapping and private given a bio
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (3 preceding siblings ...)
  2018-04-04 19:17 ` [RFC PATCH 07/79] mm/page: add helpers to find mapping give a page and buffer head jglisse
@ 2018-04-04 19:17 ` jglisse
  2018-04-04 19:17 ` [RFC PATCH 09/79] fs: add struct address_space to read_cache_page() callback argument jglisse
                   ` (37 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:17 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

When page undergo io it is associated with a unique bio and thus we can
use it to lookup other page fields which are relevant only for the bio
under consideration.

Note this only apply when page is special ie page->mapping is pointing
to some special structure which is not a valid struct address_space.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: linux-mm@kvack.org
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/mm-page.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/mm-page.h b/include/linux/mm-page.h
index 647a8a8cf9ba..6ec3ba19b1a4 100644
--- a/include/linux/mm-page.h
+++ b/include/linux/mm-page.h
@@ -24,6 +24,7 @@
 
 /* External struct dependencies: */
 struct address_space;
+struct bio;
 
 /* External function dependencies: */
 extern pgoff_t __page_file_index(struct page *page);
@@ -144,5 +145,13 @@ static inline struct address_space *fs_page_mapping_get_with_bh(
 	return page_mapping(page);
 }
 
+static inline void bio_page_mapping_and_private(struct page *page,
+		struct bio *bio, struct address_space **mappingp,
+		unsigned long *privatep)
+{
+	*mappingp = page->mapping;
+	*privatep = page_private(page);
+}
+
 #endif /* MM_PAGE_H */
 #endif /* DOT_NOT_INCLUDE___INSIDE_MM */
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 09/79] fs: add struct address_space to read_cache_page() callback argument
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (4 preceding siblings ...)
  2018-04-04 19:17 ` [RFC PATCH 08/79] mm/page: add helpers to find page mapping and private given a bio jglisse
@ 2018-04-04 19:17 ` jglisse
  2018-04-04 19:17 ` [RFC PATCH 20/79] fs: add struct address_space to write_cache_pages() " jglisse
                   ` (36 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:17 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Alexander Viro, Tejun Heo,
	Jan Kara, Josef Bacik, Mel Gorman, Jeff Layton

From: Jérôme Glisse <jglisse@redhat.com>

Add struct address_space to callback arguments of read_cache_page()
and read_cache_pages(). Note this patch only add arguments and modify
callback function signature, it does not make use of the new argument
and thus it should be regression free.

One step toward dropping reliance on page->mapping.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c |  3 ++-
 fs/9p/vfs_addr.c                                | 13 ++++++++++++-
 fs/afs/file.c                                   |  7 ++++---
 fs/afs/internal.h                               |  2 +-
 fs/exofs/inode.c                                |  5 +++--
 fs/fuse/file.c                                  |  3 ++-
 fs/gfs2/aops.c                                  |  5 +++--
 fs/jffs2/file.c                                 |  6 ++++--
 fs/jffs2/fs.c                                   |  2 +-
 fs/jffs2/os-linux.h                             |  3 ++-
 fs/nfs/dir.c                                    |  3 ++-
 fs/nfs/read.c                                   |  3 ++-
 fs/nfs/symlink.c                                |  6 ++++--
 include/linux/pagemap.h                         |  8 ++++++--
 mm/filemap.c                                    | 14 +++++++++++---
 mm/readahead.c                                  |  4 ++--
 16 files changed, 61 insertions(+), 26 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 03e55bca4ada..4814ef083824 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1122,7 +1122,8 @@ struct readpage_param {
  * in PAGE_SIZE (if PAGE_SIZE greater than LU_PAGE_SIZE), and the
  * lu_dirpage for this integrated page will be adjusted.
  **/
-static int mdc_read_page_remote(void *data, struct page *page0)
+static int mdc_read_page_remote(void *data, struct address_space *mapping,
+				struct page *page0)
 {
 	struct readpage_param *rp = data;
 	struct page **page_pool;
diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index e1cbdfdb7c68..61f70e63a525 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -99,6 +99,17 @@ static int v9fs_vfs_readpage(struct file *filp, struct page *page)
 	return v9fs_fid_readpage(filp->private_data, page);
 }
 
+/*
+ * This wrapper is needed to avoid forcing callback cast on read_cache_pages()
+ * and defeating compiler figuring out we are doing something wrong.
+ */
+static int v9fs_vfs_readpage_filler(void *data, struct address_space *mapping,
+				    struct page *page)
+{
+	return v9fs_vfs_readpage(data, page);
+}
+
+
 /**
  * v9fs_vfs_readpages - read a set of pages from 9P
  *
@@ -122,7 +133,7 @@ static int v9fs_vfs_readpages(struct file *filp, struct address_space *mapping,
 	if (ret == 0)
 		return ret;
 
-	ret = read_cache_pages(mapping, pages, (void *)v9fs_vfs_readpage, filp);
+	ret = read_cache_pages(mapping, pages, v9fs_vfs_readpage_filler, filp);
 	p9_debug(P9_DEBUG_VFS, "  = %d\n", ret);
 	return ret;
 }
diff --git a/fs/afs/file.c b/fs/afs/file.c
index a39192ced99e..f457b0144946 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -247,7 +247,8 @@ int afs_fetch_data(struct afs_vnode *vnode, struct key *key, struct afs_read *de
 /*
  * read page from file, directory or symlink, given a key to use
  */
-int afs_page_filler(void *data, struct page *page)
+int afs_page_filler(void *data, struct address_space *mapping,
+		    struct page *page)
 {
 	struct inode *inode = page->mapping->host;
 	struct afs_vnode *vnode = AFS_FS_I(inode);
@@ -373,14 +374,14 @@ static int afs_readpage(struct file *file, struct page *page)
 	if (file) {
 		key = afs_file_key(file);
 		ASSERT(key != NULL);
-		ret = afs_page_filler(key, page);
+		ret = afs_page_filler(key, page->mapping, page);
 	} else {
 		struct inode *inode = page->mapping->host;
 		key = afs_request_key(AFS_FS_S(inode->i_sb)->cell);
 		if (IS_ERR(key)) {
 			ret = PTR_ERR(key);
 		} else {
-			ret = afs_page_filler(key, page);
+			ret = afs_page_filler(key, page->mapping, page);
 			key_put(key);
 		}
 	}
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index f38d6a561a84..4c449145f668 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -656,7 +656,7 @@ extern void afs_put_wb_key(struct afs_wb_key *);
 extern int afs_open(struct inode *, struct file *);
 extern int afs_release(struct inode *, struct file *);
 extern int afs_fetch_data(struct afs_vnode *, struct key *, struct afs_read *);
-extern int afs_page_filler(void *, struct page *);
+extern int afs_page_filler(void *, struct address_space *, struct page *);
 extern void afs_put_read(struct afs_read *);
 
 /*
diff --git a/fs/exofs/inode.c b/fs/exofs/inode.c
index 0ac62811b341..e7d25af1cf8a 100644
--- a/fs/exofs/inode.c
+++ b/fs/exofs/inode.c
@@ -377,7 +377,8 @@ static int read_exec(struct page_collect *pcol)
  * and will start a new collection. Eventually caller must submit the last
  * segment if present.
  */
-static int readpage_strip(void *data, struct page *page)
+static int readpage_strip(void *data, struct address_space *mapping,
+			  struct page *page)
 {
 	struct page_collect *pcol = data;
 	struct inode *inode = pcol->inode;
@@ -499,7 +500,7 @@ static int _readpage(struct page *page, bool read_4_write)
 	_pcol_init(&pcol, 1, page->mapping->host);
 
 	pcol.read_4_write = read_4_write;
-	ret = readpage_strip(&pcol, page);
+	ret = readpage_strip(&pcol, page->mapping, page);
 	if (ret) {
 		EXOFS_ERR("_readpage => %d\n", ret);
 		return ret;
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index a201fb0ac64f..5f342cbbf015 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -837,7 +837,8 @@ struct fuse_fill_data {
 	unsigned nr_pages;
 };
 
-static int fuse_readpages_fill(void *_data, struct page *page)
+static int fuse_readpages_fill(void *_data, struct address_space *mapping,
+			       struct page *page)
 {
 	struct fuse_fill_data *data = _data;
 	struct fuse_req *req = data->req;
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 2f725b4a386b..45fa202b5fbc 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -509,7 +509,8 @@ static int stuffed_readpage(struct gfs2_inode *ip, struct page *page)
  * called by gfs2_readpage() once the required lock has been granted.
  */
 
-static int __gfs2_readpage(void *file, struct page *page)
+static int __gfs2_readpage(void *file, struct address_space *mapping,
+			   struct page *page)
 {
 	struct gfs2_inode *ip = GFS2_I(page->mapping->host);
 	struct gfs2_sbd *sdp = GFS2_SB(page->mapping->host);
@@ -553,7 +554,7 @@ static int gfs2_readpage(struct file *file, struct page *page)
 	error = AOP_TRUNCATED_PAGE;
 	lock_page(page);
 	if (page->mapping == mapping && !PageUptodate(page))
-		error = __gfs2_readpage(file, page);
+		error = __gfs2_readpage(file, mapping, page);
 	else
 		unlock_page(page);
 	gfs2_glock_dq(&gh);
diff --git a/fs/jffs2/file.c b/fs/jffs2/file.c
index bd0428bebe9b..e2faea96bce5 100644
--- a/fs/jffs2/file.c
+++ b/fs/jffs2/file.c
@@ -109,8 +109,10 @@ static int jffs2_do_readpage_nolock (struct inode *inode, struct page *pg)
 	return ret;
 }
 
-int jffs2_do_readpage_unlock(struct inode *inode, struct page *pg)
+int jffs2_do_readpage_unlock (void *data, struct address_space *mapping,
+			      struct page *pg)
 {
+	struct inode *inode = data;
 	int ret = jffs2_do_readpage_nolock(inode, pg);
 	unlock_page(pg);
 	return ret;
@@ -123,7 +125,7 @@ static int jffs2_readpage (struct file *filp, struct page *pg)
 	int ret;
 
 	mutex_lock(&f->sem);
-	ret = jffs2_do_readpage_unlock(pg->mapping->host, pg);
+	ret = jffs2_do_readpage_unlock(pg->mapping->host, pg->mapping, pg);
 	mutex_unlock(&f->sem);
 	return ret;
 }
diff --git a/fs/jffs2/fs.c b/fs/jffs2/fs.c
index eab04eca95a3..7fbe8a7843b9 100644
--- a/fs/jffs2/fs.c
+++ b/fs/jffs2/fs.c
@@ -686,7 +686,7 @@ unsigned char *jffs2_gc_fetch_page(struct jffs2_sb_info *c,
 	struct page *pg;
 
 	pg = read_cache_page(inode->i_mapping, offset >> PAGE_SHIFT,
-			     (void *)jffs2_do_readpage_unlock, inode);
+			     jffs2_do_readpage_unlock, inode);
 	if (IS_ERR(pg))
 		return (void *)pg;
 
diff --git a/fs/jffs2/os-linux.h b/fs/jffs2/os-linux.h
index c2fbec19c616..843a1d61ad73 100644
--- a/fs/jffs2/os-linux.h
+++ b/fs/jffs2/os-linux.h
@@ -154,7 +154,8 @@ extern const struct file_operations jffs2_file_operations;
 extern const struct inode_operations jffs2_file_inode_operations;
 extern const struct address_space_operations jffs2_file_address_operations;
 int jffs2_fsync(struct file *, loff_t, loff_t, int);
-int jffs2_do_readpage_unlock (struct inode *inode, struct page *pg);
+int jffs2_do_readpage_unlock (void *data, struct address_space *mapping,
+			      struct page *pg);
 
 /* ioctl.c */
 long jffs2_ioctl(struct file *, unsigned int, unsigned long);
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 2f3f86726f5b..1d988a0e91ee 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -664,7 +664,8 @@ int nfs_readdir_xdr_to_array(nfs_readdir_descriptor_t *desc, struct page *page,
  * We only need to convert from xdr once so future lookups are much simpler
  */
 static
-int nfs_readdir_filler(nfs_readdir_descriptor_t *desc, struct page* page)
+int nfs_readdir_filler(nfs_readdir_descriptor_t *desc,
+		       struct address_space *mapping, struct page* page)
 {
 	struct inode	*inode = file_inode(desc->file);
 	int ret;
diff --git a/fs/nfs/read.c b/fs/nfs/read.c
index 48d7277c60a9..2da6c62b1d3d 100644
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -354,7 +354,8 @@ struct nfs_readdesc {
 };
 
 static int
-readpage_async_filler(void *data, struct page *page)
+readpage_async_filler(void *data, struct address_space *mapping,
+		      struct page *page)
 {
 	struct nfs_readdesc *desc = (struct nfs_readdesc *)data;
 	struct nfs_page *new;
diff --git a/fs/nfs/symlink.c b/fs/nfs/symlink.c
index 06eb44b47885..c0358f77222e 100644
--- a/fs/nfs/symlink.c
+++ b/fs/nfs/symlink.c
@@ -26,8 +26,10 @@
  * and straight-forward than readdir caching.
  */
 
-static int nfs_symlink_filler(struct inode *inode, struct page *page)
+static int nfs_symlink_filler(void *data, struct address_space *mapping,
+			      struct page *page)
 {
+	struct inode *inode = data;
 	int error;
 
 	error = NFS_PROTO(inode)->readlink(inode, page, 0, PAGE_SIZE);
@@ -66,7 +68,7 @@ static const char *nfs_get_link(struct dentry *dentry,
 		if (err)
 			return err;
 		page = read_cache_page(&inode->i_data, 0,
-					(filler_t *)nfs_symlink_filler, inode);
+				       nfs_symlink_filler, inode);
 		if (IS_ERR(page))
 			return ERR_CAST(page);
 	}
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 34ce3ebf97d5..89f5b1db4993 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -239,7 +239,7 @@ static inline gfp_t readahead_gfp_mask(struct address_space *x)
 	return mapping_gfp_mask(x) | __GFP_NORETRY | __GFP_NOWARN;
 }
 
-typedef int filler_t(void *, struct page *);
+typedef int filler_t(void *, struct address_space *, struct page *);
 
 pgoff_t page_cache_next_hole(struct address_space *mapping,
 			     pgoff_t index, unsigned long max_scan);
@@ -395,10 +395,14 @@ extern struct page * read_cache_page_gfp(struct address_space *mapping,
 extern int read_cache_pages(struct address_space *mapping,
 		struct list_head *pages, filler_t *filler, void *data);
 
+int read_mapping_page_readpage_wrapper(void *data,
+				       struct address_space *mapping,
+				       struct page *page);
+
 static inline struct page *read_mapping_page(struct address_space *mapping,
 				pgoff_t index, void *data)
 {
-	filler_t *filler = (filler_t *)mapping->a_ops->readpage;
+	filler_t *filler = read_mapping_page_readpage_wrapper;
 	return read_cache_page(mapping, index, filler, data);
 }
 
diff --git a/mm/filemap.c b/mm/filemap.c
index 693f62212a59..007e0aca723f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2779,7 +2779,7 @@ static struct page *wait_on_page_read(struct page *page)
 
 static struct page *do_read_cache_page(struct address_space *mapping,
 				pgoff_t index,
-				int (*filler)(void *, struct page *),
+				filler_t filler,
 				void *data,
 				gfp_t gfp)
 {
@@ -2801,7 +2801,7 @@ static struct page *do_read_cache_page(struct address_space *mapping,
 		}
 
 filler:
-		err = filler(data, page);
+		err = filler(data, mapping, page);
 		if (err < 0) {
 			put_page(page);
 			return ERR_PTR(err);
@@ -2872,6 +2872,14 @@ static struct page *do_read_cache_page(struct address_space *mapping,
 	return page;
 }
 
+int read_mapping_page_readpage_wrapper(void *data,
+				       struct address_space *mapping,
+				       struct page *page)
+{
+	return mapping->a_ops->readpage(data, page);
+}
+EXPORT_SYMBOL(read_mapping_page_readpage_wrapper);
+
 /**
  * read_cache_page - read into page cache, fill it if needed
  * @mapping:	the page's address_space
@@ -2886,7 +2894,7 @@ static struct page *do_read_cache_page(struct address_space *mapping,
  */
 struct page *read_cache_page(struct address_space *mapping,
 				pgoff_t index,
-				int (*filler)(void *, struct page *),
+				filler_t filler,
 				void *data)
 {
 	return do_read_cache_page(mapping, index, filler, data, mapping_gfp_mask(mapping));
diff --git a/mm/readahead.c b/mm/readahead.c
index c4ca70239233..a20d3992525c 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -81,7 +81,7 @@ static void read_cache_pages_invalidate_pages(struct address_space *mapping,
  * Hides the details of the LRU cache etc from the filesystems.
  */
 int read_cache_pages(struct address_space *mapping, struct list_head *pages,
-			int (*filler)(void *, struct page *), void *data)
+			filler_t filler, void *data)
 {
 	struct page *page;
 	int ret = 0;
@@ -96,7 +96,7 @@ int read_cache_pages(struct address_space *mapping, struct list_head *pages,
 		}
 		put_page(page);
 
-		ret = filler(data, page);
+		ret = filler(data, mapping, page);
 		if (unlikely(ret)) {
 			read_cache_pages_invalidate_pages(mapping, pages);
 			break;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 20/79] fs: add struct address_space to write_cache_pages() callback argument
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (5 preceding siblings ...)
  2018-04-04 19:17 ` [RFC PATCH 09/79] fs: add struct address_space to read_cache_page() callback argument jglisse
@ 2018-04-04 19:17 ` jglisse
  2018-04-04 19:17 ` [RFC PATCH 22/79] fs: add struct inode to block_read_full_page() arguments jglisse
                   ` (35 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:17 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Alexander Viro, Tejun Heo,
	Jan Kara, Josef Bacik, Mel Gorman, Jeff Layton

From: Jérôme Glisse <jglisse@redhat.com>

Add struct address_space to callback arguments of write_cache_pages()
Note this patch only add arguments and modify all callback functions
signature, it does not make use of the new argument and thus it should
be regression free.

One step toward dropping reliance on page->mapping.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>
---
 fs/exofs/inode.c          | 2 +-
 fs/ext4/inode.c           | 7 +++----
 fs/fuse/file.c            | 1 +
 fs/mpage.c                | 6 +++---
 fs/nfs/write.c            | 4 +++-
 fs/xfs/xfs_aops.c         | 3 ++-
 include/linux/writeback.h | 4 ++--
 mm/page-writeback.c       | 9 ++++-----
 8 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/fs/exofs/inode.c b/fs/exofs/inode.c
index 41f6b04cbfca..54d6b7dbd4e7 100644
--- a/fs/exofs/inode.c
+++ b/fs/exofs/inode.c
@@ -691,7 +691,7 @@ static int write_exec(struct page_collect *pcol)
  * previous segment and will start a new collection.
  * Eventually caller must submit the last segment if present.
  */
-static int writepage_strip(struct page *page,
+static int writepage_strip(struct page *page, struct address_space *mapping,
 			   struct writeback_control *wbc_unused, void *data)
 {
 	struct page_collect *pcol = data;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 96dcae1937c8..63bf0160c579 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2697,10 +2697,9 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 	return err;
 }
 
-static int __writepage(struct page *page, struct writeback_control *wbc,
-		       void *data)
+static int __writepage(struct page *page, struct address_space *mapping,
+		       struct writeback_control *wbc, void *data)
 {
-	struct address_space *mapping = data;
 	int ret = ext4_writepage(mapping, page, wbc);
 	mapping_set_error(mapping, ret);
 	return ret;
@@ -2746,7 +2745,7 @@ static int ext4_writepages(struct address_space *mapping,
 		struct blk_plug plug;
 
 		blk_start_plug(&plug);
-		ret = write_cache_pages(mapping, wbc, __writepage, mapping);
+		ret = write_cache_pages(mapping, wbc, __writepage, NULL);
 		blk_finish_plug(&plug);
 		goto out_writepages;
 	}
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 3c602632b33a..e0562d04d84f 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1794,6 +1794,7 @@ static bool fuse_writepage_in_flight(struct fuse_req *new_req,
 }
 
 static int fuse_writepages_fill(struct page *page,
+		struct address_space *mapping,
 		struct writeback_control *wbc, void *_data)
 {
 	struct fuse_fill_wb_data *data = _data;
diff --git a/fs/mpage.c b/fs/mpage.c
index b03a82d5b908..d25f08f46090 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -479,8 +479,8 @@ void clean_page_buffers(struct page *page)
 	clean_buffers(page, ~0U);
 }
 
-static int __mpage_writepage(struct page *page, struct writeback_control *wbc,
-		      void *data)
+static int __mpage_writepage(struct page *page, struct address_space *_mapping,
+			     struct writeback_control *wbc, void *data)
 {
 	struct mpage_data *mpd = data;
 	struct bio *bio = mpd->bio;
@@ -734,7 +734,7 @@ int mpage_writepage(struct page *page, get_block_t get_block,
 		.get_block = get_block,
 		.use_writepage = 0,
 	};
-	int ret = __mpage_writepage(page, wbc, &mpd);
+	int ret = __mpage_writepage(page, page->mapping, wbc, &mpd);
 	if (mpd.bio) {
 		int op_flags = (wbc->sync_mode == WB_SYNC_ALL ?
 			  REQ_SYNC : 0);
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 1f7723eff542..ffab026b9632 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -693,7 +693,9 @@ int nfs_writepage(struct address_space *mapping, struct page *page,
 	return ret;
 }
 
-static int nfs_writepages_callback(struct page *page, struct writeback_control *wbc, void *data)
+static int nfs_writepages_callback(struct page *page,
+				   struct address_space *mapping,
+				   struct writeback_control *wbc, void *data)
 {
 	int ret;
 
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 981a2a4e00e5..00922a82ede6 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1060,6 +1060,7 @@ xfs_writepage_map(
 STATIC int
 xfs_do_writepage(
 	struct page		*page,
+	struct address_space	*mapping,
 	struct writeback_control *wbc,
 	void			*data)
 {
@@ -1179,7 +1180,7 @@ xfs_vm_writepage(
 	};
 	int			ret;
 
-	ret = xfs_do_writepage(page, wbc, &wpc);
+	ret = xfs_do_writepage(page, mapping, wbc, &wpc);
 	if (wpc.ioend)
 		ret = xfs_submit_ioend(wbc, wpc.ioend, ret);
 	return ret;
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index fdfd04e348f6..70361cc0ff54 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -358,8 +358,8 @@ void wb_update_bandwidth(struct bdi_writeback *wb, unsigned long start_time);
 void balance_dirty_pages_ratelimited(struct address_space *mapping);
 bool wb_over_bg_thresh(struct bdi_writeback *wb);
 
-typedef int (*writepage_t)(struct page *page, struct writeback_control *wbc,
-				void *data);
+typedef int (*writepage_t)(struct page *page, struct address_space *mapping,
+			   struct writeback_control *wbc, void *data);
 
 int generic_writepages(struct address_space *mapping,
 		       struct writeback_control *wbc);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 5bb8804967ca..67b857ee1a1c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2236,7 +2236,7 @@ int write_cache_pages(struct address_space *mapping,
 				goto continue_unlock;
 
 			trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
-			ret = (*writepage)(page, wbc, data);
+			ret = (*writepage)(page, mapping, wbc, data);
 			if (unlikely(ret)) {
 				if (ret == AOP_WRITEPAGE_ACTIVATE) {
 					unlock_page(page);
@@ -2294,10 +2294,9 @@ EXPORT_SYMBOL(write_cache_pages);
  * Function used by generic_writepages to call the real writepage
  * function and set the mapping flags on error
  */
-static int __writepage(struct page *page, struct writeback_control *wbc,
-		       void *data)
+static int __writepage(struct page *page, struct address_space *mapping,
+		       struct writeback_control *wbc, void *data)
 {
-	struct address_space *mapping = data;
 	int ret = mapping->a_ops->writepage(mapping, page, wbc);
 	mapping_set_error(mapping, ret);
 	return ret;
@@ -2322,7 +2321,7 @@ int generic_writepages(struct address_space *mapping,
 		return 0;
 
 	blk_start_plug(&plug);
-	ret = write_cache_pages(mapping, wbc, __writepage, mapping);
+	ret = write_cache_pages(mapping, wbc, __writepage, NULL);
 	blk_finish_plug(&plug);
 	return ret;
 }
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 22/79] fs: add struct inode to block_read_full_page() arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (6 preceding siblings ...)
  2018-04-04 19:17 ` [RFC PATCH 20/79] fs: add struct address_space to write_cache_pages() " jglisse
@ 2018-04-04 19:17 ` jglisse
  2018-04-04 19:17 ` [RFC PATCH 24/79] fs: add struct inode to nobh_writepage() arguments jglisse
                   ` (34 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:17 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Alexander Viro, Tejun Heo,
	Jan Kara, Josef Bacik, Mel Gorman, Jeff Layton

From: Jérôme Glisse <jglisse@redhat.com>

Add struct inode to block_read_full_page(). Note this patch only add
arguments and modify call site conservatily using page->mapping and
thus the end result is as before this patch.

One step toward dropping reliance on page->mapping.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>
---
 fs/adfs/inode.c             | 2 +-
 fs/affs/file.c              | 2 +-
 fs/befs/linuxvfs.c          | 3 ++-
 fs/bfs/file.c               | 2 +-
 fs/block_dev.c              | 2 +-
 fs/buffer.c                 | 4 ++--
 fs/efs/inode.c              | 2 +-
 fs/ext4/readpage.c          | 3 ++-
 fs/freevxfs/vxfs_subr.c     | 2 +-
 fs/hfs/inode.c              | 2 +-
 fs/hfsplus/inode.c          | 3 ++-
 fs/minix/inode.c            | 2 +-
 fs/mpage.c                  | 2 +-
 fs/ocfs2/aops.c             | 3 ++-
 fs/ocfs2/refcounttree.c     | 3 ++-
 fs/omfs/file.c              | 2 +-
 fs/qnx4/inode.c             | 2 +-
 fs/reiserfs/inode.c         | 3 ++-
 fs/sysv/itree.c             | 2 +-
 fs/ufs/inode.c              | 3 ++-
 include/linux/buffer_head.h | 2 +-
 21 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/fs/adfs/inode.c b/fs/adfs/inode.c
index 1100d5da84d0..2270ab3d5392 100644
--- a/fs/adfs/inode.c
+++ b/fs/adfs/inode.c
@@ -45,7 +45,7 @@ static int adfs_writepage(struct address_space *mapping, struct page *page,
 static int adfs_readpage(struct file *file, struct address_space *mapping,
 			 struct page *page)
 {
-	return block_read_full_page(page, adfs_get_block);
+	return block_read_full_page(page->mapping->host, page, adfs_get_block);
 }
 
 static void adfs_write_failed(struct address_space *mapping, loff_t to)
diff --git a/fs/affs/file.c b/fs/affs/file.c
index 55ab72c1b228..136cb90f332f 100644
--- a/fs/affs/file.c
+++ b/fs/affs/file.c
@@ -379,7 +379,7 @@ static int affs_writepage(struct address_space *mapping, struct page *page,
 static int affs_readpage(struct file *file, struct address_space *mapping,
 			 struct page *page)
 {
-	return block_read_full_page(page, affs_get_block);
+	return block_read_full_page(page->mapping->host, page, affs_get_block);
 }
 
 static void affs_write_failed(struct address_space *mapping, loff_t to)
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index f6844b4ae77f..4436123674d3 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -112,7 +112,8 @@ static int
 befs_readpage(struct file *file, struct address_space *mapping,
 	      struct page *page)
 {
-	return block_read_full_page(page, befs_get_block);
+	return block_read_full_page(page->mapping->host, page,
+				    befs_get_block);
 }
 
 static sector_t
diff --git a/fs/bfs/file.c b/fs/bfs/file.c
index 1c4593429f7d..b1255ee4cd75 100644
--- a/fs/bfs/file.c
+++ b/fs/bfs/file.c
@@ -160,7 +160,7 @@ static int bfs_writepage(struct address_space *mapping, struct page *page,
 static int bfs_readpage(struct file *file, struct address_space *mapping,
 			struct page *page)
 {
-	return block_read_full_page(page, bfs_get_block);
+	return block_read_full_page(page->mapping->host, page, bfs_get_block);
 }
 
 static void bfs_write_failed(struct address_space *mapping, loff_t to)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 2bf1b17aeff3..9ac6bf760272 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -571,7 +571,7 @@ static int blkdev_writepage(struct address_space *mapping, struct page *page,
 static int blkdev_readpage(struct file * file, struct address_space *mapping,
 			   struct page * page)
 {
-	return block_read_full_page(page, blkdev_get_block);
+	return block_read_full_page(page->mapping->host,page,blkdev_get_block);
 }
 
 static int blkdev_readpages(struct file *file, struct address_space *mapping,
diff --git a/fs/buffer.c b/fs/buffer.c
index 99818e876ad8..aa7d9be68581 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2231,9 +2231,9 @@ EXPORT_SYMBOL(block_is_partially_uptodate);
  * set/clear_buffer_uptodate() functions propagate buffer state into the
  * page struct once IO has completed.
  */
-int block_read_full_page(struct page *page, get_block_t *get_block)
+int block_read_full_page(struct inode *inode, struct page *page,
+			 get_block_t *get_block)
 {
-	struct inode *inode = page->mapping->host;
 	sector_t iblock, lblock;
 	struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE];
 	unsigned int blocksize, bbits;
diff --git a/fs/efs/inode.c b/fs/efs/inode.c
index 05aab4a5e8a1..a2f47227124e 100644
--- a/fs/efs/inode.c
+++ b/fs/efs/inode.c
@@ -16,7 +16,7 @@
 static int efs_readpage(struct file *file, struct address_space *mapping,
 			struct page *page)
 {
-	return block_read_full_page(page,efs_get_block);
+	return block_read_full_page(page->mapping->host, page,efs_get_block);
 }
 static sector_t _efs_bmap(struct address_space *mapping, sector_t block)
 {
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index 9ffa6fad18db..e43dc995f978 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -280,7 +280,8 @@ int ext4_mpage_readpages(struct address_space *mapping,
 			bio = NULL;
 		}
 		if (!PageUptodate(page))
-			block_read_full_page(page, ext4_get_block);
+			block_read_full_page(page->mapping->host, page,
+					     ext4_get_block);
 		else
 			unlock_page(page);
 	next_page:
diff --git a/fs/freevxfs/vxfs_subr.c b/fs/freevxfs/vxfs_subr.c
index 25f15ce143b5..91c5a39083c0 100644
--- a/fs/freevxfs/vxfs_subr.c
+++ b/fs/freevxfs/vxfs_subr.c
@@ -162,7 +162,7 @@ static int
 vxfs_readpage(struct file *file, struct address_space *mapping,
 	      struct page *page)
 {
-	return block_read_full_page(page, vxfs_getblk);
+	return block_read_full_page(page->mapping->host, page, vxfs_getblk);
 }
  
 /**
diff --git a/fs/hfs/inode.c b/fs/hfs/inode.c
index 17c96905191d..3851e95e9625 100644
--- a/fs/hfs/inode.c
+++ b/fs/hfs/inode.c
@@ -38,7 +38,7 @@ static int hfs_writepage(struct address_space *mapping, struct page *page,
 static int hfs_readpage(struct file *file, struct address_space *mapping,
 			struct page *page)
 {
-	return block_read_full_page(page, hfs_get_block);
+	return block_read_full_page(page->mapping->host, page, hfs_get_block);
 }
 
 static void hfs_write_failed(struct address_space *mapping, loff_t to)
diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c
index d3a1ae620a14..a39d6114375a 100644
--- a/fs/hfsplus/inode.c
+++ b/fs/hfsplus/inode.c
@@ -26,7 +26,8 @@
 static int hfsplus_readpage(struct file *file, struct address_space *mapping,
 			    struct page *page)
 {
-	return block_read_full_page(page, hfsplus_get_block);
+	return block_read_full_page(page->mapping->host, page,
+				    hfsplus_get_block);
 }
 
 static int hfsplus_writepage(struct address_space *mapping, struct page *page,
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index 218697f38375..2a151fa6b013 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -391,7 +391,7 @@ static int minix_writepage(struct address_space *mapping, struct page *page,
 static int minix_readpage(struct file *file, struct address_space *mapping,
 			  struct page *page)
 {
-	return block_read_full_page(page,minix_get_block);
+	return block_read_full_page(page->mapping->host,page,minix_get_block);
 }
 
 int minix_prepare_chunk(struct page *page, loff_t pos, unsigned len)
diff --git a/fs/mpage.c b/fs/mpage.c
index d25f08f46090..c40ed2aa9bee 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -309,7 +309,7 @@ do_mpage_readpage(struct bio *bio, struct page *page, unsigned nr_pages,
 	if (bio)
 		bio = mpage_bio_submit(REQ_OP_READ, 0, bio);
 	if (!PageUptodate(page))
-	        block_read_full_page(page, get_block);
+	        block_read_full_page(page->mapping->host, page, get_block);
 	else
 		unlock_page(page);
 	goto out;
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index c1d3b33e8676..9942ee775e08 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -343,7 +343,8 @@ static int ocfs2_readpage(struct file *file, struct address_space *mapping,
 	if (oi->ip_dyn_features & OCFS2_INLINE_DATA_FL)
 		ret = ocfs2_readpage_inline(inode, page);
 	else
-		ret = block_read_full_page(page, ocfs2_get_block);
+		ret = block_read_full_page(page->mapping->host, page,
+					   ocfs2_get_block);
 	unlock = 0;
 
 out_alloc:
diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c
index ab156e35ec00..163f639caf5e 100644
--- a/fs/ocfs2/refcounttree.c
+++ b/fs/ocfs2/refcounttree.c
@@ -2961,7 +2961,8 @@ int ocfs2_duplicate_clusters_by_page(handle_t *handle,
 			BUG_ON(PageDirty(page));
 
 		if (!PageUptodate(page)) {
-			ret = block_read_full_page(page, ocfs2_get_block);
+			ret = block_read_full_page(page->mapping->host, page,
+						   ocfs2_get_block);
 			if (ret) {
 				mlog_errno(ret);
 				goto unlock;
diff --git a/fs/omfs/file.c b/fs/omfs/file.c
index 71e9b27ee89d..ac27a4b2186a 100644
--- a/fs/omfs/file.c
+++ b/fs/omfs/file.c
@@ -287,7 +287,7 @@ static int omfs_get_block(struct inode *inode, sector_t block,
 static int omfs_readpage(struct file *file, struct address_space *mapping,
 			 struct page *page)
 {
-	return block_read_full_page(page, omfs_get_block);
+	return block_read_full_page(page->mapping->host, page, omfs_get_block);
 }
 
 static int omfs_readpages(struct file *file, struct address_space *mapping,
diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
index efc60096dd75..429f9295ec95 100644
--- a/fs/qnx4/inode.c
+++ b/fs/qnx4/inode.c
@@ -246,7 +246,7 @@ static void qnx4_kill_sb(struct super_block *sb)
 static int qnx4_readpage(struct file *file, struct address_space *mapping,
 			 struct page *page)
 {
-	return block_read_full_page(page,qnx4_get_block);
+	return block_read_full_page(page->mapping->host, page,qnx4_get_block);
 }
 
 static sector_t qnx4_bmap(struct address_space *mapping, sector_t block)
diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
index cc2dfbe8e31b..d4ab2d45f846 100644
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -2734,7 +2734,8 @@ static int reiserfs_write_full_page(struct page *page,
 static int reiserfs_readpage(struct file *f, struct address_space *mapping,
 			     struct page *page)
 {
-	return block_read_full_page(page, reiserfs_get_block);
+	return block_read_full_page(page->mapping->host, page,
+				    reiserfs_get_block);
 }
 
 static int reiserfs_writepage(struct address_space *mapping, struct page *page,
diff --git a/fs/sysv/itree.c b/fs/sysv/itree.c
index d50dfd8a4465..7cec1e024dc3 100644
--- a/fs/sysv/itree.c
+++ b/fs/sysv/itree.c
@@ -460,7 +460,7 @@ static int sysv_writepage(struct address_space *mapping, struct page *page,
 static int sysv_readpage(struct file *file, struct address_space *mapping,
 			 struct page *page)
 {
-	return block_read_full_page(page,get_block);
+	return block_read_full_page(page->mapping->host, page,get_block);
 }
 
 int sysv_prepare_chunk(struct page *page, loff_t pos, unsigned len)
diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c
index d04c6ed42be5..8589b934be09 100644
--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -477,7 +477,8 @@ static int ufs_writepage(struct address_space *mapping, struct page *page,
 static int ufs_readpage(struct file *file, struct address_space *mapping,
 			struct page *page)
 {
-	return block_read_full_page(page,ufs_getfrag_block);
+	return block_read_full_page(page->mapping->host, page,
+				    ufs_getfrag_block);
 }
 
 int ufs_prepare_chunk(struct page *page, loff_t pos, unsigned len)
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 052f7a8aa7cf..cab143668834 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -222,7 +222,7 @@ int block_write_full_page(struct inode *inode, struct page *page,
 int __block_write_full_page(struct inode *inode, struct page *page,
 			get_block_t *get_block, struct writeback_control *wbc,
 			bh_end_io_t *handler);
-int block_read_full_page(struct page*, get_block_t*);
+int block_read_full_page(struct inode *inode, struct page*, get_block_t*);
 int block_is_partially_uptodate(struct page *page,
 	struct address_space *mapping, unsigned long from,
 	unsigned long count);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 24/79] fs: add struct inode to nobh_writepage() arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (7 preceding siblings ...)
  2018-04-04 19:17 ` [RFC PATCH 22/79] fs: add struct inode to block_read_full_page() arguments jglisse
@ 2018-04-04 19:17 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 26/79] fs: add struct address_space to mpage_readpage() arguments jglisse
                   ` (33 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:17 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Alexander Viro, Tejun Heo,
	Jan Kara, Josef Bacik, Mel Gorman, Jeff Layton

From: Jérôme Glisse <jglisse@redhat.com>

Add struct inode to nobh_writepage(). Note this patch only add arguments
and modify call site conservatily using page->mapping and thus the end
result is as before this patch.

One step toward dropping reliance on page->mapping.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>
---
 fs/buffer.c                 | 5 ++---
 fs/ext2/inode.c             | 2 +-
 fs/gfs2/aops.c              | 3 ++-
 include/linux/buffer_head.h | 4 ++--
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index aa7d9be68581..31298f4f0300 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2730,10 +2730,9 @@ EXPORT_SYMBOL(nobh_write_end);
  * that it tries to operate without attaching bufferheads to
  * the page.
  */
-int nobh_writepage(struct page *page, get_block_t *get_block,
-			struct writeback_control *wbc)
+int nobh_writepage(struct inode *inode, struct page *page,
+		get_block_t *get_block, struct writeback_control *wbc)
 {
-	struct inode * const inode = page->mapping->host;
 	loff_t i_size = i_size_read(inode);
 	const pgoff_t end_index = i_size >> PAGE_SHIFT;
 	unsigned offset;
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 37439d1e544c..11b3c3e7ea65 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -926,7 +926,7 @@ static int ext2_nobh_writepage(struct address_space *mapping,
 			struct page *page,
 			struct writeback_control *wbc)
 {
-	return nobh_writepage(page, ext2_get_block, wbc);
+	return nobh_writepage(page->mapping->host, page, ext2_get_block, wbc);
 }
 
 static sector_t ext2_bmap(struct address_space *mapping, sector_t block)
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 8cfd4c7d884c..ff02313b86e6 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -142,7 +142,8 @@ static int gfs2_writepage(struct address_space *mapping, struct page *page,
 	if (ret <= 0)
 		return ret;
 
-	return nobh_writepage(page, gfs2_get_block_noalloc, wbc);
+	return nobh_writepage(page->mapping->host, page,
+			      gfs2_get_block_noalloc, wbc);
 }
 
 /* This is the same as calling block_write_full_page, but it also
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index cab143668834..fb68a3358330 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -265,8 +265,8 @@ int nobh_write_end(struct file *, struct address_space *,
 				loff_t, unsigned, unsigned,
 				struct page *, void *);
 int nobh_truncate_page(struct address_space *, loff_t, get_block_t *);
-int nobh_writepage(struct page *page, get_block_t *get_block,
-                        struct writeback_control *wbc);
+int nobh_writepage(struct inode *inode, struct page *page,
+		get_block_t *get_block, struct writeback_control *wbc);
 
 void buffer_init(void);
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 26/79] fs: add struct address_space to mpage_readpage() arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (8 preceding siblings ...)
  2018-04-04 19:17 ` [RFC PATCH 24/79] fs: add struct inode to nobh_writepage() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 27/79] fs: add struct address_space to fscache_read*() callback arguments jglisse
                   ` (32 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Alexander Viro, Tejun Heo,
	Jan Kara, Josef Bacik, Mel Gorman, Jeff Layton

From: Jérôme Glisse <jglisse@redhat.com>

Add struct address_space to mpage_readpage(). Note this patch only add
arguments and modify call site conservatily using page->mapping and thus
the end result is as before this patch.

One step toward dropping reliance on page->mapping.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>
---
 fs/ext2/inode.c       |  2 +-
 fs/fat/inode.c        |  2 +-
 fs/gfs2/aops.c        |  2 +-
 fs/hpfs/file.c        |  2 +-
 fs/isofs/inode.c      |  2 +-
 fs/jfs/inode.c        |  2 +-
 fs/mpage.c            | 14 ++++++++------
 fs/nilfs2/inode.c     |  2 +-
 fs/qnx6/inode.c       |  2 +-
 fs/udf/inode.c        |  2 +-
 fs/xfs/xfs_aops.c     |  2 +-
 include/linux/mpage.h |  3 ++-
 12 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 11b3c3e7ea65..33873c0a4c14 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -872,7 +872,7 @@ static int ext2_writepage(struct address_space *mapping, struct page *page,
 static int ext2_readpage(struct file *file, struct address_space *mapping,
 			 struct page *page)
 {
-	return mpage_readpage(page, ext2_get_block);
+	return mpage_readpage(page->mapping, page, ext2_get_block);
 }
 
 static int
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 4b70dcbcd192..9e6bc6364468 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -197,7 +197,7 @@ static int fat_writepages(struct address_space *mapping,
 static int fat_readpage(struct file *file, struct address_space *mapping,
 			struct page *page)
 {
-	return mpage_readpage(page, fat_get_block);
+	return mpage_readpage(page->mapping, page, fat_get_block);
 }
 
 static int fat_readpages(struct file *file, struct address_space *mapping,
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index ff02313b86e6..b42775bba6a1 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -524,7 +524,7 @@ static int __gfs2_readpage(void *file, struct address_space *mapping,
 		error = stuffed_readpage(ip, page);
 		unlock_page(page);
 	} else {
-		error = mpage_readpage(page, gfs2_block_map);
+		error = mpage_readpage(page->mapping, page, gfs2_block_map);
 	}
 
 	if (unlikely(test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
diff --git a/fs/hpfs/file.c b/fs/hpfs/file.c
index 3f2cc3fcee80..620dd9709a2c 100644
--- a/fs/hpfs/file.c
+++ b/fs/hpfs/file.c
@@ -118,7 +118,7 @@ static int hpfs_get_block(struct inode *inode, sector_t iblock, struct buffer_he
 static int hpfs_readpage(struct file *file, struct address_space *mapping,
 			 struct page *page)
 {
-	return mpage_readpage(page, hpfs_get_block);
+	return mpage_readpage(page->mapping, page, hpfs_get_block);
 }
 
 static int hpfs_writepage(struct address_space *mapping, struct page *page,
diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
index 541d89e0621a..7d73b1036321 100644
--- a/fs/isofs/inode.c
+++ b/fs/isofs/inode.c
@@ -1171,7 +1171,7 @@ struct buffer_head *isofs_bread(struct inode *inode, sector_t block)
 static int isofs_readpage(struct file *file, struct address_space *mapping,
 			  struct page *page)
 {
-	return mpage_readpage(page, isofs_get_block);
+	return mpage_readpage(page->mapping, page, isofs_get_block);
 }
 
 static int isofs_readpages(struct file *file, struct address_space *mapping,
diff --git a/fs/jfs/inode.c b/fs/jfs/inode.c
index be71214f4937..be6da161bc81 100644
--- a/fs/jfs/inode.c
+++ b/fs/jfs/inode.c
@@ -297,7 +297,7 @@ static int jfs_writepages(struct address_space *mapping,
 static int jfs_readpage(struct file *file, struct address_space *mapping,
 			struct page *page)
 {
-	return mpage_readpage(page, jfs_get_block);
+	return mpage_readpage(page->mapping, page, jfs_get_block);
 }
 
 static int jfs_readpages(struct file *file, struct address_space *mapping,
diff --git a/fs/mpage.c b/fs/mpage.c
index 8800bcde5f4e..52a6028e2066 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -143,12 +143,13 @@ map_buffer_to_page(struct inode *inode, struct page *page,
  * get_block() call.
  */
 static struct bio *
-do_mpage_readpage(struct bio *bio, struct page *page, unsigned nr_pages,
+do_mpage_readpage(struct bio *bio, struct address_space *mapping,
+		struct page *page, unsigned nr_pages,
 		sector_t *last_block_in_bio, struct buffer_head *map_bh,
 		unsigned long *first_logical_block, get_block_t get_block,
 		gfp_t gfp)
 {
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = mapping->host;
 	const unsigned blkbits = inode->i_blkbits;
 	const unsigned blocks_per_page = PAGE_SIZE >> blkbits;
 	const unsigned blocksize = 1 << blkbits;
@@ -381,7 +382,7 @@ mpage_readpages(struct address_space *mapping, struct list_head *pages,
 		if (!add_to_page_cache_lru(page, mapping,
 					page->index,
 					gfp)) {
-			bio = do_mpage_readpage(bio, page,
+			bio = do_mpage_readpage(bio, mapping, page,
 					nr_pages - page_idx,
 					&last_block_in_bio, &map_bh,
 					&first_logical_block,
@@ -399,17 +400,18 @@ EXPORT_SYMBOL(mpage_readpages);
 /*
  * This isn't called much at all
  */
-int mpage_readpage(struct page *page, get_block_t get_block)
+int mpage_readpage(struct address_space *mapping, struct page *page,
+		   get_block_t get_block)
 {
 	struct bio *bio = NULL;
 	sector_t last_block_in_bio = 0;
 	struct buffer_head map_bh;
 	unsigned long first_logical_block = 0;
-	gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL);
+	gfp_t gfp = mapping_gfp_constraint(mapping, GFP_KERNEL);
 
 	map_bh.b_state = 0;
 	map_bh.b_size = 0;
-	bio = do_mpage_readpage(bio, page, 1, &last_block_in_bio,
+	bio = do_mpage_readpage(bio, mapping, page, 1, &last_block_in_bio,
 			&map_bh, &first_logical_block, get_block, gfp);
 	if (bio)
 		mpage_bio_submit(REQ_OP_READ, 0, bio);
diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
index 86d12a7822a9..7cc0268d68ce 100644
--- a/fs/nilfs2/inode.c
+++ b/fs/nilfs2/inode.c
@@ -152,7 +152,7 @@ int nilfs_get_block(struct inode *inode, sector_t blkoff,
 static int nilfs_readpage(struct file *file, struct address_space *mapping,
 			  struct page *page)
 {
-	return mpage_readpage(page, nilfs_get_block);
+	return mpage_readpage(page->mapping, page, nilfs_get_block);
 }
 
 /**
diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
index 98cb4671405e..c7f3623fd5f4 100644
--- a/fs/qnx6/inode.c
+++ b/fs/qnx6/inode.c
@@ -96,7 +96,7 @@ static int qnx6_check_blockptr(__fs32 ptr)
 static int qnx6_readpage(struct file *file, struct address_space *mapping,
 			 struct page *page)
 {
-	return mpage_readpage(page, qnx6_get_block);
+	return mpage_readpage(page->mapping, page, qnx6_get_block);
 }
 
 static int qnx6_readpages(struct file *file, struct address_space *mapping,
diff --git a/fs/udf/inode.c b/fs/udf/inode.c
index 63264d72999a..56cf8e70d298 100644
--- a/fs/udf/inode.c
+++ b/fs/udf/inode.c
@@ -188,7 +188,7 @@ static int udf_writepages(struct address_space *mapping,
 static int udf_readpage(struct file *file, struct address_space *mapping,
 			struct page *page)
 {
-	return mpage_readpage(page, udf_get_block);
+	return mpage_readpage(page->mapping, page, udf_get_block);
 }
 
 static int udf_readpages(struct file *file, struct address_space *mapping,
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 00922a82ede6..bed27e59720a 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1415,7 +1415,7 @@ xfs_vm_readpage(
 	struct page		*page)
 {
 	trace_xfs_vm_readpage(page->mapping->host, 1);
-	return mpage_readpage(page, xfs_get_blocks);
+	return mpage_readpage(page->mapping, page, xfs_get_blocks);
 }
 
 STATIC int
diff --git a/include/linux/mpage.h b/include/linux/mpage.h
index e7f489fc090c..1708caae2640 100644
--- a/include/linux/mpage.h
+++ b/include/linux/mpage.h
@@ -16,7 +16,8 @@ struct writeback_control;
 
 int mpage_readpages(struct address_space *mapping, struct list_head *pages,
 				unsigned nr_pages, get_block_t get_block);
-int mpage_readpage(struct page *page, get_block_t get_block);
+int mpage_readpage(struct address_space *mapping, struct page *page,
+		   get_block_t get_block);
 int mpage_writepages(struct address_space *mapping,
 		struct writeback_control *wbc, get_block_t get_block);
 int mpage_writepage(struct address_space *mapping, struct page *page,
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 27/79] fs: add struct address_space to fscache_read*() callback arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (9 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 26/79] fs: add struct address_space to mpage_readpage() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 28/79] fs: introduce page_is_truncated() helper jglisse
                   ` (31 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, David Howells,
	linux-cachefs, Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik,
	Mel Gorman, Jeff Layton

From: Jérôme Glisse <jglisse@redhat.com>

Add struct address_space to fscache_read*() callback argument. Note
this patch only add arguments and modify call site conservatily using
page->mapping and thus the end result is as before this patch.

One step toward dropping reliance on page->mapping.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-cachefs@redhat.com
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>
---
 fs/9p/cache.c                 |  4 +++-
 fs/afs/file.c                 |  4 +++-
 fs/ceph/cache.c               | 10 ++++++----
 fs/cifs/fscache.c             |  6 ++++--
 fs/fscache/page.c             |  1 +
 fs/nfs/fscache.c              |  4 +++-
 include/linux/fscache-cache.h |  2 +-
 include/linux/fscache.h       |  9 ++++++---
 8 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/fs/9p/cache.c b/fs/9p/cache.c
index 8185bfe4492f..3f122d35c54d 100644
--- a/fs/9p/cache.c
+++ b/fs/9p/cache.c
@@ -273,7 +273,8 @@ void __v9fs_fscache_invalidate_page(struct address_space *mapping,
 	}
 }
 
-static void v9fs_vfs_readpage_complete(struct page *page, void *data,
+static void v9fs_vfs_readpage_complete(struct address_space *mapping,
+				       struct page *page, void *data,
 				       int error)
 {
 	if (!error)
@@ -299,6 +300,7 @@ int __v9fs_readpage_from_fscache(struct inode *inode, struct page *page)
 		return -ENOBUFS;
 
 	ret = fscache_read_or_alloc_page(v9inode->fscache,
+					 page->mapping,
 					 page,
 					 v9fs_vfs_readpage_complete,
 					 NULL,
diff --git a/fs/afs/file.c b/fs/afs/file.c
index f87e997b9df9..23ff51343dd3 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -203,7 +203,8 @@ void afs_put_read(struct afs_read *req)
 /*
  * deal with notification that a page was read from the cache
  */
-static void afs_file_readpage_read_complete(struct page *page,
+static void afs_file_readpage_read_complete(struct address_space *mapping,
+					    struct page *page,
 					    void *data,
 					    int error)
 {
@@ -271,6 +272,7 @@ int afs_page_filler(void *data, struct address_space *mapping,
 	/* is it cached? */
 #ifdef CONFIG_AFS_FSCACHE
 	ret = fscache_read_or_alloc_page(vnode->cache,
+					 page->mapping,
 					 page,
 					 afs_file_readpage_read_complete,
 					 NULL,
diff --git a/fs/ceph/cache.c b/fs/ceph/cache.c
index a3ab265d3215..14438f1ed7e0 100644
--- a/fs/ceph/cache.c
+++ b/fs/ceph/cache.c
@@ -266,7 +266,9 @@ void ceph_fscache_file_set_cookie(struct inode *inode, struct file *filp)
 	}
 }
 
-static void ceph_readpage_from_fscache_complete(struct page *page, void *data, int error)
+static void ceph_readpage_from_fscache_complete(struct address_space *mapping,
+						struct page *page, void *data,
+						int error)
 {
 	if (!error)
 		SetPageUptodate(page);
@@ -293,9 +295,9 @@ int ceph_readpage_from_fscache(struct inode *inode, struct page *page)
 	if (!cache_valid(ci))
 		return -ENOBUFS;
 
-	ret = fscache_read_or_alloc_page(ci->fscache, page,
-					 ceph_readpage_from_fscache_complete, NULL,
-					 GFP_KERNEL);
+	ret = fscache_read_or_alloc_page(ci->fscache, page->mapping, page,
+					 ceph_readpage_from_fscache_complete,
+					 NULL, GFP_KERNEL);
 
 	switch (ret) {
 		case 0: /* Page found */
diff --git a/fs/cifs/fscache.c b/fs/cifs/fscache.c
index 8d4b7bc8ae91..25f259a83fe0 100644
--- a/fs/cifs/fscache.c
+++ b/fs/cifs/fscache.c
@@ -140,7 +140,8 @@ int cifs_fscache_release_page(struct page *page, gfp_t gfp)
 	return 1;
 }
 
-static void cifs_readpage_from_fscache_complete(struct page *page, void *ctx,
+static void cifs_readpage_from_fscache_complete(struct address_space *mapping,
+						struct page *page, void *ctx,
 						int error)
 {
 	cifs_dbg(FYI, "%s: (0x%p/%d)\n", __func__, page, error);
@@ -158,7 +159,8 @@ int __cifs_readpage_from_fscache(struct inode *inode, struct page *page)
 
 	cifs_dbg(FYI, "%s: (fsc:%p, p:%p, i:0x%p\n",
 		 __func__, CIFS_I(inode)->fscache, page, inode);
-	ret = fscache_read_or_alloc_page(CIFS_I(inode)->fscache, page,
+	ret = fscache_read_or_alloc_page(CIFS_I(inode)->fscache,
+					 page->mapping, page,
 					 cifs_readpage_from_fscache_complete,
 					 NULL,
 					 GFP_KERNEL);
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index 7112b42ad8c5..0c3d322a7b52 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -408,6 +408,7 @@ int fscache_wait_for_operation_activation(struct fscache_object *object,
  *   0		- dispatched a read - it'll call end_io_func() when finished
  */
 int __fscache_read_or_alloc_page(struct fscache_cookie *cookie,
+				 struct address_space *mapping,
 				 struct page *page,
 				 fscache_rw_complete_t end_io_func,
 				 void *context,
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index d63bea8bbfbb..e1cf607f8959 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -301,7 +301,8 @@ void __nfs_fscache_invalidate_page(struct page *page, struct inode *inode)
  * Handle completion of a page being read from the cache.
  * - Called in process (keventd) context.
  */
-static void nfs_readpage_from_fscache_complete(struct page *page,
+static void nfs_readpage_from_fscache_complete(struct address_space *mapping,
+					       struct page *page,
 					       void *context,
 					       int error)
 {
@@ -334,6 +335,7 @@ int __nfs_readpage_from_fscache(struct nfs_open_context *ctx,
 		 nfs_i_fscache(inode), page, page->index, page->flags, inode);
 
 	ret = fscache_read_or_alloc_page(nfs_i_fscache(inode),
+					 page->mapping,
 					 page,
 					 nfs_readpage_from_fscache_complete,
 					 ctx,
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 4c467ef50159..7ae49d0306d5 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -468,7 +468,7 @@ void fscache_set_store_limit(struct fscache_object *object, loff_t i_size)
 static inline void fscache_end_io(struct fscache_retrieval *op,
 				  struct page *page, int error)
 {
-	op->end_io_func(page, op->context, error);
+	op->end_io_func(op->mapping, page, op->context, error);
 }
 
 static inline void __fscache_use_cookie(struct fscache_cookie *cookie)
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 13db0098d3a9..f62df8c68e7a 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -50,7 +50,8 @@ struct fscache_cache_tag;
 struct fscache_cookie;
 struct fscache_netfs;
 
-typedef void (*fscache_rw_complete_t)(struct page *page,
+typedef void (*fscache_rw_complete_t)(struct address_space *mapping,
+				      struct page *page,
 				      void *context,
 				      int error);
 
@@ -216,6 +217,7 @@ extern int __fscache_attr_changed(struct fscache_cookie *);
 extern void __fscache_invalidate(struct fscache_cookie *);
 extern void __fscache_wait_on_invalidate(struct fscache_cookie *);
 extern int __fscache_read_or_alloc_page(struct fscache_cookie *,
+					struct address_space *mapping,
 					struct page *,
 					fscache_rw_complete_t,
 					void *,
@@ -530,14 +532,15 @@ int fscache_reserve_space(struct fscache_cookie *cookie, loff_t size)
  */
 static inline
 int fscache_read_or_alloc_page(struct fscache_cookie *cookie,
+			       struct address_space *mapping,
 			       struct page *page,
 			       fscache_rw_complete_t end_io_func,
 			       void *context,
 			       gfp_t gfp)
 {
 	if (fscache_cookie_valid(cookie) && fscache_cookie_enabled(cookie))
-		return __fscache_read_or_alloc_page(cookie, page, end_io_func,
-						    context, gfp);
+		return __fscache_read_or_alloc_page(cookie, mapping, page,
+						    end_io_func, context, gfp);
 	else
 		return -ENOBUFS;
 }
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 28/79] fs: introduce page_is_truncated() helper
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (10 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 27/79] fs: add struct address_space to fscache_read*() callback arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 29/79] fs/block: add struct address_space to bdev_write_page() arguments jglisse
                   ` (30 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Alexander Viro, Tejun Heo,
	Jan Kara, Josef Bacik, Mel Gorman, Jeff Layton

From: Jérôme Glisse <jglisse@redhat.com>

Simple helper to unify all truncation test to one logic. This also
unify logic that was bit different in various places.

Convertion done using following coccinelle spatch on fs and mm dir:
---------------------------------------------------------------------
@@
struct page * ppage;
@@
-!ppage->mapping
+page_is_truncated(ppage, mapping)

@@
struct page * ppage;
@@
-ppage->mapping != mapping
+page_is_truncated(ppage, mapping)

@@
struct page * ppage;
@@
-ppage->mapping != inode->i_mapping
+page_is_truncated(ppage, inode->i_mapping)
---------------------------------------------------------------------

Followed by:
git checkout mm/migrate.c mm/huge_memory.c mm/memory-failure.c
git checkout mm/memcontrol.c fs/ext4/page-io.c fs/reiserfs/journal.c

Hand editing:
    mm/memory.c do_page_mkwrite()
    fs/splice.c splice_to_pipe()
    fs/nfs/dir.c cache_page_release()
    fs/xfs/xfs_aops.c xfs_check_page_type()
    fs/xfs/xfs_aops.c xfs_vm_set_page_dirty()
    fs/buffer.c mark_buffer_write_io_error()
    fs/buffer.c page_cache_seek_hole_data()

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jeff Layton <jlayton@redhat.com>

fixup! fs: introduce page_is_truncated() helper
---
 drivers/staging/lustre/lustre/llite/llite_mmap.c |  7 +++++--
 fs/9p/vfs_file.c                                 |  2 +-
 fs/afs/write.c                                   |  2 +-
 fs/btrfs/extent_io.c                             |  4 ++--
 fs/btrfs/file.c                                  |  2 +-
 fs/btrfs/inode.c                                 |  7 ++++---
 fs/btrfs/ioctl.c                                 |  6 +++---
 fs/btrfs/scrub.c                                 |  2 +-
 fs/buffer.c                                      |  8 ++++----
 fs/ceph/addr.c                                   |  6 +++---
 fs/cifs/file.c                                   |  2 +-
 fs/ext4/inode.c                                  | 10 +++++-----
 fs/ext4/mballoc.c                                |  8 ++++----
 fs/f2fs/checkpoint.c                             |  4 ++--
 fs/f2fs/data.c                                   |  8 ++++----
 fs/f2fs/file.c                                   |  2 +-
 fs/f2fs/super.c                                  |  2 +-
 fs/fuse/file.c                                   |  2 +-
 fs/gfs2/aops.c                                   |  2 +-
 fs/gfs2/file.c                                   |  4 ++--
 fs/iomap.c                                       |  2 +-
 fs/nfs/dir.c                                     |  2 +-
 fs/nilfs2/file.c                                 |  2 +-
 fs/ocfs2/aops.c                                  |  2 +-
 fs/ocfs2/mmap.c                                  |  2 +-
 fs/splice.c                                      |  2 +-
 fs/ubifs/file.c                                  |  2 +-
 fs/xfs/xfs_aops.c                                |  8 +++++---
 include/linux/pagemap.h                          | 16 ++++++++++++++++
 mm/filemap.c                                     | 12 ++++++------
 mm/memory.c                                      |  5 ++++-
 mm/page-writeback.c                              |  2 +-
 mm/truncate.c                                    | 12 ++++++------
 33 files changed, 92 insertions(+), 67 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c b/drivers/staging/lustre/lustre/llite/llite_mmap.c
index c0533bd6f352..6a9d310a7bfd 100644
--- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
+++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
@@ -191,7 +191,7 @@ static int ll_page_mkwrite0(struct vm_area_struct *vma, struct page *vmpage,
 		struct ll_inode_info *lli = ll_i2info(inode);
 
 		lock_page(vmpage);
-		if (!vmpage->mapping) {
+		if (page_is_truncated(vmpage, inode->i_mapping)) {
 			unlock_page(vmpage);
 
 			/* page was truncated and lock was cancelled, return
@@ -341,10 +341,13 @@ static int ll_fault(struct vm_fault *vmf)
 	LASSERT(!(result & VM_FAULT_LOCKED));
 	if (result == 0) {
 		struct page *vmpage = vmf->page;
+		struct address_space *mapping;
+
+		mapping = vmf->vma->vm_file ? vmf->vma->vm_file->f_mapping : 0;
 
 		/* check if this page has been truncated */
 		lock_page(vmpage);
-		if (unlikely(!vmpage->mapping)) { /* unlucky */
+		if (unlikely(page_is_truncated(vmpage, mapping))) { /* unlucky */
 			unlock_page(vmpage);
 			put_page(vmpage);
 			vmf->page = NULL;
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index 03c9e325bfbc..bf71ea1d7ff6 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -553,7 +553,7 @@ v9fs_vm_page_mkwrite(struct vm_fault *vmf)
 	v9fs_fscache_wait_on_page_write(inode, page);
 	BUG_ON(!v9inode->writeback_fid);
 	lock_page(page);
-	if (page->mapping != inode->i_mapping)
+	if (page_is_truncated(page, inode->i_mapping))
 		goto out_unlock;
 	wait_for_stable_page(page);
 
diff --git a/fs/afs/write.c b/fs/afs/write.c
index b0757ca87bfc..9c5bdad0bd72 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -583,7 +583,7 @@ static int afs_writepages_region(struct address_space *mapping,
 			return ret;
 		}
 
-		if (page->mapping != mapping || !PageDirty(page)) {
+		if (page_is_truncated(page, mapping) || !PageDirty(page)) {
 			unlock_page(page);
 			put_page(page);
 			continue;
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 2232a2c224e3..3c145b353873 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1718,7 +1718,7 @@ static int __process_pages_contig(struct address_space *mapping,
 			if (page_ops & PAGE_LOCK) {
 				lock_page(pages[i]);
 				if (!PageDirty(pages[i]) ||
-				    pages[i]->mapping != mapping) {
+				    page_is_truncated(pages[i], mapping)) {
 					unlock_page(pages[i]);
 					put_page(pages[i]);
 					err = -EAGAIN;
@@ -3970,7 +3970,7 @@ static int extent_write_cache_pages(struct address_space *mapping,
 				lock_page(page);
 			}
 
-			if (unlikely(page->mapping != mapping)) {
+			if (unlikely(page_is_truncated(page, mapping))) {
 				unlock_page(page);
 				continue;
 			}
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 8430df155af6..989735cd751c 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1406,7 +1406,7 @@ static int prepare_uptodate_page(struct inode *inode,
 			unlock_page(page);
 			return -EIO;
 		}
-		if (page->mapping != inode->i_mapping) {
+		if (page_is_truncated(page, inode->i_mapping)) {
 			unlock_page(page);
 			return -EAGAIN;
 		}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 7d6b22a8791b..968640312537 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2087,7 +2087,8 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
 	page = fixup->page;
 again:
 	lock_page(page);
-	if (!page->mapping || !PageDirty(page) || !PageChecked(page)) {
+	if (page_is_truncated(page, page->mapping) ||
+	    !PageDirty(page) || !PageChecked(page)) {
 		ClearPageChecked(page);
 		goto out_page;
 	}
@@ -4815,7 +4816,7 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len,
 	if (!PageUptodate(page)) {
 		ret = btrfs_readpage(NULL, mapping, page);
 		lock_page(page);
-		if (page->mapping != mapping) {
+		if (page_is_truncated(page, mapping)) {
 			unlock_page(page);
 			put_page(page);
 			goto again;
@@ -9019,7 +9020,7 @@ int btrfs_page_mkwrite(struct vm_fault *vmf)
 	lock_page(page);
 	size = i_size_read(inode);
 
-	if ((page->mapping != inode->i_mapping) ||
+	if ((page_is_truncated(page, inode->i_mapping)) ||
 	    (page_start >= size)) {
 		/* page got truncated out from underneath us */
 		goto out_unlock;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index b1186c15f293..c57e9ce8204d 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1141,7 +1141,7 @@ static int cluster_pages_for_defrag(struct inode *inode,
 			 * we unlocked the page above, so we need check if
 			 * it was released or not.
 			 */
-			if (page->mapping != inode->i_mapping) {
+			if (page_is_truncated(page, inode->i_mapping)) {
 				unlock_page(page);
 				put_page(page);
 				goto again;
@@ -1159,7 +1159,7 @@ static int cluster_pages_for_defrag(struct inode *inode,
 			}
 		}
 
-		if (page->mapping != inode->i_mapping) {
+		if (page_is_truncated(page, inode->i_mapping)) {
 			unlock_page(page);
 			put_page(page);
 			goto again;
@@ -2834,7 +2834,7 @@ static struct page *extent_same_get_page(struct inode *inode, pgoff_t index)
 			put_page(page);
 			return ERR_PTR(-EIO);
 		}
-		if (page->mapping != inode->i_mapping) {
+		if (page_is_truncated(page, inode->i_mapping)) {
 			unlock_page(page);
 			put_page(page);
 			return ERR_PTR(-EAGAIN);
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index ec56f33feea9..e621b79a90b3 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -4574,7 +4574,7 @@ static int copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root,
 			 * old one, the new data may be written into the new
 			 * page in the page cache.
 			 */
-			if (page->mapping != inode->i_mapping) {
+			if (page_is_truncated(page, inode->i_mapping)) {
 				unlock_page(page);
 				put_page(page);
 				goto again;
diff --git a/fs/buffer.c b/fs/buffer.c
index 6790bb4ebc07..8b2eb3dfb539 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -602,7 +602,7 @@ static void __set_page_dirty(struct page *page, struct address_space *_mapping,
 	struct address_space *mapping = page_mapping(page);
 
 	spin_lock_irqsave(&mapping->tree_lock, flags);
-	if (page->mapping) {	/* Race with truncate? */
+	if (page_is_truncated(page, mapping)) {	/* Race with truncate? */
 		WARN_ON_ONCE(warn && !PageUptodate(page));
 		account_page_dirtied(page, mapping);
 		radix_tree_tag_set(&mapping->page_tree,
@@ -1138,7 +1138,7 @@ void mark_buffer_write_io_error(struct buffer_head *bh)
 {
 	set_buffer_write_io_error(bh);
 	/* FIXME: do we need to set this in both places? */
-	if (bh->b_page && bh->b_page->mapping)
+	if (bh->b_page && !page_is_truncated(bh->b_page, bh->b_page->mapping))
 		mapping_set_error(bh->b_page->mapping, -EIO);
 	if (bh->b_assoc_map)
 		mapping_set_error(bh->b_assoc_map, -EIO);
@@ -2482,7 +2482,7 @@ int block_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf,
 
 	lock_page(page);
 	size = i_size_read(inode);
-	if ((page->mapping != inode->i_mapping) ||
+	if ((page_is_truncated(page, inode->i_mapping)) ||
 	    (page_offset(page) > size)) {
 		/* We overload EFAULT to mean page got truncated */
 		ret = -EFAULT;
@@ -3538,7 +3538,7 @@ page_cache_seek_hole_data(struct inode *inode, loff_t offset, loff_t length,
 				goto check_range;
 
 			lock_page(page);
-			if (likely(page->mapping == inode->i_mapping) &&
+			if (likely(!page_is_truncated(page, inode->i_mapping)) &&
 			    page_has_buffers(page)) {
 				lastoff = page_seek_hole_data(page, lastoff, whence);
 				if (lastoff >= 0) {
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 1ebb146dc890..c274d8a32479 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -126,7 +126,7 @@ static int ceph_set_page_dirty(struct address_space *_mapping,
 
 	ret = __set_page_dirty_nobuffers(page);
 	WARN_ON(!PageLocked(page));
-	WARN_ON(!page->mapping);
+	WARN_ON(page_is_truncated(page, mapping));
 
 	return ret;
 }
@@ -899,7 +899,7 @@ static int ceph_writepages_start(struct address_space *mapping,
 
 			/* only dirty pages, or our accounting breaks */
 			if (unlikely(!PageDirty(page)) ||
-			    unlikely(page->mapping != mapping)) {
+			    unlikely(page_is_truncated(page, mapping))) {
 				dout("!dirty or !mapping %p\n", page);
 				unlock_page(page);
 				continue;
@@ -1586,7 +1586,7 @@ static int ceph_page_mkwrite(struct vm_fault *vmf)
 	do {
 		lock_page(page);
 
-		if ((off > size) || (page->mapping != inode->i_mapping)) {
+		if ((off > size) || (page_is_truncated(page, inode->i_mapping))) {
 			unlock_page(page);
 			ret = VM_FAULT_NOPAGE;
 			break;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 01f2c3852eea..017fe16ae993 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1999,7 +1999,7 @@ wdata_prepare_pages(struct cifs_writedata *wdata, unsigned int found_pages,
 		else if (!trylock_page(page))
 			break;
 
-		if (unlikely(page->mapping != mapping)) {
+		if (unlikely(page_is_truncated(page, mapping))) {
 			unlock_page(page);
 			break;
 		}
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 63bf0160c579..394fed206138 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1295,7 +1295,7 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
 	}
 
 	lock_page(page);
-	if (page->mapping != mapping) {
+	if (page_is_truncated(page, mapping)) {
 		/* The page got truncated from under us */
 		unlock_page(page);
 		put_page(page);
@@ -2023,7 +2023,7 @@ static int __ext4_journalled_writepage(struct page *page,
 
 	lock_page(page);
 	put_page(page);
-	if (page->mapping != mapping) {
+	if (page_is_truncated(page, mapping)) {
 		/* The page got truncated from under us */
 		ext4_journal_stop(handle);
 		ret = 0;
@@ -2667,7 +2667,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
 			if (!PageDirty(page) ||
 			    (PageWriteback(page) &&
 			     (mpd->wbc->sync_mode == WB_SYNC_NONE)) ||
-			    unlikely(page->mapping != mapping)) {
+			    unlikely(page_is_truncated(page, mapping))) {
 				unlock_page(page);
 				continue;
 			}
@@ -3066,7 +3066,7 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
 	}
 
 	lock_page(page);
-	if (page->mapping != mapping) {
+	if (page_is_truncated(page, mapping)) {
 		/* The page got truncated from under us */
 		unlock_page(page);
 		put_page(page);
@@ -6123,7 +6123,7 @@ int ext4_page_mkwrite(struct vm_fault *vmf)
 	lock_page(page);
 	size = i_size_read(inode);
 	/* Page got truncated from under us? */
-	if (page->mapping != mapping || page_offset(page) > size) {
+	if (page_is_truncated(page, mapping) || page_offset(page) > size) {
 		unlock_page(page);
 		ret = VM_FAULT_NOPAGE;
 		goto out;
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 769a62708b1c..38deb97705c4 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -991,7 +991,7 @@ static int ext4_mb_get_buddy_page_lock(struct super_block *sb,
 	page = find_or_create_page(inode->i_mapping, pnum, gfp);
 	if (!page)
 		return -ENOMEM;
-	BUG_ON(page->mapping != inode->i_mapping);
+	BUG_ON(page_is_truncated(page, inode->i_mapping));
 	e4b->bd_bitmap_page = page;
 	e4b->bd_bitmap = page_address(page) + (poff * sb->s_blocksize);
 
@@ -1005,7 +1005,7 @@ static int ext4_mb_get_buddy_page_lock(struct super_block *sb,
 	page = find_or_create_page(inode->i_mapping, pnum, gfp);
 	if (!page)
 		return -ENOMEM;
-	BUG_ON(page->mapping != inode->i_mapping);
+	BUG_ON(page_is_truncated(page, inode->i_mapping));
 	e4b->bd_buddy_page = page;
 	return 0;
 }
@@ -1156,7 +1156,7 @@ ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group,
 			put_page(page);
 		page = find_or_create_page(inode->i_mapping, pnum, gfp);
 		if (page) {
-			BUG_ON(page->mapping != inode->i_mapping);
+			BUG_ON(page_is_truncated(page, inode->i_mapping));
 			if (!PageUptodate(page)) {
 				ret = ext4_mb_init_cache(page, NULL, gfp);
 				if (ret) {
@@ -1192,7 +1192,7 @@ ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group,
 			put_page(page);
 		page = find_or_create_page(inode->i_mapping, pnum, gfp);
 		if (page) {
-			BUG_ON(page->mapping != inode->i_mapping);
+			BUG_ON(page_is_truncated(page, inode->i_mapping));
 			if (!PageUptodate(page)) {
 				ret = ext4_mb_init_cache(page, e4b->bd_bitmap,
 							 gfp);
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index c694c504d673..b218fcacd395 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -89,7 +89,7 @@ static struct page *__get_meta_page(struct f2fs_sb_info *sbi, pgoff_t index,
 	}
 
 	lock_page(page);
-	if (unlikely(page->mapping != mapping)) {
+	if (unlikely(page_is_truncated(page, mapping))) {
 		f2fs_put_page(page, 1);
 		goto repeat;
 	}
@@ -337,7 +337,7 @@ long sync_meta_pages(struct f2fs_sb_info *sbi, enum page_type type,
 
 			lock_page(page);
 
-			if (unlikely(page->mapping != mapping)) {
+			if (unlikely(page_is_truncated(page, mapping))) {
 continue_unlock:
 				unlock_page(page);
 				continue;
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index d15ccbd69bf2..c1a8dd623444 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -725,7 +725,7 @@ struct page *get_lock_data_page(struct inode *inode, pgoff_t index,
 
 	/* wait for read completion */
 	lock_page(page);
-	if (unlikely(page->mapping != mapping)) {
+	if (unlikely(page_is_truncated(page, mapping))) {
 		f2fs_put_page(page, 1);
 		goto repeat;
 	}
@@ -1899,7 +1899,7 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
 retry_write:
 			lock_page(page);
 
-			if (unlikely(page->mapping != mapping)) {
+			if (unlikely(page_is_truncated(page, mapping))) {
 continue_unlock:
 				unlock_page(page);
 				continue;
@@ -2189,7 +2189,7 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
 		unlock_page(page);
 		f2fs_balance_fs(sbi, true);
 		lock_page(page);
-		if (page->mapping != mapping) {
+		if (page_is_truncated(page, mapping)) {
 			/* The page got truncated from under us */
 			f2fs_put_page(page, 1);
 			goto repeat;
@@ -2219,7 +2219,7 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
 			goto fail;
 
 		lock_page(page);
-		if (unlikely(page->mapping != mapping)) {
+		if (unlikely(page_is_truncated(page, mapping))) {
 			f2fs_put_page(page, 1);
 			goto repeat;
 		}
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..5e9ac31240bb 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -78,7 +78,7 @@ static int f2fs_vm_page_mkwrite(struct vm_fault *vmf)
 	file_update_time(vmf->vma->vm_file);
 	down_read(&F2FS_I(inode)->i_mmap_sem);
 	lock_page(page);
-	if (unlikely(page->mapping != inode->i_mapping ||
+	if (unlikely(page_is_truncated(page, inode->i_mapping) ||
 			page_offset(page) > i_size_read(inode) ||
 			!PageUptodate(page))) {
 		unlock_page(page);
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 8173ae688814..af855b563de0 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1467,7 +1467,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, int type, char *data,
 
 		lock_page(page);
 
-		if (unlikely(page->mapping != mapping)) {
+		if (unlikely(page_is_truncated(page, mapping))) {
 			f2fs_put_page(page, 1);
 			goto repeat;
 		}
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index e0562d04d84f..e63be7831f4d 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2059,7 +2059,7 @@ static int fuse_page_mkwrite(struct vm_fault *vmf)
 
 	file_update_time(vmf->vma->vm_file);
 	lock_page(page);
-	if (page->mapping != inode->i_mapping) {
+	if (page_is_truncated(page, inode->i_mapping)) {
 		unlock_page(page);
 		return VM_FAULT_NOPAGE;
 	}
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index b42775bba6a1..21cb6bc98645 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -290,7 +290,7 @@ static int gfs2_write_jdata_pagevec(struct address_space *mapping,
 
 		lock_page(page);
 
-		if (unlikely(page->mapping != mapping)) {
+		if (unlikely(page_is_truncated(page, mapping))) {
 continue_unlock:
 			unlock_page(page);
 			continue;
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 4f88e201b3f0..2c4584deb077 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -422,7 +422,7 @@ static int gfs2_page_mkwrite(struct vm_fault *vmf)
 
 	if (!gfs2_write_alloc_required(ip, pos, PAGE_SIZE)) {
 		lock_page(page);
-		if (!PageUptodate(page) || page->mapping != inode->i_mapping) {
+		if (!PageUptodate(page) || page_is_truncated(page, inode->i_mapping)) {
 			ret = -EAGAIN;
 			unlock_page(page);
 		}
@@ -465,7 +465,7 @@ static int gfs2_page_mkwrite(struct vm_fault *vmf)
 	/* If truncated, we must retry the operation, we may have raced
 	 * with the glock demotion code.
 	 */
-	if (!PageUptodate(page) || page->mapping != inode->i_mapping)
+	if (!PageUptodate(page) || page_is_truncated(page, inode->i_mapping))
 		goto out_trans_end;
 
 	/* Unstuff, if required, and allocate backing blocks for page */
diff --git a/fs/iomap.c b/fs/iomap.c
index afd163586aa0..3801abd93e4d 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -453,7 +453,7 @@ int iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
 
 	lock_page(page);
 	size = i_size_read(inode);
-	if ((page->mapping != inode->i_mapping) ||
+	if (page_is_truncated(page, inode->i_mapping) ||
 	    (page_offset(page) > size)) {
 		/* We overload EFAULT to mean page got truncated */
 		ret = -EFAULT;
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 1d988a0e91ee..9e23eab3d0df 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -689,7 +689,7 @@ int nfs_readdir_filler(nfs_readdir_descriptor_t *desc,
 static
 void cache_page_release(nfs_readdir_descriptor_t *desc)
 {
-	if (!desc->page->mapping)
+	if (page_is_truncated(desc->page, desc->file->f_mapping))
 		nfs_readdir_clear_array(desc->page);
 	put_page(desc->page);
 	desc->page = NULL;
diff --git a/fs/nilfs2/file.c b/fs/nilfs2/file.c
index c5fa3dee72fc..6d061b78a0d8 100644
--- a/fs/nilfs2/file.c
+++ b/fs/nilfs2/file.c
@@ -64,7 +64,7 @@ static int nilfs_page_mkwrite(struct vm_fault *vmf)
 
 	sb_start_pagefault(inode->i_sb);
 	lock_page(page);
-	if (page->mapping != inode->i_mapping ||
+	if (page_is_truncated(page, inode->i_mapping) ||
 	    page_offset(page) >= i_size_read(inode) || !PageUptodate(page)) {
 		unlock_page(page);
 		ret = -EFAULT;	/* make the VM retry the fault */
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 9942ee775e08..2d1d3afc9664 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -1103,7 +1103,7 @@ static int ocfs2_grab_pages_for_write(struct address_space *mapping,
 			lock_page(mmap_page);
 
 			/* Exit and let the caller retry */
-			if (mmap_page->mapping != mapping) {
+			if (page_is_truncated(mmap_page, mapping)) {
 				WARN_ON(mmap_page->mapping);
 				unlock_page(mmap_page);
 				ret = -EAGAIN;
diff --git a/fs/ocfs2/mmap.c b/fs/ocfs2/mmap.c
index fb9a20e3d608..2144c5343d08 100644
--- a/fs/ocfs2/mmap.c
+++ b/fs/ocfs2/mmap.c
@@ -87,7 +87,7 @@ static int __ocfs2_page_mkwrite(struct file *file, struct buffer_head *di_bh,
 	 *
 	 * Let VM retry with these cases.
 	 */
-	if ((page->mapping != inode->i_mapping) ||
+	if ((page_is_truncated(page, inode->i_mapping)) ||
 	    (!PageUptodate(page)) ||
 	    (page_offset(page) >= size))
 		goto out;
diff --git a/fs/splice.c b/fs/splice.c
index acab52a7fe56..a9b70ab19be3 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -112,7 +112,7 @@ static int page_cache_pipe_buf_confirm(struct pipe_inode_info *pipe,
 		 * Page got truncated/unhashed. This will cause a 0-byte
 		 * splice, if this is the first page.
 		 */
-		if (!page->mapping) {
+		if (page_is_truncated(page, pipe->inode->i_mapping)) {
 			err = -ENODATA;
 			goto error;
 		}
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 4d7d10aadbba..6dcc98351b28 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1570,7 +1570,7 @@ static int ubifs_vm_page_mkwrite(struct vm_fault *vmf)
 	}
 
 	lock_page(page);
-	if (unlikely(page->mapping != inode->i_mapping ||
+	if (unlikely(page_is_truncated(page, inode->i_mapping) ||
 		     page_offset(page) > i_size_read(inode))) {
 		/* Page got truncated out from underneath us */
 		err = -EINVAL;
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index bed27e59720a..1e2b461b8772 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -719,6 +719,7 @@ xfs_map_at_offset(
 STATIC bool
 xfs_check_page_type(
 	struct page		*page,
+	struct inode		*inode,
 	unsigned int		type,
 	bool			check_all_buffers)
 {
@@ -727,7 +728,7 @@ xfs_check_page_type(
 
 	if (PageWriteback(page))
 		return false;
-	if (!page->mapping)
+	if (page_is_truncated(page, inode->i_mapping))
 		return false;
 	if (!page_has_buffers(page))
 		return false;
@@ -798,7 +799,7 @@ xfs_aops_discard_page(
 	struct buffer_head	*bh, *head;
 	loff_t			offset = page_offset(page);
 
-	if (!xfs_check_page_type(page, XFS_IO_DELALLOC, true))
+	if (!xfs_check_page_type(page, inode, XFS_IO_DELALLOC, true))
 		goto out_invalidate;
 
 	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
@@ -1483,7 +1484,8 @@ xfs_vm_set_page_dirty(
 		unsigned long flags;
 
 		spin_lock_irqsave(&mapping->tree_lock, flags);
-		if (page->mapping) {	/* Race with truncate? */
+		/* Race with truncate? */
+		if (!page_is_truncated(page, mapping)) {
 			WARN_ON_ONCE(!PageUptodate(page));
 			account_page_dirtied(page, mapping);
 			radix_tree_tag_set(&mapping->page_tree,
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c20f00b3321a..e937f493365e 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -229,6 +229,22 @@ static inline struct page *__page_cache_alloc(gfp_t gfp)
 }
 #endif
 
+/*
+ * page_is_truncated() - test if a page have been truncated
+ * @page: page to test
+ * @mapping: address_space the page should belongs to
+ * Returns: true if truncated, false otherwise
+ *
+ * When page is truncated its mapping is set to NULL. Truncation being a multi-
+ * steps process a truncated page can still be under-going activities hence why
+ * page truncation are scatter through fs and mm code.
+ */
+static inline bool page_is_truncated(struct page *page,
+				     struct address_space *mapping)
+{
+	return !page->mapping || page->mapping != mapping;
+}
+
 static inline struct page *page_cache_alloc(struct address_space *x)
 {
 	return __page_cache_alloc(mapping_gfp_mask(x));
diff --git a/mm/filemap.c b/mm/filemap.c
index 3a980e2128ad..876e7e8c8a3e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1554,7 +1554,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
 		}
 
 		/* Has the page been truncated? */
-		if (unlikely(page->mapping != mapping)) {
+		if (unlikely(page_is_truncated(page, mapping))) {
 			unlock_page(page);
 			put_page(page);
 			goto repeat;
@@ -2129,7 +2129,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb,
 			if (!trylock_page(page))
 				goto page_not_up_to_date;
 			/* Did it get truncated before we got the lock? */
-			if (!page->mapping)
+			if (page_is_truncated(page, mapping))
 				goto page_not_up_to_date_locked;
 			if (!mapping->a_ops->is_partially_uptodate(page,
 						mapping, offset, iter->count))
@@ -2208,7 +2208,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb,
 
 page_not_up_to_date_locked:
 		/* Did it get truncated before we got the lock? */
-		if (!page->mapping) {
+		if (page_is_truncated(page, mapping)) {
 			unlock_page(page);
 			put_page(page);
 			continue;
@@ -2535,7 +2535,7 @@ int filemap_fault(struct vm_fault *vmf)
 	}
 
 	/* Did it get truncated? */
-	if (unlikely(page->mapping != mapping)) {
+	if (unlikely(page_is_truncated(page, mapping))) {
 		unlock_page(page);
 		put_page(page);
 		goto retry_find;
@@ -2663,7 +2663,7 @@ void filemap_map_pages(struct vm_fault *vmf,
 		if (!trylock_page(page))
 			goto skip;
 
-		if (page->mapping != mapping || !PageUptodate(page))
+		if (page_is_truncated(page, mapping) || !PageUptodate(page))
 			goto unlock;
 
 		max_idx = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
@@ -2854,7 +2854,7 @@ static struct page *do_read_cache_page(struct address_space *mapping,
 	lock_page(page);
 
 	/* Case c or d, restart the operation */
-	if (!page->mapping) {
+	if (page_is_truncated(page, mapping)) {
 		unlock_page(page);
 		put_page(page);
 		goto repeat;
diff --git a/mm/memory.c b/mm/memory.c
index 5fcfc24904d1..1311599a164b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2365,6 +2365,9 @@ static int do_page_mkwrite(struct vm_fault *vmf)
 	int ret;
 	struct page *page = vmf->page;
 	unsigned int old_flags = vmf->flags;
+	struct address_space *mapping;
+
+	mapping = vmf->vma->vm_file ? vmf->vma->vm_file->f_mapping : NULL;
 
 	vmf->flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE;
 
@@ -2375,7 +2378,7 @@ static int do_page_mkwrite(struct vm_fault *vmf)
 		return ret;
 	if (unlikely(!(ret & VM_FAULT_LOCKED))) {
 		lock_page(page);
-		if (!page->mapping) {
+		if (page_is_truncated(page, mapping)) {
 			unlock_page(page);
 			return 0; /* retry */
 		}
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 67b857ee1a1c..3c14d44639c8 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2213,7 +2213,7 @@ int write_cache_pages(struct address_space *mapping,
 			 * even if there is now a new, dirty page at the same
 			 * pagecache address.
 			 */
-			if (unlikely(page->mapping != mapping)) {
+			if (unlikely(page_is_truncated(page, mapping))) {
 continue_unlock:
 				unlock_page(page);
 				continue;
diff --git a/mm/truncate.c b/mm/truncate.c
index 497a895db341..a9415c96c966 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -209,7 +209,7 @@ invalidate_complete_page(struct address_space *mapping, struct page *page)
 {
 	int ret;
 
-	if (page->mapping != mapping)
+	if (page_is_truncated(page, mapping))
 		return 0;
 
 	if (page_has_private(page) && !try_to_release_page(page, 0))
@@ -224,7 +224,7 @@ int truncate_inode_page(struct address_space *mapping, struct page *page)
 {
 	VM_BUG_ON_PAGE(PageTail(page), page);
 
-	if (page->mapping != mapping)
+	if (page_is_truncated(page, mapping))
 		return -EIO;
 
 	truncate_cleanup_page(mapping, page);
@@ -358,7 +358,7 @@ void truncate_inode_pages_range(struct address_space *mapping,
 				unlock_page(page);
 				continue;
 			}
-			if (page->mapping != mapping) {
+			if (page_is_truncated(page, mapping)) {
 				unlock_page(page);
 				continue;
 			}
@@ -622,7 +622,7 @@ invalidate_complete_page2(struct address_space *mapping, struct page *page)
 {
 	unsigned long flags;
 
-	if (page->mapping != mapping)
+	if (page_is_truncated(page, mapping))
 		return 0;
 
 	if (page_has_private(page) && !try_to_release_page(page, GFP_KERNEL))
@@ -650,7 +650,7 @@ static int do_launder_page(struct address_space *mapping, struct page *page)
 {
 	if (!PageDirty(page))
 		return 0;
-	if (page->mapping != mapping || mapping->a_ops->launder_page == NULL)
+	if (page_is_truncated(page, mapping) || mapping->a_ops->launder_page == NULL)
 		return 0;
 	return mapping->a_ops->launder_page(mapping, page);
 }
@@ -702,7 +702,7 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
 
 			lock_page(page);
 			WARN_ON(page_to_index(page) != index);
-			if (page->mapping != mapping) {
+			if (page_is_truncated(page, mapping)) {
 				unlock_page(page);
 				continue;
 			}
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 29/79] fs/block: add struct address_space to bdev_write_page() arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (11 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 28/79] fs: introduce page_is_truncated() helper jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 30/79] fs/block: add struct address_space to __block_write_begin() arguments jglisse
                   ` (29 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Jens Axboe, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

Add struct address_space to bdev_write_page() arguments.

One step toward dropping reliance on page->mapping.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/block_dev.c         | 4 +++-
 fs/mpage.c             | 2 +-
 include/linux/blkdev.h | 5 +++--
 mm/page_io.c           | 7 ++++---
 4 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 9ac6bf760272..502b6643bc74 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -678,6 +678,7 @@ EXPORT_SYMBOL_GPL(bdev_read_page);
  * bdev_write_page() - Start writing a page to a block device
  * @bdev: The device to write the page to
  * @sector: The offset on the device to write the page to (need not be aligned)
+ * @mapping: The address space the page belongs to
  * @page: The page to write
  * @wbc: The writeback_control for the write
  *
@@ -694,7 +695,8 @@ EXPORT_SYMBOL_GPL(bdev_read_page);
  * Return: negative errno if an error occurs, 0 if submission was successful.
  */
 int bdev_write_page(struct block_device *bdev, sector_t sector,
-			struct page *page, struct writeback_control *wbc)
+			struct address_space *mapping, struct page *page,
+			struct writeback_control *wbc)
 {
 	int result;
 	const struct block_device_operations *ops = bdev->bd_disk->fops;
diff --git a/fs/mpage.c b/fs/mpage.c
index 52a6028e2066..a75cea232f1a 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -619,7 +619,7 @@ static int __mpage_writepage(struct page *page, struct address_space *_mapping,
 	if (bio == NULL) {
 		if (first_unmapped == blocks_per_page) {
 			if (!bdev_write_page(bdev, blocks[0] << (blkbits - 9),
-								page, wbc))
+						mapping, page, wbc))
 				goto out;
 		}
 		bio = mpage_alloc(bdev, blocks[0] << (blkbits - 9),
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ed63f3b69c12..0cf66b6993f4 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -2053,8 +2053,9 @@ struct block_device_operations {
 extern int __blkdev_driver_ioctl(struct block_device *, fmode_t, unsigned int,
 				 unsigned long);
 extern int bdev_read_page(struct block_device *, sector_t, struct page *);
-extern int bdev_write_page(struct block_device *, sector_t, struct page *,
-						struct writeback_control *);
+extern int bdev_write_page(struct block_device *bdev, sector_t sector,
+			struct address_space *mapping, struct page *page,
+			struct writeback_control *wbc);
 
 #ifdef CONFIG_BLK_DEV_ZONED
 bool blk_req_needs_zone_write_lock(struct request *rq);
diff --git a/mm/page_io.c b/mm/page_io.c
index 402231dd1286..6e548b588490 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -282,12 +282,12 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc,
 	struct bio *bio;
 	int ret;
 	struct swap_info_struct *sis = page_swap_info(page);
+	struct file *swap_file = sis->swap_file;
+	struct address_space *mapping = swap_file->f_mapping;
 
 	VM_BUG_ON_PAGE(!PageSwapCache(page), page);
 	if (sis->flags & SWP_FILE) {
 		struct kiocb kiocb;
-		struct file *swap_file = sis->swap_file;
-		struct address_space *mapping = swap_file->f_mapping;
 		struct bio_vec bv = {
 			.bv_page = page,
 			.bv_len  = PAGE_SIZE,
@@ -325,7 +325,8 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc,
 		return ret;
 	}
 
-	ret = bdev_write_page(sis->bdev, swap_page_sector(page), page, wbc);
+	ret = bdev_write_page(sis->bdev, swap_page_sector(page),
+			      mapping, page, wbc);
 	if (!ret) {
 		count_swpout_vm_event(page);
 		return 0;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 30/79] fs/block: add struct address_space to __block_write_begin() arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (12 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 29/79] fs/block: add struct address_space to bdev_write_page() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 31/79] fs/block: add struct address_space to __block_write_begin_int() args jglisse
                   ` (28 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Jens Axboe, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

Add struct address_space to __block_write_begin() arguments.

One step toward dropping reliance on page->mapping.

----------------------------------------------------------------------
identifier M;
expression E1, E2, E3, E4;
@@
struct address_space *M;
...
-__block_write_begin(E1, E2, E3, E4)
+__block_write_begin(M, E1, E2, E3, E4)

@exists@
identifier M, F;
expression E1, E2, E3, E4;
@@
F(..., struct address_space *M, ...) {...
-__block_write_begin(E1, E2, E3, E4)
+__block_write_begin(M, E1, E2, E3, E4)
...}

@exists@
identifier I;
expression E1, E2, E3, E4, E5;
@@
struct inode *I;
...
-__block_write_begin(E1, E2, E3, E4)
+__block_write_begin(I->i_mapping, E1, E2, E3, E4)

@exists@
identifier I, F;
expression E1, E2, E3, E4;
@@
F(..., struct inode *I, ...) {...
-__block_write_begin(E1, E2, E3, E4)
+__block_write_begin(I->i_mapping, E1, E2, E3, E4)
...}

@exists@
identifier P;
expression E1, E2, E3, E4, E5;
@@
struct page *P;
...
-__block_write_begin(E1, E2, E3, E4)
+__block_write_begin(P->mapping, E1, E2, E3, E4)

@exists@
identifier P, F;
expression E1, E2, E3, E4;
@@
F(..., struct page *P, ...) {...
-__block_write_begin(E1, E2, E3, E4)
+__block_write_begin(P->mapping, E1, E2, E3, E4)
...}
----------------------------------------------------------------------

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/buffer.c                 | 10 +++++-----
 fs/ext2/dir.c               |  3 ++-
 fs/ext4/inline.c            |  7 ++++---
 fs/ext4/inode.c             |  8 +++++---
 fs/gfs2/aops.c              |  2 +-
 fs/minix/inode.c            |  3 ++-
 fs/nilfs2/dir.c             |  3 ++-
 fs/ocfs2/file.c             |  2 +-
 fs/reiserfs/inode.c         |  8 +++++---
 fs/sysv/itree.c             |  2 +-
 fs/ufs/inode.c              |  3 ++-
 include/linux/buffer_head.h |  4 ++--
 12 files changed, 32 insertions(+), 23 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 8b2eb3dfb539..de16588d7f7f 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2028,8 +2028,8 @@ int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
 	return err;
 }
 
-int __block_write_begin(struct page *page, loff_t pos, unsigned len,
-		get_block_t *get_block)
+int __block_write_begin(struct address_space *mapping, struct page *page,
+		loff_t pos, unsigned len, get_block_t *get_block)
 {
 	return __block_write_begin_int(page, pos, len, get_block, NULL);
 }
@@ -2090,7 +2090,7 @@ int block_write_begin(struct address_space *mapping, loff_t pos, unsigned len,
 	if (!page)
 		return -ENOMEM;
 
-	status = __block_write_begin(page, pos, len, get_block);
+	status = __block_write_begin(mapping, page, pos, len, get_block);
 	if (unlikely(status)) {
 		unlock_page(page);
 		put_page(page);
@@ -2495,7 +2495,7 @@ int block_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf,
 	else
 		end = PAGE_SIZE;
 
-	ret = __block_write_begin(page, 0, end, get_block);
+	ret = __block_write_begin(inode->i_mapping, page, 0, end, get_block);
 	if (!ret)
 		ret = block_commit_write(page, 0, end);
 
@@ -2579,7 +2579,7 @@ int nobh_write_begin(struct address_space *mapping,
 	*fsdata = NULL;
 
 	if (page_has_buffers(page)) {
-		ret = __block_write_begin(page, pos, len, get_block);
+		ret = __block_write_begin(mapping, page, pos, len, get_block);
 		if (unlikely(ret))
 			goto out_release;
 		return ret;
diff --git a/fs/ext2/dir.c b/fs/ext2/dir.c
index 3b8114def693..0d116d4e923c 100644
--- a/fs/ext2/dir.c
+++ b/fs/ext2/dir.c
@@ -453,7 +453,8 @@ ino_t ext2_inode_by_name(struct inode *dir, const struct qstr *child)
 
 static int ext2_prepare_chunk(struct page *page, loff_t pos, unsigned len)
 {
-	return __block_write_begin(page, pos, len, ext2_get_block);
+	return __block_write_begin(page->mapping, page, pos, len,
+				   ext2_get_block);
 }
 
 /* Releases the page */
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 70cf4c7b268a..ffdbd443c67a 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -580,10 +580,11 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
 		goto out;
 
 	if (ext4_should_dioread_nolock(inode)) {
-		ret = __block_write_begin(page, from, to,
+		ret = __block_write_begin(mapping, page, from, to,
 					  ext4_get_block_unwritten);
 	} else
-		ret = __block_write_begin(page, from, to, ext4_get_block);
+		ret = __block_write_begin(mapping, page, from, to,
+					  ext4_get_block);
 
 	if (!ret && ext4_should_journal_data(inode)) {
 		ret = ext4_walk_page_buffers(handle, page_buffers(page),
@@ -808,7 +809,7 @@ static int ext4_da_convert_inline_data_to_extent(struct address_space *mapping,
 			goto out;
 	}
 
-	ret = __block_write_begin(page, 0, inline_size,
+	ret = __block_write_begin(mapping, page, 0, inline_size,
 				  ext4_da_get_block_prep);
 	if (ret) {
 		up_read(&EXT4_I(inode)->xattr_sem);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 394fed206138..1947aac3e8ee 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1314,10 +1314,11 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
 					     ext4_get_block);
 #else
 	if (ext4_should_dioread_nolock(inode))
-		ret = __block_write_begin(page, pos, len,
+		ret = __block_write_begin(mapping, page, pos, len,
 					  ext4_get_block_unwritten);
 	else
-		ret = __block_write_begin(page, pos, len, ext4_get_block);
+		ret = __block_write_begin(mapping, page, pos, len,
+					  ext4_get_block);
 #endif
 	if (!ret && ext4_should_journal_data(inode)) {
 		ret = ext4_walk_page_buffers(handle, page_buffers(page),
@@ -3080,7 +3081,8 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
 	ret = ext4_block_write_begin(page, pos, len,
 				     ext4_da_get_block_prep);
 #else
-	ret = __block_write_begin(page, pos, len, ext4_da_get_block_prep);
+	ret = __block_write_begin(mapping, page, pos, len,
+				  ext4_da_get_block_prep);
 #endif
 	if (ret < 0) {
 		unlock_page(page);
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 21cb6bc98645..466f2f909108 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -744,7 +744,7 @@ static int gfs2_write_begin(struct file *file, struct address_space *mapping,
 	}
 
 prepare_write:
-	error = __block_write_begin(page, from, len, gfs2_block_map);
+	error = __block_write_begin(mapping, page, from, len, gfs2_block_map);
 out:
 	if (error == 0)
 		return 0;
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index 2a151fa6b013..450aa4e87cd9 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -396,7 +396,8 @@ static int minix_readpage(struct file *file, struct address_space *mapping,
 
 int minix_prepare_chunk(struct page *page, loff_t pos, unsigned len)
 {
-	return __block_write_begin(page, pos, len, minix_get_block);
+	return __block_write_begin(page->mapping, page, pos, len,
+				   minix_get_block);
 }
 
 static void minix_write_failed(struct address_space *mapping, loff_t to)
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c
index 582831ab3eb9..837d7eb9a920 100644
--- a/fs/nilfs2/dir.c
+++ b/fs/nilfs2/dir.c
@@ -98,7 +98,8 @@ static int nilfs_prepare_chunk(struct page *page, unsigned int from,
 {
 	loff_t pos = page_offset(page) + from;
 
-	return __block_write_begin(page, pos, to - from, nilfs_get_block);
+	return __block_write_begin(page->mapping, page, pos, to - from,
+				   nilfs_get_block);
 }
 
 static void nilfs_commit_chunk(struct page *page,
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 5d1784a365a3..fe1d542def25 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -810,7 +810,7 @@ static int ocfs2_write_zero_page(struct inode *inode, u64 abs_from,
 		 * __block_write_begin and block_commit_write to zero the
 		 * whole block.
 		 */
-		ret = __block_write_begin(page, block_start + 1, 0,
+		ret = __block_write_begin(mapping, page, block_start + 1, 0,
 					  ocfs2_get_block);
 		if (ret < 0) {
 			mlog_errno(ret);
diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
index d4ab2d45f846..aec309175fd0 100644
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -2210,7 +2210,8 @@ static int grab_tail_page(struct inode *inode,
 	/* start within the page of the last block in the file */
 	start = (offset / blocksize) * blocksize;
 
-	error = __block_write_begin(page, start, offset - start,
+	error = __block_write_begin(inode->i_mapping, page, start,
+				    offset - start,
 				    reiserfs_get_block_create_0);
 	if (error)
 		goto unlock;
@@ -2788,7 +2789,7 @@ static int reiserfs_write_begin(struct file *file,
 		old_ref = th->t_refcount;
 		th->t_refcount++;
 	}
-	ret = __block_write_begin(page, pos, len, reiserfs_get_block);
+	ret = __block_write_begin(mapping, page, pos, len, reiserfs_get_block);
 	if (ret && reiserfs_transaction_running(inode->i_sb)) {
 		struct reiserfs_transaction_handle *th = current->journal_info;
 		/*
@@ -2848,7 +2849,8 @@ int __reiserfs_write_begin(struct page *page, unsigned from, unsigned len)
 		th->t_refcount++;
 	}
 
-	ret = __block_write_begin(page, from, len, reiserfs_get_block);
+	ret = __block_write_begin(inode->i_mapping, page, from, len,
+				  reiserfs_get_block);
 	if (ret && reiserfs_transaction_running(inode->i_sb)) {
 		struct reiserfs_transaction_handle *th = current->journal_info;
 		/*
diff --git a/fs/sysv/itree.c b/fs/sysv/itree.c
index 7cec1e024dc3..3b7d27e07e31 100644
--- a/fs/sysv/itree.c
+++ b/fs/sysv/itree.c
@@ -465,7 +465,7 @@ static int sysv_readpage(struct file *file, struct address_space *mapping,
 
 int sysv_prepare_chunk(struct page *page, loff_t pos, unsigned len)
 {
-	return __block_write_begin(page, pos, len, get_block);
+	return __block_write_begin(page->mapping, page, pos, len, get_block);
 }
 
 static void sysv_write_failed(struct address_space *mapping, loff_t to)
diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c
index 8589b934be09..fcaa60bfad49 100644
--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -483,7 +483,8 @@ static int ufs_readpage(struct file *file, struct address_space *mapping,
 
 int ufs_prepare_chunk(struct page *page, loff_t pos, unsigned len)
 {
-	return __block_write_begin(page, pos, len, ufs_getfrag_block);
+	return __block_write_begin(page->mapping, page, pos, len,
+				   ufs_getfrag_block);
 }
 
 static void ufs_truncate_blocks(struct inode *);
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index fb68a3358330..dca0d3eb789a 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -228,8 +228,8 @@ int block_is_partially_uptodate(struct page *page,
 	unsigned long count);
 int block_write_begin(struct address_space *mapping, loff_t pos, unsigned len,
 		unsigned flags, struct page **pagep, get_block_t *get_block);
-int __block_write_begin(struct page *page, loff_t pos, unsigned len,
-		get_block_t *get_block);
+int __block_write_begin(struct address_space *mapping, struct page *page,
+		loff_t pos, unsigned len, get_block_t *get_block);
 int block_write_end(struct file *, struct address_space *,
 				loff_t, unsigned, unsigned,
 				struct page *, void *);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 31/79] fs/block: add struct address_space to __block_write_begin_int() args
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (13 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 30/79] fs/block: add struct address_space to __block_write_begin() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 32/79] fs/block: do not rely on page->mapping get it from the context jglisse
                   ` (27 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Jens Axboe, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

Add struct address_space to __block_write_begin_int() arguments.

One step toward dropping reliance on page->mapping.

----------------------------------------------------------------------
@exists@
identifier M;
expression E1, E2, E3, E4, E5;
@@
struct address_space *M;
...
-__block_write_begin_int(E1, E2, E3, E4, E5)
+__block_write_begin_int(M, E1, E2, E3, E4, E5)

@exists@
identifier M, F;
expression E1, E2, E3, E4, E5;
@@
F(..., struct address_space *M, ...) {...
-__block_write_begin_int(E1, E2, E3, E4, E5)
+__block_write_begin_int(M, E1, E2, E3, E4, E5)
...}
----------------------------------------------------------------------

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/buffer.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index de16588d7f7f..c83878d0a4c0 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1943,8 +1943,9 @@ iomap_to_bh(struct inode *inode, sector_t block, struct buffer_head *bh,
 	}
 }
 
-int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
-		get_block_t *get_block, struct iomap *iomap)
+int __block_write_begin_int(struct address_space *mapping, struct page *page,
+		loff_t pos, unsigned len, get_block_t *get_block,
+		struct iomap *iomap)
 {
 	unsigned from = pos & (PAGE_SIZE - 1);
 	unsigned to = from + len;
@@ -2031,7 +2032,8 @@ int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
 int __block_write_begin(struct address_space *mapping, struct page *page,
 		loff_t pos, unsigned len, get_block_t *get_block)
 {
-	return __block_write_begin_int(page, pos, len, get_block, NULL);
+	return __block_write_begin_int(mapping, page, pos, len, get_block,
+				       NULL);
 }
 EXPORT_SYMBOL(__block_write_begin);
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 32/79] fs/block: do not rely on page->mapping get it from the context
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (14 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 31/79] fs/block: add struct address_space to __block_write_begin_int() args jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 33/79] fs/journal: add struct super_block to jbd2_journal_forget() arguments jglisse
                   ` (26 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Jens Axboe, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

This patch remove most dereference of page->mapping and get the mapping
from the call context (either already available in the function or by
adding it to function arguments).

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/block_dev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 502b6643bc74..dd9da97615e3 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -564,14 +564,14 @@ EXPORT_SYMBOL(thaw_bdev);
 static int blkdev_writepage(struct address_space *mapping, struct page *page,
 			    struct writeback_control *wbc)
 {
-	return block_write_full_page(page->mapping->host, page,
+	return block_write_full_page(mapping->host, page,
 				     blkdev_get_block, wbc);
 }
 
 static int blkdev_readpage(struct file * file, struct address_space *mapping,
 			   struct page * page)
 {
-	return block_read_full_page(page->mapping->host,page,blkdev_get_block);
+	return block_read_full_page(mapping->host,page,blkdev_get_block);
 }
 
 static int blkdev_readpages(struct file *file, struct address_space *mapping,
@@ -1941,7 +1941,7 @@ EXPORT_SYMBOL_GPL(blkdev_read_iter);
 static int blkdev_releasepage(struct address_space *mapping,
 			      struct page *page, gfp_t wait)
 {
-	struct super_block *super = BDEV_I(page->mapping->host)->bdev.bd_super;
+	struct super_block *super = BDEV_I(mapping->host)->bdev.bd_super;
 
 	if (super && super->s_op->bdev_try_to_free_page)
 		return super->s_op->bdev_try_to_free_page(super, page, wait);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 33/79] fs/journal: add struct super_block to jbd2_journal_forget() arguments.
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (15 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 32/79] fs/block: do not rely on page->mapping get it from the context jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 34/79] fs/journal: add struct inode to jbd2_journal_revoke() arguments jglisse
                   ` (25 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Theodore Ts'o,
	Jan Kara, linux-ext4, Alexander Viro

From: Jérôme Glisse <jglisse@redhat.com>

For the holy crusade to stop relying on struct page mapping field, add
struct super_block to jbd2_journal_forget() arguments.

spatch --sp-file zemantic-010a.spatch --in-place --dir fs/
----------------------------------------------------------------------
@exists@
expression E1, E2;
identifier I;
@@
struct super_block *I;
...
-jbd2_journal_forget(E1, E2)
+jbd2_journal_forget(E1, I, E2)

@exists@
expression E1, E2;
identifier F, I;
@@
F(..., struct super_block *I, ...) {
...
-jbd2_journal_forget(E1, E2)
+jbd2_journal_forget(E1, I, E2)
...
}

@exists@
expression E1, E2;
identifier I;
@@
struct block_device *I;
...
-jbd2_journal_forget(E1, E2)
+jbd2_journal_forget(E1, I->bd_super, E2)

@exists@
expression E1, E2;
identifier F, I;
@@
F(..., struct block_device *I, ...) {
...
-jbd2_journal_forget(E1, E2)
+jbd2_journal_forget(E1, I->bd_super, E2)
...
}

@exists@
expression E1, E2;
identifier I;
@@
struct inode *I;
...
-jbd2_journal_forget(E1, E2)
+jbd2_journal_forget(E1, I->i_sb, E2)

@exists@
expression E1, E2;
identifier F, I;
@@
F(..., struct inode *I, ...) {
...
-jbd2_journal_forget(E1, E2)
+jbd2_journal_forget(E1, I->i_sb, E2)
...
}
----------------------------------------------------------------------

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.com>
Cc: linux-ext4@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/ext4/ext4_jbd2.c   | 2 +-
 fs/jbd2/revoke.c      | 2 +-
 fs/jbd2/transaction.c | 3 ++-
 include/linux/jbd2.h  | 3 ++-
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 2d593201cf7a..0804d564b529 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -224,7 +224,7 @@ int __ext4_forget(const char *where, unsigned int line, handle_t *handle,
 	    (!is_metadata && !ext4_should_journal_data(inode))) {
 		if (bh) {
 			BUFFER_TRACE(bh, "call jbd2_journal_forget");
-			err = jbd2_journal_forget(handle, bh);
+			err = jbd2_journal_forget(handle, inode->i_sb, bh);
 			if (err)
 				ext4_journal_abort_handle(where, line, __func__,
 							  bh, handle, err);
diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c
index 696ef15ec942..b6e2fd52acd6 100644
--- a/fs/jbd2/revoke.c
+++ b/fs/jbd2/revoke.c
@@ -381,7 +381,7 @@ int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr,
 		set_buffer_revokevalid(bh);
 		if (bh_in) {
 			BUFFER_TRACE(bh_in, "call jbd2_journal_forget");
-			jbd2_journal_forget(handle, bh_in);
+			jbd2_journal_forget(handle, bdev->bd_super, bh_in);
 		} else {
 			BUFFER_TRACE(bh, "call brelse");
 			__brelse(bh);
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index ac311037d7a5..e8c50bb5822c 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -1482,7 +1482,8 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
  * Allow this call even if the handle has aborted --- it may be part of
  * the caller's cleanup after an abort.
  */
-int jbd2_journal_forget (handle_t *handle, struct buffer_head *bh)
+int jbd2_journal_forget (handle_t *handle, struct super_block *sb,
+			 struct buffer_head *bh)
 {
 	transaction_t *transaction = handle->h_transaction;
 	journal_t *journal;
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index b708e5169d1d..d89749a179eb 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1358,7 +1358,8 @@ extern int	 jbd2_journal_get_undo_access(handle_t *, struct buffer_head *);
 void		 jbd2_journal_set_triggers(struct buffer_head *,
 					   struct jbd2_buffer_trigger_type *type);
 extern int	 jbd2_journal_dirty_metadata (handle_t *, struct buffer_head *);
-extern int	 jbd2_journal_forget (handle_t *, struct buffer_head *);
+extern int	 jbd2_journal_forget (handle_t *, struct super_block *sb,
+					struct buffer_head *);
 extern void	 journal_sync_buffer (struct buffer_head *);
 extern int	 jbd2_journal_invalidatepage(journal_t *,
 				struct page *, unsigned int, unsigned int);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 34/79] fs/journal: add struct inode to jbd2_journal_revoke() arguments.
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (16 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 33/79] fs/journal: add struct super_block to jbd2_journal_forget() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 35/79] fs/buffer: add struct address_space and struct page to end_io callback jglisse
                   ` (24 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Theodore Ts'o,
	Jan Kara, linux-ext4, Alexander Viro

From: Jérôme Glisse <jglisse@redhat.com>

For the holy crusade to stop relying on struct page mapping field, add
struct super_block to jbd2_journal_revoke() arguments.

spatch --sp-file zemantic-011a.spatch --in-place --dir fs/
----------------------------------------------------------------------
@exists@
expression E1, E2, E3;
identifier I;
@@
struct super_block *I;
...
-jbd2_journal_revoke(E1, E2, E3)
+jbd2_journal_revoke(E1, E2, I, E3)

@exists@
expression E1, E2, E3;
identifier F, I;
@@
F(..., struct super_block *I, ...) {
...
-jbd2_journal_revoke(E1, E2, E3)
+jbd2_journal_revoke(E1, E2, I, E3)
...
}

@exists@
expression E1, E2, E3;
identifier I;
@@
struct inode *I;
...
-jbd2_journal_revoke(E1, E2, E3)
+jbd2_journal_revoke(E1, E2, I->i_sb, E3)

@exists@
expression E1, E2, E3;
identifier F, I;
@@
F(..., struct inode *I, ...) {
...
-jbd2_journal_revoke(E1, E2, E3)
+jbd2_journal_revoke(E1, E2, I->i_sb, E3)
...
}
----------------------------------------------------------------------

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.com>
Cc: linux-ext4@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
---
 fs/ext4/ext4_jbd2.c  | 2 +-
 fs/jbd2/revoke.c     | 2 +-
 include/linux/jbd2.h | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 0804d564b529..5529badca994 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -237,7 +237,7 @@ int __ext4_forget(const char *where, unsigned int line, handle_t *handle,
 	 * data!=journal && (is_metadata || should_journal_data(inode))
 	 */
 	BUFFER_TRACE(bh, "call jbd2_journal_revoke");
-	err = jbd2_journal_revoke(handle, blocknr, bh);
+	err = jbd2_journal_revoke(handle, blocknr, inode->i_sb, bh);
 	if (err) {
 		ext4_journal_abort_handle(where, line, __func__,
 					  bh, handle, err);
diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c
index b6e2fd52acd6..71e690ad9d44 100644
--- a/fs/jbd2/revoke.c
+++ b/fs/jbd2/revoke.c
@@ -320,7 +320,7 @@ void jbd2_journal_destroy_revoke(journal_t *journal)
  */
 
 int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr,
-		   struct buffer_head *bh_in)
+			struct super_block *sb, struct buffer_head *bh_in)
 {
 	struct buffer_head *bh = NULL;
 	journal_t *journal;
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index d89749a179eb..c5133df80fd4 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1450,7 +1450,8 @@ extern void	   jbd2_journal_destroy_revoke_caches(void);
 extern int	   jbd2_journal_init_revoke_caches(void);
 
 extern void	   jbd2_journal_destroy_revoke(journal_t *);
-extern int	   jbd2_journal_revoke (handle_t *, unsigned long long, struct buffer_head *);
+extern int	   jbd2_journal_revoke (handle_t *, unsigned long long,
+				struct super_block *, struct buffer_head *);
 extern int	   jbd2_journal_cancel_revoke(handle_t *, struct journal_head *);
 extern void	   jbd2_journal_write_revoke_records(transaction_t *transaction,
 						     struct list_head *log_bufs);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 35/79] fs/buffer: add struct address_space and struct page to end_io callback
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (17 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 34/79] fs/journal: add struct inode to jbd2_journal_revoke() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 36/79] fs/buffer: add struct super_block to bforget() arguments jglisse
                   ` (23 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Jens Axboe, Tejun Heo, Jan Kara, Josef Bacik,
	Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

For the holy crusade to stop relying on struct page mapping field, add
struct address_space and struct page to the end_io callback of buffer
head. Caller of this callback have more context information to find
the match page and mapping.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 drivers/md/md-bitmap.c      |  3 ++-
 fs/btrfs/disk-io.c          |  3 ++-
 fs/buffer.c                 | 26 +++++++++++++++++---------
 fs/ext4/ext4.h              |  3 ++-
 fs/ext4/ialloc.c            |  3 ++-
 fs/gfs2/meta_io.c           |  2 +-
 fs/jbd2/commit.c            |  3 ++-
 fs/ntfs/aops.c              |  9 ++++++---
 fs/reiserfs/journal.c       |  6 ++++--
 include/linux/buffer_head.h | 12 ++++++++----
 10 files changed, 46 insertions(+), 24 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 239c7bb3929b..717e99eabce9 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -313,7 +313,8 @@ static void write_page(struct bitmap *bitmap, struct page *page, int wait)
 		bitmap_file_kick(bitmap);
 }
 
-static void end_bitmap_write(struct buffer_head *bh, int uptodate)
+static void end_bitmap_write(struct address_space *mapping, struct page *page,
+			     struct buffer_head *bh, int uptodate)
 {
 	struct bitmap *bitmap = bh->b_private;
 
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index a976ccc6036b..df789cfdebd7 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3123,7 +3123,8 @@ int open_ctree(struct super_block *sb,
 }
 ALLOW_ERROR_INJECTION(open_ctree, ERRNO);
 
-static void btrfs_end_buffer_write_sync(struct buffer_head *bh, int uptodate)
+static void btrfs_end_buffer_write_sync(struct address_space *mapping,
+		struct page *page, struct buffer_head *bh, int uptodate)
 {
 	if (uptodate) {
 		set_buffer_uptodate(bh);
diff --git a/fs/buffer.c b/fs/buffer.c
index c83878d0a4c0..9f2c5e90b64d 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -159,14 +159,16 @@ static void __end_buffer_read_notouch(struct buffer_head *bh, int uptodate)
  * Default synchronous end-of-IO handler..  Just mark it up-to-date and
  * unlock the buffer. This is what ll_rw_block uses too.
  */
-void end_buffer_read_sync(struct buffer_head *bh, int uptodate)
+void end_buffer_read_sync(struct address_space *mapping, struct page *page,
+			  struct buffer_head *bh, int uptodate)
 {
 	__end_buffer_read_notouch(bh, uptodate);
 	put_bh(bh);
 }
 EXPORT_SYMBOL(end_buffer_read_sync);
 
-void end_buffer_write_sync(struct buffer_head *bh, int uptodate)
+void end_buffer_write_sync(struct address_space *mapping, struct page *page,
+			   struct buffer_head *bh, int uptodate)
 {
 	if (uptodate) {
 		set_buffer_uptodate(bh);
@@ -250,12 +252,12 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
  * I/O completion handler for block_read_full_page() - pages
  * which come unlocked at the end of I/O.
  */
-static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
+static void end_buffer_async_read(struct address_space *mapping,
+		struct page *page, struct buffer_head *bh, int uptodate)
 {
 	unsigned long flags;
 	struct buffer_head *first;
 	struct buffer_head *tmp;
-	struct page *page;
 	int page_uptodate = 1;
 
 	BUG_ON(!buffer_async_read(bh));
@@ -311,12 +313,12 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
  * Completion handler for block_write_full_page() - pages which are unlocked
  * during I/O, and which have PageWriteback cleared upon I/O completion.
  */
-void end_buffer_async_write(struct buffer_head *bh, int uptodate)
+void end_buffer_async_write(struct address_space *mapping, struct page *page,
+			    struct buffer_head *bh, int uptodate)
 {
 	unsigned long flags;
 	struct buffer_head *first;
 	struct buffer_head *tmp;
-	struct page *page;
 
 	BUG_ON(!buffer_async_write(bh));
 
@@ -2311,7 +2313,7 @@ int block_read_full_page(struct inode *inode, struct page *page,
 	for (i = 0; i < nr; i++) {
 		bh = arr[i];
 		if (buffer_uptodate(bh))
-			end_buffer_async_read(bh, 1);
+			end_buffer_async_read(inode->i_mapping, page, bh, 1);
 		else
 			submit_bh(REQ_OP_READ, 0, bh);
 	}
@@ -2517,7 +2519,8 @@ EXPORT_SYMBOL(block_page_mkwrite);
  * immediately, while under the page lock.  So it needs a special end_io
  * handler which does not touch the bh after unlocking it.
  */
-static void end_buffer_read_nobh(struct buffer_head *bh, int uptodate)
+static void end_buffer_read_nobh(struct address_space *mapping,
+		struct page *page, struct buffer_head *bh, int uptodate)
 {
 	__end_buffer_read_notouch(bh, uptodate);
 }
@@ -2989,11 +2992,16 @@ EXPORT_SYMBOL(generic_block_bmap);
 static void end_bio_bh_io_sync(struct bio *bio)
 {
 	struct buffer_head *bh = bio->bi_private;
+	struct address_space *mapping;
+	struct page *page;
+
+	page = bh->b_page;
+	mapping = fs_page_mapping_get_with_bh(page, bh);
 
 	if (unlikely(bio_flagged(bio, BIO_QUIET)))
 		set_bit(BH_Quiet, &bh->b_state);
 
-	bh->b_end_io(bh, !bio->bi_status);
+	bh->b_end_io(mapping, page, bh, !bio->bi_status);
 	bio_put(bio);
 }
 
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 3241475a1733..3be14beacd9c 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2389,7 +2389,8 @@ extern void ext4_check_inodes_bitmap(struct super_block *);
 extern void ext4_mark_bitmap_end(int start_bit, int end_bit, char *bitmap);
 extern int ext4_init_inode_table(struct super_block *sb,
 				 ext4_group_t group, int barrier);
-extern void ext4_end_bitmap_read(struct buffer_head *bh, int uptodate);
+extern void ext4_end_bitmap_read(struct address_space *mapping,
+		struct page *page, struct buffer_head *bh, int uptodate);
 
 /* mballoc.c */
 extern const struct file_operations ext4_seq_mb_groups_fops;
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 7830d28df331..1475eb54b30e 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -104,7 +104,8 @@ static int ext4_init_inode_bitmap(struct super_block *sb,
 	return 0;
 }
 
-void ext4_end_bitmap_read(struct buffer_head *bh, int uptodate)
+void ext4_end_bitmap_read(struct address_space *mapping, struct page *page,
+			  struct buffer_head *bh, int uptodate)
 {
 	if (uptodate) {
 		set_buffer_uptodate(bh);
diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index 1f1e9c330e9a..e1942636e7e8 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -202,7 +202,7 @@ static void gfs2_meta_read_endio(struct bio *bio)
 		do {
 			struct buffer_head *next = bh->b_this_page;
 			len -= bh->b_size;
-			bh->b_end_io(bh, !bio->bi_status);
+			bh->b_end_io(page->mapping, page, bh, !bio->bi_status);
 			bh = next;
 		} while (bh && len);
 	}
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 8de0e7723316..2ab9edd17ea7 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -29,7 +29,8 @@
 /*
  * IO end handler for temporary buffer_heads handling writes to the journal.
  */
-static void journal_end_buffer_io_sync(struct buffer_head *bh, int uptodate)
+static void journal_end_buffer_io_sync(struct address_space *mapping,
+		struct page *page, struct buffer_head *bh, int uptodate)
 {
 	struct buffer_head *orig_bh = bh->b_private;
 
diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
index abd945849395..048c40786dc7 100644
--- a/fs/ntfs/aops.c
+++ b/fs/ntfs/aops.c
@@ -42,6 +42,8 @@
 
 /**
  * ntfs_end_buffer_async_read - async io completion for reading attributes
+ * @mapping:	address space for the page of buffer head
+ * @page:	page the buffer head belongs to
  * @bh:		buffer head on which io is completed
  * @uptodate:	whether @bh is now uptodate or not
  *
@@ -56,11 +58,11 @@
  * record size, and index_block_size_bits, to the log(base 2) of the ntfs
  * record size.
  */
-static void ntfs_end_buffer_async_read(struct buffer_head *bh, int uptodate)
+static void ntfs_end_buffer_async_read(struct address_space *mapping,
+		struct page *page, struct buffer_head *bh, int uptodate)
 {
 	unsigned long flags;
 	struct buffer_head *first, *tmp;
-	struct page *page;
 	struct inode *vi;
 	ntfs_inode *ni;
 	int page_uptodate = 1;
@@ -365,7 +367,8 @@ static int ntfs_read_block(struct page *page)
 			if (likely(!buffer_uptodate(tbh)))
 				submit_bh(REQ_OP_READ, 0, tbh);
 			else
-				ntfs_end_buffer_async_read(tbh, 1);
+				ntfs_end_buffer_async_read(page->mapping,
+							   page, tbh, 1);
 		}
 		return 0;
 	}
diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index 70057359fbaf..230cb2a2309a 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -617,7 +617,8 @@ static void release_buffer_page(struct buffer_head *bh)
 	}
 }
 
-static void reiserfs_end_buffer_io_sync(struct buffer_head *bh, int uptodate)
+static void reiserfs_end_buffer_io_sync(struct address_space *mapping,
+		struct page *page, struct buffer_head *bh, int uptodate)
 {
 	if (buffer_journaled(bh)) {
 		reiserfs_warning(NULL, "clm-2084",
@@ -633,7 +634,8 @@ static void reiserfs_end_buffer_io_sync(struct buffer_head *bh, int uptodate)
 	release_buffer_page(bh);
 }
 
-static void reiserfs_end_ordered_io(struct buffer_head *bh, int uptodate)
+static void reiserfs_end_ordered_io(struct address_space *mapping,
+		struct page *page, struct buffer_head *bh, int uptodate)
 {
 	if (uptodate)
 		set_buffer_uptodate(bh);
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index dca0d3eb789a..61db6d5e7d85 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -49,7 +49,8 @@ enum bh_state_bits {
 struct page;
 struct buffer_head;
 struct address_space;
-typedef void (bh_end_io_t)(struct buffer_head *bh, int uptodate);
+typedef void (bh_end_io_t)(struct address_space *mapping, struct page *page,
+			   struct buffer_head *bh, int uptodate);
 
 /*
  * Historically, a buffer_head was used to map a single block
@@ -163,9 +164,12 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size,
 		bool retry);
 void create_empty_buffers(struct page *, unsigned long,
 			unsigned long b_state);
-void end_buffer_read_sync(struct buffer_head *bh, int uptodate);
-void end_buffer_write_sync(struct buffer_head *bh, int uptodate);
-void end_buffer_async_write(struct buffer_head *bh, int uptodate);
+void end_buffer_read_sync(struct address_space *mapping, struct page *page,
+			  struct buffer_head *bh, int uptodate);
+void end_buffer_write_sync(struct address_space *mapping, struct page *page,
+			   struct buffer_head *bh, int uptodate);
+void end_buffer_async_write(struct address_space *mapping, struct page *page,
+			    struct buffer_head *bh, int uptodate);
 
 /* Things to do with buffers at mapping->private_list */
 void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 36/79] fs/buffer: add struct super_block to bforget() arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (18 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 35/79] fs/buffer: add struct address_space and struct page to end_io callback jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 37/79] fs/buffer: add struct super_block to __bforget() arguments jglisse
                   ` (22 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Jens Axboe, Tejun Heo, Jan Kara, Josef Bacik,
	Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

For the holy crusade to stop relying on struct page mapping field, add
struct super_block to bforget() arguments.

spatch --sp-file zemantic-012a.spatch --in-place --dir fs/
----------------------------------------------------------------------
@exists@
expression E1;
identifier I;
@@
struct super_block *I;
...
-bforget(E1)
+bforget(I, E1)

@exists@
expression E1;
identifier F, I;
@@
F(..., struct super_block *I, ...) {
...
-bforget(E1)
+bforget(I, E1)
...
}

@exists@
expression E1;
identifier I;
@@
struct inode *I;
...
-bforget(E1)
+bforget(I->i_sb, E1)

@exists@
expression E1;
identifier F, I;
@@
F(..., struct inode *I, ...) {
...
-bforget(E1)
+bforget(I->i_sb, E1)
...
}
----------------------------------------------------------------------

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/bfs/file.c               | 2 +-
 fs/ext2/inode.c             | 4 ++--
 fs/ext2/xattr.c             | 4 ++--
 fs/ext4/ext4_jbd2.c         | 2 +-
 fs/fat/dir.c                | 4 ++--
 fs/jfs/resize.c             | 2 +-
 fs/minix/itree_common.c     | 6 +++---
 fs/reiserfs/journal.c       | 2 +-
 fs/reiserfs/resize.c        | 2 +-
 fs/sysv/itree.c             | 6 +++---
 fs/ufs/util.c               | 2 +-
 include/linux/buffer_head.h | 2 +-
 12 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/fs/bfs/file.c b/fs/bfs/file.c
index b1255ee4cd75..6d66cc137bc3 100644
--- a/fs/bfs/file.c
+++ b/fs/bfs/file.c
@@ -41,7 +41,7 @@ static int bfs_move_block(unsigned long from, unsigned long to,
 	new = sb_getblk(sb, to);
 	memcpy(new->b_data, bh->b_data, bh->b_size);
 	mark_buffer_dirty(new);
-	bforget(bh);
+	bforget(sb, bh);
 	brelse(new);
 	return 0;
 }
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 33873c0a4c14..83ea6ad2cefa 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -536,7 +536,7 @@ static int ext2_alloc_branch(struct inode *inode,
 
 failed:
 	for (i = 1; i < n; i++)
-		bforget(branch[i].bh);
+		bforget(inode->i_sb, branch[i].bh);
 	for (i = 0; i < indirect_blks; i++)
 		ext2_free_blocks(inode, new_blocks[i], 1);
 	ext2_free_blocks(inode, new_blocks[i], num);
@@ -1167,7 +1167,7 @@ static void ext2_free_branches(struct inode *inode, __le32 *p, __le32 *q, int de
 					   (__le32*)bh->b_data,
 					   (__le32*)bh->b_data + addr_per_block,
 					   depth);
-			bforget(bh);
+			bforget(inode->i_sb, bh);
 			ext2_free_blocks(inode, nr, 1);
 			mark_inode_dirty(inode);
 		}
diff --git a/fs/ext2/xattr.c b/fs/ext2/xattr.c
index 62d9a659a8ff..c77edf9afbce 100644
--- a/fs/ext2/xattr.c
+++ b/fs/ext2/xattr.c
@@ -733,7 +733,7 @@ ext2_xattr_set2(struct inode *inode, struct buffer_head *old_bh,
 			/* We let our caller release old_bh, so we
 			 * need to duplicate the buffer before. */
 			get_bh(old_bh);
-			bforget(old_bh);
+			bforget(sb, old_bh);
 		} else {
 			/* Decrement the refcount only. */
 			le32_add_cpu(&HDR(old_bh)->h_refcount, -1);
@@ -802,7 +802,7 @@ ext2_xattr_delete_inode(struct inode *inode)
 				      bh->b_blocknr);
 		ext2_free_blocks(inode, EXT2_I(inode)->i_file_acl, 1);
 		get_bh(bh);
-		bforget(bh);
+		bforget(inode->i_sb, bh);
 		unlock_buffer(bh);
 	} else {
 		le32_add_cpu(&HDR(bh)->h_refcount, -1);
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 5529badca994..60fbf5336059 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -211,7 +211,7 @@ int __ext4_forget(const char *where, unsigned int line, handle_t *handle,
 
 	/* In the no journal case, we can just do a bforget and return */
 	if (!ext4_handle_valid(handle)) {
-		bforget(bh);
+		bforget(inode->i_sb, bh);
 		return 0;
 	}
 
diff --git a/fs/fat/dir.c b/fs/fat/dir.c
index 8e100c3bf72c..b801f3d0220b 100644
--- a/fs/fat/dir.c
+++ b/fs/fat/dir.c
@@ -1126,7 +1126,7 @@ static int fat_zeroed_cluster(struct inode *dir, sector_t blknr, int nr_used,
 
 error:
 	for (i = 0; i < n; i++)
-		bforget(bhs[i]);
+		bforget(sb, bhs[i]);
 	return err;
 }
 
@@ -1266,7 +1266,7 @@ static int fat_add_new_entries(struct inode *dir, void *slots, int nr_slots,
 	n = 0;
 error_nomem:
 	for (i = 0; i < n; i++)
-		bforget(bhs[i]);
+		bforget(sb, bhs[i]);
 	fat_free_clusters(dir, cluster[0]);
 error:
 	return err;
diff --git a/fs/jfs/resize.c b/fs/jfs/resize.c
index 7ddcb445a3d9..c1f417b94fe6 100644
--- a/fs/jfs/resize.c
+++ b/fs/jfs/resize.c
@@ -114,7 +114,7 @@ int jfs_extendfs(struct super_block *sb, s64 newLVSize, int newLogSize)
 			rc = -EINVAL;
 			goto out;
 		}
-		bforget(bh);
+		bforget(sb, bh);
 	}
 
 	/* Can't extend write-protected drive */
diff --git a/fs/minix/itree_common.c b/fs/minix/itree_common.c
index 043c3fdbc8e7..86a3c3a4e767 100644
--- a/fs/minix/itree_common.c
+++ b/fs/minix/itree_common.c
@@ -100,7 +100,7 @@ static int alloc_branch(struct inode *inode,
 
 	/* Allocation failed, free what we already allocated */
 	for (i = 1; i < n; i++)
-		bforget(branch[i].bh);
+		bforget(inode->i_sb, branch[i].bh);
 	for (i = 0; i < n; i++)
 		minix_free_block(inode, block_to_cpu(branch[i].key));
 	return -ENOSPC;
@@ -137,7 +137,7 @@ static inline int splice_branch(struct inode *inode,
 changed:
 	write_unlock(&pointers_lock);
 	for (i = 1; i < num; i++)
-		bforget(where[i].bh);
+		bforget(inode->i_sb, where[i].bh);
 	for (i = 0; i < num; i++)
 		minix_free_block(inode, block_to_cpu(where[i].key));
 	return -EAGAIN;
@@ -283,7 +283,7 @@ static void free_branches(struct inode *inode, block_t *p, block_t *q, int depth
 				continue;
 			free_branches(inode, (block_t*)bh->b_data,
 				      block_end(bh), depth);
-			bforget(bh);
+			bforget(inode->i_sb, bh);
 			minix_free_block(inode, nr);
 			mark_inode_dirty(inode);
 		}
diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index 230cb2a2309a..ee5b1d1b3a3d 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -1132,7 +1132,7 @@ static int flush_commit_list(struct super_block *s,
 #endif
 		retval = -EIO;
 	}
-	bforget(jl->j_commit_bh);
+	bforget(s, jl->j_commit_bh);
 	if (journal->j_last_commit_id != 0 &&
 	    (jl->j_trans_id - journal->j_last_commit_id) != 1) {
 		reiserfs_warning(s, "clm-2200", "last commit %lu, current %lu",
diff --git a/fs/reiserfs/resize.c b/fs/reiserfs/resize.c
index 6052d323bc9a..2196afda6e28 100644
--- a/fs/reiserfs/resize.c
+++ b/fs/reiserfs/resize.c
@@ -51,7 +51,7 @@ int reiserfs_resize(struct super_block *s, unsigned long block_count_new)
 		printk("reiserfs_resize: can\'t read last block\n");
 		return -EINVAL;
 	}
-	bforget(bh);
+	bforget(s, bh);
 
 	/*
 	 * old disk layout detection; those partitions can be mounted, but
diff --git a/fs/sysv/itree.c b/fs/sysv/itree.c
index 3b7d27e07e31..61a2a0deba75 100644
--- a/fs/sysv/itree.c
+++ b/fs/sysv/itree.c
@@ -159,7 +159,7 @@ static int alloc_branch(struct inode *inode,
 
 	/* Allocation failed, free what we already allocated */
 	for (i = 1; i < n; i++)
-		bforget(branch[i].bh);
+		bforget(inode->i_sb, branch[i].bh);
 	for (i = 0; i < n; i++)
 		sysv_free_block(inode->i_sb, branch[i].key);
 	return -ENOSPC;
@@ -194,7 +194,7 @@ static inline int splice_branch(struct inode *inode,
 changed:
 	write_unlock(&pointers_lock);
 	for (i = 1; i < num; i++)
-		bforget(where[i].bh);
+		bforget(inode->i_sb, where[i].bh);
 	for (i = 0; i < num; i++)
 		sysv_free_block(inode->i_sb, where[i].key);
 	return -EAGAIN;
@@ -353,7 +353,7 @@ static void free_branches(struct inode *inode, sysv_zone_t *p, sysv_zone_t *q, i
 				continue;
 			free_branches(inode, (sysv_zone_t*)bh->b_data,
 					block_end(bh), depth);
-			bforget(bh);
+			bforget(sb, bh);
 			sysv_free_block(sb, nr);
 			mark_inode_dirty(inode);
 		}
diff --git a/fs/ufs/util.c b/fs/ufs/util.c
index 596f576b2061..7b599af21858 100644
--- a/fs/ufs/util.c
+++ b/fs/ufs/util.c
@@ -132,7 +132,7 @@ void ubh_bforget (struct super_block *sb, struct ufs_buffer_head * ubh)
 	if (!ubh) 
 		return;
 	for ( i = 0; i < ubh->count; i++ ) if ( ubh->bh[i] ) 
-		bforget (ubh->bh[i]);
+		bforget(sb, ubh->bh[i]);
 }
  
 int ubh_buffer_dirty (struct ufs_buffer_head * ubh)
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 61db6d5e7d85..82faae102ba2 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -303,7 +303,7 @@ static inline void brelse(struct buffer_head *bh)
 		__brelse(bh);
 }
 
-static inline void bforget(struct buffer_head *bh)
+static inline void bforget(struct super_block *sb, struct buffer_head *bh)
 {
 	if (bh)
 		__bforget(bh);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 37/79] fs/buffer: add struct super_block to __bforget() arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (19 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 36/79] fs/buffer: add struct super_block to bforget() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 38/79] fs/buffer: add first buffer flag for first buffer_head in a page jglisse
                   ` (21 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Jens Axboe, Tejun Heo, Jan Kara, Josef Bacik,
	Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

For the holy crusade to stop relying on struct page mapping field, add
struct super_block to __bforget() arguments.

spatch --sp-file zemantic-013a.spatch --in-place --dir fs/
spatch --sp-file zemantic-013a.spatch --in-place --dir include/ --include-headers
----------------------------------------------------------------------
@exists@
expression E1;
identifier I;
@@
struct super_block *I;
...
-__bforget(E1)
+__bforget(I, E1)

@exists@
expression E1;
identifier F, I;
@@
F(..., struct super_block *I, ...) {
...
-__bforget(E1)
+__bforget(I, E1)
...
}

@exists@
expression E1;
identifier I;
@@
struct inode *I;
...
-__bforget(E1)
+__bforget(I->i_sb, E1)

@exists@
expression E1;
identifier F, I;
@@
F(..., struct inode *I, ...) {
...
-__bforget(E1)
+__bforget(I->i_sb, E1)
...
}
----------------------------------------------------------------------

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/buffer.c                 | 2 +-
 fs/jbd2/transaction.c       | 2 +-
 include/linux/buffer_head.h | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 9f2c5e90b64d..422204701a3b 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1168,7 +1168,7 @@ EXPORT_SYMBOL(__brelse);
  * bforget() is like brelse(), except it discards any
  * potentially dirty data.
  */
-void __bforget(struct buffer_head *bh)
+void __bforget(struct super_block *sb, struct buffer_head *bh)
 {
 	clear_buffer_dirty(bh);
 	if (bh->b_assoc_map) {
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index e8c50bb5822c..177616eb793c 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -1560,7 +1560,7 @@ int jbd2_journal_forget (handle_t *handle, struct super_block *sb,
 			if (!buffer_jbd(bh)) {
 				spin_unlock(&journal->j_list_lock);
 				jbd_unlock_bh_state(bh);
-				__bforget(bh);
+				__bforget(sb, bh);
 				goto drop;
 			}
 		}
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 82faae102ba2..7ae60f59f27e 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -192,7 +192,7 @@ struct buffer_head *__find_get_block(struct block_device *bdev, sector_t block,
 struct buffer_head *__getblk_gfp(struct block_device *bdev, sector_t block,
 				  unsigned size, gfp_t gfp);
 void __brelse(struct buffer_head *);
-void __bforget(struct buffer_head *);
+void __bforget(struct super_block *, struct buffer_head *);
 void __breadahead(struct block_device *, sector_t block, unsigned int size);
 struct buffer_head *__bread_gfp(struct block_device *,
 				sector_t block, unsigned size, gfp_t gfp);
@@ -306,7 +306,7 @@ static inline void brelse(struct buffer_head *bh)
 static inline void bforget(struct super_block *sb, struct buffer_head *bh)
 {
 	if (bh)
-		__bforget(bh);
+		__bforget(sb, bh);
 }
 
 static inline struct buffer_head *
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 38/79] fs/buffer: add first buffer flag for first buffer_head in a page
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (20 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 37/79] fs/buffer: add struct super_block to __bforget() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 39/79] fs/buffer: add struct address_space to clean_page_buffers() arguments jglisse
                   ` (20 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Jens Axboe, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

A common pattern in code is that we have a buffer_head and we want to
get the first buffer_head in buffer_head list for a page. Before this
patch it was simply done with page_buffers(bh->b_page).

This patch introduce an helper bh_first_for_page(struct buffer_head *)
which can use a new flag (also introduced in this patch) to find the
first buffer_head struct for a given page.

This patch use page_buffers(bh->b_page) for now but latter patch can
update this helper to handle special page differently and instead scan
buffer_head list until a buffer_head with first_for_page flag set is
found.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/buffer.c                 |  4 ++--
 include/linux/buffer_head.h | 18 ++++++++++++++++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 422204701a3b..44beba15c38d 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -276,7 +276,7 @@ static void end_buffer_async_read(struct address_space *mapping,
 	 * two buffer heads end IO at almost the same time and both
 	 * decide that the page is now completely done.
 	 */
-	first = page_buffers(page);
+	first = bh_first_for_page(bh);
 	local_irq_save(flags);
 	bit_spin_lock(BH_Uptodate_Lock, &first->b_state);
 	clear_buffer_async_read(bh);
@@ -332,7 +332,7 @@ void end_buffer_async_write(struct address_space *mapping, struct page *page,
 		SetPageError(page);
 	}
 
-	first = page_buffers(page);
+	first = bh_first_for_page(bh);
 	local_irq_save(flags);
 	bit_spin_lock(BH_Uptodate_Lock, &first->b_state);
 
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 7ae60f59f27e..22e79307c055 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -39,6 +39,12 @@ enum bh_state_bits {
 	BH_Prio,	/* Buffer should be submitted with REQ_PRIO */
 	BH_Defer_Completion, /* Defer AIO completion to workqueue */
 
+	/*
+	 * First buffer_head for a page ie page->private is pointing to this
+	 * buffer_head struct.
+	 */
+	BH_FirstForPage,
+
 	BH_PrivateStart,/* not a state bit, but the first bit available
 			 * for private allocation by other entities
 			 */
@@ -135,6 +141,7 @@ BUFFER_FNS(Unwritten, unwritten)
 BUFFER_FNS(Meta, meta)
 BUFFER_FNS(Prio, prio)
 BUFFER_FNS(Defer_Completion, defer_completion)
+BUFFER_FNS(FirstForPage, first_for_page)
 
 #define bh_offset(bh)		((unsigned long)(bh)->b_data & ~PAGE_MASK)
 
@@ -278,11 +285,22 @@ void buffer_init(void);
  * inline definitions
  */
 
+/*
+ * bh_first_for_page - return first buffer_head for a page
+ * @bh: buffer_head for which we want the first buffer_head for same page
+ * Returns: first buffer_head within the same page as given buffer_head
+ */
+static inline struct buffer_head *bh_first_for_page(struct buffer_head *bh)
+{
+	return page_buffers(bh->b_page);
+}
+
 static inline void attach_page_buffers(struct page *page,
 		struct buffer_head *head)
 {
 	get_page(page);
 	SetPagePrivate(page);
+	set_buffer_first_for_page(head);
 	set_page_private(page, (unsigned long)head);
 }
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 39/79] fs/buffer: add struct address_space to clean_page_buffers() arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (21 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 38/79] fs/buffer: add first buffer flag for first buffer_head in a page jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 50/79] fs: stop relying on mapping field of struct page, get it from context jglisse
                   ` (19 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Jens Axboe, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

Add struct address_space to clean_page_buffers() arguments.

One step toward dropping reliance on page->mapping.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/block_dev.c              | 2 +-
 fs/mpage.c                  | 9 +++++----
 include/linux/buffer_head.h | 2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index dd9da97615e3..b653cd8fd1e3 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -712,7 +712,7 @@ int bdev_write_page(struct block_device *bdev, sector_t sector,
 	if (result) {
 		end_page_writeback(page);
 	} else {
-		clean_page_buffers(page);
+		clean_page_buffers(mapping, page);
 		unlock_page(page);
 	}
 	blk_queue_exit(bdev->bd_queue);
diff --git a/fs/mpage.c b/fs/mpage.c
index a75cea232f1a..624995c333e0 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -447,7 +447,8 @@ struct mpage_data {
  * We have our BIO, so we can now mark the buffers clean.  Make
  * sure to only clean buffers which we know we'll be writing.
  */
-static void clean_buffers(struct page *page, unsigned first_unmapped)
+static void clean_buffers(struct address_space *mapping, struct page *page,
+			  unsigned first_unmapped)
 {
 	unsigned buffer_counter = 0;
 	struct buffer_head *bh, *head;
@@ -477,9 +478,9 @@ static void clean_buffers(struct page *page, unsigned first_unmapped)
  * We don't need to calculate how many buffers are attached to the page,
  * we just need to specify a number larger than the maximum number of buffers.
  */
-void clean_page_buffers(struct page *page)
+void clean_page_buffers(struct address_space *mapping, struct page *page)
 {
-	clean_buffers(page, ~0U);
+	clean_buffers(mapping, page, ~0U);
 }
 
 static int __mpage_writepage(struct page *page, struct address_space *_mapping,
@@ -643,7 +644,7 @@ static int __mpage_writepage(struct page *page, struct address_space *_mapping,
 		goto alloc_new;
 	}
 
-	clean_buffers(page, first_unmapped);
+	clean_buffers(mapping, page, first_unmapped);
 
 	BUG_ON(PageWriteback(page));
 	set_page_writeback(page);
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 22e79307c055..f3baf88a251b 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -248,7 +248,7 @@ int generic_write_end(struct file *, struct address_space *,
 				loff_t, unsigned, unsigned,
 				struct page *, void *);
 void page_zero_new_buffers(struct page *page, unsigned from, unsigned to);
-void clean_page_buffers(struct page *page);
+void clean_page_buffers(struct address_space *mapping, struct page *page);
 int cont_write_begin(struct file *, struct address_space *, loff_t,
 			unsigned, unsigned, struct page **, void **,
 			get_block_t *, loff_t *);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 50/79] fs: stop relying on mapping field of struct page, get it from context
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (22 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 39/79] fs/buffer: add struct address_space to clean_page_buffers() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 51/79] " jglisse
                   ` (18 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Alexander Viro,
	Jens Axboe, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

Holy grail, remove all usage of mapping field of struct page inside
common fs code.

spatch --sp-file zemantic-015a.spatch --in-place fs/*.c
----------------------------------------------------------------------
@exists@
struct page * P;
identifier I;
@@
struct address_space *I;
...
-P->mapping
+I

@exists@
identifier F, I;
struct page * P;
@@
F(..., struct address_space *I, ...) {
...
-P->mapping
+I
...
}

@@
@@
-mapping = mapping;

@@
@@
-struct address_space *mapping = _mapping;
----------------------------------------------------------------------

Hand edit:
    fs/mpage.c __mpage_writepage() coccinelle sematic is too hard ...

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/buffer.c | 11 +++++------
 fs/libfs.c  |  2 +-
 fs/mpage.c  |  9 ++++-----
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index b968ac0b65e8..39d8c7315b55 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -327,7 +327,7 @@ void end_buffer_async_write(struct address_space *mapping, struct page *page,
 		set_buffer_uptodate(bh);
 	} else {
 		buffer_io_error(bh, ", lost async page write");
-		mark_buffer_write_io_error(page->mapping, page, bh);
+		mark_buffer_write_io_error(mapping, page, bh);
 		clear_buffer_uptodate(bh);
 		SetPageError(page);
 	}
@@ -597,11 +597,10 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode);
  *
  * The caller must hold lock_page_memcg().
  */
-static void __set_page_dirty(struct page *page, struct address_space *_mapping,
+static void __set_page_dirty(struct page *page, struct address_space *mapping,
 			     int warn)
 {
 	unsigned long flags;
-	struct address_space *mapping = page_mapping(page);
 
 	spin_lock_irqsave(&mapping->tree_lock, flags);
 	if (page_is_truncated(page, mapping)) {	/* Race with truncate? */
@@ -1954,7 +1953,7 @@ int __block_write_begin_int(struct address_space *mapping, struct page *page,
 {
 	unsigned from = pos & (PAGE_SIZE - 1);
 	unsigned to = from + len;
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = mapping->host;
 	unsigned block_start, block_end;
 	sector_t block;
 	int err = 0;
@@ -2456,7 +2455,7 @@ EXPORT_SYMBOL(cont_write_begin);
 int block_commit_write(struct address_space *mapping, struct page *page,
 		       unsigned from, unsigned to)
 {
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = mapping->host;
 	__block_commit_write(inode,page,from,to);
 	return 0;
 }
@@ -2705,7 +2704,7 @@ int nobh_write_end(struct file *file, struct address_space *mapping,
 			loff_t pos, unsigned len, unsigned copied,
 			struct page *page, void *fsdata)
 {
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = mapping->host;
 	struct buffer_head *head = fsdata;
 	struct buffer_head *bh;
 	BUG_ON(fsdata != NULL && page_has_buffers(page));
diff --git a/fs/libfs.c b/fs/libfs.c
index ac76b269bbb7..585ef1f37d54 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -475,7 +475,7 @@ int simple_write_end(struct file *file, struct address_space *mapping,
 			loff_t pos, unsigned len, unsigned copied,
 			struct page *page, void *fsdata)
 {
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = mapping->host;
 	loff_t last_pos = pos + copied;
 
 	/* zero the stale part of the page if we did a short copy */
diff --git a/fs/mpage.c b/fs/mpage.c
index 1eec9d0df23e..ecdef63f464e 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -231,7 +231,7 @@ do_mpage_readpage(struct bio *bio, struct address_space *mapping,
 		 * so readpage doesn't have to repeat the get_block call
 		 */
 		if (buffer_uptodate(map_bh)) {
-			map_buffer_to_page(page->mapping->host, page,
+			map_buffer_to_page(mapping->host, page,
 					   map_bh, page_block);
 			goto confused;
 		}
@@ -312,7 +312,7 @@ do_mpage_readpage(struct bio *bio, struct address_space *mapping,
 	if (bio)
 		bio = mpage_bio_submit(REQ_OP_READ, 0, bio);
 	if (!PageUptodate(page))
-	        block_read_full_page(page->mapping->host, page, get_block);
+	        block_read_full_page(mapping->host, page, get_block);
 	else
 		unlock_page(page);
 	goto out;
@@ -484,13 +484,12 @@ void clean_page_buffers(struct address_space *mapping, struct page *page)
 	clean_buffers(mapping, page, ~0U);
 }
 
-static int __mpage_writepage(struct page *page, struct address_space *_mapping,
+static int __mpage_writepage(struct page *page, struct address_space *mapping,
 			     struct writeback_control *wbc, void *data)
 {
 	struct mpage_data *mpd = data;
 	struct bio *bio = mpd->bio;
-	struct address_space *mapping = page->mapping;
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = mapping->host;
 	const unsigned blkbits = inode->i_blkbits;
 	unsigned long end_index;
 	const unsigned blocks_per_page = PAGE_SIZE >> blkbits;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 51/79] fs: stop relying on mapping field of struct page, get it from context
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (23 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 50/79] fs: stop relying on mapping field of struct page, get it from context jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 52/79] fs/buffer: use _page_has_buffers() instead of page_has_buffers() jglisse
                   ` (17 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Alexander Viro,
	Jens Axboe, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

Holy grail, remove all usage of mapping field of struct page inside
common fs code. This is the manual conversion patch (so much can be
done with coccinelle).

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/buffer.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 39d8c7315b55..3c424b7af5af 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -570,7 +570,9 @@ void write_boundary_block(struct block_device *bdev,
 void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode)
 {
 	struct address_space *mapping = inode->i_mapping;
-	struct address_space *buffer_mapping = bh->b_page->mapping;
+	struct address_space *buffer_mapping;
+
+	buffer_mapping = fs_page_mapping_get_with_bh(bh->b_page, bh);
 
 	mark_buffer_dirty(bh);
 	if (!mapping->private_data) {
@@ -1138,10 +1140,13 @@ EXPORT_SYMBOL(mark_buffer_dirty);
 void mark_buffer_write_io_error(struct address_space *mapping,
 		struct page *page, struct buffer_head *bh)
 {
+	BUG_ON(page != bh->b_page);
+	BUG_ON(mapping != bh->b_page->mapping);
+
 	set_buffer_write_io_error(bh);
 	/* FIXME: do we need to set this in both places? */
-	if (bh->b_page && !page_is_truncated(bh->b_page, bh->b_page->mapping))
-		mapping_set_error(bh->b_page->mapping, -EIO);
+	if (bh->b_page && !page_is_truncated(page, mapping))
+		mapping_set_error(mapping, -EIO);
 	if (bh->b_assoc_map)
 		mapping_set_error(bh->b_assoc_map, -EIO);
 }
@@ -1172,7 +1177,10 @@ void __bforget(struct super_block *sb, struct buffer_head *bh)
 {
 	clear_buffer_dirty(bh);
 	if (bh->b_assoc_map) {
-		struct address_space *buffer_mapping = bh->b_page->mapping;
+		struct address_space *buffer_mapping;
+
+		buffer_mapping = sb->s_bdev->bd_inode->i_mapping;
+		BUG_ON(buffer_mapping != bh->b_page->mapping);
 
 		spin_lock(&buffer_mapping->private_lock);
 		list_del_init(&bh->b_assoc_buffers);
@@ -1543,7 +1551,7 @@ void create_empty_buffers(struct address_space *mapping, struct page *page,
 	} while (bh);
 	tail->b_this_page = head;
 
-	spin_lock(&page->mapping->private_lock);
+	spin_lock(&mapping->private_lock);
 	if (PageUptodate(page) || PageDirty(page)) {
 		bh = head;
 		do {
@@ -1555,7 +1563,7 @@ void create_empty_buffers(struct address_space *mapping, struct page *page,
 		} while (bh != head);
 	}
 	attach_page_buffers(page, head);
-	spin_unlock(&page->mapping->private_lock);
+	spin_unlock(&mapping->private_lock);
 }
 EXPORT_SYMBOL(create_empty_buffers);
 
@@ -1833,7 +1841,7 @@ int __block_write_full_page(struct inode *inode, struct page *page,
 	} while ((bh = bh->b_this_page) != head);
 	SetPageError(page);
 	BUG_ON(PageWriteback(page));
-	mapping_set_error(page->mapping, err);
+	mapping_set_error(inode->i_mapping, err);
 	set_page_writeback(page);
 	do {
 		struct buffer_head *next = bh->b_this_page;
@@ -2541,7 +2549,7 @@ static void attach_nobh_buffers(struct address_space *mapping,
 
 	BUG_ON(!PageLocked(page));
 
-	spin_lock(&page->mapping->private_lock);
+	spin_lock(&mapping->private_lock);
 	bh = head;
 	do {
 		if (PageDirty(page))
@@ -2551,7 +2559,7 @@ static void attach_nobh_buffers(struct address_space *mapping,
 		bh = bh->b_this_page;
 	} while (bh != head);
 	attach_page_buffers(page, head);
-	spin_unlock(&page->mapping->private_lock);
+	spin_unlock(&mapping->private_lock);
 }
 
 /*
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 52/79] fs/buffer: use _page_has_buffers() instead of page_has_buffers()
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (24 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 51/79] " jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 63/79] mm/page: convert page's index lookup to be against specific mapping jglisse
                   ` (16 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Jens Axboe, Tejun Heo, Jan Kara, Josef Bacik,
	Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

The former need the address_space for which the buffer_head is being
lookup.

----------------------------------------------------------------------
@exists@
identifier M;
expression E;
@@
struct address_space *M;
...
-page_buffers(E)
+_page_buffers(E, M)

@exists@
identifier M, F;
expression E;
@@
F(..., struct address_space *M, ...) {...
-page_buffers(E)
+_page_buffers(E, M)
...}

@exists@
identifier M;
expression E;
@@
struct address_space *M;
...
-page_has_buffers(E)
+_page_has_buffers(E, M)

@exists@
identifier M, F;
expression E;
@@
F(..., struct address_space *M, ...) {...
-page_has_buffers(E)
+_page_has_buffers(E, M)
...}

@exists@
identifier I;
expression E;
@@
struct inode *I;
...
-page_buffers(E)
+_page_buffers(E, I->i_mapping)

@exists@
identifier I, F;
expression E;
@@
F(..., struct inode *I, ...) {...
-page_buffers(E)
+_page_buffers(E, I->i_mapping)
...}

@exists@
identifier I;
expression E;
@@
struct inode *I;
...
-page_has_buffers(E)
+_page_has_buffers(E, I->i_mapping)

@exists@
identifier I, F;
expression E;
@@
F(..., struct inode *I, ...) {...
-page_has_buffers(E)
+_page_has_buffers(E, I->i_mapping)
...}
----------------------------------------------------------------------

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/buffer.c | 60 ++++++++++++++++++++++++++++++------------------------------
 fs/mpage.c  | 14 +++++++-------
 2 files changed, 37 insertions(+), 37 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 3c424b7af5af..27b19c629308 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -89,13 +89,13 @@ void buffer_check_dirty_writeback(struct page *page,
 
 	BUG_ON(!PageLocked(page));
 
-	if (!page_has_buffers(page))
+	if (!_page_has_buffers(page, mapping))
 		return;
 
 	if (PageWriteback(page))
 		*writeback = true;
 
-	head = page_buffers(page);
+	head = _page_buffers(page, mapping);
 	bh = head;
 	do {
 		if (buffer_locked(bh))
@@ -211,9 +211,9 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
 		goto out;
 
 	spin_lock(&bd_mapping->private_lock);
-	if (!page_has_buffers(page))
+	if (!_page_has_buffers(page, bd_mapping))
 		goto out_unlock;
-	head = page_buffers(page);
+	head = _page_buffers(page, bd_mapping);
 	bh = head;
 	do {
 		if (!buffer_mapped(bh))
@@ -648,8 +648,8 @@ int __set_page_dirty_buffers(struct address_space *mapping,
 		return !TestSetPageDirty(page);
 
 	spin_lock(&mapping->private_lock);
-	if (page_has_buffers(page)) {
-		struct buffer_head *head = page_buffers(page);
+	if (_page_has_buffers(page, mapping)) {
+		struct buffer_head *head = _page_buffers(page, mapping);
 		struct buffer_head *bh = head;
 
 		do {
@@ -913,7 +913,7 @@ static sector_t
 init_page_buffers(struct address_space *buffer, struct page *page,
 		  struct block_device *bdev, sector_t block, int size)
 {
-	struct buffer_head *head = page_buffers(page);
+	struct buffer_head *head = _page_buffers(page, buffer);
 	struct buffer_head *bh = head;
 	int uptodate = PageUptodate(page);
 	sector_t end_block = blkdev_max_block(I_BDEV(bdev->bd_inode), size);
@@ -969,8 +969,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
 
 	BUG_ON(!PageLocked(page));
 
-	if (page_has_buffers(page)) {
-		bh = page_buffers(page);
+	if (_page_has_buffers(page, inode->i_mapping)) {
+		bh = _page_buffers(page, inode->i_mapping);
 		if (bh->b_size == size) {
 			end_block = init_page_buffers(inode->i_mapping, page,
 					bdev, (sector_t)index << sizebits,
@@ -1490,7 +1490,7 @@ void block_invalidatepage(struct address_space *mapping, struct page *page,
 	unsigned int stop = length + offset;
 
 	BUG_ON(!PageLocked(page));
-	if (!page_has_buffers(page))
+	if (!_page_has_buffers(page, mapping))
 		goto out;
 
 	/*
@@ -1498,7 +1498,7 @@ void block_invalidatepage(struct address_space *mapping, struct page *page,
 	 */
 	BUG_ON(stop > PAGE_SIZE || stop < length);
 
-	head = page_buffers(page);
+	head = _page_buffers(page, mapping);
 	bh = head;
 	do {
 		unsigned int next_off = curr_off + bh->b_size;
@@ -1605,7 +1605,7 @@ void clean_bdev_aliases(struct block_device *bdev, sector_t block, sector_t len)
 		for (i = 0; i < count; i++) {
 			struct page *page = pvec.pages[i];
 
-			if (!page_has_buffers(page))
+			if (!_page_has_buffers(page, bd_mapping))
 				continue;
 			/*
 			 * We use page lock instead of bd_mapping->private_lock
@@ -1614,9 +1614,9 @@ void clean_bdev_aliases(struct block_device *bdev, sector_t block, sector_t len)
 			 */
 			lock_page(page);
 			/* Recheck when the page is locked which pins bhs */
-			if (!page_has_buffers(page))
+			if (!_page_has_buffers(page, bd_inode->i_mapping))
 				goto unlock_page;
-			head = page_buffers(page);
+			head = _page_buffers(page, bd_mapping);
 			bh = head;
 			do {
 				if (!buffer_mapped(bh) || (bh->b_blocknr < block))
@@ -1658,11 +1658,11 @@ static struct buffer_head *create_page_buffers(struct page *page, struct inode *
 {
 	BUG_ON(!PageLocked(page));
 
-	if (!page_has_buffers(page))
+	if (!_page_has_buffers(page, inode->i_mapping))
 		create_empty_buffers(inode->i_mapping, page,
 				     1 << READ_ONCE(inode->i_blkbits),
 				     b_state);
-	return page_buffers(page);
+	return _page_buffers(page, inode->i_mapping);
 }
 
 /*
@@ -1870,10 +1870,10 @@ void page_zero_new_buffers(struct address_space *buffer, struct page *page,
 	struct buffer_head *head, *bh;
 
 	BUG_ON(!PageLocked(page));
-	if (!page_has_buffers(page))
+	if (!_page_has_buffers(page, buffer))
 		return;
 
-	bh = head = page_buffers(page);
+	bh = head = _page_buffers(page, buffer);
 	block_start = 0;
 	do {
 		block_end = block_start + bh->b_size;
@@ -2057,7 +2057,7 @@ static int __block_commit_write(struct inode *inode, struct page *page,
 	unsigned blocksize;
 	struct buffer_head *bh, *head;
 
-	bh = head = page_buffers(page);
+	bh = head = _page_buffers(page, inode->i_mapping);
 	blocksize = bh->b_size;
 
 	block_start = 0;
@@ -2209,10 +2209,10 @@ int block_is_partially_uptodate(struct page *page,
 	struct buffer_head *bh, *head;
 	int ret = 1;
 
-	if (!page_has_buffers(page))
+	if (!_page_has_buffers(page, mapping))
 		return 0;
 
-	head = page_buffers(page);
+	head = _page_buffers(page, mapping);
 	blocksize = head->b_size;
 	to = min_t(unsigned, PAGE_SIZE - from, count);
 	to = from + to;
@@ -2596,7 +2596,7 @@ int nobh_write_begin(struct address_space *mapping,
 	*pagep = page;
 	*fsdata = NULL;
 
-	if (page_has_buffers(page)) {
+	if (_page_has_buffers(page, mapping)) {
 		ret = __block_write_begin(mapping, page, pos, len, get_block);
 		if (unlikely(ret))
 			goto out_release;
@@ -2715,7 +2715,7 @@ int nobh_write_end(struct file *file, struct address_space *mapping,
 	struct inode *inode = mapping->host;
 	struct buffer_head *head = fsdata;
 	struct buffer_head *bh;
-	BUG_ON(fsdata != NULL && page_has_buffers(page));
+	BUG_ON(fsdata != NULL && _page_has_buffers(page, inode->i_mapping));
 
 	if (unlikely(copied < len) && head)
 		attach_nobh_buffers(mapping, page, head);
@@ -2822,7 +2822,7 @@ int nobh_truncate_page(struct address_space *mapping,
 	if (!page)
 		goto out;
 
-	if (page_has_buffers(page)) {
+	if (_page_has_buffers(page, mapping)) {
 has_buffers:
 		unlock_page(page);
 		put_page(page);
@@ -2857,7 +2857,7 @@ int nobh_truncate_page(struct address_space *mapping,
 			err = -EIO;
 			goto unlock;
 		}
-		if (page_has_buffers(page))
+		if (_page_has_buffers(page, inode->i_mapping))
 			goto has_buffers;
 	}
 	zero_user(page, offset, length);
@@ -2900,11 +2900,11 @@ int block_truncate_page(struct address_space *mapping,
 	if (!page)
 		goto out;
 
-	if (!page_has_buffers(page))
+	if (!_page_has_buffers(page, mapping))
 		create_empty_buffers(mapping, page, blocksize, 0);
 
 	/* Find the buffer that contains "offset" */
-	bh = page_buffers(page);
+	bh = _page_buffers(page, mapping);
 	pos = blocksize;
 	while (offset >= pos) {
 		bh = bh->b_this_page;
@@ -3260,7 +3260,7 @@ static int
 drop_buffers(struct address_space *mapping, struct page *page,
 	     struct buffer_head **buffers_to_free)
 {
-	struct buffer_head *head = page_buffers(page);
+	struct buffer_head *head = _page_buffers(page, mapping);
 	struct buffer_head *bh;
 
 	bh = head;
@@ -3491,7 +3491,7 @@ page_seek_hole_data(struct address_space *mapping, struct page *page,
 	if (lastoff < offset)
 		lastoff = offset;
 
-	bh = head = page_buffers(page);
+	bh = head = _page_buffers(page, mapping);
 	do {
 		offset += bh->b_size;
 		if (lastoff >= offset)
@@ -3563,7 +3563,7 @@ page_cache_seek_hole_data(struct inode *inode, loff_t offset, loff_t length,
 
 			lock_page(page);
 			if (likely(!page_is_truncated(page, inode->i_mapping)) &&
-			    page_has_buffers(page)) {
+			    _page_has_buffers(page, inode->i_mapping)) {
 				lastoff = page_seek_hole_data(inode->i_mapping,
 							page, lastoff, whence);
 				if (lastoff >= 0) {
diff --git a/fs/mpage.c b/fs/mpage.c
index ecdef63f464e..8141010b9f4c 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -107,7 +107,7 @@ map_buffer_to_page(struct inode *inode, struct page *page,
 	struct buffer_head *page_bh, *head;
 	int block = 0;
 
-	if (!page_has_buffers(page)) {
+	if (!_page_has_buffers(page, inode->i_mapping)) {
 		/*
 		 * don't make any buffers if there is only one buffer on
 		 * the page and the page just needs to be set up to date
@@ -120,7 +120,7 @@ map_buffer_to_page(struct inode *inode, struct page *page,
 		create_empty_buffers(inode->i_mapping, page,
 				     i_blocksize(inode), 0);
 	}
-	head = page_buffers(page);
+	head = _page_buffers(page, inode->i_mapping);
 	page_bh = head;
 	do {
 		if (block == page_block) {
@@ -166,7 +166,7 @@ do_mpage_readpage(struct bio *bio, struct address_space *mapping,
 	unsigned nblocks;
 	unsigned relative_block;
 
-	if (page_has_buffers(page))
+	if (_page_has_buffers(page, mapping))
 		goto confused;
 
 	block_in_file = (sector_t)page->index << (PAGE_SHIFT - blkbits);
@@ -453,9 +453,9 @@ static void clean_buffers(struct address_space *mapping, struct page *page,
 {
 	unsigned buffer_counter = 0;
 	struct buffer_head *bh, *head;
-	if (!page_has_buffers(page))
+	if (!_page_has_buffers(page, mapping))
 		return;
-	head = page_buffers(page);
+	head = _page_buffers(page, mapping);
 	bh = head;
 
 	do {
@@ -508,8 +508,8 @@ static int __mpage_writepage(struct page *page, struct address_space *mapping,
 	int ret = 0;
 	int op_flags = wbc_to_write_flags(wbc);
 
-	if (page_has_buffers(page)) {
-		struct buffer_head *head = page_buffers(page);
+	if (_page_has_buffers(page, mapping)) {
+		struct buffer_head *head = _page_buffers(page, mapping);
 		struct buffer_head *bh = head;
 
 		/* If they're all mapped and dirty, do it */
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 63/79] mm/page: convert page's index lookup to be against specific mapping
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (25 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 52/79] fs/buffer: use _page_has_buffers() instead of page_has_buffers() jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 64/79] mm/buffer: use _page_has_buffers() instead of page_has_buffers() jglisse
                   ` (15 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton, Mel Gorman,
	Alexander Viro

From: Jérôme Glisse <jglisse@redhat.com>

This patch switch mm to lookup the page index or offset value to be
against specific mapping. The page index value only have a meaning
against a mapping.

Using coccinelle:
---------------------------------------------------------------------
@@
struct page *P;
expression E;
@@
-P->index = E
+page_set_index(P, E)

@@
struct page *P;
@@
-P->index
+page_index(P)

@@
struct page *P;
@@
-page_index(P) << PAGE_SHIFT
+page_offset(P)

@@
expression E;
@@
-page_index(E)
+_page_index(E, mapping)

@@
expression E1, E2;
@@
-page_set_index(E1, E2)
+_page_set_index(E1, mapping, E2)

@@
expression E;
@@
-page_to_index(E)
+_page_to_index(E, mapping)

@@
expression E;
@@
-page_to_pgoff(E)
+_page_to_pgoff(E, mapping)

@@
expression E;
@@
-page_offset(E)
+_page_offset(E, mapping)

@@
expression E;
@@
-page_file_offset(E)
+_page_file_offset(E, mapping)
---------------------------------------------------------------------

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: linux-mm@kvack.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
---
 mm/filemap.c        | 26 ++++++++++++++------------
 mm/page-writeback.c | 16 +++++++++-------
 mm/shmem.c          | 11 +++++++----
 mm/truncate.c       | 11 ++++++-----
 4 files changed, 36 insertions(+), 28 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 012a53964215..a41c7cfb6351 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -118,7 +118,8 @@ static int page_cache_tree_insert(struct address_space *mapping,
 	void **slot;
 	int error;
 
-	error = __radix_tree_create(&mapping->page_tree, page->index, 0,
+	error = __radix_tree_create(&mapping->page_tree,
+				    _page_index(page, mapping), 0,
 				    &node, &slot);
 	if (error)
 		return error;
@@ -155,7 +156,8 @@ static void page_cache_tree_delete(struct address_space *mapping,
 		struct radix_tree_node *node;
 		void **slot;
 
-		__radix_tree_lookup(&mapping->page_tree, page->index + i,
+		__radix_tree_lookup(&mapping->page_tree,
+				    _page_index(page, mapping) + i,
 				    &node, &slot);
 
 		VM_BUG_ON_PAGE(!node && nr != 1, page);
@@ -791,12 +793,12 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask)
 		void (*freepage)(struct page *);
 		unsigned long flags;
 
-		pgoff_t offset = old->index;
+		pgoff_t offset = _page_index(old, mapping);
 		freepage = mapping->a_ops->freepage;
 
 		get_page(new);
 		new->mapping = mapping;
-		new->index = offset;
+		_page_set_index(new, mapping, offset);
 
 		spin_lock_irqsave(&mapping->tree_lock, flags);
 		__delete_from_page_cache(old, NULL);
@@ -850,7 +852,7 @@ static int __add_to_page_cache_locked(struct page *page,
 
 	get_page(page);
 	page->mapping = mapping;
-	page->index = offset;
+	_page_set_index(page, mapping, offset);
 
 	spin_lock_irq(&mapping->tree_lock);
 	error = page_cache_tree_insert(mapping, page, shadowp);
@@ -1500,7 +1502,7 @@ struct page *find_lock_entry(struct address_space *mapping, pgoff_t offset)
 			put_page(page);
 			goto repeat;
 		}
-		VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page);
+		VM_BUG_ON_PAGE(_page_to_pgoff(page, mapping) != offset, page);
 	}
 	return page;
 }
@@ -1559,7 +1561,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
 			put_page(page);
 			goto repeat;
 		}
-		VM_BUG_ON_PAGE(page->index != offset, page);
+		VM_BUG_ON_PAGE(_page_index(page, mapping) != offset, page);
 	}
 
 	if (page && (fgp_flags & FGP_ACCESSED))
@@ -1751,7 +1753,7 @@ unsigned find_get_pages_range(struct address_space *mapping, pgoff_t *start,
 
 		pages[ret] = page;
 		if (++ret == nr_pages) {
-			*start = pages[ret - 1]->index + 1;
+			*start = _page_index(pages[ret - 1], mapping) + 1;
 			goto out;
 		}
 	}
@@ -1837,7 +1839,7 @@ unsigned find_get_pages_contig(struct address_space *mapping, pgoff_t index,
 		 * otherwise we can get both false positives and false
 		 * negatives, which is just confusing to the caller.
 		 */
-		if (page->mapping == NULL || page_to_pgoff(page) != iter.index) {
+		if (page->mapping == NULL || _page_to_pgoff(page, mapping) != iter.index) {
 			put_page(page);
 			break;
 		}
@@ -1923,7 +1925,7 @@ unsigned find_get_pages_range_tag(struct address_space *mapping, pgoff_t *index,
 
 		pages[ret] = page;
 		if (++ret == nr_pages) {
-			*index = pages[ret - 1]->index + 1;
+			*index = _page_index(pages[ret - 1], mapping) + 1;
 			goto out;
 		}
 	}
@@ -2540,7 +2542,7 @@ int filemap_fault(struct vm_fault *vmf)
 		put_page(page);
 		goto retry_find;
 	}
-	VM_BUG_ON_PAGE(page->index != offset, page);
+	VM_BUG_ON_PAGE(_page_index(page, mapping) != offset, page);
 
 	/*
 	 * We have a locked page in the page cache, now we need to check
@@ -2667,7 +2669,7 @@ void filemap_map_pages(struct vm_fault *vmf,
 			goto unlock;
 
 		max_idx = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
-		if (page->index >= max_idx)
+		if (_page_index(page, mapping) >= max_idx)
 			goto unlock;
 
 		if (file->f_ra.mmap_miss > 0)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3c14d44639c8..ed9424f84715 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2201,7 +2201,7 @@ int write_cache_pages(struct address_space *mapping,
 		for (i = 0; i < nr_pages; i++) {
 			struct page *page = pvec.pages[i];
 
-			done_index = page->index;
+			done_index = _page_index(page, mapping);
 
 			lock_page(page);
 
@@ -2251,7 +2251,8 @@ int write_cache_pages(struct address_space *mapping,
 					 * not be suitable for data integrity
 					 * writeout).
 					 */
-					done_index = page->index + 1;
+					done_index = _page_index(page,
+								 mapping) + 1;
 					done = 1;
 					break;
 				}
@@ -2470,7 +2471,8 @@ int __set_page_dirty_nobuffers(struct page *page)
 		BUG_ON(page_mapping(page) != mapping);
 		WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
 		account_page_dirtied(page, mapping);
-		radix_tree_tag_set(&mapping->page_tree, page_index(page),
+		radix_tree_tag_set(&mapping->page_tree,
+				   _page_index(page, mapping),
 				   PAGECACHE_TAG_DIRTY);
 		spin_unlock_irqrestore(&mapping->tree_lock, flags);
 		unlock_page_memcg(page);
@@ -2732,7 +2734,7 @@ int test_clear_page_writeback(struct page *page)
 		ret = TestClearPageWriteback(page);
 		if (ret) {
 			radix_tree_tag_clear(&mapping->page_tree,
-						page_index(page),
+						_page_index(page, mapping),
 						PAGECACHE_TAG_WRITEBACK);
 			if (bdi_cap_account_writeback(bdi)) {
 				struct bdi_writeback *wb = inode_to_wb(inode);
@@ -2785,7 +2787,7 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 						   PAGECACHE_TAG_WRITEBACK);
 
 			radix_tree_tag_set(&mapping->page_tree,
-						page_index(page),
+						_page_index(page, mapping),
 						PAGECACHE_TAG_WRITEBACK);
 			if (bdi_cap_account_writeback(bdi))
 				inc_wb_stat(inode_to_wb(inode), WB_WRITEBACK);
@@ -2800,11 +2802,11 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 		}
 		if (!PageDirty(page))
 			radix_tree_tag_clear(&mapping->page_tree,
-						page_index(page),
+						_page_index(page, mapping),
 						PAGECACHE_TAG_DIRTY);
 		if (!keep_write)
 			radix_tree_tag_clear(&mapping->page_tree,
-						page_index(page),
+						_page_index(page, mapping),
 						PAGECACHE_TAG_TOWRITE);
 		spin_unlock_irqrestore(&mapping->tree_lock, flags);
 	} else {
diff --git a/mm/shmem.c b/mm/shmem.c
index 7fee65df10b4..7f3168d547c8 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -588,7 +588,7 @@ static int shmem_add_to_page_cache(struct page *page,
 
 	page_ref_add(page, nr);
 	page->mapping = mapping;
-	page->index = index;
+	_page_set_index(page, mapping, index);
 
 	spin_lock_irq(&mapping->tree_lock);
 	if (PageTransHuge(page)) {
@@ -644,7 +644,9 @@ static void shmem_delete_from_page_cache(struct page *page, void *radswap)
 	VM_BUG_ON_PAGE(PageCompound(page), page);
 
 	spin_lock_irq(&mapping->tree_lock);
-	error = shmem_radix_tree_replace(mapping, page->index, page, radswap);
+	error = shmem_radix_tree_replace(mapping, _page_index(page, mapping),
+					 page,
+					 radswap);
 	page->mapping = NULL;
 	mapping->nrpages--;
 	__dec_node_page_state(page, NR_FILE_PAGES);
@@ -822,7 +824,8 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend,
 				continue;
 			}
 
-			VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page);
+			VM_BUG_ON_PAGE(_page_to_pgoff(page, mapping) != index,
+				       page);
 
 			if (!trylock_page(page))
 				continue;
@@ -1267,7 +1270,7 @@ static int shmem_writepage(struct address_space *_mapping, struct page *page,
 	VM_BUG_ON_PAGE(PageCompound(page), page);
 	BUG_ON(!PageLocked(page));
 	mapping = page->mapping;
-	index = page->index;
+	index = _page_index(page, mapping);
 	inode = mapping->host;
 	info = SHMEM_I(inode);
 	if (info->flags & VM_LOCKED)
diff --git a/mm/truncate.c b/mm/truncate.c
index a9415c96c966..57d4d0948f40 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -181,7 +181,8 @@ truncate_cleanup_page(struct address_space *mapping, struct page *page)
 {
 	if (page_mapped(page)) {
 		pgoff_t nr = PageTransHuge(page) ? HPAGE_PMD_NR : 1;
-		unmap_mapping_pages(mapping, page->index, nr, false);
+		unmap_mapping_pages(mapping, _page_index(page, mapping), nr,
+				    false);
 	}
 
 	if (page_has_private(page))
@@ -353,7 +354,7 @@ void truncate_inode_pages_range(struct address_space *mapping,
 
 			if (!trylock_page(page))
 				continue;
-			WARN_ON(page_to_index(page) != index);
+			WARN_ON(_page_to_index(page, mapping) != index);
 			if (PageWriteback(page)) {
 				unlock_page(page);
 				continue;
@@ -447,7 +448,7 @@ void truncate_inode_pages_range(struct address_space *mapping,
 				continue;
 
 			lock_page(page);
-			WARN_ON(page_to_index(page) != index);
+			WARN_ON(_page_to_index(page, mapping) != index);
 			wait_on_page_writeback(page);
 			truncate_inode_page(mapping, page);
 			unlock_page(page);
@@ -571,7 +572,7 @@ unsigned long invalidate_mapping_pages(struct address_space *mapping,
 			if (!trylock_page(page))
 				continue;
 
-			WARN_ON(page_to_index(page) != index);
+			WARN_ON(_page_to_index(page, mapping) != index);
 
 			/* Middle of THP: skip */
 			if (PageTransTail(page)) {
@@ -701,7 +702,7 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
 			}
 
 			lock_page(page);
-			WARN_ON(page_to_index(page) != index);
+			WARN_ON(_page_to_index(page, mapping) != index);
 			if (page_is_truncated(page, mapping)) {
 				unlock_page(page);
 				continue;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 64/79] mm/buffer: use _page_has_buffers() instead of page_has_buffers()
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (26 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 63/79] mm/page: convert page's index lookup to be against specific mapping jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 65/79] mm/swap: add struct swap_info_struct swap_readpage() arguments jglisse
                   ` (14 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Jens Axboe, Tejun Heo, Jan Kara, Josef Bacik,
	Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

The former need the address_space for which the buffer_head is being
lookup.

----------------------------------------------------------------------
@exists@
identifier M;
expression E;
@@
struct address_space *M;
...
-page_buffers(E)
+_page_buffers(E, M)

@exists@
identifier M, F;
expression E;
@@
F(..., struct address_space *M, ...) {...
-page_buffers(E)
+_page_buffers(E, M)
...}

@exists@
identifier M;
expression E;
@@
struct address_space *M;
...
-page_has_buffers(E)
+_page_has_buffers(E, M)

@exists@
identifier M, F;
expression E;
@@
F(..., struct address_space *M, ...) {...
-page_has_buffers(E)
+_page_has_buffers(E, M)
...}

@exists@
identifier I;
expression E;
@@
struct inode *I;
...
-page_buffers(E)
+_page_buffers(E, I->i_mapping)

@exists@
identifier I, F;
expression E;
@@
F(..., struct inode *I, ...) {...
-page_buffers(E)
+_page_buffers(E, I->i_mapping)
...}

@exists@
identifier I;
expression E;
@@
struct inode *I;
...
-page_has_buffers(E)
+_page_has_buffers(E, I->i_mapping)

@exists@
identifier I, F;
expression E;
@@
F(..., struct inode *I, ...) {...
-page_has_buffers(E)
+_page_has_buffers(E, I->i_mapping)
...}
----------------------------------------------------------------------

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 mm/migrate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index c2a613283fa2..e4b20ac6cf36 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -768,10 +768,10 @@ int buffer_migrate_page(struct address_space *mapping,
 	struct buffer_head *bh, *head;
 	int rc;
 
-	if (!page_has_buffers(page))
+	if (!_page_has_buffers(page, mapping))
 		return migrate_page(mapping, newpage, page, mode);
 
-	head = page_buffers(page);
+	head = _page_buffers(page, mapping);
 
 	rc = migrate_page_move_mapping(mapping, newpage, page, head, mode, 0);
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 65/79] mm/swap: add struct swap_info_struct swap_readpage() arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (27 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 64/79] mm/buffer: use _page_has_buffers() instead of page_has_buffers() jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 68/79] mm/vma_address: convert page's index lookup to be against specific mapping jglisse
                   ` (13 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

Add struct swap_info_struct swap_readpage() arguments. One step toward
dropping reliance on page->private during swap read back.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/swap.h |  6 ++++--
 mm/memory.c          |  2 +-
 mm/page_io.c         |  4 ++--
 mm/swap_state.c      | 12 ++++++++----
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2f6abe9652f6..90c26ec2997c 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -383,7 +383,8 @@ extern void kswapd_stop(int nid);
 #include <linux/blk_types.h> /* for bio_end_io_t */
 
 /* linux/mm/page_io.c */
-extern int swap_readpage(struct page *page, bool do_poll);
+extern int swap_readpage(struct swap_info_struct *sis, struct page *page,
+			 bool do_poll);
 extern int swap_writepage(struct address_space *mapping, struct page *page,
 			  struct writeback_control *wbc);
 extern void end_swap_bio_write(struct bio *bio);
@@ -486,7 +487,8 @@ extern void exit_swap_address_space(unsigned int type);
 
 #else /* CONFIG_SWAP */
 
-static inline int swap_readpage(struct page *page, bool do_poll)
+static inline int swap_readpage(struct swap_info_struct *sis, struct page *page,
+				bool do_poll)
 {
 	return 0;
 }
diff --git a/mm/memory.c b/mm/memory.c
index 1311599a164b..6ffd76528e7b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2949,7 +2949,7 @@ int do_swap_page(struct vm_fault *vmf)
 				__SetPageSwapBacked(page);
 				set_page_private(page, entry.val);
 				lru_cache_add_anon(page);
-				swap_readpage(page, true);
+				swap_readpage(si, page, true);
 			}
 		} else {
 			if (vma_readahead)
diff --git a/mm/page_io.c b/mm/page_io.c
index 6e548b588490..f4e05c90c87e 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -349,11 +349,11 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc,
 	return ret;
 }
 
-int swap_readpage(struct page *page, bool synchronous)
+int swap_readpage(struct swap_info_struct *sis, struct page *page,
+		  bool synchronous)
 {
 	struct bio *bio;
 	int ret = 0;
-	struct swap_info_struct *sis = page_swap_info(page);
 	blk_qc_t qc;
 	struct gendisk *disk;
 
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 39ae7cfad90f..40a2437e3c34 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -466,8 +466,10 @@ struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
 	struct page *retpage = __read_swap_cache_async(entry, gfp_mask,
 			vma, addr, &page_was_allocated);
 
-	if (page_was_allocated)
-		swap_readpage(retpage, do_poll);
+	if (page_was_allocated) {
+		struct swap_info_struct *sis = swp_swap_info(entry);
+		swap_readpage(sis, retpage, do_poll);
+	}
 
 	return retpage;
 }
@@ -585,7 +587,8 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask,
 		if (!page)
 			continue;
 		if (page_allocated) {
-			swap_readpage(page, false);
+			struct swap_info_struct *sis = swp_swap_info(entry);
+			swap_readpage(sis, page, false);
 			if (offset != entry_offset &&
 			    likely(!PageTransCompound(page))) {
 				SetPageReadahead(page);
@@ -748,7 +751,8 @@ struct page *do_swap_page_readahead(swp_entry_t fentry, gfp_t gfp_mask,
 		if (!page)
 			continue;
 		if (page_allocated) {
-			swap_readpage(page, false);
+			struct swap_info_struct *sis = swp_swap_info(entry);
+			swap_readpage(sis, page, false);
 			if (i != swap_ra->offset &&
 			    likely(!PageTransCompound(page))) {
 				SetPageReadahead(page);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 68/79] mm/vma_address: convert page's index lookup to be against specific mapping
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (28 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 65/79] mm/swap: add struct swap_info_struct swap_readpage() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 69/79] fs/journal: add struct address_space to jbd2_journal_try_to_free_buffers() arguments jglisse
                   ` (12 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton, Mel Gorman,
	Alexander Viro

From: Jérôme Glisse <jglisse@redhat.com>

Pass down the mapping ...

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: linux-mm@kvack.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
---
 mm/internal.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/internal.h b/mm/internal.h
index e6bd35182dae..43e9ed27362f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -336,7 +336,9 @@ extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma);
 static inline unsigned long
 __vma_address(struct page *page, struct vm_area_struct *vma)
 {
-	pgoff_t pgoff = page_to_pgoff(page);
+	struct address_space *mapping = vma->vm_file ? vma->vm_file->f_mapping : NULL;
+
+	pgoff_t pgoff = _page_to_pgoff(page, mapping);
 	return vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
 }
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 69/79] fs/journal: add struct address_space to jbd2_journal_try_to_free_buffers() arguments
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (29 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 68/79] mm/vma_address: convert page's index lookup to be against specific mapping jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 70/79] mm: add struct address_space to mark_buffer_dirty() jglisse
                   ` (11 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Theodore Ts'o,
	Jan Kara, linux-ext4, Alexander Viro

From: Jérôme Glisse <jglisse@redhat.com>

For the holy crusade to stop relying on struct page mapping field, add
struct address_space to jbd2_journal_try_to_free_buffers() arguments.

<---------------------------------------------------------------------
@@
type T1, T2, T3;
@@
int
-jbd2_journal_try_to_free_buffers(T1 journal, T2 page, T3 gfp_mask)
+jbd2_journal_try_to_free_buffers(T1 journal, struct address_space *mapping, T2 page, T3 gfp_mask)
{...}

@@
type T1, T2, T3;
@@
int
-jbd2_journal_try_to_free_buffers(T1, T2, T3)
+jbd2_journal_try_to_free_buffers(T1, struct address_space *, T2, T3)
;

@@
expression E1, E2, E3;
@@
-jbd2_journal_try_to_free_buffers(E1, E2, E3)
+jbd2_journal_try_to_free_buffers(E1, NULL, E2, E3)
--------------------------------------------------------------------->

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.com>
Cc: linux-ext4@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/ext4/inode.c       | 3 ++-
 fs/ext4/super.c       | 4 ++--
 fs/jbd2/transaction.c | 3 ++-
 include/linux/jbd2.h  | 4 +++-
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 1a44d9acde53..ef53a57d9768 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3413,7 +3413,8 @@ static int ext4_releasepage(struct address_space *mapping,
 	if (PageChecked(page))
 		return 0;
 	if (journal)
-		return jbd2_journal_try_to_free_buffers(journal, page, wait);
+		return jbd2_journal_try_to_free_buffers(journal, NULL, page,
+						        wait);
 	else
 		return try_to_free_buffers(mapping, page);
 }
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 8f98bc886569..cf2b74137fb2 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1138,8 +1138,8 @@ static int bdev_try_to_free_page(struct super_block *sb, struct page *page,
 	if (!_page_has_buffers(page, mapping))
 		return 0;
 	if (journal)
-		return jbd2_journal_try_to_free_buffers(journal, page,
-						wait & ~__GFP_DIRECT_RECLAIM);
+		return jbd2_journal_try_to_free_buffers(journal, NULL, page,
+							wait & ~__GFP_DIRECT_RECLAIM);
 	return try_to_free_buffers(mapping, page);
 }
 
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index bf673b33d436..6899e7b4036d 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -1984,7 +1984,8 @@ __journal_try_to_free_buffer(journal_t *journal, struct buffer_head *bh)
  * Return 0 on failure, 1 on success
  */
 int jbd2_journal_try_to_free_buffers(journal_t *journal,
-				struct page *page, gfp_t gfp_mask)
+				     struct address_space *mapping,
+				     struct page *page, gfp_t gfp_mask)
 {
 	struct buffer_head *head;
 	struct buffer_head *bh;
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index c5133df80fd4..658a0d2f758f 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1363,7 +1363,9 @@ extern int	 jbd2_journal_forget (handle_t *, struct super_block *sb,
 extern void	 journal_sync_buffer (struct buffer_head *);
 extern int	 jbd2_journal_invalidatepage(journal_t *,
 				struct page *, unsigned int, unsigned int);
-extern int	 jbd2_journal_try_to_free_buffers(journal_t *, struct page *, gfp_t);
+extern int	 jbd2_journal_try_to_free_buffers(journal_t *,
+						    struct address_space *,
+						    struct page *, gfp_t);
 extern int	 jbd2_journal_stop(handle_t *);
 extern int	 jbd2_journal_flush (journal_t *);
 extern void	 jbd2_journal_lock_updates (journal_t *);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 70/79] mm: add struct address_space to mark_buffer_dirty()
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (30 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 69/79] fs/journal: add struct address_space to jbd2_journal_try_to_free_buffers() arguments jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 71/79] mm: add struct address_space to set_page_dirty() jglisse
                   ` (10 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

For the holy crusade to stop relying on struct page mapping field, add
struct address_space to mark_buffer_dirty() arguments.

<---------------------------------------------------------------------
@@
identifier I1;
type T1;
@@
void
-mark_buffer_dirty(T1 I1)
+mark_buffer_dirty(struct address_space *_mapping, T1 I1)
{...}

@@
type T1;
@@
void
-mark_buffer_dirty(T1)
+mark_buffer_dirty(struct address_space *, T1)
;

@@
identifier I1;
type T1;
@@
void
-mark_buffer_dirty(T1 I1)
+mark_buffer_dirty(struct address_space *, T1)
;

@@
expression E1;
@@
-mark_buffer_dirty(E1)
+mark_buffer_dirty(NULL, E1)
--------------------------------------------------------------------->

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 fs/adfs/dir_f.c             |  2 +-
 fs/affs/bitmap.c            |  6 +++---
 fs/affs/super.c             |  2 +-
 fs/bfs/file.c               |  2 +-
 fs/bfs/inode.c              |  4 ++--
 fs/buffer.c                 | 12 ++++++------
 fs/ext2/balloc.c            |  6 +++---
 fs/ext2/ialloc.c            |  8 ++++----
 fs/ext2/inode.c             |  2 +-
 fs/ext2/super.c             |  4 ++--
 fs/ext2/xattr.c             |  8 ++++----
 fs/ext4/ext4_jbd2.c         |  4 ++--
 fs/ext4/inode.c             |  4 ++--
 fs/ext4/mmp.c               |  2 +-
 fs/ext4/resize.c            |  2 +-
 fs/ext4/super.c             |  2 +-
 fs/fat/inode.c              |  4 ++--
 fs/fat/misc.c               |  2 +-
 fs/gfs2/bmap.c              |  4 ++--
 fs/gfs2/lops.c              |  6 +++---
 fs/hfs/mdb.c                | 10 +++++-----
 fs/hpfs/anode.c             | 34 +++++++++++++++++-----------------
 fs/hpfs/buffer.c            |  8 ++++----
 fs/hpfs/dnode.c             |  4 ++--
 fs/hpfs/ea.c                |  4 ++--
 fs/hpfs/inode.c             |  2 +-
 fs/hpfs/namei.c             | 10 +++++-----
 fs/hpfs/super.c             |  6 +++---
 fs/jbd2/recovery.c          |  2 +-
 fs/jbd2/transaction.c       |  2 +-
 fs/jfs/jfs_imap.c           |  2 +-
 fs/jfs/jfs_mount.c          |  2 +-
 fs/jfs/resize.c             |  6 +++---
 fs/jfs/super.c              |  2 +-
 fs/minix/bitmap.c           | 10 +++++-----
 fs/minix/inode.c            | 12 ++++++------
 fs/nilfs2/alloc.c           | 12 ++++++------
 fs/nilfs2/btnode.c          |  4 ++--
 fs/nilfs2/btree.c           | 38 +++++++++++++++++++-------------------
 fs/nilfs2/cpfile.c          | 24 ++++++++++++------------
 fs/nilfs2/dat.c             |  4 ++--
 fs/nilfs2/gcinode.c         |  2 +-
 fs/nilfs2/ifile.c           |  4 ++--
 fs/nilfs2/inode.c           |  2 +-
 fs/nilfs2/ioctl.c           |  2 +-
 fs/nilfs2/mdt.c             |  2 +-
 fs/nilfs2/segment.c         |  4 ++--
 fs/nilfs2/sufile.c          | 26 +++++++++++++-------------
 fs/ntfs/file.c              |  8 ++++----
 fs/ntfs/super.c             |  2 +-
 fs/ocfs2/alloc.c            |  2 +-
 fs/ocfs2/aops.c             |  4 ++--
 fs/ocfs2/inode.c            |  2 +-
 fs/omfs/bitmap.c            |  6 +++---
 fs/omfs/dir.c               |  8 ++++----
 fs/omfs/file.c              |  4 ++--
 fs/omfs/inode.c             |  4 ++--
 fs/reiserfs/file.c          |  2 +-
 fs/reiserfs/inode.c         |  4 ++--
 fs/reiserfs/journal.c       | 10 +++++-----
 fs/reiserfs/resize.c        |  2 +-
 fs/sysv/balloc.c            |  2 +-
 fs/sysv/ialloc.c            |  2 +-
 fs/sysv/inode.c             |  8 ++++----
 fs/sysv/sysv.h              |  4 ++--
 fs/udf/balloc.c             |  6 +++---
 fs/udf/inode.c              |  2 +-
 fs/udf/partition.c          |  4 ++--
 fs/udf/super.c              |  8 ++++----
 fs/ufs/balloc.c             |  4 ++--
 fs/ufs/ialloc.c             |  4 ++--
 fs/ufs/inode.c              |  8 ++++----
 fs/ufs/util.c               |  2 +-
 include/linux/buffer_head.h |  2 +-
 74 files changed, 220 insertions(+), 220 deletions(-)

diff --git a/fs/adfs/dir_f.c b/fs/adfs/dir_f.c
index 0fbfd0b04ae0..3d92f8d187bc 100644
--- a/fs/adfs/dir_f.c
+++ b/fs/adfs/dir_f.c
@@ -434,7 +434,7 @@ adfs_f_update(struct adfs_dir *dir, struct object_info *obj)
 	}
 #endif
 	for (i = dir->nr_buffers - 1; i >= 0; i--)
-		mark_buffer_dirty(dir->bh[i]);
+		mark_buffer_dirty(NULL, dir->bh[i]);
 
 	ret = 0;
 out:
diff --git a/fs/affs/bitmap.c b/fs/affs/bitmap.c
index 5ba9ef2742f6..59b352075505 100644
--- a/fs/affs/bitmap.c
+++ b/fs/affs/bitmap.c
@@ -79,7 +79,7 @@ affs_free_block(struct super_block *sb, u32 block)
 	tmp = be32_to_cpu(*(__be32 *)bh->b_data);
 	*(__be32 *)bh->b_data = cpu_to_be32(tmp - mask);
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	affs_mark_sb_dirty(sb);
 	bm->bm_free++;
 
@@ -223,7 +223,7 @@ affs_alloc_block(struct inode *inode, u32 goal)
 	tmp = be32_to_cpu(*(__be32 *)bh->b_data);
 	*(__be32 *)bh->b_data = cpu_to_be32(tmp + mask);
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	affs_mark_sb_dirty(sb);
 
 	mutex_unlock(&sbi->s_bmlock);
@@ -338,7 +338,7 @@ int affs_init_bitmap(struct super_block *sb, int *flags)
 		((__be32 *)bh->b_data)[offset] = 0;
 	((__be32 *)bh->b_data)[0] = 0;
 	((__be32 *)bh->b_data)[0] = cpu_to_be32(-affs_checksum_block(sb, bh));
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 
 	/* recalculate bitmap count for last block */
 	bm--;
diff --git a/fs/affs/super.c b/fs/affs/super.c
index e602619aed9d..515388985607 100644
--- a/fs/affs/super.c
+++ b/fs/affs/super.c
@@ -40,7 +40,7 @@ affs_commit_super(struct super_block *sb, int wait)
 	affs_fix_checksum(sb, bh);
 	unlock_buffer(bh);
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	if (wait)
 		sync_dirty_buffer(bh);
 }
diff --git a/fs/bfs/file.c b/fs/bfs/file.c
index 6d66cc137bc3..e74e1b72df80 100644
--- a/fs/bfs/file.c
+++ b/fs/bfs/file.c
@@ -40,7 +40,7 @@ static int bfs_move_block(unsigned long from, unsigned long to,
 		return -EIO;
 	new = sb_getblk(sb, to);
 	memcpy(new->b_data, bh->b_data, bh->b_size);
-	mark_buffer_dirty(new);
+	mark_buffer_dirty(NULL, new);
 	bforget(sb, bh);
 	brelse(new);
 	return 0;
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
index 9a69392f1fb3..a41edad61187 100644
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -149,7 +149,7 @@ static int bfs_write_inode(struct inode *inode, struct writeback_control *wbc)
 	di->i_eblock = cpu_to_le32(BFS_I(inode)->i_eblock);
 	di->i_eoffset = cpu_to_le32(i_sblock * BFS_BSIZE + inode->i_size - 1);
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	if (wbc->sync_mode == WB_SYNC_ALL) {
 		sync_dirty_buffer(bh);
 		if (buffer_req(bh) && !buffer_uptodate(bh))
@@ -185,7 +185,7 @@ static void bfs_evict_inode(struct inode *inode)
 	mutex_lock(&info->bfs_lock);
 	/* clear on-disk inode */
 	memset(di, 0, sizeof(struct bfs_inode));
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 
         if (bi->i_dsk_ino) {
diff --git a/fs/buffer.c b/fs/buffer.c
index 27b19c629308..24872b077269 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -574,7 +574,7 @@ void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode)
 
 	buffer_mapping = fs_page_mapping_get_with_bh(bh->b_page, bh);
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	if (!mapping->private_data) {
 		mapping->private_data = buffer_mapping;
 	} else {
@@ -1102,7 +1102,7 @@ __getblk_slow(struct block_device *bdev, sector_t block,
  * mark_buffer_dirty() is atomic.  It takes bh->b_page->mapping->private_lock,
  * mapping->tree_lock and mapping->host->i_lock.
  */
-void mark_buffer_dirty(struct buffer_head *bh)
+void mark_buffer_dirty(struct address_space *_mapping, struct buffer_head *bh)
 {
 	WARN_ON_ONCE(!buffer_uptodate(bh));
 
@@ -1891,7 +1891,7 @@ void page_zero_new_buffers(struct address_space *buffer, struct page *page,
 				}
 
 				clear_buffer_new(bh);
-				mark_buffer_dirty(bh);
+				mark_buffer_dirty(NULL, bh);
 			}
 		}
 
@@ -2006,7 +2006,7 @@ int __block_write_begin_int(struct address_space *mapping, struct page *page,
 				if (PageUptodate(page)) {
 					clear_buffer_new(bh);
 					set_buffer_uptodate(bh);
-					mark_buffer_dirty(bh);
+					mark_buffer_dirty(NULL, bh);
 					continue;
 				}
 				if (block_end > to || block_start < from)
@@ -2068,7 +2068,7 @@ static int __block_commit_write(struct inode *inode, struct page *page,
 				partial = 1;
 		} else {
 			set_buffer_uptodate(bh);
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 		}
 		clear_buffer_new(bh);
 
@@ -2937,7 +2937,7 @@ int block_truncate_page(struct address_space *mapping,
 	}
 
 	zero_user(page, offset, length);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	err = 0;
 
 unlock:
diff --git a/fs/ext2/balloc.c b/fs/ext2/balloc.c
index 33db13365c5e..e9e4a9f477fe 100644
--- a/fs/ext2/balloc.c
+++ b/fs/ext2/balloc.c
@@ -172,7 +172,7 @@ static void group_adjust_blocks(struct super_block *sb, int group_no,
 		free_blocks = le16_to_cpu(desc->bg_free_blocks_count);
 		desc->bg_free_blocks_count = cpu_to_le16(free_blocks + count);
 		spin_unlock(sb_bgl_lock(sbi, group_no));
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 	}
 }
 
@@ -547,7 +547,7 @@ void ext2_free_blocks (struct inode * inode, unsigned long block,
 		}
 	}
 
-	mark_buffer_dirty(bitmap_bh);
+	mark_buffer_dirty(NULL, bitmap_bh);
 	if (sb->s_flags & SB_SYNCHRONOUS)
 		sync_dirty_buffer(bitmap_bh);
 
@@ -1423,7 +1423,7 @@ ext2_fsblk_t ext2_new_blocks(struct inode *inode, ext2_fsblk_t goal,
 	group_adjust_blocks(sb, group_no, gdp, gdp_bh, -num);
 	percpu_counter_sub(&sbi->s_freeblocks_counter, num);
 
-	mark_buffer_dirty(bitmap_bh);
+	mark_buffer_dirty(NULL, bitmap_bh);
 	if (sb->s_flags & SB_SYNCHRONOUS)
 		sync_dirty_buffer(bitmap_bh);
 
diff --git a/fs/ext2/ialloc.c b/fs/ext2/ialloc.c
index 6484199b35d1..c444c3c1ebcb 100644
--- a/fs/ext2/ialloc.c
+++ b/fs/ext2/ialloc.c
@@ -82,7 +82,7 @@ static void ext2_release_inode(struct super_block *sb, int group, int dir)
 	spin_unlock(sb_bgl_lock(EXT2_SB(sb), group));
 	if (dir)
 		percpu_counter_dec(&EXT2_SB(sb)->s_dirs_counter);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 }
 
 /*
@@ -144,7 +144,7 @@ void ext2_free_inode (struct inode * inode)
 			      "bit already cleared for inode %lu", ino);
 	else
 		ext2_release_inode(sb, block_group, is_directory);
-	mark_buffer_dirty(bitmap_bh);
+	mark_buffer_dirty(NULL, bitmap_bh);
 	if (sb->s_flags & SB_SYNCHRONOUS)
 		sync_dirty_buffer(bitmap_bh);
 
@@ -516,7 +516,7 @@ struct inode *ext2_new_inode(struct inode *dir, umode_t mode,
 	err = -ENOSPC;
 	goto fail;
 got:
-	mark_buffer_dirty(bitmap_bh);
+	mark_buffer_dirty(NULL, bitmap_bh);
 	if (sb->s_flags & SB_SYNCHRONOUS)
 		sync_dirty_buffer(bitmap_bh);
 	brelse(bitmap_bh);
@@ -547,7 +547,7 @@ struct inode *ext2_new_inode(struct inode *dir, umode_t mode,
 	}
 	spin_unlock(sb_bgl_lock(sbi, group));
 
-	mark_buffer_dirty(bh2);
+	mark_buffer_dirty(NULL, bh2);
 	if (test_opt(sb, GRPID)) {
 		inode->i_mode = mode;
 		inode->i_uid = current_fsuid();
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index bc12273e393a..4c1782d0d0c0 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -1621,7 +1621,7 @@ static int __ext2_write_inode(struct inode *inode, int do_sync)
 		}
 	} else for (n = 0; n < EXT2_N_BLOCKS; n++)
 		raw_inode->i_block[n] = ei->i_data[n];
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	if (do_sync) {
 		sync_dirty_buffer(bh);
 		if (buffer_req(bh) && !buffer_uptodate(bh)) {
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 7666c065b96f..62cab57b448f 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -1247,7 +1247,7 @@ void ext2_sync_super(struct super_block *sb, struct ext2_super_block *es,
 	es->s_wtime = cpu_to_le32(get_seconds());
 	/* unlock before we do IO */
 	spin_unlock(&EXT2_SB(sb)->s_lock);
-	mark_buffer_dirty(EXT2_SB(sb)->s_sbh);
+	mark_buffer_dirty(NULL, EXT2_SB(sb)->s_sbh);
 	if (wait)
 		sync_dirty_buffer(EXT2_SB(sb)->s_sbh);
 }
@@ -1562,7 +1562,7 @@ static ssize_t ext2_quota_write(struct super_block *sb, int type,
 		memcpy(bh->b_data+offset, data, tocopy);
 		flush_dcache_page(bh->b_page);
 		set_buffer_uptodate(bh);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		unlock_buffer(bh);
 		brelse(bh);
 		offset = 0;
diff --git a/fs/ext2/xattr.c b/fs/ext2/xattr.c
index c77edf9afbce..8f3b1950248b 100644
--- a/fs/ext2/xattr.c
+++ b/fs/ext2/xattr.c
@@ -344,7 +344,7 @@ static void ext2_xattr_update_super_block(struct super_block *sb)
 	spin_lock(&EXT2_SB(sb)->s_lock);
 	EXT2_SET_COMPAT_FEATURE(sb, EXT2_FEATURE_COMPAT_EXT_ATTR);
 	spin_unlock(&EXT2_SB(sb)->s_lock);
-	mark_buffer_dirty(EXT2_SB(sb)->s_sbh);
+	mark_buffer_dirty(NULL, EXT2_SB(sb)->s_sbh);
 }
 
 /*
@@ -683,7 +683,7 @@ ext2_xattr_set2(struct inode *inode, struct buffer_head *old_bh,
 			
 			ext2_xattr_update_super_block(sb);
 		}
-		mark_buffer_dirty(new_bh);
+		mark_buffer_dirty(NULL, new_bh);
 		if (IS_SYNC(inode)) {
 			sync_dirty_buffer(new_bh);
 			error = -EIO;
@@ -739,7 +739,7 @@ ext2_xattr_set2(struct inode *inode, struct buffer_head *old_bh,
 			le32_add_cpu(&HDR(old_bh)->h_refcount, -1);
 			dquot_free_block_nodirty(inode, 1);
 			mark_inode_dirty(inode);
-			mark_buffer_dirty(old_bh);
+			mark_buffer_dirty(NULL, old_bh);
 			ea_bdebug(old_bh, "refcount now=%d",
 				le32_to_cpu(HDR(old_bh)->h_refcount));
 		}
@@ -809,7 +809,7 @@ ext2_xattr_delete_inode(struct inode *inode)
 		ea_bdebug(bh, "refcount now=%d",
 			le32_to_cpu(HDR(bh)->h_refcount));
 		unlock_buffer(bh);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		if (IS_SYNC(inode))
 			sync_dirty_buffer(bh);
 		dquot_free_block_nodirty(inode, 1);
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 60fbf5336059..72209e854a19 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -302,7 +302,7 @@ int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
 		if (inode)
 			mark_buffer_dirty_inode(bh, inode);
 		else
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 		if (inode && inode_needs_sync(inode)) {
 			sync_dirty_buffer(bh);
 			if (buffer_req(bh) && !buffer_uptodate(bh)) {
@@ -334,6 +334,6 @@ int __ext4_handle_dirty_super(const char *where, unsigned int line,
 			ext4_journal_abort_handle(where, line, __func__,
 						  bh, handle, err);
 	} else
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 	return err;
 }
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index ef53a57d9768..c0ae0dc7af58 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1202,7 +1202,7 @@ static int ext4_block_write_begin(struct address_space *mapping,
 				if (PageUptodate(page)) {
 					clear_buffer_new(bh);
 					set_buffer_uptodate(bh);
-					mark_buffer_dirty(bh);
+					mark_buffer_dirty(NULL, bh);
 					continue;
 				}
 				if (block_end > to || block_start < from)
@@ -4070,7 +4070,7 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 		err = ext4_handle_dirty_metadata(handle, inode, bh);
 	} else {
 		err = 0;
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		if (ext4_should_order_data(inode))
 			err = ext4_jbd2_inode_add_write(handle, inode);
 	}
diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c
index 27b9a76a0dfa..6fcd8d624ef7 100644
--- a/fs/ext4/mmp.c
+++ b/fs/ext4/mmp.c
@@ -49,7 +49,7 @@ static int write_mmp_block(struct super_block *sb, struct buffer_head *bh)
 	 */
 	sb_start_write(sb);
 	ext4_mmp_csum_set(sb, mmp);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	lock_buffer(bh);
 	bh->b_end_io = end_buffer_write_sync;
 	get_bh(bh);
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index b6bec270a8e4..398cd8c7dd40 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1160,7 +1160,7 @@ static void update_backups(struct super_block *sb, sector_t blk_off, char *data,
 			     "forcing fsck on next reboot", group, err);
 		sbi->s_mount_state &= ~EXT4_VALID_FS;
 		sbi->s_es->s_state &= cpu_to_le16(~EXT4_VALID_FS);
-		mark_buffer_dirty(sbi->s_sbh);
+		mark_buffer_dirty(NULL, sbi->s_sbh);
 	}
 }
 
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index cf2b74137fb2..ebef69e45f74 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4738,7 +4738,7 @@ static int ext4_commit_super(struct super_block *sb, int sync)
 		clear_buffer_write_io_error(sbh);
 		set_buffer_uptodate(sbh);
 	}
-	mark_buffer_dirty(sbh);
+	mark_buffer_dirty(NULL, sbh);
 	if (sync) {
 		unlock_buffer(sbh);
 		error = __sync_dirty_buffer(sbh,
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 9e6bc6364468..a5cac466caf2 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -695,7 +695,7 @@ static void fat_set_state(struct super_block *sb,
 			b->fat16.state &= ~FAT_STATE_DIRTY;
 	}
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	sync_dirty_buffer(bh);
 	brelse(bh);
 }
@@ -875,7 +875,7 @@ static int __fat_write_inode(struct inode *inode, int wait)
 				  &raw_entry->adate, NULL);
 	}
 	spin_unlock(&sbi->inode_hash_lock);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	err = 0;
 	if (wait)
 		err = sync_dirty_buffer(bh);
diff --git a/fs/fat/misc.c b/fs/fat/misc.c
index f9bdc1e01c98..2f1a027684c4 100644
--- a/fs/fat/misc.c
+++ b/fs/fat/misc.c
@@ -85,7 +85,7 @@ int fat_clusters_flush(struct super_block *sb)
 			fsinfo->free_clusters = cpu_to_le32(sbi->free_clusters);
 		if (sbi->prev_free != -1)
 			fsinfo->next_cluster = cpu_to_le32(sbi->prev_free);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 	}
 	brelse(bh);
 
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 12e10758b0f2..32028225306a 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -90,7 +90,7 @@ static int gfs2_unstuffer_page(struct gfs2_inode *ip, struct buffer_head *dibh,
 
 	set_buffer_uptodate(bh);
 	if (!gfs2_is_jdata(ip))
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 	if (!gfs2_is_writeback(ip))
 		gfs2_trans_add_data(ip->i_gl, bh);
 
@@ -955,7 +955,7 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from,
 		gfs2_trans_add_data(ip->i_gl, bh);
 
 	zero_user(page, offset, length);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 unlock:
 	unlock_page(page);
 	put_page(page);
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 3b672378d358..1c7dbf3b1227 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -105,7 +105,7 @@ static void gfs2_unpin(struct gfs2_sbd *sdp, struct buffer_head *bh,
 	BUG_ON(!buffer_pinned(bh));
 
 	lock_buffer(bh);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	clear_buffer_pinned(bh);
 
 	if (buffer_is_rgrp(bd))
@@ -558,7 +558,7 @@ static int buf_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
 		if (gfs2_meta_check(sdp, bh_ip))
 			error = -EIO;
 		else
-			mark_buffer_dirty(bh_ip);
+			mark_buffer_dirty(NULL, bh_ip);
 
 		brelse(bh_log);
 		brelse(bh_ip);
@@ -797,7 +797,7 @@ static int databuf_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
 			__be32 *eptr = (__be32 *)bh_ip->b_data;
 			*eptr = cpu_to_be32(GFS2_MAGIC);
 		}
-		mark_buffer_dirty(bh_ip);
+		mark_buffer_dirty(NULL, bh_ip);
 
 		brelse(bh_log);
 		brelse(bh_ip);
diff --git a/fs/hfs/mdb.c b/fs/hfs/mdb.c
index 460281b1299e..6be25d949b5d 100644
--- a/fs/hfs/mdb.c
+++ b/fs/hfs/mdb.c
@@ -218,7 +218,7 @@ int hfs_mdb_get(struct super_block *sb)
 		be32_add_cpu(&mdb->drWrCnt, 1);
 		mdb->drLsMod = hfs_mtime();
 
-		mark_buffer_dirty(HFS_SB(sb)->mdb_bh);
+		mark_buffer_dirty(NULL, HFS_SB(sb)->mdb_bh);
 		sync_dirty_buffer(HFS_SB(sb)->mdb_bh);
 	}
 
@@ -274,7 +274,7 @@ void hfs_mdb_commit(struct super_block *sb)
 		mdb->drDirCnt = cpu_to_be32(HFS_SB(sb)->folder_count);
 
 		/* write MDB to disk */
-		mark_buffer_dirty(HFS_SB(sb)->mdb_bh);
+		mark_buffer_dirty(NULL, HFS_SB(sb)->mdb_bh);
 	}
 
 	/* write the backup MDB, not returning until it is written.
@@ -293,7 +293,7 @@ void hfs_mdb_commit(struct super_block *sb)
 		HFS_SB(sb)->alt_mdb->drAtrb &= cpu_to_be16(~HFS_SB_ATTRIB_INCNSTNT);
 		unlock_buffer(HFS_SB(sb)->alt_mdb_bh);
 
-		mark_buffer_dirty(HFS_SB(sb)->alt_mdb_bh);
+		mark_buffer_dirty(NULL, HFS_SB(sb)->alt_mdb_bh);
 		sync_dirty_buffer(HFS_SB(sb)->alt_mdb_bh);
 	}
 
@@ -320,7 +320,7 @@ void hfs_mdb_commit(struct super_block *sb)
 			memcpy(bh->b_data + off, ptr, len);
 			unlock_buffer(bh);
 
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			brelse(bh);
 			block++;
 			off = 0;
@@ -338,7 +338,7 @@ void hfs_mdb_close(struct super_block *sb)
 		return;
 	HFS_SB(sb)->mdb->drAtrb |= cpu_to_be16(HFS_SB_ATTRIB_UNMNT);
 	HFS_SB(sb)->mdb->drAtrb &= cpu_to_be16(~HFS_SB_ATTRIB_INCNSTNT);
-	mark_buffer_dirty(HFS_SB(sb)->mdb_bh);
+	mark_buffer_dirty(NULL, HFS_SB(sb)->mdb_bh);
 }
 
 /*
diff --git a/fs/hpfs/anode.c b/fs/hpfs/anode.c
index c14c9a035ee0..38944a8cc677 100644
--- a/fs/hpfs/anode.c
+++ b/fs/hpfs/anode.c
@@ -86,7 +86,7 @@ secno hpfs_add_sector_to_btree(struct super_block *s, secno node, int fnod, unsi
 	if (bp_internal(btree)) {
 		a = le32_to_cpu(btree->u.internal[n].down);
 		btree->u.internal[n].file_secno = cpu_to_le32(-1);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		brelse(bh);
 		if (hpfs_sb(s)->sb_chk)
 			if (hpfs_stop_cycles(s, a, &c1, &c2, "hpfs_add_sector_to_btree #1")) return -1;
@@ -104,7 +104,7 @@ secno hpfs_add_sector_to_btree(struct super_block *s, secno node, int fnod, unsi
 		}
 		if (hpfs_alloc_if_possible(s, se = le32_to_cpu(btree->u.external[n].disk_secno) + le32_to_cpu(btree->u.external[n].length))) {
 			le32_add_cpu(&btree->u.external[n].length, 1);
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			brelse(bh);
 			return se;
 		}
@@ -141,7 +141,7 @@ secno hpfs_add_sector_to_btree(struct super_block *s, secno node, int fnod, unsi
 			btree->first_free = cpu_to_le16((char *)&(btree->u.internal[1]) - (char *)btree);
 			btree->u.internal[0].file_secno = cpu_to_le32(-1);
 			btree->u.internal[0].down = cpu_to_le32(na);
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 		} else if (!(ranode = hpfs_alloc_anode(s, /*a*/0, &ra, &bh2))) {
 			brelse(bh);
 			brelse(bh1);
@@ -158,7 +158,7 @@ secno hpfs_add_sector_to_btree(struct super_block *s, secno node, int fnod, unsi
 	btree->u.external[n].disk_secno = cpu_to_le32(se);
 	btree->u.external[n].file_secno = cpu_to_le32(fs);
 	btree->u.external[n].length = cpu_to_le32(1);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 	if ((a == node && fnod) || na == -1) return se;
 	c2 = 0;
@@ -179,7 +179,7 @@ secno hpfs_add_sector_to_btree(struct super_block *s, secno node, int fnod, unsi
 			btree->u.internal[n].file_secno = cpu_to_le32(-1);
 			btree->u.internal[n].down = cpu_to_le32(na);
 			btree->u.internal[n-1].file_secno = cpu_to_le32(fs);
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			brelse(bh);
 			brelse(bh2);
 			hpfs_free_sectors(s, ra, 1);
@@ -189,14 +189,14 @@ secno hpfs_add_sector_to_btree(struct super_block *s, secno node, int fnod, unsi
 					anode->btree.flags |= BP_fnode_parent;
 				else
 					anode->btree.flags &= ~BP_fnode_parent;
-				mark_buffer_dirty(bh);
+				mark_buffer_dirty(NULL, bh);
 				brelse(bh);
 			}
 			return se;
 		}
 		up = up != node ? le32_to_cpu(anode->up) : -1;
 		btree->u.internal[btree->n_used_nodes - 1].file_secno = cpu_to_le32(/*fs*/-1);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		brelse(bh);
 		a = na;
 		if ((new_anode = hpfs_alloc_anode(s, a, &na, &bh))) {
@@ -208,11 +208,11 @@ secno hpfs_add_sector_to_btree(struct super_block *s, secno node, int fnod, unsi
 			anode->btree.first_free = cpu_to_le16(16);
 			anode->btree.u.internal[0].down = cpu_to_le32(a);
 			anode->btree.u.internal[0].file_secno = cpu_to_le32(-1);
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			brelse(bh);
 			if ((anode = hpfs_map_anode(s, a, &bh))) {
 				anode->up = cpu_to_le32(na);
-				mark_buffer_dirty(bh);
+				mark_buffer_dirty(NULL, bh);
 				brelse(bh);
 			}
 		} else na = a;
@@ -221,7 +221,7 @@ secno hpfs_add_sector_to_btree(struct super_block *s, secno node, int fnod, unsi
 		anode->up = cpu_to_le32(node);
 		if (fnod)
 			anode->btree.flags |= BP_fnode_parent;
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		brelse(bh);
 	}
 	if (!fnod) {
@@ -247,7 +247,7 @@ secno hpfs_add_sector_to_btree(struct super_block *s, secno node, int fnod, unsi
 		if ((unode = hpfs_map_anode(s, le32_to_cpu(ranode->u.internal[n].down), &bh1))) {
 			unode->up = cpu_to_le32(ra);
 			unode->btree.flags &= ~BP_fnode_parent;
-			mark_buffer_dirty(bh1);
+			mark_buffer_dirty(NULL, bh1);
 			brelse(bh1);
 		}
 	}
@@ -259,9 +259,9 @@ secno hpfs_add_sector_to_btree(struct super_block *s, secno node, int fnod, unsi
 	btree->u.internal[0].down = cpu_to_le32(ra);
 	btree->u.internal[1].file_secno = cpu_to_le32(-1);
 	btree->u.internal[1].down = cpu_to_le32(na);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
-	mark_buffer_dirty(bh2);
+	mark_buffer_dirty(NULL, bh2);
 	brelse(bh2);
 	return se;
 }
@@ -375,7 +375,7 @@ int hpfs_ea_write(struct super_block *s, secno a, int ano, unsigned pos,
 			return -1;
 		l = 0x200 - (pos & 0x1ff); if (l > len) l = len;
 		memcpy(data + (pos & 0x1ff), buf, l);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		brelse(bh);
 		buf += l; pos += l; len -= l;
 	}
@@ -419,7 +419,7 @@ void hpfs_truncate_btree(struct super_block *s, secno f, int fno, unsigned secs)
 			btree->n_used_nodes = 0;
 			btree->first_free = cpu_to_le16(8);
 			btree->flags &= ~BP_internal;
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 		} else hpfs_free_sectors(s, f, 1);
 		brelse(bh);
 		return;
@@ -437,7 +437,7 @@ void hpfs_truncate_btree(struct super_block *s, secno f, int fno, unsigned secs)
 		btree->n_used_nodes = i + 1;
 		btree->n_free_nodes = nodes - btree->n_used_nodes;
 		btree->first_free = cpu_to_le16(8 + 8 * btree->n_used_nodes);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		if (btree->u.internal[i].file_secno == cpu_to_le32(secs)) {
 			brelse(bh);
 			return;
@@ -471,7 +471,7 @@ void hpfs_truncate_btree(struct super_block *s, secno f, int fno, unsigned secs)
 	btree->n_used_nodes = i + 1;
 	btree->n_free_nodes = nodes - btree->n_used_nodes;
 	btree->first_free = cpu_to_le16(8 + 12 * btree->n_used_nodes);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 }
 
diff --git a/fs/hpfs/buffer.c b/fs/hpfs/buffer.c
index e285d6b3bba4..dbb0a4d72bc8 100644
--- a/fs/hpfs/buffer.c
+++ b/fs/hpfs/buffer.c
@@ -225,8 +225,8 @@ void hpfs_mark_4buffers_dirty(struct quad_buffer_head *qbh)
 		memcpy(qbh->bh[2]->b_data, qbh->data + 2 * 512, 512);
 		memcpy(qbh->bh[3]->b_data, qbh->data + 3 * 512, 512);
 	}
-	mark_buffer_dirty(qbh->bh[0]);
-	mark_buffer_dirty(qbh->bh[1]);
-	mark_buffer_dirty(qbh->bh[2]);
-	mark_buffer_dirty(qbh->bh[3]);
+	mark_buffer_dirty(NULL, qbh->bh[0]);
+	mark_buffer_dirty(NULL, qbh->bh[1]);
+	mark_buffer_dirty(NULL, qbh->bh[2]);
+	mark_buffer_dirty(NULL, qbh->bh[3]);
 }
diff --git a/fs/hpfs/dnode.c b/fs/hpfs/dnode.c
index a4ad18afbdec..6bc5449d0fd1 100644
--- a/fs/hpfs/dnode.c
+++ b/fs/hpfs/dnode.c
@@ -359,7 +359,7 @@ static int hpfs_add_to_dnode(struct inode *i, dnode_secno dno,
 		return 1;
 	}
 	fnode->u.external[0].disk_secno = cpu_to_le32(rdno);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 	hpfs_i(i)->i_dno = rdno;
 	d->up = ad->up = cpu_to_le32(rdno);
@@ -562,7 +562,7 @@ static void delete_empty_dnode(struct inode *i, dnode_secno dno)
 			}
 			if ((fnode = hpfs_map_fnode(i->i_sb, up, &bh))) {
 				fnode->u.external[0].disk_secno = cpu_to_le32(down);
-				mark_buffer_dirty(bh);
+				mark_buffer_dirty(NULL, bh);
 				brelse(bh);
 			}
 			hpfs_inode->i_dno = down;
diff --git a/fs/hpfs/ea.c b/fs/hpfs/ea.c
index 102ba18e561f..a600bac34172 100644
--- a/fs/hpfs/ea.c
+++ b/fs/hpfs/ea.c
@@ -278,7 +278,7 @@ void hpfs_set_ea(struct inode *inode, struct fnode *fnode, const char *key,
 		fnode->ea_size_s = cpu_to_le16(0);
 		fnode->ea_secno = cpu_to_le32(n);
 		fnode->flags &= ~FNODE_anode;
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		brelse(bh);
 	}
 	pos = le32_to_cpu(fnode->ea_size_l) + 5 + strlen(key) + size;
@@ -331,7 +331,7 @@ void hpfs_set_ea(struct inode *inode, struct fnode *fnode, const char *key,
 					}
 					memcpy(b2, b1, 512);
 					brelse(bh1);
-					mark_buffer_dirty(bh2);
+					mark_buffer_dirty(NULL, bh2);
 					brelse(bh2);
 				}
 				hpfs_free_sectors(s, le32_to_cpu(fnode->ea_secno), len);
diff --git a/fs/hpfs/inode.c b/fs/hpfs/inode.c
index eb8b4baf0f2e..134313fa85fe 100644
--- a/fs/hpfs/inode.c
+++ b/fs/hpfs/inode.c
@@ -253,7 +253,7 @@ void hpfs_write_inode_nolock(struct inode *i)
 				"directory %08lx doesn't have '.' entry",
 				(unsigned long)i->i_ino);
 	}
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 }
 
diff --git a/fs/hpfs/namei.c b/fs/hpfs/namei.c
index e79bd8760f3f..75e2b5597bd6 100644
--- a/fs/hpfs/namei.c
+++ b/fs/hpfs/namei.c
@@ -96,7 +96,7 @@ static int hpfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	de->first = de->directory = 1;
 	/*de->hidden = de->system = 0;*/
 	de->fnode = cpu_to_le32(fno);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 	hpfs_mark_4buffers_dirty(&qbh0);
 	hpfs_brelse4(&qbh0);
@@ -187,7 +187,7 @@ static int hpfs_create(struct inode *dir, struct dentry *dentry, umode_t mode, b
 	fnode->len = len;
 	memcpy(fnode->name, name, len > 15 ? 15 : len);
 	fnode->up = cpu_to_le32(dir->i_ino);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 
 	insert_inode_hash(result);
@@ -269,7 +269,7 @@ static int hpfs_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, de
 	fnode->len = len;
 	memcpy(fnode->name, name, len > 15 ? 15 : len);
 	fnode->up = cpu_to_le32(dir->i_ino);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 
 	insert_inode_hash(result);
 
@@ -348,7 +348,7 @@ static int hpfs_symlink(struct inode *dir, struct dentry *dentry, const char *sy
 	memcpy(fnode->name, name, len > 15 ? 15 : len);
 	fnode->up = cpu_to_le32(dir->i_ino);
 	hpfs_set_ea(result, fnode, "SYMLINK", symlink, strlen(symlink));
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 
 	insert_inode_hash(result);
@@ -604,7 +604,7 @@ static int hpfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		fnode->len = new_len;
 		memcpy(fnode->name, new_name, new_len>15?15:new_len);
 		if (new_len < 15) memset(&fnode->name[new_len], 0, 15 - new_len);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		brelse(bh);
 	}
 end1:
diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
index f2c3ebcd309c..93b9908f300f 100644
--- a/fs/hpfs/super.c
+++ b/fs/hpfs/super.c
@@ -27,7 +27,7 @@ static void mark_dirty(struct super_block *s, int remount)
 		if ((sb = hpfs_map_sector(s, 17, &bh, 0))) {
 			sb->dirty = 1;
 			sb->old_wrote = 0;
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			sync_dirty_buffer(bh);
 			brelse(bh);
 		}
@@ -46,7 +46,7 @@ static void unmark_dirty(struct super_block *s)
 	if ((sb = hpfs_map_sector(s, 17, &bh, 0))) {
 		sb->dirty = hpfs_sb(s)->sb_chkdsk > 1 - hpfs_sb(s)->sb_was_error;
 		sb->old_wrote = hpfs_sb(s)->sb_chkdsk >= 2 && !hpfs_sb(s)->sb_was_error;
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		sync_dirty_buffer(bh);
 		brelse(bh);
 	}
@@ -667,7 +667,7 @@ static int hpfs_fill_super(struct super_block *s, void *options, int silent)
 	if (!sb_rdonly(s)) {
 		spareblock->dirty = 1;
 		spareblock->old_wrote = 0;
-		mark_buffer_dirty(bh2);
+		mark_buffer_dirty(NULL, bh2);
 	}
 
 	if (le32_to_cpu(spareblock->n_dnode_spares) != le32_to_cpu(spareblock->n_dnode_spares_free)) {
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index f99910b69c78..936d74bf4c40 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -631,7 +631,7 @@ static int do_one_pass(journal_t *journal,
 
 					BUFFER_TRACE(nbh, "marking dirty");
 					set_buffer_uptodate(nbh);
-					mark_buffer_dirty(nbh);
+					mark_buffer_dirty(NULL, nbh);
 					BUFFER_TRACE(nbh, "marking uptodate");
 					++info->nr_replays;
 					/* ll_rw_block(WRITE, 1, &nbh); */
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 6899e7b4036d..01c31d021b47 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -1885,7 +1885,7 @@ static void __jbd2_journal_temp_unlink_buffer(struct journal_head *jh)
 	if (transaction && is_journal_aborted(transaction->t_journal))
 		clear_buffer_jbddirty(bh);
 	else if (test_clear_buffer_jbddirty(bh))
-		mark_buffer_dirty(bh);	/* Expose it to the VM */
+		mark_buffer_dirty(NULL, bh);	/* Expose it to the VM */
 }
 
 /*
diff --git a/fs/jfs/jfs_imap.c b/fs/jfs/jfs_imap.c
index f36ef68905a7..69a10c8d9605 100644
--- a/fs/jfs/jfs_imap.c
+++ b/fs/jfs/jfs_imap.c
@@ -3009,7 +3009,7 @@ static void duplicateIXtree(struct super_block *sb, s64 blkno,
 		j_sb = (struct jfs_superblock *)bh->b_data;
 		j_sb->s_flag |= cpu_to_le32(JFS_BAD_SAIT);
 
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		sync_dirty_buffer(bh);
 		brelse(bh);
 		return;
diff --git a/fs/jfs/jfs_mount.c b/fs/jfs/jfs_mount.c
index d8658607bf46..3cf9d794d08a 100644
--- a/fs/jfs/jfs_mount.c
+++ b/fs/jfs/jfs_mount.c
@@ -448,7 +448,7 @@ int updateSuper(struct super_block *sb, uint state)
 			j_sb->s_flag |= cpu_to_le32(JFS_DASD_PRIME);
 	}
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	sync_dirty_buffer(bh);
 	brelse(bh);
 
diff --git a/fs/jfs/resize.c b/fs/jfs/resize.c
index c1f417b94fe6..7d800d80a9d1 100644
--- a/fs/jfs/resize.c
+++ b/fs/jfs/resize.c
@@ -247,7 +247,7 @@ int jfs_extendfs(struct super_block *sb, s64 newLVSize, int newLogSize)
 		PXDlength(&j_sb->s_xlogpxd, newLogSize);
 
 		/* synchronously update superblock */
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		sync_dirty_buffer(bh);
 		brelse(bh);
 
@@ -523,13 +523,13 @@ int jfs_extendfs(struct super_block *sb, s64 newLVSize, int newLogSize)
 		j_sb2 = (struct jfs_superblock *)bh2->b_data;
 		memcpy(j_sb2, j_sb, sizeof (struct jfs_superblock));
 
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		sync_dirty_buffer(bh2);
 		brelse(bh2);
 	}
 
 	/* write primary superblock */
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	sync_dirty_buffer(bh);
 	brelse(bh);
 
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 1b9264fd54b6..96cc8c79f0d7 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -838,7 +838,7 @@ static ssize_t jfs_quota_write(struct super_block *sb, int type,
 		memcpy(bh->b_data+offset, data, tocopy);
 		flush_dcache_page(bh->b_page);
 		set_buffer_uptodate(bh);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		unlock_buffer(bh);
 		brelse(bh);
 		offset = 0;
diff --git a/fs/minix/bitmap.c b/fs/minix/bitmap.c
index f4e5e5181a14..61c7f0d4d00a 100644
--- a/fs/minix/bitmap.c
+++ b/fs/minix/bitmap.c
@@ -64,7 +64,7 @@ void minix_free_block(struct inode *inode, unsigned long block)
 		printk("minix_free_block (%s:%lu): bit already cleared\n",
 		       sb->s_id, block);
 	spin_unlock(&bitmap_lock);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	return;
 }
 
@@ -83,7 +83,7 @@ int minix_new_block(struct inode * inode)
 		if (j < bits_per_zone) {
 			minix_set_bit(j, bh->b_data);
 			spin_unlock(&bitmap_lock);
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			j += i * bits_per_zone + sbi->s_firstdatazone-1;
 			if (j < sbi->s_firstdatazone || j >= sbi->s_nzones)
 				break;
@@ -175,7 +175,7 @@ static void minix_clear_inode(struct inode *inode)
 		}
 	}
 	if (bh) {
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		brelse (bh);
 	}
 }
@@ -207,7 +207,7 @@ void minix_free_inode(struct inode * inode)
 	if (!minix_test_and_clear_bit(bit, bh->b_data))
 		printk("minix_free_inode: bit %lu already cleared\n", bit);
 	spin_unlock(&bitmap_lock);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 }
 
 struct inode *minix_new_inode(const struct inode *dir, umode_t mode, int *error)
@@ -246,7 +246,7 @@ struct inode *minix_new_inode(const struct inode *dir, umode_t mode, int *error)
 		return NULL;
 	}
 	spin_unlock(&bitmap_lock);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	j += i * bits_per_zone;
 	if (!j || j > sbi->s_ninodes) {
 		iput(inode);
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index 450aa4e87cd9..e8550a58fe83 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -45,7 +45,7 @@ static void minix_put_super(struct super_block *sb)
 	if (!sb_rdonly(sb)) {
 		if (sbi->s_version != MINIX_V3)	 /* s_state is now out from V3 sb */
 			sbi->s_ms->s_state = sbi->s_mount_state;
-		mark_buffer_dirty(sbi->s_sbh);
+		mark_buffer_dirty(NULL, sbi->s_sbh);
 	}
 	for (i = 0; i < sbi->s_imap_blocks; i++)
 		brelse(sbi->s_imap[i]);
@@ -134,7 +134,7 @@ static int minix_remount (struct super_block * sb, int * flags, char * data)
 		/* Mounting a rw partition read-only. */
 		if (sbi->s_version != MINIX_V3)
 			ms->s_state = sbi->s_mount_state;
-		mark_buffer_dirty(sbi->s_sbh);
+		mark_buffer_dirty(NULL, sbi->s_sbh);
 	} else {
 	  	/* Mount a partition which is read-only, read-write. */
 		if (sbi->s_version != MINIX_V3) {
@@ -143,7 +143,7 @@ static int minix_remount (struct super_block * sb, int * flags, char * data)
 		} else {
 			sbi->s_mount_state = MINIX_VALID_FS;
 		}
-		mark_buffer_dirty(sbi->s_sbh);
+		mark_buffer_dirty(NULL, sbi->s_sbh);
 
 		if (!(sbi->s_mount_state & MINIX_VALID_FS))
 			printk("MINIX-fs warning: remounting unchecked fs, "
@@ -296,7 +296,7 @@ static int minix_fill_super(struct super_block *s, void *data, int silent)
 	if (!sb_rdonly(s)) {
 		if (sbi->s_version != MINIX_V3) /* s_state is now out from V3 sb */
 			ms->s_state &= ~MINIX_VALID_FS;
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 	}
 	if (!(sbi->s_mount_state & MINIX_VALID_FS))
 		printk("MINIX-fs: mounting unchecked file system, "
@@ -570,7 +570,7 @@ static struct buffer_head * V1_minix_update_inode(struct inode * inode)
 		raw_inode->i_zone[0] = old_encode_dev(inode->i_rdev);
 	else for (i = 0; i < 9; i++)
 		raw_inode->i_zone[i] = minix_inode->u.i1_data[i];
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	return bh;
 }
 
@@ -599,7 +599,7 @@ static struct buffer_head * V2_minix_update_inode(struct inode * inode)
 		raw_inode->i_zone[0] = old_encode_dev(inode->i_rdev);
 	else for (i = 0; i < 10; i++)
 		raw_inode->i_zone[i] = minix_inode->u.i2_data[i];
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	return bh;
 }
 
diff --git a/fs/nilfs2/alloc.c b/fs/nilfs2/alloc.c
index 03b8ba933eb2..7e3d3a8dc3f9 100644
--- a/fs/nilfs2/alloc.c
+++ b/fs/nilfs2/alloc.c
@@ -591,8 +591,8 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
 void nilfs_palloc_commit_alloc_entry(struct inode *inode,
 				     struct nilfs_palloc_req *req)
 {
-	mark_buffer_dirty(req->pr_bitmap_bh);
-	mark_buffer_dirty(req->pr_desc_bh);
+	mark_buffer_dirty(NULL, req->pr_bitmap_bh);
+	mark_buffer_dirty(NULL, req->pr_desc_bh);
 	nilfs_mdt_mark_dirty(inode);
 
 	brelse(req->pr_bitmap_bh);
@@ -632,8 +632,8 @@ void nilfs_palloc_commit_free_entry(struct inode *inode,
 	kunmap(req->pr_bitmap_bh->b_page);
 	kunmap(req->pr_desc_bh->b_page);
 
-	mark_buffer_dirty(req->pr_desc_bh);
-	mark_buffer_dirty(req->pr_bitmap_bh);
+	mark_buffer_dirty(NULL, req->pr_desc_bh);
+	mark_buffer_dirty(NULL, req->pr_bitmap_bh);
 	nilfs_mdt_mark_dirty(inode);
 
 	brelse(req->pr_bitmap_bh);
@@ -810,7 +810,7 @@ int nilfs_palloc_freev(struct inode *inode, __u64 *entry_nrs, size_t nitems)
 		} while (true);
 
 		kunmap(bitmap_bh->b_page);
-		mark_buffer_dirty(bitmap_bh);
+		mark_buffer_dirty(NULL, bitmap_bh);
 		brelse(bitmap_bh);
 
 		for (k = 0; k < nempties; k++) {
@@ -828,7 +828,7 @@ int nilfs_palloc_freev(struct inode *inode, __u64 *entry_nrs, size_t nitems)
 			inode, group, desc_bh, desc_kaddr);
 		nfree = nilfs_palloc_group_desc_add_entries(desc, lock, n);
 		kunmap_atomic(desc_kaddr);
-		mark_buffer_dirty(desc_bh);
+		mark_buffer_dirty(NULL, desc_bh);
 		nilfs_mdt_mark_dirty(inode);
 		brelse(desc_bh);
 
diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c
index c21e0b4454a6..a2d7844b1ff4 100644
--- a/fs/nilfs2/btnode.c
+++ b/fs/nilfs2/btnode.c
@@ -249,7 +249,7 @@ void nilfs_btnode_commit_change_key(struct address_space *btnc,
 				       "invalid oldkey %lld (newkey=%lld)",
 				       (unsigned long long)oldkey,
 				       (unsigned long long)newkey);
-		mark_buffer_dirty(obh);
+		mark_buffer_dirty(NULL, obh);
 
 		spin_lock_irq(&btnc->tree_lock);
 		radix_tree_delete(&btnc->page_tree, oldkey);
@@ -261,7 +261,7 @@ void nilfs_btnode_commit_change_key(struct address_space *btnc,
 		unlock_page(opage);
 	} else {
 		nilfs_copy_buffer(nbh, obh);
-		mark_buffer_dirty(nbh);
+		mark_buffer_dirty(NULL, nbh);
 
 		nbh->b_blocknr = newkey;
 		ctxt->bh = nbh;
diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c
index 16a7a67a11c9..39edf8a267d7 100644
--- a/fs/nilfs2/btree.c
+++ b/fs/nilfs2/btree.c
@@ -792,7 +792,7 @@ static void nilfs_btree_promote_key(struct nilfs_bmap *btree,
 				nilfs_btree_get_nonroot_node(path, level),
 				path[level].bp_index, key);
 			if (!buffer_dirty(path[level].bp_bh))
-				mark_buffer_dirty(path[level].bp_bh);
+				mark_buffer_dirty(NULL, path[level].bp_bh);
 		} while ((path[level].bp_index == 0) &&
 			 (++level < nilfs_btree_height(btree) - 1));
 	}
@@ -817,7 +817,7 @@ static void nilfs_btree_do_insert(struct nilfs_bmap *btree,
 		nilfs_btree_node_insert(node, path[level].bp_index,
 					*keyp, *ptrp, ncblk);
 		if (!buffer_dirty(path[level].bp_bh))
-			mark_buffer_dirty(path[level].bp_bh);
+			mark_buffer_dirty(NULL, path[level].bp_bh);
 
 		if (path[level].bp_index == 0)
 			nilfs_btree_promote_key(btree, path, level + 1,
@@ -855,9 +855,9 @@ static void nilfs_btree_carry_left(struct nilfs_bmap *btree,
 	nilfs_btree_node_move_left(left, node, n, ncblk, ncblk);
 
 	if (!buffer_dirty(path[level].bp_bh))
-		mark_buffer_dirty(path[level].bp_bh);
+		mark_buffer_dirty(NULL, path[level].bp_bh);
 	if (!buffer_dirty(path[level].bp_sib_bh))
-		mark_buffer_dirty(path[level].bp_sib_bh);
+		mark_buffer_dirty(NULL, path[level].bp_sib_bh);
 
 	nilfs_btree_promote_key(btree, path, level + 1,
 				nilfs_btree_node_get_key(node, 0));
@@ -901,9 +901,9 @@ static void nilfs_btree_carry_right(struct nilfs_bmap *btree,
 	nilfs_btree_node_move_right(node, right, n, ncblk, ncblk);
 
 	if (!buffer_dirty(path[level].bp_bh))
-		mark_buffer_dirty(path[level].bp_bh);
+		mark_buffer_dirty(NULL, path[level].bp_bh);
 	if (!buffer_dirty(path[level].bp_sib_bh))
-		mark_buffer_dirty(path[level].bp_sib_bh);
+		mark_buffer_dirty(NULL, path[level].bp_sib_bh);
 
 	path[level + 1].bp_index++;
 	nilfs_btree_promote_key(btree, path, level + 1,
@@ -946,9 +946,9 @@ static void nilfs_btree_split(struct nilfs_bmap *btree,
 	nilfs_btree_node_move_right(node, right, n, ncblk, ncblk);
 
 	if (!buffer_dirty(path[level].bp_bh))
-		mark_buffer_dirty(path[level].bp_bh);
+		mark_buffer_dirty(NULL, path[level].bp_bh);
 	if (!buffer_dirty(path[level].bp_sib_bh))
-		mark_buffer_dirty(path[level].bp_sib_bh);
+		mark_buffer_dirty(NULL, path[level].bp_sib_bh);
 
 	if (move) {
 		path[level].bp_index -= nilfs_btree_node_get_nchildren(node);
@@ -992,7 +992,7 @@ static void nilfs_btree_grow(struct nilfs_bmap *btree,
 	nilfs_btree_node_set_level(root, level + 1);
 
 	if (!buffer_dirty(path[level].bp_sib_bh))
-		mark_buffer_dirty(path[level].bp_sib_bh);
+		mark_buffer_dirty(NULL, path[level].bp_sib_bh);
 
 	path[level].bp_bh = path[level].bp_sib_bh;
 	path[level].bp_sib_bh = NULL;
@@ -1267,7 +1267,7 @@ static void nilfs_btree_do_delete(struct nilfs_bmap *btree,
 		nilfs_btree_node_delete(node, path[level].bp_index,
 					keyp, ptrp, ncblk);
 		if (!buffer_dirty(path[level].bp_bh))
-			mark_buffer_dirty(path[level].bp_bh);
+			mark_buffer_dirty(NULL, path[level].bp_bh);
 		if (path[level].bp_index == 0)
 			nilfs_btree_promote_key(btree, path, level + 1,
 				nilfs_btree_node_get_key(node, 0));
@@ -1299,9 +1299,9 @@ static void nilfs_btree_borrow_left(struct nilfs_bmap *btree,
 	nilfs_btree_node_move_right(left, node, n, ncblk, ncblk);
 
 	if (!buffer_dirty(path[level].bp_bh))
-		mark_buffer_dirty(path[level].bp_bh);
+		mark_buffer_dirty(NULL, path[level].bp_bh);
 	if (!buffer_dirty(path[level].bp_sib_bh))
-		mark_buffer_dirty(path[level].bp_sib_bh);
+		mark_buffer_dirty(NULL, path[level].bp_sib_bh);
 
 	nilfs_btree_promote_key(btree, path, level + 1,
 				nilfs_btree_node_get_key(node, 0));
@@ -1331,9 +1331,9 @@ static void nilfs_btree_borrow_right(struct nilfs_bmap *btree,
 	nilfs_btree_node_move_left(node, right, n, ncblk, ncblk);
 
 	if (!buffer_dirty(path[level].bp_bh))
-		mark_buffer_dirty(path[level].bp_bh);
+		mark_buffer_dirty(NULL, path[level].bp_bh);
 	if (!buffer_dirty(path[level].bp_sib_bh))
-		mark_buffer_dirty(path[level].bp_sib_bh);
+		mark_buffer_dirty(NULL, path[level].bp_sib_bh);
 
 	path[level + 1].bp_index++;
 	nilfs_btree_promote_key(btree, path, level + 1,
@@ -1362,7 +1362,7 @@ static void nilfs_btree_concat_left(struct nilfs_bmap *btree,
 	nilfs_btree_node_move_left(left, node, n, ncblk, ncblk);
 
 	if (!buffer_dirty(path[level].bp_sib_bh))
-		mark_buffer_dirty(path[level].bp_sib_bh);
+		mark_buffer_dirty(NULL, path[level].bp_sib_bh);
 
 	nilfs_btnode_delete(path[level].bp_bh);
 	path[level].bp_bh = path[level].bp_sib_bh;
@@ -1388,7 +1388,7 @@ static void nilfs_btree_concat_right(struct nilfs_bmap *btree,
 	nilfs_btree_node_move_left(node, right, n, ncblk, ncblk);
 
 	if (!buffer_dirty(path[level].bp_bh))
-		mark_buffer_dirty(path[level].bp_bh);
+		mark_buffer_dirty(NULL, path[level].bp_bh);
 
 	nilfs_btnode_delete(path[level].bp_sib_bh);
 	path[level].bp_sib_bh = NULL;
@@ -1818,7 +1818,7 @@ nilfs_btree_commit_convert_and_insert(struct nilfs_bmap *btree,
 		nilfs_btree_node_init(node, 0, 1, n, ncblk, keys, ptrs);
 		nilfs_btree_node_insert(node, n, key, dreq->bpr_ptr, ncblk);
 		if (!buffer_dirty(bh))
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 		if (!nilfs_bmap_dirty(btree))
 			nilfs_bmap_set_dirty(btree);
 
@@ -1896,7 +1896,7 @@ static int nilfs_btree_propagate_p(struct nilfs_bmap *btree,
 {
 	while ((++level < nilfs_btree_height(btree) - 1) &&
 	       !buffer_dirty(path[level].bp_bh))
-		mark_buffer_dirty(path[level].bp_bh);
+		mark_buffer_dirty(NULL, path[level].bp_bh);
 
 	return 0;
 }
@@ -2339,7 +2339,7 @@ static int nilfs_btree_mark(struct nilfs_bmap *btree, __u64 key, int level)
 	}
 
 	if (!buffer_dirty(bh))
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 	if (!nilfs_bmap_dirty(btree))
 		nilfs_bmap_set_dirty(btree);
diff --git a/fs/nilfs2/cpfile.c b/fs/nilfs2/cpfile.c
index a15a1601e931..9e4558a8e4da 100644
--- a/fs/nilfs2/cpfile.c
+++ b/fs/nilfs2/cpfile.c
@@ -258,14 +258,14 @@ int nilfs_cpfile_get_checkpoint(struct inode *cpfile,
 		if (!nilfs_cpfile_is_in_first(cpfile, cno))
 			nilfs_cpfile_block_add_valid_checkpoints(cpfile, cp_bh,
 								 kaddr, 1);
-		mark_buffer_dirty(cp_bh);
+		mark_buffer_dirty(NULL, cp_bh);
 
 		kaddr = kmap_atomic(header_bh->b_page);
 		header = nilfs_cpfile_block_get_header(cpfile, header_bh,
 						       kaddr);
 		le64_add_cpu(&header->ch_ncheckpoints, 1);
 		kunmap_atomic(kaddr);
-		mark_buffer_dirty(header_bh);
+		mark_buffer_dirty(NULL, header_bh);
 		nilfs_mdt_mark_dirty(cpfile);
 	}
 
@@ -370,7 +370,7 @@ int nilfs_cpfile_delete_checkpoints(struct inode *cpfile,
 		}
 		if (nicps > 0) {
 			tnicps += nicps;
-			mark_buffer_dirty(cp_bh);
+			mark_buffer_dirty(NULL, cp_bh);
 			nilfs_mdt_mark_dirty(cpfile);
 			if (!nilfs_cpfile_is_in_first(cpfile, cno)) {
 				count =
@@ -402,7 +402,7 @@ int nilfs_cpfile_delete_checkpoints(struct inode *cpfile,
 		header = nilfs_cpfile_block_get_header(cpfile, header_bh,
 						       kaddr);
 		le64_add_cpu(&header->ch_ncheckpoints, -(u64)tnicps);
-		mark_buffer_dirty(header_bh);
+		mark_buffer_dirty(NULL, header_bh);
 		nilfs_mdt_mark_dirty(cpfile);
 		kunmap_atomic(kaddr);
 	}
@@ -720,10 +720,10 @@ static int nilfs_cpfile_set_snapshot(struct inode *cpfile, __u64 cno)
 	le64_add_cpu(&header->ch_nsnapshots, 1);
 	kunmap_atomic(kaddr);
 
-	mark_buffer_dirty(prev_bh);
-	mark_buffer_dirty(curr_bh);
-	mark_buffer_dirty(cp_bh);
-	mark_buffer_dirty(header_bh);
+	mark_buffer_dirty(NULL, prev_bh);
+	mark_buffer_dirty(NULL, curr_bh);
+	mark_buffer_dirty(NULL, cp_bh);
+	mark_buffer_dirty(NULL, header_bh);
 	nilfs_mdt_mark_dirty(cpfile);
 
 	brelse(prev_bh);
@@ -823,10 +823,10 @@ static int nilfs_cpfile_clear_snapshot(struct inode *cpfile, __u64 cno)
 	le64_add_cpu(&header->ch_nsnapshots, -1);
 	kunmap_atomic(kaddr);
 
-	mark_buffer_dirty(next_bh);
-	mark_buffer_dirty(prev_bh);
-	mark_buffer_dirty(cp_bh);
-	mark_buffer_dirty(header_bh);
+	mark_buffer_dirty(NULL, next_bh);
+	mark_buffer_dirty(NULL, prev_bh);
+	mark_buffer_dirty(NULL, cp_bh);
+	mark_buffer_dirty(NULL, header_bh);
 	nilfs_mdt_mark_dirty(cpfile);
 
 	brelse(prev_bh);
diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index dffedb2f8817..8db180ed9812 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -56,7 +56,7 @@ static int nilfs_dat_prepare_entry(struct inode *dat,
 static void nilfs_dat_commit_entry(struct inode *dat,
 				   struct nilfs_palloc_req *req)
 {
-	mark_buffer_dirty(req->pr_entry_bh);
+	mark_buffer_dirty(NULL, req->pr_entry_bh);
 	nilfs_mdt_mark_dirty(dat);
 	brelse(req->pr_entry_bh);
 }
@@ -362,7 +362,7 @@ int nilfs_dat_move(struct inode *dat, __u64 vblocknr, sector_t blocknr)
 	entry->de_blocknr = cpu_to_le64(blocknr);
 	kunmap_atomic(kaddr);
 
-	mark_buffer_dirty(entry_bh);
+	mark_buffer_dirty(NULL, entry_bh);
 	nilfs_mdt_mark_dirty(dat);
 
 	brelse(entry_bh);
diff --git a/fs/nilfs2/gcinode.c b/fs/nilfs2/gcinode.c
index 853a831dcde0..ba2af926b39a 100644
--- a/fs/nilfs2/gcinode.c
+++ b/fs/nilfs2/gcinode.c
@@ -164,7 +164,7 @@ int nilfs_gccache_wait_and_mark_dirty(struct buffer_head *bh)
 		clear_buffer_uptodate(bh);
 		return -EIO;
 	}
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	return 0;
 }
 
diff --git a/fs/nilfs2/ifile.c b/fs/nilfs2/ifile.c
index b8fa45c20c63..11262c2f46f4 100644
--- a/fs/nilfs2/ifile.c
+++ b/fs/nilfs2/ifile.c
@@ -82,7 +82,7 @@ int nilfs_ifile_create_inode(struct inode *ifile, ino_t *out_ino,
 		return ret;
 	}
 	nilfs_palloc_commit_alloc_entry(ifile, &req);
-	mark_buffer_dirty(req.pr_entry_bh);
+	mark_buffer_dirty(NULL, req.pr_entry_bh);
 	nilfs_mdt_mark_dirty(ifile);
 	*out_ino = (ino_t)req.pr_entry_nr;
 	*out_bh = req.pr_entry_bh;
@@ -130,7 +130,7 @@ int nilfs_ifile_delete_inode(struct inode *ifile, ino_t ino)
 	raw_inode->i_flags = 0;
 	kunmap_atomic(kaddr);
 
-	mark_buffer_dirty(req.pr_entry_bh);
+	mark_buffer_dirty(NULL, req.pr_entry_bh);
 	brelse(req.pr_entry_bh);
 
 	nilfs_palloc_commit_free_entry(ifile, &req);
diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
index 7cc0268d68ce..811e4d952511 100644
--- a/fs/nilfs2/inode.c
+++ b/fs/nilfs2/inode.c
@@ -968,7 +968,7 @@ int __nilfs_mark_inode_dirty(struct inode *inode, int flags)
 		return err;
 	}
 	nilfs_update_inode(inode, ibh, flags);
-	mark_buffer_dirty(ibh);
+	mark_buffer_dirty(NULL, ibh);
 	nilfs_mdt_mark_dirty(NILFS_I(inode)->i_root->ifile);
 	brelse(ibh);
 	return 0;
diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c
index 1d2c3d7711fe..18b550252ff2 100644
--- a/fs/nilfs2/ioctl.c
+++ b/fs/nilfs2/ioctl.c
@@ -801,7 +801,7 @@ static int nilfs_ioctl_mark_blocks_dirty(struct the_nilfs *nilfs,
 				WARN_ON(ret == -ENOENT);
 				return ret;
 			}
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			nilfs_mdt_mark_dirty(nilfs->ns_dat);
 			put_bh(bh);
 		} else {
diff --git a/fs/nilfs2/mdt.c b/fs/nilfs2/mdt.c
index ca7bc0fba624..ad41b67eb25f 100644
--- a/fs/nilfs2/mdt.c
+++ b/fs/nilfs2/mdt.c
@@ -64,7 +64,7 @@ nilfs_mdt_insert_new_block(struct inode *inode, unsigned long block,
 	kunmap_atomic(kaddr);
 
 	set_buffer_uptodate(bh);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	nilfs_mdt_mark_dirty(inode);
 
 	trace_nilfs2_mdt_insert_new_block(inode, inode->i_ino, block);
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 0952d0acab4a..8dc544e3fe6a 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -883,7 +883,7 @@ static int nilfs_segctor_create_checkpoint(struct nilfs_sc_info *sci)
 		 * needed to collect the checkpoint even if it was not newly
 		 * created.
 		 */
-		mark_buffer_dirty(bh_cp);
+		mark_buffer_dirty(NULL, bh_cp);
 		nilfs_mdt_mark_dirty(nilfs->ns_cpfile);
 		nilfs_cpfile_put_checkpoint(
 			nilfs->ns_cpfile, nilfs->ns_cno, bh_cp);
@@ -1964,7 +1964,7 @@ static int nilfs_segctor_collect_dirty_files(struct nilfs_sc_info *sci,
 		}
 
 		// Always redirty the buffer to avoid race condition
-		mark_buffer_dirty(ii->i_bh);
+		mark_buffer_dirty(NULL, ii->i_bh);
 		nilfs_mdt_mark_dirty(ifile);
 
 		clear_bit(NILFS_I_QUEUED, &ii->i_state);
diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
index c7fa139d50e8..707f2231d348 100644
--- a/fs/nilfs2/sufile.c
+++ b/fs/nilfs2/sufile.c
@@ -122,7 +122,7 @@ static void nilfs_sufile_mod_counter(struct buffer_head *header_bh,
 	le64_add_cpu(&header->sh_ndirtysegs, ndirtyadd);
 	kunmap_atomic(kaddr);
 
-	mark_buffer_dirty(header_bh);
+	mark_buffer_dirty(NULL, header_bh);
 }
 
 /**
@@ -383,8 +383,8 @@ int nilfs_sufile_alloc(struct inode *sufile, __u64 *segnump)
 			kunmap_atomic(kaddr);
 
 			sui->ncleansegs--;
-			mark_buffer_dirty(header_bh);
-			mark_buffer_dirty(su_bh);
+			mark_buffer_dirty(NULL, header_bh);
+			mark_buffer_dirty(NULL, su_bh);
 			nilfs_mdt_mark_dirty(sufile);
 			brelse(su_bh);
 			*segnump = segnum;
@@ -431,7 +431,7 @@ void nilfs_sufile_do_cancel_free(struct inode *sufile, __u64 segnum,
 	nilfs_sufile_mod_counter(header_bh, -1, 1);
 	NILFS_SUI(sufile)->ncleansegs--;
 
-	mark_buffer_dirty(su_bh);
+	mark_buffer_dirty(NULL, su_bh);
 	nilfs_mdt_mark_dirty(sufile);
 }
 
@@ -462,7 +462,7 @@ void nilfs_sufile_do_scrap(struct inode *sufile, __u64 segnum,
 	nilfs_sufile_mod_counter(header_bh, clean ? (u64)-1 : 0, dirty ? 0 : 1);
 	NILFS_SUI(sufile)->ncleansegs -= clean;
 
-	mark_buffer_dirty(su_bh);
+	mark_buffer_dirty(NULL, su_bh);
 	nilfs_mdt_mark_dirty(sufile);
 }
 
@@ -489,7 +489,7 @@ void nilfs_sufile_do_free(struct inode *sufile, __u64 segnum,
 	sudirty = nilfs_segment_usage_dirty(su);
 	nilfs_segment_usage_set_clean(su);
 	kunmap_atomic(kaddr);
-	mark_buffer_dirty(su_bh);
+	mark_buffer_dirty(NULL, su_bh);
 
 	nilfs_sufile_mod_counter(header_bh, 1, sudirty ? (u64)-1 : 0);
 	NILFS_SUI(sufile)->ncleansegs++;
@@ -511,7 +511,7 @@ int nilfs_sufile_mark_dirty(struct inode *sufile, __u64 segnum)
 
 	ret = nilfs_sufile_get_segment_usage_block(sufile, segnum, 0, &bh);
 	if (!ret) {
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		nilfs_mdt_mark_dirty(sufile);
 		brelse(bh);
 	}
@@ -546,7 +546,7 @@ int nilfs_sufile_set_segment_usage(struct inode *sufile, __u64 segnum,
 	su->su_nblocks = cpu_to_le32(nblocks);
 	kunmap_atomic(kaddr);
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	nilfs_mdt_mark_dirty(sufile);
 	brelse(bh);
 
@@ -625,7 +625,7 @@ void nilfs_sufile_do_set_error(struct inode *sufile, __u64 segnum,
 		nilfs_sufile_mod_counter(header_bh, -1, 0);
 		NILFS_SUI(sufile)->ncleansegs--;
 	}
-	mark_buffer_dirty(su_bh);
+	mark_buffer_dirty(NULL, su_bh);
 	nilfs_mdt_mark_dirty(sufile);
 }
 
@@ -711,7 +711,7 @@ static int nilfs_sufile_truncate_range(struct inode *sufile,
 		}
 		kunmap_atomic(kaddr);
 		if (nc > 0) {
-			mark_buffer_dirty(su_bh);
+			mark_buffer_dirty(NULL, su_bh);
 			ncleaned += nc;
 		}
 		brelse(su_bh);
@@ -790,7 +790,7 @@ int nilfs_sufile_resize(struct inode *sufile, __u64 newnsegs)
 	header->sh_ncleansegs = cpu_to_le64(sui->ncleansegs);
 	kunmap_atomic(kaddr);
 
-	mark_buffer_dirty(header_bh);
+	mark_buffer_dirty(NULL, header_bh);
 	nilfs_mdt_mark_dirty(sufile);
 	nilfs_set_nsegments(nilfs, newnsegs);
 
@@ -984,13 +984,13 @@ ssize_t nilfs_sufile_set_suinfo(struct inode *sufile, void *buf,
 			continue;
 
 		/* get different block */
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		put_bh(bh);
 		ret = nilfs_mdt_get_block(sufile, blkoff, 1, NULL, &bh);
 		if (unlikely(ret < 0))
 			goto out_mark;
 	}
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	put_bh(bh);
 
  out_mark:
diff --git a/fs/ntfs/file.c b/fs/ntfs/file.c
index 860b3b2ff47d..bf07c0ca127e 100644
--- a/fs/ntfs/file.c
+++ b/fs/ntfs/file.c
@@ -743,7 +743,7 @@ static int ntfs_prepare_pages_for_non_resident_write(struct page **pages,
 					/* We allocated the buffer. */
 					clean_bdev_bh_alias(bh);
 					if (bh_end <= pos || bh_pos >= end)
-						mark_buffer_dirty(bh);
+						mark_buffer_dirty(NULL, bh);
 					else
 						set_buffer_new(bh);
 				}
@@ -799,7 +799,7 @@ static int ntfs_prepare_pages_for_non_resident_write(struct page **pages,
 							blocksize);
 					set_buffer_uptodate(bh);
 				}
-				mark_buffer_dirty(bh);
+				mark_buffer_dirty(NULL, bh);
 				continue;
 			}
 			set_buffer_new(bh);
@@ -1365,7 +1365,7 @@ static int ntfs_prepare_pages_for_non_resident_write(struct page **pages,
 					set_buffer_uptodate(bh);
 				}
 			}
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 		} while ((bh = bh->b_this_page) != head);
 	} while (++u <= nr_pages);
 	ntfs_error(vol->sb, "Failed.  Returning error code %i.", err);
@@ -1434,7 +1434,7 @@ static inline int ntfs_commit_pages_after_non_resident_write(
 					partial = true;
 			} else {
 				set_buffer_uptodate(bh);
-				mark_buffer_dirty(bh);
+				mark_buffer_dirty(NULL, bh);
 			}
 		} while (bh_pos += blocksize, (bh = bh->b_this_page) != head);
 		/*
diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
index bb7159f697f2..0d21744db6ba 100644
--- a/fs/ntfs/super.c
+++ b/fs/ntfs/super.c
@@ -737,7 +737,7 @@ static struct buffer_head *read_ntfs_boot_sector(struct super_block *sb,
 					"boot sector from backup copy.");
 			memcpy(bh_primary->b_data, bh_backup->b_data,
 					NTFS_BLOCK_SIZE);
-			mark_buffer_dirty(bh_primary);
+			mark_buffer_dirty(NULL, bh_primary);
 			sync_dirty_buffer(bh_primary);
 			if (buffer_uptodate(bh_primary)) {
 				brelse(bh_backup);
diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
index 9a876bb07cac..7ab61d67e89e 100644
--- a/fs/ocfs2/alloc.c
+++ b/fs/ocfs2/alloc.c
@@ -6815,7 +6815,7 @@ static int ocfs2_cache_extent_block_free(struct ocfs2_cached_dealloc_ctxt *ctxt,
 static int ocfs2_zero_func(handle_t *handle, struct buffer_head *bh)
 {
 	set_buffer_uptodate(bh);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	return 0;
 }
 
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 515e0a00839b..64002c13bdd1 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -701,7 +701,7 @@ int ocfs2_map_page_blocks(struct page *page, u64 *p_blkno,
 
 		zero_user(page, block_start, bh->b_size);
 		set_buffer_uptodate(bh);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 
 next_bh:
 		block_start = block_end;
@@ -930,7 +930,7 @@ static void ocfs2_zero_new_buffers(struct page *page, unsigned from, unsigned to
 				}
 
 				clear_buffer_new(bh);
-				mark_buffer_dirty(bh);
+				mark_buffer_dirty(NULL, bh);
 			}
 		}
 
diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index d51b80edd972..7a4e5e9db53d 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -1568,7 +1568,7 @@ static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
 
 	if (changed || ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
 		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		mlog(ML_ERROR,
 		     "Filecheck: reset dinode #%llu: compute meta ecc\n",
 		     (unsigned long long)bh->b_blocknr);
diff --git a/fs/omfs/bitmap.c b/fs/omfs/bitmap.c
index 7147ba6a6afc..0a4a756e2865 100644
--- a/fs/omfs/bitmap.c
+++ b/fs/omfs/bitmap.c
@@ -63,7 +63,7 @@ static int set_run(struct super_block *sb, int map,
 			bit = 0;
 			map++;
 
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			brelse(bh);
 			bh = sb_bread(sb,
 				clus_to_blk(sbi, sbi->s_bitmap_ino) + map);
@@ -78,7 +78,7 @@ static int set_run(struct super_block *sb, int map,
 			clear_bit(bit, (unsigned long *)bh->b_data);
 		}
 	}
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 	err = 0;
 out:
@@ -111,7 +111,7 @@ int omfs_allocate_block(struct super_block *sb, u64 block)
 			goto out;
 
 		set_bit(bit, (unsigned long *)bh->b_data);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		brelse(bh);
 	}
 	ret = 1;
diff --git a/fs/omfs/dir.c b/fs/omfs/dir.c
index b7146526afff..a5c6f506a0a0 100644
--- a/fs/omfs/dir.c
+++ b/fs/omfs/dir.c
@@ -103,7 +103,7 @@ int omfs_make_empty(struct inode *inode, struct super_block *sb)
 	oi->i_head.h_self = cpu_to_be64(inode->i_ino);
 	oi->i_sibling = ~cpu_to_be64(0ULL);
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 	return 0;
 }
@@ -127,7 +127,7 @@ static int omfs_add_link(struct dentry *dentry, struct inode *inode)
 	entry = (__be64 *) &bh->b_data[ofs];
 	block = be64_to_cpu(*entry);
 	*entry = cpu_to_be64(inode->i_ino);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 
 	/* now set the sibling and parent pointers on the new inode */
@@ -140,7 +140,7 @@ static int omfs_add_link(struct dentry *dentry, struct inode *inode)
 	memset(oi->i_name + namelen, 0, OMFS_NAMELEN - namelen);
 	oi->i_sibling = cpu_to_be64(block);
 	oi->i_parent = cpu_to_be64(dir->i_ino);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	brelse(bh);
 
 	dir->i_ctime = current_time(dir);
@@ -196,7 +196,7 @@ static int omfs_delete_entry(struct dentry *dentry)
 	}
 
 	*entry = next;
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 
 	if (prev != ~0) {
 		dirty = omfs_iget(dir->i_sb, prev);
diff --git a/fs/omfs/file.c b/fs/omfs/file.c
index ac27a4b2186a..4dbfb35b1a40 100644
--- a/fs/omfs/file.c
+++ b/fs/omfs/file.c
@@ -80,7 +80,7 @@ int omfs_shrink_inode(struct inode *inode)
 			entry++;
 		}
 		omfs_make_empty_table(bh, (char *) oe - bh->b_data);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		brelse(bh);
 
 		if (last != inode->i_ino)
@@ -272,7 +272,7 @@ static int omfs_get_block(struct inode *inode, sector_t block,
 	if (create) {
 		ret = omfs_grow_extent(inode, oe, &new_block);
 		if (ret == 0) {
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			mark_inode_dirty(inode);
 			map_bh(bh_result, inode->i_sb,
 					clus_to_blk(sbi, new_block));
diff --git a/fs/omfs/inode.c b/fs/omfs/inode.c
index ee14af9e26f2..4213ef2cd088 100644
--- a/fs/omfs/inode.c
+++ b/fs/omfs/inode.c
@@ -140,7 +140,7 @@ static int __omfs_write_inode(struct inode *inode, int wait)
 
 	omfs_update_checksums(oi);
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	if (wait) {
 		sync_dirty_buffer(bh);
 		if (buffer_req(bh) && !buffer_uptodate(bh))
@@ -154,7 +154,7 @@ static int __omfs_write_inode(struct inode *inode, int wait)
 			goto out_brelse;
 
 		memcpy(bh2->b_data, bh->b_data, bh->b_size);
-		mark_buffer_dirty(bh2);
+		mark_buffer_dirty(NULL, bh2);
 		if (wait) {
 			sync_dirty_buffer(bh2);
 			if (buffer_req(bh2) && !buffer_uptodate(bh2))
diff --git a/fs/reiserfs/file.c b/fs/reiserfs/file.c
index 843aadcc123c..804b13c45f53 100644
--- a/fs/reiserfs/file.c
+++ b/fs/reiserfs/file.c
@@ -214,7 +214,7 @@ int reiserfs_commit_page(struct inode *inode, struct page *page,
 				reiserfs_prepare_for_journal(s, bh, 1);
 				journal_mark_dirty(&th, bh);
 			} else if (!buffer_dirty(bh)) {
-				mark_buffer_dirty(bh);
+				mark_buffer_dirty(NULL, bh);
 				/*
 				 * do data=ordered on any page past the end
 				 * of file and any buffer marked BH_New.
diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
index bc64ca190848..e438aed1b622 100644
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -999,7 +999,7 @@ int reiserfs_get_block(struct inode *inode, sector_t block,
 				 * VM (which was also the case with
 				 * __mark_buffer_dirty())
 				 */
-				mark_buffer_dirty(unbh);
+				mark_buffer_dirty(NULL, unbh);
 			}
 		} else {
 			/*
@@ -2336,7 +2336,7 @@ int reiserfs_truncate_file(struct inode *inode, int update_timestamps)
 			length = blocksize - length;
 			zero_user(page, offset, length);
 			if (buffer_mapped(bh) && bh->b_blocknr != 0) {
-				mark_buffer_dirty(bh);
+				mark_buffer_dirty(NULL, bh);
 			}
 		}
 		unlock_page(page);
diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index ee74c6cddbbe..68cbaba7b2e6 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -1111,7 +1111,7 @@ static int flush_commit_list(struct super_block *s,
 	if (likely(!retval && !reiserfs_is_journal_aborted (journal))) {
 		if (buffer_dirty(jl->j_commit_bh))
 			BUG();
-		mark_buffer_dirty(jl->j_commit_bh) ;
+		mark_buffer_dirty(NULL, jl->j_commit_bh);
 		depth = reiserfs_write_unlock_nested(s);
 		if (reiserfs_barrier_flush(s))
 			__sync_dirty_buffer(jl->j_commit_bh,
@@ -1712,7 +1712,7 @@ static int dirty_one_transaction(struct super_block *s,
 				set_buffer_journal_restore_dirty(cn->bh);
 			} else {
 				set_buffer_journal_test(cn->bh);
-				mark_buffer_dirty(cn->bh);
+				mark_buffer_dirty(NULL, cn->bh);
 			}
 		}
 		cn = cn->next;
@@ -3935,7 +3935,7 @@ void reiserfs_restore_prepared_buffer(struct super_block *sb,
 					  bh->b_blocknr);
 		if (cn && can_dirty(cn)) {
 			set_buffer_journal_test(bh);
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 		}
 		reiserfs_write_unlock(sb);
 	}
@@ -4183,7 +4183,7 @@ static int do_journal_end(struct reiserfs_transaction_handle *th, int flags)
 	 * dirty now too.  Don't mark the commit block dirty until all the
 	 * others are on disk
 	 */
-	mark_buffer_dirty(d_bh);
+	mark_buffer_dirty(NULL, d_bh);
 
 	/*
 	 * first data block is j_start + 1, so add one to
@@ -4212,7 +4212,7 @@ static int do_journal_end(struct reiserfs_transaction_handle *th, int flags)
 			       addr + offset_in_page(cn->bh->b_data),
 			       cn->bh->b_size);
 			kunmap(page);
-			mark_buffer_dirty(tmp_bh);
+			mark_buffer_dirty(NULL, tmp_bh);
 			jindex++;
 			set_buffer_journal_dirty(cn->bh);
 			clear_buffer_journaled(cn->bh);
diff --git a/fs/reiserfs/resize.c b/fs/reiserfs/resize.c
index 2196afda6e28..f80e8a7e7ac4 100644
--- a/fs/reiserfs/resize.c
+++ b/fs/reiserfs/resize.c
@@ -156,7 +156,7 @@ int reiserfs_resize(struct super_block *s, unsigned long block_count_new)
 			reiserfs_cache_bitmap_metadata(s, bh, bitmap + i);
 
 			set_buffer_uptodate(bh);
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			depth = reiserfs_write_unlock_nested(s);
 			sync_dirty_buffer(bh);
 			reiserfs_write_lock_nested(s, depth);
diff --git a/fs/sysv/balloc.c b/fs/sysv/balloc.c
index 0e69dbdf7277..bfee0260f869 100644
--- a/fs/sysv/balloc.c
+++ b/fs/sysv/balloc.c
@@ -84,7 +84,7 @@ void sysv_free_block(struct super_block * sb, sysv_zone_t nr)
 		memset(bh->b_data, 0, sb->s_blocksize);
 		*(__fs16*)bh->b_data = cpu_to_fs16(sbi, count);
 		memcpy(get_chunk(sb,bh), blocks, count * sizeof(sysv_zone_t));
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		set_buffer_uptodate(bh);
 		brelse(bh);
 		count = 0;
diff --git a/fs/sysv/ialloc.c b/fs/sysv/ialloc.c
index 6c9801986af6..c3f09d9761a8 100644
--- a/fs/sysv/ialloc.c
+++ b/fs/sysv/ialloc.c
@@ -128,7 +128,7 @@ void sysv_free_inode(struct inode * inode)
 	fs16_add(sbi, sbi->s_sb_total_free_inodes, 1);
 	dirty_sb(sb);
 	memset(raw_inode, 0, sizeof(struct sysv_inode));
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	mutex_unlock(&sbi->s_lock);
 	brelse(bh);
 }
diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
index bec9f79adb25..d7494995157a 100644
--- a/fs/sysv/inode.c
+++ b/fs/sysv/inode.c
@@ -49,7 +49,7 @@ static int sysv_sync_fs(struct super_block *sb, int wait)
 		if (*sbi->s_sb_state == cpu_to_fs32(sbi, 0x7c269d38 - old_time))
 			*sbi->s_sb_state = cpu_to_fs32(sbi, 0x7c269d38 - time);
 		*sbi->s_sb_time = cpu_to_fs32(sbi, time);
-		mark_buffer_dirty(sbi->s_bh2);
+		mark_buffer_dirty(NULL, sbi->s_bh2);
 	}
 
 	mutex_unlock(&sbi->s_lock);
@@ -73,9 +73,9 @@ static void sysv_put_super(struct super_block *sb)
 
 	if (!sb_rdonly(sb)) {
 		/* XXX ext2 also updates the state here */
-		mark_buffer_dirty(sbi->s_bh1);
+		mark_buffer_dirty(NULL, sbi->s_bh1);
 		if (sbi->s_bh1 != sbi->s_bh2)
-			mark_buffer_dirty(sbi->s_bh2);
+			mark_buffer_dirty(NULL, sbi->s_bh2);
 	}
 
 	brelse(sbi->s_bh1);
@@ -265,7 +265,7 @@ static int __sysv_write_inode(struct inode *inode, int wait)
 	for (block = 0; block < 10+1+1+1; block++)
 		write3byte(sbi, (u8 *)&si->i_data[block],
 			&raw_inode->i_data[3*block]);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	if (wait) {
                 sync_dirty_buffer(bh);
                 if (buffer_req(bh) && !buffer_uptodate(bh)) {
diff --git a/fs/sysv/sysv.h b/fs/sysv/sysv.h
index e913698779c0..3a3b5c16095a 100644
--- a/fs/sysv/sysv.h
+++ b/fs/sysv/sysv.h
@@ -116,9 +116,9 @@ static inline void dirty_sb(struct super_block *sb)
 {
 	struct sysv_sb_info *sbi = SYSV_SB(sb);
 
-	mark_buffer_dirty(sbi->s_bh1);
+	mark_buffer_dirty(NULL, sbi->s_bh1);
 	if (sbi->s_bh1 != sbi->s_bh2)
-		mark_buffer_dirty(sbi->s_bh2);
+		mark_buffer_dirty(NULL, sbi->s_bh2);
 }
 
 
diff --git a/fs/udf/balloc.c b/fs/udf/balloc.c
index 1b961b1d9699..9fa31c59835f 100644
--- a/fs/udf/balloc.c
+++ b/fs/udf/balloc.c
@@ -157,7 +157,7 @@ static void udf_bitmap_free_blocks(struct super_block *sb,
 			}
 		}
 		udf_add_free_space(sb, sbi->s_partition, count);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		if (overflow) {
 			block += count;
 			count = overflow;
@@ -209,7 +209,7 @@ static int udf_bitmap_prealloc_blocks(struct super_block *sb,
 			bit++;
 			block++;
 		}
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 	} while (block_count > 0);
 
 out:
@@ -332,7 +332,7 @@ static udf_pblk_t udf_bitmap_new_block(struct super_block *sb,
 		goto repeat;
 	}
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 
 	udf_add_free_space(sb, partition, -1);
 	mutex_unlock(&sbi->s_alloc_mutex);
diff --git a/fs/udf/inode.c b/fs/udf/inode.c
index 56cf8e70d298..6194f4c4bf12 100644
--- a/fs/udf/inode.c
+++ b/fs/udf/inode.c
@@ -1817,7 +1817,7 @@ static int udf_update_inode(struct inode *inode, int do_sync)
 	unlock_buffer(bh);
 
 	/* write the data blocks */
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	if (do_sync) {
 		sync_dirty_buffer(bh);
 		if (buffer_write_io_error(bh)) {
diff --git a/fs/udf/partition.c b/fs/udf/partition.c
index 090baff83990..0e4e05a41bee 100644
--- a/fs/udf/partition.c
+++ b/fs/udf/partition.c
@@ -204,7 +204,7 @@ int udf_relocate_blocks(struct super_block *sb, long old_block, long *new_block)
 						  reallocationTableLen *
 						  sizeof(struct sparingEntry);
 						udf_update_tag((char *)st, len);
-						mark_buffer_dirty(bh);
+						mark_buffer_dirty(NULL, bh);
 					}
 					*new_block = le32_to_cpu(
 							entry->mappedLocation) +
@@ -250,7 +250,7 @@ int udf_relocate_blocks(struct super_block *sb, long old_block, long *new_block)
 						sizeof(struct sparingTable) +
 						reallocationTableLen *
 						sizeof(struct sparingEntry));
-					mark_buffer_dirty(bh);
+					mark_buffer_dirty(NULL, bh);
 				}
 				*new_block =
 					le32_to_cpu(
diff --git a/fs/udf/super.c b/fs/udf/super.c
index f73239a9a97d..79c2bfd32986 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -2001,7 +2001,7 @@ static void udf_open_lvid(struct super_block *sb)
 			le16_to_cpu(lvid->descTag.descCRCLength)));
 
 	lvid->descTag.tagChecksum = udf_tag_checksum(&lvid->descTag);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	sbi->s_lvid_dirty = 0;
 	mutex_unlock(&sbi->s_alloc_mutex);
 	/* Make opening of filesystem visible on the media immediately */
@@ -2047,7 +2047,7 @@ static void udf_close_lvid(struct super_block *sb)
 	 * the buffer as !uptodate
 	 */
 	set_buffer_uptodate(bh);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	sbi->s_lvid_dirty = 0;
 	mutex_unlock(&sbi->s_alloc_mutex);
 	/* Make closing of filesystem visible on the media immediately */
@@ -2076,7 +2076,7 @@ u64 lvid_get_unique_id(struct super_block *sb)
 		uniqueID += 16;
 	lvhd->uniqueID = cpu_to_le64(uniqueID);
 	mutex_unlock(&sbi->s_alloc_mutex);
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 
 	return ret;
 }
@@ -2351,7 +2351,7 @@ static int udf_sync_fs(struct super_block *sb, int wait)
 		 * Blockdevice will be synced later so we don't have to submit
 		 * the buffer for IO
 		 */
-		mark_buffer_dirty(sbi->s_lvid_bh);
+		mark_buffer_dirty(NULL, sbi->s_lvid_bh);
 		sbi->s_lvid_dirty = 0;
 	}
 	mutex_unlock(&sbi->s_alloc_mutex);
diff --git a/fs/ufs/balloc.c b/fs/ufs/balloc.c
index e727ee07dbe4..3648422218dc 100644
--- a/fs/ufs/balloc.c
+++ b/fs/ufs/balloc.c
@@ -311,7 +311,7 @@ static void ufs_change_blocknr(struct inode *inode, sector_t beg,
 
 			bh->b_blocknr = newb + pos;
 			clean_bdev_bh_alias(bh);
-			mark_buffer_dirty(bh);
+			mark_buffer_dirty(NULL, bh);
 			++j;
 			bh = bh->b_this_page;
 		} while (bh != head);
@@ -333,7 +333,7 @@ static void ufs_clear_frags(struct inode *inode, sector_t beg, unsigned int n,
 		lock_buffer(bh);
 		memset(bh->b_data, 0, inode->i_sb->s_blocksize);
 		set_buffer_uptodate(bh);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		unlock_buffer(bh);
 		if (IS_SYNC(inode) || sync)
 			sync_dirty_buffer(bh);
diff --git a/fs/ufs/ialloc.c b/fs/ufs/ialloc.c
index e1ef0f0a1353..3a6dd9eea6a9 100644
--- a/fs/ufs/ialloc.c
+++ b/fs/ufs/ialloc.c
@@ -144,7 +144,7 @@ static void ufs2_init_inodes_chunk(struct super_block *sb,
 		lock_buffer(bh);
 		memset(bh->b_data, 0, sb->s_blocksize);
 		set_buffer_uptodate(bh);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		unlock_buffer(bh);
 		if (sb->s_flags & SB_SYNCHRONOUS)
 			sync_dirty_buffer(bh);
@@ -328,7 +328,7 @@ struct inode *ufs_new_inode(struct inode *dir, umode_t mode)
 		ktime_get_real_ts64(&ts);
 		ufs2_inode->ui_birthtime = cpu_to_fs64(sb, ts.tv_sec);
 		ufs2_inode->ui_birthnsec = cpu_to_fs32(sb, ts.tv_nsec);
-		mark_buffer_dirty(bh);
+		mark_buffer_dirty(NULL, bh);
 		unlock_buffer(bh);
 		if (sb->s_flags & SB_SYNCHRONOUS)
 			sync_dirty_buffer(bh);
diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c
index fcaa60bfad49..c96630059d9e 100644
--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -375,7 +375,7 @@ ufs_inode_getblock(struct inode *inode, u64 ind_block,
 	if (new)
 		*new = 1;
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	if (IS_SYNC(inode))
 		sync_dirty_buffer(bh);
 	inode->i_ctime = current_time(inode);
@@ -829,7 +829,7 @@ static int ufs_update_inode(struct inode * inode, int do_sync)
 		ufs1_update_inode(inode, ufs_inode + ufs_inotofsbo(inode->i_ino));
 	}
 
-	mark_buffer_dirty(bh);
+	mark_buffer_dirty(NULL, bh);
 	if (do_sync)
 		sync_dirty_buffer(bh);
 	brelse (bh);
@@ -1095,7 +1095,7 @@ static int ufs_alloc_lastblock(struct inode *inode, loff_t size)
 		* if it maped to hole, it already contains zeroes
 		*/
 	       set_buffer_uptodate(bh);
-	       mark_buffer_dirty(bh);
+	       mark_buffer_dirty(NULL, bh);
 	       set_page_dirty(lastpage);
        }
 
@@ -1107,7 +1107,7 @@ static int ufs_alloc_lastblock(struct inode *inode, loff_t size)
 		       lock_buffer(bh);
 		       memset(bh->b_data, 0, sb->s_blocksize);
 		       set_buffer_uptodate(bh);
-		       mark_buffer_dirty(bh);
+		       mark_buffer_dirty(NULL, bh);
 		       unlock_buffer(bh);
 		       sync_dirty_buffer(bh);
 		       brelse(bh);
diff --git a/fs/ufs/util.c b/fs/ufs/util.c
index e8b3d6b70ca9..131f6ad2311f 100644
--- a/fs/ufs/util.c
+++ b/fs/ufs/util.c
@@ -96,7 +96,7 @@ void ubh_mark_buffer_dirty (struct ufs_buffer_head * ubh)
 	if (!ubh)
 		return;
 	for ( i = 0; i < ubh->count; i++ )
-		mark_buffer_dirty (ubh->bh[i]);
+		mark_buffer_dirty(NULL, ubh->bh[i]);
 }
 
 void ubh_mark_buffer_uptodate (struct ufs_buffer_head * ubh, int flag)
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 6c355f43b46b..5e77654f8e81 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -175,7 +175,7 @@ void buffer_check_dirty_writeback(struct page *page,
  * Declarations
  */
 
-void mark_buffer_dirty(struct buffer_head *bh);
+void mark_buffer_dirty(struct address_space *, struct buffer_head *);
 void mark_buffer_write_io_error(struct address_space *mapping,
 		struct page *page, struct buffer_head *bh);
 void touch_buffer(struct buffer_head *bh);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 71/79] mm: add struct address_space to set_page_dirty()
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (31 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 70/79] mm: add struct address_space to mark_buffer_dirty() jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 72/79] mm: add struct address_space to set_page_dirty_lock() jglisse
                   ` (9 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

For the holy crusade to stop relying on struct page mapping field, add
struct address_space to set_page_dirty() arguments.

<---------------------------------------------------------------------
@@
identifier I1;
type T1;
@@
int
-set_page_dirty(T1 I1)
+set_page_dirty(struct address_space *_mapping, T1 I1)
{...}

@@
type T1;
@@
int
-set_page_dirty(T1)
+set_page_dirty(struct address_space *, T1)
;

@@
identifier I1;
type T1;
@@
int
-set_page_dirty(T1 I1)
+set_page_dirty(struct address_space *, T1)
;

@@
expression E1;
@@
-set_page_dirty(E1)
+set_page_dirty(NULL, E1)
--------------------------------------------------------------------->

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c            |  2 +-
 drivers/gpu/drm/drm_gem.c                          |  2 +-
 drivers/gpu/drm/i915/i915_gem.c                    |  6 ++---
 drivers/gpu/drm/i915/i915_gem_fence_reg.c          |  2 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c            |  2 +-
 drivers/gpu/drm/radeon/radeon_ttm.c                |  2 +-
 drivers/gpu/drm/ttm/ttm_tt.c                       |  2 +-
 drivers/infiniband/core/umem_odp.c                 |  2 +-
 drivers/misc/vmw_vmci/vmci_queue_pair.c            |  2 +-
 drivers/mtd/devices/block2mtd.c                    |  4 +--
 drivers/platform/goldfish/goldfish_pipe.c          |  2 +-
 drivers/sbus/char/oradax.c                         |  2 +-
 drivers/staging/lustre/lustre/llite/rw26.c         |  2 +-
 drivers/staging/lustre/lustre/llite/vvp_io.c       |  4 +--
 .../interface/vchiq_arm/vchiq_2835_arm.c           |  2 +-
 fs/9p/vfs_addr.c                                   |  2 +-
 fs/afs/write.c                                     |  2 +-
 fs/btrfs/extent_io.c                               |  2 +-
 fs/btrfs/file.c                                    |  2 +-
 fs/btrfs/inode.c                                   |  6 ++---
 fs/btrfs/ioctl.c                                   |  2 +-
 fs/btrfs/relocation.c                              |  2 +-
 fs/buffer.c                                        |  6 ++---
 fs/ceph/addr.c                                     |  4 +--
 fs/cifs/file.c                                     |  4 +--
 fs/exofs/dir.c                                     |  2 +-
 fs/exofs/inode.c                                   |  4 +--
 fs/f2fs/checkpoint.c                               |  4 +--
 fs/f2fs/data.c                                     |  6 ++---
 fs/f2fs/dir.c                                      | 10 ++++----
 fs/f2fs/file.c                                     | 10 ++++----
 fs/f2fs/gc.c                                       |  6 ++---
 fs/f2fs/inline.c                                   | 18 ++++++-------
 fs/f2fs/inode.c                                    |  6 ++---
 fs/f2fs/node.c                                     | 20 +++++++--------
 fs/f2fs/node.h                                     |  2 +-
 fs/f2fs/recovery.c                                 |  2 +-
 fs/f2fs/segment.c                                  | 12 ++++-----
 fs/f2fs/xattr.c                                    |  6 ++---
 fs/fuse/file.c                                     |  2 +-
 fs/gfs2/file.c                                     |  2 +-
 fs/hfs/bnode.c                                     | 12 ++++-----
 fs/hfs/btree.c                                     |  6 ++---
 fs/hfsplus/bitmap.c                                |  8 +++---
 fs/hfsplus/bnode.c                                 | 30 +++++++++++-----------
 fs/hfsplus/btree.c                                 |  6 ++---
 fs/hfsplus/xattr.c                                 |  2 +-
 fs/iomap.c                                         |  2 +-
 fs/jfs/jfs_metapage.c                              |  4 +--
 fs/libfs.c                                         |  2 +-
 fs/nfs/direct.c                                    |  2 +-
 fs/ntfs/attrib.c                                   |  8 +++---
 fs/ntfs/bitmap.c                                   |  4 +--
 fs/ntfs/file.c                                     |  2 +-
 fs/ntfs/lcnalloc.c                                 |  4 +--
 fs/ntfs/mft.c                                      |  4 +--
 fs/ntfs/usnjrnl.c                                  |  2 +-
 fs/udf/file.c                                      |  2 +-
 fs/ufs/inode.c                                     |  2 +-
 include/linux/mm.h                                 |  2 +-
 mm/filemap.c                                       |  2 +-
 mm/gup.c                                           |  2 +-
 mm/huge_memory.c                                   |  2 +-
 mm/hugetlb.c                                       |  2 +-
 mm/khugepaged.c                                    |  2 +-
 mm/ksm.c                                           |  2 +-
 mm/memory.c                                        |  4 +--
 mm/page-writeback.c                                |  6 ++---
 mm/page_io.c                                       |  6 ++---
 mm/rmap.c                                          |  2 +-
 mm/shmem.c                                         | 18 ++++++-------
 mm/swap_state.c                                    |  2 +-
 mm/truncate.c                                      |  2 +-
 net/rds/ib_rdma.c                                  |  2 +-
 net/rds/rdma.c                                     |  4 +--
 75 files changed, 172 insertions(+), 172 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index e4bb435e614b..9602a7dfbc7b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -769,7 +769,7 @@ void amdgpu_ttm_tt_mark_user_pages(struct ttm_tt *ttm)
 			continue;
 
 		if (!(gtt->userflags & AMDGPU_GEM_USERPTR_READONLY))
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 
 		mark_page_accessed(page);
 	}
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 01f8d9481211..b0ef6cd6ce7a 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -607,7 +607,7 @@ void drm_gem_put_pages(struct drm_gem_object *obj, struct page **pages,
 
 	for (i = 0; i < npages; i++) {
 		if (dirty)
-			set_page_dirty(pages[i]);
+			set_page_dirty(NULL, pages[i]);
 
 		if (accessed)
 			mark_page_accessed(pages[i]);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6ff5d655c202..4ad397254c42 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -288,7 +288,7 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
 			memcpy(dst, vaddr, PAGE_SIZE);
 			kunmap_atomic(dst);
 
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			if (obj->mm.madv == I915_MADV_WILLNEED)
 				mark_page_accessed(page);
 			put_page(page);
@@ -2279,7 +2279,7 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj,
 
 	for_each_sgt_page(page, sgt_iter, pages) {
 		if (obj->mm.dirty)
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 
 		if (obj->mm.madv == I915_MADV_WILLNEED)
 			mark_page_accessed(page);
@@ -5788,7 +5788,7 @@ i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj,
 
 	page = i915_gem_object_get_page(obj, n);
 	if (!obj->mm.dirty)
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 
 	return page;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index 012250f25255..e9a75a3d588c 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -760,7 +760,7 @@ i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj,
 		char new_bit_17 = page_to_phys(page) >> 17;
 		if ((new_bit_17 & 0x1) != (test_bit(i, obj->bit_17) != 0)) {
 			i915_gem_swizzle_page(page);
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 		}
 		i++;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 382a77a1097e..9d29b00055d7 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -685,7 +685,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 
 	for_each_sgt_page(page, sgt_iter, pages) {
 		if (obj->mm.dirty)
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 
 		mark_page_accessed(page);
 		put_page(page);
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index a0a839bc39bf..a7f156941448 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -621,7 +621,7 @@ static void radeon_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
 	for_each_sg_page(ttm->sg->sgl, &sg_iter, ttm->sg->nents, 0) {
 		struct page *page = sg_page_iter_page(&sg_iter);
 		if (!(gtt->userflags & RADEON_GEM_USERPTR_READONLY))
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 
 		mark_page_accessed(page);
 		put_page(page);
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 5a046a3c543a..3138fc73c06d 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -359,7 +359,7 @@ int ttm_tt_swapout(struct ttm_tt *ttm, struct file *persistent_swap_storage)
 			goto out_err;
 		}
 		copy_highpage(to_page, from_page);
-		set_page_dirty(to_page);
+		set_page_dirty(NULL, to_page);
 		mark_page_accessed(to_page);
 		put_page(to_page);
 	}
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 2aadf5813a40..6a8077cbfc61 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -774,7 +774,7 @@ void ib_umem_odp_unmap_dma_pages(struct ib_umem *umem, u64 virt,
 				 * continuing and allowing the page mapping to
 				 * be removed.
 				 */
-				set_page_dirty(head_page);
+				set_page_dirty(NULL, head_page);
 			}
 			/* on demand pinning support */
 			if (!umem->context->invalidate_range)
diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
index 0339538c182d..4b1374ae5375 100644
--- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
+++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
@@ -643,7 +643,7 @@ static void qp_release_pages(struct page **pages,
 
 	for (i = 0; i < num_pages; i++) {
 		if (dirty)
-			set_page_dirty(pages[i]);
+			set_page_dirty(NULL, pages[i]);
 
 		put_page(pages[i]);
 		pages[i] = NULL;
diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
index 62fd6905c648..1581a00cf770 100644
--- a/drivers/mtd/devices/block2mtd.c
+++ b/drivers/mtd/devices/block2mtd.c
@@ -69,7 +69,7 @@ static int _block2mtd_erase(struct block2mtd_dev *dev, loff_t to, size_t len)
 			if (*p != -1UL) {
 				lock_page(page);
 				memset(page_address(page), 0xff, PAGE_SIZE);
-				set_page_dirty(page);
+				set_page_dirty(NULL, page);
 				unlock_page(page);
 				balance_dirty_pages_ratelimited(mapping);
 				break;
@@ -160,7 +160,7 @@ static int _block2mtd_write(struct block2mtd_dev *dev, const u_char *buf,
 		if (memcmp(page_address(page)+offset, buf, cpylen)) {
 			lock_page(page);
 			memcpy(page_address(page) + offset, buf, cpylen);
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			unlock_page(page);
 			balance_dirty_pages_ratelimited(mapping);
 		}
diff --git a/drivers/platform/goldfish/goldfish_pipe.c b/drivers/platform/goldfish/goldfish_pipe.c
index 3e32a4c14d5f..91b9a2045697 100644
--- a/drivers/platform/goldfish/goldfish_pipe.c
+++ b/drivers/platform/goldfish/goldfish_pipe.c
@@ -338,7 +338,7 @@ static void release_user_pages(struct page **pages, int pages_count,
 
 	for (i = 0; i < pages_count; i++) {
 		if (!is_write && consumed_size > 0)
-			set_page_dirty(pages[i]);
+			set_page_dirty(NULL, pages[i]);
 		put_page(pages[i]);
 	}
 }
diff --git a/drivers/sbus/char/oradax.c b/drivers/sbus/char/oradax.c
index 03dc04739225..43798fa51061 100644
--- a/drivers/sbus/char/oradax.c
+++ b/drivers/sbus/char/oradax.c
@@ -423,7 +423,7 @@ static void dax_unlock_pages(struct dax_ctx *ctx, int ccb_index, int nelem)
 			if (p) {
 				dax_dbg("freeing page %p", p);
 				if (j == OUT)
-					set_page_dirty(p);
+					set_page_dirty(NULL, p);
 				put_page(p);
 				ctx->pages[i][j] = NULL;
 			}
diff --git a/drivers/staging/lustre/lustre/llite/rw26.c b/drivers/staging/lustre/lustre/llite/rw26.c
index 366ba0afbd0e..969f4dad2f82 100644
--- a/drivers/staging/lustre/lustre/llite/rw26.c
+++ b/drivers/staging/lustre/lustre/llite/rw26.c
@@ -237,7 +237,7 @@ ssize_t ll_direct_rw_pages(const struct lu_env *env, struct cl_io *io,
 			 * cl_io_submit()->...->vvp_page_prep_write().
 			 */
 			if (rw == WRITE)
-				set_page_dirty(vmpage);
+				set_page_dirty(NULL, vmpage);
 
 			if (rw == READ) {
 				/* do not issue the page for read, since it
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index aaa06ba38b4c..301fd4d10499 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -797,7 +797,7 @@ static void write_commit_callback(const struct lu_env *env, struct cl_io *io,
 	struct page *vmpage = page->cp_vmpage;
 
 	SetPageUptodate(vmpage);
-	set_page_dirty(vmpage);
+	set_page_dirty(NULL, vmpage);
 
 	cl_page_disown(env, io, page);
 
@@ -1055,7 +1055,7 @@ static int vvp_io_kernel_fault(struct vvp_fault_io *cfio)
 static void mkwrite_commit_callback(const struct lu_env *env, struct cl_io *io,
 				    struct cl_page *page)
 {
-	set_page_dirty(page->cp_vmpage);
+	set_page_dirty(NULL, page->cp_vmpage);
 }
 
 static int vvp_io_fault_start(const struct lu_env *env,
diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
index b59ef14890aa..1846ae06ce50 100644
--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
+++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
@@ -636,7 +636,7 @@ free_pagelist(struct vchiq_pagelist_info *pagelistinfo,
 		unsigned int i;
 
 		for (i = 0; i < num_pages; i++)
-			set_page_dirty(pages[i]);
+			set_page_dirty(NULL, pages[i]);
 	}
 
 	cleanup_pagelistinfo(pagelistinfo);
diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index 1f4d49e7f811..835bd52f6215 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -342,7 +342,7 @@ static int v9fs_write_end(struct file *filp, struct address_space *mapping,
 		inode_add_bytes(inode, last_pos - inode->i_size);
 		i_size_write(inode, last_pos);
 	}
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 out:
 	unlock_page(page);
 	put_page(page);
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 9c5bdad0bd72..20d5a3388012 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -203,7 +203,7 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 		SetPageUptodate(page);
 	}
 
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	if (PageDirty(page))
 		_debug("dirtied");
 	ret = copied;
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 3c145b353873..5b12578ca5fb 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -5198,7 +5198,7 @@ int set_extent_buffer_dirty(struct extent_buffer *eb)
 	WARN_ON(!test_bit(EXTENT_BUFFER_TREE_REF, &eb->bflags));
 
 	for (i = 0; i < num_pages; i++)
-		set_page_dirty(eb->pages[i]);
+		set_page_dirty(NULL, eb->pages[i]);
 	return was_dirty;
 }
 
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 989735cd751c..2630be84bdca 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -574,7 +574,7 @@ int btrfs_dirty_pages(struct inode *inode, struct page **pages,
 		struct page *p = pages[i];
 		SetPageUptodate(p);
 		ClearPageChecked(p);
-		set_page_dirty(p);
+		set_page_dirty(NULL, p);
 	}
 
 	/*
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 968640312537..e6fdd6095579 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2134,7 +2134,7 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
 	}
 
 	ClearPageChecked(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE);
 out:
 	unlock_extent_cached(&BTRFS_I(inode)->io_tree, page_start, page_end,
@@ -4869,7 +4869,7 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len,
 		kunmap(page);
 	}
 	ClearPageChecked(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	unlock_extent_cached(io_tree, block_start, block_end, &cached_state);
 
 out_unlock:
@@ -9090,7 +9090,7 @@ int btrfs_page_mkwrite(struct vm_fault *vmf)
 		kunmap(page);
 	}
 	ClearPageChecked(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	SetPageUptodate(page);
 
 	BTRFS_I(inode)->last_trans = fs_info->generation;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index c57e9ce8204d..3ec8d50799ff 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1211,7 +1211,7 @@ static int cluster_pages_for_defrag(struct inode *inode,
 		clear_page_dirty_for_io(pages[i]);
 		ClearPageChecked(pages[i]);
 		set_page_extent_mapped(pages[i]);
-		set_page_dirty(pages[i]);
+		set_page_dirty(NULL, pages[i]);
 		unlock_page(pages[i]);
 		put_page(pages[i]);
 	}
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 6a530c59b519..454c1dd523ea 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3284,7 +3284,7 @@ static int relocate_file_extent_cluster(struct inode *inode,
 			goto out;
 
 		}
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 
 		unlock_extent(&BTRFS_I(inode)->io_tree,
 			      page_start, page_end);
diff --git a/fs/buffer.c b/fs/buffer.c
index 24872b077269..343b8b3837e7 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2517,7 +2517,7 @@ int block_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf,
 
 	if (unlikely(ret < 0))
 		goto out_unlock;
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	wait_for_stable_page(page);
 	return 0;
 out_unlock:
@@ -2724,7 +2724,7 @@ int nobh_write_end(struct file *file, struct address_space *mapping,
 					copied, page, fsdata);
 
 	SetPageUptodate(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	if (pos+copied > inode->i_size) {
 		i_size_write(inode, pos+copied);
 		mark_inode_dirty(inode);
@@ -2861,7 +2861,7 @@ int nobh_truncate_page(struct address_space *mapping,
 			goto has_buffers;
 	}
 	zero_user(page, offset, length);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	err = 0;
 
 unlock:
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index c274d8a32479..8497c198e76e 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1382,7 +1382,7 @@ static int ceph_write_end(struct file *file, struct address_space *mapping,
 	if (pos+copied > i_size_read(inode))
 		check_cap = ceph_inode_set_size(inode, pos+copied);
 
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 
 out:
 	unlock_page(page);
@@ -1595,7 +1595,7 @@ static int ceph_page_mkwrite(struct vm_fault *vmf)
 		ret = ceph_update_writeable_page(vma->vm_file, off, len, page);
 		if (ret >= 0) {
 			/* success.  we'll keep the page locked. */
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			ret = VM_FAULT_LOCKED;
 		}
 	} while (ret == -EAGAIN);
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 017fe16ae993..d460feb43595 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2297,7 +2297,7 @@ static int cifs_write_end(struct file *file, struct address_space *mapping,
 	} else {
 		rc = copied;
 		pos += copied;
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 	}
 
 	if (rc > 0) {
@@ -3220,7 +3220,7 @@ collect_uncached_read_data(struct cifs_aio_ctx *ctx)
 
 	for (i = 0; i < ctx->npages; i++) {
 		if (ctx->should_dirty)
-			set_page_dirty(ctx->bv[i].bv_page);
+			set_page_dirty(NULL, ctx->bv[i].bv_page);
 		put_page(ctx->bv[i].bv_page);
 	}
 
diff --git a/fs/exofs/dir.c b/fs/exofs/dir.c
index f0138674c1ed..e07ec3f0dfc3 100644
--- a/fs/exofs/dir.c
+++ b/fs/exofs/dir.c
@@ -70,7 +70,7 @@ static int exofs_commit_chunk(struct page *page, loff_t pos, unsigned len)
 		i_size_write(dir, pos+len);
 		mark_inode_dirty(dir);
 	}
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 
 	if (IS_DIRSYNC(dir))
 		err = write_one_page(page);
diff --git a/fs/exofs/inode.c b/fs/exofs/inode.c
index 54d6b7dbd4e7..137f1d8c13e8 100644
--- a/fs/exofs/inode.c
+++ b/fs/exofs/inode.c
@@ -832,7 +832,7 @@ static int exofs_writepages(struct address_space *mapping,
 			struct page *page = pcol.pages[i];
 
 			end_page_writeback(page);
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			unlock_page(page);
 		}
 	}
@@ -931,7 +931,7 @@ static int exofs_write_end(struct file *file, struct address_space *mapping,
 		i_size_write(inode, last_pos);
 		mark_inode_dirty(inode);
 	}
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 out:
 	unlock_page(page);
 	put_page(page);
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index b218fcacd395..d859c5682a1e 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -708,7 +708,7 @@ static void write_orphan_inodes(struct f2fs_sb_info *sbi, block_t start_blk)
 			orphan_blk->blk_addr = cpu_to_le16(index);
 			orphan_blk->blk_count = cpu_to_le16(orphan_blocks);
 			orphan_blk->entry_count = cpu_to_le32(nentries);
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			f2fs_put_page(page, 1);
 			index++;
 			nentries = 0;
@@ -720,7 +720,7 @@ static void write_orphan_inodes(struct f2fs_sb_info *sbi, block_t start_blk)
 		orphan_blk->blk_addr = cpu_to_le16(index);
 		orphan_blk->blk_count = cpu_to_le16(orphan_blocks);
 		orphan_blk->entry_count = cpu_to_le32(nentries);
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		f2fs_put_page(page, 1);
 	}
 }
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index c1a8dd623444..4e6894169d0e 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -540,7 +540,7 @@ void set_data_blkaddr(struct dnode_of_data *dn)
 {
 	f2fs_wait_on_page_writeback(dn->node_page, NODE, true);
 	__set_data_blkaddr(dn);
-	if (set_page_dirty(dn->node_page))
+	if (set_page_dirty(NULL, dn->node_page))
 		dn->node_changed = true;
 }
 
@@ -580,7 +580,7 @@ int reserve_new_blocks(struct dnode_of_data *dn, blkcnt_t count)
 		}
 	}
 
-	if (set_page_dirty(dn->node_page))
+	if (set_page_dirty(NULL, dn->node_page))
 		dn->node_changed = true;
 	return 0;
 }
@@ -2261,7 +2261,7 @@ static int f2fs_write_end(struct file *file,
 	if (!copied)
 		goto unlock_out;
 
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 
 	if (pos + copied > i_size_read(inode))
 		f2fs_i_size_write(inode, pos + copied);
diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed8c011..a6d560f57933 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -303,7 +303,7 @@ void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry *de,
 	de->ino = cpu_to_le32(inode->i_ino);
 	set_de_type(de, inode->i_mode);
 	f2fs_dentry_kunmap(dir, page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 
 	dir->i_mtime = dir->i_ctime = current_time(dir);
 	f2fs_mark_inode_dirty_sync(dir, false);
@@ -320,7 +320,7 @@ static void init_dent_inode(const struct qstr *name, struct page *ipage)
 	ri = F2FS_INODE(ipage);
 	ri->i_namelen = cpu_to_le32(name->len);
 	memcpy(ri->i_name, name->name, name->len);
-	set_page_dirty(ipage);
+	set_page_dirty(NULL, ipage);
 }
 
 void do_make_empty_dir(struct inode *inode, struct inode *parent,
@@ -357,7 +357,7 @@ static int make_empty_dir(struct inode *inode,
 
 	kunmap_atomic(dentry_blk);
 
-	set_page_dirty(dentry_page);
+	set_page_dirty(NULL, dentry_page);
 	f2fs_put_page(dentry_page, 1);
 	return 0;
 }
@@ -576,7 +576,7 @@ int f2fs_add_regular_entry(struct inode *dir, const struct qstr *new_name,
 	make_dentry_ptr_block(NULL, &d, dentry_blk);
 	f2fs_update_dentry(ino, mode, &d, new_name, dentry_hash, bit_pos);
 
-	set_page_dirty(dentry_page);
+	set_page_dirty(NULL, dentry_page);
 
 	if (inode) {
 		f2fs_i_pino_write(inode, dir->i_ino);
@@ -731,7 +731,7 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, struct page *page,
 			NR_DENTRY_IN_BLOCK,
 			0);
 	kunmap(page); /* kunmap - pair of f2fs_find_entry */
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 
 	dir->i_ctime = dir->i_mtime = current_time(dir);
 	f2fs_mark_inode_dirty_sync(dir, false);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 5e9ac31240bb..d4f253a4cb3c 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -99,7 +99,7 @@ static int f2fs_vm_page_mkwrite(struct vm_fault *vmf)
 		offset = i_size_read(inode) & ~PAGE_MASK;
 		zero_user_segment(page, offset, PAGE_SIZE);
 	}
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	if (!PageUptodate(page))
 		SetPageUptodate(page);
 
@@ -561,7 +561,7 @@ static int truncate_partial_data_page(struct inode *inode, u64 from,
 	/* An encrypted inode should have a key and truncate the last page. */
 	f2fs_bug_on(F2FS_I_SB(inode), cache_only && f2fs_encrypted_inode(inode));
 	if (!cache_only)
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 	f2fs_put_page(page, 1);
 	return 0;
 }
@@ -855,7 +855,7 @@ static int fill_zero(struct inode *inode, pgoff_t index,
 
 	f2fs_wait_on_page_writeback(page, DATA, true);
 	zero_user(page, start, len);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	f2fs_put_page(page, 1);
 	return 0;
 }
@@ -1084,7 +1084,7 @@ static int __clone_blkaddrs(struct inode *src_inode, struct inode *dst_inode,
 				return PTR_ERR(pdst);
 			}
 			f2fs_copy_page(psrc, pdst);
-			set_page_dirty(pdst);
+			set_page_dirty(NULL, pdst);
 			f2fs_put_page(pdst, 1);
 			f2fs_put_page(psrc, 1);
 
@@ -2205,7 +2205,7 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi,
 				goto clear_out;
 			}
 
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			f2fs_put_page(page, 1);
 
 			idx++;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index aa720cc44509..86e387af01ac 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -678,7 +678,7 @@ static void move_data_block(struct inode *inode, block_t bidx,
 		goto put_page_out;
 	}
 
-	set_page_dirty(fio.encrypted_page);
+	set_page_dirty(NULL, fio.encrypted_page);
 	f2fs_wait_on_page_writeback(fio.encrypted_page, DATA, true);
 	if (clear_page_dirty_for_io(fio.encrypted_page))
 		dec_page_count(fio.sbi, F2FS_DIRTY_META);
@@ -739,7 +739,7 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type,
 	if (gc_type == BG_GC) {
 		if (PageWriteback(page))
 			goto out;
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		set_cold_data(page);
 	} else {
 		struct f2fs_io_info fio = {
@@ -759,7 +759,7 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type,
 		int err;
 
 retry:
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		f2fs_wait_on_page_writeback(page, DATA, true);
 		if (clear_page_dirty_for_io(page)) {
 			inode_dec_dirty_pages(inode);
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index 90e38d8ea688..b25425068168 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -75,7 +75,7 @@ void truncate_inline_inode(struct inode *inode, struct page *ipage, u64 from)
 
 	f2fs_wait_on_page_writeback(ipage, NODE, true);
 	memset(addr + from, 0, MAX_INLINE_DATA(inode) - from);
-	set_page_dirty(ipage);
+	set_page_dirty(NULL, ipage);
 
 	if (from == 0)
 		clear_inode_flag(inode, FI_DATA_EXIST);
@@ -132,7 +132,7 @@ int f2fs_convert_inline_page(struct dnode_of_data *dn, struct page *page)
 	f2fs_bug_on(F2FS_P_SB(page), PageWriteback(page));
 
 	read_inline_data(page, dn->inode_page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 
 	/* clear dirty state */
 	dirty = clear_page_dirty_for_io(page);
@@ -224,7 +224,7 @@ int f2fs_write_inline_data(struct inode *inode, struct page *page)
 	dst_addr = inline_data_addr(inode, dn.inode_page);
 	memcpy(dst_addr, src_addr, MAX_INLINE_DATA(inode));
 	kunmap_atomic(src_addr);
-	set_page_dirty(dn.inode_page);
+	set_page_dirty(NULL, dn.inode_page);
 
 	spin_lock_irqsave(&mapping->tree_lock, flags);
 	radix_tree_tag_clear(&mapping->page_tree, page_index(page),
@@ -272,7 +272,7 @@ bool recover_inline_data(struct inode *inode, struct page *npage)
 		set_inode_flag(inode, FI_INLINE_DATA);
 		set_inode_flag(inode, FI_DATA_EXIST);
 
-		set_page_dirty(ipage);
+		set_page_dirty(NULL, ipage);
 		f2fs_put_page(ipage, 1);
 		return true;
 	}
@@ -334,7 +334,7 @@ int make_empty_inline_dir(struct inode *inode, struct inode *parent,
 	make_dentry_ptr_inline(inode, &d, inline_dentry);
 	do_make_empty_dir(inode, parent, &d);
 
-	set_page_dirty(ipage);
+	set_page_dirty(NULL, ipage);
 
 	/* update i_size to MAX_INLINE_DATA */
 	if (i_size_read(inode) < MAX_INLINE_DATA(inode))
@@ -389,7 +389,7 @@ static int f2fs_move_inline_dirents(struct inode *dir, struct page *ipage,
 	kunmap_atomic(dentry_blk);
 	if (!PageUptodate(page))
 		SetPageUptodate(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 
 	/* clear inline dir and flag after data writeback */
 	truncate_inline_inode(dir, ipage, 0);
@@ -485,7 +485,7 @@ static int f2fs_move_rehashed_dirents(struct inode *dir, struct page *ipage,
 	memcpy(inline_dentry, backup_dentry, MAX_INLINE_DATA(dir));
 	f2fs_i_depth_write(dir, 0);
 	f2fs_i_size_write(dir, MAX_INLINE_DATA(dir));
-	set_page_dirty(ipage);
+	set_page_dirty(NULL, ipage);
 	f2fs_put_page(ipage, 1);
 
 	kfree(backup_dentry);
@@ -546,7 +546,7 @@ int f2fs_add_inline_entry(struct inode *dir, const struct qstr *new_name,
 	name_hash = f2fs_dentry_hash(new_name, NULL);
 	f2fs_update_dentry(ino, mode, &d, new_name, name_hash, bit_pos);
 
-	set_page_dirty(ipage);
+	set_page_dirty(NULL, ipage);
 
 	/* we don't need to mark_inode_dirty now */
 	if (inode) {
@@ -582,7 +582,7 @@ void f2fs_delete_inline_entry(struct f2fs_dir_entry *dentry, struct page *page,
 	for (i = 0; i < slots; i++)
 		__clear_bit_le(bit_pos + i, d.bitmap);
 
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	f2fs_put_page(page, 1);
 
 	dir->i_ctime = dir->i_mtime = current_time(dir);
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 205add3d0f3a..920de42398f9 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -107,7 +107,7 @@ static void __recover_inline_status(struct inode *inode, struct page *ipage)
 
 			set_inode_flag(inode, FI_DATA_EXIST);
 			set_raw_inline(inode, F2FS_INODE(ipage));
-			set_page_dirty(ipage);
+			set_page_dirty(NULL, ipage);
 			return;
 		}
 	}
@@ -231,7 +231,7 @@ static int do_read_inode(struct inode *inode)
 	fi->i_dir_level = ri->i_dir_level;
 
 	if (f2fs_init_extent_tree(inode, &ri->i_ext))
-		set_page_dirty(node_page);
+		set_page_dirty(NULL, node_page);
 
 	get_inline_info(inode, ri);
 
@@ -375,7 +375,7 @@ void update_inode(struct inode *inode, struct page *node_page)
 	struct extent_tree *et = F2FS_I(inode)->extent_tree;
 
 	f2fs_wait_on_page_writeback(node_page, NODE, true);
-	set_page_dirty(node_page);
+	set_page_dirty(NULL, node_page);
 
 	f2fs_inode_synced(inode);
 
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 67737885cad5..e4e5798e9271 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -130,7 +130,7 @@ static struct page *get_next_nat_page(struct f2fs_sb_info *sbi, nid_t nid)
 	src_addr = page_address(src_page);
 	dst_addr = page_address(dst_page);
 	memcpy(dst_addr, src_addr, PAGE_SIZE);
-	set_page_dirty(dst_page);
+	set_page_dirty(NULL, dst_page);
 	f2fs_put_page(src_page, 1);
 
 	set_to_next_nat(nm_i, nid);
@@ -966,7 +966,7 @@ int truncate_inode_blocks(struct inode *inode, pgoff_t from)
 			BUG_ON(page->mapping != NODE_MAPPING(sbi));
 			f2fs_wait_on_page_writeback(page, NODE, true);
 			ri->i_nid[offset[0] - NODE_DIR1_BLOCK] = 0;
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			unlock_page(page);
 		}
 		offset[1] = 0;
@@ -1079,7 +1079,7 @@ struct page *new_node_page(struct dnode_of_data *dn, unsigned int ofs)
 	set_cold_node(dn->inode, page);
 	if (!PageUptodate(page))
 		SetPageUptodate(page);
-	if (set_page_dirty(page))
+	if (set_page_dirty(NULL, page))
 		dn->node_changed = true;
 
 	if (f2fs_has_xattr_block(ofs))
@@ -1253,7 +1253,7 @@ static void flush_inline_data(struct f2fs_sb_info *sbi, nid_t ino)
 	inode_dec_dirty_pages(inode);
 	remove_dirty_inode(inode);
 	if (ret)
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 page_out:
 	f2fs_put_page(page, 1);
 iput_out:
@@ -1412,7 +1412,7 @@ void move_node_page(struct page *node_page, int gc_type)
 			.for_reclaim = 0,
 		};
 
-		set_page_dirty(node_page);
+		set_page_dirty(NULL, node_page);
 		f2fs_wait_on_page_writeback(node_page, NODE, true);
 
 		f2fs_bug_on(F2FS_P_SB(node_page), PageWriteback(node_page));
@@ -1426,7 +1426,7 @@ void move_node_page(struct page *node_page, int gc_type)
 	} else {
 		/* set page dirty and write it */
 		if (!PageWriteback(node_page))
-			set_page_dirty(node_page);
+			set_page_dirty(NULL, node_page);
 	}
 out_page:
 	unlock_page(node_page);
@@ -1514,7 +1514,7 @@ int fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode,
 				}
 				/*  may be written by other thread */
 				if (!PageDirty(page))
-					set_page_dirty(page);
+					set_page_dirty(NULL, page);
 			}
 
 			if (!clear_page_dirty_for_io(page))
@@ -1550,7 +1550,7 @@ int fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode,
 					ino, last_page->index);
 		lock_page(last_page);
 		f2fs_wait_on_page_writeback(last_page, NODE, true);
-		set_page_dirty(last_page);
+		set_page_dirty(NULL, last_page);
 		unlock_page(last_page);
 		goto retry;
 	}
@@ -2263,7 +2263,7 @@ int recover_xattr_data(struct inode *inode, struct page *page)
 	/* 3: update and set xattr node page dirty */
 	memcpy(F2FS_NODE(xpage), F2FS_NODE(page), VALID_XATTR_BLOCK_SIZE);
 
-	set_page_dirty(xpage);
+	set_page_dirty(NULL, xpage);
 	f2fs_put_page(xpage, 1);
 
 	return 0;
@@ -2324,7 +2324,7 @@ int recover_inode_page(struct f2fs_sb_info *sbi, struct page *page)
 		WARN_ON(1);
 	set_node_addr(sbi, &new_ni, NEW_ADDR, false);
 	inc_valid_inode_count(sbi);
-	set_page_dirty(ipage);
+	set_page_dirty(NULL, ipage);
 	f2fs_put_page(ipage, 1);
 	return 0;
 }
diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
index 081ef0d672bf..6945269e35ae 100644
--- a/fs/f2fs/node.h
+++ b/fs/f2fs/node.h
@@ -364,7 +364,7 @@ static inline int set_nid(struct page *p, int off, nid_t nid, bool i)
 		rn->i.i_nid[off - NODE_DIR1_BLOCK] = cpu_to_le32(nid);
 	else
 		rn->in.nid[off] = cpu_to_le32(nid);
-	return set_page_dirty(p);
+	return set_page_dirty(NULL, p);
 }
 
 static inline nid_t get_nid(struct page *p, int off, bool i)
diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 337f3363f48f..d29eb2bda530 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -529,7 +529,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct inode *inode,
 	copy_node_footer(dn.node_page, page);
 	fill_node_footer(dn.node_page, dn.nid, ni.ino,
 					ofs_of_node(page), false);
-	set_page_dirty(dn.node_page);
+	set_page_dirty(NULL, dn.node_page);
 err:
 	f2fs_put_dnode(&dn);
 out:
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index b16a8e6625aa..e188e241e4c2 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -367,7 +367,7 @@ static int __commit_inmem_pages(struct inode *inode,
 		if (page->mapping == inode->i_mapping) {
 			trace_f2fs_commit_inmem_page(page, INMEM);
 
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			f2fs_wait_on_page_writeback(page, DATA, true);
 			if (clear_page_dirty_for_io(page)) {
 				inode_dec_dirty_pages(inode);
@@ -2002,7 +2002,7 @@ void update_meta_page(struct f2fs_sb_info *sbi, void *src, block_t blk_addr)
 	struct page *page = grab_meta_page(sbi, blk_addr);
 
 	memcpy(page_address(page), src, PAGE_SIZE);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	f2fs_put_page(page, 1);
 }
 
@@ -2033,7 +2033,7 @@ static void write_current_sum_page(struct f2fs_sb_info *sbi,
 
 	mutex_unlock(&curseg->curseg_mutex);
 
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	f2fs_put_page(page, 1);
 }
 
@@ -3041,13 +3041,13 @@ static void write_compacted_summaries(struct f2fs_sb_info *sbi, block_t blkaddr)
 							SUM_FOOTER_SIZE)
 				continue;
 
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			f2fs_put_page(page, 1);
 			page = NULL;
 		}
 	}
 	if (page) {
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		f2fs_put_page(page, 1);
 	}
 }
@@ -3119,7 +3119,7 @@ static struct page *get_next_sit_page(struct f2fs_sb_info *sbi,
 	page = grab_meta_page(sbi, dst_off);
 	seg_info_to_sit_page(sbi, page, start);
 
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	set_to_next_sit(sit_i, start);
 
 	return page;
diff --git a/fs/f2fs/xattr.c b/fs/f2fs/xattr.c
index ae2dfa709f5d..9532139fa223 100644
--- a/fs/f2fs/xattr.c
+++ b/fs/f2fs/xattr.c
@@ -424,7 +424,7 @@ static inline int write_all_xattrs(struct inode *inode, __u32 hsize,
 				return err;
 			}
 			memcpy(inline_addr, txattr_addr, inline_size);
-			set_page_dirty(ipage ? ipage : in_page);
+			set_page_dirty(NULL, ipage ? ipage : in_page);
 			goto in_page_out;
 		}
 	}
@@ -457,8 +457,8 @@ static inline int write_all_xattrs(struct inode *inode, __u32 hsize,
 	memcpy(xattr_addr, txattr_addr + inline_size, VALID_XATTR_BLOCK_SIZE);
 
 	if (inline_size)
-		set_page_dirty(ipage ? ipage : in_page);
-	set_page_dirty(xpage);
+		set_page_dirty(NULL, ipage ? ipage : in_page);
+	set_page_dirty(NULL, xpage);
 
 	f2fs_put_page(xpage, 1);
 in_page_out:
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index e63be7831f4d..8a4a84f3657a 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2007,7 +2007,7 @@ static int fuse_write_end(struct file *file, struct address_space *mapping,
 	}
 
 	fuse_write_update_size(inode, pos + copied);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 
 unlock:
 	unlock_page(page);
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 2c4584deb077..bcb75335c711 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -488,7 +488,7 @@ static int gfs2_page_mkwrite(struct vm_fault *vmf)
 out_uninit:
 	gfs2_holder_uninit(&gh);
 	if (ret == 0) {
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		wait_for_stable_page(page);
 	}
 out:
diff --git a/fs/hfs/bnode.c b/fs/hfs/bnode.c
index b63a4df7327b..9de3a2f9796d 100644
--- a/fs/hfs/bnode.c
+++ b/fs/hfs/bnode.c
@@ -67,7 +67,7 @@ void hfs_bnode_write(struct hfs_bnode *node, void *buf, int off, int len)
 
 	memcpy(kmap(page) + off, buf, len);
 	kunmap(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 }
 
 void hfs_bnode_write_u16(struct hfs_bnode *node, int off, u16 data)
@@ -92,7 +92,7 @@ void hfs_bnode_clear(struct hfs_bnode *node, int off, int len)
 
 	memset(kmap(page) + off, 0, len);
 	kunmap(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 }
 
 void hfs_bnode_copy(struct hfs_bnode *dst_node, int dst,
@@ -111,7 +111,7 @@ void hfs_bnode_copy(struct hfs_bnode *dst_node, int dst,
 	memcpy(kmap(dst_page) + dst, kmap(src_page) + src, len);
 	kunmap(src_page);
 	kunmap(dst_page);
-	set_page_dirty(dst_page);
+	set_page_dirty(NULL, dst_page);
 }
 
 void hfs_bnode_move(struct hfs_bnode *node, int dst, int src, int len)
@@ -128,7 +128,7 @@ void hfs_bnode_move(struct hfs_bnode *node, int dst, int src, int len)
 	ptr = kmap(page);
 	memmove(ptr + dst, ptr + src, len);
 	kunmap(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 }
 
 void hfs_bnode_dump(struct hfs_bnode *node)
@@ -427,11 +427,11 @@ struct hfs_bnode *hfs_bnode_create(struct hfs_btree *tree, u32 num)
 	pagep = node->page;
 	memset(kmap(*pagep) + node->page_offset, 0,
 	       min((int)PAGE_SIZE, (int)tree->node_size));
-	set_page_dirty(*pagep);
+	set_page_dirty(NULL, *pagep);
 	kunmap(*pagep);
 	for (i = 1; i < tree->pages_per_bnode; i++) {
 		memset(kmap(*++pagep), 0, PAGE_SIZE);
-		set_page_dirty(*pagep);
+		set_page_dirty(NULL, *pagep);
 		kunmap(*pagep);
 	}
 	clear_bit(HFS_BNODE_NEW, &node->flags);
diff --git a/fs/hfs/btree.c b/fs/hfs/btree.c
index 374b5688e29e..91e7bdb5ecbb 100644
--- a/fs/hfs/btree.c
+++ b/fs/hfs/btree.c
@@ -181,7 +181,7 @@ void hfs_btree_write(struct hfs_btree *tree)
 	head->depth = cpu_to_be16(tree->depth);
 
 	kunmap(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	hfs_bnode_put(node);
 }
 
@@ -271,7 +271,7 @@ struct hfs_bnode *hfs_bmap_alloc(struct hfs_btree *tree)
 					if (!(byte & m)) {
 						idx += i;
 						data[off] |= m;
-						set_page_dirty(*pagep);
+						set_page_dirty(NULL, *pagep);
 						kunmap(*pagep);
 						tree->free_nodes--;
 						mark_inode_dirty(tree->inode);
@@ -362,7 +362,7 @@ void hfs_bmap_free(struct hfs_bnode *node)
 		return;
 	}
 	data[off] = byte & ~m;
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	kunmap(page);
 	hfs_bnode_put(node);
 	tree->free_nodes++;
diff --git a/fs/hfsplus/bitmap.c b/fs/hfsplus/bitmap.c
index cebce0cfe340..f9685c1a207d 100644
--- a/fs/hfsplus/bitmap.c
+++ b/fs/hfsplus/bitmap.c
@@ -126,7 +126,7 @@ int hfsplus_block_allocate(struct super_block *sb, u32 size,
 			*curr++ = cpu_to_be32(0xffffffff);
 			len -= 32;
 		}
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		kunmap(page);
 		offset += PAGE_CACHE_BITS;
 		page = read_mapping_page(mapping, offset / PAGE_CACHE_BITS,
@@ -150,7 +150,7 @@ int hfsplus_block_allocate(struct super_block *sb, u32 size,
 	}
 done:
 	*curr = cpu_to_be32(n);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	kunmap(page);
 	*max = offset + (curr - pptr) * 32 + i - start;
 	sbi->free_blocks -= *max;
@@ -214,7 +214,7 @@ int hfsplus_block_free(struct super_block *sb, u32 offset, u32 count)
 		}
 		if (!count)
 			break;
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		kunmap(page);
 		page = read_mapping_page(mapping, ++pnr, NULL);
 		if (IS_ERR(page))
@@ -230,7 +230,7 @@ int hfsplus_block_free(struct super_block *sb, u32 offset, u32 count)
 		*curr &= cpu_to_be32(mask);
 	}
 out:
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	kunmap(page);
 	sbi->free_blocks += len;
 	hfsplus_mark_mdb_dirty(sb);
diff --git a/fs/hfsplus/bnode.c b/fs/hfsplus/bnode.c
index 177fae4e6581..8531709f667e 100644
--- a/fs/hfsplus/bnode.c
+++ b/fs/hfsplus/bnode.c
@@ -83,14 +83,14 @@ void hfs_bnode_write(struct hfs_bnode *node, void *buf, int off, int len)
 
 	l = min_t(int, len, PAGE_SIZE - off);
 	memcpy(kmap(*pagep) + off, buf, l);
-	set_page_dirty(*pagep);
+	set_page_dirty(NULL, *pagep);
 	kunmap(*pagep);
 
 	while ((len -= l) != 0) {
 		buf += l;
 		l = min_t(int, len, PAGE_SIZE);
 		memcpy(kmap(*++pagep), buf, l);
-		set_page_dirty(*pagep);
+		set_page_dirty(NULL, *pagep);
 		kunmap(*pagep);
 	}
 }
@@ -113,13 +113,13 @@ void hfs_bnode_clear(struct hfs_bnode *node, int off, int len)
 
 	l = min_t(int, len, PAGE_SIZE - off);
 	memset(kmap(*pagep) + off, 0, l);
-	set_page_dirty(*pagep);
+	set_page_dirty(NULL, *pagep);
 	kunmap(*pagep);
 
 	while ((len -= l) != 0) {
 		l = min_t(int, len, PAGE_SIZE);
 		memset(kmap(*++pagep), 0, l);
-		set_page_dirty(*pagep);
+		set_page_dirty(NULL, *pagep);
 		kunmap(*pagep);
 	}
 }
@@ -144,14 +144,14 @@ void hfs_bnode_copy(struct hfs_bnode *dst_node, int dst,
 		l = min_t(int, len, PAGE_SIZE - src);
 		memcpy(kmap(*dst_page) + src, kmap(*src_page) + src, l);
 		kunmap(*src_page);
-		set_page_dirty(*dst_page);
+		set_page_dirty(NULL, *dst_page);
 		kunmap(*dst_page);
 
 		while ((len -= l) != 0) {
 			l = min_t(int, len, PAGE_SIZE);
 			memcpy(kmap(*++dst_page), kmap(*++src_page), l);
 			kunmap(*src_page);
-			set_page_dirty(*dst_page);
+			set_page_dirty(NULL, *dst_page);
 			kunmap(*dst_page);
 		}
 	} else {
@@ -172,7 +172,7 @@ void hfs_bnode_copy(struct hfs_bnode *dst_node, int dst,
 			l = min(len, l);
 			memcpy(dst_ptr, src_ptr, l);
 			kunmap(*src_page);
-			set_page_dirty(*dst_page);
+			set_page_dirty(NULL, *dst_page);
 			kunmap(*dst_page);
 			if (!dst)
 				dst_page++;
@@ -204,7 +204,7 @@ void hfs_bnode_move(struct hfs_bnode *node, int dst, int src, int len)
 			while (src < len) {
 				memmove(kmap(*dst_page), kmap(*src_page), src);
 				kunmap(*src_page);
-				set_page_dirty(*dst_page);
+				set_page_dirty(NULL, *dst_page);
 				kunmap(*dst_page);
 				len -= src;
 				src = PAGE_SIZE;
@@ -215,7 +215,7 @@ void hfs_bnode_move(struct hfs_bnode *node, int dst, int src, int len)
 			memmove(kmap(*dst_page) + src,
 				kmap(*src_page) + src, len);
 			kunmap(*src_page);
-			set_page_dirty(*dst_page);
+			set_page_dirty(NULL, *dst_page);
 			kunmap(*dst_page);
 		} else {
 			void *src_ptr, *dst_ptr;
@@ -235,7 +235,7 @@ void hfs_bnode_move(struct hfs_bnode *node, int dst, int src, int len)
 				l = min(len, l);
 				memmove(dst_ptr - l, src_ptr - l, l);
 				kunmap(*src_page);
-				set_page_dirty(*dst_page);
+				set_page_dirty(NULL, *dst_page);
 				kunmap(*dst_page);
 				if (dst == PAGE_SIZE)
 					dst_page--;
@@ -254,7 +254,7 @@ void hfs_bnode_move(struct hfs_bnode *node, int dst, int src, int len)
 			memmove(kmap(*dst_page) + src,
 				kmap(*src_page) + src, l);
 			kunmap(*src_page);
-			set_page_dirty(*dst_page);
+			set_page_dirty(NULL, *dst_page);
 			kunmap(*dst_page);
 
 			while ((len -= l) != 0) {
@@ -262,7 +262,7 @@ void hfs_bnode_move(struct hfs_bnode *node, int dst, int src, int len)
 				memmove(kmap(*++dst_page),
 					kmap(*++src_page), l);
 				kunmap(*src_page);
-				set_page_dirty(*dst_page);
+				set_page_dirty(NULL, *dst_page);
 				kunmap(*dst_page);
 			}
 		} else {
@@ -284,7 +284,7 @@ void hfs_bnode_move(struct hfs_bnode *node, int dst, int src, int len)
 				l = min(len, l);
 				memmove(dst_ptr, src_ptr, l);
 				kunmap(*src_page);
-				set_page_dirty(*dst_page);
+				set_page_dirty(NULL, *dst_page);
 				kunmap(*dst_page);
 				if (!dst)
 					dst_page++;
@@ -595,11 +595,11 @@ struct hfs_bnode *hfs_bnode_create(struct hfs_btree *tree, u32 num)
 	pagep = node->page;
 	memset(kmap(*pagep) + node->page_offset, 0,
 	       min_t(int, PAGE_SIZE, tree->node_size));
-	set_page_dirty(*pagep);
+	set_page_dirty(NULL, *pagep);
 	kunmap(*pagep);
 	for (i = 1; i < tree->pages_per_bnode; i++) {
 		memset(kmap(*++pagep), 0, PAGE_SIZE);
-		set_page_dirty(*pagep);
+		set_page_dirty(NULL, *pagep);
 		kunmap(*pagep);
 	}
 	clear_bit(HFS_BNODE_NEW, &node->flags);
diff --git a/fs/hfsplus/btree.c b/fs/hfsplus/btree.c
index de14b2b6881b..985123b314eb 100644
--- a/fs/hfsplus/btree.c
+++ b/fs/hfsplus/btree.c
@@ -304,7 +304,7 @@ int hfs_btree_write(struct hfs_btree *tree)
 	head->depth = cpu_to_be16(tree->depth);
 
 	kunmap(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	hfs_bnode_put(node);
 	return 0;
 }
@@ -394,7 +394,7 @@ struct hfs_bnode *hfs_bmap_alloc(struct hfs_btree *tree)
 					if (!(byte & m)) {
 						idx += i;
 						data[off] |= m;
-						set_page_dirty(*pagep);
+						set_page_dirty(NULL, *pagep);
 						kunmap(*pagep);
 						tree->free_nodes--;
 						mark_inode_dirty(tree->inode);
@@ -490,7 +490,7 @@ void hfs_bmap_free(struct hfs_bnode *node)
 		return;
 	}
 	data[off] = byte & ~m;
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	kunmap(page);
 	hfs_bnode_put(node);
 	tree->free_nodes++;
diff --git a/fs/hfsplus/xattr.c b/fs/hfsplus/xattr.c
index e538b758c448..c00a14bf43d0 100644
--- a/fs/hfsplus/xattr.c
+++ b/fs/hfsplus/xattr.c
@@ -235,7 +235,7 @@ static int hfsplus_create_attributes_file(struct super_block *sb)
 			min_t(size_t, PAGE_SIZE, node_size - written));
 		kunmap_atomic(kaddr);
 
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		put_page(page);
 	}
 
diff --git a/fs/iomap.c b/fs/iomap.c
index 557d990c26ea..dd86d5ca6fe5 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -477,7 +477,7 @@ int iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
 		length -= ret;
 	}
 
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	wait_for_stable_page(page);
 	return VM_FAULT_LOCKED;
 out_unlock:
diff --git a/fs/jfs/jfs_metapage.c b/fs/jfs/jfs_metapage.c
index 9071b4077108..84060e65e102 100644
--- a/fs/jfs/jfs_metapage.c
+++ b/fs/jfs/jfs_metapage.c
@@ -718,7 +718,7 @@ void force_metapage(struct metapage *mp)
 	clear_bit(META_sync, &mp->flag);
 	get_page(page);
 	lock_page(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	if (write_one_page(page))
 		jfs_error(mp->sb, "write_one_page() failed\n");
 	clear_bit(META_forcewrite, &mp->flag);
@@ -762,7 +762,7 @@ void release_metapage(struct metapage * mp)
 	}
 
 	if (test_bit(META_dirty, &mp->flag)) {
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		if (test_bit(META_sync, &mp->flag)) {
 			clear_bit(META_sync, &mp->flag);
 			if (write_one_page(page))
diff --git a/fs/libfs.c b/fs/libfs.c
index 585ef1f37d54..360a64a454ab 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -494,7 +494,7 @@ int simple_write_end(struct file *file, struct address_space *mapping,
 	if (last_pos > inode->i_size)
 		i_size_write(inode, last_pos);
 
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	unlock_page(page);
 	put_page(page);
 
diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index b752f5d8d5f4..58bdf005b877 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -413,7 +413,7 @@ static void nfs_direct_read_completion(struct nfs_pgio_header *hdr)
 		struct page *page = req->wb_page;
 
 		if (!PageCompound(page) && bytes < hdr->good_bytes)
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 		bytes += req->wb_bytes;
 		nfs_list_remove_request(req);
 		nfs_release_request(req);
diff --git a/fs/ntfs/attrib.c b/fs/ntfs/attrib.c
index 44a39a099b54..5b4f444fd080 100644
--- a/fs/ntfs/attrib.c
+++ b/fs/ntfs/attrib.c
@@ -1746,7 +1746,7 @@ int ntfs_attr_make_non_resident(ntfs_inode *ni, const u32 data_size)
 	unmap_mft_record(base_ni);
 	up_write(&ni->runlist.lock);
 	if (page) {
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		unlock_page(page);
 		put_page(page);
 	}
@@ -2543,7 +2543,7 @@ int ntfs_attr_set(ntfs_inode *ni, const s64 ofs, const s64 cnt, const u8 val)
 		memset(kaddr + start_ofs, val, size - start_ofs);
 		flush_dcache_page(page);
 		kunmap_atomic(kaddr);
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		put_page(page);
 		balance_dirty_pages_ratelimited(mapping);
 		cond_resched();
@@ -2582,7 +2582,7 @@ int ntfs_attr_set(ntfs_inode *ni, const s64 ofs, const s64 cnt, const u8 val)
 		 * Set the page and all its buffers dirty and mark the inode
 		 * dirty, too.  The VM will write the page later on.
 		 */
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		/* Finally unlock and release the page. */
 		unlock_page(page);
 		put_page(page);
@@ -2601,7 +2601,7 @@ int ntfs_attr_set(ntfs_inode *ni, const s64 ofs, const s64 cnt, const u8 val)
 		memset(kaddr, val, end_ofs);
 		flush_dcache_page(page);
 		kunmap_atomic(kaddr);
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		put_page(page);
 		balance_dirty_pages_ratelimited(mapping);
 		cond_resched();
diff --git a/fs/ntfs/bitmap.c b/fs/ntfs/bitmap.c
index ec130c588d2b..ee92820b7b8a 100644
--- a/fs/ntfs/bitmap.c
+++ b/fs/ntfs/bitmap.c
@@ -122,7 +122,7 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, const s64 start_bit,
 
 		/* Update @index and get the next page. */
 		flush_dcache_page(page);
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		ntfs_unmap_page(page);
 		page = ntfs_map_page(mapping, ++index);
 		if (IS_ERR(page))
@@ -158,7 +158,7 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, const s64 start_bit,
 done:
 	/* We are done.  Unmap the page and return success. */
 	flush_dcache_page(page);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	ntfs_unmap_page(page);
 	ntfs_debug("Done.");
 	return 0;
diff --git a/fs/ntfs/file.c b/fs/ntfs/file.c
index bf07c0ca127e..c551defedaab 100644
--- a/fs/ntfs/file.c
+++ b/fs/ntfs/file.c
@@ -247,7 +247,7 @@ static int ntfs_attr_extend_initialized(ntfs_inode *ni, const s64 new_init_size)
 			ni->initialized_size = new_init_size;
 		write_unlock_irqrestore(&ni->size_lock, flags);
 		/* Set the page dirty so it gets written out. */
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		put_page(page);
 		/*
 		 * Play nice with the vm and the rest of the system.  This is
diff --git a/fs/ntfs/lcnalloc.c b/fs/ntfs/lcnalloc.c
index 27a24a42f712..50a568f77a25 100644
--- a/fs/ntfs/lcnalloc.c
+++ b/fs/ntfs/lcnalloc.c
@@ -277,7 +277,7 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol, const VCN start_vcn,
 			if (need_writeback) {
 				ntfs_debug("Marking page dirty.");
 				flush_dcache_page(page);
-				set_page_dirty(page);
+				set_page_dirty(NULL, page);
 				need_writeback = 0;
 			}
 			ntfs_unmap_page(page);
@@ -745,7 +745,7 @@ switch_to_data1_zone:		search_zone = 2;
 		if (need_writeback) {
 			ntfs_debug("Marking page dirty.");
 			flush_dcache_page(page);
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			need_writeback = 0;
 		}
 		ntfs_unmap_page(page);
diff --git a/fs/ntfs/mft.c b/fs/ntfs/mft.c
index 2831f495a674..52757378b39d 100644
--- a/fs/ntfs/mft.c
+++ b/fs/ntfs/mft.c
@@ -1217,7 +1217,7 @@ static int ntfs_mft_bitmap_find_and_alloc_free_rec_nolock(ntfs_volume *vol,
 					}
 					*byte |= 1 << b;
 					flush_dcache_page(page);
-					set_page_dirty(page);
+					set_page_dirty(NULL, page);
 					ntfs_unmap_page(page);
 					ntfs_debug("Done.  (Found and "
 							"allocated mft record "
@@ -1342,7 +1342,7 @@ static int ntfs_mft_bitmap_extend_allocation_nolock(ntfs_volume *vol)
 		/* Next cluster is free, allocate it. */
 		*b |= tb;
 		flush_dcache_page(page);
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		up_write(&vol->lcnbmp_lock);
 		ntfs_unmap_page(page);
 		/* Update the mft bitmap runlist. */
diff --git a/fs/ntfs/usnjrnl.c b/fs/ntfs/usnjrnl.c
index b2bc0d55b036..3f35649fc3f6 100644
--- a/fs/ntfs/usnjrnl.c
+++ b/fs/ntfs/usnjrnl.c
@@ -72,7 +72,7 @@ bool ntfs_stamp_usnjrnl(ntfs_volume *vol)
 				cpu_to_sle64(i_size_read(vol->usnjrnl_j_ino));
 		uh->journal_id = stamp;
 		flush_dcache_page(page);
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		ntfs_unmap_page(page);
 		/* Set the flag so we do not have to do it again on remount. */
 		NVolSetUsnJrnlStamped(vol);
diff --git a/fs/udf/file.c b/fs/udf/file.c
index 0f6a1de6b272..413f09b17136 100644
--- a/fs/udf/file.c
+++ b/fs/udf/file.c
@@ -122,7 +122,7 @@ static int udf_adinicb_write_end(struct file *file, struct address_space *mappin
 	loff_t last_pos = pos + copied;
 	if (last_pos > inode->i_size)
 		i_size_write(inode, last_pos);
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	unlock_page(page);
 	put_page(page);
 	return copied;
diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c
index c96630059d9e..abe8d36be626 100644
--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -1096,7 +1096,7 @@ static int ufs_alloc_lastblock(struct inode *inode, loff_t size)
 		*/
 	       set_buffer_uptodate(bh);
 	       mark_buffer_dirty(NULL, bh);
-	       set_page_dirty(lastpage);
+	       set_page_dirty(NULL, lastpage);
        }
 
        if (lastfrag >= UFS_IND_FRAGMENT) {
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1793b2e4f6b1..da847c874f9f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1463,7 +1463,7 @@ int redirty_page_for_writepage(struct writeback_control *wbc,
 void account_page_dirtied(struct page *page, struct address_space *mapping);
 void account_page_cleaned(struct page *page, struct address_space *mapping,
 			  struct bdi_writeback *wb);
-int set_page_dirty(struct page *page);
+int set_page_dirty(struct address_space *, struct page *);
 int set_page_dirty_lock(struct page *page);
 void __cancel_dirty_page(struct page *page);
 static inline void cancel_dirty_page(struct page *page)
diff --git a/mm/filemap.c b/mm/filemap.c
index a41c7cfb6351..c1ee7431bc4d 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2717,7 +2717,7 @@ int filemap_page_mkwrite(struct vm_fault *vmf)
 	 * progress, we are guaranteed that writeback during freezing will
 	 * see the dirty page and writeprotect it again.
 	 */
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	wait_for_stable_page(page);
 out:
 	sb_end_pagefault(inode->i_sb);
diff --git a/mm/gup.c b/mm/gup.c
index 6afae32571ca..5b9cee21d9dd 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -164,7 +164,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
 	if (flags & FOLL_TOUCH) {
 		if ((flags & FOLL_WRITE) &&
 		    !pte_dirty(pte) && !PageDirty(page))
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 		/*
 		 * pte_mkyoung() would be more correct here, but atomic care
 		 * is needed to avoid losing the dirty bit: it is easier to use
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 5a68730eebd6..9d628ab218ce 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2892,7 +2892,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw,
 	pmdval = *pvmw->pmd;
 	pmdp_invalidate(vma, address, pvmw->pmd);
 	if (pmd_dirty(pmdval))
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 	entry = make_migration_entry(page, pmd_write(pmdval));
 	pmdswp = swp_entry_to_pmd(entry);
 	if (pmd_soft_dirty(pmdval))
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 976bbc5646fe..b4595b509d6e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3387,7 +3387,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		pte = huge_ptep_get_and_clear(mm, address, ptep);
 		tlb_remove_huge_tlb_entry(h, tlb, ptep, address);
 		if (huge_pte_dirty(pte))
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 
 		hugetlb_count_sub(pages_per_huge_page(h), mm);
 		page_remove_rmap(page, true);
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index e42568284e06..ccd5da4e855f 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1513,7 +1513,7 @@ static void collapse_shmem(struct mm_struct *mm,
 		retract_page_tables(mapping, start);
 
 		/* Everything is ready, let's unfreeze the new_page */
-		set_page_dirty(new_page);
+		set_page_dirty(NULL, new_page);
 		SetPageUptodate(new_page);
 		page_ref_unfreeze(new_page, HPAGE_PMD_NR);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
diff --git a/mm/ksm.c b/mm/ksm.c
index 293721f5da70..1c16a4309c1d 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1061,7 +1061,7 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page,
 			goto out_unlock;
 		}
 		if (pte_dirty(entry))
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 
 		if (pte_protnone(entry))
 			entry = pte_mkclean(pte_clear_savedwrite(entry));
diff --git a/mm/memory.c b/mm/memory.c
index 6ffd76528e7b..22906aab3922 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1327,7 +1327,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
 			if (!PageAnon(page)) {
 				if (pte_dirty(ptent)) {
 					force_flush = 1;
-					set_page_dirty(page);
+					set_page_dirty(NULL, page);
 				}
 				if (pte_young(ptent) &&
 				    likely(!(vma->vm_flags & VM_SEQ_READ)))
@@ -2400,7 +2400,7 @@ static void fault_dirty_shared_page(struct vm_area_struct *vma,
 	bool dirtied;
 	bool page_mkwrite = vma->vm_ops && vma->vm_ops->page_mkwrite;
 
-	dirtied = set_page_dirty(page);
+	dirtied = set_page_dirty(NULL, page);
 	VM_BUG_ON_PAGE(PageAnon(page), page);
 	/*
 	 * Take a local copy of the address_space - page.mapping may be zeroed
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index ed9424f84715..d8856be8cc70 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2548,7 +2548,7 @@ EXPORT_SYMBOL(redirty_page_for_writepage);
  * If the mapping doesn't provide a set_page_dirty a_op, then
  * just fall through and assume that it wants buffer_heads.
  */
-int set_page_dirty(struct page *page)
+int set_page_dirty(struct address_space *_mapping, struct page *page)
 {
 	struct address_space *mapping = page_mapping(page);
 
@@ -2599,7 +2599,7 @@ int set_page_dirty_lock(struct page *page)
 	int ret;
 
 	lock_page(page);
-	ret = set_page_dirty(page);
+	ret = set_page_dirty(NULL, page);
 	unlock_page(page);
 	return ret;
 }
@@ -2693,7 +2693,7 @@ int clear_page_dirty_for_io(struct page *page)
 		 * threads doing their things.
 		 */
 		if (page_mkclean(page))
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 		/*
 		 * We carefully synchronise fault handlers against
 		 * installing a dirty pte and marking the page dirty
diff --git a/mm/page_io.c b/mm/page_io.c
index b4a4c52bb4e9..5afc8b8a6b97 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -62,7 +62,7 @@ void end_swap_bio_write(struct bio *bio)
 		 *
 		 * Also clear PG_reclaim to avoid rotate_reclaimable_page()
 		 */
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		pr_alert("Write-error on swap-device (%u:%u:%llu)\n",
 			 MAJOR(bio_dev(bio)), MINOR(bio_dev(bio)),
 			 (unsigned long long)bio->bi_iter.bi_sector);
@@ -329,7 +329,7 @@ int __swap_writepage(struct address_space *mapping, struct page *page,
 			 * the normal direct-to-bio case as it could
 			 * be temporary.
 			 */
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			ClearPageReclaim(page);
 			pr_err_ratelimited("Write error on dio swapfile (%llu)\n",
 					   page_file_offset(page));
@@ -348,7 +348,7 @@ int __swap_writepage(struct address_space *mapping, struct page *page,
 	ret = 0;
 	bio = get_swap_bio(GFP_NOIO, page, end_write_func);
 	if (bio == NULL) {
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		unlock_page(page);
 		ret = -ENOMEM;
 		goto out;
diff --git a/mm/rmap.c b/mm/rmap.c
index 47db27f8049e..822a3a0cd51c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1465,7 +1465,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 
 		/* Move the dirty bit to the page. Now the pte is gone. */
 		if (pte_dirty(pteval))
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 
 		/* Update high watermark before we lower rss */
 		update_hiwater_rss(mm);
diff --git a/mm/shmem.c b/mm/shmem.c
index 7f3168d547c8..cb09fea4a9ce 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -874,7 +874,7 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend,
 				partial_end = 0;
 			}
 			zero_user_segment(page, partial_start, top);
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			unlock_page(page);
 			put_page(page);
 		}
@@ -884,7 +884,7 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend,
 		shmem_getpage(inode, end, &page, SGP_READ);
 		if (page) {
 			zero_user_segment(page, 0, partial_end);
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			unlock_page(page);
 			put_page(page);
 		}
@@ -1189,7 +1189,7 @@ static int shmem_unuse_inode(struct shmem_inode_info *info,
 		 * only does trylock page: if we raced, best clean up here.
 		 */
 		delete_from_swap_cache(*pagep);
-		set_page_dirty(*pagep);
+		set_page_dirty(NULL, *pagep);
 		if (!error) {
 			spin_lock_irq(&info->lock);
 			info->swapped--;
@@ -1364,7 +1364,7 @@ static int shmem_writepage(struct address_space *_mapping, struct page *page,
 free_swap:
 	put_swap_page(page, swap);
 redirty:
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	if (wbc->for_reclaim)
 		return AOP_WRITEPAGE_ACTIVATE;	/* Return with page locked */
 	unlock_page(page);
@@ -1738,7 +1738,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
 			mark_page_accessed(page);
 
 		delete_from_swap_cache(page);
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		swap_free(swap);
 
 	} else {
@@ -2416,7 +2416,7 @@ shmem_write_end(struct file *file, struct address_space *mapping,
 		}
 		SetPageUptodate(head);
 	}
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	unlock_page(page);
 	put_page(page);
 
@@ -2469,7 +2469,7 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 		}
 		if (page) {
 			if (sgp == SGP_CACHE)
-				set_page_dirty(page);
+				set_page_dirty(NULL, page);
 			unlock_page(page);
 		}
 
@@ -2970,7 +2970,7 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset,
 		 * than free the pages we are allocating (and SGP_CACHE pages
 		 * might still be clean: we now need to mark those dirty too).
 		 */
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		unlock_page(page);
 		put_page(page);
 		cond_resched();
@@ -3271,7 +3271,7 @@ static int shmem_symlink(struct inode *dir, struct dentry *dentry, const char *s
 		inode->i_op = &shmem_symlink_inode_operations;
 		memcpy(page_address(page), symname, len);
 		SetPageUptodate(page);
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 		unlock_page(page);
 		put_page(page);
 	}
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 40a2437e3c34..3fede4bc753e 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -249,7 +249,7 @@ int add_to_swap(struct page *page)
 	 * is swap in later. Always setting the dirty bit for the page solves
 	 * the problem.
 	 */
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 
 	return 1;
 
diff --git a/mm/truncate.c b/mm/truncate.c
index 57d4d0948f40..78d907008367 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -874,7 +874,7 @@ void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)
 	 * is needed.
 	 */
 	if (page_mkclean(page))
-		set_page_dirty(page);
+		set_page_dirty(NULL, page);
 	unlock_page(page);
 	put_page(page);
 }
diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
index e678699268a2..91b2cb759bf9 100644
--- a/net/rds/ib_rdma.c
+++ b/net/rds/ib_rdma.c
@@ -252,7 +252,7 @@ void __rds_ib_teardown_mr(struct rds_ib_mr *ibmr)
 			/* FIXME we need a way to tell a r/w MR
 			 * from a r/o MR */
 			WARN_ON(!page->mapping && irqs_disabled());
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 			put_page(page);
 		}
 		kfree(ibmr->sg);
diff --git a/net/rds/rdma.c b/net/rds/rdma.c
index 634cfcb7bba6..0bc9839c2c01 100644
--- a/net/rds/rdma.c
+++ b/net/rds/rdma.c
@@ -461,7 +461,7 @@ void rds_rdma_free_op(struct rm_rdma_op *ro)
 		 * to local memory */
 		if (!ro->op_write) {
 			WARN_ON(!page->mapping && irqs_disabled());
-			set_page_dirty(page);
+			set_page_dirty(NULL, page);
 		}
 		put_page(page);
 	}
@@ -478,7 +478,7 @@ void rds_atomic_free_op(struct rm_atomic_op *ao)
 	/* Mark page dirty if it was possibly modified, which
 	 * is the case for a RDMA_READ which copies from remote
 	 * to local memory */
-	set_page_dirty(page);
+	set_page_dirty(NULL, page);
 	put_page(page);
 
 	kfree(ao->op_notifier);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 72/79] mm: add struct address_space to set_page_dirty_lock()
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (32 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 71/79] mm: add struct address_space to set_page_dirty() jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 73/79] mm: pass down struct address_space to set_page_dirty() jglisse
                   ` (8 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

For the holy crusade to stop relying on struct page mapping field, add
struct address_space to set_page_dirty_lock() arguments.

<---------------------------------------------------------------------
@@
identifier I1;
type T1;
@@
int
-set_page_dirty_lock(T1 I1)
+set_page_dirty_lock(struct address_space *_mapping, T1 I1)
{...}

@@
type T1;
@@
int
-set_page_dirty_lock(T1)
+set_page_dirty_lock(struct address_space *, T1)
;

@@
identifier I1;
type T1;
@@
int
-set_page_dirty_lock(T1 I1)
+set_page_dirty_lock(struct address_space *, T1)
;

@@
expression E1;
@@
-set_page_dirty_lock(E1)
+set_page_dirty_lock(NULL, E1)
--------------------------------------------------------------------->

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 arch/cris/arch-v32/drivers/cryptocop.c                | 2 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c                | 2 +-
 arch/powerpc/kvm/e500_mmu.c                           | 3 ++-
 arch/s390/kvm/interrupt.c                             | 4 ++--
 arch/x86/kvm/svm.c                                    | 2 +-
 block/bio.c                                           | 4 ++--
 drivers/gpu/drm/exynos/exynos_drm_g2d.c               | 2 +-
 drivers/infiniband/core/umem.c                        | 2 +-
 drivers/infiniband/hw/hfi1/user_pages.c               | 2 +-
 drivers/infiniband/hw/qib/qib_user_pages.c            | 2 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c              | 2 +-
 drivers/media/common/videobuf2/videobuf2-dma-contig.c | 2 +-
 drivers/media/common/videobuf2/videobuf2-dma-sg.c     | 2 +-
 drivers/media/common/videobuf2/videobuf2-vmalloc.c    | 2 +-
 drivers/misc/genwqe/card_utils.c                      | 2 +-
 drivers/staging/lustre/lustre/llite/rw26.c            | 2 +-
 drivers/vhost/vhost.c                                 | 2 +-
 fs/block_dev.c                                        | 2 +-
 fs/direct-io.c                                        | 2 +-
 fs/fuse/dev.c                                         | 2 +-
 fs/fuse/file.c                                        | 2 +-
 include/linux/mm.h                                    | 2 +-
 mm/memory.c                                           | 2 +-
 mm/page-writeback.c                                   | 2 +-
 mm/process_vm_access.c                                | 2 +-
 net/ceph/pagevec.c                                    | 2 +-
 26 files changed, 29 insertions(+), 28 deletions(-)

diff --git a/arch/cris/arch-v32/drivers/cryptocop.c b/arch/cris/arch-v32/drivers/cryptocop.c
index a3c353472a8c..5cb42555c90b 100644
--- a/arch/cris/arch-v32/drivers/cryptocop.c
+++ b/arch/cris/arch-v32/drivers/cryptocop.c
@@ -2930,7 +2930,7 @@ static int cryptocop_ioctl_process(struct inode *inode, struct file *filp, unsig
 	for (i = 0; i < nooutpages; i++){
 		int spdl_err;
 		/* Mark output pages dirty. */
-		spdl_err = set_page_dirty_lock(outpages[i]);
+		spdl_err = set_page_dirty_lock(NULL, outpages[i]);
 		DEBUG(if (spdl_err < 0)printk("cryptocop_ioctl_process: set_page_dirty_lock returned %d\n", spdl_err));
 	}
 	for (i = 0; i < nooutpages; i++){
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 5cb4e4687107..8daefabe650e 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -482,7 +482,7 @@ int kvmppc_book3s_radix_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 
 	if (page) {
 		if (!ret && (pgflags & _PAGE_WRITE))
-			set_page_dirty_lock(page);
+			set_page_dirty_lock(NULL, page);
 		put_page(page);
 	}
 
diff --git a/arch/powerpc/kvm/e500_mmu.c b/arch/powerpc/kvm/e500_mmu.c
index ddbf8f0284c0..364ee7a5b268 100644
--- a/arch/powerpc/kvm/e500_mmu.c
+++ b/arch/powerpc/kvm/e500_mmu.c
@@ -556,7 +556,8 @@ static void free_gtlb(struct kvmppc_vcpu_e500 *vcpu_e500)
 					  PAGE_SIZE)));
 
 		for (i = 0; i < vcpu_e500->num_shared_tlb_pages; i++) {
-			set_page_dirty_lock(vcpu_e500->shared_tlb_pages[i]);
+			set_page_dirty_lock(NULL,
+					    vcpu_e500->shared_tlb_pages[i]);
 			put_page(vcpu_e500->shared_tlb_pages[i]);
 		}
 
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index b04616b57a94..6db8d4f5c74f 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -2616,7 +2616,7 @@ static int adapter_indicators_set(struct kvm *kvm,
 	set_bit(bit, map);
 	idx = srcu_read_lock(&kvm->srcu);
 	mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT);
-	set_page_dirty_lock(info->page);
+	set_page_dirty_lock(NULL, info->page);
 	info = get_map_info(adapter, adapter_int->summary_addr);
 	if (!info) {
 		srcu_read_unlock(&kvm->srcu, idx);
@@ -2627,7 +2627,7 @@ static int adapter_indicators_set(struct kvm *kvm,
 			  adapter->swap);
 	summary_set = test_and_set_bit(bit, map);
 	mark_page_dirty(kvm, info->guest_addr >> PAGE_SHIFT);
-	set_page_dirty_lock(info->page);
+	set_page_dirty_lock(NULL, info->page);
 	srcu_read_unlock(&kvm->srcu, idx);
 	return summary_set ? 0 : 1;
 }
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index be9c839e2c89..f26f1ce478ab 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6271,7 +6271,7 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 e_unpin:
 	/* content of memory is updated, mark pages dirty */
 	for (i = 0; i < npages; i++) {
-		set_page_dirty_lock(inpages[i]);
+		set_page_dirty_lock(NULL, inpages[i]);
 		mark_page_accessed(inpages[i]);
 	}
 	/* unlock the user pages */
diff --git a/block/bio.c b/block/bio.c
index e1708db48258..28cd15314235 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1376,7 +1376,7 @@ static void __bio_unmap_user(struct bio *bio)
 	 */
 	bio_for_each_segment_all(bvec, bio, i) {
 		if (bio_data_dir(bio) == READ)
-			set_page_dirty_lock(bvec->bv_page);
+			set_page_dirty_lock(NULL, bvec->bv_page);
 
 		put_page(bvec->bv_page);
 	}
@@ -1581,7 +1581,7 @@ void bio_set_pages_dirty(struct bio *bio)
 		struct page *page = bvec->bv_page;
 
 		if (page && !PageCompound(page))
-			set_page_dirty_lock(page);
+			set_page_dirty_lock(NULL, page);
 	}
 }
 
diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
index f68ef1b3a28c..28480c603f7b 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -406,7 +406,7 @@ static void g2d_userptr_put_dma_addr(struct drm_device *drm_dev,
 		int i;
 
 		for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++)
-			set_page_dirty_lock(pages[i]);
+			set_page_dirty_lock(NULL, pages[i]);
 	}
 	put_vaddr_frames(g2d_userptr->vec);
 	frame_vector_destroy(g2d_userptr->vec);
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 9a4e899d94b3..e0d776983a46 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -59,7 +59,7 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
 
 		page = sg_page(sg);
 		if (!PageDirty(page) && umem->writable && dirty)
-			set_page_dirty_lock(page);
+			set_page_dirty_lock(NULL, page);
 		put_page(page);
 	}
 
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
index e341e6dcc388..98d11ee5853a 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -125,7 +125,7 @@ void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
 
 	for (i = 0; i < npages; i++) {
 		if (dirty)
-			set_page_dirty_lock(p[i]);
+			set_page_dirty_lock(NULL, p[i]);
 		put_page(p[i]);
 	}
 
diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c
index ce83ba9a12ef..39273c68bd54 100644
--- a/drivers/infiniband/hw/qib/qib_user_pages.c
+++ b/drivers/infiniband/hw/qib/qib_user_pages.c
@@ -44,7 +44,7 @@ static void __qib_release_user_pages(struct page **p, size_t num_pages,
 
 	for (i = 0; i < num_pages; i++) {
 		if (dirty)
-			set_page_dirty_lock(p[i]);
+			set_page_dirty_lock(NULL, p[i]);
 		put_page(p[i]);
 	}
 }
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 4381c0a9a873..5bab9930cf89 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -89,7 +89,7 @@ static void usnic_uiom_put_pages(struct list_head *chunk_list, int dirty)
 			page = sg_page(sg);
 			pa = sg_phys(sg);
 			if (dirty)
-				set_page_dirty_lock(page);
+				set_page_dirty_lock(NULL, page);
 			put_page(page);
 			usnic_dbg("pa: %pa\n", &pa);
 		}
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index f1178f6f434d..0628ed526e80 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -437,7 +437,7 @@ static void vb2_dc_put_userptr(void *buf_priv)
 		if (buf->dma_dir == DMA_FROM_DEVICE ||
 		    buf->dma_dir == DMA_BIDIRECTIONAL)
 			for (i = 0; i < frame_vector_count(buf->vec); i++)
-				set_page_dirty_lock(pages[i]);
+				set_page_dirty_lock(NULL, pages[i]);
 		sg_free_table(sgt);
 		kfree(sgt);
 	}
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
index 753ed3138dcc..ed63b47e0cfa 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
@@ -295,7 +295,7 @@ static void vb2_dma_sg_put_userptr(void *buf_priv)
 	if (buf->dma_dir == DMA_FROM_DEVICE ||
 	    buf->dma_dir == DMA_BIDIRECTIONAL)
 		while (--i >= 0)
-			set_page_dirty_lock(buf->pages[i]);
+			set_page_dirty_lock(NULL, buf->pages[i]);
 	vb2_destroy_framevec(buf->vec);
 	kfree(buf);
 }
diff --git a/drivers/media/common/videobuf2/videobuf2-vmalloc.c b/drivers/media/common/videobuf2/videobuf2-vmalloc.c
index 3a7c80cd1a17..300179a028f9 100644
--- a/drivers/media/common/videobuf2/videobuf2-vmalloc.c
+++ b/drivers/media/common/videobuf2/videobuf2-vmalloc.c
@@ -141,7 +141,7 @@ static void vb2_vmalloc_put_userptr(void *buf_priv)
 		if (buf->dma_dir == DMA_FROM_DEVICE ||
 		    buf->dma_dir == DMA_BIDIRECTIONAL)
 			for (i = 0; i < n_pages; i++)
-				set_page_dirty_lock(pages[i]);
+				set_page_dirty_lock(NULL, pages[i]);
 	} else {
 		iounmap((__force void __iomem *)buf->vaddr);
 	}
diff --git a/drivers/misc/genwqe/card_utils.c b/drivers/misc/genwqe/card_utils.c
index 8f2e6442d88b..09e16bb00412 100644
--- a/drivers/misc/genwqe/card_utils.c
+++ b/drivers/misc/genwqe/card_utils.c
@@ -540,7 +540,7 @@ static int genwqe_free_user_pages(struct page **page_list,
 	for (i = 0; i < nr_pages; i++) {
 		if (page_list[i] != NULL) {
 			if (dirty)
-				set_page_dirty_lock(page_list[i]);
+				set_page_dirty_lock(NULL, page_list[i]);
 			put_page(page_list[i]);
 		}
 	}
diff --git a/drivers/staging/lustre/lustre/llite/rw26.c b/drivers/staging/lustre/lustre/llite/rw26.c
index 969f4dad2f82..e5d8a91c3dda 100644
--- a/drivers/staging/lustre/lustre/llite/rw26.c
+++ b/drivers/staging/lustre/lustre/llite/rw26.c
@@ -168,7 +168,7 @@ static void ll_free_user_pages(struct page **pages, int npages, int do_dirty)
 
 	for (i = 0; i < npages; i++) {
 		if (do_dirty)
-			set_page_dirty_lock(pages[i]);
+			set_page_dirty_lock(NULL, pages[i]);
 		put_page(pages[i]);
 	}
 	kvfree(pages);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 1b3e8d2d5c8b..d1f3eaec0f49 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1656,7 +1656,7 @@ static int set_bit_to_user(int nr, void __user *addr)
 	base = kmap_atomic(page);
 	set_bit(bit, base);
 	kunmap_atomic(base);
-	set_page_dirty_lock(page);
+	set_page_dirty_lock(NULL, page);
 	put_page(page);
 	return 0;
 }
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 50752935681e..bae849d647d0 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -244,7 +244,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
 
 	bio_for_each_segment_all(bvec, &bio, i) {
 		if (should_dirty && !PageCompound(bvec->bv_page))
-			set_page_dirty_lock(bvec->bv_page);
+			set_page_dirty_lock(NULL, bvec->bv_page);
 		put_page(bvec->bv_page);
 	}
 
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 1357ef563893..d9a634e239e0 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -557,7 +557,7 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio)
 
 			if (dio->op == REQ_OP_READ && !PageCompound(page) &&
 					dio->should_dirty)
-				set_page_dirty_lock(page);
+				set_page_dirty_lock(NULL, page);
 			put_page(page);
 		}
 		bio_put(bio);
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 5d06384c2cae..c7baaa15a072 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -707,7 +707,7 @@ static void fuse_copy_finish(struct fuse_copy_state *cs)
 	} else if (cs->pg) {
 		if (cs->write) {
 			flush_dcache_page(cs->pg);
-			set_page_dirty_lock(cs->pg);
+			set_page_dirty_lock(NULL, cs->pg);
 		}
 		put_page(cs->pg);
 	}
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 8a4a84f3657a..011c56abc772 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -533,7 +533,7 @@ static void fuse_release_user_pages(struct fuse_req *req, bool should_dirty)
 	for (i = 0; i < req->num_pages; i++) {
 		struct page *page = req->pages[i];
 		if (should_dirty)
-			set_page_dirty_lock(page);
+			set_page_dirty_lock(NULL, page);
 		put_page(page);
 	}
 }
diff --git a/include/linux/mm.h b/include/linux/mm.h
index da847c874f9f..a8d4a859d6ad 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1464,7 +1464,7 @@ void account_page_dirtied(struct page *page, struct address_space *mapping);
 void account_page_cleaned(struct page *page, struct address_space *mapping,
 			  struct bdi_writeback *wb);
 int set_page_dirty(struct address_space *, struct page *);
-int set_page_dirty_lock(struct page *page);
+int set_page_dirty_lock(struct address_space *, struct page *);
 void __cancel_dirty_page(struct page *page);
 static inline void cancel_dirty_page(struct page *page)
 {
diff --git a/mm/memory.c b/mm/memory.c
index 22906aab3922..20443ebf9c42 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4466,7 +4466,7 @@ int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
 			if (write) {
 				copy_to_user_page(vma, page, addr,
 						  maddr + offset, buf, bytes);
-				set_page_dirty_lock(page);
+				set_page_dirty_lock(NULL, page);
 			} else {
 				copy_from_user_page(vma, page, addr,
 						    buf, maddr + offset, bytes);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index d8856be8cc70..eaa6c23ba752 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2594,7 +2594,7 @@ EXPORT_SYMBOL(set_page_dirty);
  *
  * In other cases, the page should be locked before running set_page_dirty().
  */
-int set_page_dirty_lock(struct page *page)
+int set_page_dirty_lock(struct address_space *_mapping, struct page *page)
 {
 	int ret;
 
diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index a447092d4635..5a8ffa34c9e7 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -48,7 +48,7 @@ static int process_vm_rw_pages(struct page **pages,
 
 		if (vm_write) {
 			copied = copy_page_from_iter(page, offset, copy, iter);
-			set_page_dirty_lock(page);
+			set_page_dirty_lock(NULL, page);
 		} else {
 			copied = copy_page_to_iter(page, offset, copy, iter);
 		}
diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c
index a3d0adc828e6..67ef02363a16 100644
--- a/net/ceph/pagevec.c
+++ b/net/ceph/pagevec.c
@@ -49,7 +49,7 @@ void ceph_put_page_vector(struct page **pages, int num_pages, bool dirty)
 
 	for (i = 0; i < num_pages; i++) {
 		if (dirty)
-			set_page_dirty_lock(pages[i]);
+			set_page_dirty_lock(NULL, pages[i]);
 		put_page(pages[i]);
 	}
 	kvfree(pages);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 73/79] mm: pass down struct address_space to set_page_dirty()
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (33 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 72/79] mm: add struct address_space to set_page_dirty_lock() jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 74/79] mm/page_ronly: add config option for generic read only page framework jglisse
                   ` (7 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrew Morton,
	Alexander Viro, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman

From: Jérôme Glisse <jglisse@redhat.com>

Pass down struct address_space to set_page_dirty() everywhere it is
already available.

<---------------------------------------------------------------------
@exists@
expression E;
identifier F, M;
@@
F(..., struct address_space * M, ...) {
...
-set_page_dirty(NULL, E)
+set_page_dirty(M, E)
...
}

@exists@
expression E;
identifier M;
@@
struct address_space * M;
...
-set_page_dirty(NULL, E)
+set_page_dirty(M, E)

@exists@
expression E;
identifier F, I;
@@
F(..., struct inode * I, ...) {
...
-set_page_dirty(NULL, E)
+set_page_dirty(I->i_mapping, E)
...
}

@exists@
expression E;
identifier I;
@@
struct inode * I;
...
-set_page_dirty(NULL, E)
+set_page_dirty(I->i_mapping, E)
--------------------------------------------------------------------->

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 mm/filemap.c        |  2 +-
 mm/khugepaged.c     |  2 +-
 mm/memory.c         |  2 +-
 mm/page-writeback.c |  4 ++--
 mm/page_io.c        |  4 ++--
 mm/shmem.c          | 18 +++++++++---------
 mm/truncate.c       |  2 +-
 7 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index c1ee7431bc4d..a15c29350a6a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2717,7 +2717,7 @@ int filemap_page_mkwrite(struct vm_fault *vmf)
 	 * progress, we are guaranteed that writeback during freezing will
 	 * see the dirty page and writeprotect it again.
 	 */
-	set_page_dirty(NULL, page);
+	set_page_dirty(inode->i_mapping, page);
 	wait_for_stable_page(page);
 out:
 	sb_end_pagefault(inode->i_sb);
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index ccd5da4e855f..b9a968172fb9 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1513,7 +1513,7 @@ static void collapse_shmem(struct mm_struct *mm,
 		retract_page_tables(mapping, start);
 
 		/* Everything is ready, let's unfreeze the new_page */
-		set_page_dirty(NULL, new_page);
+		set_page_dirty(mapping, new_page);
 		SetPageUptodate(new_page);
 		page_ref_unfreeze(new_page, HPAGE_PMD_NR);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
diff --git a/mm/memory.c b/mm/memory.c
index 20443ebf9c42..fbd80bb7a50a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2400,7 +2400,7 @@ static void fault_dirty_shared_page(struct vm_area_struct *vma,
 	bool dirtied;
 	bool page_mkwrite = vma->vm_ops && vma->vm_ops->page_mkwrite;
 
-	dirtied = set_page_dirty(NULL, page);
+	dirtied = set_page_dirty(mapping, page);
 	VM_BUG_ON_PAGE(PageAnon(page), page);
 	/*
 	 * Take a local copy of the address_space - page.mapping may be zeroed
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index eaa6c23ba752..59dc9a12efc7 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2599,7 +2599,7 @@ int set_page_dirty_lock(struct address_space *_mapping, struct page *page)
 	int ret;
 
 	lock_page(page);
-	ret = set_page_dirty(NULL, page);
+	ret = set_page_dirty(_mapping, page);
 	unlock_page(page);
 	return ret;
 }
@@ -2693,7 +2693,7 @@ int clear_page_dirty_for_io(struct page *page)
 		 * threads doing their things.
 		 */
 		if (page_mkclean(page))
-			set_page_dirty(NULL, page);
+			set_page_dirty(mapping, page);
 		/*
 		 * We carefully synchronise fault handlers against
 		 * installing a dirty pte and marking the page dirty
diff --git a/mm/page_io.c b/mm/page_io.c
index 5afc8b8a6b97..fd3133cd50d4 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -329,7 +329,7 @@ int __swap_writepage(struct address_space *mapping, struct page *page,
 			 * the normal direct-to-bio case as it could
 			 * be temporary.
 			 */
-			set_page_dirty(NULL, page);
+			set_page_dirty(mapping, page);
 			ClearPageReclaim(page);
 			pr_err_ratelimited("Write error on dio swapfile (%llu)\n",
 					   page_file_offset(page));
@@ -348,7 +348,7 @@ int __swap_writepage(struct address_space *mapping, struct page *page,
 	ret = 0;
 	bio = get_swap_bio(GFP_NOIO, page, end_write_func);
 	if (bio == NULL) {
-		set_page_dirty(NULL, page);
+		set_page_dirty(mapping, page);
 		unlock_page(page);
 		ret = -ENOMEM;
 		goto out;
diff --git a/mm/shmem.c b/mm/shmem.c
index cb09fea4a9ce..eae03f684869 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -874,7 +874,7 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend,
 				partial_end = 0;
 			}
 			zero_user_segment(page, partial_start, top);
-			set_page_dirty(NULL, page);
+			set_page_dirty(mapping, page);
 			unlock_page(page);
 			put_page(page);
 		}
@@ -884,7 +884,7 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, loff_t lend,
 		shmem_getpage(inode, end, &page, SGP_READ);
 		if (page) {
 			zero_user_segment(page, 0, partial_end);
-			set_page_dirty(NULL, page);
+			set_page_dirty(mapping, page);
 			unlock_page(page);
 			put_page(page);
 		}
@@ -1189,7 +1189,7 @@ static int shmem_unuse_inode(struct shmem_inode_info *info,
 		 * only does trylock page: if we raced, best clean up here.
 		 */
 		delete_from_swap_cache(*pagep);
-		set_page_dirty(NULL, *pagep);
+		set_page_dirty(mapping, *pagep);
 		if (!error) {
 			spin_lock_irq(&info->lock);
 			info->swapped--;
@@ -1364,7 +1364,7 @@ static int shmem_writepage(struct address_space *_mapping, struct page *page,
 free_swap:
 	put_swap_page(page, swap);
 redirty:
-	set_page_dirty(NULL, page);
+	set_page_dirty(_mapping, page);
 	if (wbc->for_reclaim)
 		return AOP_WRITEPAGE_ACTIVATE;	/* Return with page locked */
 	unlock_page(page);
@@ -1738,7 +1738,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
 			mark_page_accessed(page);
 
 		delete_from_swap_cache(page);
-		set_page_dirty(NULL, page);
+		set_page_dirty(mapping, page);
 		swap_free(swap);
 
 	} else {
@@ -2416,7 +2416,7 @@ shmem_write_end(struct file *file, struct address_space *mapping,
 		}
 		SetPageUptodate(head);
 	}
-	set_page_dirty(NULL, page);
+	set_page_dirty(mapping, page);
 	unlock_page(page);
 	put_page(page);
 
@@ -2469,7 +2469,7 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 		}
 		if (page) {
 			if (sgp == SGP_CACHE)
-				set_page_dirty(NULL, page);
+				set_page_dirty(mapping, page);
 			unlock_page(page);
 		}
 
@@ -2970,7 +2970,7 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset,
 		 * than free the pages we are allocating (and SGP_CACHE pages
 		 * might still be clean: we now need to mark those dirty too).
 		 */
-		set_page_dirty(NULL, page);
+		set_page_dirty(inode->i_mapping, page);
 		unlock_page(page);
 		put_page(page);
 		cond_resched();
@@ -3271,7 +3271,7 @@ static int shmem_symlink(struct inode *dir, struct dentry *dentry, const char *s
 		inode->i_op = &shmem_symlink_inode_operations;
 		memcpy(page_address(page), symname, len);
 		SetPageUptodate(page);
-		set_page_dirty(NULL, page);
+		set_page_dirty(dir->i_mapping, page);
 		unlock_page(page);
 		put_page(page);
 	}
diff --git a/mm/truncate.c b/mm/truncate.c
index 78d907008367..f4f018f35552 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -874,7 +874,7 @@ void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)
 	 * is needed.
 	 */
 	if (page_mkclean(page))
-		set_page_dirty(NULL, page);
+		set_page_dirty(inode->i_mapping, page);
 	unlock_page(page);
 	put_page(page);
 }
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 74/79] mm/page_ronly: add config option for generic read only page framework.
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (34 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 73/79] mm: pass down struct address_space to set_page_dirty() jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 75/79] mm/page_ronly: add page read only core structure and helpers jglisse
                   ` (6 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrea Arcangeli

From: Jérôme Glisse <jglisse@redhat.com>

It's really just a config option patch.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
---
 mm/Kconfig | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/Kconfig b/mm/Kconfig
index c782e8fb7235..aeffb6e8dd21 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -149,6 +149,9 @@ config NO_BOOTMEM
 config MEMORY_ISOLATION
 	bool
 
+config PAGE_RONLY
+	bool
+
 #
 # Only be set on architectures that have completely implemented memory hotplug
 # feature. If you are not sure, don't touch it.
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 75/79] mm/page_ronly: add page read only core structure and helpers.
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (35 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 74/79] mm/page_ronly: add config option for generic read only page framework jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 76/79] mm/ksm: have ksm select PAGE_RONLY config jglisse
                   ` (5 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrea Arcangeli

From: Jérôme Glisse <jglisse@redhat.com>

Page read only is a generic framework for page write protection.
It reuses the same mechanism as KSM by using the lower bit of the
page->mapping fields, and KSM is converted to use this generic
framework.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
---
 include/linux/page_ronly.h | 169 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 169 insertions(+)
 create mode 100644 include/linux/page_ronly.h

diff --git a/include/linux/page_ronly.h b/include/linux/page_ronly.h
new file mode 100644
index 000000000000..6312d4f015ea
--- /dev/null
+++ b/include/linux/page_ronly.h
@@ -0,0 +1,169 @@
+/*
+ * Copyright 2015 Red Hat Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Authors: Jérôme Glisse <jglisse@redhat.com>
+ */
+/*
+ * Page read only generic wrapper. This is common struct use to write protect
+ * page by means of forbidding anyone from inserting a pte (page table entry)
+ * with write flag set. It reuse the ksm mecanism (which use lower bit of the
+ * mapping field of struct page).
+ */
+#ifndef LINUX_PAGE_RONLY_H
+#define LINUX_PAGE_RONLY_H
+#ifdef CONFIG_PAGE_RONLY
+
+#include <linux/types.h>
+#include <linux/page-flags.h>
+#include <linux/buffer_head.h>
+#include <linux/mm_types.h>
+
+
+/* enum page_ronly_event - Event that trigger a call to unprotec().
+ *
+ * @PAGE_RONLY_SWAPIN: Page fault on at an address with a swap entry pte.
+ * @PAGE_RONLY_WFAULT: Write page fault.
+ * @PAGE_RONLY_GUP: Get user page.
+ */
+enum page_ronly_event {
+	PAGE_RONLY_SWAPIN,
+	PAGE_RONLY_WFAULT,
+	PAGE_RONLY_GUP,
+};
+
+/* struct page_ronly_ops - Page read only operations.
+ *
+ * @unprotect: Callback to unprotect a page (mandatory).
+ * @rmap_walk: Callback to walk reverse mapping of a page (mandatory).
+ *
+ * Kernel user that want to use the page write protection mechanism have to
+ * provide a number of callback.
+ */
+struct page_ronly_ops {
+	struct page *(*unprotect)(struct page *page,
+				  unsigned long addr,
+				  struct vm_area_struct *vma,
+				  enum page_ronly_event event);
+	int (*rmap_walk)(struct page *page, struct rmap_walk_control *rwc);
+};
+
+/* struct page_ronly - Replace page->mapping when a page is write protected.
+ *
+ * @ops: Pointer to page read only operations.
+ *
+ * Page that are write protect have their page->mapping field pointing to this
+ * wrapper structure. It must be allocated by page read only user and must be
+ * free (if needed) inside unprotect() callback.
+ */
+struct page_ronly {
+	const struct page_ronly_ops	*ops;
+};
+
+
+/* page_ronly() - Return page_ronly struct if any or NULL.
+ *
+ * @page: The page for which to replace the page->mapping field.
+ */
+static inline struct page_ronly *page_ronly(struct page *page)
+{
+	return PageReadOnly(page) ? page_rmapping(page) : NULL;
+}
+
+/* page_ronly_set() - Replace page->mapping with ptr to page_ronly struct.
+ *
+ * @page: The page for which to replace the page->mapping field.
+ * @ronly: The page_ronly structure to set.
+ *
+ * Page must be locked.
+ */
+static inline void page_ronly_set(struct page *page, struct page_ronly *ronly)
+{
+	VM_BUG_ON_PAGE(!PageLocked(page), page);
+
+	page->mapping = (void *)ronly + (PAGE_MAPPING_ANON|PAGE_MAPPING_RONLY);
+}
+
+/* page_ronly_unprotect() - Unprotect a read only protected page.
+ *
+ * @page: The page to unprotect.
+ * @addr: Fault address that trigger the unprotect.
+ * @vma: The vma of the fault address.
+ * @event: Event which triggered the unprotect.
+ *
+ * Page must be locked and must be a read only page.
+ */
+static inline struct page *page_ronly_unprotect(struct page *page,
+						unsigned long addr,
+						struct vm_area_struct *vma,
+						enum page_ronly_event event)
+{
+	struct page_ronly *pageronly;
+
+	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	/*
+	 * Rely on the page lock to protect against concurrent modifications
+	 * to that page's node of the stable tree.
+	 */
+	VM_BUG_ON_PAGE(!PageReadOnly(page), page);
+	pageronly = page_ronly(page);
+	if (pageronly)
+		return pageronly->ops->unprotect(page, addr, vma, event);
+	/* Safest fallback. */
+	return page;
+}
+
+/* page_ronly_rmap_walk() - Walk all CPU page table mapping of a page.
+ *
+ * @page: The page for which to replace the page->mapping field.
+ * @rwc: Private control variable for each reverse walk.
+ *
+ * Page must be locked and must be a read only page.
+ */
+static inline void page_ronly_rmap_walk(struct page *page,
+					struct rmap_walk_control *rwc)
+{
+	struct page_ronly *pageronly;
+
+	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	/*
+	 * Rely on the page lock to protect against concurrent modifications
+	 * to that page's node of the stable tree.
+	 */
+	VM_BUG_ON_PAGE(!PageReadOnly(page), page);
+	pageronly = page_ronly(page);
+	if (pageronly)
+		pageronly->ops->rmap_walk(page, rwc);
+}
+
+#else /* CONFIG_PAGE_RONLY */
+
+static inline struct page *page_ronly_unprotect(struct page *page,
+						unsigned long addr,
+						struct vm_area_struct *vma,
+						enum page_ronly_event event)
+{
+	/* This should not happen ! */
+	VM_BUG_ON_PAGE(1, page);
+	return page;
+}
+
+static inline int page_ronly_rmap_walk(struct page *page,
+				       struct rmap_walk_control *rwc)
+{
+	/* This should not happen ! */
+	BUG();
+	return 0;
+}
+
+#endif /* CONFIG_PAGE_RONLY */
+#endif /* LINUX_PAGE_RONLY_H */
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 76/79] mm/ksm: have ksm select PAGE_RONLY config.
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (36 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 75/79] mm/page_ronly: add page read only core structure and helpers jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 77/79] mm/ksm: hide set_page_stable_node() and page_stable_node() jglisse
                   ` (4 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrea Arcangeli

From: Jérôme Glisse <jglisse@redhat.com>

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
---
 mm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/Kconfig b/mm/Kconfig
index aeffb6e8dd21..6994a1fdf847 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -308,6 +308,7 @@ config MMU_NOTIFIER
 config KSM
 	bool "Enable KSM for page merging"
 	depends on MMU
+	select PAGE_RONLY
 	help
 	  Enable Kernel Samepage Merging: KSM periodically scans those areas
 	  of an application's address space that an app has advised may be
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 77/79] mm/ksm: hide set_page_stable_node() and page_stable_node()
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (37 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 76/79] mm/ksm: have ksm select PAGE_RONLY config jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 78/79] mm/ksm: rename PAGE_MAPPING_KSM to PAGE_MAPPING_RONLY jglisse
                   ` (3 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrea Arcangeli

From: Jérôme Glisse <jglisse@redhat.com>

Hiding this 2 functions as preparatory step for generalizing ksm
write protection to other users. Moreover those two helpers can
not be use meaningfully outside ksm.c as the struct they deal with
is defined inside ksm.c.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
---
 include/linux/ksm.h | 12 ------------
 mm/ksm.c            | 11 +++++++++++
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 44368b19b27e..83c664080798 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -15,7 +15,6 @@
 #include <linux/sched.h>
 #include <linux/sched/coredump.h>
 
-struct stable_node;
 struct mem_cgroup;
 
 #ifdef CONFIG_KSM
@@ -37,17 +36,6 @@ static inline void ksm_exit(struct mm_struct *mm)
 		__ksm_exit(mm);
 }
 
-static inline struct stable_node *page_stable_node(struct page *page)
-{
-	return PageKsm(page) ? page_rmapping(page) : NULL;
-}
-
-static inline void set_page_stable_node(struct page *page,
-					struct stable_node *stable_node)
-{
-	page->mapping = (void *)((unsigned long)stable_node | PAGE_MAPPING_KSM);
-}
-
 /*
  * When do_swap_page() first faults in from swap what used to be a KSM page,
  * no problem, it will be assigned to this vma's anon_vma; but thereafter,
diff --git a/mm/ksm.c b/mm/ksm.c
index 1c16a4309c1d..f9bd1251c288 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -316,6 +316,17 @@ static void __init ksm_slab_free(void)
 	mm_slot_cache = NULL;
 }
 
+static inline struct stable_node *page_stable_node(struct page *page)
+{
+	return PageKsm(page) ? page_rmapping(page) : NULL;
+}
+
+static inline void set_page_stable_node(struct page *page,
+					struct stable_node *stable_node)
+{
+	page->mapping = (void *)((unsigned long)stable_node | PAGE_MAPPING_KSM);
+}
+
 static __always_inline bool is_stable_node_chain(struct stable_node *chain)
 {
 	return chain->rmap_hlist_len == STABLE_NODE_CHAIN;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 78/79] mm/ksm: rename PAGE_MAPPING_KSM to PAGE_MAPPING_RONLY
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (38 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 77/79] mm/ksm: hide set_page_stable_node() and page_stable_node() jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-04 19:18 ` [RFC PATCH 79/79] mm/ksm: set page->mapping to page_ronly struct instead of stable_node jglisse
                   ` (2 subsequent siblings)
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrea Arcangeli

From: Jérôme Glisse <jglisse@redhat.com>

This just rename all KSM specific helper to generic page read only
name. No functional change.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
---
 fs/proc/page.c             |  2 +-
 include/linux/page-flags.h | 30 +++++++++++++++++-------------
 mm/ksm.c                   | 12 ++++++------
 mm/memory-failure.c        |  2 +-
 mm/memory.c                |  2 +-
 mm/migrate.c               |  6 +++---
 mm/mprotect.c              |  2 +-
 mm/page_idle.c             |  2 +-
 mm/rmap.c                  | 10 +++++-----
 mm/swapfile.c              |  2 +-
 10 files changed, 37 insertions(+), 33 deletions(-)

diff --git a/fs/proc/page.c b/fs/proc/page.c
index 1491918a33c3..00cc037758ef 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -110,7 +110,7 @@ u64 stable_page_flags(struct page *page)
 		u |= 1 << KPF_MMAP;
 	if (PageAnon(page))
 		u |= 1 << KPF_ANON;
-	if (PageKsm(page))
+	if (PageReadOnly(page))
 		u |= 1 << KPF_KSM;
 
 	/*
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 50c2b8786831..0338fb5dde8d 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -374,12 +374,12 @@ PAGEFLAG(Idle, idle, PF_ANY)
  * page->mapping points to its anon_vma, not to a struct address_space;
  * with the PAGE_MAPPING_ANON bit set to distinguish it.  See rmap.h.
  *
- * On an anonymous page in a VM_MERGEABLE area, if CONFIG_KSM is enabled,
+ * On an anonymous page in a VM_MERGEABLE area, if CONFIG_RONLY is enabled,
  * the PAGE_MAPPING_MOVABLE bit may be set along with the PAGE_MAPPING_ANON
  * bit; and then page->mapping points, not to an anon_vma, but to a private
- * structure which KSM associates with that merged page.  See ksm.h.
+ * structure which RONLY associates with that merged page.  See page-ronly.h.
  *
- * PAGE_MAPPING_KSM without PAGE_MAPPING_ANON is used for non-lru movable
+ * PAGE_MAPPING_RONLY without PAGE_MAPPING_ANON is used for non-lru movable
  * page and then page->mapping points a struct address_space.
  *
  * Please note that, confusingly, "page_mapping" refers to the inode
@@ -388,7 +388,7 @@ PAGEFLAG(Idle, idle, PF_ANY)
  */
 #define PAGE_MAPPING_ANON	0x1
 #define PAGE_MAPPING_MOVABLE	0x2
-#define PAGE_MAPPING_KSM	(PAGE_MAPPING_ANON | PAGE_MAPPING_MOVABLE)
+#define PAGE_MAPPING_RONLY	(PAGE_MAPPING_ANON | PAGE_MAPPING_MOVABLE)
 #define PAGE_MAPPING_FLAGS	(PAGE_MAPPING_ANON | PAGE_MAPPING_MOVABLE)
 
 static __always_inline int PageMappingFlags(struct page *page)
@@ -408,21 +408,25 @@ static __always_inline int __PageMovable(struct page *page)
 				PAGE_MAPPING_MOVABLE;
 }
 
-#ifdef CONFIG_KSM
-/*
- * A KSM page is one of those write-protected "shared pages" or "merged pages"
- * which KSM maps into multiple mms, wherever identical anonymous page content
- * is found in VM_MERGEABLE vmas.  It's a PageAnon page, pointing not to any
- * anon_vma, but to that page's node of the stable tree.
+#ifdef CONFIG_PAGE_RONLY
+/* PageReadOnly() - Returns true if page is read only, false otherwise.
+ *
+ * @page: Page under test.
+ *
+ * A read only page is one of those write-protected. Currently only KSM does
+ * write protect a page as "shared pages" or "merged pages"  which KSM maps
+ * into multiple mms, wherever identical anonymous page content is found in
+ * VM_MERGEABLE vmas.  It's a PageAnon page, pointing not to any anon_vma,
+ * but to that page's node of the stable tree.
  */
-static __always_inline int PageKsm(struct page *page)
+static __always_inline int PageReadOnly(struct page *page)
 {
 	page = compound_head(page);
 	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) ==
-				PAGE_MAPPING_KSM;
+				PAGE_MAPPING_RONLY;
 }
 #else
-TESTPAGEFLAG_FALSE(Ksm)
+TESTPAGEFLAG_FALSE(ReadOnly)
 #endif
 
 u64 stable_page_flags(struct page *page);
diff --git a/mm/ksm.c b/mm/ksm.c
index f9bd1251c288..6085068fb8b3 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -318,13 +318,13 @@ static void __init ksm_slab_free(void)
 
 static inline struct stable_node *page_stable_node(struct page *page)
 {
-	return PageKsm(page) ? page_rmapping(page) : NULL;
+	return PageReadOnly(page) ? page_rmapping(page) : NULL;
 }
 
 static inline void set_page_stable_node(struct page *page,
 					struct stable_node *stable_node)
 {
-	page->mapping = (void *)((unsigned long)stable_node | PAGE_MAPPING_KSM);
+	page->mapping = (void *)((unsigned long)stable_node | PAGE_MAPPING_RONLY);
 }
 
 static __always_inline bool is_stable_node_chain(struct stable_node *chain)
@@ -470,7 +470,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
 				FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
 		if (IS_ERR_OR_NULL(page))
 			break;
-		if (PageKsm(page))
+		if (PageReadOnly(page))
 			ret = handle_mm_fault(vma, addr,
 					FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE);
 		else
@@ -684,7 +684,7 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
 	unsigned long kpfn;
 
 	expected_mapping = (void *)((unsigned long)stable_node |
-					PAGE_MAPPING_KSM);
+					PAGE_MAPPING_RONLY);
 again:
 	kpfn = READ_ONCE(stable_node->kpfn); /* Address dependency. */
 	page = pfn_to_page(kpfn);
@@ -2490,7 +2490,7 @@ struct page *ksm_might_need_to_copy(struct page *page,
 	struct anon_vma *anon_vma = page_anon_vma(page);
 	struct page *new_page;
 
-	if (PageKsm(page)) {
+	if (PageReadOnly(page)) {
 		if (page_stable_node(page) &&
 		    !(ksm_run & KSM_RUN_UNMERGE))
 			return page;	/* no need to copy it */
@@ -2521,7 +2521,7 @@ void rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
 	struct rmap_item *rmap_item;
 	int search_new_forks = 0;
 
-	VM_BUG_ON_PAGE(!PageKsm(page), page);
+	VM_BUG_ON_PAGE(!PageReadOnly(page), page);
 
 	/*
 	 * Rely on the page lock to protect against concurrent modifications
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 8291b75f42c8..18efefc20e67 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -947,7 +947,7 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
 	if (!page_mapped(hpage))
 		return true;
 
-	if (PageKsm(p)) {
+	if (PageReadOnly(p)) {
 		pr_err("Memory failure: %#lx: can't handle KSM pages.\n", pfn);
 		return false;
 	}
diff --git a/mm/memory.c b/mm/memory.c
index fbd80bb7a50a..b565db41400f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2733,7 +2733,7 @@ static int do_wp_page(struct vm_fault *vmf)
 	 * Take out anonymous pages first, anonymous shared vmas are
 	 * not dirty accountable.
 	 */
-	if (PageAnon(vmf->page) && !PageKsm(vmf->page)) {
+	if (PageAnon(vmf->page) && !PageReadOnly(vmf->page)) {
 		int total_map_swapcount;
 		if (!trylock_page(vmf->page)) {
 			get_page(vmf->page);
diff --git a/mm/migrate.c b/mm/migrate.c
index e4b20ac6cf36..b73b31f6d2fd 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -214,7 +214,7 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 
 	VM_BUG_ON_PAGE(PageTail(page), page);
 	while (page_vma_mapped_walk(&pvmw)) {
-		if (PageKsm(page))
+		if (PageReadOnly(page))
 			new = page;
 		else
 			new = page - pvmw.page->index +
@@ -1038,7 +1038,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	 * because that implies that the anon page is no longer mapped
 	 * (and cannot be remapped so long as we hold the page lock).
 	 */
-	if (PageAnon(page) && !PageKsm(page))
+	if (PageAnon(page) && !PageReadOnly(page))
 		anon_vma = page_get_anon_vma(page);
 
 	/*
@@ -1077,7 +1077,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		}
 	} else if (page_mapped(page)) {
 		/* Establish migration ptes */
-		VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma,
+		VM_BUG_ON_PAGE(PageAnon(page) && !PageReadOnly(page) && !anon_vma,
 				page);
 		try_to_unmap(page,
 			TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index e3309fcf586b..ab2f2e4961d8 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -81,7 +81,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
 				struct page *page;
 
 				page = vm_normal_page(vma, addr, oldpte);
-				if (!page || PageKsm(page))
+				if (!page || PageReadOnly(page))
 					continue;
 
 				/* Also skip shared copy-on-write pages */
diff --git a/mm/page_idle.c b/mm/page_idle.c
index 0a49374e6931..7e5258e4d2ad 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -104,7 +104,7 @@ static void page_idle_clear_pte_refs(struct page *page)
 	    !page_rmapping(page))
 		return;
 
-	need_lock = !PageAnon(page) || PageKsm(page);
+	need_lock = !PageAnon(page) || PageReadOnly(page);
 	if (need_lock && !trylock_page(page))
 		return;
 
diff --git a/mm/rmap.c b/mm/rmap.c
index 822a3a0cd51c..70d37f77e7a4 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -855,7 +855,7 @@ int page_referenced(struct page *page,
 	if (!page_rmapping(page))
 		return 0;
 
-	if (!is_locked && (!PageAnon(page) || PageKsm(page))) {
+	if (!is_locked && (!PageAnon(page) || PageReadOnly(page))) {
 		we_locked = trylock_page(page);
 		if (!we_locked)
 			return 1;
@@ -1122,7 +1122,7 @@ void do_page_add_anon_rmap(struct page *page,
 			__inc_node_page_state(page, NR_ANON_THPS);
 		__mod_node_page_state(page_pgdat(page), NR_ANON_MAPPED, nr);
 	}
-	if (unlikely(PageKsm(page)))
+	if (unlikely(PageReadOnly(page)))
 		return;
 
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
@@ -1660,7 +1660,7 @@ bool try_to_unmap(struct page *page, enum ttu_flags flags)
 	 * temporary VMAs until after exec() completes.
 	 */
 	if ((flags & (TTU_MIGRATION|TTU_SPLIT_FREEZE))
-	    && !PageKsm(page) && PageAnon(page))
+	    && !PageReadOnly(page) && PageAnon(page))
 		rwc.invalid_vma = invalid_migration_vma;
 
 	if (flags & TTU_RMAP_LOCKED)
@@ -1842,7 +1842,7 @@ static void rmap_walk_file(struct page *page, struct rmap_walk_control *rwc,
 
 void rmap_walk(struct page *page, struct rmap_walk_control *rwc)
 {
-	if (unlikely(PageKsm(page)))
+	if (unlikely(PageReadOnly(page)))
 		rmap_walk_ksm(page, rwc);
 	else if (PageAnon(page))
 		rmap_walk_anon(page, rwc, false);
@@ -1854,7 +1854,7 @@ void rmap_walk(struct page *page, struct rmap_walk_control *rwc)
 void rmap_walk_locked(struct page *page, struct rmap_walk_control *rwc)
 {
 	/* no ksm support for now */
-	VM_BUG_ON_PAGE(PageKsm(page), page);
+	VM_BUG_ON_PAGE(PageReadOnly(page), page);
 	if (PageAnon(page))
 		rmap_walk_anon(page, rwc, true);
 	else
diff --git a/mm/swapfile.c b/mm/swapfile.c
index c429c19e5d5d..83c73cca9e21 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1552,7 +1552,7 @@ bool reuse_swap_page(struct page *page, int *total_map_swapcount)
 	int count, total_mapcount, total_swapcount;
 
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
-	if (unlikely(PageKsm(page)))
+	if (unlikely(PageReadOnly(page)))
 		return false;
 	count = page_trans_huge_map_swapcount(page, &total_mapcount,
 					      &total_swapcount);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH 79/79] mm/ksm: set page->mapping to page_ronly struct instead of stable_node.
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (39 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 78/79] mm/ksm: rename PAGE_MAPPING_KSM to PAGE_MAPPING_RONLY jglisse
@ 2018-04-04 19:18 ` jglisse
  2018-04-18 14:13 ` [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue Jan Kara
  2018-04-20 19:57 ` Tim Chen
  42 siblings, 0 replies; 50+ messages in thread
From: jglisse @ 2018-04-04 19:18 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Jérôme Glisse, Andrea Arcangeli

From: Jérôme Glisse <jglisse@redhat.com>

Set page->mapping to the page_ronly struct instead of stable_node
struct. There is no functional change as page_ronly is just a field
of stable_node.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
---
 mm/ksm.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 6085068fb8b3..52b0ae291d23 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -39,6 +39,7 @@
 #include <linux/freezer.h>
 #include <linux/oom.h>
 #include <linux/numa.h>
+#include <linux/page_ronly.h>
 
 #include <asm/tlbflush.h>
 #include "internal.h"
@@ -126,6 +127,7 @@ struct ksm_scan {
 
 /**
  * struct stable_node - node of the stable rbtree
+ * @ronly: Page read only struct wrapper (see include/linux/page_ronly.h).
  * @node: rb node of this ksm page in the stable tree
  * @head: (overlaying parent) &migrate_nodes indicates temporarily on that list
  * @hlist_dup: linked into the stable_node->hlist with a stable_node chain
@@ -137,6 +139,7 @@ struct ksm_scan {
  * @nid: NUMA node id of stable tree in which linked (may not match kpfn)
  */
 struct stable_node {
+	struct page_ronly ronly;
 	union {
 		struct rb_node node;	/* when node of stable tree */
 		struct {		/* when listed for migration */
@@ -318,13 +321,15 @@ static void __init ksm_slab_free(void)
 
 static inline struct stable_node *page_stable_node(struct page *page)
 {
-	return PageReadOnly(page) ? page_rmapping(page) : NULL;
+	struct page_ronly *ronly = page_ronly(page);
+
+	return ronly ? container_of(ronly, struct stable_node, ronly) : NULL;
 }
 
 static inline void set_page_stable_node(struct page *page,
 					struct stable_node *stable_node)
 {
-	page->mapping = (void *)((unsigned long)stable_node | PAGE_MAPPING_RONLY);
+	page_ronly_set(page, stable_node ? &stable_node->ronly : NULL);
 }
 
 static __always_inline bool is_stable_node_chain(struct stable_node *chain)
-- 
2.14.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (40 preceding siblings ...)
  2018-04-04 19:18 ` [RFC PATCH 79/79] mm/ksm: set page->mapping to page_ronly struct instead of stable_node jglisse
@ 2018-04-18 14:13 ` Jan Kara
  2018-04-18 15:54   ` Jerome Glisse
  2018-04-20 19:57 ` Tim Chen
  42 siblings, 1 reply; 50+ messages in thread
From: Jan Kara @ 2018-04-18 14:13 UTC (permalink / raw)
  To: jglisse
  Cc: linux-mm, linux-fsdevel, linux-block, linux-kernel,
	Andrea Arcangeli, Alexander Viro, Tim Chen, Theodore Ts'o,
	Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman, Jeff Layton

Hello,

so I finally got to this :)

On Wed 04-04-18 15:17:50, jglisse@redhat.com wrote:
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> https://cgit.freedesktop.org/~glisse/linux/log/?h=generic-write-protection-rfc
> 
> This is an RFC for LSF/MM discussions. It impacts the file subsystem,
> the block subsystem and the mm subsystem. Hence it would benefit from
> a cross sub-system discussion.
> 
> Patchset is not fully bake so take it with a graint of salt. I use it
> to illustrate the fact that it is doable and now that i did it once i
> believe i have a better and cleaner plan in my head on how to do this.
> I intend to share and discuss it at LSF/MM (i still need to write it
> down). That plan lead to quite different individual steps than this
> patchset takes and his also easier to split up in more manageable
> pieces.
> 
> I also want to apologize for the size and number of patches (and i am
> not even sending them all).
> 
> ----------------------------------------------------------------------
> The Why ?
> 
> I have two objectives: duplicate memory read only accross nodes and or
> devices and work around PCIE atomic limitations. More on each of those
> objective below. I also want to put forward that it can solve the page
> wait list issue ie having each page with its own wait list and thus
> avoiding long wait list traversale latency recently reported [1].
> 
> It does allow KSM for file back pages (truely generic KSM even between
> both anonymous and file back page). I am not sure how useful this can
> be, this was not an objective i did pursue, this is just a for free
> feature (see below).

I know some people (Matthew Wilcox?) wanted to do something like KSM for
file pages - not all virtualization schemes use overlayfs and e.g. if you
use reflinks (essentially shared on-disk extents among files) for your
container setup, you could save significant amounts of memory with the
ability to share pages in page cache among files that are reflinked.

> [1] https://groups.google.com/forum/#!topic/linux.kernel/Iit1P5BNyX8
> 
> ----------------------------------------------------------------------
> Per page wait list, so long page_waitqueue() !
> 
> Not implemented in this RFC but below is the logic and pseudo code
> at bottom of this email.
> 
> When there is a contention on struct page lock bit, the caller which
> is trying to lock the page will add itself to a waitqueue. The issues
> here is that multiple pages share the same wait queue and on large
> system with a lot of ram this means we can quickly get to a long list
> of waiters for differents pages (or for the same page) on the same
> list [1].
> 
> The present patchset virtualy kills all places that need to access the
> page->mapping field and only a handfull are left, namely for testing
> page truncation and for vmscan. The former can be remove if we reuse
> the PG_waiters flag for a new PG_truncate flag set on truncation then
> we can virtualy kill all derefence of page->mapping (this patchset
> proves it is doable). NOTE THIS DOES NOT MEAN THAT MAPPING is FREE TO
> BE USE BY ANYONE TO STORE WHATEVER IN STRUCT PAGE. SORRY NO !

It is interesting that you can get rid of page->mapping uses in most
places. For page reclaim (vmscan) you'll still need a way to get from a
page to an address_space so that you can reclaim the page so you can hardly
get rid of page->mapping completely but you're right that with such limited
use that transition could be more complex / expensive.

What I wonder though is what is the cost of this (in the terms of code size
and speed) - propagating the mapping down the stack costs something... Also
in terms of maintainability, code readability suffers a bit.

This could be helped though. In some cases it seems we just use the mapping
because it was easily available but could get away without it. In other
case (e.g. lot of fs/buffer.c) we could make bh -> mapping transition easy
by storing the mapping in the struct buffer_head - possibly it could
replace b_bdev pointer as we could get to that from the mapping with a bit
of magic and pointer chasing and accessing b_bdev is not very performance
critical. OTOH such optimizations make a rather complex patches from mostly
mechanical replacement so I can see why you didn't go that route.

Overall I think you'd need to make a good benchmarking comparison showing
how much this helps some real workloads (your motivation) and also how
other loads on lower end machines are affected.

> ----------------------------------------------------------------------
> The What ?
> 
> Aim of this patch serie is to introduce generic page write protection
> for any kind of regular page in a process (private anonymous or back
> by regular file). This feature already exist, in one form, for private
> anonymous page, as part of KSM (Kernel Share Memory).
> 
> So this patch serie is two fold. First it factors out the page write
> protection of KSM into a generic write protection mechanim which KSM
> becomes the first user of. Then it add support for regular file back
> page memory (regular file or share memory aka shmem). To achieve this
> i need to cut the dependency lot of code have on page->mapping so i
> can set page->mapping to point to special structure when write
> protected.

So I'm interested in this write protection mechanism but I didn't find much
about it in the series. How does it work? I can see KSM writeprotects pages
in page tables so that works for userspace mappings but what about
in-kernel users modifying pages - e.g. pages in page cache carrying
filesystem metadata do get modified a lot like this.

> ----------------------------------------------------------------------
> The How ?
> 
> The corner stone assumption in this patch serie is that page->mapping
> is always the same as vma->vm_file->f_mapping (modulo when a page is
> truncated). The one exception is in respect to swaping with nfs file.
> 
> Am i fundamentaly wrong in my assumption ?

AFAIK you're right.

> I believe this is a do-able plan because virtually all place do know
> the address_space a page belongs to, or someone in the callchain do.
> Hence this patchset is all about passing down that information. The
> only exception i am aware of is page reclamation (vmscan) but this can
> be handled as a special case as there we not interested in the page
> mapping per say but in reclaiming memory.
> 
> Once you have both struct page and mapping (without relying on the
> struct page to get the latter) you can use mapping that as a unique
> key to lookup page->private/page->index value. So all dereference of
> those fields become:
>     page_offset(page) -> page_offset(page, mapping)
>     page_buffers(page) -> page_buffers(page, mapping)
> 
> Note than this only need special handling for write protected page ie
> it is the same as before if page is not write protected so it just add
> a test each time code call either helper.
> 
> Sinful function (all existing usage are remove in this patchset):
>     page_mapping(page)
> 
> You can also use the page buffer head as a unique key. So following
> helpers are added (thought i do not use them):
>     page_mapping_with_buffers(page, (struct buffer_head *)bh)
>     page_offset_with_buffers(page, (struct buffer_head *)bh)
> 
> A write protected page has page->mapping pointing to a structure like
> struct rmap_item for KSM. So this structure has a list for each unique
> combination:
>     struct write_protect {
>         struct list_head *mappings; /* write_protect_mapping list */
>         ...
>     };
> 
>     struct write_protect_mapping {
>         struct list_head list
>         struct address_space *mapping;
>         unsigned long offset;
>         unsigned long private;
>         ...
>     };

Auch, the fact that we could share a page as data storage for several
inode+offset combinations that are not sharing underlying storage just
looks viciously twisted ;) But is it really that useful to warrant
complications? In particular I'm afraid that filesystems expect consistency
between their internal state (attached to page->private) and page state
(e.g. page->flags) and when there are multiple internal states attached to
the same page this could go easily wrong...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue
  2018-04-18 14:13 ` [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue Jan Kara
@ 2018-04-18 15:54   ` Jerome Glisse
  2018-04-18 16:20     ` Darrick J. Wong
  2018-04-19 10:32     ` Jan Kara
  0 siblings, 2 replies; 50+ messages in thread
From: Jerome Glisse @ 2018-04-18 15:54 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, linux-fsdevel, linux-block, linux-kernel,
	Andrea Arcangeli, Alexander Viro, Tim Chen, Theodore Ts'o,
	Tejun Heo, Josef Bacik, Mel Gorman, Jeff Layton

On Wed, Apr 18, 2018 at 04:13:37PM +0200, Jan Kara wrote:
> Hello,
> 
> so I finally got to this :)
> 
> On Wed 04-04-18 15:17:50, jglisse@redhat.com wrote:
> > From: Jérôme Glisse <jglisse@redhat.com>

[...]

> > ----------------------------------------------------------------------
> > The Why ?
> > 
> > I have two objectives: duplicate memory read only accross nodes and or
> > devices and work around PCIE atomic limitations. More on each of those
> > objective below. I also want to put forward that it can solve the page
> > wait list issue ie having each page with its own wait list and thus
> > avoiding long wait list traversale latency recently reported [1].
> > 
> > It does allow KSM for file back pages (truely generic KSM even between
> > both anonymous and file back page). I am not sure how useful this can
> > be, this was not an objective i did pursue, this is just a for free
> > feature (see below).
> 
> I know some people (Matthew Wilcox?) wanted to do something like KSM for
> file pages - not all virtualization schemes use overlayfs and e.g. if you
> use reflinks (essentially shared on-disk extents among files) for your
> container setup, you could save significant amounts of memory with the
> ability to share pages in page cache among files that are reflinked.

Yes i believe they are still use case where KSM with file back page make
senses, i am just not familiar enough with those workload to know how big
of a deal this is.

> > [1] https://groups.google.com/forum/#!topic/linux.kernel/Iit1P5BNyX8
> > 
> > ----------------------------------------------------------------------
> > Per page wait list, so long page_waitqueue() !
> > 
> > Not implemented in this RFC but below is the logic and pseudo code
> > at bottom of this email.
> > 
> > When there is a contention on struct page lock bit, the caller which
> > is trying to lock the page will add itself to a waitqueue. The issues
> > here is that multiple pages share the same wait queue and on large
> > system with a lot of ram this means we can quickly get to a long list
> > of waiters for differents pages (or for the same page) on the same
> > list [1].
> > 
> > The present patchset virtualy kills all places that need to access the
> > page->mapping field and only a handfull are left, namely for testing
> > page truncation and for vmscan. The former can be remove if we reuse
> > the PG_waiters flag for a new PG_truncate flag set on truncation then
> > we can virtualy kill all derefence of page->mapping (this patchset
> > proves it is doable). NOTE THIS DOES NOT MEAN THAT MAPPING is FREE TO
> > BE USE BY ANYONE TO STORE WHATEVER IN STRUCT PAGE. SORRY NO !
> 
> It is interesting that you can get rid of page->mapping uses in most
> places. For page reclaim (vmscan) you'll still need a way to get from a
> page to an address_space so that you can reclaim the page so you can hardly
> get rid of page->mapping completely but you're right that with such limited
> use that transition could be more complex / expensive.

Idea for vmscan is that you either have regular mapping pointer store in
page->mapping or you have a pointer to special struct which has a function
pointer to a reclaim/walker function (rmap_walk_ksm)

> What I wonder though is what is the cost of this (in the terms of code size
> and speed) - propagating the mapping down the stack costs something... Also
> in terms of maintainability, code readability suffers a bit.

I haven't checked that, i will, i was not so concern because in the vast
majority of places there is already struct address_space on the stack
frame (ie local variable in function being call) so moving it to function
argument shouldn't impact that. However as i expect this will be merge
over multiple kernel release cycle and the intermediary step will see an
increase in stack size. The code size should only grow marginaly i expect.
I will provide numbers with my next posting after LSF/MM.


> This could be helped though. In some cases it seems we just use the mapping
> because it was easily available but could get away without it. In other
> case (e.g. lot of fs/buffer.c) we could make bh -> mapping transition easy
> by storing the mapping in the struct buffer_head - possibly it could
> replace b_bdev pointer as we could get to that from the mapping with a bit
> of magic and pointer chasing and accessing b_bdev is not very performance
> critical. OTOH such optimizations make a rather complex patches from mostly
> mechanical replacement so I can see why you didn't go that route.

I am willing to do the buffer_head change, i remember considering it but
i don't remember why not doing it (i failed to take note of that).


> Overall I think you'd need to make a good benchmarking comparison showing
> how much this helps some real workloads (your motivation) and also how
> other loads on lower end machines are affected.

Do you have any specific benchmark you would like to see ? My list was:
  https://github.com/01org/lkp-tests
  https://github.com/gormanm/mmtests
  https://github.com/akopytov/sysbench/
  http://git.infradead.org/users/dhowells/unionmount-testsuite.git

For workload i care this will be CUDA workload. We are still working on
the OpenCL open source stack but i don't expect we will have someting
that can shows the same performance improvement with OpenCL as soon as
with CUDA.

> > ----------------------------------------------------------------------
> > The What ?
> > 
> > Aim of this patch serie is to introduce generic page write protection
> > for any kind of regular page in a process (private anonymous or back
> > by regular file). This feature already exist, in one form, for private
> > anonymous page, as part of KSM (Kernel Share Memory).
> > 
> > So this patch serie is two fold. First it factors out the page write
> > protection of KSM into a generic write protection mechanim which KSM
> > becomes the first user of. Then it add support for regular file back
> > page memory (regular file or share memory aka shmem). To achieve this
> > i need to cut the dependency lot of code have on page->mapping so i
> > can set page->mapping to point to special structure when write
> > protected.
> 
> So I'm interested in this write protection mechanism but I didn't find much
> about it in the series. How does it work? I can see KSM writeprotects pages
> in page tables so that works for userspace mappings but what about
> in-kernel users modifying pages - e.g. pages in page cache carrying
> filesystem metadata do get modified a lot like this.

So i only care about page which are mmaped into a process address space.
At first i only want to intercept CPU write access through mmap of file
but i also intend to extend write syscall to also "fault" on the write
protection ie it will call a callback to unprotect the page allowing the
write protector to take proper action while write syscall is happening.

I am affraid truely generic write protection for metadata pages is bit
out of scope of what i am doing. However the mechanism i am proposing
can be extended for that too. Issue is that all place that want to write
to those page need to be converted to something where write happens
between write_begin and write_end section (mmap and CPU pte does give
this implicitly through page fault, so does write syscall). Basicly
there is a need to make sure that write and write protection can be
ordered against one another without complex locking.


> > ----------------------------------------------------------------------
> > The How ?
> > 
> > The corner stone assumption in this patch serie is that page->mapping
> > is always the same as vma->vm_file->f_mapping (modulo when a page is
> > truncated). The one exception is in respect to swaping with nfs file.
> > 
> > Am i fundamentaly wrong in my assumption ?
> 
> AFAIK you're right.
> 
> > I believe this is a do-able plan because virtually all place do know
> > the address_space a page belongs to, or someone in the callchain do.
> > Hence this patchset is all about passing down that information. The
> > only exception i am aware of is page reclamation (vmscan) but this can
> > be handled as a special case as there we not interested in the page
> > mapping per say but in reclaiming memory.
> > 
> > Once you have both struct page and mapping (without relying on the
> > struct page to get the latter) you can use mapping that as a unique
> > key to lookup page->private/page->index value. So all dereference of
> > those fields become:
> >     page_offset(page) -> page_offset(page, mapping)
> >     page_buffers(page) -> page_buffers(page, mapping)
> > 
> > Note than this only need special handling for write protected page ie
> > it is the same as before if page is not write protected so it just add
> > a test each time code call either helper.
> > 
> > Sinful function (all existing usage are remove in this patchset):
> >     page_mapping(page)
> > 
> > You can also use the page buffer head as a unique key. So following
> > helpers are added (thought i do not use them):
> >     page_mapping_with_buffers(page, (struct buffer_head *)bh)
> >     page_offset_with_buffers(page, (struct buffer_head *)bh)
> > 
> > A write protected page has page->mapping pointing to a structure like
> > struct rmap_item for KSM. So this structure has a list for each unique
> > combination:
> >     struct write_protect {
> >         struct list_head *mappings; /* write_protect_mapping list */
> >         ...
> >     };
> > 
> >     struct write_protect_mapping {
> >         struct list_head list
> >         struct address_space *mapping;
> >         unsigned long offset;
> >         unsigned long private;
> >         ...
> >     };
> 
> Auch, the fact that we could share a page as data storage for several
> inode+offset combinations that are not sharing underlying storage just
> looks viciously twisted ;) But is it really that useful to warrant
> complications? In particular I'm afraid that filesystems expect consistency
> between their internal state (attached to page->private) and page state
> (e.g. page->flags) and when there are multiple internal states attached to
> the same page this could go easily wrong...

So at first i want to limit to write protect (not KSM) thus page->flags
will stay consistent (ie page is only ever associated with a single
mapping). For KSM yes the page->flags can be problematic, however here
we can assume that page is clean (and uptodate) and not under write
back. So problematic flags for KSM:
  - private (page_has_buffers() or PagePrivate (nfs, metadata, ...))
  - private_2 (FsCache)
  - mappedtodisk
  - swapcache
  - error

Idea again would be to PageFlagsWithMapping(page, mapping) so that for
non KSM write protected page you test the usual page->flags and for
write protected page you find the flag value using mapping as lookup
index. Usualy those flag are seldomly changed/accessed. Again the
overhead (ignoring code size) would only be for page which are KSM.
So maybe KSM will not make sense because perf overhead it has with
page->flags access (i don't think so but i haven't tested this).


Thank you for taking time to read over all this.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue
  2018-04-18 15:54   ` Jerome Glisse
@ 2018-04-18 16:20     ` Darrick J. Wong
  2018-04-19 10:32     ` Jan Kara
  1 sibling, 0 replies; 50+ messages in thread
From: Darrick J. Wong @ 2018-04-18 16:20 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Jan Kara, linux-mm, linux-fsdevel, linux-block, linux-kernel,
	Andrea Arcangeli, Alexander Viro, Tim Chen, Theodore Ts'o,
	Tejun Heo, Josef Bacik, Mel Gorman, Jeff Layton

On Wed, Apr 18, 2018 at 11:54:30AM -0400, Jerome Glisse wrote:
> On Wed, Apr 18, 2018 at 04:13:37PM +0200, Jan Kara wrote:
> > Hello,
> > 
> > so I finally got to this :)
> > 
> > On Wed 04-04-18 15:17:50, jglisse@redhat.com wrote:
> > > From: Jérôme Glisse <jglisse@redhat.com>
> 
> [...]
> 
> > > ----------------------------------------------------------------------
> > > The Why ?
> > > 
> > > I have two objectives: duplicate memory read only accross nodes and or
> > > devices and work around PCIE atomic limitations. More on each of those
> > > objective below. I also want to put forward that it can solve the page
> > > wait list issue ie having each page with its own wait list and thus
> > > avoiding long wait list traversale latency recently reported [1].
> > > 
> > > It does allow KSM for file back pages (truely generic KSM even between
> > > both anonymous and file back page). I am not sure how useful this can
> > > be, this was not an objective i did pursue, this is just a for free
> > > feature (see below).
> > 
> > I know some people (Matthew Wilcox?) wanted to do something like KSM for
> > file pages - not all virtualization schemes use overlayfs and e.g. if you
> > use reflinks (essentially shared on-disk extents among files) for your
> > container setup, you could save significant amounts of memory with the
> > ability to share pages in page cache among files that are reflinked.
> 
> Yes i believe they are still use case where KSM with file back page make
> senses, i am just not familiar enough with those workload to know how big
> of a deal this is.

Imagine container farms where they deploy the base os image via cp --reflink.
This would be a huge win for btrfs/xfs but we've all been too terrified
of the memory manager to try it. :)

For those following at home, we had a track at LSFMM2017 (and hallway
bofs about this in previous years):
https://lwn.net/Articles/717950/

/me starts to look at this big series, having sent his own yesterday. :)

--D

> > > [1] https://groups.google.com/forum/#!topic/linux.kernel/Iit1P5BNyX8
> > > 
> > > ----------------------------------------------------------------------
> > > Per page wait list, so long page_waitqueue() !
> > > 
> > > Not implemented in this RFC but below is the logic and pseudo code
> > > at bottom of this email.
> > > 
> > > When there is a contention on struct page lock bit, the caller which
> > > is trying to lock the page will add itself to a waitqueue. The issues
> > > here is that multiple pages share the same wait queue and on large
> > > system with a lot of ram this means we can quickly get to a long list
> > > of waiters for differents pages (or for the same page) on the same
> > > list [1].
> > > 
> > > The present patchset virtualy kills all places that need to access the
> > > page->mapping field and only a handfull are left, namely for testing
> > > page truncation and for vmscan. The former can be remove if we reuse
> > > the PG_waiters flag for a new PG_truncate flag set on truncation then
> > > we can virtualy kill all derefence of page->mapping (this patchset
> > > proves it is doable). NOTE THIS DOES NOT MEAN THAT MAPPING is FREE TO
> > > BE USE BY ANYONE TO STORE WHATEVER IN STRUCT PAGE. SORRY NO !
> > 
> > It is interesting that you can get rid of page->mapping uses in most
> > places. For page reclaim (vmscan) you'll still need a way to get from a
> > page to an address_space so that you can reclaim the page so you can hardly
> > get rid of page->mapping completely but you're right that with such limited
> > use that transition could be more complex / expensive.
> 
> Idea for vmscan is that you either have regular mapping pointer store in
> page->mapping or you have a pointer to special struct which has a function
> pointer to a reclaim/walker function (rmap_walk_ksm)
> 
> > What I wonder though is what is the cost of this (in the terms of code size
> > and speed) - propagating the mapping down the stack costs something... Also
> > in terms of maintainability, code readability suffers a bit.
> 
> I haven't checked that, i will, i was not so concern because in the vast
> majority of places there is already struct address_space on the stack
> frame (ie local variable in function being call) so moving it to function
> argument shouldn't impact that. However as i expect this will be merge
> over multiple kernel release cycle and the intermediary step will see an
> increase in stack size. The code size should only grow marginaly i expect.
> I will provide numbers with my next posting after LSF/MM.
> 
> 
> > This could be helped though. In some cases it seems we just use the mapping
> > because it was easily available but could get away without it. In other
> > case (e.g. lot of fs/buffer.c) we could make bh -> mapping transition easy
> > by storing the mapping in the struct buffer_head - possibly it could
> > replace b_bdev pointer as we could get to that from the mapping with a bit
> > of magic and pointer chasing and accessing b_bdev is not very performance
> > critical. OTOH such optimizations make a rather complex patches from mostly
> > mechanical replacement so I can see why you didn't go that route.
> 
> I am willing to do the buffer_head change, i remember considering it but
> i don't remember why not doing it (i failed to take note of that).
> 
> 
> > Overall I think you'd need to make a good benchmarking comparison showing
> > how much this helps some real workloads (your motivation) and also how
> > other loads on lower end machines are affected.
> 
> Do you have any specific benchmark you would like to see ? My list was:
>   https://github.com/01org/lkp-tests
>   https://github.com/gormanm/mmtests
>   https://github.com/akopytov/sysbench/
>   http://git.infradead.org/users/dhowells/unionmount-testsuite.git
> 
> For workload i care this will be CUDA workload. We are still working on
> the OpenCL open source stack but i don't expect we will have someting
> that can shows the same performance improvement with OpenCL as soon as
> with CUDA.
> 
> > > ----------------------------------------------------------------------
> > > The What ?
> > > 
> > > Aim of this patch serie is to introduce generic page write protection
> > > for any kind of regular page in a process (private anonymous or back
> > > by regular file). This feature already exist, in one form, for private
> > > anonymous page, as part of KSM (Kernel Share Memory).
> > > 
> > > So this patch serie is two fold. First it factors out the page write
> > > protection of KSM into a generic write protection mechanim which KSM
> > > becomes the first user of. Then it add support for regular file back
> > > page memory (regular file or share memory aka shmem). To achieve this
> > > i need to cut the dependency lot of code have on page->mapping so i
> > > can set page->mapping to point to special structure when write
> > > protected.
> > 
> > So I'm interested in this write protection mechanism but I didn't find much
> > about it in the series. How does it work? I can see KSM writeprotects pages
> > in page tables so that works for userspace mappings but what about
> > in-kernel users modifying pages - e.g. pages in page cache carrying
> > filesystem metadata do get modified a lot like this.
> 
> So i only care about page which are mmaped into a process address space.
> At first i only want to intercept CPU write access through mmap of file
> but i also intend to extend write syscall to also "fault" on the write
> protection ie it will call a callback to unprotect the page allowing the
> write protector to take proper action while write syscall is happening.
> 
> I am affraid truely generic write protection for metadata pages is bit
> out of scope of what i am doing. However the mechanism i am proposing
> can be extended for that too. Issue is that all place that want to write
> to those page need to be converted to something where write happens
> between write_begin and write_end section (mmap and CPU pte does give
> this implicitly through page fault, so does write syscall). Basicly
> there is a need to make sure that write and write protection can be
> ordered against one another without complex locking.
> 
> 
> > > ----------------------------------------------------------------------
> > > The How ?
> > > 
> > > The corner stone assumption in this patch serie is that page->mapping
> > > is always the same as vma->vm_file->f_mapping (modulo when a page is
> > > truncated). The one exception is in respect to swaping with nfs file.
> > > 
> > > Am i fundamentaly wrong in my assumption ?
> > 
> > AFAIK you're right.
> > 
> > > I believe this is a do-able plan because virtually all place do know
> > > the address_space a page belongs to, or someone in the callchain do.
> > > Hence this patchset is all about passing down that information. The
> > > only exception i am aware of is page reclamation (vmscan) but this can
> > > be handled as a special case as there we not interested in the page
> > > mapping per say but in reclaiming memory.
> > > 
> > > Once you have both struct page and mapping (without relying on the
> > > struct page to get the latter) you can use mapping that as a unique
> > > key to lookup page->private/page->index value. So all dereference of
> > > those fields become:
> > >     page_offset(page) -> page_offset(page, mapping)
> > >     page_buffers(page) -> page_buffers(page, mapping)
> > > 
> > > Note than this only need special handling for write protected page ie
> > > it is the same as before if page is not write protected so it just add
> > > a test each time code call either helper.
> > > 
> > > Sinful function (all existing usage are remove in this patchset):
> > >     page_mapping(page)
> > > 
> > > You can also use the page buffer head as a unique key. So following
> > > helpers are added (thought i do not use them):
> > >     page_mapping_with_buffers(page, (struct buffer_head *)bh)
> > >     page_offset_with_buffers(page, (struct buffer_head *)bh)
> > > 
> > > A write protected page has page->mapping pointing to a structure like
> > > struct rmap_item for KSM. So this structure has a list for each unique
> > > combination:
> > >     struct write_protect {
> > >         struct list_head *mappings; /* write_protect_mapping list */
> > >         ...
> > >     };
> > > 
> > >     struct write_protect_mapping {
> > >         struct list_head list
> > >         struct address_space *mapping;
> > >         unsigned long offset;
> > >         unsigned long private;
> > >         ...
> > >     };
> > 
> > Auch, the fact that we could share a page as data storage for several
> > inode+offset combinations that are not sharing underlying storage just
> > looks viciously twisted ;) But is it really that useful to warrant
> > complications? In particular I'm afraid that filesystems expect consistency
> > between their internal state (attached to page->private) and page state
> > (e.g. page->flags) and when there are multiple internal states attached to
> > the same page this could go easily wrong...
> 
> So at first i want to limit to write protect (not KSM) thus page->flags
> will stay consistent (ie page is only ever associated with a single
> mapping). For KSM yes the page->flags can be problematic, however here
> we can assume that page is clean (and uptodate) and not under write
> back. So problematic flags for KSM:
>   - private (page_has_buffers() or PagePrivate (nfs, metadata, ...))
>   - private_2 (FsCache)
>   - mappedtodisk
>   - swapcache
>   - error
> 
> Idea again would be to PageFlagsWithMapping(page, mapping) so that for
> non KSM write protected page you test the usual page->flags and for
> write protected page you find the flag value using mapping as lookup
> index. Usualy those flag are seldomly changed/accessed. Again the
> overhead (ignoring code size) would only be for page which are KSM.
> So maybe KSM will not make sense because perf overhead it has with
> page->flags access (i don't think so but i haven't tested this).
> 
> 
> Thank you for taking time to read over all this.
> 
> Cheers,
> Jérôme

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue
  2018-04-18 15:54   ` Jerome Glisse
  2018-04-18 16:20     ` Darrick J. Wong
@ 2018-04-19 10:32     ` Jan Kara
  2018-04-19 14:52       ` Jerome Glisse
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Kara @ 2018-04-19 10:32 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Jan Kara, linux-mm, linux-fsdevel, linux-block, linux-kernel,
	Andrea Arcangeli, Alexander Viro, Tim Chen, Theodore Ts'o,
	Tejun Heo, Josef Bacik, Mel Gorman, Jeff Layton

On Wed 18-04-18 11:54:30, Jerome Glisse wrote:
> > Overall I think you'd need to make a good benchmarking comparison showing
> > how much this helps some real workloads (your motivation) and also how
> > other loads on lower end machines are affected.
> 
> Do you have any specific benchmark you would like to see ? My list was:
>   https://github.com/01org/lkp-tests
>   https://github.com/gormanm/mmtests

So e.g. mmtests have a *lot* of different tests so it's probably not
realistic for you to run them all. I'd look at bonnie++ (file & dir tests),
dbench, reaim - these are crappy IO benchmarks because they mostly fit into
page cache but for your purposes this is exactly what you want to see
differences in CPU overhead :).

> > > ----------------------------------------------------------------------
> > > The What ?
> > > 
> > > Aim of this patch serie is to introduce generic page write protection
> > > for any kind of regular page in a process (private anonymous or back
> > > by regular file). This feature already exist, in one form, for private
> > > anonymous page, as part of KSM (Kernel Share Memory).
> > > 
> > > So this patch serie is two fold. First it factors out the page write
> > > protection of KSM into a generic write protection mechanim which KSM
> > > becomes the first user of. Then it add support for regular file back
> > > page memory (regular file or share memory aka shmem). To achieve this
> > > i need to cut the dependency lot of code have on page->mapping so i
> > > can set page->mapping to point to special structure when write
> > > protected.
> > 
> > So I'm interested in this write protection mechanism but I didn't find much
> > about it in the series. How does it work? I can see KSM writeprotects pages
> > in page tables so that works for userspace mappings but what about
> > in-kernel users modifying pages - e.g. pages in page cache carrying
> > filesystem metadata do get modified a lot like this.
> 
> So i only care about page which are mmaped into a process address space.
> At first i only want to intercept CPU write access through mmap of file
> but i also intend to extend write syscall to also "fault" on the write
> protection ie it will call a callback to unprotect the page allowing the
> write protector to take proper action while write syscall is happening.
> 
> I am affraid truely generic write protection for metadata pages is bit
> out of scope of what i am doing. However the mechanism i am proposing
> can be extended for that too. Issue is that all place that want to write
> to those page need to be converted to something where write happens
> between write_begin and write_end section (mmap and CPU pte does give
> this implicitly through page fault, so does write syscall). Basicly
> there is a need to make sure that write and write protection can be
> ordered against one another without complex locking.

I understand metadata pages are not interesting for your use case. However
from mm point of view these are page cache pages as any other. So maybe my
question should have been: How do we make sure this mechanism will not be
used for pages for which it cannot work?

> > > A write protected page has page->mapping pointing to a structure like
> > > struct rmap_item for KSM. So this structure has a list for each unique
> > > combination:
> > >     struct write_protect {
> > >         struct list_head *mappings; /* write_protect_mapping list */
> > >         ...
> > >     };
> > > 
> > >     struct write_protect_mapping {
> > >         struct list_head list
> > >         struct address_space *mapping;
> > >         unsigned long offset;
> > >         unsigned long private;
> > >         ...
> > >     };
> > 
> > Auch, the fact that we could share a page as data storage for several
> > inode+offset combinations that are not sharing underlying storage just
> > looks viciously twisted ;) But is it really that useful to warrant
> > complications? In particular I'm afraid that filesystems expect consistency
> > between their internal state (attached to page->private) and page state
> > (e.g. page->flags) and when there are multiple internal states attached to
> > the same page this could go easily wrong...
> 
> So at first i want to limit to write protect (not KSM) thus page->flags
> will stay consistent (ie page is only ever associated with a single
> mapping). For KSM yes the page->flags can be problematic, however here
> we can assume that page is clean (and uptodate) and not under write
> back. So problematic flags for KSM:
>   - private (page_has_buffers() or PagePrivate (nfs, metadata, ...))
>   - private_2 (FsCache)
>   - mappedtodisk
>   - swapcache
>   - error
> 
> Idea again would be to PageFlagsWithMapping(page, mapping) so that for
> non KSM write protected page you test the usual page->flags and for
> write protected page you find the flag value using mapping as lookup
> index. Usualy those flag are seldomly changed/accessed. Again the
> overhead (ignoring code size) would only be for page which are KSM.
> So maybe KSM will not make sense because perf overhead it has with
> page->flags access (i don't think so but i haven't tested this).

Yeah, sure, page->flags could be dealt with in a similar way but at this
point I don't think it's worth it. And without page->flags I don't think
abstracting page->private makes much sense - or am I missing something why
you need page->private depend on the mapping? So what I wanted to suggest
is that we leave page->private as is currently and just concentrate on
page->mapping hacks...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue
  2018-04-19 10:32     ` Jan Kara
@ 2018-04-19 14:52       ` Jerome Glisse
  0 siblings, 0 replies; 50+ messages in thread
From: Jerome Glisse @ 2018-04-19 14:52 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, linux-fsdevel, linux-block, linux-kernel,
	Andrea Arcangeli, Alexander Viro, Tim Chen, Theodore Ts'o,
	Tejun Heo, Josef Bacik, Mel Gorman, Jeff Layton

On Thu, Apr 19, 2018 at 12:32:50PM +0200, Jan Kara wrote:
> On Wed 18-04-18 11:54:30, Jerome Glisse wrote:

[...]

> > I am affraid truely generic write protection for metadata pages is bit
> > out of scope of what i am doing. However the mechanism i am proposing
> > can be extended for that too. Issue is that all place that want to write
> > to those page need to be converted to something where write happens
> > between write_begin and write_end section (mmap and CPU pte does give
> > this implicitly through page fault, so does write syscall). Basicly
> > there is a need to make sure that write and write protection can be
> > ordered against one another without complex locking.
> 
> I understand metadata pages are not interesting for your use case. However
> from mm point of view these are page cache pages as any other. So maybe my
> question should have been: How do we make sure this mechanism will not be
> used for pages for which it cannot work?

Oh that one is easy, the API take vma + addr or rather mm struct + addr
(ie like KSM today kind of). I will change wording in v1 to almost
generic write protection :) or process' page write protection (but this
would not work for special pfn/vma so not generic their either).

> > > > A write protected page has page->mapping pointing to a structure like
> > > > struct rmap_item for KSM. So this structure has a list for each unique
> > > > combination:
> > > >     struct write_protect {
> > > >         struct list_head *mappings; /* write_protect_mapping list */
> > > >         ...
> > > >     };
> > > > 
> > > >     struct write_protect_mapping {
> > > >         struct list_head list
> > > >         struct address_space *mapping;
> > > >         unsigned long offset;
> > > >         unsigned long private;
> > > >         ...
> > > >     };
> > > 
> > > Auch, the fact that we could share a page as data storage for several
> > > inode+offset combinations that are not sharing underlying storage just
> > > looks viciously twisted ;) But is it really that useful to warrant
> > > complications? In particular I'm afraid that filesystems expect consistency
> > > between their internal state (attached to page->private) and page state
> > > (e.g. page->flags) and when there are multiple internal states attached to
> > > the same page this could go easily wrong...
> > 
> > So at first i want to limit to write protect (not KSM) thus page->flags
> > will stay consistent (ie page is only ever associated with a single
> > mapping). For KSM yes the page->flags can be problematic, however here
> > we can assume that page is clean (and uptodate) and not under write
> > back. So problematic flags for KSM:
> >   - private (page_has_buffers() or PagePrivate (nfs, metadata, ...))
> >   - private_2 (FsCache)
> >   - mappedtodisk
> >   - swapcache
> >   - error
> > 
> > Idea again would be to PageFlagsWithMapping(page, mapping) so that for
> > non KSM write protected page you test the usual page->flags and for
> > write protected page you find the flag value using mapping as lookup
> > index. Usualy those flag are seldomly changed/accessed. Again the
> > overhead (ignoring code size) would only be for page which are KSM.
> > So maybe KSM will not make sense because perf overhead it has with
> > page->flags access (i don't think so but i haven't tested this).
> 
> Yeah, sure, page->flags could be dealt with in a similar way but at this
> point I don't think it's worth it. And without page->flags I don't think
> abstracting page->private makes much sense - or am I missing something why
> you need page->private depend on the mapping? So what I wanted to suggest
> is that we leave page->private as is currently and just concentrate on
> page->mapping hacks...

Well i wanted to go up to KSM or at least as close as possible to KSM
for file back page. But i can focus on page->mapping first, do write
protection with that and also do the per page wait queue for page lock.
Which i believe are both nice features. This will also make the patchset
smaller and easier to review (less scary).

KSM can be done on top of that latter and i will be happy to help. I
have a bunch of coccinelle patches for page->private, page->index and
i can do some for page->flags.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue
  2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
                   ` (41 preceding siblings ...)
  2018-04-18 14:13 ` [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue Jan Kara
@ 2018-04-20 19:57 ` Tim Chen
  2018-04-20 22:19   ` Jerome Glisse
  42 siblings, 1 reply; 50+ messages in thread
From: Tim Chen @ 2018-04-20 19:57 UTC (permalink / raw)
  To: jglisse, linux-mm, linux-fsdevel, linux-block
  Cc: linux-kernel, Andrea Arcangeli, Alexander Viro,
	Theodore Ts'o, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman,
	Jeff Layton

On 04/04/2018 12:17 PM, jglisse@redhat.com wrote:
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> https://cgit.freedesktop.org/~glisse/linux/log/?h=generic-write-protection-rfc
> 
> This is an RFC for LSF/MM discussions. It impacts the file subsystem,
> the block subsystem and the mm subsystem. Hence it would benefit from
> a cross sub-system discussion.
> 
> Patchset is not fully bake so take it with a graint of salt. I use it
> to illustrate the fact that it is doable and now that i did it once i
> believe i have a better and cleaner plan in my head on how to do this.
> I intend to share and discuss it at LSF/MM (i still need to write it
> down). That plan lead to quite different individual steps than this
> patchset takes and his also easier to split up in more manageable
> pieces.
> 
> I also want to apologize for the size and number of patches (and i am
> not even sending them all).
> 
> ----------------------------------------------------------------------
> The Why ?
> 
> I have two objectives: duplicate memory read only accross nodes and or
> devices and work around PCIE atomic limitations. More on each of those
> objective below. I also want to put forward that it can solve the page
> wait list issue ie having each page with its own wait list and thus
> avoiding long wait list traversale latency recently reported [1].
> 
> It does allow KSM for file back pages (truely generic KSM even between
> both anonymous and file back page). I am not sure how useful this can
> be, this was not an objective i did pursue, this is just a for free
> feature (see below).
> 
> [1] https://groups.google.com/forum/#!topic/linux.kernel/Iit1P5BNyX8
> 
> ----------------------------------------------------------------------
> Per page wait list, so long page_waitqueue() !
> 
> Not implemented in this RFC but below is the logic and pseudo code
> at bottom of this email.
> 
> When there is a contention on struct page lock bit, the caller which
> is trying to lock the page will add itself to a waitqueue. The issues
> here is that multiple pages share the same wait queue and on large
> system with a lot of ram this means we can quickly get to a long list
> of waiters for differents pages (or for the same page) on the same
> list [1].

Your approach seems useful if there are lots of locked pages sharing
the same wait queue.  

That said, in the original workload from our customer with the long wait queue
problem, there was a single super hot page getting migrated, and it
is being accessed by all threads which caused the big log jam while they wait for
the migration to get completed.  
With your approach, we will still likely end up with a long queue 
in that workload even if we have per page wait queue.

Thanks.

Tim

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue
  2018-04-20 19:57 ` Tim Chen
@ 2018-04-20 22:19   ` Jerome Glisse
  2018-04-20 23:48     ` Tim Chen
  0 siblings, 1 reply; 50+ messages in thread
From: Jerome Glisse @ 2018-04-20 22:19 UTC (permalink / raw)
  To: Tim Chen
  Cc: linux-mm, linux-fsdevel, linux-block, linux-kernel,
	Andrea Arcangeli, Michal Hocko, Alexander Viro,
	Theodore Ts'o, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman,
	Jeff Layton

On Fri, Apr 20, 2018 at 12:57:41PM -0700, Tim Chen wrote:
> On 04/04/2018 12:17 PM, jglisse@redhat.com wrote:
> > From: Jérôme Glisse <jglisse@redhat.com>
> > 
> > https://cgit.freedesktop.org/~glisse/linux/log/?h=generic-write-protection-rfc
> > 
> > This is an RFC for LSF/MM discussions. It impacts the file subsystem,
> > the block subsystem and the mm subsystem. Hence it would benefit from
> > a cross sub-system discussion.
> > 
> > Patchset is not fully bake so take it with a graint of salt. I use it
> > to illustrate the fact that it is doable and now that i did it once i
> > believe i have a better and cleaner plan in my head on how to do this.
> > I intend to share and discuss it at LSF/MM (i still need to write it
> > down). That plan lead to quite different individual steps than this
> > patchset takes and his also easier to split up in more manageable
> > pieces.
> > 
> > I also want to apologize for the size and number of patches (and i am
> > not even sending them all).
> > 
> > ----------------------------------------------------------------------
> > The Why ?
> > 
> > I have two objectives: duplicate memory read only accross nodes and or
> > devices and work around PCIE atomic limitations. More on each of those
> > objective below. I also want to put forward that it can solve the page
> > wait list issue ie having each page with its own wait list and thus
> > avoiding long wait list traversale latency recently reported [1].
> > 
> > It does allow KSM for file back pages (truely generic KSM even between
> > both anonymous and file back page). I am not sure how useful this can
> > be, this was not an objective i did pursue, this is just a for free
> > feature (see below).
> > 
> > [1] https://groups.google.com/forum/#!topic/linux.kernel/Iit1P5BNyX8
> > 
> > ----------------------------------------------------------------------
> > Per page wait list, so long page_waitqueue() !
> > 
> > Not implemented in this RFC but below is the logic and pseudo code
> > at bottom of this email.
> > 
> > When there is a contention on struct page lock bit, the caller which
> > is trying to lock the page will add itself to a waitqueue. The issues
> > here is that multiple pages share the same wait queue and on large
> > system with a lot of ram this means we can quickly get to a long list
> > of waiters for differents pages (or for the same page) on the same
> > list [1].
> 
> Your approach seems useful if there are lots of locked pages sharing
> the same wait queue.  
> 
> That said, in the original workload from our customer with the long wait queue
> problem, there was a single super hot page getting migrated, and it
> is being accessed by all threads which caused the big log jam while they wait for
> the migration to get completed.  
> With your approach, we will still likely end up with a long queue 
> in that workload even if we have per page wait queue.
> 
> Thanks.

Ok so i re-read the thread, i was writting this cover letter from memory
and i had bad recollection of your issue, so sorry.

First, do you have a way to reproduce the issue ? Something easy would
be nice :)

So what i am proposing for per page wait queue would only marginaly help
you (it might not even be mesurable in your workload). It would certainly
make the code smaller and easier to understand i believe.

Now that i have look back at your issue i think there is 2 things we
should do. First keep migration page map read only, this would at least
avoid CPU read fault. In trace you captured i wasn't able to ascertain
if this were read or write fault.

Second idea i have is about NUMA, everytime we NUMA migrate a page we
could attach a temporary struct to the page (using page->mapping). So
if we scan that page again we can inspect information about previous
migration and see if we are not over migrating that page (ie bouncing
it all over). If so we can mark the page (maybe with a page flag if we
can find one) to protect it from further migration. That temporary
struct would be remove after a while, ie autonuma would preallocate a
bunch of those and keep an LRU of them and recycle the oldest when it
needs a new one to migrate another page.


LSF/MM slots:

Michal can i get 2 slots to talk about this ? MM only discussion, one
to talk about doing migration with page map read only but write
protected while migration is happening. The other one to talk about
attaching auto NUMA tracking struct to page.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue
  2018-04-20 22:19   ` Jerome Glisse
@ 2018-04-20 23:48     ` Tim Chen
  0 siblings, 0 replies; 50+ messages in thread
From: Tim Chen @ 2018-04-20 23:48 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: linux-mm, linux-fsdevel, linux-block, linux-kernel,
	Andrea Arcangeli, Michal Hocko, Alexander Viro,
	Theodore Ts'o, Tejun Heo, Jan Kara, Josef Bacik, Mel Gorman,
	Jeff Layton

On 04/20/2018 03:19 PM, Jerome Glisse wrote:
> On Fri, Apr 20, 2018 at 12:57:41PM -0700, Tim Chen wrote:
>> On 04/04/2018 12:17 PM, jglisse@redhat.com wrote:
>>
>>
>> Your approach seems useful if there are lots of locked pages sharing
>> the same wait queue.  
>>
>> That said, in the original workload from our customer with the long wait queue
>> problem, there was a single super hot page getting migrated, and it
>> is being accessed by all threads which caused the big log jam while they wait for
>> the migration to get completed.  
>> With your approach, we will still likely end up with a long queue 
>> in that workload even if we have per page wait queue.
>>
>> Thanks.
> 
> Ok so i re-read the thread, i was writting this cover letter from memory
> and i had bad recollection of your issue, so sorry.
> 
> First, do you have a way to reproduce the issue ? Something easy would
> be nice :)

Unfortunately it is a customer workload that they guard closely and wouldn't let us
look at the source code.  We have to profile and backtrace its behavior.

Mel made a quick attempt to reproduce the behavior with a hot page migration, 
but he wasn't quite able to duplicate the pathologic behavior.

> 
> So what i am proposing for per page wait queue would only marginaly help
> you (it might not even be mesurable in your workload). It would certainly
> make the code smaller and easier to understand i believe.

In certain cases if we have lots of pages sharing a page wait queue,
your solution would help, and we wouldn't be wasting time checking
waiters not waiting on the page that's being unlocked.  Though I
don't have a specific workload that has such behavior.

> 
> Now that i have look back at your issue i think there is 2 things we
> should do. First keep migration page map read only, this would at least
> avoid CPU read fault. In trace you captured i wasn't able to ascertain
> if this were read or write fault.
> 
> Second idea i have is about NUMA, everytime we NUMA migrate a page we
> could attach a temporary struct to the page (using page->mapping). So
> if we scan that page again we can inspect information about previous
> migration and see if we are not over migrating that page (ie bouncing
> it all over). If so we can mark the page (maybe with a page flag if we
> can find one) to protect it from further migration. That temporary
> struct would be remove after a while, ie autonuma would preallocate a
> bunch of those and keep an LRU of them and recycle the oldest when it
> needs a new one to migrate another page.

The goal to migrate a hot page with care, or avoid bouncing it around 
frequently makes sense.  If it is a hot page shared by many threads
running on different NUMA nodes, and moving it will only mildly improve NUMA
locality, we should avoid the migration.

Tim

> 
> 
> LSF/MM slots:
> 
> Michal can i get 2 slots to talk about this ? MM only discussion, one
> to talk about doing migration with page map read only but write
> protected while migration is happening. The other one to talk about
> attaching auto NUMA tracking struct to page.
> 
> Cheers,
> Jérôme
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2018-04-20 23:48 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-04 19:17 [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue jglisse
2018-04-04 19:17 ` [RFC PATCH 04/79] pipe: add inode field to struct pipe_inode_info jglisse
2018-04-04 19:17 ` [RFC PATCH 05/79] mm/swap: add an helper to get address_space from swap_entry_t jglisse
2018-04-04 19:17 ` [RFC PATCH 06/79] mm/page: add helpers to dereference struct page index field jglisse
2018-04-04 19:17 ` [RFC PATCH 07/79] mm/page: add helpers to find mapping give a page and buffer head jglisse
2018-04-04 19:17 ` [RFC PATCH 08/79] mm/page: add helpers to find page mapping and private given a bio jglisse
2018-04-04 19:17 ` [RFC PATCH 09/79] fs: add struct address_space to read_cache_page() callback argument jglisse
2018-04-04 19:17 ` [RFC PATCH 20/79] fs: add struct address_space to write_cache_pages() " jglisse
2018-04-04 19:17 ` [RFC PATCH 22/79] fs: add struct inode to block_read_full_page() arguments jglisse
2018-04-04 19:17 ` [RFC PATCH 24/79] fs: add struct inode to nobh_writepage() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 26/79] fs: add struct address_space to mpage_readpage() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 27/79] fs: add struct address_space to fscache_read*() callback arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 28/79] fs: introduce page_is_truncated() helper jglisse
2018-04-04 19:18 ` [RFC PATCH 29/79] fs/block: add struct address_space to bdev_write_page() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 30/79] fs/block: add struct address_space to __block_write_begin() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 31/79] fs/block: add struct address_space to __block_write_begin_int() args jglisse
2018-04-04 19:18 ` [RFC PATCH 32/79] fs/block: do not rely on page->mapping get it from the context jglisse
2018-04-04 19:18 ` [RFC PATCH 33/79] fs/journal: add struct super_block to jbd2_journal_forget() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 34/79] fs/journal: add struct inode to jbd2_journal_revoke() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 35/79] fs/buffer: add struct address_space and struct page to end_io callback jglisse
2018-04-04 19:18 ` [RFC PATCH 36/79] fs/buffer: add struct super_block to bforget() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 37/79] fs/buffer: add struct super_block to __bforget() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 38/79] fs/buffer: add first buffer flag for first buffer_head in a page jglisse
2018-04-04 19:18 ` [RFC PATCH 39/79] fs/buffer: add struct address_space to clean_page_buffers() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 50/79] fs: stop relying on mapping field of struct page, get it from context jglisse
2018-04-04 19:18 ` [RFC PATCH 51/79] " jglisse
2018-04-04 19:18 ` [RFC PATCH 52/79] fs/buffer: use _page_has_buffers() instead of page_has_buffers() jglisse
2018-04-04 19:18 ` [RFC PATCH 63/79] mm/page: convert page's index lookup to be against specific mapping jglisse
2018-04-04 19:18 ` [RFC PATCH 64/79] mm/buffer: use _page_has_buffers() instead of page_has_buffers() jglisse
2018-04-04 19:18 ` [RFC PATCH 65/79] mm/swap: add struct swap_info_struct swap_readpage() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 68/79] mm/vma_address: convert page's index lookup to be against specific mapping jglisse
2018-04-04 19:18 ` [RFC PATCH 69/79] fs/journal: add struct address_space to jbd2_journal_try_to_free_buffers() arguments jglisse
2018-04-04 19:18 ` [RFC PATCH 70/79] mm: add struct address_space to mark_buffer_dirty() jglisse
2018-04-04 19:18 ` [RFC PATCH 71/79] mm: add struct address_space to set_page_dirty() jglisse
2018-04-04 19:18 ` [RFC PATCH 72/79] mm: add struct address_space to set_page_dirty_lock() jglisse
2018-04-04 19:18 ` [RFC PATCH 73/79] mm: pass down struct address_space to set_page_dirty() jglisse
2018-04-04 19:18 ` [RFC PATCH 74/79] mm/page_ronly: add config option for generic read only page framework jglisse
2018-04-04 19:18 ` [RFC PATCH 75/79] mm/page_ronly: add page read only core structure and helpers jglisse
2018-04-04 19:18 ` [RFC PATCH 76/79] mm/ksm: have ksm select PAGE_RONLY config jglisse
2018-04-04 19:18 ` [RFC PATCH 77/79] mm/ksm: hide set_page_stable_node() and page_stable_node() jglisse
2018-04-04 19:18 ` [RFC PATCH 78/79] mm/ksm: rename PAGE_MAPPING_KSM to PAGE_MAPPING_RONLY jglisse
2018-04-04 19:18 ` [RFC PATCH 79/79] mm/ksm: set page->mapping to page_ronly struct instead of stable_node jglisse
2018-04-18 14:13 ` [RFC PATCH 00/79] Generic page write protection and a solution to page waitqueue Jan Kara
2018-04-18 15:54   ` Jerome Glisse
2018-04-18 16:20     ` Darrick J. Wong
2018-04-19 10:32     ` Jan Kara
2018-04-19 14:52       ` Jerome Glisse
2018-04-20 19:57 ` Tim Chen
2018-04-20 22:19   ` Jerome Glisse
2018-04-20 23:48     ` Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).