LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>,
	Michal Hocko <mhocko@kernel.org>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Laurent Dufour <ldufour@linux.vnet.ibm.com>
Subject: Re: [RFC PATCH 1/8] mm: mmap: unmap large mapping by section
Date: Sat, 24 Mar 2018 14:24:00 -0400	[thread overview]
Message-ID: <20180324182359.GB4928@redhat.com> (raw)
In-Reply-To: <20180321172932.GE4780@bombadil.infradead.org>

On Wed, Mar 21, 2018 at 10:29:32AM -0700, Matthew Wilcox wrote:
> On Wed, Mar 21, 2018 at 09:31:22AM -0700, Yang Shi wrote:
> > On 3/21/18 6:08 AM, Michal Hocko wrote:
> > > Yes, this definitely sucks. One way to work that around is to split the
> > > unmap to two phases. One to drop all the pages. That would only need
> > > mmap_sem for read and then tear down the mapping with the mmap_sem for
> > > write. This wouldn't help for parallel mmap_sem writers but those really
> > > need a different approach (e.g. the range locking).
> > 
> > page fault might sneak in to map a page which has been unmapped before?
> > 
> > range locking should help a lot on manipulating small sections of a large
> > mapping in parallel or multiple small mappings. It may not achieve too much
> > for single large mapping.
> 
> I don't think we need range locking.  What if we do munmap this way:
> 
> Take the mmap_sem for write
> Find the VMA
>   If the VMA is large(*)
>     Mark the VMA as deleted
>     Drop the mmap_sem
>     zap all of the entries
>     Take the mmap_sem
>   Else
>     zap all of the entries
> Continue finding VMAs
> Drop the mmap_sem
> 
> Now we need to change everywhere which looks up a VMA to see if it needs
> to care the the VMA is deleted (page faults, eg will need to SIGBUS; mmap
> does not care; munmap will need to wait for the existing munmap operation
> to complete), but it gives us the atomicity, at least on a per-VMA basis.
> 

What about something that should fix all issues:
    struct list_head to_free_puds;
    ...
    down_write(&mm->mmap_sem);
    ...
    unmap_vmas(&tlb, vma, start, end, &to_free_puds);
    arch_unmap(mm, vma, start, end);
    /* Fix up all other VM information */
    remove_vma_list(mm, vma);
    ...
    up_write(&mm->mmap_sem);
    ...
    zap_pud_list(rss_update_info, to_free_puds);
    update_rss(rss_update_info)

We collect pud that need to be free/zap we update the page table PUD
entry to pud_none under the write sem and CPU page table lock, add the
pud to the list that need zapping. We only collect pud fully cover,
and usual business for partialy covered pud.

Everything behave as today except that we do not free memory. Care
must be take with the anon vma and we should probably not free the
vma struct either before the zap but all other mm struct can be
updated. The rss_counter would also to be updated post zap pud.

We would need special code to zap pud list, no need to take lock or
do special arch tlb flushing, ptep_get_clear, ... when walking down
those puds. So it should scale a lot better too.

Did i miss something ?

Cheers,
Jérôme

  parent reply	other threads:[~2018-03-24 18:24 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20 21:31 [RFC PATCH 0/8] Drop mmap_sem during unmapping large map Yang Shi
2018-03-20 21:31 ` [RFC PATCH 1/8] mm: mmap: unmap large mapping by section Yang Shi
2018-03-21 13:08   ` Michal Hocko
2018-03-21 16:31     ` Yang Shi
2018-03-21 17:29       ` Matthew Wilcox
2018-03-21 21:45         ` Yang Shi
2018-03-21 22:15           ` Matthew Wilcox
2018-03-21 22:40             ` Yang Shi
2018-03-21 22:46           ` Matthew Wilcox
2018-03-22 15:32             ` Laurent Dufour
2018-03-22 15:40               ` Matthew Wilcox
2018-03-22 15:54                 ` Laurent Dufour
2018-03-22 16:05                   ` Matthew Wilcox
2018-03-22 16:18                     ` Laurent Dufour
2018-03-22 16:46                       ` Yang Shi
2018-03-23 13:03                         ` Laurent Dufour
2018-03-22 16:51                       ` Matthew Wilcox
2018-03-22 16:49                     ` Yang Shi
2018-03-22 17:34         ` Yang Shi
2018-03-22 18:48           ` Matthew Wilcox
2018-03-24 18:24         ` Jerome Glisse [this message]
2018-03-21 13:14   ` Michal Hocko
2018-03-21 16:50     ` Yang Shi
2018-03-21 17:16       ` Yang Shi
2018-03-21 21:23         ` Michal Hocko
2018-03-21 22:36           ` Yang Shi
2018-03-22  9:10             ` Michal Hocko
2018-03-22 16:06               ` Yang Shi
2018-03-22 16:12                 ` Michal Hocko
2018-03-22 16:13                 ` Matthew Wilcox
2018-03-22 16:28                   ` Laurent Dufour
2018-03-22 16:36                     ` David Laight
2018-03-20 21:31 ` [RFC PATCH 2/8] mm: mmap: pass atomic parameter to do_munmap() call sites Yang Shi
2018-03-20 21:31 ` [RFC PATCH 3/8] mm: mremap: pass atomic parameter to do_munmap() Yang Shi
2018-03-20 21:31 ` [RFC PATCH 4/8] mm: nommu: add " Yang Shi
2018-03-20 21:31 ` [RFC PATCH 5/8] ipc: shm: pass " Yang Shi
2018-03-20 21:31 ` [RFC PATCH 6/8] fs: proc/vmcore: " Yang Shi
2018-03-20 21:31 ` [RFC PATCH 7/8] x86: mpx: " Yang Shi
2018-03-20 22:35   ` Thomas Gleixner
2018-03-21 16:53     ` Yang Shi
2018-03-20 21:31 ` [RFC PATCH 8/8] x86: vma: " Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180324182359.GB4928@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=ldufour@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang.shi@linux.alibaba.com \
    --subject='Re: [RFC PATCH 1/8] mm: mmap: unmap large mapping by section' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).