LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com> To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Alistair Popple <apopple@nvidia.com>, Tiberiu Georgescu <tiberiu.georgescu@nutanix.com>, ivan.teterevkov@nutanix.com, Mike Rapoport <rppt@linux.vnet.ibm.com>, Hugh Dickins <hughd@google.com>, peterx@redhat.com, Matthew Wilcox <willy@infradead.org>, Andrea Arcangeli <aarcange@redhat.com>, David Hildenbrand <david@redhat.com>, "Kirill A . Shutemov" <kirill@shutemov.name>, Andrew Morton <akpm@linux-foundation.org>, Mike Kravetz <mike.kravetz@oracle.com> Subject: [PATCH RFC 3/4] mm: Handle PTE_MARKER page faults Date: Fri, 6 Aug 2021 23:25:20 -0400 [thread overview] Message-ID: <20210807032521.7591-4-peterx@redhat.com> (raw) In-Reply-To: <20210807032521.7591-1-peterx@redhat.com> handle_pte_marker() is the function that will parse and handle all the pte marker faults. For PAGEOUT marker, it's as simple as dropping the pte and do the fault just like a none pte. The other solution should be that we clear the pte to none pte and retry the fault, however that'll be slower than handling it right now. Signed-off-by: Peter Xu <peterx@redhat.com> --- mm/memory.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 7288f585544a..47f8ca064459 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -98,6 +98,8 @@ struct page *mem_map; EXPORT_SYMBOL(mem_map); #endif +static vm_fault_t do_fault(struct vm_fault *vmf); + /* * A number of key systems in x86 including ioremap() rely on the assumption * that high_memory defines the upper bound on direct map memory, then end @@ -1394,6 +1396,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, put_page(page); continue; + } else if (is_pte_marker_entry(entry)) { + /* Drop PTE_MARKER_PAGEOUT when zapped */ + pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); + continue; } /* If details->check_mapping, we leave swap entries. */ @@ -3467,6 +3473,39 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf) return 0; } +/* + * This function parses PTE markers and handle the faults. Returns true if we + * finished the fault, and we should have put the return value into "*ret". + * Otherwise it means we want to continue the swap path, and "*ret" untouched. + */ +static vm_fault_t handle_pte_marker(struct vm_fault *vmf) +{ + swp_entry_t entry = pte_to_swp_entry(vmf->orig_pte); + unsigned long marker; + + marker = pte_marker_get(entry); + + /* + * PTE markers should always be with file-backed memories, and the + * marker should never be empty. If anything weird happened, the best + * thing to do is to kill the process along with its mm. + */ + if (WARN_ON_ONCE(vma_is_anonymous(vmf->vma) || !marker)) + return VM_FAULT_SIGBUS; + +#ifdef CONFIG_PTE_MARKER_PAGEOUT + if (marker == PTE_MARKER_PAGEOUT) + /* + * This pte is previously zapped for swap, the PAGEOUT is only + * a flag before it's accessed again. Safe to drop it now. + */ + return do_fault(vmf); +#endif + + /* We see some marker that we can't handle */ + return VM_FAULT_SIGBUS; +} + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -3503,6 +3542,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) ret = vmf->page->pgmap->ops->migrate_to_ram(vmf); } else if (is_hwpoison_entry(entry)) { ret = VM_FAULT_HWPOISON; + } else if (is_pte_marker_entry(entry)) { + ret = handle_pte_marker(vmf); } else { print_bad_pte(vma, vmf->address, vmf->orig_pte, NULL); ret = VM_FAULT_SIGBUS; -- 2.32.0
next prev parent reply other threads:[~2021-08-07 3:25 UTC|newest] Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-08-07 3:25 [PATCH RFC 0/4] mm: Enable PM_SWAP for shmem with PTE_MARKER Peter Xu 2021-08-07 3:25 ` [PATCH RFC 1/4] mm: Introduce PTE_MARKER swap entry Peter Xu 2021-08-07 3:25 ` [PATCH RFC 2/4] mm: Check against orig_pte for finish_fault() Peter Xu 2021-08-07 3:25 ` Peter Xu [this message] 2021-08-07 3:25 ` [PATCH RFC 4/4] mm: Install marker pte when page out for shmem pages Peter Xu 2021-08-13 15:18 ` Tiberiu Georgescu 2021-08-13 16:01 ` Peter Xu 2021-08-18 18:02 ` Tiberiu Georgescu 2021-08-17 9:04 ` [PATCH RFC 0/4] mm: Enable PM_SWAP for shmem with PTE_MARKER David Hildenbrand 2021-08-17 17:09 ` Peter Xu 2021-08-17 18:46 ` David Hildenbrand 2021-08-17 20:24 ` Peter Xu 2021-08-18 8:24 ` David Hildenbrand 2021-08-18 17:52 ` Tiberiu Georgescu 2021-08-18 18:13 ` David Hildenbrand 2021-08-19 14:54 ` Tiberiu Georgescu 2021-08-19 17:26 ` David Hildenbrand 2021-08-20 16:49 ` Tiberiu Georgescu 2021-08-20 19:12 ` Peter Xu 2021-08-25 13:40 ` Tiberiu Georgescu 2021-08-25 14:59 ` Peter Xu
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210807032521.7591-4-peterx@redhat.com \ --to=peterx@redhat.com \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=apopple@nvidia.com \ --cc=david@redhat.com \ --cc=hughd@google.com \ --cc=ivan.teterevkov@nutanix.com \ --cc=kirill@shutemov.name \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mike.kravetz@oracle.com \ --cc=rppt@linux.vnet.ibm.com \ --cc=tiberiu.georgescu@nutanix.com \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).