LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>, Matt B <jackdachef@gmail.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
xfs@oss.sgi.com
Subject: Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation.
Date: Tue, 3 Mar 2015 16:20:04 +1100 [thread overview]
Message-ID: <20150303052004.GM18360@dastard> (raw)
In-Reply-To: <CA+55aFw+fb=Fh4M2wA4dVskgqN7PhZRGZS6JTMx4Rb1Qn++oaA@mail.gmail.com>
On Mon, Mar 02, 2015 at 06:37:47PM -0800, Linus Torvalds wrote:
> On Mon, Mar 2, 2015 at 6:22 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > There might be some other case where the new "just change the
> > protection" doesn't do the "oh, but it the protection didn't change,
> > don't bother flushing". I don't see it.
>
> Hmm. I wonder.. In change_pte_range(), we just unconditionally change
> the protection bits.
>
> But the old numa code used to do
>
> if (!pte_numa(oldpte)) {
> ptep_set_numa(mm, addr, pte);
>
> so it would actually avoid the pte update if a numa-prot page was
> marked numa-prot again.
>
> But are those migrate-page calls really common enough to make these
> things happen often enough on the same pages for this all to matter?
It's looking like that's a possibility. I am running a fake-numa=4
config on this test VM so it's got 4 nodes of 4p/4GB RAM each.
both kernels are running through the same page fault path and that
is straight through migrate_pages().
3.19:
13.70% 0.01% [kernel] [k] native_flush_tlb_others
- native_flush_tlb_others
- 98.58% flush_tlb_page
ptep_clear_flush
try_to_unmap_one
rmap_walk
try_to_unmap
migrate_pages
migrate_misplaced_page
- handle_mm_fault
- 96.88% __do_page_fault
trace_do_page_fault
do_async_page_fault
+ async_page_fault
+ 3.12% __get_user_pages
+ 1.40% flush_tlb_mm_range
4.0-rc1:
- 67.12% 0.04% [kernel] [k] native_flush_tlb_others
- native_flush_tlb_others
- 99.80% flush_tlb_page
ptep_clear_flush
try_to_unmap_one
rmap_walk
try_to_unmap
migrate_pages
migrate_misplaced_page
- handle_mm_fault
- 99.50% __do_page_fault
trace_do_page_fault
do_async_page_fault
- async_page_fault
Same call chain, just a lot more CPU used further down the stack.
> Odd.
>
> So it would be good if your profiles just show "there's suddenly a
> *lot* more calls to flush_tlb_page() from XYZ" and the culprit is
> obvious that way..
Ok, I did a simple 'perf stat -e tlb:tlb_flush -a -r 6 sleep 10' to
count all the tlb flush events from the kernel. I then pulled the
full events for a 30s period to get a sampling of the reason
associated with each flush event.
4.0-rc1:
Performance counter stats for 'system wide' (6 runs):
2,190,503 tlb:tlb_flush ( +- 8.30% )
10.001970663 seconds time elapsed ( +- 0.00% )
The reason breakdown:
81% TLB_REMOTE_SHOOTDOWN
19% TLB_FLUSH_ON_TASK_SWITCH
3.19:
Performance counter stats for 'system wide' (6 runs):
467,151 tlb:tlb_flush ( +- 25.50% )
10.002021491 seconds time elapsed ( +- 0.00% )
The reason breakdown:
6% TLB_REMOTE_SHOOTDOWN
94% TLB_FLUSH_ON_TASK_SWITCH
The difference would appear to be the number of remote TLB
shootdowns that are occurring from otherwise identical page fault
paths.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2015-03-03 5:20 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-02 1:04 Dave Chinner
2015-03-02 19:47 ` Linus Torvalds
2015-03-03 1:47 ` Dave Chinner
2015-03-03 2:22 ` Linus Torvalds
2015-03-03 2:37 ` Linus Torvalds
2015-03-03 5:20 ` Dave Chinner [this message]
2015-03-03 6:56 ` Linus Torvalds
2015-03-03 11:34 ` Dave Chinner
2015-03-03 13:43 ` Mel Gorman
2015-03-03 21:33 ` Dave Chinner
2015-03-04 20:00 ` Mel Gorman
2015-03-04 23:00 ` Dave Chinner
2015-03-04 23:35 ` Ingo Molnar
2015-03-04 23:51 ` Dave Chinner
2015-03-02 19:17 Matt
2015-03-02 19:25 ` Dave Hansen
2015-03-02 19:45 ` Matt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150303052004.GM18360@dastard \
--to=david@fromorbit.com \
--cc=akpm@linux-foundation.org \
--cc=jackdachef@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=xfs@oss.sgi.com \
--subject='Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation.' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).