LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hugh@veritas.com>,
Mike Stroyan <mike.stroyan@hp.com>,
"Luck, Tony" <tony.luck@intel.com>,
linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Fw: [PATCH] ia64: race flushing icache in do_no_page path
Date: Thu, 26 Apr 2007 17:53:49 +1000 [thread overview]
Message-ID: <46305A8D.2080003@yahoo.com.au> (raw)
In-Reply-To: <20070425205548.fd51b301.akpm@linux-foundation.org>
Hi,
I had a couple of questions which I'm hoping someone would be kind
enough to explain :)
Andrew Morton wrote:
> guys, aplication crashes on million-dollar machines aren't nice. Please review carefully
> and urgently?
>
>
> Begin forwarded message:
>
> Date: Wed, 25 Apr 2007 18:16:15 -0600
> From: Mike Stroyan <mike.stroyan@hp.com>
> To: "Luck, Tony" <tony.luck@intel.com>
> Cc: linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: [PATCH] ia64: race flushing icache in do_no_page path
>
>
> This is a very similar problem to a copy-on-write cache flushing problem
> that Tony Luck fixed in July 2006. In this case the do_no_page function
> handles a fault in an executable or library that is mmapped from an
> NFS file system. The code is copied into a newly reallocated page.
> The lazy_mmu_prot_update() function should be used to flush old entries
> from the icache for that page on ia64 processors. But that call is made
> after a set_pte_at call that makes the page accessible to other threads
> executing the same code. This was seen to cause application crashes
> when an OpenMP application ran many threads calling same functions at
> the same time. The first thread to reach a page starts to fault in the
> new code. One of the other threads overtakes the first and executes old
> data from the icache. That could result in bad instructions. It is more
> obvious when an old cache line contains prefetched non-instruction bits
> that result in an illegal instruction trap.
I wonder how this is different to all the other code which calls
lazy_mmu_prot_update() after set_pte_at(). do_swap_page, for example,
_could_ fault in executable code, couldn't it?
It is because do_swap_page uses flush_icache_page()? So why doesn't
the flush_icache_page() work in do_no_page as well? (It seems to look
like a superset of lazy_mmu_prot_update on ia64?!?).
And while we're looking at flush_icache_page, why is there none in
do_wp_page (I admit, I'm not really up to scratch on d/i cache aliasing
handling, but cachetlb.txt seems to suggest that cow_user_page fits the
description). That is, if we're already trying to cover our butts wrt
SMC, then do_wp_page _could_ be cow'ing executable code, couldn't it?
And for that matter, I admit I don't understand how the icache flushing
can be done lazily, only at change-protection time. Why is any
flush_dcache_page() site not a problem for an _existing_ executable pte
wrt d/i cache aliases?
BTW. while I'm ranting, I hope all this stuff has gone so complex for a
reason, and that being that the alternative simpler approach of more
flushes, less lazy, less complex, less buggy was tested and found to be
noticably slower... :)
>
> The problem has only been seen on montecito processors which have
> separate level 2 icache and dcache. This dcache to icache coherency
> problem is more likely to occur there because of the much larger level
> 2 icache. I suspect that the non-NFS case is working because direct
> DMA into the new page is making the instruction cache coherent. Any
> file system that uses a non-DMA copy into the text page could show the
> same problem.
>
> Signed-off-by: Mike Stroyan <mike.stroyan@hp.com>
>
> diff --git a/mm/memory.c b/mm/memory.c
> index e7066e7..50c8848 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2291,6 +2291,7 @@ retry:
> entry = mk_pte(new_page, vma->vm_page_prot);
> if (write_access)
> entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> + lazy_mmu_prot_update(entry);
> set_pte_at(mm, address, page_table, entry);
> if (anon) {
> inc_mm_counter(mm, anon_rss);
> @@ -2312,7 +2313,6 @@ retry:
>
> /* no need to invalidate: a not-present page shouldn't be cached */
> update_mmu_cache(vma, address, entry);
> - lazy_mmu_prot_update(entry);
> unlock:
> pte_unmap_unlock(page_table, ptl);
> if (dirty_page) {
>
--
SUSE Labs, Novell Inc.
next parent reply other threads:[~2007-04-26 7:54 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20070425205548.fd51b301.akpm@linux-foundation.org>
2007-04-26 7:53 ` Nick Piggin [this message]
2007-04-26 17:35 ` Mike Stroyan
2007-04-27 11:55 ` Nick Piggin
2007-04-27 14:18 ` Hugh Dickins
2007-04-27 17:02 ` David Mosberger-Tang
2007-04-28 1:31 ` Rohit Seth
2007-04-28 5:34 ` Hugh Dickins
2007-04-28 18:17 ` Rohit Seth
2007-05-01 11:52 ` Nick Piggin
2007-05-02 0:36 ` Rohit Seth
2007-05-02 2:05 ` Nick Piggin
2007-04-28 2:16 ` Nick Piggin
2007-04-28 1:24 ` Rohit Seth
2007-04-28 2:00 ` Nick Piggin
2007-04-28 3:04 ` Nick Piggin
2007-04-28 5:20 ` Hugh Dickins
2007-04-28 6:03 ` Nick Piggin
2007-04-28 18:30 ` Rohit Seth
2007-05-01 11:47 ` Nick Piggin
2007-05-02 0:36 ` Rohit Seth
2007-04-28 18:05 ` Rohit Seth
2007-05-01 11:43 ` Nick Piggin
2007-05-04 21:32 ` Mike Stroyan
2007-04-28 4:11 ` Nick Piggin
2007-04-28 17:57 ` Rohit Seth
2007-05-01 11:39 ` Nick Piggin
2007-05-02 0:36 ` Rohit Seth
2007-05-02 1:57 ` Nick Piggin
2007-07-04 14:24 Zoltan Menyhart
2007-07-04 16:58 ` KAMEZAWA Hiroyuki
2007-07-05 8:57 ` Zoltan Menyhart
2007-07-05 17:36 ` Mike Stroyan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46305A8D.2080003@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@linux-foundation.org \
--cc=hugh@veritas.com \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mike.stroyan@hp.com \
--cc=tony.luck@intel.com \
--subject='Re: Fw: [PATCH] ia64: race flushing icache in do_no_page path' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).