LKML Archive on
help / color / mirror / Atom feed
From: Nick Piggin <>
To: William Lee Irwin III <>
Cc: Adam Litke <>,
	Andrew Morton <>,
	Arjan van de Ven <>,
	Christoph Hellwig <>,
	Ken Chen <>,,,
	Linus Torvalds <>
Subject: Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.
Date: Wed, 21 Mar 2007 17:51:23 +1100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

William Lee Irwin III wrote:
> William Lee Irwin III wrote:
>>>ISTR potential ppc64 users coming out of the woodwork for something I
>>>didn't recognize the name of, but I may be confusing that with your
>>>patch. I can implement additional users (and useful ones at that)
>>>needing this in particular if desired.
> On Wed, Mar 21, 2007 at 04:07:43PM +1100, Nick Piggin wrote:
>>Yes I would be interested in seeing useful additional users of this
>>that cannot use our regular virtual memory, before making it a general
>>I just don't want to see proliferation of these things, if possible.
> I'm tied up elsewhere so I won't get to it in a timely fashion. Maybe
> in a few weeks I can start up on the first two of the bunch.

Care to give us a hint? :)

> William Lee Irwin III wrote:
>>>Two fault handling methods callbacks raise an eyebrow over here at least.
>>>I was vaguely hoping for unification of the fault handling callbacks.
> On Wed, Mar 21, 2007 at 04:07:43PM +1100, Nick Piggin wrote:
>>I don't know if it would be so clean to do that as they are at different 
>>Adam's fault is before the VM translation (and bypasses it), and mine is 
> Not much of a VM translation; it's just a lookup through the software
> mocked-up structures on everything save i386, x86_64, and some m68k where
> they're the same thing only with hardware walkers (ISTR ia64's being
> firmware a la Alpha despite the "HPW" name, though I could be wrong)

Well the vma+pagetables *are* our VM translation data structure. It is
a good data structure. The Gelato/UNSW guys experimenting with changing
this have basically said they haven't yet got anything that beats it.

I would be opposed to anything that bypasses that unless a) it is not
applicable to the VM as a whole, and b) it is really worth it
(hugepages was a reasonable exception).

> reliant on them. The drivers/etc. could just as easily use helper
> functions to carry out the lookup, thereby accomplishing the
> unification. There's nothing particularly fundamental about a pte
> lookup.

Yeah you could, but it looks back to front to me.

The VM tells the filesystem that the machine took a fault at virtual
address X, then the filesystem asks the VM what pgoff that is, then
tells the VM to install the corresponding page to vaddr X.

With my ->fault, the VM asks the filesystem to give the page that
corresponds to vaddr X, then installs it into that vaddr.

> Normal arches that do software TLB refill could just as easily
> consult the radix trees dangled off struct address_space or any old
> data structure floating around the kernel with enough information to
> translate user virtual addresses to the physical addresses they need to
> fill the TLB with, and there are other kernels that literally do things
> like that.

Sure it *could* be done, but it may not be very nice, given Linux's
design. And you definitely need _something_ other than just the
pagecache radix-tree, because the VM needs to know who maps the page.

So if, for your backing store, you use a small hash table and evict old
entries like powerpc, you'll constantly be faulting in and out pages
from the VM's high level view of the address space. That isn't a really
cheap operation. It takes at least:



Compared to our current page table walk which is just a single locked
op + barrier for the spinlock + radix tree walk.

If you had a very large hash table (ia64 long mode, maybe?), then you
may have slightly fewer high level faults, but range based operations
are going to take a whole lot of cache misses, aren't they? Especially
for small processes.

Not that I wouldn't be happy to be proven wrong, but I don't think it
should be something that sneaks in under these pagetable operations.

SUSE Labs, Novell Inc.
Send instant messages to your online friends 

  reply	other threads:[~2007-03-21  6:51 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-19 20:05 [PATCH 0/7] [RFC] hugetlb: pagetable_operations API (V2) Adam Litke
2007-03-19 20:05 ` [PATCH 1/7] Introduce the pagetable_operations and associated helper macros Adam Litke
2007-03-20 23:24   ` Dave Hansen
2007-03-21 14:50     ` Adam Litke
2007-03-21 15:05       ` Arjan van de Ven
2007-03-21  4:18   ` Nick Piggin
2007-03-21  4:52     ` William Lee Irwin III
2007-03-21  5:07       ` Nick Piggin
2007-03-21  5:41         ` William Lee Irwin III
2007-03-21  6:51           ` Nick Piggin [this message]
2007-03-21  7:36             ` Nick Piggin
2007-03-21 10:46             ` William Lee Irwin III
2007-03-21 15:17     ` Adam Litke
2007-03-21 16:00       ` Christoph Hellwig
2007-03-21 23:03         ` Nick Piggin
2007-03-21 23:02       ` Nick Piggin
2007-03-21 23:32         ` William Lee Irwin III
2007-03-19 20:05 ` [PATCH 2/7] copy_vma for hugetlbfs Adam Litke
2007-03-19 20:05 ` [PATCH 3/7] pin_pages for hugetlb Adam Litke
2007-03-19 20:05 ` [PATCH 4/7] unmap_page_range " Adam Litke
2007-03-20 23:27   ` Dave Hansen
2007-03-19 20:05 ` [PATCH 5/7] change_protection " Adam Litke
2007-03-19 20:06 ` [PATCH 6/7] free_pgtable_range " Adam Litke
2007-03-19 20:06 ` [PATCH 7/7] hugetlbfs fault handler Adam Litke
2007-03-20 23:50 ` [PATCH 0/7] [RFC] hugetlb: pagetable_operations API (V2) Dave Hansen
2007-03-21  1:17 ` William Lee Irwin III
2007-03-21 15:55 ` Hugh Dickins
2007-03-21 16:01   ` Christoph Hellwig
2007-03-21 19:43 ` pagetable_ops: Hugetlb character device example Adam Litke
2007-03-21 19:51   ` Valdis.Kletnieks
2007-03-21 20:26     ` Adam Litke
2007-03-21 22:26     ` William Lee Irwin III
2007-03-21 22:53       ` Matt Mackall
2007-03-21 23:35         ` William Lee Irwin III
2007-03-22  0:31           ` Matt Mackall
2007-03-22 10:38   ` Christoph Hellwig
2007-03-22 15:42     ` Mel Gorman
2007-03-22 18:15       ` Christoph Hellwig
2007-03-23 14:57         ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2007-02-19 18:31 [PATCH 0/7] [RFC] hugetlb: pagetable_operations API Adam Litke
2007-02-19 18:31 ` [PATCH 1/7] Introduce the pagetable_operations and associated helper macros Adam Litke
2007-02-19 18:41   ` Arjan van de Ven
2007-02-19 19:31     ` Adam Litke
2007-02-19 19:48   ` William Lee Irwin III
2007-02-19 22:29   ` Christoph Hellwig
2007-02-20 15:50     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \
    --subject='Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).