LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/4] SGI Altix cross partition memory (XPMEM)
@ 2007-08-27 15:56 Dean Nelson
  2007-08-27 15:59 ` [PATCH 1/4] export __put_task_struct for XPMEM Dean Nelson
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Dean Nelson @ 2007-08-27 15:56 UTC (permalink / raw)
  To: tony.luck, akpm; +Cc: linux-ia64, linux-kernel, linux-mm, jes


    Terminology

The term 'partition', adopted by the SGI hardware designers and which
perculated up into the software, is used in reference to a single SSI
when multiple SSIs are running on a single Altix. An Altix running
multiple SSIs is said to be 'partitioned', whereas one that is running
only a single SSI is said to be 'unpartitioned'.

The term '[a]cross partition' refers to a functionality that spans between
two SSIs on a multi-SSI Altix. ('XP' is its abbreviation.)

    Introduction

This feature provides cross partition access to user memory (XPMEM) when
running multiple partitions on a single SGI Altix. XPMEM, like XPNET,
utilizes XPC to communicate between the partitions.

XPMEM allows a user process to identify portion(s) of its address space
that other user processes can attach (i.e. map) into their own address
spaces. These processes can be running on the same or a different
partition from the one whose memory they are attaching.

    Known Issues

XPMEM is not currently using the kthread API (which is also true for XPC)
because it was in the process of being changed to require a kthread_stop()
be done for every kthread_create() and the kthread_stop() couldn't be called
for a thread that had already exited. In talking with Eric Biederman, there
was some thought of creating a kthread_orphan() which would eliminate the
need for a call to kthread_stop() being required.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] export __put_task_struct for XPMEM
  2007-08-27 15:56 [PATCH 0/4] SGI Altix cross partition memory (XPMEM) Dean Nelson
@ 2007-08-27 15:59 ` Dean Nelson
  2007-08-27 16:13   ` Al Viro
  2007-08-27 16:01 ` [PATCH 2/4] export zap_page_range " Dean Nelson
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Dean Nelson @ 2007-08-27 15:59 UTC (permalink / raw)
  To: akpm; +Cc: linux-ia64, linux-kernel, linux-mm, tony.luck, jes

This patch exports __put_task_struct as it is needed by XPMEM.

Signed-off-by: Dean Nelson <dcn@sgi.com>

---

One struct file_operations registered by XPMEM, xpmem_open(), calls
'get_task_struct(current->group_leader)' and another, xpmem_flush(), calls
'put_task_struct(tg->group_leader)'. The reason for this is given in the
comment block that appears in xpmem_open().

	/*
	 * Increment 'usage' and 'mm->mm_users' for the current task's thread
	 * group leader. This ensures that both its task_struct and mm_struct
	 * will still be around when our thread group exits. (The Linux kernel
	 * normally tears down the mm_struct prior to calling a module's
	 * 'flush' function.) Since all XPMEM thread groups must go through
	 * this path, this extra reference to mm_users also allows us to
	 * directly inc/dec mm_users in xpmem_ensure_valid_PFNs() and avoid
	 * mmput() which has a scaling issue with the mmlist_lock.
	 */

Index: linux-2.6/kernel/fork.c
===================================================================
--- linux-2.6.orig/kernel/fork.c	2007-08-09 07:07:55.426611601 -0500
+++ linux-2.6/kernel/fork.c	2007-08-09 07:15:43.246391700 -0500
@@ -127,6 +127,7 @@
 	if (!profile_handoff_task(tsk))
 		free_task(tsk);
 }
+EXPORT_SYMBOL_GPL(__put_task_struct);
 
 void __init fork_init(unsigned long mempages)
 {

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/4] export zap_page_range for XPMEM
  2007-08-27 15:56 [PATCH 0/4] SGI Altix cross partition memory (XPMEM) Dean Nelson
  2007-08-27 15:59 ` [PATCH 1/4] export __put_task_struct for XPMEM Dean Nelson
@ 2007-08-27 16:01 ` Dean Nelson
  2007-08-27 16:02 ` [PATCH 3/4] add new lock ordering rule to mm/filemap.c Dean Nelson
  2007-08-27 16:41 ` [PATCH 4/4] add SGI Altix cross partition memory (XPMEM) driver Dean Nelson
  3 siblings, 0 replies; 17+ messages in thread
From: Dean Nelson @ 2007-08-27 16:01 UTC (permalink / raw)
  To: akpm; +Cc: linux-ia64, linux-kernel, linux-mm, tony.luck, jes

This patch exports zap_page_range as it is needed by XPMEM.

Signed-off-by: Dean Nelson <dcn@sgi.com>

---

XPMEM would have used sys_madvise() except that madvise_dontneed()
madvise_dontneed() returns an -EINVAL if VM_PFNMAP is set, which is
always true for the pages XPMEM imports from other partitions and is
also true for uncached pages allocated locally via the mspec allocator.
XPMEM needs zap_page_range() functionality for these types of pages as
well as 'normal' pages.

Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c	2007-08-09 07:07:55.762651612 -0500
+++ linux-2.6/mm/memory.c	2007-08-09 07:15:43.226389312 -0500
@@ -894,6 +894,7 @@
 		tlb_finish_mmu(tlb, address, end);
 	return end;
 }
+EXPORT_SYMBOL_GPL(zap_page_range);
 
 /*
  * Do a quick page-table lookup for a single page.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/4] add new lock ordering rule to mm/filemap.c
  2007-08-27 15:56 [PATCH 0/4] SGI Altix cross partition memory (XPMEM) Dean Nelson
  2007-08-27 15:59 ` [PATCH 1/4] export __put_task_struct for XPMEM Dean Nelson
  2007-08-27 16:01 ` [PATCH 2/4] export zap_page_range " Dean Nelson
@ 2007-08-27 16:02 ` Dean Nelson
  2007-08-27 16:41 ` [PATCH 4/4] add SGI Altix cross partition memory (XPMEM) driver Dean Nelson
  3 siblings, 0 replies; 17+ messages in thread
From: Dean Nelson @ 2007-08-27 16:02 UTC (permalink / raw)
  To: akpm; +Cc: linux-ia64, linux-kernel, linux-mm, tony.luck, jes

This patch adds a lock ordering rule to avoid a potential deadlock when
multiple mmap_sems need to be locked.

Signed-off-by: Dean Nelson <dcn@sgi.com>

Index: linux-2.6/mm/filemap.c
===================================================================
--- linux-2.6.orig/mm/filemap.c	2007-08-09 19:18:15.000000000 -0500
+++ linux-2.6/mm/filemap.c	2007-08-27 09:13:47.435717670 -0500
@@ -78,6 +78,9 @@
  *  ->i_mutex			(generic_file_buffered_write)
  *    ->mmap_sem		(fault_in_pages_readable->do_page_fault)
  *
+ *    When taking multiple mmap_sems, one should lock the lowest-addressed
+ *    one first proceeding on up to the highest-addressed one.
+ *
  *  ->i_mutex
  *    ->i_alloc_sem             (various)
  *

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] export __put_task_struct for XPMEM
  2007-08-27 15:59 ` [PATCH 1/4] export __put_task_struct for XPMEM Dean Nelson
@ 2007-08-27 16:13   ` Al Viro
  2007-08-27 18:10     ` Dean Nelson
  0 siblings, 1 reply; 17+ messages in thread
From: Al Viro @ 2007-08-27 16:13 UTC (permalink / raw)
  To: Dean Nelson; +Cc: akpm, linux-ia64, linux-kernel, linux-mm, tony.luck, jes

On Mon, Aug 27, 2007 at 10:59:33AM -0500, Dean Nelson wrote:
> This patch exports __put_task_struct as it is needed by XPMEM.
> 
> Signed-off-by: Dean Nelson <dcn@sgi.com>
> 
> ---
> 
> One struct file_operations registered by XPMEM, xpmem_open(), calls
> 'get_task_struct(current->group_leader)' and another, xpmem_flush(), calls
> 'put_task_struct(tg->group_leader)'.

Does it?  Well, then open the file in question and start doing close(dup(fd))
in a loop.  Won't take long for an oops...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 4/4] add SGI Altix cross partition memory (XPMEM) driver
  2007-08-27 15:56 [PATCH 0/4] SGI Altix cross partition memory (XPMEM) Dean Nelson
                   ` (2 preceding siblings ...)
  2007-08-27 16:02 ` [PATCH 3/4] add new lock ordering rule to mm/filemap.c Dean Nelson
@ 2007-08-27 16:41 ` Dean Nelson
  2007-08-27 20:00   ` Adrian Bunk
  2007-08-28 18:02   ` Christoph Hellwig
  3 siblings, 2 replies; 17+ messages in thread
From: Dean Nelson @ 2007-08-27 16:41 UTC (permalink / raw)
  To: tony.luck; +Cc: linux-ia64, linux-kernel, linux-mm, akpm, jes

[-- Attachment #1: Type: text/plain, Size: 71 bytes --]

This patch has been bzip2'd and included as an attachment due to size.

[-- Attachment #2: xpmem-module.v002.bz2 --]
[-- Type: application/x-bzip, Size: 37264 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] export __put_task_struct for XPMEM
  2007-08-27 16:13   ` Al Viro
@ 2007-08-27 18:10     ` Dean Nelson
  2007-08-27 18:15       ` Al Viro
  2007-08-28 18:01       ` Christoph Hellwig
  0 siblings, 2 replies; 17+ messages in thread
From: Dean Nelson @ 2007-08-27 18:10 UTC (permalink / raw)
  To: Al Viro; +Cc: akpm, linux-ia64, linux-kernel, linux-mm, tony.luck, jes

On Mon, Aug 27, 2007 at 05:13:28PM +0100, Al Viro wrote:
> On Mon, Aug 27, 2007 at 10:59:33AM -0500, Dean Nelson wrote:
> > This patch exports __put_task_struct as it is needed by XPMEM.
> > 
> > Signed-off-by: Dean Nelson <dcn@sgi.com>
> > 
> > ---
> > 
> > One struct file_operations registered by XPMEM, xpmem_open(), calls
> > 'get_task_struct(current->group_leader)' and another, xpmem_flush(), calls
> > 'put_task_struct(tg->group_leader)'.
> 
> Does it?  Well, then open the file in question and start doing close(dup(fd))
> in a loop.  Won't take long for an oops...

Actually it won't oops. And that's because when the file is opened,
xpmem_open() creates a structure for that thread group, and when
xpmem_flush() is called on the close() it first looks for that structure
and if it finds it then it does what it needs to do (which includes the
put_task_struct() call) and then finishes off by destroying the structure.
So for subsequent closes xpmem_flush() returns without calling
put_task_struct().

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] export __put_task_struct for XPMEM
  2007-08-27 18:10     ` Dean Nelson
@ 2007-08-27 18:15       ` Al Viro
  2007-08-27 19:19         ` Dean Nelson
  2007-08-28 18:01       ` Christoph Hellwig
  1 sibling, 1 reply; 17+ messages in thread
From: Al Viro @ 2007-08-27 18:15 UTC (permalink / raw)
  To: Dean Nelson; +Cc: akpm, linux-ia64, linux-kernel, linux-mm, tony.luck, jes

On Mon, Aug 27, 2007 at 01:10:56PM -0500, Dean Nelson wrote:
> On Mon, Aug 27, 2007 at 05:13:28PM +0100, Al Viro wrote:
> > On Mon, Aug 27, 2007 at 10:59:33AM -0500, Dean Nelson wrote:
> > > This patch exports __put_task_struct as it is needed by XPMEM.
> > > 
> > > Signed-off-by: Dean Nelson <dcn@sgi.com>
> > > 
> > > ---
> > > 
> > > One struct file_operations registered by XPMEM, xpmem_open(), calls
> > > 'get_task_struct(current->group_leader)' and another, xpmem_flush(), calls
> > > 'put_task_struct(tg->group_leader)'.
> > 
> > Does it?  Well, then open the file in question and start doing close(dup(fd))
> > in a loop.  Won't take long for an oops...
> 
> Actually it won't oops. And that's because when the file is opened,
> xpmem_open() creates a structure for that thread group, and when
> xpmem_flush() is called on the close() it first looks for that structure
> and if it finds it then it does what it needs to do (which includes the
> put_task_struct() call) and then finishes off by destroying the structure.
> So for subsequent closes xpmem_flush() returns without calling
> put_task_struct().

Then what kind of protection does it get you?  It can be called immediately
after the call of ->open(), so you can't rely on it being there for any
operations.  Makes no sense...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] export __put_task_struct for XPMEM
  2007-08-27 18:15       ` Al Viro
@ 2007-08-27 19:19         ` Dean Nelson
  2007-08-27 19:35           ` Al Viro
  0 siblings, 1 reply; 17+ messages in thread
From: Dean Nelson @ 2007-08-27 19:19 UTC (permalink / raw)
  To: Al Viro; +Cc: akpm, linux-ia64, linux-kernel, linux-mm, tony.luck, jes

On Mon, Aug 27, 2007 at 07:15:44PM +0100, Al Viro wrote:
> On Mon, Aug 27, 2007 at 01:10:56PM -0500, Dean Nelson wrote:
> > On Mon, Aug 27, 2007 at 05:13:28PM +0100, Al Viro wrote:
> > > On Mon, Aug 27, 2007 at 10:59:33AM -0500, Dean Nelson wrote:
> > > > This patch exports __put_task_struct as it is needed by XPMEM.
> > > > 
> > > > Signed-off-by: Dean Nelson <dcn@sgi.com>
> > > > 
> > > > ---
> > > > 
> > > > One struct file_operations registered by XPMEM, xpmem_open(), calls
> > > > 'get_task_struct(current->group_leader)' and another, xpmem_flush(), calls
> > > > 'put_task_struct(tg->group_leader)'.
> > > 
> > > Does it?  Well, then open the file in question and start doing close(dup(fd))
> > > in a loop.  Won't take long for an oops...
> > 
> > Actually it won't oops. And that's because when the file is opened,
> > xpmem_open() creates a structure for that thread group, and when
> > xpmem_flush() is called on the close() it first looks for that structure
> > and if it finds it then it does what it needs to do (which includes the
> > put_task_struct() call) and then finishes off by destroying the structure.
> > So for subsequent closes xpmem_flush() returns without calling
> > put_task_struct().
> 
> Then what kind of protection does it get you?  It can be called immediately
> after the call of ->open(), so you can't rely on it being there for any
> operations.  Makes no sense...

No operations can be done once it's closed, only while it's opened.

At fault time we need to be able to fault pages that belong to another
thread group. To call get_user_pages() we need to have the task_struct
and the mm_struct for the other thread group.

At one time we did look up the task_struct by pid, but that brought the
system to its knees due to heavy lock contention.

Do you know of a better way of doing this?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] export __put_task_struct for XPMEM
  2007-08-27 19:19         ` Dean Nelson
@ 2007-08-27 19:35           ` Al Viro
  2007-08-27 20:24             ` Robin Holt
  0 siblings, 1 reply; 17+ messages in thread
From: Al Viro @ 2007-08-27 19:35 UTC (permalink / raw)
  To: Dean Nelson; +Cc: akpm, linux-ia64, linux-kernel, linux-mm, tony.luck, jes

On Mon, Aug 27, 2007 at 02:19:06PM -0500, Dean Nelson wrote:

> No operations can be done once it's closed, only while it's opened.

What the hell do you mean, can't be done?

	fd = open(...);
	fp = popen("/bin/date", "r");
	/* read from fp */
	fclose(fp);
	do operations on fd

And you've got yourself
	open
	fork duplicating descriptor
	exit closing that duplicate
	IO on original, after call of flush

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] add SGI Altix cross partition memory (XPMEM) driver
  2007-08-27 16:41 ` [PATCH 4/4] add SGI Altix cross partition memory (XPMEM) driver Dean Nelson
@ 2007-08-27 20:00   ` Adrian Bunk
  2007-08-28 18:02   ` Christoph Hellwig
  1 sibling, 0 replies; 17+ messages in thread
From: Adrian Bunk @ 2007-08-27 20:00 UTC (permalink / raw)
  To: Dean Nelson; +Cc: tony.luck, linux-ia64, linux-kernel, linux-mm, akpm, jes

On Mon, Aug 27, 2007 at 11:41:12AM -0500, Dean Nelson wrote:
> This patch has been bzip2'd and included as an attachment due to size.

200 kB shouldn't be a problem
(AFAIR the current limit on linux-kernel is 400 kB).

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] export __put_task_struct for XPMEM
  2007-08-27 19:35           ` Al Viro
@ 2007-08-27 20:24             ` Robin Holt
  2007-08-27 20:47               ` Al Viro
  0 siblings, 1 reply; 17+ messages in thread
From: Robin Holt @ 2007-08-27 20:24 UTC (permalink / raw)
  To: Al Viro
  Cc: Dean Nelson, akpm, linux-ia64, linux-kernel, linux-mm, tony.luck, jes

On Mon, Aug 27, 2007 at 08:35:10PM +0100, Al Viro wrote:
> On Mon, Aug 27, 2007 at 02:19:06PM -0500, Dean Nelson wrote:
> 
> > No operations can be done once it's closed, only while it's opened.
> 
> What the hell do you mean, can't be done?
> 
> 	fd = open(...);
> 	fp = popen("/bin/date", "r");
> 	/* read from fp */
> 	fclose(fp);

But this will operate on the dup'd fd.  We detect that in the flush
(ignore) and ioctl (return errors) operations.  All other operations
are not handled by xpmem.

If you look at the fourth patch, at the beginning of the xpmem_flush
function, we have:

+       tg = xpmem_tg_ref_by_tgid(xpmem_my_part, current->tgid);
+       if (IS_ERR(tg))
+               return 0;  /* probably child process who inherited fd */

This will protect the xpmem structures of the parent from closes by
the child.

Thanks,
Robin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] export __put_task_struct for XPMEM
  2007-08-27 20:24             ` Robin Holt
@ 2007-08-27 20:47               ` Al Viro
  0 siblings, 0 replies; 17+ messages in thread
From: Al Viro @ 2007-08-27 20:47 UTC (permalink / raw)
  To: Robin Holt
  Cc: Dean Nelson, akpm, linux-ia64, linux-kernel, linux-mm, tony.luck, jes

On Mon, Aug 27, 2007 at 03:24:20PM -0500, Robin Holt wrote:
> On Mon, Aug 27, 2007 at 08:35:10PM +0100, Al Viro wrote:
> > On Mon, Aug 27, 2007 at 02:19:06PM -0500, Dean Nelson wrote:
> > 
> > > No operations can be done once it's closed, only while it's opened.
> > 
> > What the hell do you mean, can't be done?
> > 
> > 	fd = open(...);
> > 	fp = popen("/bin/date", "r");
> > 	/* read from fp */
> > 	fclose(fp);
> 
> But this will operate on the dup'd fd.  We detect that in the flush
> (ignore) and ioctl (return errors) operations.  All other operations
> are not handled by xpmem.

How the hell do you detect dup'd fd?  It's identical to the original
in every respect and it doesn't have to be held by a different task.
Seriously, what you are proposing makes no sense whatsoever...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] export __put_task_struct for XPMEM
  2007-08-27 18:10     ` Dean Nelson
  2007-08-27 18:15       ` Al Viro
@ 2007-08-28 18:01       ` Christoph Hellwig
  1 sibling, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2007-08-28 18:01 UTC (permalink / raw)
  To: Dean Nelson
  Cc: Al Viro, akpm, linux-ia64, linux-kernel, linux-mm, tony.luck, jes

On Mon, Aug 27, 2007 at 01:10:56PM -0500, Dean Nelson wrote:
> > Does it?  Well, then open the file in question and start doing close(dup(fd))
> > in a loop.  Won't take long for an oops...
> 
> Actually it won't oops. And that's because when the file is opened,
> xpmem_open() creates a structure for that thread group, and when
> xpmem_flush() is called on the close() it first looks for that structure
> and if it finds it then it does what it needs to do (which includes the
> put_task_struct() call) and then finishes off by destroying the structure.
> So for subsequent closes xpmem_flush() returns without calling
> put_task_struct().

Your refcounting is totally broken.  fds can be passed around in the same
process group (which btw is not something driver should look at because
there are variants of different kinds of process groups depending on clone
flags), and your driver is going boom in most of the case.

We don't export the routines to acquire reference to task structs for
a reason, and this piece of junk called xpmem is not going to change it.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] add SGI Altix cross partition memory (XPMEM) driver
  2007-08-27 16:41 ` [PATCH 4/4] add SGI Altix cross partition memory (XPMEM) driver Dean Nelson
  2007-08-27 20:00   ` Adrian Bunk
@ 2007-08-28 18:02   ` Christoph Hellwig
  2007-08-28 19:00     ` Robin Holt
  1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2007-08-28 18:02 UTC (permalink / raw)
  To: Dean Nelson; +Cc: tony.luck, linux-ia64, linux-kernel, linux-mm, akpm, jes

Big fat NACK, for dirty VM tricks, playing with task_struct lifetimes,
and last but not least the horrible ioctl "API".

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] add SGI Altix cross partition memory (XPMEM) driver
  2007-08-28 18:02   ` Christoph Hellwig
@ 2007-08-28 19:00     ` Robin Holt
  2007-08-28 19:05       ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Robin Holt @ 2007-08-28 19:00 UTC (permalink / raw)
  To: Christoph Hellwig, Dean Nelson, tony.luck, linux-ia64,
	linux-kernel, linux-mm, akpm, jes

On Tue, Aug 28, 2007 at 07:02:35PM +0100, Christoph Hellwig wrote:
> Big fat NACK, for dirty VM tricks, playing with task_struct lifetimes,
> and last but not least the horrible ioctl "API".

The ioctl is sort of historical.  IIRC, in ProPack 3 (RHEL4 based 2.4
kernel), we added system calls.  When the community started making noise
about system calls being bad, we went to a device special file with a
read/write (couldn't get the needed performance from the ioctl() interface
which used to acquire the BKL).  Now that the community fixed the ioctl
issues, we went to using an ioctl, but are completely open to change.

If you want to introduce system calls, we would expect to need, IIRC, 8.
We also pondered an xpmem filesystem today.  It really felt wrong,
but we could pursue that as an alternative.

What is the correct direction to go with this?  get_user_pages() does
currently require the task_struct.  Are you proposing we develop a way
to fault pages without the task_struct of the owning process/thread group?


Thanks,
Robin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] add SGI Altix cross partition memory (XPMEM) driver
  2007-08-28 19:00     ` Robin Holt
@ 2007-08-28 19:05       ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2007-08-28 19:05 UTC (permalink / raw)
  To: Robin Holt
  Cc: Christoph Hellwig, Dean Nelson, tony.luck, linux-ia64,
	linux-kernel, linux-mm, akpm, jes

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=unknown-8bit, Size: 1325 bytes --]

On Tue, Aug 28, 2007 at 02:00:43PM -0500, Robin Holt wrote:
> The ioctl is sort of historical.  IIRC, in ProPack 3 (RHEL4 based 2.4
> kernel), we added system calls.  When the community started making noise
> about system calls being bad, we went to a device special file with a
> read/write (couldn't get the needed performance from the ioctl() interface
> which used to acquire the BKL).  Now that the community fixed the ioctl
> issues, we went to using an ioctl, but are completely open to change.
> 
> If you want to introduce system calls, we would expect to need, IIRC, 8.
> We also pondered an xpmem filesystem today.  It really felt wrong,
> but we could pursue that as an alternative.

The problem is not ioctls per sæ, but the kind of operation you
export.

> What is the correct direction to go with this?  get_user_pages() does
> currently require the task_struct.  Are you proposing we develop a way
> to fault pages without the task_struct of the owning process/thread group?

Stop trying to mess with vmas and get_user_pages on processes entirely.
The only region of virtual memory a driver can deal with is the one it
got a mmap request for, or when using get_user_pages the one it's got
a read/write request for.  You're doing a worse variant of the rdma page
pinning scheme we're rejected countless times.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-08-28 19:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-27 15:56 [PATCH 0/4] SGI Altix cross partition memory (XPMEM) Dean Nelson
2007-08-27 15:59 ` [PATCH 1/4] export __put_task_struct for XPMEM Dean Nelson
2007-08-27 16:13   ` Al Viro
2007-08-27 18:10     ` Dean Nelson
2007-08-27 18:15       ` Al Viro
2007-08-27 19:19         ` Dean Nelson
2007-08-27 19:35           ` Al Viro
2007-08-27 20:24             ` Robin Holt
2007-08-27 20:47               ` Al Viro
2007-08-28 18:01       ` Christoph Hellwig
2007-08-27 16:01 ` [PATCH 2/4] export zap_page_range " Dean Nelson
2007-08-27 16:02 ` [PATCH 3/4] add new lock ordering rule to mm/filemap.c Dean Nelson
2007-08-27 16:41 ` [PATCH 4/4] add SGI Altix cross partition memory (XPMEM) driver Dean Nelson
2007-08-27 20:00   ` Adrian Bunk
2007-08-28 18:02   ` Christoph Hellwig
2007-08-28 19:00     ` Robin Holt
2007-08-28 19:05       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).