LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* rfc, leader_pid_type()
@ 2008-03-18 15:38 Oleg Nesterov
       [not found] ` <m18x0fvh5d.fsf@ebiederm.dsl.xmission.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Oleg Nesterov @ 2008-03-18 15:38 UTC (permalink / raw)
  To: Eric W. Biederman, Pavel Emelyanov; +Cc: linux-kernel

Eric, Pavel.

Without tasklist lock held, task_tgid/task_pgrp/task_session can return the
bogus NULL. Note that the last 2 can return NULL even if task == current.

What do you think if we add yet another helper?

	struct pid *leader_pid_type(struct task_struct *task, enum pid_type type)
	{
		struct pid *ret;
	retry:
		ret = task->group_leader->pids[type].pid;
		if (likely(ret != NULL) || !pid_alive(task))
			return ret;
		/*
		 * We hit the old leader in the middle of de_thread(),
		 * or setsid/setpgrp is in progress.
		 */
		cpu_relax();
		goto retry;
	}

Yes, we already have a lot helpers... The one potential user is
check_kill_permission(), but it can live without it.

What do you think, do you see other possible users? Say, do_task_stat() may
report sid = 0, but this is minor of course...

Oleg.	


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: rfc, leader_pid_type()
       [not found] ` <m18x0fvh5d.fsf@ebiederm.dsl.xmission.com>
@ 2008-03-18 22:15   ` Oleg Nesterov
       [not found]     ` <m13aqnvdnk.fsf@ebiederm.dsl.xmission.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Oleg Nesterov @ 2008-03-18 22:15 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Pavel Emelyanov, linux-kernel

On 03/18, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@tv-sign.ru> writes:
> 
> > Eric, Pavel.
> >
> > Without tasklist lock held, task_tgid/task_pgrp/task_session can return the
> > bogus NULL. Note that the last 2 can return NULL even if task == current.
> >
> > What do you think if we add yet another helper?
> 
> My current inclination is this places the cost for de_thread in the
> wrong place.  exec on a threaded binary should be rare.
> Any chance we can make de_thread rcu safe?
> 
> We are very close.
> 
> It would take a double check but I believe all we need to do is to
> modify detach_pid to remove link->pid.  This of course messes up
> pid_alive but otherwise we should be ok if we have a big fat comment.

Not sure I understand... detach_pid(type) already sets

	task->pids[type].pid = NULL;

> We might need to replace the detach_pid, attach_pid sequence in
> __set_special_pids with an optimized sequence like transfer_pid
> call it replace_pid where we guarantee there is always a valid pid
> pointer in the group_leader.

OK... I think you are right... good point.

> It just feels wrong to me to put cost (and worse complexity) for
> handling the very rare cases in much more common code paths. 

Absolutely agreed.

> > Yes, we already have a lot helpers... The one potential user is
> > check_kill_permission(), but it can live without it.
> 
> I think it is worth removing the pain of using de_thread if we can.

Agreed, but how? de_thread() must remove the old leader from
->pids[type].node before it does release_task(leader).

(i do remember your idea to _not_ switch ->group_leader on exec,
 it solves sooo many problems...)

Oleg.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: rfc, leader_pid_type()
       [not found]     ` <m13aqnvdnk.fsf@ebiederm.dsl.xmission.com>
@ 2008-03-19  0:31       ` Oleg Nesterov
  2008-03-19 14:36         ` Oleg Nesterov
  0 siblings, 1 reply; 4+ messages in thread
From: Oleg Nesterov @ 2008-03-19  0:31 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Pavel Emelyanov, linux-kernel

On 03/18, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@tv-sign.ru> writes:
> 
> > On 03/18, Eric W. Biederman wrote:
> >>
> >> Oleg Nesterov <oleg@tv-sign.ru> writes:
> >> 
> >> > Eric, Pavel.
> >> >
> >> > Without tasklist lock held, task_tgid/task_pgrp/task_session can return the
> >> > bogus NULL. Note that the last 2 can return NULL even if task == current.
> >> >
> >> > What do you think if we add yet another helper?
> >> 
> >> My current inclination is this places the cost for de_thread in the
> >> wrong place.  exec on a threaded binary should be rare.
> >> Any chance we can make de_thread rcu safe?
> >> 
> >> We are very close.
> >> 
> >> It would take a double check but I believe all we need to do is to
> >> modify detach_pid to remove link->pid.  This of course messes up
> >> pid_alive but otherwise we should be ok if we have a big fat comment.
> >
> > Not sure I understand... detach_pid(type) already sets
> >
> > 	task->pids[type].pid = NULL;
> 
> Sorry.  I meant to remove that line that clears task->pids[type].pid 
> If we don't set .pid to NULL we can still dereference while the task_struct
> is rcu reachable.

!!!!! Yes you are right.

No, we don't need to change detach_pid(), but we can change transfer_pid()
and kill this line:

	old->pids[type].pid = NULL;

this can't break de_thread()->release_task(leader)->__unhash_process()
because leader is not group_leader() any longer.

And with this change we can't see task_session() == NULL under rcu_read_lock()
during exec, great!

And. this doesn't break do_each_pid_task(PIDTYPE_SID), we can't see both
old and new leaders.

> >> We might need to replace the detach_pid, attach_pid sequence in
> >> __set_special_pids with an optimized sequence like transfer_pid
> >> call it replace_pid where we guarantee there is always a valid pid
> >> pointer in the group_leader.
> >
> > OK... I think you are right... good point.

Yes, good point. Except we need a couple of subtle but straightforward
changes, but I think this change worth it.

Perhaps I (or you ;) missed something, I'll re-check... but this all
looks like a good plan, thanks!

Oleg.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: rfc, leader_pid_type()
  2008-03-19  0:31       ` Oleg Nesterov
@ 2008-03-19 14:36         ` Oleg Nesterov
  0 siblings, 0 replies; 4+ messages in thread
From: Oleg Nesterov @ 2008-03-19 14:36 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Pavel Emelyanov, linux-kernel

On 03/19, Oleg Nesterov wrote:
>
> On 03/18, Eric W. Biederman wrote:
> >
> > Oleg Nesterov <oleg@tv-sign.ru> writes:
> > 
> > > On 03/18, Eric W. Biederman wrote:
> > >>
> > >> Any chance we can make de_thread rcu safe?
> > >> 
> > >> We are very close.
> > >> 
> > >> It would take a double check but I believe all we need to do is to
> > >> modify detach_pid to remove link->pid.  This of course messes up
> > >> pid_alive but otherwise we should be ok if we have a big fat comment.
> > >
> > > Not sure I understand... detach_pid(type) already sets
> > >
> > > 	task->pids[type].pid = NULL;
> > 
> > Sorry.  I meant to remove that line that clears task->pids[type].pid 
> > If we don't set .pid to NULL we can still dereference while the task_struct
> > is rcu reachable.

To clarify: we can't change detach_pid() and remove the line that clears
task->pids[type].pid right now. Look for example at get_task_pid(), it
could be used without rcu, using the previously pinned task_struct.

But in the long term this might be right.

(As for pid_alive(). Can't we rename it? Its name is very misleading, and it
 shouldn't play with ->pids, this looks confusing. Just do p->sighand != 0)

> !!!!! Yes you are right.
> 
> No, we don't need to change detach_pid(), but we can change transfer_pid()
> and kill this line:
> 
> 	old->pids[type].pid = NULL;
> 
> this can't break de_thread()->release_task(leader)->__unhash_process()
> because leader is not group_leader() any longer.
> 
> And with this change we can't see task_session() == NULL under rcu_read_lock()
> during exec, great!
> 
> And. this doesn't break do_each_pid_task(PIDTYPE_SID), we can't see both
> old and new leaders.
> 
> > >> We might need to replace the detach_pid, attach_pid sequence in
> > >> __set_special_pids with an optimized sequence like transfer_pid
> > >> call it replace_pid where we guarantee there is always a valid pid
> > >> pointer in the group_leader.
> > >
> > > OK... I think you are right... good point.
> 
> Yes, good point. Except we need a couple of subtle but straightforward
> changes, but I think this change worth it.

Something like the patch below. What do you think? setsid/setpgrp now
can use change_pid(). The change in transfer_pid() needs a separate
patch with the fat changelog.

This doesn't solve the problem with task_tgid() == NULL, but at least
this is not possible with task == current.

Oleg.

--- 25/kernel/pid.c~1_PID_CHANGE	2008-02-16 17:51:17.000000000 +0300
+++ 25/kernel/pid.c	2008-03-19 17:15:15.000000000 +0300
@@ -317,29 +317,25 @@ EXPORT_SYMBOL_GPL(find_pid);
 /*
  * attach_pid() must be called with the tasklist_lock write-held.
  */
-int attach_pid(struct task_struct *task, enum pid_type type,
-		struct pid *pid)
+int attach_pid(struct task_struct *task, enum pid_type type, struct pid *pid)
 {
-	struct pid_link *link;
+	struct pid_link *link = &task->pids[type];
 
-	link = &task->pids[type];
 	link->pid = pid;
-	hlist_add_head_rcu(&link->node, &pid->tasks[type]);
+	if (likely(pid != NULL))
+		hlist_add_head_rcu(&link->node, &pid->tasks[type]);
 
 	return 0;
 }
 
-void detach_pid(struct task_struct *task, enum pid_type type)
+void change_pid(struct task_struct *task, enum pid_type type, struct pid *new)
 {
-	struct pid_link *link;
-	struct pid *pid;
+	struct pid_link *link = &task->pids[type];
+	struct pid *pid = link->pid;
 	int tmp;
 
-	link = &task->pids[type];
-	pid = link->pid;
-
 	hlist_del_rcu(&link->node);
-	link->pid = NULL;
+	attach_pid(task, type, new);
 
 	for (tmp = PIDTYPE_MAX; --tmp >= 0; )
 		if (!hlist_empty(&pid->tasks[tmp]))
@@ -348,13 +344,17 @@ void detach_pid(struct task_struct *task
 	free_pid(pid);
 }
 
+void detach_pid(struct task_struct *task, enum pid_type type)
+{
+	change_pid(task, type, NULL);
+}
+
 /* transfer_pid is an optimization of attach_pid(new), detach_pid(old) */
 void transfer_pid(struct task_struct *old, struct task_struct *new,
 			   enum pid_type type)
 {
 	new->pids[type].pid = old->pids[type].pid;
 	hlist_replace_rcu(&old->pids[type].node, &new->pids[type].node);
-	old->pids[type].pid = NULL;
 }
 
 struct task_struct *pid_task(struct pid *pid, enum pid_type type)


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-03-19 23:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-18 15:38 rfc, leader_pid_type() Oleg Nesterov
     [not found] ` <m18x0fvh5d.fsf@ebiederm.dsl.xmission.com>
2008-03-18 22:15   ` Oleg Nesterov
     [not found]     ` <m13aqnvdnk.fsf@ebiederm.dsl.xmission.com>
2008-03-19  0:31       ` Oleg Nesterov
2008-03-19 14:36         ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).