LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* rfc, leader_pid_type()
@ 2008-03-18 15:38 Oleg Nesterov
[not found] ` <m18x0fvh5d.fsf@ebiederm.dsl.xmission.com>
0 siblings, 1 reply; 4+ messages in thread
From: Oleg Nesterov @ 2008-03-18 15:38 UTC (permalink / raw)
To: Eric W. Biederman, Pavel Emelyanov; +Cc: linux-kernel
Eric, Pavel.
Without tasklist lock held, task_tgid/task_pgrp/task_session can return the
bogus NULL. Note that the last 2 can return NULL even if task == current.
What do you think if we add yet another helper?
struct pid *leader_pid_type(struct task_struct *task, enum pid_type type)
{
struct pid *ret;
retry:
ret = task->group_leader->pids[type].pid;
if (likely(ret != NULL) || !pid_alive(task))
return ret;
/*
* We hit the old leader in the middle of de_thread(),
* or setsid/setpgrp is in progress.
*/
cpu_relax();
goto retry;
}
Yes, we already have a lot helpers... The one potential user is
check_kill_permission(), but it can live without it.
What do you think, do you see other possible users? Say, do_task_stat() may
report sid = 0, but this is minor of course...
Oleg.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: rfc, leader_pid_type()
[not found] ` <m18x0fvh5d.fsf@ebiederm.dsl.xmission.com>
@ 2008-03-18 22:15 ` Oleg Nesterov
[not found] ` <m13aqnvdnk.fsf@ebiederm.dsl.xmission.com>
0 siblings, 1 reply; 4+ messages in thread
From: Oleg Nesterov @ 2008-03-18 22:15 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Pavel Emelyanov, linux-kernel
On 03/18, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@tv-sign.ru> writes:
>
> > Eric, Pavel.
> >
> > Without tasklist lock held, task_tgid/task_pgrp/task_session can return the
> > bogus NULL. Note that the last 2 can return NULL even if task == current.
> >
> > What do you think if we add yet another helper?
>
> My current inclination is this places the cost for de_thread in the
> wrong place. exec on a threaded binary should be rare.
> Any chance we can make de_thread rcu safe?
>
> We are very close.
>
> It would take a double check but I believe all we need to do is to
> modify detach_pid to remove link->pid. This of course messes up
> pid_alive but otherwise we should be ok if we have a big fat comment.
Not sure I understand... detach_pid(type) already sets
task->pids[type].pid = NULL;
> We might need to replace the detach_pid, attach_pid sequence in
> __set_special_pids with an optimized sequence like transfer_pid
> call it replace_pid where we guarantee there is always a valid pid
> pointer in the group_leader.
OK... I think you are right... good point.
> It just feels wrong to me to put cost (and worse complexity) for
> handling the very rare cases in much more common code paths.
Absolutely agreed.
> > Yes, we already have a lot helpers... The one potential user is
> > check_kill_permission(), but it can live without it.
>
> I think it is worth removing the pain of using de_thread if we can.
Agreed, but how? de_thread() must remove the old leader from
->pids[type].node before it does release_task(leader).
(i do remember your idea to _not_ switch ->group_leader on exec,
it solves sooo many problems...)
Oleg.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: rfc, leader_pid_type()
[not found] ` <m13aqnvdnk.fsf@ebiederm.dsl.xmission.com>
@ 2008-03-19 0:31 ` Oleg Nesterov
2008-03-19 14:36 ` Oleg Nesterov
0 siblings, 1 reply; 4+ messages in thread
From: Oleg Nesterov @ 2008-03-19 0:31 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Pavel Emelyanov, linux-kernel
On 03/18, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@tv-sign.ru> writes:
>
> > On 03/18, Eric W. Biederman wrote:
> >>
> >> Oleg Nesterov <oleg@tv-sign.ru> writes:
> >>
> >> > Eric, Pavel.
> >> >
> >> > Without tasklist lock held, task_tgid/task_pgrp/task_session can return the
> >> > bogus NULL. Note that the last 2 can return NULL even if task == current.
> >> >
> >> > What do you think if we add yet another helper?
> >>
> >> My current inclination is this places the cost for de_thread in the
> >> wrong place. exec on a threaded binary should be rare.
> >> Any chance we can make de_thread rcu safe?
> >>
> >> We are very close.
> >>
> >> It would take a double check but I believe all we need to do is to
> >> modify detach_pid to remove link->pid. This of course messes up
> >> pid_alive but otherwise we should be ok if we have a big fat comment.
> >
> > Not sure I understand... detach_pid(type) already sets
> >
> > task->pids[type].pid = NULL;
>
> Sorry. I meant to remove that line that clears task->pids[type].pid
> If we don't set .pid to NULL we can still dereference while the task_struct
> is rcu reachable.
!!!!! Yes you are right.
No, we don't need to change detach_pid(), but we can change transfer_pid()
and kill this line:
old->pids[type].pid = NULL;
this can't break de_thread()->release_task(leader)->__unhash_process()
because leader is not group_leader() any longer.
And with this change we can't see task_session() == NULL under rcu_read_lock()
during exec, great!
And. this doesn't break do_each_pid_task(PIDTYPE_SID), we can't see both
old and new leaders.
> >> We might need to replace the detach_pid, attach_pid sequence in
> >> __set_special_pids with an optimized sequence like transfer_pid
> >> call it replace_pid where we guarantee there is always a valid pid
> >> pointer in the group_leader.
> >
> > OK... I think you are right... good point.
Yes, good point. Except we need a couple of subtle but straightforward
changes, but I think this change worth it.
Perhaps I (or you ;) missed something, I'll re-check... but this all
looks like a good plan, thanks!
Oleg.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: rfc, leader_pid_type()
2008-03-19 0:31 ` Oleg Nesterov
@ 2008-03-19 14:36 ` Oleg Nesterov
0 siblings, 0 replies; 4+ messages in thread
From: Oleg Nesterov @ 2008-03-19 14:36 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Pavel Emelyanov, linux-kernel
On 03/19, Oleg Nesterov wrote:
>
> On 03/18, Eric W. Biederman wrote:
> >
> > Oleg Nesterov <oleg@tv-sign.ru> writes:
> >
> > > On 03/18, Eric W. Biederman wrote:
> > >>
> > >> Any chance we can make de_thread rcu safe?
> > >>
> > >> We are very close.
> > >>
> > >> It would take a double check but I believe all we need to do is to
> > >> modify detach_pid to remove link->pid. This of course messes up
> > >> pid_alive but otherwise we should be ok if we have a big fat comment.
> > >
> > > Not sure I understand... detach_pid(type) already sets
> > >
> > > task->pids[type].pid = NULL;
> >
> > Sorry. I meant to remove that line that clears task->pids[type].pid
> > If we don't set .pid to NULL we can still dereference while the task_struct
> > is rcu reachable.
To clarify: we can't change detach_pid() and remove the line that clears
task->pids[type].pid right now. Look for example at get_task_pid(), it
could be used without rcu, using the previously pinned task_struct.
But in the long term this might be right.
(As for pid_alive(). Can't we rename it? Its name is very misleading, and it
shouldn't play with ->pids, this looks confusing. Just do p->sighand != 0)
> !!!!! Yes you are right.
>
> No, we don't need to change detach_pid(), but we can change transfer_pid()
> and kill this line:
>
> old->pids[type].pid = NULL;
>
> this can't break de_thread()->release_task(leader)->__unhash_process()
> because leader is not group_leader() any longer.
>
> And with this change we can't see task_session() == NULL under rcu_read_lock()
> during exec, great!
>
> And. this doesn't break do_each_pid_task(PIDTYPE_SID), we can't see both
> old and new leaders.
>
> > >> We might need to replace the detach_pid, attach_pid sequence in
> > >> __set_special_pids with an optimized sequence like transfer_pid
> > >> call it replace_pid where we guarantee there is always a valid pid
> > >> pointer in the group_leader.
> > >
> > > OK... I think you are right... good point.
>
> Yes, good point. Except we need a couple of subtle but straightforward
> changes, but I think this change worth it.
Something like the patch below. What do you think? setsid/setpgrp now
can use change_pid(). The change in transfer_pid() needs a separate
patch with the fat changelog.
This doesn't solve the problem with task_tgid() == NULL, but at least
this is not possible with task == current.
Oleg.
--- 25/kernel/pid.c~1_PID_CHANGE 2008-02-16 17:51:17.000000000 +0300
+++ 25/kernel/pid.c 2008-03-19 17:15:15.000000000 +0300
@@ -317,29 +317,25 @@ EXPORT_SYMBOL_GPL(find_pid);
/*
* attach_pid() must be called with the tasklist_lock write-held.
*/
-int attach_pid(struct task_struct *task, enum pid_type type,
- struct pid *pid)
+int attach_pid(struct task_struct *task, enum pid_type type, struct pid *pid)
{
- struct pid_link *link;
+ struct pid_link *link = &task->pids[type];
- link = &task->pids[type];
link->pid = pid;
- hlist_add_head_rcu(&link->node, &pid->tasks[type]);
+ if (likely(pid != NULL))
+ hlist_add_head_rcu(&link->node, &pid->tasks[type]);
return 0;
}
-void detach_pid(struct task_struct *task, enum pid_type type)
+void change_pid(struct task_struct *task, enum pid_type type, struct pid *new)
{
- struct pid_link *link;
- struct pid *pid;
+ struct pid_link *link = &task->pids[type];
+ struct pid *pid = link->pid;
int tmp;
- link = &task->pids[type];
- pid = link->pid;
-
hlist_del_rcu(&link->node);
- link->pid = NULL;
+ attach_pid(task, type, new);
for (tmp = PIDTYPE_MAX; --tmp >= 0; )
if (!hlist_empty(&pid->tasks[tmp]))
@@ -348,13 +344,17 @@ void detach_pid(struct task_struct *task
free_pid(pid);
}
+void detach_pid(struct task_struct *task, enum pid_type type)
+{
+ change_pid(task, type, NULL);
+}
+
/* transfer_pid is an optimization of attach_pid(new), detach_pid(old) */
void transfer_pid(struct task_struct *old, struct task_struct *new,
enum pid_type type)
{
new->pids[type].pid = old->pids[type].pid;
hlist_replace_rcu(&old->pids[type].node, &new->pids[type].node);
- old->pids[type].pid = NULL;
}
struct task_struct *pid_task(struct pid *pid, enum pid_type type)
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-03-19 23:00 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-18 15:38 rfc, leader_pid_type() Oleg Nesterov
[not found] ` <m18x0fvh5d.fsf@ebiederm.dsl.xmission.com>
2008-03-18 22:15 ` Oleg Nesterov
[not found] ` <m13aqnvdnk.fsf@ebiederm.dsl.xmission.com>
2008-03-19 0:31 ` Oleg Nesterov
2008-03-19 14:36 ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).