LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Question about ftrace, dynamically allocated trampolines and dynamic fops
@ 2015-01-29  9:40 Miroslav Benes
  2015-01-29 14:41 ` Steven Rostedt
  0 siblings, 1 reply; 4+ messages in thread
From: Miroslav Benes @ 2015-01-29  9:40 UTC (permalink / raw)
  To: rostedt; +Cc: Linux Kernel Mailing List, jkosina


Hi,

solving a possible race condition in kGraft and thinking about the same in 
klp live patching I looked quite a lot at ftrace code. One thing about 
recent dynamic trampolines seems a bit odd. For dynamic fops 
(FTRACE_OPS_FL_DYNAMIC is set in ops->flags) arch_ftrace_update_trampoline 
is called only for nonpreemptive kernels in ftrace_update_trampoline. The 
reason is obvious and well described in the comment there. However the 
actual callback function in arch_ftrace_update_trampoline is 
determined by call to ftrace_ops_get_func which gives generic 
ftrace_ops_list_func for dynamic ops. This function disables preemption 
(because of traversing rcu protected list), so it should be safe to use 
dynamic trampolines even for dynamic fops in preemptive kernels. Is this 
correct? 

Or maybe the problem is the opposite. Why does the ftrace use 
ftrace_ops_list_func in such situation? Even for nonpreemptive kernel and 
dynamic fops ftrace_ops_list_func has unnecessary overhead.

I'm almost sure I must have missed something but I cannot see what. Could 
you point me at the right direction?

Thanks a lot in advance,
--
Miroslav Benes
SUSE Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question about ftrace, dynamically allocated trampolines and dynamic fops
  2015-01-29  9:40 Question about ftrace, dynamically allocated trampolines and dynamic fops Miroslav Benes
@ 2015-01-29 14:41 ` Steven Rostedt
  2015-01-30 10:08   ` Miroslav Benes
  0 siblings, 1 reply; 4+ messages in thread
From: Steven Rostedt @ 2015-01-29 14:41 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: Linux Kernel Mailing List, jkosina

On Thu, 29 Jan 2015 10:40:58 +0100 (CET)
Miroslav Benes <mbenes@suse.cz> wrote:

> 
> Hi,
> 
> solving a possible race condition in kGraft and thinking about the same in 
> klp live patching I looked quite a lot at ftrace code. One thing about 
> recent dynamic trampolines seems a bit odd. For dynamic fops 
> (FTRACE_OPS_FL_DYNAMIC is set in ops->flags) arch_ftrace_update_trampoline 
> is called only for nonpreemptive kernels in ftrace_update_trampoline. The 
> reason is obvious and well described in the comment there. However the 
> actual callback function in arch_ftrace_update_trampoline is 
> determined by call to ftrace_ops_get_func which gives generic 
> ftrace_ops_list_func for dynamic ops. This function disables preemption 
> (because of traversing rcu protected list), so it should be safe to use 
> dynamic trampolines even for dynamic fops in preemptive kernels. Is this 
> correct? 

No, the dynamic trampoline itself is not safe. I explained this at the
LinuxConEU/Plumbers conference, although the slides don't explain it
well :-/, otherwise I would have just pointed you to them.

Basically what the issue is, if a task jumps to the dynamic trampoline
and gets preempted, how would you ever free that trampoline again?

Now for live kernel patching, we could add another flag that says
permanent, that means the trampoline and the ops will never be free or
change, and then it would be safe to use dynamic trampolines.

> 
> Or maybe the problem is the opposite. Why does the ftrace use 
> ftrace_ops_list_func in such situation? Even for nonpreemptive kernel and 
> dynamic fops ftrace_ops_list_func has unnecessary overhead.

It points the dynamic ops (in non-preempt) to ftrace_ops_list_func? Are
you sure about that. One way to verify is to
cat /sys/kernel/debug/tracing/enabled_functions, which should show you
want the dynamic ops points to.

-- Steve

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question about ftrace, dynamically allocated trampolines and dynamic fops
  2015-01-29 14:41 ` Steven Rostedt
@ 2015-01-30 10:08   ` Miroslav Benes
  2015-01-30 14:12     ` Steven Rostedt
  0 siblings, 1 reply; 4+ messages in thread
From: Miroslav Benes @ 2015-01-30 10:08 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Linux Kernel Mailing List, jkosina

On Thu, 29 Jan 2015, Steven Rostedt wrote:

> On Thu, 29 Jan 2015 10:40:58 +0100 (CET)
> Miroslav Benes <mbenes@suse.cz> wrote:
> 
> > 
> > Hi,
> > 
> > solving a possible race condition in kGraft and thinking about the same in 
> > klp live patching I looked quite a lot at ftrace code. One thing about 
> > recent dynamic trampolines seems a bit odd. For dynamic fops 
> > (FTRACE_OPS_FL_DYNAMIC is set in ops->flags) arch_ftrace_update_trampoline 
> > is called only for nonpreemptive kernels in ftrace_update_trampoline. The 
> > reason is obvious and well described in the comment there. However the 
> > actual callback function in arch_ftrace_update_trampoline is 
> > determined by call to ftrace_ops_get_func which gives generic 
> > ftrace_ops_list_func for dynamic ops. This function disables preemption 
> > (because of traversing rcu protected list), so it should be safe to use 
> > dynamic trampolines even for dynamic fops in preemptive kernels. Is this 
> > correct? 
> 
> No, the dynamic trampoline itself is not safe. I explained this at the
> LinuxConEU/Plumbers conference, although the slides don't explain it
> well :-/, otherwise I would have just pointed you to them.
> 
> Basically what the issue is, if a task jumps to the dynamic trampoline
> and gets preempted, how would you ever free that trampoline again?

Thanks for the reply. I know there is the problem with preemption and 
trampolines. In such case schedule_on_each_cpu in ftrace_shutdown is not 
enough. But it seems I didn't describe the current problem clearly. So 
for x86...

The function ftrace_ops_get_func returns the function which the trampoline 
should call. For FTRACE_OPS_FL_DYNAMIC it always returns 
ftrace_ops_list_func

ftrace_func_t ftrace_ops_get_func(struct ftrace_ops *ops)                                                              
{
        /*
         * If this is a dynamic ops or we force list func,
         * then it needs to call the list anyway.                                                                      
         */                                                                                                            
        if (ops->flags & FTRACE_OPS_FL_DYNAMIC || FTRACE_FORCE_LIST_FUNC)                                              
                return ftrace_ops_list_func;
[...]

This function is called in arch_ftrace_update_trampoline after the 
trampoline is created (it is also called in update_ftrace_function, but 
that is not important for dynamic trampolines)...

void arch_ftrace_update_trampoline(struct ftrace_ops *ops)                                                             
{       
[...]
        if (ops->trampoline) {
                /*
                 * The ftrace_ops caller may set up its own trampoline.
                 * In such a case, this code must not modify it.
                 */                                                                                                    
                if (!(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
                        return;
        } else {
                ops->trampoline = create_trampoline(ops, &size);                                                       
                if (!ops->trampoline)
                        return; 
                ops->trampoline_size = size;
        }

        offset = calc_trampoline_call_offset(ops->flags & 
FTRACE_OPS_FL_SAVE_REGS);
        ip = ops->trampoline + offset;

        func = ftrace_ops_get_func(ops);

        /* Do a safe modify in case the trampoline is executing */
        new = ftrace_call_replace(ip, (unsigned long)func);
        ret = update_ftrace_func(ip, new);
[...]
}

Thus the dynamic trampoline for FTRACE_OPS_FL_DYNAMIC ops always calls 
ftrace_ops_list_func.

Now, arch_ftrace_update_trampoline is called by 
ftrace_update_trampoline...

static void ftrace_update_trampoline(struct ftrace_ops *ops)
{       
/*              
 * Currently there's no safe way to free a trampoline when the kernel
 * is configured with PREEMPT. That is because a task could be preempted
 * when it jumped to the trampoline, it may be preempted for a long time
 * depending on the system load, and currently there's no way to know
 * when it will be off the trampoline. If the trampoline is freed
 * too early, when the task runs again, it will be executing on freed
 * memory and crash.
 */
#ifdef CONFIG_PREEMPT
        /* Currently, only non dynamic ops can have a trampoline */
        if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
                return;
#endif          
                                             
        arch_ftrace_update_trampoline(ops);
}

The comment is theoretically correct. But since for FTRACE_OPS_FL_DYNAMIC 
ftrace_ops_list_func is always called it is also not valid. 
ftrace_ops_list_func disables preemption so schedule_on_each_cpu in 
ftrace_shutdown does the trick.

Anyway I think the problem is the opposite. There should be no 
ftrace_ops_list_func for dynamic trampoline. See below, please...

> Now for live kernel patching, we could add another flag that says
> permanent, that means the trampoline and the ops will never be free or
> change, and then it would be safe to use dynamic trampolines.

Yes, that would be the solution. Or could we use call_rcu_tasks? If I 
remember correctly you wrote somewhere that you had such patch already 
prepared but wanted to test it more. Is that correct?

> > 
> > Or maybe the problem is the opposite. Why does the ftrace use 
> > ftrace_ops_list_func in such situation? Even for nonpreemptive kernel and 
> > dynamic fops ftrace_ops_list_func has unnecessary overhead.
> 
> It points the dynamic ops (in non-preempt) to ftrace_ops_list_func? Are
> you sure about that. One way to verify is to
> cat /sys/kernel/debug/tracing/enabled_functions, which should show you
> want the dynamic ops points to.

I think so. One thing is the code above. And verification in 
next-20150130 in qemu, CONFIG_PREEMPT=n,...

I livepatched /proc/cmdline and there is no other tracing involved

dsc1:~ # cat /sys/kernel/debug/tracing/enabled_functions
cmdline_proc_show (1) R I	tramp: 0xffffffffa0004000 ->ftrace_ops_list_func+0x0/0x1a0
dsc1:~

I think that there should be direct call to the callback function (new 
cmdline_proc_show). 

Miroslav

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question about ftrace, dynamically allocated trampolines and dynamic fops
  2015-01-30 10:08   ` Miroslav Benes
@ 2015-01-30 14:12     ` Steven Rostedt
  0 siblings, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2015-01-30 14:12 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: Linux Kernel Mailing List, jkosina

On Fri, 30 Jan 2015 11:08:19 +0100 (CET)
Miroslav Benes <mbenes@suse.cz> wrote:

> Anyway I think the problem is the opposite. There should be no 
> ftrace_ops_list_func for dynamic trampoline. See below, please...
> 


> 
> I think so. One thing is the code above. And verification in 
> next-20150130 in qemu, CONFIG_PREEMPT=n,...
> 
> I livepatched /proc/cmdline and there is no other tracing involved
> 
> dsc1:~ # cat /sys/kernel/debug/tracing/enabled_functions
> cmdline_proc_show (1) R I	tramp: 0xffffffffa0004000 ->ftrace_ops_list_func+0x0/0x1a0
> dsc1:~
> 
> I think that there should be direct call to the callback function (new 
> cmdline_proc_show). 
> 

Ah, OK I see the issue. Yeah, it shouldn't be that way. I'll have to
fix that.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-01-30 14:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-29  9:40 Question about ftrace, dynamically allocated trampolines and dynamic fops Miroslav Benes
2015-01-29 14:41 ` Steven Rostedt
2015-01-30 10:08   ` Miroslav Benes
2015-01-30 14:12     ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).