LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Song Liu <songliubraving@fb.com>
Cc: "open list:BPF (Safe dynamic programs and tools)" 
	<bpf@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, kajoljain <kjain@linux.ibm.com>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v2 bpf-next 1/3] perf: enable branch record for software events
Date: Mon, 30 Aug 2021 18:06:30 +0200	[thread overview]
Message-ID: <YS0CBphTuIdTWEXF@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <F70BD5BE-C698-4C53-9ECD-A4805CB2D659@fb.com>

On Mon, Aug 30, 2021 at 03:25:44PM +0000, Song Liu wrote:
> Thanks for these information! I did get confused these macros for quite a 
> while. Let me try with the _RET0 version.

Does you kernel have:

  9ae6ab27f44e ("static_call: Update API documentation")

?

With that included, the comment at the top of static_call.h reads like
the below. Please let me know where you think this can be improved.


/*
 * Static call support
 *
 * Static calls use code patching to hard-code function pointers into direct
 * branch instructions. They give the flexibility of function pointers, but
 * with improved performance. This is especially important for cases where
 * retpolines would otherwise be used, as retpolines can significantly impact
 * performance.
 *
 *
 * API overview:
 *
 *   DECLARE_STATIC_CALL(name, func);
 *   DEFINE_STATIC_CALL(name, func);
 *   DEFINE_STATIC_CALL_NULL(name, typename);
 *   DEFINE_STATIC_CALL_RET0(name, typename);
 *
 *   __static_call_return0;
 *
 *   static_call(name)(args...);
 *   static_call_cond(name)(args...);
 *   static_call_update(name, func);
 *   static_call_query(name);
 *
 *   EXPORT_STATIC_CALL{,_TRAMP}{,_GPL}()
 *
 * Usage example:
 *
 *   # Start with the following functions (with identical prototypes):
 *   int func_a(int arg1, int arg2);
 *   int func_b(int arg1, int arg2);
 *
 *   # Define a 'my_name' reference, associated with func_a() by default
 *   DEFINE_STATIC_CALL(my_name, func_a);
 *
 *   # Call func_a()
 *   static_call(my_name)(arg1, arg2);
 *
 *   # Update 'my_name' to point to func_b()
 *   static_call_update(my_name, &func_b);
 *
 *   # Call func_b()
 *   static_call(my_name)(arg1, arg2);
 *
 *
 * Implementation details:
 *
 *   This requires some arch-specific code (CONFIG_HAVE_STATIC_CALL).
 *   Otherwise basic indirect calls are used (with function pointers).
 *
 *   Each static_call() site calls into a trampoline associated with the name.
 *   The trampoline has a direct branch to the default function.  Updates to a
 *   name will modify the trampoline's branch destination.
 *
 *   If the arch has CONFIG_HAVE_STATIC_CALL_INLINE, then the call sites
 *   themselves will be patched at runtime to call the functions directly,
 *   rather than calling through the trampoline.  This requires objtool or a
 *   compiler plugin to detect all the static_call() sites and annotate them
 *   in the .static_call_sites section.
 *
 *
 * Notes on NULL function pointers:
 *
 *   Static_call()s support NULL functions, with many of the caveats that
 *   regular function pointers have.
 *
 *   Clearly calling a NULL function pointer is 'BAD', so too for
 *   static_call()s (although when HAVE_STATIC_CALL it might not be immediately
 *   fatal). A NULL static_call can be the result of:
 *
 *     DECLARE_STATIC_CALL_NULL(my_static_call, void (*)(int));
 *
 *   which is equivalent to declaring a NULL function pointer with just a
 *   typename:
 *
 *     void (*my_func_ptr)(int arg1) = NULL;
 *
 *   or using static_call_update() with a NULL function. In both cases the
 *   HAVE_STATIC_CALL implementation will patch the trampoline with a RET
 *   instruction, instead of an immediate tail-call JMP. HAVE_STATIC_CALL_INLINE
 *   architectures can patch the trampoline call to a NOP.
 *
 *   In all cases, any argument evaluation is unconditional. Unlike a regular
 *   conditional function pointer call:
 *
 *     if (my_func_ptr)
 *         my_func_ptr(arg1)
 *
 *   where the argument evaludation also depends on the pointer value.
 *
 *   When calling a static_call that can be NULL, use:
 *
 *     static_call_cond(name)(arg1);
 *
 *   which will include the required value tests to avoid NULL-pointer
 *   dereferences.
 *
 *   To query which function is currently set to be called, use:
 *
 *   func = static_call_query(name);
 *
 *
 * DEFINE_STATIC_CALL_RET0 / __static_call_return0:
 *
 *   Just like how DEFINE_STATIC_CALL_NULL() / static_call_cond() optimize the
 *   conditional void function call, DEFINE_STATIC_CALL_RET0 /
 *   __static_call_return0 optimize the do nothing return 0 function.
 *
 *   This feature is strictly UB per the C standard (since it casts a function
 *   pointer to a different signature) and relies on the architecture ABI to
 *   make things work. In particular it relies on Caller Stack-cleanup and the
 *   whole return register being clobbered for short return values. All normal
 *   CDECL style ABIs conform.
 *
 *   In particular the x86_64 implementation replaces the 5 byte CALL
 *   instruction at the callsite with a 5 byte clear of the RAX register,
 *   completely eliding any function call overhead.
 *
 *   Notably argument setup is unconditional.
 *
 *
 * EXPORT_STATIC_CALL() vs EXPORT_STATIC_CALL_TRAMP():
 *
 *   The difference is that the _TRAMP variant tries to only export the
 *   trampoline with the result that a module can use static_call{,_cond}() but
 *   not static_call_update().
 *
 */

  reply	other threads:[~2021-08-30 16:07 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-26 22:13 [PATCH v2 bpf-next 0/3] bpf: introduce bpf_get_branch_snapshot Song Liu
2021-08-26 22:13 ` [PATCH v2 bpf-next 1/3] perf: enable branch record for software events Song Liu
2021-08-30 10:22   ` Peter Zijlstra
2021-08-30 15:25     ` Song Liu
2021-08-30 16:06       ` Peter Zijlstra [this message]
2021-08-30 16:36         ` Song Liu
2021-09-01 17:09           ` Peter Zijlstra
2021-08-30 17:41     ` Song Liu
2021-08-30 18:07       ` Peter Zijlstra
2021-09-01 17:12         ` Peter Zijlstra
2021-09-04 23:01           ` Josh Poimboeuf
2021-08-30 10:43   ` Peter Zijlstra
2021-08-30 16:06     ` Song Liu
2021-08-26 22:13 ` [PATCH v2 bpf-next 2/3] bpf: introduce helper bpf_get_branch_snapshot Song Liu
2021-08-27  9:28   ` kernel test robot
2021-08-27 15:10   ` kernel test robot
2021-08-30 10:47     ` Peter Zijlstra
2021-08-26 22:13 ` [PATCH v2 bpf-next 3/3] selftests/bpf: add test for bpf_get_branch_snapshot Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YS0CBphTuIdTWEXF@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Kernel-team@fb.com \
    --cc=acme@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=kjain@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=songliubraving@fb.com \
    --subject='Re: [PATCH v2 bpf-next 1/3] perf: enable branch record for software events' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).