LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Nadav Amit <namit@vmware.com>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Alok Kataria <akataria@vmware.com>,
	Christopher Li <sparse@chrisli.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Jan Beulich <JBeulich@suse.com>, Juergen Gross <jgross@suse.com>,
	Kate Stewart <kstewart@linuxfoundation.org>,
	Kees Cook <keescook@chromium.org>,
	"linux-sparse@vger.kernel.org" <linux-sparse@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Philippe Ombredanne <pombredanne@nexb.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"virtualization@lists.linux-foundation.org"
	<virtualization@lists.linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH v2 0/9] x86: macrofying inline asm for better compilation
Date: Mon, 4 Jun 2018 19:56:04 +0000	[thread overview]
Message-ID: <AC3A577C-3DB6-4F12-8392-A67F0461CBCB@vmware.com> (raw)
In-Reply-To: <20180604190552.hm5e6zcabeyxt26u@treble>

Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Mon, Jun 04, 2018 at 04:21:22AM -0700, Nadav Amit wrote:
>> This patch-set deals with an interesting yet stupid problem: kernel code
>> that does not get inlined despite its simplicity. There are several
>> causes for this behavior: "cold" attribute on __init, different function
>> optimization levels; conditional constant computations based on
>> __builtin_constant_p(); and finally large inline assembly blocks.
>> 
>> This patch-set deals with the inline assembly problem. I separated these
>> patches from the others (that were sent in the RFC) for easier
>> inclusion. I also separated the removal of unnecessary new-lines which
>> would be sent separately.
>> 
>> The problem with inline assembly is that inline assembly is often used
>> by the kernel for things that are other than code - for example,
>> assembly directives and data. GCC however is oblivious to the content of
>> the blocks and assumes their cost in space and time is proportional to
>> the number of the perceived assembly "instruction", according to the
>> number of newlines and semicolons. Alternatives, paravirt and other
>> mechanisms are affected, causing code not to be inlined, and degrading
>> compilation quality in general.
>> 
>> The solution that this patch-set carries for this problem is to create
>> an assembly macro, and then call it from the inline assembly block.  As
>> a result, the compiler sees a single "instruction" and assigns the more
>> appropriate cost to the code.
>> 
>> To avoid uglification of the code, as many noted, the macros are first
>> precompiled into an assembly file, which is later assembled together
>> with the the C files. This also enables to avoid duplicate
>> implementation that was set before for the asm and C code. This can be
>> seen in the exception table changes.
>> 
>> Overall this patch-set slightly increases the kernel size (my build was
>> done using my Ubuntu 18.04 config + localyesconfig for the record):
>> 
>>   text	   data	    bss	    dec	    hex	filename
>> 18140829 10224724 2957312 31322865 1ddf2f1 ./vmlinux before
>> 18163608 10227348 2957312 31348268 1de562c ./vmlinux after (+0.1%)
>> 
>> The number of static functions in the image is reduced by 379, but
>> actually inlining is even better, which does not always shows in these
>> numbers: a function may be inlined causing the calling function not to
>> be inlined.
>> 
>> The Makefile stuff may not be too clean. Ideas for improvements are
>> welcome.
>> 
>> v1->v2:	* Compiling the macros into a separate .s file, improving
>> 	  readability (Linus)
>> 	* Improving assembly formatting, applying most of the comments
>> 	  according to my judgment (Jan)
>> 	* Adding exception-table, cpufeature and jump-labels
>> 	* Removing new-line cleanup; to be submitted separately
> 
> How did you find these issues?  Is there some way to find them
> automatically in the future?  Perhaps with a GCC plugin?

Initially I found it while developing something unrelated and seeing the
disassembly going crazy for no good reason.

One way to see problematic functions is finding duplicate static functions,
which mostly happens when inline function in a header is not inlined:

	nm ./vmlinux | grep ' t ' | cut -d' ' -f3 | uniq -c | sort | \
	grep -v '      1’ 

But due to all kind of reasons (duplicate function names, inlined functions
which are being set a function pointers), it still requires manual work to
filter the false-positive.

Another way is to look on small functions, doing something like:
	nm --print-size ./vmlinux | grep ' t ' | cut -d' ' -f2- | sort | \
	head -n 10000

But again, there are many false-positives so I only looked at functions that
I know or only considered those that are marked as “inline”.

I don’t know how this process can be fully automated.

Regards,
Nadav

      reply	other threads:[~2018-06-04 19:56 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-04 11:21 Nadav Amit
2018-06-04 11:21 ` [PATCH v2 1/9] Makefile: Prepare for using macros for inline asm Nadav Amit
2018-06-04 11:21 ` [PATCH v2 2/9] x86: objtool: use asm macro for better compiler decisions Nadav Amit
2018-06-04 19:04   ` Josh Poimboeuf
2018-06-05  5:41   ` kbuild test robot
2018-06-04 11:21 ` [PATCH v2 3/9] x86: refcount: prevent gcc distortions Nadav Amit
2018-06-04 22:06   ` Kees Cook
2018-06-04 22:20     ` Nadav Amit
2018-06-05  8:26   ` kbuild test robot
2018-06-04 11:21 ` [PATCH v2 4/9] x86: alternatives: macrofy locks for better inlining Nadav Amit
2018-06-05  5:36   ` kbuild test robot
2018-06-05 14:07   ` kbuild test robot
2018-06-07  3:05   ` [lkp-robot] [x86] 1a39381d70: WARNING:at_kernel/locking/mutex.c:#__mutex_unlock_slowpath kernel test robot
2018-06-04 11:21 ` [PATCH v2 5/9] x86: bug: prevent gcc distortions Nadav Amit
2018-06-05  7:34   ` kbuild test robot
2018-06-04 11:21 ` [PATCH v2 6/9] x86: prevent inline distortion by paravirt ops Nadav Amit
2018-06-04 11:21 ` [PATCH v2 7/9] x86: extable: use macros instead of inline assembly Nadav Amit
2018-06-04 11:21 ` [PATCH v2 8/9] x86: cpufeature: " Nadav Amit
2018-06-04 11:21 ` [PATCH v2 9/9] x86: jump-labels: " Nadav Amit
2018-06-04 19:05 ` [PATCH v2 0/9] x86: macrofying inline asm for better compilation Josh Poimboeuf
2018-06-04 19:56   ` Nadav Amit [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AC3A577C-3DB6-4F12-8392-A67F0461CBCB@vmware.com \
    --to=namit@vmware.com \
    --cc=JBeulich@suse.com \
    --cc=akataria@vmware.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=jpoimboe@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kstewart@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sparse@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pombredanne@nexb.com \
    --cc=sparse@chrisli.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=x86@kernel.org \
    --subject='Re: [PATCH v2 0/9] x86: macrofying inline asm for better compilation' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).