LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: "H. Peter Anvin (Intel)" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Andy Lutomirski <luto@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86/asm: pessimize the pre-initialization case in static_cpu_has()
Date: Thu, 9 Sep 2021 19:01:15 +0200	[thread overview]
Message-ID: <YTo92+0ruBlkcaDf@zn.tnic> (raw)
In-Reply-To: <20210908171716.3340120-1-hpa@zytor.com>

On Wed, Sep 08, 2021 at 10:17:16AM -0700, H. Peter Anvin (Intel) wrote:

> Subject: Re: [PATCH] x86/asm: pessimize the pre-initialization case in static_cpu_has()

"pessimize" huh? :)

Why not simply

"Do not waste registers in the pre-initialization case... "

?

> gcc will sometimes manifest the address of boot_cpu_data in a register
> as part of constant propagation. When multiple static_cpu_has() are
> used this may foul the mainline code with a register load which will
> only be used on the fallback path, which is unused after
> initialization.

So a before-after thing looks like this here:

before:

ffffffff89696517 <.altinstr_aux>:
ffffffff89696517:       f6 05 cb 09 cb ff 80    testb  $0x80,-0x34f635(%rip)        # ffffffff89346ee9 <boot_cpu_data+0x69>
ffffffff8969651e:       0f 85 fc 3e fb ff       jne    ffffffff8964a420 <intel_pmu_init+0x14e7>
ffffffff89696524:       e9 ee 3e fb ff          jmp    ffffffff8964a417 <intel_pmu_init+0x14de>
ffffffff89696529:       f6 45 6a 08             testb  $0x8,0x6a(%rbp)
ffffffff8969652d:       0f 85 45 b9 97 f7       jne    ffffffff81011e78 <intel_pmu_lbr_filter+0x68>
ffffffff89696533:       e9 95 b9 97 f7          jmp    ffffffff81011ecd <intel_pmu_lbr_filter+0xbd>
ffffffff89696538:       41 f6 44 24 6a 08       testb  $0x8,0x6a(%r12)
ffffffff8969653e:       0f 85 d3 bc 97 f7       jne    ffffffff81012217 <intel_pmu_store_lbr+0x77>
ffffffff89696544:       e9 d9 bc 97 f7          jmp    ffffffff81012222 <intel_pmu_store_lbr+0x82>
ffffffff89696549:       41 f6 44 24 6a 08       testb  $0x8,0x6a(%r12)

after:

ffffffff89696517 <.altinstr_aux>:
ffffffff89696517:       f6 04 25 e9 6e 34 89    testb  $0x80,0xffffffff89346ee9
ffffffff8969651e:       80 
ffffffff8969651f:       0f 85 fb 3e fb ff       jne    ffffffff8964a420 <intel_pmu_init+0x14e7>
ffffffff89696525:       e9 ed 3e fb ff          jmp    ffffffff8964a417 <intel_pmu_init+0x14de>
ffffffff8969652a:       f6 04 25 ea 6e 34 89    testb  $0x8,0xffffffff89346eea
ffffffff89696531:       08 
ffffffff89696532:       0f 85 37 b9 97 f7       jne    ffffffff81011e6f <intel_pmu_lbr_filter+0x5f>
ffffffff89696538:       e9 89 b9 97 f7          jmp    ffffffff81011ec6 <intel_pmu_lbr_filter+0xb6>
ffffffff8969653d:       f6 04 25 ea 6e 34 89    testb  $0x8,0xffffffff89346eea
ffffffff89696544:       08 
ffffffff89696545:       0f 85 b5 bc 97 f7       jne    ffffffff81012200 <intel_pmu_store_lbr+0x70>
ffffffff8969654b:       e9 bb bc 97 f7          jmp    ffffffff8101220b <intel_pmu_store_lbr+0x7b>
ffffffff89696550:       f6 04 25 ea 6e 34 89    testb  $0x8,0xffffffff89346eea

so you're basically forcing an immediate thing.

And you wanna get rid of the (%<reg>) relative addressing and force it
to be rip-relative.

> Explicitly force gcc to use immediate (rip-relative) addressing for

Right, the rip-relative addressing doesn't happen here:

--- /tmp/before	2021-09-09 18:18:28.693009433 +0200
+++ /tmp/after	2021-09-09 18:19:06.285009113 +0200
@@ -1,5 +1,5 @@
-# ./arch/x86/include/asm/cpufeature.h:179: 	asm_volatile_goto(
-# 179 "./arch/x86/include/asm/cpufeature.h" 1
+# ./arch/x86/include/asm/cpufeature.h:184: 	asm_volatile_goto(
+# 184 "./arch/x86/include/asm/cpufeature.h" 1
 	# ALT: oldinstr2
 661:
 	jmp 6f
@@ -29,12 +29,12 @@
 	
 6652:
 .popsection
-.section .altinstr_aux,"ax"
+.pushsection .altinstr_aux,"ax"
 6:
- testb $1,boot_cpu_data+62(%rip)	#, MEM[(const char *)&boot_cpu_data + 62B]
+ testb $1,boot_cpu_data+62	#,
  jnz .L99	#
  jmp .L100	#
-.previous
+.popsection
 
 # 0 "" 2


.vminstr_aux even on an allyesconfig build is solely immediate
addressing in the TEST insn.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2021-09-09 17:01 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-08 17:17 H. Peter Anvin (Intel)
2021-09-09 17:01 ` Borislav Petkov [this message]
2021-09-09 21:28   ` H. Peter Anvin
2021-09-09 21:53     ` Borislav Petkov
2021-09-09 22:17     ` H. Peter Anvin
2021-09-10  9:14       ` Borislav Petkov
2021-09-10 19:25         ` H. Peter Anvin
2021-09-09 22:08 ` [PATCH v2 0/2] x86/asm: avoid register pressure from static_cpu_has() H. Peter Anvin (Intel)
2021-09-09 22:08   ` [PATCH v2 1/2] x86/asm: add _ASM_RIP() macro for x86-64 (%rip) suffix H. Peter Anvin (Intel)
2021-09-09 22:08   ` [PATCH v2 2/2] x86/asm: pessimize the pre-initialization case in static_cpu_has() H. Peter Anvin (Intel)
2021-09-10  9:16   ` [PATCH v2 0/2] x86/asm: avoid register pressure from static_cpu_has() Borislav Petkov
2021-09-10 13:24     ` Borislav Petkov
2021-09-10 19:59 ` [PATCH v3 0/2] x86/asm: avoid register pressure from the init case in static_cpu_has() H. Peter Anvin (Intel)
2021-09-10 19:59   ` [PATCH] drm/bochs: add Bochs PCI ID for Simics model H. Peter Anvin (Intel)
2021-09-10 19:59   ` [PATCH v3 1/2] x86/asm: add _ASM_RIP() macro for x86-64 (%rip) suffix H. Peter Anvin (Intel)
2021-09-13 19:39     ` [tip: x86/cpu] x86/asm: Add " tip-bot2 for H. Peter Anvin (Intel)
2021-09-10 19:59   ` [PATCH v3 2/2] x86/asm: avoid adding register pressure for the init case in static_cpu_has() H. Peter Anvin (Intel)
2021-09-13 19:39     ` [tip: x86/cpu] x86/asm: Avoid " tip-bot2 for H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YTo92+0ruBlkcaDf@zn.tnic \
    --to=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --subject='Re: [PATCH] x86/asm: pessimize the pre-initialization case in static_cpu_has()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).