LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Ingo Molnar <mingo@kernel.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Alexei Starovoitov <ast@plumgrid.com>,
	Will Drewry <wad@chromium.org>, Kees Cook <keescook@chromium.org>,
	X86 ML <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath
Date: Tue, 10 Mar 2015 06:26:47 -0700	[thread overview]
Message-ID: <CALCETrW6-ToMwhfSzA79K2zz6iPNoz0EjPa_wHK7jiJLip8zag@mail.gmail.com> (raw)
In-Reply-To: <20150310132147.GB26185@gmail.com>

On Tue, Mar 10, 2015 at 6:21 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Denys Vlasenko <dvlasenk@redhat.com> wrote:
>
>> > So there are now +2 instructions (5 instead of 3) in the
>> > system_call path, but there are -2 instructions in the SYSRETQ
>> > path,
>>
>> Unfortunately, no. [...]
>
> So I assumed that it was an equivalent transformation, given that none
> of the changelogs spelled out the increase in overhead ...
>
>> [...] There is only this change in SYSRETQ path, which simply
>> changes where we get RSP from:
>>
>> @@ -293,7 +289,7 @@ ret_from_sys_call:
>>       CFI_REGISTER    rip,rcx
>>       movq    EFLAGS(%rsp),%r11
>>       /*CFI_REGISTER  rflags,r11*/
>> -     movq    PER_CPU_VAR(old_rsp), %rsp
>> +     movq    RSP(%rsp),%rsp
>>       /*
>>        * 64bit SYSRET restores rip from rcx,
>>        * rflags from r11 (but RF and VM bits are forced to 0),
>>
>> Most likely, no change in execution speed here.
>> At best, it is one cycle faster somewhere in address generation unit
>> because for PER_CPU_VAR() address evaluation, GS base is nonzero.
>>
>> Since this patch does add two extra MOVs,
>> I did benchmark these patches. They add exactly one cycle
>> to system call code path on my Sandy Bridge CPU.
>
> Hm, but that's the wrong direction, we should try to make it faster,
> and to clean it up - but making it slower without really good reasons
> isn't good.
>
> Is 'usersp' really that much of a complication?

usersp is IMO tolerable.  The nasty thing is the FIXUP_TOP_OF_STACK /
RESTORE_TOP_OF_STACK garbage, and this patch is the main step toward
killing that off completely.  I've still never convinced myself that
there aren't ptrace-related info leaks in there.

Denys, did you ever benchmark what happens if we use push instead of
mov?  I bet that we get that cycle back and more, not to mention much
less icache usage.

The reason I think that is that this patch changes us from writing
nothing to the RIP slot to writing something there.  If we push our
stack frame instead of moving it into place, we must write to every
slot, and writing the right value is probably no more expensive than
writing, say, zero (the load / forwarding units are unlikely to be
very busy at that point).  So this change could actually be a step
toward getting faster.

--Andy

>
> Thanks,
>
>         Ingo



-- 
Andy Lutomirski
AMA Capital Management, LLC

  reply	other threads:[~2015-03-10 13:27 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-09 18:39 [PATCH 0/4 v2] x86: entry_64.S: steps towards simpler iret frame handling Denys Vlasenko
2015-03-09 18:39 ` [PATCH 1/4] x86: save r11 into pt_regs->flags on SYSCALL64 fastpath Denys Vlasenko
2015-03-09 20:02   ` Andy Lutomirski
2015-03-16 12:04   ` [tip:x86/asm] x86/asm/entry/64: Save R11 into pt_regs-> flags " tip-bot for Denys Vlasenko
2015-03-09 18:39 ` [PATCH 3/4] x86: save user rsp in pt_regs->sp " Denys Vlasenko
2015-03-09 20:11   ` Andy Lutomirski
2015-03-09 20:32     ` Denys Vlasenko
2015-03-09 20:43       ` Andy Lutomirski
2015-03-10 12:51   ` Ingo Molnar
2015-03-10 13:10     ` Andy Lutomirski
2015-03-10 13:18     ` Denys Vlasenko
2015-03-10 13:20       ` Andy Lutomirski
2015-03-10 13:26         ` Ingo Molnar
2015-03-10 13:21       ` Ingo Molnar
2015-03-10 13:26         ` Andy Lutomirski [this message]
2015-03-10 14:00           ` Denys Vlasenko
2015-03-10 14:02             ` Andy Lutomirski
2015-03-10 14:09               ` Denys Vlasenko
2015-03-10 13:28         ` Ingo Molnar
2015-03-10 13:50         ` Denys Vlasenko
2015-03-16  9:44           ` Ingo Molnar
2015-03-16 12:05   ` [tip:x86/asm] x86/asm/entry/64: Save user RSP in pt_regs-> sp " tip-bot for Denys Vlasenko
2015-03-10  6:00 ` [PATCH 0/4 v2] x86: entry_64.S: steps towards simpler iret frame handling Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrW6-ToMwhfSzA79K2zz6iPNoz0EjPa_wHK7jiJLip8zag@mail.gmail.com \
    --to=luto@amacapital.net \
    --cc=ast@plumgrid.com \
    --cc=bp@alien8.de \
    --cc=dvlasenk@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    --subject='Re: [PATCH 3/4] x86: save user rsp in pt_regs->sp on SYSCALL64 fastpath' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).