LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [GIT PULL] x86 FPU changes for 5.2
@ 2019-05-07 13:26 Borislav Petkov
2019-05-07 17:35 ` Linus Torvalds
2019-05-07 17:55 ` pr-tracker-bot
0 siblings, 2 replies; 6+ messages in thread
From: Borislav Petkov @ 2019-05-07 13:26 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Rik van Riel, Sebastian Andrzej Siewior, x86-ml, lkml
Hi Linus,
please pull the latest x86-fpu-for-linus tree from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-fpu-for-linus
This branch contains work started by Rik van Riel and brought to
fruition by Sebastian Andrzej Siewior with the main goal to optimize
when to load FPU registers: only when returning to userspace and not on
every context switch (while the task remains in the kernel).
In addition, this optimization makes kernel_fpu_begin() cheaper by
requiring registers saving only on the first invocation and skipping
that in following ones.
What is more, this series cleans up and streamlines many aspects of the
already complex FPU code, hopefully making it more palatable for future
improvements and simplifications.
Finally, there's a __user annotations fix from Jann Horn.
Thx.
---
The following changes since commit 79a3aaa7b82e3106be97842dedfd8429248896e6:
Linux 5.1-rc3 (2019-03-31 14:39:29 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-fpu-for-linus
for you to fetch changes up to d9c9ce34ed5c892323cbf5b4f9a4c498e036316a:
x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails (2019-05-06 09:49:40 +0200)
----------------------------------------------------------------
Jann Horn (1):
x86/fpu: Fix __user annotations
Rik van Riel (5):
x86/fpu: Add an __fpregs_load_activate() internal helper
x86/fpu: Eager switch PKRU state
x86/fpu: Always store the registers in copy_fpstate_to_sigframe()
x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOAD
x86/fpu: Defer FPU state load until return to userspace
Sebastian Andrzej Siewior (23):
x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig()
x86/fpu: Remove fpu__restore()
x86/fpu: Remove preempt_disable() in fpu__clear()
x86/fpu: Always init the state in fpu__clear()
x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe()
x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()
x86/fpu: Remove fpu->initialized
x86/fpu: Remove user_fpu_begin()
x86/fpu: Make __raw_xsave_addr() use a feature number instead of mask
x86/fpu: Use a feature number instead of mask in two more helpers
x86/pkeys: Provide *pkru() helpers
x86/fpu: Only write PKRU if it is different from current
x86/pkeys: Don't check if PKRU is zero before writing it
x86/entry: Add TIF_NEED_FPU_LOAD
x86/fpu: Update xstate's PKRU value on write_pkru()
x86/fpu: Inline copy_user_to_fpregs_zeroing()
x86/fpu: Restore from kernel memory on the 64-bit path too
x86/fpu: Merge the two code paths in __fpu__restore_sig()
x86/fpu: Add a fastpath to __fpu__restore_sig()
x86/fpu: Add a fastpath to copy_fpstate_to_sigframe()
x86/fpu: Restore regs in copy_fpstate_to_sigframe() in order to use the fastpath
x86/pkeys: Add PKRU value to init_fpstate
x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails
Documentation/preempt-locking.txt | 1 -
arch/x86/entry/common.c | 10 +-
arch/x86/ia32/ia32_signal.c | 17 ++-
arch/x86/include/asm/fpu/api.h | 31 ++++++
arch/x86/include/asm/fpu/internal.h | 133 +++++++++++++++++------
arch/x86/include/asm/fpu/signal.h | 2 +-
arch/x86/include/asm/fpu/types.h | 9 --
arch/x86/include/asm/fpu/xstate.h | 8 +-
arch/x86/include/asm/pgtable.h | 29 ++++-
arch/x86/include/asm/special_insns.h | 19 +++-
arch/x86/include/asm/thread_info.h | 2 +
arch/x86/include/asm/trace/fpu.h | 13 +--
arch/x86/kernel/cpu/common.c | 5 +
arch/x86/kernel/fpu/core.c | 195 ++++++++++++++++-----------------
arch/x86/kernel/fpu/init.c | 2 -
arch/x86/kernel/fpu/regset.c | 24 ++---
arch/x86/kernel/fpu/signal.c | 202 ++++++++++++++++++++++-------------
arch/x86/kernel/fpu/xstate.c | 42 ++++----
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/process_32.c | 11 +-
arch/x86/kernel/process_64.c | 11 +-
arch/x86/kernel/signal.c | 21 ++--
arch/x86/kernel/traps.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 2 +-
arch/x86/kvm/x86.c | 48 +++++----
arch/x86/math-emu/fpu_entry.c | 3 -
arch/x86/mm/mpx.c | 6 +-
arch/x86/mm/pkeys.c | 21 ++--
28 files changed, 512 insertions(+), 359 deletions(-)
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] x86 FPU changes for 5.2
2019-05-07 13:26 [GIT PULL] x86 FPU changes for 5.2 Borislav Petkov
@ 2019-05-07 17:35 ` Linus Torvalds
2019-05-07 17:59 ` Borislav Petkov
` (2 more replies)
2019-05-07 17:55 ` pr-tracker-bot
1 sibling, 3 replies; 6+ messages in thread
From: Linus Torvalds @ 2019-05-07 17:35 UTC (permalink / raw)
To: Borislav Petkov; +Cc: Rik van Riel, Sebastian Andrzej Siewior, x86-ml, lkml
On Tue, May 7, 2019 at 6:26 AM Borislav Petkov <bp@suse.de> wrote:
>
> This branch contains work started by Rik van Riel and brought to
> fruition by Sebastian Andrzej Siewior with the main goal to optimize
> when to load FPU registers: only when returning to userspace and not on
> every context switch (while the task remains in the kernel).
I love this and we should have done it long ago, but I also worry that
every time we've messed with the FP state, we've had interesting bugs.
Which is obviously why we didn't do this long ago.
Has this gone through lots of testing, particularly with things like
FP signal handling and old machines that don't necessarily have
anything but the most basic FP state (ie Pentium class etc)?
I've pulled it, but I'd still like to feel safer about it after-the-fact ;)
Linus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] x86 FPU changes for 5.2
2019-05-07 13:26 [GIT PULL] x86 FPU changes for 5.2 Borislav Petkov
2019-05-07 17:35 ` Linus Torvalds
@ 2019-05-07 17:55 ` pr-tracker-bot
1 sibling, 0 replies; 6+ messages in thread
From: pr-tracker-bot @ 2019-05-07 17:55 UTC (permalink / raw)
To: Borislav Petkov
Cc: Linus Torvalds, Rik van Riel, Sebastian Andrzej Siewior, x86-ml, lkml
The pull request you sent on Tue, 7 May 2019 15:26:32 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-fpu-for-linus
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/8ff468c29e9a9c3afe9152c10c7b141343270bf3
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] x86 FPU changes for 5.2
2019-05-07 17:35 ` Linus Torvalds
@ 2019-05-07 17:59 ` Borislav Petkov
2019-05-07 20:12 ` Ingo Molnar
2019-05-07 21:59 ` Sebastian Andrzej Siewior
2 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2019-05-07 17:59 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Rik van Riel, Sebastian Andrzej Siewior, x86-ml, lkml
On Tue, May 07, 2019 at 10:35:25AM -0700, Linus Torvalds wrote:
> I love this and we should have done it long ago, but I also worry that
> every time we've messed with the FP state, we've had interesting bugs.
Tell me about it. Our FPU code is one helluva contraption.
> Which is obviously why we didn't do this long ago.
>
> Has this gone through lots of testing, particularly with things like
> FP signal handling and old machines that don't necessarily have
> anything but the most basic FP state (ie Pentium class etc)?
Right, so I ran it on the bunch of boxes I have here, the oldest is a K8
which has:
[ 0.000000] x86/fpu: x87 FPU will use FXSAVE
and it looked ok. Ingo ran it on his fleet too, AFAIR.
Lemme see if I can dig out something older at work.
> I've pulled it, but I'd still like to feel safer about it after-the-fact ;)
Yeah.
--
Regards/Gruss,
Boris.
SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] x86 FPU changes for 5.2
2019-05-07 17:35 ` Linus Torvalds
2019-05-07 17:59 ` Borislav Petkov
@ 2019-05-07 20:12 ` Ingo Molnar
2019-05-07 21:59 ` Sebastian Andrzej Siewior
2 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2019-05-07 20:12 UTC (permalink / raw)
To: Linus Torvalds
Cc: Borislav Petkov, Rik van Riel, Sebastian Andrzej Siewior, x86-ml, lkml
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Tue, May 7, 2019 at 6:26 AM Borislav Petkov <bp@suse.de> wrote:
> >
> > This branch contains work started by Rik van Riel and brought to
> > fruition by Sebastian Andrzej Siewior with the main goal to optimize
> > when to load FPU registers: only when returning to userspace and not
> > on every context switch (while the task remains in the kernel).
>
> I love this and we should have done it long ago, but I also worry that
> every time we've messed with the FP state, we've had interesting bugs.
> Which is obviously why we didn't do this long ago.
>
> Has this gone through lots of testing, particularly with things like FP
> signal handling and old machines that don't necessarily have anything
> but the most basic FP state (ie Pentium class etc)?
>
> I've pulled it, but I'd still like to feel safer about it
> after-the-fact ;)
Most of the x86/fpu commits except the final one are several weeks old,
but I have to admit that our old-systems testing is hit and miss, and FPU
bugs do tend to have an additional 'bug report latency' multipier of a
factor of 3 or so ...
I've been running all this (modulo the final commit) on my primary
desktop and other systems as well, FWIIW.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] x86 FPU changes for 5.2
2019-05-07 17:35 ` Linus Torvalds
2019-05-07 17:59 ` Borislav Petkov
2019-05-07 20:12 ` Ingo Molnar
@ 2019-05-07 21:59 ` Sebastian Andrzej Siewior
2 siblings, 0 replies; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-07 21:59 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Borislav Petkov, Rik van Riel, x86-ml, lkml
On 2019-05-07 10:35:25 [-0700], Linus Torvalds wrote:
> Has this gone through lots of testing, particularly with things like
> FP signal handling and old machines that don't necessarily have
> anything but the most basic FP state (ie Pentium class etc)?
I tried it in qemu with incremental FPU capabilities so it went through
all the possible load/restore variants during signal handling. I had a
testcase which did SHA1 computation and a received a SIGALRM every
second. This caught some bugs, most of them were xsave/xsaves related
once the SHA1 result did not match. I tried something similar one the
8087 FPU.
I had the series in the v5.0-RT tree which resulted in the most recent
patch in that branch.
> Linus
Sebastian
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-05-07 21:59 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-07 13:26 [GIT PULL] x86 FPU changes for 5.2 Borislav Petkov
2019-05-07 17:35 ` Linus Torvalds
2019-05-07 17:59 ` Borislav Petkov
2019-05-07 20:12 ` Ingo Molnar
2019-05-07 21:59 ` Sebastian Andrzej Siewior
2019-05-07 17:55 ` pr-tracker-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).