From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932192AbbCQMvc (ORCPT ); Tue, 17 Mar 2015 08:51:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33412 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752681AbbCQMvb (ORCPT ); Tue, 17 Mar 2015 08:51:31 -0400 Message-ID: <5508234D.4040000@redhat.com> Date: Tue, 17 Mar 2015 13:51:25 +0100 From: Denys Vlasenko User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Borislav Petkov , Ingo Molnar CC: linux-tip-commits@vger.kernel.org, linux-kernel@vger.kernel.org, keescook@chromium.org, ast@plumgrid.com, fweisbec@gmail.com, oleg@redhat.com, tglx@linutronix.de, torvalds@linux-foundation.org, hpa@zytor.com, wad@chromium.org, rostedt@goodmis.org Subject: Re: [tip:x86/asm] x86/asm/entry/64: Remove unused thread_struct::usersp References: <1425984307-2143-2-git-send-email-dvlasenk@redhat.com> <20150316164707.GB23015@pd.tnic> <55075736.7030003@redhat.com> <20150317070830.GA19645@pd.tnic> <20150317071316.GA22758@gmail.com> <20150317072118.GA26864@gmail.com> <20150317073901.GC19645@pd.tnic> <55081C6D.6090609@redhat.com> In-Reply-To: <55081C6D.6090609@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/17/2015 01:22 PM, Denys Vlasenko wrote: > On 03/17/2015 08:39 AM, Borislav Petkov wrote: >> On Tue, Mar 17, 2015 at 08:21:18AM +0100, Ingo Molnar wrote: >>> Assuming this does not fix the regression, could you apply the minimal >>> patch below - which reverts the old_rsp handling change. >>> >>> (The rest of the commit are in a third patch, but those are only >>> comment changes.) >>> >>> So my theory is that this change is what will revert the regression. >> >> Yep, it does. Below is the diff that works (it is the rough revert >> without the comments :-)): >> > ... >> @@ -395,6 +398,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) >> /* >> * Switch the PDA and FPU contexts. >> */ >> + prev->usersp = this_cpu_read(old_rsp); >> + this_cpu_write(old_rsp, next->usersp); > > I have a theory. There is a time window when user's sp > is in PER_CPU_VAR(old_rsp) but not yet in pt_regs->sp, > and *interrupts are enabled*: > > ENTRY(system_call) > SWAPGS_UNSAFE_STACK > movq %rsp,PER_CPU_VAR(old_rsp) > movq PER_CPU_VAR(kernel_stack),%rsp > ENABLE_INTERRUPTS(CLBR_NONE) > ALLOC_PT_GPREGS_ON_STACK 8 /* +8: space for orig_ax */ > movq %rcx,RIP(%rsp) > movq PER_CPU_VAR(old_rsp),%rcx > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > movq %r11,EFLAGS(%rsp) > movq %rcx,RSP(%rsp) > > Before indicated insn, interrupts are already enabled. > If preempt would hit now, next task can clobber PER_CPU_VAR(old_rsp). > Then, when we return to this task, a bogus user's sp will be stored > in pt_regs, restores on exit to userspace, and next attempt > to, say, execute RETQ will try to pop a bogus, likely noncanonical > address into RIP -> #GP -> SEGV! > > The theory can be tested by just moving interrupt enable a bit down: > > ENTRY(system_call) > SWAPGS_UNSAFE_STACK > movq %rsp,PER_CPU_VAR(old_rsp) > movq PER_CPU_VAR(kernel_stack),%rsp > - ENABLE_INTERRUPTS(CLBR_NONE) > ALLOC_PT_GPREGS_ON_STACK 8 /* +8: space for orig_ax */ > movq %rcx,RIP(%rsp) > movq PER_CPU_VAR(old_rsp),%rcx > movq %r11,EFLAGS(%rsp) > movq %rcx,RSP(%rsp) > + ENABLE_INTERRUPTS(CLBR_NONE) > > If I'm right, segfaults should be gone. > Borislav, can you try this? I managed to reproduce the segfault, and the fix shown above works. I see that Ingo removed the failing commit from his tree. I'll send two patches: one which moves ENABLE_INTERRUPTS(CLBR_NONE) down, and another which tries to remove thread_struct::usersp again.