LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] riscv: Don't use va_pa_offset on kdump
@ 2021-10-02 12:20 Nick Kossifidis
  2021-10-06 11:13 ` Alexandre Ghiti
  0 siblings, 1 reply; 5+ messages in thread
From: Nick Kossifidis @ 2021-10-02 12:20 UTC (permalink / raw)
  To: palmer, paul.walmsley, aou; +Cc: linux-riscv, linux-kernel, Nick Kossifidis

On kdump instead of using an intermediate step to relocate the kernel, that
lives in a "control buffer" outside the current kernel's mapping, we jump
to the crash kernel directly by calling riscv_kexec_norelocate(). The
current implementation uses va_pa_offset while switching to physical
addressing, however since we moved the kernel outside the linear mapping
this won't work anymore since riscv_kexec_norelocate() is part of the
kernel mapping and we should use kernel_map.va_kernel_pa_offset, and also
take XIP kernel into account.

We don't really need to use va_pa_offset on riscv_kexec_norelocate, we can
just set STVEC to the physical address of the new kernel instead and let
the hart jump to the new kernel on the next instruction after setting
SATP to zero. This fixes kdump and is also simpler/cleaner.

Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
---
 arch/riscv/kernel/kexec_relocate.S | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S
index a80b52a74..e2f34196e 100644
--- a/arch/riscv/kernel/kexec_relocate.S
+++ b/arch/riscv/kernel/kexec_relocate.S
@@ -159,25 +159,15 @@ SYM_CODE_START(riscv_kexec_norelocate)
 	 * s0: (const) Phys address to jump to
 	 * s1: (const) Phys address of the FDT image
 	 * s2: (const) The hartid of the current hart
-	 * s3: (const) kernel_map.va_pa_offset, used when switching MMU off
 	 */
 	mv	s0, a1
 	mv	s1, a2
 	mv	s2, a3
-	mv	s3, a4
 
 	/* Disable / cleanup interrupts */
 	csrw	CSR_SIE, zero
 	csrw	CSR_SIP, zero
 
-	/* Switch to physical addressing */
-	la	s4, 1f
-	sub	s4, s4, s3
-	csrw	CSR_STVEC, s4
-	csrw	CSR_SATP, zero
-
-.align 2
-1:
 	/* Pass the arguments to the next kernel  / Cleanup*/
 	mv	a0, s2
 	mv	a1, s1
@@ -214,6 +204,11 @@ SYM_CODE_START(riscv_kexec_norelocate)
 	csrw	CSR_SCAUSE, zero
 	csrw	CSR_SSCRATCH, zero
 
+	/* Switch to physical addressing */
+	csrw	CSR_STVEC, a2
+	csrw	CSR_SATP, zero
+
+	/* This will trigger a jump to CSR_STVEC anyway */
 	jalr	zero, a2, 0
 SYM_CODE_END(riscv_kexec_norelocate)
 
-- 
2.32.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] riscv: Don't use va_pa_offset on kdump
  2021-10-02 12:20 [PATCH] riscv: Don't use va_pa_offset on kdump Nick Kossifidis
@ 2021-10-06 11:13 ` Alexandre Ghiti
  2021-10-09 13:18   ` Nick Kossifidis
  0 siblings, 1 reply; 5+ messages in thread
From: Alexandre Ghiti @ 2021-10-06 11:13 UTC (permalink / raw)
  To: Nick Kossifidis
  Cc: Palmer Dabbelt, Paul Walmsley, Albert Ou, linux-riscv,
	linux-kernel@vger.kernel.org List

On Sat, Oct 2, 2021 at 2:23 PM Nick Kossifidis <mick@ics.forth.gr> wrote:
>
> On kdump instead of using an intermediate step to relocate the kernel, that
> lives in a "control buffer" outside the current kernel's mapping, we jump
> to the crash kernel directly by calling riscv_kexec_norelocate(). The
> current implementation uses va_pa_offset while switching to physical
> addressing, however since we moved the kernel outside the linear mapping
> this won't work anymore since riscv_kexec_norelocate() is part of the
> kernel mapping and we should use kernel_map.va_kernel_pa_offset, and also
> take XIP kernel into account.
>
> We don't really need to use va_pa_offset on riscv_kexec_norelocate, we can
> just set STVEC to the physical address of the new kernel instead and let
> the hart jump to the new kernel on the next instruction after setting
> SATP to zero. This fixes kdump and is also simpler/cleaner.
>
> Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
> ---
>  arch/riscv/kernel/kexec_relocate.S | 15 +++++----------
>  1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S
> index a80b52a74..e2f34196e 100644
> --- a/arch/riscv/kernel/kexec_relocate.S
> +++ b/arch/riscv/kernel/kexec_relocate.S
> @@ -159,25 +159,15 @@ SYM_CODE_START(riscv_kexec_norelocate)
>          * s0: (const) Phys address to jump to
>          * s1: (const) Phys address of the FDT image
>          * s2: (const) The hartid of the current hart
> -        * s3: (const) kernel_map.va_pa_offset, used when switching MMU off
>          */
>         mv      s0, a1
>         mv      s1, a2
>         mv      s2, a3
> -       mv      s3, a4
>
>         /* Disable / cleanup interrupts */
>         csrw    CSR_SIE, zero
>         csrw    CSR_SIP, zero
>
> -       /* Switch to physical addressing */
> -       la      s4, 1f
> -       sub     s4, s4, s3
> -       csrw    CSR_STVEC, s4
> -       csrw    CSR_SATP, zero
> -
> -.align 2
> -1:
>         /* Pass the arguments to the next kernel  / Cleanup*/
>         mv      a0, s2
>         mv      a1, s1
> @@ -214,6 +204,11 @@ SYM_CODE_START(riscv_kexec_norelocate)
>         csrw    CSR_SCAUSE, zero
>         csrw    CSR_SSCRATCH, zero
>
> +       /* Switch to physical addressing */
> +       csrw    CSR_STVEC, a2
> +       csrw    CSR_SATP, zero
> +
> +       /* This will trigger a jump to CSR_STVEC anyway */
>         jalr    zero, a2, 0

The last jump to a2 can be removed since the fault will be triggered
before even reaching this instruction.

>  SYM_CODE_END(riscv_kexec_norelocate)
>
> --
> 2.32.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

This patch fixes a regression introduced when moving the kernel to the
end of the address space, so we should add:
Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear mapping")

And it should be backported to 5.13 and 5.14. It seems that the
following tags should be enough:

Cc: <stable@vger.kernel.org> # 5.13
Cc: <stable@vger.kernel.org> # 5.14

And finally, you can add:

Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>

Thanks,

Alex

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] riscv: Don't use va_pa_offset on kdump
  2021-10-06 11:13 ` Alexandre Ghiti
@ 2021-10-09 13:18   ` Nick Kossifidis
  2021-10-23 20:14     ` Palmer Dabbelt
  0 siblings, 1 reply; 5+ messages in thread
From: Nick Kossifidis @ 2021-10-09 13:18 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Nick Kossifidis, Palmer Dabbelt, Paul Walmsley, Albert Ou,
	linux-riscv, linux-kernel@vger.kernel.org List

Στις 2021-10-06 14:13, Alexandre Ghiti έγραψε:
>> +
>> +       /* This will trigger a jump to CSR_STVEC anyway */
>>         jalr    zero, a2, 0
> 
> The last jump to a2 can be removed since the fault will be triggered
> before even reaching this instruction.
> 

Just switching SATP to zero doesn't generate a trap unless mstatus.TVM 
is set (for visualization purposes). The hart will try and execute the 
next instruction but it's not clear in the spec what happens in case the 
code is cached, I don't want to rely solely on STVEC. I prefer having 
this instruction there, note that some earlier QEMU versions also had 
this behavior (the original kdump patch didn't set STVEC and it worked 
fine after setting SATP to zero).

> 
> This patch fixes a regression introduced when moving the kernel to the
> end of the address space, so we should add:
> Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear 
> mapping")
> 
> And it should be backported to 5.13 and 5.14. It seems that the
> following tags should be enough:
> 
> Cc: <stable@vger.kernel.org> # 5.13
> Cc: <stable@vger.kernel.org> # 5.14
> 
> And finally, you can add:
> 
> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
> 

ACK, thanks ! I'll resend the patch with the tags you mentioned.

Regards,
Nick

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] riscv: Don't use va_pa_offset on kdump
  2021-10-09 13:18   ` Nick Kossifidis
@ 2021-10-23 20:14     ` Palmer Dabbelt
  2021-10-25  1:00       ` Nick Kossifidis
  0 siblings, 1 reply; 5+ messages in thread
From: Palmer Dabbelt @ 2021-10-23 20:14 UTC (permalink / raw)
  To: mick; +Cc: alexandre.ghiti, mick, Paul Walmsley, aou, linux-riscv, linux-kernel

On Sat, 09 Oct 2021 06:18:48 PDT (-0700), mick@ics.forth.gr wrote:
> Στις 2021-10-06 14:13, Alexandre Ghiti έγραψε:
>>> +
>>> +       /* This will trigger a jump to CSR_STVEC anyway */
>>>         jalr    zero, a2, 0
>>
>> The last jump to a2 can be removed since the fault will be triggered
>> before even reaching this instruction.
>>
>
> Just switching SATP to zero doesn't generate a trap unless mstatus.TVM
> is set (for visualization purposes). The hart will try and execute the
> next instruction but it's not clear in the spec what happens in case the
> code is cached, I don't want to rely solely on STVEC. I prefer having
> this instruction there, note that some earlier QEMU versions also had
> this behavior (the original kdump patch didn't set STVEC and it worked
> fine after setting SATP to zero).

IIRC this came down to some very specific wording in the spec.  
Something along the lines of the 0 in SATP meaning "no translation", 
SFENCE.VMA ordering translations, and the general "if the spec doesn't 
mention it then it has to work" logic.  I thought I opened a spec issue 
about this for clarification, but I can't find it.

That said, I'm perfectly fine taking the safe approach here as it's not 
like the performance matters here.  Warrants a comment, though.

>
>>
>> This patch fixes a regression introduced when moving the kernel to the
>> end of the address space, so we should add:
>> Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear
>> mapping")
>>
>> And it should be backported to 5.13 and 5.14. It seems that the
>> following tags should be enough:
>>
>> Cc: <stable@vger.kernel.org> # 5.13
>> Cc: <stable@vger.kernel.org> # 5.14
>>
>> And finally, you can add:
>>
>> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
>>
>
> ACK, thanks ! I'll resend the patch with the tags you mentioned.

I don't have a v2 in my inbox, did I miss something?  Also, if it's just 
the tags then it's generally not necessary to re-send something.  The 
comment does, though.

LMK if you want me to deal with this, or if there's going to be a v2.

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] riscv: Don't use va_pa_offset on kdump
  2021-10-23 20:14     ` Palmer Dabbelt
@ 2021-10-25  1:00       ` Nick Kossifidis
  0 siblings, 0 replies; 5+ messages in thread
From: Nick Kossifidis @ 2021-10-25  1:00 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: mick, alexandre.ghiti, Paul Walmsley, aou, linux-riscv, linux-kernel

Στις 2021-10-23 23:14, Palmer Dabbelt έγραψε:
> On Sat, 09 Oct 2021 06:18:48 PDT (-0700), mick@ics.forth.gr wrote:
>> Στις 2021-10-06 14:13, Alexandre Ghiti έγραψε:
>>>> +
>>>> +       /* This will trigger a jump to CSR_STVEC anyway */
>>>>         jalr    zero, a2, 0
>>> 
>>> The last jump to a2 can be removed since the fault will be triggered
>>> before even reaching this instruction.
>>> 
>> 
>> Just switching SATP to zero doesn't generate a trap unless mstatus.TVM
>> is set (for visualization purposes). The hart will try and execute the
>> next instruction but it's not clear in the spec what happens in case 
>> the
>> code is cached, I don't want to rely solely on STVEC. I prefer having
>> this instruction there, note that some earlier QEMU versions also had
>> this behavior (the original kdump patch didn't set STVEC and it worked
>> fine after setting SATP to zero).
> 
> IIRC this came down to some very specific wording in the spec.
> Something along the lines of the 0 in SATP meaning "no translation",
> SFENCE.VMA ordering translations, and the general "if the spec doesn't
> mention it then it has to work" logic.  I thought I opened a spec
> issue about this for clarification, but I can't find it.
> 

I guess you mean this one:
https://github.com/riscv/riscv-isa-manual/issues/538

I couldn't find anything though regarding cached code, it's not that 
there's going to be a load after setting satp to 0 if the code has been 
cached, so even if the translation is cached we don't have a guarantee 
that the next instruction will result a trap.

> That said, I'm perfectly fine taking the safe approach here as it's
> not like the performance matters here.  Warrants a comment, though.
> 

ACK

> 
> I don't have a v2 in my inbox, did I miss something?  Also, if it's
> just the tags then it's generally not necessary to re-send something.
> The comment does, though.
> 
> LMK if you want me to deal with this, or if there's going to be a v2.
> 
> Thanks!

I'll send a v2 with the tags and the comment.

Regards,
Nick

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-10-25  1:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-02 12:20 [PATCH] riscv: Don't use va_pa_offset on kdump Nick Kossifidis
2021-10-06 11:13 ` Alexandre Ghiti
2021-10-09 13:18   ` Nick Kossifidis
2021-10-23 20:14     ` Palmer Dabbelt
2021-10-25  1:00       ` Nick Kossifidis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).