LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* [PATCH v3 0/2] add support for new persistent memory instructions @ 2015-01-27 16:53 Ross Zwisler 2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Ross Zwisler @ 2015-01-27 16:53 UTC (permalink / raw) To: linux-kernel Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner, Borislav Petkov This patch set adds support for two new persistent memory instructions, pcommit and clwb. These instructions were announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf These patches apply cleanly to v3.19-rc6. Changes from v2: - Added instruction descriptions and flows to the patch descriptions. - Added needed sfence to pcommit alternatives assembly. The inline function is now called pcommit_sfence(). If pcommit is not supported on the platform both the pcommit and the sfence will be nops. Cc: H Peter Anvin <h.peter.anvin@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Ross Zwisler (2): x86: Add support for the pcommit instruction x86: Add support for the clwb instruction arch/x86/include/asm/cpufeature.h | 2 ++ arch/x86/include/asm/special_insns.h | 22 ++++++++++++++++++++++ 2 files changed, 24 insertions(+) -- 1.9.3 ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v3 1/2] x86: Add support for the pcommit instruction 2015-01-27 16:53 [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler @ 2015-01-27 16:53 ` Ross Zwisler 2015-01-28 10:58 ` Borislav Petkov ` (3 more replies) 2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler 2015-02-05 16:24 ` [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler 2 siblings, 4 replies; 20+ messages in thread From: Ross Zwisler @ 2015-01-27 16:53 UTC (permalink / raw) To: linux-kernel Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner, Borislav Petkov Add support for the new pcommit (persistent commit) instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf The pcommit instruction ensures that data that has been flushed from the processor's cache hierarchy with clwb, clflushopt or clflush is accepted to memory and is durable on the DIMM. The primary use case for this is persistent memory. This function shows how to properly use clwb/clflushopt/clflush and pcommit with appropriate fencing: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); /* pcommit and the required sfence for ordering */ pcommit_sfence(); } After this function completes the data pointed to by vaddr is has been accepted to memory and will be durable if the vaddr points to persistent memory. Pcommit must always be ordered by an mfence or sfence, so to help simplify things we include both the pcommit and the required sfence in the alternatives generated by pcommit_sfence(). The other option is to keep them separated, but on platforms that don't support pcommit this would then turn into: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); nop(); /* from pcommit(), via alternatives */ /* * sfence to order pcommit * mfence via mb() also works */ wmb(); } This is still correct, but now you've got two fences separated by only a nop. With the commit and the fence together in pcommit_sfence() you avoid the final unneeded fence. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: H Peter Anvin <h.peter.anvin@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> --- arch/x86/include/asm/cpufeature.h | 1 + arch/x86/include/asm/special_insns.h | 8 ++++++++ 2 files changed, 9 insertions(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index bb9b258..dfdd689 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -220,6 +220,7 @@ #define X86_FEATURE_RDSEED ( 9*32+18) /* The RDSEED instruction */ #define X86_FEATURE_ADX ( 9*32+19) /* The ADCX and ADOX instructions */ #define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ +#define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ #define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ #define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ #define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index e820c08..d686f9b 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -199,6 +199,14 @@ static inline void clflushopt(volatile void *__p) "+m" (*(volatile char __force *)__p)); } +static inline void pcommit_sfence(void) +{ + alternative(ASM_NOP7, + ".byte 0x66, 0x0f, 0xae, 0xf8\n\t" /* pcommit */ + "sfence", + X86_FEATURE_PCOMMIT); +} + #define nop() asm volatile ("nop") -- 1.9.3 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 1/2] x86: Add support for the pcommit instruction 2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler @ 2015-01-28 10:58 ` Borislav Petkov 2015-01-28 17:10 ` Elliott, Robert (Server Storage) ` (2 subsequent siblings) 3 siblings, 0 replies; 20+ messages in thread From: Borislav Petkov @ 2015-01-28 10:58 UTC (permalink / raw) To: Ross Zwisler; +Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner On Tue, Jan 27, 2015 at 09:53:50AM -0700, Ross Zwisler wrote: > Add support for the new pcommit (persistent commit) instruction. This > instruction was announced in the document "Intel Architecture > Instruction Set Extensions Programming Reference" with reference number > 319433-022. > > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf > > The pcommit instruction ensures that data that has been flushed from the > processor's cache hierarchy with clwb, clflushopt or clflush is accepted to > memory and is durable on the DIMM. The primary use case for this is persistent > memory. > > This function shows how to properly use clwb/clflushopt/clflush and > pcommit with appropriate fencing: ... > This is still correct, but now you've got two fences separated by only a > nop. With the commit and the fence together in pcommit_sfence() you > avoid the final unneeded fence. > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > Cc: H Peter Anvin <h.peter.anvin@intel.com> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Borislav Petkov <bp@alien8.de> Acked-by: Borislav Petkov <bp@suse.de> -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v3 1/2] x86: Add support for the pcommit instruction 2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler 2015-01-28 10:58 ` Borislav Petkov @ 2015-01-28 17:10 ` Elliott, Robert (Server Storage) 2015-01-28 17:21 ` Borislav Petkov 2015-02-11 22:24 ` H. Peter Anvin 2015-02-19 0:29 ` [tip:x86/asm] " tip-bot for Ross Zwisler 3 siblings, 1 reply; 20+ messages in thread From: Elliott, Robert (Server Storage) @ 2015-01-28 17:10 UTC (permalink / raw) To: Ross Zwisler, linux-kernel Cc: H Peter Anvin, Ingo Molnar, Thomas Gleixner, Borislav Petkov, Kani, Toshimitsu, Knippers, Linda > -----Original Message----- > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel- > owner@vger.kernel.org] On Behalf Of Ross Zwisler > Sent: Tuesday, 27 January, 2015 10:54 AM > To: linux-kernel@vger.kernel.org > Cc: Ross Zwisler; H Peter Anvin; Ingo Molnar; Thomas Gleixner; Borislav > Petkov > Subject: [PATCH v3 1/2] x86: Add support for the pcommit instruction > > Add support for the new pcommit (persistent commit) instruction. This > instruction was announced in the document "Intel Architecture > Instruction Set Extensions Programming Reference" with reference number > 319433-022. > > https://software.intel.com/sites/default/files/managed/0d/53/319433- > 022.pdf > ... > --- > arch/x86/include/asm/cpufeature.h | 1 + > arch/x86/include/asm/special_insns.h | 8 ++++++++ > 2 files changed, 9 insertions(+) Should this patch series also add defines for the virtual machine control data structure changes? 1. Add the new VM-Execution Controls bit 21 as SECONDARY_EXEC_PCOMMIT_EXITING 0x00200000 to arch/x86/include/asm/vmx.h. 2. Add the new exit reason of 64 (0x41) as EXIT_REASON_PCOMMIT 64 to arch/x86/include/uapi/asm/vmx.h and (with a VMX_EXIT_REASONS string) to usr/include/asm/vmx.h. 3. Add a kvm_vmx_exit_handler to arch/x86/kvm/vmx.c. --- Rob Elliott HP Server Storage ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 1/2] x86: Add support for the pcommit instruction 2015-01-28 17:10 ` Elliott, Robert (Server Storage) @ 2015-01-28 17:21 ` Borislav Petkov 2015-01-28 17:27 ` Ross Zwisler 0 siblings, 1 reply; 20+ messages in thread From: Borislav Petkov @ 2015-01-28 17:21 UTC (permalink / raw) To: Elliott, Robert (Server Storage) Cc: Ross Zwisler, linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner, Kani, Toshimitsu, Knippers, Linda On Wed, Jan 28, 2015 at 05:10:46PM +0000, Elliott, Robert (Server Storage) wrote: > Should this patch series also add defines for the virtual > machine control data structure changes? > > 1. Add the new VM-Execution Controls bit 21 as > SECONDARY_EXEC_PCOMMIT_EXITING 0x00200000 > to arch/x86/include/asm/vmx.h. > > 2. Add the new exit reason of 64 (0x41) as > EXIT_REASON_PCOMMIT 64 > to arch/x86/include/uapi/asm/vmx.h and (with a > VMX_EXIT_REASONS string) to usr/include/asm/vmx.h. > > 3. Add a kvm_vmx_exit_handler to arch/x86/kvm/vmx.c. These look like a separate patchset for kvm enablement to me. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 1/2] x86: Add support for the pcommit instruction 2015-01-28 17:21 ` Borislav Petkov @ 2015-01-28 17:27 ` Ross Zwisler 0 siblings, 0 replies; 20+ messages in thread From: Ross Zwisler @ 2015-01-28 17:27 UTC (permalink / raw) To: Borislav Petkov Cc: Elliott, Robert (Server Storage), linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner, Kani, Toshimitsu, Knippers, Linda On Wed, 2015-01-28 at 18:21 +0100, Borislav Petkov wrote: > On Wed, Jan 28, 2015 at 05:10:46PM +0000, Elliott, Robert (Server Storage) wrote: > > Should this patch series also add defines for the virtual > > machine control data structure changes? > > > > 1. Add the new VM-Execution Controls bit 21 as > > SECONDARY_EXEC_PCOMMIT_EXITING 0x00200000 > > to arch/x86/include/asm/vmx.h. > > > > 2. Add the new exit reason of 64 (0x41) as > > EXIT_REASON_PCOMMIT 64 > > to arch/x86/include/uapi/asm/vmx.h and (with a > > VMX_EXIT_REASONS string) to usr/include/asm/vmx.h. > > > > 3. Add a kvm_vmx_exit_handler to arch/x86/kvm/vmx.c. > > These look like a separate patchset for kvm enablement to me. Agreed, I think they are a separate patch set. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 1/2] x86: Add support for the pcommit instruction 2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler 2015-01-28 10:58 ` Borislav Petkov 2015-01-28 17:10 ` Elliott, Robert (Server Storage) @ 2015-02-11 22:24 ` H. Peter Anvin 2015-02-19 0:29 ` [tip:x86/asm] " tip-bot for Ross Zwisler 3 siblings, 0 replies; 20+ messages in thread From: H. Peter Anvin @ 2015-02-11 22:24 UTC (permalink / raw) To: Ross Zwisler, linux-kernel; +Cc: Ingo Molnar, Thomas Gleixner, Borislav Petkov On 01/27/2015 08:53 AM, Ross Zwisler wrote: > Add support for the new pcommit (persistent commit) instruction. This > instruction was announced in the document "Intel Architecture > Instruction Set Extensions Programming Reference" with reference number > 319433-022. > > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf > > The pcommit instruction ensures that data that has been flushed from the > processor's cache hierarchy with clwb, clflushopt or clflush is accepted to > memory and is durable on the DIMM. The primary use case for this is persistent > memory. > > This function shows how to properly use clwb/clflushopt/clflush and > pcommit with appropriate fencing: > > void flush_and_commit_buffer(void *vaddr, unsigned int size) > { > void *vend = vaddr + size - 1; > > for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) > clwb(vaddr); > > /* Flush any possible final partial cacheline */ > clwb(vend); > > /* > * sfence to order clwb/clflushopt/clflush cache flushes > * mfence via mb() also works > */ > wmb(); > > /* pcommit and the required sfence for ordering */ > pcommit_sfence(); > } > > After this function completes the data pointed to by vaddr is has been > accepted to memory and will be durable if the vaddr points to > persistent memory. > > Pcommit must always be ordered by an mfence or sfence, so to help > simplify things we include both the pcommit and the required sfence in > the alternatives generated by pcommit_sfence(). The other option is to > keep them separated, but on platforms that don't support pcommit this > would then turn into: > > void flush_and_commit_buffer(void *vaddr, unsigned int size) > { > void *vend = vaddr + size - 1; > > for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) > clwb(vaddr); > > /* Flush any possible final partial cacheline */ > clwb(vend); > > /* > * sfence to order clwb/clflushopt/clflush cache flushes > * mfence via mb() also works > */ > wmb(); > > nop(); /* from pcommit(), via alternatives */ > > /* > * sfence to order pcommit > * mfence via mb() also works > */ > wmb(); > } > > This is still correct, but now you've got two fences separated by only a > nop. With the commit and the fence together in pcommit_sfence() you > avoid the final unneeded fence. Acked-by: H. Peter Anvin <hpa@linux.intel.com> ^ permalink raw reply [flat|nested] 20+ messages in thread
* [tip:x86/asm] x86: Add support for the pcommit instruction 2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler ` (2 preceding siblings ...) 2015-02-11 22:24 ` H. Peter Anvin @ 2015-02-19 0:29 ` tip-bot for Ross Zwisler 2015-02-19 1:15 ` Ingo Molnar 3 siblings, 1 reply; 20+ messages in thread From: tip-bot for Ross Zwisler @ 2015-02-19 0:29 UTC (permalink / raw) To: linux-tip-commits Cc: mingo, linux-kernel, bp, tglx, torvalds, ross.zwisler, hpa, hpa Commit-ID: a71ef01336f2228dc9d47320492360d6848e591e Gitweb: http://git.kernel.org/tip/a71ef01336f2228dc9d47320492360d6848e591e Author: Ross Zwisler <ross.zwisler@linux.intel.com> AuthorDate: Tue, 27 Jan 2015 09:53:50 -0700 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Thu, 19 Feb 2015 00:06:37 +0100 x86: Add support for the pcommit instruction Add support for the new pcommit (persistent commit) instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf The pcommit instruction ensures that data that has been flushed from the processor's cache hierarchy with clwb, clflushopt or clflush is accepted to memory and is durable on the DIMM. The primary use case for this is persistent memory. This function shows how to properly use clwb/clflushopt/clflush and pcommit with appropriate fencing: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); /* pcommit and the required sfence for ordering */ pcommit_sfence(); } After this function completes the data pointed to by vaddr is has been accepted to memory and will be durable if the vaddr points to persistent memory. Pcommit must always be ordered by an mfence or sfence, so to help simplify things we include both the pcommit and the required sfence in the alternatives generated by pcommit_sfence(). The other option is to keep them separated, but on platforms that don't support pcommit this would then turn into: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); nop(); /* from pcommit(), via alternatives */ /* * sfence to order pcommit * mfence via mb() also works */ wmb(); } This is still correct, but now you've got two fences separated by only a nop. With the commit and the fence together in pcommit_sfence() you avoid the final unneeded fence. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1422377631-8986-2-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/include/asm/cpufeature.h | 1 + arch/x86/include/asm/special_insns.h | 8 ++++++++ 2 files changed, 9 insertions(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 90a5485..d6428ea 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -231,6 +231,7 @@ #define X86_FEATURE_RDSEED ( 9*32+18) /* The RDSEED instruction */ #define X86_FEATURE_ADX ( 9*32+19) /* The ADCX and ADOX instructions */ #define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ +#define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ #define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ #define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ #define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index e820c08..d686f9b 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -199,6 +199,14 @@ static inline void clflushopt(volatile void *__p) "+m" (*(volatile char __force *)__p)); } +static inline void pcommit_sfence(void) +{ + alternative(ASM_NOP7, + ".byte 0x66, 0x0f, 0xae, 0xf8\n\t" /* pcommit */ + "sfence", + X86_FEATURE_PCOMMIT); +} + #define nop() asm volatile ("nop") ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [tip:x86/asm] x86: Add support for the pcommit instruction 2015-02-19 0:29 ` [tip:x86/asm] " tip-bot for Ross Zwisler @ 2015-02-19 1:15 ` Ingo Molnar 2015-02-19 17:21 ` Ross Zwisler 0 siblings, 1 reply; 20+ messages in thread From: Ingo Molnar @ 2015-02-19 1:15 UTC (permalink / raw) To: hpa, ross.zwisler, torvalds, tglx, hpa, linux-kernel, bp Cc: linux-tip-commits * tip-bot for Ross Zwisler <tipbot@zytor.com> wrote: > Commit-ID: a71ef01336f2228dc9d47320492360d6848e591e > Gitweb: http://git.kernel.org/tip/a71ef01336f2228dc9d47320492360d6848e591e > Author: Ross Zwisler <ross.zwisler@linux.intel.com> > AuthorDate: Tue, 27 Jan 2015 09:53:50 -0700 > Committer: Ingo Molnar <mingo@kernel.org> > CommitDate: Thu, 19 Feb 2015 00:06:37 +0100 > > x86: Add support for the pcommit instruction So this breaks the UML build: /home/mingo/tip/arch/x86/include/asm/special_insns.h: In function ‘pcommit_sfence’: /home/mingo/tip/arch/x86/include/asm/special_insns.h:218:14: error: expected ‘:’ or ‘)’ before ‘ASM_NOP7’ Thanks, Ingo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [tip:x86/asm] x86: Add support for the pcommit instruction 2015-02-19 1:15 ` Ingo Molnar @ 2015-02-19 17:21 ` Ross Zwisler 2015-02-19 17:33 ` Borislav Petkov 0 siblings, 1 reply; 20+ messages in thread From: Ross Zwisler @ 2015-02-19 17:21 UTC (permalink / raw) To: Ingo Molnar; +Cc: hpa, torvalds, tglx, hpa, linux-kernel, bp, linux-tip-commits On Thu, 2015-02-19 at 02:15 +0100, Ingo Molnar wrote: > * tip-bot for Ross Zwisler <tipbot@zytor.com> wrote: > > > Commit-ID: a71ef01336f2228dc9d47320492360d6848e591e > > Gitweb: http://git.kernel.org/tip/a71ef01336f2228dc9d47320492360d6848e591e > > Author: Ross Zwisler <ross.zwisler@linux.intel.com> > > AuthorDate: Tue, 27 Jan 2015 09:53:50 -0700 > > Committer: Ingo Molnar <mingo@kernel.org> > > CommitDate: Thu, 19 Feb 2015 00:06:37 +0100 > > > > x86: Add support for the pcommit instruction > > So this breaks the UML build: > > /home/mingo/tip/arch/x86/include/asm/special_insns.h: In function ‘pcommit_sfence’: > /home/mingo/tip/arch/x86/include/asm/special_insns.h:218:14: error: expected ‘:’ or ‘)’ before ‘ASM_NOP7’ > > Thanks, > > Ingo Interesting, it looks like I need to include <asm/nops.h> explicitly for UML. New patch on the way. Thanks, - Ross ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [tip:x86/asm] x86: Add support for the pcommit instruction 2015-02-19 17:21 ` Ross Zwisler @ 2015-02-19 17:33 ` Borislav Petkov 2015-02-19 17:41 ` Ross Zwisler 0 siblings, 1 reply; 20+ messages in thread From: Borislav Petkov @ 2015-02-19 17:33 UTC (permalink / raw) To: Ross Zwisler Cc: Ingo Molnar, hpa, torvalds, tglx, hpa, linux-kernel, bp, linux-tip-commits On Thu, Feb 19, 2015 at 10:21:53AM -0700, Ross Zwisler wrote: > Interesting, it looks like I need to include <asm/nops.h> explicitly for > UML. New patch on the way. You'd need to do an incremental fix ontop, though. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [tip:x86/asm] x86: Add support for the pcommit instruction 2015-02-19 17:33 ` Borislav Petkov @ 2015-02-19 17:41 ` Ross Zwisler 0 siblings, 0 replies; 20+ messages in thread From: Ross Zwisler @ 2015-02-19 17:41 UTC (permalink / raw) To: Borislav Petkov Cc: Ingo Molnar, hpa, torvalds, tglx, hpa, linux-kernel, bp, linux-tip-commits On Thu, 2015-02-19 at 18:33 +0100, Borislav Petkov wrote: > On Thu, Feb 19, 2015 at 10:21:53AM -0700, Ross Zwisler wrote: > > Interesting, it looks like I need to include <asm/nops.h> explicitly for > > UML. New patch on the way. > > You'd need to do an incremental fix ontop, though. Oh, instead of just sending out a new patch that does the include? Sorry, didn't see this before I sent out v4 of the patch that added the include - Ingo, if you'd rather have a separate patch that adds the include to fix the compile error, please let me know & I can send one out. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v3 2/2] x86: Add support for the clwb instruction 2015-01-27 16:53 [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler 2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler @ 2015-01-27 16:53 ` Ross Zwisler 2015-01-28 10:58 ` Borislav Petkov ` (3 more replies) 2015-02-05 16:24 ` [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler 2 siblings, 4 replies; 20+ messages in thread From: Ross Zwisler @ 2015-01-27 16:53 UTC (permalink / raw) To: linux-kernel Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner, Borislav Petkov Add support for the new clwb (cache line write back) instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf The clwb instruction is used to write back the contents of dirtied cache lines to memory without evicting the cache lines from the processor's cache hierarchy. This should be used in favor of clflushopt or clflush in cases where you require the cache line to be written to memory but plan to access the data again in the near future. One of the main use cases for this is with persistent memory where clwb can be used with pcommit to ensure that data has been accepted to memory and is durable on the DIMM. This function shows how to properly use clwb/clflushopt/clflush and pcommit with appropriate fencing: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); /* pcommit and the required sfence for ordering */ pcommit_sfence(); } After this function completes the data pointed to by vaddr is has been accepted to memory and will be durable if the vaddr points to persistent memory. Regarding the details of how the alternatives assembly is set up, we need one additional byte at the beginning of the clflush so that we can flip it into a clflushopt by changing that byte into a 0x66 prefix. Two options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no functional effect with the plain clflush, but I've been told that executing a clflush + prefix should be faster than executing a clflush + NOP. We had to hard code the assembly for clwb because, lacking the ability to assemble the clwb instruction itself, the next closest thing is to have an xsaveopt instruction with a 0x66 prefix. Unfortunately xsaveopt itself is also relatively new, and isn't included by all the GCC versions that the kernel needs to support. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: H Peter Anvin <h.peter.anvin@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> --- arch/x86/include/asm/cpufeature.h | 1 + arch/x86/include/asm/special_insns.h | 14 ++++++++++++++ 2 files changed, 15 insertions(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index dfdd689..dc91747 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -222,6 +222,7 @@ #define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ #define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ #define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ +#define X86_FEATURE_CLWB ( 9*32+24) /* CLWB instruction */ #define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ #define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ #define X86_FEATURE_AVX512CD ( 9*32+28) /* AVX-512 Conflict Detection */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index d686f9b..0772365 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -199,6 +199,20 @@ static inline void clflushopt(volatile void *__p) "+m" (*(volatile char __force *)__p)); } +static inline void clwb(volatile void *__p) +{ + volatile struct { char x[64]; } *p = __p; + + asm volatile(ALTERNATIVE_2( + ".byte " __stringify(NOP_DS_PREFIX) "; clflush (%[pax])", + ".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */ + X86_FEATURE_CLFLUSHOPT, + ".byte 0x66, 0x0f, 0xae, 0x30", /* clwb (%%rax) */ + X86_FEATURE_CLWB) + : [p] "+m" (*p) + : [pax] "a" (p)); +} + static inline void pcommit_sfence(void) { alternative(ASM_NOP7, -- 1.9.3 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 2/2] x86: Add support for the clwb instruction 2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler @ 2015-01-28 10:58 ` Borislav Petkov 2015-02-11 22:25 ` H. Peter Anvin ` (2 subsequent siblings) 3 siblings, 0 replies; 20+ messages in thread From: Borislav Petkov @ 2015-01-28 10:58 UTC (permalink / raw) To: Ross Zwisler; +Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner On Tue, Jan 27, 2015 at 09:53:51AM -0700, Ross Zwisler wrote: > Add support for the new clwb (cache line write back) instruction. This > instruction was announced in the document "Intel Architecture > Instruction Set Extensions Programming Reference" with reference number > 319433-022. ... > After this function completes the data pointed to by vaddr is has been > accepted to memory and will be durable if the vaddr points to > persistent memory. > > Regarding the details of how the alternatives assembly is set up, we > need one additional byte at the beginning of the clflush so that we can > flip it into a clflushopt by changing that byte into a 0x66 prefix. Two > options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte > NOP_DS_PREFIX. Both have no functional effect with the plain clflush, > but I've been told that executing a clflush + prefix should be faster > than executing a clflush + NOP. > > We had to hard code the assembly for clwb because, lacking the ability > to assemble the clwb instruction itself, the next closest thing is to > have an xsaveopt instruction with a 0x66 prefix. Unfortunately xsaveopt > itself is also relatively new, and isn't included by all the GCC > versions that the kernel needs to support. > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > Cc: H Peter Anvin <h.peter.anvin@intel.com> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Borislav Petkov <bp@alien8.de> Acked-by: Borislav Petkov <bp@suse.de> -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 2/2] x86: Add support for the clwb instruction 2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler 2015-01-28 10:58 ` Borislav Petkov @ 2015-02-11 22:25 ` H. Peter Anvin 2015-02-19 0:29 ` [tip:x86/asm] " tip-bot for Ross Zwisler 2015-04-03 5:10 ` [tip:x86/asm] x86/asm: Add support for the CLWB instruction tip-bot for Ross Zwisler 3 siblings, 0 replies; 20+ messages in thread From: H. Peter Anvin @ 2015-02-11 22:25 UTC (permalink / raw) To: Ross Zwisler, linux-kernel; +Cc: Ingo Molnar, Thomas Gleixner, Borislav Petkov On 01/27/2015 08:53 AM, Ross Zwisler wrote: > Add support for the new clwb (cache line write back) instruction. This > instruction was announced in the document "Intel Architecture > Instruction Set Extensions Programming Reference" with reference number > 319433-022. > > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf > > The clwb instruction is used to write back the contents of dirtied cache > lines to memory without evicting the cache lines from the processor's > cache hierarchy. This should be used in favor of clflushopt or clflush > in cases where you require the cache line to be written to memory but > plan to access the data again in the near future. > > One of the main use cases for this is with persistent memory where clwb > can be used with pcommit to ensure that data has been accepted to memory > and is durable on the DIMM. Acked-by: H. Peter Anvin <hpa@linux.intel.com> ^ permalink raw reply [flat|nested] 20+ messages in thread
* [tip:x86/asm] x86: Add support for the clwb instruction 2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler 2015-01-28 10:58 ` Borislav Petkov 2015-02-11 22:25 ` H. Peter Anvin @ 2015-02-19 0:29 ` tip-bot for Ross Zwisler 2015-04-02 20:31 ` Ross Zwisler 2015-04-03 5:10 ` [tip:x86/asm] x86/asm: Add support for the CLWB instruction tip-bot for Ross Zwisler 3 siblings, 1 reply; 20+ messages in thread From: tip-bot for Ross Zwisler @ 2015-02-19 0:29 UTC (permalink / raw) To: linux-tip-commits Cc: hpa, hpa, mingo, ross.zwisler, torvalds, linux-kernel, bp, tglx Commit-ID: 3b68983dc66c61da3ab4191b891084a7ab09e3e1 Gitweb: http://git.kernel.org/tip/3b68983dc66c61da3ab4191b891084a7ab09e3e1 Author: Ross Zwisler <ross.zwisler@linux.intel.com> AuthorDate: Tue, 27 Jan 2015 09:53:51 -0700 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Thu, 19 Feb 2015 00:06:38 +0100 x86: Add support for the clwb instruction Add support for the new clwb (cache line write back) instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf The clwb instruction is used to write back the contents of dirtied cache lines to memory without evicting the cache lines from the processor's cache hierarchy. This should be used in favor of clflushopt or clflush in cases where you require the cache line to be written to memory but plan to access the data again in the near future. One of the main use cases for this is with persistent memory where clwb can be used with pcommit to ensure that data has been accepted to memory and is durable on the DIMM. This function shows how to properly use clwb/clflushopt/clflush and pcommit with appropriate fencing: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); /* pcommit and the required sfence for ordering */ pcommit_sfence(); } After this function completes the data pointed to by vaddr is has been accepted to memory and will be durable if the vaddr points to persistent memory. Regarding the details of how the alternatives assembly is set up, we need one additional byte at the beginning of the clflush so that we can flip it into a clflushopt by changing that byte into a 0x66 prefix. Two options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no functional effect with the plain clflush, but I've been told that executing a clflush + prefix should be faster than executing a clflush + NOP. We had to hard code the assembly for clwb because, lacking the ability to assemble the clwb instruction itself, the next closest thing is to have an xsaveopt instruction with a 0x66 prefix. Unfortunately xsaveopt itself is also relatively new, and isn't included by all the GCC versions that the kernel needs to support. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/include/asm/cpufeature.h | 1 + arch/x86/include/asm/special_insns.h | 14 ++++++++++++++ 2 files changed, 15 insertions(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index d6428ea..bc96e78 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -233,6 +233,7 @@ #define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ #define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ #define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ +#define X86_FEATURE_CLWB ( 9*32+24) /* CLWB instruction */ #define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ #define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ #define X86_FEATURE_AVX512CD ( 9*32+28) /* AVX-512 Conflict Detection */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index d686f9b..0772365 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -199,6 +199,20 @@ static inline void clflushopt(volatile void *__p) "+m" (*(volatile char __force *)__p)); } +static inline void clwb(volatile void *__p) +{ + volatile struct { char x[64]; } *p = __p; + + asm volatile(ALTERNATIVE_2( + ".byte " __stringify(NOP_DS_PREFIX) "; clflush (%[pax])", + ".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */ + X86_FEATURE_CLFLUSHOPT, + ".byte 0x66, 0x0f, 0xae, 0x30", /* clwb (%%rax) */ + X86_FEATURE_CLWB) + : [p] "+m" (*p) + : [pax] "a" (p)); +} + static inline void pcommit_sfence(void) { alternative(ASM_NOP7, ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [tip:x86/asm] x86: Add support for the clwb instruction 2015-02-19 0:29 ` [tip:x86/asm] " tip-bot for Ross Zwisler @ 2015-04-02 20:31 ` Ross Zwisler 2015-04-03 5:04 ` Ingo Molnar 0 siblings, 1 reply; 20+ messages in thread From: Ross Zwisler @ 2015-04-02 20:31 UTC (permalink / raw) To: tglx, torvalds, linux-kernel, bp, mingo, hpa, hpa; +Cc: linux-tip-commits On Wed, 2015-02-18 at 16:29 -0800, tip-bot for Ross Zwisler wrote: > Commit-ID: 3b68983dc66c61da3ab4191b891084a7ab09e3e1 > Gitweb: http://git.kernel.org/tip/3b68983dc66c61da3ab4191b891084a7ab09e3e1 > Author: Ross Zwisler <ross.zwisler@linux.intel.com> > AuthorDate: Tue, 27 Jan 2015 09:53:51 -0700 > Committer: Ingo Molnar <mingo@kernel.org> > CommitDate: Thu, 19 Feb 2015 00:06:38 +0100 > > x86: Add support for the clwb instruction > > Add support for the new clwb (cache line write back) > instruction. This instruction was announced in the document > "Intel Architecture Instruction Set Extensions Programming > Reference" with reference number 319433-022. > > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf > > The clwb instruction is used to write back the contents of > dirtied cache lines to memory without evicting the cache lines > from the processor's cache hierarchy. This should be used in > favor of clflushopt or clflush in cases where you require the > cache line to be written to memory but plan to access the data > again in the near future. > > One of the main use cases for this is with persistent memory > where clwb can be used with pcommit to ensure that data has been > accepted to memory and is durable on the DIMM. > > This function shows how to properly use clwb/clflushopt/clflush > and pcommit with appropriate fencing: > > void flush_and_commit_buffer(void *vaddr, unsigned int size) > { > void *vend = vaddr + size - 1; > > for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) > clwb(vaddr); > > /* Flush any possible final partial cacheline */ > clwb(vend); > > /* > * sfence to order clwb/clflushopt/clflush cache flushes > * mfence via mb() also works > */ > wmb(); > > /* pcommit and the required sfence for ordering */ > pcommit_sfence(); > } > > After this function completes the data pointed to by vaddr is > has been accepted to memory and will be durable if the vaddr > points to persistent memory. > > Regarding the details of how the alternatives assembly is set > up, we need one additional byte at the beginning of the clflush > so that we can flip it into a clflushopt by changing that byte > into a 0x66 prefix. Two options are to either insert a 1 byte > ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no > functional effect with the plain clflush, but I've been told > that executing a clflush + prefix should be faster than > executing a clflush + NOP. > > We had to hard code the assembly for clwb because, lacking the > ability to assemble the clwb instruction itself, the next > closest thing is to have an xsaveopt instruction with a 0x66 > prefix. Unfortunately xsaveopt itself is also relatively new, > and isn't included by all the GCC versions that the kernel needs > to support. > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > Acked-by: Borislav Petkov <bp@suse.de> > Acked-by: H. Peter Anvin <hpa@linux.intel.com> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com > Signed-off-by: Ingo Molnar <mingo@kernel.org> Ping on this patch - it looks like the pcommit patch is in the tip tree, but this one is missing? I'm looking at the tree as of: 9a760fbbdc7 "Merge branch 'tools/kvm'" ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [tip:x86/asm] x86: Add support for the clwb instruction 2015-04-02 20:31 ` Ross Zwisler @ 2015-04-03 5:04 ` Ingo Molnar 0 siblings, 0 replies; 20+ messages in thread From: Ingo Molnar @ 2015-04-03 5:04 UTC (permalink / raw) To: Ross Zwisler Cc: tglx, torvalds, linux-kernel, bp, hpa, hpa, linux-tip-commits * Ross Zwisler <ross.zwisler@linux.intel.com> wrote: > On Wed, 2015-02-18 at 16:29 -0800, tip-bot for Ross Zwisler wrote: > > Commit-ID: 3b68983dc66c61da3ab4191b891084a7ab09e3e1 > > Gitweb: http://git.kernel.org/tip/3b68983dc66c61da3ab4191b891084a7ab09e3e1 > > Author: Ross Zwisler <ross.zwisler@linux.intel.com> > > AuthorDate: Tue, 27 Jan 2015 09:53:51 -0700 > > Committer: Ingo Molnar <mingo@kernel.org> > > CommitDate: Thu, 19 Feb 2015 00:06:38 +0100 > > > > x86: Add support for the clwb instruction > > > > Add support for the new clwb (cache line write back) > > instruction. This instruction was announced in the document > > "Intel Architecture Instruction Set Extensions Programming > > Reference" with reference number 319433-022. > > > > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf > > > > The clwb instruction is used to write back the contents of > > dirtied cache lines to memory without evicting the cache lines > > from the processor's cache hierarchy. This should be used in > > favor of clflushopt or clflush in cases where you require the > > cache line to be written to memory but plan to access the data > > again in the near future. > > > > One of the main use cases for this is with persistent memory > > where clwb can be used with pcommit to ensure that data has been > > accepted to memory and is durable on the DIMM. > > > > This function shows how to properly use clwb/clflushopt/clflush > > and pcommit with appropriate fencing: > > > > void flush_and_commit_buffer(void *vaddr, unsigned int size) > > { > > void *vend = vaddr + size - 1; > > > > for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) > > clwb(vaddr); > > > > /* Flush any possible final partial cacheline */ > > clwb(vend); > > > > /* > > * sfence to order clwb/clflushopt/clflush cache flushes > > * mfence via mb() also works > > */ > > wmb(); > > > > /* pcommit and the required sfence for ordering */ > > pcommit_sfence(); > > } > > > > After this function completes the data pointed to by vaddr is > > has been accepted to memory and will be durable if the vaddr > > points to persistent memory. > > > > Regarding the details of how the alternatives assembly is set > > up, we need one additional byte at the beginning of the clflush > > so that we can flip it into a clflushopt by changing that byte > > into a 0x66 prefix. Two options are to either insert a 1 byte > > ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no > > functional effect with the plain clflush, but I've been told > > that executing a clflush + prefix should be faster than > > executing a clflush + NOP. > > > > We had to hard code the assembly for clwb because, lacking the > > ability to assemble the clwb instruction itself, the next > > closest thing is to have an xsaveopt instruction with a 0x66 > > prefix. Unfortunately xsaveopt itself is also relatively new, > > and isn't included by all the GCC versions that the kernel needs > > to support. > > > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > Acked-by: Borislav Petkov <bp@suse.de> > > Acked-by: H. Peter Anvin <hpa@linux.intel.com> > > Cc: Linus Torvalds <torvalds@linux-foundation.org> > > Cc: Thomas Gleixner <tglx@linutronix.de> > > Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com > > Signed-off-by: Ingo Molnar <mingo@kernel.org> > > Ping on this patch - it looks like the pcommit patch is in the tip tree, > but this one is missing? Yeah, I applied it initially, then had some reservations about it - but those are now resolved so I've applied it to tip:x86/asm again. Thanks, Ingo ^ permalink raw reply [flat|nested] 20+ messages in thread
* [tip:x86/asm] x86/asm: Add support for the CLWB instruction 2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler ` (2 preceding siblings ...) 2015-02-19 0:29 ` [tip:x86/asm] " tip-bot for Ross Zwisler @ 2015-04-03 5:10 ` tip-bot for Ross Zwisler 3 siblings, 0 replies; 20+ messages in thread From: tip-bot for Ross Zwisler @ 2015-04-03 5:10 UTC (permalink / raw) To: linux-tip-commits Cc: luto, ross.zwisler, linux-kernel, bp, brgerst, bp, mingo, hpa, hpa, torvalds, dvlasenk, tglx Commit-ID: d9dc64f30abe42f71bc7e9eb9d38c41006cf39f9 Gitweb: http://git.kernel.org/tip/d9dc64f30abe42f71bc7e9eb9d38c41006cf39f9 Author: Ross Zwisler <ross.zwisler@linux.intel.com> AuthorDate: Tue, 27 Jan 2015 09:53:51 -0700 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Fri, 3 Apr 2015 06:56:38 +0200 x86/asm: Add support for the CLWB instruction Add support for the new CLWB (cache line write back) instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf The CLWB instruction is used to write back the contents of dirtied cache lines to memory without evicting the cache lines from the processor's cache hierarchy. This should be used in favor of clflushopt or clflush in cases where you require the cache line to be written to memory but plan to access the data again in the near future. One of the main use cases for this is with persistent memory where CLWB can be used with PCOMMIT to ensure that data has been accepted to memory and is durable on the DIMM. This function shows how to properly use CLWB/CLFLUSHOPT/CLFLUSH and PCOMMIT with appropriate fencing: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * Use SFENCE to order CLWB/CLFLUSHOPT/CLFLUSH cache flushes. * (MFENCE via mb() also works) */ wmb(); /* PCOMMIT and the required SFENCE for ordering */ pcommit_sfence(); } After this function completes the data pointed to by vaddr is has been accepted to memory and will be durable if the vaddr points to persistent memory. Regarding the details of how the alternatives assembly is set up, we need one additional byte at the beginning of the CLFLUSH so that we can flip it into a CLFLUSHOPT by changing that byte into a 0x66 prefix. Two options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no functional effect with the plain CLFLUSH, but I've been told that executing a CLFLUSH + prefix should be faster than executing a CLFLUSH + NOP. We had to hard code the assembly for CLWB because, lacking the ability to assemble the CLWB instruction itself, the next closest thing is to have an xsaveopt instruction with a 0x66 prefix. Unfortunately XSAVEOPT itself is also relatively new, and isn't included by all the GCC versions that the kernel needs to support. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/include/asm/cpufeature.h | 1 + arch/x86/include/asm/special_insns.h | 14 ++++++++++++++ 2 files changed, 15 insertions(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 0f7a5a1..854c04b 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -233,6 +233,7 @@ #define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ #define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ #define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ +#define X86_FEATURE_CLWB ( 9*32+24) /* CLWB instruction */ #define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ #define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ #define X86_FEATURE_AVX512CD ( 9*32+28) /* AVX-512 Conflict Detection */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 2ec1a53..aeb4666e 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -201,6 +201,20 @@ static inline void clflushopt(volatile void *__p) "+m" (*(volatile char __force *)__p)); } +static inline void clwb(volatile void *__p) +{ + volatile struct { char x[64]; } *p = __p; + + asm volatile(ALTERNATIVE_2( + ".byte " __stringify(NOP_DS_PREFIX) "; clflush (%[pax])", + ".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */ + X86_FEATURE_CLFLUSHOPT, + ".byte 0x66, 0x0f, 0xae, 0x30", /* clwb (%%rax) */ + X86_FEATURE_CLWB) + : [p] "+m" (*p) + : [pax] "a" (p)); +} + static inline void pcommit_sfence(void) { alternative(ASM_NOP7, ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 0/2] add support for new persistent memory instructions 2015-01-27 16:53 [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler 2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler 2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler @ 2015-02-05 16:24 ` Ross Zwisler 2 siblings, 0 replies; 20+ messages in thread From: Ross Zwisler @ 2015-02-05 16:24 UTC (permalink / raw) To: linux-kernel; +Cc: H Peter Anvin, Ingo Molnar, Thomas Gleixner, Borislav Petkov On Tue, 2015-01-27 at 09:53 -0700, Ross Zwisler wrote: > This patch set adds support for two new persistent memory instructions, pcommit > and clwb. These instructions were announced in the document "Intel > Architecture Instruction Set Extensions Programming Reference" with reference > number 319433-022. > > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf > > These patches apply cleanly to v3.19-rc6. > > Changes from v2: > > - Added instruction descriptions and flows to the patch descriptions. > - Added needed sfence to pcommit alternatives assembly. The inline function > is now called pcommit_sfence(). If pcommit is not supported on the platform > both the pcommit and the sfence will be nops. > > Cc: H Peter Anvin <h.peter.anvin@intel.com> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Borislav Petkov <bp@alien8.de> > > Ross Zwisler (2): > x86: Add support for the pcommit instruction > x86: Add support for the clwb instruction > > arch/x86/include/asm/cpufeature.h | 2 ++ > arch/x86/include/asm/special_insns.h | 22 ++++++++++++++++++++++ > 2 files changed, 24 insertions(+) Ping? :) ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2015-04-03 5:11 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-27 16:53 [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler 2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler 2015-01-28 10:58 ` Borislav Petkov 2015-01-28 17:10 ` Elliott, Robert (Server Storage) 2015-01-28 17:21 ` Borislav Petkov 2015-01-28 17:27 ` Ross Zwisler 2015-02-11 22:24 ` H. Peter Anvin 2015-02-19 0:29 ` [tip:x86/asm] " tip-bot for Ross Zwisler 2015-02-19 1:15 ` Ingo Molnar 2015-02-19 17:21 ` Ross Zwisler 2015-02-19 17:33 ` Borislav Petkov 2015-02-19 17:41 ` Ross Zwisler 2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler 2015-01-28 10:58 ` Borislav Petkov 2015-02-11 22:25 ` H. Peter Anvin 2015-02-19 0:29 ` [tip:x86/asm] " tip-bot for Ross Zwisler 2015-04-02 20:31 ` Ross Zwisler 2015-04-03 5:04 ` Ingo Molnar 2015-04-03 5:10 ` [tip:x86/asm] x86/asm: Add support for the CLWB instruction tip-bot for Ross Zwisler 2015-02-05 16:24 ` [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).