LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v3 0/2] add support for new persistent memory instructions
@ 2015-01-27 16:53 Ross Zwisler
  2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Ross Zwisler @ 2015-01-27 16:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Borislav Petkov

This patch set adds support for two new persistent memory instructions, pcommit
and clwb.  These instructions were announced in the document "Intel
Architecture Instruction Set Extensions Programming Reference" with reference
number 319433-022.

https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

These patches apply cleanly to v3.19-rc6.

Changes from v2:

 - Added instruction descriptions and flows to the patch descriptions.
 - Added needed sfence to pcommit alternatives assembly.  The inline function
   is now called pcommit_sfence().  If pcommit is not supported on the platform
   both the pcommit and the sfence will be nops.

Cc: H Peter Anvin <h.peter.anvin@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>

Ross Zwisler (2):
  x86: Add support for the pcommit instruction
  x86: Add support for the clwb instruction

 arch/x86/include/asm/cpufeature.h    |  2 ++
 arch/x86/include/asm/special_insns.h | 22 ++++++++++++++++++++++
 2 files changed, 24 insertions(+)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 1/2] x86: Add support for the pcommit instruction
  2015-01-27 16:53 [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler
@ 2015-01-27 16:53 ` Ross Zwisler
  2015-01-28 10:58   ` Borislav Petkov
                     ` (3 more replies)
  2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler
  2015-02-05 16:24 ` [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler
  2 siblings, 4 replies; 20+ messages in thread
From: Ross Zwisler @ 2015-01-27 16:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Borislav Petkov

Add support for the new pcommit (persistent commit) instruction.  This
instruction was announced in the document "Intel Architecture
Instruction Set Extensions Programming Reference" with reference number
319433-022.

https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

The pcommit instruction ensures that data that has been flushed from the
processor's cache hierarchy with clwb, clflushopt or clflush is accepted to
memory and is durable on the DIMM.  The primary use case for this is persistent
memory.

This function shows how to properly use clwb/clflushopt/clflush and
pcommit with appropriate fencing:

void flush_and_commit_buffer(void *vaddr, unsigned int size)
{
	void *vend = vaddr + size - 1;

	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
		clwb(vaddr);

	/* Flush any possible final partial cacheline */
	clwb(vend);

	/*
	 * sfence to order clwb/clflushopt/clflush cache flushes
	 * mfence via mb() also works
	 */
	wmb();

	/* pcommit and the required sfence for ordering */
	pcommit_sfence();
}

After this function completes the data pointed to by vaddr is has been
accepted to memory and will be durable if the vaddr points to
persistent memory.

Pcommit must always be ordered by an mfence or sfence, so to help
simplify things we include both the pcommit and the required sfence in
the alternatives generated by pcommit_sfence().  The other option is to
keep them separated, but on platforms that don't support pcommit this
would then turn into:

void flush_and_commit_buffer(void *vaddr, unsigned int size)
{
        void *vend = vaddr + size - 1;

        for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
                clwb(vaddr);

        /* Flush any possible final partial cacheline */
        clwb(vend);

        /*
         * sfence to order clwb/clflushopt/clflush cache flushes
         * mfence via mb() also works
         */
        wmb();

        nop(); /* from pcommit(), via alternatives */

        /*
         * sfence to order pcommit
         * mfence via mb() also works
         */
        wmb();
}

This is still correct, but now you've got two fences separated by only a
nop.  With the commit and the fence together in pcommit_sfence() you
avoid the final unneeded fence.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: H Peter Anvin <h.peter.anvin@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
---
 arch/x86/include/asm/cpufeature.h    | 1 +
 arch/x86/include/asm/special_insns.h | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index bb9b258..dfdd689 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -220,6 +220,7 @@
 #define X86_FEATURE_RDSEED	( 9*32+18) /* The RDSEED instruction */
 #define X86_FEATURE_ADX		( 9*32+19) /* The ADCX and ADOX instructions */
 #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
+#define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
 #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
 #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
 #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index e820c08..d686f9b 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -199,6 +199,14 @@ static inline void clflushopt(volatile void *__p)
 		       "+m" (*(volatile char __force *)__p));
 }
 
+static inline void pcommit_sfence(void)
+{
+	alternative(ASM_NOP7,
+		    ".byte 0x66, 0x0f, 0xae, 0xf8\n\t" /* pcommit */
+		    "sfence",
+		    X86_FEATURE_PCOMMIT);
+}
+
 #define nop() asm volatile ("nop")
 
 
-- 
1.9.3


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 2/2] x86: Add support for the clwb instruction
  2015-01-27 16:53 [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler
  2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler
@ 2015-01-27 16:53 ` Ross Zwisler
  2015-01-28 10:58   ` Borislav Petkov
                     ` (3 more replies)
  2015-02-05 16:24 ` [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler
  2 siblings, 4 replies; 20+ messages in thread
From: Ross Zwisler @ 2015-01-27 16:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Borislav Petkov

Add support for the new clwb (cache line write back) instruction.  This
instruction was announced in the document "Intel Architecture
Instruction Set Extensions Programming Reference" with reference number
319433-022.

https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

The clwb instruction is used to write back the contents of dirtied cache
lines to memory without evicting the cache lines from the processor's
cache hierarchy.  This should be used in favor of clflushopt or clflush
in cases where you require the cache line to be written to memory but
plan to access the data again in the near future.

One of the main use cases for this is with persistent memory where clwb
can be used with pcommit to ensure that data has been accepted to memory
and is durable on the DIMM.

This function shows how to properly use clwb/clflushopt/clflush and
pcommit with appropriate fencing:

void flush_and_commit_buffer(void *vaddr, unsigned int size)
{
	void *vend = vaddr + size - 1;

	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
		clwb(vaddr);

	/* Flush any possible final partial cacheline */
	clwb(vend);

	/*
	 * sfence to order clwb/clflushopt/clflush cache flushes
	 * mfence via mb() also works
	 */
	wmb();

	/* pcommit and the required sfence for ordering */
	pcommit_sfence();
}

After this function completes the data pointed to by vaddr is has been
accepted to memory and will be durable if the vaddr points to
persistent memory.

Regarding the details of how the alternatives assembly is set up, we
need one additional byte at the beginning of the clflush so that we can
flip it into a clflushopt by changing that byte into a 0x66 prefix.  Two
options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte
NOP_DS_PREFIX.  Both have no functional effect with the plain clflush,
but I've been told that executing a clflush + prefix should be faster
than executing a clflush + NOP.

We had to hard code the assembly for clwb because, lacking the ability
to assemble the clwb instruction itself, the next closest thing is to
have an xsaveopt instruction with a 0x66 prefix.  Unfortunately xsaveopt
itself is also relatively new, and isn't included by all the GCC
versions that the kernel needs to support.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: H Peter Anvin <h.peter.anvin@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
---
 arch/x86/include/asm/cpufeature.h    |  1 +
 arch/x86/include/asm/special_insns.h | 14 ++++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index dfdd689..dc91747 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -222,6 +222,7 @@
 #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
 #define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
 #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
+#define X86_FEATURE_CLWB	( 9*32+24) /* CLWB instruction */
 #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
 #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
 #define X86_FEATURE_AVX512CD	( 9*32+28) /* AVX-512 Conflict Detection */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index d686f9b..0772365 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -199,6 +199,20 @@ static inline void clflushopt(volatile void *__p)
 		       "+m" (*(volatile char __force *)__p));
 }
 
+static inline void clwb(volatile void *__p)
+{
+	volatile struct { char x[64]; } *p = __p;
+
+	asm volatile(ALTERNATIVE_2(
+		".byte " __stringify(NOP_DS_PREFIX) "; clflush (%[pax])",
+		".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */
+		X86_FEATURE_CLFLUSHOPT,
+		".byte 0x66, 0x0f, 0xae, 0x30",  /* clwb (%%rax) */
+		X86_FEATURE_CLWB)
+		: [p] "+m" (*p)
+		: [pax] "a" (p));
+}
+
 static inline void pcommit_sfence(void)
 {
 	alternative(ASM_NOP7,
-- 
1.9.3


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/2] x86: Add support for the pcommit instruction
  2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler
@ 2015-01-28 10:58   ` Borislav Petkov
  2015-01-28 17:10   ` Elliott, Robert (Server Storage)
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: Borislav Petkov @ 2015-01-28 10:58 UTC (permalink / raw)
  To: Ross Zwisler; +Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner

On Tue, Jan 27, 2015 at 09:53:50AM -0700, Ross Zwisler wrote:
> Add support for the new pcommit (persistent commit) instruction.  This
> instruction was announced in the document "Intel Architecture
> Instruction Set Extensions Programming Reference" with reference number
> 319433-022.
> 
> https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> 
> The pcommit instruction ensures that data that has been flushed from the
> processor's cache hierarchy with clwb, clflushopt or clflush is accepted to
> memory and is durable on the DIMM.  The primary use case for this is persistent
> memory.
> 
> This function shows how to properly use clwb/clflushopt/clflush and
> pcommit with appropriate fencing:

...

> This is still correct, but now you've got two fences separated by only a
> nop.  With the commit and the fence together in pcommit_sfence() you
> avoid the final unneeded fence.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Cc: H Peter Anvin <h.peter.anvin@intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Borislav Petkov <bp@alien8.de>

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 2/2] x86: Add support for the clwb instruction
  2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler
@ 2015-01-28 10:58   ` Borislav Petkov
  2015-02-11 22:25   ` H. Peter Anvin
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: Borislav Petkov @ 2015-01-28 10:58 UTC (permalink / raw)
  To: Ross Zwisler; +Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner

On Tue, Jan 27, 2015 at 09:53:51AM -0700, Ross Zwisler wrote:
> Add support for the new clwb (cache line write back) instruction.  This
> instruction was announced in the document "Intel Architecture
> Instruction Set Extensions Programming Reference" with reference number
> 319433-022.

...

> After this function completes the data pointed to by vaddr is has been
> accepted to memory and will be durable if the vaddr points to
> persistent memory.
> 
> Regarding the details of how the alternatives assembly is set up, we
> need one additional byte at the beginning of the clflush so that we can
> flip it into a clflushopt by changing that byte into a 0x66 prefix.  Two
> options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte
> NOP_DS_PREFIX.  Both have no functional effect with the plain clflush,
> but I've been told that executing a clflush + prefix should be faster
> than executing a clflush + NOP.
> 
> We had to hard code the assembly for clwb because, lacking the ability
> to assemble the clwb instruction itself, the next closest thing is to
> have an xsaveopt instruction with a 0x66 prefix.  Unfortunately xsaveopt
> itself is also relatively new, and isn't included by all the GCC
> versions that the kernel needs to support.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Cc: H Peter Anvin <h.peter.anvin@intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Borislav Petkov <bp@alien8.de>

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v3 1/2] x86: Add support for the pcommit instruction
  2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler
  2015-01-28 10:58   ` Borislav Petkov
@ 2015-01-28 17:10   ` Elliott, Robert (Server Storage)
  2015-01-28 17:21     ` Borislav Petkov
  2015-02-11 22:24   ` H. Peter Anvin
  2015-02-19  0:29   ` [tip:x86/asm] " tip-bot for Ross Zwisler
  3 siblings, 1 reply; 20+ messages in thread
From: Elliott, Robert (Server Storage) @ 2015-01-28 17:10 UTC (permalink / raw)
  To: Ross Zwisler, linux-kernel
  Cc: H Peter Anvin, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Kani, Toshimitsu, Knippers, Linda



> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> owner@vger.kernel.org] On Behalf Of Ross Zwisler
> Sent: Tuesday, 27 January, 2015 10:54 AM
> To: linux-kernel@vger.kernel.org
> Cc: Ross Zwisler; H Peter Anvin; Ingo Molnar; Thomas Gleixner; Borislav
> Petkov
> Subject: [PATCH v3 1/2] x86: Add support for the pcommit instruction
> 
> Add support for the new pcommit (persistent commit) instruction.  This
> instruction was announced in the document "Intel Architecture
> Instruction Set Extensions Programming Reference" with reference number
> 319433-022.
> 
> https://software.intel.com/sites/default/files/managed/0d/53/319433-
> 022.pdf
> 
...
> ---
>  arch/x86/include/asm/cpufeature.h    | 1 +
>  arch/x86/include/asm/special_insns.h | 8 ++++++++
>  2 files changed, 9 insertions(+)

Should this patch series also add defines for the virtual 
machine control data structure changes?

1. Add the new VM-Execution Controls bit 21 as
    SECONDARY_EXEC_PCOMMIT_EXITING 0x00200000
to arch/x86/include/asm/vmx.h.

2. Add the new exit reason of 64 (0x41) as
	EXIT_REASON_PCOMMIT  64
to arch/x86/include/uapi/asm/vmx.h and (with a
VMX_EXIT_REASONS string) to usr/include/asm/vmx.h.

3. Add a kvm_vmx_exit_handler to arch/x86/kvm/vmx.c.


---
Rob Elliott    HP Server Storage



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/2] x86: Add support for the pcommit instruction
  2015-01-28 17:10   ` Elliott, Robert (Server Storage)
@ 2015-01-28 17:21     ` Borislav Petkov
  2015-01-28 17:27       ` Ross Zwisler
  0 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2015-01-28 17:21 UTC (permalink / raw)
  To: Elliott, Robert (Server Storage)
  Cc: Ross Zwisler, linux-kernel, H Peter Anvin, Ingo Molnar,
	Thomas Gleixner, Kani, Toshimitsu, Knippers, Linda

On Wed, Jan 28, 2015 at 05:10:46PM +0000, Elliott, Robert (Server Storage) wrote:
> Should this patch series also add defines for the virtual 
> machine control data structure changes?
> 
> 1. Add the new VM-Execution Controls bit 21 as
>     SECONDARY_EXEC_PCOMMIT_EXITING 0x00200000
> to arch/x86/include/asm/vmx.h.
> 
> 2. Add the new exit reason of 64 (0x41) as
> 	EXIT_REASON_PCOMMIT  64
> to arch/x86/include/uapi/asm/vmx.h and (with a
> VMX_EXIT_REASONS string) to usr/include/asm/vmx.h.
> 
> 3. Add a kvm_vmx_exit_handler to arch/x86/kvm/vmx.c.

These look like a separate patchset for kvm enablement to me.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/2] x86: Add support for the pcommit instruction
  2015-01-28 17:21     ` Borislav Petkov
@ 2015-01-28 17:27       ` Ross Zwisler
  0 siblings, 0 replies; 20+ messages in thread
From: Ross Zwisler @ 2015-01-28 17:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Elliott, Robert (Server Storage),
	linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner, Kani,
	Toshimitsu, Knippers, Linda

On Wed, 2015-01-28 at 18:21 +0100, Borislav Petkov wrote:
> On Wed, Jan 28, 2015 at 05:10:46PM +0000, Elliott, Robert (Server Storage) wrote:
> > Should this patch series also add defines for the virtual 
> > machine control data structure changes?
> > 
> > 1. Add the new VM-Execution Controls bit 21 as
> >     SECONDARY_EXEC_PCOMMIT_EXITING 0x00200000
> > to arch/x86/include/asm/vmx.h.
> > 
> > 2. Add the new exit reason of 64 (0x41) as
> > 	EXIT_REASON_PCOMMIT  64
> > to arch/x86/include/uapi/asm/vmx.h and (with a
> > VMX_EXIT_REASONS string) to usr/include/asm/vmx.h.
> > 
> > 3. Add a kvm_vmx_exit_handler to arch/x86/kvm/vmx.c.
> 
> These look like a separate patchset for kvm enablement to me.

Agreed, I think they are a separate patch set.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/2] add support for new persistent memory instructions
  2015-01-27 16:53 [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler
  2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler
  2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler
@ 2015-02-05 16:24 ` Ross Zwisler
  2 siblings, 0 replies; 20+ messages in thread
From: Ross Zwisler @ 2015-02-05 16:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: H Peter Anvin, Ingo Molnar, Thomas Gleixner, Borislav Petkov

On Tue, 2015-01-27 at 09:53 -0700, Ross Zwisler wrote:
> This patch set adds support for two new persistent memory instructions, pcommit
> and clwb.  These instructions were announced in the document "Intel
> Architecture Instruction Set Extensions Programming Reference" with reference
> number 319433-022.
> 
> https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> 
> These patches apply cleanly to v3.19-rc6.
> 
> Changes from v2:
> 
>  - Added instruction descriptions and flows to the patch descriptions.
>  - Added needed sfence to pcommit alternatives assembly.  The inline function
>    is now called pcommit_sfence().  If pcommit is not supported on the platform
>    both the pcommit and the sfence will be nops.
> 
> Cc: H Peter Anvin <h.peter.anvin@intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Borislav Petkov <bp@alien8.de>
> 
> Ross Zwisler (2):
>   x86: Add support for the pcommit instruction
>   x86: Add support for the clwb instruction
> 
>  arch/x86/include/asm/cpufeature.h    |  2 ++
>  arch/x86/include/asm/special_insns.h | 22 ++++++++++++++++++++++
>  2 files changed, 24 insertions(+)

Ping?  :)


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/2] x86: Add support for the pcommit instruction
  2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler
  2015-01-28 10:58   ` Borislav Petkov
  2015-01-28 17:10   ` Elliott, Robert (Server Storage)
@ 2015-02-11 22:24   ` H. Peter Anvin
  2015-02-19  0:29   ` [tip:x86/asm] " tip-bot for Ross Zwisler
  3 siblings, 0 replies; 20+ messages in thread
From: H. Peter Anvin @ 2015-02-11 22:24 UTC (permalink / raw)
  To: Ross Zwisler, linux-kernel; +Cc: Ingo Molnar, Thomas Gleixner, Borislav Petkov

On 01/27/2015 08:53 AM, Ross Zwisler wrote:
> Add support for the new pcommit (persistent commit) instruction.  This
> instruction was announced in the document "Intel Architecture
> Instruction Set Extensions Programming Reference" with reference number
> 319433-022.
> 
> https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> 
> The pcommit instruction ensures that data that has been flushed from the
> processor's cache hierarchy with clwb, clflushopt or clflush is accepted to
> memory and is durable on the DIMM.  The primary use case for this is persistent
> memory.
> 
> This function shows how to properly use clwb/clflushopt/clflush and
> pcommit with appropriate fencing:
> 
> void flush_and_commit_buffer(void *vaddr, unsigned int size)
> {
> 	void *vend = vaddr + size - 1;
> 
> 	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
> 		clwb(vaddr);
> 
> 	/* Flush any possible final partial cacheline */
> 	clwb(vend);
> 
> 	/*
> 	 * sfence to order clwb/clflushopt/clflush cache flushes
> 	 * mfence via mb() also works
> 	 */
> 	wmb();
> 
> 	/* pcommit and the required sfence for ordering */
> 	pcommit_sfence();
> }
> 
> After this function completes the data pointed to by vaddr is has been
> accepted to memory and will be durable if the vaddr points to
> persistent memory.
> 
> Pcommit must always be ordered by an mfence or sfence, so to help
> simplify things we include both the pcommit and the required sfence in
> the alternatives generated by pcommit_sfence().  The other option is to
> keep them separated, but on platforms that don't support pcommit this
> would then turn into:
> 
> void flush_and_commit_buffer(void *vaddr, unsigned int size)
> {
>         void *vend = vaddr + size - 1;
> 
>         for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
>                 clwb(vaddr);
> 
>         /* Flush any possible final partial cacheline */
>         clwb(vend);
> 
>         /*
>          * sfence to order clwb/clflushopt/clflush cache flushes
>          * mfence via mb() also works
>          */
>         wmb();
> 
>         nop(); /* from pcommit(), via alternatives */
> 
>         /*
>          * sfence to order pcommit
>          * mfence via mb() also works
>          */
>         wmb();
> }
> 
> This is still correct, but now you've got two fences separated by only a
> nop.  With the commit and the fence together in pcommit_sfence() you
> avoid the final unneeded fence.

Acked-by: H. Peter Anvin <hpa@linux.intel.com>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 2/2] x86: Add support for the clwb instruction
  2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler
  2015-01-28 10:58   ` Borislav Petkov
@ 2015-02-11 22:25   ` H. Peter Anvin
  2015-02-19  0:29   ` [tip:x86/asm] " tip-bot for Ross Zwisler
  2015-04-03  5:10   ` [tip:x86/asm] x86/asm: Add support for the CLWB instruction tip-bot for Ross Zwisler
  3 siblings, 0 replies; 20+ messages in thread
From: H. Peter Anvin @ 2015-02-11 22:25 UTC (permalink / raw)
  To: Ross Zwisler, linux-kernel; +Cc: Ingo Molnar, Thomas Gleixner, Borislav Petkov

On 01/27/2015 08:53 AM, Ross Zwisler wrote:
> Add support for the new clwb (cache line write back) instruction.  This
> instruction was announced in the document "Intel Architecture
> Instruction Set Extensions Programming Reference" with reference number
> 319433-022.
> 
> https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> 
> The clwb instruction is used to write back the contents of dirtied cache
> lines to memory without evicting the cache lines from the processor's
> cache hierarchy.  This should be used in favor of clflushopt or clflush
> in cases where you require the cache line to be written to memory but
> plan to access the data again in the near future.
> 
> One of the main use cases for this is with persistent memory where clwb
> can be used with pcommit to ensure that data has been accepted to memory
> and is durable on the DIMM.

Acked-by: H. Peter Anvin <hpa@linux.intel.com>



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [tip:x86/asm] x86: Add support for the pcommit instruction
  2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler
                     ` (2 preceding siblings ...)
  2015-02-11 22:24   ` H. Peter Anvin
@ 2015-02-19  0:29   ` tip-bot for Ross Zwisler
  2015-02-19  1:15     ` Ingo Molnar
  3 siblings, 1 reply; 20+ messages in thread
From: tip-bot for Ross Zwisler @ 2015-02-19  0:29 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, linux-kernel, bp, tglx, torvalds, ross.zwisler, hpa, hpa

Commit-ID:  a71ef01336f2228dc9d47320492360d6848e591e
Gitweb:     http://git.kernel.org/tip/a71ef01336f2228dc9d47320492360d6848e591e
Author:     Ross Zwisler <ross.zwisler@linux.intel.com>
AuthorDate: Tue, 27 Jan 2015 09:53:50 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 19 Feb 2015 00:06:37 +0100

x86: Add support for the pcommit instruction

Add support for the new pcommit (persistent commit) instruction.
 This instruction was announced in the document "Intel
Architecture Instruction Set Extensions Programming Reference"
with reference number 319433-022.

https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

The pcommit instruction ensures that data that has been flushed
from the processor's cache hierarchy with clwb, clflushopt or
clflush is accepted to memory and is durable on the DIMM.  The
primary use case for this is persistent memory.

This function shows how to properly use clwb/clflushopt/clflush
and pcommit with appropriate fencing:

void flush_and_commit_buffer(void *vaddr, unsigned int size)
{
	void *vend = vaddr + size - 1;

	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
		clwb(vaddr);

	/* Flush any possible final partial cacheline */
	clwb(vend);

	/*
	 * sfence to order clwb/clflushopt/clflush cache flushes
	 * mfence via mb() also works
	 */
	wmb();

	/* pcommit and the required sfence for ordering */
	pcommit_sfence();
}

After this function completes the data pointed to by vaddr is
has been accepted to memory and will be durable if the vaddr
points to persistent memory.

Pcommit must always be ordered by an mfence or sfence, so to
help simplify things we include both the pcommit and the
required sfence in the alternatives generated by
pcommit_sfence().  The other option is to keep them separated,
but on platforms that don't support pcommit this would then turn
into:

void flush_and_commit_buffer(void *vaddr, unsigned int size)
{
        void *vend = vaddr + size - 1;

        for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
                clwb(vaddr);

        /* Flush any possible final partial cacheline */
        clwb(vend);

        /*
         * sfence to order clwb/clflushopt/clflush cache flushes
         * mfence via mb() also works
         */
        wmb();

        nop(); /* from pcommit(), via alternatives */

        /*
         * sfence to order pcommit
         * mfence via mb() also works
         */
        wmb();
}

This is still correct, but now you've got two fences separated
by only a nop.  With the commit and the fence together in
pcommit_sfence() you avoid the final unneeded fence.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1422377631-8986-2-git-send-email-ross.zwisler@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/cpufeature.h    | 1 +
 arch/x86/include/asm/special_insns.h | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 90a5485..d6428ea 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -231,6 +231,7 @@
 #define X86_FEATURE_RDSEED	( 9*32+18) /* The RDSEED instruction */
 #define X86_FEATURE_ADX		( 9*32+19) /* The ADCX and ADOX instructions */
 #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
+#define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
 #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
 #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
 #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index e820c08..d686f9b 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -199,6 +199,14 @@ static inline void clflushopt(volatile void *__p)
 		       "+m" (*(volatile char __force *)__p));
 }
 
+static inline void pcommit_sfence(void)
+{
+	alternative(ASM_NOP7,
+		    ".byte 0x66, 0x0f, 0xae, 0xf8\n\t" /* pcommit */
+		    "sfence",
+		    X86_FEATURE_PCOMMIT);
+}
+
 #define nop() asm volatile ("nop")
 
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [tip:x86/asm] x86: Add support for the clwb instruction
  2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler
  2015-01-28 10:58   ` Borislav Petkov
  2015-02-11 22:25   ` H. Peter Anvin
@ 2015-02-19  0:29   ` tip-bot for Ross Zwisler
  2015-04-02 20:31     ` Ross Zwisler
  2015-04-03  5:10   ` [tip:x86/asm] x86/asm: Add support for the CLWB instruction tip-bot for Ross Zwisler
  3 siblings, 1 reply; 20+ messages in thread
From: tip-bot for Ross Zwisler @ 2015-02-19  0:29 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, hpa, mingo, ross.zwisler, torvalds, linux-kernel, bp, tglx

Commit-ID:  3b68983dc66c61da3ab4191b891084a7ab09e3e1
Gitweb:     http://git.kernel.org/tip/3b68983dc66c61da3ab4191b891084a7ab09e3e1
Author:     Ross Zwisler <ross.zwisler@linux.intel.com>
AuthorDate: Tue, 27 Jan 2015 09:53:51 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 19 Feb 2015 00:06:38 +0100

x86: Add support for the clwb instruction

Add support for the new clwb (cache line write back)
instruction.  This instruction was announced in the document
"Intel Architecture Instruction Set Extensions Programming
Reference" with reference number 319433-022.

https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

The clwb instruction is used to write back the contents of
dirtied cache lines to memory without evicting the cache lines
from the processor's cache hierarchy.  This should be used in
favor of clflushopt or clflush in cases where you require the
cache line to be written to memory but plan to access the data
again in the near future.

One of the main use cases for this is with persistent memory
where clwb can be used with pcommit to ensure that data has been
accepted to memory and is durable on the DIMM.

This function shows how to properly use clwb/clflushopt/clflush
and pcommit with appropriate fencing:

void flush_and_commit_buffer(void *vaddr, unsigned int size)
{
	void *vend = vaddr + size - 1;

	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
		clwb(vaddr);

	/* Flush any possible final partial cacheline */
	clwb(vend);

	/*
	 * sfence to order clwb/clflushopt/clflush cache flushes
	 * mfence via mb() also works
	 */
	wmb();

	/* pcommit and the required sfence for ordering */
	pcommit_sfence();
}

After this function completes the data pointed to by vaddr is
has been accepted to memory and will be durable if the vaddr
points to persistent memory.

Regarding the details of how the alternatives assembly is set
up, we need one additional byte at the beginning of the clflush
so that we can flip it into a clflushopt by changing that byte
into a 0x66 prefix.  Two options are to either insert a 1 byte
ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX.  Both have no
functional effect with the plain clflush, but I've been told
that executing a clflush + prefix should be faster than
executing a clflush + NOP.

We had to hard code the assembly for clwb because, lacking the
ability to assemble the clwb instruction itself, the next
closest thing is to have an xsaveopt instruction with a 0x66
prefix.  Unfortunately xsaveopt itself is also relatively new,
and isn't included by all the GCC versions that the kernel needs
to support.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/cpufeature.h    |  1 +
 arch/x86/include/asm/special_insns.h | 14 ++++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index d6428ea..bc96e78 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -233,6 +233,7 @@
 #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
 #define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
 #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
+#define X86_FEATURE_CLWB	( 9*32+24) /* CLWB instruction */
 #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
 #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
 #define X86_FEATURE_AVX512CD	( 9*32+28) /* AVX-512 Conflict Detection */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index d686f9b..0772365 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -199,6 +199,20 @@ static inline void clflushopt(volatile void *__p)
 		       "+m" (*(volatile char __force *)__p));
 }
 
+static inline void clwb(volatile void *__p)
+{
+	volatile struct { char x[64]; } *p = __p;
+
+	asm volatile(ALTERNATIVE_2(
+		".byte " __stringify(NOP_DS_PREFIX) "; clflush (%[pax])",
+		".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */
+		X86_FEATURE_CLFLUSHOPT,
+		".byte 0x66, 0x0f, 0xae, 0x30",  /* clwb (%%rax) */
+		X86_FEATURE_CLWB)
+		: [p] "+m" (*p)
+		: [pax] "a" (p));
+}
+
 static inline void pcommit_sfence(void)
 {
 	alternative(ASM_NOP7,

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [tip:x86/asm] x86: Add support for the pcommit instruction
  2015-02-19  0:29   ` [tip:x86/asm] " tip-bot for Ross Zwisler
@ 2015-02-19  1:15     ` Ingo Molnar
  2015-02-19 17:21       ` Ross Zwisler
  0 siblings, 1 reply; 20+ messages in thread
From: Ingo Molnar @ 2015-02-19  1:15 UTC (permalink / raw)
  To: hpa, ross.zwisler, torvalds, tglx, hpa, linux-kernel, bp
  Cc: linux-tip-commits


* tip-bot for Ross Zwisler <tipbot@zytor.com> wrote:

> Commit-ID:  a71ef01336f2228dc9d47320492360d6848e591e
> Gitweb:     http://git.kernel.org/tip/a71ef01336f2228dc9d47320492360d6848e591e
> Author:     Ross Zwisler <ross.zwisler@linux.intel.com>
> AuthorDate: Tue, 27 Jan 2015 09:53:50 -0700
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Thu, 19 Feb 2015 00:06:37 +0100
> 
> x86: Add support for the pcommit instruction

So this breaks the UML build:

/home/mingo/tip/arch/x86/include/asm/special_insns.h: In function ‘pcommit_sfence’:
/home/mingo/tip/arch/x86/include/asm/special_insns.h:218:14: error: expected ‘:’ or ‘)’ before ‘ASM_NOP7’

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [tip:x86/asm] x86: Add support for the pcommit instruction
  2015-02-19  1:15     ` Ingo Molnar
@ 2015-02-19 17:21       ` Ross Zwisler
  2015-02-19 17:33         ` Borislav Petkov
  0 siblings, 1 reply; 20+ messages in thread
From: Ross Zwisler @ 2015-02-19 17:21 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: hpa, torvalds, tglx, hpa, linux-kernel, bp, linux-tip-commits

On Thu, 2015-02-19 at 02:15 +0100, Ingo Molnar wrote:
> * tip-bot for Ross Zwisler <tipbot@zytor.com> wrote:
> 
> > Commit-ID:  a71ef01336f2228dc9d47320492360d6848e591e
> > Gitweb:     http://git.kernel.org/tip/a71ef01336f2228dc9d47320492360d6848e591e
> > Author:     Ross Zwisler <ross.zwisler@linux.intel.com>
> > AuthorDate: Tue, 27 Jan 2015 09:53:50 -0700
> > Committer:  Ingo Molnar <mingo@kernel.org>
> > CommitDate: Thu, 19 Feb 2015 00:06:37 +0100
> > 
> > x86: Add support for the pcommit instruction
> 
> So this breaks the UML build:
> 
> /home/mingo/tip/arch/x86/include/asm/special_insns.h: In function ‘pcommit_sfence’:
> /home/mingo/tip/arch/x86/include/asm/special_insns.h:218:14: error: expected ‘:’ or ‘)’ before ‘ASM_NOP7’
> 
> Thanks,
> 
> 	Ingo

Interesting, it looks like I need to include <asm/nops.h> explicitly for
UML.  New patch on the way.

Thanks,
- Ross


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [tip:x86/asm] x86: Add support for the pcommit instruction
  2015-02-19 17:21       ` Ross Zwisler
@ 2015-02-19 17:33         ` Borislav Petkov
  2015-02-19 17:41           ` Ross Zwisler
  0 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2015-02-19 17:33 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Ingo Molnar, hpa, torvalds, tglx, hpa, linux-kernel, bp,
	linux-tip-commits

On Thu, Feb 19, 2015 at 10:21:53AM -0700, Ross Zwisler wrote:
> Interesting, it looks like I need to include <asm/nops.h> explicitly for
> UML.  New patch on the way.

You'd need to do an incremental fix ontop, though.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [tip:x86/asm] x86: Add support for the pcommit instruction
  2015-02-19 17:33         ` Borislav Petkov
@ 2015-02-19 17:41           ` Ross Zwisler
  0 siblings, 0 replies; 20+ messages in thread
From: Ross Zwisler @ 2015-02-19 17:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, hpa, torvalds, tglx, hpa, linux-kernel, bp,
	linux-tip-commits

On Thu, 2015-02-19 at 18:33 +0100, Borislav Petkov wrote:
> On Thu, Feb 19, 2015 at 10:21:53AM -0700, Ross Zwisler wrote:
> > Interesting, it looks like I need to include <asm/nops.h> explicitly for
> > UML.  New patch on the way.
> 
> You'd need to do an incremental fix ontop, though.

Oh, instead of just sending out a new patch that does the include?

Sorry, didn't see this before I sent out v4 of the patch that added the
include - Ingo, if you'd rather have a separate patch that adds the
include to fix the compile error, please let me know & I can send one
out.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [tip:x86/asm] x86: Add support for the clwb instruction
  2015-02-19  0:29   ` [tip:x86/asm] " tip-bot for Ross Zwisler
@ 2015-04-02 20:31     ` Ross Zwisler
  2015-04-03  5:04       ` Ingo Molnar
  0 siblings, 1 reply; 20+ messages in thread
From: Ross Zwisler @ 2015-04-02 20:31 UTC (permalink / raw)
  To: tglx, torvalds, linux-kernel, bp, mingo, hpa, hpa; +Cc: linux-tip-commits

On Wed, 2015-02-18 at 16:29 -0800, tip-bot for Ross Zwisler wrote:
> Commit-ID:  3b68983dc66c61da3ab4191b891084a7ab09e3e1
> Gitweb:     http://git.kernel.org/tip/3b68983dc66c61da3ab4191b891084a7ab09e3e1
> Author:     Ross Zwisler <ross.zwisler@linux.intel.com>
> AuthorDate: Tue, 27 Jan 2015 09:53:51 -0700
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Thu, 19 Feb 2015 00:06:38 +0100
> 
> x86: Add support for the clwb instruction
> 
> Add support for the new clwb (cache line write back)
> instruction.  This instruction was announced in the document
> "Intel Architecture Instruction Set Extensions Programming
> Reference" with reference number 319433-022.
> 
> https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> 
> The clwb instruction is used to write back the contents of
> dirtied cache lines to memory without evicting the cache lines
> from the processor's cache hierarchy.  This should be used in
> favor of clflushopt or clflush in cases where you require the
> cache line to be written to memory but plan to access the data
> again in the near future.
> 
> One of the main use cases for this is with persistent memory
> where clwb can be used with pcommit to ensure that data has been
> accepted to memory and is durable on the DIMM.
> 
> This function shows how to properly use clwb/clflushopt/clflush
> and pcommit with appropriate fencing:
> 
> void flush_and_commit_buffer(void *vaddr, unsigned int size)
> {
> 	void *vend = vaddr + size - 1;
> 
> 	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
> 		clwb(vaddr);
> 
> 	/* Flush any possible final partial cacheline */
> 	clwb(vend);
> 
> 	/*
> 	 * sfence to order clwb/clflushopt/clflush cache flushes
> 	 * mfence via mb() also works
> 	 */
> 	wmb();
> 
> 	/* pcommit and the required sfence for ordering */
> 	pcommit_sfence();
> }
> 
> After this function completes the data pointed to by vaddr is
> has been accepted to memory and will be durable if the vaddr
> points to persistent memory.
> 
> Regarding the details of how the alternatives assembly is set
> up, we need one additional byte at the beginning of the clflush
> so that we can flip it into a clflushopt by changing that byte
> into a 0x66 prefix.  Two options are to either insert a 1 byte
> ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX.  Both have no
> functional effect with the plain clflush, but I've been told
> that executing a clflush + prefix should be faster than
> executing a clflush + NOP.
> 
> We had to hard code the assembly for clwb because, lacking the
> ability to assemble the clwb instruction itself, the next
> closest thing is to have an xsaveopt instruction with a 0x66
> prefix.  Unfortunately xsaveopt itself is also relatively new,
> and isn't included by all the GCC versions that the kernel needs
> to support.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Acked-by: Borislav Petkov <bp@suse.de>
> Acked-by: H. Peter Anvin <hpa@linux.intel.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ping on this patch - it looks like the pcommit patch is in the tip tree,
but this one is missing?

I'm looking at the tree as of:
9a760fbbdc7 "Merge branch 'tools/kvm'"


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [tip:x86/asm] x86: Add support for the clwb instruction
  2015-04-02 20:31     ` Ross Zwisler
@ 2015-04-03  5:04       ` Ingo Molnar
  0 siblings, 0 replies; 20+ messages in thread
From: Ingo Molnar @ 2015-04-03  5:04 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: tglx, torvalds, linux-kernel, bp, hpa, hpa, linux-tip-commits


* Ross Zwisler <ross.zwisler@linux.intel.com> wrote:

> On Wed, 2015-02-18 at 16:29 -0800, tip-bot for Ross Zwisler wrote:
> > Commit-ID:  3b68983dc66c61da3ab4191b891084a7ab09e3e1
> > Gitweb:     http://git.kernel.org/tip/3b68983dc66c61da3ab4191b891084a7ab09e3e1
> > Author:     Ross Zwisler <ross.zwisler@linux.intel.com>
> > AuthorDate: Tue, 27 Jan 2015 09:53:51 -0700
> > Committer:  Ingo Molnar <mingo@kernel.org>
> > CommitDate: Thu, 19 Feb 2015 00:06:38 +0100
> > 
> > x86: Add support for the clwb instruction
> > 
> > Add support for the new clwb (cache line write back)
> > instruction.  This instruction was announced in the document
> > "Intel Architecture Instruction Set Extensions Programming
> > Reference" with reference number 319433-022.
> > 
> > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> > 
> > The clwb instruction is used to write back the contents of
> > dirtied cache lines to memory without evicting the cache lines
> > from the processor's cache hierarchy.  This should be used in
> > favor of clflushopt or clflush in cases where you require the
> > cache line to be written to memory but plan to access the data
> > again in the near future.
> > 
> > One of the main use cases for this is with persistent memory
> > where clwb can be used with pcommit to ensure that data has been
> > accepted to memory and is durable on the DIMM.
> > 
> > This function shows how to properly use clwb/clflushopt/clflush
> > and pcommit with appropriate fencing:
> > 
> > void flush_and_commit_buffer(void *vaddr, unsigned int size)
> > {
> > 	void *vend = vaddr + size - 1;
> > 
> > 	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
> > 		clwb(vaddr);
> > 
> > 	/* Flush any possible final partial cacheline */
> > 	clwb(vend);
> > 
> > 	/*
> > 	 * sfence to order clwb/clflushopt/clflush cache flushes
> > 	 * mfence via mb() also works
> > 	 */
> > 	wmb();
> > 
> > 	/* pcommit and the required sfence for ordering */
> > 	pcommit_sfence();
> > }
> > 
> > After this function completes the data pointed to by vaddr is
> > has been accepted to memory and will be durable if the vaddr
> > points to persistent memory.
> > 
> > Regarding the details of how the alternatives assembly is set
> > up, we need one additional byte at the beginning of the clflush
> > so that we can flip it into a clflushopt by changing that byte
> > into a 0x66 prefix.  Two options are to either insert a 1 byte
> > ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX.  Both have no
> > functional effect with the plain clflush, but I've been told
> > that executing a clflush + prefix should be faster than
> > executing a clflush + NOP.
> > 
> > We had to hard code the assembly for clwb because, lacking the
> > ability to assemble the clwb instruction itself, the next
> > closest thing is to have an xsaveopt instruction with a 0x66
> > prefix.  Unfortunately xsaveopt itself is also relatively new,
> > and isn't included by all the GCC versions that the kernel needs
> > to support.
> > 
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > Acked-by: Borislav Petkov <bp@suse.de>
> > Acked-by: H. Peter Anvin <hpa@linux.intel.com>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> 
> Ping on this patch - it looks like the pcommit patch is in the tip tree,
> but this one is missing?

Yeah, I applied it initially, then had some reservations about it - 
but those are now resolved so I've applied it to tip:x86/asm again.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [tip:x86/asm] x86/asm: Add support for the CLWB instruction
  2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler
                     ` (2 preceding siblings ...)
  2015-02-19  0:29   ` [tip:x86/asm] " tip-bot for Ross Zwisler
@ 2015-04-03  5:10   ` tip-bot for Ross Zwisler
  3 siblings, 0 replies; 20+ messages in thread
From: tip-bot for Ross Zwisler @ 2015-04-03  5:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, ross.zwisler, linux-kernel, bp, brgerst, bp, mingo, hpa,
	hpa, torvalds, dvlasenk, tglx

Commit-ID:  d9dc64f30abe42f71bc7e9eb9d38c41006cf39f9
Gitweb:     http://git.kernel.org/tip/d9dc64f30abe42f71bc7e9eb9d38c41006cf39f9
Author:     Ross Zwisler <ross.zwisler@linux.intel.com>
AuthorDate: Tue, 27 Jan 2015 09:53:51 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 3 Apr 2015 06:56:38 +0200

x86/asm: Add support for the CLWB instruction

Add support for the new CLWB (cache line write back)
instruction.  This instruction was announced in the document
"Intel Architecture Instruction Set Extensions Programming
Reference" with reference number 319433-022.

  https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

The CLWB instruction is used to write back the contents of
dirtied cache lines to memory without evicting the cache lines
from the processor's cache hierarchy.  This should be used in
favor of clflushopt or clflush in cases where you require the
cache line to be written to memory but plan to access the data
again in the near future.

One of the main use cases for this is with persistent memory
where CLWB can be used with PCOMMIT to ensure that data has been
accepted to memory and is durable on the DIMM.

This function shows how to properly use CLWB/CLFLUSHOPT/CLFLUSH
and PCOMMIT with appropriate fencing:

void flush_and_commit_buffer(void *vaddr, unsigned int size)
{
	void *vend = vaddr + size - 1;

	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
		clwb(vaddr);

	/* Flush any possible final partial cacheline */
	clwb(vend);

	/*
	 * Use SFENCE to order CLWB/CLFLUSHOPT/CLFLUSH cache flushes.
	 * (MFENCE via mb() also works)
	 */
	wmb();

	/* PCOMMIT and the required SFENCE for ordering */
	pcommit_sfence();
}

After this function completes the data pointed to by vaddr is
has been accepted to memory and will be durable if the vaddr
points to persistent memory.

Regarding the details of how the alternatives assembly is set
up, we need one additional byte at the beginning of the CLFLUSH
so that we can flip it into a CLFLUSHOPT by changing that byte
into a 0x66 prefix.  Two options are to either insert a 1 byte
ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX.  Both have no
functional effect with the plain CLFLUSH, but I've been told
that executing a CLFLUSH + prefix should be faster than
executing a CLFLUSH + NOP.

We had to hard code the assembly for CLWB because, lacking the
ability to assemble the CLWB instruction itself, the next
closest thing is to have an xsaveopt instruction with a 0x66
prefix.  Unfortunately XSAVEOPT itself is also relatively new,
and isn't included by all the GCC versions that the kernel needs
to support.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/cpufeature.h    |  1 +
 arch/x86/include/asm/special_insns.h | 14 ++++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 0f7a5a1..854c04b 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -233,6 +233,7 @@
 #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
 #define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
 #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
+#define X86_FEATURE_CLWB	( 9*32+24) /* CLWB instruction */
 #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
 #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
 #define X86_FEATURE_AVX512CD	( 9*32+28) /* AVX-512 Conflict Detection */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 2ec1a53..aeb4666e 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -201,6 +201,20 @@ static inline void clflushopt(volatile void *__p)
 		       "+m" (*(volatile char __force *)__p));
 }
 
+static inline void clwb(volatile void *__p)
+{
+	volatile struct { char x[64]; } *p = __p;
+
+	asm volatile(ALTERNATIVE_2(
+		".byte " __stringify(NOP_DS_PREFIX) "; clflush (%[pax])",
+		".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */
+		X86_FEATURE_CLFLUSHOPT,
+		".byte 0x66, 0x0f, 0xae, 0x30",  /* clwb (%%rax) */
+		X86_FEATURE_CLWB)
+		: [p] "+m" (*p)
+		: [pax] "a" (p));
+}
+
 static inline void pcommit_sfence(void)
 {
 	alternative(ASM_NOP7,

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-04-03  5:11 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-27 16:53 [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler
2015-01-27 16:53 ` [PATCH v3 1/2] x86: Add support for the pcommit instruction Ross Zwisler
2015-01-28 10:58   ` Borislav Petkov
2015-01-28 17:10   ` Elliott, Robert (Server Storage)
2015-01-28 17:21     ` Borislav Petkov
2015-01-28 17:27       ` Ross Zwisler
2015-02-11 22:24   ` H. Peter Anvin
2015-02-19  0:29   ` [tip:x86/asm] " tip-bot for Ross Zwisler
2015-02-19  1:15     ` Ingo Molnar
2015-02-19 17:21       ` Ross Zwisler
2015-02-19 17:33         ` Borislav Petkov
2015-02-19 17:41           ` Ross Zwisler
2015-01-27 16:53 ` [PATCH v3 2/2] x86: Add support for the clwb instruction Ross Zwisler
2015-01-28 10:58   ` Borislav Petkov
2015-02-11 22:25   ` H. Peter Anvin
2015-02-19  0:29   ` [tip:x86/asm] " tip-bot for Ross Zwisler
2015-04-02 20:31     ` Ross Zwisler
2015-04-03  5:04       ` Ingo Molnar
2015-04-03  5:10   ` [tip:x86/asm] x86/asm: Add support for the CLWB instruction tip-bot for Ross Zwisler
2015-02-05 16:24 ` [PATCH v3 0/2] add support for new persistent memory instructions Ross Zwisler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).