LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] MIPS: c-r4k: fix data corruption related to cache coherence.
@ 2018-04-25  4:08 NeilBrown
  2018-04-25 21:46 ` James Hogan
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2018-04-25  4:08 UTC (permalink / raw)
  To: Ralf Baechle, James Hogan, Paul Burton; +Cc: linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2357 bytes --]


When DMA is to be performed to a MIPS32 1004K CPS, the
L1-cache for the range needs to be flushed and invalidated
first.
The code currently takes one of two approaches.
1/ If the range is less than the size of the dcache, then
   HIT type requests flush/invalidate cache lines for the
   particular addresses.  HIT-type requests a globalised
   by the CPS so this is safe on SMP.

2/ If the range is larger than the size of dcache, then
   INDEX type requests flush/invalidate the whole cache.
   INDEX type requests are NOT globalized by CPS so this
   is NOT safe when CPS is used.

Data corruption due to '2' can quite easily be demonstrated by
repeatedly "echo 3 > /proc/sys/vm/drop_caches" and then sha1sum
a file that is several times the size of available memory.
Dropping caches means that large contiguous extents (large than
dcache) are more likely.

This was not a problem before Linux-4.8 because option 2 was
never used if CONFIG_MIPS_CPS was defined.  The commit
which removed that apparently didn't appreciate the full
consequence of the change.

This patch avoids options 2 if mips_cm_present().

Fixes: c00ab4896ed5 ("MIPS: Remove cpu_has_safe_index_cacheops")
Cc: stable@vger.kernel.org (v4.8)
Signed-off-by: NeilBrown <neil@brown.name>
---
 arch/mips/mm/c-r4k.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c
index 6f534b209971..f845ec96f31e 100644
--- a/arch/mips/mm/c-r4k.c
+++ b/arch/mips/mm/c-r4k.c
@@ -851,9 +851,11 @@ static void r4k_dma_cache_wback_inv(unsigned long addr, unsigned long size)
 	/*
 	 * Either no secondary cache or the available caches don't have the
 	 * subset property so we have to flush the primary caches
-	 * explicitly
+	 * explicitly.
+	 * As Index type operations are not globalized by CM, we must
+	 * use the HIT type when CM is present.
 	 */
-	if (size >= dcache_size) {
+	if (!mips_cm_present() && size >= dcache_size) {
 		r4k_blast_dcache();
 	} else {
 		R4600_HIT_CACHEOP_WAR_IMPL;
@@ -890,7 +892,7 @@ static void r4k_dma_cache_inv(unsigned long addr, unsigned long size)
 		return;
 	}
 
-	if (size >= dcache_size) {
+	if (!mips_cm_present() && size >= dcache_size) {
 		r4k_blast_dcache();
 	} else {
 		R4600_HIT_CACHEOP_WAR_IMPL;
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MIPS: c-r4k: fix data corruption related to cache coherence.
  2018-04-25  4:08 [PATCH] MIPS: c-r4k: fix data corruption related to cache coherence NeilBrown
@ 2018-04-25 21:46 ` James Hogan
  2018-04-25 22:00   ` NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: James Hogan @ 2018-04-25 21:46 UTC (permalink / raw)
  To: NeilBrown; +Cc: Ralf Baechle, Paul Burton, linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1871 bytes --]

Hi NeilBrown,

On Wed, Apr 25, 2018 at 02:08:15PM +1000, NeilBrown wrote:
> Cc: stable@vger.kernel.org (v4.8)

FYI my preferred form of this is:

Cc: <stable@vger.kernel.org> # 4.8+

>  	/*
>  	 * Either no secondary cache or the available caches don't have the
>  	 * subset property so we have to flush the primary caches
> -	 * explicitly
> +	 * explicitly.
> +	 * As Index type operations are not globalized by CM, we must
> +	 * use the HIT type when CM is present.

I don't think Index type operations *can* be sensibly globalised to
separate primary caches, since they directly index the lines in the
cache rather than any particular physical or virtual address, so this
fundamental issue doesn't sound specific to the CM.

I'm not entirely clear OTOH whether there are any non-CM r4k-like SMP
capable cores that don't have coherent IO or inclusive caches.
- Octeon doesn't use c-r4k and has coherent IO
- Netlogic appear to have coherent IO 
- Loongson64 have inclusive caches
- bmips, only bmips5000 appears to have multicore (not just multi-thread
  with I presume shared dcaches), and that has inclusive caches

So I suppose that does just leave non-multicore and CM systems as
potentially reaching these bits of code...

>  	 */
> -	if (size >= dcache_size) {
> +	if (!mips_cm_present() && size >= dcache_size) {

... but even for CM systems its only if there are foreign CPUs running
(i.e. not just threads on the same core sharing the same dcache) that it
actually matters.

So I'm thinking "!mips_cm_present()" should probably be replaced with
"!r4k_op_needs_ipi(R4K_INDEX)" (and the comment updated to mention that
IPI calls aren't implemented here).

That effectively means:
	!CONFIG_SMP || cpumask_empty(&cpu_foreign_map[0])

Otherwise the patch looks spot on. Thanks for spotting this!

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MIPS: c-r4k: fix data corruption related to cache coherence.
  2018-04-25 21:46 ` James Hogan
@ 2018-04-25 22:00   ` NeilBrown
  2018-04-25 22:08     ` James Hogan
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2018-04-25 22:00 UTC (permalink / raw)
  To: James Hogan; +Cc: Ralf Baechle, Paul Burton, linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2543 bytes --]

On Wed, Apr 25 2018, James Hogan wrote:

> Hi NeilBrown,
>
> On Wed, Apr 25, 2018 at 02:08:15PM +1000, NeilBrown wrote:
>> Cc: stable@vger.kernel.org (v4.8)
>
> FYI my preferred form of this is:
>
> Cc: <stable@vger.kernel.org> # 4.8+

Ugh.  Cc: looks like an RFC822 header, and # does not introduce
a comment in RFC822 headers. () do provide comments.
I would consider that a syntax error :-(
Apparently Documentation/process/stable-kernel-rules.rst
doesn't.  Sad.

>
>>  	/*
>>  	 * Either no secondary cache or the available caches don't have the
>>  	 * subset property so we have to flush the primary caches
>> -	 * explicitly
>> +	 * explicitly.
>> +	 * As Index type operations are not globalized by CM, we must
>> +	 * use the HIT type when CM is present.
>
> I don't think Index type operations *can* be sensibly globalised to
> separate primary caches, since they directly index the lines in the
> cache rather than any particular physical or virtual address, so this
> fundamental issue doesn't sound specific to the CM.
>
> I'm not entirely clear OTOH whether there are any non-CM r4k-like SMP
> capable cores that don't have coherent IO or inclusive caches.
> - Octeon doesn't use c-r4k and has coherent IO
> - Netlogic appear to have coherent IO 
> - Loongson64 have inclusive caches
> - bmips, only bmips5000 appears to have multicore (not just multi-thread
>   with I presume shared dcaches), and that has inclusive caches
>
> So I suppose that does just leave non-multicore and CM systems as
> potentially reaching these bits of code...
>
>>  	 */
>> -	if (size >= dcache_size) {
>> +	if (!mips_cm_present() && size >= dcache_size) {
>
> ... but even for CM systems its only if there are foreign CPUs running
> (i.e. not just threads on the same core sharing the same dcache) that it
> actually matters.
>
> So I'm thinking "!mips_cm_present()" should probably be replaced with
> "!r4k_op_needs_ipi(R4K_INDEX)" (and the comment updated to mention that
> IPI calls aren't implemented here).

That makes sense.  I tried ipi calls at one stage.  Synchronous ones
cannot be used because this code is called with interrupts disabled, and
async ones cannot provide the required guarantees.

I'll update the patch, test that it still works, and resend, in the next
day or so.

Thanks,
NeilBrown

>
> That effectively means:
> 	!CONFIG_SMP || cpumask_empty(&cpu_foreign_map[0])
>
> Otherwise the patch looks spot on. Thanks for spotting this!
>
> Cheers
> James

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MIPS: c-r4k: fix data corruption related to cache coherence.
  2018-04-25 22:00   ` NeilBrown
@ 2018-04-25 22:08     ` James Hogan
  2018-04-26 23:28       ` [PATCH v2] " NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: James Hogan @ 2018-04-25 22:08 UTC (permalink / raw)
  To: NeilBrown; +Cc: Ralf Baechle, Paul Burton, linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 842 bytes --]

On Thu, Apr 26, 2018 at 08:00:18AM +1000, NeilBrown wrote:
> On Wed, Apr 25 2018, James Hogan wrote:
> > So I'm thinking "!mips_cm_present()" should probably be replaced with
> > "!r4k_op_needs_ipi(R4K_INDEX)" (and the comment updated to mention that
> > IPI calls aren't implemented here).
> 
> That makes sense.  I tried ipi calls at one stage.  Synchronous ones
> cannot be used because this code is called with interrupts disabled, and
> async ones cannot provide the required guarantees.

Okay. I thought something like that might be the case.

> 
> I'll update the patch, test that it still works, and resend, in the next
> day or so.

Thanks. It wouldn't hurt to also mention why IPIs can't be used in the
code comments, if its fresh in your mind, just for the benefit of next
person to attempt it.

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] MIPS: c-r4k: fix data corruption related to cache coherence.
  2018-04-25 22:08     ` James Hogan
@ 2018-04-26 23:28       ` NeilBrown
  2018-05-06 21:40         ` NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2018-04-26 23:28 UTC (permalink / raw)
  To: James Hogan; +Cc: Ralf Baechle, Paul Burton, linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3197 bytes --]


When DMA will be performed to a MIPS32 1004K CPS, the
L1-cache for the range needs to be flushed and invalidated
first.
The code currently takes one of two approaches.
1/ If the range is less than the size of the dcache, then
   HIT type requests flush/invalidate cache lines for the
   particular addresses.  HIT-type requests a globalised
   by the CPS so this is safe on SMP.

2/ If the range is larger than the size of dcache, then
   INDEX type requests flush/invalidate the whole cache.
   INDEX type requests affect the local cache only. CPS
   does not propagate them in any way.  So this invalidation
   is not safe on SMP CPS systems.

Data corruption due to '2' can quite easily be demonstrated by
repeatedly "echo 3 > /proc/sys/vm/drop_caches" and then sha1sum
a file that is several times the size of available memory.
Dropping caches means that large contiguous extents (large than
dcache) are more likely.

This was not a problem before Linux-4.8 because option 2 was
never used if CONFIG_MIPS_CPS was defined.  The commit
which removed that apparently didn't appreciate the full
consequence of the change.

We could, in theory, globalize the INDEX based flush by sending an IPI
to other cores.  These cache invalidation routines can be called with
interrupts disabled and synchronous IPI require interrupts to be
enabled.  Asynchronous IPI may not trigger writeback soon enough.
So we cannot use IPI in practice.

We can already test is IPI would be needed for an INDEX operation
with r4k_op_needs_ipi(R4K_INDEX).  If this is True then we mustn't try
the INDEX approach as we cannot use IPI.  If this is False (e.g. when
there is only one core and hence one L1 cache) then it is safe to
use the INDEX approach without IPI.

This patch avoids options 2 if r4k_op_needs_ipi(R4K_INDEX), and so
eliminates the corruption.

Fixes: c00ab4896ed5 ("MIPS: Remove cpu_has_safe_index_cacheops")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: NeilBrown <neil@brown.name>
---
 arch/mips/mm/c-r4k.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c
index 6f534b209971..e12dfa48b478 100644
--- a/arch/mips/mm/c-r4k.c
+++ b/arch/mips/mm/c-r4k.c
@@ -851,9 +851,12 @@ static void r4k_dma_cache_wback_inv(unsigned long addr, unsigned long size)
 	/*
 	 * Either no secondary cache or the available caches don't have the
 	 * subset property so we have to flush the primary caches
-	 * explicitly
+	 * explicitly.
+	 * If we would need IPI to perform an INDEX-type operation, then
+	 * we have to use the HIT-type alternative as IPI cannot be used
+	 * here due to interrupts possibly being disabled.
 	 */
-	if (size >= dcache_size) {
+	if (!r4k_op_needs_ipi(R4K_INDEX) && size >= dcache_size) {
 		r4k_blast_dcache();
 	} else {
 		R4600_HIT_CACHEOP_WAR_IMPL;
@@ -890,7 +893,7 @@ static void r4k_dma_cache_inv(unsigned long addr, unsigned long size)
 		return;
 	}
 
-	if (size >= dcache_size) {
+	if (!r4k_op_needs_ipi(R4K_INDEX) && size >= dcache_size) {
 		r4k_blast_dcache();
 	} else {
 		R4600_HIT_CACHEOP_WAR_IMPL;
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] MIPS: c-r4k: fix data corruption related to cache coherence.
  2018-04-26 23:28       ` [PATCH v2] " NeilBrown
@ 2018-05-06 21:40         ` NeilBrown
  2018-05-07 20:16           ` James Hogan
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2018-05-06 21:40 UTC (permalink / raw)
  To: James Hogan; +Cc: Ralf Baechle, Paul Burton, linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3574 bytes --]


Hi James,
 this hasn't appear in linux-next yet, or in any branch
 of
   git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips.git

 Should I expect it to?

Thanks,
NeilBrown

On Fri, Apr 27 2018, NeilBrown wrote:

> When DMA will be performed to a MIPS32 1004K CPS, the
> L1-cache for the range needs to be flushed and invalidated
> first.
> The code currently takes one of two approaches.
> 1/ If the range is less than the size of the dcache, then
>    HIT type requests flush/invalidate cache lines for the
>    particular addresses.  HIT-type requests a globalised
>    by the CPS so this is safe on SMP.
>
> 2/ If the range is larger than the size of dcache, then
>    INDEX type requests flush/invalidate the whole cache.
>    INDEX type requests affect the local cache only. CPS
>    does not propagate them in any way.  So this invalidation
>    is not safe on SMP CPS systems.
>
> Data corruption due to '2' can quite easily be demonstrated by
> repeatedly "echo 3 > /proc/sys/vm/drop_caches" and then sha1sum
> a file that is several times the size of available memory.
> Dropping caches means that large contiguous extents (large than
> dcache) are more likely.
>
> This was not a problem before Linux-4.8 because option 2 was
> never used if CONFIG_MIPS_CPS was defined.  The commit
> which removed that apparently didn't appreciate the full
> consequence of the change.
>
> We could, in theory, globalize the INDEX based flush by sending an IPI
> to other cores.  These cache invalidation routines can be called with
> interrupts disabled and synchronous IPI require interrupts to be
> enabled.  Asynchronous IPI may not trigger writeback soon enough.
> So we cannot use IPI in practice.
>
> We can already test is IPI would be needed for an INDEX operation
> with r4k_op_needs_ipi(R4K_INDEX).  If this is True then we mustn't try
> the INDEX approach as we cannot use IPI.  If this is False (e.g. when
> there is only one core and hence one L1 cache) then it is safe to
> use the INDEX approach without IPI.
>
> This patch avoids options 2 if r4k_op_needs_ipi(R4K_INDEX), and so
> eliminates the corruption.
>
> Fixes: c00ab4896ed5 ("MIPS: Remove cpu_has_safe_index_cacheops")
> Cc: stable@vger.kernel.org # v4.8+
> Signed-off-by: NeilBrown <neil@brown.name>
> ---
>  arch/mips/mm/c-r4k.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c
> index 6f534b209971..e12dfa48b478 100644
> --- a/arch/mips/mm/c-r4k.c
> +++ b/arch/mips/mm/c-r4k.c
> @@ -851,9 +851,12 @@ static void r4k_dma_cache_wback_inv(unsigned long addr, unsigned long size)
>  	/*
>  	 * Either no secondary cache or the available caches don't have the
>  	 * subset property so we have to flush the primary caches
> -	 * explicitly
> +	 * explicitly.
> +	 * If we would need IPI to perform an INDEX-type operation, then
> +	 * we have to use the HIT-type alternative as IPI cannot be used
> +	 * here due to interrupts possibly being disabled.
>  	 */
> -	if (size >= dcache_size) {
> +	if (!r4k_op_needs_ipi(R4K_INDEX) && size >= dcache_size) {
>  		r4k_blast_dcache();
>  	} else {
>  		R4600_HIT_CACHEOP_WAR_IMPL;
> @@ -890,7 +893,7 @@ static void r4k_dma_cache_inv(unsigned long addr, unsigned long size)
>  		return;
>  	}
>  
> -	if (size >= dcache_size) {
> +	if (!r4k_op_needs_ipi(R4K_INDEX) && size >= dcache_size) {
>  		r4k_blast_dcache();
>  	} else {
>  		R4600_HIT_CACHEOP_WAR_IMPL;
> -- 
> 2.14.0.rc0.dirty

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] MIPS: c-r4k: fix data corruption related to cache coherence.
  2018-05-06 21:40         ` NeilBrown
@ 2018-05-07 20:16           ` James Hogan
  2018-05-08  1:22             ` NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: James Hogan @ 2018-05-07 20:16 UTC (permalink / raw)
  To: NeilBrown; +Cc: Ralf Baechle, Paul Burton, linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 425 bytes --]

On Mon, May 07, 2018 at 07:40:49AM +1000, NeilBrown wrote:
> 
> Hi James,
>  this hasn't appear in linux-next yet, or in any branch
>  of
>    git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips.git
> 
>  Should I expect it to?

Sorry Neil, I haven't applied it yet. I'm planning to get a few fixes
sorted this week, at which point it would land in the mips-fixes branch
at the above repo.

Cheers
James

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] MIPS: c-r4k: fix data corruption related to cache coherence.
  2018-05-07 20:16           ` James Hogan
@ 2018-05-08  1:22             ` NeilBrown
  2018-05-11 21:56               ` James Hogan
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2018-05-08  1:22 UTC (permalink / raw)
  To: James Hogan; +Cc: Ralf Baechle, Paul Burton, linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 560 bytes --]

On Mon, May 07 2018, James Hogan wrote:

> On Mon, May 07, 2018 at 07:40:49AM +1000, NeilBrown wrote:
>> 
>> Hi James,
>>  this hasn't appear in linux-next yet, or in any branch
>>  of
>>    git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips.git
>> 
>>  Should I expect it to?
>
> Sorry Neil, I haven't applied it yet. I'm planning to get a few fixes
> sorted this week, at which point it would land in the mips-fixes branch
> at the above repo.

Cool, thanks.  I just wanted to be sure it hadn't got lost somehow.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] MIPS: c-r4k: fix data corruption related to cache coherence.
  2018-05-08  1:22             ` NeilBrown
@ 2018-05-11 21:56               ` James Hogan
  0 siblings, 0 replies; 9+ messages in thread
From: James Hogan @ 2018-05-11 21:56 UTC (permalink / raw)
  To: NeilBrown; +Cc: Ralf Baechle, Paul Burton, linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 662 bytes --]

On Tue, May 08, 2018 at 11:22:36AM +1000, NeilBrown wrote:
> On Mon, May 07 2018, James Hogan wrote:
> 
> > On Mon, May 07, 2018 at 07:40:49AM +1000, NeilBrown wrote:
> >> 
> >> Hi James,
> >>  this hasn't appear in linux-next yet, or in any branch
> >>  of
> >>    git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips.git
> >> 
> >>  Should I expect it to?
> >
> > Sorry Neil, I haven't applied it yet. I'm planning to get a few fixes
> > sorted this week, at which point it would land in the mips-fixes branch
> > at the above repo.
> 
> Cool, thanks.  I just wanted to be sure it hadn't got lost somehow.

Now pushed.

Thanks
James

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-05-11 21:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-25  4:08 [PATCH] MIPS: c-r4k: fix data corruption related to cache coherence NeilBrown
2018-04-25 21:46 ` James Hogan
2018-04-25 22:00   ` NeilBrown
2018-04-25 22:08     ` James Hogan
2018-04-26 23:28       ` [PATCH v2] " NeilBrown
2018-05-06 21:40         ` NeilBrown
2018-05-07 20:16           ` James Hogan
2018-05-08  1:22             ` NeilBrown
2018-05-11 21:56               ` James Hogan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).