LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Mapping PCI BAR through /sys/devices/pci* sets cache-disable and write-through
@ 2008-04-16 17:40 Keith Packard
  2008-04-16 22:23 ` Mapping PCI BAR through /sys/devices/pci* sets cache-disable andwrite-through Pallipadi, Venkatesh
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Packard @ 2008-04-16 17:40 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 831 bytes --]

The X server recently (within the last year) switched from mapping the
frame buffer from /dev/mem to using /sys/devices/pci*. This caused a
significant memory bandwidth reduction for writes, similar to the effect
caused when the associated MTRR is set to UC instead of WC.

Looking at the code path, we find, in i386:pci_mmap_page_range:

        prot = pgprot_val(vma->vm_page_prot);
        if (boot_cpu_data.x86 > 3)
                prot |= _PAGE_PCD | _PAGE_PWT;
        vma->vm_page_prot = __pgprot(prot);

Which is to say, on any CPU which supports it, force the cache-disable
and write-through bits on.

Is there some reason this is there? Surely applications should be
expected to set these attributes correctly, currently using MTRR and in
the future, using PAT directly.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Mapping PCI BAR through /sys/devices/pci* sets cache-disable andwrite-through
  2008-04-16 17:40 Mapping PCI BAR through /sys/devices/pci* sets cache-disable and write-through Keith Packard
@ 2008-04-16 22:23 ` Pallipadi, Venkatesh
  2008-04-17  5:20   ` Keith Packard
  0 siblings, 1 reply; 5+ messages in thread
From: Pallipadi, Venkatesh @ 2008-04-16 22:23 UTC (permalink / raw)
  To: Keith Packard, linux-kernel; +Cc: Siddha, Suresh B

 

>-----Original Message-----
>From: linux-kernel-owner@vger.kernel.org 
>[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Keith Packard
>Sent: Wednesday, April 16, 2008 10:40 AM
>To: linux-kernel
>Subject: Mapping PCI BAR through /sys/devices/pci* sets 
>cache-disable andwrite-through
>
>The X server recently (within the last year) switched from mapping the
>frame buffer from /dev/mem to using /sys/devices/pci*. This caused a
>significant memory bandwidth reduction for writes, similar to 
>the effect
>caused when the associated MTRR is set to UC instead of WC.
>
>Looking at the code path, we find, in i386:pci_mmap_page_range:
>
>        prot = pgprot_val(vma->vm_page_prot);
>        if (boot_cpu_data.x86 > 3)
>                prot |= _PAGE_PCD | _PAGE_PWT;
>        vma->vm_page_prot = __pgprot(prot);
>
>Which is to say, on any CPU which supports it, force the cache-disable
>and write-through bits on.
>
>Is there some reason this is there? Surely applications should be
>expected to set these attributes correctly, currently using MTRR and in
>the future, using PAT directly.
>

Setting flags "_PAGE_PCD | _PAGE_PWT" maps to PAT_3 which is "UC". So,
the mapping X server will get is UC and if X sets MTRR to WC, it will
not have any effect on this mapping, which will be still UC. There may
be apps which depend on this /sys/ interface giving UC mapping always...

With PAT patches, we now have /sys/devices/pci*resource_wc (in addition
to resource) for any PREFETCHABLE region. So, X has to change to use
that new interface to get WC mapping.

Thanks,
Venki

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Mapping PCI BAR through /sys/devices/pci* sets cache-disable andwrite-through
  2008-04-16 22:23 ` Mapping PCI BAR through /sys/devices/pci* sets cache-disable andwrite-through Pallipadi, Venkatesh
@ 2008-04-17  5:20   ` Keith Packard
  2008-04-17 13:14     ` Mapping PCI BAR through /sys/devices/pci* sets cache-disableandwrite-through Pallipadi, Venkatesh
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Packard @ 2008-04-17  5:20 UTC (permalink / raw)
  To: Pallipadi, Venkatesh; +Cc: linux-kernel, Siddha, Suresh B

[-- Attachment #1: Type: text/plain, Size: 1470 bytes --]

On Wed, 2008-04-16 at 15:23 -0700, Pallipadi, Venkatesh wrote:

> Setting flags "_PAGE_PCD | _PAGE_PWT" maps to PAT_3 which is "UC". So,
> the mapping X server will get is UC and if X sets MTRR to WC, it will
> not have any effect on this mapping, which will be still UC. There may
> be apps which depend on this /sys/ interface giving UC mapping always...

Sure, but X wants WC in the biggest way (our measurements show WC being
more than 6 times faster for writes, the operation we do the most of).

Jesse Barnes suggested using the fact that the BAR was prefetchable to
guess that this region should not be mapped UC. I don't know of any
other information available at this API that would help make a better
choice; there are no ioctls on /sys files that we could use to
manipulate the mapping.

> With PAT patches, we now have /sys/devices/pci*resource_wc (in addition
> to resource) for any PREFETCHABLE region. So, X has to change to use
> that new interface to get WC mapping.

Yeah, we'll do that when it becomes available. Would it make sense in
the pre-PAT world to use Jesse's guess?

In any case, we'll continue to use the fact that mprotect is also broken
to get our WC mapping working (using mprotect PROT_NONE followed by
mprotect PROT_READ|PROT_WRITE causes the CD and WT bits to get cleared).
We're fortunate in this case that we've found a bug to exploit that
gives us the desired behaviour.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Mapping PCI BAR through /sys/devices/pci* sets cache-disableandwrite-through
  2008-04-17  5:20   ` Keith Packard
@ 2008-04-17 13:14     ` Pallipadi, Venkatesh
  2008-04-17 17:07       ` Jesse Barnes
  0 siblings, 1 reply; 5+ messages in thread
From: Pallipadi, Venkatesh @ 2008-04-17 13:14 UTC (permalink / raw)
  To: Keith Packard; +Cc: linux-kernel, Siddha, Suresh B

 

>-----Original Message-----
>From: Keith Packard [mailto:keithp@keithp.com] 
>Sent: Wednesday, April 16, 2008 10:21 PM
>To: Pallipadi, Venkatesh
>Cc: linux-kernel; Siddha, Suresh B
>Subject: RE: Mapping PCI BAR through /sys/devices/pci* sets 
>cache-disableandwrite-through
>
>On Wed, 2008-04-16 at 15:23 -0700, Pallipadi, Venkatesh wrote:
>
>> Setting flags "_PAGE_PCD | _PAGE_PWT" maps to PAT_3 which is 
>"UC". So,
>> the mapping X server will get is UC and if X sets MTRR to WC, it will
>> not have any effect on this mapping, which will be still UC. 
>There may
>> be apps which depend on this /sys/ interface giving UC 
>mapping always...
>
>Sure, but X wants WC in the biggest way (our measurements show WC being
>more than 6 times faster for writes, the operation we do the most of).
>
>Jesse Barnes suggested using the fact that the BAR was prefetchable to
>guess that this region should not be mapped UC. I don't know of any
>other information available at this API that would help make a better
>choice; there are no ioctls on /sys files that we could use to
>manipulate the mapping.

Yes. But, the API being there for some time may mean that there is some
user who always wants UC behavior with or without PREFETCHABLE flag.
With PAT changes we thought of ioctl and then settled on new interface.

>> With PAT patches, we now have /sys/devices/pci*resource_wc 
>(in addition
>> to resource) for any PREFETCHABLE region. So, X has to change to use
>> that new interface to get WC mapping.
>
>Yeah, we'll do that when it becomes available. Would it make sense in
>the pre-PAT world to use Jesse's guess?
>

Yes. Easiest way to do that will be to use UC_MINUS (Just set PCD bit
and not PWT bit) until PAT changes.
Ioremap() uses UC_MINUS, it should not have any side effects to other
users who expect this mapping to be UC without any MTRR setting.

>In any case, we'll continue to use the fact that mprotect is 
>also broken
>to get our WC mapping working (using mprotect PROT_NONE followed by
>mprotect PROT_READ|PROT_WRITE causes the CD and WT bits to get 
>cleared).
>We're fortunate in this case that we've found a bug to exploit that
>gives us the desired behaviour.
>

Noo.. Now we have one more thing in PAT to do list :-(. To go and plug
those mprotect APIs to prevent users from doing things like this.

Thanks,
Venki


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Mapping PCI BAR through /sys/devices/pci* sets cache-disableandwrite-through
  2008-04-17 13:14     ` Mapping PCI BAR through /sys/devices/pci* sets cache-disableandwrite-through Pallipadi, Venkatesh
@ 2008-04-17 17:07       ` Jesse Barnes
  0 siblings, 0 replies; 5+ messages in thread
From: Jesse Barnes @ 2008-04-17 17:07 UTC (permalink / raw)
  To: Pallipadi, Venkatesh; +Cc: Keith Packard, linux-kernel, Siddha, Suresh B

On Thursday, April 17, 2008 6:14 am Pallipadi, Venkatesh wrote:
> >Jesse Barnes suggested using the fact that the BAR was prefetchable to
> >guess that this region should not be mapped UC. I don't know of any
> >other information available at this API that would help make a better
> >choice; there are no ioctls on /sys files that we could use to
> >manipulate the mapping.
>
> Yes. But, the API being there for some time may mean that there is some
> user who always wants UC behavior with or without PREFETCHABLE flag.
> With PAT changes we thought of ioctl and then settled on new interface.

Yeah, we can't really know.  There *probably* aren't any users here that would 
break, but you never know, and it's best not to change the behavior of an 
existing API...

> >Yeah, we'll do that when it becomes available. Would it make sense in
> >the pre-PAT world to use Jesse's guess?
>
> Yes. Easiest way to do that will be to use UC_MINUS (Just set PCD bit
> and not PWT bit) until PAT changes.
> Ioremap() uses UC_MINUS, it should not have any side effects to other
> users who expect this mapping to be UC without any MTRR setting.

We could probably do that unconditionally though, rather than dependent on the 
prefetchable bit like my patch does.

> >In any case, we'll continue to use the fact that mprotect is
> >also broken
> >to get our WC mapping working (using mprotect PROT_NONE followed by
> >mprotect PROT_READ|PROT_WRITE causes the CD and WT bits to get
> >cleared).
> >We're fortunate in this case that we've found a bug to exploit that
> >gives us the desired behaviour.
>
> Noo.. Now we have one more thing in PAT to do list :-(. To go and plug
> those mprotect APIs to prevent users from doing things like this.

I think it's just a long standing bug...  mprotect really shouldn't be 
clobbering PTE bits regardless of PAT (though with PAT the bug becomes more 
serious).

Jesse

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-04-17 17:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-16 17:40 Mapping PCI BAR through /sys/devices/pci* sets cache-disable and write-through Keith Packard
2008-04-16 22:23 ` Mapping PCI BAR through /sys/devices/pci* sets cache-disable andwrite-through Pallipadi, Venkatesh
2008-04-17  5:20   ` Keith Packard
2008-04-17 13:14     ` Mapping PCI BAR through /sys/devices/pci* sets cache-disableandwrite-through Pallipadi, Venkatesh
2008-04-17 17:07       ` Jesse Barnes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).