LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* [git pull] drm patches for 2.6.27-rc1 @ 2008-10-17 21:29 Dave Airlie 2008-10-17 22:43 ` Linus Torvalds 2008-10-18 1:37 ` Nick Piggin 0 siblings, 2 replies; 80+ messages in thread From: Dave Airlie @ 2008-10-17 21:29 UTC (permalink / raw) To: torvalds, Andrew Morton; +Cc: linux-kernel, dri-devel [-- Attachment #1: Type: TEXT/PLAIN, Size: 24394 bytes --] Hi Linus, This is a new tree, I had a conflict with your latest tree due to some trivial cleanups you merged. I've added the fix for CVE-2008-3831 which is unembargoed. Please pull the 'drm-next' branch from ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-next This contains two patches outside the DRM git tree to add exports for GEM functionality while we await Nick Piggins vmap and shmem changes. This update contains the following. 1. CVE-2008-3831 - fixes for memset on arbitrary memory via i915 on G33 hw and above. 2. Intel GEM memory manager . 3. Opregion support for Intel Integrated chipsets. 4. Updated radeon driver with support for new chipsets + bugfixes 5. vblank reworking to save power and interrupts on intel/radeon GPUs. Dave. arch/x86/mm/highmem_32.c | 1 + drivers/gpu/drm/Kconfig | 3 +- drivers/gpu/drm/Makefile | 5 +- drivers/gpu/drm/drm_agpsupport.c | 52 +- drivers/gpu/drm/drm_cache.c | 69 + drivers/gpu/drm/drm_drv.c | 6 + drivers/gpu/drm/drm_fops.c | 8 +- drivers/gpu/drm/drm_gem.c | 421 ++++++ drivers/gpu/drm/drm_irq.c | 464 +++++- drivers/gpu/drm/drm_memory.c | 2 + drivers/gpu/drm/drm_mm.c | 5 +- drivers/gpu/drm/drm_proc.c | 135 ++- drivers/gpu/drm/drm_stub.c | 11 +- drivers/gpu/drm/drm_sysfs.c | 2 +- drivers/gpu/drm/i915/Makefile | 7 +- drivers/gpu/drm/i915/i915_dma.c | 332 +++-- drivers/gpu/drm/i915/i915_drv.c | 476 +------ drivers/gpu/drm/i915/i915_drv.h | 1180 +++++----------- drivers/gpu/drm/i915/i915_gem.c | 2558 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_gem_debug.c | 201 +++ drivers/gpu/drm/i915/i915_gem_proc.c | 292 ++++ drivers/gpu/drm/i915/i915_gem_tiling.c | 257 ++++ drivers/gpu/drm/i915/i915_irq.c | 514 +++++-- drivers/gpu/drm/i915/i915_opregion.c | 371 +++++ drivers/gpu/drm/i915/i915_reg.h | 1417 ++++++++++++++++++ drivers/gpu/drm/i915/i915_suspend.c | 509 +++++++ drivers/gpu/drm/mga/mga_drv.c | 29 +- drivers/gpu/drm/mga/mga_drv.h | 6 +- drivers/gpu/drm/mga/mga_irq.c | 74 +- drivers/gpu/drm/mga/mga_state.c | 2 +- drivers/gpu/drm/r128/r128_drv.c | 29 +- drivers/gpu/drm/r128/r128_drv.h | 11 +- drivers/gpu/drm/r128/r128_irq.c | 55 +- drivers/gpu/drm/r128/r128_state.c | 2 +- drivers/gpu/drm/radeon/radeon_cp.c | 53 +- drivers/gpu/drm/radeon/radeon_drv.c | 32 +- drivers/gpu/drm/radeon/radeon_drv.h | 57 +- drivers/gpu/drm/radeon/radeon_irq.c | 268 +++-- drivers/gpu/drm/radeon/radeon_state.c | 2 +- drivers/gpu/drm/sis/sis_mm.c | 10 +- drivers/gpu/drm/via/via_drv.c | 26 +- drivers/gpu/drm/via/via_drv.h | 16 +- drivers/gpu/drm/via/via_irq.c | 105 +- drivers/gpu/drm/via/via_mm.c | 3 +- include/drm/drm.h | 63 +- include/drm/drmP.h | 249 +++- include/drm/drm_pciids.h | 54 +- include/drm/i915_drm.h | 333 +++++ mm/shmem.c | 1 + 49 files changed, 8814 insertions(+), 1964 deletions(-) create mode 100644 drivers/gpu/drm/drm_cache.c create mode 100644 drivers/gpu/drm/drm_gem.c create mode 100644 drivers/gpu/drm/i915/i915_gem.c create mode 100644 drivers/gpu/drm/i915/i915_gem_debug.c create mode 100644 drivers/gpu/drm/i915/i915_gem_proc.c create mode 100644 drivers/gpu/drm/i915/i915_gem_tiling.c create mode 100644 drivers/gpu/drm/i915/i915_opregion.c create mode 100644 drivers/gpu/drm/i915/i915_reg.h create mode 100644 drivers/gpu/drm/i915/i915_suspend.c commit 4b40893918203ee1a1f6a114316c2a19c072e9bd Author: Matthias Hopf <mhopf@suse.de> Date: Sat Oct 18 07:18:05 2008 +1000 drm/i915: fix ioremap of a user address for non-root (CVE-2008-3831) Olaf Kirch noticed that the i915_set_status_page() function of the i915 kernel driver calls ioremap with an address offset that is supplied by userspace via ioctl. The function zeroes the mapped memory via memset and tells the hardware about the address. Turns out that access to that ioctl is not restricted to root so users could probably exploit that to do nasty things. We haven't tried to write actual exploit code though. It only affects the Intel G33 series and newer. Signed-off-by: Dave Airlie <airlied@redhat.com> commit 9e0b97e37fddaf5419d8af24362015ab684eff7e Author: Dave Airlie <airlied@redhat.com> Date: Fri Oct 17 09:29:14 2008 +1000 drm: make CONFIG_DRM depend on CONFIG_SHMEM. This can be removed later when DRM doesn't depend on shmem. Signed-off-by: Dave Airlie <airlied@redhat.com> commit edc6f389f6ae9cb7621270a8ddbb1892bd8df125 Author: Alex Deucher <alexdeucher@gmail.com> Date: Fri Oct 17 09:21:45 2008 +1000 radeon: fix PCI bus mastering support enables. Someone noticed these registers moved around for later chips, so we redo the codepaths per-chip. PCIE chips don't appear to require explicit enables. Signed-off-by: Dave Airlie <airlied@redhat.com> commit b2ceddfa52cbeb244b90096f1e8d3e9f7e0ce299 Author: Alex Deucher <alexdeucher@gmail.com> Date: Fri Oct 17 09:19:33 2008 +1000 radeon: add RS400 family support. This adds support for the RS400 family of IGPs for Intel CPUs. Signed-off-by: Dave Airlie <airlied@redhat.com> commit f0738e92403466d45cfb5008da668260c77fff4b Author: Alex Deucher <alexdeucher@gmail.com> Date: Thu Oct 16 17:12:02 2008 +1000 drm/radeon: add support for RS740 IGP chipsets. This adds support for the HS2100 IGP chipset. Signed-off-by: Dave Airlie <airlied@redhat.com> commit b612eda98e4b4bae4c98a863f039bc89425f9039 Author: Eric Anholt <eric@anholt.net> Date: Wed Oct 15 00:05:58 2008 -0700 i915: GM45 has GM965-style MCH setup. Fixes tiling swizzling mode failures that manifest in glReadPixels(). Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 6dbe2772d6af067845bab57be490c302f4490ac7 Author: Keith Packard <keithp@keithp.com> Date: Tue Oct 14 21:41:13 2008 -0700 i915: Don't run retire work handler while suspended At leavevt and lastclose time, cancel any pending retire work handler invocation, and keep the retire work handler from requeuing itself if it is currently running. This patch restructures i915_gem_idle to perform all of these tasks instead of having both leavevt and lastclose call a sequence of functions. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit ba1eb1d825fdef40f69871caf8e5842d00efbbc5 Author: Keith Packard <keithp@keithp.com> Date: Tue Oct 14 19:55:10 2008 -0700 i915: Map status page cached for chips with GTT-based HWS location. This should improve performance by avoiding uncached reads by the CPU (the point of having a status page), and may improve stability. This patch only affects G33, GM45 and G45 chips as those are the only ones using GTT-based HWS mappings. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 50aa253d820ad4577e2231202f2c8fd89f9dc4e6 Author: Keith Packard <keithp@keithp.com> Date: Tue Oct 14 17:20:35 2008 -0700 i915: Fix up ring initialization to cover G45 oddities G45 appears quite sensitive to ring initialization register writes, sometimes leaving the HEAD register with the START register contents. Check to make sure HEAD is reset correctly when START is written, and fix it up, screaming loudly. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 0cdad7e88a23910a911a3339ff2d70f8f952d7b8 Author: Keith Packard <keithp@keithp.com> Date: Tue Oct 14 17:19:38 2008 -0700 i915: Use non-reserved status page index for breadcrumb Dwords 0 through 0x1f are reserved for use by the hardware. Move the GEM breadcrumb from 0x10 to 0x20 to keep out of this area. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 630681d9a5314e6cf53d144f7f58b7c19862a7d3 Author: Eric Anholt <eric@anholt.net> Date: Mon Oct 6 15:14:12 2008 -0700 drm: Increment dev_priv->irq_received so i915_gem_interrupts count works. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 9bfbd5cb72c9edb8504a4a7a0aa89cdb2fcb4845 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Mon Sep 15 15:00:33 2008 -0700 drm: kill drm_device->irq Like the last patch but adds a macro to get at the irq value instead of dereferencing pdev directly. Should make things easier for the BSD guys and if we ever support non-PCI devices. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Dave Airlie <airlied@redhat.com> commit e0f0754ff6128570dcf38032f5bfb1f195e5bbee Author: Dave Airlie <airlied@redhat.com> Date: Tue Oct 7 13:41:49 2008 +1000 drm: wbinvd is cache coherent. doing an ipi for the wbinvd case isn't necessary. Signed-off-by: Dave Airlie <airlied@redhat.com> commit e7d22bc3cb57126196c4f475d4e55aa44e151784 Author: Dave Airlie <airlied@redhat.com> Date: Tue Oct 7 13:40:36 2008 +1000 i915: add missing return in error path. Signed-off-by: Dave Airlie <airlied@redhat.com> commit 2bdf00b22154023ac312481583603f4724eb1401 Author: Dave Airlie <airlied@redhat.com> Date: Tue Oct 7 13:40:10 2008 +1000 i915: fixup permissions on gem ioctls. init/entervt/leavevt should be root-only master ioctls. Signed-off-by: Dave Airlie <airlied@redhat.com> commit 3043c60c485ad694392d3f71bd7ef9f5c5f7cfdd Author: Eric Anholt <eric@anholt.net> Date: Thu Oct 2 12:24:47 2008 -0700 drm: Clean up many sparse warnings in i915. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit bd88ee4c1b1c8fc8b78a0ba7b6235d230cea0d05 Author: Eric Anholt <eric@anholt.net> Date: Tue Sep 23 14:50:57 2008 -0700 drm: Use ioremap_wc in i915_driver instead of ioremap, since we always want WC. Fixes failure to map the ringbuffer when PAT tells us we don't get to do uncached on something that's already mapped WC, or something along those lines. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 28af0a2767412937e8424364a8ece9b230bdbc83 Author: Eric Anholt <eric@anholt.net> Date: Mon Sep 15 13:13:34 2008 -0700 drm: G33-class hardware has a newer 965-style MCH (no DCC register). Fixes bad software fallback rendering in Mesa in dual-channel configurations. d9a2470012588dc5313a5ac8bb2f03575af00e99 Signed-off-by: Dave Airlie <airlied@redhat.com> commit 4f481ed22ec0d412336a13dc4477f6d0f3688882 Author: Eric Anholt <eric@anholt.net> Date: Wed Sep 10 14:22:49 2008 -0700 drm: Avoid oops in GEM execbuffers with bad arguments. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit d4e7b898c12b2b458323243abdd4a215f8f8f090 Author: Eric Anholt <eric@anholt.net> Date: Tue Sep 9 11:40:34 2008 -0700 DRM: Return -EBADF on bad object in flink, and return curent name if it exists. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit dbb19d302baa8ba518599701914f600935fc53fa Author: Kristian Høgsberg <krh@redhat.com> Date: Wed Aug 20 11:04:27 2008 -0400 i915 gem: install and uninstall irq handler in entervt and leavevt ioctls. Signed-off-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit c99b058f132388a666544d293392d52d1def6b12 Author: Kristian Høgsberg <krh@redhat.com> Date: Wed Aug 20 11:20:13 2008 -0400 i915: Make use of sarea_priv conditional. We fail ioctls that depend on the sarea_priv with EINVAL. Signed-off-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 546b0974c39657017407c86fe79811100b60700d Author: Eric Anholt <eric@anholt.net> Date: Mon Sep 1 16:45:29 2008 -0700 i915: Use struct_mutex to protect ring in GEM mode. In the conversion for GEM, we had stopped using the hardware lock to protect ring usage, since it was all internal to the DRM now. However, some paths weren't converted to using struct_mutex to prevent multiple threads from concurrently working on the ring, in particular between the vblank swap handler and ioctls. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit ed4c9c4acf948b42b138747fcb8843ecb1a24ce4 Author: Kristian Høgsberg <krh@redhat.com> Date: Wed Aug 20 11:08:52 2008 -0400 i915: Add chip set ID param. Signed-off-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 673a394b1e3b69be886ff24abfd6df97c52e8d08 Author: Eric Anholt <eric@anholt.net> Date: Wed Jul 30 12:06:12 2008 -0700 drm: Add GEM ("graphics execution manager") to i915 driver. GEM allows the creation of persistent buffer objects accessible by the graphics device through new ioctls for managing execution of commands on the device. The userland API is almost entirely driver-specific to ensure that any driver building on this model can easily map the interface to individual driver requirements. GEM is used by the 2d driver for managing its internal state allocations and will be used for pixmap storage to reduce memory consumption and enable zero-copy GLX_EXT_texture_from_pixmap, and in the 3d driver is used to enable GL_EXT_framebuffer_object and GL_ARB_pixel_buffer_object. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit d1d8c925b71dd6753bf438f9e14a9e5c5183bcc6 Author: Eric Anholt <eric@anholt.net> Date: Thu Aug 21 12:53:33 2008 -0700 Export kmap_atomic_pfn for DRM-GEM. The driver would like to map IO space directly for copying data in when appropriate, to avoid CPU cache flushing for streaming writes. kmap_atomic_pfn lets us avoid IPIs associated with ioremap for this process. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 395e0ddc44005ced5e4fed9bfc2e4bdf63d37627 Author: Keith Packard <keithp@keithp.com> Date: Fri Jun 20 00:08:06 2008 -0700 Export shmem_file_setup for DRM-GEM GEM needs to create shmem files to back buffer objects. Though currently creation of files for objects could have been driven from userland, the modesetting work will require allocation of buffer objects before userland is running, for boot-time message display. Signed-off-by: Eric Anholt <eric@anholt.net> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 0a3e67a4caac273a3bfc4ced3da364830b1ab241 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Tue Sep 30 12:14:26 2008 -0700 drm: Rework vblank-wait handling to allow interrupt reduction. Previously, drivers supporting vblank interrupt waits would run the interrupt all the time, or all the time that any 3d client was running, preventing the CPU from sleeping for long when the system was otherwise idle. Now, interrupts are disabled any time that no client is waiting on a vblank event. The new method uses vblank counters on the chipsets when the interrupts are turned off, rather than counting interrupts, so that we can continue to present accurate vblank numbers. Co-author: Michel Dänzer <michel@tungstengraphics.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 2df68b439fcb97a4c55f81516206ef4ee325e28d Author: David Howells <dhowells@redhat.com> Date: Tue Sep 2 11:03:14 2008 +1000 drm/cred: wrap task credential accesses in the drm driver. Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: David Airlie <airlied@redhat.com> commit b9bfdfe6703eb089839d48316a79c84924a3c335 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Mon Aug 25 15:16:19 2008 -0700 new chip name is GM45 Author: Zhenyu Wang <zhenyu.z.wang@intel.com> i915: official name for GM45 chipset Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 317c35d1446f68b34d4de4e1100fc01680bd4877 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Mon Aug 25 15:11:06 2008 -0700 separate i915 suspend/resume functions into their own file [Patch against drm-next. Consider this a trial balloon for our new Linux development model.] This is a big chunk of code. Separating it out makes it easier to change without churn on the main i915_drv.c file (and there will be churn as we fix bugs and add things like kernel mode setting). Also makes it easier to share this file with BSD. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 6b79d521e07aae155303a992245abb539974dbaa Author: Dave Airlie <airlied@redhat.com> Date: Tue Sep 2 10:10:16 2008 +1000 radeon: fix writeback across suspend/resume. Make writeback not get disabled on resume. Signed-off-by: Dave Airlie <airlied@redhat.com> commit 38eda21189b414b1520ea7aa0e71220796f7008f Author: Dave Airlie <airlied@redhat.com> Date: Tue Sep 2 10:06:06 2008 +1000 drm: fix sysfs error path. Pointed out by Roel Kluin on dri-devel. Signed-off-by: Dave Airlie <airlied@redhat.com> commit dfcf96d09cff63c9aaa8e7c98bbc71e5073b1377 Author: Adrian Bunk <bunk@kernel.org> Date: Sun Aug 24 17:11:22 2008 +1000 FB_SIS=m, DRM_SIS=y is not a legal configuration. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 8ee1c3db9075cb3211352e737e0feb98fd733b20 Author: Matthew Garrett <mjg59@srcf.ucam.org> Date: Tue Aug 5 19:37:25 2008 +0100 Add Intel ACPI IGD OpRegion support This adds the support necessary for allowing ACPI backlight control to work on some newer Intel-based graphics systems. Tested on Thinkpad T61 and HP 2510p hardware. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Dave Airlie <airlied@linux.ie> commit 398c9cb20b5c6c5d1313912b937d653a46fec578 Author: Keith Packard <keithp@keithp.com> Date: Wed Jul 30 13:03:43 2008 -0700 i915: Initialize hardware status page at device load when possible. Some chips were unstable with repeated setup/teardown of the hardware status page. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit d3a6d4467ca44833bcb4ba1893a7aeaae939e4d5 Author: Keith Packard <keithp@keithp.com> Date: Wed Jul 30 12:21:20 2008 -0700 i915: Track progress inside of batchbuffers for determining wedgedness. This avoids early termination for long-running commands. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit ed4cb4142b242d8090d3811d5eb4abf6aa985bc8 Author: Eric Anholt <eric@anholt.net> Date: Tue Jul 29 12:10:39 2008 -0700 i915: Add support for MSI and interrupt mitigation. Previous attempts at interrupt mitigation had been foiled by i915_wait_irq's failure to update the sarea seqno value when the status page indicated that the seqno had already been passed. MSI support has been seen to cut CPU costs by up to 40% in some workloads by avoiding other expensive interrupt handlers for frequent graphics interrupts. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 585fb111348f7cdc30c6a1b903987612ddeafb23 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Tue Jul 29 11:54:06 2008 -0700 i915: Use more consistent names for regs, and store them in a separate file. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 962d4fd7273e144ae003ddb90138ae4b80567c70 Author: Keith Packard <keithp@keithp.com> Date: Wed Jul 30 12:36:08 2008 -0700 i915: Ignore X server provided mmio address It is already correctly detected by the kernel for use in suspend/resume. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 0790d5e148c0747499742a3c09ba5f1c07f9ed0d Author: Keith Packard <keithp@keithp.com> Date: Wed Jul 30 12:28:47 2008 -0700 i915: remove settable use_mi_batchbuffer_start The driver can know what hardware requires MI_BATCH_BUFFER vs MI_BATCH_BUFFER_START; there's no reason to let user mode configure this. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 48f185d0e0f3adde81117ead074e5e6ec5548449 Author: David Howells <dhowells@redhat.com> Date: Wed Jul 30 12:29:38 2008 -0700 SiS DRM: fix a pointer cast warning Fix a pointer cast warning in the SIS DRM code. This was introduced in patch ce65a44de07f73ceda1749812b75086b7add408d. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 6bb9e4bff5c6fd908d907222108ef5650d77972f Author: David Howells <dhowells@redhat.com> Date: Wed Jul 30 12:29:37 2008 -0700 SiS DRM: fix the memory allocator if the SIS FB is built as a module Fix the SIS DRM memory allocator if the SIS FB built as a module. The SIS DRM code initialises the mm allocation hooks, but _only_ if the SIS FB is not built as a module because it depends on CONFIG_FB_SIS, and that's unset if the SIS FB is not built in. It must check CONFIG_FB_SIS_MODULE as well. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Airlie <airlied@redhat.com> commit ddb7f4cb819fb6b9df261ce4c80b3c6f4852620d Author: Carlos R. Mafra <crmafra2@gmail.com> Date: Wed Jul 30 12:29:37 2008 -0700 drm: remove #define's for non-linux systems There is no point in considering FreeBSD et al. in the linux kernel source code. Signed-off-by: Carlos R. Mafra <crmafra@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Airlie <airlied@redhat.com> ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-17 21:29 [git pull] drm patches for 2.6.27-rc1 Dave Airlie @ 2008-10-17 22:43 ` Linus Torvalds 2008-10-18 2:10 ` Eric Anholt 2008-10-18 9:11 ` Dave Airlie 2008-10-18 1:37 ` Nick Piggin 1 sibling, 2 replies; 80+ messages in thread From: Linus Torvalds @ 2008-10-17 22:43 UTC (permalink / raw) To: Dave Airlie; +Cc: Andrew Morton, linux-kernel, dri-devel On Fri, 17 Oct 2008, Dave Airlie wrote: > > Please pull the 'drm-next' branch from > ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-next Grr. This whole merge series has been full of people sending me UNTESTED CRAP. So what's the excuse _this_ time for adding all these stupid warnings to my build log? Did nobody test this? drivers/gpu/drm/drm_proc.c: In function ‘drm_gem_one_name_info’: drivers/gpu/drm/drm_proc.c:525: warning: format ‘%d’ expects type ‘int’, but argument 3 has type ‘size_t’ drivers/gpu/drm/drm_proc.c:533: warning: format ‘%9d’ expects type ‘int’, but argument 4 has type ‘size_t’ drivers/gpu/drm/i915/i915_gem.c: In function ‘i915_gem_gtt_pwrite’: drivers/gpu/drm/i915/i915_gem.c:184: warning: unused variable ‘vaddr_atomic’ and I wonder how many other warnings got added that I never even noticed because I don't build them? And yes, it's not just warnings. One of thse is horribly bad code: nid->len += sprintf(&nid->buf[nid->len], "%6d%9d%8d%9d\n", obj->name, obj->size, atomic_read(&obj->handlecount.refcount), atomic_read(&obj->refcount.refcount)); where it's just wrong to use the field width as a separator. Because if the counts are big enough, the separator suddenly goes away! So that format string should be "%6d %8zd %7d %8d\n" instead. Which gives the same output when you don't overflow, and doesn't have the overflow bug when you do. As to that "vaddr_atomic" thing, the warning would have been avoided if you had just cleanly split up the optimistic fast case. IOW, write cleaner code, and the warning just goes away on its own. Something like the appended. UNTESTED! Hmm? I really wish people were more careful, and took more pride in trying to write readable code, with small modular functions instead. And move those variables down to the block they are needed in. Anyway, I pulled the thing, but _please_ test this cleanup and send it back to me if it passes your testing. Ok? Linus --- drivers/gpu/drm/drm_proc.c | 4 +- drivers/gpu/drm/i915/i915_gem.c | 59 +++++++++++++++++++++++--------------- 2 files changed, 38 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/drm_proc.c b/drivers/gpu/drm/drm_proc.c index d490db4..ae73b7f 100644 --- a/drivers/gpu/drm/drm_proc.c +++ b/drivers/gpu/drm/drm_proc.c @@ -522,12 +522,12 @@ static int drm_gem_one_name_info(int id, void *ptr, void *data) struct drm_gem_object *obj = ptr; struct drm_gem_name_info_data *nid = data; - DRM_INFO("name %d size %d\n", obj->name, obj->size); + DRM_INFO("name %d size %zd\n", obj->name, obj->size); if (nid->eof) return 0; nid->len += sprintf(&nid->buf[nid->len], - "%6d%9d%8d%9d\n", + "%6d %8zd %7d %8d\n", obj->name, obj->size, atomic_read(&obj->handlecount.refcount), atomic_read(&obj->refcount.refcount)); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 9ac73dd..b8c8b2e 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -171,6 +171,36 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data, return 0; } +/* + * Try to write quickly with an atomic kmap. Return true on success. + * + * If this fails (which includes a partial write), we'll redo the whole + * thing with the slow version. + * + * This is a workaround for the low performance of iounmap (approximate + * 10% cpu cost on normal 3D workloads). kmap_atomic on HIGHMEM kernels + * happens to let us map card memory without taking IPIs. When the vmap + * rework lands we should be able to dump this hack. + */ +static inline int fast_user_write(unsigned long pfn, char __user *user_data, int l) +{ +#ifdef CONFIG_HIGHMEM + unsigned long unwritten; + char *vaddr_atomic; + + vaddr_atomic = kmap_atomic_pfn(pfn, KM_USER0); +#if WATCH_PWRITE + DRM_INFO("pwrite i %d o %d l %d pfn %ld vaddr %p\n", + i, o, l, pfn, vaddr_atomic); +#endif + unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + o, user_data, l); + kunmap_atomic(vaddr_atomic, KM_USER0); + return !unwritten; +#else + return 1; +#endif +} + static int i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, struct drm_i915_gem_pwrite *args, @@ -180,12 +210,7 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, ssize_t remain; loff_t offset; char __user *user_data; - char __iomem *vaddr; - char *vaddr_atomic; - int i, o, l; int ret = 0; - unsigned long pfn; - unsigned long unwritten; user_data = (char __user *) (uintptr_t) args->data_ptr; remain = args->size; @@ -209,6 +234,9 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, obj_priv->dirty = 1; while (remain > 0) { + unsigned long pfn; + int i, o, l; + /* Operation in this page * * i = page number @@ -223,25 +251,10 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, pfn = (dev->agp->base >> PAGE_SHIFT) + i; -#ifdef CONFIG_HIGHMEM - /* This is a workaround for the low performance of iounmap - * (approximate 10% cpu cost on normal 3D workloads). - * kmap_atomic on HIGHMEM kernels happens to let us map card - * memory without taking IPIs. When the vmap rework lands - * we should be able to dump this hack. - */ - vaddr_atomic = kmap_atomic_pfn(pfn, KM_USER0); -#if WATCH_PWRITE - DRM_INFO("pwrite i %d o %d l %d pfn %ld vaddr %p\n", - i, o, l, pfn, vaddr_atomic); -#endif - unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + o, - user_data, l); - kunmap_atomic(vaddr_atomic, KM_USER0); + if (!fast_user_write(pfn, user_data, l)) { + unsigned long unwritten; + char __iomem *vaddr; - if (unwritten) -#endif /* CONFIG_HIGHMEM */ - { vaddr = ioremap_wc(pfn << PAGE_SHIFT, PAGE_SIZE); #if WATCH_PWRITE DRM_INFO("pwrite slow i %d o %d l %d " ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-17 22:43 ` Linus Torvalds @ 2008-10-18 2:10 ` Eric Anholt 2008-10-18 2:47 ` Linus Torvalds 2008-10-20 20:04 ` Jesse Barnes 2008-10-18 9:11 ` Dave Airlie 1 sibling, 2 replies; 80+ messages in thread From: Eric Anholt @ 2008-10-18 2:10 UTC (permalink / raw) To: Linus Torvalds; +Cc: Dave Airlie, Andrew Morton, linux-kernel, dri-devel [-- Attachment #1: Type: text/plain, Size: 2460 bytes --] On Fri, 2008-10-17 at 15:43 -0700, Linus Torvalds wrote: > > On Fri, 17 Oct 2008, Dave Airlie wrote: > > > > Please pull the 'drm-next' branch from > > ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-next > > Grr. > > This whole merge series has been full of people sending me UNTESTED CRAP. > > So what's the excuse _this_ time for adding all these stupid warnings to > my build log? Did nobody test this? > > drivers/gpu/drm/drm_proc.c: In function ‘drm_gem_one_name_info’: > drivers/gpu/drm/drm_proc.c:525: warning: format ‘%d’ expects type ‘int’, but argument 3 has type ‘size_t’ > drivers/gpu/drm/drm_proc.c:533: warning: format ‘%9d’ expects type ‘int’, but argument 4 has type ‘size_t’ > drivers/gpu/drm/i915/i915_gem.c: In function ‘i915_gem_gtt_pwrite’: > drivers/gpu/drm/i915/i915_gem.c:184: warning: unused variable ‘vaddr_atomic’ > > and I wonder how many other warnings got added that I never even noticed > because I don't build them? > > And yes, it's not just warnings. One of thse is horribly bad code: > > nid->len += sprintf(&nid->buf[nid->len], > "%6d%9d%8d%9d\n", > obj->name, obj->size, > atomic_read(&obj->handlecount.refcount), > atomic_read(&obj->refcount.refcount)); > > where it's just wrong to use the field width as a separator. Because if > the counts are big enough, the separator suddenly goes away! > > So that format string should be > > "%6d %8zd %7d %8d\n" > > instead. Which gives the same output when you don't overflow, and doesn't > have the overflow bug when you do. > > As to that "vaddr_atomic" thing, the warning would have been avoided if > you had just cleanly split up the optimistic fast case. > > IOW, write cleaner code, and the warning just goes away on its own. > Something like the appended. UNTESTED! Yeah, none of us are on x86-64, so we missed those warnings in testing. The change looks pretty good except for s/return 1/return 0/. We wanted to pull the i_wish_it_was_ioremap_atomic() hack out so that we could use it for relocation updates as well (which aren't copy_from_user), and I've started writing that patch. Expect something pushed at you soon. -- Eric Anholt eric@anholt.net eric.anholt@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 2:10 ` Eric Anholt @ 2008-10-18 2:47 ` Linus Torvalds 2008-10-18 3:49 ` Keith Packard 2008-10-18 7:49 ` Eric Anholt 2008-10-20 20:04 ` Jesse Barnes 1 sibling, 2 replies; 80+ messages in thread From: Linus Torvalds @ 2008-10-18 2:47 UTC (permalink / raw) To: Eric Anholt; +Cc: Dave Airlie, Andrew Morton, linux-kernel, dri-devel On Fri, 17 Oct 2008, Eric Anholt wrote: > > Yeah, none of us are on x86-64, so we missed those warnings in testing. Really? None of you use any modern CPU's, or you're _all_ running 32-bit distros even though your cpu's could support 64-bit ones? I would suggest at least _somebody_ in the intel graphics team try to get with the times.. I realize that Otellini was saying "Nobody needs 64-bit on the desktop" a few years ago, but he was full of sh*t then, and it's certainly not remotely true now. It's not being disloyal to your CEO, really. I'm pretty sure nobody will be fired just for ignoring that whole "640kB^H^H^H^H^H32-bits should be enough for everybody" idiocy. > The change looks pretty good except for s/return 1/return 0/. oops, yes. Thinko. I reversed the meaning of the return value: at first I just returned "unwritten", but decided that that was a really ugly interface, and then when I prettied that up, I didn't fix the !HIGHMEM case. And I obviously never _tested_ it. Linus ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 2:47 ` Linus Torvalds @ 2008-10-18 3:49 ` Keith Packard 2008-10-18 6:44 ` Corbin Simpson 2008-10-18 7:49 ` Eric Anholt 1 sibling, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-18 3:49 UTC (permalink / raw) To: Linus Torvalds Cc: keithp, Eric Anholt, Dave Airlie, Andrew Morton, linux-kernel, dri-devel [-- Attachment #1: Type: text/plain, Size: 825 bytes --] On Fri, 2008-10-17 at 19:47 -0700, Linus Torvalds wrote: > Really? None of you use any modern CPU's, or you're _all_ running 32-bit > distros even though your cpu's could support 64-bit ones? We're lazy, perhaps even lazier than yourself. Given that the whole goal is to essentially ignore the CPU and get our code running on the GPU, it's hard to get excited about the kind of kernel we're running on the CPU. We've got a bunch of test boxes that run 64-bits, unfortunately, the people doing builds there appear not to care about warnings. That will get fixed. I've also got this new G45 box here; perhaps it's time to enter the 21st century and run it in native 64-bit mode. It's scary though; I've been writing window systems for 32-bit machines for thirty years now. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 3:49 ` Keith Packard @ 2008-10-18 6:44 ` Corbin Simpson 0 siblings, 0 replies; 80+ messages in thread From: Corbin Simpson @ 2008-10-18 6:44 UTC (permalink / raw) To: Keith Packard Cc: Linus Torvalds, Dave Airlie, linux-kernel, dri-devel, Andrew Morton -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Keith Packard wrote: > On Fri, 2008-10-17 at 19:47 -0700, Linus Torvalds wrote: > >> Really? None of you use any modern CPU's, or you're _all_ running 32-bit >> distros even though your cpu's could support 64-bit ones? > > We're lazy, perhaps even lazier than yourself. Given that the whole goal > is to essentially ignore the CPU and get our code running on the GPU, > it's hard to get excited about the kind of kernel we're running on the > CPU. > > We've got a bunch of test boxes that run 64-bits, unfortunately, the > people doing builds there appear not to care about warnings. That will > get fixed. Indeed. I have been running 64 bit builds for quite a while now, and merely ignored the warnings as "not my problem." In the future, I shall make it my problem. ~ C. - -- ~ Corbin Simpson <MostAwesomeDude@gmail.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkj5hecACgkQeCCY8PC5utAMAQCggtkBoolO58rW5PnPkYTosyZ5 DkgAnAqRSZzoFhgkdFcL92qV6Zyc7usJ =vpMP -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 2:47 ` Linus Torvalds 2008-10-18 3:49 ` Keith Packard @ 2008-10-18 7:49 ` Eric Anholt 2008-10-19 17:52 ` Peter Zijlstra 2008-10-20 16:31 ` Linus Torvalds 1 sibling, 2 replies; 80+ messages in thread From: Eric Anholt @ 2008-10-18 7:49 UTC (permalink / raw) To: Linus Torvalds; +Cc: Dave Airlie, Andrew Morton, linux-kernel, dri-devel [-- Attachment #1: Type: text/plain, Size: 1142 bytes --] On Fri, 2008-10-17 at 19:47 -0700, Linus Torvalds wrote: > > On Fri, 17 Oct 2008, Eric Anholt wrote: > > > > Yeah, none of us are on x86-64, so we missed those warnings in testing. > > Really? None of you use any modern CPU's, or you're _all_ running 32-bit > distros even though your cpu's could support 64-bit ones? > > I would suggest at least _somebody_ in the intel graphics team try to get > with the times.. I realize that Otellini was saying "Nobody needs 64-bit > on the desktop" a few years ago, but he was full of sh*t then, and it's > certainly not remotely true now. > > It's not being disloyal to your CEO, really. I'm pretty sure nobody will > be fired just for ignoring that whole "640kB^H^H^H^H^H32-bits should be > enough for everybody" idiocy. Writing 3D drivers means running 3D games. Running 3D games unfortunately means running a lot of 32-bit userland as the fun stuff is binary-only. So I stick to a 32-bit system, becuase past experience trying to run both on the same system has been misery. -- Eric Anholt eric@anholt.net eric.anholt@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 7:49 ` Eric Anholt @ 2008-10-19 17:52 ` Peter Zijlstra 2008-10-20 4:17 ` Steven J Newbury 2008-10-20 16:31 ` Linus Torvalds 1 sibling, 1 reply; 80+ messages in thread From: Peter Zijlstra @ 2008-10-19 17:52 UTC (permalink / raw) To: Eric Anholt Cc: Linus Torvalds, Dave Airlie, Andrew Morton, linux-kernel, dri-devel On Sat, 2008-10-18 at 00:49 -0700, Eric Anholt wrote: > Writing 3D drivers means running 3D games. Running 3D games > unfortunately means running a lot of 32-bit userland as the fun stuff is > binary-only. So I stick to a 32-bit system, becuase past experience > trying to run both on the same system has been misery. You can run 32bit user-space on a 64bit kernel just fine. Any breakage, if anything, you find will need to get fixed anyway. Also, one would hope Intel is poking these game writers to migrate to 64bit Linux ;-) ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-19 17:52 ` Peter Zijlstra @ 2008-10-20 4:17 ` Steven J Newbury 0 siblings, 0 replies; 80+ messages in thread From: Steven J Newbury @ 2008-10-20 4:17 UTC (permalink / raw) To: Peter Zijlstra Cc: Eric Anholt, Dave Airlie, Andrew Morton, Linus Torvalds, linux-kernel, dri-devel On Sun, 2008-10-19 at 19:52 +0200, Peter Zijlstra wrote: > On Sat, 2008-10-18 at 00:49 -0700, Eric Anholt wrote: > > > Writing 3D drivers means running 3D games. Running 3D games > > unfortunately means running a lot of 32-bit userland as the fun stuff is > > binary-only. So I stick to a 32-bit system, becuase past experience > > trying to run both on the same system has been misery. > > You can run 32bit user-space on a 64bit kernel just fine. > Any breakage, if anything, you find will need to get fixed anyway. > I'd like to second this. 32bit on 64bit works fine. I've been maintaining separate multilib versions of the Gentoo libdrm/Mesa ebuilds for 32bit/64bit. It is important that the 32bit and 64bit graphics stacks are synchronised, rather than relying on external compatibility libraries as Gentoo does currently. Failing to do so does lead to instability and/or failure. > Also, one would hope Intel is poking these game writers to migrate to > 64bit Linux ;-) > Indeed! :) ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 7:49 ` Eric Anholt 2008-10-19 17:52 ` Peter Zijlstra @ 2008-10-20 16:31 ` Linus Torvalds 1 sibling, 0 replies; 80+ messages in thread From: Linus Torvalds @ 2008-10-20 16:31 UTC (permalink / raw) To: Eric Anholt; +Cc: Dave Airlie, Andrew Morton, linux-kernel, dri-devel On Sat, 18 Oct 2008, Eric Anholt wrote: > > > It's not being disloyal to your CEO, really. I'm pretty sure nobody will > > be fired just for ignoring that whole "640kB^H^H^H^H^H32-bits should be > > enough for everybody" idiocy. > > Writing 3D drivers means running 3D games. Running 3D games > unfortunately means running a lot of 32-bit userland as the fun stuff is > binary-only. That's not a very good reason, though. We're supposed to be perfectly binary compatible, so it would actually be *better* if you did the testing using a 64-bit kernel. Sure, don't do it on _all_ machines, but running 32-bit user land is definitely not a valid reason to avoid a 64-bit kernel. Quite the reverse. We'd like to see more coverage. Linus ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 2:10 ` Eric Anholt 2008-10-18 2:47 ` Linus Torvalds @ 2008-10-20 20:04 ` Jesse Barnes 1 sibling, 0 replies; 80+ messages in thread From: Jesse Barnes @ 2008-10-20 20:04 UTC (permalink / raw) To: Eric Anholt Cc: Linus Torvalds, Dave Airlie, Andrew Morton, linux-kernel, dri-devel On Friday, October 17, 2008 7:10 pm Eric Anholt wrote: > On Fri, 2008-10-17 at 15:43 -0700, Linus Torvalds wrote: > > On Fri, 17 Oct 2008, Dave Airlie wrote: > > So what's the excuse _this_ time for adding all these stupid warnings to > > my build log? Did nobody test this? > > > > drivers/gpu/drm/drm_proc.c: In function ‘drm_gem_one_name_info’: > > drivers/gpu/drm/drm_proc.c:525: warning: format ‘%d’ expects type > > ‘int’, but argument 3 has type ‘size_t’ drivers/gpu/drm/drm_proc.c:533: > > warning: format ‘%9d’ expects type ‘int’, but argument 4 has type > > ‘size_t’ drivers/gpu/drm/i915/i915_gem.c: In function > > ‘i915_gem_gtt_pwrite’: drivers/gpu/drm/i915/i915_gem.c:184: warning: > > unused variable ‘vaddr_atomic’ > > Yeah, none of us are on x86-64, so we missed those warnings in testing. Actually, I'm on x86_64 pretty much exclusively and saw these warnings last week. But I didn't send a fix (yet); sorry. That said, this code was far from untested, even though it did contain a few compile warnings, so I think Linus's complaint about UNTESTED CRAP was at least half wrong. -- Jesse Barnes, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-17 22:43 ` Linus Torvalds 2008-10-18 2:10 ` Eric Anholt @ 2008-10-18 9:11 ` Dave Airlie 1 sibling, 0 replies; 80+ messages in thread From: Dave Airlie @ 2008-10-18 9:11 UTC (permalink / raw) To: Linus Torvalds; +Cc: Dave Airlie, Andrew Morton, linux-kernel, dri-devel 2008/10/18 Linus Torvalds <torvalds@linux-foundation.org>: > > > On Fri, 17 Oct 2008, Dave Airlie wrote: >> >> Please pull the 'drm-next' branch from >> ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-next > > Grr. > > This whole merge series has been full of people sending me UNTESTED CRAP. > > So what's the excuse _this_ time for adding all these stupid warnings to > my build log? Did nobody test this? The code has been tested on 32 and 64 bit in lots of places, however I don't generally trawl the Fedora kernel build logs, and the amount of warnings we have there, new ones just get lost in the mists.. So its not untested on 64-bit, its just nobody who cared enough saw the compiler warnings. If linux-next had been going for the past few weeks someone would have spotted them, I just got the report yesterday from linux-next after I fed the push request. Perhaps we all need access to the Ingo compile farm. Dave. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-17 21:29 [git pull] drm patches for 2.6.27-rc1 Dave Airlie 2008-10-17 22:43 ` Linus Torvalds @ 2008-10-18 1:37 ` Nick Piggin 2008-10-18 19:11 ` Keith Packard 1 sibling, 1 reply; 80+ messages in thread From: Nick Piggin @ 2008-10-18 1:37 UTC (permalink / raw) To: Dave Airlie; +Cc: torvalds, Andrew Morton, linux-kernel, dri-devel On Saturday 18 October 2008 08:29, Dave Airlie wrote: > Hi Linus, > > This is a new tree, I had a conflict with your latest tree due to some > trivial cleanups you merged. I've added the fix for CVE-2008-3831 which is > unembargoed. > > Please pull the 'drm-next' branch from > ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git > drm-next > > This contains two patches outside the DRM git tree to add exports for GEM > functionality while we await Nick Piggins vmap and shmem changes. OK. I was hoping that kmap_atomic_pfn thing *never* see the light of mainline ;) Because hopefully the vmap improvements should be OK for 2.6.28 as well... But it's already there. OK, if vmap patches go upstream, then it can be switched over and the export removed before the release... Thanks, Nick ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 1:37 ` Nick Piggin @ 2008-10-18 19:11 ` Keith Packard 2008-10-18 19:31 ` Linus Torvalds 0 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-18 19:11 UTC (permalink / raw) To: Nick Piggin Cc: keithp, Dave Airlie, Andrew Morton, torvalds, linux-kernel, dri-devel [-- Attachment #1: Type: text/plain, Size: 1523 bytes --] On Sat, 2008-10-18 at 12:37 +1100, Nick Piggin wrote: > OK. I was hoping that kmap_atomic_pfn thing *never* see the light of > mainline ;) Because hopefully the vmap improvements should be OK for > 2.6.28 as well... Eric and I chatted with Venki Pallipadi this week and decided what we really need is an official API for getting at pfns in our device BAR, which is what we're (ab)using kmap_atomic_pfn for. This ties in with some PAT issues as well, where we want to assert that all mappings of our BAR are in WC mode. The basic plan is to have four new functions (yes, I'm making up names here): struct io_mapping *io_reserve_pci_resource(struct pci_dev *dev, int bar, int prot); void io_mapping_free(struct io_mapping *mapping); void *io_map_atomic(struct io_mapping *mapping, unsigned long pfn); void io_unmap_atomic(struct io_mapping *mapping, unsigned long pfn); This would have the additional effect of making all other mappings of this BAR be required to match the specific prot, both providing protection against broken drivers, and providing support for old X servers which would directly map this BAR from user space. > But it's already there. OK, if vmap patches go upstream, then it can > be switched over and the export removed before the release... I'd say we should leave things alone for .28 and work on making an official IO mapping API available. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 19:11 ` Keith Packard @ 2008-10-18 19:31 ` Linus Torvalds 2008-10-18 20:07 ` Thomas Hellström ` (3 more replies) 0 siblings, 4 replies; 80+ messages in thread From: Linus Torvalds @ 2008-10-18 19:31 UTC (permalink / raw) To: Keith Packard Cc: Nick Piggin, Dave Airlie, Andrew Morton, Linux Kernel Mailing List, dri-devel On Sat, 18 Oct 2008, Keith Packard wrote: > > The basic plan is to have four new functions (yes, I'm making up names > here): > > struct io_mapping *io_reserve_pci_resource(struct pci_dev *dev, > int bar, > int prot); > void io_mapping_free(struct io_mapping *mapping); > > void *io_map_atomic(struct io_mapping *mapping, unsigned long pfn); > void io_unmap_atomic(struct io_mapping *mapping, unsigned long pfn); The important thing is that mappings need to be per-CPU, so the above may work, but only if it's designed so that "io_reserve_pci_resource()" will actually reserve space for 'nr_possible_cpu' page mappings, and then the "io_[un]map_atomic()" functions do per-CPU mappings. Anything else is a disaster, because anything else implies TLB shootdown. And quite frankly, even so, we'd possibly still be _better_ off with just exposing the "kmap_atomic_pfn()" functionality even so. Because quite frankly, your "io_reserve_pci_resource()" infrastructure is going to inevitably be more complex and slower than the rather efficient kmap_atomic_pfn() thing we have. [ The *non-atomic* kmap() functions are fairly high-overhead, in that they want to keep track of cached mappings and remember page addresses etc. So those are the ones we don't want to support for non-HIGHMEM setups. But the atomic kmaps are pretty simple, and really only need some trivial FIXMAP support. We could easily extend it for x86-64, methinks, and do it for x86-32 even when we don't do HIGHMEM. Ingo? ] One small detail: our we currently have "kmap_atomic_pfn()" and "kmap_atomic_prot()", and we really should maek the fundamental core operation be "kmap_atomic_pfn_prot()", and have everything be done in terms of that. Looking at it, it also looks like kmap_atomic_prot() is actually incorrect right now, and doesn't do a "prot" thing for non-highmem pages, but just returns "page_address(page);" Linus ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 19:31 ` Linus Torvalds @ 2008-10-18 20:07 ` Thomas Hellström 2008-10-18 20:20 ` Keith Packard ` (2 subsequent siblings) 3 siblings, 0 replies; 80+ messages in thread From: Thomas Hellström @ 2008-10-18 20:07 UTC (permalink / raw) To: Linus Torvalds Cc: Keith Packard, Dave Airlie, Nick Piggin, Andrew Morton, Linux Kernel Mailing List, dri-devel Linus Torvalds wrote: > On Sat, 18 Oct 2008, Keith Packard wrote: > >> The basic plan is to have four new functions (yes, I'm making up names >> here): >> >> struct io_mapping *io_reserve_pci_resource(struct pci_dev *dev, >> int bar, >> int prot); >> void io_mapping_free(struct io_mapping *mapping); >> >> void *io_map_atomic(struct io_mapping *mapping, unsigned long pfn); >> void io_unmap_atomic(struct io_mapping *mapping, unsigned long pfn); >> > > The important thing is that mappings need to be per-CPU, so the above may > work, but only if it's designed so that "io_reserve_pci_resource()" will > actually reserve space for 'nr_possible_cpu' page mappings, and then the > "io_[un]map_atomic()" functions do per-CPU mappings. > > Anything else is a disaster, because anything else implies TLB shootdown. > > And quite frankly, even so, we'd possibly still be _better_ off with just > exposing the "kmap_atomic_pfn()" functionality even so. Because quite > frankly, your "io_reserve_pci_resource()" infrastructure is going to > inevitably be more complex and slower than the rather efficient > kmap_atomic_pfn() thing we have. > > [ The *non-atomic* kmap() functions are fairly high-overhead, in that they > want to keep track of cached mappings and remember page addresses etc. > So those are the ones we don't want to support for non-HIGHMEM setups. > > But the atomic kmaps are pretty simple, and really only need some > trivial FIXMAP support. We could easily extend it for x86-64, methinks, > and do it for x86-32 even when we don't do HIGHMEM. > > Ingo? ] > > One small detail: our we currently have "kmap_atomic_pfn()" and > "kmap_atomic_prot()", and we really should maek the fundamental core > operation be "kmap_atomic_pfn_prot()", and have everything be done in > terms of that. Looking at it, it also looks like kmap_atomic_prot() is > actually incorrect right now, and doesn't do a "prot" thing for > non-highmem pages, but just returns "page_address(page);" > Actually, a "kmap_atomic_prot_pfn()" has been lurking in the drm repos for some time now, but hasn't been suggested for upstream. It was intended for drivers that require quick in-kernel patching of write-combined io and highmem pages. The latter is a common situation for PCIE graphics devices with their own MMU, so IMHO an exported kmap_atomic_pfn_prot() would be a big help in such cases. /Thomas > Linus > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > -- > _______________________________________________ > Dri-devel mailing list > Dri-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dri-devel > ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 19:31 ` Linus Torvalds 2008-10-18 20:07 ` Thomas Hellström @ 2008-10-18 20:20 ` Keith Packard 2008-10-18 20:37 ` Ingo Molnar 2008-10-19 3:14 ` Nick Piggin 3 siblings, 0 replies; 80+ messages in thread From: Keith Packard @ 2008-10-18 20:20 UTC (permalink / raw) To: Linus Torvalds Cc: keithp, Nick Piggin, Dave Airlie, Andrew Morton, Linux Kernel Mailing List, dri-devel [-- Attachment #1: Type: text/plain, Size: 2011 bytes --] On Sat, 2008-10-18 at 12:31 -0700, Linus Torvalds wrote: > The important thing is that mappings need to be per-CPU, so the above may > work, but only if it's designed so that "io_reserve_pci_resource()" will > actually reserve space for 'nr_possible_cpu' page mappings, and then the > "io_[un]map_atomic()" functions do per-CPU mappings. Yes, of course. The goal is to make it look exactly like kmap_atomic_pfn, but work on non-memory pages even on non-HIGHMEM configs for x86 (we have to use ioremap on those systems today, which is a fairly significant performance problem). > And quite frankly, even so, we'd possibly still be _better_ off with just > exposing the "kmap_atomic_pfn()" functionality even so. Because quite > frankly, your "io_reserve_pci_resource()" infrastructure is going to > inevitably be more complex and slower than the rather efficient > kmap_atomic_pfn() thing we have. I want it to work just like kmap_atomic_pfn; Venki wanted some way to "know" that no-one else was mapping the pages in non-WC mode. This seemed like a reasonable compromise; the 'mapping' object exists solely to hold the desired prot value to keep all PTEs consistent across the whole system. > One small detail: our we currently have "kmap_atomic_pfn()" and > "kmap_atomic_prot()", and we really should maek the fundamental core > operation be "kmap_atomic_pfn_prot()", and have everything be done in > terms of that. Looking at it, it also looks like kmap_atomic_prot() is > actually incorrect right now, and doesn't do a "prot" thing for > non-highmem pages, but just returns "page_address(page);" Fortunately, we use kmap_atomic_pfn only for our BAR, which is not a non-highmem page. Right now, we're counting on having our BAR covered by an MTRR register so that we get WC access here. I'd like to have the API expose this so that the driver will work on systems which don't have a spare MTRR for the graphics aperture. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 19:31 ` Linus Torvalds 2008-10-18 20:07 ` Thomas Hellström 2008-10-18 20:20 ` Keith Packard @ 2008-10-18 20:37 ` Ingo Molnar 2008-10-18 21:51 ` Keith Packard 2008-10-19 3:14 ` Nick Piggin 3 siblings, 1 reply; 80+ messages in thread From: Ingo Molnar @ 2008-10-18 20:37 UTC (permalink / raw) To: Linus Torvalds Cc: Keith Packard, Nick Piggin, Dave Airlie, Andrew Morton, Linux Kernel Mailing List, dri-devel * Linus Torvalds <torvalds@linux-foundation.org> wrote: > [ The *non-atomic* kmap() functions are fairly high-overhead, in that they > want to keep track of cached mappings and remember page addresses etc. > So those are the ones we don't want to support for non-HIGHMEM setups. > > But the atomic kmaps are pretty simple, and really only need some > trivial FIXMAP support. We could easily extend it for x86-64, methinks, > and do it for x86-32 even when we don't do HIGHMEM. > > Ingo? ] agreed, and there's certainly no resistance from the x86 architecture side to extend our mapping APIs in such directions. But i think the direction of the new GEM code is subtly wrong here, because it tries to manage memory even on 64-bit systems. IMO it should just map the _whole_ graphics aperture (non-cached) and be done with it. There's no faster method at managing pages than the CPU doing a TLB fill from pagetables. The only real API need i see is on 32-bit: with a 1GB or 2GB graphics aperture we just cannot map that permanently, so kmap_atomic() is a necessity. We can certainly extend that to non-highmem as well. But this should be an ad-hoc transitionary thing for 32-bit, and on 64-bit we really should not be using any form of kmap. Especially with large vertex buffers or textures, mapping a lot of pages via kmap is not going to be trivial overhead - even if INVLPG is faster than a full TLB flush, it's still on the order of 100-200 cycles - and with a lot of pages that mounts up quickly. And if i understood your workload correctly you want to do tens of thousand of map/unmap/remap events per frame generated - depending on the type of the 3D app/engine. Or am i missing something subtle? Why do you want the overhead of kmap on 64-bit? Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 20:37 ` Ingo Molnar @ 2008-10-18 21:51 ` Keith Packard 2008-10-18 22:32 ` Ingo Molnar 0 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-18 21:51 UTC (permalink / raw) To: Ingo Molnar Cc: keithp, Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton [-- Attachment #1: Type: text/plain, Size: 1662 bytes --] On Sat, 2008-10-18 at 22:37 +0200, Ingo Molnar wrote: > But i think the direction of the new GEM code is subtly wrong here, > because it tries to manage memory even on 64-bit systems. IMO it should > just map the _whole_ graphics aperture (non-cached) and be done with it. > There's no faster method at managing pages than the CPU doing a TLB fill > from pagetables. Yeah, we're stuck thinking that we "can't" map the aperture because it's too large, but with a 64-bit kernel, we should be able to keep it mapped permanently. Of course, the io_reserve_pci_resource and io_map_atomic functions could do precisely that, as kmap_atomic does on non-HIGHMEM systems today. > The only real API need i see is on 32-bit: with a 1GB or 2GB graphics > aperture we just cannot map that permanently, so kmap_atomic() is a > necessity. We can certainly extend that to non-highmem as well. Yes, this is where exposing an io-specific atomic mapping function will remain necessary for some time. > And if i understood your > workload correctly you want to do tens of thousand of map/unmap/remap > events per frame generated - depending on the type of the 3D app/engine. Yeah, data transfer from CPU to GPU is through a pwrite interface, and we perform the transfer within the kernel using map/unmap operations on the aperture as those are WC and hence do not require clflush. > Or am i missing something subtle? Why do you want the overhead of kmap > on 64-bit? We don't, but I think it would be nice to have a common API that works across all 32-bit configurations as well as 64-bit systems. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 21:51 ` Keith Packard @ 2008-10-18 22:32 ` Ingo Molnar 2008-10-18 22:47 ` Jon Smirl ` (3 more replies) 0 siblings, 4 replies; 80+ messages in thread From: Ingo Molnar @ 2008-10-18 22:32 UTC (permalink / raw) To: Keith Packard Cc: Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu * Keith Packard <keithp@keithp.com> wrote: > On Sat, 2008-10-18 at 22:37 +0200, Ingo Molnar wrote: > > > But i think the direction of the new GEM code is subtly wrong here, > > because it tries to manage memory even on 64-bit systems. IMO it > > should just map the _whole_ graphics aperture (non-cached) and be > > done with it. There's no faster method at managing pages than the > > CPU doing a TLB fill from pagetables. > > Yeah, we're stuck thinking that we "can't" map the aperture because > it's too large, but with a 64-bit kernel, we should be able to keep it > mapped permanently. > > Of course, the io_reserve_pci_resource and io_map_atomic functions > could do precisely that, as kmap_atomic does on non-HIGHMEM systems > today. okay, so basically what we need is a shared API that does per page kmap_atomic on 32-bit, and just an ioremap() on 64-bit. I had the impression that you were suggesting to extend kmap_atomic() to 64-bit - which would be wrong. So, in terms of the 4 APIs you suggest: struct io_mapping *io_reserve_pci_resource(struct pci_dev *dev, int bar, int prot); void io_mapping_free(struct io_mapping *mapping); void *io_map_atomic(struct io_mapping *mapping, unsigned long pfn); void io_unmap_atomic(struct io_mapping *mapping, unsigned long pfn); here is what we'd do on 64-bit: - io_reserve_pci_resource() would just do an ioremap(), and would save the ioremap-ed memory into struct io_mapping. - io_mapping_free() does the iounmap() - io_map_atomic(): just arithmetics, returns mapping->base + pfn - no TLB activities at all. - io_unmap_atomic(): NOP. it's as fast as it gets: zero overhead in essence. Note that it's also shared between all CPUs and there's no aliasing trouble. And we could make it even faster: if you think we could even use 2MB TLBs for the _linear_ ioremap()s here, hm? There's plenty of address space on 64-bit so we can align to 2MB just fine - and aperture sizes are 2MB sized anyway. Or we could go one step further and install these aperture mappings into the _kernel linear_ address space. That would be even faster, because we'd have a constant offset. We have the (2MB mappings aware) mechanism for that already. (Yinghai Cc:-ed - he did a lot of great work to generalize this area.) (In fact if we installed it into the linear kernel address space, and if the aperture is 1GB aligned, we will automatically use gbpages for it. Were Intel to support gbpages in the future ;-) the _real_ remapping in a graphics aperture happens on the GPU level anyway, you manage an in-RAM GPU pagetable that just works like an IOMMU, correct? on 32-bit we'd have what you use in the GEM code today: - io_reserve_pci_resource(): a NOP in essence - io_mapping_free(): a NOP - io_map_atomic(): does a kmap_atomic(pfn) - io_unmap_atomic(): does a kunmap_atomic(pfn) so on 32-bit we have the INVLPG TLB overhead and preemption restrictions - but we knew that. We'd have to allow atomic_kmap() on non-highmem as well but that's fair. Mind sending patches for this? :-) Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 22:32 ` Ingo Molnar @ 2008-10-18 22:47 ` Jon Smirl 2008-10-18 22:53 ` Linus Torvalds 2008-10-19 0:38 ` Keith Packard ` (2 subsequent siblings) 3 siblings, 1 reply; 80+ messages in thread From: Jon Smirl @ 2008-10-18 22:47 UTC (permalink / raw) To: Ingo Molnar Cc: Keith Packard, Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu On Sat, Oct 18, 2008 at 6:32 PM, Ingo Molnar <mingo@elte.hu> wrote: > > * Keith Packard <keithp@keithp.com> wrote: > >> On Sat, 2008-10-18 at 22:37 +0200, Ingo Molnar wrote: >> >> > But i think the direction of the new GEM code is subtly wrong here, >> > because it tries to manage memory even on 64-bit systems. IMO it >> > should just map the _whole_ graphics aperture (non-cached) and be >> > done with it. There's no faster method at managing pages than the >> > CPU doing a TLB fill from pagetables. >> >> Yeah, we're stuck thinking that we "can't" map the aperture because >> it's too large, but with a 64-bit kernel, we should be able to keep it >> mapped permanently. >> >> Of course, the io_reserve_pci_resource and io_map_atomic functions >> could do precisely that, as kmap_atomic does on non-HIGHMEM systems >> today. > > okay, so basically what we need is a shared API that does per page > kmap_atomic on 32-bit, and just an ioremap() on 64-bit. I had the > impression that you were suggesting to extend kmap_atomic() to 64-bit - > which would be wrong. Is it possible to use a segment register to map the whole aperture on 32b? A segment register might allow common code on 64b/32b by eliminating the need to move the mapping window around. > > So, in terms of the 4 APIs you suggest: > > struct io_mapping *io_reserve_pci_resource(struct pci_dev *dev, > int bar, > int prot); > void io_mapping_free(struct io_mapping *mapping); > > void *io_map_atomic(struct io_mapping *mapping, unsigned long pfn); > void io_unmap_atomic(struct io_mapping *mapping, unsigned long pfn); > > here is what we'd do on 64-bit: > > - io_reserve_pci_resource() would just do an ioremap(), and would save > the ioremap-ed memory into struct io_mapping. > > - io_mapping_free() does the iounmap() > > - io_map_atomic(): just arithmetics, returns mapping->base + pfn - no > TLB activities at all. > > - io_unmap_atomic(): NOP. > > it's as fast as it gets: zero overhead in essence. Note that it's also > shared between all CPUs and there's no aliasing trouble. > > And we could make it even faster: if you think we could even use 2MB > TLBs for the _linear_ ioremap()s here, hm? There's plenty of address > space on 64-bit so we can align to 2MB just fine - and aperture sizes > are 2MB sized anyway. > > Or we could go one step further and install these aperture mappings into > the _kernel linear_ address space. That would be even faster, because > we'd have a constant offset. We have the (2MB mappings aware) mechanism > for that already. (Yinghai Cc:-ed - he did a lot of great work to > generalize this area.) > > (In fact if we installed it into the linear kernel address space, and if > the aperture is 1GB aligned, we will automatically use gbpages for it. > Were Intel to support gbpages in the future ;-) > > the _real_ remapping in a graphics aperture happens on the GPU level > anyway, you manage an in-RAM GPU pagetable that just works like an > IOMMU, correct? > > on 32-bit we'd have what you use in the GEM code today: > > - io_reserve_pci_resource(): a NOP in essence > > - io_mapping_free(): a NOP > > - io_map_atomic(): does a kmap_atomic(pfn) > > - io_unmap_atomic(): does a kunmap_atomic(pfn) > > so on 32-bit we have the INVLPG TLB overhead and preemption restrictions > - but we knew that. We'd have to allow atomic_kmap() on non-highmem as > well but that's fair. > > Mind sending patches for this? :-) > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 22:47 ` Jon Smirl @ 2008-10-18 22:53 ` Linus Torvalds 0 siblings, 0 replies; 80+ messages in thread From: Linus Torvalds @ 2008-10-18 22:53 UTC (permalink / raw) To: Jon Smirl Cc: Ingo Molnar, Keith Packard, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu On Sat, 18 Oct 2008, Jon Smirl wrote: > > Is it possible to use a segment register to map the whole aperture on > 32b? No. Segment registers don't extend the virtual address space, they can only limit visibility into the one single 32-bit one. IOW, segment registers don't actually extend addressing in any way, they only limit it. There's a reason why people don't use them (except as a strange base register for things like per-cpu or per-thread variables, and that is not to extend the address space, but to avoid wasting precious _useful_ registers on that). Linus ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 22:32 ` Ingo Molnar 2008-10-18 22:47 ` Jon Smirl @ 2008-10-19 0:38 ` Keith Packard 2008-10-19 1:06 ` Arjan van de Ven 2008-10-20 10:01 ` Ingo Molnar 2008-10-19 4:14 ` Keith Packard 2008-10-19 4:28 ` [git pull] drm patches for 2.6.27-rc1 Yinghai Lu 3 siblings, 2 replies; 80+ messages in thread From: Keith Packard @ 2008-10-19 0:38 UTC (permalink / raw) To: Ingo Molnar Cc: keithp, Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu [-- Attachment #1: Type: text/plain, Size: 1039 bytes --] On Sun, 2008-10-19 at 00:32 +0200, Ingo Molnar wrote: > the _real_ remapping in a graphics aperture happens on the GPU level > anyway, you manage an in-RAM GPU pagetable that just works like an > IOMMU, correct? Yes, a one-level linear MMU which uses BIOS-reserved memory. So, at least for a prototype, on 64-bit we can just use ioremap_wc and hold the mapping while the driver is open? Is there any huge benefit to using the kernel mapping? > so on 32-bit we have the INVLPG TLB overhead and preemption restrictions > - but we knew that. We'd have to allow atomic_kmap() on non-highmem as > well but that's fair. Yes, the non-highmem case is currently in fairly bad shape. > Mind sending patches for this? :-) I've got Venki lined up to do this work for me; once we're happy enough with the API. In particular, the non-highmem 32-bit case seems a bit tricky. Also, does anyone have a better set of names for this stuff? io_reserve_pci_mapping seems fairly ugly to me. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-19 0:38 ` Keith Packard @ 2008-10-19 1:06 ` Arjan van de Ven 2008-10-19 1:15 ` Keith Packard 2008-10-20 10:01 ` Ingo Molnar 1 sibling, 1 reply; 80+ messages in thread From: Arjan van de Ven @ 2008-10-19 1:06 UTC (permalink / raw) To: Keith Packard Cc: Ingo Molnar, keithp, Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu On Sat, 18 Oct 2008 17:38:11 -0700 Keith Packard <keithp@keithp.com> wrote: > I've got Venki lined up to do this work for me; once we're happy > enough with the API. In particular, the non-highmem 32-bit case seems > a bit tricky. > > Also, does anyone have a better set of names for this stuff? > io_reserve_pci_mapping seems fairly ugly to me. > how about pci_resource_force_caching(pci_dev, barnr, cachetype)? -- Arjan van de Ven Intel Open Source Technology Centre For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-19 1:06 ` Arjan van de Ven @ 2008-10-19 1:15 ` Keith Packard 0 siblings, 0 replies; 80+ messages in thread From: Keith Packard @ 2008-10-19 1:15 UTC (permalink / raw) To: Arjan van de Ven Cc: keithp, Ingo Molnar, Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu [-- Attachment #1: Type: text/plain, Size: 241 bytes --] On Sat, 2008-10-18 at 18:06 -0700, Arjan van de Ven wrote: > how about pci_resource_force_caching(pci_dev, barnr, cachetype)? and then 'pci_map_atomic'? Not sure this makes sense in the pci namespace. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-19 0:38 ` Keith Packard 2008-10-19 1:06 ` Arjan van de Ven @ 2008-10-20 10:01 ` Ingo Molnar 1 sibling, 0 replies; 80+ messages in thread From: Ingo Molnar @ 2008-10-20 10:01 UTC (permalink / raw) To: Keith Packard Cc: Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu * Keith Packard <keithp@keithp.com> wrote: > On Sun, 2008-10-19 at 00:32 +0200, Ingo Molnar wrote: > > > the _real_ remapping in a graphics aperture happens on the GPU level > > anyway, you manage an in-RAM GPU pagetable that just works like an > > IOMMU, correct? > > Yes, a one-level linear MMU which uses BIOS-reserved memory. > > So, at least for a prototype, on 64-bit we can just use ioremap_wc and > hold the mapping while the driver is open? Is there any huge benefit > to using the kernel mapping? correct. There's no benefit to using the kernel mapping - and we can add 2MB pages support to ioremap just fine and then you'll benefit from it automatically. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 22:32 ` Ingo Molnar 2008-10-18 22:47 ` Jon Smirl 2008-10-19 0:38 ` Keith Packard @ 2008-10-19 4:14 ` Keith Packard 2008-10-19 6:41 ` Keith Packard 2008-10-19 4:28 ` [git pull] drm patches for 2.6.27-rc1 Yinghai Lu 3 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-19 4:14 UTC (permalink / raw) To: Ingo Molnar Cc: keithp, Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu [-- Attachment #1: Type: text/plain, Size: 1666 bytes --] On Sun, 2008-10-19 at 00:32 +0200, Ingo Molnar wrote: > Mind sending patches for this? :-) Here's what these functions would look like as wrappers on top of the existing APIs: /* x86_64 style */ static inline struct io_reserve * iomap_reserve(unsigned long base, unsigned long size) { return (struct io_reserve *) ioremap_wc(base, size); } static inline void * iomap_atomic(struct io_reserve *reserve, unsigned long offset) { return ((char *) reserve) + offset; } static inline void iounmap_atomic(void *vaddr) { } static inline void iomap_unreserve(struct io_reserve *reserve) { iounmap(reserve); } /* HIGHMEM style */ #if 0 static inline struct io_reserve * iomap_reserve(unsigned long base, unsigned long size) { return (struct io_reserve *) base; } static inline void * iomap_atomic(struct io_reserve *reserve, unsigned long offset) { offset += (unsigned long) reserve; return kmap_atomic(offset >> PAGE_SHIFT, KM_USER0); } static inline void iounmap_atomic(void *vaddr) { kunmap_atomic(vaddr, KM_USER0); } static inline void iomap_unreserve(struct io_reserve *reserve) { } #endif /* 32-bit non-HIGHMEM style */ #if 0 static inline struct io_reserve * iomap_reserve(unsigned long base, unsigned long size) { return NULL; } static inline void * iomap_atomic(struct io_reserve *reserve, unsigned long offset) { return NULL; } static inline void iounmap_atomic(void *vaddr) { } static inline void iomap_unreserve(struct io_reserve *reserve) { } #endif > > Ingo -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-19 4:14 ` Keith Packard @ 2008-10-19 6:41 ` Keith Packard 2008-10-19 17:53 ` io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar 0 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-19 6:41 UTC (permalink / raw) To: Ingo Molnar Cc: keithp, Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu [-- Attachment #1: Type: text/plain, Size: 11944 bytes --] On Sat, 2008-10-18 at 21:14 -0700, Keith Packard wrote: > On Sun, 2008-10-19 at 00:32 +0200, Ingo Molnar wrote: > > > Mind sending patches for this? :-) Here's a patch for the i915 driver that includes the new API. Tested on x86_32+HIGHMEM and x86_64. I stuck a new 'io_reserve.h' header in the i915 directory for this patch, but it should go elsewhere. The new APIs are: io_reserve_create_wc io_reserve_free io_reserve_map_atomic_wc io_reserve_unmap_atomic io_reserve_map_wc io_reserve_unmap I added the non-atomic variants at Eric's suggestion so that we can use the direct map on x86_64, avoiding any use of ioremap at run-time. I think the resulting code looks quite a bit cleaner now. Also, one benchmark I tried ran about 18 times faster in 64-bit mode than the original code. From 2f6b125cb586a0671a2b9c22aadb03fcafdf99ab Mon Sep 17 00:00:00 2001 From: Keith Packard <keithp@taka.keithp.com> Date: Sat, 18 Oct 2008 22:59:58 -0700 Subject: [PATCH] [drm/i915] Create new 'io_reserve' API for mapping GTT pages The io_reserve API abstracts away the operations necessary to construct mappings for our GTT aperture, providing atomic and non-atomic mappings for GTT pages that work efficiently on x86_64 and x86_32+HIGHMEM configurations. This eliminates the in-drive abuse of the kmap_atomic_pfn function as well as improving GEM performance on x86_64 architecture. Signed-off-by: Keith Packard <keithp@taka.keithp.com> --- drivers/gpu/drm/i915/i915_drv.h | 3 + drivers/gpu/drm/i915/i915_gem.c | 71 ++++++++++++---------- drivers/gpu/drm/i915/i915_irq.c | 4 +- drivers/gpu/drm/i915/io_reserve.h | 122 +++++++++++++++++++++++++++++++++++++ 4 files changed, 165 insertions(+), 35 deletions(-) create mode 100644 drivers/gpu/drm/i915/io_reserve.h diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f20ffe1..c99b9ea 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -31,6 +31,7 @@ #define _I915_DRV_H_ #include "i915_reg.h" +#include "io_reserve.h" /* General customization: */ @@ -246,6 +247,8 @@ typedef struct drm_i915_private { struct { struct drm_mm gtt_space; + struct io_reserve *io_reserve; + /** * List of objects currently involved in rendering from the * ringbuffer. diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 9255088..cd9e489 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -177,14 +177,14 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, struct drm_file *file_priv) { struct drm_i915_gem_object *obj_priv = obj->driver_private; + drm_i915_private_t *dev_priv = dev->dev_private; ssize_t remain; - loff_t offset; + loff_t offset, base; char __user *user_data; char __iomem *vaddr; char *vaddr_atomic; - int i, o, l; + int o, l; int ret = 0; - unsigned long pfn; unsigned long unwritten; user_data = (char __user *) (uintptr_t) args->data_ptr; @@ -211,42 +211,41 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, while (remain > 0) { /* Operation in this page * - * i = page number + * base = page offset within aperture * o = offset within page * l = bytes to copy */ - i = offset >> PAGE_SHIFT; + base = (offset & ~(PAGE_SIZE-1)); o = offset & (PAGE_SIZE-1); l = remain; if ((o + l) > PAGE_SIZE) l = PAGE_SIZE - o; - pfn = (dev->agp->base >> PAGE_SHIFT) + i; - -#ifdef CONFIG_HIGHMEM /* This is a workaround for the low performance of iounmap * (approximate 10% cpu cost on normal 3D workloads). - * kmap_atomic on HIGHMEM kernels happens to let us map card - * memory without taking IPIs. When the vmap rework lands - * we should be able to dump this hack. + * io_reserve_map_atomic_wc maps card memory + * without taking IPIs. + */ + vaddr_atomic = io_reserve_map_atomic_wc(dev_priv->mm.io_reserve, + base); + if (vaddr_atomic) { + unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + o, + user_data, l); + io_reserve_unmap_atomic(vaddr_atomic); + } else + unwritten = l; + + /* If we get a fault while copying data, then (presumably) our + * source page isn't available. In this case, use the + * non-atomic __copy_from_user function */ - vaddr_atomic = kmap_atomic_pfn(pfn, KM_USER0); -#if WATCH_PWRITE - DRM_INFO("pwrite i %d o %d l %d pfn %ld vaddr %p\n", - i, o, l, pfn, vaddr_atomic); -#endif - unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + o, - user_data, l); - kunmap_atomic(vaddr_atomic, KM_USER0); - if (unwritten) -#endif /* CONFIG_HIGHMEM */ { - vaddr = ioremap_wc(pfn << PAGE_SHIFT, PAGE_SIZE); + vaddr = io_reserve_map_wc(dev_priv->mm.io_reserve, base); #if WATCH_PWRITE - DRM_INFO("pwrite slow i %d o %d l %d " - "pfn %ld vaddr %p\n", - i, o, l, pfn, vaddr); + DRM_INFO("pwrite slow base %ld o %d l %d " + "vaddr %p\n", + base, o, l, vaddr); #endif if (vaddr == NULL) { ret = -EFAULT; @@ -256,7 +255,7 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, #if WATCH_PWRITE DRM_INFO("unwritten %ld\n", unwritten); #endif - iounmap(vaddr); + io_reserve_unmap(vaddr); if (unwritten) { ret = -EFAULT; goto fail; @@ -1489,6 +1488,7 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, struct drm_i915_gem_exec_object *entry) { struct drm_device *dev = obj->dev; + drm_i915_private_t *dev_priv = dev->dev_private; struct drm_i915_gem_relocation_entry reloc; struct drm_i915_gem_relocation_entry __user *relocs; struct drm_i915_gem_object *obj_priv = obj->driver_private; @@ -1621,12 +1621,11 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, (last_reloc_offset & ~(PAGE_SIZE - 1)) != (reloc_offset & ~(PAGE_SIZE - 1))) { if (reloc_page != NULL) - iounmap(reloc_page); + io_reserve_unmap(reloc_page); - reloc_page = ioremap_wc(dev->agp->base + - (reloc_offset & - ~(PAGE_SIZE - 1)), - PAGE_SIZE); + reloc_page = io_reserve_map_wc(dev_priv->mm.io_reserve, + (reloc_offset & + ~(PAGE_SIZE - 1))); last_reloc_offset = reloc_offset; if (reloc_page == NULL) { drm_gem_object_unreference(target_obj); @@ -1636,7 +1635,7 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, } reloc_entry = (uint32_t __iomem *)(reloc_page + - (reloc_offset & (PAGE_SIZE - 1))); + (reloc_offset & (PAGE_SIZE - 1))); reloc_val = target_obj_priv->gtt_offset + reloc.delta; #if WATCH_BUF @@ -1661,7 +1660,7 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, } if (reloc_page != NULL) - iounmap(reloc_page); + io_reserve_unmap(reloc_page); #if WATCH_BUF if (0) @@ -2504,6 +2503,10 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data, if (ret != 0) return ret; + dev_priv->mm.io_reserve = io_reserve_create_wc(dev->agp->base, + dev->agp->agp_info.aper_size + * 1024 * 1024); + mutex_lock(&dev->struct_mutex); BUG_ON(!list_empty(&dev_priv->mm.active_list)); BUG_ON(!list_empty(&dev_priv->mm.flushing_list)); @@ -2521,11 +2524,13 @@ int i915_gem_leavevt_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { + drm_i915_private_t *dev_priv = dev->dev_private; int ret; ret = i915_gem_idle(dev); drm_irq_uninstall(dev); + io_reserve_free(dev_priv->mm.io_reserve); return ret; } diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index ce866ac..de8e084 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -784,8 +784,8 @@ int i915_vblank_swap(struct drm_device *dev, void *data, if (dev_priv->swaps_pending >= 10) { DRM_DEBUG("Too many swaps queued\n"); DRM_DEBUG(" pipe 0: %d pipe 1: %d\n", - drm_vblank_count(dev, 0); - drm_vblank_count(dev, 1); + drm_vblank_count(dev, 0), + drm_vblank_count(dev, 1)); list_for_each(list, &dev_priv->vbl_swaps.head) { vbl_old = list_entry(list, drm_i915_vbl_swap_t, head); diff --git a/drivers/gpu/drm/i915/io_reserve.h b/drivers/gpu/drm/i915/io_reserve.h new file mode 100644 index 0000000..4e90a36 --- /dev/null +++ b/drivers/gpu/drm/i915/io_reserve.h @@ -0,0 +1,122 @@ +/* + * Copyright © 2008 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + * Keith Packard <keithp@keithp.com> + * + */ +/* x86_64 style */ + +#ifndef _IO_RESERVE_H_ +#define _IO_RESERVE_H_ + +/* this struct isn't actually defined anywhere */ +struct io_reserve; + +#ifdef CONFIG_X86_64 + +/* Create the io_reserve object*/ +static inline struct io_reserve * +io_reserve_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_reserve *) ioremap_wc(base, size); +} + +static inline void +io_reserve_free(struct io_reserve *reserve) +{ + iounmap(reserve); +} + +/* Atomic map/unmap */ +static inline void * +io_reserve_map_atomic_wc(struct io_reserve *reserve, unsigned long offset) +{ + return ((char *) reserve) + offset; +} + +static inline void +io_reserve_unmap_atomic(void *vaddr) +{ +} + +/* Non-atomic map/unmap */ +static inline void * +io_reserve_map_wc(struct io_reserve *reserve, unsigned long offset) +{ + return ((char *) reserve) + offset; +} + +static inline void +io_reserve_unmap(void *vaddr) +{ +} + +#endif /* CONFIG_X86_64 */ + +#ifdef CONFIG_X86_32 +static inline struct io_reserve * +io_reserve_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_reserve *) base; +} + +static inline void +io_reserve_free(struct io_reserve *reserve) +{ +} + +/* Atomic map/unmap */ +static inline void * +io_reserve_map_atomic_wc(struct io_reserve *reserve, unsigned long offset) +{ +#ifdef CONFIG_HIGHMEM + offset += (unsigned long) reserve; + return kmap_atomic_pfn(offset >> PAGE_SHIFT, KM_USER0); +#else + return NULL; +#endif +} + +static inline void +io_reserve_unmap_atomic(void *vaddr) +{ +#ifdef CONFIG_HIGHMEM + kunmap_atomic(vaddr, KM_USER0); +#endif +} + +static inline void * +io_reserve_map_wc(struct io_reserve *reserve, unsigned long offset) +{ + offset += (unsigned long) reserve; + return ioremap_wc(offset, PAGE_SIZE); +} + +static inline void +io_reserve_unmap(void *vaddr) +{ + iounmap(vaddr); +} +#endif /* CONFIG_X86_32 */ + +#endif /* _IO_RESERVE_H_ */ -- 1.5.6.5 -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-19 6:41 ` Keith Packard @ 2008-10-19 17:53 ` Ingo Molnar 2008-10-19 18:00 ` Arjan van de Ven ` (3 more replies) 0 siblings, 4 replies; 80+ messages in thread From: Ingo Molnar @ 2008-10-19 17:53 UTC (permalink / raw) To: Keith Packard, Jesse Barnes Cc: Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu * Keith Packard <keithp@keithp.com> wrote: > On Sat, 2008-10-18 at 21:14 -0700, Keith Packard wrote: > > On Sun, 2008-10-19 at 00:32 +0200, Ingo Molnar wrote: > > > > > Mind sending patches for this? :-) > > Here's a patch for the i915 driver that includes the new API. Tested > on x86_32+HIGHMEM and x86_64. I stuck a new 'io_reserve.h' header in > the i915 directory for this patch, but it should go elsewhere. > > The new APIs are: > > io_reserve_create_wc > io_reserve_free > io_reserve_map_atomic_wc > io_reserve_unmap_atomic > io_reserve_map_wc > io_reserve_unmap very nice! I think we need a somewhat different abstraction though. Firstly, regarding drivers/gpu/drm/i915/io_reserve.h, that needs to move to generic code. Secondly, wouldnt the right abstraction be to attach this functionality to 'struct resource' ? [or at least create a second struct that embedds struct resource] this abstraction is definitely not a PCI thing and not a detached-from-everything thing, it's an IO resource thing. We could make it a property of struct resource: struct resource { resource_size_t start; resource_size_t end; const char *name; unsigned long flags; struct resource *parent, *sibling, *child; + void *mapping; }; The APIs would be: int io_resource_init_mapping(struct resource *res); void io_resource_free_mapping(struct resource *res); void * io_resource_map(struct resource *res, pfn_t pfn, unsigned long offset); void io_resource_unmap(struct resource *res, void *kaddr); Note how simple and consistent it all gets: IO resources already know their physical location and their size limits. Being able to cache an ioremap in a mapping [and being able to use atomic kmaps on 32-bit] is a relatively simple and natural extension to the concept. i think that would be quite acceptable - and the APIs could just transparently work on it. This would also allow the PCI code to automatically unmap any cached mappings from resources, when the driver deinitializes. Linus, Jesse, what do you think? i think we need to finalize the API names and their abstraction level, and then could even merge those APIs into v2.6.28 on a fast path, to enable you to use it. It does not interact with anything else so it should be safe to do. (i'd not suggest to merge the i915 bits just yet - but that's obviously not my call.) Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-19 17:53 ` io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar @ 2008-10-19 18:00 ` Arjan van de Ven 2008-10-19 19:07 ` Eric Anholt ` (2 subsequent siblings) 3 siblings, 0 replies; 80+ messages in thread From: Arjan van de Ven @ 2008-10-19 18:00 UTC (permalink / raw) To: Ingo Molnar Cc: Keith Packard, Jesse Barnes, Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu On Sun, 19 Oct 2008 19:53:20 +0200 Ingo Molnar <mingo@elte.hu> wrote: > > struct resource { > resource_size_t start; > resource_size_t end; > const char *name; > unsigned long flags; > struct resource *parent, *sibling, *child; > + void *mapping; > }; > > The APIs would be: > > int io_resource_init_mapping(struct resource *res); > void io_resource_free_mapping(struct resource *res); > void * io_resource_map(struct resource *res, pfn_t pfn, unsigned > long offset); void io_resource_unmap(struct resource *res, void > *kaddr); > > Note how simple and consistent it all gets: IO resources already know > their physical location and their size limits. Being able to cache an > ioremap in a mapping [and being able to use atomic kmaps on 32-bit] > is a relatively simple and natural extension to the concept. and making a simple wrapper around this that turns "struct pci_dev, barnr" into a resource would make sense too, but yes. We need one more io_resource_force_cachability(struct resource, cachetype) or maybe only io_resource_force_writecombine(struct resource) -- Arjan van de Ven Intel Open Source Technology Centre For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-19 17:53 ` io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar 2008-10-19 18:00 ` Arjan van de Ven @ 2008-10-19 19:07 ` Eric Anholt 2008-10-20 11:55 ` Ingo Molnar 2008-10-19 21:04 ` Keith Packard 2008-10-20 10:10 ` io resources and cached mappings " Ingo Molnar 3 siblings, 1 reply; 80+ messages in thread From: Eric Anholt @ 2008-10-19 19:07 UTC (permalink / raw) To: Ingo Molnar Cc: Keith Packard, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, dri-devel, Andrew Morton, Linus Torvalds [-- Attachment #1: Type: text/plain, Size: 4231 bytes --] On Sun, 2008-10-19 at 19:53 +0200, Ingo Molnar wrote: > * Keith Packard <keithp@keithp.com> wrote: > > > On Sat, 2008-10-18 at 21:14 -0700, Keith Packard wrote: > > > On Sun, 2008-10-19 at 00:32 +0200, Ingo Molnar wrote: > > > > > > > Mind sending patches for this? :-) > > > > Here's a patch for the i915 driver that includes the new API. Tested > > on x86_32+HIGHMEM and x86_64. I stuck a new 'io_reserve.h' header in > > the i915 directory for this patch, but it should go elsewhere. > > > > The new APIs are: > > > > io_reserve_create_wc > > io_reserve_free > > io_reserve_map_atomic_wc > > io_reserve_unmap_atomic > > io_reserve_map_wc > > io_reserve_unmap > > very nice! > > I think we need a somewhat different abstraction though. > > Firstly, regarding drivers/gpu/drm/i915/io_reserve.h, that needs to move > to generic code. > > Secondly, wouldnt the right abstraction be to attach this functionality > to 'struct resource' ? [or at least create a second struct that embedds > struct resource] > > this abstraction is definitely not a PCI thing and not a > detached-from-everything thing, it's an IO resource thing. We could make > it a property of struct resource: > > struct resource { > resource_size_t start; > resource_size_t end; > const char *name; > unsigned long flags; > struct resource *parent, *sibling, *child; > + void *mapping; > }; > > The APIs would be: > > int io_resource_init_mapping(struct resource *res); > void io_resource_free_mapping(struct resource *res); > void * io_resource_map(struct resource *res, pfn_t pfn, unsigned long offset); > void io_resource_unmap(struct resource *res, void *kaddr); > > Note how simple and consistent it all gets: IO resources already know > their physical location and their size limits. Being able to cache an > ioremap in a mapping [and being able to use atomic kmaps on 32-bit] is a > relatively simple and natural extension to the concept. > > i think that would be quite acceptable - and the APIs could just > transparently work on it. This would also allow the PCI code to > automatically unmap any cached mappings from resources, when the driver > deinitializes. > > Linus, Jesse, what do you think? > > i think we need to finalize the API names and their abstraction level, > and then could even merge those APIs into v2.6.28 on a fast path, to > enable you to use it. It does not interact with anything else so it > should be safe to do. This API needs the cacheability control, which I don't see in it currently. In the past we've been relying on an MTRR over the GTT resulting in all of our UC- mappings getting us the correct WC behavior, but now there aren't enough MTRRs to go around and graphics loses out (at about a 20% CPU cost as a conservative estimate). The primary goal of the new API is to let us eat PAT costs up front so we don't have to at runtime. Second, we need to know when we're doing a mapping whether we're affected by atomic scheduling restrictions. Right now our plan has been to try doing page-by-page io_map_atomic_wc()/copy_from_user_inatomic()/io_unmap_atomic(), and if we fail at that at some point (map returns NULL or we get a partial completion from copy_from_user_inatomic) then fall back to io_map_wc() and copy_from_user() the whole thing at once. That gets us good performance on both x86 with highmem and x86-64, and not too shabby performance on x86 non-highmem. Also, while it's rare, there have been graphics cards (looking at you, S3) where BARs were expensive for some reason and they stuffed both the framebuffer and registers into one PCI BAR, where you want the FB to be WC and the registers to be UC. Not sure if they would be supportable with this API or not. And if it's not, I'm not sure how much we care to design for them, but it's something to potentially consider. Finally, I'm confused by the pfn and offset args to io_resource_map, when I expected something parallel to ioremap but with our resource arg added. -- Eric Anholt eric@anholt.net eric.anholt@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-19 19:07 ` Eric Anholt @ 2008-10-20 11:55 ` Ingo Molnar 2008-10-20 12:20 ` Ingo Molnar 0 siblings, 1 reply; 80+ messages in thread From: Ingo Molnar @ 2008-10-20 11:55 UTC (permalink / raw) To: Eric Anholt Cc: Keith Packard, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, dri-devel, Andrew Morton, Linus Torvalds * Eric Anholt <eric@anholt.net> wrote: > > The APIs would be: > > > > int io_resource_init_mapping(struct resource *res); > > void io_resource_free_mapping(struct resource *res); > > void * io_resource_map(struct resource *res, pfn_t pfn, unsigned long offset); > > void io_resource_unmap(struct resource *res, void *kaddr); > > > > Note how simple and consistent it all gets: IO resources already > > know their physical location and their size limits. Being able to > > cache an ioremap in a mapping [and being able to use atomic kmaps on > > 32-bit] is a relatively simple and natural extension to the concept. > > > > i think that would be quite acceptable - and the APIs could just > > transparently work on it. This would also allow the PCI code to > > automatically unmap any cached mappings from resources, when the > > driver deinitializes. > > > > Linus, Jesse, what do you think? > > > > i think we need to finalize the API names and their abstraction > > level, and then could even merge those APIs into v2.6.28 on a fast > > path, to enable you to use it. It does not interact with anything > > else so it should be safe to do. > > This API needs the cacheability control, which I don't see in it > currently. [...] yes, these two should do the trick: int io_resource_init_mapping_wc(struct resource *res); int io_resource_init_mapping_wb(struct resource *res); > Second, we need to know when we're doing a mapping whether we're > affected by atomic scheduling restrictions. Right now our plan has > been to try doing page-by-page > io_map_atomic_wc()/copy_from_user_inatomic()/io_unmap_atomic(), and if > we fail at that at some point (map returns NULL or we get a partial > completion from copy_from_user_inatomic) then fall back to io_map_wc() > and copy_from_user() the whole thing at once. That gets us good > performance on both x86 with highmem and x86-64, and not too shabby > performance on x86 non-highmem. that gets ugly very fast. I think we should not use atomic kmaps but NR_CPUS _fixmaps_ with a per CPU array of mutexes (this is basically atomic kmaps but without the preemption restrictions). We could take/drop the mutex and statistically you'll stay on the same CPU and wont ever contend on that lock in practice. > Also, while it's rare, there have been graphics cards (looking at you, > S3) where BARs were expensive for some reason and they stuffed both > the framebuffer and registers into one PCI BAR, where you want the FB > to be WC and the registers to be UC. Not sure if they would be > supportable with this API or not. And if it's not, I'm not sure how > much we care to design for them, but it's something to potentially > consider. yes, this is a weakness of this API - you cannot mix multiple cachability domains within the same BAR. and that can happen on non-graphics as well: some storage controller that has regular control registers in one portion of the BAR, which all need to be consistently accessed via UC and properly POST-ed - while it could also have some large mailbox structure at the end of the BAR, which could be mapped both cacheable or perhaps WC. So ... i guess we can go back to the io_mapping API proposed by Keith, but not make it atomic kmap based but fixmap + mutex based - for good 32-bit performance. (and the fixmap would not be used on 64-bit at all) > Finally, I'm confused by the pfn and offset args to io_resource_map, > when I expected something parallel to ioremap but with our resource > arg added. ok. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-20 11:55 ` Ingo Molnar @ 2008-10-20 12:20 ` Ingo Molnar 0 siblings, 0 replies; 80+ messages in thread From: Ingo Molnar @ 2008-10-20 12:20 UTC (permalink / raw) To: Eric Anholt Cc: Keith Packard, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, dri-devel, Andrew Morton, Linus Torvalds, Peter Zijlstra * Ingo Molnar <mingo@elte.hu> wrote: > that gets ugly very fast. I think we should not use atomic kmaps but > NR_CPUS _fixmaps_ with a per CPU array of mutexes (this is basically > atomic kmaps but without the preemption restrictions). We could > take/drop the mutex and statistically you'll stay on the same CPU and > wont ever contend on that lock in practice. peterz pointed it out that we need to be careful with the TLBs in that case. I think it's solvable: a small non-default special-case in switch_to() would invlpg any pending such page. (and no two such mappings could be held at the same time) So as long as we dont leave the CPU we wont ever do TLB flushes, and the flush will be local even if we do. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-19 17:53 ` io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar 2008-10-19 18:00 ` Arjan van de Ven 2008-10-19 19:07 ` Eric Anholt @ 2008-10-19 21:04 ` Keith Packard 2008-10-20 11:58 ` Ingo Molnar 2008-10-20 10:10 ` io resources and cached mappings " Ingo Molnar 3 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-19 21:04 UTC (permalink / raw) To: Ingo Molnar Cc: keithp, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, dri-devel, Andrew Morton, Linus Torvalds [-- Attachment #1: Type: text/plain, Size: 1332 bytes --] On Sun, 2008-10-19 at 19:53 +0200, Ingo Molnar wrote: > Note how simple and consistent it all gets: IO resources already know > their physical location and their size limits. Being able to cache an > ioremap in a mapping [and being able to use atomic kmaps on 32-bit] is a > relatively simple and natural extension to the concept. I'm not sure I see any value in caching mappings here; we're mostly interested in copying lots of data onto the card and so we use a lot of different mappings; atomic mappings are easy to use, and efficient enough. Also, if we're using a resource, I'd like to just use byte offsets and not pfns; the resource presumably knows the base address. > i think we need to finalize the API names and their abstraction level, > and then could even merge those APIs into v2.6.28 on a fast path, to > enable you to use it. It does not interact with anything else so it > should be safe to do. Let's figure out the API we want, then figure out whether it makes sense to stick it into 2.6.28 or whether we just want to wait for .29. I don't mind rewriting the driver for the next release. I will probably use something like the io_reserve stuff for .28 though; it makes the code easier to read and gives us good performance on 64 bit kernels. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-19 21:04 ` Keith Packard @ 2008-10-20 11:58 ` Ingo Molnar 2008-10-20 15:49 ` Keith Packard 0 siblings, 1 reply; 80+ messages in thread From: Ingo Molnar @ 2008-10-20 11:58 UTC (permalink / raw) To: Keith Packard Cc: Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, dri-devel, Andrew Morton, Linus Torvalds * Keith Packard <keithp@keithp.com> wrote: > On Sun, 2008-10-19 at 19:53 +0200, Ingo Molnar wrote: > > > Note how simple and consistent it all gets: IO resources already > > know their physical location and their size limits. Being able to > > cache an ioremap in a mapping [and being able to use atomic kmaps on > > 32-bit] is a relatively simple and natural extension to the concept. > > I'm not sure I see any value in caching mappings here; we're mostly > interested in copying lots of data onto the card and so we use a lot > of different mappings; atomic mappings are easy to use, and efficient > enough. yes but note that by caching the whole mapping on 64-bit we get everything we want: trivially lockless, works from any CPU, can be preempted at will, and there are no ugly INVLPG flushes anywhere. you'll even get 2MB mappings automatically, if the BAR is aligned and sized correctly. 32-bit we should handle as well but not design for it. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-20 11:58 ` Ingo Molnar @ 2008-10-20 15:49 ` Keith Packard 2008-10-22 9:36 ` Ingo Molnar 0 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-20 15:49 UTC (permalink / raw) To: Ingo Molnar Cc: keithp, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, dri-devel, Andrew Morton, Linus Torvalds [-- Attachment #1: Type: text/plain, Size: 657 bytes --] On Mon, 2008-10-20 at 13:58 +0200, Ingo Molnar wrote: > yes but note that by caching the whole mapping on 64-bit we get > everything we want: trivially lockless, works from any CPU, can be > preempted at will, and there are no ugly INVLPG flushes anywhere. I was assuming that on 64-bit, the map would be created at driver init time and be left in place until the driver closed; if that's what you mean by 'caching', then yes, we should cache the map. > 32-bit we should handle as well but not design for it. As long as we get kmap_atomic-like performance, and we get to simplify our code, I'm up for it. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-20 15:49 ` Keith Packard @ 2008-10-22 9:36 ` Ingo Molnar 2008-10-23 7:14 ` Keith Packard 2008-10-23 20:22 ` Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) Keith Packard 0 siblings, 2 replies; 80+ messages in thread From: Ingo Molnar @ 2008-10-22 9:36 UTC (permalink / raw) To: Keith Packard Cc: Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, dri-devel, Andrew Morton, Linus Torvalds * Keith Packard <keithp@keithp.com> wrote: > On Mon, 2008-10-20 at 13:58 +0200, Ingo Molnar wrote: > > > yes but note that by caching the whole mapping on 64-bit we get > > everything we want: trivially lockless, works from any CPU, can be > > preempted at will, and there are no ugly INVLPG flushes anywhere. > > I was assuming that on 64-bit, the map would be created at driver init > time and be left in place until the driver closed; if that's what you > mean by 'caching', then yes, we should cache the map. correct. > > 32-bit we should handle as well but not design for it. > > As long as we get kmap_atomic-like performance, and we get to simplify > our code, I'm up for it. okay. So ... mind sending your io_mapping patch as a generic facility? It looks all good to me in its present form, except that it should live in include/linux/io.h, not in the drivers/gpu/drm/i915/io_reserve.h file :-) also, please send at least two patches, so that we can look at (and possibly merge) the generic facility in isolation. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-22 9:36 ` Ingo Molnar @ 2008-10-23 7:14 ` Keith Packard 2008-10-23 7:14 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Keith Packard 2008-10-23 8:05 ` io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar 2008-10-23 20:22 ` Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) Keith Packard 1 sibling, 2 replies; 80+ messages in thread From: Keith Packard @ 2008-10-23 7:14 UTC (permalink / raw) To: Ingo Molnar Cc: Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, Keith Packard > okay. So ... mind sending your io_mapping patch as a generic facility? > It looks all good to me in its present form, except that it should live > in include/linux/io.h, not in the drivers/gpu/drm/i915/io_reserve.h file > :-) The first patch in this series (assuming I'm driving git-send-email correctly) adds the io_mapping API. I ended up creating a new linux/io_mapping.h file as the kernel init code uses io.h and got very angry when I tried to include linux/highmem.h from that. I'm afraid I gave up at that point and just moved the code to a new file. The second patch switches the drm/i915 driver to the new API. Performance improvements on 64-bit kernels are impressive as we were using the slow path before and now get to take advantage of 64-bit wonderfulness. ^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH] Add io-mapping functions to dynamically map large device apertures 2008-10-23 7:14 ` Keith Packard @ 2008-10-23 7:14 ` Keith Packard 2008-10-23 7:14 ` [PATCH] [drm/i915] Use io-mapping interfaces instead of a variety of mapping kludges Keith Packard 2008-10-24 4:49 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Randy Dunlap 2008-10-23 8:05 ` io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar 1 sibling, 2 replies; 80+ messages in thread From: Keith Packard @ 2008-10-23 7:14 UTC (permalink / raw) To: Ingo Molnar Cc: Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, Keith Packard [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 7706 bytes --] Graphics devices have large PCI apertures which would consume a significant fraction of a 32-bit address space if mapped during driver initialization. Using ioremap at runtime is impractical as it is too slow. This new set of interfaces uses atomic mappings on 32-bit processors and a large static mapping on 64-bit processors to provide reasonable 32-bit performance and optimal 64-bit performance. The current implementation sits atop the existing CONFIG_HIGHMEM kmap_atomic mechanism for 32-bit processors when present. When absent, it just uses ioremap, which remains horribly inefficient. Fixing non-HIGHMEM 32-bit kernels to provide per-CPU mappings ala HIGHMEM would resolve that performance issue. Signed-off-by: Keith Packard <keithp@keithp.com> --- Documentation/io-mapping.txt | 84 ++++++++++++++++++++++++++++ include/linux/io-mapping.h | 125 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 209 insertions(+), 0 deletions(-) create mode 100644 Documentation/io-mapping.txt create mode 100644 include/linux/io-mapping.h diff --git a/Documentation/io-mapping.txt b/Documentation/io-mapping.txt new file mode 100644 index 0000000..ebf6dc5 --- /dev/null +++ b/Documentation/io-mapping.txt @@ -0,0 +1,84 @@ +The io_mapping functions in linux/io.h provide an abstraction for +efficiently mapping small regions of an io device to the CPU. The initial +usage is to support the large graphics aperture on 32-bit processors where +ioremap_wc cannot be used to statically map the entire aperture to the CPU +as it would consume too much of the kernel address space. + +A mapping object is created during driver initialization using + + struct io_mapping * + io_mapping_create_wc(unsigned long base, unsigned long size) + + 'base' is the bus address of the region to be made + mappable, while 'size' indicates how large a mapping region to + enable. Both are in bytes. + + This _wc variant provides a mapping which may only be used + with the io_mapping_map_atomic_wc or io_mapping_map_wc. + +With this mapping object, individual pages can be mapped either atomically +or not, depending on the necessary scheduling environment. Of course, atomic +maps are more efficient: + + void * + io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) + + 'offset' is the offset within the defined mapping region. + Accessing addresses beyond the region specified in the + creation function yields undefined results. Using an offset + which is not page aligned yields an undefined result. The + return value points to a single page in CPU address space. + + This _wc variant returns a write-combining map to the + page and may only be used with + + Note that the task may not sleep while holding this page + mapped. + + void + io_mapping_unmap_atomic(void *vaddr) + + 'vaddr' must be the the value returned by the last + io_mapping_map_atomic_wc call. This unmaps the specified + page, and allows the task to sleep once again. + +If you need to sleep while holding the lock, you can use the non-atomic +variant, although they may be significantly slower; + + void * + io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) + + This works like io_mapping_map_atomic_wc except it allows + the task to sleep while holding the page mapped. + + void + io_mapping_unmap(void *vaddr) + + This works like io_mapping_unmap_atomic, except it is used + for pages mapped with io_mapping_map_wc. + +At driver close time, the io_mapping object must be freed: + + void + io_mapping_free(struct io_mapping *mapping) + +Current Implementation: + +The initial implementation of these functions use existing mapping +mechanisms and so provide only an abstraction layer and no new +functionality. + +On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole +range, creating a permanent kernel-visible mapping to the resource. The +map_atomic and map functions add the requested offset to the base of the +virtual address returned by ioremap_wc. + +On 32-bit processors with HIGHMEM defined, io_mapping_map_atomic_wc uses +kmap_atomic_pfn to map the specified page in an atomic fashion; +kmap_atomic_pfn isn't really supposed to be used with device pages, but it +provides an efficient mapping for this usage. + +On 32-bit processors without HIGHMEM defined, io_mapping_map_atomic_wc and +io_mapping_map_wc both use ioremap_wc, a terribly inefficient function which +performs an IPI to inform all processors about the new mapping. This results +in a significant performance penalty. diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h new file mode 100644 index 0000000..dcc24d5 --- /dev/null +++ b/include/linux/io-mapping.h @@ -0,0 +1,125 @@ +/* + * Copyright © 2008 Keith Packard <keithp@keithp.com> + * + * This file is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License + * as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +#ifndef _LINUX_IO_MAPPING_H +#define _LINUX_IO_MAPPING_H + +#include <linux/types.h> +#include <asm/io.h> +#include <asm/page.h> +#include <linux/highmem.h> + +/* + * The io_mapping mechanism provides an abstraction for mapping + * individual pages from an io device to the CPU in an efficient fashion. + * + * See Documentation/io_mapping.txt + */ + +/* this struct isn't actually defined anywhere */ +struct io_mapping; + +#ifdef CONFIG_X86_64 + +/* Create the io_mapping object*/ +static inline struct io_mapping * +io_mapping_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_mapping *) ioremap_wc(base, size); +} + +static inline void +io_mapping_free(struct io_mapping *mapping) +{ + iounmap(mapping); +} + +/* Atomic map/unmap */ +static inline void * +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) +{ + return ((char *) mapping) + offset; +} + +static inline void +io_mapping_unmap_atomic(void *vaddr) +{ +} + +/* Non-atomic map/unmap */ +static inline void * +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) +{ + return ((char *) mapping) + offset; +} + +static inline void +io_mapping_unmap(void *vaddr) +{ +} + +#endif /* CONFIG_X86_64 */ + +#ifdef CONFIG_X86_32 +static inline struct io_mapping * +io_mapping_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_mapping *) base; +} + +static inline void +io_mapping_free(struct io_mapping *mapping) +{ +} + +/* Atomic map/unmap */ +static inline void * +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) +{ + offset += (unsigned long) mapping; +#ifdef CONFIG_HIGHMEM + return kmap_atomic_pfn(offset >> PAGE_SHIFT, KM_USER0); +#else + return ioremap_wc(offset, PAGE_SIZE); +#endif +} + +static inline void +io_mapping_unmap_atomic(void *vaddr) +{ +#ifdef CONFIG_HIGHMEM + kunmap_atomic(vaddr, KM_USER0); +#else + iounmap(vaddr); +#endif +} + +static inline void * +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) +{ + offset += (unsigned long) mapping; + return ioremap_wc(offset, PAGE_SIZE); +} + +static inline void +io_mapping_unmap(void *vaddr) +{ + iounmap(vaddr); +} +#endif /* CONFIG_X86_32 */ + +#endif /* _LINUX_IO_MAPPING_H */ -- 1.5.6.5 ^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH] [drm/i915] Use io-mapping interfaces instead of a variety of mapping kludges 2008-10-23 7:14 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Keith Packard @ 2008-10-23 7:14 ` Keith Packard 2008-10-24 4:49 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Randy Dunlap 1 sibling, 0 replies; 80+ messages in thread From: Keith Packard @ 2008-10-23 7:14 UTC (permalink / raw) To: Ingo Molnar Cc: Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, Keith Packard Switch the i915 device aperture mapping to the io-mapping interface, taking advantage of the cleaner API to extend it across all of the mapping uses, including both pwrite and relocation updates. This dramatically improves performance on 64-bit kernels which were using the same slow path as 32-bit non-HIGHMEM kernels prior to this patch. Signed-off-by: Keith Packard <keithp@keithp.com> --- drivers/gpu/drm/i915/i915_drv.h | 3 + drivers/gpu/drm/i915/i915_gem.c | 81 ++++++++++++++++----------------------- 2 files changed, 36 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f20ffe1..8ca5fbc 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -31,6 +31,7 @@ #define _I915_DRV_H_ #include "i915_reg.h" +#include <linux/io-mapping.h> /* General customization: */ @@ -246,6 +247,8 @@ typedef struct drm_i915_private { struct { struct drm_mm gtt_space; + struct io_mapping *io_mapping; + /** * List of objects currently involved in rendering from the * ringbuffer. diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 9255088..d38b052 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -177,14 +177,14 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, struct drm_file *file_priv) { struct drm_i915_gem_object *obj_priv = obj->driver_private; + drm_i915_private_t *dev_priv = dev->dev_private; ssize_t remain; - loff_t offset; + loff_t offset, base; char __user *user_data; char __iomem *vaddr; char *vaddr_atomic; - int i, o, l; + int o, l; int ret = 0; - unsigned long pfn; unsigned long unwritten; user_data = (char __user *) (uintptr_t) args->data_ptr; @@ -211,42 +211,38 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, while (remain > 0) { /* Operation in this page * - * i = page number + * base = page offset within aperture * o = offset within page * l = bytes to copy */ - i = offset >> PAGE_SHIFT; + base = (offset & ~(PAGE_SIZE-1)); o = offset & (PAGE_SIZE-1); l = remain; if ((o + l) > PAGE_SIZE) l = PAGE_SIZE - o; - pfn = (dev->agp->base >> PAGE_SHIFT) + i; - -#ifdef CONFIG_HIGHMEM /* This is a workaround for the low performance of iounmap * (approximate 10% cpu cost on normal 3D workloads). - * kmap_atomic on HIGHMEM kernels happens to let us map card - * memory without taking IPIs. When the vmap rework lands - * we should be able to dump this hack. + * io_mapping_map_atomic_wc maps card memory + * without taking IPIs. */ - vaddr_atomic = kmap_atomic_pfn(pfn, KM_USER0); -#if WATCH_PWRITE - DRM_INFO("pwrite i %d o %d l %d pfn %ld vaddr %p\n", - i, o, l, pfn, vaddr_atomic); -#endif + vaddr_atomic = io_mapping_map_atomic_wc(dev_priv->mm.io_mapping, + base); unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + o, user_data, l); - kunmap_atomic(vaddr_atomic, KM_USER0); - + io_mapping_unmap_atomic(vaddr_atomic); + + /* If we get a fault while copying data, then (presumably) our + * source page isn't available. In this case, use the + * non-atomic __copy_from_user function + */ if (unwritten) -#endif /* CONFIG_HIGHMEM */ { - vaddr = ioremap_wc(pfn << PAGE_SHIFT, PAGE_SIZE); + vaddr = io_mapping_map_wc(dev_priv->mm.io_mapping, base); #if WATCH_PWRITE - DRM_INFO("pwrite slow i %d o %d l %d " - "pfn %ld vaddr %p\n", - i, o, l, pfn, vaddr); + DRM_INFO("pwrite slow base %ld o %d l %d " + "vaddr %p\n", + base, o, l, vaddr); #endif if (vaddr == NULL) { ret = -EFAULT; @@ -256,7 +252,7 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, #if WATCH_PWRITE DRM_INFO("unwritten %ld\n", unwritten); #endif - iounmap(vaddr); + io_mapping_unmap(vaddr); if (unwritten) { ret = -EFAULT; goto fail; @@ -1489,12 +1485,12 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, struct drm_i915_gem_exec_object *entry) { struct drm_device *dev = obj->dev; + drm_i915_private_t *dev_priv = dev->dev_private; struct drm_i915_gem_relocation_entry reloc; struct drm_i915_gem_relocation_entry __user *relocs; struct drm_i915_gem_object *obj_priv = obj->driver_private; int i, ret; - uint32_t last_reloc_offset = -1; - void __iomem *reloc_page = NULL; + void __iomem *reloc_page; /* Choose the GTT offset for our buffer and put it there. */ ret = i915_gem_object_pin(obj, (uint32_t) entry->alignment); @@ -1617,26 +1613,11 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, * perform. */ reloc_offset = obj_priv->gtt_offset + reloc.offset; - if (reloc_page == NULL || - (last_reloc_offset & ~(PAGE_SIZE - 1)) != - (reloc_offset & ~(PAGE_SIZE - 1))) { - if (reloc_page != NULL) - iounmap(reloc_page); - - reloc_page = ioremap_wc(dev->agp->base + - (reloc_offset & - ~(PAGE_SIZE - 1)), - PAGE_SIZE); - last_reloc_offset = reloc_offset; - if (reloc_page == NULL) { - drm_gem_object_unreference(target_obj); - i915_gem_object_unpin(obj); - return -ENOMEM; - } - } - + reloc_page = io_mapping_map_atomic_wc(dev_priv->mm.io_mapping, + (reloc_offset & + ~(PAGE_SIZE - 1))); reloc_entry = (uint32_t __iomem *)(reloc_page + - (reloc_offset & (PAGE_SIZE - 1))); + (reloc_offset & (PAGE_SIZE - 1))); reloc_val = target_obj_priv->gtt_offset + reloc.delta; #if WATCH_BUF @@ -1645,6 +1626,7 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, readl(reloc_entry), reloc_val); #endif writel(reloc_val, reloc_entry); + io_mapping_unmap_atomic(reloc_page); /* Write the updated presumed offset for this entry back out * to the user. @@ -1660,9 +1642,6 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, drm_gem_object_unreference(target_obj); } - if (reloc_page != NULL) - iounmap(reloc_page); - #if WATCH_BUF if (0) i915_gem_dump_object(obj, 128, __func__, ~0); @@ -2504,6 +2483,10 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data, if (ret != 0) return ret; + dev_priv->mm.io_mapping = io_mapping_create_wc(dev->agp->base, + dev->agp->agp_info.aper_size + * 1024 * 1024); + mutex_lock(&dev->struct_mutex); BUG_ON(!list_empty(&dev_priv->mm.active_list)); BUG_ON(!list_empty(&dev_priv->mm.flushing_list)); @@ -2521,11 +2504,13 @@ int i915_gem_leavevt_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { + drm_i915_private_t *dev_priv = dev->dev_private; int ret; ret = i915_gem_idle(dev); drm_irq_uninstall(dev); + io_mapping_free(dev_priv->mm.io_mapping); return ret; } -- 1.5.6.5 ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH] Add io-mapping functions to dynamically map large device apertures 2008-10-23 7:14 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Keith Packard 2008-10-23 7:14 ` [PATCH] [drm/i915] Use io-mapping interfaces instead of a variety of mapping kludges Keith Packard @ 2008-10-24 4:49 ` Randy Dunlap 2008-10-24 6:26 ` Keith Packard 1 sibling, 1 reply; 80+ messages in thread From: Randy Dunlap @ 2008-10-24 4:49 UTC (permalink / raw) To: Keith Packard Cc: Ingo Molnar, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List On Thu, 23 Oct 2008 00:14:46 -0700 Keith Packard wrote: > Documentation/io-mapping.txt | 84 ++++++++++++++++++++++++++++ > include/linux/io-mapping.h | 125 ++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 209 insertions(+), 0 deletions(-) > create mode 100644 Documentation/io-mapping.txt > create mode 100644 include/linux/io-mapping.h > > diff --git a/Documentation/io-mapping.txt b/Documentation/io-mapping.txt > new file mode 100644 > index 0000000..ebf6dc5 > --- /dev/null > +++ b/Documentation/io-mapping.txt > @@ -0,0 +1,84 @@ > +The io_mapping functions in linux/io.h provide an abstraction for io-mapping.h ? > +efficiently mapping small regions of an io device to the CPU. The initial IO or I/O, please > +usage is to support the large graphics aperture on 32-bit processors where > +ioremap_wc cannot be used to statically map the entire aperture to the CPU > +as it would consume too much of the kernel address space. > + > +A mapping object is created during driver initialization using > + > + struct io_mapping * > + io_mapping_create_wc(unsigned long base, unsigned long size) > + > + 'base' is the bus address of the region to be made > + mappable, while 'size' indicates how large a mapping region to > + enable. Both are in bytes. > + > + This _wc variant provides a mapping which may only be used > + with the io_mapping_map_atomic_wc or io_mapping_map_wc. > + > +With this mapping object, individual pages can be mapped either atomically > +or not, depending on the necessary scheduling environment. Of course, atomic > +maps are more efficient: > + > + void * > + io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) > + > + 'offset' is the offset within the defined mapping region. > + Accessing addresses beyond the region specified in the > + creation function yields undefined results. Using an offset > + which is not page aligned yields an undefined result. The > + return value points to a single page in CPU address space. > + > + This _wc variant returns a write-combining map to the > + page and may only be used with with <TBD>... > + > + Note that the task may not sleep while holding this page > + mapped. > + > + void > + io_mapping_unmap_atomic(void *vaddr) > + > + 'vaddr' must be the the value returned by the last > + io_mapping_map_atomic_wc call. This unmaps the specified > + page, and allows the task to sleep once again. s/,// > + > +If you need to sleep while holding the lock, you can use the non-atomic > +variant, although they may be significantly slower; s/;/./ > + > + void * > + io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) > + > + This works like io_mapping_map_atomic_wc except it allows > + the task to sleep while holding the page mapped. > + > + void > + io_mapping_unmap(void *vaddr) > + > + This works like io_mapping_unmap_atomic, except it is used > + for pages mapped with io_mapping_map_wc. > + > +At driver close time, the io_mapping object must be freed: > + > + void > + io_mapping_free(struct io_mapping *mapping) > + > +Current Implementation: > + > +The initial implementation of these functions use existing mapping uses > +mechanisms and so provide only an abstraction layer and no new provides > +functionality. > + > +On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole > +range, creating a permanent kernel-visible mapping to the resource. The > +map_atomic and map functions add the requested offset to the base of the > +virtual address returned by ioremap_wc. > + > +On 32-bit processors with HIGHMEM defined, io_mapping_map_atomic_wc uses > +kmap_atomic_pfn to map the specified page in an atomic fashion; > +kmap_atomic_pfn isn't really supposed to be used with device pages, but it > +provides an efficient mapping for this usage. > + > +On 32-bit processors without HIGHMEM defined, io_mapping_map_atomic_wc and > +io_mapping_map_wc both use ioremap_wc, a terribly inefficient function which > +performs an IPI to inform all processors about the new mapping. This results > +in a significant performance penalty. And I wish you could lose that horrible (non-Linux kernel) style of function return type on a separate line. --- ~Randy ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH] Add io-mapping functions to dynamically map large device apertures 2008-10-24 4:49 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Randy Dunlap @ 2008-10-24 6:26 ` Keith Packard 0 siblings, 0 replies; 80+ messages in thread From: Keith Packard @ 2008-10-24 6:26 UTC (permalink / raw) To: Randy Dunlap Cc: keithp, Ingo Molnar, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 234 bytes --] On Thu, 2008-10-23 at 21:49 -0700, Randy Dunlap wrote: (A bunch of helpful edits for the io-mapping.txt file) Thanks! They're in my tree and will get included in the next version of this patch. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-23 7:14 ` Keith Packard 2008-10-23 7:14 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Keith Packard @ 2008-10-23 8:05 ` Ingo Molnar 2008-10-23 15:39 ` Keith Packard 1 sibling, 1 reply; 80+ messages in thread From: Ingo Molnar @ 2008-10-23 8:05 UTC (permalink / raw) To: Keith Packard Cc: Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List * Keith Packard <keithp@keithp.com> wrote: > > okay. So ... mind sending your io_mapping patch as a generic > > facility? It looks all good to me in its present form, except that > > it should live in include/linux/io.h, not in the > > drivers/gpu/drm/i915/io_reserve.h file > > :-) > > The first patch in this series (assuming I'm driving git-send-email > correctly) adds the io_mapping API. I ended up creating a new > linux/io_mapping.h file as the kernel init code uses io.h and got very > angry when I tried to include linux/highmem.h from that. I'm afraid I > gave up at that point and just moved the code to a new file. ah ... good call, i missed that mess. linux/io.h is indeed dependency laden and it's best to keep new facilities separated anyway. > The second patch switches the drm/i915 driver to the new API. > Performance improvements on 64-bit kernels are impressive as we were > using the slow path before and now get to take advantage of 64-bit > wonderfulness. heh, cool - the wonders of 64-bit x86 :-) Any ballpark-figure numbers you can share with us? Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-23 8:05 ` io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar @ 2008-10-23 15:39 ` Keith Packard 2008-11-03 7:00 ` Dave Airlie 0 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-23 15:39 UTC (permalink / raw) To: Ingo Molnar Cc: keithp, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List [-- Attachment #1: Type: text/plain, Size: 366 bytes --] On Thu, 2008-10-23 at 10:05 +0200, Ingo Molnar wrote: > Any ballpark-figure numbers you can share with us? For one quake-3 based game we use for performance regression checking, 64-bit kernels run about 18 times faster now. That's the difference between using a zero-cost dynamic mapping and using ioremap_wc for each page. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-23 15:39 ` Keith Packard @ 2008-11-03 7:00 ` Dave Airlie 2008-11-03 10:48 ` Ingo Molnar 2008-11-03 16:36 ` Linus Torvalds 0 siblings, 2 replies; 80+ messages in thread From: Dave Airlie @ 2008-11-03 7:00 UTC (permalink / raw) To: Keith Packard Cc: Ingo Molnar, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, Linus Torvalds On Fri, Oct 24, 2008 at 1:39 AM, Keith Packard <keithp@keithp.com> wrote: > On Thu, 2008-10-23 at 10:05 +0200, Ingo Molnar wrote: > >> Any ballpark-figure numbers you can share with us? > > For one quake-3 based game we use for performance regression checking, > 64-bit kernels run about 18 times faster now. That's the difference > between using a zero-cost dynamic mapping and using ioremap_wc for each > page. > > -- > keith.packard@intel.com > So I've put these patches into Fedora rawhide kernel to test, and glxgears on my 945G hw went from 85fps to 380fps, clearly we would want these patches upstream sooner rather than later. Or we at least need something in the i915 driver to get the speed under GEM back up. I'm happy to ship these patches in F10 GA if that makes any difference, so I'd really like them upstream asap. Dave. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-11-03 7:00 ` Dave Airlie @ 2008-11-03 10:48 ` Ingo Molnar 2008-11-03 16:36 ` Linus Torvalds 1 sibling, 0 replies; 80+ messages in thread From: Ingo Molnar @ 2008-11-03 10:48 UTC (permalink / raw) To: Dave Airlie Cc: Keith Packard, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List, Linus Torvalds * Dave Airlie <airlied@gmail.com> wrote: > On Fri, Oct 24, 2008 at 1:39 AM, Keith Packard <keithp@keithp.com> wrote: > > On Thu, 2008-10-23 at 10:05 +0200, Ingo Molnar wrote: > > > >> Any ballpark-figure numbers you can share with us? > > > > For one quake-3 based game we use for performance regression checking, > > 64-bit kernels run about 18 times faster now. That's the difference > > between using a zero-cost dynamic mapping and using ioremap_wc for each > > page. > > > > -- > > keith.packard@intel.com > > > > So I've put these patches into Fedora rawhide kernel to test, and > glxgears on my 945G hw went from 85fps to 380fps, clearly we would > want these patches upstream sooner rather than later. yep, it's all lined up already in tip/core/resources, and got massively tested over the past few days. Will send a pull request to Linus tomorrow-ish - we need one final cleanup patch and then it's green to go. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-11-03 7:00 ` Dave Airlie 2008-11-03 10:48 ` Ingo Molnar @ 2008-11-03 16:36 ` Linus Torvalds 2008-11-03 16:53 ` Ingo Molnar 1 sibling, 1 reply; 80+ messages in thread From: Linus Torvalds @ 2008-11-03 16:36 UTC (permalink / raw) To: Dave Airlie Cc: Keith Packard, Ingo Molnar, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List On Mon, 3 Nov 2008, Dave Airlie wrote: > > So I've put these patches into Fedora rawhide kernel to test, and > glxgears on my 945G hw went from 85fps to 380fps, clearly we would want > these patches upstream sooner rather than later. I'm inclined to agree. Not that I think 380fps sounds very impressive (I get 850+ fps with _software_ rendering, for chissake), but because 85 fps is a joke, and clearly without this setup there's not even any point to try to do any other optimizations. And if we're looking at the patches being in Fedora anyway, there really is no reason not to merge them. Even if it's out of the merge window. Ingo? Linus ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-11-03 16:36 ` Linus Torvalds @ 2008-11-03 16:53 ` Ingo Molnar 2008-11-03 17:29 ` [git pull] IO mappings, #2 Ingo Molnar 0 siblings, 1 reply; 80+ messages in thread From: Ingo Molnar @ 2008-11-03 16:53 UTC (permalink / raw) To: Linus Torvalds Cc: Dave Airlie, Keith Packard, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List * Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Mon, 3 Nov 2008, Dave Airlie wrote: > > > > So I've put these patches into Fedora rawhide kernel to test, and > > glxgears on my 945G hw went from 85fps to 380fps, clearly we would want > > these patches upstream sooner rather than later. > > I'm inclined to agree. Not that I think 380fps sounds very > impressive (I get 850+ fps with _software_ rendering, for chissake), > but because 85 fps is a joke, and clearly without this setup there's > not even any point to try to do any other optimizations. > > And if we're looking at the patches being in Fedora anyway, there > really is no reason not to merge them. Even if it's out of the merge > window. > > Ingo? there was one pending item: i was waiting for the x86 #ifdef to be cleaned up out of include/linux/io-mapping.h, but that is trivial with no impact to any object code and can thus perhaps wait - would be nice to get this in early in this -rc. Please pull the latest io-mappings-for-linus git tree from: git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git io-mappings-for-linus Thanks, Ingo ------------------> Keith Packard (3): x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps resources: add io-mapping functions to dynamically map large device apertures i915: use io-mapping interfaces instead of a variety of mapping kludges Documentation/io-mapping.txt | 76 +++++++++++++++++ arch/x86/include/asm/fixmap.h | 4 + arch/x86/include/asm/fixmap_32.h | 4 - arch/x86/include/asm/highmem.h | 5 +- arch/x86/mm/Makefile | 2 +- arch/x86/mm/init_32.c | 3 +- arch/x86/mm/iomap_32.c | 59 +++++++++++++ drivers/gpu/drm/i915/i915_drv.h | 3 + drivers/gpu/drm/i915/i915_gem.c | 174 +++++++++++++++++-------------------- include/asm-x86/iomap.h | 30 +++++++ include/linux/io-mapping.h | 118 ++++++++++++++++++++++++++ 11 files changed, 373 insertions(+), 105 deletions(-) create mode 100644 Documentation/io-mapping.txt create mode 100644 arch/x86/mm/iomap_32.c create mode 100644 include/asm-x86/iomap.h create mode 100644 include/linux/io-mapping.h diff --git a/Documentation/io-mapping.txt b/Documentation/io-mapping.txt new file mode 100644 index 0000000..cd2f726 --- /dev/null +++ b/Documentation/io-mapping.txt @@ -0,0 +1,76 @@ +The io_mapping functions in linux/io-mapping.h provide an abstraction for +efficiently mapping small regions of an I/O device to the CPU. The initial +usage is to support the large graphics aperture on 32-bit processors where +ioremap_wc cannot be used to statically map the entire aperture to the CPU +as it would consume too much of the kernel address space. + +A mapping object is created during driver initialization using + + struct io_mapping *io_mapping_create_wc(unsigned long base, + unsigned long size) + + 'base' is the bus address of the region to be made + mappable, while 'size' indicates how large a mapping region to + enable. Both are in bytes. + + This _wc variant provides a mapping which may only be used + with the io_mapping_map_atomic_wc or io_mapping_map_wc. + +With this mapping object, individual pages can be mapped either atomically +or not, depending on the necessary scheduling environment. Of course, atomic +maps are more efficient: + + void *io_mapping_map_atomic_wc(struct io_mapping *mapping, + unsigned long offset) + + 'offset' is the offset within the defined mapping region. + Accessing addresses beyond the region specified in the + creation function yields undefined results. Using an offset + which is not page aligned yields an undefined result. The + return value points to a single page in CPU address space. + + This _wc variant returns a write-combining map to the + page and may only be used with mappings created by + io_mapping_create_wc + + Note that the task may not sleep while holding this page + mapped. + + void io_mapping_unmap_atomic(void *vaddr) + + 'vaddr' must be the the value returned by the last + io_mapping_map_atomic_wc call. This unmaps the specified + page and allows the task to sleep once again. + +If you need to sleep while holding the lock, you can use the non-atomic +variant, although they may be significantly slower. + + void *io_mapping_map_wc(struct io_mapping *mapping, + unsigned long offset) + + This works like io_mapping_map_atomic_wc except it allows + the task to sleep while holding the page mapped. + + void io_mapping_unmap(void *vaddr) + + This works like io_mapping_unmap_atomic, except it is used + for pages mapped with io_mapping_map_wc. + +At driver close time, the io_mapping object must be freed: + + void io_mapping_free(struct io_mapping *mapping) + +Current Implementation: + +The initial implementation of these functions uses existing mapping +mechanisms and so provides only an abstraction layer and no new +functionality. + +On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole +range, creating a permanent kernel-visible mapping to the resource. The +map_atomic and map functions add the requested offset to the base of the +virtual address returned by ioremap_wc. + +On 32-bit processors, io_mapping_map_atomic_wc uses io_map_atomic_prot_pfn, +which uses the fixmaps to get us a mapping to a page using an atomic fashion. +For io_mapping_map_wc, ioremap_wc() is used to get a mapping of the region. diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 8668a94..23696d4 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -9,6 +9,10 @@ extern int fixmaps_set; +extern pte_t *kmap_pte; +extern pgprot_t kmap_prot; +extern pte_t *pkmap_page_table; + void __native_set_fixmap(enum fixed_addresses idx, pte_t pte); void native_set_fixmap(enum fixed_addresses idx, unsigned long phys, pgprot_t flags); diff --git a/arch/x86/include/asm/fixmap_32.h b/arch/x86/include/asm/fixmap_32.h index 09f29ab..c7115c1 100644 --- a/arch/x86/include/asm/fixmap_32.h +++ b/arch/x86/include/asm/fixmap_32.h @@ -28,10 +28,8 @@ extern unsigned long __FIXADDR_TOP; #include <asm/acpi.h> #include <asm/apicdef.h> #include <asm/page.h> -#ifdef CONFIG_HIGHMEM #include <linux/threads.h> #include <asm/kmap_types.h> -#endif /* * Here we define all the compile-time 'special' virtual @@ -75,10 +73,8 @@ enum fixed_addresses { #ifdef CONFIG_X86_CYCLONE_TIMER FIX_CYCLONE_TIMER, /*cyclone timer register*/ #endif -#ifdef CONFIG_HIGHMEM FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, -#endif #ifdef CONFIG_PCI_MMCONFIG FIX_PCIE_MCFG, #endif diff --git a/arch/x86/include/asm/highmem.h b/arch/x86/include/asm/highmem.h index a3b3b7c..bf9276b 100644 --- a/arch/x86/include/asm/highmem.h +++ b/arch/x86/include/asm/highmem.h @@ -25,14 +25,11 @@ #include <asm/kmap_types.h> #include <asm/tlbflush.h> #include <asm/paravirt.h> +#include <asm/fixmap.h> /* declarations for highmem.c */ extern unsigned long highstart_pfn, highend_pfn; -extern pte_t *kmap_pte; -extern pgprot_t kmap_prot; -extern pte_t *pkmap_page_table; - /* * Right now we initialize only a single pte table. It can be extended * easily, subsequent pte tables have to be allocated in one physical diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 59f89b4..fea4565 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -1,7 +1,7 @@ obj-y := init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \ pat.o pgtable.o gup.o -obj-$(CONFIG_X86_32) += pgtable_32.o +obj-$(CONFIG_X86_32) += pgtable_32.o iomap_32.o obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o obj-$(CONFIG_X86_PTDUMP) += dump_pagetables.o diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 8396868..c483f42 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -334,7 +334,6 @@ int devmem_is_allowed(unsigned long pagenr) return 0; } -#ifdef CONFIG_HIGHMEM pte_t *kmap_pte; pgprot_t kmap_prot; @@ -357,6 +356,7 @@ static void __init kmap_init(void) kmap_prot = PAGE_KERNEL; } +#ifdef CONFIG_HIGHMEM static void __init permanent_kmaps_init(pgd_t *pgd_base) { unsigned long vaddr; @@ -436,7 +436,6 @@ static void __init set_highmem_pages_init(void) #endif /* !CONFIG_NUMA */ #else -# define kmap_init() do { } while (0) # define permanent_kmaps_init(pgd_base) do { } while (0) # define set_highmem_pages_init() do { } while (0) #endif /* CONFIG_HIGHMEM */ diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c new file mode 100644 index 0000000..d0151d8 --- /dev/null +++ b/arch/x86/mm/iomap_32.c @@ -0,0 +1,59 @@ +/* + * Copyright © 2008 Ingo Molnar + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <asm/iomap.h> +#include <linux/module.h> + +/* Map 'pfn' using fixed map 'type' and protections 'prot' + */ +void * +iomap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot) +{ + enum fixed_addresses idx; + unsigned long vaddr; + + pagefault_disable(); + + idx = type + KM_TYPE_NR*smp_processor_id(); + vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); + set_pte(kmap_pte-idx, pfn_pte(pfn, prot)); + arch_flush_lazy_mmu_mode(); + + return (void*) vaddr; +} +EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn); + +void +iounmap_atomic(void *kvaddr, enum km_type type) +{ + unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; + enum fixed_addresses idx = type + KM_TYPE_NR*smp_processor_id(); + + /* + * Force other mappings to Oops if they'll try to access this pte + * without first remap it. Keeping stale mappings around is a bad idea + * also, in case the page changes cacheability attributes or becomes + * a protected page in a hypervisor. + */ + if (vaddr == __fix_to_virt(FIX_KMAP_BEGIN+idx)) + kpte_clear_flush(kmap_pte-idx, vaddr); + + arch_flush_lazy_mmu_mode(); + pagefault_enable(); +} +EXPORT_SYMBOL_GPL(iounmap_atomic); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f20ffe1..126b2f1 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -31,6 +31,7 @@ #define _I915_DRV_H_ #include "i915_reg.h" +#include <linux/io-mapping.h> /* General customization: */ @@ -246,6 +247,8 @@ typedef struct drm_i915_private { struct { struct drm_mm gtt_space; + struct io_mapping *gtt_mapping; + /** * List of objects currently involved in rendering from the * ringbuffer. diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 17ae330..61183b9 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -171,35 +171,50 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data, return 0; } -/* - * Try to write quickly with an atomic kmap. Return true on success. - * - * If this fails (which includes a partial write), we'll redo the whole - * thing with the slow version. - * - * This is a workaround for the low performance of iounmap (approximate - * 10% cpu cost on normal 3D workloads). kmap_atomic on HIGHMEM kernels - * happens to let us map card memory without taking IPIs. When the vmap - * rework lands we should be able to dump this hack. +/* This is the fast write path which cannot handle + * page faults in the source data */ -static inline int fast_user_write(unsigned long pfn, char __user *user_data, - int l, int o) + +static inline int +fast_user_write(struct io_mapping *mapping, + loff_t page_base, int page_offset, + char __user *user_data, + int length) { -#ifdef CONFIG_HIGHMEM - unsigned long unwritten; char *vaddr_atomic; + unsigned long unwritten; - vaddr_atomic = kmap_atomic_pfn(pfn, KM_USER0); -#if WATCH_PWRITE - DRM_INFO("pwrite i %d o %d l %d pfn %ld vaddr %p\n", - i, o, l, pfn, vaddr_atomic); -#endif - unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + o, user_data, l); - kunmap_atomic(vaddr_atomic, KM_USER0); - return !unwritten; -#else + vaddr_atomic = io_mapping_map_atomic_wc(mapping, page_base); + unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + page_offset, + user_data, length); + io_mapping_unmap_atomic(vaddr_atomic); + if (unwritten) + return -EFAULT; + return 0; +} + +/* Here's the write path which can sleep for + * page faults + */ + +static inline int +slow_user_write(struct io_mapping *mapping, + loff_t page_base, int page_offset, + char __user *user_data, + int length) +{ + char __iomem *vaddr; + unsigned long unwritten; + + vaddr = io_mapping_map_wc(mapping, page_base); + if (vaddr == NULL) + return -EFAULT; + unwritten = __copy_from_user(vaddr + page_offset, + user_data, length); + io_mapping_unmap(vaddr); + if (unwritten) + return -EFAULT; return 0; -#endif } static int @@ -208,10 +223,12 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, struct drm_file *file_priv) { struct drm_i915_gem_object *obj_priv = obj->driver_private; + drm_i915_private_t *dev_priv = dev->dev_private; ssize_t remain; - loff_t offset; + loff_t offset, page_base; char __user *user_data; - int ret = 0; + int page_offset, page_length; + int ret; user_data = (char __user *) (uintptr_t) args->data_ptr; remain = args->size; @@ -235,57 +252,37 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, obj_priv->dirty = 1; while (remain > 0) { - unsigned long pfn; - int i, o, l; - /* Operation in this page * - * i = page number - * o = offset within page - * l = bytes to copy + * page_base = page offset within aperture + * page_offset = offset within page + * page_length = bytes to copy for this page */ - i = offset >> PAGE_SHIFT; - o = offset & (PAGE_SIZE-1); - l = remain; - if ((o + l) > PAGE_SIZE) - l = PAGE_SIZE - o; - - pfn = (dev->agp->base >> PAGE_SHIFT) + i; - - if (!fast_user_write(pfn, user_data, l, o)) { - unsigned long unwritten; - char __iomem *vaddr; - - vaddr = ioremap_wc(pfn << PAGE_SHIFT, PAGE_SIZE); -#if WATCH_PWRITE - DRM_INFO("pwrite slow i %d o %d l %d " - "pfn %ld vaddr %p\n", - i, o, l, pfn, vaddr); -#endif - if (vaddr == NULL) { - ret = -EFAULT; - goto fail; - } - unwritten = __copy_from_user(vaddr + o, user_data, l); -#if WATCH_PWRITE - DRM_INFO("unwritten %ld\n", unwritten); -#endif - iounmap(vaddr); - if (unwritten) { - ret = -EFAULT; + page_base = (offset & ~(PAGE_SIZE-1)); + page_offset = offset & (PAGE_SIZE-1); + page_length = remain; + if ((page_offset + remain) > PAGE_SIZE) + page_length = PAGE_SIZE - page_offset; + + ret = fast_user_write (dev_priv->mm.gtt_mapping, page_base, + page_offset, user_data, page_length); + + /* If we get a fault while copying data, then (presumably) our + * source page isn't available. In this case, use the + * non-atomic function + */ + if (ret) { + ret = slow_user_write (dev_priv->mm.gtt_mapping, + page_base, page_offset, + user_data, page_length); + if (ret) goto fail; - } } - remain -= l; - user_data += l; - offset += l; + remain -= page_length; + user_data += page_length; + offset += page_length; } -#if WATCH_PWRITE && 1 - i915_gem_clflush_object(obj); - i915_gem_dump_object(obj, args->offset + args->size, __func__, ~0); - i915_gem_clflush_object(obj); -#endif fail: i915_gem_object_unpin(obj); @@ -1503,12 +1500,12 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, struct drm_i915_gem_exec_object *entry) { struct drm_device *dev = obj->dev; + drm_i915_private_t *dev_priv = dev->dev_private; struct drm_i915_gem_relocation_entry reloc; struct drm_i915_gem_relocation_entry __user *relocs; struct drm_i915_gem_object *obj_priv = obj->driver_private; int i, ret; - uint32_t last_reloc_offset = -1; - void __iomem *reloc_page = NULL; + void __iomem *reloc_page; /* Choose the GTT offset for our buffer and put it there. */ ret = i915_gem_object_pin(obj, (uint32_t) entry->alignment); @@ -1631,26 +1628,11 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, * perform. */ reloc_offset = obj_priv->gtt_offset + reloc.offset; - if (reloc_page == NULL || - (last_reloc_offset & ~(PAGE_SIZE - 1)) != - (reloc_offset & ~(PAGE_SIZE - 1))) { - if (reloc_page != NULL) - iounmap(reloc_page); - - reloc_page = ioremap_wc(dev->agp->base + - (reloc_offset & - ~(PAGE_SIZE - 1)), - PAGE_SIZE); - last_reloc_offset = reloc_offset; - if (reloc_page == NULL) { - drm_gem_object_unreference(target_obj); - i915_gem_object_unpin(obj); - return -ENOMEM; - } - } - + reloc_page = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping, + (reloc_offset & + ~(PAGE_SIZE - 1))); reloc_entry = (uint32_t __iomem *)(reloc_page + - (reloc_offset & (PAGE_SIZE - 1))); + (reloc_offset & (PAGE_SIZE - 1))); reloc_val = target_obj_priv->gtt_offset + reloc.delta; #if WATCH_BUF @@ -1659,6 +1641,7 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, readl(reloc_entry), reloc_val); #endif writel(reloc_val, reloc_entry); + io_mapping_unmap_atomic(reloc_page); /* Write the updated presumed offset for this entry back out * to the user. @@ -1674,9 +1657,6 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, drm_gem_object_unreference(target_obj); } - if (reloc_page != NULL) - iounmap(reloc_page); - #if WATCH_BUF if (0) i915_gem_dump_object(obj, 128, __func__, ~0); @@ -2518,6 +2498,10 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data, if (ret != 0) return ret; + dev_priv->mm.gtt_mapping = io_mapping_create_wc(dev->agp->base, + dev->agp->agp_info.aper_size + * 1024 * 1024); + mutex_lock(&dev->struct_mutex); BUG_ON(!list_empty(&dev_priv->mm.active_list)); BUG_ON(!list_empty(&dev_priv->mm.flushing_list)); @@ -2535,11 +2519,13 @@ int i915_gem_leavevt_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { + drm_i915_private_t *dev_priv = dev->dev_private; int ret; ret = i915_gem_idle(dev); drm_irq_uninstall(dev); + io_mapping_free(dev_priv->mm.gtt_mapping); return ret; } diff --git a/include/asm-x86/iomap.h b/include/asm-x86/iomap.h new file mode 100644 index 0000000..c1f0628 --- /dev/null +++ b/include/asm-x86/iomap.h @@ -0,0 +1,30 @@ +/* + * Copyright © 2008 Ingo Molnar + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <linux/fs.h> +#include <linux/mm.h> +#include <linux/uaccess.h> +#include <asm/cacheflush.h> +#include <asm/pgtable.h> +#include <asm/tlbflush.h> + +void * +iomap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot); + +void +iounmap_atomic(void *kvaddr, enum km_type type); diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h new file mode 100644 index 0000000..1b56699 --- /dev/null +++ b/include/linux/io-mapping.h @@ -0,0 +1,118 @@ +/* + * Copyright © 2008 Keith Packard <keithp@keithp.com> + * + * This file is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License + * as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +#ifndef _LINUX_IO_MAPPING_H +#define _LINUX_IO_MAPPING_H + +#include <linux/types.h> +#include <asm/io.h> +#include <asm/page.h> +#include <asm/iomap.h> + +/* + * The io_mapping mechanism provides an abstraction for mapping + * individual pages from an io device to the CPU in an efficient fashion. + * + * See Documentation/io_mapping.txt + */ + +/* this struct isn't actually defined anywhere */ +struct io_mapping; + +#ifdef CONFIG_X86_64 + +/* Create the io_mapping object*/ +static inline struct io_mapping * +io_mapping_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_mapping *) ioremap_wc(base, size); +} + +static inline void +io_mapping_free(struct io_mapping *mapping) +{ + iounmap(mapping); +} + +/* Atomic map/unmap */ +static inline void * +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) +{ + return ((char *) mapping) + offset; +} + +static inline void +io_mapping_unmap_atomic(void *vaddr) +{ +} + +/* Non-atomic map/unmap */ +static inline void * +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) +{ + return ((char *) mapping) + offset; +} + +static inline void +io_mapping_unmap(void *vaddr) +{ +} + +#endif /* CONFIG_X86_64 */ + +#ifdef CONFIG_X86_32 +static inline struct io_mapping * +io_mapping_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_mapping *) base; +} + +static inline void +io_mapping_free(struct io_mapping *mapping) +{ +} + +/* Atomic map/unmap */ +static inline void * +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) +{ + offset += (unsigned long) mapping; + return iomap_atomic_prot_pfn(offset >> PAGE_SHIFT, KM_USER0, + __pgprot(__PAGE_KERNEL_WC)); +} + +static inline void +io_mapping_unmap_atomic(void *vaddr) +{ + iounmap_atomic(vaddr, KM_USER0); +} + +static inline void * +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) +{ + offset += (unsigned long) mapping; + return ioremap_wc(offset, PAGE_SIZE); +} + +static inline void +io_mapping_unmap(void *vaddr) +{ + iounmap(vaddr); +} +#endif /* CONFIG_X86_32 */ + +#endif /* _LINUX_IO_MAPPING_H */ ^ permalink raw reply [flat|nested] 80+ messages in thread
* [git pull] IO mappings, #2 2008-11-03 16:53 ` Ingo Molnar @ 2008-11-03 17:29 ` Ingo Molnar 2008-11-04 22:36 ` Jonathan Corbet 0 siblings, 1 reply; 80+ messages in thread From: Ingo Molnar @ 2008-11-03 17:29 UTC (permalink / raw) To: Linus Torvalds Cc: Dave Airlie, Keith Packard, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List * Ingo Molnar <mingo@elte.hu> wrote: > there was one pending item: i was waiting for the x86 #ifdef to be > cleaned up out of include/linux/io-mapping.h, but that is trivial > with no impact to any object code and can thus perhaps wait - would > be nice to get this in early in this -rc. These cleanups can now be found in the second tree below. They can be cleanly pulled over the other tree i just posted, but it has slightly less testing. (the cleanup commits are fresh - but there should be no difference in functionality) Please pull the latest io-mappings-for-linus-2 git tree from: git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git io-mappings-for-linus-2 Thanks, Ingo ------------------> Keith Packard (5): x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps resources: add io-mapping functions to dynamically map large device apertures i915: use io-mapping interfaces instead of a variety of mapping kludges io mapping: improve documentation io mapping: clean up #ifdefs Documentation/io-mapping.txt | 82 ++++++++++++++++++ arch/x86/Kconfig | 4 + arch/x86/include/asm/fixmap.h | 4 + arch/x86/include/asm/fixmap_32.h | 4 - arch/x86/include/asm/highmem.h | 5 +- arch/x86/mm/Makefile | 2 +- arch/x86/mm/init_32.c | 3 +- arch/x86/mm/iomap_32.c | 59 +++++++++++++ drivers/gpu/drm/i915/i915_drv.h | 3 + drivers/gpu/drm/i915/i915_gem.c | 174 +++++++++++++++++-------------------- include/asm-x86/iomap.h | 30 +++++++ include/linux/io-mapping.h | 125 +++++++++++++++++++++++++++ 12 files changed, 390 insertions(+), 105 deletions(-) create mode 100644 Documentation/io-mapping.txt create mode 100644 arch/x86/mm/iomap_32.c create mode 100644 include/asm-x86/iomap.h create mode 100644 include/linux/io-mapping.h diff --git a/Documentation/io-mapping.txt b/Documentation/io-mapping.txt new file mode 100644 index 0000000..473e43b --- /dev/null +++ b/Documentation/io-mapping.txt @@ -0,0 +1,82 @@ +The io_mapping functions in linux/io-mapping.h provide an abstraction for +efficiently mapping small regions of an I/O device to the CPU. The initial +usage is to support the large graphics aperture on 32-bit processors where +ioremap_wc cannot be used to statically map the entire aperture to the CPU +as it would consume too much of the kernel address space. + +A mapping object is created during driver initialization using + + struct io_mapping *io_mapping_create_wc(unsigned long base, + unsigned long size) + + 'base' is the bus address of the region to be made + mappable, while 'size' indicates how large a mapping region to + enable. Both are in bytes. + + This _wc variant provides a mapping which may only be used + with the io_mapping_map_atomic_wc or io_mapping_map_wc. + +With this mapping object, individual pages can be mapped either atomically +or not, depending on the necessary scheduling environment. Of course, atomic +maps are more efficient: + + void *io_mapping_map_atomic_wc(struct io_mapping *mapping, + unsigned long offset) + + 'offset' is the offset within the defined mapping region. + Accessing addresses beyond the region specified in the + creation function yields undefined results. Using an offset + which is not page aligned yields an undefined result. The + return value points to a single page in CPU address space. + + This _wc variant returns a write-combining map to the + page and may only be used with mappings created by + io_mapping_create_wc + + Note that the task may not sleep while holding this page + mapped. + + void io_mapping_unmap_atomic(void *vaddr) + + 'vaddr' must be the the value returned by the last + io_mapping_map_atomic_wc call. This unmaps the specified + page and allows the task to sleep once again. + +If you need to sleep while holding the lock, you can use the non-atomic +variant, although they may be significantly slower. + + void *io_mapping_map_wc(struct io_mapping *mapping, + unsigned long offset) + + This works like io_mapping_map_atomic_wc except it allows + the task to sleep while holding the page mapped. + + void io_mapping_unmap(void *vaddr) + + This works like io_mapping_unmap_atomic, except it is used + for pages mapped with io_mapping_map_wc. + +At driver close time, the io_mapping object must be freed: + + void io_mapping_free(struct io_mapping *mapping) + +Current Implementation: + +The initial implementation of these functions uses existing mapping +mechanisms and so provides only an abstraction layer and no new +functionality. + +On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole +range, creating a permanent kernel-visible mapping to the resource. The +map_atomic and map functions add the requested offset to the base of the +virtual address returned by ioremap_wc. + +On 32-bit processors with HIGHMEM defined, io_mapping_map_atomic_wc uses +kmap_atomic_pfn to map the specified page in an atomic fashion; +kmap_atomic_pfn isn't really supposed to be used with device pages, but it +provides an efficient mapping for this usage. + +On 32-bit processors without HIGHMEM defined, io_mapping_map_atomic_wc and +io_mapping_map_wc both use ioremap_wc, a terribly inefficient function which +performs an IPI to inform all processors about the new mapping. This results +in a significant performance penalty. diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6f20718..e60c59b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1894,6 +1894,10 @@ config SYSVIPC_COMPAT endmenu +config HAVE_ATOMIC_IOMAP + def_bool y + depends on X86_32 + source "net/Kconfig" source "drivers/Kconfig" diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 8668a94..23696d4 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -9,6 +9,10 @@ extern int fixmaps_set; +extern pte_t *kmap_pte; +extern pgprot_t kmap_prot; +extern pte_t *pkmap_page_table; + void __native_set_fixmap(enum fixed_addresses idx, pte_t pte); void native_set_fixmap(enum fixed_addresses idx, unsigned long phys, pgprot_t flags); diff --git a/arch/x86/include/asm/fixmap_32.h b/arch/x86/include/asm/fixmap_32.h index 09f29ab..c7115c1 100644 --- a/arch/x86/include/asm/fixmap_32.h +++ b/arch/x86/include/asm/fixmap_32.h @@ -28,10 +28,8 @@ extern unsigned long __FIXADDR_TOP; #include <asm/acpi.h> #include <asm/apicdef.h> #include <asm/page.h> -#ifdef CONFIG_HIGHMEM #include <linux/threads.h> #include <asm/kmap_types.h> -#endif /* * Here we define all the compile-time 'special' virtual @@ -75,10 +73,8 @@ enum fixed_addresses { #ifdef CONFIG_X86_CYCLONE_TIMER FIX_CYCLONE_TIMER, /*cyclone timer register*/ #endif -#ifdef CONFIG_HIGHMEM FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, -#endif #ifdef CONFIG_PCI_MMCONFIG FIX_PCIE_MCFG, #endif diff --git a/arch/x86/include/asm/highmem.h b/arch/x86/include/asm/highmem.h index a3b3b7c..bf9276b 100644 --- a/arch/x86/include/asm/highmem.h +++ b/arch/x86/include/asm/highmem.h @@ -25,14 +25,11 @@ #include <asm/kmap_types.h> #include <asm/tlbflush.h> #include <asm/paravirt.h> +#include <asm/fixmap.h> /* declarations for highmem.c */ extern unsigned long highstart_pfn, highend_pfn; -extern pte_t *kmap_pte; -extern pgprot_t kmap_prot; -extern pte_t *pkmap_page_table; - /* * Right now we initialize only a single pte table. It can be extended * easily, subsequent pte tables have to be allocated in one physical diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 59f89b4..fea4565 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -1,7 +1,7 @@ obj-y := init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \ pat.o pgtable.o gup.o -obj-$(CONFIG_X86_32) += pgtable_32.o +obj-$(CONFIG_X86_32) += pgtable_32.o iomap_32.o obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o obj-$(CONFIG_X86_PTDUMP) += dump_pagetables.o diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 8396868..c483f42 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -334,7 +334,6 @@ int devmem_is_allowed(unsigned long pagenr) return 0; } -#ifdef CONFIG_HIGHMEM pte_t *kmap_pte; pgprot_t kmap_prot; @@ -357,6 +356,7 @@ static void __init kmap_init(void) kmap_prot = PAGE_KERNEL; } +#ifdef CONFIG_HIGHMEM static void __init permanent_kmaps_init(pgd_t *pgd_base) { unsigned long vaddr; @@ -436,7 +436,6 @@ static void __init set_highmem_pages_init(void) #endif /* !CONFIG_NUMA */ #else -# define kmap_init() do { } while (0) # define permanent_kmaps_init(pgd_base) do { } while (0) # define set_highmem_pages_init() do { } while (0) #endif /* CONFIG_HIGHMEM */ diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c new file mode 100644 index 0000000..d0151d8 --- /dev/null +++ b/arch/x86/mm/iomap_32.c @@ -0,0 +1,59 @@ +/* + * Copyright © 2008 Ingo Molnar + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <asm/iomap.h> +#include <linux/module.h> + +/* Map 'pfn' using fixed map 'type' and protections 'prot' + */ +void * +iomap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot) +{ + enum fixed_addresses idx; + unsigned long vaddr; + + pagefault_disable(); + + idx = type + KM_TYPE_NR*smp_processor_id(); + vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); + set_pte(kmap_pte-idx, pfn_pte(pfn, prot)); + arch_flush_lazy_mmu_mode(); + + return (void*) vaddr; +} +EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn); + +void +iounmap_atomic(void *kvaddr, enum km_type type) +{ + unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; + enum fixed_addresses idx = type + KM_TYPE_NR*smp_processor_id(); + + /* + * Force other mappings to Oops if they'll try to access this pte + * without first remap it. Keeping stale mappings around is a bad idea + * also, in case the page changes cacheability attributes or becomes + * a protected page in a hypervisor. + */ + if (vaddr == __fix_to_virt(FIX_KMAP_BEGIN+idx)) + kpte_clear_flush(kmap_pte-idx, vaddr); + + arch_flush_lazy_mmu_mode(); + pagefault_enable(); +} +EXPORT_SYMBOL_GPL(iounmap_atomic); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f20ffe1..126b2f1 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -31,6 +31,7 @@ #define _I915_DRV_H_ #include "i915_reg.h" +#include <linux/io-mapping.h> /* General customization: */ @@ -246,6 +247,8 @@ typedef struct drm_i915_private { struct { struct drm_mm gtt_space; + struct io_mapping *gtt_mapping; + /** * List of objects currently involved in rendering from the * ringbuffer. diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 17ae330..61183b9 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -171,35 +171,50 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data, return 0; } -/* - * Try to write quickly with an atomic kmap. Return true on success. - * - * If this fails (which includes a partial write), we'll redo the whole - * thing with the slow version. - * - * This is a workaround for the low performance of iounmap (approximate - * 10% cpu cost on normal 3D workloads). kmap_atomic on HIGHMEM kernels - * happens to let us map card memory without taking IPIs. When the vmap - * rework lands we should be able to dump this hack. +/* This is the fast write path which cannot handle + * page faults in the source data */ -static inline int fast_user_write(unsigned long pfn, char __user *user_data, - int l, int o) + +static inline int +fast_user_write(struct io_mapping *mapping, + loff_t page_base, int page_offset, + char __user *user_data, + int length) { -#ifdef CONFIG_HIGHMEM - unsigned long unwritten; char *vaddr_atomic; + unsigned long unwritten; - vaddr_atomic = kmap_atomic_pfn(pfn, KM_USER0); -#if WATCH_PWRITE - DRM_INFO("pwrite i %d o %d l %d pfn %ld vaddr %p\n", - i, o, l, pfn, vaddr_atomic); -#endif - unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + o, user_data, l); - kunmap_atomic(vaddr_atomic, KM_USER0); - return !unwritten; -#else + vaddr_atomic = io_mapping_map_atomic_wc(mapping, page_base); + unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + page_offset, + user_data, length); + io_mapping_unmap_atomic(vaddr_atomic); + if (unwritten) + return -EFAULT; + return 0; +} + +/* Here's the write path which can sleep for + * page faults + */ + +static inline int +slow_user_write(struct io_mapping *mapping, + loff_t page_base, int page_offset, + char __user *user_data, + int length) +{ + char __iomem *vaddr; + unsigned long unwritten; + + vaddr = io_mapping_map_wc(mapping, page_base); + if (vaddr == NULL) + return -EFAULT; + unwritten = __copy_from_user(vaddr + page_offset, + user_data, length); + io_mapping_unmap(vaddr); + if (unwritten) + return -EFAULT; return 0; -#endif } static int @@ -208,10 +223,12 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, struct drm_file *file_priv) { struct drm_i915_gem_object *obj_priv = obj->driver_private; + drm_i915_private_t *dev_priv = dev->dev_private; ssize_t remain; - loff_t offset; + loff_t offset, page_base; char __user *user_data; - int ret = 0; + int page_offset, page_length; + int ret; user_data = (char __user *) (uintptr_t) args->data_ptr; remain = args->size; @@ -235,57 +252,37 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj, obj_priv->dirty = 1; while (remain > 0) { - unsigned long pfn; - int i, o, l; - /* Operation in this page * - * i = page number - * o = offset within page - * l = bytes to copy + * page_base = page offset within aperture + * page_offset = offset within page + * page_length = bytes to copy for this page */ - i = offset >> PAGE_SHIFT; - o = offset & (PAGE_SIZE-1); - l = remain; - if ((o + l) > PAGE_SIZE) - l = PAGE_SIZE - o; - - pfn = (dev->agp->base >> PAGE_SHIFT) + i; - - if (!fast_user_write(pfn, user_data, l, o)) { - unsigned long unwritten; - char __iomem *vaddr; - - vaddr = ioremap_wc(pfn << PAGE_SHIFT, PAGE_SIZE); -#if WATCH_PWRITE - DRM_INFO("pwrite slow i %d o %d l %d " - "pfn %ld vaddr %p\n", - i, o, l, pfn, vaddr); -#endif - if (vaddr == NULL) { - ret = -EFAULT; - goto fail; - } - unwritten = __copy_from_user(vaddr + o, user_data, l); -#if WATCH_PWRITE - DRM_INFO("unwritten %ld\n", unwritten); -#endif - iounmap(vaddr); - if (unwritten) { - ret = -EFAULT; + page_base = (offset & ~(PAGE_SIZE-1)); + page_offset = offset & (PAGE_SIZE-1); + page_length = remain; + if ((page_offset + remain) > PAGE_SIZE) + page_length = PAGE_SIZE - page_offset; + + ret = fast_user_write (dev_priv->mm.gtt_mapping, page_base, + page_offset, user_data, page_length); + + /* If we get a fault while copying data, then (presumably) our + * source page isn't available. In this case, use the + * non-atomic function + */ + if (ret) { + ret = slow_user_write (dev_priv->mm.gtt_mapping, + page_base, page_offset, + user_data, page_length); + if (ret) goto fail; - } } - remain -= l; - user_data += l; - offset += l; + remain -= page_length; + user_data += page_length; + offset += page_length; } -#if WATCH_PWRITE && 1 - i915_gem_clflush_object(obj); - i915_gem_dump_object(obj, args->offset + args->size, __func__, ~0); - i915_gem_clflush_object(obj); -#endif fail: i915_gem_object_unpin(obj); @@ -1503,12 +1500,12 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, struct drm_i915_gem_exec_object *entry) { struct drm_device *dev = obj->dev; + drm_i915_private_t *dev_priv = dev->dev_private; struct drm_i915_gem_relocation_entry reloc; struct drm_i915_gem_relocation_entry __user *relocs; struct drm_i915_gem_object *obj_priv = obj->driver_private; int i, ret; - uint32_t last_reloc_offset = -1; - void __iomem *reloc_page = NULL; + void __iomem *reloc_page; /* Choose the GTT offset for our buffer and put it there. */ ret = i915_gem_object_pin(obj, (uint32_t) entry->alignment); @@ -1631,26 +1628,11 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, * perform. */ reloc_offset = obj_priv->gtt_offset + reloc.offset; - if (reloc_page == NULL || - (last_reloc_offset & ~(PAGE_SIZE - 1)) != - (reloc_offset & ~(PAGE_SIZE - 1))) { - if (reloc_page != NULL) - iounmap(reloc_page); - - reloc_page = ioremap_wc(dev->agp->base + - (reloc_offset & - ~(PAGE_SIZE - 1)), - PAGE_SIZE); - last_reloc_offset = reloc_offset; - if (reloc_page == NULL) { - drm_gem_object_unreference(target_obj); - i915_gem_object_unpin(obj); - return -ENOMEM; - } - } - + reloc_page = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping, + (reloc_offset & + ~(PAGE_SIZE - 1))); reloc_entry = (uint32_t __iomem *)(reloc_page + - (reloc_offset & (PAGE_SIZE - 1))); + (reloc_offset & (PAGE_SIZE - 1))); reloc_val = target_obj_priv->gtt_offset + reloc.delta; #if WATCH_BUF @@ -1659,6 +1641,7 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, readl(reloc_entry), reloc_val); #endif writel(reloc_val, reloc_entry); + io_mapping_unmap_atomic(reloc_page); /* Write the updated presumed offset for this entry back out * to the user. @@ -1674,9 +1657,6 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj, drm_gem_object_unreference(target_obj); } - if (reloc_page != NULL) - iounmap(reloc_page); - #if WATCH_BUF if (0) i915_gem_dump_object(obj, 128, __func__, ~0); @@ -2518,6 +2498,10 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data, if (ret != 0) return ret; + dev_priv->mm.gtt_mapping = io_mapping_create_wc(dev->agp->base, + dev->agp->agp_info.aper_size + * 1024 * 1024); + mutex_lock(&dev->struct_mutex); BUG_ON(!list_empty(&dev_priv->mm.active_list)); BUG_ON(!list_empty(&dev_priv->mm.flushing_list)); @@ -2535,11 +2519,13 @@ int i915_gem_leavevt_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { + drm_i915_private_t *dev_priv = dev->dev_private; int ret; ret = i915_gem_idle(dev); drm_irq_uninstall(dev); + io_mapping_free(dev_priv->mm.gtt_mapping); return ret; } diff --git a/include/asm-x86/iomap.h b/include/asm-x86/iomap.h new file mode 100644 index 0000000..c1f0628 --- /dev/null +++ b/include/asm-x86/iomap.h @@ -0,0 +1,30 @@ +/* + * Copyright © 2008 Ingo Molnar + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <linux/fs.h> +#include <linux/mm.h> +#include <linux/uaccess.h> +#include <asm/cacheflush.h> +#include <asm/pgtable.h> +#include <asm/tlbflush.h> + +void * +iomap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot); + +void +iounmap_atomic(void *kvaddr, enum km_type type); diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h new file mode 100644 index 0000000..82df317 --- /dev/null +++ b/include/linux/io-mapping.h @@ -0,0 +1,125 @@ +/* + * Copyright © 2008 Keith Packard <keithp@keithp.com> + * + * This file is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License + * as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +#ifndef _LINUX_IO_MAPPING_H +#define _LINUX_IO_MAPPING_H + +#include <linux/types.h> +#include <asm/io.h> +#include <asm/page.h> +#include <asm/iomap.h> + +/* + * The io_mapping mechanism provides an abstraction for mapping + * individual pages from an io device to the CPU in an efficient fashion. + * + * See Documentation/io_mapping.txt + */ + +/* this struct isn't actually defined anywhere */ +struct io_mapping; + +#ifdef CONFIG_HAVE_ATOMIC_IOMAP + +/* + * For small address space machines, mapping large objects + * into the kernel virtual space isn't practical. Where + * available, use fixmap support to dynamically map pages + * of the object at run time. + */ + +static inline struct io_mapping * +io_mapping_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_mapping *) base; +} + +static inline void +io_mapping_free(struct io_mapping *mapping) +{ +} + +/* Atomic map/unmap */ +static inline void * +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) +{ + offset += (unsigned long) mapping; + return iomap_atomic_prot_pfn(offset >> PAGE_SHIFT, KM_USER0, + __pgprot(__PAGE_KERNEL_WC)); +} + +static inline void +io_mapping_unmap_atomic(void *vaddr) +{ + iounmap_atomic(vaddr, KM_USER0); +} + +static inline void * +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) +{ + offset += (unsigned long) mapping; + return ioremap_wc(offset, PAGE_SIZE); +} + +static inline void +io_mapping_unmap(void *vaddr) +{ + iounmap(vaddr); +} + +#else + +/* Create the io_mapping object*/ +static inline struct io_mapping * +io_mapping_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_mapping *) ioremap_wc(base, size); +} + +static inline void +io_mapping_free(struct io_mapping *mapping) +{ + iounmap(mapping); +} + +/* Atomic map/unmap */ +static inline void * +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) +{ + return ((char *) mapping) + offset; +} + +static inline void +io_mapping_unmap_atomic(void *vaddr) +{ +} + +/* Non-atomic map/unmap */ +static inline void * +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) +{ + return ((char *) mapping) + offset; +} + +static inline void +io_mapping_unmap(void *vaddr) +{ +} + +#endif /* HAVE_ATOMIC_IOMAP */ + +#endif /* _LINUX_IO_MAPPING_H */ ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] IO mappings, #2 2008-11-03 17:29 ` [git pull] IO mappings, #2 Ingo Molnar @ 2008-11-04 22:36 ` Jonathan Corbet 2008-11-05 9:01 ` Ingo Molnar 0 siblings, 1 reply; 80+ messages in thread From: Jonathan Corbet @ 2008-11-04 22:36 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Keith Packard, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List Having looked at this some, I have one, tiny little suggestion: > +With this mapping object, individual pages can be mapped either > atomically +or not, depending on the necessary scheduling > environment. Of course, atomic +maps are more efficient: > + > + void *io_mapping_map_atomic_wc(struct io_mapping *mapping, > + unsigned long offset) Should the documentation for this function (perhaps the certainly-forthcoming kerneldoc comments :) mention loudly that this function uses KM_USER0? This isn't kmap(), and doesn't look much like it; someday some developer might get an ugly surprise when they try to use that slot simultaneously for something else. jon ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] IO mappings, #2 2008-11-04 22:36 ` Jonathan Corbet @ 2008-11-05 9:01 ` Ingo Molnar 0 siblings, 0 replies; 80+ messages in thread From: Ingo Molnar @ 2008-11-05 9:01 UTC (permalink / raw) To: Jonathan Corbet Cc: Linus Torvalds, Keith Packard, Jesse Barnes, Nick Piggin, Dave Airlie, Yinghai Lu, Linux Kernel Mailing List * Jonathan Corbet <corbet@lwn.net> wrote: > Having looked at this some, I have one, tiny little suggestion: > > > +With this mapping object, individual pages can be mapped either > > atomically +or not, depending on the necessary scheduling > > environment. Of course, atomic +maps are more efficient: > > + > > + void *io_mapping_map_atomic_wc(struct io_mapping *mapping, > > + unsigned long offset) > > Should the documentation for this function (perhaps the > certainly-forthcoming kerneldoc comments :) mention loudly that this > function uses KM_USER0? This isn't kmap(), and doesn't look much > like it; someday some developer might get an ugly surprise when they > try to use that slot simultaneously for something else. definitely worth a comment. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-22 9:36 ` Ingo Molnar 2008-10-23 7:14 ` Keith Packard @ 2008-10-23 20:22 ` Keith Packard 2008-10-23 20:38 ` Andrew Morton 1 sibling, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-23 20:22 UTC (permalink / raw) To: Ingo Molnar Cc: keithp, Nick Piggin, Dave Airlie, Linus Torvalds, Linux Kernel Mailing List, Jesse Barnes, dri-devel, Andrew Morton, Yinghai Lu [-- Attachment #1: Type: text/plain, Size: 3694 bytes --] We just ran some numbers on a box with PAT enabled and broken MTRRs. Finally we have a test platform for the difference between kmap_atomic and kmap_atomic_prot. Using regular kmap_atomic on this platform, we get UC access to the graphics device; sending data from the CPU is a bit slow. Adding kmap_atomic_prot_pfn and specifying WC access raises that by a fairly significant factor, taking our CPU utilization for copy_from_user from 40% to 2%. Here's a patch which adds kmap_atomic_prot_pfn to the kernel for this usage. When we add native io-mapping support instead of sitting on top of kmap, we can remove this function. I've reworked the io_mapping patches to use this function as well, along with cleaning up the i915 code along the lines of Linus' current version. I'll post those if this patch looks acceptable. From e3f633dcb36889fa85ea06cca339072df3c44ae0 Mon Sep 17 00:00:00 2001 From: Keith Packard <keithp@keithp.com> Date: Thu, 23 Oct 2008 11:53:45 -0700 Subject: [PATCH] Add kmap_atomic_prot_pfn kmap_atomic_prot_pfn is a mixture of kmap_atomic_prot and kmap_atomic_pfn, accepting both a pfn instead of a page and an explicit protection value. Signed-off-by: Keith Packard <keithp@keithp.com> --- arch/x86/mm/highmem_32.c | 19 +++++++++++++++++++ include/asm-x86/highmem.h | 1 + include/linux/highmem.h | 1 + 3 files changed, 21 insertions(+), 0 deletions(-) diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c index bcc079c..91ae5e8 100644 --- a/arch/x86/mm/highmem_32.c +++ b/arch/x86/mm/highmem_32.c @@ -139,6 +139,25 @@ void *kmap_atomic_pfn(unsigned long pfn, enum km_type type) } EXPORT_SYMBOL_GPL(kmap_atomic_pfn); /* temporarily in use by i915 GEM until vmap */ +/* This is the same as kmap_atomic_prot() but can map memory that doesn't + * have a struct page associated with it. + */ +void *kmap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot) +{ + enum fixed_addresses idx; + unsigned long vaddr; + + pagefault_disable(); + + idx = type + KM_TYPE_NR*smp_processor_id(); + vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); + set_pte(kmap_pte-idx, pfn_pte(pfn, prot)); + arch_flush_lazy_mmu_mode(); + + return (void*) vaddr; +} +EXPORT_SYMBOL_GPL(kmap_atomic_prot_pfn); /* temporarily in use by i915 GEM until vmap */ + struct page *kmap_atomic_to_page(void *ptr) { unsigned long idx, vaddr = (unsigned long)ptr; diff --git a/include/asm-x86/highmem.h b/include/asm-x86/highmem.h index bc3f6a2..a1f8f8c 100644 --- a/include/asm-x86/highmem.h +++ b/include/asm-x86/highmem.h @@ -66,6 +66,7 @@ void *kmap_atomic_prot(struct page *page, enum km_type type, pgprot_t prot); void *kmap_atomic(struct page *page, enum km_type type); void kunmap_atomic(void *kvaddr, enum km_type type); void *kmap_atomic_pfn(unsigned long pfn, enum km_type type); +void *kmap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot); struct page *kmap_atomic_to_page(void *ptr); #ifndef CONFIG_PARAVIRT diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 7dcbc82..7fbee2c 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -55,6 +55,7 @@ static inline void *kmap_atomic(struct page *page, enum km_type idx) #define kunmap_atomic(addr, idx) do { pagefault_enable(); } while (0) #define kmap_atomic_pfn(pfn, idx) kmap_atomic(pfn_to_page(pfn), (idx)) +#define kmap_atomic_prot_pfn(pfn, idx, prot) kmap_atomic_prot(pfn_to_page(pfn), (idx), (prot)) #define kmap_atomic_to_page(ptr) virt_to_page(ptr) #define kmap_flush_unused() do {} while(0) -- 1.5.6.5 -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-23 20:22 ` Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) Keith Packard @ 2008-10-23 20:38 ` Andrew Morton 2008-10-23 21:03 ` Keith Packard 2008-10-23 21:24 ` Linus Torvalds 0 siblings, 2 replies; 80+ messages in thread From: Andrew Morton @ 2008-10-23 20:38 UTC (permalink / raw) To: Keith Packard Cc: mingo, keithp, nickpiggin, airlied, torvalds, linux-kernel, jbarnes, dri-devel, yinghai On Thu, 23 Oct 2008 13:22:12 -0700 Keith Packard <keithp@keithp.com> wrote: > We just ran some numbers on a box with PAT enabled and broken MTRRs. > Finally we have a test platform for the difference between kmap_atomic > and kmap_atomic_prot. Using regular kmap_atomic on this platform, we get > UC access to the graphics device; sending data from the CPU is a bit > slow. Adding kmap_atomic_prot_pfn and specifying WC access raises that > by a fairly significant factor, taking our CPU utilization for > copy_from_user from 40% to 2%. > > Here's a patch which adds kmap_atomic_prot_pfn to the kernel for this > usage. When we add native io-mapping support instead of sitting on top > of kmap, we can remove this function. > > I've reworked the io_mapping patches to use this function as well, along > with cleaning up the i915 code along the lines of Linus' current > version. I'll post those if this patch looks acceptable. > > From e3f633dcb36889fa85ea06cca339072df3c44ae0 Mon Sep 17 00:00:00 2001 > From: Keith Packard <keithp@keithp.com> > Date: Thu, 23 Oct 2008 11:53:45 -0700 > Subject: [PATCH] Add kmap_atomic_prot_pfn > > kmap_atomic_prot_pfn is a mixture of kmap_atomic_prot and kmap_atomic_pfn, > accepting both a pfn instead of a page and an explicit protection value. > > Signed-off-by: Keith Packard <keithp@keithp.com> > --- > arch/x86/mm/highmem_32.c | 19 +++++++++++++++++++ > include/asm-x86/highmem.h | 1 + > include/linux/highmem.h | 1 + > 3 files changed, 21 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c > index bcc079c..91ae5e8 100644 > --- a/arch/x86/mm/highmem_32.c > +++ b/arch/x86/mm/highmem_32.c > @@ -139,6 +139,25 @@ void *kmap_atomic_pfn(unsigned long pfn, enum km_type type) > } > EXPORT_SYMBOL_GPL(kmap_atomic_pfn); /* temporarily in use by i915 GEM until vmap */ > > +/* This is the same as kmap_atomic_prot() but can map memory that doesn't > + * have a struct page associated with it. > + */ > +void *kmap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot) > +{ > + enum fixed_addresses idx; > + unsigned long vaddr; > + > + pagefault_disable(); > + > + idx = type + KM_TYPE_NR*smp_processor_id(); > + vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); > + set_pte(kmap_pte-idx, pfn_pte(pfn, prot)); > + arch_flush_lazy_mmu_mode(); > + > + return (void*) vaddr; > +} > +EXPORT_SYMBOL_GPL(kmap_atomic_prot_pfn); /* temporarily in use by i915 GEM until vmap */ I guess one could reimplemenet kmap_atomic_pfn() to call this. Sometime. > struct page *kmap_atomic_to_page(void *ptr) > { > unsigned long idx, vaddr = (unsigned long)ptr; > diff --git a/include/asm-x86/highmem.h b/include/asm-x86/highmem.h > index bc3f6a2..a1f8f8c 100644 > --- a/include/asm-x86/highmem.h > +++ b/include/asm-x86/highmem.h > @@ -66,6 +66,7 @@ void *kmap_atomic_prot(struct page *page, enum km_type type, pgprot_t prot); > void *kmap_atomic(struct page *page, enum km_type type); > void kunmap_atomic(void *kvaddr, enum km_type type); > void *kmap_atomic_pfn(unsigned long pfn, enum km_type type); > +void *kmap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot); Given that all highmem-implementing archtiectures must use the same declaration here, we might as well put it into include/linux/highmem.h. Although that goes against current mistakes^Wcode. Does powerpc32 still implement highmem? It seems that way. You broke it, no? ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-23 20:38 ` Andrew Morton @ 2008-10-23 21:03 ` Keith Packard 2008-10-23 21:24 ` Linus Torvalds 1 sibling, 0 replies; 80+ messages in thread From: Keith Packard @ 2008-10-23 21:03 UTC (permalink / raw) To: Andrew Morton Cc: keithp, mingo, nickpiggin, airlied, torvalds, linux-kernel, jbarnes, dri-devel, yinghai [-- Attachment #1: Type: text/plain, Size: 765 bytes --] On Thu, 2008-10-23 at 13:38 -0700, Andrew Morton wrote: > I guess one could reimplemenet kmap_atomic_pfn() to call this. Sometime. The goal is to stop needing this function fairly soon and replace it with a 'real' io-mapping implementation for 32-bit processors. > Given that all highmem-implementing archtiectures must use the same > declaration here, we might as well put it into include/linux/highmem.h. > Although that goes against current mistakes^Wcode. I'd hate to break with a long tradition. > Does powerpc32 still implement highmem? It seems that way. You broke > it, no? Powerpc32 doesn't have kmap_atomic_pfn either. Seems like the set of HIGHMEM functions is not uniform across architectures. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-23 20:38 ` Andrew Morton 2008-10-23 21:03 ` Keith Packard @ 2008-10-23 21:24 ` Linus Torvalds 2008-10-24 1:50 ` Keith Packard 2008-10-24 3:21 ` Benjamin Herrenschmidt 1 sibling, 2 replies; 80+ messages in thread From: Linus Torvalds @ 2008-10-23 21:24 UTC (permalink / raw) To: Andrew Morton Cc: Keith Packard, mingo, nickpiggin, airlied, linux-kernel, jbarnes, dri-devel, yinghai On Thu, 23 Oct 2008, Andrew Morton wrote: > > Given that all highmem-implementing archtiectures must use the same > declaration here, we might as well put it into include/linux/highmem.h. > Although that goes against current mistakes^Wcode. > > Does powerpc32 still implement highmem? It seems that way. You broke > it, no? This really shouldn't be in highmem.h AT ALL. The whole point of that function has absolutely nothing to do with highmem, and it *must* be useful on non-highmem configurations to be appropriate. So I'd much rather create a new <linux/kmap.h> or something. Or just expose this from to <asm/fixmap.h> or something. Let's not confuse this with highmem, even if the implementation _historically_ was due to that. Linus ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-23 21:24 ` Linus Torvalds @ 2008-10-24 1:50 ` Keith Packard 2008-10-24 2:48 ` Linus Torvalds 2008-10-24 3:21 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-24 1:50 UTC (permalink / raw) To: Linus Torvalds Cc: keithp, Andrew Morton, mingo, nickpiggin, airlied, linux-kernel, jbarnes, dri-devel, yinghai [-- Attachment #1: Type: text/plain, Size: 1050 bytes --] On Thu, 2008-10-23 at 14:24 -0700, Linus Torvalds wrote: > The whole point of that function has absolutely nothing to do with > highmem, and it *must* be useful on non-highmem configurations to be > appropriate. Right, I'd just like my io_mapping_map_atomic_wc to be able to rapidly map random pages from my PCI BAR in WC mode. The code in your tree uses kmap_atomic_pfn, which does the mapping, but use the wrong prot bits. On a system with MTRR failure, this all ends badly, with factors of 20 performance drops for copying data from the CPU to the aperture. > So I'd much rather create a new <linux/kmap.h> or something. Or just > expose this from to <asm/fixmap.h> or something. Let's not confuse this > with highmem, even if the implementation _historically_ was due to that. Sure, we readily admit that we're abusing the highmem API. So we wrapped that abusive code in a pretty package that can be re-implemented however you desire. How (and when) would you like to see this fixed? -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-24 1:50 ` Keith Packard @ 2008-10-24 2:48 ` Linus Torvalds 2008-10-24 3:24 ` Benjamin Herrenschmidt ` (3 more replies) 0 siblings, 4 replies; 80+ messages in thread From: Linus Torvalds @ 2008-10-24 2:48 UTC (permalink / raw) To: Keith Packard Cc: Andrew Morton, Ingo Molnar, nickpiggin, airlied, Linux Kernel Mailing List, jbarnes, dri-devel, yinghai, Peter Anvin On Thu, 23 Oct 2008, Keith Packard wrote: > > > So I'd much rather create a new <linux/kmap.h> or something. Or just > > expose this from to <asm/fixmap.h> or something. Let's not confuse this > > with highmem, even if the implementation _historically_ was due to that. > > Sure, we readily admit that we're abusing the highmem API. So we wrapped > that abusive code in a pretty package that can be re-implemented however > you desire. > > How (and when) would you like to see this fixed? I'm not entirely sure who wants to own up to owning that particular part of code, and is willing to make kmap_atomic_prot_pfn() also work in the absense of CONFIG_HIGHMEM. I suspect it's Ingo, but I also suspect that he'd like to see a tested patch by somebody who actually _uses_ this code. So I would suspect that if you guys actually write a patch, and make sure that it works on x86-32 even _without_ CONFIG_HIGHMEM, and send it to Ingo, good things will happen. As to the "without CONFIG_HIGHMEM" part: making all of this work even without HIGHMEM should literally be a matter of: - remove the '#ifdef CONFIG_HIGHMEM' in <asm/fixmap_32.h>, so that we have fixmap entries for the temporary kernel mappings regardless (ie FIX_KMAP_BEGIN and FIX_KMAP_END). - move the code for the atomic accesses that use those fixmap entries into a file that gets compiled regardless of CONFIG_HIGHMEM. Or at least just kmap_atomic_prot_pfn() and kunmap_atomic(). and it probably should all work automatically. The kmap_atomic() stuff really should be almost totally independent of CONFIG_HIGHMEM, since it's really much more closely related to fixmaps than HIGHMEM. I guess there may be some debugging code that depends on HIGHMEM (eg that whole testing for whether a page is a highmem page or not), so it might be a _bit_ more than just moving code around, but I really didn't look closer. Then, there's the issue of 64-bit, and just mapping everything there, and the interface to that. I liked the trivial extension to "struct resource" to have a "cached mapping" pointer. So if we can just make it pass resources around and get a page that way (and not even need kmap() on 64-bit architections), that would be good. It's too late for -rc1 (which I'm planning on cutting within the hour), and if it ends up being complex, I guess that it means this will eb a 2.6.29 issue. But if it really is just a matter of mostly just some trivial code movement and both the gfx and x86 people are all happy, I'll happily take it for -rc2, and we can just leave this all behind us. Linus ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-24 2:48 ` Linus Torvalds @ 2008-10-24 3:24 ` Benjamin Herrenschmidt 2008-10-24 5:37 ` Keith Packard 2008-10-24 4:29 ` Keith Packard ` (2 subsequent siblings) 3 siblings, 1 reply; 80+ messages in thread From: Benjamin Herrenschmidt @ 2008-10-24 3:24 UTC (permalink / raw) To: Linus Torvalds Cc: Keith Packard, Andrew Morton, Ingo Molnar, nickpiggin, airlied, Linux Kernel Mailing List, jbarnes, dri-devel, yinghai, Peter Anvin On Thu, 2008-10-23 at 19:48 -0700, Linus Torvalds wrote: > Then, there's the issue of 64-bit, and just mapping everything there, and > the interface to that. I liked the trivial extension to "struct resource" > to have a "cached mapping" pointer. So if we can just make it pass > resources around and get a page that way (and not even need kmap() on > 64-bit architections), that would be good. I'm not that fan of carrying a mapping with a struct resource because if we do that we should probably also refcount the mapping, and then there is the whole question of mappings with different attributes, etc etc... Cheers, Ben. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-24 3:24 ` Benjamin Herrenschmidt @ 2008-10-24 5:37 ` Keith Packard 2008-10-24 14:53 ` Linus Torvalds 0 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-24 5:37 UTC (permalink / raw) To: benh Cc: keithp, Linus Torvalds, Andrew Morton, Ingo Molnar, nickpiggin, airlied, Linux Kernel Mailing List, jbarnes, dri-devel, yinghai, Peter Anvin [-- Attachment #1: Type: text/plain, Size: 762 bytes --] On Fri, 2008-10-24 at 14:24 +1100, Benjamin Herrenschmidt wrote: > I'm not that fan of carrying a mapping with a struct resource because if > we do that we should probably also refcount the mapping, and then there > is the whole question of mappings with different attributes, etc etc... I'm fine with sticking the mapping in a separate structure; it's just the return from ioremap_wc on 64-bit systems, and nothing at all on 32-bit systems. I don't really see the advantage of moving it into the resource, especially as we may need different access modes (UC vs WC) for different portions of a single BAR. We may want to add some bounds and access mode information to this structure to check for broken drivers. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-24 5:37 ` Keith Packard @ 2008-10-24 14:53 ` Linus Torvalds 2008-10-24 15:45 ` Keith Packard 0 siblings, 1 reply; 80+ messages in thread From: Linus Torvalds @ 2008-10-24 14:53 UTC (permalink / raw) To: Keith Packard Cc: benh, Andrew Morton, Ingo Molnar, nickpiggin, airlied, Linux Kernel Mailing List, jbarnes, dri-devel, yinghai, Peter Anvin On Thu, 23 Oct 2008, Keith Packard wrote: > > I'm fine with sticking the mapping in a separate structure; it's just > the return from ioremap_wc on 64-bit systems, and nothing at all on > 32-bit systems. Actually, on 32-bit, the 'prot' should be there, as should the starting physical page. Otherwise the two interfaces would be very odd, and you'd have to repeat those arguments in all callers (ie both in "prepare" and in the actual "access"). Linus ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-24 14:53 ` Linus Torvalds @ 2008-10-24 15:45 ` Keith Packard 0 siblings, 0 replies; 80+ messages in thread From: Keith Packard @ 2008-10-24 15:45 UTC (permalink / raw) To: Linus Torvalds Cc: keithp, nickpiggin, airlied, benh, Peter Anvin, Linux Kernel Mailing List, jbarnes, dri-devel, Andrew Morton, yinghai, Ingo Molnar [-- Attachment #1: Type: text/plain, Size: 585 bytes --] On Fri, 2008-10-24 at 07:53 -0700, Linus Torvalds wrote: > Actually, on 32-bit, the 'prot' should be there, as should the starting > physical page. Otherwise the two interfaces would be very odd, and you'd > have to repeat those arguments in all callers (ie both in "prepare" and > in the actual "access"). I suppose. What I did instead was create _wc versions of both the prepare and access functions to eliminate the need for additional data. Either way is fine with me; I took the route which didn't require an additional allocation. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-24 2:48 ` Linus Torvalds 2008-10-24 3:24 ` Benjamin Herrenschmidt @ 2008-10-24 4:29 ` Keith Packard 2008-10-24 6:22 ` Keith Packard 2008-10-24 9:14 ` Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar 3 siblings, 0 replies; 80+ messages in thread From: Keith Packard @ 2008-10-24 4:29 UTC (permalink / raw) To: Linus Torvalds Cc: keithp, Andrew Morton, Ingo Molnar, nickpiggin, airlied, Linux Kernel Mailing List, jbarnes, dri-devel, yinghai, Peter Anvin [-- Attachment #1: Type: text/plain, Size: 3748 bytes --] On Thu, 2008-10-23 at 19:48 -0700, Linus Torvalds wrote: > I'm not entirely sure who wants to own up to owning that particular part > of code, and is willing to make kmap_atomic_prot_pfn() also work in the > absense of CONFIG_HIGHMEM. All of the kmap_atomic functions *do* work without CONFIG_HIGHMEM, they just don't do what we want in this case. Without knowing the history, it seems fairly clear that the kmap functions are designed to map physical memory pages into the kernel virtual address space. On small 32-bit systems, that's trivial, you just use the direct map (as one does on 64-bit systems). The magic fixmap entries make this work with CONFIG_HIGHMEM as well. So, I fear touching the kmap API as it appears to have a specific and useful purpose, irrespective of the memory size the kernel is configured for. What I've proposed is that we create a new io-space specific set of fixmap APIs. On CONFIG_HIGHMEM, they'd just use the existing kmap_atomic mechanism, but on small 32-bit systems, we'd enable the fixmaps and have some for that environment as well. > So I would suspect that if you guys actually write a patch, and make sure > that it works on x86-32 even _without_ CONFIG_HIGHMEM, and send it to > Ingo, good things will happen. Ok, we can give this a try. > and it probably should all work automatically. The kmap_atomic() stuff > really should be almost totally independent of CONFIG_HIGHMEM, since it's > really much more closely related to fixmaps than HIGHMEM. As above, I think kmap_atomic should be left alone as a way of quickly mapping memory pages. There are a users of both kmap_atomic_prot (xen) and kmap_atomic_pfn (crash dumps). > I guess there may be some debugging code that depends on HIGHMEM (eg that > whole testing for whether a page is a highmem page or not), so it might be > a _bit_ more than just moving code around, but I really didn't look > closer. > > Then, there's the issue of 64-bit, and just mapping everything there, and > the interface to that. I liked the trivial extension to "struct resource" > to have a "cached mapping" pointer. So if we can just make it pass > resources around and get a page that way (and not even need kmap() on > 64-bit architections), that would be good. The io_mapping API I proposed does precisely this. On 32-bit systems, it uses kmap_atomic for each page access while on 64-bit systems it uses ioremap_wc at device init time and then just uses an offset for each page access. Hiding this detail behind an API leaves the driver code independent of this particular choice. > It's too late for -rc1 (which I'm planning on cutting within the hour), > and if it ends up being complex, I guess that it means this will eb a > 2.6.29 issue. If we do end up pushing this out to 2.6.29, I'd like to see kmap_atomic_prot_pfn in place as a stop-gap so that PAT performance on 32-bit systems is reasonable. I don't think too many people are running desktop systems without CONFIG_HIGHMEM at this point, and if so, we can just suggest that perhaps they change their configuration. > But if it really is just a matter of mostly just some trivial code > movement and both the gfx and x86 people are all happy, I'll happily take > it for -rc2, and we can just leave this all behind us. I'll try to get something working in the next day or so and see how it looks. My plan at this point is to create new API for 32-bit systems: void *io_map_atomic_wc(unsigned long pfn) void io_unmap_atomic(void *addr); With this, I can switch my existing io_mapping API over to an io-specific interface and implement those using the fixmap code. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-24 2:48 ` Linus Torvalds 2008-10-24 3:24 ` Benjamin Herrenschmidt 2008-10-24 4:29 ` Keith Packard @ 2008-10-24 6:22 ` Keith Packard 2008-10-24 7:33 ` Adding kmap_atomic_prot_pfn Thomas Hellström 2008-10-24 9:14 ` Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar 3 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-24 6:22 UTC (permalink / raw) To: Linus Torvalds Cc: keithp, nickpiggin, airlied, Peter Anvin, Linux Kernel Mailing List, jbarnes, dri-devel, Andrew Morton, yinghai, Ingo Molnar [-- Attachment #1: Type: text/plain, Size: 9435 bytes --] On Thu, 2008-10-23 at 19:48 -0700, Linus Torvalds wrote: > So I would suspect that if you guys actually write a patch, and make sure > that it works on x86-32 even _without_ CONFIG_HIGHMEM, and send it to > Ingo, good things will happen. Something like the following (yes, I know, this is missing the include/asm-x86 rename)? From e7921809c72f940295311cfa6c300d5234ac96c1 Mon Sep 17 00:00:00 2001 From: Keith Packard <keithp@keithp.com> Date: Thu, 23 Oct 2008 23:17:40 -0700 Subject: [PATCH] [x86_32] Add io_map_atomic using fixmaps This steals the code used for CONFIG_HIGHMEM memory mappings except that it's designed for dynamic io resource mapping. These fixmaps are available even with CONFIG_HIGHMEM turned off. Signed-off-by: Keith Packard <keithp@keithp.com> --- arch/x86/mm/Makefile | 2 +- arch/x86/mm/init_32.c | 3 +- arch/x86/mm/iomap_32.c | 59 +++++++++++++++++++++++++++++++++++++++++++ include/asm-x86/fixmap.h | 4 +++ include/asm-x86/fixmap_32.h | 4 --- include/asm-x86/highmem.h | 8 +++--- include/asm-x86/iomap.h | 30 ++++++++++++++++++++++ include/linux/io-mapping.h | 15 +++------- 8 files changed, 104 insertions(+), 21 deletions(-) create mode 100644 arch/x86/mm/iomap_32.c create mode 100644 include/asm-x86/iomap.h diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 59f89b4..fea4565 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -1,7 +1,7 @@ obj-y := init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \ pat.o pgtable.o gup.o -obj-$(CONFIG_X86_32) += pgtable_32.o +obj-$(CONFIG_X86_32) += pgtable_32.o iomap_32.o obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o obj-$(CONFIG_X86_PTDUMP) += dump_pagetables.o diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 8396868..c483f42 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -334,7 +334,6 @@ int devmem_is_allowed(unsigned long pagenr) return 0; } -#ifdef CONFIG_HIGHMEM pte_t *kmap_pte; pgprot_t kmap_prot; @@ -357,6 +356,7 @@ static void __init kmap_init(void) kmap_prot = PAGE_KERNEL; } +#ifdef CONFIG_HIGHMEM static void __init permanent_kmaps_init(pgd_t *pgd_base) { unsigned long vaddr; @@ -436,7 +436,6 @@ static void __init set_highmem_pages_init(void) #endif /* !CONFIG_NUMA */ #else -# define kmap_init() do { } while (0) # define permanent_kmaps_init(pgd_base) do { } while (0) # define set_highmem_pages_init() do { } while (0) #endif /* CONFIG_HIGHMEM */ diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c new file mode 100644 index 0000000..c559599 --- /dev/null +++ b/arch/x86/mm/iomap_32.c @@ -0,0 +1,59 @@ +/* + * Copyright © 2008 Keith Packard <keithp@keithp.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <asm/iomap.h> +#include <linux/module.h> + +/* Map 'pfn' using fixed map 'type' and protections 'prot' + */ +void * +iomap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot) +{ + enum fixed_addresses idx; + unsigned long vaddr; + + pagefault_disable(); + + idx = type + KM_TYPE_NR*smp_processor_id(); + vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); + set_pte(kmap_pte-idx, pfn_pte(pfn, prot)); + arch_flush_lazy_mmu_mode(); + + return (void*) vaddr; +} +EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn); + +void +iounmap_atomic(void *kvaddr, enum km_type type) +{ + unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; + enum fixed_addresses idx = type + KM_TYPE_NR*smp_processor_id(); + + /* + * Force other mappings to Oops if they'll try to access this pte + * without first remap it. Keeping stale mappings around is a bad idea + * also, in case the page changes cacheability attributes or becomes + * a protected page in a hypervisor. + */ + if (vaddr == __fix_to_virt(FIX_KMAP_BEGIN+idx)) + kpte_clear_flush(kmap_pte-idx, vaddr); + + arch_flush_lazy_mmu_mode(); + pagefault_enable(); +} +EXPORT_SYMBOL_GPL(iounmap_atomic); diff --git a/include/asm-x86/fixmap.h b/include/asm-x86/fixmap.h index 78e33a1..a8b4379 100644 --- a/include/asm-x86/fixmap.h +++ b/include/asm-x86/fixmap.h @@ -9,6 +9,10 @@ extern int fixmaps_set; +extern pte_t *kmap_pte; +extern pgprot_t kmap_prot; +extern pte_t *pkmap_page_table; + void __native_set_fixmap(enum fixed_addresses idx, pte_t pte); void native_set_fixmap(enum fixed_addresses idx, unsigned long phys, pgprot_t flags); diff --git a/include/asm-x86/fixmap_32.h b/include/asm-x86/fixmap_32.h index 8844002..97c2976 100644 --- a/include/asm-x86/fixmap_32.h +++ b/include/asm-x86/fixmap_32.h @@ -28,10 +28,8 @@ extern unsigned long __FIXADDR_TOP; #include <asm/acpi.h> #include <asm/apicdef.h> #include <asm/page.h> -#ifdef CONFIG_HIGHMEM #include <linux/threads.h> #include <asm/kmap_types.h> -#endif /* * Here we define all the compile-time 'special' virtual @@ -75,10 +73,8 @@ enum fixed_addresses { #ifdef CONFIG_X86_CYCLONE_TIMER FIX_CYCLONE_TIMER, /*cyclone timer register*/ #endif -#ifdef CONFIG_HIGHMEM FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, -#endif #ifdef CONFIG_PCI_MMCONFIG FIX_PCIE_MCFG, #endif diff --git a/include/asm-x86/highmem.h b/include/asm-x86/highmem.h index a1f8f8c..d728928 100644 --- a/include/asm-x86/highmem.h +++ b/include/asm-x86/highmem.h @@ -4,6 +4,9 @@ * Used in CONFIG_HIGHMEM systems for memory pages which * are not addressable by direct kernel virtual addresses. * + * Used in other 32-bit systems for io pages which are not + * in the direct kernel virtual map + * * Copyright (C) 1999 Gerhard Wichert, Siemens AG * Gerhard.Wichert@pdb.siemens.de * @@ -25,14 +28,11 @@ #include <asm/kmap_types.h> #include <asm/tlbflush.h> #include <asm/paravirt.h> +#include <asm/fixmap.h> /* declarations for highmem.c */ extern unsigned long highstart_pfn, highend_pfn; -extern pte_t *kmap_pte; -extern pgprot_t kmap_prot; -extern pte_t *pkmap_page_table; - /* * Right now we initialize only a single pte table. It can be extended * easily, subsequent pte tables have to be allocated in one physical diff --git a/include/asm-x86/iomap.h b/include/asm-x86/iomap.h new file mode 100644 index 0000000..33b8180 --- /dev/null +++ b/include/asm-x86/iomap.h @@ -0,0 +1,30 @@ +/* + * Copyright © 2008 Keith Packard <keithp@keithp.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <linux/fs.h> +#include <linux/mm.h> +#include <linux/uaccess.h> +#include <asm/cacheflush.h> +#include <asm/pgtable.h> +#include <asm/tlbflush.h> + +void * +iomap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot); + +void +iounmap_atomic(void *kvaddr, enum km_type type); diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h index bd1dc4f..a7e0d98 100644 --- a/include/linux/io-mapping.h +++ b/include/linux/io-mapping.h @@ -75,6 +75,9 @@ io_mapping_unmap(void *vaddr) #endif /* CONFIG_X86_64 */ #ifdef CONFIG_X86_32 + +#include <asm/iomap.h> + static inline struct io_mapping * io_mapping_create_wc(unsigned long base, unsigned long size) { @@ -91,22 +94,14 @@ static inline void * io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) { offset += (unsigned long) mapping; -#ifdef CONFIG_HIGHMEM - return kmap_atomic_prot_pfn(offset >> PAGE_SHIFT, KM_USER0, + return iomap_atomic_prot_pfn(offset >> PAGE_SHIFT, KM_USER0, __pgprot(__PAGE_KERNEL_WC)); -#else - return ioremap_wc(offset, PAGE_SIZE); -#endif } static inline void io_mapping_unmap_atomic(void *vaddr) { -#ifdef CONFIG_HIGHMEM - kunmap_atomic(vaddr, KM_USER0); -#else - iounmap(vaddr); -#endif + iounmap_atomic(vaddr, KM_USER0); } static inline void * -- 1.5.6.5 -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn 2008-10-24 6:22 ` Keith Packard @ 2008-10-24 7:33 ` Thomas Hellström 2008-10-24 8:38 ` Ingo Molnar 2008-10-24 15:48 ` Keith Packard 0 siblings, 2 replies; 80+ messages in thread From: Thomas Hellström @ 2008-10-24 7:33 UTC (permalink / raw) To: Keith Packard Cc: Linus Torvalds, nickpiggin, airlied, dri-devel, Linux Kernel Mailing List, jbarnes, Peter Anvin, Andrew Morton, yinghai, Ingo Molnar Keith, What you actually are doing here is claiming copyright on code that other people have written, and tighten the export restrictions. kmap_atomic_prot_pfn() appeared long ago in drm git with identical code and purpose, but with different authors, and iounmap_atomic is identical to kunmap_atomic. Pls fix. /Thomas Keith Packard wrote: > > diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c > new file mode 100644 > index 0000000..c559599 > --- /dev/null > +++ b/arch/x86/mm/iomap_32.c > @@ -0,0 +1,59 @@ > +/* > + * Copyright © 2008 Keith Packard <keithp@keithp.com> > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License along > + * with this program; if not, write to the Free Software Foundation, Inc., > + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. > + */ > + > +#include <asm/iomap.h> > +#include <linux/module.h> > + > +/* Map 'pfn' using fixed map 'type' and protections 'prot' > + */ > +void * > +iomap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot) > +{ > + enum fixed_addresses idx; > + unsigned long vaddr; > + > + pagefault_disable(); > + > + idx = type + KM_TYPE_NR*smp_processor_id(); > + vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); > + set_pte(kmap_pte-idx, pfn_pte(pfn, prot)); > + arch_flush_lazy_mmu_mode(); > + > + return (void*) vaddr; > +} > +EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn); > + > +void > +iounmap_atomic(void *kvaddr, enum km_type type) > +{ > + unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; > + enum fixed_addresses idx = type + KM_TYPE_NR*smp_processor_id(); > + > + /* > + * Force other mappings to Oops if they'll try to access this pte > + * without first remap it. Keeping stale mappings around is a bad idea > + * also, in case the page changes cacheability attributes or becomes > + * a protected page in a hypervisor. > + */ > + if (vaddr == __fix_to_virt(FIX_KMAP_BEGIN+idx)) > + kpte_clear_flush(kmap_pte-idx, vaddr); > + > + arch_flush_lazy_mmu_mode(); > + pagefault_enable(); > +} > +EXPORT_SYMBOL_GPL(iounmap_atomic); > ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn 2008-10-24 7:33 ` Adding kmap_atomic_prot_pfn Thomas Hellström @ 2008-10-24 8:38 ` Ingo Molnar 2008-10-24 9:19 ` Thomas Hellström 2008-10-24 15:48 ` Keith Packard 1 sibling, 1 reply; 80+ messages in thread From: Ingo Molnar @ 2008-10-24 8:38 UTC (permalink / raw) To: Thomas Hellström Cc: Keith Packard, Linus Torvalds, nickpiggin, airlied, dri-devel, Linux Kernel Mailing List, jbarnes, Peter Anvin, Andrew Morton, yinghai * Thomas Hellström <thomas@tungstengraphics.com> wrote: > Keith, > > What you actually are doing here is claiming copyright on code that > other people have written, and tighten the export restrictions. > kmap_atomic_prot_pfn() appeared long ago in drm git with identical > code and purpose, but with different authors, and iounmap_atomic is > identical to kunmap_atomic. >> +EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn); you want to use this facility in a binary-only driver? Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn 2008-10-24 8:38 ` Ingo Molnar @ 2008-10-24 9:19 ` Thomas Hellström 2008-10-24 9:32 ` Ingo Molnar 0 siblings, 1 reply; 80+ messages in thread From: Thomas Hellström @ 2008-10-24 9:19 UTC (permalink / raw) To: Ingo Molnar Cc: Keith Packard, Linus Torvalds, nickpiggin, airlied, dri-devel, Linux Kernel Mailing List, jbarnes, Peter Anvin, Andrew Morton, yinghai Ingo Molnar wrote: > * Thomas Hellström <thomas@tungstengraphics.com> wrote: > > >> Keith, >> >> What you actually are doing here is claiming copyright on code that >> other people have written, and tighten the export restrictions. >> kmap_atomic_prot_pfn() appeared long ago in drm git with identical >> code and purpose, but with different authors, and iounmap_atomic is >> identical to kunmap_atomic. >> > > >>> +EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn); >>> > > you want to use this facility in a binary-only driver? > > Ingo > At this point I have no such use for it, no. The original user was a MIT style licenced driver. /Thomas ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn 2008-10-24 9:19 ` Thomas Hellström @ 2008-10-24 9:32 ` Ingo Molnar 2008-10-24 11:04 ` Thomas Hellström 0 siblings, 1 reply; 80+ messages in thread From: Ingo Molnar @ 2008-10-24 9:32 UTC (permalink / raw) To: Thomas Hellström Cc: Keith Packard, Linus Torvalds, nickpiggin, airlied, dri-devel, Linux Kernel Mailing List, jbarnes, Peter Anvin, Andrew Morton, yinghai * Thomas Hellström <thomas@tungstengraphics.com> wrote: > Ingo Molnar wrote: >> * Thomas Hellström <thomas@tungstengraphics.com> wrote: >> >> >>> Keith, >>> >>> What you actually are doing here is claiming copyright on code that >>> other people have written, and tighten the export restrictions. >>> kmap_atomic_prot_pfn() appeared long ago in drm git with identical >>> code and purpose, but with different authors, and iounmap_atomic is >>> identical to kunmap_atomic. >>> >> >> >>>> +EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn); >>>> >> >> you want to use this facility in a binary-only driver? >> >> Ingo >> > At this point I have no such use for it, no. > The original user was a MIT style licenced driver. okay, then just to put this question to rest: i wrote the original 32-bit highmem code ~10 years ago. I wrote the first version of fixmap support - in fact i coined the term. I wrote the first version of the atomic-kmap facility as well. All of that code is licensed under the GPLv2. So if anyone wants to make any copyright claims about highmem/kmap/fixmap derivative works, consider it in that light. Regarding this new API variant that Keith wrote: it would be silly and dangerous to export it anywhere but to in-kernel drivers. The API disables preemption on 32-bit and rummages deep in the guts of the kernel as well, uses up a precious resource (fixmap slots), etc. It's internal and we eventually might want to deprecate forms of it and concentrate on the good 64-bit performance side. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn 2008-10-24 9:32 ` Ingo Molnar @ 2008-10-24 11:04 ` Thomas Hellström 0 siblings, 0 replies; 80+ messages in thread From: Thomas Hellström @ 2008-10-24 11:04 UTC (permalink / raw) To: Ingo Molnar Cc: Keith Packard, Linus Torvalds, nickpiggin, airlied, dri-devel, Linux Kernel Mailing List, jbarnes, Peter Anvin, Andrew Morton, yinghai Ingo Molnar wrote: > * Thomas Hellström <thomas@tungstengraphics.com> wrote: > > >> Ingo Molnar wrote: >> >>> * Thomas Hellström <thomas@tungstengraphics.com> wrote: >>> >>> >>> >>>> Keith, >>>> >>>> What you actually are doing here is claiming copyright on code that >>>> other people have written, and tighten the export restrictions. >>>> kmap_atomic_prot_pfn() appeared long ago in drm git with identical >>>> code and purpose, but with different authors, and iounmap_atomic is >>>> identical to kunmap_atomic. >>>> >>>> >>> >>> >>>>> +EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn); >>>>> >>>>> >>> you want to use this facility in a binary-only driver? >>> >>> Ingo >>> >>> >> At this point I have no such use for it, no. >> The original user was a MIT style licenced driver. >> > > okay, then just to put this question to rest: i wrote the original > 32-bit highmem code ~10 years ago. I wrote the first version of fixmap > support - in fact i coined the term. I wrote the first version of the > atomic-kmap facility as well. > > All of that code is licensed under the GPLv2. So if anyone wants to make > any copyright claims about highmem/kmap/fixmap derivative works, > consider it in that light. > I fully acknowledge that. The fact that others wrote almost all of the code is the reason why there is no additional copyright claims in the file of the original kmap_atomic_prot_pfn() implementation. What I'm considering very bad form is someone taking other people's contributions, be it code or ideas, no matter how small they are, claim copyright on them and restrict the original usage. That's really the point here. Frankly, from my own usage point of view I don't really care about the specific case (whether they are exported GPL or not), but with the current form of this patch, I'm basically no longer allowed to use the code currently in DRM git without Keith's permission. > Regarding this new API variant that Keith wrote: it would be silly and > dangerous to export it anywhere but to in-kernel drivers. The API > disables preemption on 32-bit and rummages deep in the guts of the > kernel as well, uses up a precious resource (fixmap slots), etc. It's > internal and we eventually might want to deprecate forms of it and > concentrate on the good 64-bit performance side. > > These are all good arguments to reserve this api for in-kernel drivers. There are other reasons why it should be available to out of tree drivers: * The api enables a fast facility that will be extremely important by graphics drivers in the future, probably not only for in-kernel drivers. Particularly as future graphics drivers will want to get fast write-combined kernel mappings also on highmem pages, not only io. This latter actually suggests keeping the form kmap_atomic_prot_pfn instead of iomap_atomic_prot_pfn, and make it permanent, unless the same goals can be achieved by the VMAP work Nick Piggin is suggesting. * The use of k[un]map_atomic() would, following your argumentation, be equally dangerous to export non-GPL? Now, considering these pros and cons one might still come to the conclusion that for stability reasons it is best to keep this API for in kernel drivers. I don't really know enough about the affected kernel internals to be able to judge. I just think it's important that all facts are considered. /Thomas ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn 2008-10-24 7:33 ` Adding kmap_atomic_prot_pfn Thomas Hellström 2008-10-24 8:38 ` Ingo Molnar @ 2008-10-24 15:48 ` Keith Packard 2008-10-24 10:18 ` Thomas Hellström 1 sibling, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-24 15:48 UTC (permalink / raw) To: Thomas Hellström Cc: keithp, nickpiggin, airlied, yinghai, Peter Anvin, Linux Kernel Mailing List, jbarnes, dri-devel, Andrew Morton, Linus Torvalds, Ingo Molnar [-- Attachment #1: Type: text/plain, Size: 847 bytes --] On Fri, 2008-10-24 at 09:33 +0200, Thomas Hellström wrote: > Keith, > > What you actually are doing here is claiming copyright on code that > other people have written, and tighten the export restrictions. > kmap_atomic_prot_pfn() appeared long ago in drm git with identical code > and purpose, but with different authors, and iounmap_atomic is identical > to kunmap_atomic. Yeah, I just stuck my usual license header on it and didn't think about authorship. I'll fix that, once we figure out what the appropriate name is. But, as this code is clearly a trivial adaptation of the existing kernel code, it should carry a GPLv2 license. I'm also not particular as to the EXPORT restriction, I was just following the EXPORT advice given for the other newly exposed kernel symbols we're using. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn 2008-10-24 15:48 ` Keith Packard @ 2008-10-24 10:18 ` Thomas Hellström 0 siblings, 0 replies; 80+ messages in thread From: Thomas Hellström @ 2008-10-24 10:18 UTC (permalink / raw) To: Keith Packard Cc: nickpiggin, airlied, yinghai, Peter Anvin, Linux Kernel Mailing List, jbarnes, dri-devel, Andrew Morton, Linus Torvalds, Ingo Molnar Keith Packard wrote: > On Fri, 2008-10-24 at 09:33 +0200, Thomas Hellström wrote: > >> Keith, >> >> What you actually are doing here is claiming copyright on code that >> other people have written, and tighten the export restrictions. >> kmap_atomic_prot_pfn() appeared long ago in drm git with identical code >> and purpose, but with different authors, and iounmap_atomic is identical >> to kunmap_atomic. >> > > Yeah, I just stuck my usual license header on it and didn't think about > authorship. I'll fix that, once we figure out what the appropriate name > is. > > But, as this code is clearly a trivial adaptation of the existing kernel > code, it should carry a GPLv2 license. I'm also not particular as to the > EXPORT restriction, I was just following the EXPORT advice given for the > other newly exposed kernel symbols we're using. > > Thanks, Keith. If there is a chance that people who do want the EXPORT_SYMBOL_GPL restriction are OK to go with just EXPORT_SYMBOL(), I guess that would fit best considering what's already exported and doable in drivers today. /Thomas ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-24 2:48 ` Linus Torvalds ` (2 preceding siblings ...) 2008-10-24 6:22 ` Keith Packard @ 2008-10-24 9:14 ` Ingo Molnar 3 siblings, 0 replies; 80+ messages in thread From: Ingo Molnar @ 2008-10-24 9:14 UTC (permalink / raw) To: Linus Torvalds Cc: Keith Packard, Andrew Morton, nickpiggin, airlied, Linux Kernel Mailing List, jbarnes, dri-devel, yinghai, Peter Anvin * Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, 23 Oct 2008, Keith Packard wrote: > > > > > So I'd much rather create a new <linux/kmap.h> or something. Or just > > > expose this from to <asm/fixmap.h> or something. Let's not confuse this > > > with highmem, even if the implementation _historically_ was due to that. > > > > Sure, we readily admit that we're abusing the highmem API. So we wrapped > > that abusive code in a pretty package that can be re-implemented however > > you desire. > > > > How (and when) would you like to see this fixed? > > I'm not entirely sure who wants to own up to owning that particular > part of code, and is willing to make kmap_atomic_prot_pfn() also work > in the absense of CONFIG_HIGHMEM. yeah, that would be the x86 maintainers. (whoever they are) > I suspect it's Ingo, but I also suspect that he'd like to see a tested > patch by somebody who actually _uses_ this code. yeah. I already asked Keith yesterday to send one coherent bundle of patches and i think we've got all the code sent already, you also pulled the latest DRM tree - so it all just needs sorting out and testing. [ I can possibly test the end result with a bleeding-edge Xorg which supports the new DRI APIs. (Whether X comes up fine is a regular, necessary and 'fun' component of the x86 regression testing we do anyway. We also tend to notice highmem breakages promptly.) ] > So I would suspect that if you guys actually write a patch, and make > sure that it works on x86-32 even _without_ CONFIG_HIGHMEM, and send > it to Ingo, good things will happen. > > As to the "without CONFIG_HIGHMEM" part: making all of this work even > without HIGHMEM should literally be a matter of: > > - remove the '#ifdef CONFIG_HIGHMEM' in <asm/fixmap_32.h>, so that we > have fixmap entries for the temporary kernel mappings regardless (ie > FIX_KMAP_BEGIN and FIX_KMAP_END). > > - move the code for the atomic accesses that use those fixmap entries > into a file that gets compiled regardless of CONFIG_HIGHMEM. Or at > least just kmap_atomic_prot_pfn() and kunmap_atomic(). > > and it probably should all work automatically. The kmap_atomic() stuff > really should be almost totally independent of CONFIG_HIGHMEM, since > it's really much more closely related to fixmaps than HIGHMEM. yeah. > I guess there may be some debugging code that depends on HIGHMEM (eg > that whole testing for whether a page is a highmem page or not), so it > might be a _bit_ more than just moving code around, but I really > didn't look closer. > > Then, there's the issue of 64-bit, and just mapping everything there, > and the interface to that. I liked the trivial extension to "struct > resource" to have a "cached mapping" pointer. So if we can just make > it pass resources around and get a page that way (and not even need > kmap() on 64-bit architections), that would be good. hm, the thing that i find the weakest aspect of that (despite having suggested this approach) is that the structure and the granularity of the resource tree is really controlled by the hardware environment. We try to map the hardware circumstances accurately at bootstrap / device discovery time, and keep it rather static from that point on. (modulo hotplug and dynamic BAR sizing) And this static property of the resource tree is _good_ IMO, because we can think about it as a hardware property - not something tweaked by the kernel. (the kernel does tweak it when need to be or when the hardware asks for it, but it's more of an exception) That means that if a driver wants to map just a portion of a BAR, (because the hardware itself compresses various different pieces of functionality into the same BAR), this abstraction becomes a limitation on usage. And it might even be _impossible_ to use the simplest form of resource->mem_cache facility with certain types of hardware: say there's a cacheable and an uncacheable window within the same BAR - and we'd map the whole thing as cacheable. The CPU is then entitled to (and will most likely) prefetch into the uncacheable region as well, causing hw lockups, etc. [In this thread it was claimed that S3 chips have such properties.] And tweaking this abstraction to cover those cases, for the ioremap to not be a full mapping of the resource looks a bit hacky/limiting as well: - we'd either have to add 'size', 'offset' properties to the window we cache in the struct resource. (and possibly an array. yuck.) - or we'd have to say "dont use this facility with such quirky hardware then". - or we'd have to say "split up the struct resource into two pieces artificially", when the driver starts using the resource - which violates the current rather static nature of the resource tree. Maybe i'm overcomplicating it and maybe this last option isnt all that bad after all: as the 'combined' resource in such cases _is_ artificial - and i915+ does not have such problematic BARs to begin with. (keeping separate BARs for the framebuffer is the sane thing to do anyway) OTOH, Keith's io-mapping API does look pretty natural too - it wraps facilities that are already available to drivers, into a coherent unit. A driver has to be aware of it anyway, and drivers have to store their ioremap results in the private device structure currently anyway, so it fits nicely into current ioremap practices and is a gradual extension of it. Discovering the resource at the driver level might be a bit complicated (right now there's no need for any 3D driver to even know about struct resource - they just use the PCI APIs) and then using it dynamically brings up various lifetime questions. Hm? Right now i'm leaning slightly towards Keith's code - but no strong feelings. (Assuming that the atomic-kmap facilities are separated out cleanly and are made independent of any highmem connections - but that is just a shuffle-code-around-and-test excercise) > It's too late for -rc1 (which I'm planning on cutting within the > hour), and if it ends up being complex, I guess that it means this > will eb a 2.6.29 issue. > > But if it really is just a matter of mostly just some trivial code > movement and both the gfx and x86 people are all happy, I'll happily > take it for -rc2, and we can just leave this all behind us. ok. I'm pretty optimistic about it because the risk factor seems manageable: only one driver is affected by the changes - and that driver's authors wrote this new core kernel facility: which is the best of all combinations. We can test the x86 side highmem impact just fine. Plus this portion of the current upstream code in the DRM driver is rather ugly to begin with, so in my book this is a fair cleanup/bugfix as well. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-23 21:24 ` Linus Torvalds 2008-10-24 1:50 ` Keith Packard @ 2008-10-24 3:21 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 80+ messages in thread From: Benjamin Herrenschmidt @ 2008-10-24 3:21 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Keith Packard, mingo, nickpiggin, airlied, linux-kernel, jbarnes, dri-devel, yinghai On Thu, 2008-10-23 at 14:24 -0700, Linus Torvalds wrote: > The whole point of that function has absolutely nothing to do with > highmem, and it *must* be useful on non-highmem configurations to be > appropriate. > > So I'd much rather create a new <linux/kmap.h> or something. Or just > expose this from to <asm/fixmap.h> or something. Let's not confuse > this > with highmem, even if the implementation _historically_ was due to > that. Well, on powerpc, we just went (or rather, Kumar just went) through the oops of implementing fixmap and then kmap on top of it... just because we wanted kmap_atomic functionality on non-highmem platforms :-) (and the fixmap approach has some other interesting features). So yes, I agree. Typically very useful for any 32-bit processor with a memory mapped PCI Express config space since it's something like 256M of virtual space to map it all iirc. Cheers, Ben. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) 2008-10-19 17:53 ` io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar ` (2 preceding siblings ...) 2008-10-19 21:04 ` Keith Packard @ 2008-10-20 10:10 ` Ingo Molnar 3 siblings, 0 replies; 80+ messages in thread From: Ingo Molnar @ 2008-10-20 10:10 UTC (permalink / raw) To: Keith Packard, Jesse Barnes Cc: Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton, Yinghai Lu * Ingo Molnar <mingo@elte.hu> wrote: > very nice! > > I think we need a somewhat different abstraction though. > > Firstly, regarding drivers/gpu/drm/i915/io_reserve.h, that needs to > move to generic code. > > Secondly, wouldnt the right abstraction be to attach this > functionality to 'struct resource' ? [or at least create a second > struct that embedds struct resource] > > this abstraction is definitely not a PCI thing and not a > detached-from-everything thing, it's an IO resource thing. We could > make it a property of struct resource: > > struct resource { > resource_size_t start; > resource_size_t end; > const char *name; > unsigned long flags; > struct resource *parent, *sibling, *child; > + void *mapping; > }; > > The APIs would be: > > int io_resource_init_mapping(struct resource *res); > void io_resource_free_mapping(struct resource *res); > void * io_resource_map(struct resource *res, pfn_t pfn, unsigned long offset); > void io_resource_unmap(struct resource *res, void *kaddr); > > Note how simple and consistent it all gets: IO resources already know > their physical location and their size limits. Being able to cache an > ioremap in a mapping [and being able to use atomic kmaps on 32-bit] is > a relatively simple and natural extension to the concept. > > i think that would be quite acceptable - and the APIs could just > transparently work on it. This would also allow the PCI code to > automatically unmap any cached mappings from resources, when the > driver deinitializes. > > Linus, Jesse, what do you think? the downsize would be that we'd attach a runtime property to the IORESOURCE_MEM resource tree - which is a fairly static thing right now, after the point where we finalize the resource tree. (modulo device/bridge hotplug variances) Another downside is that we might not want to map the whole thing. I.e. the structure of the IO memory space we want to map by drivers might be different from how it looks like in the resource tree. the concept of introducing resource->mapping does not feel _that_ wrong though and has a couple of upsides: it could act as a natural mapping type serializer for example and drivers wouldnt have to explicitly manage ioremap results - they could just use the resource descriptor directly and "read" and "write" to/from it. readl/writel could be extended to operate on the resource descriptor transparently, getting rid of a source of resource mismatches and overmapping, etc. etc. We could even safety check IO space accesses this way. and we'd get rid of the complication that your APIs introduced, the need to introduce a separate io_mapping type, etc. Dunno, i might be missing some obvious downside why this wasnt done like that until now. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 22:32 ` Ingo Molnar ` (2 preceding siblings ...) 2008-10-19 4:14 ` Keith Packard @ 2008-10-19 4:28 ` Yinghai Lu 3 siblings, 0 replies; 80+ messages in thread From: Yinghai Lu @ 2008-10-19 4:28 UTC (permalink / raw) To: Ingo Molnar Cc: Keith Packard, Linus Torvalds, Nick Piggin, Dave Airlie, Linux Kernel Mailing List, dri-devel, Andrew Morton On Sat, Oct 18, 2008 at 3:32 PM, Ingo Molnar <mingo@elte.hu> wrote: > > (In fact if we installed it into the linear kernel address space, and if > the aperture is 1GB aligned, we will automatically use gbpages for it. > Were Intel to support gbpages in the future ;-) > we could expand init_memory_mapping to support direct map for them... with needed prot. or use set init_memory_mapping() + set_memory_wb() directly. but that is only for 64bit x86 also someone is talking about to have 6 pcie display adapters on 64bit system. and every card will have 4g ram. 32bit could use fix map for 1G or 2G mapping. YH ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [git pull] drm patches for 2.6.27-rc1 2008-10-18 19:31 ` Linus Torvalds ` (2 preceding siblings ...) 2008-10-18 20:37 ` Ingo Molnar @ 2008-10-19 3:14 ` Nick Piggin 3 siblings, 0 replies; 80+ messages in thread From: Nick Piggin @ 2008-10-19 3:14 UTC (permalink / raw) To: Linus Torvalds Cc: Keith Packard, Dave Airlie, Andrew Morton, Linux Kernel Mailing List, dri-devel On Sunday 19 October 2008 06:31, Linus Torvalds wrote: > On Sat, 18 Oct 2008, Keith Packard wrote: > > The basic plan is to have four new functions (yes, I'm making up names > > here): > > > > struct io_mapping *io_reserve_pci_resource(struct pci_dev *dev, > > int bar, > > int prot); > > void io_mapping_free(struct io_mapping *mapping); > > > > void *io_map_atomic(struct io_mapping *mapping, unsigned long pfn); > > void io_unmap_atomic(struct io_mapping *mapping, unsigned long pfn); > > The important thing is that mappings need to be per-CPU, so the above may > work, but only if it's designed so that "io_reserve_pci_resource()" will > actually reserve space for 'nr_possible_cpu' page mappings, and then the > "io_[un]map_atomic()" functions do per-CPU mappings. > > Anything else is a disaster, because anything else implies TLB shootdown. TLB shootdown need only be implied if the behaviour required is to unmap the virtual address *and* cause any other CPU that subsequently touches it to fault. For kva, that would be a bug anyway (use after free). The only thing it implies is that a TLB shootdown happens at some point before the address get reused. Still, it's always going to be faster than a global mapping, if done properly. I was thinking about doing a vmap_atomic thing generically in the vmap layer... why exactly do we need the FIXMAP stuff for it? ^ permalink raw reply [flat|nested] 80+ messages in thread
[parent not found: <1225392985-6832-1-git-send-email-eric@anholt.net>]
* [PATCH] Add io-mapping functions to dynamically map large device apertures [not found] <1225392985-6832-1-git-send-email-eric@anholt.net> @ 2008-10-31 2:38 ` Eric Anholt 2008-10-31 9:21 ` Ingo Molnar 0 siblings, 1 reply; 80+ messages in thread From: Eric Anholt @ 2008-10-31 2:38 UTC (permalink / raw) To: linux-kernel; +Cc: Ingo Molnar, Keith Packard, Eric Anholt [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 7347 bytes --] From: Keith Packard <keithp@keithp.com> Graphics devices have large PCI apertures which would consume a significant fraction of a 32-bit address space if mapped during driver initialization. Using ioremap at runtime is impractical as it is too slow. This new set of interfaces uses atomic mappings on 32-bit processors and a large static mapping on 64-bit processors to provide reasonable 32-bit performance and optimal 64-bit performance. The current implementation sits atop the io_map_atomic fixmap-based mechanism for 32-bit processors. This includes some editorial suggestions from Randy Dunlap for Documentation/io-mapping.txt Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> --- Documentation/io-mapping.txt | 76 +++++++++++++++++++++++++++ include/linux/io-mapping.h | 118 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 194 insertions(+), 0 deletions(-) create mode 100644 Documentation/io-mapping.txt create mode 100644 include/linux/io-mapping.h diff --git a/Documentation/io-mapping.txt b/Documentation/io-mapping.txt new file mode 100644 index 0000000..cd2f726 --- /dev/null +++ b/Documentation/io-mapping.txt @@ -0,0 +1,76 @@ +The io_mapping functions in linux/io-mapping.h provide an abstraction for +efficiently mapping small regions of an I/O device to the CPU. The initial +usage is to support the large graphics aperture on 32-bit processors where +ioremap_wc cannot be used to statically map the entire aperture to the CPU +as it would consume too much of the kernel address space. + +A mapping object is created during driver initialization using + + struct io_mapping *io_mapping_create_wc(unsigned long base, + unsigned long size) + + 'base' is the bus address of the region to be made + mappable, while 'size' indicates how large a mapping region to + enable. Both are in bytes. + + This _wc variant provides a mapping which may only be used + with the io_mapping_map_atomic_wc or io_mapping_map_wc. + +With this mapping object, individual pages can be mapped either atomically +or not, depending on the necessary scheduling environment. Of course, atomic +maps are more efficient: + + void *io_mapping_map_atomic_wc(struct io_mapping *mapping, + unsigned long offset) + + 'offset' is the offset within the defined mapping region. + Accessing addresses beyond the region specified in the + creation function yields undefined results. Using an offset + which is not page aligned yields an undefined result. The + return value points to a single page in CPU address space. + + This _wc variant returns a write-combining map to the + page and may only be used with mappings created by + io_mapping_create_wc + + Note that the task may not sleep while holding this page + mapped. + + void io_mapping_unmap_atomic(void *vaddr) + + 'vaddr' must be the the value returned by the last + io_mapping_map_atomic_wc call. This unmaps the specified + page and allows the task to sleep once again. + +If you need to sleep while holding the lock, you can use the non-atomic +variant, although they may be significantly slower. + + void *io_mapping_map_wc(struct io_mapping *mapping, + unsigned long offset) + + This works like io_mapping_map_atomic_wc except it allows + the task to sleep while holding the page mapped. + + void io_mapping_unmap(void *vaddr) + + This works like io_mapping_unmap_atomic, except it is used + for pages mapped with io_mapping_map_wc. + +At driver close time, the io_mapping object must be freed: + + void io_mapping_free(struct io_mapping *mapping) + +Current Implementation: + +The initial implementation of these functions uses existing mapping +mechanisms and so provides only an abstraction layer and no new +functionality. + +On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole +range, creating a permanent kernel-visible mapping to the resource. The +map_atomic and map functions add the requested offset to the base of the +virtual address returned by ioremap_wc. + +On 32-bit processors, io_mapping_map_atomic_wc uses io_map_atomic_prot_pfn, +which uses the fixmaps to get us a mapping to a page using an atomic fashion. +For io_mapping_map_wc, ioremap_wc() is used to get a mapping of the region. diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h new file mode 100644 index 0000000..1b56699 --- /dev/null +++ b/include/linux/io-mapping.h @@ -0,0 +1,118 @@ +/* + * Copyright © 2008 Keith Packard <keithp@keithp.com> + * + * This file is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License + * as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +#ifndef _LINUX_IO_MAPPING_H +#define _LINUX_IO_MAPPING_H + +#include <linux/types.h> +#include <asm/io.h> +#include <asm/page.h> +#include <asm/iomap.h> + +/* + * The io_mapping mechanism provides an abstraction for mapping + * individual pages from an io device to the CPU in an efficient fashion. + * + * See Documentation/io_mapping.txt + */ + +/* this struct isn't actually defined anywhere */ +struct io_mapping; + +#ifdef CONFIG_X86_64 + +/* Create the io_mapping object*/ +static inline struct io_mapping * +io_mapping_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_mapping *) ioremap_wc(base, size); +} + +static inline void +io_mapping_free(struct io_mapping *mapping) +{ + iounmap(mapping); +} + +/* Atomic map/unmap */ +static inline void * +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) +{ + return ((char *) mapping) + offset; +} + +static inline void +io_mapping_unmap_atomic(void *vaddr) +{ +} + +/* Non-atomic map/unmap */ +static inline void * +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) +{ + return ((char *) mapping) + offset; +} + +static inline void +io_mapping_unmap(void *vaddr) +{ +} + +#endif /* CONFIG_X86_64 */ + +#ifdef CONFIG_X86_32 +static inline struct io_mapping * +io_mapping_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_mapping *) base; +} + +static inline void +io_mapping_free(struct io_mapping *mapping) +{ +} + +/* Atomic map/unmap */ +static inline void * +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) +{ + offset += (unsigned long) mapping; + return iomap_atomic_prot_pfn(offset >> PAGE_SHIFT, KM_USER0, + __pgprot(__PAGE_KERNEL_WC)); +} + +static inline void +io_mapping_unmap_atomic(void *vaddr) +{ + iounmap_atomic(vaddr, KM_USER0); +} + +static inline void * +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) +{ + offset += (unsigned long) mapping; + return ioremap_wc(offset, PAGE_SIZE); +} + +static inline void +io_mapping_unmap(void *vaddr) +{ + iounmap(vaddr); +} +#endif /* CONFIG_X86_32 */ + +#endif /* _LINUX_IO_MAPPING_H */ -- 1.5.6.5 ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH] Add io-mapping functions to dynamically map large device apertures 2008-10-31 2:38 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Eric Anholt @ 2008-10-31 9:21 ` Ingo Molnar 2008-10-31 16:59 ` Keith Packard 0 siblings, 1 reply; 80+ messages in thread From: Ingo Molnar @ 2008-10-31 9:21 UTC (permalink / raw) To: Eric Anholt; +Cc: linux-kernel, Keith Packard * Eric Anholt <eric@anholt.net> wrote: > From: Keith Packard <keithp@keithp.com> > > Graphics devices have large PCI apertures which would consume a significant > fraction of a 32-bit address space if mapped during driver initialization. > Using ioremap at runtime is impractical as it is too slow. This new set of > interfaces uses atomic mappings on 32-bit processors and a large static > mapping on 64-bit processors to provide reasonable 32-bit performance and > optimal 64-bit performance. > > The current implementation sits atop the io_map_atomic fixmap-based mechanism > for 32-bit processors. > > This includes some editorial suggestions from Randy Dunlap for > Documentation/io-mapping.txt > > Signed-off-by: Keith Packard <keithp@keithp.com> > Signed-off-by: Eric Anholt <eric@anholt.net> > --- > Documentation/io-mapping.txt | 76 +++++++++++++++++++++++++++ > include/linux/io-mapping.h | 118 ++++++++++++++++++++++++++++++++++++++++++ I've applied your three patches to tip/core/resources for testing, thanks! One small detail: > +++ b/include/linux/io-mapping.h > +#ifdef CONFIG_X86_64 it's ugly and inflexible to put x86 dependencies into generic headers. (even though with a high likelyhood 32-bit x86 will be the only arch to ever implement the iomap_atomic() APIs) Instead please add a HAVE_ATOMIC_IOMAP define to arch/x86/Kconfig: config HAVE_ATOMIC_IOMAP def_bool y depends on X86_32 ... and use #ifndef HAVE_ATOMIC_IOMAP in include/linux/io-mapping.h instead of #ifdef CONFIG_X86_64. ( Other 32-bit architectures which need an atomic iomap implementation for address space reasons can then implement the iomap_atomic*() APIs too and set this same flag, to gain the same generic io_mapping implementation. ) Please send this cleanup as a delta patch, ontop of your three patches. Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH] Add io-mapping functions to dynamically map large device apertures 2008-10-31 9:21 ` Ingo Molnar @ 2008-10-31 16:59 ` Keith Packard 2008-11-03 8:37 ` Ingo Molnar 0 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-10-31 16:59 UTC (permalink / raw) To: Ingo Molnar; +Cc: keithp, Eric Anholt, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1752 bytes --] On Fri, 2008-10-31 at 10:21 +0100, Ingo Molnar wrote: > it's ugly and inflexible to put x86 dependencies into generic headers. > (even though with a high likelyhood 32-bit x86 will be the only arch > to ever implement the iomap_atomic() APIs) > > Instead please add a HAVE_ATOMIC_IOMAP define to arch/x86/Kconfig: > > config HAVE_ATOMIC_IOMAP > def_bool y > depends on X86_32 > > ... and use #ifndef HAVE_ATOMIC_IOMAP in include/linux/io-mapping.h > instead of #ifdef CONFIG_X86_64. Just to clarify the issue here: there are two separate implementations of the io_mapping API -- one for 'large address space' machines where ioremap_wc can handle the typical graphics aperture within the kernel virtual map, and the other using iomap_atomic_prot_pfn for machines with puny address spaces. All large address space machines can provide the io_mapping API without any archtecture-specific support. For efficient 32-bit io_mapping support, we require the new iomap_atomic_prot_pfn function. So, it seems like what I want to do is use the large address space code on any machine which supports it, and then use the iomap_atomic_prot_pfn version for small address space machines which have the iomap_atomic_prot_pfn function. What I think you're suggesting is to just assume that machines without iomap_atomic_prot_pfn have address spaces large enough to support the ioremap_wc path. The alternative is to create a third (slow) path (which I did before the iomap_atomic_prot_pfn API was introduced) that uses ioremap_wc at run time for small address space machines without iomap_atomic_prot_pfn. Let me know which you'd prefer and I'll get a patch out ASAP. -- keith.packard@intel.com [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [PATCH] Add io-mapping functions to dynamically map large device apertures 2008-10-31 16:59 ` Keith Packard @ 2008-11-03 8:37 ` Ingo Molnar 0 siblings, 0 replies; 80+ messages in thread From: Ingo Molnar @ 2008-11-03 8:37 UTC (permalink / raw) To: Keith Packard; +Cc: Eric Anholt, linux-kernel * Keith Packard <keithp@keithp.com> wrote: > On Fri, 2008-10-31 at 10:21 +0100, Ingo Molnar wrote: > > > it's ugly and inflexible to put x86 dependencies into generic headers. > > (even though with a high likelyhood 32-bit x86 will be the only arch > > to ever implement the iomap_atomic() APIs) > > > > Instead please add a HAVE_ATOMIC_IOMAP define to arch/x86/Kconfig: > > > > config HAVE_ATOMIC_IOMAP > > def_bool y > > depends on X86_32 > > > > ... and use #ifndef HAVE_ATOMIC_IOMAP in include/linux/io-mapping.h > > instead of #ifdef CONFIG_X86_64. > > Just to clarify the issue here: there are two separate > implementations of the io_mapping API -- one for 'large address > space' machines where ioremap_wc can handle the typical graphics > aperture within the kernel virtual map, and the other using > iomap_atomic_prot_pfn for machines with puny address spaces. > > All large address space machines can provide the io_mapping API > without any archtecture-specific support. For efficient 32-bit > io_mapping support, we require the new iomap_atomic_prot_pfn > function. > > So, it seems like what I want to do is use the large address space > code on any machine which supports it, and then use the > iomap_atomic_prot_pfn version for small address space machines which > have the iomap_atomic_prot_pfn function. Correct. > What I think you're suggesting is to just assume that machines > without iomap_atomic_prot_pfn have address spaces large enough to > support the ioremap_wc path. The alternative is to create a third > (slow) path (which I did before the iomap_atomic_prot_pfn API was > introduced) that uses ioremap_wc at run time for small address space > machines without iomap_atomic_prot_pfn. > > Let me know which you'd prefer and I'll get a patch out ASAP. Please lets keep it simple: i.e. always use ioremap_wc() when there's no iomap_atomic_prot_pfn() 32-bit API provided. ( and by all means ioremap_wc() will just work fine on most 32-bit architectures out of box: they dont go about trying to map hundreds of megabytes of graphics aperture. If they nevertheless need it, they can implement iomap_atomic_prot_pfn() to add support. ) Ingo ^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH] [x86_32] Add io_map_atomic using fixmaps @ 2008-11-03 17:09 Keith Packard 2008-11-03 17:09 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Keith Packard 0 siblings, 1 reply; 80+ messages in thread From: Keith Packard @ 2008-11-03 17:09 UTC (permalink / raw) To: linux-kernel; +Cc: Ingo Molnar, Keith Packard [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 7561 bytes --] This steals the code used for CONFIG_HIGHMEM memory mappings except that it's designed for dynamic io resource mapping. These fixmaps are available even with CONFIG_HIGHMEM turned off. Signed-off-by: Keith Packard <keithp@keithp.com> --- arch/x86/Kconfig | 4 ++ arch/x86/include/asm/fixmap.h | 4 ++ arch/x86/include/asm/fixmap_32.h | 4 -- arch/x86/include/asm/highmem.h | 5 +-- arch/x86/mm/Makefile | 2 +- arch/x86/mm/init_32.c | 3 +- arch/x86/mm/iomap_32.c | 59 ++++++++++++++++++++++++++++++++++++++ include/asm-x86/iomap.h | 30 +++++++++++++++++++ 8 files changed, 100 insertions(+), 11 deletions(-) create mode 100644 arch/x86/mm/iomap_32.c create mode 100644 include/asm-x86/iomap.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 350bee1..a879393 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1890,6 +1890,10 @@ config SYSVIPC_COMPAT endmenu +config HAVE_ATOMIC_IOMAP + def_bool y + depends on X86_32 + source "net/Kconfig" source "drivers/Kconfig" diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 8668a94..23696d4 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -9,6 +9,10 @@ extern int fixmaps_set; +extern pte_t *kmap_pte; +extern pgprot_t kmap_prot; +extern pte_t *pkmap_page_table; + void __native_set_fixmap(enum fixed_addresses idx, pte_t pte); void native_set_fixmap(enum fixed_addresses idx, unsigned long phys, pgprot_t flags); diff --git a/arch/x86/include/asm/fixmap_32.h b/arch/x86/include/asm/fixmap_32.h index 09f29ab..c7115c1 100644 --- a/arch/x86/include/asm/fixmap_32.h +++ b/arch/x86/include/asm/fixmap_32.h @@ -28,10 +28,8 @@ extern unsigned long __FIXADDR_TOP; #include <asm/acpi.h> #include <asm/apicdef.h> #include <asm/page.h> -#ifdef CONFIG_HIGHMEM #include <linux/threads.h> #include <asm/kmap_types.h> -#endif /* * Here we define all the compile-time 'special' virtual @@ -75,10 +73,8 @@ enum fixed_addresses { #ifdef CONFIG_X86_CYCLONE_TIMER FIX_CYCLONE_TIMER, /*cyclone timer register*/ #endif -#ifdef CONFIG_HIGHMEM FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, -#endif #ifdef CONFIG_PCI_MMCONFIG FIX_PCIE_MCFG, #endif diff --git a/arch/x86/include/asm/highmem.h b/arch/x86/include/asm/highmem.h index a3b3b7c..bf9276b 100644 --- a/arch/x86/include/asm/highmem.h +++ b/arch/x86/include/asm/highmem.h @@ -25,14 +25,11 @@ #include <asm/kmap_types.h> #include <asm/tlbflush.h> #include <asm/paravirt.h> +#include <asm/fixmap.h> /* declarations for highmem.c */ extern unsigned long highstart_pfn, highend_pfn; -extern pte_t *kmap_pte; -extern pgprot_t kmap_prot; -extern pte_t *pkmap_page_table; - /* * Right now we initialize only a single pte table. It can be extended * easily, subsequent pte tables have to be allocated in one physical diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 59f89b4..fea4565 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -1,7 +1,7 @@ obj-y := init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \ pat.o pgtable.o gup.o -obj-$(CONFIG_X86_32) += pgtable_32.o +obj-$(CONFIG_X86_32) += pgtable_32.o iomap_32.o obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o obj-$(CONFIG_X86_PTDUMP) += dump_pagetables.o diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 8396868..c483f42 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -334,7 +334,6 @@ int devmem_is_allowed(unsigned long pagenr) return 0; } -#ifdef CONFIG_HIGHMEM pte_t *kmap_pte; pgprot_t kmap_prot; @@ -357,6 +356,7 @@ static void __init kmap_init(void) kmap_prot = PAGE_KERNEL; } +#ifdef CONFIG_HIGHMEM static void __init permanent_kmaps_init(pgd_t *pgd_base) { unsigned long vaddr; @@ -436,7 +436,6 @@ static void __init set_highmem_pages_init(void) #endif /* !CONFIG_NUMA */ #else -# define kmap_init() do { } while (0) # define permanent_kmaps_init(pgd_base) do { } while (0) # define set_highmem_pages_init() do { } while (0) #endif /* CONFIG_HIGHMEM */ diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c new file mode 100644 index 0000000..d0151d8 --- /dev/null +++ b/arch/x86/mm/iomap_32.c @@ -0,0 +1,59 @@ +/* + * Copyright © 2008 Ingo Molnar + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <asm/iomap.h> +#include <linux/module.h> + +/* Map 'pfn' using fixed map 'type' and protections 'prot' + */ +void * +iomap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot) +{ + enum fixed_addresses idx; + unsigned long vaddr; + + pagefault_disable(); + + idx = type + KM_TYPE_NR*smp_processor_id(); + vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); + set_pte(kmap_pte-idx, pfn_pte(pfn, prot)); + arch_flush_lazy_mmu_mode(); + + return (void*) vaddr; +} +EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn); + +void +iounmap_atomic(void *kvaddr, enum km_type type) +{ + unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; + enum fixed_addresses idx = type + KM_TYPE_NR*smp_processor_id(); + + /* + * Force other mappings to Oops if they'll try to access this pte + * without first remap it. Keeping stale mappings around is a bad idea + * also, in case the page changes cacheability attributes or becomes + * a protected page in a hypervisor. + */ + if (vaddr == __fix_to_virt(FIX_KMAP_BEGIN+idx)) + kpte_clear_flush(kmap_pte-idx, vaddr); + + arch_flush_lazy_mmu_mode(); + pagefault_enable(); +} +EXPORT_SYMBOL_GPL(iounmap_atomic); diff --git a/include/asm-x86/iomap.h b/include/asm-x86/iomap.h new file mode 100644 index 0000000..c1f0628 --- /dev/null +++ b/include/asm-x86/iomap.h @@ -0,0 +1,30 @@ +/* + * Copyright © 2008 Ingo Molnar + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <linux/fs.h> +#include <linux/mm.h> +#include <linux/uaccess.h> +#include <asm/cacheflush.h> +#include <asm/pgtable.h> +#include <asm/tlbflush.h> + +void * +iomap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot); + +void +iounmap_atomic(void *kvaddr, enum km_type type); -- 1.5.6.5 ^ permalink raw reply [flat|nested] 80+ messages in thread
* [PATCH] Add io-mapping functions to dynamically map large device apertures 2008-11-03 17:09 [PATCH] [x86_32] Add io_map_atomic using fixmaps Keith Packard @ 2008-11-03 17:09 ` Keith Packard 0 siblings, 0 replies; 80+ messages in thread From: Keith Packard @ 2008-11-03 17:09 UTC (permalink / raw) To: linux-kernel; +Cc: Ingo Molnar, Keith Packard [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 7959 bytes --] Graphics devices have large PCI apertures which would consume a significant fraction of a 32-bit address space if mapped during driver initialization. Using ioremap at runtime is impractical as it is too slow. This new set of interfaces uses atomic mappings on 32-bit processors and a large static mapping on 64-bit processors to provide reasonable 32-bit performance and optimal 64-bit performance. The current implementation sits atop the existing CONFIG_HIGHMEM kmap_atomic mechanism for 32-bit processors when present. When absent, it just uses ioremap, which remains horribly inefficient. Fixing non-HIGHMEM 32-bit kernels to provide per-CPU mappings ala HIGHMEM would resolve that performance issue. This includes some editorial suggestions from Randy Dunlap for Documentation/io-mapping.txt Signed-off-by: Keith Packard <keithp@keithp.com> --- Documentation/io-mapping.txt | 82 +++++++++++++++++++++++++++ include/linux/io-mapping.h | 125 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 207 insertions(+), 0 deletions(-) create mode 100644 Documentation/io-mapping.txt create mode 100644 include/linux/io-mapping.h diff --git a/Documentation/io-mapping.txt b/Documentation/io-mapping.txt new file mode 100644 index 0000000..473e43b --- /dev/null +++ b/Documentation/io-mapping.txt @@ -0,0 +1,82 @@ +The io_mapping functions in linux/io-mapping.h provide an abstraction for +efficiently mapping small regions of an I/O device to the CPU. The initial +usage is to support the large graphics aperture on 32-bit processors where +ioremap_wc cannot be used to statically map the entire aperture to the CPU +as it would consume too much of the kernel address space. + +A mapping object is created during driver initialization using + + struct io_mapping *io_mapping_create_wc(unsigned long base, + unsigned long size) + + 'base' is the bus address of the region to be made + mappable, while 'size' indicates how large a mapping region to + enable. Both are in bytes. + + This _wc variant provides a mapping which may only be used + with the io_mapping_map_atomic_wc or io_mapping_map_wc. + +With this mapping object, individual pages can be mapped either atomically +or not, depending on the necessary scheduling environment. Of course, atomic +maps are more efficient: + + void *io_mapping_map_atomic_wc(struct io_mapping *mapping, + unsigned long offset) + + 'offset' is the offset within the defined mapping region. + Accessing addresses beyond the region specified in the + creation function yields undefined results. Using an offset + which is not page aligned yields an undefined result. The + return value points to a single page in CPU address space. + + This _wc variant returns a write-combining map to the + page and may only be used with mappings created by + io_mapping_create_wc + + Note that the task may not sleep while holding this page + mapped. + + void io_mapping_unmap_atomic(void *vaddr) + + 'vaddr' must be the the value returned by the last + io_mapping_map_atomic_wc call. This unmaps the specified + page and allows the task to sleep once again. + +If you need to sleep while holding the lock, you can use the non-atomic +variant, although they may be significantly slower. + + void *io_mapping_map_wc(struct io_mapping *mapping, + unsigned long offset) + + This works like io_mapping_map_atomic_wc except it allows + the task to sleep while holding the page mapped. + + void io_mapping_unmap(void *vaddr) + + This works like io_mapping_unmap_atomic, except it is used + for pages mapped with io_mapping_map_wc. + +At driver close time, the io_mapping object must be freed: + + void io_mapping_free(struct io_mapping *mapping) + +Current Implementation: + +The initial implementation of these functions uses existing mapping +mechanisms and so provides only an abstraction layer and no new +functionality. + +On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole +range, creating a permanent kernel-visible mapping to the resource. The +map_atomic and map functions add the requested offset to the base of the +virtual address returned by ioremap_wc. + +On 32-bit processors with HIGHMEM defined, io_mapping_map_atomic_wc uses +kmap_atomic_pfn to map the specified page in an atomic fashion; +kmap_atomic_pfn isn't really supposed to be used with device pages, but it +provides an efficient mapping for this usage. + +On 32-bit processors without HIGHMEM defined, io_mapping_map_atomic_wc and +io_mapping_map_wc both use ioremap_wc, a terribly inefficient function which +performs an IPI to inform all processors about the new mapping. This results +in a significant performance penalty. diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h new file mode 100644 index 0000000..82df317 --- /dev/null +++ b/include/linux/io-mapping.h @@ -0,0 +1,125 @@ +/* + * Copyright © 2008 Keith Packard <keithp@keithp.com> + * + * This file is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License + * as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. + */ + +#ifndef _LINUX_IO_MAPPING_H +#define _LINUX_IO_MAPPING_H + +#include <linux/types.h> +#include <asm/io.h> +#include <asm/page.h> +#include <asm/iomap.h> + +/* + * The io_mapping mechanism provides an abstraction for mapping + * individual pages from an io device to the CPU in an efficient fashion. + * + * See Documentation/io_mapping.txt + */ + +/* this struct isn't actually defined anywhere */ +struct io_mapping; + +#ifdef CONFIG_HAVE_ATOMIC_IOMAP + +/* + * For small address space machines, mapping large objects + * into the kernel virtual space isn't practical. Where + * available, use fixmap support to dynamically map pages + * of the object at run time. + */ + +static inline struct io_mapping * +io_mapping_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_mapping *) base; +} + +static inline void +io_mapping_free(struct io_mapping *mapping) +{ +} + +/* Atomic map/unmap */ +static inline void * +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) +{ + offset += (unsigned long) mapping; + return iomap_atomic_prot_pfn(offset >> PAGE_SHIFT, KM_USER0, + __pgprot(__PAGE_KERNEL_WC)); +} + +static inline void +io_mapping_unmap_atomic(void *vaddr) +{ + iounmap_atomic(vaddr, KM_USER0); +} + +static inline void * +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) +{ + offset += (unsigned long) mapping; + return ioremap_wc(offset, PAGE_SIZE); +} + +static inline void +io_mapping_unmap(void *vaddr) +{ + iounmap(vaddr); +} + +#else + +/* Create the io_mapping object*/ +static inline struct io_mapping * +io_mapping_create_wc(unsigned long base, unsigned long size) +{ + return (struct io_mapping *) ioremap_wc(base, size); +} + +static inline void +io_mapping_free(struct io_mapping *mapping) +{ + iounmap(mapping); +} + +/* Atomic map/unmap */ +static inline void * +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset) +{ + return ((char *) mapping) + offset; +} + +static inline void +io_mapping_unmap_atomic(void *vaddr) +{ +} + +/* Non-atomic map/unmap */ +static inline void * +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset) +{ + return ((char *) mapping) + offset; +} + +static inline void +io_mapping_unmap(void *vaddr) +{ +} + +#endif /* HAVE_ATOMIC_IOMAP */ + +#endif /* _LINUX_IO_MAPPING_H */ -- 1.5.6.5 ^ permalink raw reply [flat|nested] 80+ messages in thread
end of thread, other threads:[~2008-11-05 9:01 UTC | newest] Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-10-17 21:29 [git pull] drm patches for 2.6.27-rc1 Dave Airlie 2008-10-17 22:43 ` Linus Torvalds 2008-10-18 2:10 ` Eric Anholt 2008-10-18 2:47 ` Linus Torvalds 2008-10-18 3:49 ` Keith Packard 2008-10-18 6:44 ` Corbin Simpson 2008-10-18 7:49 ` Eric Anholt 2008-10-19 17:52 ` Peter Zijlstra 2008-10-20 4:17 ` Steven J Newbury 2008-10-20 16:31 ` Linus Torvalds 2008-10-20 20:04 ` Jesse Barnes 2008-10-18 9:11 ` Dave Airlie 2008-10-18 1:37 ` Nick Piggin 2008-10-18 19:11 ` Keith Packard 2008-10-18 19:31 ` Linus Torvalds 2008-10-18 20:07 ` Thomas Hellström 2008-10-18 20:20 ` Keith Packard 2008-10-18 20:37 ` Ingo Molnar 2008-10-18 21:51 ` Keith Packard 2008-10-18 22:32 ` Ingo Molnar 2008-10-18 22:47 ` Jon Smirl 2008-10-18 22:53 ` Linus Torvalds 2008-10-19 0:38 ` Keith Packard 2008-10-19 1:06 ` Arjan van de Ven 2008-10-19 1:15 ` Keith Packard 2008-10-20 10:01 ` Ingo Molnar 2008-10-19 4:14 ` Keith Packard 2008-10-19 6:41 ` Keith Packard 2008-10-19 17:53 ` io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar 2008-10-19 18:00 ` Arjan van de Ven 2008-10-19 19:07 ` Eric Anholt 2008-10-20 11:55 ` Ingo Molnar 2008-10-20 12:20 ` Ingo Molnar 2008-10-19 21:04 ` Keith Packard 2008-10-20 11:58 ` Ingo Molnar 2008-10-20 15:49 ` Keith Packard 2008-10-22 9:36 ` Ingo Molnar 2008-10-23 7:14 ` Keith Packard 2008-10-23 7:14 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Keith Packard 2008-10-23 7:14 ` [PATCH] [drm/i915] Use io-mapping interfaces instead of a variety of mapping kludges Keith Packard 2008-10-24 4:49 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Randy Dunlap 2008-10-24 6:26 ` Keith Packard 2008-10-23 8:05 ` io resources and cached mappings (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar 2008-10-23 15:39 ` Keith Packard 2008-11-03 7:00 ` Dave Airlie 2008-11-03 10:48 ` Ingo Molnar 2008-11-03 16:36 ` Linus Torvalds 2008-11-03 16:53 ` Ingo Molnar 2008-11-03 17:29 ` [git pull] IO mappings, #2 Ingo Molnar 2008-11-04 22:36 ` Jonathan Corbet 2008-11-05 9:01 ` Ingo Molnar 2008-10-23 20:22 ` Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) Keith Packard 2008-10-23 20:38 ` Andrew Morton 2008-10-23 21:03 ` Keith Packard 2008-10-23 21:24 ` Linus Torvalds 2008-10-24 1:50 ` Keith Packard 2008-10-24 2:48 ` Linus Torvalds 2008-10-24 3:24 ` Benjamin Herrenschmidt 2008-10-24 5:37 ` Keith Packard 2008-10-24 14:53 ` Linus Torvalds 2008-10-24 15:45 ` Keith Packard 2008-10-24 4:29 ` Keith Packard 2008-10-24 6:22 ` Keith Packard 2008-10-24 7:33 ` Adding kmap_atomic_prot_pfn Thomas Hellström 2008-10-24 8:38 ` Ingo Molnar 2008-10-24 9:19 ` Thomas Hellström 2008-10-24 9:32 ` Ingo Molnar 2008-10-24 11:04 ` Thomas Hellström 2008-10-24 15:48 ` Keith Packard 2008-10-24 10:18 ` Thomas Hellström 2008-10-24 9:14 ` Adding kmap_atomic_prot_pfn (was: [git pull] drm patches for 2.6.27-rc1) Ingo Molnar 2008-10-24 3:21 ` Benjamin Herrenschmidt 2008-10-20 10:10 ` io resources and cached mappings " Ingo Molnar 2008-10-19 4:28 ` [git pull] drm patches for 2.6.27-rc1 Yinghai Lu 2008-10-19 3:14 ` Nick Piggin [not found] <1225392985-6832-1-git-send-email-eric@anholt.net> 2008-10-31 2:38 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Eric Anholt 2008-10-31 9:21 ` Ingo Molnar 2008-10-31 16:59 ` Keith Packard 2008-11-03 8:37 ` Ingo Molnar 2008-11-03 17:09 [PATCH] [x86_32] Add io_map_atomic using fixmaps Keith Packard 2008-11-03 17:09 ` [PATCH] Add io-mapping functions to dynamically map large device apertures Keith Packard
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).