LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
@ 2007-12-25 11:26 Arjan van de Ven
  2007-12-27 11:52 ` Jeff Garzik
                   ` (2 more replies)
  0 siblings, 3 replies; 125+ messages in thread
From: Arjan van de Ven @ 2007-12-25 11:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jeff Garzik, Linus Torvalds, gregkh, inux-pci,
	Benjamin Herrenschmidt, Martin Mares, Matthew Wilcox


From: Arjan van de Ven <arjan@linux.intel.com>
Subject: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

On PCs, PCI extended configuration space (4Kb) is riddled with problems 
associated with the memory mapped access method (MMCONFIG). At the same
time, there are very few machines that actually need or use this extended
configuration space. 

At this point in time, the only sensible action is to make access to the
extended configuration space an opt-in operation for those device drivers
that need/want access to this space, as well as for those userland 
diagnostics utilities that (on admin request) want to access this space.

It's inevitable that this is done per device rather than per bus; we'll
be needing per device PCI quirks to turn this extended config space off 
over time no matter what; in addition, it gives the least amount of surprise:
loading a driver for a device only impacts that one device, not a whole bus
worth of devices (although it'll be common to have one physical device per
bus on PCI-E).

The (desireable) side-effect of this patch is that all enumeration is done
using normal configuration cycles.

The patch below splits the lower level PCI config space operation (which
operate on a bus) in two: one that normally only operates on traditional 
space, and one that gets used after the driver has opted in to using the
extended configuration space. This has lead to a little code duplication,
but it's not all that bad (most of it is prototypes in headers and such).

Architectures that have a solid reliable way to get to extended configuration
space can just keep doing what they do now and allow extended space access
from the "traditional" bus ops, and just not fill in the new bus ops.
(This could include x86 for, say, BIOS year 2009 and later, but doesn't
right now)

This patch also adds a sysfs property for each device into which root can
write a '1' to enable extended configuration space. The kernel will print
a notice into dmesg when this happens (including the name of the app) so that
if the system crashes as a result of this action, the user can know what
action/tool caused it.


Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

---
 arch/x86/pci/common.c      |   23 ++++++++++++++++++++++
 arch/x86/pci/init.c        |   10 +++++++++
 arch/x86/pci/mmconfig_32.c |    2 -
 arch/x86/pci/mmconfig_64.c |    2 -
 arch/x86/pci/pci.h         |    2 +
 drivers/pci/access.c       |   46 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci-sysfs.c    |   31 +++++++++++++++++++++++++++++
 drivers/pci/pci.c          |   28 ++++++++++++++++++++++++++
 include/linux/pci.h        |   47 +++++++++++++++++++++++++++++++++++++++------
 9 files changed, 183 insertions(+), 8 deletions(-)

Index: linux-2.6.24-rc5/arch/x86/pci/common.c
===================================================================
--- linux-2.6.24-rc5.orig/arch/x86/pci/common.c
+++ linux-2.6.24-rc5/arch/x86/pci/common.c
@@ -26,6 +26,7 @@ int pcibios_last_bus = -1;
 unsigned long pirq_table_addr;
 struct pci_bus *pci_root_bus;
 struct pci_raw_ops *raw_pci_ops;
+struct pci_raw_ops *raw_pci_ops_extcfg;
 
 static int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
@@ -39,9 +40,31 @@ static int pci_write(struct pci_bus *bus
 				  devfn, where, size, value);
 }
 
+static int pci_read_ext(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
+{
+	if (raw_pci_ops_extcfg)
+		return raw_pci_ops_extcfg->read(pci_domain_nr(bus), bus->number,
+				 devfn, where, size, value);
+	else
+		return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+				 devfn, where, size, value);
+}
+
+static int pci_write_ext(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
+{
+	if (raw_pci_ops_extcfg)
+		return raw_pci_ops_extcfg->write(pci_domain_nr(bus), bus->number,
+				  devfn, where, size, value);
+	else
+		return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+				  devfn, where, size, value);
+}
+
 struct pci_ops pci_root_ops = {
 	.read = pci_read,
 	.write = pci_write,
+	.readext = pci_read_ext,
+	.writeext = pci_write_ext,
 };
 
 /*
Index: linux-2.6.24-rc5/drivers/pci/pci.c
===================================================================
--- linux-2.6.24-rc5.orig/drivers/pci/pci.c
+++ linux-2.6.24-rc5/drivers/pci/pci.c
@@ -752,6 +752,34 @@ int pci_enable_device(struct pci_dev *de
 	return pci_enable_device_bars(dev, (1 << PCI_NUM_RESOURCES) - 1);
 }
 
+/**
+ * pci_enable_ext_config - Enable extended (4K) config space accesses
+ * @dev: PCI device to be changed
+ *
+ *  Enable extended (4Kb) configuration space accesses for a device.
+ *  Extended config space is available for PCI-E devices and can
+ *  be used for things like PCI AER and other features. However,
+ *  due to various stability issues, this can only be done on demand.
+ *
+ * Returns: -1 on failure, 0 on success
+ */
+
+int pci_enable_ext_config(struct pci_dev *dev)
+{
+	if (dev->ext_cfg_space < 0)
+		return -1;
+	if (dev->ext_cfg_space > 0)
+		return 0;
+	dev->ext_cfg_space = 1;
+	/*
+	 * now that we enabled large accesse, we
+	 * need to update the config space size variable
+	 */
+	dev->cfg_size = pci_cfg_space_size(dev);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pci_enable_ext_config);
+
 /*
  * Managed PCI resources.  This manages device on/off, intx/msi/msix
  * on/off and BAR regions.  pci_dev itself records msi/msix status, so
Index: linux-2.6.24-rc5/include/linux/pci.h
===================================================================
--- linux-2.6.24-rc5.orig/include/linux/pci.h
+++ linux-2.6.24-rc5/include/linux/pci.h
@@ -174,6 +174,15 @@ struct pci_dev {
 	int		cfg_size;	/* Size of configuration space */
 
 	/*
+	 * ext_cfg_space gets set by drivers/quirks to device if
+	 * extended (4K) config space is desired.
+	 * negative values -- hard disabled (quirk etc)
+	 * zero            -- disabled
+	 * positive values -- enable
+	 */
+	int		ext_cfg_space;
+
+	/*
 	 * Instead of touching interrupt line and base address registers
 	 * directly, use the values stored here. They might be different!
 	 */
@@ -302,6 +311,8 @@ struct pci_bus {
 struct pci_ops {
 	int (*read)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val);
 	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
+	int (*readext)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val);
+	int (*writeext)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
 };
 
 struct pci_raw_ops {
@@ -521,29 +532,48 @@ int pci_bus_write_config_byte (struct pc
 int pci_bus_write_config_word (struct pci_bus *bus, unsigned int devfn, int where, u16 val);
 int pci_bus_write_config_dword (struct pci_bus *bus, unsigned int devfn, int where, u32 val);
 
+int pci_bus_read_extconfig_byte(struct pci_bus *bus, unsigned int devfn, int where, u8 *val);
+int pci_bus_read_extconfig_word(struct pci_bus *bus, unsigned int devfn, int where, u16 *val);
+int pci_bus_read_extconfig_dword(struct pci_bus *bus, unsigned int devfn, int where, u32 *val);
+int pci_bus_write_extconfig_byte(struct pci_bus *bus, unsigned int devfn, int where, u8 val);
+int pci_bus_write_extconfig_word(struct pci_bus *bus, unsigned int devfn, int where, u16 val);
+int pci_bus_write_extconfig_dword(struct pci_bus *bus, unsigned int devfn, int where, u32 val);
+
 static inline int pci_read_config_byte(struct pci_dev *dev, int where, u8 *val)
 {
-	return pci_bus_read_config_byte (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_read_extconfig_byte(dev->bus, dev->devfn, where, val);
+	return pci_bus_read_config_byte(dev->bus, dev->devfn, where, val);
 }
 static inline int pci_read_config_word(struct pci_dev *dev, int where, u16 *val)
 {
-	return pci_bus_read_config_word (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_read_extconfig_word(dev->bus, dev->devfn, where, val);
+	return pci_bus_read_config_word(dev->bus, dev->devfn, where, val);
 }
 static inline int pci_read_config_dword(struct pci_dev *dev, int where, u32 *val)
 {
-	return pci_bus_read_config_dword (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_read_extconfig_dword(dev->bus, dev->devfn, where, val);
+	return pci_bus_read_config_dword(dev->bus, dev->devfn, where, val);
 }
 static inline int pci_write_config_byte(struct pci_dev *dev, int where, u8 val)
 {
-	return pci_bus_write_config_byte (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_write_extconfig_byte(dev->bus, dev->devfn, where, val);
+	return pci_bus_write_config_byte(dev->bus, dev->devfn, where, val);
 }
 static inline int pci_write_config_word(struct pci_dev *dev, int where, u16 val)
 {
-	return pci_bus_write_config_word (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_write_extconfig_word(dev->bus, dev->devfn, where, val);
+	return pci_bus_write_config_word(dev->bus, dev->devfn, where, val);
 }
 static inline int pci_write_config_dword(struct pci_dev *dev, int where, u32 val)
 {
-	return pci_bus_write_config_dword (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_write_extconfig_dword(dev->bus, dev->devfn, where, val);
+	return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val);
 }
 
 int __must_check pci_enable_device(struct pci_dev *dev);
@@ -693,6 +723,9 @@ void ht_destroy_irq(unsigned int irq);
 extern void pci_block_user_cfg_access(struct pci_dev *dev);
 extern void pci_unblock_user_cfg_access(struct pci_dev *dev);
 
+extern int pci_enable_ext_config(struct pci_dev *dev);
+
+
 /*
  * PCI domain support.  Sometimes called PCI segment (eg by ACPI),
  * a PCI domain is defined to be a set of PCI busses which share
@@ -789,6 +822,8 @@ static inline struct pci_dev *pci_get_bu
 						unsigned int devfn)
 { return NULL; }
 
+static inline int pci_enable_ext_config(struct pci_dev *dev) { return -1; }
+
 #endif /* CONFIG_PCI */
 
 /* Include architecture-dependent settings and functions */
Index: linux-2.6.24-rc5/arch/x86/pci/mmconfig_32.c
===================================================================
--- linux-2.6.24-rc5.orig/arch/x86/pci/mmconfig_32.c
+++ linux-2.6.24-rc5/arch/x86/pci/mmconfig_32.c
@@ -143,6 +143,6 @@ int __init pci_mmcfg_arch_reachable(unsi
 int __init pci_mmcfg_arch_init(void)
 {
 	printk(KERN_INFO "PCI: Using MMCONFIG\n");
-	raw_pci_ops = &pci_mmcfg;
+	raw_pci_ops_extcfg = &pci_mmcfg;
 	return 1;
 }
Index: linux-2.6.24-rc5/arch/x86/pci/mmconfig_64.c
===================================================================
--- linux-2.6.24-rc5.orig/arch/x86/pci/mmconfig_64.c
+++ linux-2.6.24-rc5/arch/x86/pci/mmconfig_64.c
@@ -152,6 +152,6 @@ int __init pci_mmcfg_arch_init(void)
 			return 0;
 		}
 	}
-	raw_pci_ops = &pci_mmcfg;
+	raw_pci_ops_extcfg = &pci_mmcfg;
 	return 1;
 }
Index: linux-2.6.24-rc5/drivers/pci/access.c
===================================================================
--- linux-2.6.24-rc5.orig/drivers/pci/access.c
+++ linux-2.6.24-rc5/drivers/pci/access.c
@@ -51,6 +51,45 @@ int pci_bus_write_config_##size \
 	return res;							\
 }
 
+#define PCI_OP_READ_EXT(size, type, len) \
+int pci_bus_read_extconfig_##size \
+	(struct pci_bus *bus, unsigned int devfn, int pos, type *value)	\
+{									\
+	int res;							\
+	unsigned long flags;						\
+	u32 data = 0;							\
+	if (PCI_##size##_BAD)						\
+		return PCIBIOS_BAD_REGISTER_NUMBER;			\
+	spin_lock_irqsave(&pci_lock, flags);				\
+	if (bus->ops->readext)						\
+		res = bus->ops->readext(bus, devfn, pos, len, &data);	\
+	else								\
+		res = bus->ops->read(bus, devfn, pos, len, &data);	\
+	*value = (type)data;						\
+	spin_unlock_irqrestore(&pci_lock, flags);			\
+	return res;							\
+}									\
+EXPORT_SYMBOL(pci_bus_read_extconfig_##size);
+
+#define PCI_OP_WRITE_EXT(size, type, len) \
+int pci_bus_write_extconfig_##size \
+	(struct pci_bus *bus, unsigned int devfn, int pos, type value)	\
+{									\
+	int res;							\
+	unsigned long flags;						\
+	if (PCI_##size##_BAD)						\
+		return PCIBIOS_BAD_REGISTER_NUMBER;			\
+	spin_lock_irqsave(&pci_lock, flags);				\
+	if (bus->ops->writeext)						\
+		res = bus->ops->writeext(bus, devfn, pos, len, value);	\
+	else								\
+		res = bus->ops->write(bus, devfn, pos, len, value);	\
+	spin_unlock_irqrestore(&pci_lock, flags);			\
+	return res;							\
+}									\
+EXPORT_SYMBOL(pci_bus_write_extconfig_##size);
+
+
 PCI_OP_READ(byte, u8, 1)
 PCI_OP_READ(word, u16, 2)
 PCI_OP_READ(dword, u32, 4)
@@ -58,6 +97,13 @@ PCI_OP_WRITE(byte, u8, 1)
 PCI_OP_WRITE(word, u16, 2)
 PCI_OP_WRITE(dword, u32, 4)
 
+PCI_OP_READ_EXT(byte, u8, 1)
+PCI_OP_READ_EXT(word, u16, 2)
+PCI_OP_READ_EXT(dword, u32, 4)
+PCI_OP_WRITE_EXT(byte, u8, 1)
+PCI_OP_WRITE_EXT(word, u16, 2)
+PCI_OP_WRITE_EXT(dword, u32, 4)
+
 EXPORT_SYMBOL(pci_bus_read_config_byte);
 EXPORT_SYMBOL(pci_bus_read_config_word);
 EXPORT_SYMBOL(pci_bus_read_config_dword);
Index: linux-2.6.24-rc5/arch/x86/pci/pci.h
===================================================================
--- linux-2.6.24-rc5.orig/arch/x86/pci/pci.h
+++ linux-2.6.24-rc5/arch/x86/pci/pci.h
@@ -32,6 +32,8 @@
 extern unsigned int pci_probe;
 extern unsigned long pirq_table_addr;
 
+extern struct pci_raw_ops *raw_pci_ops_extcfg;
+
 enum pci_bf_sort_state {
 	pci_bf_sort_default,
 	pci_force_nobf,
Index: linux-2.6.24-rc5/arch/x86/pci/init.c
===================================================================
--- linux-2.6.24-rc5.orig/arch/x86/pci/init.c
+++ linux-2.6.24-rc5/arch/x86/pci/init.c
@@ -14,6 +14,16 @@ static __init int pci_access_init(void)
 #ifdef CONFIG_PCI_MMCONFIG
 	pci_mmcfg_init(type);
 #endif
+	/* if we ONLY have MMCONFIG, we need to use it always */
+	if (!raw_pci_ops && raw_pci_ops_extcfg) {
+		printk(KERN_INFO "No direct PCI access, using MMCONFIG always\n");
+		raw_pci_ops = raw_pci_ops_extcfg;
+	}
+
+	/*
+	 * we've found a usable method; this means we can skip
+	 * the potentially dangerous BIOS based methods
+	 */
 	if (raw_pci_ops)
 		return 0;
 #ifdef CONFIG_PCI_BIOS
Index: linux-2.6.24-rc5/drivers/pci/pci-sysfs.c
===================================================================
--- linux-2.6.24-rc5.orig/drivers/pci/pci-sysfs.c
+++ linux-2.6.24-rc5/drivers/pci/pci-sysfs.c
@@ -143,6 +143,35 @@ static ssize_t is_enabled_show(struct de
 	return sprintf (buf, "%u\n", atomic_read(&pdev->enable_cnt));
 }
 
+static ssize_t extended_config_space_store(struct device *dev,
+				struct device_attribute *attr, const char *buf,
+				size_t count)
+{
+	ssize_t result = -EINVAL;
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	/* this can crash the machine when done on the "wrong" device */
+	if (!capable(CAP_SYS_ADMIN))
+		return count;
+
+	if (*buf == '1') {
+		printk(KERN_WARNING "Application %s enabled extended config space for device %s\n",
+			current->comm,  pci_name(pdev));
+		result = pci_enable_ext_config(pdev);
+	}
+
+	return result < 0 ? result : count;
+}
+
+static ssize_t extended_config_space_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct pci_dev *pdev;
+
+	pdev = to_pci_dev(dev);
+	return sprintf(buf, "%u\n", pdev->ext_cfg_space);
+}
+
 #ifdef CONFIG_NUMA
 static ssize_t
 numa_node_show(struct device *dev, struct device_attribute *attr, char *buf)
@@ -206,6 +235,8 @@ struct device_attribute pci_dev_attrs[] 
 	__ATTR_RO(numa_node),
 #endif
 	__ATTR(enable, 0600, is_enabled_show, is_enabled_store),
+	__ATTR(extended_config_space, 0600, extended_config_space_show,
+		extended_config_space_store),
 	__ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
 		broken_parity_status_show,broken_parity_status_store),
 	__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2007-12-25 11:26 [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in Arjan van de Ven
@ 2007-12-27 11:52 ` Jeff Garzik
  2007-12-27 14:09   ` Arjan van de Ven
  2007-12-27 17:52   ` Linus Torvalds
  2008-01-11 19:02 ` Greg KH
  2008-01-15 12:58 ` Øyvind Vågen Jægtnes
  2 siblings, 2 replies; 125+ messages in thread
From: Jeff Garzik @ 2007-12-27 11:52 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: linux-kernel, Linus Torvalds, gregkh, inux-pci,
	Benjamin Herrenschmidt, Martin Mares, Matthew Wilcox

Arjan van de Ven wrote:
> This patch also adds a sysfs property for each device into which root can
> write a '1' to enable extended configuration space. The kernel will print
> a notice into dmesg when this happens (including the name of the app) so that
> if the system crashes as a result of this action, the user can know what
> action/tool caused it.


Comments:

1) [minor] With a bit in struct pci_dev, there is no need for separate 
raw_pci_ops.  That will simplify your patch, with no functionality change.

"golden" arches (no pun intended) may implement raw_pci_ops that 
_always_ work with extended config space, and simply ignore that bit, if 
that is how their underlying non-mmconfig-nor-type1 hardware is implemented.


2) [non-minor] hmmmm.

	[jgarzik@core ~]$ lspci -n | wc -l
	23

So I would have to perform 23 sysfs twiddles, before I could obtain a 
full and unabridged 'lspci -vvvxxx'?

For the userspace interface, the most-often-used knob for diagnostic 
purposes will be the easiest one.  And that's

	echo 1 > enable-ext-cfg-space-for-all-buses-ACPI-says-to
	lspci -vvvxxx


3) [minor] architectures must be able to override 
pci_enable_ext_config().  see "golden arches".





^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2007-12-27 11:52 ` Jeff Garzik
@ 2007-12-27 14:09   ` Arjan van de Ven
  2007-12-27 17:52   ` Linus Torvalds
  1 sibling, 0 replies; 125+ messages in thread
From: Arjan van de Ven @ 2007-12-27 14:09 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: linux-kernel, Linus Torvalds, gregkh, inux-pci,
	Benjamin Herrenschmidt, Martin Mares, Matthew Wilcox

On Thu, 27 Dec 2007 06:52:35 -0500
Jeff Garzik <jeff@garzik.org> wrote:

> Arjan van de Ven wrote:
> > This patch also adds a sysfs property for each device into which
> > root can write a '1' to enable extended configuration space. The
> > kernel will print a notice into dmesg when this happens (including
> > the name of the app) so that if the system crashes as a result of
> > this action, the user can know what action/tool caused it.
> 
> 
> Comments:
> 
> 1) [minor] With a bit in struct pci_dev, 

I have this

> there is no need for
> separate raw_pci_ops.  That will simplify your patch, with no
> functionality change.

but sadly your second statement is not correct. Part of the complication is that all PCI config ops
operate on busses not devices; at first I thought "just add a bit and be done with it", but sadly it's
not quite the case. Due to the per-bus nature of the ops, you end up having 2 type of bus operations, 
and that's just boilerplate (prototypes, exports and stuff) but it makes up most of the lines of the patch

In addition, a separate raw_pci_ops (for x86 only!) is needed anyway since it's quite likely that 
we'll have various options of each case (extended or not) and we want to pick the best one for each case,
at which point you really do need the 2 variables.

> 
> "golden" arches (no pun intended) may implement raw_pci_ops that 
> _always_ work with extended config space, and simply ignore that bit,
> if that is how their underlying non-mmconfig-nor-type1 hardware is
> implemented.

that is what I implemented already in the patch that you commented on ;-)

> 
> 
> 2) [non-minor] hmmmm.
> 
> 	[jgarzik@core ~]$ lspci -n | wc -l
> 	23
> 
> So I would have to perform 23 sysfs twiddles, before I could obtain a 
> full and unabridged 'lspci -vvvxxx'?

not you as human, but "lspci" ought to yes.

> 
> For the userspace interface, the most-often-used knob for diagnostic 
> purposes will be the easiest one.  And that's

the easiest one is an option to lspci. Nothing more nothing less.

Making a global knob in kernel space is a lot more tricky, and in addition
really there's enough cases where userspace wants the one device anyway
Doing the "for each device I'm about to dump" in lspci is pretty much as hard as doing
the global one (if not simpler)

> 
> 3) [minor] architectures must be able to override 
> pci_enable_ext_config().  see "golden arches".

see the patch. All pci_enable_ext_config() does is set a flag.
The architecture decides what to do with that flag. Golden
architectures can just totally ignore the flag and always expose
the full space. 
(In fact, the patch assumes all-but-x86 to be golden here; which is fair)



-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2007-12-27 11:52 ` Jeff Garzik
  2007-12-27 14:09   ` Arjan van de Ven
@ 2007-12-27 17:52   ` Linus Torvalds
  1 sibling, 0 replies; 125+ messages in thread
From: Linus Torvalds @ 2007-12-27 17:52 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Arjan van de Ven, linux-kernel, gregkh, inux-pci,
	Benjamin Herrenschmidt, Martin Mares, Matthew Wilcox



On Thu, 27 Dec 2007, Jeff Garzik wrote:
> 
> 2) [non-minor] hmmmm.
> 
> 	[jgarzik@core ~]$ lspci -n | wc -l
> 	23
> 
> So I would have to perform 23 sysfs twiddles, before I could obtain a full and
> unabridged 'lspci -vvvxxx'?

Or you force it on with "pci=mmconfig" or something at boot-time.

But yes. The *fact* is that MMCONFIG has not just been globally broken, 
but broken on a per-device basis. I don't know why (and quite frankly, I 
doubt anybody does), but the PCI device ID corruption happened only for a 
specific set of devices.

Whether it was a timing issue with particular devices or whether it was a 
timing issue with some particular bridge (and could affect any devices 
behind that bridge), who knows... It almost certainly was brought on by a 
borderline (or broken) northbridge, but it apparently only affected 
specific devices - which makes me suspect that it wasn't *entirely* due to 
just the northbridge, and it was a combination of things.

I don't understand why you cannot seem to accept that per-device thing, in 
the face of clear data that yes, it really *is* per-device. Not to mention 
the fact that the way MMIO config setups work, you may well have entire 
buses that simply aren't accessible with MMIO config at all (because the 
MMIO config window is not large enough).

Furthermore, please accept the fact that of those 23 devices, exactly 
*none* will actually care. So yes, you'd have to enable it manually for 
those individual devices, but that's only if you want to do something 
totally pointless in the first place.

So stop this totally inane "it has to be global" crap. It doesn't have to 
be global at all, and we have hard data showing that it really SHOULD NOT 
be a global flag.

		Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2007-12-25 11:26 [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in Arjan van de Ven
  2007-12-27 11:52 ` Jeff Garzik
@ 2008-01-11 19:02 ` Greg KH
  2008-01-11 19:09   ` Arjan van de Ven
                     ` (2 more replies)
  2008-01-15 12:58 ` Øyvind Vågen Jægtnes
  2 siblings, 3 replies; 125+ messages in thread
From: Greg KH @ 2008-01-11 19:02 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: linux-kernel, Jeff Garzik, Linus Torvalds, gregkh, inux-pci,
	Benjamin Herrenschmidt, Martin Mares, Matthew Wilcox

On Tue, Dec 25, 2007 at 03:26:05AM -0800, Arjan van de Ven wrote:
> 
> This patch also adds a sysfs property for each device into which root can
> write a '1' to enable extended configuration space. The kernel will print
> a notice into dmesg when this happens (including the name of the app) so that
> if the system crashes as a result of this action, the user can know what
> action/tool caused it.

Can you send me a follow-on patch that documents this in
Documentation/ABI please.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 19:02 ` Greg KH
@ 2008-01-11 19:09   ` Arjan van de Ven
  2008-01-11 19:14     ` Greg KH
  2008-01-11 19:28   ` Matthew Wilcox
  2008-01-11 19:54   ` Arjan van de Ven
  2 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-11 19:09 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, Jeff Garzik, Linus Torvalds, gregkh, inux-pci,
	Benjamin Herrenschmidt, Martin Mares, Matthew Wilcox

On Fri, 11 Jan 2008 11:02:29 -0800
Greg KH <greg@kroah.com> wrote:

> On Tue, Dec 25, 2007 at 03:26:05AM -0800, Arjan van de Ven wrote:
> > 
> > This patch also adds a sysfs property for each device into which
> > root can write a '1' to enable extended configuration space. The
> > kernel will print a notice into dmesg when this happens (including
> > the name of the app) so that if the system crashes as a result of
> > this action, the user can know what action/tool caused it.
> 
> Can you send me a follow-on patch that documents this in
> Documentation/ABI please.
> 

once it's stable enough, say after 1 kernel release, sure

-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 19:09   ` Arjan van de Ven
@ 2008-01-11 19:14     ` Greg KH
  0 siblings, 0 replies; 125+ messages in thread
From: Greg KH @ 2008-01-11 19:14 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Greg KH, linux-kernel, Jeff Garzik, Linus Torvalds, inux-pci,
	Benjamin Herrenschmidt, Martin Mares, Matthew Wilcox

On Fri, Jan 11, 2008 at 11:09:31AM -0800, Arjan van de Ven wrote:
> On Fri, 11 Jan 2008 11:02:29 -0800
> Greg KH <greg@kroah.com> wrote:
> 
> > On Tue, Dec 25, 2007 at 03:26:05AM -0800, Arjan van de Ven wrote:
> > > 
> > > This patch also adds a sysfs property for each device into which
> > > root can write a '1' to enable extended configuration space. The
> > > kernel will print a notice into dmesg when this happens (including
> > > the name of the app) so that if the system crashes as a result of
> > > this action, the user can know what action/tool caused it.
> > 
> > Can you send me a follow-on patch that documents this in
> > Documentation/ABI please.
> > 
> 
> once it's stable enough, say after 1 kernel release, sure

That's what the Documentation/ABI/testing section is for.  If you add
something new, it needs to be documented now, otherwise it will be
forgotten.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 19:02 ` Greg KH
  2008-01-11 19:09   ` Arjan van de Ven
@ 2008-01-11 19:28   ` Matthew Wilcox
  2008-01-11 19:40     ` Arjan van de Ven
  2008-01-11 19:54   ` Arjan van de Ven
  2 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-11 19:28 UTC (permalink / raw)
  To: Greg KH
  Cc: Arjan van de Ven, linux-kernel, Jeff Garzik, Linus Torvalds,
	gregkh, inux-pci, Benjamin Herrenschmidt, Martin Mares

On Fri, Jan 11, 2008 at 11:02:29AM -0800, Greg KH wrote:
> Can you send me a follow-on patch that documents this in
> Documentation/ABI please.

Greg, if you integrate Ivan's patch, you don't need Arjan's patch.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 19:28   ` Matthew Wilcox
@ 2008-01-11 19:40     ` Arjan van de Ven
  2008-01-11 19:45       ` Greg KH
  0 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-11 19:40 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, linux-kernel, Jeff Garzik, Linus Torvalds, gregkh,
	inux-pci, Benjamin Herrenschmidt, Martin Mares

On Fri, 11 Jan 2008 12:28:20 -0700
Matthew Wilcox <matthew@wil.cx> wrote:

> On Fri, Jan 11, 2008 at 11:02:29AM -0800, Greg KH wrote:
> > Can you send me a follow-on patch that documents this in
> > Documentation/ABI please.
> 
> Greg, if you integrate Ivan's patch, you don't need Arjan's patch.
> 

Personally I absolutely don't agree with that.
Ivan's patch is another attempt to make MMCONFIG work somewhat better,
but does not provide the explicit opt-in that I think is required at
this point; people have tried to get MMCONFIG stable for a really long time,
and failed still upto today. At least my patience is up and this needs
to be opt-in.



-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 19:40     ` Arjan van de Ven
@ 2008-01-11 19:45       ` Greg KH
  2008-01-11 19:49         ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Greg KH @ 2008-01-11 19:45 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Matthew Wilcox, Greg KH, linux-kernel, Jeff Garzik,
	Linus Torvalds, linux-pci, Benjamin Herrenschmidt, Martin Mares

On Fri, Jan 11, 2008 at 11:40:02AM -0800, Arjan van de Ven wrote:
> On Fri, 11 Jan 2008 12:28:20 -0700
> Matthew Wilcox <matthew@wil.cx> wrote:
> 
> > On Fri, Jan 11, 2008 at 11:02:29AM -0800, Greg KH wrote:
> > > Can you send me a follow-on patch that documents this in
> > > Documentation/ABI please.
> > 
> > Greg, if you integrate Ivan's patch, you don't need Arjan's patch.
> > 
> 
> Personally I absolutely don't agree with that.
> Ivan's patch is another attempt to make MMCONFIG work somewhat better,
> but does not provide the explicit opt-in that I think is required at
> this point; people have tried to get MMCONFIG stable for a really long time,
> and failed still upto today. At least my patience is up and this needs
> to be opt-in.

I think I agree with Arjan here, Ivan's patch should also work on top of
this one, and will help out some machines.

But as he hasn't asked for it to be included in the kernel tree, that's
a moot point right now :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 19:45       ` Greg KH
@ 2008-01-11 19:49         ` Matthew Wilcox
  2008-01-11 19:58           ` Linus Torvalds
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-11 19:49 UTC (permalink / raw)
  To: Greg KH
  Cc: Arjan van de Ven, Greg KH, linux-kernel, Jeff Garzik,
	Linus Torvalds, linux-pci, Benjamin Herrenschmidt, Martin Mares

On Fri, Jan 11, 2008 at 11:45:24AM -0800, Greg KH wrote:
> On Fri, Jan 11, 2008 at 11:40:02AM -0800, Arjan van de Ven wrote:
> > Personally I absolutely don't agree with that.
> > Ivan's patch is another attempt to make MMCONFIG work somewhat better,
> > but does not provide the explicit opt-in that I think is required at
> > this point; people have tried to get MMCONFIG stable for a really long time,
> > and failed still upto today. At least my patience is up and this needs
> > to be opt-in.

So your argument is that MMCONFIG sucks, therefore Linux has to have a
horrible interface to extended PCI config space?

> I think I agree with Arjan here, Ivan's patch should also work on top of
> this one, and will help out some machines.
> 
> But as he hasn't asked for it to be included in the kernel tree, that's
> a moot point right now :)

He didn't?  I certainly ask for it to be included.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 19:02 ` Greg KH
  2008-01-11 19:09   ` Arjan van de Ven
  2008-01-11 19:28   ` Matthew Wilcox
@ 2008-01-11 19:54   ` Arjan van de Ven
  2008-01-11 20:55     ` Greg KH
  2 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-11 19:54 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, Jeff Garzik, Linus Torvalds, gregkh, inux-pci,
	Benjamin Herrenschmidt, Martin Mares, Matthew Wilcox

On Fri, 11 Jan 2008 11:02:29 -0800
Greg KH <greg@kroah.com> wrote:

> On Tue, Dec 25, 2007 at 03:26:05AM -0800, Arjan van de Ven wrote:
> > 
> > This patch also adds a sysfs property for each device into which
> > root can write a '1' to enable extended configuration space. The
> > kernel will print a notice into dmesg when this happens (including
> > the name of the app) so that if the system crashes as a result of
> > this action, the user can know what action/tool caused it.
> 
> Can you send me a follow-on patch that documents this in
> Documentation/ABI please.
> 

---
 Documentation/ABI/testing/sysfs-pci-extended-config |   39 ++++++++++++++++++++
 1 file changed, 39 insertions(+)

Index: linux-2.6.24-rc7/Documentation/ABI/testing/sysfs-pci-extended-config
===================================================================
--- /dev/null
+++ linux-2.6.24-rc7/Documentation/ABI/testing/sysfs-pci-extended-config
@@ -0,0 +1,39 @@
+What:		/sys/devices/pci<bus>/<device>/extended_config_space
+Date:		January 11, 2008
+Contact:	Arjan van de Ven <arjan@linux.intel.com>
+Description:
+		This attribute is for use for system-diagnostic software
+		only.
+
+		The kernel may decide to restrict PCI configuration space
+		access for userspace to the first 64 or 256 bytes by
+		default, for stability reasons. This attribute, when
+		present, can be used to request access to the full
+		4Kb from the kernel.
+
+		Request to get access to the full 4Kb can be done by
+		writing a '1' into this attribute file. All other values
+		are reserved for future use and should not be used by
+		software at this point.
+
+		The kernel may log the request to the various kernel
+		logging services. The kernel may decide to ignore the
+		request if the kernel deems extended configuration space
+		access not reliable enough for the system or the device.
+		The kernel may decide to not present this attribute
+		if the kernel decides extended config space is reliable
+		and made available by default, or if the kernel decides
+		that extended configuration space will never be
+		accessible.
+
+		Software needs to gracefully deal with getting the
+		access not granted. Software also needs to gracefully deal
+		with this attribute not being present.
+
+		Due to the fragility of extended configuratio space,
+		system diagnostic software should only set this attribute
+		on explicit user request, or in the case of GUI like tools,
+		at least with explicit user permission.
+
+
+


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 19:49         ` Matthew Wilcox
@ 2008-01-11 19:58           ` Linus Torvalds
  2008-01-11 20:17             ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Linus Torvalds @ 2008-01-11 19:58 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Arjan van de Ven, Greg KH, linux-kernel, Jeff Garzik,
	linux-pci, Benjamin Herrenschmidt, Martin Mares



On Fri, 11 Jan 2008, Matthew Wilcox wrote:
> 
> So your argument is that MMCONFIG sucks, therefore Linux has to have a
> horrible interface to extended PCI config space?

What's *your* point?

MMCONFIG is known broken. If we ever start enabling it more (ie start 
using it even if it's not reserved in the e820 tables), all that known 
breakage will come and bite us in the *ss.

We need to have some armor-plated underwear to protect against that 
ass-biting, and that's what Arjan's patch is. 

Tell me what *other* armor plating you could have that actually works?

		Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 19:58           ` Linus Torvalds
@ 2008-01-11 20:17             ` Matthew Wilcox
  2008-01-11 20:27               ` Linus Torvalds
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-11 20:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Greg KH, Arjan van de Ven, Greg KH, linux-kernel, Jeff Garzik,
	linux-pci, Benjamin Herrenschmidt, Martin Mares

On Fri, Jan 11, 2008 at 11:58:23AM -0800, Linus Torvalds wrote:
> On Fri, 11 Jan 2008, Matthew Wilcox wrote:
> > 
> > So your argument is that MMCONFIG sucks, therefore Linux has to have a
> > horrible interface to extended PCI config space?
> 
> What's *your* point?
> 
> MMCONFIG is known broken. If we ever start enabling it more (ie start 
> using it even if it's not reserved in the e820 tables), all that known 
> breakage will come and bite us in the *ss.

Ivan's patch doesn't start enabling MMCONFIG in more places than we
currently do.  It makes us use conf1 accesses for all accesses below
256 bytes.  That fixes all known problems to date.

> We need to have some armor-plated underwear to protect against that 
> ass-biting, and that's what Arjan's patch is. 
> 
> Tell me what *other* armor plating you could have that actually works?

The armour plating that already exists -- pci=nommconf.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 20:17             ` Matthew Wilcox
@ 2008-01-11 20:27               ` Linus Torvalds
  2008-01-11 20:42                 ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Linus Torvalds @ 2008-01-11 20:27 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Arjan van de Ven, Greg KH, linux-kernel, Jeff Garzik,
	linux-pci, Benjamin Herrenschmidt, Martin Mares



On Fri, 11 Jan 2008, Matthew Wilcox wrote:
> 
> Ivan's patch doesn't start enabling MMCONFIG in more places than we
> currently do.  It makes us use conf1 accesses for all accesses below
> 256 bytes.  That fixes all known problems to date.

.. and I agree with that patch. But there will be people who try to access 
extended space by mistake, and they'll have a hard-locked machine or 
something.

> > Tell me what *other* armor plating you could have that actually works?
> 
> The armour plating that already exists -- pci=nommconf.

No. It needs to be automatic, OR THE OTHER WAY AROUND.

Ie we disable the unsafe feature on purpose, and then force people who 
access it to do so *consciously*.

Extended config space is different, for chissake! It's not even like it's 
just a bigger normal config space where normal config accesses just 
overflow into it. It really does have different rules etc.

			Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 20:27               ` Linus Torvalds
@ 2008-01-11 20:42                 ` Matthew Wilcox
  2008-01-11 21:12                   ` Linus Torvalds
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-11 20:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Greg KH, Arjan van de Ven, Greg KH, linux-kernel, Jeff Garzik,
	linux-pci, Benjamin Herrenschmidt, Martin Mares

On Fri, Jan 11, 2008 at 12:27:06PM -0800, Linus Torvalds wrote:
> On Fri, 11 Jan 2008, Matthew Wilcox wrote:
> > 
> > Ivan's patch doesn't start enabling MMCONFIG in more places than we
> > currently do.  It makes us use conf1 accesses for all accesses below
> > 256 bytes.  That fixes all known problems to date.
> 
> .. and I agree with that patch. But there will be people who try to access 
> extended space by mistake, and they'll have a hard-locked machine or 
> something.

But they can't.  We limit the size they can access to 256 bytes, unless
the kernel probed address 256 and it worked.

> > The armour plating that already exists -- pci=nommconf.
> 
> No. It needs to be automatic, OR THE OTHER WAY AROUND.
> 
> Ie we disable the unsafe feature on purpose, and then force people who 
> access it to do so *consciously*.

I'd be fine with making mmconfig off by default.  Make people pass
pci=mmconf to activate it.

> Extended config space is different, for chissake! It's not even like it's 
> just a bigger normal config space where normal config accesses just 
> overflow into it. It really does have different rules etc.

Yes, but it's also important to enable some of the PCIe features.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 19:54   ` Arjan van de Ven
@ 2008-01-11 20:55     ` Greg KH
  0 siblings, 0 replies; 125+ messages in thread
From: Greg KH @ 2008-01-11 20:55 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: linux-kernel, Jeff Garzik, Linus Torvalds, gregkh, inux-pci,
	Benjamin Herrenschmidt, Martin Mares, Matthew Wilcox

On Fri, Jan 11, 2008 at 11:54:56AM -0800, Arjan van de Ven wrote:
> On Fri, 11 Jan 2008 11:02:29 -0800
> Greg KH <greg@kroah.com> wrote:
> 
> > On Tue, Dec 25, 2007 at 03:26:05AM -0800, Arjan van de Ven wrote:
> > > 
> > > This patch also adds a sysfs property for each device into which
> > > root can write a '1' to enable extended configuration space. The
> > > kernel will print a notice into dmesg when this happens (including
> > > the name of the app) so that if the system crashes as a result of
> > > this action, the user can know what action/tool caused it.
> > 
> > Can you send me a follow-on patch that documents this in
> > Documentation/ABI please.
> > 
> 
> ---
>  Documentation/ABI/testing/sysfs-pci-extended-config |   39 ++++++++++++++++++++
>  1 file changed, 39 insertions(+)

Thanks, I've merged this with the original one.

greg k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 20:42                 ` Matthew Wilcox
@ 2008-01-11 21:12                   ` Linus Torvalds
  2008-01-11 21:17                     ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Linus Torvalds @ 2008-01-11 21:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Arjan van de Ven, Greg KH, linux-kernel, Jeff Garzik,
	linux-pci, Benjamin Herrenschmidt, Martin Mares



On Fri, 11 Jan 2008, Matthew Wilcox wrote:
> 
> But they can't.  We limit the size they can access to 256 bytes, unless
> the kernel probed address 256 and it worked.

Umm. Probing address 256 (or *any* address) using MMCONFIG will simply 
lock up the machine. HARD.

What's so hard to understand about MMCONFIG being broken on certain 
hardware?

		Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 21:12                   ` Linus Torvalds
@ 2008-01-11 21:17                     ` Matthew Wilcox
  2008-01-11 21:28                       ` Linus Torvalds
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-11 21:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Greg KH, Arjan van de Ven, Greg KH, linux-kernel, Jeff Garzik,
	linux-pci, Benjamin Herrenschmidt, Martin Mares

On Fri, Jan 11, 2008 at 01:12:12PM -0800, Linus Torvalds wrote:
> 
> 
> On Fri, 11 Jan 2008, Matthew Wilcox wrote:
> > 
> > But they can't.  We limit the size they can access to 256 bytes, unless
> > the kernel probed address 256 and it worked.
> 
> Umm. Probing address 256 (or *any* address) using MMCONFIG will simply 
> lock up the machine. HARD.

Did I miss a bug report?  The only problems I'm currently aware of are
the ones where using MMCONFIG during BAR probing causes a hard lockup on
some Intel machines, and the ones where we get bad config data on some
AMD machines due to the configuration retry status being mishandled.

All the other lockups I'm aware of are already handled by the existing
checks.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 21:17                     ` Matthew Wilcox
@ 2008-01-11 21:28                       ` Linus Torvalds
  2008-01-11 21:38                         ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Linus Torvalds @ 2008-01-11 21:28 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Arjan van de Ven, Greg KH, linux-kernel, Jeff Garzik,
	linux-pci, Benjamin Herrenschmidt, Martin Mares



On Fri, 11 Jan 2008, Matthew Wilcox wrote:
> 
> Did I miss a bug report?  The only problems I'm currently aware of are
> the ones where using MMCONFIG during BAR probing causes a hard lockup on
> some Intel machines, and the ones where we get bad config data on some
> AMD machines due to the configuration retry status being mishandled.

Hmm. Were all those reports root-caused to just that BAR probing? If so, 
we may be in better shape than I worried.

		Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 21:28                       ` Linus Torvalds
@ 2008-01-11 21:38                         ` Matthew Wilcox
  2008-01-11 23:58                           ` Ivan Kokshaysky
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-11 21:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Greg KH, Arjan van de Ven, Greg KH, linux-kernel, Jeff Garzik,
	linux-pci, Benjamin Herrenschmidt, Martin Mares

On Fri, Jan 11, 2008 at 01:28:30PM -0800, Linus Torvalds wrote:
> 
> 
> On Fri, 11 Jan 2008, Matthew Wilcox wrote:
> > 
> > Did I miss a bug report?  The only problems I'm currently aware of are
> > the ones where using MMCONFIG during BAR probing causes a hard lockup on
> > some Intel machines, and the ones where we get bad config data on some
> > AMD machines due to the configuration retry status being mishandled.
> 
> Hmm. Were all those reports root-caused to just that BAR probing? If so, 
> we may be in better shape than I worried.

I believe so.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 21:38                         ` Matthew Wilcox
@ 2008-01-11 23:58                           ` Ivan Kokshaysky
  2008-01-12  0:17                             ` Jesse Barnes
  2008-01-12  0:26                             ` Greg KH
  0 siblings, 2 replies; 125+ messages in thread
From: Ivan Kokshaysky @ 2008-01-11 23:58 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linus Torvalds, Greg KH, Arjan van de Ven, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Benjamin Herrenschmidt, Martin Mares

On Fri, Jan 11, 2008 at 02:38:03PM -0700, Matthew Wilcox wrote:
> On Fri, Jan 11, 2008 at 01:28:30PM -0800, Linus Torvalds wrote:
> > Hmm. Were all those reports root-caused to just that BAR probing? If so, 
> > we may be in better shape than I worried.
> 
> I believe so.

Ditto.

One typical problem is that on "Intel(r) 3 Series Experss Chipset Family"
MMCONFIG probing of the BAR #2 (frame buffer address) of integrated graphics
device locks up the machine (depending on BIOS settings, of course).
This happens because the frame buffer of IGD has higher decode priority
than MMCONFIG range, as stated in Intel docs...

Ivan.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 23:58                           ` Ivan Kokshaysky
@ 2008-01-12  0:17                             ` Jesse Barnes
  2008-01-12  0:26                             ` Greg KH
  1 sibling, 0 replies; 125+ messages in thread
From: Jesse Barnes @ 2008-01-12  0:17 UTC (permalink / raw)
  To: linux-pci
  Cc: Ivan Kokshaysky, Matthew Wilcox, Linus Torvalds, Greg KH,
	Arjan van de Ven, Greg KH, linux-kernel, Jeff Garzik,
	Benjamin Herrenschmidt, Martin Mares

On Friday, January 11, 2008 3:58 Ivan Kokshaysky wrote:
> On Fri, Jan 11, 2008 at 02:38:03PM -0700, Matthew Wilcox wrote:
> > On Fri, Jan 11, 2008 at 01:28:30PM -0800, Linus Torvalds wrote:
> > > Hmm. Were all those reports root-caused to just that BAR probing?
> > > If so, we may be in better shape than I worried.
> >
> > I believe so.
>
> Ditto.
>
> One typical problem is that on "Intel(r) 3 Series Experss Chipset
> Family" MMCONFIG probing of the BAR #2 (frame buffer address) of
> integrated graphics device locks up the machine (depending on BIOS
> settings, of course). This happens because the frame buffer of IGD
> has higher decode priority than MMCONFIG range, as stated in Intel
> docs...

Yeah, I'm only aware of 3:
  - the BAR overlapping w/MMCONFIG problem described above
  - ATI chipset config space retry bug
  - VIA (?) chipset host bridges don't respond well to having decode
    disabled (they stop decoding RAM addresses as well)

That's it afaik, so I've never really known where Linus' paranoia comes 
from.  OTOH I haven't been too keen to challenge it either; MMCONFIG 
space is only just beginning to be tested widely with the deployment of 
Vista, so we'll doubtless see more problems on older chipsets if we 
enable it by default.

Jesse

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-11 23:58                           ` Ivan Kokshaysky
  2008-01-12  0:17                             ` Jesse Barnes
@ 2008-01-12  0:26                             ` Greg KH
  2008-01-12 14:40                               ` Ivan Kokshaysky
  1 sibling, 1 reply; 125+ messages in thread
From: Greg KH @ 2008-01-12  0:26 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Matthew Wilcox, Linus Torvalds, Greg KH, Arjan van de Ven,
	linux-kernel, Jeff Garzik, linux-pci, Benjamin Herrenschmidt,
	Martin Mares, Tony Camuso

On Sat, Jan 12, 2008 at 02:58:56AM +0300, Ivan Kokshaysky wrote:
> On Fri, Jan 11, 2008 at 02:38:03PM -0700, Matthew Wilcox wrote:
> > On Fri, Jan 11, 2008 at 01:28:30PM -0800, Linus Torvalds wrote:
> > > Hmm. Were all those reports root-caused to just that BAR probing? If so, 
> > > we may be in better shape than I worried.
> > 
> > I believe so.
> 
> Ditto.
> 
> One typical problem is that on "Intel(r) 3 Series Experss Chipset Family"
> MMCONFIG probing of the BAR #2 (frame buffer address) of integrated graphics
> device locks up the machine (depending on BIOS settings, of course).
> This happens because the frame buffer of IGD has higher decode priority
> than MMCONFIG range, as stated in Intel docs...

Ok, so what would the proposed patch look like to help resolve this?

Ivan, you posted one a while ago, but never seemed to get any
confirmation if it helped or not.  Should I use that and drop Arjan's?
Or use both?  Or something else like the patches proposed by Tony
Camuso?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-12  0:26                             ` Greg KH
@ 2008-01-12 14:40                               ` Ivan Kokshaysky
  2008-01-12 15:46                                 ` Arjan van de Ven
                                                   ` (2 more replies)
  0 siblings, 3 replies; 125+ messages in thread
From: Ivan Kokshaysky @ 2008-01-12 14:40 UTC (permalink / raw)
  To: Greg KH
  Cc: Matthew Wilcox, Linus Torvalds, Greg KH, Arjan van de Ven,
	linux-kernel, Jeff Garzik, linux-pci, Benjamin Herrenschmidt,
	Martin Mares, Tony Camuso, Loic Prylli

On Fri, Jan 11, 2008 at 04:26:38PM -0800, Greg KH wrote:
> > One typical problem is that on "Intel(r) 3 Series Experss Chipset Family"
> > MMCONFIG probing of the BAR #2 (frame buffer address) of integrated graphics
> > device locks up the machine (depending on BIOS settings, of course).
> > This happens because the frame buffer of IGD has higher decode priority
> > than MMCONFIG range, as stated in Intel docs...
> 
> Ok, so what would the proposed patch look like to help resolve this?

Yeah, for sure.

> Ivan, you posted one a while ago, but never seemed to get any
> confirmation if it helped or not.  Should I use that and drop Arjan's?

Actually I'm strongly against Arjan's patch. First, it's based on
assumption that the MMCONFIG thing is sort of fundamentally broken
on some systems, but none of the facts we have so far does confirm that.
And second, I really don't like the implementation as it breaks all
non-x86 arches (or forces them to add a set of totally meaningless
PCI functions).

> Or use both?  Or something else like the patches proposed by Tony
> Camuso?

Tony's patch is a variation of the same idea, so this patch
supersedes it. The only argument for using conf1 to access only the
first 64 bytes of the config space was some concerns about performance.
But the only driver that extensively uses config space at runtime
is tg3, and only as a work around some broken revisions of the chip.
And even in that case I seriously doubt that mmconf vs. conf1 would
make any measurable difference.
On the other hand, always using conf1 for the whole 256-byte legacy
config space allows us to drop all sorts of black lists, which is
a *huge* advantage.

Here is the same patch, but with an updated commit message -
proper attribution to Loic Prylli, which I somehow missed
the first time, sorry.

Ivan.

---
PCI x86: always use conf1 to access config space below 256 bytes

Thanks to Loic Prylli <loic@myri.com>, who originally proposed
this idea.

Always using legacy configuration mechanism for the legacy config space
and extended mechanism (mmconf) for the extended config space is
a simple and very logical approach. It's supposed to resolve all
known mmconf problems. It still allows per-device quirks (tweaking
dev->cfg_size). It also allows to get rid of mmconf fallback code.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
---
 arch/x86/pci/mmconfig-shared.c |   35 -----------------------------------
 arch/x86/pci/mmconfig_32.c     |   22 +++++++++-------------
 arch/x86/pci/mmconfig_64.c     |   22 ++++++++++------------
 arch/x86/pci/pci.h             |    7 -------
 4 files changed, 19 insertions(+), 67 deletions(-)

diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 4df637e..6b521d3 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -22,42 +22,9 @@
 #define MMCONFIG_APER_MIN	(2 * 1024*1024)
 #define MMCONFIG_APER_MAX	(256 * 1024*1024)
 
-DECLARE_BITMAP(pci_mmcfg_fallback_slots, 32*PCI_MMCFG_MAX_CHECK_BUS);
-
 /* Indicate if the mmcfg resources have been placed into the resource table. */
 static int __initdata pci_mmcfg_resources_inserted;
 
-/* K8 systems have some devices (typically in the builtin northbridge)
-   that are only accessible using type1
-   Normally this can be expressed in the MCFG by not listing them
-   and assigning suitable _SEGs, but this isn't implemented in some BIOS.
-   Instead try to discover all devices on bus 0 that are unreachable using MM
-   and fallback for them. */
-static void __init unreachable_devices(void)
-{
-	int i, bus;
-	/* Use the max bus number from ACPI here? */
-	for (bus = 0; bus < PCI_MMCFG_MAX_CHECK_BUS; bus++) {
-		for (i = 0; i < 32; i++) {
-			unsigned int devfn = PCI_DEVFN(i, 0);
-			u32 val1, val2;
-
-			pci_conf1_read(0, bus, devfn, 0, 4, &val1);
-			if (val1 == 0xffffffff)
-				continue;
-
-			if (pci_mmcfg_arch_reachable(0, bus, devfn)) {
-				raw_pci_ops->read(0, bus, devfn, 0, 4, &val2);
-				if (val1 == val2)
-					continue;
-			}
-			set_bit(i + 32 * bus, pci_mmcfg_fallback_slots);
-			printk(KERN_NOTICE "PCI: No mmconfig possible on device"
-			       " %02x:%02x\n", bus, i);
-		}
-	}
-}
-
 static const char __init *pci_mmcfg_e7520(void)
 {
 	u32 win;
@@ -270,8 +237,6 @@ void __init pci_mmcfg_init(int type)
 		return;
 
 	if (pci_mmcfg_arch_init()) {
-		if (type == 1)
-			unreachable_devices();
 		if (known_bridge)
 			pci_mmcfg_insert_resources(IORESOURCE_BUSY);
 		pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 1bf5816..7b75e65 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -30,10 +30,6 @@ static u32 get_base_addr(unsigned int seg, int bus, unsigned devfn)
 	struct acpi_mcfg_allocation *cfg;
 	int cfg_num;
 
-	if (seg == 0 && bus < PCI_MMCFG_MAX_CHECK_BUS &&
-	    test_bit(PCI_SLOT(devfn) + 32*bus, pci_mmcfg_fallback_slots))
-		return 0;
-
 	for (cfg_num = 0; cfg_num < pci_mmcfg_config_num; cfg_num++) {
 		cfg = &pci_mmcfg_config[cfg_num];
 		if (cfg->pci_segment == seg &&
@@ -68,13 +64,16 @@ static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
 	u32 base;
 
 	if ((bus > 255) || (devfn > 255) || (reg > 4095)) {
-		*value = -1;
+err:		*value = -1;
 		return -EINVAL;
 	}
 
+	if (reg < 256)
+		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+		goto err;
 
 	spin_lock_irqsave(&pci_config_lock, flags);
 
@@ -105,9 +104,12 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if ((bus > 255) || (devfn > 255) || (reg > 4095))
 		return -EINVAL;
 
+	if (reg < 256)
+		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+		return -EINVAL;
 
 	spin_lock_irqsave(&pci_config_lock, flags);
 
@@ -134,12 +136,6 @@ static struct pci_raw_ops pci_mmcfg = {
 	.write =	pci_mmcfg_write,
 };
 
-int __init pci_mmcfg_arch_reachable(unsigned int seg, unsigned int bus,
-				    unsigned int devfn)
-{
-	return get_base_addr(seg, bus, devfn) != 0;
-}
-
 int __init pci_mmcfg_arch_init(void)
 {
 	printk(KERN_INFO "PCI: Using MMCONFIG\n");
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index 4095e4d..c4cf318 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -40,9 +40,7 @@ static char __iomem *get_virt(unsigned int seg, unsigned bus)
 static char __iomem *pci_dev_base(unsigned int seg, unsigned int bus, unsigned int devfn)
 {
 	char __iomem *addr;
-	if (seg == 0 && bus < PCI_MMCFG_MAX_CHECK_BUS &&
-		test_bit(32*bus + PCI_SLOT(devfn), pci_mmcfg_fallback_slots))
-		return NULL;
+
 	addr = get_virt(seg, bus);
 	if (!addr)
 		return NULL;
@@ -56,13 +54,16 @@ static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
 
 	/* Why do we have this when nobody checks it. How about a BUG()!? -AK */
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095))) {
-		*value = -1;
+err:		*value = -1;
 		return -EINVAL;
 	}
 
+	if (reg < 256)
+		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+		goto err;
 
 	switch (len) {
 	case 1:
@@ -88,9 +89,12 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095)))
 		return -EINVAL;
 
+	if (reg < 256)
+		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+		return -EINVAL;
 
 	switch (len) {
 	case 1:
@@ -126,12 +130,6 @@ static void __iomem * __init mcfg_ioremap(struct acpi_mcfg_allocation *cfg)
 	return addr;
 }
 
-int __init pci_mmcfg_arch_reachable(unsigned int seg, unsigned int bus,
-				    unsigned int devfn)
-{
-	return pci_dev_base(seg, bus, devfn) != NULL;
-}
-
 int __init pci_mmcfg_arch_init(void)
 {
 	int i;
diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
index ac56d39..36cb44c 100644
--- a/arch/x86/pci/pci.h
+++ b/arch/x86/pci/pci.h
@@ -98,13 +98,6 @@ extern void pcibios_sort(void);
 
 /* pci-mmconfig.c */
 
-/* Verify the first 16 busses. We assume that systems with more busses
-   get MCFG right. */
-#define PCI_MMCFG_MAX_CHECK_BUS 16
-extern DECLARE_BITMAP(pci_mmcfg_fallback_slots, 32*PCI_MMCFG_MAX_CHECK_BUS);
-
-extern int __init pci_mmcfg_arch_reachable(unsigned int seg, unsigned int bus,
-					   unsigned int devfn);
 extern int __init pci_mmcfg_arch_init(void);
 
 /*

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-12 14:40                               ` Ivan Kokshaysky
@ 2008-01-12 15:46                                 ` Arjan van de Ven
  2008-01-12 16:23                                   ` Ivan Kokshaysky
  2008-01-12 17:45                                 ` Arjan van de Ven
  2008-01-13  7:08                                 ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-12 15:46 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Greg KH, Matthew Wilcox, Linus Torvalds, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Benjamin Herrenschmidt, Martin Mares,
	Tony Camuso, Loic Prylli

On Sat, 12 Jan 2008 17:40:30 +0300
Ivan Kokshaysky <ink@jurassic.park.msu.ru> wrote:
e.
> 
> > Ivan, you posted one a while ago, but never seemed to get any
> > confirmation if it helped or not.  Should I use that and drop
> > Arjan's?
> 
> Actually I'm strongly against Arjan's patch. First, it's based on
> assumption that the MMCONFIG thing is sort of fundamentally broken
> on some systems, but none of the facts we have so far does confirm
> that. And second, I really don't like the implementation as it breaks
> all non-x86 arches (or forces them to add a set of totally meaningless
> PCI functions).

no it doesn't!
Other arches need no changes.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-12 15:46                                 ` Arjan van de Ven
@ 2008-01-12 16:23                                   ` Ivan Kokshaysky
  0 siblings, 0 replies; 125+ messages in thread
From: Ivan Kokshaysky @ 2008-01-12 16:23 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Greg KH, Matthew Wilcox, Linus Torvalds, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Benjamin Herrenschmidt, Martin Mares,
	Tony Camuso, Loic Prylli

On Sat, Jan 12, 2008 at 07:46:32AM -0800, Arjan van de Ven wrote:
> Ivan Kokshaysky <ink@jurassic.park.msu.ru> wrote:
> > Actually I'm strongly against Arjan's patch. First, it's based on
> > assumption that the MMCONFIG thing is sort of fundamentally broken
> > on some systems, but none of the facts we have so far does confirm
> > that. And second, I really don't like the implementation as it breaks
> > all non-x86 arches (or forces them to add a set of totally meaningless
> > PCI functions).
> 
> no it doesn't!
> Other arches need no changes.

Umm, true. I misread your patch.
But it doesn't change anything - that wasn't my main objection
anyway.

Ivan.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-12 14:40                               ` Ivan Kokshaysky
  2008-01-12 15:46                                 ` Arjan van de Ven
@ 2008-01-12 17:45                                 ` Arjan van de Ven
  2008-01-12 18:17                                   ` Matthew Wilcox
                                                     ` (2 more replies)
  2008-01-13  7:08                                 ` Benjamin Herrenschmidt
  2 siblings, 3 replies; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-12 17:45 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Greg KH, Matthew Wilcox, Linus Torvalds, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Benjamin Herrenschmidt, Martin Mares,
	Tony Camuso, Loic Prylli

On Sat, 12 Jan 2008 17:40:30 +0300
Ivan Kokshaysky <ink@jurassic.park.msu.ru> wrote:
> --- a/arch/x86/pci/mmconfig_32.c
> +++ b/arch/x86/pci/mmconfig_32.c
> @@ -30,10 +30,6 @@ static u32 get_base_addr(unsigned int seg, int
> bus, unsigned devfn) struct acpi_mcfg_allocation *cfg;
>  	int cfg_num;
>  
> -	if (seg == 0 && bus < PCI_MMCFG_MAX_CHECK_BUS &&
> -	    test_bit(PCI_SLOT(devfn) + 32*bus,
> pci_mmcfg_fallback_slots))
> -		return 0;
> -
>  	for (cfg_num = 0; cfg_num < pci_mmcfg_config_num; cfg_num++)
> { cfg = &pci_mmcfg_config[cfg_num];
>  		if (cfg->pci_segment == seg &&
> @@ -68,13 +64,16 @@ static int pci_mmcfg_read(unsigned int seg,
> unsigned int bus, u32 base;
>  
>  	if ((bus > 255) || (devfn > 255) || (reg > 4095)) {
> -		*value = -1;
> +err:		*value = -1;
>  		return -EINVAL;
>  	}
>  
> +	if (reg < 256)
> +		return pci_conf1_read(seg,bus,devfn,reg,len,value);
> +


btw this is my main objection to your patch; it intertwines the conf1 and mmconfig code even more.
When (and I'm saying "when" not "if") systems arrive that only have MMCONFIG for some of the devices,
we'll have to detangle this again, and I'm really not looking forward to that.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-12 17:45                                 ` Arjan van de Ven
@ 2008-01-12 18:17                                   ` Matthew Wilcox
  2008-01-12 21:49                                   ` Ivan Kokshaysky
  2008-01-13 18:23                                   ` Loic Prylli
  2 siblings, 0 replies; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-12 18:17 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ivan Kokshaysky, Greg KH, Linus Torvalds, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Benjamin Herrenschmidt, Martin Mares,
	Tony Camuso, Loic Prylli

On Sat, Jan 12, 2008 at 09:45:57AM -0800, Arjan van de Ven wrote:
> btw this is my main objection to your patch; it intertwines the conf1 and mmconfig code even more.
> When (and I'm saying "when" not "if") systems arrive that only have MMCONFIG for some of the devices,
> we'll have to detangle this again, and I'm really not looking forward to that.

I think this will be OK.  We'll end up with three pci_ops, one for
mmconfig-only, one for mixed mmconfig-conf1 and one for conf1.  We could
do with that now actually -- the machines which will definitely go beserk
if you try to use mmconfig could have the conf1 ops on those busses.

Let's take Ivan's patch for now, and do that patch for 2.6.26.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-12 17:45                                 ` Arjan van de Ven
  2008-01-12 18:17                                   ` Matthew Wilcox
@ 2008-01-12 21:49                                   ` Ivan Kokshaysky
  2008-01-12 23:01                                     ` Arjan van de Ven
  2008-01-13 18:23                                   ` Loic Prylli
  2 siblings, 1 reply; 125+ messages in thread
From: Ivan Kokshaysky @ 2008-01-12 21:49 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Greg KH, Matthew Wilcox, Linus Torvalds, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Benjamin Herrenschmidt, Martin Mares,
	Tony Camuso, Loic Prylli

On Sat, Jan 12, 2008 at 09:45:57AM -0800, Arjan van de Ven wrote:
> btw this is my main objection to your patch; it intertwines the conf1
> and mmconfig code even more.

There is nothing wrong with it; please realize that mmconf and conf1 are
just different cpu-side interfaces. Both produce precisely the *same* bus
cycles as far as the lower 256-byte space is concerned.

> When (and I'm saying "when" not "if") systems arrive that only have
> MMCONFIG for some of the devices, we'll have to detangle this again,
> and I'm really not looking forward to that.

MMCONFIG for *some* of the devices? This doesn't sound realistic
from technical point of view.
MMCONFIG-only systems? Sure. I really hope to see these. But it won't
be PC-AT architecture anymore. It has to be something like alpha,
for instance, fully utilizing the 64-bit address space, and we'll have
to have the whole low-level PCI infrastructure completely different
for these future platforms anyway.
Right now, each and every x86 chipset *does* require working
conf1 just in order to set up the mmconf aperture. It's the very
fundamental thing, sort of design philosophy.

Ivan.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-12 21:49                                   ` Ivan Kokshaysky
@ 2008-01-12 23:01                                     ` Arjan van de Ven
  2008-01-13  0:12                                       ` Tony Camuso
  0 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-12 23:01 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Greg KH, Matthew Wilcox, Linus Torvalds, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Benjamin Herrenschmidt, Martin Mares,
	Tony Camuso, Loic Prylli

On Sun, 13 Jan 2008 00:49:11 +0300
Ivan Kokshaysky <ink@jurassic.park.msu.ru> wrote:

> On Sat, Jan 12, 2008 at 09:45:57AM -0800, Arjan van de Ven wrote:
> > btw this is my main objection to your patch; it intertwines the
> > conf1 and mmconfig code even more.
> 
> There is nothing wrong with it; please realize that mmconf and conf1
> are just different cpu-side interfaces. Both produce precisely the
> *same* bus cycles as far as the lower 256-byte space is concerned.
> 
> > When (and I'm saying "when" not "if") systems arrive that only have
> > MMCONFIG for some of the devices, we'll have to detangle this again,
> > and I'm really not looking forward to that.
> 
> MMCONFIG for *some* of the devices? This doesn't sound realistic
> from technical point of view.

you're wrong. 

> MMCONFIG-only systems? Sure. I really hope to see these. But it won't
> be PC-AT architecture anymore. It has to be something like alpha,
> for instance, fully utilizing the 64-bit address space, and we'll have
> to have the whole low-level PCI infrastructure completely different
> for these future platforms anyway.
> Right now, each and every x86 chipset *does* require working
> conf1 just in order to set up the mmconf aperture. It's the very
> fundamental thing, sort of design philosophy.

s/x86/pc/

and not even that.

Really this is a huge design mistake in your patch, the hard coding of conf1,
and for that reason I really don't think it should go in.

We have 4 or so methods on PC today to access config space, probably going to 6 in the next year
or two. One of those methods *HARD PICKING* another one as "second best" for cases where it
doesn't want to deal with is WRONG. It really needs to be up to the architecture/platform
to decide which ops vector is the fallback. And yes on your current PC that might well be conf1.
But hardcoding that is not the right thing. We have the vectors, we have the ranking code,
just make a "second rank" thing. 
Oh wait, my patch did that ;)
Then let either the mmconfig code or the wrapper above it (doesn't matter, in fact, I can see
value of making this decision in the wrapper and keep mmconfig code simple and clean,
because maybe mmconfig IS the thing that the architecture says needs to deal with the lower 256 bytes)..

Oh wait my patch also did that pretty much ;)

The rest of my patch was defaulting to off. Is it that bit that you really hate?



^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-12 23:01                                     ` Arjan van de Ven
@ 2008-01-13  0:12                                       ` Tony Camuso
  2008-01-13  0:40                                         ` Arjan van de Ven
  0 siblings, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-13  0:12 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ivan Kokshaysky, Greg KH, Matthew Wilcox, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli

Arjan,

I have not seen your MMCONFIG patch.

Would you mind sending me a copy?

Thanks.

Tony


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13  0:12                                       ` Tony Camuso
@ 2008-01-13  0:40                                         ` Arjan van de Ven
  2008-01-13  1:36                                           ` Tony Camuso
  0 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-13  0:40 UTC (permalink / raw)
  To: tcamuso
  Cc: Ivan Kokshaysky, Greg KH, Matthew Wilcox, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli

On Sat, 12 Jan 2008 19:12:23 -0500
Tony Camuso <tcamuso@redhat.com> wrote:

> Arjan,
> 
> I have not seen your MMCONFIG patch.
> 
> Would you mind sending me a copy?
> 

sure


----


On PCs, PCI extended configuration space (4Kb) is riddled with problems
associated with the memory mapped access method (MMCONFIG). At the same
time, there are very few machines that actually need or use this
extended configuration space.

At this point in time, the only sensible action is to make access to the
extended configuration space an opt-in operation for those device
drivers that need/want access to this space, as well as for those
userland diagnostics utilities that (on admin request) want to access
this space.

It's inevitable that this is done per device rather than per bus; we'll
be needing per device PCI quirks to turn this extended config space off
over time no matter what; in addition, it gives the least amount of
surprise: loading a driver for a device only impacts that one device,
not a whole bus worth of devices (although it'll be common to have one
physical device per bus on PCI-E).

The (desireable) side-effect of this patch is that all enumeration is
done using normal configuration cycles.

The patch below splits the lower level PCI config space operation (which
operate on a bus) in two: one that normally only operates on traditional
space, and one that gets used after the driver has opted in to using the
extended configuration space. This has lead to a little code
duplication, but it's not all that bad (most of it is prototypes in
headers and such).

Architectures that have a solid reliable way to get to extended
configuration space can just keep doing what they do now and allow
extended space access from the "traditional" bus ops, and just not fill
in the new bus ops.  (This could include x86 for, say, BIOS year 2009
and later, but doesn't right now)

This patch also adds a sysfs property for each device into which root
can write a '1' to enable extended configuration space. The kernel will
print a notice into dmesg when this happens (including the name of the
app) so that if the system crashes as a result of this action, the user
can know what action/tool caused it.


Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 Documentation/ABI/testing/sysfs-pci-extended-config |   39 ++++++++++++++++
 arch/x86/pci/common.c                               |   23 +++++++++
 arch/x86/pci/init.c                                 |   10 ++++
 arch/x86/pci/mmconfig_32.c                          |    2 
 arch/x86/pci/mmconfig_64.c                          |    2 
 arch/x86/pci/pci.h                                  |    2 
 drivers/pci/access.c                                |   46 +++++++++++++++++++
 drivers/pci/pci-sysfs.c                             |   31 +++++++++++++
 drivers/pci/pci.c                                   |   28 +++++++++++
 include/linux/pci.h                                 |   47 +++++++++++++++++---
 10 files changed, 222 insertions(+), 8 deletions(-)

--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-pci-extended-config
@@ -0,0 +1,39 @@
+What:		/sys/devices/pci<bus>/<device>/extended_config_space
+Date:		January 11, 2008
+Contact:	Arjan van de Ven <arjan@linux.intel.com>
+Description:
+		This attribute is for use for system-diagnostic software
+		only.
+
+		The kernel may decide to restrict PCI configuration space
+		access for userspace to the first 64 or 256 bytes by
+		default, for stability reasons. This attribute, when
+		present, can be used to request access to the full
+		4Kb from the kernel.
+
+		Request to get access to the full 4Kb can be done by
+		writing a '1' into this attribute file. All other values
+		are reserved for future use and should not be used by
+		software at this point.
+
+		The kernel may log the request to the various kernel
+		logging services. The kernel may decide to ignore the
+		request if the kernel deems extended configuration space
+		access not reliable enough for the system or the device.
+		The kernel may decide to not present this attribute
+		if the kernel decides extended config space is reliable
+		and made available by default, or if the kernel decides
+		that extended configuration space will never be
+		accessible.
+
+		Software needs to gracefully deal with getting the
+		access not granted. Software also needs to gracefully deal
+		with this attribute not being present.
+
+		Due to the fragility of extended configuration space,
+		system diagnostic software should only set this attribute
+		on explicit user request, or in the case of GUI like tools,
+		at least with explicit user permission.
+
+
+
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -26,6 +26,7 @@ int pcibios_last_bus = -1;
 unsigned long pirq_table_addr;
 struct pci_bus *pci_root_bus;
 struct pci_raw_ops *raw_pci_ops;
+struct pci_raw_ops *raw_pci_ops_extcfg;
 
 static int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
@@ -39,9 +40,31 @@ static int pci_write(struct pci_bus *bus
 				  devfn, where, size, value);
 }
 
+static int pci_read_ext(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
+{
+	if (raw_pci_ops_extcfg)
+		return raw_pci_ops_extcfg->read(pci_domain_nr(bus), bus->number,
+				 devfn, where, size, value);
+	else
+		return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+				 devfn, where, size, value);
+}
+
+static int pci_write_ext(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
+{
+	if (raw_pci_ops_extcfg)
+		return raw_pci_ops_extcfg->write(pci_domain_nr(bus), bus->number,
+				  devfn, where, size, value);
+	else
+		return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+				  devfn, where, size, value);
+}
+
 struct pci_ops pci_root_ops = {
 	.read = pci_read,
 	.write = pci_write,
+	.readext = pci_read_ext,
+	.writeext = pci_write_ext,
 };
 
 /*
--- a/arch/x86/pci/init.c
+++ b/arch/x86/pci/init.c
@@ -14,6 +14,16 @@ static __init int pci_access_init(void)
 #ifdef CONFIG_PCI_MMCONFIG
 	pci_mmcfg_init(type);
 #endif
+	/* if we ONLY have MMCONFIG, we need to use it always */
+	if (!raw_pci_ops && raw_pci_ops_extcfg) {
+		printk(KERN_INFO "No direct PCI access, using MMCONFIG always\n");
+		raw_pci_ops = raw_pci_ops_extcfg;
+	}
+
+	/*
+	 * we've found a usable method; this means we can skip
+	 * the potentially dangerous BIOS based methods
+	 */
 	if (raw_pci_ops)
 		return 0;
 #ifdef CONFIG_PCI_BIOS
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -143,6 +143,6 @@ int __init pci_mmcfg_arch_reachable(unsi
 int __init pci_mmcfg_arch_init(void)
 {
 	printk(KERN_INFO "PCI: Using MMCONFIG\n");
-	raw_pci_ops = &pci_mmcfg;
+	raw_pci_ops_extcfg = &pci_mmcfg;
 	return 1;
 }
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -152,6 +152,6 @@ int __init pci_mmcfg_arch_init(void)
 			return 0;
 		}
 	}
-	raw_pci_ops = &pci_mmcfg;
+	raw_pci_ops_extcfg = &pci_mmcfg;
 	return 1;
 }
--- a/arch/x86/pci/pci.h
+++ b/arch/x86/pci/pci.h
@@ -32,6 +32,8 @@
 extern unsigned int pci_probe;
 extern unsigned long pirq_table_addr;
 
+extern struct pci_raw_ops *raw_pci_ops_extcfg;
+
 enum pci_bf_sort_state {
 	pci_bf_sort_default,
 	pci_force_nobf,
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -51,6 +51,45 @@ int pci_bus_write_config_##size \
 	return res;							\
 }
 
+#define PCI_OP_READ_EXT(size, type, len) \
+int pci_bus_read_extconfig_##size \
+	(struct pci_bus *bus, unsigned int devfn, int pos, type *value)	\
+{									\
+	int res;							\
+	unsigned long flags;						\
+	u32 data = 0;							\
+	if (PCI_##size##_BAD)						\
+		return PCIBIOS_BAD_REGISTER_NUMBER;			\
+	spin_lock_irqsave(&pci_lock, flags);				\
+	if (bus->ops->readext)						\
+		res = bus->ops->readext(bus, devfn, pos, len, &data);	\
+	else								\
+		res = bus->ops->read(bus, devfn, pos, len, &data);	\
+	*value = (type)data;						\
+	spin_unlock_irqrestore(&pci_lock, flags);			\
+	return res;							\
+}									\
+EXPORT_SYMBOL(pci_bus_read_extconfig_##size);
+
+#define PCI_OP_WRITE_EXT(size, type, len) \
+int pci_bus_write_extconfig_##size \
+	(struct pci_bus *bus, unsigned int devfn, int pos, type value)	\
+{									\
+	int res;							\
+	unsigned long flags;						\
+	if (PCI_##size##_BAD)						\
+		return PCIBIOS_BAD_REGISTER_NUMBER;			\
+	spin_lock_irqsave(&pci_lock, flags);				\
+	if (bus->ops->writeext)						\
+		res = bus->ops->writeext(bus, devfn, pos, len, value);	\
+	else								\
+		res = bus->ops->write(bus, devfn, pos, len, value);	\
+	spin_unlock_irqrestore(&pci_lock, flags);			\
+	return res;							\
+}									\
+EXPORT_SYMBOL(pci_bus_write_extconfig_##size);
+
+
 PCI_OP_READ(byte, u8, 1)
 PCI_OP_READ(word, u16, 2)
 PCI_OP_READ(dword, u32, 4)
@@ -58,6 +97,13 @@ PCI_OP_WRITE(byte, u8, 1)
 PCI_OP_WRITE(word, u16, 2)
 PCI_OP_WRITE(dword, u32, 4)
 
+PCI_OP_READ_EXT(byte, u8, 1)
+PCI_OP_READ_EXT(word, u16, 2)
+PCI_OP_READ_EXT(dword, u32, 4)
+PCI_OP_WRITE_EXT(byte, u8, 1)
+PCI_OP_WRITE_EXT(word, u16, 2)
+PCI_OP_WRITE_EXT(dword, u32, 4)
+
 EXPORT_SYMBOL(pci_bus_read_config_byte);
 EXPORT_SYMBOL(pci_bus_read_config_word);
 EXPORT_SYMBOL(pci_bus_read_config_dword);
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -143,6 +143,35 @@ static ssize_t is_enabled_show(struct de
 	return sprintf (buf, "%u\n", atomic_read(&pdev->enable_cnt));
 }
 
+static ssize_t extended_config_space_store(struct device *dev,
+				struct device_attribute *attr, const char *buf,
+				size_t count)
+{
+	ssize_t result = -EINVAL;
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	/* this can crash the machine when done on the "wrong" device */
+	if (!capable(CAP_SYS_ADMIN))
+		return count;
+
+	if (*buf == '1') {
+		printk(KERN_WARNING "Application %s enabled extended config space for device %s\n",
+			current->comm,  pci_name(pdev));
+		result = pci_enable_ext_config(pdev);
+	}
+
+	return result < 0 ? result : count;
+}
+
+static ssize_t extended_config_space_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct pci_dev *pdev;
+
+	pdev = to_pci_dev(dev);
+	return sprintf(buf, "%u\n", pdev->ext_cfg_space);
+}
+
 #ifdef CONFIG_NUMA
 static ssize_t
 numa_node_show(struct device *dev, struct device_attribute *attr, char *buf)
@@ -206,6 +235,8 @@ struct device_attribute pci_dev_attrs[] 
 	__ATTR_RO(numa_node),
 #endif
 	__ATTR(enable, 0600, is_enabled_show, is_enabled_store),
+	__ATTR(extended_config_space, 0600, extended_config_space_show,
+		extended_config_space_store),
 	__ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
 		broken_parity_status_show,broken_parity_status_store),
 	__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -802,6 +802,34 @@ int pci_enable_device(struct pci_dev *de
 	return __pci_enable_device_flags(dev, IORESOURCE_MEM | IORESOURCE_IO);
 }
 
+/**
+ * pci_enable_ext_config - Enable extended (4K) config space accesses
+ * @dev: PCI device to be changed
+ *
+ *  Enable extended (4Kb) configuration space accesses for a device.
+ *  Extended config space is available for PCI-E devices and can
+ *  be used for things like PCI AER and other features. However,
+ *  due to various stability issues, this can only be done on demand.
+ *
+ * Returns: -1 on failure, 0 on success
+ */
+
+int pci_enable_ext_config(struct pci_dev *dev)
+{
+	if (dev->ext_cfg_space < 0)
+		return -1;
+	if (dev->ext_cfg_space > 0)
+		return 0;
+	dev->ext_cfg_space = 1;
+	/*
+	 * now that we enabled large accesse, we
+	 * need to update the config space size variable
+	 */
+	dev->cfg_size = pci_cfg_space_size(dev);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pci_enable_ext_config);
+
 /*
  * Managed PCI resources.  This manages device on/off, intx/msi/msix
  * on/off and BAR regions.  pci_dev itself records msi/msix status, so
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -169,6 +169,15 @@ struct pci_dev {
 	int		cfg_size;	/* Size of configuration space */
 
 	/*
+	 * ext_cfg_space gets set by drivers/quirks to device if
+	 * extended (4K) config space is desired.
+	 * negative values -- hard disabled (quirk etc)
+	 * zero            -- disabled
+	 * positive values -- enable
+	 */
+	int		ext_cfg_space;
+
+	/*
 	 * Instead of touching interrupt line and base address registers
 	 * directly, use the values stored here. They might be different!
 	 */
@@ -297,6 +306,8 @@ struct pci_bus {
 struct pci_ops {
 	int (*read)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val);
 	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
+	int (*readext)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val);
+	int (*writeext)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
 };
 
 struct pci_raw_ops {
@@ -517,29 +528,48 @@ int pci_bus_write_config_byte (struct pc
 int pci_bus_write_config_word (struct pci_bus *bus, unsigned int devfn, int where, u16 val);
 int pci_bus_write_config_dword (struct pci_bus *bus, unsigned int devfn, int where, u32 val);
 
+int pci_bus_read_extconfig_byte(struct pci_bus *bus, unsigned int devfn, int where, u8 *val);
+int pci_bus_read_extconfig_word(struct pci_bus *bus, unsigned int devfn, int where, u16 *val);
+int pci_bus_read_extconfig_dword(struct pci_bus *bus, unsigned int devfn, int where, u32 *val);
+int pci_bus_write_extconfig_byte(struct pci_bus *bus, unsigned int devfn, int where, u8 val);
+int pci_bus_write_extconfig_word(struct pci_bus *bus, unsigned int devfn, int where, u16 val);
+int pci_bus_write_extconfig_dword(struct pci_bus *bus, unsigned int devfn, int where, u32 val);
+
 static inline int pci_read_config_byte(struct pci_dev *dev, int where, u8 *val)
 {
-	return pci_bus_read_config_byte (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_read_extconfig_byte(dev->bus, dev->devfn, where, val);
+	return pci_bus_read_config_byte(dev->bus, dev->devfn, where, val);
 }
 static inline int pci_read_config_word(struct pci_dev *dev, int where, u16 *val)
 {
-	return pci_bus_read_config_word (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_read_extconfig_word(dev->bus, dev->devfn, where, val);
+	return pci_bus_read_config_word(dev->bus, dev->devfn, where, val);
 }
 static inline int pci_read_config_dword(struct pci_dev *dev, int where, u32 *val)
 {
-	return pci_bus_read_config_dword (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_read_extconfig_dword(dev->bus, dev->devfn, where, val);
+	return pci_bus_read_config_dword(dev->bus, dev->devfn, where, val);
 }
 static inline int pci_write_config_byte(struct pci_dev *dev, int where, u8 val)
 {
-	return pci_bus_write_config_byte (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_write_extconfig_byte(dev->bus, dev->devfn, where, val);
+	return pci_bus_write_config_byte(dev->bus, dev->devfn, where, val);
 }
 static inline int pci_write_config_word(struct pci_dev *dev, int where, u16 val)
 {
-	return pci_bus_write_config_word (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_write_extconfig_word(dev->bus, dev->devfn, where, val);
+	return pci_bus_write_config_word(dev->bus, dev->devfn, where, val);
 }
 static inline int pci_write_config_dword(struct pci_dev *dev, int where, u32 val)
 {
-	return pci_bus_write_config_dword (dev->bus, dev->devfn, where, val);
+	if (dev->ext_cfg_space > 0)
+		return pci_bus_write_extconfig_dword(dev->bus, dev->devfn, where, val);
+	return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val);
 }
 
 int __must_check pci_enable_device(struct pci_dev *dev);
@@ -689,6 +719,9 @@ void ht_destroy_irq(unsigned int irq);
 extern void pci_block_user_cfg_access(struct pci_dev *dev);
 extern void pci_unblock_user_cfg_access(struct pci_dev *dev);
 
+extern int pci_enable_ext_config(struct pci_dev *dev);
+
+
 /*
  * PCI domain support.  Sometimes called PCI segment (eg by ACPI),
  * a PCI domain is defined to be a set of PCI busses which share
@@ -786,6 +819,8 @@ static inline struct pci_dev *pci_get_bu
 						unsigned int devfn)
 { return NULL; }
 
+static inline int pci_enable_ext_config(struct pci_dev *dev) { return -1; }
+
 #endif /* CONFIG_PCI */
 
 /* Include architecture-dependent settings and functions */


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13  0:40                                         ` Arjan van de Ven
@ 2008-01-13  1:36                                           ` Tony Camuso
  2008-01-13  4:42                                             ` Arjan van de Ven
  0 siblings, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-13  1:36 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ivan Kokshaysky, Greg KH, Matthew Wilcox, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

Thanks, Arjan.

The problem we have been experiencing has to do with Northbridges,
not with devices.

As far as the device is concerned, after the Northbridge translates
the config access into PCI bus cycles, the device has no idea what
mechanism drove the Northbridge to the translation.

That is to say, the device does not know whether the config cycle
on the bus was caused by an MMCONFIG cycle or a legacy Port IO
cycle delivered to the Northbridge.

In systems that had Northbridges that did not respond correctly to
MMCONFIG cycles, like the AMD 8132, we (HP & RH) were blacklisting
whole platforms to limit them to Port IO PCI config.

However, when platforms emerged using both legacy PCI and PCI express,
the platforms that were limited to Port IO config cycles were not
express compliant, since the express spec requires the platform to
be able to address the full 4096 byte region of config space to
be considered express-compliant.

The patch I devised concerned itself with Northbridges and separated
MMCONFIG-compliant buses from those that could not handle MMCONFIG.

Therefore, the express bus in the platform could happily employ
MMCONFIG to access the entire 4K region, while the legacy bus
with the non-compliant Northbridge could be restricted to Port IO
config.

However, even with my patch, the problem remained where devices
requiring large displacements could overlap the BIOS-mapped
MMCONFIG region. In such a situation, where the bus has passed
the MMCONFIG test, the MMCONFIG region can get doubly mapped by
bus-sizing code, causing the system to hang.

The remedy proposed by Loic and implemented by Ivan is actually
quite elegant, in that it addresses all these problems quite
effectively while eliminating a ration of specialized and somewhat
obscure code.

In my humble opinion, Port IO config access is here to stay, having
been defined as an architected mechanism in the PCI 2.1 spec.

This is most especially true for x86.

In other words, for x86, I don't think we need to worry about Port
IO config access ever going away at all.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13  1:36                                           ` Tony Camuso
@ 2008-01-13  4:42                                             ` Arjan van de Ven
  2008-01-13  4:47                                               ` Matthew Wilcox
  2008-01-13 12:43                                               ` Tony Camuso
  0 siblings, 2 replies; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-13  4:42 UTC (permalink / raw)
  To: tcamuso
  Cc: Ivan Kokshaysky, Greg KH, Matthew Wilcox, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

On Sat, 12 Jan 2008 20:36:59 -0500
Tony Camuso <tcamuso@redhat.com> wrote:

> Thanks, Arjan.
> 
> The problem we have been experiencing has to do with Northbridges,
> not with devices.

correct for now.
HOWEVER, and this is the point Linus has made several times:
Just about NOBODY has devices that need the extended config space. At all.
So making this opt-in for devices allows our users to boot and use
their system if they are in the majority that has no need for even getting
close to this mess.

> 
> As far as the device is concerned, after the Northbridge translates
> the config access into PCI bus cycles, the device has no idea what
> mechanism drove the Northbridge to the translation.

Wanne bet there'll be devices that screw this up? THere's devices that even screwed
up the 64-256 region after all.

> The patch I devised concerned itself with Northbridges and separated
> MMCONFIG-compliant buses from those that could not handle MMCONFIG.

THis kind of patchup has been going on for the better part of a year (well 2 years)
by now and it's STILL NOT ENOUGH, as you can see by the more patchups that have
been proposed as "alternative" to my approach.

> 
> In my humble opinion, Port IO config access is here to stay, having
> been defined as an architected mechanism in the PCI 2.1 spec.
> 
> This is most especially true for x86.
> 
> In other words, for x86, I don't think we need to worry about Port
> IO config access ever going away at all.

You're wrong there. Sad to say, but you're wrong there.

-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13  4:42                                             ` Arjan van de Ven
@ 2008-01-13  4:47                                               ` Matthew Wilcox
  2008-01-13  6:43                                                 ` Jeff Garzik
  2008-01-13 12:43                                               ` Tony Camuso
  1 sibling, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-13  4:47 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: tcamuso, Ivan Kokshaysky, Greg KH, Linus Torvalds, Greg KH,
	linux-kernel, Jeff Garzik, linux-pci, Benjamin Herrenschmidt,
	Martin Mares, Loic Prylli, Prarit Bhargava, Chumbalkar,
	Nagananda, Schoeller, Patrick (Linux - Houston, TX),
	Bhavana Nagendra

On Sat, Jan 12, 2008 at 08:42:48PM -0800, Arjan van de Ven wrote:
> Wanne bet there'll be devices that screw this up? THere's devices that even screwed
> up the 64-256 region after all.

I don't know if they 'screwed it up'.  There are devices that misbehave
when registers are read from pci config space.  But this was never
guaranteed to be a safe thing to do; it gradualy became clear that
people expected to be able to read random registers and manufacturers
responded accordingly, but I don't think you were ever guaranteed to be
able to peek at bits of config space arbitrarily.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13  4:47                                               ` Matthew Wilcox
@ 2008-01-13  6:43                                                 ` Jeff Garzik
  0 siblings, 0 replies; 125+ messages in thread
From: Jeff Garzik @ 2008-01-13  6:43 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Arjan van de Ven, tcamuso, Ivan Kokshaysky, Greg KH,
	Linus Torvalds, Greg KH, linux-kernel, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

Matthew Wilcox wrote:
> On Sat, Jan 12, 2008 at 08:42:48PM -0800, Arjan van de Ven wrote:
>> Wanne bet there'll be devices that screw this up? THere's devices that even screwed
>> up the 64-256 region after all.
> 
> I don't know if they 'screwed it up'.  There are devices that misbehave
> when registers are read from pci config space.  But this was never
> guaranteed to be a safe thing to do; it gradualy became clear that
> people expected to be able to read random registers and manufacturers
> responded accordingly, but I don't think you were ever guaranteed to be
> able to peek at bits of config space arbitrarily.

Quite correct...  Reading registers can have all sorts of side effects, 
for example clearing chip conditions.

	Jeff




^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-12 14:40                               ` Ivan Kokshaysky
  2008-01-12 15:46                                 ` Arjan van de Ven
  2008-01-12 17:45                                 ` Arjan van de Ven
@ 2008-01-13  7:08                                 ` Benjamin Herrenschmidt
  2008-01-13  7:24                                   ` Matthew Wilcox
  2 siblings, 1 reply; 125+ messages in thread
From: Benjamin Herrenschmidt @ 2008-01-13  7:08 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Greg KH, Matthew Wilcox, Linus Torvalds, Greg KH,
	Arjan van de Ven, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares, Tony Camuso, Loic Prylli


On Sat, 2008-01-12 at 17:40 +0300, Ivan Kokshaysky wrote:
> 
> Actually I'm strongly against Arjan's patch. First, it's based on
> assumption that the MMCONFIG thing is sort of fundamentally broken
> on some systems, but none of the facts we have so far does confirm
> that.
> And second, I really don't like the implementation as it breaks all
> non-x86 arches (or forces them to add a set of totally meaningless
> PCI functions).

I agree, I quite dislike it too. Even If the breakage on x86 makes us
want to totally disable it there, it can be done within the existing PCI
ops I believe.

I think Arjan's problem is to try to do it per-device since the
"standard" PCI ops don't get a pci_dev structure (for obvious reasons).

But from what I read in this thread, this per-device enabling/disabling
doesn't seem very useful at all.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13  7:08                                 ` Benjamin Herrenschmidt
@ 2008-01-13  7:24                                   ` Matthew Wilcox
  2008-01-13  7:58                                     ` Matthew Wilcox
  2008-01-13 17:01                                     ` Arjan van de Ven
  0 siblings, 2 replies; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-13  7:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Ivan Kokshaysky, Greg KH, Linus Torvalds, Greg KH,
	Arjan van de Ven, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares, Tony Camuso, Loic Prylli

On Sun, Jan 13, 2008 at 06:08:05PM +1100, Benjamin Herrenschmidt wrote:
> On Sat, 2008-01-12 at 17:40 +0300, Ivan Kokshaysky wrote:
> > Actually I'm strongly against Arjan's patch. First, it's based on
> > assumption that the MMCONFIG thing is sort of fundamentally broken
> > on some systems, but none of the facts we have so far does confirm
> > that.
> > And second, I really don't like the implementation as it breaks all
> > non-x86 arches (or forces them to add a set of totally meaningless
> > PCI functions).
> 
> I agree, I quite dislike it too. Even If the breakage on x86 makes us
> want to totally disable it there, it can be done within the existing PCI
> ops I believe.
> 
> I think Arjan's problem is to try to do it per-device since the
> "standard" PCI ops don't get a pci_dev structure (for obvious reasons).

Here's a patch (on top of Ivan's) to improve things further.

One of Arjan's big problems with Ivan's patch is the hardcoding of conf1
as the fallback.  So I took an idea from Arjan's patch, crossed it
with an idea of my own and came up with this.  It gets rid of the
raw_pci_ops as a generic idea, and makes it private to the x86 arch.
It also makes the whole select-which-ops private to the x86 arch without
touching the pci layer at all.

Only compile-tested on x86-64.

diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 488e48a..ffaf02b 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -43,8 +43,7 @@
 #define PCI_SAL_EXT_ADDRESS(seg, bus, devfn, reg)	\
 	(((u64) seg << 28) | (bus << 20) | (devfn << 12) | (reg))
 
-static int
-pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_read(unsigned int seg, unsigned int bus, unsigned int devfn,
 	      int reg, int len, u32 *value)
 {
 	u64 addr, data = 0;
@@ -68,8 +67,7 @@ pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
 	return 0;
 }
 
-static int
-pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_write(unsigned int seg, unsigned int bus, unsigned int devfn,
 	       int reg, int len, u32 value)
 {
 	u64 addr;
@@ -91,24 +89,17 @@ pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
 	return 0;
 }
 
-static struct pci_raw_ops pci_sal_ops = {
-	.read =		pci_sal_read,
-	.write =	pci_sal_write
-};
-
-struct pci_raw_ops *raw_pci_ops = &pci_sal_ops;
-
-static int
-pci_read (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
+static int pci_read(struct pci_bus *bus, unsigned int devfn, int where,
+							int size, u32 *value)
 {
-	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
 				 devfn, where, size, value);
 }
 
-static int
-pci_write (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
+static int pci_write(struct pci_bus *bus, unsigned int devfn, int where,
+							int size, u32 value)
 {
-	return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
 				  devfn, where, size, value);
 }
 
diff --git a/arch/ia64/sn/pci/tioce_provider.c b/arch/ia64/sn/pci/tioce_provider.c
index e1a3e19..f6df212 100644
--- a/arch/ia64/sn/pci/tioce_provider.c
+++ b/arch/ia64/sn/pci/tioce_provider.c
@@ -752,13 +752,13 @@ tioce_kern_init(struct tioce_common *tioce_common)
 	 * Determine the secondary bus number of the port2 logical PPB.
 	 * This is used to decide whether a given pci device resides on
 	 * port1 or port2.  Note:  We don't have enough plumbing set up
-	 * here to use pci_read_config_xxx() so use the raw_pci_ops vector.
+	 * here to use pci_read_config_xxx() so use raw_pci_read().
 	 */
 
 	seg = tioce_common->ce_pcibus.bs_persist_segment;
 	bus = tioce_common->ce_pcibus.bs_persist_busnum;
 
-	raw_pci_ops->read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
+	raw_pci_read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
 	tioce_kern->ce_port1_secondary = (u8) tmp;
 
 	/*
@@ -799,11 +799,11 @@ tioce_kern_init(struct tioce_common *tioce_common)
 
 		/* mem base/limit */
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_MEMORY_BASE, 2, &tmp);
 		base = (u64)tmp << 16;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_MEMORY_LIMIT, 2, &tmp);
 		limit = (u64)tmp << 16;
 		limit |= 0xfffffUL;
@@ -817,21 +817,21 @@ tioce_kern_init(struct tioce_common *tioce_common)
 		 * attributes.
 		 */
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_MEMORY_BASE, 2, &tmp);
 		base = ((u64)tmp & PCI_PREF_RANGE_MASK) << 16;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_BASE_UPPER32, 4, &tmp);
 		base |= (u64)tmp << 32;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_MEMORY_LIMIT, 2, &tmp);
 
 		limit = ((u64)tmp & PCI_PREF_RANGE_MASK) << 16;
 		limit |= 0xfffffUL;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_LIMIT_UPPER32, 4, &tmp);
 		limit |= (u64)tmp << 32;
 
diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index fab30e1..b92d2e6 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -27,7 +27,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
 	pci_write_config_byte(dev, 0xf4, config|0x2);
 
 	/* read xTPR register */
-	raw_pci_ops->read(0, 0, 0x40, 0x4c, 2, &word);
+	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
 
 	if (!(word & (1 << 13))) {
 		printk(KERN_INFO "Intel E7520/7320/7525 detected. "
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 8627463..65a6c55 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -26,16 +26,37 @@ int pcibios_last_bus = -1;
 unsigned long pirq_table_addr;
 struct pci_bus *pci_root_bus;
 struct pci_raw_ops *raw_pci_ops;
+struct pci_raw_ops *raw_pci_ext_ops;
+
+int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val)
+{
+	if (reg < 256)
+		return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
+	return -EINVAL;
+}
+
+int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val)
+{
+	if (reg < 256)
+		return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
+	return -EINVAL;
+}
 
 static int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
-	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
 				 devfn, where, size, value);
 }
 
 static int pci_write(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
 {
-	return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
 				  devfn, where, size, value);
 }
 
diff --git a/arch/x86/pci/direct.c b/arch/x86/pci/direct.c
index 431c9a5..42f3e4c 100644
--- a/arch/x86/pci/direct.c
+++ b/arch/x86/pci/direct.c
@@ -14,7 +14,7 @@
 #define PCI_CONF1_ADDRESS(bus, devfn, reg) \
 	(0x80000000 | (bus << 16) | (devfn << 8) | (reg & ~3))
 
-int pci_conf1_read(unsigned int seg, unsigned int bus,
+static int pci_conf1_read(unsigned int seg, unsigned int bus,
 			  unsigned int devfn, int reg, int len, u32 *value)
 {
 	unsigned long flags;
@@ -45,7 +45,7 @@ int pci_conf1_read(unsigned int seg, unsigned int bus,
 	return 0;
 }
 
-int pci_conf1_write(unsigned int seg, unsigned int bus,
+static int pci_conf1_write(unsigned int seg, unsigned int bus,
 			   unsigned int devfn, int reg, int len, u32 value)
 {
 	unsigned long flags;
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 6cff66d..c222a1f 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -215,7 +215,8 @@ static int quirk_aspm_offset[MAX_PCIEROOT << 3];
 
 static int quirk_pcie_aspm_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
-	return raw_pci_ops->read(0, bus->number, devfn, where, size, value);
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
+						devfn, where, size, value);
 }
 
 /*
@@ -231,7 +232,8 @@ static int quirk_pcie_aspm_write(struct pci_bus *bus, unsigned int devfn, int wh
 	if ((offset) && (where == offset))
 		value = value & 0xfffffffc;
 	
-	return raw_pci_ops->write(0, bus->number, devfn, where, size, value);
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
+						devfn, where, size, value);
 }
 
 static struct pci_ops quirk_pcie_aspm_ops = {
diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c
index 5565d70..b008765 100644
--- a/arch/x86/pci/legacy.c
+++ b/arch/x86/pci/legacy.c
@@ -22,7 +22,7 @@ static void __devinit pcibios_fixup_peer_bridges(void)
 		if (pci_find_bus(0, n))
 			continue;
 		for (devfn = 0; devfn < 256; devfn += 8) {
-			if (!raw_pci_ops->read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
+			if (!raw_pci_read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
 			    l != 0x0000 && l != 0xffff) {
 				DBG("Found device at %02x:%02x [%04x]\n", n, devfn, l);
 				printk(KERN_INFO "PCI: Discovered peer bus %02x\n", n);
diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 6b521d3..8d54df4 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -28,7 +28,7 @@ static int __initdata pci_mmcfg_resources_inserted;
 static const char __init *pci_mmcfg_e7520(void)
 {
 	u32 win;
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
 
 	win = win & 0xf000;
 	if(win == 0x0000 || win == 0xf000)
@@ -53,7 +53,7 @@ static const char __init *pci_mmcfg_intel_945(void)
 
 	pci_mmcfg_config_num = 1;
 
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
 
 	/* Enable bit */
 	if (!(pciexbar & 1))
@@ -118,7 +118,7 @@ static int __init pci_mmcfg_check_hostbridge(void)
 	int i;
 	const char *name;
 
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
 	vendor = l & 0xffff;
 	device = (l >> 16) & 0xffff;
 
diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 7b75e65..37a00fb 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -68,9 +68,6 @@ err:		*value = -1;
 		return -EINVAL;
 	}
 
-	if (reg < 256)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
-
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
 		goto err;
@@ -104,9 +101,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if ((bus > 255) || (devfn > 255) || (reg > 4095))
 		return -EINVAL;
 
-	if (reg < 256)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
-
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
 		return -EINVAL;
@@ -140,5 +134,6 @@ int __init pci_mmcfg_arch_init(void)
 {
 	printk(KERN_INFO "PCI: Using MMCONFIG\n");
 	raw_pci_ops = &pci_mmcfg;
+	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index c4cf318..6bf47e9 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -58,9 +58,6 @@ err:		*value = -1;
 		return -EINVAL;
 	}
 
-	if (reg < 256)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
-
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
 		goto err;
@@ -89,9 +86,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095)))
 		return -EINVAL;
 
-	if (reg < 256)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
-
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
 		return -EINVAL;
@@ -151,5 +145,6 @@ int __init pci_mmcfg_arch_init(void)
 		}
 	}
 	raw_pci_ops = &pci_mmcfg;
+	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
index 36cb44c..3431518 100644
--- a/arch/x86/pci/pci.h
+++ b/arch/x86/pci/pci.h
@@ -85,10 +85,17 @@ extern spinlock_t pci_config_lock;
 extern int (*pcibios_enable_irq)(struct pci_dev *dev);
 extern void (*pcibios_disable_irq)(struct pci_dev *dev);
 
-extern int pci_conf1_write(unsigned int seg, unsigned int bus,
-			   unsigned int devfn, int reg, int len, u32 value);
-extern int pci_conf1_read(unsigned int seg, unsigned int bus,
-			  unsigned int devfn, int reg, int len, u32 *value);
+struct pci_raw_ops {
+	int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val);
+	int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val);
+};
+
+extern struct pci_raw_ops *raw_pci_ops;
+extern struct pci_raw_ops *raw_pci_ext_ops;
+
+extern struct pci_raw_ops pci_direct_conf1;
 
 extern int pci_direct_probe(void);
 extern void pci_direct_init(int type);
diff --git a/arch/x86/pci/visws.c b/arch/x86/pci/visws.c
index 8ecb1c7..c2df4e9 100644
--- a/arch/x86/pci/visws.c
+++ b/arch/x86/pci/visws.c
@@ -13,9 +13,6 @@
 
 #include "pci.h"
 
-
-extern struct pci_raw_ops pci_direct_conf1;
-
 static int pci_visws_enable_irq(struct pci_dev *dev) { return 0; }
 static void pci_visws_disable_irq(struct pci_dev *dev) { }
 
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index e3a673a..ea68ef1 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -139,15 +139,6 @@ acpi_status __init acpi_os_initialize(void)
 
 acpi_status acpi_os_initialize1(void)
 {
-	/*
-	 * Initialize PCI configuration space access, as we'll need to access
-	 * it while walking the namespace (bus 0 and root bridges w/ _BBNs).
-	 */
-	if (!raw_pci_ops) {
-		printk(KERN_ERR PREFIX
-		       "Access to PCI configuration space unavailable\n");
-		return AE_NULL_ENTRY;
-	}
 	kacpid_wq = create_singlethread_workqueue("kacpid");
 	kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
 	BUG_ON(!kacpid_wq);
@@ -498,11 +489,9 @@ acpi_os_read_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
 		return AE_ERROR;
 	}
 
-	BUG_ON(!raw_pci_ops);
-
-	result = raw_pci_ops->read(pci_id->segment, pci_id->bus,
-				   PCI_DEVFN(pci_id->device, pci_id->function),
-				   reg, size, value);
+	result = raw_pci_read(pci_id->segment, pci_id->bus,
+				PCI_DEVFN(pci_id->device, pci_id->function),
+				reg, size, value);
 
 	return (result ? AE_ERROR : AE_OK);
 }
@@ -529,11 +518,9 @@ acpi_os_write_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
 		return AE_ERROR;
 	}
 
-	BUG_ON(!raw_pci_ops);
-
-	result = raw_pci_ops->write(pci_id->segment, pci_id->bus,
-				    PCI_DEVFN(pci_id->device, pci_id->function),
-				    reg, size, value);
+	result = raw_pci_write(pci_id->segment, pci_id->bus,
+				PCI_DEVFN(pci_id->device, pci_id->function),
+				reg, size, value);
 
 	return (result ? AE_ERROR : AE_OK);
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0dd93bb..75029d3 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -304,14 +304,14 @@ struct pci_ops {
 	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
 };
 
-struct pci_raw_ops {
-	int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
-		    int reg, int len, u32 *val);
-	int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
-		     int reg, int len, u32 val);
-};
-
-extern struct pci_raw_ops *raw_pci_ops;
+/*
+ * ACPI needs to be able to access PCI config space before we've done a
+ * PCI bus scan and created pci_bus structures.
+ */
+extern int raw_pci_read(unsigned int domain, unsigned int bus,
+			unsigned int devfn, int reg, int len, u32 *val);
+extern int raw_pci_write(unsigned int domain, unsigned int bus,
+			unsigned int devfn, int reg, int len, u32 val);
 
 struct pci_bus_region {
 	unsigned long start;

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13  7:24                                   ` Matthew Wilcox
@ 2008-01-13  7:58                                     ` Matthew Wilcox
  2008-01-13 17:01                                     ` Arjan van de Ven
  1 sibling, 0 replies; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-13  7:58 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Ivan Kokshaysky, Greg KH, Linus Torvalds, Greg KH,
	Arjan van de Ven, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares, Tony Camuso, Loic Prylli

On Sun, Jan 13, 2008 at 12:24:15AM -0700, Matthew Wilcox wrote:
> Here's a patch (on top of Ivan's) to improve things further.

Oops.  I forgot to check the ordering of mmconfig vs direct probing, so
that patch would end up just using mmconfig for everything.  Not what we
want.  Also, there's three bits of mmconfig-shared that're probing using
conf1, even if it might have failed.  And if we're going to use
raw_pci_read() when conf1 might have failed and mmconf isn't set up yet,
we need to check raw_pci_ops in raw_pci_read().  Add the check in
raw_pci_write too, just for symmetry.

I don't like it that mmconfig_32 prints a message and mmconfig_64
doesn't, but fixing that is not part of this patch.

Interdiff:

diff -u b/arch/x86/pci/common.c b/arch/x86/pci/common.c
--- b/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -31,7 +31,7 @@
 int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
 						int reg, int len, u32 *val)
 {
-	if (reg < 256)
+	if (reg < 256 && raw_pci_ops)
 		return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
 	if (raw_pci_ext_ops)
 		return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
@@ -41,7 +41,7 @@
 int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
 						int reg, int len, u32 val)
 {
-	if (reg < 256)
+	if (reg < 256 && raw_pci_ops)
 		return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
 	if (raw_pci_ext_ops)
 		return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
diff -u b/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
--- b/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -28,7 +28,7 @@
 static const char __init *pci_mmcfg_e7520(void)
 {
 	u32 win;
-	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
+	raw_pci_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
 
 	win = win & 0xf000;
 	if(win == 0x0000 || win == 0xf000)
@@ -53,7 +53,7 @@
 
 	pci_mmcfg_config_num = 1;
 
-	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
+	raw_pci_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
 
 	/* Enable bit */
 	if (!(pciexbar & 1))
@@ -118,7 +118,7 @@
 	int i;
 	const char *name;
 
-	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
+	raw_pci_read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
 	vendor = l & 0xffff;
 	device = (l >> 16) & 0xffff;
 
diff -u b/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
--- b/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -132,8 +132,10 @@
 
 int __init pci_mmcfg_arch_init(void)
 {
-	printk(KERN_INFO "PCI: Using MMCONFIG\n");
-	raw_pci_ops = &pci_mmcfg;
+	printk(KERN_INFO "PCI: Using MMCONFIG for %s config space\n",
+		raw_pci_ops ? "extended" : "all");
+	if (!raw_pci_ops)
+		raw_pci_ops = &pci_mmcfg;
 	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff -u b/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
--- b/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -144,7 +144,8 @@
 			return 0;
 		}
 	}
-	raw_pci_ops = &pci_mmcfg;
+	if (!raw_pci_ops)
+		raw_pci_ops = &pci_mmcfg;
 	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13  4:42                                             ` Arjan van de Ven
  2008-01-13  4:47                                               ` Matthew Wilcox
@ 2008-01-13 12:43                                               ` Tony Camuso
  2008-01-13 17:03                                                 ` Arjan van de Ven
  1 sibling, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-13 12:43 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ivan Kokshaysky, Greg KH, Matthew Wilcox, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

Arjan van de Ven wrote:
> On Sat, 12 Jan 2008 20:36:59 -0500
> Tony Camuso <tcamuso@redhat.com> wrote:
> 
>
> Just about NOBODY has devices that need the extended config space. At all.

The PCI express spec requires the platform to provide access to this space
for express-compliance. More devices will be using this space as express
becomes the dominant IO bus technology.

>> As far as the device is concerned, after the Northbridge translates
>> the config access into PCI bus cycles, the device has no idea what
>> mechanism drove the Northbridge to the translation.
> 
> Wanne bet there'll be devices that screw this up? THere's devices that even screwed
> up the 64-256 region after all.
> 

There may have been devices that incorrectly applied the PCI spec to
various fields in the header, I'll grant you that.

However, there is no way a device can determine electrically whether the
Northbridge received Port IO or MMCONFIG cycles. This is between the CPU
and the Northbridge and is utterly opaque to the devices on the bus.

>> The patch I devised concerned itself with Northbridges and separated
>> MMCONFIG-compliant buses from those that could not handle MMCONFIG.
> 
> THis kind of patchup has been going on for the better part of a year (well 2 years)
> by now and it's STILL NOT ENOUGH, as you can see by the more patchups that have
> been proposed as "alternative" to my approach.
> 

Which is why Loic's proposal and Ivan's implementation of it is so elegant.
It solves all these problems in one sweep, and eliminates the code rendered
cruft by Ivan's patch. A two-fer, by my reckoning.

>> In other words, for x86, I don't think we need to worry about Port
>> IO config access ever going away at all.
> 
> You're wrong there. Sad to say, but you're wrong there.
> 

The PCI spec provides for conf1 as an architected solution. It's not
going away, and especially not in x86 land where Port IO is built-in
to the CPU.




^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13  7:24                                   ` Matthew Wilcox
  2008-01-13  7:58                                     ` Matthew Wilcox
@ 2008-01-13 17:01                                     ` Arjan van de Ven
  2008-01-14 22:52                                       ` Matthew Wilcox
  1 sibling, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-13 17:01 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares,
	Tony Camuso, Loic Prylli

as a general thing I like where this patch is going

On Sun, 13 Jan 2008 00:24:15 -0700
Matthew Wilcox <matthew@wil.cx> wrote:
> +
> +int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int
> devfn,
> +						int reg, int len,
> u32 *val) +{
> +	if (reg < 256)
> +		return raw_pci_ops->read(domain, bus, devfn, reg,
> len, val);
> +	if (raw_pci_ext_ops)
> +		return raw_pci_ext_ops->read(domain, bus, devfn,
> reg, len, val);
> +	return -EINVAL;

would be nice the "reg > 256 && raw_pci_Ext_ops==NULL" case would just
call the raw_pci_ops-> pointer, to give that a chance of refusal
(but I guess that shouldn't really happen)

> --- a/arch/x86/pci/mmconfig-shared.c
> +++ b/arch/x86/pci/mmconfig-shared.c
> @@ -28,7 +28,7 @@ static int __initdata pci_mmcfg_resources_inserted;
>  static const char __init *pci_mmcfg_e7520(void)
>  {
>  	u32 win;
> -	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
> +	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);

	couldn't this (at least in some next patch) use the vector if it exists?

\

> @@ -140,5 +134,6 @@ int __init pci_mmcfg_arch_init(void)
>  {
>  	printk(KERN_INFO "PCI: Using MMCONFIG\n");
>  	raw_pci_ops = &pci_mmcfg;
> +	raw_pci_ext_ops = &pci_mmcfg;

why set BOTH vectors? you probably ONLY want to set the ext one, so 
that calls to the lower 256 go to the original


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13 12:43                                               ` Tony Camuso
@ 2008-01-13 17:03                                                 ` Arjan van de Ven
  2008-01-13 21:28                                                   ` Tony Camuso
  0 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-13 17:03 UTC (permalink / raw)
  To: tcamuso
  Cc: Ivan Kokshaysky, Greg KH, Matthew Wilcox, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

On Sun, 13 Jan 2008 07:43:11 -0500
Tony Camuso <tcamuso@redhat.com> wrote:

> Arjan van de Ven wrote:
> > On Sat, 12 Jan 2008 20:36:59 -0500
> > Tony Camuso <tcamuso@redhat.com> wrote:
> > 
> >
> > Just about NOBODY has devices that need the extended config space.
> > At all.
> 
> The PCI express spec requires the platform to provide access to this
> space for express-compliance.

PLATFORM not OS :)
Windows isn't using it in the server space, and only in the client space it recently started
considering it.

> More devices will be using this space
> as express becomes the dominant IO bus technology.

sure in like 2009 maybe.


> Which is why Loic's proposal and Ivan's implementation of it is so
> elegant. It solves all these problems in one sweep, and eliminates
> the code rendered cruft by Ivan's patch. A two-fer, by my reckoning.
> 
> >> In other words, for x86, I don't think we need to worry about Port
> >> IO config access ever going away at all.
> > 
> > You're wrong there. Sad to say, but you're wrong there.
> > 
> 
> The PCI spec provides for conf1 as an architected solution. It's not
> going away, and especially not in x86 land where Port IO is built-in
> to the CPU.

again sadly you're wrong. 

-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-12 17:45                                 ` Arjan van de Ven
  2008-01-12 18:17                                   ` Matthew Wilcox
  2008-01-12 21:49                                   ` Ivan Kokshaysky
@ 2008-01-13 18:23                                   ` Loic Prylli
  2008-01-13 18:41                                     ` Arjan van de Ven
  2 siblings, 1 reply; 125+ messages in thread
From: Loic Prylli @ 2008-01-13 18:23 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ivan Kokshaysky, Greg KH, Matthew Wilcox, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Tony Camuso



On 1/12/2008 12:45 PM, Arjan van de Ven wrote:
> On Sat, 12 Jan 2008 17:40:30 +0300
> Ivan Kokshaysky <ink@jurassic.park.msu.ru> wrote:
>   
>>  
>> +	if (reg < 256)
>> +		return pci_conf1_read(seg,bus,devfn,reg,len,value);
>> +
>>     
>
>
> btw this is my main objection to your patch; it intertwines the conf1 and mmconfig code even more.
> When (and I'm saying "when" not "if") systems arrive that only have MMCONFIG for some of the devices,
> we'll have to detangle this again, and I'm really not looking forward to that.
>   


conf1 has been a hardcoded dependencies of mmconfig for years. Ivan's 
patch does not make it worse (in fact it considerably simplifies that 
code, making it easier to untangle later).


IMHO, either your patch or Ivan's can be a good base, but:

1) For your remark above to be given any consideration, your patch 
should be modified to remove the hardcoded conf1 from the *current* 
mmconfig code, otherwise we end up with 3 set of ops (mmconfig + conf1+ 
a possible third set of operations) intertwined in a confusing manner. 
And removing that dependency is not a straightforward operation unless 
you also do 2):

2) the pci_enable_ext_config() function and dev->ext_cfg_space field, 
sysfs interface should be removed from the patch.  There has never been 
a problem reporting crashes or any undefined behaviour while trying to 
access ext-conf-space, all the problems where *using mmconfig to access 
legacy-conf-space*. The "if (dev->cfg_space_ext > 0)" checks can instead 
be replaced by  "if (reg >= 256)".
Otherwise when using per-device explicit enabling, just *checking* 
whether ext-conf-space is available by calling pci_enable_ext_config(), 
will make some of the old problems of *loosing legacy conf-space* come 
back: you would  have introduced a new user-space and kernel API while 
only solving half the problems, not a good deal.

if you do 1) and 2), then you really support the good properties you 
claimed:
- You can use mmconfig for ext-space and something else for legacy-space.
- You can use mmconfig for everything (for instance if conf1 is not 
implemented).

Of course it is as straightforward to modify Ivan's patch to also have 
the same properties.



Loic








Loic


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13 18:23                                   ` Loic Prylli
@ 2008-01-13 18:41                                     ` Arjan van de Ven
  2008-01-13 20:43                                       ` Matthew Wilcox
  2008-01-13 20:51                                       ` Loic Prylli
  0 siblings, 2 replies; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-13 18:41 UTC (permalink / raw)
  To: Loic Prylli
  Cc: Ivan Kokshaysky, Greg KH, Matthew Wilcox, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Tony Camuso

On Sun, 13 Jan 2008 13:23:35 -0500
Loic Prylli <loic@myri.com> wrote:

Matthew pointed a patch that basically does what you suggested; only one comment on your mail left after that:

> 
> 2) the pci_enable_ext_config() function and dev->ext_cfg_space field, 
> sysfs interface should be removed from the patch.  There has never
> been a problem reporting crashes or any undefined behaviour while
> trying to access ext-conf-space, all the problems where *using
> mmconfig to access legacy-conf-space*.


This entirely misses the point of why I made the patch. The point is NOT
that devices are buggy. The point is that right now, 99.99% of the machines
out there do NOT need extended config space (no matter how it gets accessed),
yet at the same time they suffered from it's issues for... what 2 years now?
The point of my patch was to make people who don't need extended config space,
not have to deal with it anymore.

Note: There is not a 100% overlap between "need" and "will not be used in 
the patches that use legacy for < 256". In the other patches posted, 
extended config space will be used in cases where it won't be with my 
patch. (Most obvious one is an "lspci -vx" from automated scripts). 
Is  that a problem? We've had 2 years of mess, with one not-enough patch after another.
There still are problems TODAY (eg im 2.6.24-rc7). The patch that falls back
to an alternative method for below 256 is no doubt a step in the right direction. 
(although I'm not all that happy about mixing access types, it's not provably incorrect)
Is it enough? I'm not sure. Only time can tell I suppose, but the risk side is that
if it is not enough, users who don't need the extended config space for functionality
will suffer the bugs AGAIN.

So in short, my approach was NOT about "fix PCI", it is about "fix the user experience".
It's a stopgap for sure, until the underlying mechanism gets reliable. It's been 2 years.....
maybe this next step is "it", maybe it isn't.


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13 18:41                                     ` Arjan van de Ven
@ 2008-01-13 20:43                                       ` Matthew Wilcox
  2008-01-13 21:18                                         ` Loic Prylli
  2008-01-13 20:51                                       ` Loic Prylli
  1 sibling, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-13 20:43 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Loic Prylli, Ivan Kokshaysky, Greg KH, Linus Torvalds, Greg KH,
	linux-kernel, Jeff Garzik, linux-pci, Benjamin Herrenschmidt,
	Martin Mares, Tony Camuso

On Sun, Jan 13, 2008 at 10:41:24AM -0800, Arjan van de Ven wrote:
> Note: There is not a 100% overlap between "need" and "will not be used in 
> the patches that use legacy for < 256". In the other patches posted, 
> extended config space will be used in cases where it won't be with my 
> patch. (Most obvious one is an "lspci -vx" from automated scripts). 

I believe you to be mistaken in this belief.  If you take Ivan's patch,
conf1 is used for all accesses below 256 bytes.  lspci -x only dumps
config space up to 64 bytes; lspci -xxxx is needed to show extended pci
config space.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13 18:41                                     ` Arjan van de Ven
  2008-01-13 20:43                                       ` Matthew Wilcox
@ 2008-01-13 20:51                                       ` Loic Prylli
  1 sibling, 0 replies; 125+ messages in thread
From: Loic Prylli @ 2008-01-13 20:51 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ivan Kokshaysky, Greg KH, Matthew Wilcox, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Tony Camuso



On 1/13/2008 1:41 PM, Arjan van de Ven wrote:
> On Sun, 13 Jan 2008 13:23:35 -0500
> Loic Prylli <loic@myri.com> wrote:
>
> Matthew pointed a patch that basically does what you suggested; only one comment on your mail left after that:
>
>   
>> 2) the pci_enable_ext_config() function and dev->ext_cfg_space field, 
>> sysfs interface should be removed from the patch.  There has never
>> been a problem reporting crashes or any undefined behaviour while
>> trying to access ext-conf-space, all the problems where *using
>> mmconfig to access legacy-conf-space*.
>>     
>
>
> This entirely misses the point of why I made the patch. The point is NOT
> that devices are buggy. The point is that right now, 99.99% of the machines
> out there do NOT need extended config space (no matter how it gets accessed),
>   

> The point of my patch was to make people who don't need extended config space,
> not have to deal with it anymore.
>   




I think I got your point the first time, and I agree it is sound. But in 
my subjective and biased opinion,  I just think ext-conf-space is 
already useful and widespread enough (being used is not the same as 
being strictly required for basic operation) for your proposed tradeoff 
to not be optimal (protecting against "future/non-proven" hardware bugs, 
i.e. bringing non-proven benefits, at the expense of making life harder 
for ext-conf-space users while bringing additional extra API/code).


To take an example from the linux tree: the driver/pci/pcie/aer code 
uses ext-conf-space for every pcie-root (currently several distributions 
enable it by default), does it mean opt-in would be automatically 
activated for most pcie hierarchies (defeating most of the benefits of 
being opt-in), or we just disable that code by default?


Does lspci -v will automatically opt-in all pcie (right now by default 
it tries to list the extended-capabilities for pcie and pcix), or do we 
now require manual explicit sysfs operations to get the whole thing? Is 
is an additional flag to lspci (if so will that flag also apply to pcix, 
possibly causing a crash for lspci -v 
-<opt-in-all-potential-ext-devices> on some machines).



> Note: There is not a 100% overlap between "need" and "will not be used in 
> the patches that use legacy for < 256". In the other patches posted, 
> extended config space will be used in cases where it won't be with my 
> patch. (Most obvious one is an "lspci -vx" from automated scripts).



To go one step your direction, I have already argued in a couple of 
emails that I would prefer to not implement ext-conf-space access for 
any PCI-X devices (removing PCI-X2 from pci_ext_cfg_size), because there 
we are trying to support devices that we don't really know exists or 
will ever exists. And protecting against "unproven bugs" makes more 
sense when it only removes "unproven benefits".



>  
> Is  that a problem? We've had 2 years of mess, with one not-enough patch after another.
>   
> There still are problems TODAY (eg im 2.6.24-rc7). The patch that falls back
> to an alternative method for below 256 is no doubt a step in the right direction. 
> (although I'm not all that happy about mixing access types, it's not provably incorrect)
> Is it enough? I'm not sure. 


FWIW, I have in my tree a patch almost identical to Ivan's dated 
"December 2005". Because of the constant activity on the mmconfig front 
(that I thought would make it obsolete), I never took the effort of 
suggesting it before one month ago (I am not a regular user of 
linux-kernel). I admit nobody else should view it that way, but for me  
rather than the last attempt at fixing mmconfig, it's a patch first used 
two years ago that would have arguably prevented all problems that have 
been reported since then.

Besides, recent mails show that hypothetically, we could even not change 
anything to the existing conf-space code, since the only known bug 
remaining is the one associated with bar probing and could be adressed by:

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc6/2.6.24-rc6-mm1/broken-out/pci-disable-decoding-during-sizing-of-bars.patch


[ Thanks to Robert hancok and Grant Grundler for explaining to me the 
history of bar-probing last month  ]


Even if  that bar-probing patch was applied (maybe it needs to be more 
combat-proven), by default, it still seems  better to not use mmconfig 
for legacy-conf-space access, but going two extra precaution steps 
beyond what seems necessary might be excessive.




> Only time can tell I suppose, but the risk side is that
> if it is not enough, users who don't need the extended config space for functionality
> will suffer the bugs AGAIN.
>   



You can indeed never exclude 100% that possibility, but if they see a 
problem again, it is likely to be a new category of hardware/BIOS bugs 
never seen before.



Loic


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13 20:43                                       ` Matthew Wilcox
@ 2008-01-13 21:18                                         ` Loic Prylli
  0 siblings, 0 replies; 125+ messages in thread
From: Loic Prylli @ 2008-01-13 21:18 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Arjan van de Ven, Ivan Kokshaysky, Greg KH, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Tony Camuso



On 1/13/2008 3:43 PM, Matthew Wilcox wrote:
> On Sun, Jan 13, 2008 at 10:41:24AM -0800, Arjan van de Ven wrote:
>   
>> Note: There is not a 100% overlap between "need" and "will not be used in 
>> the patches that use legacy for < 256". In the other patches posted, 
>> extended config space will be used in cases where it won't be with my 
>> patch. (Most obvious one is an "lspci -vx" from automated scripts). 
>>     
>
> I believe you to be mistaken in this belief.  If you take Ivan's patch,
> conf1 is used for all accesses below 256 bytes.  lspci -x only dumps
> config space up to 64 bytes; lspci -xxxx is needed to show extended pci
> config space.
>   


I agree with Arjan about that "not a 100% overlap". It is about the 
extra ext-conf-space access done while probing in drivers/pci/probe.c:
    dev->cfg_size = pci_cfg_space_size(dev);

(and lspci -v will also query/show the list of extended-caps for 
pci-x/pcie-x devices that have some, provided the kernel can access 
ext-conf-space).

With Ivan's patch, that line would still cause one extended-conf-space 
access at offset 256 for pcie/pci-x2 devices  (to check the ability to 
query ext-space). Arjan "opt-in" patch would prevent that extra access.

IMHO that access is OK and harmless in all cases, we are already 
protected by MCFG/e820 checks, but I agree one can express a different 
opinion based on trying to prevent "never-seen/potential" hardware/BIOS 
bugs. FWIW it is also there that I was suggested to exclude PCI-X2 
devices (when restricted to pcie, that access while probing cannot even 
cause the harmless master-abort/0xffffffff), but there is a small trade-off.


Loic


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13 17:03                                                 ` Arjan van de Ven
@ 2008-01-13 21:28                                                   ` Tony Camuso
  2008-01-14  0:54                                                     ` Alan Cox
  0 siblings, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-13 21:28 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ivan Kokshaysky, Greg KH, Matthew Wilcox, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

Arjan van de Ven wrote:

>> The PCI spec provides for conf1 as an architected solution. It's not
>> going away, and especially not in x86 land where Port IO is built-in
>> to the CPU.
> 
> again sadly you're wrong. 
> 

As someone gently pointed out to me, you are in a position to know this,
so I probably am wrong.


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13 21:28                                                   ` Tony Camuso
@ 2008-01-14  0:54                                                     ` Alan Cox
  2008-01-14  1:33                                                       ` Arjan van de Ven
  2008-01-14  5:20                                                       ` Linus Torvalds
  0 siblings, 2 replies; 125+ messages in thread
From: Alan Cox @ 2008-01-14  0:54 UTC (permalink / raw)
  To: tcamuso
  Cc: Arjan van de Ven, Ivan Kokshaysky, Greg KH, Matthew Wilcox,
	Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

On Sun, 13 Jan 2008 16:28:08 -0500
Tony Camuso <tcamuso@redhat.com> wrote:

> Arjan van de Ven wrote:
> 
> >> The PCI spec provides for conf1 as an architected solution. It's not
> >> going away, and especially not in x86 land where Port IO is built-in
> >> to the CPU.
> > 
> > again sadly you're wrong. 
> > 
> 
> As someone gently pointed out to me, you are in a position to know this,
> so I probably am wrong.

I suspect Arjan is wrong. It might be some Intel agenda but I still see
fairly new driver reference code that is hardcoding port accesses even
when designed for Redmond products.

Alan

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14  0:54                                                     ` Alan Cox
@ 2008-01-14  1:33                                                       ` Arjan van de Ven
  2008-01-14  3:29                                                         ` Tony Camuso
  2008-01-14  9:11                                                         ` Alan Cox
  2008-01-14  5:20                                                       ` Linus Torvalds
  1 sibling, 2 replies; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-14  1:33 UTC (permalink / raw)
  To: Alan Cox
  Cc: tcamuso, Ivan Kokshaysky, Greg KH, Matthew Wilcox,
	Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

On Mon, 14 Jan 2008 00:54:34 +0000
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> On Sun, 13 Jan 2008 16:28:08 -0500
> Tony Camuso <tcamuso@redhat.com> wrote:
> 
> > Arjan van de Ven wrote:
> > 
> > >> The PCI spec provides for conf1 as an architected solution. It's
> > >> not going away, and especially not in x86 land where Port IO is
> > >> built-in to the CPU.
>

> 
> I suspect Arjan is wrong. It might be some Intel agenda but I still
> see fairly new driver reference code that is hardcoding port accesses
> even when designed for Redmond products.


I find it hard to believe that even they have their drivers do PCI config access via ports directly from the drivers,
and especially in driver reference code...


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14  1:33                                                       ` Arjan van de Ven
@ 2008-01-14  3:29                                                         ` Tony Camuso
  2008-01-14  5:05                                                           ` Arjan van de Ven
  2008-01-14  9:11                                                         ` Alan Cox
  1 sibling, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-14  3:29 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Alan Cox, Ivan Kokshaysky, Greg KH, Matthew Wilcox,
	Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

To all ...

Well, here is what I perceive we've got so far.

. Some PCI Northbridges do not work with MMCONFIG.

. Some PCI BARs can overlap the MMCONFIG area during bus sizing.
   It is hoped that new BIOSes will locate MMCONFIG in an area
   safely out of the way of bus sizing code, but there can be
   no guarantees.

. conf1 is going away in newer x86 implementations in the not
   too distant future.

. The PCI express spec requires platforms to provide access to
   the extended config area, and there are express devices today
   using that area for AER.

. There is no need to provide different PCI config access
   mechanisms at device granularity, since the PCI config access
   mechanism between the CPU and the Northbridge is opaque to
   the devices. PCI config mechanisms only need to differ at
   the Northbridge level.

. We have a flurry of patches all claiming to solve all or some
   of these problems.


Arjan,

I realize it may not be possible for you to answer this question,
but I feel compelled to ask it anyway. Is it possible that future
x86 architectures will be implementing a SAL-like interface to
abstract PCI config access altogether?

Or can we condense these patches down to a set that does the
following?

. If the system is capable of conf1, then PCI config access
   at offsets < 256 should be confined to conf1. This solution
   is most effective for existing and legacy systems.

. If the system does not support MMCONFIG, of if MMCONFIG is
   not working, then accesses to offsets > 256 return -1 and an
   error status.

. For systems, where the conf1 mechanism is NOT available,
   then MMCONFIG should be the PCI access mechanism for all
   offsets. For such systems, we must assume that the BIOS has
   become smart enough to locate MMCONFIG in a region safe from
   encroachment by bus sizing code.




^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14  3:29                                                         ` Tony Camuso
@ 2008-01-14  5:05                                                           ` Arjan van de Ven
  2008-01-14 13:01                                                             ` Tony Camuso
  0 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-14  5:05 UTC (permalink / raw)
  To: tcamuso
  Cc: Alan Cox, Ivan Kokshaysky, Greg KH, Matthew Wilcox,
	Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

On Sun, 13 Jan 2008 22:29:23 -0500
Tony Camuso <tcamuso@redhat.com> wrote:

> . There is no need to provide different PCI config access
>    mechanisms at device granularity, since the PCI config access
>    mechanism between the CPU and the Northbridge is opaque to
>    the devices. PCI config mechanisms only need to differ at
>    the Northbridge level.

This ignores the "lets make it not matter for the 99% of the users" case.
> 
> . If the system is capable of conf1, then PCI config access
>    at offsets < 256 should be confined to conf1. This solution
>    is most effective for existing and legacy systems.

not "conf1" but "what the platform thinks is the best method for < 256".

We have this nice abstraction for the platform to select the best method... we should use it.

And still, it's another attempt to get this fixed (well.. it's been 2 years in the coming so far, maybe this will
be the last one, maybe it will not be... we'll see I suppose, but it sucks to be a user who doesn't 
need any of the functionality that the extended config space provides in theory but gets to suffer more of the issues)

I'm all in favor of making this more reliable, but really..
we've thought it was fixed time and time again over the last two years. Please consider
limiting the scope of the damage as well.




-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14  0:54                                                     ` Alan Cox
  2008-01-14  1:33                                                       ` Arjan van de Ven
@ 2008-01-14  5:20                                                       ` Linus Torvalds
  1 sibling, 0 replies; 125+ messages in thread
From: Linus Torvalds @ 2008-01-14  5:20 UTC (permalink / raw)
  To: Alan Cox
  Cc: tcamuso, Arjan van de Ven, Ivan Kokshaysky, Greg KH,
	Matthew Wilcox, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra



On Mon, 14 Jan 2008, Alan Cox wrote:
> > 
> > As someone gently pointed out to me, you are in a position to know this,
> > so I probably am wrong.
> 
> I suspect Arjan is wrong. It might be some Intel agenda but I still see
> fairly new driver reference code that is hardcoding port accesses even
> when designed for Redmond products.

Agreed. I suspect that the likelihood of conf1 accesses going away in the 
next five years is slim to none.

			Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14  1:33                                                       ` Arjan van de Ven
  2008-01-14  3:29                                                         ` Tony Camuso
@ 2008-01-14  9:11                                                         ` Alan Cox
  1 sibling, 0 replies; 125+ messages in thread
From: Alan Cox @ 2008-01-14  9:11 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: tcamuso, Ivan Kokshaysky, Greg KH, Matthew Wilcox,
	Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

> > even when designed for Redmond products.
> 
> I find it hard to believe that even they have their drivers do PCI config access via ports directly from the drivers,
> and especially in driver reference code...

Microsoft may not but the standard of Taiwanese driver code (and by
reference I mean vendor reference not OS supplier reference) is not
always great. When you have weeks to write a driver for a product with a
6 month sales lifetime I guess there are other pressures on driver
authors.

Easy enough for Intel to analyse though.

Alan

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14  5:05                                                           ` Arjan van de Ven
@ 2008-01-14 13:01                                                             ` Tony Camuso
  2008-01-14 14:46                                                               ` Arjan van de Ven
  0 siblings, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-14 13:01 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Alan Cox, Ivan Kokshaysky, Greg KH, Matthew Wilcox,
	Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

Arjan van de Ven wrote:
> On Sun, 13 Jan 2008 22:29:23 -0500
> Tony Camuso <tcamuso@redhat.com> wrote:
> 
>> . There is no need to provide different PCI config access
>>    mechanisms at device granularity, since the PCI config access
>>    mechanism between the CPU and the Northbridge is opaque to
>>    the devices. PCI config mechanisms only need to differ at
>>    the Northbridge level.
> 
> This ignores the "lets make it not matter for the 99% of the users" case.

I don't understand. If we're going to differentiate MMCONFIG from some other
access mechanism, it only needs to be done at the Northbridge level. Devices
are electrically ignorant of the protocol used between CPU and Northbridge
to get the Northbridge to assert config cycles on the bus.

>> . If the system is capable of conf1, then PCI config access
>>    at offsets < 256 should be confined to conf1. This solution
>>    is most effective for existing and legacy systems.
> 
> not "conf1" but "what the platform thinks is the best method for < 256".
> 
> We have this nice abstraction for the platform to select the best method... we should use it.
> 
Agreed.

So we have Loic and Ivan's patch limiting MMCONFIG accesses to
offsets >= 256.

And we have Matthew's patch that abstracts the method for config
accesses to offsets < 256.

I beleive Matthew has already tested these patches for functionality
on x86. All that's needed is to test for regressions on other arches.

Is there any interest in providing the following?

1. The ability to use MMCONFIG for all accesses on systems that have
    no problems with MMCONFIG.

2. For systems using both PCI and PCI express, testing each bus
    for MMCONFIG compliance, to determine whether MMCONFIG can be
    used for all config accesses or whether the bus must be limited
    all to the method abstracted for offsets < 256.

Or does that introduce unnecessary complications?



^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14 13:01                                                             ` Tony Camuso
@ 2008-01-14 14:46                                                               ` Arjan van de Ven
  2008-01-14 15:23                                                                 ` Tony Camuso
  0 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-14 14:46 UTC (permalink / raw)
  To: tcamuso
  Cc: Alan Cox, Ivan Kokshaysky, Greg KH, Matthew Wilcox,
	Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

On Mon, 14 Jan 2008 08:01:01 -0500
Tony Camuso <tcamuso@redhat.com> wrote:

> Arjan van de Ven wrote:
> > On Sun, 13 Jan 2008 22:29:23 -0500
> > Tony Camuso <tcamuso@redhat.com> wrote:
> > 
> >> . There is no need to provide different PCI config access
> >>    mechanisms at device granularity, since the PCI config access
> >>    mechanism between the CPU and the Northbridge is opaque to
> >>    the devices. PCI config mechanisms only need to differ at
> >>    the Northbridge level.
> > 
> > This ignores the "lets make it not matter for the 99% of the users"
> > case.
> 
> I don't understand. 

That;s clear :)

> If we're going to differentiate MMCONFIG from
> some other access mechanism, it only needs to be done at the
> Northbridge level. Devices are electrically ignorant of the protocol
> used between CPU and Northbridge to get the Northbridge to assert
> config cycles on the bus.

Again this is about having systems that don't need extended config space not use it. At all.
The only way to do that is have the drivers say they need it, and not use it otherwise.
It has NOTHING to do with how things are wired up. It's pure a kernel level policy decision
about whether to use extended config space AT ALL.



-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14 14:46                                                               ` Arjan van de Ven
@ 2008-01-14 15:23                                                                 ` Tony Camuso
  2008-01-14 16:01                                                                   ` Arjan van de Ven
  0 siblings, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-14 15:23 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Alan Cox, Ivan Kokshaysky, Greg KH, Matthew Wilcox,
	Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

Arjan van de Ven wrote:
> On Mon, 14 Jan 2008 08:01:01 -0500
> Tony Camuso <tcamuso@redhat.com> wrote:
 >>
>> If we're going to differentiate MMCONFIG from
>> some other access mechanism, it only needs to be done at the
>> Northbridge level. Devices are electrically ignorant of the protocol
>> used between CPU and Northbridge to get the Northbridge to assert
>> config cycles on the bus.
> 
> Again this is about having systems that don't need extended config space not use it. At all.
> The only way to do that is have the drivers say they need it, and not use it otherwise.
> It has NOTHING to do with how things are wired up. It's pure a kernel level policy decision
> about whether to use extended config space AT ALL.
> 

The problem with compelling device drivers to determine the PCI
config mechanism is that it must be forced upon arches that
have no PCI configuration quirks or don't even use the same
PCI config mechanisms as x86.

I don't think that's a good policy.

Better to confine arch-specific quirks to the arch-specific code
whenever possible.


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14 15:23                                                                 ` Tony Camuso
@ 2008-01-14 16:01                                                                   ` Arjan van de Ven
  2008-01-14 16:08                                                                     ` Tony Camuso
  0 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-14 16:01 UTC (permalink / raw)
  To: tcamuso
  Cc: Alan Cox, Ivan Kokshaysky, Greg KH, Matthew Wilcox,
	Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

On Mon, 14 Jan 2008 10:23:14 -0500
Tony Camuso <tcamuso@redhat.com> wrote:

> Arjan van de Ven wrote:
> > On Mon, 14 Jan 2008 08:01:01 -0500
> > Tony Camuso <tcamuso@redhat.com> wrote:
>  >>
> >> If we're going to differentiate MMCONFIG from
> >> some other access mechanism, it only needs to be done at the
> >> Northbridge level. Devices are electrically ignorant of the
> >> protocol used between CPU and Northbridge to get the Northbridge
> >> to assert config cycles on the bus.
> > 
> > Again this is about having systems that don't need extended config
> > space not use it. At all. The only way to do that is have the
> > drivers say they need it, and not use it otherwise. It has NOTHING
> > to do with how things are wired up. It's pure a kernel level policy
> > decision about whether to use extended config space AT ALL.
> > 
> 
> The problem with compelling device drivers to determine the PCI
> config mechanism is that it must be forced upon arches that
> have no PCI configuration quirks or don't even use the same
> PCI config mechanisms as x86.

it's not pci_enable_mmconf(), it's pci_enable_extended_config_space... it's independent of the mechanism!

> 
> I don't think that's a good policy.
> 
> Better to confine arch-specific quirks to the arch-specific code
> whenever possible.
> 


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14 16:01                                                                   ` Arjan van de Ven
@ 2008-01-14 16:08                                                                     ` Tony Camuso
  0 siblings, 0 replies; 125+ messages in thread
From: Tony Camuso @ 2008-01-14 16:08 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Alan Cox, Ivan Kokshaysky, Greg KH, Matthew Wilcox,
	Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Benjamin Herrenschmidt, Martin Mares, Loic Prylli,
	Prarit Bhargava, Chumbalkar, Nagananda, Schoeller,
	Patrick (Linux - Houston, TX),
	Bhavana Nagendra

Arjan van de Ven wrote:

> it's not pci_enable_mmconf(), it's pci_enable_extended_config_space... it's independent of the mechanism!
> 
Arjan, you would be foisting this call on device drivers running on
arches that don't need any such distinction between extended config
space and < 256 bytes.

I still think it's a bad policy.

Let's endeavor to confine arch-specific quirks to the arch-specific
code.


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-13 17:01                                     ` Arjan van de Ven
@ 2008-01-14 22:52                                       ` Matthew Wilcox
  2008-01-14 23:04                                         ` Adrian Bunk
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-14 22:52 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, Linus Torvalds,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares,
	Tony Camuso, Loic Prylli

On Sun, Jan 13, 2008 at 09:01:08AM -0800, Arjan van de Ven wrote:
> would be nice the "reg > 256 && raw_pci_Ext_ops==NULL" case would just
> call the raw_pci_ops-> pointer, to give that a chance of refusal
> (but I guess that shouldn't really happen)

We don't have a situation where that can happen -- all the other current
config methods on x86 are limited to <256 bytes.  If we get another
method, we can revisit this.

> > -	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
> > +	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
> 
> 	couldn't this (at least in some next patch) use the vector if it exists?

I thought so, but due to the way that things are initialised, mmconfig
happens before conf1.  conf1 is known to be usable, but hasn't set
raw_pci_ops at this point.  Confusing, and not ideal, but fixing this
isn't in scope for 2.6.24.

> >  	printk(KERN_INFO "PCI: Using MMCONFIG\n");
> >  	raw_pci_ops = &pci_mmcfg;
> > +	raw_pci_ext_ops = &pci_mmcfg;
> 
> why set BOTH vectors? you probably ONLY want to set the ext one, so 
> that calls to the lower 256 go to the original

I had misunderstood how the x86 pci init happened -- I thought conf1
would override this.  It doesn't.

The following patch has been tested on ia64, x86 and x86_64.
It successfully avoids the hang on my G33 machine (ie BAR probing
problem), when applied *after* Ivan's patch.

Greg, please apply Ivan's patch and then this one.

---

PCI: Rationalise raw_pci_ops

Replace raw_pci_ops with raw_pci_read() and raw_pci_write().  This is
a better interface for ACPI, ia64 and now x86.

Make pci_raw_ops private to the x86 arch, and use it to implement
raw_pci_read/write.  Add a raw_pci_ext_ops for extended config space.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>

diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 488e48a..8fd7e82 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -43,8 +43,7 @@
 #define PCI_SAL_EXT_ADDRESS(seg, bus, devfn, reg)	\
 	(((u64) seg << 28) | (bus << 20) | (devfn << 12) | (reg))
 
-static int
-pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_read(unsigned int seg, unsigned int bus, unsigned int devfn,
 	      int reg, int len, u32 *value)
 {
 	u64 addr, data = 0;
@@ -68,8 +67,7 @@ pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
 	return 0;
 }
 
-static int
-pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_write(unsigned int seg, unsigned int bus, unsigned int devfn,
 	       int reg, int len, u32 value)
 {
 	u64 addr;
@@ -91,24 +89,17 @@ pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
 	return 0;
 }
 
-static struct pci_raw_ops pci_sal_ops = {
-	.read =		pci_sal_read,
-	.write =	pci_sal_write
-};
-
-struct pci_raw_ops *raw_pci_ops = &pci_sal_ops;
-
-static int
-pci_read (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
+static int pci_read(struct pci_bus *bus, unsigned int devfn, int where,
+							int size, u32 *value)
 {
-	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
 				 devfn, where, size, value);
 }
 
-static int
-pci_write (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
+static int pci_write(struct pci_bus *bus, unsigned int devfn, int where,
+							int size, u32 value)
 {
-	return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
 				  devfn, where, size, value);
 }
 
diff --git a/arch/ia64/sn/pci/tioce_provider.c b/arch/ia64/sn/pci/tioce_provider.c
index e1a3e19..999f14f 100644
--- a/arch/ia64/sn/pci/tioce_provider.c
+++ b/arch/ia64/sn/pci/tioce_provider.c
@@ -752,13 +752,13 @@ tioce_kern_init(struct tioce_common *tioce_common)
 	 * Determine the secondary bus number of the port2 logical PPB.
 	 * This is used to decide whether a given pci device resides on
 	 * port1 or port2.  Note:  We don't have enough plumbing set up
-	 * here to use pci_read_config_xxx() so use the raw_pci_ops vector.
+	 * here to use pci_read_config_xxx() so use raw_pci_read().
 	 */
 
 	seg = tioce_common->ce_pcibus.bs_persist_segment;
 	bus = tioce_common->ce_pcibus.bs_persist_busnum;
 
-	raw_pci_ops->read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
+	raw_pci_read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
 	tioce_kern->ce_port1_secondary = (u8) tmp;
 
 	/*
@@ -799,11 +799,11 @@ tioce_kern_init(struct tioce_common *tioce_common)
 
 		/* mem base/limit */
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_MEMORY_BASE, 2, &tmp);
 		base = (u64)tmp << 16;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_MEMORY_LIMIT, 2, &tmp);
 		limit = (u64)tmp << 16;
 		limit |= 0xfffffUL;
@@ -817,21 +817,21 @@ tioce_kern_init(struct tioce_common *tioce_common)
 		 * attributes.
 		 */
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_MEMORY_BASE, 2, &tmp);
 		base = ((u64)tmp & PCI_PREF_RANGE_MASK) << 16;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_BASE_UPPER32, 4, &tmp);
 		base |= (u64)tmp << 32;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_MEMORY_LIMIT, 2, &tmp);
 
 		limit = ((u64)tmp & PCI_PREF_RANGE_MASK) << 16;
 		limit |= 0xfffffUL;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_LIMIT_UPPER32, 4, &tmp);
 		limit |= (u64)tmp << 32;
 
diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index fab30e1..7f73f7c 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -27,7 +27,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
 	pci_write_config_byte(dev, 0xf4, config|0x2);
 
 	/* read xTPR register */
-	raw_pci_ops->read(0, 0, 0x40, 0x4c, 2, &word);
+	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
 
 	if (!(word & (1 << 13))) {
 		printk(KERN_INFO "Intel E7520/7320/7525 detected. "
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 8627463..f2bd9f3 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -26,16 +26,37 @@ int pcibios_last_bus = -1;
 unsigned long pirq_table_addr;
 struct pci_bus *pci_root_bus;
 struct pci_raw_ops *raw_pci_ops;
+struct pci_raw_ops *raw_pci_ext_ops;
+
+int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val)
+{
+	if (reg < 256 && raw_pci_ops)
+		return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
+	return -EINVAL;
+}
+
+int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val)
+{
+	if (reg < 256 && raw_pci_ops)
+		return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
+	return -EINVAL;
+}
 
 static int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
-	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
 				 devfn, where, size, value);
 }
 
 static int pci_write(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
 {
-	return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
 				  devfn, where, size, value);
 }
 
diff --git a/arch/x86/pci/direct.c b/arch/x86/pci/direct.c
index 431c9a5..42f3e4c 100644
--- a/arch/x86/pci/direct.c
+++ b/arch/x86/pci/direct.c
@@ -14,7 +14,7 @@
 #define PCI_CONF1_ADDRESS(bus, devfn, reg) \
 	(0x80000000 | (bus << 16) | (devfn << 8) | (reg & ~3))
 
-int pci_conf1_read(unsigned int seg, unsigned int bus,
+static int pci_conf1_read(unsigned int seg, unsigned int bus,
 			  unsigned int devfn, int reg, int len, u32 *value)
 {
 	unsigned long flags;
@@ -45,7 +45,7 @@ int pci_conf1_read(unsigned int seg, unsigned int bus,
 	return 0;
 }
 
-int pci_conf1_write(unsigned int seg, unsigned int bus,
+static int pci_conf1_write(unsigned int seg, unsigned int bus,
 			   unsigned int devfn, int reg, int len, u32 value)
 {
 	unsigned long flags;
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 6cff66d..b31cd6a 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -215,7 +215,8 @@ static int quirk_aspm_offset[MAX_PCIEROOT << 3];
 
 static int quirk_pcie_aspm_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
-	return raw_pci_ops->read(0, bus->number, devfn, where, size, value);
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
+						devfn, where, size, value);
 }
 
 /*
@@ -231,7 +232,8 @@ static int quirk_pcie_aspm_write(struct pci_bus *bus, unsigned int devfn, int wh
 	if ((offset) && (where == offset))
 		value = value & 0xfffffffc;
 	
-	return raw_pci_ops->write(0, bus->number, devfn, where, size, value);
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
+						devfn, where, size, value);
 }
 
 static struct pci_ops quirk_pcie_aspm_ops = {
diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c
index 5565d70..e041ced 100644
--- a/arch/x86/pci/legacy.c
+++ b/arch/x86/pci/legacy.c
@@ -22,7 +22,7 @@ static void __devinit pcibios_fixup_peer_bridges(void)
 		if (pci_find_bus(0, n))
 			continue;
 		for (devfn = 0; devfn < 256; devfn += 8) {
-			if (!raw_pci_ops->read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
+			if (!raw_pci_read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
 			    l != 0x0000 && l != 0xffff) {
 				DBG("Found device at %02x:%02x [%04x]\n", n, devfn, l);
 				printk(KERN_INFO "PCI: Discovered peer bus %02x\n", n);
diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 6b521d3..8d54df4 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -28,7 +28,7 @@ static int __initdata pci_mmcfg_resources_inserted;
 static const char __init *pci_mmcfg_e7520(void)
 {
 	u32 win;
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
 
 	win = win & 0xf000;
 	if(win == 0x0000 || win == 0xf000)
@@ -53,7 +53,7 @@ static const char __init *pci_mmcfg_intel_945(void)
 
 	pci_mmcfg_config_num = 1;
 
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
 
 	/* Enable bit */
 	if (!(pciexbar & 1))
@@ -118,7 +118,7 @@ static int __init pci_mmcfg_check_hostbridge(void)
 	int i;
 	const char *name;
 
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
 	vendor = l & 0xffff;
 	device = (l >> 16) & 0xffff;
 
diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 7b75e65..081816a 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -68,9 +68,6 @@ err:		*value = -1;
 		return -EINVAL;
 	}
 
-	if (reg < 256)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
-
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
 		goto err;
@@ -104,9 +101,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if ((bus > 255) || (devfn > 255) || (reg > 4095))
 		return -EINVAL;
 
-	if (reg < 256)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
-
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
 		return -EINVAL;
@@ -138,7 +132,7 @@ static struct pci_raw_ops pci_mmcfg = {
 
 int __init pci_mmcfg_arch_init(void)
 {
-	printk(KERN_INFO "PCI: Using MMCONFIG\n");
-	raw_pci_ops = &pci_mmcfg;
+	printk(KERN_INFO "PCI: Using MMCONFIG for extended config space\n");
+	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index c4cf318..9207fd4 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -58,9 +58,6 @@ err:		*value = -1;
 		return -EINVAL;
 	}
 
-	if (reg < 256)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
-
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
 		goto err;
@@ -89,9 +86,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095)))
 		return -EINVAL;
 
-	if (reg < 256)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
-
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
 		return -EINVAL;
@@ -150,6 +144,6 @@ int __init pci_mmcfg_arch_init(void)
 			return 0;
 		}
 	}
-	raw_pci_ops = &pci_mmcfg;
+	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
index 36cb44c..3431518 100644
--- a/arch/x86/pci/pci.h
+++ b/arch/x86/pci/pci.h
@@ -85,10 +85,17 @@ extern spinlock_t pci_config_lock;
 extern int (*pcibios_enable_irq)(struct pci_dev *dev);
 extern void (*pcibios_disable_irq)(struct pci_dev *dev);
 
-extern int pci_conf1_write(unsigned int seg, unsigned int bus,
-			   unsigned int devfn, int reg, int len, u32 value);
-extern int pci_conf1_read(unsigned int seg, unsigned int bus,
-			  unsigned int devfn, int reg, int len, u32 *value);
+struct pci_raw_ops {
+	int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val);
+	int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val);
+};
+
+extern struct pci_raw_ops *raw_pci_ops;
+extern struct pci_raw_ops *raw_pci_ext_ops;
+
+extern struct pci_raw_ops pci_direct_conf1;
 
 extern int pci_direct_probe(void);
 extern void pci_direct_init(int type);
diff --git a/arch/x86/pci/visws.c b/arch/x86/pci/visws.c
index 8ecb1c7..c2df4e9 100644
--- a/arch/x86/pci/visws.c
+++ b/arch/x86/pci/visws.c
@@ -13,9 +13,6 @@
 
 #include "pci.h"
 
-
-extern struct pci_raw_ops pci_direct_conf1;
-
 static int pci_visws_enable_irq(struct pci_dev *dev) { return 0; }
 static void pci_visws_disable_irq(struct pci_dev *dev) { }
 
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index e3a673a..f190db9 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -139,15 +139,6 @@ acpi_status __init acpi_os_initialize(void)
 
 acpi_status acpi_os_initialize1(void)
 {
-	/*
-	 * Initialize PCI configuration space access, as we'll need to access
-	 * it while walking the namespace (bus 0 and root bridges w/ _BBNs).
-	 */
-	if (!raw_pci_ops) {
-		printk(KERN_ERR PREFIX
-		       "Access to PCI configuration space unavailable\n");
-		return AE_NULL_ENTRY;
-	}
 	kacpid_wq = create_singlethread_workqueue("kacpid");
 	kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
 	BUG_ON(!kacpid_wq);
@@ -498,11 +489,9 @@ acpi_os_read_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
 		return AE_ERROR;
 	}
 
-	BUG_ON(!raw_pci_ops);
-
-	result = raw_pci_ops->read(pci_id->segment, pci_id->bus,
-				   PCI_DEVFN(pci_id->device, pci_id->function),
-				   reg, size, value);
+	result = raw_pci_read(pci_id->segment, pci_id->bus,
+				PCI_DEVFN(pci_id->device, pci_id->function),
+				reg, size, value);
 
 	return (result ? AE_ERROR : AE_OK);
 }
@@ -529,11 +518,9 @@ acpi_os_write_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
 		return AE_ERROR;
 	}
 
-	BUG_ON(!raw_pci_ops);
-
-	result = raw_pci_ops->write(pci_id->segment, pci_id->bus,
-				    PCI_DEVFN(pci_id->device, pci_id->function),
-				    reg, size, value);
+	result = raw_pci_write(pci_id->segment, pci_id->bus,
+				PCI_DEVFN(pci_id->device, pci_id->function),
+				reg, size, value);
 
 	return (result ? AE_ERROR : AE_OK);
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0dd93bb..f4f1edd 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -304,14 +304,14 @@ struct pci_ops {
 	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
 };
 
-struct pci_raw_ops {
-	int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
-		    int reg, int len, u32 *val);
-	int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
-		     int reg, int len, u32 val);
-};
-
-extern struct pci_raw_ops *raw_pci_ops;
+/*
+ * ACPI needs to be able to access PCI config space before we've done a
+ * PCI bus scan and created pci_bus structures.
+ */
+extern int raw_pci_read(unsigned int domain, unsigned int bus,
+			unsigned int devfn, int reg, int len, u32 *val);
+extern int raw_pci_write(unsigned int domain, unsigned int bus,
+			unsigned int devfn, int reg, int len, u32 val);
 
 struct pci_bus_region {
 	unsigned long start;

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14 22:52                                       ` Matthew Wilcox
@ 2008-01-14 23:04                                         ` Adrian Bunk
  2008-01-15 16:00                                           ` Loic Prylli
  0 siblings, 1 reply; 125+ messages in thread
From: Adrian Bunk @ 2008-01-14 23:04 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Arjan van de Ven, Benjamin Herrenschmidt, Ivan Kokshaysky,
	Greg KH, Linus Torvalds, Greg KH, linux-kernel, Jeff Garzik,
	linux-pci, Martin Mares, Tony Camuso, Loic Prylli

On Mon, Jan 14, 2008 at 03:52:26PM -0700, Matthew Wilcox wrote:
> On Sun, Jan 13, 2008 at 09:01:08AM -0800, Arjan van de Ven wrote:
>...
> > > -	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
> > > +	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
> > 
> > 	couldn't this (at least in some next patch) use the vector if it exists?
> 
> I thought so, but due to the way that things are initialised, mmconfig
> happens before conf1.  conf1 is known to be usable, but hasn't set
> raw_pci_ops at this point.  Confusing, and not ideal, but fixing this
> isn't in scope for 2.6.24.
>...

*ahem*

I don't think anything of what was discussed in this thread would be in 
scope for 2.6.24 (unless Linus wants to let the bunny that brings eggs 
release 2.6.24).

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2007-12-25 11:26 [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in Arjan van de Ven
  2007-12-27 11:52 ` Jeff Garzik
  2008-01-11 19:02 ` Greg KH
@ 2008-01-15 12:58 ` Øyvind Vågen Jægtnes
  2 siblings, 0 replies; 125+ messages in thread
From: Øyvind Vågen Jægtnes @ 2008-01-15 12:58 UTC (permalink / raw)
  To: linux-kernel

I just thought this might be interesting to the discussion.

I recently bought another 2 GB memory for my computer.
My hardware is as following:

Asus Commando (Intel P965 chipset)
Intel Core2 Q6600
4x1 GB Geil PC6400 memory
nVidia 8800 gts (old g80 core, 640 mb mem)

Without booting with pci=nommeconf i have severe stability issues and
often when its not crashing i get slowdowns with the error:

kern.log:Jan 15 13:19:40 bilbo kernel: [  132.046715] NVRM: Xid
(0001:00): 6, PE0001
... repeated x times.

In addition the nVidia framebuffer seems to "leak" or not update since
i get loads of graphics artifacts.

The system works perfectly fine with 2 GB memory and not the
pci=nommconf.
It works like a charm when using pci=nommconf and 4 GB memory.

In adition i have to enable the Northbridge->PCI Memory remap feature
in the BIOS to avoid the kernel panicing when trying to access > 3 gb
but that is understandable :)

My software is Kubuntu 7.10 stock x86_64 kernel, but i do use the
binary driver by nVidia.

It works like a charm when using pci=nommconf

If you guys need any more info about hardware/software from me, please
let me know.

-- 
Øyvind Vågen Jægtnes
+47 96 22 03 08

(i reject your diurnal rhythm and subsitute my own)

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-14 23:04                                         ` Adrian Bunk
@ 2008-01-15 16:00                                           ` Loic Prylli
  2008-01-15 17:46                                             ` Greg KH
  0 siblings, 1 reply; 125+ messages in thread
From: Loic Prylli @ 2008-01-15 16:00 UTC (permalink / raw)
  To: Adrian Bunk, Linus Torvalds
  Cc: Matthew Wilcox, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, Greg KH, linux-kernel, Jeff Garzik,
	linux-pci, Martin Mares, Tony Camuso



On 1/14/2008 6:04 PM, Adrian Bunk wrote:
>> I thought so, but due to the way that things are initialised, mmconfig
>> happens before conf1.  conf1 is known to be usable, but hasn't set
>> raw_pci_ops at this point.  Confusing, and not ideal, but fixing this
>> isn't in scope for 2.6.24.
>> ...
>>     
>
> *ahem*
>
> I don't think anything of what was discussed in this thread would be in 
> scope for 2.6.24 (unless Linus wants to let the bunny that brings eggs 
> release 2.6.24).
>
> cu
> Adrian
>   


Why not put in 2.6.24 a simple fix for the last known remaining mmconfig 
problems in 2.6.24?  There has mostly been three bugs related to mmconfig:
- BIOS/hardware: exaggerated MCFG claims: solved long ago
- hardware: buggy CRS+mmconfig chipset: fix included last month
- Linux code: mmconfig incompatible with live BAR-probing: *not fixed*

It would be ironic to not fix the only one that is really confined to 
the Linux code.

Everybody more or less agrees *any* patches submitted so far does solve 
the known problems, and will not cause regressions. The only long 
discussion is about how to best prevent the effect of an "imaginary" 
fourth bug, and by nature that's a controversial topic.

For 2.6.24, if nothing more than a few lines can be done, either make 
pci=nommconf the default and add a pci=mmconf option, or/and apply one 
of the easiest patch to review i.e.Tony's one, so small I copy it again 
below (using 0x40 or 0x100 for the comparison does not really matter, 
personally I would change it to 0x100 to be like Ivan's patch, but 
either is much better than nothing). Replacing some mmconfig access by 
conf1 cannot cause any regression.


Loic


P.S.: with that patch, conf1-less x86 systems requiring mmconfig would 
not be supported. But they are like UFOs. They are plenty of them in the 
galaxy, but earth sightings are not convincing enough for 2.6.24 
support, they can wait 2.6.25.


diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 1bf5816..4474979 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -73,7 +73,7 @@ static int pci_mmcfg_read(unsigned int seg, unsigned 
int bus,
     }
 
     base = get_base_addr(seg, bus, devfn);
-    if (!base)
+    if ((!base) || (reg < 0x40))
         return pci_conf1_read(seg,bus,devfn,reg,len,value);
 
     spin_lock_irqsave(&pci_config_lock, flags);
@@ -106,7 +106,7 @@ static int pci_mmcfg_write(unsigned int seg, 
unsigned int bus,
         return -EINVAL;
 
     base = get_base_addr(seg, bus, devfn);
-    if (!base)
+    if ((!base) || (reg < 0x40))
         return pci_conf1_write(seg,bus,devfn,reg,len,value);
 
     spin_lock_irqsave(&pci_config_lock, flags);
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index 4095e4d..4ad1fcb 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -61,7 +61,7 @@ static int pci_mmcfg_read(unsigned int seg, unsigned 
int bus,
     }
 
     addr = pci_dev_base(seg, bus, devfn);
-    if (!addr)
+    if ((!addr) || (reg < 0x40))
         return pci_conf1_read(seg,bus,devfn,reg,len,value);
 
     switch (len) {
@@ -89,7 +89,7 @@ static int pci_mmcfg_write(unsigned int seg, unsigned 
int bus,
         return -EINVAL;
 
     addr = pci_dev_base(seg, bus, devfn);
-    if (!addr)
+    if ((!addr) || (reg < 0x40))
         return pci_conf1_write(seg,bus,devfn,reg,len,value);
 
     switch (len) {






^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-15 16:00                                           ` Loic Prylli
@ 2008-01-15 17:46                                             ` Greg KH
  2008-01-15 17:56                                               ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Greg KH @ 2008-01-15 17:46 UTC (permalink / raw)
  To: Loic Prylli
  Cc: Adrian Bunk, Linus Torvalds, Matthew Wilcox, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares, Tony Camuso

On Tue, Jan 15, 2008 at 11:00:37AM -0500, Loic Prylli wrote:
>
>
> On 1/14/2008 6:04 PM, Adrian Bunk wrote:
>>> I thought so, but due to the way that things are initialised, mmconfig
>>> happens before conf1.  conf1 is known to be usable, but hasn't set
>>> raw_pci_ops at this point.  Confusing, and not ideal, but fixing this
>>> isn't in scope for 2.6.24.
>>> ...
>>>     
>>
>> *ahem*
>>
>> I don't think anything of what was discussed in this thread would be in 
>> scope for 2.6.24 (unless Linus wants to let the bunny that brings eggs 
>> release 2.6.24).
>>
>> cu
>> Adrian
>>   
>
>
> Why not put in 2.6.24 a simple fix for the last known remaining mmconfig 
> problems in 2.6.24?

Heh, no, because it is _way_ too late for such a patch that hasn't been
tested in any trees, sorry.

2.6.25 is the earliest I'll take such a fix, and if it's really as
simple as you say, I'll consider it for the -stable releases for .24 if
needed.

But so far, we have a zillion patches floating around, claiming
different things, some with signed-off-bys and others without, so for
now, I'll just stick with Arjan's patch in -mm and see if anyone
complains about those releases...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-15 17:46                                             ` Greg KH
@ 2008-01-15 17:56                                               ` Matthew Wilcox
  2008-01-15 19:27                                                 ` Tony Camuso
  2008-01-19 16:58                                                 ` Grant Grundler
  0 siblings, 2 replies; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-15 17:56 UTC (permalink / raw)
  To: Greg KH
  Cc: Loic Prylli, Adrian Bunk, Linus Torvalds, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares, Tony Camuso

On Tue, Jan 15, 2008 at 09:46:43AM -0800, Greg KH wrote:
> But so far, we have a zillion patches floating around, claiming
> different things, some with signed-off-bys and others without, so for
> now, I'll just stick with Arjan's patch in -mm and see if anyone
> complains about those releases...

I complain about Arjan's patch.  For reasons which have been adequately
gone into already in this thread.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-15 17:56                                               ` Matthew Wilcox
@ 2008-01-15 19:27                                                 ` Tony Camuso
  2008-01-15 19:38                                                   ` Linus Torvalds
  2008-01-19 16:58                                                 ` Grant Grundler
  1 sibling, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-15 19:27 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Loic Prylli, Adrian Bunk, Linus Torvalds,
	Arjan van de Ven, Benjamin Herrenschmidt, Ivan Kokshaysky,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares

I agree with Matthew.

My preference is Ivan's patch using Loic's proposal.

My patch would have tested MMCONFIG before using it, but it didn't
fix the problem where the decode of large displacement devices can
overlap the MMCONFIG region.

Ivan's patch fixes that, and the problem of Northbridges that don't
respond to MMCONFIG and as a bonus cleans out some code rendered
unnecessary by his patch.

Linus is confident that conf1 is not going away for at least the
next five years.


Matthew Wilcox wrote:
> On Tue, Jan 15, 2008 at 09:46:43AM -0800, Greg KH wrote:
>> But so far, we have a zillion patches floating around, claiming
>> different things, some with signed-off-bys and others without, so for
>> now, I'll just stick with Arjan's patch in -mm and see if anyone
>> complains about those releases...
> 
> I complain about Arjan's patch.  For reasons which have been adequately
> gone into already in this thread.
> 


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-15 19:27                                                 ` Tony Camuso
@ 2008-01-15 19:38                                                   ` Linus Torvalds
  2008-01-15 19:40                                                     ` Matthew Wilcox
  2008-01-15 22:12                                                     ` Loic Prylli
  0 siblings, 2 replies; 125+ messages in thread
From: Linus Torvalds @ 2008-01-15 19:38 UTC (permalink / raw)
  To: Tony Camuso
  Cc: Matthew Wilcox, Greg KH, Loic Prylli, Adrian Bunk,
	Arjan van de Ven, Benjamin Herrenschmidt, Ivan Kokshaysky,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares



On Tue, 15 Jan 2008, Tony Camuso wrote:
> 
> Linus is confident that conf1 is not going away for at least the
> next five years.

Not on PC's. Small birds tell me that there can be all these non-PC x86 
subarchitectures that may or may not have conf1.

		Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-15 19:38                                                   ` Linus Torvalds
@ 2008-01-15 19:40                                                     ` Matthew Wilcox
  2008-01-15 22:12                                                     ` Loic Prylli
  1 sibling, 0 replies; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-15 19:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tony Camuso, Greg KH, Loic Prylli, Adrian Bunk, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Tue, Jan 15, 2008 at 11:38:42AM -0800, Linus Torvalds wrote:
> On Tue, 15 Jan 2008, Tony Camuso wrote:
> > Linus is confident that conf1 is not going away for at least the
> > next five years.
> 
> Not on PC's. Small birds tell me that there can be all these non-PC x86 
> subarchitectures that may or may not have conf1.

Right -- hence my patch on top of Ivan's which removes all the assumptions
about conf1 from mmconfig (there are still *references* to conf1 in the
mmconfig code, but they'll only be used if conf1 is functional).

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-15 19:38                                                   ` Linus Torvalds
  2008-01-15 19:40                                                     ` Matthew Wilcox
@ 2008-01-15 22:12                                                     ` Loic Prylli
  1 sibling, 0 replies; 125+ messages in thread
From: Loic Prylli @ 2008-01-15 22:12 UTC (permalink / raw)
  To: Linus Torvalds, Arjan van de Ven
  Cc: Tony Camuso, Matthew Wilcox, Greg KH, Adrian Bunk,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares



On 1/15/2008 2:38 PM, Linus Torvalds wrote:
> On Tue, 15 Jan 2008, Tony Camuso wrote:
>   
>> Linus is confident that conf1 is not going away for at least the
>> next five years.
>>     
>
> Not on PC's. Small birds tell me that there can be all these non-PC x86 
> subarchitectures that may or may not have conf1.
>
> 		Linus
>
>   




But is there a ACPI-compliant/architecture that only offers mmconfig for 
configuration-space access and no other fallback method (i.e. no conf1, 
no bios,...)?

2.6.24 supports mmconfig for:
 - ACPI-system with  MCFG
 - a couple chipset discovered by conf1


If a system has no conf1, but does not have e820+ACPI+MCFG, or does have 
some other method than mmconfig, it was already irrelevant in the 
discussion of Ivan's initial patch in december (because that system was 
either never supported or not impacted, and we were trying to fix bugs, 
not introduce support for new class of systems).


Maybe Arjan could share his knowledge, and tell us what system he was 
thinking about (and whether it needed to be supported by 2.6.24) when 
saying:
  "When (and I'm saying "when" not "if") systems arrive that only have 
MMCONFIG for some of the devices."


Anyway Ivan's patch + Matthew's extensions are handling that non-PC 
arch. That combination is advocated by at least:
Ivan Kokshaysky
Matthew Wilcox
Tony Camuso
Loic Prylli
even Arjan's said that while he prefers his patch (saying it's more 
conservative), he does not see a existing problem with the Ivan/Matthew 
combination.

[ simpler, less ambitious fixes can be forgotten if nothing can be done 
for 2.6.24, I can understand that choice ]


The list of problems I see with Arjan's patch are:
- no word on whether the existing Linux driver/pci/pcie/aer code should 
be converted to opt-in?
- mmconfig still needs to be revisited to sort-out the mix of 
mmconfig+conf1+third-method access.
- you cannot test if ext-conf-space is available without taking risks: 
when pci_enable_ext_config() is called, even legacy-conf-space is 
switched to the new method.  So some administrator action (lspci -v 
+maybe-other-flag) or some driver action (that can optionally use 
ext-conf-space but does not *rely* on it) could cause some devices to 
totally disappear (if some pci hierarchy is handled by mmconfig as a 
0xffffffff section as seen on many amd machines). Matthew/Ivan will 
simply in the worst case detect that ext-conf-space is not available in 
pci_cfg_space_size()), legacy-conf-space will still work (and that 
0xffffffff section is perfectly *safe* to query, tell me if you need 
more details of why).
- introduce a new user-api, and a new kernel API, while in practice 
there is no evidence that brings any benefits compared to Ivan/Matthew.


IMHO, making  "pci=nommconf" the default behaviour is better than 
Arjan's patch: for the exaggerated 99.99% users he claims don't need 
ext-conf-space, that's obviously as good. And many of the others would 
benefit from the ability to test and optionally use ext-conf-space is 
available without taking the risk of crashing something, so something 
else is better for them.


With Arjan's patch, in 10 years, we might still have to use an extra 
option (or some other action) when using lspci to display extended caps, 
and we would still run the risk of crashing some old machine when doing 
so (unless maybe a blacklist of some sort will be added, making the 
newly introduced API completely useless soon, or unless we keep the 
painful bitmaps in mmconfig potentially ending-up with 3 set of pci-ops).


Loic


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-15 17:56                                               ` Matthew Wilcox
  2008-01-15 19:27                                                 ` Tony Camuso
@ 2008-01-19 16:58                                                 ` Grant Grundler
  2008-01-28 18:32                                                   ` Tony Camuso
  1 sibling, 1 reply; 125+ messages in thread
From: Grant Grundler @ 2008-01-19 16:58 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Loic Prylli, Adrian Bunk, Linus Torvalds,
	Arjan van de Ven, Benjamin Herrenschmidt, Ivan Kokshaysky,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares,
	Tony Camuso

On Tue, Jan 15, 2008 at 10:56:41AM -0700, Matthew Wilcox wrote:
> On Tue, Jan 15, 2008 at 09:46:43AM -0800, Greg KH wrote:
> > But so far, we have a zillion patches floating around, claiming
> > different things, some with signed-off-bys and others without, so for
> > now, I'll just stick with Arjan's patch in -mm and see if anyone
> > complains about those releases...
> 
> I complain about Arjan's patch.  For reasons which have been adequately
> gone into already in this thread.

Agreed.
Greg, I think at least two better alternatives were proposed already.
Please review the thread again.

grant

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-19 16:58                                                 ` Grant Grundler
@ 2008-01-28 18:32                                                   ` Tony Camuso
  2008-01-28 20:44                                                     ` Greg KH
  0 siblings, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-28 18:32 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Matthew Wilcox, Greg KH, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

Greg,

Have you given Grant's suggestion any further consideration?

I'd like to know how the MMCONFIG issues discussed in this thread are going
to be handled upstream. I have a patch implemented in RHEL 5.2, but I would
rather have the upstream patch implemented, whatever it is.


Grant Grundler wrote:
> On Tue, Jan 15, 2008 at 10:56:41AM -0700, Matthew Wilcox wrote:
>> On Tue, Jan 15, 2008 at 09:46:43AM -0800, Greg KH wrote:
>>> But so far, we have a zillion patches floating around, claiming
>>> different things, some with signed-off-bys and others without, so for
>>> now, I'll just stick with Arjan's patch in -mm and see if anyone
>>> complains about those releases...
>> I complain about Arjan's patch.  For reasons which have been adequately
>> gone into already in this thread.
> 
> Agreed.
> Greg, I think at least two better alternatives were proposed already.
> Please review the thread again.
> 
> grant


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-28 18:32                                                   ` Tony Camuso
@ 2008-01-28 20:44                                                     ` Greg KH
  2008-01-28 22:31                                                       ` Matthew Wilcox
  2008-01-29  3:05                                                       ` [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in Arjan van de Ven
  0 siblings, 2 replies; 125+ messages in thread
From: Greg KH @ 2008-01-28 20:44 UTC (permalink / raw)
  To: Tony Camuso
  Cc: Grant Grundler, Matthew Wilcox, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Mon, Jan 28, 2008 at 01:32:06PM -0500, Tony Camuso wrote:
> Greg,
>
> Have you given Grant's suggestion any further consideration?
>
> I'd like to know how the MMCONFIG issues discussed in this thread are going
> to be handled upstream. I have a patch implemented in RHEL 5.2, but I would
> rather have the upstream patch implemented, whatever it is.

Well, everyone still doesn't seem to agree on the proper way forward
here, so for me to just "pick one" isn't very appropriate.

So, can we try again?

Can people submit, what they think the change should be?  Right now I
have Arjan's patch in my kernel tree, but will not send it to Linus for
.25 for now, unless everyone thinks that is the best solution at the
moment (which, for me, I'm leaning toward right now...)

thanks,

greg "can't we all just get along?" k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-28 20:44                                                     ` Greg KH
@ 2008-01-28 22:31                                                       ` Matthew Wilcox
  2008-01-28 22:53                                                         ` Greg KH
  2008-01-29  3:05                                                       ` [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in Arjan van de Ven
  1 sibling, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-28 22:31 UTC (permalink / raw)
  To: Greg KH
  Cc: Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Mon, Jan 28, 2008 at 12:44:31PM -0800, Greg KH wrote:
> On Mon, Jan 28, 2008 at 01:32:06PM -0500, Tony Camuso wrote:
> > Greg,
> >
> > Have you given Grant's suggestion any further consideration?
> >
> > I'd like to know how the MMCONFIG issues discussed in this thread are going
> > to be handled upstream. I have a patch implemented in RHEL 5.2, but I would
> > rather have the upstream patch implemented, whatever it is.
> 
> Well, everyone still doesn't seem to agree on the proper way forward
> here, so for me to just "pick one" isn't very appropriate.
> 
> So, can we try again?
> 
> Can people submit, what they think the change should be?  Right now I
> have Arjan's patch in my kernel tree, but will not send it to Linus for
> .25 for now, unless everyone thinks that is the best solution at the
> moment (which, for me, I'm leaning toward right now...)

My opinion is that Ivan's patch followed by my patch is the best way
forward.  I see Arjan's patch as a good prototype, but it introduces a lot
of unnecessary infrastructure (and a userspace interface that I dislike).

I would like to see Ivan's patch merged ASAP as it does fix one of
my machines.  akpm has the patch from me to disable io decoding, and
intends to send it to Linus during this merge window ... that patch
becomes unnecessary if we merge Ivan's patch.

My patch is an incremental improvement that adds some of the features
of Arjan's patch without the extra infrastructure.  I don't think it's
urgent, but it does make some of our internal interfaces cleaner.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-28 22:31                                                       ` Matthew Wilcox
@ 2008-01-28 22:53                                                         ` Greg KH
  2008-01-29  2:56                                                           ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Greg KH @ 2008-01-28 22:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Mon, Jan 28, 2008 at 03:31:42PM -0700, Matthew Wilcox wrote:
> On Mon, Jan 28, 2008 at 12:44:31PM -0800, Greg KH wrote:
> > On Mon, Jan 28, 2008 at 01:32:06PM -0500, Tony Camuso wrote:
> > > Greg,
> > >
> > > Have you given Grant's suggestion any further consideration?
> > >
> > > I'd like to know how the MMCONFIG issues discussed in this thread are going
> > > to be handled upstream. I have a patch implemented in RHEL 5.2, but I would
> > > rather have the upstream patch implemented, whatever it is.
> > 
> > Well, everyone still doesn't seem to agree on the proper way forward
> > here, so for me to just "pick one" isn't very appropriate.
> > 
> > So, can we try again?
> > 
> > Can people submit, what they think the change should be?  Right now I
> > have Arjan's patch in my kernel tree, but will not send it to Linus for
> > .25 for now, unless everyone thinks that is the best solution at the
> > moment (which, for me, I'm leaning toward right now...)
> 
> My opinion is that Ivan's patch followed by my patch is the best way
> forward.  I see Arjan's patch as a good prototype, but it introduces a lot
> of unnecessary infrastructure (and a userspace interface that I dislike).
> 
> I would like to see Ivan's patch merged ASAP as it does fix one of
> my machines.  akpm has the patch from me to disable io decoding, and
> intends to send it to Linus during this merge window ... that patch
> becomes unnecessary if we merge Ivan's patch.
> 
> My patch is an incremental improvement that adds some of the features
> of Arjan's patch without the extra infrastructure.  I don't think it's
> urgent, but it does make some of our internal interfaces cleaner.

Please send me patches, in a form that can be merged, along with a
proper changelog entry, in the order in which you wish them to be
applied, so I know exactly what changes you are referring to.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-28 22:53                                                         ` Greg KH
@ 2008-01-29  2:56                                                           ` Matthew Wilcox
  2008-01-29  2:57                                                             ` PCI x86: always use conf1 to access config space below 256 bytes Matthew Wilcox
  2008-01-29  3:03                                                             ` [PATCH] Change pci_raw_ops to pci_raw_read/write Matthew Wilcox
  0 siblings, 2 replies; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-29  2:56 UTC (permalink / raw)
  To: Greg KH
  Cc: Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Mon, Jan 28, 2008 at 02:53:34PM -0800, Greg KH wrote:
> Please send me patches, in a form that can be merged, along with a
> proper changelog entry, in the order in which you wish them to be
> applied, so I know exactly what changes you are referring to.

I'll send each patch as a reply to this email.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* PCI x86: always use conf1 to access config space below 256 bytes
  2008-01-29  2:56                                                           ` Matthew Wilcox
@ 2008-01-29  2:57                                                             ` Matthew Wilcox
  2008-01-29 13:21                                                               ` Greg KH
  2008-01-29  3:03                                                             ` [PATCH] Change pci_raw_ops to pci_raw_read/write Matthew Wilcox
  1 sibling, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-29  2:57 UTC (permalink / raw)
  To: Greg KH
  Cc: Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

PCI x86: always use conf1 to access config space below 256 bytes

Thanks to Loic Prylli <loic@myri.com>, who originally proposed
this idea.

Always using legacy configuration mechanism for the legacy config space
and extended mechanism (mmconf) for the extended config space is
a simple and very logical approach. It's supposed to resolve all
known mmconf problems. It still allows per-device quirks (tweaking
dev->cfg_size). It also allows to get rid of mmconf fallback code.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
---
 arch/x86/pci/mmconfig-shared.c |   35 -----------------------------------
 arch/x86/pci/mmconfig_32.c     |   22 +++++++++-------------
 arch/x86/pci/mmconfig_64.c     |   22 ++++++++++------------
 arch/x86/pci/pci.h             |    7 -------
 4 files changed, 19 insertions(+), 67 deletions(-)

diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 4df637e..6b521d3 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -22,42 +22,9 @@
 #define MMCONFIG_APER_MIN	(2 * 1024*1024)
 #define MMCONFIG_APER_MAX	(256 * 1024*1024)
 
-DECLARE_BITMAP(pci_mmcfg_fallback_slots, 32*PCI_MMCFG_MAX_CHECK_BUS);
-
 /* Indicate if the mmcfg resources have been placed into the resource table. */
 static int __initdata pci_mmcfg_resources_inserted;
 
-/* K8 systems have some devices (typically in the builtin northbridge)
-   that are only accessible using type1
-   Normally this can be expressed in the MCFG by not listing them
-   and assigning suitable _SEGs, but this isn't implemented in some BIOS.
-   Instead try to discover all devices on bus 0 that are unreachable using MM
-   and fallback for them. */
-static void __init unreachable_devices(void)
-{
-	int i, bus;
-	/* Use the max bus number from ACPI here? */
-	for (bus = 0; bus < PCI_MMCFG_MAX_CHECK_BUS; bus++) {
-		for (i = 0; i < 32; i++) {
-			unsigned int devfn = PCI_DEVFN(i, 0);
-			u32 val1, val2;
-
-			pci_conf1_read(0, bus, devfn, 0, 4, &val1);
-			if (val1 == 0xffffffff)
-				continue;
-
-			if (pci_mmcfg_arch_reachable(0, bus, devfn)) {
-				raw_pci_ops->read(0, bus, devfn, 0, 4, &val2);
-				if (val1 == val2)
-					continue;
-			}
-			set_bit(i + 32 * bus, pci_mmcfg_fallback_slots);
-			printk(KERN_NOTICE "PCI: No mmconfig possible on device"
-			       " %02x:%02x\n", bus, i);
-		}
-	}
-}
-
 static const char __init *pci_mmcfg_e7520(void)
 {
 	u32 win;
@@ -270,8 +237,6 @@ void __init pci_mmcfg_init(int type)
 		return;
 
 	if (pci_mmcfg_arch_init()) {
-		if (type == 1)
-			unreachable_devices();
 		if (known_bridge)
 			pci_mmcfg_insert_resources(IORESOURCE_BUSY);
 		pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 1bf5816..7b75e65 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -30,10 +30,6 @@ static u32 get_base_addr(unsigned int seg, int bus, unsigned devfn)
 	struct acpi_mcfg_allocation *cfg;
 	int cfg_num;
 
-	if (seg == 0 && bus < PCI_MMCFG_MAX_CHECK_BUS &&
-	    test_bit(PCI_SLOT(devfn) + 32*bus, pci_mmcfg_fallback_slots))
-		return 0;
-
 	for (cfg_num = 0; cfg_num < pci_mmcfg_config_num; cfg_num++) {
 		cfg = &pci_mmcfg_config[cfg_num];
 		if (cfg->pci_segment == seg &&
@@ -68,13 +64,16 @@ static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
 	u32 base;
 
 	if ((bus > 255) || (devfn > 255) || (reg > 4095)) {
-		*value = -1;
+err:		*value = -1;
 		return -EINVAL;
 	}
 
+	if (reg < 256)
+		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+		goto err;
 
 	spin_lock_irqsave(&pci_config_lock, flags);
 
@@ -105,9 +104,12 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if ((bus > 255) || (devfn > 255) || (reg > 4095))
 		return -EINVAL;
 
+	if (reg < 256)
+		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+		return -EINVAL;
 
 	spin_lock_irqsave(&pci_config_lock, flags);
 
@@ -134,12 +136,6 @@ static struct pci_raw_ops pci_mmcfg = {
 	.write =	pci_mmcfg_write,
 };
 
-int __init pci_mmcfg_arch_reachable(unsigned int seg, unsigned int bus,
-				    unsigned int devfn)
-{
-	return get_base_addr(seg, bus, devfn) != 0;
-}
-
 int __init pci_mmcfg_arch_init(void)
 {
 	printk(KERN_INFO "PCI: Using MMCONFIG\n");
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index 4095e4d..c4cf318 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -40,9 +40,7 @@ static char __iomem *get_virt(unsigned int seg, unsigned bus)
 static char __iomem *pci_dev_base(unsigned int seg, unsigned int bus, unsigned int devfn)
 {
 	char __iomem *addr;
-	if (seg == 0 && bus < PCI_MMCFG_MAX_CHECK_BUS &&
-		test_bit(32*bus + PCI_SLOT(devfn), pci_mmcfg_fallback_slots))
-		return NULL;
+
 	addr = get_virt(seg, bus);
 	if (!addr)
 		return NULL;
@@ -56,13 +54,16 @@ static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
 
 	/* Why do we have this when nobody checks it. How about a BUG()!? -AK */
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095))) {
-		*value = -1;
+err:		*value = -1;
 		return -EINVAL;
 	}
 
+	if (reg < 256)
+		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+		goto err;
 
 	switch (len) {
 	case 1:
@@ -88,9 +89,12 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095)))
 		return -EINVAL;
 
+	if (reg < 256)
+		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+		return -EINVAL;
 
 	switch (len) {
 	case 1:
@@ -126,12 +130,6 @@ static void __iomem * __init mcfg_ioremap(struct acpi_mcfg_allocation *cfg)
 	return addr;
 }
 
-int __init pci_mmcfg_arch_reachable(unsigned int seg, unsigned int bus,
-				    unsigned int devfn)
-{
-	return pci_dev_base(seg, bus, devfn) != NULL;
-}
-
 int __init pci_mmcfg_arch_init(void)
 {
 	int i;
diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
index ac56d39..36cb44c 100644
--- a/arch/x86/pci/pci.h
+++ b/arch/x86/pci/pci.h
@@ -98,13 +98,6 @@ extern void pcibios_sort(void);
 
 /* pci-mmconfig.c */
 
-/* Verify the first 16 busses. We assume that systems with more busses
-   get MCFG right. */
-#define PCI_MMCFG_MAX_CHECK_BUS 16
-extern DECLARE_BITMAP(pci_mmcfg_fallback_slots, 32*PCI_MMCFG_MAX_CHECK_BUS);
-
-extern int __init pci_mmcfg_arch_reachable(unsigned int seg, unsigned int bus,
-					   unsigned int devfn);
 extern int __init pci_mmcfg_arch_init(void);
 
 /*

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-01-29  2:56                                                           ` Matthew Wilcox
  2008-01-29  2:57                                                             ` PCI x86: always use conf1 to access config space below 256 bytes Matthew Wilcox
@ 2008-01-29  3:03                                                             ` Matthew Wilcox
  2008-02-03  7:30                                                               ` Yinghai Lu
  1 sibling, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-29  3:03 UTC (permalink / raw)
  To: Greg KH
  Cc: Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares


We want to allow different implementations of pci_raw_ops for standard
and extended config space on x86.  Rather than clutter generic code with
knowledge of this, we make pci_raw_ops private to x86 and use it to
implement the new raw interface -- raw_pci_read() and raw_pci_write().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
---
 arch/ia64/pci/pci.c               |   25 ++++++++-----------------
 arch/ia64/sn/pci/tioce_provider.c |   16 ++++++++--------
 arch/x86/kernel/quirks.c          |    2 +-
 arch/x86/pci/common.c             |   25 +++++++++++++++++++++++--
 arch/x86/pci/direct.c             |    4 ++--
 arch/x86/pci/fixup.c              |    6 ++++--
 arch/x86/pci/legacy.c             |    2 +-
 arch/x86/pci/mmconfig-shared.c    |    6 +++---
 arch/x86/pci/mmconfig_32.c        |   10 ++--------
 arch/x86/pci/mmconfig_64.c        |    8 +-------
 arch/x86/pci/pci.h                |   15 +++++++++++----
 arch/x86/pci/visws.c              |    3 ---
 drivers/acpi/osl.c                |   25 ++++++-------------------
 drivers/ata/Kconfig               |    3 +++
 drivers/ata/Makefile              |    3 +++
 include/linux/pci.h               |   16 ++++++++--------
 16 files changed, 84 insertions(+), 85 deletions(-)

diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 488e48a..8fd7e82 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -43,8 +43,7 @@
 #define PCI_SAL_EXT_ADDRESS(seg, bus, devfn, reg)	\
 	(((u64) seg << 28) | (bus << 20) | (devfn << 12) | (reg))
 
-static int
-pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_read(unsigned int seg, unsigned int bus, unsigned int devfn,
 	      int reg, int len, u32 *value)
 {
 	u64 addr, data = 0;
@@ -68,8 +67,7 @@ pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
 	return 0;
 }
 
-static int
-pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_write(unsigned int seg, unsigned int bus, unsigned int devfn,
 	       int reg, int len, u32 value)
 {
 	u64 addr;
@@ -91,24 +89,17 @@ pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
 	return 0;
 }
 
-static struct pci_raw_ops pci_sal_ops = {
-	.read =		pci_sal_read,
-	.write =	pci_sal_write
-};
-
-struct pci_raw_ops *raw_pci_ops = &pci_sal_ops;
-
-static int
-pci_read (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
+static int pci_read(struct pci_bus *bus, unsigned int devfn, int where,
+							int size, u32 *value)
 {
-	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
 				 devfn, where, size, value);
 }
 
-static int
-pci_write (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
+static int pci_write(struct pci_bus *bus, unsigned int devfn, int where,
+							int size, u32 value)
 {
-	return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
 				  devfn, where, size, value);
 }
 
diff --git a/arch/ia64/sn/pci/tioce_provider.c b/arch/ia64/sn/pci/tioce_provider.c
index e1a3e19..999f14f 100644
--- a/arch/ia64/sn/pci/tioce_provider.c
+++ b/arch/ia64/sn/pci/tioce_provider.c
@@ -752,13 +752,13 @@ tioce_kern_init(struct tioce_common *tioce_common)
 	 * Determine the secondary bus number of the port2 logical PPB.
 	 * This is used to decide whether a given pci device resides on
 	 * port1 or port2.  Note:  We don't have enough plumbing set up
-	 * here to use pci_read_config_xxx() so use the raw_pci_ops vector.
+	 * here to use pci_read_config_xxx() so use raw_pci_read().
 	 */
 
 	seg = tioce_common->ce_pcibus.bs_persist_segment;
 	bus = tioce_common->ce_pcibus.bs_persist_busnum;
 
-	raw_pci_ops->read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
+	raw_pci_read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
 	tioce_kern->ce_port1_secondary = (u8) tmp;
 
 	/*
@@ -799,11 +799,11 @@ tioce_kern_init(struct tioce_common *tioce_common)
 
 		/* mem base/limit */
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_MEMORY_BASE, 2, &tmp);
 		base = (u64)tmp << 16;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_MEMORY_LIMIT, 2, &tmp);
 		limit = (u64)tmp << 16;
 		limit |= 0xfffffUL;
@@ -817,21 +817,21 @@ tioce_kern_init(struct tioce_common *tioce_common)
 		 * attributes.
 		 */
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_MEMORY_BASE, 2, &tmp);
 		base = ((u64)tmp & PCI_PREF_RANGE_MASK) << 16;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_BASE_UPPER32, 4, &tmp);
 		base |= (u64)tmp << 32;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_MEMORY_LIMIT, 2, &tmp);
 
 		limit = ((u64)tmp & PCI_PREF_RANGE_MASK) << 16;
 		limit |= 0xfffffUL;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_LIMIT_UPPER32, 4, &tmp);
 		limit |= (u64)tmp << 32;
 
diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index fab30e1..7f73f7c 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -27,7 +27,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
 	pci_write_config_byte(dev, 0xf4, config|0x2);
 
 	/* read xTPR register */
-	raw_pci_ops->read(0, 0, 0x40, 0x4c, 2, &word);
+	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
 
 	if (!(word & (1 << 13))) {
 		printk(KERN_INFO "Intel E7520/7320/7525 detected. "
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 8627463..f2bd9f3 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -26,16 +26,37 @@ int pcibios_last_bus = -1;
 unsigned long pirq_table_addr;
 struct pci_bus *pci_root_bus;
 struct pci_raw_ops *raw_pci_ops;
+struct pci_raw_ops *raw_pci_ext_ops;
+
+int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val)
+{
+	if (reg < 256 && raw_pci_ops)
+		return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
+	return -EINVAL;
+}
+
+int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val)
+{
+	if (reg < 256 && raw_pci_ops)
+		return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
+	return -EINVAL;
+}
 
 static int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
-	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
 				 devfn, where, size, value);
 }
 
 static int pci_write(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
 {
-	return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
 				  devfn, where, size, value);
 }
 
diff --git a/arch/x86/pci/direct.c b/arch/x86/pci/direct.c
index 431c9a5..42f3e4c 100644
--- a/arch/x86/pci/direct.c
+++ b/arch/x86/pci/direct.c
@@ -14,7 +14,7 @@
 #define PCI_CONF1_ADDRESS(bus, devfn, reg) \
 	(0x80000000 | (bus << 16) | (devfn << 8) | (reg & ~3))
 
-int pci_conf1_read(unsigned int seg, unsigned int bus,
+static int pci_conf1_read(unsigned int seg, unsigned int bus,
 			  unsigned int devfn, int reg, int len, u32 *value)
 {
 	unsigned long flags;
@@ -45,7 +45,7 @@ int pci_conf1_read(unsigned int seg, unsigned int bus,
 	return 0;
 }
 
-int pci_conf1_write(unsigned int seg, unsigned int bus,
+static int pci_conf1_write(unsigned int seg, unsigned int bus,
 			   unsigned int devfn, int reg, int len, u32 value)
 {
 	unsigned long flags;
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 6cff66d..b31cd6a 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -215,7 +215,8 @@ static int quirk_aspm_offset[MAX_PCIEROOT << 3];
 
 static int quirk_pcie_aspm_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
-	return raw_pci_ops->read(0, bus->number, devfn, where, size, value);
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
+						devfn, where, size, value);
 }
 
 /*
@@ -231,7 +232,8 @@ static int quirk_pcie_aspm_write(struct pci_bus *bus, unsigned int devfn, int wh
 	if ((offset) && (where == offset))
 		value = value & 0xfffffffc;
 	
-	return raw_pci_ops->write(0, bus->number, devfn, where, size, value);
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
+						devfn, where, size, value);
 }
 
 static struct pci_ops quirk_pcie_aspm_ops = {
diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c
index 5565d70..e041ced 100644
--- a/arch/x86/pci/legacy.c
+++ b/arch/x86/pci/legacy.c
@@ -22,7 +22,7 @@ static void __devinit pcibios_fixup_peer_bridges(void)
 		if (pci_find_bus(0, n))
 			continue;
 		for (devfn = 0; devfn < 256; devfn += 8) {
-			if (!raw_pci_ops->read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
+			if (!raw_pci_read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
 			    l != 0x0000 && l != 0xffff) {
 				DBG("Found device at %02x:%02x [%04x]\n", n, devfn, l);
 				printk(KERN_INFO "PCI: Discovered peer bus %02x\n", n);
diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 6b521d3..8d54df4 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -28,7 +28,7 @@ static int __initdata pci_mmcfg_resources_inserted;
 static const char __init *pci_mmcfg_e7520(void)
 {
 	u32 win;
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
 
 	win = win & 0xf000;
 	if(win == 0x0000 || win == 0xf000)
@@ -53,7 +53,7 @@ static const char __init *pci_mmcfg_intel_945(void)
 
 	pci_mmcfg_config_num = 1;
 
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
 
 	/* Enable bit */
 	if (!(pciexbar & 1))
@@ -118,7 +118,7 @@ static int __init pci_mmcfg_check_hostbridge(void)
 	int i;
 	const char *name;
 
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
 	vendor = l & 0xffff;
 	device = (l >> 16) & 0xffff;
 
diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 7b75e65..081816a 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -68,9 +68,6 @@ err:		*value = -1;
 		return -EINVAL;
 	}
 
-	if (reg < 256)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
-
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
 		goto err;
@@ -104,9 +101,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if ((bus > 255) || (devfn > 255) || (reg > 4095))
 		return -EINVAL;
 
-	if (reg < 256)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
-
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
 		return -EINVAL;
@@ -138,7 +132,7 @@ static struct pci_raw_ops pci_mmcfg = {
 
 int __init pci_mmcfg_arch_init(void)
 {
-	printk(KERN_INFO "PCI: Using MMCONFIG\n");
-	raw_pci_ops = &pci_mmcfg;
+	printk(KERN_INFO "PCI: Using MMCONFIG for extended config space\n");
+	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index c4cf318..9207fd4 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -58,9 +58,6 @@ err:		*value = -1;
 		return -EINVAL;
 	}
 
-	if (reg < 256)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
-
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
 		goto err;
@@ -89,9 +86,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095)))
 		return -EINVAL;
 
-	if (reg < 256)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
-
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
 		return -EINVAL;
@@ -150,6 +144,6 @@ int __init pci_mmcfg_arch_init(void)
 			return 0;
 		}
 	}
-	raw_pci_ops = &pci_mmcfg;
+	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
index 36cb44c..3431518 100644
--- a/arch/x86/pci/pci.h
+++ b/arch/x86/pci/pci.h
@@ -85,10 +85,17 @@ extern spinlock_t pci_config_lock;
 extern int (*pcibios_enable_irq)(struct pci_dev *dev);
 extern void (*pcibios_disable_irq)(struct pci_dev *dev);
 
-extern int pci_conf1_write(unsigned int seg, unsigned int bus,
-			   unsigned int devfn, int reg, int len, u32 value);
-extern int pci_conf1_read(unsigned int seg, unsigned int bus,
-			  unsigned int devfn, int reg, int len, u32 *value);
+struct pci_raw_ops {
+	int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val);
+	int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val);
+};
+
+extern struct pci_raw_ops *raw_pci_ops;
+extern struct pci_raw_ops *raw_pci_ext_ops;
+
+extern struct pci_raw_ops pci_direct_conf1;
 
 extern int pci_direct_probe(void);
 extern void pci_direct_init(int type);
diff --git a/arch/x86/pci/visws.c b/arch/x86/pci/visws.c
index 8ecb1c7..c2df4e9 100644
--- a/arch/x86/pci/visws.c
+++ b/arch/x86/pci/visws.c
@@ -13,9 +13,6 @@
 
 #include "pci.h"
 
-
-extern struct pci_raw_ops pci_direct_conf1;
-
 static int pci_visws_enable_irq(struct pci_dev *dev) { return 0; }
 static void pci_visws_disable_irq(struct pci_dev *dev) { }
 
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index e3a673a..f190db9 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -139,15 +139,6 @@ acpi_status __init acpi_os_initialize(void)
 
 acpi_status acpi_os_initialize1(void)
 {
-	/*
-	 * Initialize PCI configuration space access, as we'll need to access
-	 * it while walking the namespace (bus 0 and root bridges w/ _BBNs).
-	 */
-	if (!raw_pci_ops) {
-		printk(KERN_ERR PREFIX
-		       "Access to PCI configuration space unavailable\n");
-		return AE_NULL_ENTRY;
-	}
 	kacpid_wq = create_singlethread_workqueue("kacpid");
 	kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
 	BUG_ON(!kacpid_wq);
@@ -498,11 +489,9 @@ acpi_os_read_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
 		return AE_ERROR;
 	}
 
-	BUG_ON(!raw_pci_ops);
-
-	result = raw_pci_ops->read(pci_id->segment, pci_id->bus,
-				   PCI_DEVFN(pci_id->device, pci_id->function),
-				   reg, size, value);
+	result = raw_pci_read(pci_id->segment, pci_id->bus,
+				PCI_DEVFN(pci_id->device, pci_id->function),
+				reg, size, value);
 
 	return (result ? AE_ERROR : AE_OK);
 }
@@ -529,11 +518,9 @@ acpi_os_write_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
 		return AE_ERROR;
 	}
 
-	BUG_ON(!raw_pci_ops);
-
-	result = raw_pci_ops->write(pci_id->segment, pci_id->bus,
-				    PCI_DEVFN(pci_id->device, pci_id->function),
-				    reg, size, value);
+	result = raw_pci_write(pci_id->segment, pci_id->bus,
+				PCI_DEVFN(pci_id->device, pci_id->function),
+				reg, size, value);
 
 	return (result ? AE_ERROR : AE_OK);
 }
diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig
index ba63619..1e71dc0 100644
--- a/drivers/ata/Kconfig
+++ b/drivers/ata/Kconfig
@@ -40,6 +40,9 @@ config ATA_ACPI
 	  You can disable this at kernel boot time by using the
 	  option libata.noacpi=1
 
+config ATA_RAM
+	tristate "ATA RAM driver"
+
 config SATA_AHCI
 	tristate "AHCI SATA support"
 	depends on PCI
diff --git a/drivers/ata/Makefile b/drivers/ata/Makefile
index b13feb2..bc2eef0 100644
--- a/drivers/ata/Makefile
+++ b/drivers/ata/Makefile
@@ -75,6 +75,9 @@ obj-$(CONFIG_ATA_GENERIC)	+= ata_generic.o
 # Should be last libata driver
 obj-$(CONFIG_PATA_LEGACY)	+= pata_legacy.o
 
+# A fake ata driver.  Can it be postultimate?
+obj-$(CONFIG_ATA_RAM)          += ata_ram.o
+
 libata-objs	:= libata-core.o libata-scsi.o libata-sff.o libata-eh.o \
 		   libata-pmp.o
 libata-$(CONFIG_ATA_ACPI)	+= libata-acpi.o
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0dd93bb..f4f1edd 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -304,14 +304,14 @@ struct pci_ops {
 	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
 };
 
-struct pci_raw_ops {
-	int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
-		    int reg, int len, u32 *val);
-	int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
-		     int reg, int len, u32 val);
-};
-
-extern struct pci_raw_ops *raw_pci_ops;
+/*
+ * ACPI needs to be able to access PCI config space before we've done a
+ * PCI bus scan and created pci_bus structures.
+ */
+extern int raw_pci_read(unsigned int domain, unsigned int bus,
+			unsigned int devfn, int reg, int len, u32 *val);
+extern int raw_pci_write(unsigned int domain, unsigned int bus,
+			unsigned int devfn, int reg, int len, u32 val);
 
 struct pci_bus_region {
 	unsigned long start;
-- 
1.5.2.5

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-28 20:44                                                     ` Greg KH
  2008-01-28 22:31                                                       ` Matthew Wilcox
@ 2008-01-29  3:05                                                       ` Arjan van de Ven
  2008-01-29  3:18                                                         ` Matthew Wilcox
  1 sibling, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-29  3:05 UTC (permalink / raw)
  To: Greg KH
  Cc: Tony Camuso, Grant Grundler, Matthew Wilcox, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Mon, 28 Jan 2008 12:44:31 -0800
Greg KH <greg@kroah.com> wrote:

> On Mon, Jan 28, 2008 at 01:32:06PM -0500, Tony Camuso wrote:
> > Greg,
> >
> > Have you given Grant's suggestion any further consideration?
> >
> > I'd like to know how the MMCONFIG issues discussed in this thread
> > are going to be handled upstream. I have a patch implemented in
> > RHEL 5.2, but I would rather have the upstream patch implemented,
> > whatever it is.
> 
> Well, everyone still doesn't seem to agree on the proper way forward
> here, so for me to just "pick one" isn't very appropriate.
> 
> So, can we try again?

I think there's only one fundamental disagreement; and that is:
do we think that things are now totally fixed and no new major issues
will arrive after the "fix yet another mmconfig thing" patches are merged.

If the answer is no, then imho my patch is the right approach; it will limit the damage and doesn't make
the people suffer who don't need extended config space.
If the answer is yet, then my patch is not needed.

This is a judgment call; I'm skeptical, others are more optimistic that after 2 years of messing around
they have finally found the last golden fix.

-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29  3:05                                                       ` [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in Arjan van de Ven
@ 2008-01-29  3:18                                                         ` Matthew Wilcox
  2008-01-29 13:19                                                           ` Greg KH
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-29  3:18 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Greg KH, Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH,
	linux-kernel, Jeff Garzik, linux-pci, Martin Mares

On Mon, Jan 28, 2008 at 07:05:05PM -0800, Arjan van de Ven wrote:
> I think there's only one fundamental disagreement; and that is:
> do we think that things are now totally fixed and no new major issues
> will arrive after the "fix yet another mmconfig thing" patches are merged.
> 
> If the answer is no, then imho my patch is the right approach; it will limit the damage and doesn't make
> the people suffer who don't need extended config space.
> If the answer is yet, then my patch is not needed.
> 
> This is a judgment call; I'm skeptical, others are more optimistic that after 2 years of messing around
> they have finally found the last golden fix.

I'm more optimistic because we've so severely restricted the use of
mmconf after these patches that it's unlikely to cause problems.  I also
hear Vista is now using mmconf, so fewer implementations are going to
be buggy at this point.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29  3:18                                                         ` Matthew Wilcox
@ 2008-01-29 13:19                                                           ` Greg KH
  2008-01-29 14:15                                                             ` Tony Camuso
                                                                               ` (2 more replies)
  0 siblings, 3 replies; 125+ messages in thread
From: Greg KH @ 2008-01-29 13:19 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Arjan van de Ven, Tony Camuso, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Mon, Jan 28, 2008 at 08:18:04PM -0700, Matthew Wilcox wrote:
> On Mon, Jan 28, 2008 at 07:05:05PM -0800, Arjan van de Ven wrote:
> > I think there's only one fundamental disagreement; and that is:
> > do we think that things are now totally fixed and no new major issues
> > will arrive after the "fix yet another mmconfig thing" patches are merged.
> > 
> > If the answer is no, then imho my patch is the right approach; it will limit the damage and doesn't make
> > the people suffer who don't need extended config space.
> > If the answer is yet, then my patch is not needed.
> > 
> > This is a judgment call; I'm skeptical, others are more optimistic that after 2 years of messing around
> > they have finally found the last golden fix.
> 
> I'm more optimistic because we've so severely restricted the use of
> mmconf after these patches that it's unlikely to cause problems.  I also
> hear Vista is now using mmconf, so fewer implementations are going to
> be buggy at this point.

Hahahaha, oh, that's a good one...

But what about the thousands of implementations out there that are
buggy?

I'm with Arjan here, I'm very skeptical.

Matthew, with Arjan's patch, is anything that currently works now
broken?  Why do you feel it is somehow "wrong"?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: PCI x86: always use conf1 to access config space below 256 bytes
  2008-01-29  2:57                                                             ` PCI x86: always use conf1 to access config space below 256 bytes Matthew Wilcox
@ 2008-01-29 13:21                                                               ` Greg KH
  2008-01-29 23:43                                                                 ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Greg KH @ 2008-01-29 13:21 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Mon, Jan 28, 2008 at 07:57:44PM -0700, Matthew Wilcox wrote:
> PCI x86: always use conf1 to access config space below 256 bytes
> 
> Thanks to Loic Prylli <loic@myri.com>, who originally proposed
> this idea.
> 
> Always using legacy configuration mechanism for the legacy config space
> and extended mechanism (mmconf) for the extended config space is
> a simple and very logical approach. It's supposed to resolve all
> known mmconf problems. It still allows per-device quirks (tweaking
> dev->cfg_size). It also allows to get rid of mmconf fallback code.
> 
> Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>

Hm, who wrote this, Ivan?

If so, Matthew, please do not strip off authorship of patches, and place
a "From:" line on the first line above the description, so it is not
lost.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29 13:19                                                           ` Greg KH
@ 2008-01-29 14:15                                                             ` Tony Camuso
  2008-01-29 14:47                                                               ` Arjan van de Ven
  2008-01-30  3:45                                                             ` Matthew Wilcox
  2008-01-31  5:51                                                             ` Jesse Barnes
  2 siblings, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-29 14:15 UTC (permalink / raw)
  To: Greg KH
  Cc: Matthew Wilcox, Arjan van de Ven, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

Greg KH wrote:
> On Mon, Jan 28, 2008 at 08:18:04PM -0700, Matthew Wilcox wrote:
>> I'm more optimistic because we've so severely restricted the use of
>> mmconf after these patches that it's unlikely to cause problems.  I also
>> hear Vista is now using mmconf, so fewer implementations are going to
>> be buggy at this point.
> 
> Hahahaha, oh, that's a good one...
> 
> But what about the thousands of implementations out there that are
> buggy?
> 
> I'm with Arjan here, I'm very skeptical.
> 
> Matthew, with Arjan's patch, is anything that currently works now
> broken?  Why do you feel it is somehow "wrong"?
> 
> thanks,
> 
> greg k-h

Greg,

The problem with Arjan's patch, if I understand it correctly, is that it
requires drivers to make a call to access extended PCI config space.

And, IIRC, Arjan's patch encumbers drivers for all arch's, even those
that have no MMCONFIG problems.

The patches proposed by Loic, Ivan, Matthew, and myself, all address the
problem in an x86-specific manner that is transparent to the drivers.


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29 14:15                                                             ` Tony Camuso
@ 2008-01-29 14:47                                                               ` Arjan van de Ven
  2008-01-29 15:15                                                                 ` Tony Camuso
  0 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-29 14:47 UTC (permalink / raw)
  To: tcamuso
  Cc: Greg KH, Matthew Wilcox, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Tue, 29 Jan 2008 09:15:02 -0500
Tony Camuso <tcamuso@redhat.com> wrote:

> Greg KH wrote:
> > On Mon, Jan 28, 2008 at 08:18:04PM -0700, Matthew Wilcox wrote:
> >> I'm more optimistic because we've so severely restricted the use of
> >> mmconf after these patches that it's unlikely to cause problems.
> >> I also hear Vista is now using mmconf, so fewer implementations
> >> are going to be buggy at this point.
> > 
> > Hahahaha, oh, that's a good one...
> > 
> > But what about the thousands of implementations out there that are
> > buggy?
> > 
> > I'm with Arjan here, I'm very skeptical.
> > 
> > Matthew, with Arjan's patch, is anything that currently works now
> > broken?  Why do you feel it is somehow "wrong"?
> > 
> > thanks,
> > 
> > greg k-h
> 
> Greg,
> 
> The problem with Arjan's patch, if I understand it correctly, is that
> it requires drivers to make a call to access extended PCI config
> space.
> 
> And, IIRC, Arjan's patch encumbers drivers for all arch's, even those
> that have no MMCONFIG problems.
> 
> The patches proposed by Loic, Ivan, Matthew, and myself, all address
> the problem in an x86-specific manner that is transparent to the
> drivers.

this is not quite correct; the patches from Loic, Ivan, Matthew and you are for a different
problem statement.

Your patch problem statement is "need to fix mmconfig", my patch problem statement is "need
to not make users who don't need it suffer". These are orthogonal problems.


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29 14:47                                                               ` Arjan van de Ven
@ 2008-01-29 15:15                                                                 ` Tony Camuso
  2008-01-29 15:29                                                                   ` Arjan van de Ven
  0 siblings, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-01-29 15:15 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Greg KH, Matthew Wilcox, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

Arjan van de Ven wrote:
> On Tue, 29 Jan 2008 09:15:02 -0500
> Tony Camuso <tcamuso@redhat.com> wrote:
> 
>> Greg,
>>
>> The problem with Arjan's patch, if I understand it correctly, is that
>> it requires drivers to make a call to access extended PCI config
>> space.
>>
>> And, IIRC, Arjan's patch encumbers drivers for all arch's, even those
>> that have no MMCONFIG problems.
>>
>> The patches proposed by Loic, Ivan, Matthew, and myself, all address
>> the problem in an x86-specific manner that is transparent to the
>> drivers.
> 
> this is not quite correct; the patches from Loic, Ivan, Matthew and you are for a different
> problem statement.
> 
> Your patch problem statement is "need to fix mmconfig", my patch problem statement is "need
> to not make users who don't need it suffer". These are orthogonal problems.
> 
> 

Yes, but your patch also makes users who need extended PCI config space suffer.

Right now, that isn't a lot of people in x86 land, but your patch encumbers drivers
for non-x86 archs with an additional call to access space that they've never had
a problem with.

As more PCI express drivers start to take advantage of AER and other advanced
express capabilities, the extra call to address a condition specific to legacy
x86 hardware is, IMNSHO, a kludge.

The patches submitted by the others fix the problems with MMCONFIG without
encumbering the drivers to be aware of any difference between legacy config
space and extended config space.

I have tested these patches on a number of systems exhibiting various MMCONFIG-
related pathologies, and they work.





^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29 15:15                                                                 ` Tony Camuso
@ 2008-01-29 15:29                                                                   ` Arjan van de Ven
  2008-01-29 16:26                                                                     ` Tony Camuso
  2008-01-29 23:57                                                                     ` Matthew Wilcox
  0 siblings, 2 replies; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-29 15:29 UTC (permalink / raw)
  To: tcamuso
  Cc: Greg KH, Matthew Wilcox, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Tue, 29 Jan 2008 10:15:45 -0500
Tony Camuso <tcamuso@redhat.com> wrote:

> Arjan van de Ven wrote:
> > On Tue, 29 Jan 2008 09:15:02 -0500
> > Tony Camuso <tcamuso@redhat.com> wrote:
> > 
> >> Greg,
> >>
> >> The problem with Arjan's patch, if I understand it correctly, is
> >> that it requires drivers to make a call to access extended PCI
> >> config space.
> >>
> >> And, IIRC, Arjan's patch encumbers drivers for all arch's, even
> >> those that have no MMCONFIG problems.
> >>
> >> The patches proposed by Loic, Ivan, Matthew, and myself, all
> >> address the problem in an x86-specific manner that is transparent
> >> to the drivers.
> > 
> > this is not quite correct; the patches from Loic, Ivan, Matthew and
> > you are for a different problem statement.
> > 
> > Your patch problem statement is "need to fix mmconfig", my patch
> > problem statement is "need to not make users who don't need it
> > suffer". These are orthogonal problems.
> > 
> > 
> 
> Yes, but your patch also makes users who need extended PCI config
> space suffer.
> 
> Right now, that isn't a lot of people in x86 land, but your patch
> encumbers drivers for non-x86 archs with an additional call to access
> space that they've never had a problem with.

lets say s/x86/x86, IA64 and architectures that use intel, amd or via chipsets/



> As more PCI express drivers start to take advantage of AER and other
> advanced express capabilities, the extra call to address a condition
> specific to legacy x86 hardware is, IMNSHO, a kludge.

in addition to pci_enable(), pci_enable_msi(), pci_enable_busmaster() they already need to do
to enable various features?


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29 15:29                                                                   ` Arjan van de Ven
@ 2008-01-29 16:26                                                                     ` Tony Camuso
  2008-01-29 23:57                                                                     ` Matthew Wilcox
  1 sibling, 0 replies; 125+ messages in thread
From: Tony Camuso @ 2008-01-29 16:26 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Greg KH, Matthew Wilcox, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

Arjan van de Ven wrote:
> On Tue, 29 Jan 2008 10:15:45 -0500
> Tony Camuso <tcamuso@redhat.com> wrote:
> 

>> specific to legacy x86 hardware is, IMNSHO, a kludge.
> 
> in addition to pci_enable(), pci_enable_msi(), pci_enable_busmaster() they already need to do
> to enable various features?
> 

These calls are related to generic aspects of the PCI* landscape itself and are
not related to any arch-specific hardware, nor were they devised to address
chipset-specific or BIOS-specific problems.

For the good of all, we should endeavor to avoid putting arch-specific fixes into
the generic code whenever possible.

And in this case, not only is it possible, it's been done and tested.


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: PCI x86: always use conf1 to access config space below 256 bytes
  2008-01-29 13:21                                                               ` Greg KH
@ 2008-01-29 23:43                                                                 ` Matthew Wilcox
  2008-01-30  0:04                                                                   ` Linus Torvalds
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-29 23:43 UTC (permalink / raw)
  To: Greg KH
  Cc: Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Tue, Jan 29, 2008 at 05:21:08AM -0800, Greg KH wrote:
> Hm, who wrote this, Ivan?
> 
> If so, Matthew, please do not strip off authorship of patches, and place
> a "From:" line on the first line above the description, so it is not
> lost.

Sorry, I didn't know that was the convention.  I thought the first
Signed-off-by: was assumed to be the author.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29 15:29                                                                   ` Arjan van de Ven
  2008-01-29 16:26                                                                     ` Tony Camuso
@ 2008-01-29 23:57                                                                     ` Matthew Wilcox
  2008-01-30  2:30                                                                       ` Tony Camuso
  1 sibling, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-29 23:57 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: tcamuso, Greg KH, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH,
	linux-kernel, Jeff Garzik, linux-pci, Martin Mares

On Tue, Jan 29, 2008 at 07:29:51AM -0800, Arjan van de Ven wrote:
> > Right now, that isn't a lot of people in x86 land, but your patch
> > encumbers drivers for non-x86 archs with an additional call to access
> > space that they've never had a problem with.
> 
> lets say s/x86/x86, IA64 and architectures that use intel, amd or via chipsets/

Umm .. ia64 already does exactly what I'm proposing for x86.  It uses
one SAL interface for bytes below 256 and a different SAL interface for
bytes 256-4095.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: PCI x86: always use conf1 to access config space below 256 bytes
  2008-01-29 23:43                                                                 ` Matthew Wilcox
@ 2008-01-30  0:04                                                                   ` Linus Torvalds
  0 siblings, 0 replies; 125+ messages in thread
From: Linus Torvalds @ 2008-01-30  0:04 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Arjan van de Ven, Benjamin Herrenschmidt, Ivan Kokshaysky,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares



On Tue, 29 Jan 2008, Matthew Wilcox wrote:
> 
> Sorry, I didn't know that was the convention.  I thought the first
> Signed-off-by: was assumed to be the author.

There's certainly a strong correlation between "first sign-off" and 
authorship, but signing off doesn't guarantee it, and while it's not the 
bulk of patches, it certainly happens that people sign off on patches made 
by others (either because the company has specific people who have the 
right to sign off on things, or simply because the code comes from some 
source that did GPL it, but perhaps didn't sign off on it - hopefully 
rare, but certainly not impossible or unheard of especially for 
one-liners that got picked up from mailing lists etc)

		Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29 23:57                                                                     ` Matthew Wilcox
@ 2008-01-30  2:30                                                                       ` Tony Camuso
  0 siblings, 0 replies; 125+ messages in thread
From: Tony Camuso @ 2008-01-30  2:30 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Arjan van de Ven, Greg KH, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

Matthew Wilcox wrote:
> On Tue, Jan 29, 2008 at 07:29:51AM -0800, Arjan van de Ven wrote:
>>> Right now, that isn't a lot of people in x86 land, but your patch
>>> encumbers drivers for non-x86 archs with an additional call to access
>>> space that they've never had a problem with.
>> lets say s/x86/x86, IA64 and architectures that use intel, amd or via chipsets/
> 
> Umm .. ia64 already does exactly what I'm proposing for x86.  It uses
> one SAL interface for bytes below 256 and a different SAL interface for
> bytes 256-4095.
> 

Not exactly.
:)

The interface is the same, ia64_sal_pci_config_write() and ia64_sal_pci_config_read(),
but a flag bit in the mode argument is used to tell the SAL interface whether to
translate the offset component of the config address as having 8 or 12 bits of
of displacement.

In my estimation, Ivan's patch, in his implementation of Loic's suggestion, is even
more elegant, since there is no need to flag whether the access is for offsets below
256. Ivan's code automatically uses Port IO (or equivalent with Matthew's patch) for
offsets below 256 and MMCONFIG for offsets from 256 to 4096.

And even better, it removes the bitmap that tracks MMCONFIG-unfriendly devices for
the first 16 buses, a solution that assumes systems with bus numbers higher than 16
will get MMCONFIG right, which turned out to be a very wrong assumption. Furthermore,
the config address is translated by the Northbridge. The delivery mechanism to
the Northbridge, whether Port IO or MMCONFIG, is utterly opaque to the devices on the
bus, since all they see is PCI config cycles, not Port IO or MMCONFIG cycles. The test
only needed to be made at the Northbridge level, not at the device level. Ivan's patch
removes all this cruft.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29 13:19                                                           ` Greg KH
  2008-01-29 14:15                                                             ` Tony Camuso
@ 2008-01-30  3:45                                                             ` Matthew Wilcox
  2008-01-30 15:15                                                               ` Ivan Kokshaysky
  2008-01-31  5:51                                                             ` Jesse Barnes
  2 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-01-30  3:45 UTC (permalink / raw)
  To: Greg KH
  Cc: Arjan van de Ven, Tony Camuso, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Tue, Jan 29, 2008 at 05:19:55AM -0800, Greg KH wrote:
> On Mon, Jan 28, 2008 at 08:18:04PM -0700, Matthew Wilcox wrote:
> > I'm more optimistic because we've so severely restricted the use of
> > mmconf after these patches that it's unlikely to cause problems.  I also
> > hear Vista is now using mmconf, so fewer implementations are going to
> > be buggy at this point.
> 
> Hahahaha, oh, that's a good one...

Thanks Greg.  What happened to "Can't we all try to get along"?

> But what about the thousands of implementations out there that are
> buggy?
> 
> I'm with Arjan here, I'm very skeptical.

Maybe I'm insufficiently imaginative.  Can you come up with a plausible
way in which the two patches I posted will succumb to bugs?  After those
patches we only use mmconf if:

 1. conf1 has failed to work
OR
 2. user has compiled their own kernel without support for conf1
OR
 3. kernel probes config space 0x100 to see if it can access extended
    config space (requires the device to be PCIe or PCI-X2)
OR
 4. root attempts to lspci -xxxx or lspci -v
OR
 5. device driver tries to access extended config space

With Arjan's patch, I believe only case 3 changes.  In cases 4 and 5,
either lspci or the device driver will jump through the hoop to enable
access to extended config space.

> Matthew, with Arjan's patch, is anything that currently works now
> broken?  Why do you feel it is somehow "wrong"?

lspci is broken.  It used to be able to access extended config space, and
now can't unless it is patched to know about the sysfs flag to enable it.

If you're determined to implement something to disable extended config
space by default, it can be done in a much better way than Arjan's patch
-- less code (both source and object).

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-30  3:45                                                             ` Matthew Wilcox
@ 2008-01-30 15:15                                                               ` Ivan Kokshaysky
  2008-01-30 15:42                                                                 ` Arjan van de Ven
  0 siblings, 1 reply; 125+ messages in thread
From: Ivan Kokshaysky @ 2008-01-30 15:15 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Arjan van de Ven, Tony Camuso, Grant Grundler,
	Loic Prylli, Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares

On Tue, Jan 29, 2008 at 08:45:55PM -0700, Matthew Wilcox wrote:
> On Tue, Jan 29, 2008 at 05:19:55AM -0800, Greg KH wrote:
> > Matthew, with Arjan's patch, is anything that currently works now
> > broken?  Why do you feel it is somehow "wrong"?
> 
> lspci is broken.  It used to be able to access extended config space, and
> now can't unless it is patched to know about the sysfs flag to enable it.

There is also likely damage to Xorg for the very same reason.

Ivan.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-30 15:15                                                               ` Ivan Kokshaysky
@ 2008-01-30 15:42                                                                 ` Arjan van de Ven
  2008-01-30 20:14                                                                   ` Ivan Kokshaysky
  0 siblings, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-01-30 15:42 UTC (permalink / raw)
  To: Ivan Kokshaysky
  Cc: Matthew Wilcox, Greg KH, Tony Camuso, Grant Grundler,
	Loic Prylli, Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares

On Wed, 30 Jan 2008 18:15:39 +0300
Ivan Kokshaysky <ink@jurassic.park.msu.ru> wrote:

> On Tue, Jan 29, 2008 at 08:45:55PM -0700, Matthew Wilcox wrote:
> > On Tue, Jan 29, 2008 at 05:19:55AM -0800, Greg KH wrote:
> > > Matthew, with Arjan's patch, is anything that currently works now
> > > broken?  Why do you feel it is somehow "wrong"?
> > 
> > lspci is broken.  It used to be able to access extended config
> > space, and now can't unless it is patched to know about the sysfs
> > flag to enable it.
> 
> There is also likely damage to Xorg for the very same reason.
>

Xorg doesn't do pci express ..
(newer ones actually have gotten out of the "do the PCI layer ourselves" business entirely)

> Ivan.


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-30 15:42                                                                 ` Arjan van de Ven
@ 2008-01-30 20:14                                                                   ` Ivan Kokshaysky
  0 siblings, 0 replies; 125+ messages in thread
From: Ivan Kokshaysky @ 2008-01-30 20:14 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Matthew Wilcox, Greg KH, Tony Camuso, Grant Grundler,
	Loic Prylli, Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares

On Wed, Jan 30, 2008 at 07:42:49AM -0800, Arjan van de Ven wrote:
> Xorg doesn't do pci express ..

Xorg core provides a set of PCI config access functions (via sysfs) for
the graphics drivers. These functions do work correctly with offsets > 256
bytes. Can you guarantee that none of PCI-E video drivers use that,
including proprietary nvidia and ati ones?

> (newer ones actually have gotten out of the "do the PCI layer ourselves" business entirely)

Unfortunately, not completely true. Though it has nothing to do with
extended config space.

Ivan.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
  2008-01-29 13:19                                                           ` Greg KH
  2008-01-29 14:15                                                             ` Tony Camuso
  2008-01-30  3:45                                                             ` Matthew Wilcox
@ 2008-01-31  5:51                                                             ` Jesse Barnes
  2 siblings, 0 replies; 125+ messages in thread
From: Jesse Barnes @ 2008-01-31  5:51 UTC (permalink / raw)
  To: Greg KH
  Cc: Matthew Wilcox, Arjan van de Ven, Tony Camuso, Grant Grundler,
	Loic Prylli, Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Tuesday 29 January 2008 05:19:55 am Greg KH wrote:
> Hahahaha, oh, that's a good one...
>
> But what about the thousands of implementations out there that are
> buggy?
>
> I'm with Arjan here, I'm very skeptical.

Ugg, let's look at the actual data (again); I'm really not sure why people 
are jumping to such dire conclusions about the current state of things.

AIUI we only have 3 issues so far (remember mmconfig has been enabled in -mm 
for a long time):
  1) host bridge decode problems (disabling decode to avoid overlaps can 
cause some bridges to stop decoding RAM addrs, but we have a fix for that)
  2) config space retry on ATI (I think willy already debunked this one?)
  3) some FUD about SMM or other firmware interrupts coming in during BAR 
sizing while decode is disabled (this one is just pure FUD; if we want to 
solve it properly we need a new platform hook to disable SMM/NMI/etc. 
around PCI probing)

What else was there?  What reason do we have to think that things are so 
disastrous?

So I really prefer willy's approach to Arjan's alternative...

Jesse

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-01-29  3:03                                                             ` [PATCH] Change pci_raw_ops to pci_raw_read/write Matthew Wilcox
@ 2008-02-03  7:30                                                               ` Yinghai Lu
  2008-02-07 15:54                                                                 ` Tony Camuso
  0 siblings, 1 reply; 125+ messages in thread
From: Yinghai Lu @ 2008-02-03  7:30 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Jan 28, 2008 7:03 PM, Matthew Wilcox <matthew@wil.cx> wrote:
>
> We want to allow different implementations of pci_raw_ops for standard
> and extended config space on x86.  Rather than clutter generic code with
> knowledge of this, we make pci_raw_ops private to x86 and use it to
> implement the new raw interface -- raw_pci_read() and raw_pci_write().
>
> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
> ---
>  arch/ia64/pci/pci.c               |   25 ++++++++-----------------
>  arch/ia64/sn/pci/tioce_provider.c |   16 ++++++++--------
>  arch/x86/kernel/quirks.c          |    2 +-
>  arch/x86/pci/common.c             |   25 +++++++++++++++++++++++--
>  arch/x86/pci/direct.c             |    4 ++--
>  arch/x86/pci/fixup.c              |    6 ++++--
>  arch/x86/pci/legacy.c             |    2 +-
>  arch/x86/pci/mmconfig-shared.c    |    6 +++---
>  arch/x86/pci/mmconfig_32.c        |   10 ++--------
>  arch/x86/pci/mmconfig_64.c        |    8 +-------
>  arch/x86/pci/pci.h                |   15 +++++++++++----
>  arch/x86/pci/visws.c              |    3 ---
>  drivers/acpi/osl.c                |   25 ++++++-------------------
>  drivers/ata/Kconfig               |    3 +++
>  drivers/ata/Makefile              |    3 +++
>  include/linux/pci.h               |   16 ++++++++--------
>  16 files changed, 84 insertions(+), 85 deletions(-)
>
...
>
> diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
> index fab30e1..7f73f7c 100644
> --- a/arch/x86/kernel/quirks.c
> +++ b/arch/x86/kernel/quirks.c
> @@ -27,7 +27,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
>         pci_write_config_byte(dev, 0xf4, config|0x2);
>
>         /* read xTPR register */
> -       raw_pci_ops->read(0, 0, 0x40, 0x4c, 2, &word);
> +       raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
>
>         if (!(word & (1 << 13))) {
>                 printk(KERN_INFO "Intel E7520/7320/7525 detected. "
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 8627463..f2bd9f3 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -26,16 +26,37 @@ int pcibios_last_bus = -1;
>  unsigned long pirq_table_addr;
>  struct pci_bus *pci_root_bus;
>  struct pci_raw_ops *raw_pci_ops;
> +struct pci_raw_ops *raw_pci_ext_ops;
> +
> +int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
> +                                               int reg, int len, u32 *val)
> +{
> +       if (reg < 256 && raw_pci_ops)
> +               return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
> +       if (raw_pci_ext_ops)
> +               return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
> +       return -EINVAL;
> +}
> +
> +int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
> +                                               int reg, int len, u32 val)
> +{
> +       if (reg < 256 && raw_pci_ops)
> +               return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
> +       if (raw_pci_ext_ops)
> +               return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
> +       return -EINVAL;
> +}
>
>  static int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
>  {
> -       return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
> +       return raw_pci_read(pci_domain_nr(bus), bus->number,
>                                  devfn, where, size, value);
>  }
>
>  static int pci_write(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
>  {
> -       return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
> +       return raw_pci_write(pci_domain_nr(bus), bus->number,
>                                   devfn, where, size, value);
>  }
>
> diff --git a/arch/x86/pci/direct.c b/arch/x86/pci/direct.c
> index 431c9a5..42f3e4c 100644
> --- a/arch/x86/pci/direct.c
> +++ b/arch/x86/pci/direct.c
> @@ -14,7 +14,7 @@
>  #define PCI_CONF1_ADDRESS(bus, devfn, reg) \
>         (0x80000000 | (bus << 16) | (devfn << 8) | (reg & ~3))
>
> -int pci_conf1_read(unsigned int seg, unsigned int bus,
> +static int pci_conf1_read(unsigned int seg, unsigned int bus,
>                           unsigned int devfn, int reg, int len, u32 *value)
>  {
>         unsigned long flags;
> @@ -45,7 +45,7 @@ int pci_conf1_read(unsigned int seg, unsigned int bus,
>         return 0;
>  }
>
> -int pci_conf1_write(unsigned int seg, unsigned int bus,
> +static int pci_conf1_write(unsigned int seg, unsigned int bus,
>                            unsigned int devfn, int reg, int len, u32 value)
>  {
>         unsigned long flags;

any reason to change pci_conf1_read/write to static?

> diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> index 6cff66d..b31cd6a 100644
> --- a/arch/x86/pci/fixup.c
> +++ b/arch/x86/pci/fixup.c
> @@ -215,7 +215,8 @@ static int quirk_aspm_offset[MAX_PCIEROOT << 3];
>
>  static int quirk_pcie_aspm_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
>  {
> -       return raw_pci_ops->read(0, bus->number, devfn, where, size, value);
> +       return raw_pci_read(pci_domain_nr(bus), bus->number,
> +                                               devfn, where, size, value);
>  }
>
>  /*
> @@ -231,7 +232,8 @@ static int quirk_pcie_aspm_write(struct pci_bus *bus, unsigned int devfn, int wh
>         if ((offset) && (where == offset))
>                 value = value & 0xfffffffc;
>
> -       return raw_pci_ops->write(0, bus->number, devfn, where, size, value);
> +       return raw_pci_write(pci_domain_nr(bus), bus->number,
> +                                               devfn, where, size, value);
>  }
>
>  static struct pci_ops quirk_pcie_aspm_ops = {
> diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c
> index 5565d70..e041ced 100644
> --- a/arch/x86/pci/legacy.c
> +++ b/arch/x86/pci/legacy.c
> @@ -22,7 +22,7 @@ static void __devinit pcibios_fixup_peer_bridges(void)
>                 if (pci_find_bus(0, n))
>                         continue;
>                 for (devfn = 0; devfn < 256; devfn += 8) {
> -                       if (!raw_pci_ops->read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
> +                       if (!raw_pci_read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
>                             l != 0x0000 && l != 0xffff) {
>                                 DBG("Found device at %02x:%02x [%04x]\n", n, devfn, l);
>                                 printk(KERN_INFO "PCI: Discovered peer bus %02x\n", n);
> diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
> index 6b521d3..8d54df4 100644
> --- a/arch/x86/pci/mmconfig-shared.c
> +++ b/arch/x86/pci/mmconfig-shared.c
> @@ -28,7 +28,7 @@ static int __initdata pci_mmcfg_resources_inserted;
>  static const char __init *pci_mmcfg_e7520(void)
>  {
>         u32 win;
> -       pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
> +       pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
>
>         win = win & 0xf000;
>         if(win == 0x0000 || win == 0xf000)
> @@ -53,7 +53,7 @@ static const char __init *pci_mmcfg_intel_945(void)
>
>         pci_mmcfg_config_num = 1;
>
> -       pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
> +       pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
>
>         /* Enable bit */
>         if (!(pciexbar & 1))
> @@ -118,7 +118,7 @@ static int __init pci_mmcfg_check_hostbridge(void)
>         int i;
>         const char *name;
>
> -       pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
> +       pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
>         vendor = l & 0xffff;
>         device = (l >> 16) & 0xffff;
>
> diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
> index 7b75e65..081816a 100644
> --- a/arch/x86/pci/mmconfig_32.c
> +++ b/arch/x86/pci/mmconfig_32.c
> @@ -68,9 +68,6 @@ err:          *value = -1;
>                 return -EINVAL;
>         }
>
> -       if (reg < 256)
> -               return pci_conf1_read(seg,bus,devfn,reg,len,value);
> -
>         base = get_base_addr(seg, bus, devfn);
>         if (!base)
>                 goto err;
> @@ -104,9 +101,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
>         if ((bus > 255) || (devfn > 255) || (reg > 4095))
>                 return -EINVAL;
>
> -       if (reg < 256)
> -               return pci_conf1_write(seg,bus,devfn,reg,len,value);
> -
>         base = get_base_addr(seg, bus, devfn);
>         if (!base)
>                 return -EINVAL;
> @@ -138,7 +132,7 @@ static struct pci_raw_ops pci_mmcfg = {
>
>  int __init pci_mmcfg_arch_init(void)
>  {
> -       printk(KERN_INFO "PCI: Using MMCONFIG\n");
> -       raw_pci_ops = &pci_mmcfg;
> +       printk(KERN_INFO "PCI: Using MMCONFIG for extended config space\n");
> +       raw_pci_ext_ops = &pci_mmcfg;
>         return 1;
>  }
> diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
> index c4cf318..9207fd4 100644
> --- a/arch/x86/pci/mmconfig_64.c
> +++ b/arch/x86/pci/mmconfig_64.c
> @@ -58,9 +58,6 @@ err:          *value = -1;
>                 return -EINVAL;
>         }
>
> -       if (reg < 256)
> -               return pci_conf1_read(seg,bus,devfn,reg,len,value);
> -
>         addr = pci_dev_base(seg, bus, devfn);
>         if (!addr)
>                 goto err;
> @@ -89,9 +86,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
>         if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095)))
>                 return -EINVAL;
>
> -       if (reg < 256)
> -               return pci_conf1_write(seg,bus,devfn,reg,len,value);
> -
>         addr = pci_dev_base(seg, bus, devfn);
>         if (!addr)
>                 return -EINVAL;
> @@ -150,6 +144,6 @@ int __init pci_mmcfg_arch_init(void)
>                         return 0;
>                 }
>         }
> -       raw_pci_ops = &pci_mmcfg;
> +       raw_pci_ext_ops = &pci_mmcfg;
>         return 1;
>  }
> diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
> index 36cb44c..3431518 100644
> --- a/arch/x86/pci/pci.h
> +++ b/arch/x86/pci/pci.h
> @@ -85,10 +85,17 @@ extern spinlock_t pci_config_lock;
>  extern int (*pcibios_enable_irq)(struct pci_dev *dev);
>  extern void (*pcibios_disable_irq)(struct pci_dev *dev);
>
> -extern int pci_conf1_write(unsigned int seg, unsigned int bus,
> -                          unsigned int devfn, int reg, int len, u32 value);
> -extern int pci_conf1_read(unsigned int seg, unsigned int bus,
> -                         unsigned int devfn, int reg, int len, u32 *value);
> +struct pci_raw_ops {
> +       int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
> +                                               int reg, int len, u32 *val);
> +       int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
> +                                               int reg, int len, u32 val);
> +};
> +
> +extern struct pci_raw_ops *raw_pci_ops;
> +extern struct pci_raw_ops *raw_pci_ext_ops;
> +
> +extern struct pci_raw_ops pci_direct_conf1;
>
>  extern int pci_direct_probe(void);
>  extern void pci_direct_init(int type);
> diff --git a/arch/x86/pci/visws.c b/arch/x86/pci/visws.c
> index 8ecb1c7..c2df4e9 100644
> --- a/arch/x86/pci/visws.c
> +++ b/arch/x86/pci/visws.c
> @@ -13,9 +13,6 @@
>
>  #include "pci.h"
>
> -
> -extern struct pci_raw_ops pci_direct_conf1;
> -
>  static int pci_visws_enable_irq(struct pci_dev *dev) { return 0; }
>  static void pci_visws_disable_irq(struct pci_dev *dev) { }
>
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index e3a673a..f190db9 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -139,15 +139,6 @@ acpi_status __init acpi_os_initialize(void)
>
>  acpi_status acpi_os_initialize1(void)
>  {
> -       /*
> -        * Initialize PCI configuration space access, as we'll need to access
> -        * it while walking the namespace (bus 0 and root bridges w/ _BBNs).
> -        */
> -       if (!raw_pci_ops) {
> -               printk(KERN_ERR PREFIX
> -                      "Access to PCI configuration space unavailable\n");
> -               return AE_NULL_ENTRY;
> -       }
>         kacpid_wq = create_singlethread_workqueue("kacpid");
>         kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
>         BUG_ON(!kacpid_wq);
> @@ -498,11 +489,9 @@ acpi_os_read_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
>                 return AE_ERROR;
>         }
>
> -       BUG_ON(!raw_pci_ops);
> -
> -       result = raw_pci_ops->read(pci_id->segment, pci_id->bus,
> -                                  PCI_DEVFN(pci_id->device, pci_id->function),
> -                                  reg, size, value);
> +       result = raw_pci_read(pci_id->segment, pci_id->bus,
> +                               PCI_DEVFN(pci_id->device, pci_id->function),
> +                               reg, size, value);
>
>         return (result ? AE_ERROR : AE_OK);
>  }
> @@ -529,11 +518,9 @@ acpi_os_write_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
>                 return AE_ERROR;
>         }
>
> -       BUG_ON(!raw_pci_ops);
> -
> -       result = raw_pci_ops->write(pci_id->segment, pci_id->bus,
> -                                   PCI_DEVFN(pci_id->device, pci_id->function),
> -                                   reg, size, value);
> +       result = raw_pci_write(pci_id->segment, pci_id->bus,
> +                               PCI_DEVFN(pci_id->device, pci_id->function),
> +                               reg, size, value);
>
>         return (result ? AE_ERROR : AE_OK);
>  }
> diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig
> index ba63619..1e71dc0 100644
> --- a/drivers/ata/Kconfig
> +++ b/drivers/ata/Kconfig
> @@ -40,6 +40,9 @@ config ATA_ACPI
>           You can disable this at kernel boot time by using the
>           option libata.noacpi=1
>
> +config ATA_RAM
> +       tristate "ATA RAM driver"
> +

related?

YH

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-03  7:30                                                               ` Yinghai Lu
@ 2008-02-07 15:54                                                                 ` Tony Camuso
  2008-02-07 16:28                                                                   ` Arjan van de Ven
  2008-02-09 12:41                                                                   ` Matthew Wilcox
  0 siblings, 2 replies; 125+ messages in thread
From: Tony Camuso @ 2008-02-07 15:54 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Matthew Wilcox, Greg KH, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

Matthew,

Perhaps I missed it, but did you address Yinghai's concerns?

Yinghai Lu wrote:
> On Jan 28, 2008 7:03 PM, Matthew Wilcox <matthew@wil.cx> wrote:
>>
>> -int pci_conf1_write(unsigned int seg, unsigned int bus,
>> +static int pci_conf1_write(unsigned int seg, unsigned int bus,
>>                            unsigned int devfn, int reg, int len, u32 value)
> 
> any reason to change pci_conf1_read/write to static?
> 

>>
>> +config ATA_RAM
>> +       tristate "ATA RAM driver"
>> +
> 
> related?
> 
> YH


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-07 15:54                                                                 ` Tony Camuso
@ 2008-02-07 16:28                                                                   ` Arjan van de Ven
  2008-02-07 16:36                                                                     ` Tony Camuso
  2008-02-09 12:41                                                                   ` Matthew Wilcox
  1 sibling, 1 reply; 125+ messages in thread
From: Arjan van de Ven @ 2008-02-07 16:28 UTC (permalink / raw)
  To: tcamuso
  Cc: Yinghai Lu, Matthew Wilcox, Greg KH, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Thu, 07 Feb 2008 10:54:05 -0500
Tony Camuso <tcamuso@redhat.com> wrote:

> Matthew,
> 
> Perhaps I missed it, but did you address Yinghai's concerns?
> 
> Yinghai Lu wrote:
> > On Jan 28, 2008 7:03 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> >>
> >> -int pci_conf1_write(unsigned int seg, unsigned int bus,
> >> +static int pci_conf1_write(unsigned int seg, unsigned int bus,
> >>                            unsigned int devfn, int reg, int len,
> >> u32 value)
> > 
> > any reason to change pci_conf1_read/write to static?
> > 
> 

nothing should use these directly. So static is the right answer ;)

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-07 16:28                                                                   ` Arjan van de Ven
@ 2008-02-07 16:36                                                                     ` Tony Camuso
  2008-02-08  2:28                                                                       ` Grant Grundler
  0 siblings, 1 reply; 125+ messages in thread
From: Tony Camuso @ 2008-02-07 16:36 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Yinghai Lu, Matthew Wilcox, Greg KH, Grant Grundler, Loic Prylli,
	Adrian Bunk, Linus Torvalds, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

Arjan van de Ven wrote:
> On Thu, 07 Feb 2008 10:54:05 -0500
> Tony Camuso <tcamuso@redhat.com> wrote:
> 
>> Matthew,
>>
>> Perhaps I missed it, but did you address Yinghai's concerns?
>>
>> Yinghai Lu wrote:
>>> On Jan 28, 2008 7:03 PM, Matthew Wilcox <matthew@wil.cx> wrote:
>>>> -int pci_conf1_write(unsigned int seg, unsigned int bus,
>>>> +static int pci_conf1_write(unsigned int seg, unsigned int bus,
>>>>                            unsigned int devfn, int reg, int len,
>>>> u32 value)
>>> any reason to change pci_conf1_read/write to static?
>>>
> 
> nothing should use these directly. So static is the right answer ;)

Agreed. Thanks, Arjan.

Matthew,
What about the ATA_RAM addition to Kconfig? Was it accidental,
or intended? If intended, how is it related?

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-07 16:36                                                                     ` Tony Camuso
@ 2008-02-08  2:28                                                                       ` Grant Grundler
  0 siblings, 0 replies; 125+ messages in thread
From: Grant Grundler @ 2008-02-08  2:28 UTC (permalink / raw)
  To: Tony Camuso
  Cc: Arjan van de Ven, Yinghai Lu, Matthew Wilcox, Greg KH,
	Grant Grundler, Loic Prylli, Adrian Bunk, Linus Torvalds,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Thu, Feb 07, 2008 at 11:36:18AM -0500, Tony Camuso wrote:
> Arjan van de Ven wrote:
>> On Thu, 07 Feb 2008 10:54:05 -0500
>> Tony Camuso <tcamuso@redhat.com> wrote:
>>> Matthew,
>>>
>>> Perhaps I missed it, but did you address Yinghai's concerns?
>>>
>>> Yinghai Lu wrote:
>>>> On Jan 28, 2008 7:03 PM, Matthew Wilcox <matthew@wil.cx> wrote:
>>>>> -int pci_conf1_write(unsigned int seg, unsigned int bus,
>>>>> +static int pci_conf1_write(unsigned int seg, unsigned int bus,
>>>>>                            unsigned int devfn, int reg, int len,
>>>>> u32 value)
>>>> any reason to change pci_conf1_read/write to static?
>>>>
>> nothing should use these directly. So static is the right answer ;)
>
> Agreed. Thanks, Arjan.
>
> Matthew,
> What about the ATA_RAM addition to Kconfig? Was it accidental,
> or intended? If intended, how is it related?

AFAICT, it looks accidental. I can't see how it's related.
He should be back online next week and can answer for himself.

hth,
grant

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-07 15:54                                                                 ` Tony Camuso
  2008-02-07 16:28                                                                   ` Arjan van de Ven
@ 2008-02-09 12:41                                                                   ` Matthew Wilcox
  2008-02-10  6:25                                                                     ` Yinghai Lu
  1 sibling, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-02-09 12:41 UTC (permalink / raw)
  To: Tony Camuso
  Cc: Yinghai Lu, Greg KH, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Thu, Feb 07, 2008 at 10:54:05AM -0500, Tony Camuso wrote:
> Matthew,
> 
> Perhaps I missed it, but did you address Yinghai's concerns?

No, I was on holiday.

> Yinghai Lu wrote:
> >On Jan 28, 2008 7:03 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> >>
> >>-int pci_conf1_write(unsigned int seg, unsigned int bus,
> >>+static int pci_conf1_write(unsigned int seg, unsigned int bus,
> >>                           unsigned int devfn, int reg, int len, u32 
> >>                           value)
> >
> >any reason to change pci_conf1_read/write to static?

Yes -- it no longer needs to be called from outside this file.

> >>+config ATA_RAM
> >>+       tristate "ATA RAM driver"
> >>+
> >
> >related?

No.  An unrelated patch that I didn't trim out.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-09 12:41                                                                   ` Matthew Wilcox
@ 2008-02-10  6:25                                                                     ` Yinghai Lu
  2008-02-10  7:21                                                                       ` Greg KH
  0 siblings, 1 reply; 125+ messages in thread
From: Yinghai Lu @ 2008-02-10  6:25 UTC (permalink / raw)
  To: Matthew Wilcox, Andrew Morton, Ingo Molnar
  Cc: Tony Camuso, Greg KH, Grant Grundler, Loic Prylli, Adrian Bunk,
	Linus Torvalds, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Feb 9, 2008 4:41 AM, Matthew Wilcox <matthew@wil.cx> wrote:
> On Thu, Feb 07, 2008 at 10:54:05AM -0500, Tony Camuso wrote:
> > Matthew,
> >
> > Perhaps I missed it, but did you address Yinghai's concerns?
>
> No, I was on holiday.
>
> > Yinghai Lu wrote:
> > >On Jan 28, 2008 7:03 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> > >>
> > >>-int pci_conf1_write(unsigned int seg, unsigned int bus,
> > >>+static int pci_conf1_write(unsigned int seg, unsigned int bus,
> > >>                           unsigned int devfn, int reg, int len, u32
> > >>                           value)
> > >
> > >any reason to change pci_conf1_read/write to static?
>
> Yes -- it no longer needs to be called from outside this file.
>
> > >>+config ATA_RAM
> > >>+       tristate "ATA RAM driver"
> > >>+
> > >
> > >related?
>

looks good. it should get into -mm or x86/mm for some testing

YH

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10  6:25                                                                     ` Yinghai Lu
@ 2008-02-10  7:21                                                                       ` Greg KH
  2008-02-10 14:51                                                                         ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Greg KH @ 2008-02-10  7:21 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Matthew Wilcox, Andrew Morton, Ingo Molnar, Tony Camuso,
	Grant Grundler, Loic Prylli, Adrian Bunk, Linus Torvalds,
	Arjan van de Ven, Benjamin Herrenschmidt, Ivan Kokshaysky,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares

On Sat, Feb 09, 2008 at 10:25:23PM -0800, Yinghai Lu wrote:
> On Feb 9, 2008 4:41 AM, Matthew Wilcox <matthew@wil.cx> wrote:
> > On Thu, Feb 07, 2008 at 10:54:05AM -0500, Tony Camuso wrote:
> > > Matthew,
> > >
> > > Perhaps I missed it, but did you address Yinghai's concerns?
> >
> > No, I was on holiday.
> >
> > > Yinghai Lu wrote:
> > > >On Jan 28, 2008 7:03 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> > > >>
> > > >>-int pci_conf1_write(unsigned int seg, unsigned int bus,
> > > >>+static int pci_conf1_write(unsigned int seg, unsigned int bus,
> > > >>                           unsigned int devfn, int reg, int len, u32
> > > >>                           value)
> > > >
> > > >any reason to change pci_conf1_read/write to static?
> >
> > Yes -- it no longer needs to be called from outside this file.
> >
> > > >>+config ATA_RAM
> > > >>+       tristate "ATA RAM driver"
> > > >>+
> > > >
> > > >related?
> >
> 
> looks good. it should get into -mm or x86/mm for some testing

Can I get a revised version of this, without the incorrect hunk?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10  7:21                                                                       ` Greg KH
@ 2008-02-10 14:51                                                                         ` Matthew Wilcox
  2008-02-10 19:13                                                                           ` Grant Grundler
  2008-02-10 20:16                                                                           ` Yinghai Lu
  0 siblings, 2 replies; 125+ messages in thread
From: Matthew Wilcox @ 2008-02-10 14:51 UTC (permalink / raw)
  To: Greg KH
  Cc: Yinghai Lu, Andrew Morton, Ingo Molnar, Tony Camuso,
	Grant Grundler, Loic Prylli, Adrian Bunk, Linus Torvalds,
	Arjan van de Ven, Benjamin Herrenschmidt, Ivan Kokshaysky,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares

On Sat, Feb 09, 2008 at 11:21:16PM -0800, Greg KH wrote:
> Can I get a revised version of this, without the incorrect hunk?

Sure.  I've even rebased it against current HEAD.  Damn whitespace
cleanup introducing unnecessary conflicts ....

I suggest Ivan's patch be merged ASAP as it actually fixes bugs.
This patch is just cleanup (and takes care of some future concerns).

>From ad4c3f135cda6f5210735231d30ef8e9dbd58c7c Mon Sep 17 00:00:00 2001
From: Matthew Wilcox <matthew@wil.cx>
Date: Sun, 10 Feb 2008 09:45:28 -0500
Subject: [PATCH] Change pci_raw_ops to pci_raw_read/write

We want to allow different implementations of pci_raw_ops for standard
and extended config space on x86.  Rather than clutter generic code with
knowledge of this, we make pci_raw_ops private to x86 and use it to
implement the new raw interface -- raw_pci_read() and raw_pci_write().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
---
 arch/ia64/pci/pci.c               |   25 ++++++++-----------------
 arch/ia64/sn/pci/tioce_provider.c |   16 ++++++++--------
 arch/x86/kernel/quirks.c          |    2 +-
 arch/x86/pci/common.c             |   25 +++++++++++++++++++++++--
 arch/x86/pci/direct.c             |    4 ++--
 arch/x86/pci/fixup.c              |    6 ++++--
 arch/x86/pci/legacy.c             |    2 +-
 arch/x86/pci/mmconfig-shared.c    |    6 +++---
 arch/x86/pci/mmconfig_32.c        |   10 ++--------
 arch/x86/pci/mmconfig_64.c        |    8 +-------
 arch/x86/pci/pci.h                |   15 +++++++++++----
 arch/x86/pci/visws.c              |    3 ---
 drivers/acpi/osl.c                |   25 ++++++-------------------
 include/linux/pci.h               |   16 ++++++++--------
 14 files changed, 78 insertions(+), 85 deletions(-)

diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 488e48a..8fd7e82 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -43,8 +43,7 @@
 #define PCI_SAL_EXT_ADDRESS(seg, bus, devfn, reg)	\
 	(((u64) seg << 28) | (bus << 20) | (devfn << 12) | (reg))
 
-static int
-pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_read(unsigned int seg, unsigned int bus, unsigned int devfn,
 	      int reg, int len, u32 *value)
 {
 	u64 addr, data = 0;
@@ -68,8 +67,7 @@ pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
 	return 0;
 }
 
-static int
-pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_write(unsigned int seg, unsigned int bus, unsigned int devfn,
 	       int reg, int len, u32 value)
 {
 	u64 addr;
@@ -91,24 +89,17 @@ pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
 	return 0;
 }
 
-static struct pci_raw_ops pci_sal_ops = {
-	.read =		pci_sal_read,
-	.write =	pci_sal_write
-};
-
-struct pci_raw_ops *raw_pci_ops = &pci_sal_ops;
-
-static int
-pci_read (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
+static int pci_read(struct pci_bus *bus, unsigned int devfn, int where,
+							int size, u32 *value)
 {
-	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
 				 devfn, where, size, value);
 }
 
-static int
-pci_write (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
+static int pci_write(struct pci_bus *bus, unsigned int devfn, int where,
+							int size, u32 value)
 {
-	return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
 				  devfn, where, size, value);
 }
 
diff --git a/arch/ia64/sn/pci/tioce_provider.c b/arch/ia64/sn/pci/tioce_provider.c
index e1a3e19..999f14f 100644
--- a/arch/ia64/sn/pci/tioce_provider.c
+++ b/arch/ia64/sn/pci/tioce_provider.c
@@ -752,13 +752,13 @@ tioce_kern_init(struct tioce_common *tioce_common)
 	 * Determine the secondary bus number of the port2 logical PPB.
 	 * This is used to decide whether a given pci device resides on
 	 * port1 or port2.  Note:  We don't have enough plumbing set up
-	 * here to use pci_read_config_xxx() so use the raw_pci_ops vector.
+	 * here to use pci_read_config_xxx() so use raw_pci_read().
 	 */
 
 	seg = tioce_common->ce_pcibus.bs_persist_segment;
 	bus = tioce_common->ce_pcibus.bs_persist_busnum;
 
-	raw_pci_ops->read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
+	raw_pci_read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
 	tioce_kern->ce_port1_secondary = (u8) tmp;
 
 	/*
@@ -799,11 +799,11 @@ tioce_kern_init(struct tioce_common *tioce_common)
 
 		/* mem base/limit */
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_MEMORY_BASE, 2, &tmp);
 		base = (u64)tmp << 16;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_MEMORY_LIMIT, 2, &tmp);
 		limit = (u64)tmp << 16;
 		limit |= 0xfffffUL;
@@ -817,21 +817,21 @@ tioce_kern_init(struct tioce_common *tioce_common)
 		 * attributes.
 		 */
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_MEMORY_BASE, 2, &tmp);
 		base = ((u64)tmp & PCI_PREF_RANGE_MASK) << 16;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_BASE_UPPER32, 4, &tmp);
 		base |= (u64)tmp << 32;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_MEMORY_LIMIT, 2, &tmp);
 
 		limit = ((u64)tmp & PCI_PREF_RANGE_MASK) << 16;
 		limit |= 0xfffffUL;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_LIMIT_UPPER32, 4, &tmp);
 		limit |= (u64)tmp << 32;
 
diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 6ba33ca..1941482 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -27,7 +27,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
 	pci_write_config_byte(dev, 0xf4, config|0x2);
 
 	/* read xTPR register */
-	raw_pci_ops->read(0, 0, 0x40, 0x4c, 2, &word);
+	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
 
 	if (!(word & (1 << 13))) {
 		dev_info(&dev->dev, "Intel E7520/7320/7525 detected; "
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 52deabc..b7c67a1 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -26,16 +26,37 @@ int pcibios_last_bus = -1;
 unsigned long pirq_table_addr;
 struct pci_bus *pci_root_bus;
 struct pci_raw_ops *raw_pci_ops;
+struct pci_raw_ops *raw_pci_ext_ops;
+
+int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val)
+{
+	if (reg < 256 && raw_pci_ops)
+		return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
+	return -EINVAL;
+}
+
+int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val)
+{
+	if (reg < 256 && raw_pci_ops)
+		return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
+	return -EINVAL;
+}
 
 static int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
-	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
 				 devfn, where, size, value);
 }
 
 static int pci_write(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
 {
-	return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
 				  devfn, where, size, value);
 }
 
diff --git a/arch/x86/pci/direct.c b/arch/x86/pci/direct.c
index 431c9a5..42f3e4c 100644
--- a/arch/x86/pci/direct.c
+++ b/arch/x86/pci/direct.c
@@ -14,7 +14,7 @@
 #define PCI_CONF1_ADDRESS(bus, devfn, reg) \
 	(0x80000000 | (bus << 16) | (devfn << 8) | (reg & ~3))
 
-int pci_conf1_read(unsigned int seg, unsigned int bus,
+static int pci_conf1_read(unsigned int seg, unsigned int bus,
 			  unsigned int devfn, int reg, int len, u32 *value)
 {
 	unsigned long flags;
@@ -45,7 +45,7 @@ int pci_conf1_read(unsigned int seg, unsigned int bus,
 	return 0;
 }
 
-int pci_conf1_write(unsigned int seg, unsigned int bus,
+static int pci_conf1_write(unsigned int seg, unsigned int bus,
 			   unsigned int devfn, int reg, int len, u32 value)
 {
 	unsigned long flags;
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 74d30ff..a5ef5f5 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -215,7 +215,8 @@ static int quirk_aspm_offset[MAX_PCIEROOT << 3];
 
 static int quirk_pcie_aspm_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
-	return raw_pci_ops->read(0, bus->number, devfn, where, size, value);
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
+						devfn, where, size, value);
 }
 
 /*
@@ -231,7 +232,8 @@ static int quirk_pcie_aspm_write(struct pci_bus *bus, unsigned int devfn, int wh
 	if ((offset) && (where == offset))
 		value = value & 0xfffffffc;
 
-	return raw_pci_ops->write(0, bus->number, devfn, where, size, value);
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
+						devfn, where, size, value);
 }
 
 static struct pci_ops quirk_pcie_aspm_ops = {
diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c
index 5565d70..e041ced 100644
--- a/arch/x86/pci/legacy.c
+++ b/arch/x86/pci/legacy.c
@@ -22,7 +22,7 @@ static void __devinit pcibios_fixup_peer_bridges(void)
 		if (pci_find_bus(0, n))
 			continue;
 		for (devfn = 0; devfn < 256; devfn += 8) {
-			if (!raw_pci_ops->read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
+			if (!raw_pci_read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
 			    l != 0x0000 && l != 0xffff) {
 				DBG("Found device at %02x:%02x [%04x]\n", n, devfn, l);
 				printk(KERN_INFO "PCI: Discovered peer bus %02x\n", n);
diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 6b521d3..8d54df4 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -28,7 +28,7 @@ static int __initdata pci_mmcfg_resources_inserted;
 static const char __init *pci_mmcfg_e7520(void)
 {
 	u32 win;
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
 
 	win = win & 0xf000;
 	if(win == 0x0000 || win == 0xf000)
@@ -53,7 +53,7 @@ static const char __init *pci_mmcfg_intel_945(void)
 
 	pci_mmcfg_config_num = 1;
 
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
 
 	/* Enable bit */
 	if (!(pciexbar & 1))
@@ -118,7 +118,7 @@ static int __init pci_mmcfg_check_hostbridge(void)
 	int i;
 	const char *name;
 
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
 	vendor = l & 0xffff;
 	device = (l >> 16) & 0xffff;
 
diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 7b75e65..081816a 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -68,9 +68,6 @@ err:		*value = -1;
 		return -EINVAL;
 	}
 
-	if (reg < 256)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
-
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
 		goto err;
@@ -104,9 +101,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if ((bus > 255) || (devfn > 255) || (reg > 4095))
 		return -EINVAL;
 
-	if (reg < 256)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
-
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
 		return -EINVAL;
@@ -138,7 +132,7 @@ static struct pci_raw_ops pci_mmcfg = {
 
 int __init pci_mmcfg_arch_init(void)
 {
-	printk(KERN_INFO "PCI: Using MMCONFIG\n");
-	raw_pci_ops = &pci_mmcfg;
+	printk(KERN_INFO "PCI: Using MMCONFIG for extended config space\n");
+	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index c4cf318..9207fd4 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -58,9 +58,6 @@ err:		*value = -1;
 		return -EINVAL;
 	}
 
-	if (reg < 256)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
-
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
 		goto err;
@@ -89,9 +86,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095)))
 		return -EINVAL;
 
-	if (reg < 256)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
-
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
 		return -EINVAL;
@@ -150,6 +144,6 @@ int __init pci_mmcfg_arch_init(void)
 			return 0;
 		}
 	}
-	raw_pci_ops = &pci_mmcfg;
+	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
index 36cb44c..3431518 100644
--- a/arch/x86/pci/pci.h
+++ b/arch/x86/pci/pci.h
@@ -85,10 +85,17 @@ extern spinlock_t pci_config_lock;
 extern int (*pcibios_enable_irq)(struct pci_dev *dev);
 extern void (*pcibios_disable_irq)(struct pci_dev *dev);
 
-extern int pci_conf1_write(unsigned int seg, unsigned int bus,
-			   unsigned int devfn, int reg, int len, u32 value);
-extern int pci_conf1_read(unsigned int seg, unsigned int bus,
-			  unsigned int devfn, int reg, int len, u32 *value);
+struct pci_raw_ops {
+	int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val);
+	int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val);
+};
+
+extern struct pci_raw_ops *raw_pci_ops;
+extern struct pci_raw_ops *raw_pci_ext_ops;
+
+extern struct pci_raw_ops pci_direct_conf1;
 
 extern int pci_direct_probe(void);
 extern void pci_direct_init(int type);
diff --git a/arch/x86/pci/visws.c b/arch/x86/pci/visws.c
index 8ecb1c7..c2df4e9 100644
--- a/arch/x86/pci/visws.c
+++ b/arch/x86/pci/visws.c
@@ -13,9 +13,6 @@
 
 #include "pci.h"
 
-
-extern struct pci_raw_ops pci_direct_conf1;
-
 static int pci_visws_enable_irq(struct pci_dev *dev) { return 0; }
 static void pci_visws_disable_irq(struct pci_dev *dev) { }
 
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index a14501c..34b3386 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -200,15 +200,6 @@ acpi_status __init acpi_os_initialize(void)
 
 acpi_status acpi_os_initialize1(void)
 {
-	/*
-	 * Initialize PCI configuration space access, as we'll need to access
-	 * it while walking the namespace (bus 0 and root bridges w/ _BBNs).
-	 */
-	if (!raw_pci_ops) {
-		printk(KERN_ERR PREFIX
-		       "Access to PCI configuration space unavailable\n");
-		return AE_NULL_ENTRY;
-	}
 	kacpid_wq = create_singlethread_workqueue("kacpid");
 	kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
 	BUG_ON(!kacpid_wq);
@@ -653,11 +644,9 @@ acpi_os_read_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
 		return AE_ERROR;
 	}
 
-	BUG_ON(!raw_pci_ops);
-
-	result = raw_pci_ops->read(pci_id->segment, pci_id->bus,
-				   PCI_DEVFN(pci_id->device, pci_id->function),
-				   reg, size, value);
+	result = raw_pci_read(pci_id->segment, pci_id->bus,
+				PCI_DEVFN(pci_id->device, pci_id->function),
+				reg, size, value);
 
 	return (result ? AE_ERROR : AE_OK);
 }
@@ -682,11 +671,9 @@ acpi_os_write_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
 		return AE_ERROR;
 	}
 
-	BUG_ON(!raw_pci_ops);
-
-	result = raw_pci_ops->write(pci_id->segment, pci_id->bus,
-				    PCI_DEVFN(pci_id->device, pci_id->function),
-				    reg, size, value);
+	result = raw_pci_write(pci_id->segment, pci_id->bus,
+				PCI_DEVFN(pci_id->device, pci_id->function),
+				reg, size, value);
 
 	return (result ? AE_ERROR : AE_OK);
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7215d3b..87195b6 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -301,14 +301,14 @@ struct pci_ops {
 	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
 };
 
-struct pci_raw_ops {
-	int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
-		    int reg, int len, u32 *val);
-	int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
-		     int reg, int len, u32 val);
-};
-
-extern struct pci_raw_ops *raw_pci_ops;
+/*
+ * ACPI needs to be able to access PCI config space before we've done a
+ * PCI bus scan and created pci_bus structures.
+ */
+extern int raw_pci_read(unsigned int domain, unsigned int bus,
+			unsigned int devfn, int reg, int len, u32 *val);
+extern int raw_pci_write(unsigned int domain, unsigned int bus,
+			unsigned int devfn, int reg, int len, u32 val);
 
 struct pci_bus_region {
 	resource_size_t start;
-- 
1.5.2.5


-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10 14:51                                                                         ` Matthew Wilcox
@ 2008-02-10 19:13                                                                           ` Grant Grundler
  2008-02-10 19:37                                                                             ` Matthew Wilcox
  2008-02-10 20:16                                                                           ` Yinghai Lu
  1 sibling, 1 reply; 125+ messages in thread
From: Grant Grundler @ 2008-02-10 19:13 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Yinghai Lu, Andrew Morton, Ingo Molnar, Tony Camuso,
	Grant Grundler, Loic Prylli, Adrian Bunk, Linus Torvalds,
	Arjan van de Ven, Benjamin Herrenschmidt, Ivan Kokshaysky,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares

On Sun, Feb 10, 2008 at 07:51:22AM -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <matthew@wil.cx>
> Date: Sun, 10 Feb 2008 09:45:28 -0500
> Subject: [PATCH] Change pci_raw_ops to pci_raw_read/write
...
> -static int
> -pci_read (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
> +static int pci_read(struct pci_bus *bus, unsigned int devfn, int where,
> +							int size, u32 *value)
>  {
> -	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
> +	return raw_pci_read(pci_domain_nr(bus), bus->number,
>  				 devfn, where, size, value);

Willy,
Just wondering...why don't we just pass "struct bus*" through to the
raw_pci* ops?
My thinking is if a PCI bus controller or bridge is discovered, then we should
always create a matching "struct bus *".

Your patch looks fine to me but if you (and others) agree with the above,
I can make patch to change the internal interface. The pci_*_config API
needs to remain the same.

...
> --- a/arch/x86/kernel/quirks.c
> +++ b/arch/x86/kernel/quirks.c
> @@ -27,7 +27,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
>  	pci_write_config_byte(dev, 0xf4, config|0x2);
>  
>  	/* read xTPR register */
> -	raw_pci_ops->read(0, 0, 0x40, 0x4c, 2, &word);
> +	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);

Why are we using raw_pci_read here instead of pci_read_config_dword()?
If the pci_write_config_byte() above works, then I expect the read
to work too.

To be clear, this is not a problem with this patch...rather a seperate
problem with the original code.

hth,
grant

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10 19:13                                                                           ` Grant Grundler
@ 2008-02-10 19:37                                                                             ` Matthew Wilcox
  0 siblings, 0 replies; 125+ messages in thread
From: Matthew Wilcox @ 2008-02-10 19:37 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Greg KH, Yinghai Lu, Andrew Morton, Ingo Molnar, Tony Camuso,
	Loic Prylli, Adrian Bunk, Linus Torvalds, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Sun, Feb 10, 2008 at 12:13:13PM -0700, Grant Grundler wrote:
> Just wondering...why don't we just pass "struct bus*" through to the
> raw_pci* ops?
> My thinking is if a PCI bus controller or bridge is discovered, then we should
> always create a matching "struct bus *".

ACPI may need to access PCI config space before we've done a PCI bus
walk.  There's an opregion that AML may access that is for PCI config
space, and an apparently unrelated method might happen to contain such a
piece of AML.

> > --- a/arch/x86/kernel/quirks.c
> > +++ b/arch/x86/kernel/quirks.c
> > @@ -27,7 +27,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
> >  	pci_write_config_byte(dev, 0xf4, config|0x2);
> >  
> >  	/* read xTPR register */
> > -	raw_pci_ops->read(0, 0, 0x40, 0x4c, 2, &word);
> > +	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
> 
> Why are we using raw_pci_read here instead of pci_read_config_dword()?
> If the pci_write_config_byte() above works, then I expect the read
> to work too.

I have no idea.  I didn't want to change the semantics in this patch.
Presumably the original author would have an idea why they needed to do
this.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10 14:51                                                                         ` Matthew Wilcox
  2008-02-10 19:13                                                                           ` Grant Grundler
@ 2008-02-10 20:16                                                                           ` Yinghai Lu
  2008-02-10 20:19                                                                             ` Matthew Wilcox
  2008-02-10 20:24                                                                             ` Linus Torvalds
  1 sibling, 2 replies; 125+ messages in thread
From: Yinghai Lu @ 2008-02-10 20:16 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Andrew Morton, Ingo Molnar, Tony Camuso, Grant Grundler,
	Loic Prylli, Adrian Bunk, Linus Torvalds, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Feb 10, 2008 6:51 AM, Matthew Wilcox <matthew@wil.cx> wrote:
> On Sat, Feb 09, 2008 at 11:21:16PM -0800, Greg KH wrote:
> > Can I get a revised version of this, without the incorrect hunk?
>
> Sure.  I've even rebased it against current HEAD.  Damn whitespace
> cleanup introducing unnecessary conflicts ....
>
> I suggest Ivan's patch be merged ASAP as it actually fixes bugs.
> This patch is just cleanup (and takes care of some future concerns).

your patch and Ivan's patch should be merged in one...

YH

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10 20:16                                                                           ` Yinghai Lu
@ 2008-02-10 20:19                                                                             ` Matthew Wilcox
  2008-02-10 20:25                                                                               ` Yinghai Lu
  2008-02-10 20:24                                                                             ` Linus Torvalds
  1 sibling, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-02-10 20:19 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Greg KH, Andrew Morton, Ingo Molnar, Tony Camuso, Grant Grundler,
	Loic Prylli, Adrian Bunk, Linus Torvalds, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Sun, Feb 10, 2008 at 12:16:43PM -0800, Yinghai Lu wrote:
> On Feb 10, 2008 6:51 AM, Matthew Wilcox <matthew@wil.cx> wrote:
> > On Sat, Feb 09, 2008 at 11:21:16PM -0800, Greg KH wrote:
> > > Can I get a revised version of this, without the incorrect hunk?
> >
> > Sure.  I've even rebased it against current HEAD.  Damn whitespace
> > cleanup introducing unnecessary conflicts ....
> >
> > I suggest Ivan's patch be merged ASAP as it actually fixes bugs.
> > This patch is just cleanup (and takes care of some future concerns).
> 
> your patch and Ivan's patch should be merged in one...

Why?

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10 20:16                                                                           ` Yinghai Lu
  2008-02-10 20:19                                                                             ` Matthew Wilcox
@ 2008-02-10 20:24                                                                             ` Linus Torvalds
  2008-02-10 20:45                                                                               ` Matthew Wilcox
  1 sibling, 1 reply; 125+ messages in thread
From: Linus Torvalds @ 2008-02-10 20:24 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Matthew Wilcox, Greg KH, Andrew Morton, Ingo Molnar, Tony Camuso,
	Grant Grundler, Loic Prylli, Adrian Bunk, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares



On Sun, 10 Feb 2008, Yinghai Lu wrote:
> >
> > I suggest Ivan's patch be merged ASAP as it actually fixes bugs.
> > This patch is just cleanup (and takes care of some future concerns).
> 
> your patch and Ivan's patch should be merged in one...

I really don't care whether they get merges as one or separately, but I 
think it should be merged _now_ (-rc1 is already delayed), and I'd like to 
see the final versions of both. Does anybody have them in a final 
agreed-upon format (preferably with that oddness in quirk_intel_irqbalance 
also fixed?)

		Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10 20:19                                                                             ` Matthew Wilcox
@ 2008-02-10 20:25                                                                               ` Yinghai Lu
  2008-02-10 20:32                                                                                 ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Yinghai Lu @ 2008-02-10 20:25 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Andrew Morton, Ingo Molnar, Tony Camuso, Grant Grundler,
	Loic Prylli, Adrian Bunk, Linus Torvalds, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Feb 10, 2008 12:19 PM, Matthew Wilcox <matthew@wil.cx> wrote:
>
> On Sun, Feb 10, 2008 at 12:16:43PM -0800, Yinghai Lu wrote:
> > On Feb 10, 2008 6:51 AM, Matthew Wilcox <matthew@wil.cx> wrote:
> > > On Sat, Feb 09, 2008 at 11:21:16PM -0800, Greg KH wrote:
> > > > Can I get a revised version of this, without the incorrect hunk?
> > >
> > > Sure.  I've even rebased it against current HEAD.  Damn whitespace
> > > cleanup introducing unnecessary conflicts ....
> > >
> > > I suggest Ivan's patch be merged ASAP as it actually fixes bugs.
> > > This patch is just cleanup (and takes care of some future concerns).
> >
> > your patch and Ivan's patch should be merged in one...
>
> Why?

Even Greg didn't know that there was another patch need to be applied
before this one yesterday.

he said there was some hunks..

YH

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10 20:25                                                                               ` Yinghai Lu
@ 2008-02-10 20:32                                                                                 ` Matthew Wilcox
  2008-02-10 20:47                                                                                   ` Yinghai Lu
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-02-10 20:32 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Greg KH, Andrew Morton, Ingo Molnar, Tony Camuso, Grant Grundler,
	Loic Prylli, Adrian Bunk, Linus Torvalds, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Sun, Feb 10, 2008 at 12:25:02PM -0800, Yinghai Lu wrote:
> Even Greg didn't know that there was another patch need to be applied
> before this one yesterday.

I don't believe you.  For example:

On Mon, Jan 28, 2008 at 02:53:34PM -0800, Greg KH wrote:
> Please send me patches, in a form that can be merged, along with a
> proper changelog entry, in the order in which you wish them to be
> applied, so I know exactly what changes you are referring to.

Which I then did.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10 20:24                                                                             ` Linus Torvalds
@ 2008-02-10 20:45                                                                               ` Matthew Wilcox
  2008-02-10 23:02                                                                                 ` raw_pci_read in quirk_intel_irqbalance Matthew Wilcox
  2008-02-11  1:49                                                                                 ` [PATCH] Change pci_raw_ops to pci_raw_read/write Yinghai Lu
  0 siblings, 2 replies; 125+ messages in thread
From: Matthew Wilcox @ 2008-02-10 20:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Yinghai Lu, Greg KH, Andrew Morton, Ingo Molnar, Tony Camuso,
	Grant Grundler, Loic Prylli, Adrian Bunk, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

[-- Attachment #1: Type: text/plain, Size: 1225 bytes --]

On Sun, Feb 10, 2008 at 12:24:18PM -0800, Linus Torvalds wrote:
> On Sun, 10 Feb 2008, Yinghai Lu wrote:
> > >
> > > I suggest Ivan's patch be merged ASAP as it actually fixes bugs.
> > > This patch is just cleanup (and takes care of some future concerns).
> > 
> > your patch and Ivan's patch should be merged in one...
> 
> I really don't care whether they get merges as one or separately, but I 
> think it should be merged _now_ (-rc1 is already delayed), and I'd like to 
> see the final versions of both. Does anybody have them in a final 
> agreed-upon format (preferably with that oddness in quirk_intel_irqbalance 
> also fixed?)

I just looked at fixing that -- the reason seems to be that we don't
actually have the struct pci_dev at that point.  I can fix it, but I
think it's actually buggy.  I want to look at some chipset docs to
confirm that though.

I've attached the two patches that I believe are the ones we want.  We
can (and should) fix quirk_intel_irqbalance separately.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

[-- Attachment #2: 0001-Ivan-s-PCI-patch.patch --]
[-- Type: text/plain, Size: 7101 bytes --]

>From fc1e24786c764bf7e7a73128f839b58a0559739c Mon Sep 17 00:00:00 2001
From: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Date: Mon, 14 Jan 2008 17:31:09 -0500
Subject: [PATCH] PCI x86: always use conf1 to access config space below 256 bytes

Thanks to Loic Prylli <loic@myri.com>, who originally proposed
this idea.

Always using legacy configuration mechanism for the legacy config space
and extended mechanism (mmconf) for the extended config space is
a simple and very logical approach. It's supposed to resolve all
known mmconf problems. It still allows per-device quirks (tweaking
dev->cfg_size). It also allows to get rid of mmconf fallback code.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>

---
 arch/x86/pci/mmconfig-shared.c |   35 -----------------------------------
 arch/x86/pci/mmconfig_32.c     |   22 +++++++++-------------
 arch/x86/pci/mmconfig_64.c     |   22 ++++++++++------------
 arch/x86/pci/pci.h             |    7 -------
 4 files changed, 19 insertions(+), 67 deletions(-)

diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 4df637e..6b521d3 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -22,42 +22,9 @@
 #define MMCONFIG_APER_MIN	(2 * 1024*1024)
 #define MMCONFIG_APER_MAX	(256 * 1024*1024)
 
-DECLARE_BITMAP(pci_mmcfg_fallback_slots, 32*PCI_MMCFG_MAX_CHECK_BUS);
-
 /* Indicate if the mmcfg resources have been placed into the resource table. */
 static int __initdata pci_mmcfg_resources_inserted;
 
-/* K8 systems have some devices (typically in the builtin northbridge)
-   that are only accessible using type1
-   Normally this can be expressed in the MCFG by not listing them
-   and assigning suitable _SEGs, but this isn't implemented in some BIOS.
-   Instead try to discover all devices on bus 0 that are unreachable using MM
-   and fallback for them. */
-static void __init unreachable_devices(void)
-{
-	int i, bus;
-	/* Use the max bus number from ACPI here? */
-	for (bus = 0; bus < PCI_MMCFG_MAX_CHECK_BUS; bus++) {
-		for (i = 0; i < 32; i++) {
-			unsigned int devfn = PCI_DEVFN(i, 0);
-			u32 val1, val2;
-
-			pci_conf1_read(0, bus, devfn, 0, 4, &val1);
-			if (val1 == 0xffffffff)
-				continue;
-
-			if (pci_mmcfg_arch_reachable(0, bus, devfn)) {
-				raw_pci_ops->read(0, bus, devfn, 0, 4, &val2);
-				if (val1 == val2)
-					continue;
-			}
-			set_bit(i + 32 * bus, pci_mmcfg_fallback_slots);
-			printk(KERN_NOTICE "PCI: No mmconfig possible on device"
-			       " %02x:%02x\n", bus, i);
-		}
-	}
-}
-
 static const char __init *pci_mmcfg_e7520(void)
 {
 	u32 win;
@@ -270,8 +237,6 @@ void __init pci_mmcfg_init(int type)
 		return;
 
 	if (pci_mmcfg_arch_init()) {
-		if (type == 1)
-			unreachable_devices();
 		if (known_bridge)
 			pci_mmcfg_insert_resources(IORESOURCE_BUSY);
 		pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 1bf5816..7b75e65 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -30,10 +30,6 @@ static u32 get_base_addr(unsigned int seg, int bus, unsigned devfn)
 	struct acpi_mcfg_allocation *cfg;
 	int cfg_num;
 
-	if (seg == 0 && bus < PCI_MMCFG_MAX_CHECK_BUS &&
-	    test_bit(PCI_SLOT(devfn) + 32*bus, pci_mmcfg_fallback_slots))
-		return 0;
-
 	for (cfg_num = 0; cfg_num < pci_mmcfg_config_num; cfg_num++) {
 		cfg = &pci_mmcfg_config[cfg_num];
 		if (cfg->pci_segment == seg &&
@@ -68,13 +64,16 @@ static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
 	u32 base;
 
 	if ((bus > 255) || (devfn > 255) || (reg > 4095)) {
-		*value = -1;
+err:		*value = -1;
 		return -EINVAL;
 	}
 
+	if (reg < 256)
+		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+		goto err;
 
 	spin_lock_irqsave(&pci_config_lock, flags);
 
@@ -105,9 +104,12 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if ((bus > 255) || (devfn > 255) || (reg > 4095))
 		return -EINVAL;
 
+	if (reg < 256)
+		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+		return -EINVAL;
 
 	spin_lock_irqsave(&pci_config_lock, flags);
 
@@ -134,12 +136,6 @@ static struct pci_raw_ops pci_mmcfg = {
 	.write =	pci_mmcfg_write,
 };
 
-int __init pci_mmcfg_arch_reachable(unsigned int seg, unsigned int bus,
-				    unsigned int devfn)
-{
-	return get_base_addr(seg, bus, devfn) != 0;
-}
-
 int __init pci_mmcfg_arch_init(void)
 {
 	printk(KERN_INFO "PCI: Using MMCONFIG\n");
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index 4095e4d..c4cf318 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -40,9 +40,7 @@ static char __iomem *get_virt(unsigned int seg, unsigned bus)
 static char __iomem *pci_dev_base(unsigned int seg, unsigned int bus, unsigned int devfn)
 {
 	char __iomem *addr;
-	if (seg == 0 && bus < PCI_MMCFG_MAX_CHECK_BUS &&
-		test_bit(32*bus + PCI_SLOT(devfn), pci_mmcfg_fallback_slots))
-		return NULL;
+
 	addr = get_virt(seg, bus);
 	if (!addr)
 		return NULL;
@@ -56,13 +54,16 @@ static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
 
 	/* Why do we have this when nobody checks it. How about a BUG()!? -AK */
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095))) {
-		*value = -1;
+err:		*value = -1;
 		return -EINVAL;
 	}
 
+	if (reg < 256)
+		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
+		goto err;
 
 	switch (len) {
 	case 1:
@@ -88,9 +89,12 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095)))
 		return -EINVAL;
 
+	if (reg < 256)
+		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
+		return -EINVAL;
 
 	switch (len) {
 	case 1:
@@ -126,12 +130,6 @@ static void __iomem * __init mcfg_ioremap(struct acpi_mcfg_allocation *cfg)
 	return addr;
 }
 
-int __init pci_mmcfg_arch_reachable(unsigned int seg, unsigned int bus,
-				    unsigned int devfn)
-{
-	return pci_dev_base(seg, bus, devfn) != NULL;
-}
-
 int __init pci_mmcfg_arch_init(void)
 {
 	int i;
diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
index ac56d39..36cb44c 100644
--- a/arch/x86/pci/pci.h
+++ b/arch/x86/pci/pci.h
@@ -98,13 +98,6 @@ extern void pcibios_sort(void);
 
 /* pci-mmconfig.c */
 
-/* Verify the first 16 busses. We assume that systems with more busses
-   get MCFG right. */
-#define PCI_MMCFG_MAX_CHECK_BUS 16
-extern DECLARE_BITMAP(pci_mmcfg_fallback_slots, 32*PCI_MMCFG_MAX_CHECK_BUS);
-
-extern int __init pci_mmcfg_arch_reachable(unsigned int seg, unsigned int bus,
-					   unsigned int devfn);
 extern int __init pci_mmcfg_arch_init(void);
 
 /*
-- 
1.5.2.5


[-- Attachment #3: 0002-Change-pci_raw_ops-to-pci_raw_read-write.patch --]
[-- Type: text/plain, Size: 16307 bytes --]

>From ad4c3f135cda6f5210735231d30ef8e9dbd58c7c Mon Sep 17 00:00:00 2001
From: Matthew Wilcox <matthew@wil.cx>
Date: Sun, 10 Feb 2008 09:45:28 -0500
Subject: [PATCH] Change pci_raw_ops to pci_raw_read/write

We want to allow different implementations of pci_raw_ops for standard
and extended config space on x86.  Rather than clutter generic code with
knowledge of this, we make pci_raw_ops private to x86 and use it to
implement the new raw interface -- raw_pci_read() and raw_pci_write().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
---
 arch/ia64/pci/pci.c               |   25 ++++++++-----------------
 arch/ia64/sn/pci/tioce_provider.c |   16 ++++++++--------
 arch/x86/kernel/quirks.c          |    2 +-
 arch/x86/pci/common.c             |   25 +++++++++++++++++++++++--
 arch/x86/pci/direct.c             |    4 ++--
 arch/x86/pci/fixup.c              |    6 ++++--
 arch/x86/pci/legacy.c             |    2 +-
 arch/x86/pci/mmconfig-shared.c    |    6 +++---
 arch/x86/pci/mmconfig_32.c        |   10 ++--------
 arch/x86/pci/mmconfig_64.c        |    8 +-------
 arch/x86/pci/pci.h                |   15 +++++++++++----
 arch/x86/pci/visws.c              |    3 ---
 drivers/acpi/osl.c                |   25 ++++++-------------------
 include/linux/pci.h               |   16 ++++++++--------
 14 files changed, 78 insertions(+), 85 deletions(-)

diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 488e48a..8fd7e82 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -43,8 +43,7 @@
 #define PCI_SAL_EXT_ADDRESS(seg, bus, devfn, reg)	\
 	(((u64) seg << 28) | (bus << 20) | (devfn << 12) | (reg))
 
-static int
-pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_read(unsigned int seg, unsigned int bus, unsigned int devfn,
 	      int reg, int len, u32 *value)
 {
 	u64 addr, data = 0;
@@ -68,8 +67,7 @@ pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
 	return 0;
 }
 
-static int
-pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_write(unsigned int seg, unsigned int bus, unsigned int devfn,
 	       int reg, int len, u32 value)
 {
 	u64 addr;
@@ -91,24 +89,17 @@ pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
 	return 0;
 }
 
-static struct pci_raw_ops pci_sal_ops = {
-	.read =		pci_sal_read,
-	.write =	pci_sal_write
-};
-
-struct pci_raw_ops *raw_pci_ops = &pci_sal_ops;
-
-static int
-pci_read (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
+static int pci_read(struct pci_bus *bus, unsigned int devfn, int where,
+							int size, u32 *value)
 {
-	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
 				 devfn, where, size, value);
 }
 
-static int
-pci_write (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
+static int pci_write(struct pci_bus *bus, unsigned int devfn, int where,
+							int size, u32 value)
 {
-	return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
 				  devfn, where, size, value);
 }
 
diff --git a/arch/ia64/sn/pci/tioce_provider.c b/arch/ia64/sn/pci/tioce_provider.c
index e1a3e19..999f14f 100644
--- a/arch/ia64/sn/pci/tioce_provider.c
+++ b/arch/ia64/sn/pci/tioce_provider.c
@@ -752,13 +752,13 @@ tioce_kern_init(struct tioce_common *tioce_common)
 	 * Determine the secondary bus number of the port2 logical PPB.
 	 * This is used to decide whether a given pci device resides on
 	 * port1 or port2.  Note:  We don't have enough plumbing set up
-	 * here to use pci_read_config_xxx() so use the raw_pci_ops vector.
+	 * here to use pci_read_config_xxx() so use raw_pci_read().
 	 */
 
 	seg = tioce_common->ce_pcibus.bs_persist_segment;
 	bus = tioce_common->ce_pcibus.bs_persist_busnum;
 
-	raw_pci_ops->read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
+	raw_pci_read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
 	tioce_kern->ce_port1_secondary = (u8) tmp;
 
 	/*
@@ -799,11 +799,11 @@ tioce_kern_init(struct tioce_common *tioce_common)
 
 		/* mem base/limit */
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_MEMORY_BASE, 2, &tmp);
 		base = (u64)tmp << 16;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_MEMORY_LIMIT, 2, &tmp);
 		limit = (u64)tmp << 16;
 		limit |= 0xfffffUL;
@@ -817,21 +817,21 @@ tioce_kern_init(struct tioce_common *tioce_common)
 		 * attributes.
 		 */
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_MEMORY_BASE, 2, &tmp);
 		base = ((u64)tmp & PCI_PREF_RANGE_MASK) << 16;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_BASE_UPPER32, 4, &tmp);
 		base |= (u64)tmp << 32;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_MEMORY_LIMIT, 2, &tmp);
 
 		limit = ((u64)tmp & PCI_PREF_RANGE_MASK) << 16;
 		limit |= 0xfffffUL;
 
-		raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+		raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
 				  PCI_PREF_LIMIT_UPPER32, 4, &tmp);
 		limit |= (u64)tmp << 32;
 
diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 6ba33ca..1941482 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -27,7 +27,7 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
 	pci_write_config_byte(dev, 0xf4, config|0x2);
 
 	/* read xTPR register */
-	raw_pci_ops->read(0, 0, 0x40, 0x4c, 2, &word);
+	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
 
 	if (!(word & (1 << 13))) {
 		dev_info(&dev->dev, "Intel E7520/7320/7525 detected; "
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 52deabc..b7c67a1 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -26,16 +26,37 @@ int pcibios_last_bus = -1;
 unsigned long pirq_table_addr;
 struct pci_bus *pci_root_bus;
 struct pci_raw_ops *raw_pci_ops;
+struct pci_raw_ops *raw_pci_ext_ops;
+
+int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val)
+{
+	if (reg < 256 && raw_pci_ops)
+		return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
+	return -EINVAL;
+}
+
+int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val)
+{
+	if (reg < 256 && raw_pci_ops)
+		return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
+	if (raw_pci_ext_ops)
+		return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, val);
+	return -EINVAL;
+}
 
 static int pci_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
-	return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
 				 devfn, where, size, value);
 }
 
 static int pci_write(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
 {
-	return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
 				  devfn, where, size, value);
 }
 
diff --git a/arch/x86/pci/direct.c b/arch/x86/pci/direct.c
index 431c9a5..42f3e4c 100644
--- a/arch/x86/pci/direct.c
+++ b/arch/x86/pci/direct.c
@@ -14,7 +14,7 @@
 #define PCI_CONF1_ADDRESS(bus, devfn, reg) \
 	(0x80000000 | (bus << 16) | (devfn << 8) | (reg & ~3))
 
-int pci_conf1_read(unsigned int seg, unsigned int bus,
+static int pci_conf1_read(unsigned int seg, unsigned int bus,
 			  unsigned int devfn, int reg, int len, u32 *value)
 {
 	unsigned long flags;
@@ -45,7 +45,7 @@ int pci_conf1_read(unsigned int seg, unsigned int bus,
 	return 0;
 }
 
-int pci_conf1_write(unsigned int seg, unsigned int bus,
+static int pci_conf1_write(unsigned int seg, unsigned int bus,
 			   unsigned int devfn, int reg, int len, u32 value)
 {
 	unsigned long flags;
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 74d30ff..a5ef5f5 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -215,7 +215,8 @@ static int quirk_aspm_offset[MAX_PCIEROOT << 3];
 
 static int quirk_pcie_aspm_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
 {
-	return raw_pci_ops->read(0, bus->number, devfn, where, size, value);
+	return raw_pci_read(pci_domain_nr(bus), bus->number,
+						devfn, where, size, value);
 }
 
 /*
@@ -231,7 +232,8 @@ static int quirk_pcie_aspm_write(struct pci_bus *bus, unsigned int devfn, int wh
 	if ((offset) && (where == offset))
 		value = value & 0xfffffffc;
 
-	return raw_pci_ops->write(0, bus->number, devfn, where, size, value);
+	return raw_pci_write(pci_domain_nr(bus), bus->number,
+						devfn, where, size, value);
 }
 
 static struct pci_ops quirk_pcie_aspm_ops = {
diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c
index 5565d70..e041ced 100644
--- a/arch/x86/pci/legacy.c
+++ b/arch/x86/pci/legacy.c
@@ -22,7 +22,7 @@ static void __devinit pcibios_fixup_peer_bridges(void)
 		if (pci_find_bus(0, n))
 			continue;
 		for (devfn = 0; devfn < 256; devfn += 8) {
-			if (!raw_pci_ops->read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
+			if (!raw_pci_read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
 			    l != 0x0000 && l != 0xffff) {
 				DBG("Found device at %02x:%02x [%04x]\n", n, devfn, l);
 				printk(KERN_INFO "PCI: Discovered peer bus %02x\n", n);
diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 6b521d3..8d54df4 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -28,7 +28,7 @@ static int __initdata pci_mmcfg_resources_inserted;
 static const char __init *pci_mmcfg_e7520(void)
 {
 	u32 win;
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
 
 	win = win & 0xf000;
 	if(win == 0x0000 || win == 0xf000)
@@ -53,7 +53,7 @@ static const char __init *pci_mmcfg_intel_945(void)
 
 	pci_mmcfg_config_num = 1;
 
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
 
 	/* Enable bit */
 	if (!(pciexbar & 1))
@@ -118,7 +118,7 @@ static int __init pci_mmcfg_check_hostbridge(void)
 	int i;
 	const char *name;
 
-	pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
+	pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
 	vendor = l & 0xffff;
 	device = (l >> 16) & 0xffff;
 
diff --git a/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
index 7b75e65..081816a 100644
--- a/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -68,9 +68,6 @@ err:		*value = -1;
 		return -EINVAL;
 	}
 
-	if (reg < 256)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
-
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
 		goto err;
@@ -104,9 +101,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if ((bus > 255) || (devfn > 255) || (reg > 4095))
 		return -EINVAL;
 
-	if (reg < 256)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
-
 	base = get_base_addr(seg, bus, devfn);
 	if (!base)
 		return -EINVAL;
@@ -138,7 +132,7 @@ static struct pci_raw_ops pci_mmcfg = {
 
 int __init pci_mmcfg_arch_init(void)
 {
-	printk(KERN_INFO "PCI: Using MMCONFIG\n");
-	raw_pci_ops = &pci_mmcfg;
+	printk(KERN_INFO "PCI: Using MMCONFIG for extended config space\n");
+	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff --git a/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
index c4cf318..9207fd4 100644
--- a/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -58,9 +58,6 @@ err:		*value = -1;
 		return -EINVAL;
 	}
 
-	if (reg < 256)
-		return pci_conf1_read(seg,bus,devfn,reg,len,value);
-
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
 		goto err;
@@ -89,9 +86,6 @@ static int pci_mmcfg_write(unsigned int seg, unsigned int bus,
 	if (unlikely((bus > 255) || (devfn > 255) || (reg > 4095)))
 		return -EINVAL;
 
-	if (reg < 256)
-		return pci_conf1_write(seg,bus,devfn,reg,len,value);
-
 	addr = pci_dev_base(seg, bus, devfn);
 	if (!addr)
 		return -EINVAL;
@@ -150,6 +144,6 @@ int __init pci_mmcfg_arch_init(void)
 			return 0;
 		}
 	}
-	raw_pci_ops = &pci_mmcfg;
+	raw_pci_ext_ops = &pci_mmcfg;
 	return 1;
 }
diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
index 36cb44c..3431518 100644
--- a/arch/x86/pci/pci.h
+++ b/arch/x86/pci/pci.h
@@ -85,10 +85,17 @@ extern spinlock_t pci_config_lock;
 extern int (*pcibios_enable_irq)(struct pci_dev *dev);
 extern void (*pcibios_disable_irq)(struct pci_dev *dev);
 
-extern int pci_conf1_write(unsigned int seg, unsigned int bus,
-			   unsigned int devfn, int reg, int len, u32 value);
-extern int pci_conf1_read(unsigned int seg, unsigned int bus,
-			  unsigned int devfn, int reg, int len, u32 *value);
+struct pci_raw_ops {
+	int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 *val);
+	int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
+						int reg, int len, u32 val);
+};
+
+extern struct pci_raw_ops *raw_pci_ops;
+extern struct pci_raw_ops *raw_pci_ext_ops;
+
+extern struct pci_raw_ops pci_direct_conf1;
 
 extern int pci_direct_probe(void);
 extern void pci_direct_init(int type);
diff --git a/arch/x86/pci/visws.c b/arch/x86/pci/visws.c
index 8ecb1c7..c2df4e9 100644
--- a/arch/x86/pci/visws.c
+++ b/arch/x86/pci/visws.c
@@ -13,9 +13,6 @@
 
 #include "pci.h"
 
-
-extern struct pci_raw_ops pci_direct_conf1;
-
 static int pci_visws_enable_irq(struct pci_dev *dev) { return 0; }
 static void pci_visws_disable_irq(struct pci_dev *dev) { }
 
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index a14501c..34b3386 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -200,15 +200,6 @@ acpi_status __init acpi_os_initialize(void)
 
 acpi_status acpi_os_initialize1(void)
 {
-	/*
-	 * Initialize PCI configuration space access, as we'll need to access
-	 * it while walking the namespace (bus 0 and root bridges w/ _BBNs).
-	 */
-	if (!raw_pci_ops) {
-		printk(KERN_ERR PREFIX
-		       "Access to PCI configuration space unavailable\n");
-		return AE_NULL_ENTRY;
-	}
 	kacpid_wq = create_singlethread_workqueue("kacpid");
 	kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
 	BUG_ON(!kacpid_wq);
@@ -653,11 +644,9 @@ acpi_os_read_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
 		return AE_ERROR;
 	}
 
-	BUG_ON(!raw_pci_ops);
-
-	result = raw_pci_ops->read(pci_id->segment, pci_id->bus,
-				   PCI_DEVFN(pci_id->device, pci_id->function),
-				   reg, size, value);
+	result = raw_pci_read(pci_id->segment, pci_id->bus,
+				PCI_DEVFN(pci_id->device, pci_id->function),
+				reg, size, value);
 
 	return (result ? AE_ERROR : AE_OK);
 }
@@ -682,11 +671,9 @@ acpi_os_write_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
 		return AE_ERROR;
 	}
 
-	BUG_ON(!raw_pci_ops);
-
-	result = raw_pci_ops->write(pci_id->segment, pci_id->bus,
-				    PCI_DEVFN(pci_id->device, pci_id->function),
-				    reg, size, value);
+	result = raw_pci_write(pci_id->segment, pci_id->bus,
+				PCI_DEVFN(pci_id->device, pci_id->function),
+				reg, size, value);
 
 	return (result ? AE_ERROR : AE_OK);
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7215d3b..87195b6 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -301,14 +301,14 @@ struct pci_ops {
 	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
 };
 
-struct pci_raw_ops {
-	int (*read)(unsigned int domain, unsigned int bus, unsigned int devfn,
-		    int reg, int len, u32 *val);
-	int (*write)(unsigned int domain, unsigned int bus, unsigned int devfn,
-		     int reg, int len, u32 val);
-};
-
-extern struct pci_raw_ops *raw_pci_ops;
+/*
+ * ACPI needs to be able to access PCI config space before we've done a
+ * PCI bus scan and created pci_bus structures.
+ */
+extern int raw_pci_read(unsigned int domain, unsigned int bus,
+			unsigned int devfn, int reg, int len, u32 *val);
+extern int raw_pci_write(unsigned int domain, unsigned int bus,
+			unsigned int devfn, int reg, int len, u32 val);
 
 struct pci_bus_region {
 	resource_size_t start;
-- 
1.5.2.5


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10 20:32                                                                                 ` Matthew Wilcox
@ 2008-02-10 20:47                                                                                   ` Yinghai Lu
  0 siblings, 0 replies; 125+ messages in thread
From: Yinghai Lu @ 2008-02-10 20:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Greg KH, Andrew Morton, Ingo Molnar, Tony Camuso, Grant Grundler,
	Loic Prylli, Adrian Bunk, Linus Torvalds, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Feb 10, 2008 12:32 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> On Sun, Feb 10, 2008 at 12:25:02PM -0800, Yinghai Lu wrote:
> > Even Greg didn't know that there was another patch need to be applied
> > before this one yesterday.
>
> I don't believe you.  For example:
>
> On Mon, Jan 28, 2008 at 02:53:34PM -0800, Greg KH wrote:
> > Please send me patches, in a form that can be merged, along with a
> > proper changelog entry, in the order in which you wish them to be
> > applied, so I know exactly what changes you are referring to.
>
> Which I then did.

then you may need to send patches to Greg: So Grey or others don'e
need to dig Ivan's patch

[PATCH 0/2]...
[PATCH 1/2]... Ivan's patch with from statement
[PATCH 2/2] ... your patch

YH

^ permalink raw reply	[flat|nested] 125+ messages in thread

* raw_pci_read in quirk_intel_irqbalance
  2008-02-10 20:45                                                                               ` Matthew Wilcox
@ 2008-02-10 23:02                                                                                 ` Matthew Wilcox
  2008-02-11  5:04                                                                                   ` Matthew Wilcox
  2008-02-11  1:49                                                                                 ` [PATCH] Change pci_raw_ops to pci_raw_read/write Yinghai Lu
  1 sibling, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-02-10 23:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Yinghai Lu, Greg KH, Andrew Morton, Ingo Molnar, Tony Camuso,
	Grant Grundler, Loic Prylli, Adrian Bunk, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Sun, Feb 10, 2008 at 01:45:57PM -0700, Matthew Wilcox wrote:
> I just looked at fixing that -- the reason seems to be that we don't
> actually have the struct pci_dev at that point.  I can fix it, but I
> think it's actually buggy.  I want to look at some chipset docs to
> confirm that though.

I don't think I fully understand what's going on here.  So here's what
I've been able to glean; hopefully someone who understands this better
can help out.

I happen to have an E7525-based machine, so here's an lspci of bus 0:

00:00.0 Host bridge: Intel Corporation E7525 Memory Controller Hub (rev 0a)
00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A (rev 0a)
00:03.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A1 (rev 0a)
00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B (rev 0a)
00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller (rev 02)
00:1f.5 Multimedia audio controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) AC'97 Audio Controller (rev 02)

The line in question reads:

        /* read xTPR register */
        raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);

That's domain 0, bus 0, device 8, function 0, address 0x4c, length 2.

I've checked the public E7525 and E7520 MCH datasheets, and they don't
document the xTPR registers; nor do any of the devices in the datasheet
have registers documented at 0x4c.

You can see from my lspci above that I don't _have_ a device 8 on bus 0.
The aforementioned documentation says:

"A disabled or non-existent device's configuration register space is
hidden. A disabled or non-existent device will return all ones for reads
and will drop writes just as if the cycle terminated with a Master Abort
on PCI."

Now, my E7525 isn't affected by this quirk as it has a revision greater
than 0x9.  So maybe it's expected that device 8 is hidden on my machine;
that it's only present on revisions up to 0x9.  But maybe device 8 is
always hidden, and that's why the author used raw_pci_ops?

We can still do better than this, though.  We can do:

-	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
+	pci_bus_read_config_word(dev->bus, PCI_DEVFN(8, 0), 0x4c, &word);

Using PCI_DEVFN tells people you really did mean device 8, and it's not
a braino for device 4 or 2 (how many bits for slot and function again?)

I'll see if I can dig up the internal documentation for the xTPR register
when I'm at work on Monday.  But I've never gone looking for internal
documentation before, so I have no idea how easy it will be to find ;-)

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-10 20:45                                                                               ` Matthew Wilcox
  2008-02-10 23:02                                                                                 ` raw_pci_read in quirk_intel_irqbalance Matthew Wilcox
@ 2008-02-11  1:49                                                                                 ` Yinghai Lu
  2008-02-11  2:53                                                                                   ` Robert Hancock
  2008-02-11 22:10                                                                                   ` Andrew Morton
  1 sibling, 2 replies; 125+ messages in thread
From: Yinghai Lu @ 2008-02-11  1:49 UTC (permalink / raw)
  To: Andrew Morton, Matthew Wilcox, Arjan van de Ven, Robert Hancock
  Cc: Linus Torvalds, Greg KH, Ingo Molnar, Tony Camuso,
	Grant Grundler, Loic Prylli, Adrian Bunk, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

[-- Attachment #1: Type: text/plain, Size: 1575 bytes --]

On Feb 10, 2008 12:45 PM, Matthew Wilcox <matthew@wil.cx> wrote:
>
> On Sun, Feb 10, 2008 at 12:24:18PM -0800, Linus Torvalds wrote:
> > On Sun, 10 Feb 2008, Yinghai Lu wrote:
> > > >
> > > > I suggest Ivan's patch be merged ASAP as it actually fixes bugs.
> > > > This patch is just cleanup (and takes care of some future concerns).
> > >
> > > your patch and Ivan's patch should be merged in one...
> >
> > I really don't care whether they get merges as one or separately, but I
> > think it should be merged _now_ (-rc1 is already delayed), and I'd like to
> > see the final versions of both. Does anybody have them in a final
> > agreed-upon format (preferably with that oddness in quirk_intel_irqbalance
> > also fixed?)
>
> I just looked at fixing that -- the reason seems to be that we don't
> actually have the struct pci_dev at that point.  I can fix it, but I
> think it's actually buggy.  I want to look at some chipset docs to
> confirm that though.
>
> I've attached the two patches that I believe are the ones we want.  We
> can (and should) fix quirk_intel_irqbalance separately.

Andrew,

those two patch just got into linus 2.6.25-rc1.

I assume that you will drop
gregkh-pci-pci-make-pci-extended-config-space-a-driver-opt-in.patch in
-mm.

please check some updated patches in -mm that could be affected. hope
it could save you some time

x86-validate-against-acpi-motherboard-resources.patch
x86-clear-pci_mmcfg_virt-when-mmcfg-get-rejected.patch
x86-mmconf-enable-mcfg-early.patch
x86_64-check-msr-to-get-mmconfig-for-amd-family-10h-opteron-v3.patch

YH

[-- Attachment #2: x.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 5882 bytes --]

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-11  1:49                                                                                 ` [PATCH] Change pci_raw_ops to pci_raw_read/write Yinghai Lu
@ 2008-02-11  2:53                                                                                   ` Robert Hancock
  2008-02-11  5:59                                                                                     ` Yinghai Lu
  2008-02-11 22:10                                                                                   ` Andrew Morton
  1 sibling, 1 reply; 125+ messages in thread
From: Robert Hancock @ 2008-02-11  2:53 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Andrew Morton, Matthew Wilcox, Arjan van de Ven, Linus Torvalds,
	Greg KH, Ingo Molnar, Tony Camuso, Grant Grundler, Loic Prylli,
	Adrian Bunk, Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH,
	linux-kernel, Jeff Garzik, linux-pci, Martin Mares

Yinghai Lu wrote:
> On Feb 10, 2008 12:45 PM, Matthew Wilcox <matthew@wil.cx> wrote:
>> On Sun, Feb 10, 2008 at 12:24:18PM -0800, Linus Torvalds wrote:
>>> On Sun, 10 Feb 2008, Yinghai Lu wrote:
>>>>> I suggest Ivan's patch be merged ASAP as it actually fixes bugs.
>>>>> This patch is just cleanup (and takes care of some future concerns).
>>>> your patch and Ivan's patch should be merged in one...
>>> I really don't care whether they get merges as one or separately, but I
>>> think it should be merged _now_ (-rc1 is already delayed), and I'd like to
>>> see the final versions of both. Does anybody have them in a final
>>> agreed-upon format (preferably with that oddness in quirk_intel_irqbalance
>>> also fixed?)
>> I just looked at fixing that -- the reason seems to be that we don't
>> actually have the struct pci_dev at that point.  I can fix it, but I
>> think it's actually buggy.  I want to look at some chipset docs to
>> confirm that though.
>>
>> I've attached the two patches that I believe are the ones we want.  We
>> can (and should) fix quirk_intel_irqbalance separately.
> 
> Andrew,
> 
> those two patch just got into linus 2.6.25-rc1.
> 
> I assume that you will drop
> gregkh-pci-pci-make-pci-extended-config-space-a-driver-opt-in.patch in
> -mm.
> 
> please check some updated patches in -mm that could be affected. hope
> it could save you some time
> 
> x86-validate-against-acpi-motherboard-resources.patch
> x86-clear-pci_mmcfg_virt-when-mmcfg-get-rejected.patch
> x86-mmconf-enable-mcfg-early.patch
> x86_64-check-msr-to-get-mmconfig-for-amd-family-10h-opteron-v3.patch

I don't think any of these patches are affected. They all affect whether 
to use MMCONFIG globally or not, regardless of whether not particular 
accesses will use it.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: raw_pci_read in quirk_intel_irqbalance
  2008-02-10 23:02                                                                                 ` raw_pci_read in quirk_intel_irqbalance Matthew Wilcox
@ 2008-02-11  5:04                                                                                   ` Matthew Wilcox
  2008-02-11  7:49                                                                                     ` Grant Grundler
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-02-11  5:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Yinghai Lu, Greg KH, Andrew Morton, Ingo Molnar, Tony Camuso,
	Grant Grundler, Loic Prylli, Adrian Bunk, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Sun, Feb 10, 2008 at 04:02:04PM -0700, Matthew Wilcox wrote:
> The line in question reads:
> 
>         /* read xTPR register */
>         raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
> 
> That's domain 0, bus 0, device 8, function 0, address 0x4c, length 2.
> 
> I've checked the public E7525 and E7520 MCH datasheets, and they don't
> document the xTPR registers; nor do any of the devices in the datasheet
> have registers documented at 0x4c.
> 
> You can see from my lspci above that I don't _have_ a device 8 on bus 0.
> The aforementioned documentation says:
> 
> "A disabled or non-existent device's configuration register space is
> hidden. A disabled or non-existent device will return all ones for reads
> and will drop writes just as if the cycle terminated with a Master Abort
> on PCI."

I'd like to thank Grant for pointing out to me that this is exactly what
the write immediately above this is doing -- enabling device 8 to
respond to config space cycles.

> Now, my E7525 isn't affected by this quirk as it has a revision greater
> than 0x9.  So maybe it's expected that device 8 is hidden on my machine;
> that it's only present on revisions up to 0x9.  But maybe device 8 is
> always hidden, and that's why the author used raw_pci_ops?
> 
> We can still do better than this, though.  We can do:
> 
> -	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
> +	pci_bus_read_config_word(dev->bus, PCI_DEVFN(8, 0), 0x4c, &word);
> 
> Using PCI_DEVFN tells people you really did mean device 8, and it's not
> a braino for device 4 or 2 (how many bits for slot and function again?)

Here's the patch to implement the above two suggestions:

----

>From f565b65591a3f90a272b1d511e4ab1728861fe77 Mon Sep 17 00:00:00 2001
From: Matthew Wilcox <matthew@wil.cx>
Date: Sun, 10 Feb 2008 23:18:15 -0500
Subject: [PATCH] Use proper abstractions in quirk_intel_irqbalance

Since we may not have a pci_dev for the device we need to access, we can't
use pci_read_config_word.  But raw_pci_read is an internal implementation
detail; it's better to use the architected pci_bus_read_config_word
interface.  Using PCI_DEVFN instead of a mysterious constant helps
reassure everyone that we really do intend to access device 8.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
---
 arch/x86/kernel/quirks.c |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 1941482..c47208f 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -11,7 +11,7 @@
 static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
 {
 	u8 config, rev;
-	u32 word;
+	u16 word;
 
 	/* BIOS may enable hardware IRQ balancing for
 	 * E7520/E7320/E7525(revision ID 0x9 and below)
@@ -26,8 +26,11 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
 	pci_read_config_byte(dev, 0xf4, &config);
 	pci_write_config_byte(dev, 0xf4, config|0x2);
 
-	/* read xTPR register */
-	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
+	/*
+	 * read xTPR register.  We may not have a pci_dev for device 8
+	 * because it might be hidden until the above write.
+	 */
+	pci_bus_read_config_word(dev->bus, PCI_DEVFN(8, 0), 0x4c, &word);
 
 	if (!(word & (1 << 13))) {
 		dev_info(&dev->dev, "Intel E7520/7320/7525 detected; "
-- 
1.5.2.5

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-11  2:53                                                                                   ` Robert Hancock
@ 2008-02-11  5:59                                                                                     ` Yinghai Lu
  0 siblings, 0 replies; 125+ messages in thread
From: Yinghai Lu @ 2008-02-11  5:59 UTC (permalink / raw)
  To: Robert Hancock
  Cc: Andrew Morton, Matthew Wilcox, Arjan van de Ven, Linus Torvalds,
	Greg KH, Ingo Molnar, Tony Camuso, Grant Grundler, Loic Prylli,
	Adrian Bunk, Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH,
	linux-kernel, Jeff Garzik, linux-pci, Martin Mares

On Feb 10, 2008 6:53 PM, Robert Hancock <hancockr@shaw.ca> wrote:
>
> Yinghai Lu wrote:
> > On Feb 10, 2008 12:45 PM, Matthew Wilcox <matthew@wil.cx> wrote:
..
> >> I've attached the two patches that I believe are the ones we want.  We
> >> can (and should) fix quirk_intel_irqbalance separately.
> >
> > Andrew,
> >
> > those two patch just got into linus 2.6.25-rc1.
> >
> > I assume that you will drop
> > gregkh-pci-pci-make-pci-extended-config-space-a-driver-opt-in.patch in
> > -mm.
> >
> > please check some updated patches in -mm that could be affected. hope
> > it could save you some time
> >
> > x86-validate-against-acpi-motherboard-resources.patch
> > x86-clear-pci_mmcfg_virt-when-mmcfg-get-rejected.patch
> > x86-mmconf-enable-mcfg-early.patch
> > x86_64-check-msr-to-get-mmconfig-for-amd-family-10h-opteron-v3.patch
>
> I don't think any of these patches are affected. They all affect whether
> to use MMCONFIG globally or not, regardless of whether not particular
> accesses will use it.

what i mean:

gregkh-pci-pci-make-pci-extended-config-space-a-driver-opt-in.patch is
not needed.

and

> > x86-validate-against-acpi-motherboard-resources.patch
> > x86-clear-pci_mmcfg_virt-when-mmcfg-get-rejected.patch
> > x86-mmconf-enable-mcfg-early.patch
> > x86_64-check-msr-to-get-mmconfig-for-amd-family-10h-opteron-v3.patch
need some update because of changes by "Change pci_raw_ops to
pci_raw_read/write" patch.
such as pci_conf1_read became static...unreachable_devices() is gone..

YH

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: raw_pci_read in quirk_intel_irqbalance
  2008-02-11  5:04                                                                                   ` Matthew Wilcox
@ 2008-02-11  7:49                                                                                     ` Grant Grundler
  2008-02-11 16:15                                                                                       ` Matthew Wilcox
  0 siblings, 1 reply; 125+ messages in thread
From: Grant Grundler @ 2008-02-11  7:49 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linus Torvalds, Yinghai Lu, Greg KH, Andrew Morton, Ingo Molnar,
	Tony Camuso, Grant Grundler, Loic Prylli, Adrian Bunk,
	Arjan van de Ven, Benjamin Herrenschmidt, Ivan Kokshaysky,
	Greg KH, linux-kernel, Jeff Garzik, linux-pci, Martin Mares

On Sun, Feb 10, 2008 at 10:04:16PM -0700, Matthew Wilcox wrote:
> > "A disabled or non-existent device's configuration register space is
> > hidden. A disabled or non-existent device will return all ones for reads
> > and will drop writes just as if the cycle terminated with a Master Abort
> > on PCI."
> 
> I'd like to thank Grant for pointing out to me that this is exactly what
> the write immediately above this is doing -- enabling device 8 to
> respond to config space cycles.

welcome.

...
> >From f565b65591a3f90a272b1d511e4ab1728861fe77 Mon Sep 17 00:00:00 2001
> From: Matthew Wilcox <matthew@wil.cx>
> Date: Sun, 10 Feb 2008 23:18:15 -0500
> Subject: [PATCH] Use proper abstractions in quirk_intel_irqbalance
> 
> Since we may not have a pci_dev for the device we need to access, we can't
> use pci_read_config_word.  But raw_pci_read is an internal implementation
> detail; it's better to use the architected pci_bus_read_config_word
> interface.  Using PCI_DEVFN instead of a mysterious constant helps
> reassure everyone that we really do intend to access device 8.
> 
> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
> ---
>  arch/x86/kernel/quirks.c |    9 ++++++---
>  1 files changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
> index 1941482..c47208f 100644
> --- a/arch/x86/kernel/quirks.c
> +++ b/arch/x86/kernel/quirks.c
> @@ -11,7 +11,7 @@
>  static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
>  {
>  	u8 config, rev;
> -	u32 word;
> +	u16 word;
>  
>  	/* BIOS may enable hardware IRQ balancing for
>  	 * E7520/E7320/E7525(revision ID 0x9 and below)
> @@ -26,8 +26,11 @@ static void __devinit quirk_intel_irqbalance(struct pci_dev *dev)
>  	pci_read_config_byte(dev, 0xf4, &config);
>  	pci_write_config_byte(dev, 0xf4, config|0x2);

Can you also add a comment which points at the Intel documentation?

http://download.intel.com/design/chipsets/datashts/30300702.pdf
Page 34 documents 0xf4 register.

And I just doubled checked that the 0xf4 register value is restored later
in the quirk (obvious when you look at the code but not from the patch).

> -	/* read xTPR register */
> -	raw_pci_read(0, 0, 0x40, 0x4c, 2, &word);
> +	/*
> +	 * read xTPR register.  We may not have a pci_dev for device 8
> +	 * because it might be hidden until the above write.
> +	 */
> +	pci_bus_read_config_word(dev->bus, PCI_DEVFN(8, 0), 0x4c, &word);

Yeah, this should work even though we don't have a dev for it.

Acked-by: Grant Grundler <grundler@parisc-linux.org>

thanks,
grant

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: raw_pci_read in quirk_intel_irqbalance
  2008-02-11  7:49                                                                                     ` Grant Grundler
@ 2008-02-11 16:15                                                                                       ` Matthew Wilcox
  2008-02-11 17:18                                                                                         ` Linus Torvalds
  0 siblings, 1 reply; 125+ messages in thread
From: Matthew Wilcox @ 2008-02-11 16:15 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Linus Torvalds, Yinghai Lu, Greg KH, Andrew Morton, Ingo Molnar,
	Tony Camuso, Loic Prylli, Adrian Bunk, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares

On Mon, Feb 11, 2008 at 12:49:54AM -0700, Grant Grundler wrote:
> Can you also add a comment which points at the Intel documentation?
> 
> http://download.intel.com/design/chipsets/datashts/30300702.pdf
> Page 34 documents 0xf4 register.

I'm told that these URLs are not guaranteed to be stable.  And
remembering the pain we had when HP decided to relocate all of their
documents, I'm really not inclined to embed a link to a URL in the
source code.

> And I just doubled checked that the 0xf4 register value is restored later
> in the quirk (obvious when you look at the code but not from the patch).

Yep, I checked that too ;-)

> Acked-by: Grant Grundler <grundler@parisc-linux.org>

Thanks.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: raw_pci_read in quirk_intel_irqbalance
  2008-02-11 16:15                                                                                       ` Matthew Wilcox
@ 2008-02-11 17:18                                                                                         ` Linus Torvalds
  2008-02-11 19:38                                                                                           ` Grant Grundler
  0 siblings, 1 reply; 125+ messages in thread
From: Linus Torvalds @ 2008-02-11 17:18 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Grant Grundler, Yinghai Lu, Greg KH, Andrew Morton, Ingo Molnar,
	Tony Camuso, Loic Prylli, Adrian Bunk, Arjan van de Ven,
	Benjamin Herrenschmidt, Ivan Kokshaysky, Greg KH, linux-kernel,
	Jeff Garzik, linux-pci, Martin Mares



On Mon, 11 Feb 2008, Matthew Wilcox wrote:
> 
> I'm told that these URLs are not guaranteed to be stable.  And
> remembering the pain we had when HP decided to relocate all of their
> documents, I'm really not inclined to embed a link to a URL in the
> source code.

I put it in the commit message, but it wasn't on page 34 when I checked (I 
changed it to 69), and I added the naem for the datasheet so that if/when 
it moves, maybe google can help.

		Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: raw_pci_read in quirk_intel_irqbalance
  2008-02-11 17:18                                                                                         ` Linus Torvalds
@ 2008-02-11 19:38                                                                                           ` Grant Grundler
  0 siblings, 0 replies; 125+ messages in thread
From: Grant Grundler @ 2008-02-11 19:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matthew Wilcox, Grant Grundler, Yinghai Lu, Greg KH,
	Andrew Morton, Ingo Molnar, Tony Camuso, Loic Prylli,
	Adrian Bunk, Arjan van de Ven, Benjamin Herrenschmidt,
	Ivan Kokshaysky, Greg KH, linux-kernel, Jeff Garzik, linux-pci,
	Martin Mares

On Mon, Feb 11, 2008 at 09:18:49AM -0800, Linus Torvalds wrote:
> I put it in the commit message, but it wasn't on page 34 when I checked (I 
> changed it to 69),

Sorry - page 34 was just the first reference to "Extended Configuration
Registers" when I originally scrounged up the info for willy.
Page 69 is in fact what I wanted to point at ("DEVPRES1" reg).

> and I added the naem for the datasheet so that if/when 
> it moves, maybe google can help.

It should. But doing a quick check now only shows one other copy
(in .es domain :) when searching for "30300702.pdf". 

Searching for the full document title results in several intel.com
locations and lots of other misc references that don't look quite right.
Many of those just reference the "product brief" and not the data sheet.

yahoo.com gives similar results.

thanks,
grant



> 
> 		Linus

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-11  1:49                                                                                 ` [PATCH] Change pci_raw_ops to pci_raw_read/write Yinghai Lu
  2008-02-11  2:53                                                                                   ` Robert Hancock
@ 2008-02-11 22:10                                                                                   ` Andrew Morton
  2008-02-11 22:38                                                                                     ` Ingo Molnar
  1 sibling, 1 reply; 125+ messages in thread
From: Andrew Morton @ 2008-02-11 22:10 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: matthew, arjan, hancockr, torvalds, greg, mingo, tcamuso,
	grundler, loic, bunk, benh, ink, gregkh, linux-kernel, jeff,
	linux-pci, mj

On Sun, 10 Feb 2008 17:49:34 -0800
"Yinghai Lu" <yhlu.kernel@gmail.com> wrote:

> On Feb 10, 2008 12:45 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> >
> > On Sun, Feb 10, 2008 at 12:24:18PM -0800, Linus Torvalds wrote:
> > > On Sun, 10 Feb 2008, Yinghai Lu wrote:
> > > > >
> > > > > I suggest Ivan's patch be merged ASAP as it actually fixes bugs.
> > > > > This patch is just cleanup (and takes care of some future concerns).
> > > >
> > > > your patch and Ivan's patch should be merged in one...
> > >
> > > I really don't care whether they get merges as one or separately, but I
> > > think it should be merged _now_ (-rc1 is already delayed), and I'd like to
> > > see the final versions of both. Does anybody have them in a final
> > > agreed-upon format (preferably with that oddness in quirk_intel_irqbalance
> > > also fixed?)
> >
> > I just looked at fixing that -- the reason seems to be that we don't
> > actually have the struct pci_dev at that point.  I can fix it, but I
> > think it's actually buggy.  I want to look at some chipset docs to
> > confirm that though.
> >
> > I've attached the two patches that I believe are the ones we want.  We
> > can (and should) fix quirk_intel_irqbalance separately.
> 
> Andrew,
> 
> those two patch just got into linus 2.6.25-rc1.
> 
> I assume that you will drop
> gregkh-pci-pci-make-pci-extended-config-space-a-driver-opt-in.patch in
> -mm.

That's no longer in Greg's tree.

> please check some updated patches in -mm that could be affected. hope
> it could save you some time
> 
> x86-validate-against-acpi-motherboard-resources.patch
> x86-clear-pci_mmcfg_virt-when-mmcfg-get-rejected.patch
> x86-mmconf-enable-mcfg-early.patch
> x86_64-check-msr-to-get-mmconfig-for-amd-family-10h-opteron-v3.patch

I have unhappy feelings here - the patches seem to be churning a bit
and when I last sent them to Greg and Ingo they received no apparent
response.

So I think I'll just drop all four.  Please redo, retest and fully
resubmit, thanks.

And we need to work out who owns these patches.  Are they rightly part of
the PCI tree, or of the x86 tree?


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] Change pci_raw_ops to pci_raw_read/write
  2008-02-11 22:10                                                                                   ` Andrew Morton
@ 2008-02-11 22:38                                                                                     ` Ingo Molnar
  0 siblings, 0 replies; 125+ messages in thread
From: Ingo Molnar @ 2008-02-11 22:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Yinghai Lu, matthew, arjan, hancockr, torvalds, greg, tcamuso,
	grundler, loic, bunk, benh, ink, gregkh, linux-kernel, jeff,
	linux-pci, mj


* Andrew Morton <akpm@linux-foundation.org> wrote:

> > please check some updated patches in -mm that could be affected. 
> > hope it could save you some time
> > 
> > x86-validate-against-acpi-motherboard-resources.patch
> > x86-clear-pci_mmcfg_virt-when-mmcfg-get-rejected.patch
> > x86-mmconf-enable-mcfg-early.patch
> > x86_64-check-msr-to-get-mmconfig-for-amd-family-10h-opteron-v3.patch
> 
> I have unhappy feelings here - the patches seem to be churning a bit 
> and when I last sent them to Greg and Ingo they received no apparent 
> response.

i actually carried them for a while and 
validate-against-acpi-motherboard-resources.patch got a fair bit of test 
time with positive results. So it has a clear ACK from me.

It's something that looks appealing:

| This path adds validation of the MMCONFIG table against the ACPI 
| reserved motherboard resources. If the MMCONFIG table is found to be 
| reserved in ACPI, we don't bother checking the E820 table.  The PCI 
| Express firmware spec apparently tells BIOS developers that 
| reservation in ACPI is required and E820 reservation is optional, so 
| checking against ACPI first makes sense.  Many BIOSes don't reserve 
| the MMCONFIG region in E820 even though it is perfectly functional, 
| the existing check needlessly disables MMCONFIG in these cases.

anything that isolates Linux from BIOS messups should be music to our 
ears.

i also think the mmconf-enable stuff for Barcelona stuff from Yinghai, 
albeit not particularly pretty, is probably good too for similar 
reasons. It makes the kernel boot with noacpi which is a good sign IMO. 

I have testsystems that simply do not boot with ACPI turned off - and i 
have a testsystem that locks up hard if it takes an NMI in certain ACPI 
AML sequences ... Just Because.

So i'd ACK them just on general principle - earlier versions of the 
patches were carried in x86.git and caused no particular problems.

but ... then we got complaints from you that stuff collides and that 
such patches should be carried in your or Greg's tree, so we dropped 
them. And there was another 100 KLOC of x86 code to worry about ;-)

So i'd suggest to send those patches upstream, they are system enablers 
and they are at fundamental enough places to be apparent if they cause 
any breakage i think.

	Ingo

^ permalink raw reply	[flat|nested] 125+ messages in thread

end of thread, other threads:[~2008-02-11 22:40 UTC | newest]

Thread overview: 125+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-25 11:26 [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in Arjan van de Ven
2007-12-27 11:52 ` Jeff Garzik
2007-12-27 14:09   ` Arjan van de Ven
2007-12-27 17:52   ` Linus Torvalds
2008-01-11 19:02 ` Greg KH
2008-01-11 19:09   ` Arjan van de Ven
2008-01-11 19:14     ` Greg KH
2008-01-11 19:28   ` Matthew Wilcox
2008-01-11 19:40     ` Arjan van de Ven
2008-01-11 19:45       ` Greg KH
2008-01-11 19:49         ` Matthew Wilcox
2008-01-11 19:58           ` Linus Torvalds
2008-01-11 20:17             ` Matthew Wilcox
2008-01-11 20:27               ` Linus Torvalds
2008-01-11 20:42                 ` Matthew Wilcox
2008-01-11 21:12                   ` Linus Torvalds
2008-01-11 21:17                     ` Matthew Wilcox
2008-01-11 21:28                       ` Linus Torvalds
2008-01-11 21:38                         ` Matthew Wilcox
2008-01-11 23:58                           ` Ivan Kokshaysky
2008-01-12  0:17                             ` Jesse Barnes
2008-01-12  0:26                             ` Greg KH
2008-01-12 14:40                               ` Ivan Kokshaysky
2008-01-12 15:46                                 ` Arjan van de Ven
2008-01-12 16:23                                   ` Ivan Kokshaysky
2008-01-12 17:45                                 ` Arjan van de Ven
2008-01-12 18:17                                   ` Matthew Wilcox
2008-01-12 21:49                                   ` Ivan Kokshaysky
2008-01-12 23:01                                     ` Arjan van de Ven
2008-01-13  0:12                                       ` Tony Camuso
2008-01-13  0:40                                         ` Arjan van de Ven
2008-01-13  1:36                                           ` Tony Camuso
2008-01-13  4:42                                             ` Arjan van de Ven
2008-01-13  4:47                                               ` Matthew Wilcox
2008-01-13  6:43                                                 ` Jeff Garzik
2008-01-13 12:43                                               ` Tony Camuso
2008-01-13 17:03                                                 ` Arjan van de Ven
2008-01-13 21:28                                                   ` Tony Camuso
2008-01-14  0:54                                                     ` Alan Cox
2008-01-14  1:33                                                       ` Arjan van de Ven
2008-01-14  3:29                                                         ` Tony Camuso
2008-01-14  5:05                                                           ` Arjan van de Ven
2008-01-14 13:01                                                             ` Tony Camuso
2008-01-14 14:46                                                               ` Arjan van de Ven
2008-01-14 15:23                                                                 ` Tony Camuso
2008-01-14 16:01                                                                   ` Arjan van de Ven
2008-01-14 16:08                                                                     ` Tony Camuso
2008-01-14  9:11                                                         ` Alan Cox
2008-01-14  5:20                                                       ` Linus Torvalds
2008-01-13 18:23                                   ` Loic Prylli
2008-01-13 18:41                                     ` Arjan van de Ven
2008-01-13 20:43                                       ` Matthew Wilcox
2008-01-13 21:18                                         ` Loic Prylli
2008-01-13 20:51                                       ` Loic Prylli
2008-01-13  7:08                                 ` Benjamin Herrenschmidt
2008-01-13  7:24                                   ` Matthew Wilcox
2008-01-13  7:58                                     ` Matthew Wilcox
2008-01-13 17:01                                     ` Arjan van de Ven
2008-01-14 22:52                                       ` Matthew Wilcox
2008-01-14 23:04                                         ` Adrian Bunk
2008-01-15 16:00                                           ` Loic Prylli
2008-01-15 17:46                                             ` Greg KH
2008-01-15 17:56                                               ` Matthew Wilcox
2008-01-15 19:27                                                 ` Tony Camuso
2008-01-15 19:38                                                   ` Linus Torvalds
2008-01-15 19:40                                                     ` Matthew Wilcox
2008-01-15 22:12                                                     ` Loic Prylli
2008-01-19 16:58                                                 ` Grant Grundler
2008-01-28 18:32                                                   ` Tony Camuso
2008-01-28 20:44                                                     ` Greg KH
2008-01-28 22:31                                                       ` Matthew Wilcox
2008-01-28 22:53                                                         ` Greg KH
2008-01-29  2:56                                                           ` Matthew Wilcox
2008-01-29  2:57                                                             ` PCI x86: always use conf1 to access config space below 256 bytes Matthew Wilcox
2008-01-29 13:21                                                               ` Greg KH
2008-01-29 23:43                                                                 ` Matthew Wilcox
2008-01-30  0:04                                                                   ` Linus Torvalds
2008-01-29  3:03                                                             ` [PATCH] Change pci_raw_ops to pci_raw_read/write Matthew Wilcox
2008-02-03  7:30                                                               ` Yinghai Lu
2008-02-07 15:54                                                                 ` Tony Camuso
2008-02-07 16:28                                                                   ` Arjan van de Ven
2008-02-07 16:36                                                                     ` Tony Camuso
2008-02-08  2:28                                                                       ` Grant Grundler
2008-02-09 12:41                                                                   ` Matthew Wilcox
2008-02-10  6:25                                                                     ` Yinghai Lu
2008-02-10  7:21                                                                       ` Greg KH
2008-02-10 14:51                                                                         ` Matthew Wilcox
2008-02-10 19:13                                                                           ` Grant Grundler
2008-02-10 19:37                                                                             ` Matthew Wilcox
2008-02-10 20:16                                                                           ` Yinghai Lu
2008-02-10 20:19                                                                             ` Matthew Wilcox
2008-02-10 20:25                                                                               ` Yinghai Lu
2008-02-10 20:32                                                                                 ` Matthew Wilcox
2008-02-10 20:47                                                                                   ` Yinghai Lu
2008-02-10 20:24                                                                             ` Linus Torvalds
2008-02-10 20:45                                                                               ` Matthew Wilcox
2008-02-10 23:02                                                                                 ` raw_pci_read in quirk_intel_irqbalance Matthew Wilcox
2008-02-11  5:04                                                                                   ` Matthew Wilcox
2008-02-11  7:49                                                                                     ` Grant Grundler
2008-02-11 16:15                                                                                       ` Matthew Wilcox
2008-02-11 17:18                                                                                         ` Linus Torvalds
2008-02-11 19:38                                                                                           ` Grant Grundler
2008-02-11  1:49                                                                                 ` [PATCH] Change pci_raw_ops to pci_raw_read/write Yinghai Lu
2008-02-11  2:53                                                                                   ` Robert Hancock
2008-02-11  5:59                                                                                     ` Yinghai Lu
2008-02-11 22:10                                                                                   ` Andrew Morton
2008-02-11 22:38                                                                                     ` Ingo Molnar
2008-01-29  3:05                                                       ` [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in Arjan van de Ven
2008-01-29  3:18                                                         ` Matthew Wilcox
2008-01-29 13:19                                                           ` Greg KH
2008-01-29 14:15                                                             ` Tony Camuso
2008-01-29 14:47                                                               ` Arjan van de Ven
2008-01-29 15:15                                                                 ` Tony Camuso
2008-01-29 15:29                                                                   ` Arjan van de Ven
2008-01-29 16:26                                                                     ` Tony Camuso
2008-01-29 23:57                                                                     ` Matthew Wilcox
2008-01-30  2:30                                                                       ` Tony Camuso
2008-01-30  3:45                                                             ` Matthew Wilcox
2008-01-30 15:15                                                               ` Ivan Kokshaysky
2008-01-30 15:42                                                                 ` Arjan van de Ven
2008-01-30 20:14                                                                   ` Ivan Kokshaysky
2008-01-31  5:51                                                             ` Jesse Barnes
2008-01-11 19:54   ` Arjan van de Ven
2008-01-11 20:55     ` Greg KH
2008-01-15 12:58 ` Øyvind Vågen Jægtnes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).