LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 0/8] PCI support for UML
@ 2021-03-01 15:07 Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 1/8] um: allow disabling NO_IOMEM Johannes Berg
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Johannes Berg @ 2021-03-01 15:07 UTC (permalink / raw)
  To: linux-um; +Cc: Arnd Bergmann, linux-kernel

Hi,

Changes since v1:
 - fix a memory leak in the PCI code
 - fix race in interrupt handling
 - fix checks in interrupt handling
 - use asm-generic for fb.h and vga.h
 - rediff against v5.12-rc1
 - export signals_enabled directly


Original description:

In order to simulate some devices and write tests completely
independent of real PCI devices, we continued the development
of time-travel and related bits, and are adding PCI support
here now.

The way it works is that it communicates with the outside (of
UML) with virtio, which we previously added using vhost-user,
and then offers a PCI bus to the inside system, where normal
PCI probing etc. happens, but all config space & IO accesses
are forwarded over virtio.

To enable that, add lib/logic_iomem, similar to logic_pio but
for iomem regions, this way, ioread/iowrite can be redirected
over the virtio device.

Since currently no official virtio device ID is assigned yet
a Kconfig option for that is required to be set to the value
you want to use locally for experimentation. Once we have an
official value we can change the default (currently -1 which
makes it non-functional) or remove the option entirely.

johannes





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/8] um: allow disabling NO_IOMEM
  2021-03-01 15:07 [PATCH v2 0/8] PCI support for UML Johannes Berg
@ 2021-03-01 15:07 ` Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 2/8] lib: add iomem emulation (logic_iomem) Johannes Berg
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Johannes Berg @ 2021-03-01 15:07 UTC (permalink / raw)
  To: linux-um; +Cc: Arnd Bergmann, linux-kernel, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

Adjust the kconfig a little to allow disabling NO_IOMEM in UML. To
make an "allyesconfig" with CONFIG_NO_IOMEM=n build, adjust a few
Kconfig things elsewhere and add dummy asm/fb.h and asm/vga.h files.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
v2: use asm-generic for fb.h and vga.h
---
 arch/um/Kconfig                | 4 ++++
 arch/um/include/asm/Kbuild     | 2 ++
 drivers/input/Kconfig          | 1 -
 drivers/input/gameport/Kconfig | 1 +
 drivers/input/joystick/Kconfig | 1 +
 drivers/tty/Kconfig            | 5 ++---
 drivers/video/console/Kconfig  | 2 +-
 7 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index c3030db3325f..20b0640e01b8 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -26,6 +26,10 @@ config MMU
 	default y
 
 config NO_IOMEM
+	bool "disable IOMEM" if EXPERT
+	default y
+
+config NO_IOPORT_MAP
 	def_bool y
 
 config ISA
diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild
index d7492e5a1bbb..10b7228b3aee 100644
--- a/arch/um/include/asm/Kbuild
+++ b/arch/um/include/asm/Kbuild
@@ -7,6 +7,7 @@ generic-y += device.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += extable.h
+generic-y += fb.h
 generic-y += ftrace.h
 generic-y += futex.h
 generic-y += hw_irq.h
@@ -27,3 +28,4 @@ generic-y += trace_clock.h
 generic-y += word-at-a-time.h
 generic-y += kprobes.h
 generic-y += mm_hooks.h
+generic-y += vga.h
diff --git a/drivers/input/Kconfig b/drivers/input/Kconfig
index ec0e861f185f..5baebf62df33 100644
--- a/drivers/input/Kconfig
+++ b/drivers/input/Kconfig
@@ -4,7 +4,6 @@
 #
 
 menu "Input device support"
-	depends on !UML
 
 config INPUT
 	tristate "Generic input layer (needed for keyboard, mouse, ...)" if EXPERT
diff --git a/drivers/input/gameport/Kconfig b/drivers/input/gameport/Kconfig
index 4761795cb49f..5a2c2fb3217d 100644
--- a/drivers/input/gameport/Kconfig
+++ b/drivers/input/gameport/Kconfig
@@ -4,6 +4,7 @@
 #
 config GAMEPORT
 	tristate "Gameport support"
+	depends on !UML
 	help
 	  Gameport support is for the standard 15-pin PC gameport. If you
 	  have a joystick, gamepad, gameport card, a soundcard with a gameport
diff --git a/drivers/input/joystick/Kconfig b/drivers/input/joystick/Kconfig
index 5e38899058c1..b10948bf9551 100644
--- a/drivers/input/joystick/Kconfig
+++ b/drivers/input/joystick/Kconfig
@@ -4,6 +4,7 @@
 #
 menuconfig INPUT_JOYSTICK
 	bool "Joysticks/Gamepads"
+	depends on !UML
 	help
 	  If you have a joystick, 6dof controller, gamepad, steering wheel,
 	  weapon control system or something like that you can say Y here
diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig
index e15cd6b5bb99..26648daaaee2 100644
--- a/drivers/tty/Kconfig
+++ b/drivers/tty/Kconfig
@@ -12,9 +12,8 @@ if TTY
 
 config VT
 	bool "Virtual terminal" if EXPERT
-	depends on !UML
 	select INPUT
-	default y
+	default y if !UML
 	help
 	  If you say Y here, you will get support for terminal devices with
 	  display and keyboard devices. These are called "virtual" because you
@@ -78,7 +77,7 @@ config VT_CONSOLE_SLEEP
 
 config HW_CONSOLE
 	bool
-	depends on VT && !UML
+	depends on VT
 	default y
 
 config VT_HW_CONSOLE_BINDING
diff --git a/drivers/video/console/Kconfig b/drivers/video/console/Kconfig
index ee33b8ec62bb..840d9813b0bc 100644
--- a/drivers/video/console/Kconfig
+++ b/drivers/video/console/Kconfig
@@ -9,7 +9,7 @@ config VGA_CONSOLE
 	bool "VGA text console" if EXPERT || !X86
 	depends on !4xx && !PPC_8xx && !SPARC && !M68K && !PARISC &&  !SUPERH && \
 		(!ARM || ARCH_FOOTBRIDGE || ARCH_INTEGRATOR || ARCH_NETWINDER) && \
-		!ARM64 && !ARC && !MICROBLAZE && !OPENRISC && !NDS32 && !S390
+		!ARM64 && !ARC && !MICROBLAZE && !OPENRISC && !NDS32 && !S390 && !UML
 	default y
 	help
 	  Saying Y here will allow you to use Linux in text mode through a
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 2/8] lib: add iomem emulation (logic_iomem)
  2021-03-01 15:07 [PATCH v2 0/8] PCI support for UML Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 1/8] um: allow disabling NO_IOMEM Johannes Berg
@ 2021-03-01 15:07 ` Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 3/8] um: remove unused smp_sigio_handler() declaration Johannes Berg
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Johannes Berg @ 2021-03-01 15:07 UTC (permalink / raw)
  To: linux-um; +Cc: Arnd Bergmann, linux-kernel, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

Add IO memory emulation that uses callbacks for read/write to
the allocated regions. The callbacks can be registered by the
users using logic_iomem_alloc().

To use, an architecture must 'select LOGIC_IOMEM' in Kconfig
and then include <asm-generic/logic_io.h> into asm/io.h to get
the __raw_read*/__raw_write* functions.

Optionally, an architecture may 'select LOGIC_IOMEM_FALLBACK'
in which case non-emulated regions will 'fall back' to the
various real_* functions that must then be provided.

Cc: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 include/asm-generic/logic_io.h |  78 ++++++++
 include/linux/logic_iomem.h    |  62 +++++++
 lib/Kconfig                    |  14 ++
 lib/Makefile                   |   2 +
 lib/logic_iomem.c              | 318 +++++++++++++++++++++++++++++++++
 5 files changed, 474 insertions(+)
 create mode 100644 include/asm-generic/logic_io.h
 create mode 100644 include/linux/logic_iomem.h
 create mode 100644 lib/logic_iomem.c

diff --git a/include/asm-generic/logic_io.h b/include/asm-generic/logic_io.h
new file mode 100644
index 000000000000..a53116b8c57e
--- /dev/null
+++ b/include/asm-generic/logic_io.h
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2021 Intel Corporation
+ * Author: johannes@sipsolutions.net
+ */
+#ifndef _LOGIC_IO_H
+#define _LOGIC_IO_H
+#include <linux/types.h>
+
+/* include this file into asm/io.h */
+
+#ifdef CONFIG_INDIRECT_IOMEM
+
+#ifdef CONFIG_INDIRECT_IOMEM_FALLBACK
+/*
+ * If you want emulated IO memory to fall back to 'normal' IO memory
+ * if a region wasn't registered as emulated, then you need to have
+ * all of the real_* functions implemented.
+ */
+#if !defined(real_ioremap) || !defined(real_iounmap) || \
+    !defined(real_raw_readb) || !defined(real_raw_writeb) || \
+    !defined(real_raw_readw) || !defined(real_raw_writew) || \
+    !defined(real_raw_readl) || !defined(real_raw_writel) || \
+    (defined(CONFIG_64BIT) && \
+     (!defined(real_raw_readq) || !defined(real_raw_writeq))) || \
+    !defined(real_memset_io) || \
+    !defined(real_memcpy_fromio) || \
+    !defined(real_memcpy_toio)
+#error "Must provide fallbacks for real IO memory access"
+#endif /* defined ... */
+#endif /* CONFIG_INDIRECT_IOMEM_FALLBACK */
+
+#define ioremap ioremap
+void __iomem *ioremap(phys_addr_t offset, size_t size);
+
+#define iounmap iounmap
+void iounmap(void __iomem *addr);
+
+#define __raw_readb __raw_readb
+u8 __raw_readb(const volatile void __iomem *addr);
+
+#define __raw_readw __raw_readw
+u16 __raw_readw(const volatile void __iomem *addr);
+
+#define __raw_readl __raw_readl
+u32 __raw_readl(const volatile void __iomem *addr);
+
+#ifdef CONFIG_64BIT
+#define __raw_readq __raw_readq
+u64 __raw_readq(const volatile void __iomem *addr);
+#endif /* CONFIG_64BIT */
+
+#define __raw_writeb __raw_writeb
+void __raw_writeb(u8 value, volatile void __iomem *addr);
+
+#define __raw_writew __raw_writew
+void __raw_writew(u16 value, volatile void __iomem *addr);
+
+#define __raw_writel __raw_writel
+void __raw_writel(u32 value, volatile void __iomem *addr);
+
+#ifdef CONFIG_64BIT
+#define __raw_writeq __raw_writeq
+void __raw_writeq(u64 value, volatile void __iomem *addr);
+#endif /* CONFIG_64BIT */
+
+#define memset_io memset_io
+void memset_io(volatile void __iomem *addr, int value, size_t size);
+
+#define memcpy_fromio memcpy_fromio
+void memcpy_fromio(void *buffer, const volatile void __iomem *addr,
+		   size_t size);
+
+#define memcpy_toio memcpy_toio
+void memcpy_toio(volatile void __iomem *addr, const void *buffer, size_t size);
+
+#endif /* CONFIG_INDIRECT_IOMEM */
+#endif /* _LOGIC_IO_H */
diff --git a/include/linux/logic_iomem.h b/include/linux/logic_iomem.h
new file mode 100644
index 000000000000..3fa65c964379
--- /dev/null
+++ b/include/linux/logic_iomem.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2021 Intel Corporation
+ * Author: johannes@sipsolutions.net
+ */
+#ifndef __LOGIC_IOMEM_H
+#define __LOGIC_IOMEM_H
+#include <linux/types.h>
+#include <linux/ioport.h>
+
+/**
+ * struct logic_iomem_ops - emulated IO memory ops
+ * @read: read an 8, 16, 32 or 64 bit quantity from the given offset,
+ *	size is given in bytes (1, 2, 4 or 8)
+ *	(64-bit only necessary if CONFIG_64BIT is set)
+ * @write: write an 8, 16 32 or 64 bit quantity to the given offset,
+ *	size is given in bytes (1, 2, 4 or 8)
+ *	(64-bit only necessary if CONFIG_64BIT is set)
+ * @set: optional, for memset_io()
+ * @copy_from: optional, for memcpy_fromio()
+ * @copy_to: optional, for memcpy_toio()
+ * @unmap: optional, this region is getting unmapped
+ */
+struct logic_iomem_ops {
+	unsigned long (*read)(void *priv, unsigned int offset, int size);
+	void (*write)(void *priv, unsigned int offset, int size,
+		      unsigned long val);
+
+	void (*set)(void *priv, unsigned int offset, u8 value, int size);
+	void (*copy_from)(void *priv, void *buffer, unsigned int offset,
+			  int size);
+	void (*copy_to)(void *priv, unsigned int offset, const void *buffer,
+			int size);
+
+	void (*unmap)(void *priv);
+};
+
+/**
+ * struct logic_iomem_region_ops - ops for an IO memory handler
+ * @map: map a range in the registered IO memory region, must
+ *	fill *ops with the ops and may fill *priv to be passed
+ *	to the ops. The offset is given as the offset into the
+ *	registered resource region.
+ *	The return value is negative for errors, or >= 0 for
+ *	success. On success, the return value is added to the
+ *	offset for later ops, to allow for partial mappings.
+ */
+struct logic_iomem_region_ops {
+	long (*map)(unsigned long offset, size_t size,
+		    const struct logic_iomem_ops **ops,
+		    void **priv);
+};
+
+/**
+ * logic_iomem_add_region - register an IO memory region
+ * @resource: the resource description for this region
+ * @ops: the IO memory mapping ops for this resource
+ */
+int logic_iomem_add_region(struct resource *resource,
+			   const struct logic_iomem_region_ops *ops);
+
+#endif /* __LOGIC_IOMEM_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index a38cc61256f1..04684ae9593d 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -102,6 +102,20 @@ config INDIRECT_PIO
 
 	  When in doubt, say N.
 
+config INDIRECT_IOMEM
+	bool
+	help
+	  This is selected by other options/architectures to provide the
+	  emulated iomem accessors.
+
+config INDIRECT_IOMEM_FALLBACK
+	bool
+	depends on INDIRECT_IOMEM
+	help
+	  If INDIRECT_IOMEM is selected, this enables falling back to plain
+	  mmio accesses when the IO memory address is not a registered
+	  emulated region.
+
 config CRC_CCITT
 	tristate "CRC-CCITT functions"
 	help
diff --git a/lib/Makefile b/lib/Makefile
index b5307d3eec1a..f50b1b5d2391 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -147,6 +147,8 @@ obj-$(CONFIG_DEBUG_LOCKING_API_SELFTESTS) += locking-selftest.o
 
 lib-y += logic_pio.o
 
+lib-$(CONFIG_INDIRECT_IOMEM) += logic_iomem.o
+
 obj-$(CONFIG_GENERIC_HWEIGHT) += hweight.o
 
 obj-$(CONFIG_BTREE) += btree.o
diff --git a/lib/logic_iomem.c b/lib/logic_iomem.c
new file mode 100644
index 000000000000..b76b92dd0f1f
--- /dev/null
+++ b/lib/logic_iomem.c
@@ -0,0 +1,318 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Intel Corporation
+ * Author: Johannes Berg <johannes@sipsolutions.net>
+ */
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/logic_iomem.h>
+
+struct logic_iomem_region {
+	const struct resource *res;
+	const struct logic_iomem_region_ops *ops;
+	struct list_head list;
+};
+
+struct logic_iomem_area {
+	const struct logic_iomem_ops *ops;
+	void *priv;
+};
+
+#define AREA_SHIFT	24
+#define MAX_AREA_SIZE	(1 << AREA_SHIFT)
+#define MAX_AREAS	((1ULL<<32) / MAX_AREA_SIZE)
+#define AREA_BITS	((MAX_AREAS - 1) << AREA_SHIFT)
+#define AREA_MASK	(MAX_AREA_SIZE - 1)
+#ifdef CONFIG_64BIT
+#define IOREMAP_BIAS	0xDEAD000000000000UL
+#define IOREMAP_MASK	0xFFFFFFFF00000000UL
+#else
+#define IOREMAP_BIAS	0
+#define IOREMAP_MASK	0
+#endif
+
+static DEFINE_MUTEX(regions_mtx);
+static LIST_HEAD(regions_list);
+static struct logic_iomem_area mapped_areas[MAX_AREAS];
+
+int logic_iomem_add_region(struct resource *resource,
+			   const struct logic_iomem_region_ops *ops)
+{
+	struct logic_iomem_region *rreg;
+	int err;
+
+	if (WARN_ON(!resource || !ops))
+		return -EINVAL;
+
+	if (WARN_ON((resource->flags & IORESOURCE_TYPE_BITS) != IORESOURCE_MEM))
+		return -EINVAL;
+
+	rreg = kzalloc(sizeof(*rreg), GFP_KERNEL);
+	if (!rreg)
+		return -ENOMEM;
+
+	err = request_resource(&iomem_resource, resource);
+	if (err) {
+		kfree(rreg);
+		return -ENOMEM;
+	}
+
+	mutex_lock(&regions_mtx);
+	rreg->res = resource;
+	rreg->ops = ops;
+	list_add_tail(&rreg->list, &regions_list);
+	mutex_unlock(&regions_mtx);
+
+	return 0;
+}
+EXPORT_SYMBOL(logic_iomem_add_region);
+
+#ifndef CONFIG_LOGIC_IOMEM_FALLBACK
+static void __iomem *real_ioremap(phys_addr_t offset, size_t size)
+{
+	WARN(1, "invalid ioremap(0x%llx, 0x%zx)\n",
+	     (unsigned long long)offset, size);
+	return NULL;
+}
+
+static void real_iounmap(void __iomem *addr)
+{
+	WARN(1, "invalid iounmap for addr 0x%llx\n",
+	     (unsigned long long)addr);
+}
+#endif /* CONFIG_LOGIC_IOMEM_FALLBACK */
+
+void __iomem *ioremap(phys_addr_t offset, size_t size)
+{
+	void __iomem *ret = NULL;
+	struct logic_iomem_region *rreg, *found = NULL;
+	int i;
+
+	mutex_lock(&regions_mtx);
+	list_for_each_entry(rreg, &regions_list, list) {
+		if (rreg->res->start > offset)
+			continue;
+		if (rreg->res->end < offset + size - 1)
+			continue;
+		found = rreg;
+		break;
+	}
+
+	if (!found)
+		goto out;
+
+	for (i = 0; i < MAX_AREAS; i++) {
+		long offs;
+
+		if (mapped_areas[i].ops)
+			continue;
+
+		offs = rreg->ops->map(offset - found->res->start,
+				      size, &mapped_areas[i].ops,
+				      &mapped_areas[i].priv);
+		if (offs < 0) {
+			mapped_areas[i].ops = NULL;
+			break;
+		}
+
+		if (WARN_ON(!mapped_areas[i].ops)) {
+			mapped_areas[i].ops = NULL;
+			break;
+		}
+
+		ret = (void __iomem *)(IOREMAP_BIAS + (i << AREA_SHIFT) + offs);
+		break;
+	}
+out:
+	mutex_unlock(&regions_mtx);
+	if (ret)
+		return ret;
+	return real_ioremap(offset, size);
+}
+EXPORT_SYMBOL(ioremap);
+
+static inline struct logic_iomem_area *
+get_area(const volatile void __iomem *addr)
+{
+	unsigned long a = (unsigned long)addr;
+	unsigned int idx;
+
+	if (WARN_ON((a & IOREMAP_MASK) != IOREMAP_BIAS))
+		return NULL;
+
+	idx = (a & AREA_BITS) >> AREA_SHIFT;
+
+	if (mapped_areas[idx].ops)
+		return &mapped_areas[idx];
+
+	return NULL;
+}
+
+void iounmap(void __iomem *addr)
+{
+	struct logic_iomem_area *area = get_area(addr);
+
+	if (!area) {
+		real_iounmap(addr);
+		return;
+	}
+
+	if (area->ops->unmap)
+		area->ops->unmap(area->priv);
+
+	mutex_lock(&regions_mtx);
+	area->ops = NULL;
+	area->priv = NULL;
+	mutex_unlock(&regions_mtx);
+}
+EXPORT_SYMBOL(iounmap);
+
+#ifndef CONFIG_LOGIC_IOMEM_FALLBACK
+#define MAKE_FALLBACK(op, sz) 						\
+static u##sz real_raw_read ## op(const volatile void __iomem *addr)	\
+{									\
+	WARN(1, "Invalid read" #op " at address %llx\n",		\
+	     (unsigned long long)addr);					\
+	return (u ## sz)~0ULL;						\
+}									\
+									\
+void real_raw_write ## op(u ## sz val, volatile void __iomem *addr)	\
+{									\
+	WARN(1, "Invalid writeq" #op " of 0x%llx at address %llx\n",	\
+	     (unsigned long long)val, (unsigned long long)addr);	\
+}									\
+
+MAKE_FALLBACK(b, 8);
+MAKE_FALLBACK(w, 16);
+MAKE_FALLBACK(l, 32);
+#ifdef CONFIG_64BIT
+MAKE_FALLBACK(q, 64);
+#endif
+
+static void real_memset_io(volatile void __iomem *addr, int value, size_t size)
+{
+	WARN(1, "Invalid memset_io at address 0x%llx\n",
+	     (unsigned long long)addr);
+}
+
+static void real_memcpy_fromio(void *buffer, const volatile void __iomem *addr,
+			       size_t size)
+{
+	WARN(1, "Invalid memcpy_fromio at address 0x%llx\n",
+	     (unsigned long long)addr);
+
+	memset(buffer, 0xff, size);
+}
+
+static void real_memcpy_toio(volatile void __iomem *addr, const void *buffer,
+			     size_t size)
+{
+	WARN(1, "Invalid memcpy_toio at address 0x%llx\n",
+	     (unsigned long long)addr);
+}
+#endif /* CONFIG_LOGIC_IOMEM_FALLBACK */
+
+#define MAKE_OP(op, sz) 						\
+u##sz __raw_read ## op(const volatile void __iomem *addr)		\
+{									\
+	struct logic_iomem_area *area = get_area(addr);			\
+									\
+	if (!area)							\
+		return real_raw_read ## op(addr);			\
+									\
+	return (u ## sz) area->ops->read(area->priv,			\
+					 (unsigned long)addr & AREA_MASK,\
+					 sz / 8);			\
+}									\
+EXPORT_SYMBOL(__raw_read ## op);					\
+									\
+void __raw_write ## op(u ## sz val, volatile void __iomem *addr)	\
+{									\
+	struct logic_iomem_area *area = get_area(addr);			\
+									\
+	if (!area) {							\
+		real_raw_write ## op(val, addr);			\
+		return;							\
+	}								\
+									\
+	area->ops->write(area->priv,					\
+			 (unsigned long)addr & AREA_MASK,		\
+			 sz / 8, val);					\
+}									\
+EXPORT_SYMBOL(__raw_write ## op)
+
+MAKE_OP(b, 8);
+MAKE_OP(w, 16);
+MAKE_OP(l, 32);
+#ifdef CONFIG_64BIT
+MAKE_OP(q, 64);
+#endif
+
+void memset_io(volatile void __iomem *addr, int value, size_t size)
+{
+	struct logic_iomem_area *area = get_area(addr);
+	unsigned long offs, start;
+
+	if (!area) {
+		real_memset_io(addr, value, size);
+		return;
+	}
+
+	start = (unsigned long)addr & AREA_MASK;
+
+	if (area->ops->set) {
+		area->ops->set(area->priv, start, value, size);
+		return;
+	}
+
+	for (offs = 0; offs < size; offs++)
+		area->ops->write(area->priv, start + offs, 1, value);
+}
+EXPORT_SYMBOL(memset_io);
+
+void memcpy_fromio(void *buffer, const volatile void __iomem *addr,
+                   size_t size)
+{
+	struct logic_iomem_area *area = get_area(addr);
+	u8 *buf = buffer;
+	unsigned long offs, start;
+
+	if (!area) {
+		real_memcpy_fromio(buffer, addr, size);
+		return;
+	}
+
+	start = (unsigned long)addr & AREA_MASK;
+
+	if (area->ops->copy_from) {
+		area->ops->copy_from(area->priv, buffer, start, size);
+		return;
+	}
+
+	for (offs = 0; offs < size; offs++)
+		buf[offs] = area->ops->read(area->priv, start + offs, 1);
+}
+EXPORT_SYMBOL(memcpy_fromio);
+
+void memcpy_toio(volatile void __iomem *addr, const void *buffer, size_t size)
+{
+	struct logic_iomem_area *area = get_area(addr);
+	const u8 *buf = buffer;
+	unsigned long offs, start;
+
+	if (!area) {
+		real_memcpy_toio(addr, buffer, size);
+		return;
+	}
+
+	start = (unsigned long)addr & AREA_MASK;
+
+	if (area->ops->copy_to) {
+		area->ops->copy_to(area->priv, start, buffer, size);
+		return;
+	}
+
+	for (offs = 0; offs < size; offs++)
+		area->ops->write(area->priv, start + offs, 1, buf[offs]);
+}
+EXPORT_SYMBOL(memcpy_toio);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 3/8] um: remove unused smp_sigio_handler() declaration
  2021-03-01 15:07 [PATCH v2 0/8] PCI support for UML Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 1/8] um: allow disabling NO_IOMEM Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 2/8] lib: add iomem emulation (logic_iomem) Johannes Berg
@ 2021-03-01 15:07 ` Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 4/8] um: export signals_enabled directly Johannes Berg
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Johannes Berg @ 2021-03-01 15:07 UTC (permalink / raw)
  To: linux-um; +Cc: Arnd Bergmann, linux-kernel, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

This function doesn't exist, remove its declaration.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 arch/um/include/shared/kern_util.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/um/include/shared/kern_util.h b/arch/um/include/shared/kern_util.h
index 2888ec812f6e..a2cfd42608a0 100644
--- a/arch/um/include/shared/kern_util.h
+++ b/arch/um/include/shared/kern_util.h
@@ -33,7 +33,6 @@ extern int handle_page_fault(unsigned long address, unsigned long ip,
 			     int is_write, int is_user, int *code_out);
 
 extern unsigned int do_IRQ(int irq, struct uml_pt_regs *regs);
-extern int smp_sigio_handler(void);
 extern void initial_thread_cb(void (*proc)(void *), void *arg);
 extern int is_syscall(unsigned long addr);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 4/8] um: export signals_enabled directly
  2021-03-01 15:07 [PATCH v2 0/8] PCI support for UML Johannes Berg
                   ` (2 preceding siblings ...)
  2021-03-01 15:07 ` [PATCH v2 3/8] um: remove unused smp_sigio_handler() declaration Johannes Berg
@ 2021-03-01 15:07 ` Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 5/8] um: time-travel/signals: fix ndelay() in interrupt Johannes Berg
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Johannes Berg @ 2021-03-01 15:07 UTC (permalink / raw)
  To: linux-um; +Cc: Arnd Bergmann, linux-kernel, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

Use signals_enabled instead of always jumping through
a function call to read it, there's not much point in
that.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 arch/um/include/asm/irqflags.h   | 10 +++++-----
 arch/um/include/shared/longjmp.h | 14 +++++++-------
 arch/um/include/shared/os.h      |  1 -
 arch/um/kernel/ksyms.c           |  2 +-
 arch/um/os-Linux/signal.c        |  7 +------
 5 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/arch/um/include/asm/irqflags.h b/arch/um/include/asm/irqflags.h
index 0642ad9035d1..dab5744e9253 100644
--- a/arch/um/include/asm/irqflags.h
+++ b/arch/um/include/asm/irqflags.h
@@ -2,15 +2,15 @@
 #ifndef __UM_IRQFLAGS_H
 #define __UM_IRQFLAGS_H
 
-extern int get_signals(void);
-extern int set_signals(int enable);
-extern void block_signals(void);
-extern void unblock_signals(void);
+extern int signals_enabled;
+int set_signals(int enable);
+void block_signals(void);
+void unblock_signals(void);
 
 #define arch_local_save_flags arch_local_save_flags
 static inline unsigned long arch_local_save_flags(void)
 {
-	return get_signals();
+	return signals_enabled;
 }
 
 #define arch_local_irq_restore arch_local_irq_restore
diff --git a/arch/um/include/shared/longjmp.h b/arch/um/include/shared/longjmp.h
index 85a1cc290ecb..bdb2869b72b3 100644
--- a/arch/um/include/shared/longjmp.h
+++ b/arch/um/include/shared/longjmp.h
@@ -5,6 +5,7 @@
 #include <sysdep/archsetjmp.h>
 #include <os.h>
 
+extern int signals_enabled;
 extern int setjmp(jmp_buf);
 extern void longjmp(jmp_buf, int);
 
@@ -12,13 +13,12 @@ extern void longjmp(jmp_buf, int);
 	longjmp(*buf, val);	\
 } while(0)
 
-#define UML_SETJMP(buf) ({ \
-	int n;	   \
-	volatile int enable;	\
-	enable = get_signals(); \
-	n = setjmp(*buf); \
-	if(n != 0) \
-		set_signals_trace(enable); \
+#define UML_SETJMP(buf) ({				\
+	int n, enable;					\
+	enable = *(volatile int *)&signals_enabled;	\
+	n = setjmp(*buf);				\
+	if(n != 0)					\
+		set_signals_trace(enable);		\
 	n; })
 
 #endif
diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
index 13d86f94cf0f..f9fbbddc38bb 100644
--- a/arch/um/include/shared/os.h
+++ b/arch/um/include/shared/os.h
@@ -237,7 +237,6 @@ extern void send_sigio_to_self(void);
 extern int change_sig(int signal, int on);
 extern void block_signals(void);
 extern void unblock_signals(void);
-extern int get_signals(void);
 extern int set_signals(int enable);
 extern int set_signals_trace(int enable);
 extern int os_is_signal_stack(void);
diff --git a/arch/um/kernel/ksyms.c b/arch/um/kernel/ksyms.c
index 8ade54a86a7e..b1e5634398d0 100644
--- a/arch/um/kernel/ksyms.c
+++ b/arch/um/kernel/ksyms.c
@@ -7,7 +7,7 @@
 #include <os.h>
 
 EXPORT_SYMBOL(set_signals);
-EXPORT_SYMBOL(get_signals);
+EXPORT_SYMBOL(signals_enabled);
 
 EXPORT_SYMBOL(os_stat_fd);
 EXPORT_SYMBOL(os_stat_file);
diff --git a/arch/um/os-Linux/signal.c b/arch/um/os-Linux/signal.c
index 96f511d1aabe..8c9d162e6c51 100644
--- a/arch/um/os-Linux/signal.c
+++ b/arch/um/os-Linux/signal.c
@@ -62,7 +62,7 @@ static void sig_handler_common(int sig, struct siginfo *si, mcontext_t *mc)
 #define SIGALRM_BIT 1
 #define SIGALRM_MASK (1 << SIGALRM_BIT)
 
-static int signals_enabled;
+int signals_enabled;
 static unsigned int signals_pending;
 static unsigned int signals_active = 0;
 
@@ -334,11 +334,6 @@ void unblock_signals(void)
 	}
 }
 
-int get_signals(void)
-{
-	return signals_enabled;
-}
-
 int set_signals(int enable)
 {
 	int ret;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 5/8] um: time-travel/signals: fix ndelay() in interrupt
  2021-03-01 15:07 [PATCH v2 0/8] PCI support for UML Johannes Berg
                   ` (3 preceding siblings ...)
  2021-03-01 15:07 ` [PATCH v2 4/8] um: export signals_enabled directly Johannes Berg
@ 2021-03-01 15:07 ` Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 6/8] um: irqs: allow invoking time-travel handler multiple times Johannes Berg
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Johannes Berg @ 2021-03-01 15:07 UTC (permalink / raw)
  To: linux-um; +Cc: Arnd Bergmann, linux-kernel, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

We should be able to ndelay() from any context, even from an
interrupt context! However, this is broken (not functionally,
but locking-wise) in time-travel because we'll get into the
time-travel code and enable interrupts to handle messages on
other time-travel aware subsystems (only virtio for now).

Luckily, I've already reworked the time-travel aware signal
(interrupt) delivery for suspend/resume to have a time travel
handler, which runs directly in the context of the signal and
not from the Linux interrupt.

In order to fix this time-travel issue then, we need to do a
few things:

 1) rework the signal handling code to not block SIGIO (if
    time-travel is enabled, just to simplify the other case)
    but rather let it bubble through the system, all the way
    *past* the IRQ's timetravel_handler, stopping it after
    that and before real interrupt delivery if IRQs are not
    actually enabled;
 2) rework time-travel to not enable interrupts while it's
    waiting for a message;
 3) rework time-travel to not (just) disable interrupts but
    rather block signals at a lower level while it needs them
    disabled for communicating with the controller.

Finally, since now we can actually spend even virtual time
in interrupts-disabled sections, the delay warning when we
deliver a time-travel delayed interrupt is no longer valid,
things can (and should) now get delayed.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
v2:
 - check for signals_enabled as well as irqs_suspended
   to see if only time-travel handlers should be called
 - fix race in unblock
---
 arch/um/include/shared/irq_user.h |  1 +
 arch/um/include/shared/os.h       |  3 +++
 arch/um/kernel/irq.c              | 42 ++++++++++++++++++++++++-----
 arch/um/kernel/time.c             | 35 +++++++++---------------
 arch/um/os-Linux/signal.c         | 45 ++++++++++++++++++++++++++++---
 5 files changed, 93 insertions(+), 33 deletions(-)

diff --git a/arch/um/include/shared/irq_user.h b/arch/um/include/shared/irq_user.h
index 07239e801a5b..065829f443ae 100644
--- a/arch/um/include/shared/irq_user.h
+++ b/arch/um/include/shared/irq_user.h
@@ -17,6 +17,7 @@ enum um_irq_type {
 
 struct siginfo;
 extern void sigio_handler(int sig, struct siginfo *unused_si, struct uml_pt_regs *regs);
+void sigio_run_timetravel_handlers(void);
 extern void free_irq_by_fd(int fd);
 extern void deactivate_fd(int fd, int irqnum);
 extern int deactivate_all_fds(void);
diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
index f9fbbddc38bb..453b13369533 100644
--- a/arch/um/include/shared/os.h
+++ b/arch/um/include/shared/os.h
@@ -242,6 +242,9 @@ extern int set_signals_trace(int enable);
 extern int os_is_signal_stack(void);
 extern void deliver_alarm(void);
 extern void register_pm_wake_signal(void);
+extern void block_signals_hard(void);
+extern void unblock_signals_hard(void);
+extern void mark_sigio_pending(void);
 
 /* util.c */
 extern void stack_protections(unsigned long address);
diff --git a/arch/um/kernel/irq.c b/arch/um/kernel/irq.c
index 82af5191e73d..ccf5e4d27202 100644
--- a/arch/um/kernel/irq.c
+++ b/arch/um/kernel/irq.c
@@ -123,7 +123,8 @@ static bool irq_do_timetravel_handler(struct irq_entry *entry,
 #endif
 
 static void sigio_reg_handler(int idx, struct irq_entry *entry, enum um_irq_type t,
-			      struct uml_pt_regs *regs)
+			      struct uml_pt_regs *regs,
+			      bool timetravel_handlers_only)
 {
 	struct irq_reg *reg = &entry->reg[t];
 
@@ -136,18 +137,32 @@ static void sigio_reg_handler(int idx, struct irq_entry *entry, enum um_irq_type
 	if (irq_do_timetravel_handler(entry, t))
 		return;
 
-	if (irqs_suspended)
+#ifdef CONFIG_UML_TIME_TRAVEL_SUPPORT
+	/*
+	 * We can only get here with signals disabled if we have time-travel
+	 * support, otherwise we cannot have the hard handlers that may need
+	 * to run even in interrupts-disabled sections and therefore sigio is
+	 * blocked as well when interrupts are disabled.
+	 */
+	if (!signals_enabled) {
+		mark_sigio_pending();
+		return;
+	}
+#endif
+
+	if (timetravel_handlers_only)
 		return;
 
 	irq_io_loop(reg, regs);
 }
 
-void sigio_handler(int sig, struct siginfo *unused_si, struct uml_pt_regs *regs)
+static void _sigio_handler(struct uml_pt_regs *regs,
+			   bool timetravel_handlers_only)
 {
 	struct irq_entry *irq_entry;
 	int n, i;
 
-	if (irqs_suspended && !um_irq_timetravel_handler_used())
+	if (timetravel_handlers_only && !um_irq_timetravel_handler_used())
 		return;
 
 	while (1) {
@@ -172,14 +187,24 @@ void sigio_handler(int sig, struct siginfo *unused_si, struct uml_pt_regs *regs)
 			irq_entry = os_epoll_get_data_pointer(i);
 
 			for (t = 0; t < NUM_IRQ_TYPES; t++)
-				sigio_reg_handler(i, irq_entry, t, regs);
+				sigio_reg_handler(i, irq_entry, t, regs,
+						  timetravel_handlers_only);
 		}
 	}
 
-	if (!irqs_suspended)
+	if (!timetravel_handlers_only)
 		free_irqs();
 }
 
+void sigio_handler(int sig, struct siginfo *unused_si, struct uml_pt_regs *regs)
+{
+	_sigio_handler(regs, irqs_suspended
+#ifdef CONFIG_UML_TIME_TRAVEL_SUPPORT
+			     || !signals_enabled
+#endif
+		       );
+}
+
 static struct irq_entry *get_irq_entry_by_fd(int fd)
 {
 	struct irq_entry *walk;
@@ -467,6 +492,11 @@ int um_request_irq_tt(int irq, int fd, enum um_irq_type type,
 			       devname, dev_id, timetravel_handler);
 }
 EXPORT_SYMBOL(um_request_irq_tt);
+
+void sigio_run_timetravel_handlers(void)
+{
+	_sigio_handler(NULL, true);
+}
 #endif
 
 #ifdef CONFIG_PM_SLEEP
diff --git a/arch/um/kernel/time.c b/arch/um/kernel/time.c
index e0cdb9694fb8..fddd1dec27e6 100644
--- a/arch/um/kernel/time.c
+++ b/arch/um/kernel/time.c
@@ -68,23 +68,15 @@ static void time_travel_handle_message(struct um_timetravel_msg *msg,
 	int ret;
 
 	/*
-	 * Poll outside the locked section (if we're not called to only read
-	 * the response) so we can get interrupts for e.g. virtio while we're
-	 * here, but then we need to lock to not get interrupted between the
-	 * read of the message and write of the ACK.
+	 * We can't unlock here, but interrupt signals with a timetravel_handler
+	 * (see um_request_irq_tt) get to the timetravel_handler anyway.
 	 */
 	if (mode != TTMH_READ) {
-		bool disabled = irqs_disabled();
+		BUG_ON(mode == TTMH_IDLE && !irqs_disabled());
 
-		BUG_ON(mode == TTMH_IDLE && !disabled);
-
-		if (disabled)
-			local_irq_enable();
 		while (os_poll(1, &time_travel_ext_fd) != 0) {
 			/* nothing */
 		}
-		if (disabled)
-			local_irq_disable();
 	}
 
 	ret = os_read_file(time_travel_ext_fd, msg, sizeof(*msg));
@@ -123,15 +115,15 @@ static u64 time_travel_ext_req(u32 op, u64 time)
 		.time = time,
 		.seq = mseq,
 	};
-	unsigned long flags;
 
 	/*
-	 * We need to save interrupts here and only restore when we
-	 * got the ACK - otherwise we can get interrupted and send
-	 * another request while we're still waiting for an ACK, but
-	 * the peer doesn't know we got interrupted and will send
-	 * the ACKs in the same order as the message, but we'd need
-	 * to see them in the opposite order ...
+	 * We need to block even the timetravel handlers of SIGIO here and
+	 * only restore their use when we got the ACK - otherwise we may
+	 * (will) get interrupted by that, try to queue the IRQ for future
+	 * processing and thus send another request while we're still waiting
+	 * for an ACK, but the peer doesn't know we got interrupted and will
+	 * send the ACKs in the same order as the message, but we'd need to
+	 * see them in the opposite order ...
 	 *
 	 * This wouldn't matter *too* much, but some ACKs carry the
 	 * current time (for UM_TIMETRAVEL_GET) and getting another
@@ -140,7 +132,7 @@ static u64 time_travel_ext_req(u32 op, u64 time)
 	 * The sequence number assignment that happens here lets us
 	 * debug such message handling issues more easily.
 	 */
-	local_irq_save(flags);
+	block_signals_hard();
 	os_write_file(time_travel_ext_fd, &msg, sizeof(msg));
 
 	while (msg.op != UM_TIMETRAVEL_ACK)
@@ -152,7 +144,7 @@ static u64 time_travel_ext_req(u32 op, u64 time)
 
 	if (op == UM_TIMETRAVEL_GET)
 		time_travel_set_time(msg.time);
-	local_irq_restore(flags);
+	unblock_signals_hard();
 
 	return msg.time;
 }
@@ -352,9 +344,6 @@ void deliver_time_travel_irqs(void)
 	while ((e = list_first_entry_or_null(&time_travel_irqs,
 					     struct time_travel_event,
 					     list))) {
-		WARN(e->time != time_travel_time,
-		     "time moved from %lld to %lld before IRQ delivery\n",
-		     time_travel_time, e->time);
 		list_del(&e->list);
 		e->pending = false;
 		e->fn(e);
diff --git a/arch/um/os-Linux/signal.c b/arch/um/os-Linux/signal.c
index 8c9d162e6c51..8bce743fbe64 100644
--- a/arch/um/os-Linux/signal.c
+++ b/arch/um/os-Linux/signal.c
@@ -63,15 +63,19 @@ static void sig_handler_common(int sig, struct siginfo *si, mcontext_t *mc)
 #define SIGALRM_MASK (1 << SIGALRM_BIT)
 
 int signals_enabled;
+#ifdef UML_CONFIG_UML_TIME_TRAVEL_SUPPORT
+static int signals_blocked;
+#else
+#define signals_blocked (!signals_enabled)
+#endif
 static unsigned int signals_pending;
 static unsigned int signals_active = 0;
 
 void sig_handler(int sig, struct siginfo *si, mcontext_t *mc)
 {
-	int enabled;
+	int enabled = signals_enabled;
 
-	enabled = signals_enabled;
-	if (!enabled && (sig == SIGIO)) {
+	if (signals_blocked && (sig == SIGIO)) {
 		signals_pending |= SIGIO_MASK;
 		return;
 	}
@@ -99,7 +103,7 @@ void timer_alarm_handler(int sig, struct siginfo *unused_si, mcontext_t *mc)
 	int enabled;
 
 	enabled = signals_enabled;
-	if (!signals_enabled) {
+	if (!signals_enabled || signals_blocked) {
 		signals_pending |= SIGALRM_MASK;
 		return;
 	}
@@ -363,6 +367,39 @@ int set_signals_trace(int enable)
 	return ret;
 }
 
+#ifdef UML_CONFIG_UML_TIME_TRAVEL_SUPPORT
+void mark_sigio_pending(void)
+{
+	signals_pending |= SIGIO_MASK;
+}
+
+void block_signals_hard(void)
+{
+	if (signals_blocked)
+		return;
+	signals_blocked = 1;
+	barrier();
+}
+
+void unblock_signals_hard(void)
+{
+	if (!signals_blocked)
+		return;
+	/* Must be set to 0 before we check the pending bits etc. */
+	signals_blocked = 0;
+	barrier();
+
+	if (signals_pending && signals_enabled) {
+		/* this is a bit inefficient, but that's not really important */
+		block_signals();
+		unblock_signals();
+	} else if (signals_pending & SIGIO_MASK) {
+		/* we need to run time-travel handlers even if not enabled */
+		sigio_run_timetravel_handlers();
+	}
+}
+#endif
+
 int os_is_signal_stack(void)
 {
 	stack_t ss;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 6/8] um: irqs: allow invoking time-travel handler multiple times
  2021-03-01 15:07 [PATCH v2 0/8] PCI support for UML Johannes Berg
                   ` (4 preceding siblings ...)
  2021-03-01 15:07 ` [PATCH v2 5/8] um: time-travel/signals: fix ndelay() in interrupt Johannes Berg
@ 2021-03-01 15:07 ` Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 7/8] um: add PCI over virtio emulation driver Johannes Berg
  2021-03-01 15:07 ` [PATCH v2 8/8] um: virtio/pci: enable suspend/resume Johannes Berg
  7 siblings, 0 replies; 11+ messages in thread
From: Johannes Berg @ 2021-03-01 15:07 UTC (permalink / raw)
  To: linux-um; +Cc: Arnd Bergmann, linux-kernel, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

If we happen to get multiple messages while IRQS are already
suspended, we still need to handle them, since otherwise the
simulation blocks.

Remove the "prevent nesting" part, time_travel_add_irq_event()
will deal with being called multiple times just fine.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 arch/um/kernel/irq.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/um/kernel/irq.c b/arch/um/kernel/irq.c
index ccf5e4d27202..2ee0a368aa59 100644
--- a/arch/um/kernel/irq.c
+++ b/arch/um/kernel/irq.c
@@ -101,10 +101,12 @@ static bool irq_do_timetravel_handler(struct irq_entry *entry,
 	if (!reg->timetravel_handler)
 		return false;
 
-	/* prevent nesting - we'll get it again later when we SIGIO ourselves */
-	if (reg->pending_on_resume)
-		return true;
-
+	/*
+	 * Handle all messages - we might get multiple even while
+	 * interrupts are already suspended, due to suspend order
+	 * etc. Note that time_travel_add_irq_event() will not add
+	 * an event twice, if it's pending already "first wins".
+	 */
 	reg->timetravel_handler(reg->irq, entry->fd, reg->id, &reg->event);
 
 	if (!reg->event.pending)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 7/8] um: add PCI over virtio emulation driver
  2021-03-01 15:07 [PATCH v2 0/8] PCI support for UML Johannes Berg
                   ` (5 preceding siblings ...)
  2021-03-01 15:07 ` [PATCH v2 6/8] um: irqs: allow invoking time-travel handler multiple times Johannes Berg
@ 2021-03-01 15:07 ` Johannes Berg
  2021-08-17 15:50   ` Bjorn Helgaas
  2021-03-01 15:07 ` [PATCH v2 8/8] um: virtio/pci: enable suspend/resume Johannes Berg
  7 siblings, 1 reply; 11+ messages in thread
From: Johannes Berg @ 2021-03-01 15:07 UTC (permalink / raw)
  To: linux-um; +Cc: Arnd Bergmann, linux-kernel, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

To support testing of PCI/PCIe drivers in UML, add a PCI bus
support driver. This driver uses virtio, which in UML is really
just vhost-user, to talk to devices, and adds the devices to
the virtual PCI bus in the system.

Since virtio already allows DMA/bus mastering this really isn't
all that hard, of course we need the logic_iomem infrastructure
that was added by a previous patch.

The protocol to talk to the device is has a few fairly simple
messages for reading to/writing from config and IO spaces, and
messages for the device to send the various interrupts (INT#,
MSI/MSI-X and while suspended PME#).

Note that currently no offical virtio device ID is assigned for
this protocol, as a consequence this patch requires defining it
in the Kconfig, with a default that makes the driver refuse to
work at all.

Finally, in order to add support for MSI/MSI-X interrupts, some
small changes are needed in the UML IRQ code, it needs to have
more interrupts, changing NR_IRQS from 64 to 128 if this driver
is enabled, but not actually use them for anything so that the
generic IRQ domain/MSI infrastructure can allocate IRQ numbers.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
v2:
 - fix memory leak
---
 arch/um/Kconfig                    |  13 +-
 arch/um/drivers/Kconfig            |  20 +
 arch/um/drivers/Makefile           |   1 +
 arch/um/drivers/virt-pci.c         | 885 +++++++++++++++++++++++++++++
 arch/um/include/asm/Kbuild         |   1 -
 arch/um/include/asm/io.h           |   7 +
 arch/um/include/asm/irq.h          |   8 +-
 arch/um/include/asm/msi.h          |   1 +
 arch/um/include/asm/pci.h          |  39 ++
 arch/um/kernel/Makefile            |   1 +
 arch/um/kernel/ioport.c            |  13 +
 arch/um/kernel/irq.c               |   7 +-
 include/uapi/linux/virtio_pcidev.h |  64 +++
 13 files changed, 1054 insertions(+), 6 deletions(-)
 create mode 100644 arch/um/drivers/virt-pci.c
 create mode 100644 arch/um/include/asm/msi.h
 create mode 100644 arch/um/include/asm/pci.h
 create mode 100644 arch/um/kernel/ioport.c
 create mode 100644 include/uapi/linux/virtio_pcidev.h

diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index 20b0640e01b8..f64d774706e5 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -14,7 +14,7 @@ config UML
 	select HAVE_FUTEX_CMPXCHG if FUTEX
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DEBUG_BUGVERBOSE
-	select NO_DMA
+	select NO_DMA if !UML_DMA_EMULATION
 	select GENERIC_IRQ_SHOW
 	select GENERIC_CPU_DEVICES
 	select HAVE_GCC_PLUGINS
@@ -25,10 +25,21 @@ config MMU
 	bool
 	default y
 
+config UML_DMA_EMULATION
+	bool
+
 config NO_IOMEM
 	bool "disable IOMEM" if EXPERT
+	depends on !INDIRECT_IOMEM
 	default y
 
+config UML_IOMEM_EMULATION
+	bool
+	select INDIRECT_IOMEM
+	select GENERIC_PCI_IOMAP
+	select GENERIC_IOMAP
+	select NO_GENERIC_PCI_IOPORT_MAP
+
 config NO_IOPORT_MAP
 	def_bool y
 
diff --git a/arch/um/drivers/Kconfig b/arch/um/drivers/Kconfig
index 03ba34b61115..f145842c40b9 100644
--- a/arch/um/drivers/Kconfig
+++ b/arch/um/drivers/Kconfig
@@ -357,3 +357,23 @@ config UML_RTC
 	  rtcwake, especially in time-travel mode. This driver enables that
 	  by providing a fake RTC clock that causes a wakeup at the right
 	  time.
+
+config UML_PCI_OVER_VIRTIO
+	bool "Enable PCI over VIRTIO device simulation"
+	# in theory, just VIRTIO is enough, but that causes recursion
+	depends on VIRTIO_UML
+	select FORCE_PCI
+	select UML_IOMEM_EMULATION
+	select UML_DMA_EMULATION
+	select PCI_MSI
+	select PCI_MSI_IRQ_DOMAIN
+	select PCI_LOCKLESS_CONFIG
+
+config UML_PCI_OVER_VIRTIO_DEVICE_ID
+	int "set the virtio device ID for PCI emulation"
+	default -1
+	depends on UML_PCI_OVER_VIRTIO
+	help
+	  There's no official device ID assigned (yet), set the one you
+	  wish to use for experimentation here. The default of -1 is
+	  not valid and will cause the driver to fail at probe.
diff --git a/arch/um/drivers/Makefile b/arch/um/drivers/Makefile
index dcc64a02f81f..803666e85414 100644
--- a/arch/um/drivers/Makefile
+++ b/arch/um/drivers/Makefile
@@ -64,6 +64,7 @@ obj-$(CONFIG_BLK_DEV_COW_COMMON) += cow_user.o
 obj-$(CONFIG_UML_RANDOM) += random.o
 obj-$(CONFIG_VIRTIO_UML) += virtio_uml.o
 obj-$(CONFIG_UML_RTC) += rtc.o
+obj-$(CONFIG_UML_PCI_OVER_VIRTIO) += virt-pci.o
 
 # pcap_user.o must be added explicitly.
 USER_OBJS := fd.o null.o pty.o tty.o xterm.o slip_common.o pcap_user.o vde_user.o vector_user.o
diff --git a/arch/um/drivers/virt-pci.c b/arch/um/drivers/virt-pci.c
new file mode 100644
index 000000000000..dd85f36197aa
--- /dev/null
+++ b/arch/um/drivers/virt-pci.c
@@ -0,0 +1,885 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020 Intel Corporation
+ * Author: Johannes Berg <johannes@sipsolutions.net>
+ */
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+#include <linux/logic_iomem.h>
+#include <linux/irqdomain.h>
+#include <linux/virtio_pcidev.h>
+#include <linux/delay.h>
+#include <linux/msi.h>
+#include <asm/unaligned.h>
+#include <irq_kern.h>
+
+#define MAX_DEVICES 8
+#define MAX_MSI_VECTORS 32
+#define CFG_SPACE_SIZE 4096
+
+/* for MSI-X we have a 32-bit payload */
+#define MAX_IRQ_MSG_SIZE (sizeof(struct virtio_pcidev_msg) + sizeof(u32))
+#define NUM_IRQ_MSGS	10
+
+#define HANDLE_NO_FREE(ptr) ((void *)((unsigned long)(ptr) | 1))
+#define HANDLE_IS_NO_FREE(ptr) ((unsigned long)(ptr) & 1)
+
+struct um_pci_device {
+	struct virtio_device *vdev;
+
+	/* for now just standard BARs */
+	u8 resptr[PCI_STD_NUM_BARS];
+
+	struct virtqueue *cmd_vq, *irq_vq;
+
+#define UM_PCI_STAT_WAITING	0
+	unsigned long status;
+
+	int irq;
+};
+
+struct um_pci_device_reg {
+	struct um_pci_device *dev;
+	void __iomem *iomem;
+};
+
+static struct pci_host_bridge *bridge;
+static DEFINE_MUTEX(um_pci_mtx);
+static struct um_pci_device_reg um_pci_devices[MAX_DEVICES];
+static struct fwnode_handle *um_pci_fwnode;
+static struct irq_domain *um_pci_inner_domain;
+static struct irq_domain *um_pci_msi_domain;
+static unsigned long um_pci_msi_used[BITS_TO_LONGS(MAX_MSI_VECTORS)];
+
+#define UM_VIRT_PCI_MAXDELAY 40000
+
+static int um_pci_send_cmd(struct um_pci_device *dev,
+			   struct virtio_pcidev_msg *cmd,
+			   unsigned int cmd_size,
+			   const void *extra, unsigned int extra_size,
+			   void *out, unsigned int out_size)
+{
+	struct scatterlist out_sg, extra_sg, in_sg;
+	struct scatterlist *sgs_list[] = {
+		[0] = &out_sg,
+		[1] = extra ? &extra_sg : &in_sg,
+		[2] = extra ? &in_sg : NULL,
+	};
+	int delay_count = 0;
+	int ret, len;
+	bool posted;
+
+	if (WARN_ON(cmd_size < sizeof(*cmd)))
+		return -EINVAL;
+
+	switch (cmd->op) {
+	case VIRTIO_PCIDEV_OP_CFG_WRITE:
+	case VIRTIO_PCIDEV_OP_MMIO_WRITE:
+	case VIRTIO_PCIDEV_OP_MMIO_MEMSET:
+		/* in PCI, writes are posted, so don't wait */
+		posted = !out;
+		WARN_ON(!posted);
+		break;
+	default:
+		posted = false;
+		break;
+	}
+
+	if (posted) {
+		u8 *ncmd = kmalloc(cmd_size + extra_size, GFP_ATOMIC);
+
+		if (ncmd) {
+			memcpy(ncmd, cmd, cmd_size);
+			if (extra)
+				memcpy(ncmd + cmd_size, extra, extra_size);
+			cmd = (void *)ncmd;
+			cmd_size += extra_size;
+			extra = NULL;
+			extra_size = 0;
+		} else {
+			/* try without allocating memory */
+			posted = false;
+		}
+	}
+
+	sg_init_one(&out_sg, cmd, cmd_size);
+	if (extra)
+		sg_init_one(&extra_sg, extra, extra_size);
+	if (out)
+		sg_init_one(&in_sg, out, out_size);
+
+	/* add to internal virtio queue */
+	ret = virtqueue_add_sgs(dev->cmd_vq, sgs_list,
+				extra ? 2 : 1,
+				out ? 1 : 0,
+				posted ? cmd : HANDLE_NO_FREE(cmd),
+				GFP_ATOMIC);
+	if (ret)
+		return ret;
+
+	if (posted) {
+		virtqueue_kick(dev->cmd_vq);
+		return 0;
+	}
+
+	/* kick and poll for getting a response on the queue */
+	set_bit(UM_PCI_STAT_WAITING, &dev->status);
+	virtqueue_kick(dev->cmd_vq);
+
+	while (1) {
+		void *completed = virtqueue_get_buf(dev->cmd_vq, &len);
+
+		if (completed == HANDLE_NO_FREE(cmd))
+			break;
+
+		if (WARN_ONCE(virtqueue_is_broken(dev->cmd_vq) ||
+			      ++delay_count > UM_VIRT_PCI_MAXDELAY,
+			      "um virt-pci delay: %d", delay_count)) {
+			ret = -EIO;
+			break;
+		}
+		udelay(1);
+	}
+	clear_bit(UM_PCI_STAT_WAITING, &dev->status);
+
+	return ret;
+}
+
+static unsigned long um_pci_cfgspace_read(void *priv, unsigned int offset,
+					  int size)
+{
+	struct um_pci_device_reg *reg = priv;
+	struct um_pci_device *dev = reg->dev;
+	struct virtio_pcidev_msg hdr = {
+		.op = VIRTIO_PCIDEV_OP_CFG_READ,
+		.size = size,
+		.addr = offset,
+	};
+	/* maximum size - we may only use parts of it */
+	u8 data[8];
+
+	if (!dev)
+		return ~0ULL;
+
+	memset(data, 0xff, sizeof(data));
+
+	switch (size) {
+	case 1:
+	case 2:
+	case 4:
+#ifdef CONFIG_64BIT
+	case 8:
+#endif
+		break;
+	default:
+		WARN(1, "invalid config space read size %d\n", size);
+		return ~0ULL;
+	}
+
+	if (um_pci_send_cmd(dev, &hdr, sizeof(hdr), NULL, 0,
+			    data, sizeof(data)))
+		return ~0ULL;
+
+	switch (size) {
+	case 1:
+		return data[0];
+	case 2:
+		return le16_to_cpup((void *)data);
+	case 4:
+		return le32_to_cpup((void *)data);
+#ifdef CONFIG_64BIT
+	case 8:
+		return le64_to_cpup((void *)data);
+#endif
+	default:
+		return ~0ULL;
+	}
+}
+
+static void um_pci_cfgspace_write(void *priv, unsigned int offset, int size,
+				  unsigned long val)
+{
+	struct um_pci_device_reg *reg = priv;
+	struct um_pci_device *dev = reg->dev;
+	struct {
+		struct virtio_pcidev_msg hdr;
+		/* maximum size - we may only use parts of it */
+		u8 data[8];
+	} msg = {
+		.hdr = {
+			.op = VIRTIO_PCIDEV_OP_CFG_WRITE,
+			.size = size,
+			.addr = offset,
+		},
+	};
+
+	if (!dev)
+		return;
+
+	switch (size) {
+	case 1:
+		msg.data[0] = (u8)val;
+		break;
+	case 2:
+		put_unaligned_le16(val, (void *)msg.data);
+		break;
+	case 4:
+		put_unaligned_le32(val, (void *)msg.data);
+		break;
+#ifdef CONFIG_64BIT
+	case 8:
+		put_unaligned_le64(val, (void *)msg.data);
+		break;
+#endif
+	default:
+		WARN(1, "invalid config space write size %d\n", size);
+		return;
+	}
+
+	WARN_ON(um_pci_send_cmd(dev, &msg.hdr, sizeof(msg), NULL, 0, NULL, 0));
+}
+
+static const struct logic_iomem_ops um_pci_device_cfgspace_ops = {
+	.read = um_pci_cfgspace_read,
+	.write = um_pci_cfgspace_write,
+};
+
+static void um_pci_bar_copy_from(void *priv, void *buffer,
+				 unsigned int offset, int size)
+{
+	u8 *resptr = priv;
+	struct um_pci_device *dev = container_of(resptr - *resptr,
+						 struct um_pci_device,
+						 resptr[0]);
+	struct virtio_pcidev_msg hdr = {
+		.op = VIRTIO_PCIDEV_OP_MMIO_READ,
+		.bar = *resptr,
+		.size = size,
+		.addr = offset,
+	};
+
+	memset(buffer, 0xff, size);
+
+	um_pci_send_cmd(dev, &hdr, sizeof(hdr), NULL, 0, buffer, size);
+}
+
+static unsigned long um_pci_bar_read(void *priv, unsigned int offset,
+				     int size)
+{
+	/* maximum size - we may only use parts of it */
+	u8 data[8];
+
+	switch (size) {
+	case 1:
+	case 2:
+	case 4:
+#ifdef CONFIG_64BIT
+	case 8:
+#endif
+		break;
+	default:
+		WARN(1, "invalid config space read size %d\n", size);
+		return ~0ULL;
+	}
+
+	um_pci_bar_copy_from(priv, data, offset, size);
+
+	switch (size) {
+	case 1:
+		return data[0];
+	case 2:
+		return le16_to_cpup((void *)data);
+	case 4:
+		return le32_to_cpup((void *)data);
+#ifdef CONFIG_64BIT
+	case 8:
+		return le64_to_cpup((void *)data);
+#endif
+	default:
+		return ~0ULL;
+	}
+}
+
+static void um_pci_bar_copy_to(void *priv, unsigned int offset,
+			       const void *buffer, int size)
+{
+	u8 *resptr = priv;
+	struct um_pci_device *dev = container_of(resptr - *resptr,
+						 struct um_pci_device,
+						 resptr[0]);
+	struct virtio_pcidev_msg hdr = {
+		.op = VIRTIO_PCIDEV_OP_MMIO_WRITE,
+		.bar = *resptr,
+		.size = size,
+		.addr = offset,
+	};
+
+	um_pci_send_cmd(dev, &hdr, sizeof(hdr), buffer, size, NULL, 0);
+}
+
+static void um_pci_bar_write(void *priv, unsigned int offset, int size,
+			     unsigned long val)
+{
+	/* maximum size - we may only use parts of it */
+	u8 data[8];
+
+	switch (size) {
+	case 1:
+		data[0] = (u8)val;
+		break;
+	case 2:
+		put_unaligned_le16(val, (void *)data);
+		break;
+	case 4:
+		put_unaligned_le32(val, (void *)data);
+		break;
+#ifdef CONFIG_64BIT
+	case 8:
+		put_unaligned_le64(val, (void *)data);
+		break;
+#endif
+	default:
+		WARN(1, "invalid config space write size %d\n", size);
+		return;
+	}
+
+	um_pci_bar_copy_to(priv, offset, data, size);
+}
+
+static void um_pci_bar_set(void *priv, unsigned int offset, u8 value, int size)
+{
+	u8 *resptr = priv;
+	struct um_pci_device *dev = container_of(resptr - *resptr,
+						 struct um_pci_device,
+						 resptr[0]);
+	struct {
+		struct virtio_pcidev_msg hdr;
+		u8 data;
+	} msg = {
+		.hdr = {
+			.op = VIRTIO_PCIDEV_OP_CFG_WRITE,
+			.bar = *resptr,
+			.size = size,
+			.addr = offset,
+		},
+		.data = value,
+	};
+
+	um_pci_send_cmd(dev, &msg.hdr, sizeof(msg), NULL, 0, NULL, 0);
+}
+
+static const struct logic_iomem_ops um_pci_device_bar_ops = {
+	.read = um_pci_bar_read,
+	.write = um_pci_bar_write,
+	.set = um_pci_bar_set,
+	.copy_from = um_pci_bar_copy_from,
+	.copy_to = um_pci_bar_copy_to,
+};
+
+static void __iomem *um_pci_map_bus(struct pci_bus *bus, unsigned int devfn,
+				    int where)
+{
+	struct um_pci_device_reg *dev;
+	unsigned int busn = bus->number;
+
+	if (busn > 0)
+		return NULL;
+
+	/* not allowing functions for now ... */
+	if (devfn % 8)
+		return NULL;
+
+	if (devfn / 8 >= ARRAY_SIZE(um_pci_devices))
+		return NULL;
+
+	dev = &um_pci_devices[devfn / 8];
+	if (!dev)
+		return NULL;
+
+	return (void __iomem *)((unsigned long)dev->iomem + where);
+}
+
+static struct pci_ops um_pci_ops = {
+	.map_bus = um_pci_map_bus,
+	.read = pci_generic_config_read,
+	.write = pci_generic_config_write,
+};
+
+static void um_pci_rescan(void)
+{
+	pci_lock_rescan_remove();
+	pci_rescan_bus(bridge->bus);
+	pci_unlock_rescan_remove();
+}
+
+static void um_pci_irq_vq_addbuf(struct virtqueue *vq, void *buf, bool kick)
+{
+	struct scatterlist sg[1];
+
+	sg_init_one(sg, buf, MAX_IRQ_MSG_SIZE);
+	if (virtqueue_add_inbuf(vq, sg, 1, buf, GFP_ATOMIC))
+		kfree(buf);
+	else if (kick)
+		virtqueue_kick(vq);
+}
+
+static void um_pci_handle_irq_message(struct virtqueue *vq,
+				      struct virtio_pcidev_msg *msg)
+{
+	struct virtio_device *vdev = vq->vdev;
+	struct um_pci_device *dev = vdev->priv;
+
+	/* we should properly chain interrupts, but on ARCH=um we don't care */
+
+	switch (msg->op) {
+	case VIRTIO_PCIDEV_OP_INT:
+		generic_handle_irq(dev->irq);
+		break;
+	case VIRTIO_PCIDEV_OP_MSI:
+		/* our MSI message is just the interrupt number */
+		if (msg->size == sizeof(u32))
+			generic_handle_irq(le32_to_cpup((void *)msg->data));
+		else
+			generic_handle_irq(le16_to_cpup((void *)msg->data));
+		break;
+	case VIRTIO_PCIDEV_OP_PME:
+		/* nothing to do - we already woke up due to the message */
+		break;
+	default:
+		dev_err(&vdev->dev, "unexpected virt-pci message %d\n", msg->op);
+		break;
+	}
+}
+
+static void um_pci_cmd_vq_cb(struct virtqueue *vq)
+{
+	struct virtio_device *vdev = vq->vdev;
+	struct um_pci_device *dev = vdev->priv;
+	void *cmd;
+	int len;
+
+	if (test_bit(UM_PCI_STAT_WAITING, &dev->status))
+		return;
+
+	while ((cmd = virtqueue_get_buf(vq, &len))) {
+		if (WARN_ON(HANDLE_IS_NO_FREE(cmd)))
+			continue;
+		kfree(cmd);
+	}
+}
+
+static void um_pci_irq_vq_cb(struct virtqueue *vq)
+{
+	struct virtio_pcidev_msg *msg;
+	int len;
+
+	while ((msg = virtqueue_get_buf(vq, &len))) {
+		if (len >= sizeof(*msg))
+			um_pci_handle_irq_message(vq, msg);
+
+		/* recycle the message buffer */
+		um_pci_irq_vq_addbuf(vq, msg, true);
+	}
+}
+
+static int um_pci_init_vqs(struct um_pci_device *dev)
+{
+	struct virtqueue *vqs[2];
+	static const char *const names[2] = { "cmd", "irq" };
+	vq_callback_t *cbs[2] = { um_pci_cmd_vq_cb, um_pci_irq_vq_cb };
+	int err, i;
+
+	err = virtio_find_vqs(dev->vdev, 2, vqs, cbs, names, NULL);
+	if (err)
+		return err;
+
+	dev->cmd_vq = vqs[0];
+	dev->irq_vq = vqs[1];
+
+	for (i = 0; i < NUM_IRQ_MSGS; i++) {
+		void *msg = kzalloc(MAX_IRQ_MSG_SIZE, GFP_KERNEL);
+
+		if (msg)
+			um_pci_irq_vq_addbuf(dev->irq_vq, msg, false);
+	}
+
+	virtqueue_kick(dev->irq_vq);
+
+	return 0;
+}
+
+static int um_pci_virtio_probe(struct virtio_device *vdev)
+{
+	struct um_pci_device *dev;
+	int i, free = -1;
+	int err = -ENOSPC;
+
+	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+	if (!dev)
+		return -ENOMEM;
+
+	dev->vdev = vdev;
+	vdev->priv = dev;
+
+	mutex_lock(&um_pci_mtx);
+	for (i = 0; i < MAX_DEVICES; i++) {
+		if (um_pci_devices[i].dev)
+			continue;
+		free = i;
+		break;
+	}
+
+	if (free < 0)
+		goto error;
+
+	err = um_pci_init_vqs(dev);
+	if (err)
+		goto error;
+
+	dev->irq = irq_alloc_desc(numa_node_id());
+	if (dev->irq < 0) {
+		err = dev->irq;
+		goto error;
+	}
+	um_pci_devices[free].dev = dev;
+	vdev->priv = dev;
+
+	mutex_unlock(&um_pci_mtx);
+
+	device_set_wakeup_enable(&vdev->dev, true);
+
+	um_pci_rescan();
+	return 0;
+error:
+	mutex_unlock(&um_pci_mtx);
+	kfree(dev);
+	return err;
+}
+
+static void um_pci_virtio_remove(struct virtio_device *vdev)
+{
+	struct um_pci_device *dev = vdev->priv;
+	int i;
+
+        /* Stop all virtqueues */
+        vdev->config->reset(vdev);
+        vdev->config->del_vqs(vdev);
+
+	device_set_wakeup_enable(&vdev->dev, false);
+
+	mutex_lock(&um_pci_mtx);
+	for (i = 0; i < MAX_DEVICES; i++) {
+		if (um_pci_devices[i].dev != dev)
+			continue;
+		um_pci_devices[i].dev = NULL;
+		irq_free_desc(dev->irq);
+	}
+	mutex_unlock(&um_pci_mtx);
+
+	um_pci_rescan();
+
+	kfree(dev);
+}
+
+static struct virtio_device_id id_table[] = {
+	{ CONFIG_UML_PCI_OVER_VIRTIO_DEVICE_ID, VIRTIO_DEV_ANY_ID },
+	{ 0 },
+};
+MODULE_DEVICE_TABLE(virtio, id_table);
+
+static struct virtio_driver um_pci_virtio_driver = {
+	.driver.name = "virtio-pci",
+	.driver.owner = THIS_MODULE,
+	.id_table = id_table,
+	.probe = um_pci_virtio_probe,
+	.remove = um_pci_virtio_remove,
+};
+
+static struct resource virt_cfgspace_resource = {
+	.name = "PCI config space",
+	.start = 0xf0000000 - MAX_DEVICES * CFG_SPACE_SIZE,
+	.end = 0xf0000000 - 1,
+	.flags = IORESOURCE_MEM,
+};
+
+static long um_pci_map_cfgspace(unsigned long offset, size_t size,
+				const struct logic_iomem_ops **ops,
+				void **priv)
+{
+	if (WARN_ON(size > CFG_SPACE_SIZE || offset % CFG_SPACE_SIZE))
+		return -EINVAL;
+
+	if (offset / CFG_SPACE_SIZE < MAX_DEVICES) {
+		*ops = &um_pci_device_cfgspace_ops;
+		*priv = &um_pci_devices[offset / CFG_SPACE_SIZE];
+		return 0;
+	}
+
+	WARN(1, "cannot map offset 0x%lx/0x%zx\n", offset, size);
+	return -ENOENT;
+}
+
+static const struct logic_iomem_region_ops um_pci_cfgspace_ops = {
+	.map = um_pci_map_cfgspace,
+};
+
+static struct resource virt_iomem_resource = {
+	.name = "PCI iomem",
+	.start = 0xf0000000,
+	.end = 0xffffffff,
+	.flags = IORESOURCE_MEM,
+};
+
+struct um_pci_map_iomem_data {
+	unsigned long offset;
+	size_t size;
+	const struct logic_iomem_ops **ops;
+	void **priv;
+	long ret;
+};
+
+static int um_pci_map_iomem_walk(struct pci_dev *pdev, void *_data)
+{
+	struct um_pci_map_iomem_data *data = _data;
+	struct um_pci_device_reg *reg = &um_pci_devices[pdev->devfn / 8];
+	struct um_pci_device *dev;
+	int i;
+
+	if (!reg->dev)
+		return 0;
+
+	for (i = 0; i < ARRAY_SIZE(dev->resptr); i++) {
+		struct resource *r = &pdev->resource[i];
+
+		if ((r->flags & IORESOURCE_TYPE_BITS) != IORESOURCE_MEM)
+			continue;
+
+		/*
+		 * must be the whole or part of the resource,
+		 * not allowed to only overlap
+		 */
+		if (data->offset < r->start || data->offset > r->end)
+			continue;
+		if (data->offset + data->size - 1 > r->end)
+			continue;
+
+		dev = reg->dev;
+		*data->ops = &um_pci_device_bar_ops;
+		dev->resptr[i] = i;
+		*data->priv = &dev->resptr[i];
+		data->ret = data->offset - r->start;
+
+		/* no need to continue */
+		return 1;
+	}
+
+	return 0;
+}
+
+static long um_pci_map_iomem(unsigned long offset, size_t size,
+			     const struct logic_iomem_ops **ops,
+			     void **priv)
+{
+	struct um_pci_map_iomem_data data = {
+		/* we want the full address here */
+		.offset = offset + virt_iomem_resource.start,
+		.size = size,
+		.ops = ops,
+		.priv = priv,
+		.ret = -ENOENT,
+	};
+
+	pci_walk_bus(bridge->bus, um_pci_map_iomem_walk, &data);
+	return data.ret;
+}
+
+static const struct logic_iomem_region_ops um_pci_iomem_ops = {
+	.map = um_pci_map_iomem,
+};
+
+static void um_pci_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+{
+	/*
+	 * This is a very low address and not actually valid 'physical' memory
+	 * in UML, so we can simply map MSI(-X) vectors to there, it cannot be
+	 * legitimately written to by the device in any other way.
+	 * We use the (virtual) IRQ number here as the message to simplify the
+	 * code that receives the message, where for now we simply trust the
+	 * device to send the correct message.
+	 */
+	msg->address_hi = 0;
+	msg->address_lo = 0xa0000;
+	msg->data = data->irq;
+}
+
+static struct irq_chip um_pci_msi_bottom_irq_chip = {
+	.name = "UM virtio MSI",
+	.irq_compose_msi_msg = um_pci_compose_msi_msg,
+};
+
+static int um_pci_inner_domain_alloc(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs,
+				     void *args)
+{
+	unsigned long bit;
+
+	WARN_ON(nr_irqs != 1);
+
+	mutex_lock(&um_pci_mtx);
+	bit = find_first_zero_bit(um_pci_msi_used, MAX_MSI_VECTORS);
+	if (bit >= MAX_MSI_VECTORS) {
+		mutex_unlock(&um_pci_mtx);
+		return -ENOSPC;
+	}
+
+	set_bit(bit, um_pci_msi_used);
+	mutex_unlock(&um_pci_mtx);
+
+	irq_domain_set_info(domain, virq, bit, &um_pci_msi_bottom_irq_chip,
+			    domain->host_data, handle_simple_irq,
+			    NULL, NULL);
+
+	return 0;
+}
+
+static void um_pci_inner_domain_free(struct irq_domain *domain,
+				     unsigned int virq, unsigned int nr_irqs)
+{
+	struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+
+	mutex_lock(&um_pci_mtx);
+
+	if (!test_bit(d->hwirq, um_pci_msi_used))
+		pr_err("trying to free unused MSI#%lu\n", d->hwirq);
+	else
+		__clear_bit(d->hwirq, um_pci_msi_used);
+
+	mutex_unlock(&um_pci_mtx);
+}
+
+static const struct irq_domain_ops um_pci_inner_domain_ops = {
+	.alloc = um_pci_inner_domain_alloc,
+	.free = um_pci_inner_domain_free,
+};
+
+static struct irq_chip um_pci_msi_irq_chip = {
+	.name = "UM virtio PCIe MSI",
+	.irq_mask = pci_msi_mask_irq,
+	.irq_unmask = pci_msi_unmask_irq,
+};
+
+static struct msi_domain_info um_pci_msi_domain_info = {
+	.flags	= MSI_FLAG_USE_DEF_DOM_OPS |
+		  MSI_FLAG_USE_DEF_CHIP_OPS |
+		  MSI_FLAG_PCI_MSIX,
+	.chip	= &um_pci_msi_irq_chip,
+};
+
+static struct resource busn_resource = {
+	.name	= "PCI busn",
+	.start	= 0,
+	.end	= 0,
+	.flags	= IORESOURCE_BUS,
+};
+
+static int um_pci_map_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
+{
+	struct um_pci_device_reg *reg = &um_pci_devices[pdev->devfn / 8];
+
+	if (WARN_ON(!reg->dev))
+		return -EINVAL;
+
+	/* Yes, we map all pins to the same IRQ ... doesn't matter for now. */
+	return reg->dev->irq;
+}
+
+void *pci_root_bus_fwnode(struct pci_bus *bus)
+{
+	return um_pci_fwnode;
+}
+
+int um_pci_init(void)
+{
+	int err, i;
+
+	WARN_ON(logic_iomem_add_region(&virt_cfgspace_resource,
+				       &um_pci_cfgspace_ops));
+	WARN_ON(logic_iomem_add_region(&virt_iomem_resource,
+				       &um_pci_iomem_ops));
+
+	if (WARN(CONFIG_UML_PCI_OVER_VIRTIO_DEVICE_ID < 0,
+		 "No virtio device ID configured for PCI - no PCI support\n"))
+		return 0;
+
+	bridge = pci_alloc_host_bridge(0);
+	if (!bridge)
+		return -ENOMEM;
+
+	um_pci_fwnode = irq_domain_alloc_named_fwnode("um-pci");
+	if (!um_pci_fwnode) {
+		err = -ENOMEM;
+		goto free;
+	}
+
+	um_pci_inner_domain = __irq_domain_add(um_pci_fwnode, MAX_MSI_VECTORS,
+					       MAX_MSI_VECTORS, 0,
+					       &um_pci_inner_domain_ops, NULL);
+	if (!um_pci_inner_domain) {
+		err = -ENOMEM;
+		goto free;
+	}
+
+	um_pci_msi_domain = pci_msi_create_irq_domain(um_pci_fwnode,
+						      &um_pci_msi_domain_info,
+						      um_pci_inner_domain);
+	if (!um_pci_msi_domain) {
+		err = -ENOMEM;
+		goto free;
+	}
+
+	pci_add_resource(&bridge->windows, &virt_iomem_resource);
+	pci_add_resource(&bridge->windows, &busn_resource);
+	bridge->ops = &um_pci_ops;
+	bridge->map_irq = um_pci_map_irq;
+
+	for (i = 0; i < MAX_DEVICES; i++) {
+		resource_size_t start;
+
+		start = virt_cfgspace_resource.start + i * CFG_SPACE_SIZE;
+		um_pci_devices[i].iomem = ioremap(start, CFG_SPACE_SIZE);
+		if (WARN(!um_pci_devices[i].iomem, "failed to map %d\n", i)) {
+			err = -ENOMEM;
+			goto free;
+		}
+	}
+
+	err = pci_host_probe(bridge);
+	if (err)
+		goto free;
+
+	err = register_virtio_driver(&um_pci_virtio_driver);
+	if (err)
+		goto free;
+	return 0;
+free:
+	if (um_pci_inner_domain)
+		irq_domain_remove(um_pci_inner_domain);
+	if (um_pci_fwnode)
+		irq_domain_free_fwnode(um_pci_fwnode);
+	pci_free_resource_list(&bridge->windows);
+	pci_free_host_bridge(bridge);
+	return err;
+}
+module_init(um_pci_init);
+
+void um_pci_exit(void)
+{
+	unregister_virtio_driver(&um_pci_virtio_driver);
+	irq_domain_remove(um_pci_msi_domain);
+	irq_domain_remove(um_pci_inner_domain);
+	pci_free_resource_list(&bridge->windows);
+	pci_free_host_bridge(bridge);
+}
+module_exit(um_pci_exit);
diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild
index 10b7228b3aee..0c31d19a7a9c 100644
--- a/arch/um/include/asm/Kbuild
+++ b/arch/um/include/asm/Kbuild
@@ -18,7 +18,6 @@ generic-y += mcs_spinlock.h
 generic-y += mmiowb.h
 generic-y += module.lds.h
 generic-y += param.h
-generic-y += pci.h
 generic-y += percpu.h
 generic-y += preempt.h
 generic-y += softirq_stack.h
diff --git a/arch/um/include/asm/io.h b/arch/um/include/asm/io.h
index 6ce18d343997..9ea42cc746d9 100644
--- a/arch/um/include/asm/io.h
+++ b/arch/um/include/asm/io.h
@@ -3,16 +3,23 @@
 #define _ASM_UM_IO_H
 #include <linux/types.h>
 
+/* get emulated iomem (if desired) */
+#include <asm-generic/logic_io.h>
+
+#ifndef ioremap
 #define ioremap ioremap
 static inline void __iomem *ioremap(phys_addr_t offset, size_t size)
 {
 	return NULL;
 }
+#endif /* ioremap */
 
+#ifndef iounmap
 #define iounmap iounmap
 static inline void iounmap(void __iomem *addr)
 {
 }
+#endif /* iounmap */
 
 #include <asm-generic/io.h>
 
diff --git a/arch/um/include/asm/irq.h b/arch/um/include/asm/irq.h
index 3f5d3e8228fc..e187c789369d 100644
--- a/arch/um/include/asm/irq.h
+++ b/arch/um/include/asm/irq.h
@@ -31,7 +31,13 @@
 
 #endif
 
-#define NR_IRQS			64
+#define UM_LAST_SIGNAL_IRQ	64
+/* If we have (simulated) PCI MSI, allow 64 more interrupt numbers for it */
+#ifdef CONFIG_PCI_MSI
+#define NR_IRQS			(UM_LAST_SIGNAL_IRQ + 64)
+#else
+#define NR_IRQS			UM_LAST_SIGNAL_IRQ
+#endif /* CONFIG_PCI_MSI */
 
 #include <asm-generic/irq.h>
 #endif
diff --git a/arch/um/include/asm/msi.h b/arch/um/include/asm/msi.h
new file mode 100644
index 000000000000..c8c6c381f394
--- /dev/null
+++ b/arch/um/include/asm/msi.h
@@ -0,0 +1 @@
+#include <asm-generic/msi.h>
diff --git a/arch/um/include/asm/pci.h b/arch/um/include/asm/pci.h
new file mode 100644
index 000000000000..da13fd5519ef
--- /dev/null
+++ b/arch/um/include/asm/pci.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_UM_PCI_H
+#define __ASM_UM_PCI_H
+#include <linux/types.h>
+#include <asm/io.h>
+
+#define PCIBIOS_MIN_IO		0
+#define PCIBIOS_MIN_MEM		0
+
+#define pcibios_assign_all_busses() 1
+
+extern int isa_dma_bridge_buggy;
+
+#ifdef CONFIG_PCI
+static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
+{
+	/* no legacy IRQs */
+	return -ENODEV;
+}
+#endif
+
+#ifdef CONFIG_PCI_DOMAINS
+static inline int pci_proc_domain(struct pci_bus *bus)
+{
+	/* always show the domain in /proc */
+	return 1;
+}
+#endif  /* CONFIG_PCI */
+
+#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
+/*
+ * This is a bit of an annoying hack, and it assumes we only have
+ * the virt-pci (if anything). Which is true, but still.
+ */
+void *pci_root_bus_fwnode(struct pci_bus *bus);
+#define pci_root_bus_fwnode	pci_root_bus_fwnode
+#endif
+
+#endif  /* __ASM_UM_PCI_H */
diff --git a/arch/um/kernel/Makefile b/arch/um/kernel/Makefile
index 5aa882011e04..c1205f9ec17e 100644
--- a/arch/um/kernel/Makefile
+++ b/arch/um/kernel/Makefile
@@ -24,6 +24,7 @@ obj-$(CONFIG_GPROF)	+= gprof_syms.o
 obj-$(CONFIG_GCOV)	+= gmon_syms.o
 obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
 obj-$(CONFIG_STACKTRACE) += stacktrace.o
+obj-$(CONFIG_GENERIC_PCI_IOMAP) += ioport.o
 
 USER_OBJS := config.o
 
diff --git a/arch/um/kernel/ioport.c b/arch/um/kernel/ioport.c
new file mode 100644
index 000000000000..7220615b3beb
--- /dev/null
+++ b/arch/um/kernel/ioport.c
@@ -0,0 +1,13 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Intel Corporation
+ * Author: Johannes Berg <johannes@sipsolutions.net>
+ */
+#include <asm/iomap.h>
+#include <asm-generic/pci_iomap.h>
+
+void __iomem *__pci_ioport_map(struct pci_dev *dev, unsigned long port,
+			       unsigned int nr)
+{
+	return NULL;
+}
diff --git a/arch/um/kernel/irq.c b/arch/um/kernel/irq.c
index 2ee0a368aa59..cb7c2ebf260c 100644
--- a/arch/um/kernel/irq.c
+++ b/arch/um/kernel/irq.c
@@ -56,7 +56,7 @@ struct irq_entry {
 
 static DEFINE_SPINLOCK(irq_lock);
 static LIST_HEAD(active_fds);
-static DECLARE_BITMAP(irqs_allocated, NR_IRQS);
+static DECLARE_BITMAP(irqs_allocated, UM_LAST_SIGNAL_IRQ);
 static bool irqs_suspended;
 
 static void irq_io_loop(struct irq_reg *irq, struct uml_pt_regs *regs)
@@ -426,7 +426,8 @@ unsigned int do_IRQ(int irq, struct uml_pt_regs *regs)
 
 void um_free_irq(int irq, void *dev)
 {
-	if (WARN(irq < 0 || irq > NR_IRQS, "freeing invalid irq %d", irq))
+	if (WARN(irq < 0 || irq > UM_LAST_SIGNAL_IRQ,
+		 "freeing invalid irq %d", irq))
 		return;
 
 	free_irq_by_irq_and_dev(irq, dev);
@@ -655,7 +656,7 @@ void __init init_IRQ(void)
 
 	irq_set_chip_and_handler(TIMER_IRQ, &alarm_irq_type, handle_edge_irq);
 
-	for (i = 1; i < NR_IRQS; i++)
+	for (i = 1; i < UM_LAST_SIGNAL_IRQ; i++)
 		irq_set_chip_and_handler(i, &normal_irq_type, handle_edge_irq);
 	/* Initialize EPOLL Loop */
 	os_setup_epoll();
diff --git a/include/uapi/linux/virtio_pcidev.h b/include/uapi/linux/virtio_pcidev.h
new file mode 100644
index 000000000000..89daa88bcfef
--- /dev/null
+++ b/include/uapi/linux/virtio_pcidev.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/*
+ * Copyright (C) 2021 Intel Corporation
+ * Author: Johannes Berg <johannes@sipsolutions.net>
+ */
+#ifndef _UAPI_LINUX_VIRTIO_PCIDEV_H
+#define _UAPI_LINUX_VIRTIO_PCIDEV_H
+#include <linux/types.h>
+
+/**
+ * enum virtio_pcidev_ops - virtual PCI device operations
+ * @VIRTIO_PCIDEV_OP_CFG_READ: read config space, size is 1, 2, 4 or 8;
+ *	the @data field should be filled in by the device (in little endian).
+ * @VIRTIO_PCIDEV_OP_CFG_WRITE: write config space, size is 1, 2, 4 or 8;
+ *	the @data field contains the data to write (in little endian).
+ * @VIRTIO_PCIDEV_OP_BAR_READ: read BAR mem/pio, size can be variable;
+ *	the @data field should be filled in by the device (in little endian).
+ * @VIRTIO_PCIDEV_OP_BAR_WRITE: write BAR mem/pio, size can be variable;
+ *	the @data field contains the data to write (in little endian).
+ * @VIRTIO_PCIDEV_OP_MMIO_MEMSET: memset MMIO, size is variable but
+ *	the @data field only has one byte (unlike @VIRTIO_PCIDEV_OP_MMIO_WRITE)
+ * @VIRTIO_PCIDEV_OP_INT: legacy INTx# pin interrupt, the addr field is 1-4 for
+ *	the number
+ * @VIRTIO_PCIDEV_OP_MSI: MSI(-X) interrupt, this message basically transports
+ *	the 16- or 32-bit write that would otherwise be done into memory,
+ *	analogous to the write messages (@VIRTIO_PCIDEV_OP_MMIO_WRITE) above
+ * @VIRTIO_PCIDEV_OP_PME: Dummy message whose content is ignored (and should be
+ *	all zeroes) to signal the PME# pin.
+ */
+enum virtio_pcidev_ops {
+	VIRTIO_PCIDEV_OP_RESERVED = 0,
+	VIRTIO_PCIDEV_OP_CFG_READ,
+	VIRTIO_PCIDEV_OP_CFG_WRITE,
+	VIRTIO_PCIDEV_OP_MMIO_READ,
+	VIRTIO_PCIDEV_OP_MMIO_WRITE,
+	VIRTIO_PCIDEV_OP_MMIO_MEMSET,
+	VIRTIO_PCIDEV_OP_INT,
+	VIRTIO_PCIDEV_OP_MSI,
+	VIRTIO_PCIDEV_OP_PME,
+};
+
+/**
+ * struct virtio_pcidev_msg - virtio PCI device operation
+ * @op: the operation to do
+ * @bar: the bar (only with BAR read/write messages)
+ * @reserved: reserved
+ * @size: the size of the read/write (in bytes)
+ * @addr: the address to read/write
+ * @data: the data, normally @size long, but just one byte for
+ *	%VIRTIO_PCIDEV_OP_MMIO_MEMSET
+ *
+ * Note: the fields are all in native (CPU) endian, however, the
+ * @data values will often be in little endian (see the ops above.)
+ */
+struct virtio_pcidev_msg {
+	__u8 op;
+	__u8 bar;
+	__u16 reserved;
+	__u32 size;
+	__u64 addr;
+	__u8 data[];
+};
+
+#endif /* _UAPI_LINUX_VIRTIO_PCIDEV_H */
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 8/8] um: virtio/pci: enable suspend/resume
  2021-03-01 15:07 [PATCH v2 0/8] PCI support for UML Johannes Berg
                   ` (6 preceding siblings ...)
  2021-03-01 15:07 ` [PATCH v2 7/8] um: add PCI over virtio emulation driver Johannes Berg
@ 2021-03-01 15:07 ` Johannes Berg
  7 siblings, 0 replies; 11+ messages in thread
From: Johannes Berg @ 2021-03-01 15:07 UTC (permalink / raw)
  To: linux-um; +Cc: Arnd Bergmann, linux-kernel, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

The UM virtual PCI devices currently cannot be suspended properly
since the virtio driver already disables VQs well before the PCI
bus's suspend_noirq wants to complete the transition by writing to
PCI config space.

After trying around for a long time with moving the devices on the
DPM list, trying to create dependencies between them, etc. I gave
up and instead added UML specific cross-driver API that lets the
virt-pci code enable not suspending/resuming VQs for its devices.

This then allows the PCI bus suspend_noirq to still talk to the
device, and suspend/resume works properly.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 arch/um/drivers/virt-pci.c         | 10 ++++++++
 arch/um/drivers/virtio_uml.c       | 40 ++++++++++++++++++++++--------
 arch/um/include/linux/virtio-uml.h | 13 ++++++++++
 3 files changed, 53 insertions(+), 10 deletions(-)
 create mode 100644 arch/um/include/linux/virtio-uml.h

diff --git a/arch/um/drivers/virt-pci.c b/arch/um/drivers/virt-pci.c
index dd85f36197aa..0b802834f40a 100644
--- a/arch/um/drivers/virt-pci.c
+++ b/arch/um/drivers/virt-pci.c
@@ -10,6 +10,7 @@
 #include <linux/logic_iomem.h>
 #include <linux/irqdomain.h>
 #include <linux/virtio_pcidev.h>
+#include <linux/virtio-uml.h>
 #include <linux/delay.h>
 #include <linux/msi.h>
 #include <asm/unaligned.h>
@@ -134,6 +135,9 @@ static int um_pci_send_cmd(struct um_pci_device *dev,
 		if (completed == HANDLE_NO_FREE(cmd))
 			break;
 
+		if (completed && !HANDLE_IS_NO_FREE(completed))
+			kfree(completed);
+
 		if (WARN_ONCE(virtqueue_is_broken(dev->cmd_vq) ||
 			      ++delay_count > UM_VIRT_PCI_MAXDELAY,
 			      "um virt-pci delay: %d", delay_count)) {
@@ -550,6 +554,12 @@ static int um_pci_virtio_probe(struct virtio_device *vdev)
 
 	device_set_wakeup_enable(&vdev->dev, true);
 
+	/*
+	 * In order to do suspend-resume properly, don't allow VQs
+	 * to be suspended.
+	 */
+	virtio_uml_set_no_vq_suspend(vdev, true);
+
 	um_pci_rescan();
 	return 0;
 error:
diff --git a/arch/um/drivers/virtio_uml.c b/arch/um/drivers/virtio_uml.c
index 91ddf74ca888..4412d6febade 100644
--- a/arch/um/drivers/virtio_uml.c
+++ b/arch/um/drivers/virtio_uml.c
@@ -56,6 +56,7 @@ struct virtio_uml_device {
 	u8 status;
 	u8 registered:1;
 	u8 suspended:1;
+	u8 no_vq_suspend:1;
 
 	u8 config_changed_irq:1;
 	uint64_t vq_irq_vq_map;
@@ -1098,6 +1099,19 @@ static void virtio_uml_release_dev(struct device *d)
 	kfree(vu_dev);
 }
 
+void virtio_uml_set_no_vq_suspend(struct virtio_device *vdev,
+				  bool no_vq_suspend)
+{
+	struct virtio_uml_device *vu_dev = to_virtio_uml_device(vdev);
+
+	if (WARN_ON(vdev->config != &virtio_uml_config_ops))
+		return;
+
+	vu_dev->no_vq_suspend = no_vq_suspend;
+	dev_info(&vdev->dev, "%sabled VQ suspend\n",
+		 no_vq_suspend ? "dis" : "en");
+}
+
 /* Platform device */
 
 static int virtio_uml_probe(struct platform_device *pdev)
@@ -1302,13 +1316,16 @@ MODULE_DEVICE_TABLE(of, virtio_uml_match);
 static int virtio_uml_suspend(struct platform_device *pdev, pm_message_t state)
 {
 	struct virtio_uml_device *vu_dev = platform_get_drvdata(pdev);
-	struct virtqueue *vq;
 
-	virtio_device_for_each_vq((&vu_dev->vdev), vq) {
-		struct virtio_uml_vq_info *info = vq->priv;
+	if (!vu_dev->no_vq_suspend) {
+		struct virtqueue *vq;
 
-		info->suspended = true;
-		vhost_user_set_vring_enable(vu_dev, vq->index, false);
+		virtio_device_for_each_vq((&vu_dev->vdev), vq) {
+			struct virtio_uml_vq_info *info = vq->priv;
+
+			info->suspended = true;
+			vhost_user_set_vring_enable(vu_dev, vq->index, false);
+		}
 	}
 
 	if (!device_may_wakeup(&vu_dev->vdev.dev)) {
@@ -1322,13 +1339,16 @@ static int virtio_uml_suspend(struct platform_device *pdev, pm_message_t state)
 static int virtio_uml_resume(struct platform_device *pdev)
 {
 	struct virtio_uml_device *vu_dev = platform_get_drvdata(pdev);
-	struct virtqueue *vq;
 
-	virtio_device_for_each_vq((&vu_dev->vdev), vq) {
-		struct virtio_uml_vq_info *info = vq->priv;
+	if (!vu_dev->no_vq_suspend) {
+		struct virtqueue *vq;
+
+		virtio_device_for_each_vq((&vu_dev->vdev), vq) {
+			struct virtio_uml_vq_info *info = vq->priv;
 
-		info->suspended = false;
-		vhost_user_set_vring_enable(vu_dev, vq->index, true);
+			info->suspended = false;
+			vhost_user_set_vring_enable(vu_dev, vq->index, true);
+		}
 	}
 
 	vu_dev->suspended = false;
diff --git a/arch/um/include/linux/virtio-uml.h b/arch/um/include/linux/virtio-uml.h
new file mode 100644
index 000000000000..2f652fa90f04
--- /dev/null
+++ b/arch/um/include/linux/virtio-uml.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2021 Intel Corporation
+ * Author: Johannes Berg <johannes@sipsolutions.net>
+ */
+
+#ifndef __VIRTIO_UML_H__
+#define __VIRTIO_UML_H__
+
+void virtio_uml_set_no_vq_suspend(struct virtio_device *vdev,
+				  bool no_vq_suspend);
+
+#endif /* __VIRTIO_UML_H__ */
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 7/8] um: add PCI over virtio emulation driver
  2021-03-01 15:07 ` [PATCH v2 7/8] um: add PCI over virtio emulation driver Johannes Berg
@ 2021-08-17 15:50   ` Bjorn Helgaas
  2021-08-17 15:51     ` Johannes Berg
  0 siblings, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2021-08-17 15:50 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-um, Arnd Bergmann, linux-kernel, Johannes Berg

On Mon, Mar 01, 2021 at 04:07:07PM +0100, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
> 
> To support testing of PCI/PCIe drivers in UML, add a PCI bus
> support driver. This driver uses virtio, which in UML is really
> just vhost-user, to talk to devices, and adds the devices to
> the virtual PCI bus in the system.

Hi Johannes,

The virtio_pcidev_ops kernel-doc below doesn't match the actual enum,
so it generates several warnings:

  include/uapi/linux/virtio_pcidev.h:41: warning: Enum value 'VIRTIO_PCIDEV_OP_RESERVED' not described in enum 'virtio_pcidev_ops'
  include/uapi/linux/virtio_pcidev.h:41: warning: Enum value 'VIRTIO_PCIDEV_OP_MMIO_READ' not described in enum 'virtio_pcidev_ops'
  include/uapi/linux/virtio_pcidev.h:41: warning: Enum value 'VIRTIO_PCIDEV_OP_MMIO_WRITE' not described in enum 'virtio_pcidev_ops'
  include/uapi/linux/virtio_pcidev.h:41: warning: Excess enum value 'VIRTIO_PCIDEV_OP_BAR_READ' description in 'virtio_pcidev_ops'
  include/uapi/linux/virtio_pcidev.h:41: warning: Excess enum value 'VIRTIO_PCIDEV_OP_BAR_WRITE' description in 'virtio_pcidev_ops'

FWIW, here's the command I used to find these:

  $ find include drivers/pci -type f -path "*pci*.[ch]" | xargs scripts/kernel-doc -none

> diff --git a/include/uapi/linux/virtio_pcidev.h b/include/uapi/linux/virtio_pcidev.h
> new file mode 100644
> index 000000000000..89daa88bcfef
> --- /dev/null
> +++ b/include/uapi/linux/virtio_pcidev.h
> @@ -0,0 +1,64 @@
> +/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
> +/*
> + * Copyright (C) 2021 Intel Corporation
> + * Author: Johannes Berg <johannes@sipsolutions.net>
> + */
> +#ifndef _UAPI_LINUX_VIRTIO_PCIDEV_H
> +#define _UAPI_LINUX_VIRTIO_PCIDEV_H
> +#include <linux/types.h>
> +
> +/**
> + * enum virtio_pcidev_ops - virtual PCI device operations
> + * @VIRTIO_PCIDEV_OP_CFG_READ: read config space, size is 1, 2, 4 or 8;
> + *	the @data field should be filled in by the device (in little endian).
> + * @VIRTIO_PCIDEV_OP_CFG_WRITE: write config space, size is 1, 2, 4 or 8;
> + *	the @data field contains the data to write (in little endian).
> + * @VIRTIO_PCIDEV_OP_BAR_READ: read BAR mem/pio, size can be variable;
> + *	the @data field should be filled in by the device (in little endian).
> + * @VIRTIO_PCIDEV_OP_BAR_WRITE: write BAR mem/pio, size can be variable;
> + *	the @data field contains the data to write (in little endian).
> + * @VIRTIO_PCIDEV_OP_MMIO_MEMSET: memset MMIO, size is variable but
> + *	the @data field only has one byte (unlike @VIRTIO_PCIDEV_OP_MMIO_WRITE)
> + * @VIRTIO_PCIDEV_OP_INT: legacy INTx# pin interrupt, the addr field is 1-4 for
> + *	the number
> + * @VIRTIO_PCIDEV_OP_MSI: MSI(-X) interrupt, this message basically transports
> + *	the 16- or 32-bit write that would otherwise be done into memory,
> + *	analogous to the write messages (@VIRTIO_PCIDEV_OP_MMIO_WRITE) above
> + * @VIRTIO_PCIDEV_OP_PME: Dummy message whose content is ignored (and should be
> + *	all zeroes) to signal the PME# pin.
> + */
> +enum virtio_pcidev_ops {
> +	VIRTIO_PCIDEV_OP_RESERVED = 0,
> +	VIRTIO_PCIDEV_OP_CFG_READ,
> +	VIRTIO_PCIDEV_OP_CFG_WRITE,
> +	VIRTIO_PCIDEV_OP_MMIO_READ,
> +	VIRTIO_PCIDEV_OP_MMIO_WRITE,
> +	VIRTIO_PCIDEV_OP_MMIO_MEMSET,
> +	VIRTIO_PCIDEV_OP_INT,
> +	VIRTIO_PCIDEV_OP_MSI,
> +	VIRTIO_PCIDEV_OP_PME,
> +};

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 7/8] um: add PCI over virtio emulation driver
  2021-08-17 15:50   ` Bjorn Helgaas
@ 2021-08-17 15:51     ` Johannes Berg
  0 siblings, 0 replies; 11+ messages in thread
From: Johannes Berg @ 2021-08-17 15:51 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-um, Arnd Bergmann, linux-kernel

Hi,

> The virtio_pcidev_ops kernel-doc below doesn't match the actual enum,
> so it generates several warnings:
> 
>   include/uapi/linux/virtio_pcidev.h:41: warning: Enum value 'VIRTIO_PCIDEV_OP_RESERVED' not described in enum 'virtio_pcidev_ops'
>   include/uapi/linux/virtio_pcidev.h:41: warning: Enum value 'VIRTIO_PCIDEV_OP_MMIO_READ' not described in enum 'virtio_pcidev_ops'
>   include/uapi/linux/virtio_pcidev.h:41: warning: Enum value 'VIRTIO_PCIDEV_OP_MMIO_WRITE' not described in enum 'virtio_pcidev_ops'
>   include/uapi/linux/virtio_pcidev.h:41: warning: Excess enum value 'VIRTIO_PCIDEV_OP_BAR_READ' description in 'virtio_pcidev_ops'
>   include/uapi/linux/virtio_pcidev.h:41: warning: Excess enum value 'VIRTIO_PCIDEV_OP_BAR_WRITE' description in 'virtio_pcidev_ops'
> 
> FWIW, here's the command I used to find these:
> 
>   $ find include drivers/pci -type f -path "*pci*.[ch]" | xargs scripts/kernel-doc -none

Oops. It didn't get tied to any real doc creation so I guess the bots
didn't find it, but I'll send a patch to fix it, thanks!

johannes


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-08-17 15:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-01 15:07 [PATCH v2 0/8] PCI support for UML Johannes Berg
2021-03-01 15:07 ` [PATCH v2 1/8] um: allow disabling NO_IOMEM Johannes Berg
2021-03-01 15:07 ` [PATCH v2 2/8] lib: add iomem emulation (logic_iomem) Johannes Berg
2021-03-01 15:07 ` [PATCH v2 3/8] um: remove unused smp_sigio_handler() declaration Johannes Berg
2021-03-01 15:07 ` [PATCH v2 4/8] um: export signals_enabled directly Johannes Berg
2021-03-01 15:07 ` [PATCH v2 5/8] um: time-travel/signals: fix ndelay() in interrupt Johannes Berg
2021-03-01 15:07 ` [PATCH v2 6/8] um: irqs: allow invoking time-travel handler multiple times Johannes Berg
2021-03-01 15:07 ` [PATCH v2 7/8] um: add PCI over virtio emulation driver Johannes Berg
2021-08-17 15:50   ` Bjorn Helgaas
2021-08-17 15:51     ` Johannes Berg
2021-03-01 15:07 ` [PATCH v2 8/8] um: virtio/pci: enable suspend/resume Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).