LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/8] Introduce a device-dax bus-based device-model
@ 2018-10-31  3:12 Dan Williams
  2018-10-31  3:12 ` [PATCH 1/8] device-dax: Kill dax_region ida Dan Williams
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-31  3:12 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-mm, linux-kernel, dave.hansen

Prompted by the review of "[PATCH 0/9] Allow persistent memory to be
used like normal RAM" [1] introduce a new bus / device-driver-model
for device-dax.

Currently device-dax instances result from attaching an nvdimm namespace
device to the dax_pmem driver. These instances are registered with the
/sys/class/dax sub-system. With the expectation that platforms will
describe performance differentiated memory [2] for ranges other than
persistent memory (pmem) a new device-model is needed.

Arrange for dax_pmem to be one of potentially several drivers that know
how to discover differentiated memory and register a device instance on
the dax bus. The expectation is that, by default, this device is
consumed by the typical device-dax driver that will expose the range
through a /dev/daxX.Y character device. Optionally other drivers can
consume the dax device instance. For example, the kmem driver [1] can
attach to device-dax device instance to hot-add the related memory range
to the core page-allocator.

Going forward, provider drivers outside of dax_pmem can be created to
register other memories with unique performance properties.

Since /sys/class/dax is a released ABI, a compat driver is provided so
that distros can opt-in to the new bus based ABI. The /sys/class/dax
interface is then deprecated and scheduled to be removed.

[1]: https://lkml.org/lkml/2018/10/23/9
[2]: Section 5.2.27 Heterogeneous Memory Attribute Table (HMAT)
     http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf

---

Dan Williams (8):
      device-dax: Kill dax_region ida
      device-dax: Kill dax_region base
      device-dax: Remove multi-resource infrastructure
      device-dax: Start defining a dax bus model
      device-dax: Introduce bus + driver model
      device-dax: Move resource pinning+mapping into the common driver
      device-dax: Add support for a dax override driver
      device-dax: Add /sys/class/dax backwards compatibility


 Documentation/ABI/obsolete/sysfs-class-dax |   22 +
 drivers/dax/Kconfig                        |   12 +
 drivers/dax/Makefile                       |    5 
 drivers/dax/bus.c                          |  449 ++++++++++++++++++++++++++++
 drivers/dax/bus.h                          |   60 ++++
 drivers/dax/dax-private.h                  |   30 +-
 drivers/dax/dax.h                          |   18 -
 drivers/dax/device-dax.h                   |   25 --
 drivers/dax/device.c                       |  365 +++++------------------
 drivers/dax/pmem.c                         |  161 ----------
 drivers/dax/pmem/Makefile                  |    7 
 drivers/dax/pmem/compat.c                  |   73 +++++
 drivers/dax/pmem/core.c                    |   69 ++++
 drivers/dax/pmem/pmem.c                    |   40 ++
 drivers/dax/super.c                        |   41 ++-
 tools/testing/nvdimm/Kbuild                |    7 
 tools/testing/nvdimm/dax-dev.c             |   16 -
 17 files changed, 880 insertions(+), 520 deletions(-)
 create mode 100644 Documentation/ABI/obsolete/sysfs-class-dax
 create mode 100644 drivers/dax/bus.c
 create mode 100644 drivers/dax/bus.h
 delete mode 100644 drivers/dax/dax.h
 delete mode 100644 drivers/dax/device-dax.h
 delete mode 100644 drivers/dax/pmem.c
 create mode 100644 drivers/dax/pmem/Makefile
 create mode 100644 drivers/dax/pmem/compat.c
 create mode 100644 drivers/dax/pmem/core.c
 create mode 100644 drivers/dax/pmem/pmem.c

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/8] device-dax: Kill dax_region ida
  2018-10-31  3:12 [PATCH 0/8] Introduce a device-dax bus-based device-model Dan Williams
@ 2018-10-31  3:12 ` Dan Williams
  2018-10-31  3:13 ` [PATCH 2/8] device-dax: Kill dax_region base Dan Williams
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-31  3:12 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-mm, linux-kernel, dave.hansen

Commit bbb3be170ac2 "device-dax: fix sysfs duplicate warnings" arranged
for passing a dax instance-id to devm_create_dax_dev(), rather than
generating one internally. Remove the dax_region ida and related code.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dax/dax-private.h |    3 ---
 drivers/dax/device.c      |   24 +++---------------------
 2 files changed, 3 insertions(+), 24 deletions(-)

diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index b6fc4f04636d..d1b36a42132f 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -28,7 +28,6 @@
  */
 struct dax_region {
 	int id;
-	struct ida ida;
 	void *base;
 	struct kref kref;
 	struct device *dev;
@@ -42,7 +41,6 @@ struct dax_region {
  * @region - parent region
  * @dax_dev - core dax functionality
  * @dev - device core
- * @id - child id in the region
  * @num_resources - number of physical address extents in this device
  * @res - array of physical address ranges
  */
@@ -50,7 +48,6 @@ struct dev_dax {
 	struct dax_region *region;
 	struct dax_device *dax_dev;
 	struct device dev;
-	int id;
 	int num_resources;
 	struct resource res[0];
 };
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 948806e57cee..a5a670c1cd58 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -128,7 +128,6 @@ struct dax_region *alloc_dax_region(struct device *parent, int region_id,
 	dax_region->pfn_flags = pfn_flags;
 	kref_init(&dax_region->kref);
 	dax_region->id = region_id;
-	ida_init(&dax_region->ida);
 	dax_region->align = align;
 	dax_region->dev = parent;
 	dax_region->base = addr;
@@ -582,8 +581,6 @@ static void dev_dax_release(struct device *dev)
 	struct dax_region *dax_region = dev_dax->region;
 	struct dax_device *dax_dev = dev_dax->dax_dev;
 
-	if (dev_dax->id >= 0)
-		ida_simple_remove(&dax_region->ida, dev_dax->id);
 	dax_region_put(dax_region);
 	put_dax(dax_dev);
 	kfree(dev_dax);
@@ -642,19 +639,7 @@ struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region,
 	}
 
 	if (i < count)
-		goto err_id;
-
-	if (id < 0) {
-		id = ida_simple_get(&dax_region->ida, 0, 0, GFP_KERNEL);
-		dev_dax->id = id;
-		if (id < 0) {
-			rc = id;
-			goto err_id;
-		}
-	} else {
-		/* region provider owns @id lifetime */
-		dev_dax->id = -1;
-	}
+		goto err;
 
 	/*
 	 * No 'host' or dax_operations since there is no access to this
@@ -663,7 +648,7 @@ struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region,
 	dax_dev = alloc_dax(dev_dax, NULL, NULL);
 	if (!dax_dev) {
 		rc = -ENOMEM;
-		goto err_dax;
+		goto err;
 	}
 
 	/* from here on we're committed to teardown via dax_dev_release() */
@@ -700,10 +685,7 @@ struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region,
 
 	return dev_dax;
 
- err_dax:
-	if (dev_dax->id >= 0)
-		ida_simple_remove(&dax_region->ida, dev_dax->id);
- err_id:
+ err:
 	kfree(dev_dax);
 
 	return ERR_PTR(rc);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/8] device-dax: Kill dax_region base
  2018-10-31  3:12 [PATCH 0/8] Introduce a device-dax bus-based device-model Dan Williams
  2018-10-31  3:12 ` [PATCH 1/8] device-dax: Kill dax_region ida Dan Williams
@ 2018-10-31  3:13 ` Dan Williams
  2018-10-31  3:13 ` [PATCH 3/8] device-dax: Remove multi-resource infrastructure Dan Williams
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-31  3:13 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-mm, linux-kernel, dave.hansen

Nothing consumes this attribute of a region and devres otherwise
remembers the value for de-allocation purposes.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dax/dax-private.h |    2 --
 drivers/dax/device-dax.h  |    5 ++---
 drivers/dax/device.c      |    3 +--
 drivers/dax/pmem.c        |    2 +-
 4 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index d1b36a42132f..9b393c218fe4 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -19,7 +19,6 @@
 /**
  * struct dax_region - mapping infrastructure for dax devices
  * @id: kernel-wide unique region for a memory range
- * @base: linear address corresponding to @res
  * @kref: to pin while other agents have a need to do lookups
  * @dev: parent device backing this region
  * @align: allocation and mapping alignment for child dax devices
@@ -28,7 +27,6 @@
  */
 struct dax_region {
 	int id;
-	void *base;
 	struct kref kref;
 	struct device *dev;
 	unsigned int align;
diff --git a/drivers/dax/device-dax.h b/drivers/dax/device-dax.h
index 688b051750bd..4f1c69e1b3a2 100644
--- a/drivers/dax/device-dax.h
+++ b/drivers/dax/device-dax.h
@@ -17,9 +17,8 @@ struct dev_dax;
 struct resource;
 struct dax_region;
 void dax_region_put(struct dax_region *dax_region);
-struct dax_region *alloc_dax_region(struct device *parent,
-		int region_id, struct resource *res, unsigned int align,
-		void *addr, unsigned long flags);
+struct dax_region *alloc_dax_region(struct device *parent, int region_id,
+		struct resource *res, unsigned int align, unsigned long flags);
 struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region,
 		int id, struct resource *res, int count);
 #endif /* __DEVICE_DAX_H__ */
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index a5a670c1cd58..811c1015194c 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -100,7 +100,7 @@ static void dax_region_unregister(void *region)
 }
 
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
-		struct resource *res, unsigned int align, void *addr,
+		struct resource *res, unsigned int align,
 		unsigned long pfn_flags)
 {
 	struct dax_region *dax_region;
@@ -130,7 +130,6 @@ struct dax_region *alloc_dax_region(struct device *parent, int region_id,
 	dax_region->id = region_id;
 	dax_region->align = align;
 	dax_region->dev = parent;
-	dax_region->base = addr;
 	if (sysfs_create_groups(&parent->kobj, dax_region_attribute_groups)) {
 		kfree(dax_region);
 		return NULL;
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index 99e2aace8078..3afae503fc42 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -133,7 +133,7 @@ static int dax_pmem_probe(struct device *dev)
 		return -EINVAL;
 
 	dax_region = alloc_dax_region(dev, region_id, &res,
-			le32_to_cpu(pfn_sb->align), addr, PFN_DEV|PFN_MAP);
+			le32_to_cpu(pfn_sb->align), PFN_DEV|PFN_MAP);
 	if (!dax_region)
 		return -ENOMEM;
 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/8] device-dax: Remove multi-resource infrastructure
  2018-10-31  3:12 [PATCH 0/8] Introduce a device-dax bus-based device-model Dan Williams
  2018-10-31  3:12 ` [PATCH 1/8] device-dax: Kill dax_region ida Dan Williams
  2018-10-31  3:13 ` [PATCH 2/8] device-dax: Kill dax_region base Dan Williams
@ 2018-10-31  3:13 ` Dan Williams
  2018-10-31  3:13 ` [PATCH 4/8] device-dax: Start defining a dax bus model Dan Williams
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-31  3:13 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-mm, linux-kernel, dave.hansen

The multi-resource implementation anticipated discontiguous sub-division
support. That has not yet materialized, delete the infrastructure and
related code.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dax/dax-private.h      |    4 ---
 drivers/dax/device-dax.h       |    3 +-
 drivers/dax/device.c           |   49 +++++++---------------------------------
 drivers/dax/pmem.c             |    3 +-
 tools/testing/nvdimm/dax-dev.c |   16 ++-----------
 5 files changed, 13 insertions(+), 62 deletions(-)

diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index 9b393c218fe4..dbd077653b5c 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -39,14 +39,10 @@ struct dax_region {
  * @region - parent region
  * @dax_dev - core dax functionality
  * @dev - device core
- * @num_resources - number of physical address extents in this device
- * @res - array of physical address ranges
  */
 struct dev_dax {
 	struct dax_region *region;
 	struct dax_device *dax_dev;
 	struct device dev;
-	int num_resources;
-	struct resource res[0];
 };
 #endif
diff --git a/drivers/dax/device-dax.h b/drivers/dax/device-dax.h
index 4f1c69e1b3a2..e9be99584b92 100644
--- a/drivers/dax/device-dax.h
+++ b/drivers/dax/device-dax.h
@@ -19,6 +19,5 @@ struct dax_region;
 void dax_region_put(struct dax_region *dax_region);
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
 		struct resource *res, unsigned int align, unsigned long flags);
-struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region,
-		int id, struct resource *res, int count);
+struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id);
 #endif /* __DEVICE_DAX_H__ */
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 811c1015194c..db12e24b8005 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -151,11 +151,7 @@ static ssize_t size_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
 	struct dev_dax *dev_dax = to_dev_dax(dev);
-	unsigned long long size = 0;
-	int i;
-
-	for (i = 0; i < dev_dax->num_resources; i++)
-		size += resource_size(&dev_dax->res[i]);
+	unsigned long long size = resource_size(&dev_dax->region->res);
 
 	return sprintf(buf, "%llu\n", size);
 }
@@ -224,21 +220,11 @@ static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma,
 __weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff,
 		unsigned long size)
 {
-	struct resource *res;
-	/* gcc-4.6.3-nolibc for i386 complains that this is uninitialized */
-	phys_addr_t uninitialized_var(phys);
-	int i;
-
-	for (i = 0; i < dev_dax->num_resources; i++) {
-		res = &dev_dax->res[i];
-		phys = pgoff * PAGE_SIZE + res->start;
-		if (phys >= res->start && phys <= res->end)
-			break;
-		pgoff -= PHYS_PFN(resource_size(res));
-	}
+	struct resource *res = &dev_dax->region->res;
+	phys_addr_t phys;
 
-	if (i < dev_dax->num_resources) {
-		res = &dev_dax->res[i];
+	phys = pgoff * PAGE_SIZE + res->start;
+	if (phys >= res->start && phys <= res->end) {
 		if (phys + size - 1 <= res->end)
 			return phys;
 	}
@@ -608,8 +594,7 @@ static void unregister_dev_dax(void *dev)
 	put_device(dev);
 }
 
-struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region,
-		int id, struct resource *res, int count)
+struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id)
 {
 	struct device *parent = dax_region->dev;
 	struct dax_device *dax_dev;
@@ -617,29 +602,12 @@ struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region,
 	struct inode *inode;
 	struct device *dev;
 	struct cdev *cdev;
-	int rc, i;
+	int rc;
 
-	if (!count)
-		return ERR_PTR(-EINVAL);
-
-	dev_dax = kzalloc(struct_size(dev_dax, res, count), GFP_KERNEL);
+	dev_dax = kzalloc(sizeof(*dev_dax), GFP_KERNEL);
 	if (!dev_dax)
 		return ERR_PTR(-ENOMEM);
 
-	for (i = 0; i < count; i++) {
-		if (!IS_ALIGNED(res[i].start, dax_region->align)
-				|| !IS_ALIGNED(resource_size(&res[i]),
-					dax_region->align)) {
-			rc = -EINVAL;
-			break;
-		}
-		dev_dax->res[i].start = res[i].start;
-		dev_dax->res[i].end = res[i].end;
-	}
-
-	if (i < count)
-		goto err;
-
 	/*
 	 * No 'host' or dax_operations since there is no access to this
 	 * device outside of mmap of the resulting character device.
@@ -659,7 +627,6 @@ struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region,
 	cdev_init(cdev, &dax_fops);
 	cdev->owner = parent->driver->owner;
 
-	dev_dax->num_resources = count;
 	dev_dax->dax_dev = dax_dev;
 	dev_dax->region = dax_region;
 	kref_get(&dax_region->kref);
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index 3afae503fc42..23bc8ed40e84 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -137,8 +137,7 @@ static int dax_pmem_probe(struct device *dev)
 	if (!dax_region)
 		return -ENOMEM;
 
-	/* TODO: support for subdividing a dax region... */
-	dev_dax = devm_create_dev_dax(dax_region, id, &res, 1);
+	dev_dax = devm_create_dev_dax(dax_region, id);
 
 	/* child dev_dax instances now own the lifetime of the dax_region */
 	dax_region_put(dax_region);
diff --git a/tools/testing/nvdimm/dax-dev.c b/tools/testing/nvdimm/dax-dev.c
index 36ee3d8797c3..f36e708265b8 100644
--- a/tools/testing/nvdimm/dax-dev.c
+++ b/tools/testing/nvdimm/dax-dev.c
@@ -17,20 +17,11 @@
 phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff,
 		unsigned long size)
 {
-	struct resource *res;
+	struct resource *res = &dev_dax->region->res;
 	phys_addr_t addr;
-	int i;
 
-	for (i = 0; i < dev_dax->num_resources; i++) {
-		res = &dev_dax->res[i];
-		addr = pgoff * PAGE_SIZE + res->start;
-		if (addr >= res->start && addr <= res->end)
-			break;
-		pgoff -= PHYS_PFN(resource_size(res));
-	}
-
-	if (i < dev_dax->num_resources) {
-		res = &dev_dax->res[i];
+	addr = pgoff * PAGE_SIZE + res->start;
+	if (addr >= res->start && addr <= res->end) {
 		if (addr + size - 1 <= res->end) {
 			if (get_nfit_res(addr)) {
 				struct page *page;
@@ -44,6 +35,5 @@ phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff,
 				return addr;
 		}
 	}
-
 	return -1;
 }


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/8] device-dax: Start defining a dax bus model
  2018-10-31  3:12 [PATCH 0/8] Introduce a device-dax bus-based device-model Dan Williams
                   ` (2 preceding siblings ...)
  2018-10-31  3:13 ` [PATCH 3/8] device-dax: Remove multi-resource infrastructure Dan Williams
@ 2018-10-31  3:13 ` Dan Williams
  2018-10-31  3:13 ` [PATCH 5/8] device-dax: Introduce bus + driver model Dan Williams
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-31  3:13 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-mm, linux-kernel, dave.hansen

Towards eliminating the dax_class, move the dax-device-attribute
enabling to a new bus.c file in the core. The amount of code
thrash of sub-sequent patches is reduced as no logic changes are made,
just pure code movement.

A temporary export of unregister_dex_dax() and dax_attribute_groups is
needed to preserve compilation, but those symbols become static again in
a follow-on patch.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dax/Makefile        |    1 
 drivers/dax/bus.c           |  174 ++++++++++++++++++++++++++++++++++++++++
 drivers/dax/bus.h           |   15 +++
 drivers/dax/dax-private.h   |   14 +++
 drivers/dax/dax.h           |   18 ----
 drivers/dax/device-dax.h    |   23 -----
 drivers/dax/device.c        |  185 +------------------------------------------
 drivers/dax/pmem.c          |    2 
 drivers/dax/super.c         |    1 
 tools/testing/nvdimm/Kbuild |    1 
 10 files changed, 210 insertions(+), 224 deletions(-)
 create mode 100644 drivers/dax/bus.c
 create mode 100644 drivers/dax/bus.h
 delete mode 100644 drivers/dax/dax.h
 delete mode 100644 drivers/dax/device-dax.h

diff --git a/drivers/dax/Makefile b/drivers/dax/Makefile
index 574286fac87c..658e6b9b1d74 100644
--- a/drivers/dax/Makefile
+++ b/drivers/dax/Makefile
@@ -4,5 +4,6 @@ obj-$(CONFIG_DEV_DAX) += device_dax.o
 obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o
 
 dax-y := super.o
+dax-y += bus.o
 dax_pmem-y := pmem.o
 device_dax-y := device.o
diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
new file mode 100644
index 000000000000..8a398e8e1956
--- /dev/null
+++ b/drivers/dax/bus.c
@@ -0,0 +1,174 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2017-2018 Intel Corporation. All rights reserved. */
+#include <linux/device.h>
+#include <linux/slab.h>
+#include <linux/dax.h>
+#include "dax-private.h"
+#include "bus.h"
+
+/*
+ * Rely on the fact that drvdata is set before the attributes are
+ * registered, and that the attributes are unregistered before drvdata
+ * is cleared to assume that drvdata is always valid.
+ */
+static ssize_t id_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct dax_region *dax_region = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%d\n", dax_region->id);
+}
+static DEVICE_ATTR_RO(id);
+
+static ssize_t region_size_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct dax_region *dax_region = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%llu\n", (unsigned long long)
+			resource_size(&dax_region->res));
+}
+static struct device_attribute dev_attr_region_size = __ATTR(size, 0444,
+		region_size_show, NULL);
+
+static ssize_t align_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct dax_region *dax_region = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%u\n", dax_region->align);
+}
+static DEVICE_ATTR_RO(align);
+
+static struct attribute *dax_region_attributes[] = {
+	&dev_attr_region_size.attr,
+	&dev_attr_align.attr,
+	&dev_attr_id.attr,
+	NULL,
+};
+
+static const struct attribute_group dax_region_attribute_group = {
+	.name = "dax_region",
+	.attrs = dax_region_attributes,
+};
+
+static const struct attribute_group *dax_region_attribute_groups[] = {
+	&dax_region_attribute_group,
+	NULL,
+};
+
+static void dax_region_free(struct kref *kref)
+{
+	struct dax_region *dax_region;
+
+	dax_region = container_of(kref, struct dax_region, kref);
+	kfree(dax_region);
+}
+
+void dax_region_put(struct dax_region *dax_region)
+{
+	kref_put(&dax_region->kref, dax_region_free);
+}
+EXPORT_SYMBOL_GPL(dax_region_put);
+
+static void dax_region_unregister(void *region)
+{
+	struct dax_region *dax_region = region;
+
+	sysfs_remove_groups(&dax_region->dev->kobj,
+			dax_region_attribute_groups);
+	dax_region_put(dax_region);
+}
+
+struct dax_region *alloc_dax_region(struct device *parent, int region_id,
+		struct resource *res, unsigned int align,
+		unsigned long pfn_flags)
+{
+	struct dax_region *dax_region;
+
+	/*
+	 * The DAX core assumes that it can store its private data in
+	 * parent->driver_data. This WARN is a reminder / safeguard for
+	 * developers of device-dax drivers.
+	 */
+	if (dev_get_drvdata(parent)) {
+		dev_WARN(parent, "dax core failed to setup private data\n");
+		return NULL;
+	}
+
+	if (!IS_ALIGNED(res->start, align)
+			|| !IS_ALIGNED(resource_size(res), align))
+		return NULL;
+
+	dax_region = kzalloc(sizeof(*dax_region), GFP_KERNEL);
+	if (!dax_region)
+		return NULL;
+
+	dev_set_drvdata(parent, dax_region);
+	memcpy(&dax_region->res, res, sizeof(*res));
+	dax_region->pfn_flags = pfn_flags;
+	kref_init(&dax_region->kref);
+	dax_region->id = region_id;
+	dax_region->align = align;
+	dax_region->dev = parent;
+	if (sysfs_create_groups(&parent->kobj, dax_region_attribute_groups)) {
+		kfree(dax_region);
+		return NULL;
+	}
+
+	kref_get(&dax_region->kref);
+	if (devm_add_action_or_reset(parent, dax_region_unregister, dax_region))
+		return NULL;
+	return dax_region;
+}
+EXPORT_SYMBOL_GPL(alloc_dax_region);
+
+static ssize_t size_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct dev_dax *dev_dax = to_dev_dax(dev);
+	unsigned long long size = resource_size(&dev_dax->region->res);
+
+	return sprintf(buf, "%llu\n", size);
+}
+static DEVICE_ATTR_RO(size);
+
+static struct attribute *dev_dax_attributes[] = {
+	&dev_attr_size.attr,
+	NULL,
+};
+
+static const struct attribute_group dev_dax_attribute_group = {
+	.attrs = dev_dax_attributes,
+};
+
+const struct attribute_group *dax_attribute_groups[] = {
+	&dev_dax_attribute_group,
+	NULL,
+};
+EXPORT_SYMBOL_GPL(dax_attribute_groups);
+
+void kill_dev_dax(struct dev_dax *dev_dax)
+{
+	struct dax_device *dax_dev = dev_dax->dax_dev;
+	struct inode *inode = dax_inode(dax_dev);
+
+	kill_dax(dax_dev);
+	unmap_mapping_range(inode->i_mapping, 0, 0, 1);
+}
+EXPORT_SYMBOL_GPL(kill_dev_dax);
+
+void unregister_dev_dax(void *dev)
+{
+	struct dev_dax *dev_dax = to_dev_dax(dev);
+	struct dax_device *dax_dev = dev_dax->dax_dev;
+	struct inode *inode = dax_inode(dax_dev);
+	struct cdev *cdev = inode->i_cdev;
+
+	dev_dbg(dev, "trace\n");
+
+	kill_dev_dax(dev_dax);
+	cdev_device_del(cdev, dev);
+	put_device(dev);
+}
+EXPORT_SYMBOL_GPL(unregister_dev_dax);
diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
new file mode 100644
index 000000000000..840865aa69e8
--- /dev/null
+++ b/drivers/dax/bus.h
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */
+#ifndef __DAX_BUS_H__
+#define __DAX_BUS_H__
+struct device;
+struct dev_dax;
+struct resource;
+struct dax_device;
+struct dax_region;
+void dax_region_put(struct dax_region *dax_region);
+struct dax_region *alloc_dax_region(struct device *parent, int region_id,
+		struct resource *res, unsigned int align, unsigned long flags);
+struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id);
+void kill_dev_dax(struct dev_dax *dev_dax);
+#endif /* __DAX_BUS_H__ */
diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index dbd077653b5c..620c3f4eefe7 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -16,6 +16,15 @@
 #include <linux/device.h>
 #include <linux/cdev.h>
 
+/* private routines between core files */
+struct dax_device;
+struct dax_device *inode_dax(struct inode *inode);
+struct inode *dax_inode(struct dax_device *dax_dev);
+
+/* temporary until devm_create_dax_dev moves to bus.c */
+extern const struct attribute_group *dax_attribute_groups[];
+void unregister_dev_dax(void *dev);
+
 /**
  * struct dax_region - mapping infrastructure for dax devices
  * @id: kernel-wide unique region for a memory range
@@ -45,4 +54,9 @@ struct dev_dax {
 	struct dax_device *dax_dev;
 	struct device dev;
 };
+
+static inline struct dev_dax *to_dev_dax(struct device *dev)
+{
+	return container_of(dev, struct dev_dax, dev);
+}
 #endif
diff --git a/drivers/dax/dax.h b/drivers/dax/dax.h
deleted file mode 100644
index f9e5feea742c..000000000000
--- a/drivers/dax/dax.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/*
- * Copyright(c) 2016 - 2017 Intel Corporation. All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of version 2 of the GNU General Public License as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- */
-#ifndef __DAX_H__
-#define __DAX_H__
-struct dax_device;
-struct dax_device *inode_dax(struct inode *inode);
-struct inode *dax_inode(struct dax_device *dax_dev);
-#endif /* __DAX_H__ */
diff --git a/drivers/dax/device-dax.h b/drivers/dax/device-dax.h
deleted file mode 100644
index e9be99584b92..000000000000
--- a/drivers/dax/device-dax.h
+++ /dev/null
@@ -1,23 +0,0 @@
-/*
- * Copyright(c) 2016 Intel Corporation. All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of version 2 of the GNU General Public License as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- */
-#ifndef __DEVICE_DAX_H__
-#define __DEVICE_DAX_H__
-struct device;
-struct dev_dax;
-struct resource;
-struct dax_region;
-void dax_region_put(struct dax_region *dax_region);
-struct dax_region *alloc_dax_region(struct device *parent, int region_id,
-		struct resource *res, unsigned int align, unsigned long flags);
-struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id);
-#endif /* __DEVICE_DAX_H__ */
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index db12e24b8005..1fc375783e0b 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -1,15 +1,5 @@
-/*
- * Copyright(c) 2016 - 2017 Intel Corporation. All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of version 2 of the GNU General Public License as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- */
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2016-2018 Intel Corporation. All rights reserved. */
 #include <linux/pagemap.h>
 #include <linux/module.h>
 #include <linux/device.h>
@@ -21,156 +11,10 @@
 #include <linux/mm.h>
 #include <linux/mman.h>
 #include "dax-private.h"
-#include "dax.h"
+#include "bus.h"
 
 static struct class *dax_class;
 
-/*
- * Rely on the fact that drvdata is set before the attributes are
- * registered, and that the attributes are unregistered before drvdata
- * is cleared to assume that drvdata is always valid.
- */
-static ssize_t id_show(struct device *dev,
-		struct device_attribute *attr, char *buf)
-{
-	struct dax_region *dax_region = dev_get_drvdata(dev);
-
-	return sprintf(buf, "%d\n", dax_region->id);
-}
-static DEVICE_ATTR_RO(id);
-
-static ssize_t region_size_show(struct device *dev,
-		struct device_attribute *attr, char *buf)
-{
-	struct dax_region *dax_region = dev_get_drvdata(dev);
-
-	return sprintf(buf, "%llu\n", (unsigned long long)
-			resource_size(&dax_region->res));
-}
-static struct device_attribute dev_attr_region_size = __ATTR(size, 0444,
-		region_size_show, NULL);
-
-static ssize_t align_show(struct device *dev,
-		struct device_attribute *attr, char *buf)
-{
-	struct dax_region *dax_region = dev_get_drvdata(dev);
-
-	return sprintf(buf, "%u\n", dax_region->align);
-}
-static DEVICE_ATTR_RO(align);
-
-static struct attribute *dax_region_attributes[] = {
-	&dev_attr_region_size.attr,
-	&dev_attr_align.attr,
-	&dev_attr_id.attr,
-	NULL,
-};
-
-static const struct attribute_group dax_region_attribute_group = {
-	.name = "dax_region",
-	.attrs = dax_region_attributes,
-};
-
-static const struct attribute_group *dax_region_attribute_groups[] = {
-	&dax_region_attribute_group,
-	NULL,
-};
-
-static void dax_region_free(struct kref *kref)
-{
-	struct dax_region *dax_region;
-
-	dax_region = container_of(kref, struct dax_region, kref);
-	kfree(dax_region);
-}
-
-void dax_region_put(struct dax_region *dax_region)
-{
-	kref_put(&dax_region->kref, dax_region_free);
-}
-EXPORT_SYMBOL_GPL(dax_region_put);
-
-static void dax_region_unregister(void *region)
-{
-	struct dax_region *dax_region = region;
-
-	sysfs_remove_groups(&dax_region->dev->kobj,
-			dax_region_attribute_groups);
-	dax_region_put(dax_region);
-}
-
-struct dax_region *alloc_dax_region(struct device *parent, int region_id,
-		struct resource *res, unsigned int align,
-		unsigned long pfn_flags)
-{
-	struct dax_region *dax_region;
-
-	/*
-	 * The DAX core assumes that it can store its private data in
-	 * parent->driver_data. This WARN is a reminder / safeguard for
-	 * developers of device-dax drivers.
-	 */
-	if (dev_get_drvdata(parent)) {
-		dev_WARN(parent, "dax core failed to setup private data\n");
-		return NULL;
-	}
-
-	if (!IS_ALIGNED(res->start, align)
-			|| !IS_ALIGNED(resource_size(res), align))
-		return NULL;
-
-	dax_region = kzalloc(sizeof(*dax_region), GFP_KERNEL);
-	if (!dax_region)
-		return NULL;
-
-	dev_set_drvdata(parent, dax_region);
-	memcpy(&dax_region->res, res, sizeof(*res));
-	dax_region->pfn_flags = pfn_flags;
-	kref_init(&dax_region->kref);
-	dax_region->id = region_id;
-	dax_region->align = align;
-	dax_region->dev = parent;
-	if (sysfs_create_groups(&parent->kobj, dax_region_attribute_groups)) {
-		kfree(dax_region);
-		return NULL;
-	}
-
-	kref_get(&dax_region->kref);
-	if (devm_add_action_or_reset(parent, dax_region_unregister, dax_region))
-		return NULL;
-	return dax_region;
-}
-EXPORT_SYMBOL_GPL(alloc_dax_region);
-
-static struct dev_dax *to_dev_dax(struct device *dev)
-{
-	return container_of(dev, struct dev_dax, dev);
-}
-
-static ssize_t size_show(struct device *dev,
-		struct device_attribute *attr, char *buf)
-{
-	struct dev_dax *dev_dax = to_dev_dax(dev);
-	unsigned long long size = resource_size(&dev_dax->region->res);
-
-	return sprintf(buf, "%llu\n", size);
-}
-static DEVICE_ATTR_RO(size);
-
-static struct attribute *dev_dax_attributes[] = {
-	&dev_attr_size.attr,
-	NULL,
-};
-
-static const struct attribute_group dev_dax_attribute_group = {
-	.attrs = dev_dax_attributes,
-};
-
-static const struct attribute_group *dax_attribute_groups[] = {
-	&dev_dax_attribute_group,
-	NULL,
-};
-
 static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma,
 		const char *func)
 {
@@ -571,29 +415,6 @@ static void dev_dax_release(struct device *dev)
 	kfree(dev_dax);
 }
 
-static void kill_dev_dax(struct dev_dax *dev_dax)
-{
-	struct dax_device *dax_dev = dev_dax->dax_dev;
-	struct inode *inode = dax_inode(dax_dev);
-
-	kill_dax(dax_dev);
-	unmap_mapping_range(inode->i_mapping, 0, 0, 1);
-}
-
-static void unregister_dev_dax(void *dev)
-{
-	struct dev_dax *dev_dax = to_dev_dax(dev);
-	struct dax_device *dax_dev = dev_dax->dax_dev;
-	struct inode *inode = dax_inode(dax_dev);
-	struct cdev *cdev = inode->i_cdev;
-
-	dev_dbg(dev, "trace\n");
-
-	kill_dev_dax(dev_dax);
-	cdev_device_del(cdev, dev);
-	put_device(dev);
-}
-
 struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id)
 {
 	struct device *parent = dax_region->dev;
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index 23bc8ed40e84..c94f17e662bd 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -16,7 +16,7 @@
 #include <linux/pfn_t.h>
 #include "../nvdimm/pfn.h"
 #include "../nvdimm/nd.h"
-#include "device-dax.h"
+#include "bus.h"
 
 struct dax_pmem {
 	struct device *dev;
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 6e928f37d084..0ecc1a2cf1cc 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -22,6 +22,7 @@
 #include <linux/uio.h>
 #include <linux/dax.h>
 #include <linux/fs.h>
+#include "dax-private.h"
 
 static dev_t dax_devt;
 DEFINE_STATIC_SRCU(dax_srcu);
diff --git a/tools/testing/nvdimm/Kbuild b/tools/testing/nvdimm/Kbuild
index 0392153a0009..bfc4d8e98452 100644
--- a/tools/testing/nvdimm/Kbuild
+++ b/tools/testing/nvdimm/Kbuild
@@ -55,6 +55,7 @@ nd_e820-y := $(NVDIMM_SRC)/e820.o
 nd_e820-y += config_check.o
 
 dax-y := $(DAX_SRC)/super.o
+dax-y += $(DAX_SRC)/bus.o
 dax-y += config_check.o
 
 device_dax-y := $(DAX_SRC)/device.o


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 5/8] device-dax: Introduce bus + driver model
  2018-10-31  3:12 [PATCH 0/8] Introduce a device-dax bus-based device-model Dan Williams
                   ` (3 preceding siblings ...)
  2018-10-31  3:13 ` [PATCH 4/8] device-dax: Start defining a dax bus model Dan Williams
@ 2018-10-31  3:13 ` Dan Williams
  2018-10-31  3:13 ` [PATCH 6/8] device-dax: Move resource pinning+mapping into the common driver Dan Williams
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-31  3:13 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-mm, linux-kernel, dave.hansen

In support of multiple device-dax instances per device-dax-region and
allowing the 'kmem' driver to attach to dax-instances instead of the
current device-node access, convert the dax sub-system from a class to a
bus. Recall that the kmem driver takes reserved / special purpose
memories and assigns them to be managed by the core-mm.

Aside from the fact the device-dax instances are registered and probed
on a bus, two other lifetime-management changes are made:

1/ Delay attaching a cdev until driver probe time

2/ A new run_dax() helper is introduced to allow restoring dax-operation
   after a kill_dax() event. So, at driver ->probe() time we run_dax()
   and at ->remove() time we kill_dax() and invalidate all mappings.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dax/bus.c         |  133 ++++++++++++++++++++++++++++++++++++++++++---
 drivers/dax/bus.h         |   16 +++++
 drivers/dax/dax-private.h |    6 +-
 drivers/dax/device.c      |   95 +++++++++++---------------------
 drivers/dax/super.c       |   40 +++++++++-----
 5 files changed, 203 insertions(+), 87 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 8a398e8e1956..0cff32102c4c 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -6,6 +6,33 @@
 #include "dax-private.h"
 #include "bus.h"
 
+static int dax_bus_uevent(struct device *dev, struct kobj_uevent_env *env)
+{
+	/*
+	 * We only ever expect to handle device-dax instances, i.e. the
+	 * @type argument to MODULE_ALIAS_DAX_DEVICE() is always zero
+	 */
+	return add_uevent_var(env, "MODALIAS=" DAX_DEVICE_MODALIAS_FMT, 0);
+}
+
+static int dax_bus_match(struct device *dev, struct device_driver *drv);
+
+static struct bus_type dax_bus_type = {
+	.name = "dax",
+	.uevent = dax_bus_uevent,
+	.match = dax_bus_match,
+};
+
+static int dax_bus_match(struct device *dev, struct device_driver *drv)
+{
+	/*
+	 * The drivers that can register on the 'dax' bus are private to
+	 * drivers/dax/ so any device and driver on the bus always
+	 * match.
+	 */
+	return 1;
+}
+
 /*
  * Rely on the fact that drvdata is set before the attributes are
  * registered, and that the attributes are unregistered before drvdata
@@ -142,11 +169,10 @@ static const struct attribute_group dev_dax_attribute_group = {
 	.attrs = dev_dax_attributes,
 };
 
-const struct attribute_group *dax_attribute_groups[] = {
+static const struct attribute_group *dax_attribute_groups[] = {
 	&dev_dax_attribute_group,
 	NULL,
 };
-EXPORT_SYMBOL_GPL(dax_attribute_groups);
 
 void kill_dev_dax(struct dev_dax *dev_dax)
 {
@@ -158,17 +184,108 @@ void kill_dev_dax(struct dev_dax *dev_dax)
 }
 EXPORT_SYMBOL_GPL(kill_dev_dax);
 
-void unregister_dev_dax(void *dev)
+static void dev_dax_release(struct device *dev)
 {
 	struct dev_dax *dev_dax = to_dev_dax(dev);
+	struct dax_region *dax_region = dev_dax->region;
 	struct dax_device *dax_dev = dev_dax->dax_dev;
-	struct inode *inode = dax_inode(dax_dev);
-	struct cdev *cdev = inode->i_cdev;
 
-	dev_dbg(dev, "trace\n");
+	dax_region_put(dax_region);
+	put_dax(dax_dev);
+	kfree(dev_dax);
+}
+
+static void unregister_dev_dax(void *dev)
+{
+	struct dev_dax *dev_dax = to_dev_dax(dev);
+
+	dev_dbg(dev, "%s\n", __func__);
 
 	kill_dev_dax(dev_dax);
-	cdev_device_del(cdev, dev);
+	device_del(dev);
 	put_device(dev);
 }
-EXPORT_SYMBOL_GPL(unregister_dev_dax);
+
+struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id)
+{
+	struct device *parent = dax_region->dev;
+	struct dax_device *dax_dev;
+	struct dev_dax *dev_dax;
+	struct inode *inode;
+	struct device *dev;
+	int rc = -ENOMEM;
+
+	if (id < 0)
+		return ERR_PTR(-EINVAL);
+
+	dev_dax = kzalloc(sizeof(*dev_dax), GFP_KERNEL);
+	if (!dev_dax)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * No 'host' or dax_operations since there is no access to this
+	 * device outside of mmap of the resulting character device.
+	 */
+	dax_dev = alloc_dax(dev_dax, NULL, NULL);
+	if (!dax_dev)
+		goto err;
+
+	/* a device_dax instance is dead while the driver is not attached */
+	kill_dax(dax_dev);
+
+	/* from here on we're committed to teardown via dax_dev_release() */
+	dev = &dev_dax->dev;
+	device_initialize(dev);
+
+	dev_dax->dax_dev = dax_dev;
+	dev_dax->region = dax_region;
+	kref_get(&dax_region->kref);
+
+	inode = dax_inode(dax_dev);
+	dev->devt = inode->i_rdev;
+	dev->bus = &dax_bus_type;
+	dev->parent = parent;
+	dev->groups = dax_attribute_groups;
+	dev->release = dev_dax_release;
+	dev_set_name(dev, "dax%d.%d", dax_region->id, id);
+
+	rc = device_add(dev);
+	if (rc) {
+		kill_dev_dax(dev_dax);
+		put_device(dev);
+		return ERR_PTR(rc);
+	}
+
+	rc = devm_add_action_or_reset(dax_region->dev, unregister_dev_dax, dev);
+	if (rc)
+		return ERR_PTR(rc);
+
+	return dev_dax;
+
+ err:
+	kfree(dev_dax);
+
+	return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_GPL(devm_create_dev_dax);
+
+int __dax_driver_register(struct device_driver *drv,
+		struct module *module, const char *mod_name)
+{
+	drv->owner = module;
+	drv->name = mod_name;
+	drv->mod_name = mod_name;
+	drv->bus = &dax_bus_type;
+	return driver_register(drv);
+}
+EXPORT_SYMBOL_GPL(__dax_driver_register);
+
+int __init dax_bus_init(void)
+{
+	return bus_register(&dax_bus_type);
+}
+
+void __exit dax_bus_exit(void)
+{
+	bus_unregister(&dax_bus_type);
+}
diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
index 840865aa69e8..ea509504df3a 100644
--- a/drivers/dax/bus.h
+++ b/drivers/dax/bus.h
@@ -11,5 +11,21 @@ void dax_region_put(struct dax_region *dax_region);
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
 		struct resource *res, unsigned int align, unsigned long flags);
 struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id);
+int __dax_driver_register(struct device_driver *drv,
+		struct module *module, const char *mod_name);
+#define dax_driver_register(driver) \
+	__dax_driver_register(driver, THIS_MODULE, KBUILD_MODNAME)
 void kill_dev_dax(struct dev_dax *dev_dax);
+
+/*
+ * While run_dax() is potentially a generic operation that could be
+ * defined in include/linux/dax.h we don't want to grow any users
+ * outside of drivers/dax/
+ */
+void run_dax(struct dax_device *dax_dev);
+
+#define MODULE_ALIAS_DAX_DEVICE(type) \
+	MODULE_ALIAS("dax:t" __stringify(type) "*")
+#define DAX_DEVICE_MODALIAS_FMT "dax:t%d"
+
 #endif /* __DAX_BUS_H__ */
diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index 620c3f4eefe7..c3a121700837 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -20,10 +20,8 @@
 struct dax_device;
 struct dax_device *inode_dax(struct inode *inode);
 struct inode *dax_inode(struct dax_device *dax_dev);
-
-/* temporary until devm_create_dax_dev moves to bus.c */
-extern const struct attribute_group *dax_attribute_groups[];
-void unregister_dev_dax(void *dev);
+int dax_bus_init(void);
+void dax_bus_exit(void);
 
 /**
  * struct dax_region - mapping infrastructure for dax devices
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 1fc375783e0b..f55829404a24 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -13,8 +13,6 @@
 #include "dax-private.h"
 #include "bus.h"
 
-static struct class *dax_class;
-
 static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma,
 		const char *func)
 {
@@ -404,93 +402,64 @@ static const struct file_operations dax_fops = {
 	.mmap_supported_flags = MAP_SYNC,
 };
 
-static void dev_dax_release(struct device *dev)
+static void dev_dax_cdev_del(void *cdev)
 {
-	struct dev_dax *dev_dax = to_dev_dax(dev);
-	struct dax_region *dax_region = dev_dax->region;
-	struct dax_device *dax_dev = dev_dax->dax_dev;
+	cdev_del(cdev);
+}
 
-	dax_region_put(dax_region);
-	put_dax(dax_dev);
-	kfree(dev_dax);
+static void dev_dax_kill(void *dev_dax)
+{
+	kill_dev_dax(dev_dax);
 }
 
-struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id)
+static int dev_dax_probe(struct device *dev)
 {
-	struct device *parent = dax_region->dev;
-	struct dax_device *dax_dev;
-	struct dev_dax *dev_dax;
+	struct dev_dax *dev_dax = to_dev_dax(dev);
+	struct dax_device *dax_dev = dev_dax->dax_dev;
 	struct inode *inode;
-	struct device *dev;
 	struct cdev *cdev;
 	int rc;
 
-	dev_dax = kzalloc(sizeof(*dev_dax), GFP_KERNEL);
-	if (!dev_dax)
-		return ERR_PTR(-ENOMEM);
-
-	/*
-	 * No 'host' or dax_operations since there is no access to this
-	 * device outside of mmap of the resulting character device.
-	 */
-	dax_dev = alloc_dax(dev_dax, NULL, NULL);
-	if (!dax_dev) {
-		rc = -ENOMEM;
-		goto err;
-	}
-
-	/* from here on we're committed to teardown via dax_dev_release() */
-	dev = &dev_dax->dev;
-	device_initialize(dev);
-
 	inode = dax_inode(dax_dev);
 	cdev = inode->i_cdev;
 	cdev_init(cdev, &dax_fops);
-	cdev->owner = parent->driver->owner;
-
-	dev_dax->dax_dev = dax_dev;
-	dev_dax->region = dax_region;
-	kref_get(&dax_region->kref);
-
-	dev->devt = inode->i_rdev;
-	dev->class = dax_class;
-	dev->parent = parent;
-	dev->groups = dax_attribute_groups;
-	dev->release = dev_dax_release;
-	dev_set_name(dev, "dax%d.%d", dax_region->id, id);
-
-	rc = cdev_device_add(cdev, dev);
-	if (rc) {
-		kill_dev_dax(dev_dax);
-		put_device(dev);
-		return ERR_PTR(rc);
-	}
-
-	rc = devm_add_action_or_reset(dax_region->dev, unregister_dev_dax, dev);
+	cdev->owner = dev->driver->owner;
+	cdev_set_parent(cdev, &dev->kobj);
+	rc = cdev_add(cdev, dev->devt, 1);
 	if (rc)
-		return ERR_PTR(rc);
+		return rc;
 
-	return dev_dax;
+	rc = devm_add_action_or_reset(dev, dev_dax_cdev_del, cdev);
+	if (rc)
+		return rc;
 
- err:
-	kfree(dev_dax);
+	run_dax(dax_dev);
+	return devm_add_action_or_reset(dev, dev_dax_kill, dev_dax);
+}
 
-	return ERR_PTR(rc);
+static int dev_dax_remove(struct device *dev)
+{
+	/* all probe actions are unwound by devm */
+	return 0;
 }
-EXPORT_SYMBOL_GPL(devm_create_dev_dax);
+
+static struct device_driver device_dax_driver = {
+	.probe = dev_dax_probe,
+	.remove = dev_dax_remove,
+};
 
 static int __init dax_init(void)
 {
-	dax_class = class_create(THIS_MODULE, "dax");
-	return PTR_ERR_OR_ZERO(dax_class);
+	return dax_driver_register(&device_dax_driver);
 }
 
 static void __exit dax_exit(void)
 {
-	class_destroy(dax_class);
+	driver_unregister(&device_dax_driver);
 }
 
 MODULE_AUTHOR("Intel Corporation");
 MODULE_LICENSE("GPL v2");
-subsys_initcall(dax_init);
+module_init(dax_init);
 module_exit(dax_exit);
+MODULE_ALIAS_DAX_DEVICE(0);
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 0ecc1a2cf1cc..ccb22d8db3a2 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -366,11 +366,15 @@ void kill_dax(struct dax_device *dax_dev)
 	spin_lock(&dax_host_lock);
 	hlist_del_init(&dax_dev->list);
 	spin_unlock(&dax_host_lock);
-
-	dax_dev->private = NULL;
 }
 EXPORT_SYMBOL_GPL(kill_dax);
 
+void run_dax(struct dax_device *dax_dev)
+{
+	set_bit(DAXDEV_ALIVE, &dax_dev->flags);
+}
+EXPORT_SYMBOL_GPL(run_dax);
+
 static struct inode *dax_alloc_inode(struct super_block *sb)
 {
 	struct dax_device *dax_dev;
@@ -585,6 +589,8 @@ EXPORT_SYMBOL_GPL(dax_inode);
 
 void *dax_get_private(struct dax_device *dax_dev)
 {
+	if (!test_bit(DAXDEV_ALIVE, &dax_dev->flags))
+		return NULL;
 	return dax_dev->private;
 }
 EXPORT_SYMBOL_GPL(dax_get_private);
@@ -598,7 +604,7 @@ static void init_once(void *_dax_dev)
 	inode_init_once(inode);
 }
 
-static int __dax_fs_init(void)
+static int dax_fs_init(void)
 {
 	int rc;
 
@@ -630,35 +636,45 @@ static int __dax_fs_init(void)
 	return rc;
 }
 
-static void __dax_fs_exit(void)
+static void dax_fs_exit(void)
 {
 	kern_unmount(dax_mnt);
 	unregister_filesystem(&dax_fs_type);
 	kmem_cache_destroy(dax_cache);
 }
 
-static int __init dax_fs_init(void)
+static int __init dax_core_init(void)
 {
 	int rc;
 
-	rc = __dax_fs_init();
+	rc = dax_fs_init();
 	if (rc)
 		return rc;
 
 	rc = alloc_chrdev_region(&dax_devt, 0, MINORMASK+1, "dax");
 	if (rc)
-		__dax_fs_exit();
-	return rc;
+		goto err_chrdev;
+
+	rc = dax_bus_init();
+	if (rc)
+		goto err_bus;
+	return 0;
+
+err_bus:
+	unregister_chrdev_region(dax_devt, MINORMASK+1);
+err_chrdev:
+	dax_fs_exit();
+	return 0;
 }
 
-static void __exit dax_fs_exit(void)
+static void __exit dax_core_exit(void)
 {
 	unregister_chrdev_region(dax_devt, MINORMASK+1);
 	ida_destroy(&dax_minor_ida);
-	__dax_fs_exit();
+	dax_fs_exit();
 }
 
 MODULE_AUTHOR("Intel Corporation");
 MODULE_LICENSE("GPL v2");
-subsys_initcall(dax_fs_init);
-module_exit(dax_fs_exit);
+subsys_initcall(dax_core_init);
+module_exit(dax_core_exit);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 6/8] device-dax: Move resource pinning+mapping into the common driver
  2018-10-31  3:12 [PATCH 0/8] Introduce a device-dax bus-based device-model Dan Williams
                   ` (4 preceding siblings ...)
  2018-10-31  3:13 ` [PATCH 5/8] device-dax: Introduce bus + driver model Dan Williams
@ 2018-10-31  3:13 ` Dan Williams
  2018-10-31  3:13 ` [PATCH 7/8] device-dax: Add support for a dax override driver Dan Williams
  2018-10-31  3:13 ` [PATCH 8/8] device-dax: Add /sys/class/dax backwards compatibility Dan Williams
  7 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-31  3:13 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-mm, linux-kernel, dave.hansen

Move the responsibility of calling devm_request_resource() and
devm_memremap_pages() into the common device-dax driver. This is another
preparatory step to allowing an alternate personality driver for a
device-dax range.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dax/bus.c         |    6 ++-
 drivers/dax/bus.h         |    3 +
 drivers/dax/dax-private.h |    9 ++++
 drivers/dax/device.c      |   65 ++++++++++++++++++++++++++++++
 drivers/dax/pmem.c        |   98 ++++++---------------------------------------
 5 files changed, 94 insertions(+), 87 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 0cff32102c4c..69aae2cbd45f 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright(c) 2017-2018 Intel Corporation. All rights reserved. */
+#include <linux/memremap.h>
 #include <linux/device.h>
 #include <linux/slab.h>
 #include <linux/dax.h>
@@ -206,7 +207,8 @@ static void unregister_dev_dax(void *dev)
 	put_device(dev);
 }
 
-struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id)
+struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id,
+		struct dev_pagemap *pgmap)
 {
 	struct device *parent = dax_region->dev;
 	struct dax_device *dax_dev;
@@ -222,6 +224,8 @@ struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id)
 	if (!dev_dax)
 		return ERR_PTR(-ENOMEM);
 
+	memcpy(&dev_dax->pgmap, pgmap, sizeof(*pgmap));
+
 	/*
 	 * No 'host' or dax_operations since there is no access to this
 	 * device outside of mmap of the resulting character device.
diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
index ea509504df3a..e08e0c394983 100644
--- a/drivers/dax/bus.h
+++ b/drivers/dax/bus.h
@@ -10,7 +10,8 @@ struct dax_region;
 void dax_region_put(struct dax_region *dax_region);
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
 		struct resource *res, unsigned int align, unsigned long flags);
-struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id);
+struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id,
+		struct dev_pagemap *pgmap);
 int __dax_driver_register(struct device_driver *drv,
 		struct module *module, const char *mod_name);
 #define dax_driver_register(driver) \
diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index c3a121700837..a82ce48f5884 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -42,15 +42,22 @@ struct dax_region {
 };
 
 /**
- * struct dev_dax - instance data for a subdivision of a dax region
+ * struct dev_dax - instance data for a subdivision of a dax region, and
+ * data while the device is activated in the driver.
  * @region - parent region
  * @dax_dev - core dax functionality
  * @dev - device core
+ * @pgmap - pgmap for memmap setup / lifetime (driver owned)
+ * @ref: pgmap reference count (driver owned)
+ * @cmp: @ref final put completion (driver owned)
  */
 struct dev_dax {
 	struct dax_region *region;
 	struct dax_device *dax_dev;
 	struct device dev;
+	struct dev_pagemap pgmap;
+	struct percpu_ref ref;
+	struct completion cmp;
 };
 
 static inline struct dev_dax *to_dev_dax(struct device *dev)
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index f55829404a24..967bab097013 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright(c) 2016-2018 Intel Corporation. All rights reserved. */
+#include <linux/memremap.h>
 #include <linux/pagemap.h>
 #include <linux/module.h>
 #include <linux/device.h>
@@ -13,6 +14,38 @@
 #include "dax-private.h"
 #include "bus.h"
 
+static struct dev_dax *ref_to_dev_dax(struct percpu_ref *ref)
+{
+	return container_of(ref, struct dev_dax, ref);
+}
+
+static void dev_dax_percpu_release(struct percpu_ref *ref)
+{
+	struct dev_dax *dev_dax = ref_to_dev_dax(ref);
+
+	dev_dbg(&dev_dax->dev, "%s\n", __func__);
+	complete(&dev_dax->cmp);
+}
+
+static void dev_dax_percpu_exit(void *data)
+{
+	struct percpu_ref *ref = data;
+	struct dev_dax *dev_dax = ref_to_dev_dax(ref);
+
+	dev_dbg(&dev_dax->dev, "%s\n", __func__);
+	wait_for_completion(&dev_dax->cmp);
+	percpu_ref_exit(ref);
+}
+
+static void dev_dax_percpu_kill(void *data)
+{
+	struct percpu_ref *ref = data;
+	struct dev_dax *dev_dax = ref_to_dev_dax(ref);
+
+	dev_dbg(&dev_dax->dev, "%s\n", __func__);
+	percpu_ref_kill(ref);
+}
+
 static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma,
 		const char *func)
 {
@@ -416,10 +449,42 @@ static int dev_dax_probe(struct device *dev)
 {
 	struct dev_dax *dev_dax = to_dev_dax(dev);
 	struct dax_device *dax_dev = dev_dax->dax_dev;
+	struct resource *res = &dev_dax->region->res;
 	struct inode *inode;
 	struct cdev *cdev;
+	void *addr;
 	int rc;
 
+	/* 1:1 map region resource range to device-dax instance range */
+	if (!devm_request_mem_region(dev, res->start, resource_size(res),
+				dev_name(dev))) {
+		dev_warn(dev, "could not reserve region %pR\n", res);
+		return -EBUSY;
+	}
+
+	init_completion(&dev_dax->cmp);
+	rc = percpu_ref_init(&dev_dax->ref, dev_dax_percpu_release, 0,
+			GFP_KERNEL);
+	if (rc)
+		return rc;
+
+	rc = devm_add_action_or_reset(dev, dev_dax_percpu_exit, &dev_dax->ref);
+	if (rc)
+		return rc;
+
+	dev_dax->pgmap.ref = &dev_dax->ref;
+	addr = devm_memremap_pages(dev, &dev_dax->pgmap);
+	if (IS_ERR(addr)) {
+		devm_remove_action(dev, dev_dax_percpu_exit, &dev_dax->ref);
+		percpu_ref_exit(&dev_dax->ref);
+		return PTR_ERR(addr);
+	}
+
+	rc = devm_add_action_or_reset(dev, dev_dax_percpu_kill,
+			&dev_dax->ref);
+	if (rc)
+		return rc;
+
 	inode = dax_inode(dax_dev);
 	cdev = inode->i_cdev;
 	cdev_init(cdev, &dax_fops);
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index c94f17e662bd..d3cefa7868ac 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -18,55 +18,16 @@
 #include "../nvdimm/nd.h"
 #include "bus.h"
 
-struct dax_pmem {
-	struct device *dev;
-	struct percpu_ref ref;
-	struct dev_pagemap pgmap;
-	struct completion cmp;
-};
-
-static struct dax_pmem *to_dax_pmem(struct percpu_ref *ref)
-{
-	return container_of(ref, struct dax_pmem, ref);
-}
-
-static void dax_pmem_percpu_release(struct percpu_ref *ref)
-{
-	struct dax_pmem *dax_pmem = to_dax_pmem(ref);
-
-	dev_dbg(dax_pmem->dev, "trace\n");
-	complete(&dax_pmem->cmp);
-}
-
-static void dax_pmem_percpu_exit(void *data)
-{
-	struct percpu_ref *ref = data;
-	struct dax_pmem *dax_pmem = to_dax_pmem(ref);
-
-	dev_dbg(dax_pmem->dev, "trace\n");
-	wait_for_completion(&dax_pmem->cmp);
-	percpu_ref_exit(ref);
-}
-
-static void dax_pmem_percpu_kill(void *data)
-{
-	struct percpu_ref *ref = data;
-	struct dax_pmem *dax_pmem = to_dax_pmem(ref);
-
-	dev_dbg(dax_pmem->dev, "trace\n");
-	percpu_ref_kill(ref);
-}
-
 static int dax_pmem_probe(struct device *dev)
 {
-	void *addr;
 	struct resource res;
 	int rc, id, region_id;
+	resource_size_t offset;
 	struct nd_pfn_sb *pfn_sb;
 	struct dev_dax *dev_dax;
-	struct dax_pmem *dax_pmem;
 	struct nd_namespace_io *nsio;
 	struct dax_region *dax_region;
+	struct dev_pagemap pgmap = { 0 };
 	struct nd_namespace_common *ndns;
 	struct nd_dax *nd_dax = to_nd_dax(dev);
 	struct nd_pfn *nd_pfn = &nd_dax->nd_pfn;
@@ -76,68 +37,37 @@ static int dax_pmem_probe(struct device *dev)
 		return PTR_ERR(ndns);
 	nsio = to_nd_namespace_io(&ndns->dev);
 
-	dax_pmem = devm_kzalloc(dev, sizeof(*dax_pmem), GFP_KERNEL);
-	if (!dax_pmem)
-		return -ENOMEM;
-
 	/* parse the 'pfn' info block via ->rw_bytes */
 	rc = devm_nsio_enable(dev, nsio);
 	if (rc)
 		return rc;
-	rc = nvdimm_setup_pfn(nd_pfn, &dax_pmem->pgmap);
+	rc = nvdimm_setup_pfn(nd_pfn, &pgmap);
 	if (rc)
 		return rc;
 	devm_nsio_disable(dev, nsio);
 
-	pfn_sb = nd_pfn->pfn_sb;
-
-	if (!devm_request_mem_region(dev, nsio->res.start,
-				resource_size(&nsio->res),
+	/* reserve the metadata area, device-dax will reserve the data */
+        pfn_sb = nd_pfn->pfn_sb;
+	offset = le64_to_cpu(pfn_sb->dataoff);
+	if (!devm_request_mem_region(dev, nsio->res.start, offset,
 				dev_name(&ndns->dev))) {
-		dev_warn(dev, "could not reserve region %pR\n", &nsio->res);
-		return -EBUSY;
-	}
-
-	dax_pmem->dev = dev;
-	init_completion(&dax_pmem->cmp);
-	rc = percpu_ref_init(&dax_pmem->ref, dax_pmem_percpu_release, 0,
-			GFP_KERNEL);
-	if (rc)
-		return rc;
-
-	rc = devm_add_action(dev, dax_pmem_percpu_exit, &dax_pmem->ref);
-	if (rc) {
-		percpu_ref_exit(&dax_pmem->ref);
-		return rc;
-	}
-
-	dax_pmem->pgmap.ref = &dax_pmem->ref;
-	addr = devm_memremap_pages(dev, &dax_pmem->pgmap);
-	if (IS_ERR(addr)) {
-		devm_remove_action(dev, dax_pmem_percpu_exit, &dax_pmem->ref);
-		percpu_ref_exit(&dax_pmem->ref);
-		return PTR_ERR(addr);
-	}
-
-	rc = devm_add_action_or_reset(dev, dax_pmem_percpu_kill,
-							&dax_pmem->ref);
-	if (rc)
-		return rc;
-
-	/* adjust the dax_region resource to the start of data */
-	memcpy(&res, &dax_pmem->pgmap.res, sizeof(res));
-	res.start += le64_to_cpu(pfn_sb->dataoff);
+                dev_warn(dev, "could not reserve metadata\n");
+                return -EBUSY;
+        }
 
 	rc = sscanf(dev_name(&ndns->dev), "namespace%d.%d", &region_id, &id);
 	if (rc != 2)
 		return -EINVAL;
 
+	/* adjust the dax_region resource to the start of data */
+	memcpy(&res, &pgmap.res, sizeof(res));
+	res.start += offset;
 	dax_region = alloc_dax_region(dev, region_id, &res,
 			le32_to_cpu(pfn_sb->align), PFN_DEV|PFN_MAP);
 	if (!dax_region)
 		return -ENOMEM;
 
-	dev_dax = devm_create_dev_dax(dax_region, id);
+	dev_dax = devm_create_dev_dax(dax_region, id, &pgmap);
 
 	/* child dev_dax instances now own the lifetime of the dax_region */
 	dax_region_put(dax_region);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 7/8] device-dax: Add support for a dax override driver
  2018-10-31  3:12 [PATCH 0/8] Introduce a device-dax bus-based device-model Dan Williams
                   ` (5 preceding siblings ...)
  2018-10-31  3:13 ` [PATCH 6/8] device-dax: Move resource pinning+mapping into the common driver Dan Williams
@ 2018-10-31  3:13 ` Dan Williams
  2018-11-07 23:42   ` Williams, Dan J
  2018-10-31  3:13 ` [PATCH 8/8] device-dax: Add /sys/class/dax backwards compatibility Dan Williams
  7 siblings, 1 reply; 10+ messages in thread
From: Dan Williams @ 2018-10-31  3:13 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-mm, linux-kernel, dave.hansen

Introduce the 'new_id' concept for enabling a custom device-driver attach
policy for dax-bus drivers. The intended use is to have a mechanism for
hot-plugging device-dax ranges into the page allocator on-demand. With
this in place the default policy of using device-dax for performance
differentiated memory can be overridden by user-space policy that can
arrange for the memory range to be managed as 'System RAM' with
user-defined NUMA and other performance attributes.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dax/bus.c    |  145 ++++++++++++++++++++++++++++++++++++++++++++++++--
 drivers/dax/bus.h    |   10 +++
 drivers/dax/device.c |   11 ++--
 3 files changed, 156 insertions(+), 10 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 69aae2cbd45f..178d76504f79 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -2,11 +2,21 @@
 /* Copyright(c) 2017-2018 Intel Corporation. All rights reserved. */
 #include <linux/memremap.h>
 #include <linux/device.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
 #include <linux/slab.h>
 #include <linux/dax.h>
 #include "dax-private.h"
 #include "bus.h"
 
+static DEFINE_MUTEX(dax_bus_lock);
+
+#define DAX_NAME_LEN 30
+struct dax_id {
+	struct list_head list;
+	char dev_name[DAX_NAME_LEN];
+};
+
 static int dax_bus_uevent(struct device *dev, struct kobj_uevent_env *env)
 {
 	/*
@@ -16,22 +26,115 @@ static int dax_bus_uevent(struct device *dev, struct kobj_uevent_env *env)
 	return add_uevent_var(env, "MODALIAS=" DAX_DEVICE_MODALIAS_FMT, 0);
 }
 
+static struct dax_device_driver *to_dax_drv(struct device_driver *drv)
+{
+	return container_of(drv, struct dax_device_driver, drv);
+}
+
+static struct dax_id *__dax_match_id(struct dax_device_driver *dax_drv,
+		const char *dev_name)
+{
+	struct dax_id *dax_id;
+
+	lockdep_assert_held(&dax_bus_lock);
+
+	list_for_each_entry(dax_id, &dax_drv->ids, list)
+		if (strcmp(dax_id->dev_name, dev_name) == 0)
+			return dax_id;
+	return NULL;
+}
+
+static int dax_match_id(struct dax_device_driver *dax_drv, struct device *dev)
+{
+	int match;
+
+	mutex_lock(&dax_bus_lock);
+	match = !!__dax_match_id(dax_drv, dev_name(dev));
+	mutex_unlock(&dax_bus_lock);
+
+	return match;
+}
+
+static ssize_t do_id_store(struct device_driver *drv, const char *buf,
+		size_t count, bool add)
+{
+	struct dax_device_driver *dax_drv = to_dax_drv(drv);
+	unsigned int region_id, id;
+	struct dax_id *dax_id;
+	ssize_t rc = count;
+	int fields;
+
+	fields = sscanf(buf, "dax%d.%d", &region_id, &id);
+	if (fields != 2)
+		return -EINVAL;
+
+	if (strlen(buf) + 1 > DAX_NAME_LEN)
+		return -EINVAL;
+
+	mutex_lock(&dax_bus_lock);
+	dax_id = __dax_match_id(dax_drv, buf);
+	if (!dax_id) {
+		if (add) {
+			dax_id = kzalloc(sizeof(*dax_id), GFP_KERNEL);
+			if (dax_id) {
+				strncpy(dax_id->dev_name, buf, DAX_NAME_LEN);
+				list_add(&dax_id->list, &dax_drv->ids);
+			} else
+				rc = -ENOMEM;
+		} else
+			/* nothing to remove */;
+	} else if (!add) {
+		list_del(&dax_id->list);
+		kfree(dax_id);
+	} else
+		/* dax_id already added */;
+	mutex_unlock(&dax_bus_lock);
+	return rc;
+}
+
+static ssize_t new_id_store(struct device_driver *drv, const char *buf,
+		size_t count)
+{
+	return do_id_store(drv, buf, count, true);
+}
+static DRIVER_ATTR_WO(new_id);
+
+
+static ssize_t remove_id_store(struct device_driver *drv, const char *buf,
+		size_t count)
+{
+	return do_id_store(drv, buf, count, false);
+}
+static DRIVER_ATTR_WO(remove_id);
+
+static struct attribute *dax_drv_attrs[] = {
+	&driver_attr_new_id.attr,
+	&driver_attr_remove_id.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(dax_drv);
+
 static int dax_bus_match(struct device *dev, struct device_driver *drv);
 
 static struct bus_type dax_bus_type = {
 	.name = "dax",
 	.uevent = dax_bus_uevent,
 	.match = dax_bus_match,
+	.drv_groups = dax_drv_groups,
 };
 
 static int dax_bus_match(struct device *dev, struct device_driver *drv)
 {
+	struct dax_device_driver *dax_drv = to_dax_drv(drv);
+
 	/*
-	 * The drivers that can register on the 'dax' bus are private to
-	 * drivers/dax/ so any device and driver on the bus always
-	 * match.
+	 * All but the 'device-dax' driver, which has 'match_always'
+	 * set, requires an exact id match.
 	 */
-	return 1;
+	if (dax_drv->match_always)
+		return 1;
+
+	return dax_match_id(dax_drv, dev);
 }
 
 /*
@@ -273,17 +376,49 @@ struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id,
 }
 EXPORT_SYMBOL_GPL(devm_create_dev_dax);
 
-int __dax_driver_register(struct device_driver *drv,
+static int match_always_count;
+
+int __dax_driver_register(struct dax_device_driver *dax_drv,
 		struct module *module, const char *mod_name)
 {
+	struct device_driver *drv = &dax_drv->drv;
+	int rc = 0;
+
+	INIT_LIST_HEAD(&dax_drv->ids);
 	drv->owner = module;
 	drv->name = mod_name;
 	drv->mod_name = mod_name;
 	drv->bus = &dax_bus_type;
+
+	/* there can only be one default driver */
+	mutex_lock(&dax_bus_lock);
+	match_always_count += dax_drv->match_always;
+	if (match_always_count > 1) {
+		match_always_count--;
+		WARN_ON(1);
+		rc = -EINVAL;
+	}
+	mutex_unlock(&dax_bus_lock);
+	if (rc)
+		return rc;
 	return driver_register(drv);
 }
 EXPORT_SYMBOL_GPL(__dax_driver_register);
 
+void dax_driver_unregister(struct dax_device_driver *dax_drv)
+{
+	struct dax_id *dax_id, *_id;
+
+	mutex_lock(&dax_bus_lock);
+	match_always_count -= dax_drv->match_always;
+	list_for_each_entry_safe(dax_id, _id, &dax_drv->ids, list) {
+		list_del(&dax_id->list);
+		kfree(dax_id);
+	}
+	mutex_unlock(&dax_bus_lock);
+}
+EXPORT_SYMBOL_GPL(dax_driver_unregister);
+
 int __init dax_bus_init(void)
 {
 	return bus_register(&dax_bus_type);
diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
index e08e0c394983..395ab812367c 100644
--- a/drivers/dax/bus.h
+++ b/drivers/dax/bus.h
@@ -12,10 +12,18 @@ struct dax_region *alloc_dax_region(struct device *parent, int region_id,
 		struct resource *res, unsigned int align, unsigned long flags);
 struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id,
 		struct dev_pagemap *pgmap);
-int __dax_driver_register(struct device_driver *drv,
+
+struct dax_device_driver {
+	struct device_driver drv;
+	struct list_head ids;
+	int match_always;
+};
+
+int __dax_driver_register(struct dax_device_driver *dax_drv,
 		struct module *module, const char *mod_name);
 #define dax_driver_register(driver) \
 	__dax_driver_register(driver, THIS_MODULE, KBUILD_MODNAME)
+void dax_driver_unregister(struct dax_device_driver *dax_drv);
 void kill_dev_dax(struct dev_dax *dev_dax);
 
 /*
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 967bab097013..052aed3ab600 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -508,9 +508,12 @@ static int dev_dax_remove(struct device *dev)
 	return 0;
 }
 
-static struct device_driver device_dax_driver = {
-	.probe = dev_dax_probe,
-	.remove = dev_dax_remove,
+static struct dax_device_driver device_dax_driver = {
+	.drv = {
+		.probe = dev_dax_probe,
+		.remove = dev_dax_remove,
+	},
+	.match_always = 1,
 };
 
 static int __init dax_init(void)
@@ -520,7 +523,7 @@ static int __init dax_init(void)
 
 static void __exit dax_exit(void)
 {
-	driver_unregister(&device_dax_driver);
+	dax_driver_unregister(&device_dax_driver);
 }
 
 MODULE_AUTHOR("Intel Corporation");


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 8/8] device-dax: Add /sys/class/dax backwards compatibility
  2018-10-31  3:12 [PATCH 0/8] Introduce a device-dax bus-based device-model Dan Williams
                   ` (6 preceding siblings ...)
  2018-10-31  3:13 ` [PATCH 7/8] device-dax: Add support for a dax override driver Dan Williams
@ 2018-10-31  3:13 ` Dan Williams
  7 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2018-10-31  3:13 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-mm, linux-kernel, dave.hansen

On the expectation that some environments may not upgrade libdaxctl
(userspace component that depends on the /sys/class/dax hierarchy),
provide a default / legacy dax_pmem_compat driver. The dax_pmem_compat
driver implements the original /sys/class/dax sysfs layout rather than
/sys/bus/dax. When userspace is upgraded it can blacklist this module
and switch to the dax_pmem driver going forward.

CONFIG_DEV_DAX_PMEM_COMPAT and supporting code will be deleted according
to the dax_pmem entry in Documentation/ABI/obsolete/.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 Documentation/ABI/obsolete/sysfs-class-dax |   22 +++++++
 drivers/dax/Kconfig                        |   12 +++-
 drivers/dax/Makefile                       |    4 +
 drivers/dax/bus.c                          |   29 +++++++--
 drivers/dax/bus.h                          |   26 +++++++-
 drivers/dax/device.c                       |    9 ++-
 drivers/dax/pmem.c                         |   90 ----------------------------
 drivers/dax/pmem/Makefile                  |    7 ++
 drivers/dax/pmem/compat.c                  |   73 +++++++++++++++++++++++
 drivers/dax/pmem/core.c                    |   69 +++++++++++++++++++++
 drivers/dax/pmem/pmem.c                    |   40 ++++++++++++
 tools/testing/nvdimm/Kbuild                |    6 ++
 12 files changed, 283 insertions(+), 104 deletions(-)
 create mode 100644 Documentation/ABI/obsolete/sysfs-class-dax
 delete mode 100644 drivers/dax/pmem.c
 create mode 100644 drivers/dax/pmem/Makefile
 create mode 100644 drivers/dax/pmem/compat.c
 create mode 100644 drivers/dax/pmem/core.c
 create mode 100644 drivers/dax/pmem/pmem.c

diff --git a/Documentation/ABI/obsolete/sysfs-class-dax b/Documentation/ABI/obsolete/sysfs-class-dax
new file mode 100644
index 000000000000..2cb9fc5e8bd1
--- /dev/null
+++ b/Documentation/ABI/obsolete/sysfs-class-dax
@@ -0,0 +1,22 @@
+What:           /sys/class/dax/
+Date:           May, 2016
+KernelVersion:  v4.7
+Contact:        linux-nvdimm@lists.01.org
+Description:	Device DAX is the device-centric analogue of Filesystem
+		DAX (CONFIG_FS_DAX).  It allows memory ranges to be
+		allocated and mapped without need of an intervening file
+		system.  Device DAX is strict, precise and predictable.
+		Specifically this interface:
+
+		1/ Guarantees fault granularity with respect to a given
+		page size (pte, pmd, or pud) set at configuration time.
+
+		2/ Enforces deterministic behavior by being strict about
+		what fault scenarios are supported.
+
+		The /sys/class/dax/ interface enumerates all the
+		device-dax instances in the system. The ABI is
+		deprecated and will be removed after 2020. It is
+		replaced with the DAX bus interface /sys/bus/dax/ where
+		device-dax instances can be found under
+		/sys/bus/dax/devices/
diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig
index e0700bf4893a..6fc96f03920e 100644
--- a/drivers/dax/Kconfig
+++ b/drivers/dax/Kconfig
@@ -23,12 +23,22 @@ config DEV_DAX
 config DEV_DAX_PMEM
 	tristate "PMEM DAX: direct access to persistent memory"
 	depends on LIBNVDIMM && NVDIMM_DAX && DEV_DAX
+	depends on m # until we can kill DEV_DAX_PMEM_COMPAT
 	default DEV_DAX
 	help
 	  Support raw access to persistent memory.  Note that this
 	  driver consumes memory ranges allocated and exported by the
 	  libnvdimm sub-system.
 
-	  Say Y if unsure
+	  Say M if unsure
+
+config DEV_DAX_PMEM_COMPAT
+	tristate "PMEM DAX: support the deprecated /sys/class/dax interface"
+	depends on DEV_DAX_PMEM
+	default DEV_DAX_PMEM
+	help
+	  Older versions of the libdaxctl library expect to find all
+	  device-dax instances under /sys/class/dax. If libdaxctl in
+	  your distribution is older than v58 say M, otherwise say N.
 
 endif
diff --git a/drivers/dax/Makefile b/drivers/dax/Makefile
index 658e6b9b1d74..233bbffccbe6 100644
--- a/drivers/dax/Makefile
+++ b/drivers/dax/Makefile
@@ -1,9 +1,9 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_DAX) += dax.o
 obj-$(CONFIG_DEV_DAX) += device_dax.o
-obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o
 
 dax-y := super.o
 dax-y += bus.o
-dax_pmem-y := pmem.o
 device_dax-y := device.o
+
+obj-y += pmem/
diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 178d76504f79..ebf9a4726f6c 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -9,6 +9,8 @@
 #include "dax-private.h"
 #include "bus.h"
 
+static struct class *dax_class;
+
 static DEFINE_MUTEX(dax_bus_lock);
 
 #define DAX_NAME_LEN 30
@@ -310,8 +312,8 @@ static void unregister_dev_dax(void *dev)
 	put_device(dev);
 }
 
-struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id,
-		struct dev_pagemap *pgmap)
+struct dev_dax *__devm_create_dev_dax(struct dax_region *dax_region, int id,
+		struct dev_pagemap *pgmap, enum dev_dax_subsys subsys)
 {
 	struct device *parent = dax_region->dev;
 	struct dax_device *dax_dev;
@@ -350,7 +352,10 @@ struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id,
 
 	inode = dax_inode(dax_dev);
 	dev->devt = inode->i_rdev;
-	dev->bus = &dax_bus_type;
+	if (subsys == DEV_DAX_BUS)
+		dev->bus = &dax_bus_type;
+	else
+		dev->class = dax_class;
 	dev->parent = parent;
 	dev->groups = dax_attribute_groups;
 	dev->release = dev_dax_release;
@@ -374,7 +379,7 @@ struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id,
 
 	return ERR_PTR(rc);
 }
-EXPORT_SYMBOL_GPL(devm_create_dev_dax);
+EXPORT_SYMBOL_GPL(__devm_create_dev_dax);
 
 static int match_always_count;
 
@@ -407,6 +412,7 @@ EXPORT_SYMBOL_GPL(__dax_driver_register);
 
 void dax_driver_unregister(struct dax_device_driver *dax_drv)
 {
+	struct device_driver *drv = &dax_drv->drv;
 	struct dax_id *dax_id, *_id;
 
 	mutex_lock(&dax_bus_lock);
@@ -416,15 +422,28 @@ void dax_driver_unregister(struct dax_device_driver *dax_drv)
 		kfree(dax_id);
 	}
 	mutex_unlock(&dax_bus_lock);
+	driver_unregister(drv);
 }
 EXPORT_SYMBOL_GPL(dax_driver_unregister);
 
 int __init dax_bus_init(void)
 {
-	return bus_register(&dax_bus_type);
+	int rc;
+
+	if (IS_ENABLED(CONFIG_DEV_DAX_PMEM_COMPAT)) {
+		dax_class = class_create(THIS_MODULE, "dax");
+		if (IS_ERR(dax_class))
+			return PTR_ERR(dax_class);
+	}
+
+	rc = bus_register(&dax_bus_type);
+	if (rc)
+		class_destroy(dax_class);
+	return rc;
 }
 
 void __exit dax_bus_exit(void)
 {
 	bus_unregister(&dax_bus_type);
+	class_destroy(dax_class);
 }
diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
index 395ab812367c..ce977552ffb5 100644
--- a/drivers/dax/bus.h
+++ b/drivers/dax/bus.h
@@ -2,7 +2,8 @@
 /* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */
 #ifndef __DAX_BUS_H__
 #define __DAX_BUS_H__
-struct device;
+#include <linux/device.h>
+
 struct dev_dax;
 struct resource;
 struct dax_device;
@@ -10,8 +11,23 @@ struct dax_region;
 void dax_region_put(struct dax_region *dax_region);
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
 		struct resource *res, unsigned int align, unsigned long flags);
-struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, int id,
-		struct dev_pagemap *pgmap);
+
+enum dev_dax_subsys {
+	DEV_DAX_BUS,
+	DEV_DAX_CLASS,
+};
+
+struct dev_dax *__devm_create_dev_dax(struct dax_region *dax_region, int id,
+		struct dev_pagemap *pgmap, enum dev_dax_subsys subsys);
+
+static inline struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region,
+		int id, struct dev_pagemap *pgmap)
+{
+	return __devm_create_dev_dax(dax_region, id, pgmap, DEV_DAX_BUS);
+}
+
+/* to be deleted when DEV_DAX_CLASS is removed */
+struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys subsys);
 
 struct dax_device_driver {
 	struct device_driver drv;
@@ -26,6 +42,10 @@ int __dax_driver_register(struct dax_device_driver *dax_drv,
 void dax_driver_unregister(struct dax_device_driver *dax_drv);
 void kill_dev_dax(struct dev_dax *dev_dax);
 
+#if IS_ENABLED(CONFIG_DEV_DAX_PMEM_COMPAT)
+int dev_dax_probe(struct device *dev);
+#endif
+
 /*
  * While run_dax() is potentially a generic operation that could be
  * defined in include/linux/dax.h we don't want to grow any users
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 052aed3ab600..71694cd984a3 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -445,7 +445,7 @@ static void dev_dax_kill(void *dev_dax)
 	kill_dev_dax(dev_dax);
 }
 
-static int dev_dax_probe(struct device *dev)
+int dev_dax_probe(struct device *dev)
 {
 	struct dev_dax *dev_dax = to_dev_dax(dev);
 	struct dax_device *dax_dev = dev_dax->dax_dev;
@@ -488,7 +488,11 @@ static int dev_dax_probe(struct device *dev)
 	inode = dax_inode(dax_dev);
 	cdev = inode->i_cdev;
 	cdev_init(cdev, &dax_fops);
-	cdev->owner = dev->driver->owner;
+	if (dev->class) {
+		/* for the CONFIG_DEV_DAX_PMEM_COMPAT case */
+		cdev->owner = dev->parent->driver->owner;
+	} else
+		cdev->owner = dev->driver->owner;
 	cdev_set_parent(cdev, &dev->kobj);
 	rc = cdev_add(cdev, dev->devt, 1);
 	if (rc)
@@ -501,6 +505,7 @@ static int dev_dax_probe(struct device *dev)
 	run_dax(dax_dev);
 	return devm_add_action_or_reset(dev, dev_dax_kill, dev_dax);
 }
+EXPORT_SYMBOL_GPL(dev_dax_probe);
 
 static int dev_dax_remove(struct device *dev)
 {
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
deleted file mode 100644
index d3cefa7868ac..000000000000
--- a/drivers/dax/pmem.c
+++ /dev/null
@@ -1,90 +0,0 @@
-/*
- * Copyright(c) 2016 Intel Corporation. All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of version 2 of the GNU General Public License as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- */
-#include <linux/percpu-refcount.h>
-#include <linux/memremap.h>
-#include <linux/module.h>
-#include <linux/pfn_t.h>
-#include "../nvdimm/pfn.h"
-#include "../nvdimm/nd.h"
-#include "bus.h"
-
-static int dax_pmem_probe(struct device *dev)
-{
-	struct resource res;
-	int rc, id, region_id;
-	resource_size_t offset;
-	struct nd_pfn_sb *pfn_sb;
-	struct dev_dax *dev_dax;
-	struct nd_namespace_io *nsio;
-	struct dax_region *dax_region;
-	struct dev_pagemap pgmap = { 0 };
-	struct nd_namespace_common *ndns;
-	struct nd_dax *nd_dax = to_nd_dax(dev);
-	struct nd_pfn *nd_pfn = &nd_dax->nd_pfn;
-
-	ndns = nvdimm_namespace_common_probe(dev);
-	if (IS_ERR(ndns))
-		return PTR_ERR(ndns);
-	nsio = to_nd_namespace_io(&ndns->dev);
-
-	/* parse the 'pfn' info block via ->rw_bytes */
-	rc = devm_nsio_enable(dev, nsio);
-	if (rc)
-		return rc;
-	rc = nvdimm_setup_pfn(nd_pfn, &pgmap);
-	if (rc)
-		return rc;
-	devm_nsio_disable(dev, nsio);
-
-	/* reserve the metadata area, device-dax will reserve the data */
-        pfn_sb = nd_pfn->pfn_sb;
-	offset = le64_to_cpu(pfn_sb->dataoff);
-	if (!devm_request_mem_region(dev, nsio->res.start, offset,
-				dev_name(&ndns->dev))) {
-                dev_warn(dev, "could not reserve metadata\n");
-                return -EBUSY;
-        }
-
-	rc = sscanf(dev_name(&ndns->dev), "namespace%d.%d", &region_id, &id);
-	if (rc != 2)
-		return -EINVAL;
-
-	/* adjust the dax_region resource to the start of data */
-	memcpy(&res, &pgmap.res, sizeof(res));
-	res.start += offset;
-	dax_region = alloc_dax_region(dev, region_id, &res,
-			le32_to_cpu(pfn_sb->align), PFN_DEV|PFN_MAP);
-	if (!dax_region)
-		return -ENOMEM;
-
-	dev_dax = devm_create_dev_dax(dax_region, id, &pgmap);
-
-	/* child dev_dax instances now own the lifetime of the dax_region */
-	dax_region_put(dax_region);
-
-	return PTR_ERR_OR_ZERO(dev_dax);
-}
-
-static struct nd_device_driver dax_pmem_driver = {
-	.probe = dax_pmem_probe,
-	.drv = {
-		.name = "dax_pmem",
-	},
-	.type = ND_DRIVER_DAX_PMEM,
-};
-
-module_nd_driver(dax_pmem_driver);
-
-MODULE_LICENSE("GPL v2");
-MODULE_AUTHOR("Intel Corporation");
-MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM);
diff --git a/drivers/dax/pmem/Makefile b/drivers/dax/pmem/Makefile
new file mode 100644
index 000000000000..e2e79bd3fdcf
--- /dev/null
+++ b/drivers/dax/pmem/Makefile
@@ -0,0 +1,7 @@
+obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o
+obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem_core.o
+obj-$(CONFIG_DEV_DAX_PMEM_COMPAT) += dax_pmem_compat.o
+
+dax_pmem-y := pmem.o
+dax_pmem_core-y := core.o
+dax_pmem_compat-y := compat.o
diff --git a/drivers/dax/pmem/compat.c b/drivers/dax/pmem/compat.c
new file mode 100644
index 000000000000..d7b15e6f30c5
--- /dev/null
+++ b/drivers/dax/pmem/compat.c
@@ -0,0 +1,73 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */
+#include <linux/percpu-refcount.h>
+#include <linux/memremap.h>
+#include <linux/module.h>
+#include <linux/pfn_t.h>
+#include <linux/nd.h>
+#include "../bus.h"
+
+/* we need the private definitions to implement compat suport */
+#include "../dax-private.h"
+
+static int dax_pmem_compat_probe(struct device *dev)
+{
+	struct dev_dax *dev_dax = __dax_pmem_probe(dev, DEV_DAX_CLASS);
+	int rc;
+
+	if (IS_ERR(dev_dax))
+		return PTR_ERR(dev_dax);
+
+        if (!devres_open_group(&dev_dax->dev, dev_dax, GFP_KERNEL))
+		return -ENOMEM;
+
+	device_lock(&dev_dax->dev);
+	rc = dev_dax_probe(&dev_dax->dev);
+	device_unlock(&dev_dax->dev);
+
+	devres_close_group(&dev_dax->dev, dev_dax);
+	if (rc)
+		devres_release_group(&dev_dax->dev, dev_dax);
+
+	return rc;
+}
+
+static int dax_pmem_compat_release(struct device *dev, void *data)
+{
+	device_lock(dev);
+	devres_release_group(dev, to_dev_dax(dev));
+	device_unlock(dev);
+
+	return 0;
+}
+
+static int dax_pmem_compat_remove(struct device *dev)
+{
+	device_for_each_child(dev, NULL, dax_pmem_compat_release);
+	return 0;
+}
+
+static struct nd_device_driver dax_pmem_compat_driver = {
+	.probe = dax_pmem_compat_probe,
+	.remove = dax_pmem_compat_remove,
+	.drv = {
+		.name = "dax_pmem_compat",
+	},
+	.type = ND_DRIVER_DAX_PMEM,
+};
+
+static int __init dax_pmem_compat_init(void)
+{
+	return nd_driver_register(&dax_pmem_compat_driver);
+}
+module_init(dax_pmem_compat_init);
+
+static void __exit dax_pmem_compat_exit(void)
+{
+	driver_unregister(&dax_pmem_compat_driver.drv);
+}
+module_exit(dax_pmem_compat_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Intel Corporation");
+MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM);
diff --git a/drivers/dax/pmem/core.c b/drivers/dax/pmem/core.c
new file mode 100644
index 000000000000..bdcff1b14e95
--- /dev/null
+++ b/drivers/dax/pmem/core.c
@@ -0,0 +1,69 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */
+#include <linux/memremap.h>
+#include <linux/module.h>
+#include <linux/pfn_t.h>
+#include "../../nvdimm/pfn.h"
+#include "../../nvdimm/nd.h"
+#include "../bus.h"
+
+struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys subsys)
+{
+	struct resource res;
+	int rc, id, region_id;
+	resource_size_t offset;
+	struct nd_pfn_sb *pfn_sb;
+	struct dev_dax *dev_dax;
+	struct nd_namespace_io *nsio;
+	struct dax_region *dax_region;
+	struct dev_pagemap pgmap = { 0 };
+	struct nd_namespace_common *ndns;
+	struct nd_dax *nd_dax = to_nd_dax(dev);
+	struct nd_pfn *nd_pfn = &nd_dax->nd_pfn;
+
+	ndns = nvdimm_namespace_common_probe(dev);
+	if (IS_ERR(ndns))
+		return ERR_CAST(ndns);
+	nsio = to_nd_namespace_io(&ndns->dev);
+
+	/* parse the 'pfn' info block via ->rw_bytes */
+	rc = devm_nsio_enable(dev, nsio);
+	if (rc)
+		return ERR_PTR(rc);
+	rc = nvdimm_setup_pfn(nd_pfn, &pgmap);
+	if (rc)
+		return ERR_PTR(rc);
+	devm_nsio_disable(dev, nsio);
+
+	/* reserve the metadata area, device-dax will reserve the data */
+        pfn_sb = nd_pfn->pfn_sb;
+	offset = le64_to_cpu(pfn_sb->dataoff);
+	if (!devm_request_mem_region(dev, nsio->res.start, offset,
+				dev_name(&ndns->dev))) {
+                dev_warn(dev, "could not reserve metadata\n");
+		return ERR_PTR(-EBUSY);
+        }
+
+	rc = sscanf(dev_name(&ndns->dev), "namespace%d.%d", &region_id, &id);
+	if (rc != 2)
+		return ERR_PTR(-EINVAL);
+
+	/* adjust the dax_region resource to the start of data */
+	memcpy(&res, &pgmap.res, sizeof(res));
+	res.start += offset;
+	dax_region = alloc_dax_region(dev, region_id, &res,
+			le32_to_cpu(pfn_sb->align), PFN_DEV|PFN_MAP);
+	if (!dax_region)
+		return ERR_PTR(-ENOMEM);
+
+	dev_dax = __devm_create_dev_dax(dax_region, id, &pgmap, subsys);
+
+	/* child dev_dax instances now own the lifetime of the dax_region */
+	dax_region_put(dax_region);
+
+	return dev_dax;
+}
+EXPORT_SYMBOL_GPL(__dax_pmem_probe);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Intel Corporation");
diff --git a/drivers/dax/pmem/pmem.c b/drivers/dax/pmem/pmem.c
new file mode 100644
index 000000000000..0ae4238a0ef8
--- /dev/null
+++ b/drivers/dax/pmem/pmem.c
@@ -0,0 +1,40 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */
+#include <linux/percpu-refcount.h>
+#include <linux/memremap.h>
+#include <linux/module.h>
+#include <linux/pfn_t.h>
+#include <linux/nd.h>
+#include "../bus.h"
+
+static int dax_pmem_probe(struct device *dev)
+{
+	return PTR_ERR_OR_ZERO(__dax_pmem_probe(dev, DEV_DAX_BUS));
+}
+
+static struct nd_device_driver dax_pmem_driver = {
+	.probe = dax_pmem_probe,
+	.drv = {
+		.name = "dax_pmem",
+	},
+	.type = ND_DRIVER_DAX_PMEM,
+};
+
+static int __init dax_pmem_init(void)
+{
+	return nd_driver_register(&dax_pmem_driver);
+}
+module_init(dax_pmem_init);
+
+static void __exit dax_pmem_exit(void)
+{
+	driver_unregister(&dax_pmem_driver.drv);
+}
+module_exit(dax_pmem_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Intel Corporation");
+#if !IS_ENABLED(CONFIG_DEV_DAX_PMEM_COMPAT)
+/* For compat builds, don't load this module by default */
+MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM);
+#endif
diff --git a/tools/testing/nvdimm/Kbuild b/tools/testing/nvdimm/Kbuild
index bfc4d8e98452..1e37719bc37e 100644
--- a/tools/testing/nvdimm/Kbuild
+++ b/tools/testing/nvdimm/Kbuild
@@ -34,6 +34,8 @@ obj-$(CONFIG_DAX) += dax.o
 endif
 obj-$(CONFIG_DEV_DAX) += device_dax.o
 obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o
+obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem_core.o
+obj-$(CONFIG_DEV_DAX_PMEM_COMPAT) += dax_pmem_compat.o
 
 nfit-y := $(ACPI_SRC)/core.o
 nfit-$(CONFIG_X86_MCE) += $(ACPI_SRC)/mce.o
@@ -63,7 +65,9 @@ device_dax-y += dax-dev.o
 device_dax-y += device_dax_test.o
 device_dax-y += config_check.o
 
-dax_pmem-y := $(DAX_SRC)/pmem.o
+dax_pmem-y := $(DAX_SRC)/pmem/pmem.o
+dax_pmem_core-y := $(DAX_SRC)/pmem/core.o
+dax_pmem_compat-y := $(DAX_SRC)/pmem/compat.o
 dax_pmem-y += config_check.o
 
 libnvdimm-y := $(NVDIMM_SRC)/core.o


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 7/8] device-dax: Add support for a dax override driver
  2018-10-31  3:13 ` [PATCH 7/8] device-dax: Add support for a dax override driver Dan Williams
@ 2018-11-07 23:42   ` Williams, Dan J
  0 siblings, 0 replies; 10+ messages in thread
From: Williams, Dan J @ 2018-11-07 23:42 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: linux-kernel, linux-mm, dave.hansen

On Tue, 2018-10-30 at 20:13 -0700, Dan Williams wrote:
> Introduce the 'new_id' concept for enabling a custom device-driver attach
> policy for dax-bus drivers. The intended use is to have a mechanism for
> hot-plugging device-dax ranges into the page allocator on-demand. With
> this in place the default policy of using device-dax for performance
> differentiated memory can be overridden by user-space policy that can
> arrange for the memory range to be managed as 'System RAM' with
> user-defined NUMA and other performance attributes.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/dax/bus.c    |  145 ++++++++++++++++++++++++++++++++++++++++++++++++--
>  drivers/dax/bus.h    |   10 +++
>  drivers/dax/device.c |   11 ++--
>  3 files changed, 156 insertions(+), 10 deletions(-)
> 
> 

Here's an incremental fixup for the string matching in this patch, I'll
send a v2 if other review comments come in:

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 178d76504f79..17af6fbc3be5 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -39,7 +39,7 @@ static struct dax_id *__dax_match_id(struct dax_device_driver *dax_drv,
 	lockdep_assert_held(&dax_bus_lock);
 
 	list_for_each_entry(dax_id, &dax_drv->ids, list)
-		if (strcmp(dax_id->dev_name, dev_name) == 0)
+		if (sysfs_streq(dax_id->dev_name, dev_name))
 			return dax_id;
 	return NULL;
 }
@@ -60,6 +60,7 @@ static ssize_t do_id_store(struct device_driver *drv, const char *buf,
 {
 	struct dax_device_driver *dax_drv = to_dax_drv(drv);
 	unsigned int region_id, id;
+	char devname[DAX_NAME_LEN];
 	struct dax_id *dax_id;
 	ssize_t rc = count;
 	int fields;
@@ -67,8 +68,8 @@ static ssize_t do_id_store(struct device_driver *drv, const char *buf,
 	fields = sscanf(buf, "dax%d.%d", &region_id, &id);
 	if (fields != 2)
 		return -EINVAL;
-
-	if (strlen(buf) + 1 > DAX_NAME_LEN)
+	sprintf(devname, "dax%d.%d", region_id, id);
+	if (!sysfs_streq(buf, devname))
 		return -EINVAL;
 
 	mutex_lock(&dax_bus_lock);
@@ -99,7 +100,6 @@ static ssize_t new_id_store(struct device_driver *drv, const char *buf,
 }
 static DRIVER_ATTR_WO(new_id);
 
-
 static ssize_t remove_id_store(struct device_driver *drv, const char *buf,
 		size_t count)
 {



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-11-07 23:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-31  3:12 [PATCH 0/8] Introduce a device-dax bus-based device-model Dan Williams
2018-10-31  3:12 ` [PATCH 1/8] device-dax: Kill dax_region ida Dan Williams
2018-10-31  3:13 ` [PATCH 2/8] device-dax: Kill dax_region base Dan Williams
2018-10-31  3:13 ` [PATCH 3/8] device-dax: Remove multi-resource infrastructure Dan Williams
2018-10-31  3:13 ` [PATCH 4/8] device-dax: Start defining a dax bus model Dan Williams
2018-10-31  3:13 ` [PATCH 5/8] device-dax: Introduce bus + driver model Dan Williams
2018-10-31  3:13 ` [PATCH 6/8] device-dax: Move resource pinning+mapping into the common driver Dan Williams
2018-10-31  3:13 ` [PATCH 7/8] device-dax: Add support for a dax override driver Dan Williams
2018-11-07 23:42   ` Williams, Dan J
2018-10-31  3:13 ` [PATCH 8/8] device-dax: Add /sys/class/dax backwards compatibility Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).