LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/8] Drivers: hv: vmbus: Enable unloading of vmbus driver
@ 2015-01-27 23:46 K. Y. Srinivasan
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
  0 siblings, 1 reply; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx; +Cc: K. Y. Srinivasan

Windows hosts starting with Ws2012 R2 permit re-establishing the vmbus
connection from the guest. This patch-set includes patches from Vitaly
to cleanup the VMBUS unload path so we can potentially reload the driver.

This set also includes a patch from Jake to correctly extract MMIO
information on both Gen1 and Gen 2 firmware.


Jake Oshins (1):
  drivers:hv:vmbus drivers:hv:vmbus Allow for more than one MMIO range
    for children

Vitaly Kuznetsov (7):
  Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors
  Drivers: hv: vmbus: rename channel work queues
  Drivers: hv: vmbus: avoid double kfree for device_obj
  Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and
    vmbus_connection pages on shutdown
  drivers: hv: vmbus: Teardown synthetic interrupt controllers on
    module unload
  clockevents: export clockevents_unbind_device instead of
    clockevents_unbind
  Drivers: hv: vmbus: Teardown clockevent devices on module unload

 drivers/hv/channel_mgmt.c       |    6 +-
 drivers/hv/connection.c         |   17 +++--
 drivers/hv/hv.c                 |   34 ++++++++-
 drivers/hv/hyperv_vmbus.h       |    3 +
 drivers/hv/vmbus_drv.c          |  150 ++++++++++++++++++++++++++++++++++-----
 drivers/video/fbdev/hyperv_fb.c |    2 +-
 include/linux/hyperv.h          |    5 +-
 kernel/time/clockevents.c       |    2 +-
 8 files changed, 188 insertions(+), 31 deletions(-)

-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors
  2015-01-27 23:46 [PATCH 0/8] Drivers: hv: vmbus: Enable unloading of vmbus driver K. Y. Srinivasan
@ 2015-01-27 23:46 ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 2/8] Drivers: hv: vmbus: rename channel work queues K. Y. Srinivasan
                     ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx; +Cc: K. Y. Srinivasan

From: Vitaly Kuznetsov <vkuznets@redhat.com>

When an SMP Hyper-V guest is running on top of 2012R2 Server and secondary
cpus are sent offline (with echo 0 > /sys/devices/system/cpu/cpu$cpu/online)
the system freeze is observed. This happens due to the fact that on newer
hypervisors (Win8, WS2012R2, ...) vmbus channel handlers are distributed
across all cpus (see init_vp_index() function in drivers/hv/channel_mgmt.c)
and on cpu offlining nobody reassigns them to CPU0. Prevent cpu offlining
when vmbus is loaded until the issue is fixed host-side.

This patch also disables hibernation but it is OK as it is also broken (MCE
error is hit on resume). Suspend still works.

Tested with WS2008R2 and WS2012R2.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/vmbus_drv.c |   36 ++++++++++++++++++++++++++++++++++++
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index f518b8d7..3b18a66 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -33,6 +33,7 @@
 #include <linux/hyperv.h>
 #include <linux/kernel_stat.h>
 #include <linux/clockchips.h>
+#include <linux/cpu.h>
 #include <asm/hyperv.h>
 #include <asm/hypervisor.h>
 #include <asm/mshyperv.h>
@@ -704,6 +705,39 @@ static void vmbus_isr(void)
 	}
 }
 
+#ifdef CONFIG_HOTPLUG_CPU
+static int hyperv_cpu_disable(void)
+{
+	return -ENOSYS;
+}
+
+static void hv_cpu_hotplug_quirk(bool vmbus_loaded)
+{
+	static void *previous_cpu_disable;
+
+	/*
+	 * Offlining a CPU when running on newer hypervisors (WS2012R2, Win8,
+	 * ...) is not supported at this moment as channel interrupts are
+	 * distributed across all of them.
+	 */
+
+	if ((vmbus_proto_version == VERSION_WS2008) ||
+	    (vmbus_proto_version == VERSION_WIN7))
+		return;
+
+	if (vmbus_loaded) {
+		previous_cpu_disable = smp_ops.cpu_disable;
+		smp_ops.cpu_disable = hyperv_cpu_disable;
+		pr_notice("CPU offlining is not supported by hypervisor\n");
+	} else if (previous_cpu_disable)
+		smp_ops.cpu_disable = previous_cpu_disable;
+}
+#else
+static void hv_cpu_hotplug_quirk(bool vmbus_loaded)
+{
+}
+#endif
+
 /*
  * vmbus_bus_init -Main vmbus driver initialization routine.
  *
@@ -744,6 +778,7 @@ static int vmbus_bus_init(int irq)
 	if (ret)
 		goto err_alloc;
 
+	hv_cpu_hotplug_quirk(true);
 	vmbus_request_offers();
 
 	return 0;
@@ -997,6 +1032,7 @@ static void __exit vmbus_exit(void)
 	bus_unregister(&hv_bus);
 	hv_cleanup();
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
+	hv_cpu_hotplug_quirk(false);
 }
 
 
-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RESEND 2/8] Drivers: hv: vmbus: rename channel work queues
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 3/8] drivers:hv:vmbus drivers:hv:vmbus Allow for more than one MMIO range for children K. Y. Srinivasan
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

All channel work queues are named 'hv_vmbus_ctl', this makes them
indistinguishable in ps output and makes it hard to link to the corresponding
vmbus device. Rename them to hv_vmbus_ctl/N and make vmbus device names match,
e.g. now vmbus_1 device is served by hv_vmbus_ctl/1 work queue.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/channel_mgmt.c |    5 ++++-
 drivers/hv/vmbus_drv.c    |    6 ++----
 include/linux/hyperv.h    |    3 +++
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 3736f71..ba4b25f 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -139,19 +139,22 @@ EXPORT_SYMBOL_GPL(vmbus_prep_negotiate_resp);
  */
 static struct vmbus_channel *alloc_channel(void)
 {
+	static atomic_t chan_num = ATOMIC_INIT(0);
 	struct vmbus_channel *channel;
 
 	channel = kzalloc(sizeof(*channel), GFP_ATOMIC);
 	if (!channel)
 		return NULL;
 
+	channel->id = atomic_inc_return(&chan_num);
 	spin_lock_init(&channel->inbound_lock);
 	spin_lock_init(&channel->lock);
 
 	INIT_LIST_HEAD(&channel->sc_list);
 	INIT_LIST_HEAD(&channel->percpu_list);
 
-	channel->controlwq = create_workqueue("hv_vmbus_ctl");
+	channel->controlwq = alloc_workqueue("hv_vmbus_ctl/%d", WQ_MEM_RECLAIM,
+					     1, channel->id);
 	if (!channel->controlwq) {
 		kfree(channel);
 		return NULL;
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 3b18a66..e334ccc 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -875,10 +875,8 @@ int vmbus_device_register(struct hv_device *child_device_obj)
 {
 	int ret = 0;
 
-	static atomic_t device_num = ATOMIC_INIT(0);
-
-	dev_set_name(&child_device_obj->device, "vmbus_0_%d",
-		     atomic_inc_return(&device_num));
+	dev_set_name(&child_device_obj->device, "vmbus_%d",
+		     child_device_obj->channel->id);
 
 	child_device_obj->device.bus = &hv_bus;
 	child_device_obj->device.parent = &hv_acpi_dev->dev;
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 5a2ba67..26a32b7 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -646,6 +646,9 @@ struct hv_input_signal_event_buffer {
 };
 
 struct vmbus_channel {
+	/* Unique channel id */
+	int id;
+
 	struct list_head listentry;
 
 	struct hv_device *device_obj;
-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RESEND 3/8] drivers:hv:vmbus drivers:hv:vmbus Allow for more than one MMIO range for children
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 2/8] Drivers: hv: vmbus: rename channel work queues K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 4/8] Drivers: hv: vmbus: avoid double kfree for device_obj K. Y. Srinivasan
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Jake Oshins, Jake Oshins, K. Y. Srinivasan

From: Jake Oshins <[mailto:jakeo@microsoft.com]>

This set of changes finds the _CRS object in the ACPI namespace
that contains memory address space descriptors, intended to convey
to VMBus which ranges of memory-mapped I/O space are available for
child devices, and then builds a resource list that contains all
those ranges.  Without this change, only some of the memory-mapped
I/O space will be available for child devices, and only in some
virtual BIOS configurations (Generation 2 VMs).

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/vmbus_drv.c          |   97 +++++++++++++++++++++++++++++++++------
 drivers/video/fbdev/hyperv_fb.c |    2 +-
 include/linux/hyperv.h          |    2 +-
 3 files changed, 85 insertions(+), 16 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index e334ccc..aebc8fe 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -45,10 +45,7 @@ static struct tasklet_struct msg_dpc;
 static struct completion probe_event;
 static int irq;
 
-struct resource hyperv_mmio = {
-	.name  = "hyperv mmio",
-	.flags = IORESOURCE_MEM,
-};
+struct resource *hyperv_mmio;
 EXPORT_SYMBOL_GPL(hyperv_mmio);
 
 static int vmbus_exists(void)
@@ -915,30 +912,98 @@ void vmbus_device_unregister(struct hv_device *device_obj)
 
 
 /*
- * VMBUS is an acpi enumerated device. Get the the information we
+ * VMBUS is an acpi enumerated device. Get the information we
  * need from DSDT.
  */
 
 static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx)
 {
+	resource_size_t start = 0;
+	resource_size_t end = 0;
+	struct resource *new_res;
+	struct resource **old_res = &hyperv_mmio;
+
 	switch (res->type) {
 	case ACPI_RESOURCE_TYPE_IRQ:
 		irq = res->data.irq.interrupts[0];
+		return AE_OK;
+
+	/*
+	 * "Address" descriptors are for bus windows. Ignore
+	 * "memory" descriptors, which are for registers on
+	 * devices.
+	 */
+	case ACPI_RESOURCE_TYPE_ADDRESS32:
+		start = res->data.address32.address.minimum;
+		end = res->data.address32.address.maximum;
 		break;
 
 	case ACPI_RESOURCE_TYPE_ADDRESS64:
-		hyperv_mmio.start = res->data.address64.address.minimum;
-		hyperv_mmio.end = res->data.address64.address.maximum;
+		start = res->data.address64.address.minimum;
+		end = res->data.address64.address.maximum;
 		break;
+
+	default:
+		/* Unused resource type */
+		return AE_OK;
+
 	}
+	/*
+	 * Ignore ranges that are below 1MB, as they're not
+	 * necessary or useful here.
+	*/
+	if (end < 0x100000)
+		return AE_OK;
+
+	new_res = kzalloc(sizeof(*new_res), GFP_ATOMIC);
+	if (!new_res)
+		return AE_NO_MEMORY;
+
+	new_res->name = "hyperv mmio";
+	new_res->flags = IORESOURCE_MEM;
+	new_res->start = start;
+	new_res->end = end;
+
+	do {
+		if (!*old_res) {
+			*old_res = new_res;
+			break;
+		}
+
+		if ((*old_res)->start > new_res->end) {
+			new_res->sibling = *old_res;
+			*old_res = new_res;
+			break;
+		}
+
+		old_res = &(*old_res)->sibling;
+
+	} while (1);
 
 	return AE_OK;
 }
 
+static int vmbus_acpi_remove(struct acpi_device *device)
+{
+	struct resource *cur_res;
+	struct resource *next_res;
+
+	if (hyperv_mmio) {
+		release_resource(hyperv_mmio);
+		for (cur_res = hyperv_mmio; cur_res; cur_res = next_res) {
+			next_res = cur_res->sibling;
+			kfree(cur_res);
+		}
+	}
+
+	return 0;
+}
+
 static int vmbus_acpi_add(struct acpi_device *device)
 {
 	acpi_status result;
 	int ret_val = -ENODEV;
+	struct acpi_device *ancestor;
 
 	hv_acpi_dev = device;
 
@@ -948,23 +1013,26 @@ static int vmbus_acpi_add(struct acpi_device *device)
 	if (ACPI_FAILURE(result))
 		goto acpi_walk_err;
 	/*
-	 * The parent of the vmbus acpi device (Gen2 firmware) is the VMOD that
-	 * has the mmio ranges. Get that.
+	 * Some ancestor of the vmbus acpi device (Gen1 or Gen2
+	 * firmware) is the VMOD that has the mmio ranges. Get that.
 	 */
-	if (device->parent) {
-		result = acpi_walk_resources(device->parent->handle,
+	for (ancestor = device->parent; ancestor; ancestor = ancestor->parent) {
+		result = acpi_walk_resources(ancestor->handle,
 					METHOD_NAME__CRS,
 					vmbus_walk_resources, NULL);
 
 		if (ACPI_FAILURE(result))
-			goto acpi_walk_err;
-		if (hyperv_mmio.start && hyperv_mmio.end)
-			request_resource(&iomem_resource, &hyperv_mmio);
+			continue;
+		if (hyperv_mmio) {
+			request_resource(&iomem_resource, hyperv_mmio);
+			break;
+		}
 	}
 	ret_val = 0;
 
 acpi_walk_err:
 	complete(&probe_event);
+	vmbus_acpi_remove(device);
 	return ret_val;
 }
 
@@ -980,6 +1048,7 @@ static struct acpi_driver vmbus_acpi_driver = {
 	.ids = vmbus_acpi_device_ids,
 	.ops = {
 		.add = vmbus_acpi_add,
+		.remove = vmbus_acpi_remove,
 	},
 };
 
diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
index 4254336..003c8f0 100644
--- a/drivers/video/fbdev/hyperv_fb.c
+++ b/drivers/video/fbdev/hyperv_fb.c
@@ -686,7 +686,7 @@ static int hvfb_getmem(struct fb_info *info)
 	par->mem.name = KBUILD_MODNAME;
 	par->mem.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 	if (gen2vm) {
-		ret = allocate_resource(&hyperv_mmio, &par->mem,
+		ret = allocate_resource(hyperv_mmio, &par->mem,
 					screen_fb_size,
 					0, -1,
 					screen_fb_size,
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 26a32b7..e73cfeb 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1217,7 +1217,7 @@ int hv_vss_init(struct hv_util_service *);
 void hv_vss_deinit(void);
 void hv_vss_onchannelcallback(void *);
 
-extern struct resource hyperv_mmio;
+extern struct resource *hyperv_mmio;
 
 /*
  * Negotiated version with the Host.
-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RESEND 4/8] Drivers: hv: vmbus: avoid double kfree for device_obj
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 2/8] Drivers: hv: vmbus: rename channel work queues K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 3/8] drivers:hv:vmbus drivers:hv:vmbus Allow for more than one MMIO range for children K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 5/8] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown K. Y. Srinivasan
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

On driver shutdown device_obj is being freed twice:
1) In vmbus_free_channels()
2) vmbus_device_release() (which is being triggered by device_unregister() in
   vmbus_device_unregister().
This double kfree leads to the following sporadic crash on driver unload:

[   23.469876] general protection fault: 0000 [#1] SMP
[   23.470036] Modules linked in: hv_vmbus(-)
[   23.470036] CPU: 2 PID: 213 Comm: rmmod Not tainted 3.19.0-rc5_bug923184+ #488
[   23.470036] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[   23.470036] task: ffff880036ef1cb0 ti: ffff880036ce8000 task.ti: ffff880036ce8000
[   23.470036] RIP: 0010:[<ffffffff811d2e1b>]  [<ffffffff811d2e1b>] __kmalloc_node_track_caller+0xdb/0x1e0
[   23.470036] RSP: 0018:ffff880036cebcc8  EFLAGS: 00010246
...

When this crash does not happen on driver unload the similar one is expected if
we try to load hv_vmbus again.

Remove kfree from vmbus_free_channels() as freeing it from
vmbus_device_release() seems right.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/channel_mgmt.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index ba4b25f..36bacc7 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -262,7 +262,6 @@ void vmbus_free_channels(void)
 
 	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
 		vmbus_device_unregister(channel->device_obj);
-		kfree(channel->device_obj);
 		free_channel(channel);
 	}
 }
-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RESEND 5/8] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
                     ` (2 preceding siblings ...)
  2015-01-27 23:46   ` [PATCH RESEND 4/8] Drivers: hv: vmbus: avoid double kfree for device_obj K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 6/8] drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload K. Y. Srinivasan
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

We need to destroy hv_vmbus_con on module shutdown, otherwise the following
crash is sometimes observed:

[   76.569845] hv_vmbus: Hyper-V Host Build:9600-6.3-17-0.17039; Vmbus version:3.0
[   82.598859] BUG: unable to handle kernel paging request at ffffffffa0003480
[   82.599287] IP: [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] PGD 1f34067 PUD 1f35063 PMD 3f72d067 PTE 0
[   82.599287] Oops: 0010 [#1] SMP
[   82.599287] Modules linked in: [last unloaded: hv_vmbus]
[   82.599287] CPU: 0 PID: 26 Comm: kworker/0:1 Not tainted 3.19.0-rc5_bug923184+ #488
[   82.599287] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   82.599287] Workqueue: hv_vmbus_con 0xffffffffa0003480
[   82.599287] task: ffff88007b6ddfa0 ti: ffff88007f8f8000 task.ti: ffff88007f8f8000
[   82.599287] RIP: 0010:[<ffffffffa0003480>]  [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] RSP: 0018:ffff88007f8fbe00  EFLAGS: 00010202
...

To avoid memory leaks we need to free monitor_pages and int_page for
vmbus_connection. Implement vmbus_disconnect() function by separating cleanup
path from vmbus_connect().

As we use hv_vmbus_con to release channels (see free_channel() in channel_mgmt.c)
we need to make sure the work was done before we remove the queue, do that with
drain_workqueue(). We also need to avoid handling messages  which can (potentially)
create new channels, so set vmbus_connection.conn_state = DISCONNECTED at the very
beginning of vmbus_exit() and check for that in vmbus_onmessage_work().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/connection.c   |   17 ++++++++++++-----
 drivers/hv/hyperv_vmbus.h |    1 +
 drivers/hv/vmbus_drv.c    |    6 ++++++
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index a63a795..c4acd1c 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -216,10 +216,21 @@ int vmbus_connect(void)
 
 cleanup:
 	pr_err("Unable to connect to host\n");
+
 	vmbus_connection.conn_state = DISCONNECTED;
+	vmbus_disconnect();
+
+	kfree(msginfo);
+
+	return ret;
+}
 
-	if (vmbus_connection.work_queue)
+void vmbus_disconnect(void)
+{
+	if (vmbus_connection.work_queue) {
+		drain_workqueue(vmbus_connection.work_queue);
 		destroy_workqueue(vmbus_connection.work_queue);
+	}
 
 	if (vmbus_connection.int_page) {
 		free_pages((unsigned long)vmbus_connection.int_page, 0);
@@ -230,10 +241,6 @@ cleanup:
 	free_pages((unsigned long)vmbus_connection.monitor_pages[1], 0);
 	vmbus_connection.monitor_pages[0] = NULL;
 	vmbus_connection.monitor_pages[1] = NULL;
-
-	kfree(msginfo);
-
-	return ret;
 }
 
 /*
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 44b1c94..6cf2de9 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -692,6 +692,7 @@ void vmbus_free_channels(void);
 /* Connection interface */
 
 int vmbus_connect(void);
+void vmbus_disconnect(void);
 
 int vmbus_post_msg(void *buffer, size_t buflen);
 
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index aebc8fe..3b304b9 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -571,6 +571,10 @@ static void vmbus_onmessage_work(struct work_struct *work)
 {
 	struct onmessage_work_context *ctx;
 
+	/* Do not process messages if we're in DISCONNECTED state */
+	if (vmbus_connection.conn_state == DISCONNECTED)
+		return;
+
 	ctx = container_of(work, struct onmessage_work_context,
 			   work);
 	vmbus_onmessage(&ctx->msg);
@@ -1094,12 +1098,14 @@ cleanup:
 
 static void __exit vmbus_exit(void)
 {
+	vmbus_connection.conn_state = DISCONNECTED;
 	hv_remove_vmbus_irq();
 	vmbus_free_channels();
 	bus_unregister(&hv_bus);
 	hv_cleanup();
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
 	hv_cpu_hotplug_quirk(false);
+	vmbus_disconnect();
 }
 
 
-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RESEND 6/8] drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
                     ` (3 preceding siblings ...)
  2015-01-27 23:46   ` [PATCH RESEND 5/8] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 7/8] clockevents: export clockevents_unbind_device instead of clockevents_unbind K. Y. Srinivasan
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

SynIC has to be switched off when we unload the module, otherwise registered
memory pages can get corrupted after (as Hyper-V host still writes there) and
we see the following crashes for random processes:

[   89.116774] BUG: Bad page map in process sh  pte:4989c716 pmd:36f81067
[   89.159454] addr:0000000000437000 vm_flags:00000875 anon_vma:          (null) mapping:ffff88007bba55a0 index:37
[   89.226146] vma->vm_ops->fault: filemap_fault+0x0/0x410
[   89.257776] vma->vm_file->f_op->mmap: generic_file_mmap+0x0/0x60
[   89.297570] CPU: 0 PID: 215 Comm: sh Tainted: G    B          3.19.0-rc5_bug923184+ #488
[   89.353738] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[   89.409138]  0000000000000000 000000004e083d7b ffff880036e9fa18 ffffffff81a68d31
[   89.468724]  0000000000000000 0000000000437000 ffff880036e9fa68 ffffffff811a1e3a
[   89.519233]  000000004989c716 0000000000000037 ffffea0001edc340 0000000000437000
[   89.575751] Call Trace:
[   89.591060]  [<ffffffff81a68d31>] dump_stack+0x45/0x57
[   89.625164]  [<ffffffff811a1e3a>] print_bad_pte+0x1aa/0x250
[   89.667234]  [<ffffffff811a2c95>] vm_normal_page+0x55/0xa0
[   89.703818]  [<ffffffff811a3105>] unmap_page_range+0x425/0x8a0
[   89.737982]  [<ffffffff811a3601>] unmap_single_vma+0x81/0xf0
[   89.780385]  [<ffffffff81184320>] ? lru_deactivate_fn+0x190/0x190
[   89.820130]  [<ffffffff811a4131>] unmap_vmas+0x51/0xa0
[   89.860168]  [<ffffffff811ad12c>] exit_mmap+0xac/0x1a0
[   89.890588]  [<ffffffff810763c3>] mmput+0x63/0x100
[   89.919205]  [<ffffffff811eba48>] flush_old_exec+0x3f8/0x8b0
[   89.962135]  [<ffffffff8123b5bb>] load_elf_binary+0x32b/0x1260
[   89.998581]  [<ffffffff811a14f2>] ? get_user_pages+0x52/0x60

hv_synic_cleanup() function exists but noone calls it now. Do the following:
- call hv_synic_cleanup() on each cpu from vmbus_exit();
- write global disable bit through MSR;
- use hv_synic_free_cpu() to avoid memory leask and code duplication.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/hv.c        |    9 +++++++--
 drivers/hv/vmbus_drv.c |    4 ++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 50e51a5..39531dc 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -477,6 +477,7 @@ void hv_synic_cleanup(void *arg)
 	union hv_synic_sint shared_sint;
 	union hv_synic_simp simp;
 	union hv_synic_siefp siefp;
+	union hv_synic_scontrol sctrl;
 	int cpu = smp_processor_id();
 
 	if (!hv_context.synic_initialized)
@@ -502,6 +503,10 @@ void hv_synic_cleanup(void *arg)
 
 	wrmsrl(HV_X64_MSR_SIEFP, siefp.as_uint64);
 
-	free_page((unsigned long)hv_context.synic_message_page[cpu]);
-	free_page((unsigned long)hv_context.synic_event_page[cpu]);
+	/* Disable the global synic bit */
+	rdmsrl(HV_X64_MSR_SCONTROL, sctrl.as_uint64);
+	sctrl.enable = 0;
+	wrmsrl(HV_X64_MSR_SCONTROL, sctrl.as_uint64);
+
+	hv_synic_free_cpu(cpu);
 }
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 3b304b9..8ff6f69 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1098,11 +1098,15 @@ cleanup:
 
 static void __exit vmbus_exit(void)
 {
+	int cpu;
+
 	vmbus_connection.conn_state = DISCONNECTED;
 	hv_remove_vmbus_irq();
 	vmbus_free_channels();
 	bus_unregister(&hv_bus);
 	hv_cleanup();
+	for_each_online_cpu(cpu)
+		smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1);
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
 	hv_cpu_hotplug_quirk(false);
 	vmbus_disconnect();
-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RESEND 7/8] clockevents: export clockevents_unbind_device instead of clockevents_unbind
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
                     ` (4 preceding siblings ...)
  2015-01-27 23:46   ` [PATCH RESEND 6/8] drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-27 23:46   ` [PATCH RESEND 8/8] Drivers: hv: vmbus: Teardown clockevent devices on module unload K. Y. Srinivasan
  2015-01-28 22:16   ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors KY Srinivasan
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

It looks like clockevents_unbind is being exported by mistake as:
- it is static;
- it is not listed in include/linux/clockchips.h;
- EXPORT_SYMBOL_GPL(clockevents_unbind) follows clockevents_unbind_device()
  implementation.

I think clockevents_unbind_device should be exported instead. This is going to
be used to teardown Hyper-V clockevent devices on module unload.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 kernel/time/clockevents.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 5544990..888ecc1 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -371,7 +371,7 @@ int clockevents_unbind_device(struct clock_event_device *ced, int cpu)
 	mutex_unlock(&clockevents_mutex);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(clockevents_unbind);
+EXPORT_SYMBOL_GPL(clockevents_unbind_device);
 
 /**
  * clockevents_register_device - register a clock event device
-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RESEND 8/8] Drivers: hv: vmbus: Teardown clockevent devices on module unload
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
                     ` (5 preceding siblings ...)
  2015-01-27 23:46   ` [PATCH RESEND 7/8] clockevents: export clockevents_unbind_device instead of clockevents_unbind K. Y. Srinivasan
@ 2015-01-27 23:46   ` K. Y. Srinivasan
  2015-01-28 22:16   ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors KY Srinivasan
  7 siblings, 0 replies; 10+ messages in thread
From: K. Y. Srinivasan @ 2015-01-27 23:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx
  Cc: Vitaly Kuznetsov, K. Y. Srinivasan

From: Vitaly Kuznetsov <[mailto:vkuznets@redhat.com]>

Newly introduced clockevent devices made it impossible to unload hv_vmbus
module as clockevents_config_and_register() takes additional reverence to
the module. To make it possible again we do the following:
- avoid setting dev->owner for clockevent devices;
- implement hv_synic_clockevents_cleanup() doing clockevents_unbind_device();
- call it from vmbus_exit().

In theory hv_synic_clockevents_cleanup() can be merged with hv_synic_cleanup(),
however, we call hv_synic_cleanup() from smp_call_function_single() and this
doesn't work for clockevents_unbind_device() as it does such call on its own. I
opted for a separate function.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/hv.c           |   25 ++++++++++++++++++++++++-
 drivers/hv/hyperv_vmbus.h |    2 ++
 drivers/hv/vmbus_drv.c    |    1 +
 3 files changed, 27 insertions(+), 1 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 39531dc..d3943bc 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -312,7 +312,11 @@ static void hv_init_clockevent_device(struct clock_event_device *dev, int cpu)
 	dev->features = CLOCK_EVT_FEAT_ONESHOT;
 	dev->cpumask = cpumask_of(cpu);
 	dev->rating = 1000;
-	dev->owner = THIS_MODULE;
+	/*
+	 * Avoid settint dev->owner = THIS_MODULE deliberately as doing so will
+	 * result in clockevents_config_and_register() taking additional
+	 * references to the hv_vmbus module making it impossible to unload.
+	 */
 
 	dev->set_mode = hv_ce_setmode;
 	dev->set_next_event = hv_ce_set_next_event;
@@ -470,6 +474,20 @@ void hv_synic_init(void *arg)
 }
 
 /*
+ * hv_synic_clockevents_cleanup - Cleanup clockevent devices
+ */
+void hv_synic_clockevents_cleanup(void)
+{
+	int cpu;
+
+	if (!(ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE))
+		return;
+
+	for_each_online_cpu(cpu)
+		clockevents_unbind_device(hv_context.clk_evt[cpu], cpu);
+}
+
+/*
  * hv_synic_cleanup - Cleanup routine for hv_synic_init().
  */
 void hv_synic_cleanup(void *arg)
@@ -483,6 +501,11 @@ void hv_synic_cleanup(void *arg)
 	if (!hv_context.synic_initialized)
 		return;
 
+	/* Turn off clockevent device */
+	if (ms_hyperv.features & HV_X64_MSR_SYNTIMER_AVAILABLE)
+		hv_ce_setmode(CLOCK_EVT_MODE_SHUTDOWN,
+			      hv_context.clk_evt[cpu]);
+
 	rdmsrl(HV_X64_MSR_SINT0 + VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
 
 	shared_sint.masked = 1;
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 6cf2de9..b055e53 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -572,6 +572,8 @@ extern void hv_synic_init(void *irqarg);
 
 extern void hv_synic_cleanup(void *arg);
 
+extern void hv_synic_clockevents_cleanup(void);
+
 /*
  * Host version information.
  */
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 8ff6f69..2de170a 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1101,6 +1101,7 @@ static void __exit vmbus_exit(void)
 	int cpu;
 
 	vmbus_connection.conn_state = DISCONNECTED;
+	hv_synic_clockevents_cleanup();
 	hv_remove_vmbus_irq();
 	vmbus_free_channels();
 	bus_unregister(&hv_bus);
-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors
  2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
                     ` (6 preceding siblings ...)
  2015-01-27 23:46   ` [PATCH RESEND 8/8] Drivers: hv: vmbus: Teardown clockevent devices on module unload K. Y. Srinivasan
@ 2015-01-28 22:16   ` KY Srinivasan
  7 siblings, 0 replies; 10+ messages in thread
From: KY Srinivasan @ 2015-01-28 22:16 UTC (permalink / raw)
  To: KY Srinivasan, gregkh, linux-kernel, devel, olaf, apw, vkuznets, tglx



> -----Original Message-----
> From: K. Y. Srinivasan [mailto:kys@microsoft.com]
> Sent: Tuesday, January 27, 2015 3:47 PM
> To: gregkh@linuxfoundation.org; linux-kernel@vger.kernel.org;
> devel@linuxdriverproject.org; olaf@aepfle.de; apw@canonical.com;
> vkuznets@redhat.com; tglx@linutronix.de
> Cc: KY Srinivasan
> Subject: [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on
> newer hypervisors
> 
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> When an SMP Hyper-V guest is running on top of 2012R2 Server and
> secondary cpus are sent offline (with echo 0 >
> /sys/devices/system/cpu/cpu$cpu/online)
> the system freeze is observed. This happens due to the fact that on newer
> hypervisors (Win8, WS2012R2, ...) vmbus channel handlers are distributed
> across all cpus (see init_vp_index() function in drivers/hv/channel_mgmt.c)
> and on cpu offlining nobody reassigns them to CPU0. Prevent cpu offlining
> when vmbus is loaded until the issue is fixed host-side.
> 
> This patch also disables hibernation but it is OK as it is also broken (MCE error
> is hit on resume). Suspend still works.
> 
> Tested with WS2008R2 and WS2012R2.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>

Greg,

Please drop this entire series; the patch-set was based on the incorrect tree. I will resend the set shortly
after rebasing them on your  char-misc.git.

Thank you,

K. Y

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-01-29  1:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-27 23:46 [PATCH 0/8] Drivers: hv: vmbus: Enable unloading of vmbus driver K. Y. Srinivasan
2015-01-27 23:46 ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 2/8] Drivers: hv: vmbus: rename channel work queues K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 3/8] drivers:hv:vmbus drivers:hv:vmbus Allow for more than one MMIO range for children K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 4/8] Drivers: hv: vmbus: avoid double kfree for device_obj K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 5/8] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 6/8] drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 7/8] clockevents: export clockevents_unbind_device instead of clockevents_unbind K. Y. Srinivasan
2015-01-27 23:46   ` [PATCH RESEND 8/8] Drivers: hv: vmbus: Teardown clockevent devices on module unload K. Y. Srinivasan
2015-01-28 22:16   ` [PATCH RESEND 1/8] Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors KY Srinivasan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).