LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path
@ 2015-01-21 19:02 Vitaly Kuznetsov
  2015-01-21 19:02 ` [PATCH 1/3] Drivers: hv: vmbus: avoid double kfree for device_obj Vitaly Kuznetsov
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-21 19:02 UTC (permalink / raw)
  To: K. Y. Srinivasan, devel; +Cc: Haiyang Zhang, linux-kernel, Dexuan Cui

It is possible (since 93e5bd06a953: "Drivers: hv: Make the vmbus driver
unloadable") to unload hv_vmbus driver if no other devices are connected.
1aec169673d7: "x86: Hyperv: Cleanup the irq mess" fixed doulble interrupt
gate setup. However, if we try to unload hv_vmbus and then load it back
crashes in different places of vmbus driver occur on both unload and second
load paths. Address those I saw in my testing.

Not everything is fixed though. MCE was hit once on Generation2 instance and
I neither understand what caused it nor do I know the way to reproduce it.
Anyway, here is the log:

[  204.846255] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 0: b2000000c0020001
[  204.846675] mce: [Hardware Error]: TSC 6b5cd64bc8 
[  204.846675] mce: [Hardware Error]: PROCESSOR 0:306e4 TIME 1421944123 SOCKET 0 APIC 0 microcode ffffffff
[  204.846675] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[  204.846675] mce: [Hardware Error]: Machine check: Processor context corrupt
[  204.846675] Kernel panic - not syncing: Fatal Machine check
[  204.846675] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[  204.846675] Rebooting in 30 seconds..
[  204.846675] ACPI MEMORY or I/O RESET_REG.

Vitaly Kuznetsov (3):
  Drivers: hv: vmbus: avoid double kfree for device_obj
  Drivers: hv: vmbus: introduce vmbus_acpi_remove
  Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and
    vmbus_connection pages on shutdown

 drivers/hv/channel_mgmt.c |  1 -
 drivers/hv/connection.c   | 17 ++++++++++++-----
 drivers/hv/hyperv_vmbus.h |  1 +
 drivers/hv/vmbus_drv.c    | 16 ++++++++++++++++
 4 files changed, 29 insertions(+), 6 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] Drivers: hv: vmbus: avoid double kfree for device_obj
  2015-01-21 19:02 [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
@ 2015-01-21 19:02 ` Vitaly Kuznetsov
  2015-01-21 19:02 ` [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-21 19:02 UTC (permalink / raw)
  To: K. Y. Srinivasan, devel; +Cc: Haiyang Zhang, linux-kernel, Dexuan Cui

On driver shutdown device_obj is being freed twice:
1) In vmbus_free_channels()
2) vmbus_device_release() (which is being triggered by device_unregister() in
   vmbus_device_unregister().
This double kfree leads to the following sporadic crash on driver unload:

[   23.469876] general protection fault: 0000 [#1] SMP
[   23.470036] Modules linked in: hv_vmbus(-)
[   23.470036] CPU: 2 PID: 213 Comm: rmmod Not tainted 3.19.0-rc5_bug923184+ #488
[   23.470036] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[   23.470036] task: ffff880036ef1cb0 ti: ffff880036ce8000 task.ti: ffff880036ce8000
[   23.470036] RIP: 0010:[<ffffffff811d2e1b>]  [<ffffffff811d2e1b>] __kmalloc_node_track_caller+0xdb/0x1e0
[   23.470036] RSP: 0018:ffff880036cebcc8  EFLAGS: 00010246
...

When this crash does not happen on driver unload the similar one is expected if
we try to load hv_vmbus again.

Remove kfree from vmbus_free_channels() as freeing it from
vmbus_device_release() seems right.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/channel_mgmt.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 877a944..0141a3d 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -259,7 +259,6 @@ void vmbus_free_channels(void)
 
 	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
 		vmbus_device_unregister(channel->device_obj);
-		kfree(channel->device_obj);
 		free_channel(channel);
 	}
 }
-- 
1.9.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
  2015-01-21 19:02 [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
  2015-01-21 19:02 ` [PATCH 1/3] Drivers: hv: vmbus: avoid double kfree for device_obj Vitaly Kuznetsov
@ 2015-01-21 19:02 ` Vitaly Kuznetsov
  2015-01-22 19:08   ` KY Srinivasan
  2015-01-21 19:02 ` [PATCH 3/3] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown Vitaly Kuznetsov
  2015-01-26 13:41 ` [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
  3 siblings, 1 reply; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-21 19:02 UTC (permalink / raw)
  To: K. Y. Srinivasan, devel; +Cc: Haiyang Zhang, linux-kernel, Dexuan Cui

In case we do request_resource() in vmbus_acpi_add() we need to tear it down
to be able to load the driver again. Otherwise the following crash in oberved
when hv_vmbus unload/load sequence is performed on Generation2 instance:

[   38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0
[   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
[   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
[   38.166315] Oops: 0000 [#1] SMP
[   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
[   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-rc5_bug923184+ #486
[   38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti: ffff88003f60c000
[   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
[   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286
...

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/vmbus_drv.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 4d6b269..b06cb87 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -902,6 +902,15 @@ acpi_walk_err:
 	return ret_val;
 }
 
+static int vmbus_acpi_remove(struct acpi_device *device)
+{
+	int ret = 0;
+
+	if (hyperv_mmio.start && hyperv_mmio.end)
+		ret = release_resource(&hyperv_mmio);
+	return ret;
+}
+
 static const struct acpi_device_id vmbus_acpi_device_ids[] = {
 	{"VMBUS", 0},
 	{"VMBus", 0},
@@ -914,6 +923,7 @@ static struct acpi_driver vmbus_acpi_driver = {
 	.ids = vmbus_acpi_device_ids,
 	.ops = {
 		.add = vmbus_acpi_add,
+		.remove = vmbus_acpi_remove,
 	},
 };
 
-- 
1.9.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 3/3] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown
  2015-01-21 19:02 [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
  2015-01-21 19:02 ` [PATCH 1/3] Drivers: hv: vmbus: avoid double kfree for device_obj Vitaly Kuznetsov
  2015-01-21 19:02 ` [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove Vitaly Kuznetsov
@ 2015-01-21 19:02 ` Vitaly Kuznetsov
  2015-01-26 13:41 ` [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
  3 siblings, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-21 19:02 UTC (permalink / raw)
  To: K. Y. Srinivasan, devel; +Cc: Haiyang Zhang, linux-kernel, Dexuan Cui

We need to destroy hv_vmbus_con on module shutdown, otherwise the following
crash is sometimes observed:

[   76.569845] hv_vmbus: Hyper-V Host Build:9600-6.3-17-0.17039; Vmbus version:3.0
[   82.598859] BUG: unable to handle kernel paging request at ffffffffa0003480
[   82.599287] IP: [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] PGD 1f34067 PUD 1f35063 PMD 3f72d067 PTE 0
[   82.599287] Oops: 0010 [#1] SMP
[   82.599287] Modules linked in: [last unloaded: hv_vmbus]
[   82.599287] CPU: 0 PID: 26 Comm: kworker/0:1 Not tainted 3.19.0-rc5_bug923184+ #488
[   82.599287] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   82.599287] Workqueue: hv_vmbus_con 0xffffffffa0003480
[   82.599287] task: ffff88007b6ddfa0 ti: ffff88007f8f8000 task.ti: ffff88007f8f8000
[   82.599287] RIP: 0010:[<ffffffffa0003480>]  [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] RSP: 0018:ffff88007f8fbe00  EFLAGS: 00010202
...

To avoid memory leaks we need to free monitor_pages and int_page for
vmbus_connection. Implement vmbus_disconnect() function by separating cleanup
path from vmbus_connect().

As we use hv_vmbus_con to release channels (see free_channel() in channel_mgmt.c)
we need to make sure the work was done before we remove the queue, do that with
drain_workqueue(). We also need to avoid handling messages  which can (potentially)
create new channels, so set vmbus_connection.conn_state = DISCONNECTED at the very
beginning of vmbus_exit() and check for that in vmbus_onmessage_work().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/connection.c   | 17 ++++++++++++-----
 drivers/hv/hyperv_vmbus.h |  1 +
 drivers/hv/vmbus_drv.c    |  6 ++++++
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index e206619..2fce0c7 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -214,10 +214,21 @@ int vmbus_connect(void)
 
 cleanup:
 	pr_err("Unable to connect to host\n");
+
 	vmbus_connection.conn_state = DISCONNECTED;
+	vmbus_disconnect();
+
+	kfree(msginfo);
+
+	return ret;
+}
 
-	if (vmbus_connection.work_queue)
+void vmbus_disconnect(void)
+{
+	if (vmbus_connection.work_queue) {
+		drain_workqueue(vmbus_connection.work_queue);
 		destroy_workqueue(vmbus_connection.work_queue);
+	}
 
 	if (vmbus_connection.int_page) {
 		free_pages((unsigned long)vmbus_connection.int_page, 0);
@@ -228,10 +239,6 @@ cleanup:
 	free_pages((unsigned long)vmbus_connection.monitor_pages[1], 0);
 	vmbus_connection.monitor_pages[0] = NULL;
 	vmbus_connection.monitor_pages[1] = NULL;
-
-	kfree(msginfo);
-
-	return ret;
 }
 
 /*
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index c386d8d..e01c13e 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -671,6 +671,7 @@ void vmbus_free_channels(void);
 /* Connection interface */
 
 int vmbus_connect(void);
+void vmbus_disconnect(void);
 
 int vmbus_post_msg(void *buffer, size_t buflen);
 
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b06cb87..e647505 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -572,6 +572,10 @@ static void vmbus_onmessage_work(struct work_struct *work)
 {
 	struct onmessage_work_context *ctx;
 
+	/* Do not process messages if we're in DISCONNECTED state */
+	if (vmbus_connection.conn_state == DISCONNECTED)
+		return;
+
 	ctx = container_of(work, struct onmessage_work_context,
 			   work);
 	vmbus_onmessage(&ctx->msg);
@@ -969,11 +973,13 @@ cleanup:
 
 static void __exit vmbus_exit(void)
 {
+	vmbus_connection.conn_state = DISCONNECTED;
 	hv_remove_vmbus_irq();
 	vmbus_free_channels();
 	bus_unregister(&hv_bus);
 	hv_cleanup();
 	acpi_bus_unregister_driver(&vmbus_acpi_driver);
+	vmbus_disconnect();
 }
 
 
-- 
1.9.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
  2015-01-21 19:02 ` [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove Vitaly Kuznetsov
@ 2015-01-22 19:08   ` KY Srinivasan
  2015-01-23 10:01     ` Vitaly Kuznetsov
  2015-04-10 12:47     ` Vitaly Kuznetsov
  0 siblings, 2 replies; 9+ messages in thread
From: KY Srinivasan @ 2015-01-22 19:08 UTC (permalink / raw)
  To: Vitaly Kuznetsov, devel
  Cc: Haiyang Zhang, linux-kernel, Dexuan Cui, Jake Oshins



> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> Sent: Wednesday, January 21, 2015 11:02 AM
> To: KY Srinivasan; devel@linuxdriverproject.org
> Cc: Haiyang Zhang; linux-kernel@vger.kernel.org; Dexuan Cui
> Subject: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
> 
> In case we do request_resource() in vmbus_acpi_add() we need to tear it
> down to be able to load the driver again. Otherwise the following crash in
> oberved when hv_vmbus unload/load sequence is performed on
> Generation2 instance:
> 
> [   38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0
> [   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
> [   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
> [   38.166315] Oops: 0000 [#1] SMP
> [   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
> [   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-
> rc5_bug923184+ #486
> [   38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual
> Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
> [   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti:
> ffff88003f60c000
> [   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>]
> __request_resource+0x2f/0x50
> [   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286
> ...
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  drivers/hv/vmbus_drv.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index
> 4d6b269..b06cb87 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -902,6 +902,15 @@ acpi_walk_err:
>  	return ret_val;
>  }
> 
> +static int vmbus_acpi_remove(struct acpi_device *device) {
> +	int ret = 0;
> +
> +	if (hyperv_mmio.start && hyperv_mmio.end)
> +		ret = release_resource(&hyperv_mmio);
> +	return ret;
> +}
> +
>  static const struct acpi_device_id vmbus_acpi_device_ids[] = {
>  	{"VMBUS", 0},
>  	{"VMBus", 0},
> @@ -914,6 +923,7 @@ static struct acpi_driver vmbus_acpi_driver = {
>  	.ids = vmbus_acpi_device_ids,
>  	.ops = {
>  		.add = vmbus_acpi_add,
> +		.remove = vmbus_acpi_remove,
>  	},
>  };

Vitaly,

Jake has sent the following patch that has fixed retrieving of the mmio resources:
https://lkml.org/lkml/2015/1/20/876

This patch also deals with the resource cleanup that you have in this patch.

Regards,

K. Y
> 
> --
> 1.9.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
  2015-01-22 19:08   ` KY Srinivasan
@ 2015-01-23 10:01     ` Vitaly Kuznetsov
  2015-04-10 12:47     ` Vitaly Kuznetsov
  1 sibling, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-23 10:01 UTC (permalink / raw)
  To: KY Srinivasan; +Cc: devel, Haiyang Zhang, linux-kernel, Dexuan Cui, Jake Oshins

KY Srinivasan <kys@microsoft.com> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
>> Sent: Wednesday, January 21, 2015 11:02 AM
>> To: KY Srinivasan; devel@linuxdriverproject.org
>> Cc: Haiyang Zhang; linux-kernel@vger.kernel.org; Dexuan Cui
>> Subject: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
>> 
>> In case we do request_resource() in vmbus_acpi_add() we need to tear it
>> down to be able to load the driver again. Otherwise the following crash in
>> oberved when hv_vmbus unload/load sequence is performed on
>> Generation2 instance:
>> 
>> [   38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0
>> [   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
>> [   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
>> [   38.166315] Oops: 0000 [#1] SMP
>> [   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
>> [   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-
>> rc5_bug923184+ #486
>> [   38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual
>> Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
>> [   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti:
>> ffff88003f60c000
>> [   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>]
>> __request_resource+0x2f/0x50
>> [   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286
>> ...
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  drivers/hv/vmbus_drv.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>> 
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index
>> 4d6b269..b06cb87 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -902,6 +902,15 @@ acpi_walk_err:
>>  	return ret_val;
>>  }
>> 
>> +static int vmbus_acpi_remove(struct acpi_device *device) {
>> +	int ret = 0;
>> +
>> +	if (hyperv_mmio.start && hyperv_mmio.end)
>> +		ret = release_resource(&hyperv_mmio);
>> +	return ret;
>> +}
>> +
>>  static const struct acpi_device_id vmbus_acpi_device_ids[] = {
>>  	{"VMBUS", 0},
>>  	{"VMBus", 0},
>> @@ -914,6 +923,7 @@ static struct acpi_driver vmbus_acpi_driver = {
>>  	.ids = vmbus_acpi_device_ids,
>>  	.ops = {
>>  		.add = vmbus_acpi_add,
>> +		.remove = vmbus_acpi_remove,
>>  	},
>>  };
>
> Vitaly,
>
> Jake has sent the following patch that has fixed retrieving of the mmio resources:
> https://lkml.org/lkml/2015/1/20/876
>
> This patch also deals with the resource cleanup that you have in this
> patch.

Ah, I see, I'm stepping on his toes here :) We can just drop this patch
from this series if Jake's get accepted (and I think it should).

>
> Regards,
>
> K. Y
>> 
>> --
>> 1.9.3

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path
  2015-01-21 19:02 [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2015-01-21 19:02 ` [PATCH 3/3] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown Vitaly Kuznetsov
@ 2015-01-26 13:41 ` Vitaly Kuznetsov
  3 siblings, 0 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-01-26 13:41 UTC (permalink / raw)
  To: K. Y. Srinivasan; +Cc: devel, Haiyang Zhang, linux-kernel

Vitaly Kuznetsov <vkuznets@redhat.com> writes:

> It is possible (since 93e5bd06a953: "Drivers: hv: Make the vmbus driver
> unloadable") to unload hv_vmbus driver if no other devices are connected.
> 1aec169673d7: "x86: Hyperv: Cleanup the irq mess" fixed doulble interrupt
> gate setup. However, if we try to unload hv_vmbus and then load it back
> crashes in different places of vmbus driver occur on both unload and second
> load paths. Address those I saw in my testing.

It seems that newly introduced clockevent device (Drivers: hv: vmbus:
Implement a clockevent device) makes it impossible to unload hv_vmbus
module:

# rmmod hv_vmbus
rmmod hv_vmbus
rmmod: ERROR: Module hv_vmbus is in use

I'll try investigating before sending v2 without PATCH 2/3.

>
> Not everything is fixed though. MCE was hit once on Generation2 instance and
> I neither understand what caused it nor do I know the way to reproduce it.
> Anyway, here is the log:
>
> [  204.846255] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 0: b2000000c0020001
> [  204.846675] mce: [Hardware Error]: TSC 6b5cd64bc8 
> [  204.846675] mce: [Hardware Error]: PROCESSOR 0:306e4 TIME 1421944123 SOCKET 0 APIC 0 microcode ffffffff
> [  204.846675] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
> [  204.846675] mce: [Hardware Error]: Machine check: Processor context corrupt
> [  204.846675] Kernel panic - not syncing: Fatal Machine check
> [  204.846675] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
> [  204.846675] Rebooting in 30 seconds..
> [  204.846675] ACPI MEMORY or I/O RESET_REG.
>
> Vitaly Kuznetsov (3):
>   Drivers: hv: vmbus: avoid double kfree for device_obj
>   Drivers: hv: vmbus: introduce vmbus_acpi_remove
>   Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and
>     vmbus_connection pages on shutdown
>
>  drivers/hv/channel_mgmt.c |  1 -
>  drivers/hv/connection.c   | 17 ++++++++++++-----
>  drivers/hv/hyperv_vmbus.h |  1 +
>  drivers/hv/vmbus_drv.c    | 16 ++++++++++++++++
>  4 files changed, 29 insertions(+), 6 deletions(-)

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
  2015-01-22 19:08   ` KY Srinivasan
  2015-01-23 10:01     ` Vitaly Kuznetsov
@ 2015-04-10 12:47     ` Vitaly Kuznetsov
  2015-04-10 14:50       ` KY Srinivasan
  1 sibling, 1 reply; 9+ messages in thread
From: Vitaly Kuznetsov @ 2015-04-10 12:47 UTC (permalink / raw)
  To: KY Srinivasan; +Cc: devel, Haiyang Zhang, linux-kernel, Dexuan Cui, Jake Oshins

KY Srinivasan <kys@microsoft.com> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
>> Sent: Wednesday, January 21, 2015 11:02 AM
>> To: KY Srinivasan; devel@linuxdriverproject.org
>> Cc: Haiyang Zhang; linux-kernel@vger.kernel.org; Dexuan Cui
>> Subject: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
>> 
>> In case we do request_resource() in vmbus_acpi_add() we need to tear it
>> down to be able to load the driver again. Otherwise the following crash in
>> oberved when hv_vmbus unload/load sequence is performed on
>> Generation2 instance:
>> 
>> [   38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0
>> [   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
>> [   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
>> [   38.166315] Oops: 0000 [#1] SMP
>> [   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
>> [   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-
>> rc5_bug923184+ #486
>> [   38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual
>> Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
>> [   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti:
>> ffff88003f60c000
>> [   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>]
>> __request_resource+0x2f/0x50
>> [   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286
>> ...
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  drivers/hv/vmbus_drv.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>> 
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index
>> 4d6b269..b06cb87 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -902,6 +902,15 @@ acpi_walk_err:
>>  	return ret_val;
>>  }
>> 
>> +static int vmbus_acpi_remove(struct acpi_device *device) {
>> +	int ret = 0;
>> +
>> +	if (hyperv_mmio.start && hyperv_mmio.end)
>> +		ret = release_resource(&hyperv_mmio);
>> +	return ret;
>> +}
>> +
>>  static const struct acpi_device_id vmbus_acpi_device_ids[] = {
>>  	{"VMBUS", 0},
>>  	{"VMBus", 0},
>> @@ -914,6 +923,7 @@ static struct acpi_driver vmbus_acpi_driver = {
>>  	.ids = vmbus_acpi_device_ids,
>>  	.ops = {
>>  		.add = vmbus_acpi_add,
>> +		.remove = vmbus_acpi_remove,
>>  	},
>>  };
>
> Vitaly,
>
> Jake has sent the following patch that has fixed retrieving of the mmio resources:
> https://lkml.org/lkml/2015/1/20/876
>
> This patch also deals with the resource cleanup that you have in this patch.
>

Hi K.Y.,

it looks like Jake was forced to redo his work, do you think it would
make sense to have this simple fix for mainline in the meantime? I can
rebase/resend in case you do.

Thanks,

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
  2015-04-10 12:47     ` Vitaly Kuznetsov
@ 2015-04-10 14:50       ` KY Srinivasan
  0 siblings, 0 replies; 9+ messages in thread
From: KY Srinivasan @ 2015-04-10 14:50 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: devel, Haiyang Zhang, linux-kernel, Dexuan Cui, Jake Oshins



> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> Sent: Friday, April 10, 2015 5:48 AM
> To: KY Srinivasan
> Cc: devel@linuxdriverproject.org; Haiyang Zhang; linux-
> kernel@vger.kernel.org; Dexuan Cui; Jake Oshins
> Subject: Re: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
> 
> KY Srinivasan <kys@microsoft.com> writes:
> 
> >> -----Original Message-----
> >> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> >> Sent: Wednesday, January 21, 2015 11:02 AM
> >> To: KY Srinivasan; devel@linuxdriverproject.org
> >> Cc: Haiyang Zhang; linux-kernel@vger.kernel.org; Dexuan Cui
> >> Subject: [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove
> >>
> >> In case we do request_resource() in vmbus_acpi_add() we need to tear it
> >> down to be able to load the driver again. Otherwise the following crash in
> >> oberved when hv_vmbus unload/load sequence is performed on
> >> Generation2 instance:
> >>
> >> [   38.165701] BUG: unable to handle kernel paging request at
> ffffffffa00075a0
> >> [   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
> >> [   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
> >> [   38.166315] Oops: 0000 [#1] SMP
> >> [   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
> >> [   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-
> >> rc5_bug923184+ #486
> >> [   38.166315] Hardware name: Microsoft Corporation Virtual
> Machine/Virtual
> >> Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
> >> [   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti:
> >> ffff88003f60c000
> >> [   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>]
> >> __request_resource+0x2f/0x50
> >> [   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286
> >> ...
> >>
> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> ---
> >>  drivers/hv/vmbus_drv.c | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index
> >> 4d6b269..b06cb87 100644
> >> --- a/drivers/hv/vmbus_drv.c
> >> +++ b/drivers/hv/vmbus_drv.c
> >> @@ -902,6 +902,15 @@ acpi_walk_err:
> >>  	return ret_val;
> >>  }
> >>
> >> +static int vmbus_acpi_remove(struct acpi_device *device) {
> >> +	int ret = 0;
> >> +
> >> +	if (hyperv_mmio.start && hyperv_mmio.end)
> >> +		ret = release_resource(&hyperv_mmio);
> >> +	return ret;
> >> +}
> >> +
> >>  static const struct acpi_device_id vmbus_acpi_device_ids[] = {
> >>  	{"VMBUS", 0},
> >>  	{"VMBus", 0},
> >> @@ -914,6 +923,7 @@ static struct acpi_driver vmbus_acpi_driver = {
> >>  	.ids = vmbus_acpi_device_ids,
> >>  	.ops = {
> >>  		.add = vmbus_acpi_add,
> >> +		.remove = vmbus_acpi_remove,
> >>  	},
> >>  };
> >
> > Vitaly,
> >
> > Jake has sent the following patch that has fixed retrieving of the mmio
> resources:
> > https://lkml.org/lkml/2015/1/20/876
> >
> > This patch also deals with the resource cleanup that you have in this patch.
> >
> 
> Hi K.Y.,
> 
> it looks like Jake was forced to redo his work, do you think it would
> make sense to have this simple fix for mainline in the meantime? I can
> rebase/resend in case you do.

Please do. Thanks.

K. Y
> 
> Thanks,
> 
> --
>   Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-04-10 14:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-21 19:02 [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov
2015-01-21 19:02 ` [PATCH 1/3] Drivers: hv: vmbus: avoid double kfree for device_obj Vitaly Kuznetsov
2015-01-21 19:02 ` [PATCH 2/3] Drivers: hv: vmbus: introduce vmbus_acpi_remove Vitaly Kuznetsov
2015-01-22 19:08   ` KY Srinivasan
2015-01-23 10:01     ` Vitaly Kuznetsov
2015-04-10 12:47     ` Vitaly Kuznetsov
2015-04-10 14:50       ` KY Srinivasan
2015-01-21 19:02 ` [PATCH 3/3] Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown Vitaly Kuznetsov
2015-01-26 13:41 ` [PATCH 0/3] Drivers: hv: vmbus: fix crashes on hv_vmbus load/unload path Vitaly Kuznetsov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).