LKML Archive on
help / color / mirror / Atom feed
* Some concurrent actions cause __device_links_no_driver to report the warning calltrace.
@ 2021-09-01  8:01 luojiaxing
  0 siblings, 0 replies; only message in thread
From: luojiaxing @ 2021-09-01  8:01 UTC (permalink / raw)
  To: rafael.j.wysocki, linux-pm; +Cc: linux-kernel

Hi, rafael

I found one issue about device link, and want to ask for your help.

During the kernel test recently, we find that some concurrent actions 
generate a calltrace, as shown in the following:

<4>[  606.102307] WARNING: CPU: 0 PID: 7 at drivers/base/core.c:1339 
<4>[  606.284685] Call trace:
<4>[  606.287122]  __device_links_no_driver+0x138/0x170
<4>[  606.291804]  device_links_driver_cleanup+0xb0/0xfc
<4>[  606.296575]  __device_release_driver+0x148/0x1d8
<4>[  606.301173]  device_release_driver+0x38/0x50
<4>[  606.305423]  bus_remove_device+0x130/0x140
<4>[  606.309502]  device_del+0x174/0x430
<4>[  606.312975]  __scsi_remove_device+0x114/0x14c
<4>[  606.317313]  scsi_remove_target+0x1bc/0x240
<4>[  606.321469]  sas_rphy_remove+0x90/0x94
<4>[  606.325202]  sas_rphy_delete+0x44/0x5c
<4>[  606.328935]  sas_destruct_devices+0x64/0xa0 [libsas]
<4>[  606.333883]  sas_revalidate_domain+0xf8/0x1d0 [libsas]
<4>[  606.339002]  process_one_work+0x1dc/0x48c
<4>[  606.342994]  worker_thread+0x15c/0x464
<4>[  606.346726]  kthread+0x168/0x16c
<4>[  606.349940]  ret_from_fork+0x10/0x18
<4>[  606.353502] ---[ end trace cceb4f5db8bdcd25 ]---

The test method is to rmmod device driver and perform hard reset on the 
hard disk at the same time.

We know device_links_unbind_consumers() is called during rmmod device 
driver to release all consumers under the device in sequence.

As we are storage controller driver, so it look as follows:

supplier: storage controller

consumer: sda->sdb->sdc...

As the device_links_unbind_consumers () releases the consumer device in 
serial mode. If a concurrent action is performed to hard reset a hard 
disk, as the following software call stack show :


The hardreset process also calls __device_links_no_driver.

Assume that device_links_unbind_consumers () is releasing sda and sdb is 
queuing, but scsi_remove_target() calls __device_links_no_driver() to 
release sdb in advance.Then a warning calltrace is generated.

We got some further analysis, it shows that sdb's link->status is now 
DL_STATE_ACTIVE(sda's sdb's link->status is modified to 
DL_STATE_SUPPLIER_UNBIND by device_links_unbind_consumers).

The if() in the following code will be false and pass through.

if (link->status != DL_STATE_CONSUMER_PROBE &&
     link->status != DL_STATE_ACTIVE)

Since link->supplier->links.status has been set to DL_DEV_UNBINDING, 
next code enters the else branch.

if (link->supplier->links.status == DL_DEV_DRIVER_BOUND) {
         WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
} else {
         WARN_ON(!(link->flags & DL_FLAG_SYNC_STATE_ONLY));
         WRITE_ONCE(link->status, DL_STATE_DORMANT);

Because link->flags is set to DL_FLAG_MANAGED, calltrace is generated 
based on WARN_ON.

In conclusion, we know that the call trace is generated because 
link->supplier->links.status and link->status are not modified 

After link->supplier->links.status is changed to DL_DEV_UNBINDING, the 
value of link->status is changed to DL_STATE_SUPPLIER_UNBIND in sequence.

During this time difference, if a concurrent kernel thread invokes 
__device_links_no_driver, warning calltrace will occurs.

I wonder if there is any way to solve this warning call trace?



^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-09-01  8:02 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-01  8:01 Some concurrent actions cause __device_links_no_driver to report the warning calltrace luojiaxing

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).