LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] input/psmouse: Don't hold the mutex while calling ->disconnect
@ 2018-04-30 19:56 Daniel Vetter
  2018-04-30 21:17 ` Dmitry Torokhov
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Vetter @ 2018-04-30 19:56 UTC (permalink / raw)
  To: Intel Graphics Development
  Cc: LKML, Daniel Vetter, Daniel Vetter, Dmitry Torokhov,
	Benjamin Tissoires, Arvind Yadav, Stephen Lyons, linux-input

At least trackpoint_disconnect wants to remove some sysfs files, and
we can't remove sysfs files while holding psmouse_mutex:

======================================================
WARNING: possible circular locking dependency detected
4.16.0-rc5-g613eb885b69e-drmtip_1+ #1 Tainted: G     U
------------------------------------------------------
kworker/0:3/102 is trying to acquire lock:
 (kn->count#130){++++}, at: [<000000009679748b>] kernfs_remove_by_name_ns+0x3b/0x80

but task is already holding lock:
 (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (psmouse_mutex){+.+.}:
       psmouse_attr_set_helper+0x28/0x140
       kernfs_fop_write+0xfe/0x180
       __vfs_write+0x1e/0x130
       vfs_write+0xbd/0x1b0
       SyS_write+0x40/0xa0
       do_syscall_64+0x65/0x1a0
       entry_SYSCALL_64_after_hwframe+0x42/0xb7

-> #0 (kn->count#130){++++}:
       __kernfs_remove+0x243/0x2b0
       kernfs_remove_by_name_ns+0x3b/0x80
       remove_files.isra.0+0x2b/0x60
       sysfs_remove_group+0x38/0x80
       sysfs_remove_groups+0x24/0x40
       trackpoint_disconnect+0x2c/0x50
       psmouse_disconnect+0x8f/0x160
       serio_disconnect_driver+0x28/0x40
       serio_driver_remove+0xc/0x10
       device_release_driver_internal+0x15b/0x230
       serio_handle_event+0x1c8/0x260
       process_one_work+0x215/0x620
       worker_thread+0x48/0x3a0
       kthread+0xfb/0x130
       ret_from_fork+0x3a/0x50

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(psmouse_mutex);
                               lock(kn->count#130);
                               lock(psmouse_mutex);
  lock(kn->count#130);

 *** DEADLOCK ***

6 locks held by kworker/0:3/102:
 #0:  ((wq_completion)"events_long"){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
 #1:  (serio_event_work){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
 #2:  (serio_mutex){+.+.}, at: [<00000000c8a49847>] serio_handle_event+0x23/0x260
 #3:  (&dev->mutex){....}, at: [<00000000b55eee75>] device_release_driver_internal+0x2f/0x230
 #4:  (&serio->drv_mutex){+.+.}, at: [<000000009719f997>] serio_disconnect_driver+0x16/0x40
 #5:  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160

stack backtrace:
CPU: 0 PID: 102 Comm: kworker/0:3 Tainted: G     U           4.16.0-rc5-g613eb885b69e-drmtip_1+ #1
Hardware name: LENOVO 74591P0/74591P0, BIOS 6DET28WW (1.05 ) 07/30/2008
Workqueue: events_long serio_handle_event
Call Trace:
 dump_stack+0x5f/0x86
 print_circular_bug.isra.18+0x1d0/0x2c0
 __lock_acquire+0x14ae/0x1b60
 ? kernfs_remove_by_name_ns+0x20/0x80
 ? lock_acquire+0xaf/0x200
 lock_acquire+0xaf/0x200
 ? kernfs_remove_by_name_ns+0x3b/0x80
 __kernfs_remove+0x243/0x2b0
 ? kernfs_remove_by_name_ns+0x3b/0x80
 ? kernfs_name_hash+0xd/0x70
 ? kernfs_find_ns+0x7e/0x100
 kernfs_remove_by_name_ns+0x3b/0x80
 remove_files.isra.0+0x2b/0x60
 sysfs_remove_group+0x38/0x80
 sysfs_remove_groups+0x24/0x40
 trackpoint_disconnect+0x2c/0x50
 psmouse_disconnect+0x8f/0x160
 serio_disconnect_driver+0x28/0x40
 serio_driver_remove+0xc/0x10
 device_release_driver_internal+0x15b/0x230
 serio_handle_event+0x1c8/0x260
 process_one_work+0x215/0x620
 worker_thread+0x48/0x3a0
 ? _raw_spin_unlock_irqrestore+0x4c/0x60
 kthread+0xfb/0x130
 ? process_one_work+0x620/0x620
 ? _kthread_create_on_node+0x30/0x30
 ret_from_fork+0x3a/0x50

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Stephen Lyons <slysven@virginmedia.com>
Cc: linux-input@vger.kernel.org
---
 drivers/input/mouse/psmouse-base.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c
index 8900c3166ebf..06ccd8e22f3c 100644
--- a/drivers/input/mouse/psmouse-base.c
+++ b/drivers/input/mouse/psmouse-base.c
@@ -1484,8 +1484,10 @@ static void psmouse_disconnect(struct serio *serio)
 		psmouse_deactivate(parent);
 	}
 
+	mutex_unlock(&psmouse_mutex);
 	if (psmouse->disconnect)
 		psmouse->disconnect(psmouse);
+	mutex_lock(&psmouse_mutex);
 
 	if (parent && parent->pt_deactivate)
 		parent->pt_deactivate(parent);
-- 
2.17.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] input/psmouse: Don't hold the mutex while calling ->disconnect
  2018-04-30 19:56 [PATCH] input/psmouse: Don't hold the mutex while calling ->disconnect Daniel Vetter
@ 2018-04-30 21:17 ` Dmitry Torokhov
  2018-05-02  9:06   ` Daniel Vetter
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Torokhov @ 2018-04-30 21:17 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, LKML, Daniel Vetter,
	Benjamin Tissoires, Arvind Yadav, Stephen Lyons, linux-input

Hi Daniel,

On Mon, Apr 30, 2018 at 09:56:49PM +0200, Daniel Vetter wrote:
> At least trackpoint_disconnect wants to remove some sysfs files, and
> we can't remove sysfs files while holding psmouse_mutex:
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.16.0-rc5-g613eb885b69e-drmtip_1+ #1 Tainted: G     U
> ------------------------------------------------------
> kworker/0:3/102 is trying to acquire lock:
>  (kn->count#130){++++}, at: [<000000009679748b>] kernfs_remove_by_name_ns+0x3b/0x80
> 
> but task is already holding lock:
>  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (psmouse_mutex){+.+.}:
>        psmouse_attr_set_helper+0x28/0x140
>        kernfs_fop_write+0xfe/0x180
>        __vfs_write+0x1e/0x130
>        vfs_write+0xbd/0x1b0
>        SyS_write+0x40/0xa0
>        do_syscall_64+0x65/0x1a0
>        entry_SYSCALL_64_after_hwframe+0x42/0xb7
> 
> -> #0 (kn->count#130){++++}:
>        __kernfs_remove+0x243/0x2b0
>        kernfs_remove_by_name_ns+0x3b/0x80
>        remove_files.isra.0+0x2b/0x60
>        sysfs_remove_group+0x38/0x80
>        sysfs_remove_groups+0x24/0x40
>        trackpoint_disconnect+0x2c/0x50
>        psmouse_disconnect+0x8f/0x160
>        serio_disconnect_driver+0x28/0x40
>        serio_driver_remove+0xc/0x10
>        device_release_driver_internal+0x15b/0x230
>        serio_handle_event+0x1c8/0x260
>        process_one_work+0x215/0x620
>        worker_thread+0x48/0x3a0
>        kthread+0xfb/0x130
>        ret_from_fork+0x3a/0x50
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(psmouse_mutex);
>                                lock(kn->count#130);
>                                lock(psmouse_mutex);
>   lock(kn->count#130);
> 
>  *** DEADLOCK ***
> 
> 6 locks held by kworker/0:3/102:
>  #0:  ((wq_completion)"events_long"){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
>  #1:  (serio_event_work){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
>  #2:  (serio_mutex){+.+.}, at: [<00000000c8a49847>] serio_handle_event+0x23/0x260
>  #3:  (&dev->mutex){....}, at: [<00000000b55eee75>] device_release_driver_internal+0x2f/0x230
>  #4:  (&serio->drv_mutex){+.+.}, at: [<000000009719f997>] serio_disconnect_driver+0x16/0x40
>  #5:  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160
> 
> stack backtrace:
> CPU: 0 PID: 102 Comm: kworker/0:3 Tainted: G     U           4.16.0-rc5-g613eb885b69e-drmtip_1+ #1
> Hardware name: LENOVO 74591P0/74591P0, BIOS 6DET28WW (1.05 ) 07/30/2008
> Workqueue: events_long serio_handle_event
> Call Trace:
>  dump_stack+0x5f/0x86
>  print_circular_bug.isra.18+0x1d0/0x2c0
>  __lock_acquire+0x14ae/0x1b60
>  ? kernfs_remove_by_name_ns+0x20/0x80
>  ? lock_acquire+0xaf/0x200
>  lock_acquire+0xaf/0x200
>  ? kernfs_remove_by_name_ns+0x3b/0x80
>  __kernfs_remove+0x243/0x2b0
>  ? kernfs_remove_by_name_ns+0x3b/0x80
>  ? kernfs_name_hash+0xd/0x70
>  ? kernfs_find_ns+0x7e/0x100
>  kernfs_remove_by_name_ns+0x3b/0x80
>  remove_files.isra.0+0x2b/0x60
>  sysfs_remove_group+0x38/0x80
>  sysfs_remove_groups+0x24/0x40
>  trackpoint_disconnect+0x2c/0x50
>  psmouse_disconnect+0x8f/0x160
>  serio_disconnect_driver+0x28/0x40
>  serio_driver_remove+0xc/0x10
>  device_release_driver_internal+0x15b/0x230
>  serio_handle_event+0x1c8/0x260
>  process_one_work+0x215/0x620
>  worker_thread+0x48/0x3a0
>  ? _raw_spin_unlock_irqrestore+0x4c/0x60
>  kthread+0xfb/0x130
>  ? process_one_work+0x620/0x620
>  ? _kthread_create_on_node+0x30/0x30
>  ret_from_fork+0x3a/0x50
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
> Cc: Stephen Lyons <slysven@virginmedia.com>
> Cc: linux-input@vger.kernel.org
> ---
>  drivers/input/mouse/psmouse-base.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c
> index 8900c3166ebf..06ccd8e22f3c 100644
> --- a/drivers/input/mouse/psmouse-base.c
> +++ b/drivers/input/mouse/psmouse-base.c
> @@ -1484,8 +1484,10 @@ static void psmouse_disconnect(struct serio *serio)
>  		psmouse_deactivate(parent);
>  	}
>  
> +	mutex_unlock(&psmouse_mutex);
>  	if (psmouse->disconnect)
>  		psmouse->disconnect(psmouse);
> +	mutex_lock(&psmouse_mutex);

Why do you think it is proper to drop this mutex? It is introduced for a
reason.

I think the trace you are seeing is due to:

commit 988cd7afb3f37598891ca70b4c6eb914c338c46a
Author: Tejun Heo <tj@kernel.org>
Date:   Mon Feb 3 14:02:58 2014 -0500

    kernfs: remove kernfs_addrm_cxt

where we started taking kernfs_mutex when adding/removing sysfs files.
Ultimately I think we may have to change switching protocol to a work
item to be done asynchronously from writing to sysfs attribute.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] input/psmouse: Don't hold the mutex while calling ->disconnect
  2018-04-30 21:17 ` Dmitry Torokhov
@ 2018-05-02  9:06   ` Daniel Vetter
  2018-10-01  9:23     ` Daniel Vetter
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Vetter @ 2018-05-02  9:06 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Intel Graphics Development, LKML, Daniel Vetter,
	Benjamin Tissoires, Arvind Yadav, Stephen Lyons, linux-input

On Mon, Apr 30, 2018 at 11:17 PM, Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
> Hi Daniel,
>
> On Mon, Apr 30, 2018 at 09:56:49PM +0200, Daniel Vetter wrote:
>> At least trackpoint_disconnect wants to remove some sysfs files, and
>> we can't remove sysfs files while holding psmouse_mutex:
>>
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 4.16.0-rc5-g613eb885b69e-drmtip_1+ #1 Tainted: G     U
>> ------------------------------------------------------
>> kworker/0:3/102 is trying to acquire lock:
>>  (kn->count#130){++++}, at: [<000000009679748b>] kernfs_remove_by_name_ns+0x3b/0x80
>>
>> but task is already holding lock:
>>  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160
>>
>> which lock already depends on the new lock.
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (psmouse_mutex){+.+.}:
>>        psmouse_attr_set_helper+0x28/0x140
>>        kernfs_fop_write+0xfe/0x180
>>        __vfs_write+0x1e/0x130
>>        vfs_write+0xbd/0x1b0
>>        SyS_write+0x40/0xa0
>>        do_syscall_64+0x65/0x1a0
>>        entry_SYSCALL_64_after_hwframe+0x42/0xb7
>>
>> -> #0 (kn->count#130){++++}:
>>        __kernfs_remove+0x243/0x2b0
>>        kernfs_remove_by_name_ns+0x3b/0x80
>>        remove_files.isra.0+0x2b/0x60
>>        sysfs_remove_group+0x38/0x80
>>        sysfs_remove_groups+0x24/0x40
>>        trackpoint_disconnect+0x2c/0x50
>>        psmouse_disconnect+0x8f/0x160
>>        serio_disconnect_driver+0x28/0x40
>>        serio_driver_remove+0xc/0x10
>>        device_release_driver_internal+0x15b/0x230
>>        serio_handle_event+0x1c8/0x260
>>        process_one_work+0x215/0x620
>>        worker_thread+0x48/0x3a0
>>        kthread+0xfb/0x130
>>        ret_from_fork+0x3a/0x50
>>
>> other info that might help us debug this:
>>
>>  Possible unsafe locking scenario:
>>
>>        CPU0                    CPU1
>>        ----                    ----
>>   lock(psmouse_mutex);
>>                                lock(kn->count#130);
>>                                lock(psmouse_mutex);
>>   lock(kn->count#130);
>>
>>  *** DEADLOCK ***
>>
>> 6 locks held by kworker/0:3/102:
>>  #0:  ((wq_completion)"events_long"){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
>>  #1:  (serio_event_work){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
>>  #2:  (serio_mutex){+.+.}, at: [<00000000c8a49847>] serio_handle_event+0x23/0x260
>>  #3:  (&dev->mutex){....}, at: [<00000000b55eee75>] device_release_driver_internal+0x2f/0x230
>>  #4:  (&serio->drv_mutex){+.+.}, at: [<000000009719f997>] serio_disconnect_driver+0x16/0x40
>>  #5:  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160
>>
>> stack backtrace:
>> CPU: 0 PID: 102 Comm: kworker/0:3 Tainted: G     U           4.16.0-rc5-g613eb885b69e-drmtip_1+ #1
>> Hardware name: LENOVO 74591P0/74591P0, BIOS 6DET28WW (1.05 ) 07/30/2008
>> Workqueue: events_long serio_handle_event
>> Call Trace:
>>  dump_stack+0x5f/0x86
>>  print_circular_bug.isra.18+0x1d0/0x2c0
>>  __lock_acquire+0x14ae/0x1b60
>>  ? kernfs_remove_by_name_ns+0x20/0x80
>>  ? lock_acquire+0xaf/0x200
>>  lock_acquire+0xaf/0x200
>>  ? kernfs_remove_by_name_ns+0x3b/0x80
>>  __kernfs_remove+0x243/0x2b0
>>  ? kernfs_remove_by_name_ns+0x3b/0x80
>>  ? kernfs_name_hash+0xd/0x70
>>  ? kernfs_find_ns+0x7e/0x100
>>  kernfs_remove_by_name_ns+0x3b/0x80
>>  remove_files.isra.0+0x2b/0x60
>>  sysfs_remove_group+0x38/0x80
>>  sysfs_remove_groups+0x24/0x40
>>  trackpoint_disconnect+0x2c/0x50
>>  psmouse_disconnect+0x8f/0x160
>>  serio_disconnect_driver+0x28/0x40
>>  serio_driver_remove+0xc/0x10
>>  device_release_driver_internal+0x15b/0x230
>>  serio_handle_event+0x1c8/0x260
>>  process_one_work+0x215/0x620
>>  worker_thread+0x48/0x3a0
>>  ? _raw_spin_unlock_irqrestore+0x4c/0x60
>>  kthread+0xfb/0x130
>>  ? process_one_work+0x620/0x620
>>  ? _kthread_create_on_node+0x30/0x30
>>  ret_from_fork+0x3a/0x50
>>
>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
>> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
>> Cc: Stephen Lyons <slysven@virginmedia.com>
>> Cc: linux-input@vger.kernel.org
>> ---
>>  drivers/input/mouse/psmouse-base.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c
>> index 8900c3166ebf..06ccd8e22f3c 100644
>> --- a/drivers/input/mouse/psmouse-base.c
>> +++ b/drivers/input/mouse/psmouse-base.c
>> @@ -1484,8 +1484,10 @@ static void psmouse_disconnect(struct serio *serio)
>>               psmouse_deactivate(parent);
>>       }
>>
>> +     mutex_unlock(&psmouse_mutex);
>>       if (psmouse->disconnect)
>>               psmouse->disconnect(psmouse);
>> +     mutex_lock(&psmouse_mutex);
>
> Why do you think it is proper to drop this mutex? It is introduced for a
> reason.
>
> I think the trace you are seeing is due to:
>
> commit 988cd7afb3f37598891ca70b4c6eb914c338c46a
> Author: Tejun Heo <tj@kernel.org>
> Date:   Mon Feb 3 14:02:58 2014 -0500
>
>     kernfs: remove kernfs_addrm_cxt
>
> where we started taking kernfs_mutex when adding/removing sysfs files.
> Ultimately I think we may have to change switching protocol to a work
> item to be done asynchronously from writing to sysfs attribute.


Well "holding a lock while adding/removing sysfs files and holding the
same lock from sysfs read/write callbacks" is a classic deadlock. I've
made lockdep shut up about it, I have no idea how to fix it properly.
kernfs switching it's locking doesn't change anything here.

Generally the fix is to untangle the locking: You need 1 set of locks
to protect the datastructures exposed through sysfs, and another thing
(maybe that's provided by serio already, I have no idea) to make sure
you're ordering connect/disconnect handling correct and there's not
concurrent calls to this hooks. Assuming serio does give that
guarantee then there's no need to hold the lock while unregistering
the sysfs files (we could perhaps push the lock dropping down into
trackpoint_disconnect, but that's less maintainable imo since it's not
obvious in psmouse_disconnect what's going on), and my patch is
correct.

But I didn't do a full locking audit of what exactly serio guarantees,
and what exactly psmouse_mutex needs to protect (and where
psmouse_mutex is only held because it's convenient).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] input/psmouse: Don't hold the mutex while calling ->disconnect
  2018-05-02  9:06   ` Daniel Vetter
@ 2018-10-01  9:23     ` Daniel Vetter
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Vetter @ 2018-10-01  9:23 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: intel-gfx, Linux Kernel Mailing List, Daniel Vetter,
	Benjamin Tissoires, Arvind Yadav, Stephen Lyons, linux-input

On Wed, May 2, 2018 at 11:06 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On Mon, Apr 30, 2018 at 11:17 PM, Dmitry Torokhov
> <dmitry.torokhov@gmail.com> wrote:
> > Hi Daniel,
> >
> > On Mon, Apr 30, 2018 at 09:56:49PM +0200, Daniel Vetter wrote:
> >> At least trackpoint_disconnect wants to remove some sysfs files, and
> >> we can't remove sysfs files while holding psmouse_mutex:
> >>
> >> ======================================================
> >> WARNING: possible circular locking dependency detected
> >> 4.16.0-rc5-g613eb885b69e-drmtip_1+ #1 Tainted: G     U
> >> ------------------------------------------------------
> >> kworker/0:3/102 is trying to acquire lock:
> >>  (kn->count#130){++++}, at: [<000000009679748b>] kernfs_remove_by_name_ns+0x3b/0x80
> >>
> >> but task is already holding lock:
> >>  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160
> >>
> >> which lock already depends on the new lock.
> >>
> >> the existing dependency chain (in reverse order) is:
> >>
> >> -> #1 (psmouse_mutex){+.+.}:
> >>        psmouse_attr_set_helper+0x28/0x140
> >>        kernfs_fop_write+0xfe/0x180
> >>        __vfs_write+0x1e/0x130
> >>        vfs_write+0xbd/0x1b0
> >>        SyS_write+0x40/0xa0
> >>        do_syscall_64+0x65/0x1a0
> >>        entry_SYSCALL_64_after_hwframe+0x42/0xb7
> >>
> >> -> #0 (kn->count#130){++++}:
> >>        __kernfs_remove+0x243/0x2b0
> >>        kernfs_remove_by_name_ns+0x3b/0x80
> >>        remove_files.isra.0+0x2b/0x60
> >>        sysfs_remove_group+0x38/0x80
> >>        sysfs_remove_groups+0x24/0x40
> >>        trackpoint_disconnect+0x2c/0x50
> >>        psmouse_disconnect+0x8f/0x160
> >>        serio_disconnect_driver+0x28/0x40
> >>        serio_driver_remove+0xc/0x10
> >>        device_release_driver_internal+0x15b/0x230
> >>        serio_handle_event+0x1c8/0x260
> >>        process_one_work+0x215/0x620
> >>        worker_thread+0x48/0x3a0
> >>        kthread+0xfb/0x130
> >>        ret_from_fork+0x3a/0x50
> >>
> >> other info that might help us debug this:
> >>
> >>  Possible unsafe locking scenario:
> >>
> >>        CPU0                    CPU1
> >>        ----                    ----
> >>   lock(psmouse_mutex);
> >>                                lock(kn->count#130);
> >>                                lock(psmouse_mutex);
> >>   lock(kn->count#130);
> >>
> >>  *** DEADLOCK ***
> >>
> >> 6 locks held by kworker/0:3/102:
> >>  #0:  ((wq_completion)"events_long"){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
> >>  #1:  (serio_event_work){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
> >>  #2:  (serio_mutex){+.+.}, at: [<00000000c8a49847>] serio_handle_event+0x23/0x260
> >>  #3:  (&dev->mutex){....}, at: [<00000000b55eee75>] device_release_driver_internal+0x2f/0x230
> >>  #4:  (&serio->drv_mutex){+.+.}, at: [<000000009719f997>] serio_disconnect_driver+0x16/0x40
> >>  #5:  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160
> >>
> >> stack backtrace:
> >> CPU: 0 PID: 102 Comm: kworker/0:3 Tainted: G     U           4.16.0-rc5-g613eb885b69e-drmtip_1+ #1
> >> Hardware name: LENOVO 74591P0/74591P0, BIOS 6DET28WW (1.05 ) 07/30/2008
> >> Workqueue: events_long serio_handle_event
> >> Call Trace:
> >>  dump_stack+0x5f/0x86
> >>  print_circular_bug.isra.18+0x1d0/0x2c0
> >>  __lock_acquire+0x14ae/0x1b60
> >>  ? kernfs_remove_by_name_ns+0x20/0x80
> >>  ? lock_acquire+0xaf/0x200
> >>  lock_acquire+0xaf/0x200
> >>  ? kernfs_remove_by_name_ns+0x3b/0x80
> >>  __kernfs_remove+0x243/0x2b0
> >>  ? kernfs_remove_by_name_ns+0x3b/0x80
> >>  ? kernfs_name_hash+0xd/0x70
> >>  ? kernfs_find_ns+0x7e/0x100
> >>  kernfs_remove_by_name_ns+0x3b/0x80
> >>  remove_files.isra.0+0x2b/0x60
> >>  sysfs_remove_group+0x38/0x80
> >>  sysfs_remove_groups+0x24/0x40
> >>  trackpoint_disconnect+0x2c/0x50
> >>  psmouse_disconnect+0x8f/0x160
> >>  serio_disconnect_driver+0x28/0x40
> >>  serio_driver_remove+0xc/0x10
> >>  device_release_driver_internal+0x15b/0x230
> >>  serio_handle_event+0x1c8/0x260
> >>  process_one_work+0x215/0x620
> >>  worker_thread+0x48/0x3a0
> >>  ? _raw_spin_unlock_irqrestore+0x4c/0x60
> >>  kthread+0xfb/0x130
> >>  ? process_one_work+0x620/0x620
> >>  ? _kthread_create_on_node+0x30/0x30
> >>  ret_from_fork+0x3a/0x50
> >>
> >> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> >> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> >> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
> >> Cc: Stephen Lyons <slysven@virginmedia.com>
> >> Cc: linux-input@vger.kernel.org
> >> ---
> >>  drivers/input/mouse/psmouse-base.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c
> >> index 8900c3166ebf..06ccd8e22f3c 100644
> >> --- a/drivers/input/mouse/psmouse-base.c
> >> +++ b/drivers/input/mouse/psmouse-base.c
> >> @@ -1484,8 +1484,10 @@ static void psmouse_disconnect(struct serio *serio)
> >>               psmouse_deactivate(parent);
> >>       }
> >>
> >> +     mutex_unlock(&psmouse_mutex);
> >>       if (psmouse->disconnect)
> >>               psmouse->disconnect(psmouse);
> >> +     mutex_lock(&psmouse_mutex);
> >
> > Why do you think it is proper to drop this mutex? It is introduced for a
> > reason.
> >
> > I think the trace you are seeing is due to:
> >
> > commit 988cd7afb3f37598891ca70b4c6eb914c338c46a
> > Author: Tejun Heo <tj@kernel.org>
> > Date:   Mon Feb 3 14:02:58 2014 -0500
> >
> >     kernfs: remove kernfs_addrm_cxt
> >
> > where we started taking kernfs_mutex when adding/removing sysfs files.
> > Ultimately I think we may have to change switching protocol to a work
> > item to be done asynchronously from writing to sysfs attribute.
>
> Well "holding a lock while adding/removing sysfs files and holding the
> same lock from sysfs read/write callbacks" is a classic deadlock. I've
> made lockdep shut up about it, I have no idea how to fix it properly.
> kernfs switching it's locking doesn't change anything here.
>
> Generally the fix is to untangle the locking: You need 1 set of locks
> to protect the datastructures exposed through sysfs, and another thing
> (maybe that's provided by serio already, I have no idea) to make sure
> you're ordering connect/disconnect handling correct and there's not
> concurrent calls to this hooks. Assuming serio does give that
> guarantee then there's no need to hold the lock while unregistering
> the sysfs files (we could perhaps push the lock dropping down into
> trackpoint_disconnect, but that's less maintainable imo since it's not
> obvious in psmouse_disconnect what's going on), and my patch is
> correct.
>
> But I didn't do a full locking audit of what exactly serio guarantees,
> and what exactly psmouse_mutex needs to protect (and where
> psmouse_mutex is only held because it's convenient).

Ping? Just noticed that I still have this bugfix hanging around.
bugfix as in "make lockdep more useful", not necessarily fixing the
locking here properly. I guess I could respin to add a FIXME comment,
but not sure how useful that would be.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] input/psmouse: Don't hold the mutex while calling ->disconnect
  2018-03-20  8:51 Daniel Vetter
@ 2018-04-25  9:16 ` Daniel Vetter
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Vetter @ 2018-04-25  9:16 UTC (permalink / raw)
  To: Intel Graphics Development
  Cc: LKML, Daniel Vetter, Daniel Vetter, Dmitry Torokhov,
	Benjamin Tissoires, Arvind Yadav

Ping.
-Daniel

On Tue, Mar 20, 2018 at 9:51 AM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> At least trackpoint_disconnect wants to remove some sysfs files, and
> we can't remove sysfs files while holding psmouse_mutex:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.16.0-rc5-g613eb885b69e-drmtip_1+ #1 Tainted: G     U
> ------------------------------------------------------
> kworker/0:3/102 is trying to acquire lock:
>  (kn->count#130){++++}, at: [<000000009679748b>] kernfs_remove_by_name_ns+0x3b/0x80
>
> but task is already holding lock:
>  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (psmouse_mutex){+.+.}:
>        psmouse_attr_set_helper+0x28/0x140
>        kernfs_fop_write+0xfe/0x180
>        __vfs_write+0x1e/0x130
>        vfs_write+0xbd/0x1b0
>        SyS_write+0x40/0xa0
>        do_syscall_64+0x65/0x1a0
>        entry_SYSCALL_64_after_hwframe+0x42/0xb7
>
> -> #0 (kn->count#130){++++}:
>        __kernfs_remove+0x243/0x2b0
>        kernfs_remove_by_name_ns+0x3b/0x80
>        remove_files.isra.0+0x2b/0x60
>        sysfs_remove_group+0x38/0x80
>        sysfs_remove_groups+0x24/0x40
>        trackpoint_disconnect+0x2c/0x50
>        psmouse_disconnect+0x8f/0x160
>        serio_disconnect_driver+0x28/0x40
>        serio_driver_remove+0xc/0x10
>        device_release_driver_internal+0x15b/0x230
>        serio_handle_event+0x1c8/0x260
>        process_one_work+0x215/0x620
>        worker_thread+0x48/0x3a0
>        kthread+0xfb/0x130
>        ret_from_fork+0x3a/0x50
>
> other info that might help us debug this:
>
>  Possible unsafe locking scenario:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(psmouse_mutex);
>                                lock(kn->count#130);
>                                lock(psmouse_mutex);
>   lock(kn->count#130);
>
>  *** DEADLOCK ***
>
> 6 locks held by kworker/0:3/102:
>  #0:  ((wq_completion)"events_long"){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
>  #1:  (serio_event_work){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
>  #2:  (serio_mutex){+.+.}, at: [<00000000c8a49847>] serio_handle_event+0x23/0x260
>  #3:  (&dev->mutex){....}, at: [<00000000b55eee75>] device_release_driver_internal+0x2f/0x230
>  #4:  (&serio->drv_mutex){+.+.}, at: [<000000009719f997>] serio_disconnect_driver+0x16/0x40
>  #5:  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160
>
> stack backtrace:
> CPU: 0 PID: 102 Comm: kworker/0:3 Tainted: G     U           4.16.0-rc5-g613eb885b69e-drmtip_1+ #1
> Hardware name: LENOVO 74591P0/74591P0, BIOS 6DET28WW (1.05 ) 07/30/2008
> Workqueue: events_long serio_handle_event
> Call Trace:
>  dump_stack+0x5f/0x86
>  print_circular_bug.isra.18+0x1d0/0x2c0
>  __lock_acquire+0x14ae/0x1b60
>  ? kernfs_remove_by_name_ns+0x20/0x80
>  ? lock_acquire+0xaf/0x200
>  lock_acquire+0xaf/0x200
>  ? kernfs_remove_by_name_ns+0x3b/0x80
>  __kernfs_remove+0x243/0x2b0
>  ? kernfs_remove_by_name_ns+0x3b/0x80
>  ? kernfs_name_hash+0xd/0x70
>  ? kernfs_find_ns+0x7e/0x100
>  kernfs_remove_by_name_ns+0x3b/0x80
>  remove_files.isra.0+0x2b/0x60
>  sysfs_remove_group+0x38/0x80
>  sysfs_remove_groups+0x24/0x40
>  trackpoint_disconnect+0x2c/0x50
>  psmouse_disconnect+0x8f/0x160
>  serio_disconnect_driver+0x28/0x40
>  serio_driver_remove+0xc/0x10
>  device_release_driver_internal+0x15b/0x230
>  serio_handle_event+0x1c8/0x260
>  process_one_work+0x215/0x620
>  worker_thread+0x48/0x3a0
>  ? _raw_spin_unlock_irqrestore+0x4c/0x60
>  kthread+0xfb/0x130
>  ? process_one_work+0x620/0x620
>  ? _kthread_create_on_node+0x30/0x30
>  ret_from_fork+0x3a/0x50
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
> ---
>  drivers/input/mouse/psmouse-base.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c
> index 8ac9e03c05b4..6e09a5d6e58a 100644
> --- a/drivers/input/mouse/psmouse-base.c
> +++ b/drivers/input/mouse/psmouse-base.c
> @@ -1467,8 +1467,10 @@ static void psmouse_disconnect(struct serio *serio)
>                 psmouse_deactivate(parent);
>         }
>
> +       mutex_unlock(&psmouse_mutex);
>         if (psmouse->disconnect)
>                 psmouse->disconnect(psmouse);
> +       mutex_lock(&psmouse_mutex);
>
>         if (parent && parent->pt_deactivate)
>                 parent->pt_deactivate(parent);
> --
> 2.16.2
>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] input/psmouse: Don't hold the mutex while calling ->disconnect
@ 2018-03-20  8:51 Daniel Vetter
  2018-04-25  9:16 ` Daniel Vetter
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Vetter @ 2018-03-20  8:51 UTC (permalink / raw)
  To: Intel Graphics Development
  Cc: LKML, Daniel Vetter, Daniel Vetter, Dmitry Torokhov,
	Benjamin Tissoires, Arvind Yadav

At least trackpoint_disconnect wants to remove some sysfs files, and
we can't remove sysfs files while holding psmouse_mutex:

======================================================
WARNING: possible circular locking dependency detected
4.16.0-rc5-g613eb885b69e-drmtip_1+ #1 Tainted: G     U
------------------------------------------------------
kworker/0:3/102 is trying to acquire lock:
 (kn->count#130){++++}, at: [<000000009679748b>] kernfs_remove_by_name_ns+0x3b/0x80

but task is already holding lock:
 (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (psmouse_mutex){+.+.}:
       psmouse_attr_set_helper+0x28/0x140
       kernfs_fop_write+0xfe/0x180
       __vfs_write+0x1e/0x130
       vfs_write+0xbd/0x1b0
       SyS_write+0x40/0xa0
       do_syscall_64+0x65/0x1a0
       entry_SYSCALL_64_after_hwframe+0x42/0xb7

-> #0 (kn->count#130){++++}:
       __kernfs_remove+0x243/0x2b0
       kernfs_remove_by_name_ns+0x3b/0x80
       remove_files.isra.0+0x2b/0x60
       sysfs_remove_group+0x38/0x80
       sysfs_remove_groups+0x24/0x40
       trackpoint_disconnect+0x2c/0x50
       psmouse_disconnect+0x8f/0x160
       serio_disconnect_driver+0x28/0x40
       serio_driver_remove+0xc/0x10
       device_release_driver_internal+0x15b/0x230
       serio_handle_event+0x1c8/0x260
       process_one_work+0x215/0x620
       worker_thread+0x48/0x3a0
       kthread+0xfb/0x130
       ret_from_fork+0x3a/0x50

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(psmouse_mutex);
                               lock(kn->count#130);
                               lock(psmouse_mutex);
  lock(kn->count#130);

 *** DEADLOCK ***

6 locks held by kworker/0:3/102:
 #0:  ((wq_completion)"events_long"){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
 #1:  (serio_event_work){+.+.}, at: [<000000002e408bfa>] process_one_work+0x191/0x620
 #2:  (serio_mutex){+.+.}, at: [<00000000c8a49847>] serio_handle_event+0x23/0x260
 #3:  (&dev->mutex){....}, at: [<00000000b55eee75>] device_release_driver_internal+0x2f/0x230
 #4:  (&serio->drv_mutex){+.+.}, at: [<000000009719f997>] serio_disconnect_driver+0x16/0x40
 #5:  (psmouse_mutex){+.+.}, at: [<0000000014f44bcc>] psmouse_disconnect+0x62/0x160

stack backtrace:
CPU: 0 PID: 102 Comm: kworker/0:3 Tainted: G     U           4.16.0-rc5-g613eb885b69e-drmtip_1+ #1
Hardware name: LENOVO 74591P0/74591P0, BIOS 6DET28WW (1.05 ) 07/30/2008
Workqueue: events_long serio_handle_event
Call Trace:
 dump_stack+0x5f/0x86
 print_circular_bug.isra.18+0x1d0/0x2c0
 __lock_acquire+0x14ae/0x1b60
 ? kernfs_remove_by_name_ns+0x20/0x80
 ? lock_acquire+0xaf/0x200
 lock_acquire+0xaf/0x200
 ? kernfs_remove_by_name_ns+0x3b/0x80
 __kernfs_remove+0x243/0x2b0
 ? kernfs_remove_by_name_ns+0x3b/0x80
 ? kernfs_name_hash+0xd/0x70
 ? kernfs_find_ns+0x7e/0x100
 kernfs_remove_by_name_ns+0x3b/0x80
 remove_files.isra.0+0x2b/0x60
 sysfs_remove_group+0x38/0x80
 sysfs_remove_groups+0x24/0x40
 trackpoint_disconnect+0x2c/0x50
 psmouse_disconnect+0x8f/0x160
 serio_disconnect_driver+0x28/0x40
 serio_driver_remove+0xc/0x10
 device_release_driver_internal+0x15b/0x230
 serio_handle_event+0x1c8/0x260
 process_one_work+0x215/0x620
 worker_thread+0x48/0x3a0
 ? _raw_spin_unlock_irqrestore+0x4c/0x60
 kthread+0xfb/0x130
 ? process_one_work+0x620/0x620
 ? _kthread_create_on_node+0x30/0x30
 ret_from_fork+0x3a/0x50

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
---
 drivers/input/mouse/psmouse-base.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c
index 8ac9e03c05b4..6e09a5d6e58a 100644
--- a/drivers/input/mouse/psmouse-base.c
+++ b/drivers/input/mouse/psmouse-base.c
@@ -1467,8 +1467,10 @@ static void psmouse_disconnect(struct serio *serio)
 		psmouse_deactivate(parent);
 	}
 
+	mutex_unlock(&psmouse_mutex);
 	if (psmouse->disconnect)
 		psmouse->disconnect(psmouse);
+	mutex_lock(&psmouse_mutex);
 
 	if (parent && parent->pt_deactivate)
 		parent->pt_deactivate(parent);
-- 
2.16.2

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-01  9:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-30 19:56 [PATCH] input/psmouse: Don't hold the mutex while calling ->disconnect Daniel Vetter
2018-04-30 21:17 ` Dmitry Torokhov
2018-05-02  9:06   ` Daniel Vetter
2018-10-01  9:23     ` Daniel Vetter
  -- strict thread matches above, loose matches on Subject: below --
2018-03-20  8:51 Daniel Vetter
2018-04-25  9:16 ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).