LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Frank Rowand <frowand.list@gmail.com>
To: Vladimir Oltean <olteanv@gmail.com>
Cc: Rob Herring <robh+dt@kernel.org>,
	Sascha Hauer <s.hauer@pengutronix.de>,
	devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org
Subject: Re: of_node_put() usage is buggy all over drivers/of/base.c?!
Date: Mon, 16 Aug 2021 10:14:22 -0500	[thread overview]
Message-ID: <4cad28e0-d6b4-800d-787b-936ffaca7be3@gmail.com> (raw)
In-Reply-To: <20210816144622.tgslast6sbblclda@skbuf>

On 8/16/21 9:46 AM, Vladimir Oltean wrote:
> Hi Frank,
> 
> On Mon, Aug 16, 2021 at 09:33:03AM -0500, Frank Rowand wrote:
>> Hi Vladimir,
>>
>> On 8/13/21 8:01 PM, Vladimir Oltean wrote:
>>> Hi,
>>>
>>> I was debugging an RCU stall which happened during the probing of a
>>> driver. Activating lock debugging, I see:
>>
>> I took a quick look at sja1105_mdiobus_register() in v5.14-rc1 and v5.14-rc6.
>>
>> Looking at the following stack trace, I did not see any calls to
>> of_find_compatible_node() in sja1105_mdiobus_register().  I am
>> guessing that maybe there is an inlined function that calls
>> of_find_compatible_node().  This would likely be either
>> sja1105_mdiobus_base_tx_register() or sja1105_mdioux_base_t1_register().
> 
> Yes, it is sja1105_mdiobus_base_t1_register which is inlined.
> 
>>>
>>> [  101.710694] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
>>> [  101.719119] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1534, name: sh
>>> [  101.726763] INFO: lockdep is turned off.
>>> [  101.730674] irq event stamp: 0
>>> [  101.733716] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
>>> [  101.739973] hardirqs last disabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
>>> [  101.748146] softirqs last  enabled at (0): [<ffffd3ebecb10120>] copy_process+0xa78/0x1a98
>>> [  101.756313] softirqs last disabled at (0): [<0000000000000000>] 0x0
>>> [  101.762569] CPU: 4 PID: 1534 Comm: sh Not tainted 5.14.0-rc5+ #272
>>> [  101.774558] Call trace:
>>> [  101.794734]  __might_sleep+0x50/0x88
>>> [  101.798297]  __mutex_lock+0x60/0x938
>>> [  101.801863]  mutex_lock_nested+0x38/0x50
>>> [  101.805775]  kernfs_remove+0x2c/0x50             <---- this takes mutex_lock(&kernfs_mutex);
>>> [  101.809341]  sysfs_remove_dir+0x54/0x70
>>
>> The __kobject_del() occurs only if the refcount on the node
>> becomes zero.  This should never be true when of_find_compatible_node()
>> calls of_node_put() unless a "from" node is passed to of_find_compatible_node().
> 
> I figured that was the assumption, that the of_node_put would never
> trigger a sysfs file / kobject deletion from there.
> 
>> In both sja1105_mdiobus_base_tx_register() and sja1105_mdioux_base_t1_register()
>> a from node ("mdio") is passed to of_find_compatible_node() without first doing an
>> of_node_get(mdio).  If you add the of_node_get() calls the problem should be fixed.
> 
> The answer seems simple enough, but stupid question, but why does
> of_find_compatible_node call of_node_put on "from" in the first place?

Actually a good question.

I do not know why of_find_compatible_node() calls of_node_put() instead of making
the caller of of_find_compatible_node() responsible.  That pattern was created
long before I was involved in devicetree and I have not gone back to read the
review comments of when that code was created.

-Frank

  reply	other threads:[~2021-08-16 15:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-14  1:01 Vladimir Oltean
2021-08-16 14:33 ` Frank Rowand
2021-08-16 14:46   ` Vladimir Oltean
2021-08-16 15:14     ` Frank Rowand [this message]
2021-08-16 19:20       ` Rob Herring
2021-08-16 19:56         ` Frank Rowand
2021-08-16 20:00           ` Frank Rowand
2021-08-16 20:25           ` Rob Herring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4cad28e0-d6b4-800d-787b-936ffaca7be3@gmail.com \
    --to=frowand.list@gmail.com \
    --cc=devicetree@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=robh+dt@kernel.org \
    --cc=s.hauer@pengutronix.de \
    --subject='Re: of_node_put() usage is buggy all over drivers/of/base.c?'\!'' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).