LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Saravana Kannan <email@example.com>
To: Kefeng Wang <firstname.lastname@example.org>
Cc: Rob Herring <email@example.com>,
Frank Rowand <firstname.lastname@example.org>,
email@example.com, Russell King <firstname.lastname@example.org>,
Linus Walleij <email@example.com>,
Dmitry Torokhov <firstname.lastname@example.org>,
Subject: Re: [BUG] amba: Remove deferred device addition
Date: Wed, 8 Sep 2021 20:30:24 -0700 [thread overview]
Message-ID: <CAGETcx9m4=7V25nvYa0030ChKeJw5bu3ogs6gjFpjNKdq+_B_Q@mail.gmail.com> (raw)
On Fri, Aug 27, 2021 at 6:09 PM Kefeng Wang <email@example.com> wrote:
> On 2021/8/28 3:09, Saravana Kannan wrote:
> > On Fri, Aug 27, 2021 at 7:38 AM Kefeng Wang <firstname.lastname@example.org> wrote:
> >> On 2021/8/27 8:04, Saravana Kannan wrote:
> >>> On Thu, Aug 26, 2021 at 1:22 AM Kefeng Wang <email@example.com> wrote:
> >>>>>>> Btw, I've been working on  cleaning up the one-off deferred probe
> >>>>>>> solution that we have for amba devices. That causes a bunch of other
> >>>>>>> headaches. Your patch 3/3 takes us further in the wrong direction by
> >>>>>>> adding more reasons for delaying the addition of the device.
> >>>> Hi Saravana, I try the link, but with it, there is a crash when boot
> >>>> (qemu-system-arm -M vexpress-a15),
> > I'm assuming it's this one?
> > arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> I use arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dts.
> qemu-system-arm -M vexpress-a15 -dtb vexpress-v2p-ca15-tc1.dtb -cpu
> cortex-a15 -smp 2 -m size=3G -kernel zImage -rtc base=localtime -initrd
> initrd-arm32 -append 'console=ttyAMA0 cma=0 kfence.sample_interval=0
> earlyprintk debug ' -device virtio-net-device,netdev=net8 -netdev
> >>> Hi,
> >>> It's hard to make sense of the logs. Looks like two different threads
> >>> might be printing to the log at the same time? Can you please enable
> >>> the config that prints the thread ID (forgot what it's called) and
> >>> collect this again? With what I could tell the crash seems to be
> >>> happening somewhere in platform_match(), but that's not related to
> >>> this patch at all?
> >> Can you reproduce it? it is very likely related(without your patch, the
> >> boot is fine),
> > Sorry, I haven't ever setup qemu and booted vexpress. Thanks for your help.
> >> the NULL ptr is about serio, it is registed from amba driver.
> >> ambakmi_driver_init
> >> -- amba_kmi_probe
> >> -- __serio_register_port
> > Thanks for the pointer. I took a look at the logs and the code. It's
> > very strange. As you can see from the backtrace, platform_match() is
> > being called for the device_add() from serio_handle_event(). But the
> > device that gets added there is on the serio_bus which obviously
> > should be using the serio_bus_match.
> Yes, I am confused too.
> >> +Dmitry and input maillist, is there some known issue about serio ?
> >> I add some debug, the full log is attached.
> >> [ 2.958355][ T41] input: AT Raw Set 2 keyboard as
> >> /devices/platform/bus@8000000/bus@8000000:motherboard-bus/bus@8000000:motherboard-bus:iofpga-bus@300000000/1c060000.kmi/serio0/input/input0
> >> [ 2.977441][ T41] serio serio1: pdev c1e05508, pdev->name (null),
> >> drv c1090fc0, drv->name vexpress-reset
> > Based on the logs you added, it's pretty clear we are getting to
> > platform_match(). It's also strange that the drv->name is
> > vexpress-reset
> >> [ 3.003113][ T41] Backtrace:
> >> [ 3.003451][ T41] [<c0560bb4>] (strcmp) from [<c0646358>] (platform_match+0xdc/0xf0)
> >> [ 3.003963][ T41] [<c064627c>] (platform_match) from [<c06437d4>] (__device_attach_driver+0x3c/0xf4)
> >> [ 3.004769][ T41] [<c0643798>] (__device_attach_driver) from [<c0641180>] (bus_for_each_drv+0x68/0xc8)
> >> [ 3.005481][ T41] [<c0641118>] (bus_for_each_drv) from [<c0642f40>] (__device_attach+0xf0/0x16c)
> >> [ 3.006152][ T41] [<c0642e50>] (__device_attach) from [<c06439d4>] (device_initial_probe+0x1c/0x20)
> >> [ 3.006853][ T41] [<c06439b8>] (device_initial_probe) from [<c0642030>] (bus_probe_device+0x94/0x9c)
> >> [ 3.007259][ T41] [<c0641f9c>] (bus_probe_device) from [<c063f9cc>] (device_add+0x408/0x8b8)
> >> [ 3.007900][ T41] [<c063f5c4>] (device_add) from [<c071c1cc>] (serio_handle_event+0x1b8/0x234)
> >> [ 3.008824][ T41] [<c071c014>] (serio_handle_event) from [<c01475a4>] (process_one_work+0x238/0x594)
> >> [ 3.009737][ T41] [<c014736c>] (process_one_work) from [<c014795c>] (worker_thread+0x5c/0x5f4)
> >> [ 3.010638][ T41] [<c0147900>] (worker_thread) from [<c014feb4>] (kthread+0x178/0x194)
> >> [ 3.011496][ T41] [<c014fd3c>] (kthread) from [<c0100150>] (ret_from_fork+0x14/0x24)
> >> [ 3.011860][ T41] Exception stack(0xc1675fb0 to 0xc1675ff8)
> > But the platform_match() is happening for the device_add() from
> > serio_event_handle() that's adding a device to the serio_bus and it
> > should be using serio_bus_match().
> > I haven't reached any conclusion yet, but my current thought process
> > is that it's either:
> > 1. My patch is somehow causing list corruption. But I don't directly
> > touch any list in my change (other than deleting a list entirely), so
> > it's not clear how that would be happening.
> Maybe some concurrent driver load？
> > 2. Without my patch, these AMBA device's probe would be delayed at
> > least until 5 seconds or possibly later. I'm wondering if my patch is
> > catching some bad timing assumptions in other code.
> After Rob's patch, It will retry soon.
> commit 039599c92d3b2e73689e8b6e519d653fd4770abb
> Author: Rob Herring <firstname.lastname@example.org>
> Date: Wed Apr 29 15:58:12 2020 -0500
> amba: Retry adding deferred devices at late_initcall
> If amba bus devices defer when adding, the amba bus code simply retries
> adding the devices every 5 seconds. This doesn't work well as it
> completely unsynchronized with starting the init process which can
> happen in less than 5 secs. Add a retry during late_initcall. If the
> amba devices are added, then deferred probe takes over. If the
> dependencies have not probed at this point, then there's no improvement
> over previous behavior. To completely solve this, we'd need to retry
> after every successful probe as deferred probe does.
> The list_empty() check now happens outside the mutex, but the mutex
> wasn't necessary in the first place.
> This needed to use deferred probe instead of fragile initcall ordering
> on 32-bit VExpress systems where the apb_pclk has a number of probe
> dependencies (vexpress-sysregs, vexpress-config).
> > You might be able to test out theory (2) by DEFERRED_DEVICE_TIMEOUT to
> > a much smaller number. Say 500ms or 100ms. If it doesn't crash, it
> > doesn't mean it's not (2), but if it does, then we know for sure it's
> > (2).
> ok, I will try this one, but due to above patch, it may not work.
Were you able to find anything more?
next prev parent reply other threads:[~2021-09-09 3:31 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-16 7:46 [PATCH 0/3] amba: Properly handle device probe without IRQ domain Kefeng Wang
2021-08-16 7:46 ` [PATCH 1/3] amba: Drop unused functions about APB/AHB devices add Kefeng Wang
2021-08-16 7:46 ` [PATCH 2/3] Revert "ARM: amba: make use of -1 IRQs warn" Kefeng Wang
2021-08-16 7:46 ` [PATCH 3/3] amba: Properly handle device probe without IRQ domain Kefeng Wang
2021-08-24 20:05 ` Rob Herring
2021-08-24 20:08 ` Saravana Kannan
2021-08-25 4:05 ` Kefeng Wang
2021-08-25 8:04 ` Saravana Kannan
2021-08-25 8:39 ` Kefeng Wang
2021-08-26 2:45 ` Kefeng Wang
2021-08-26 4:45 ` Saravana Kannan
2021-08-26 6:22 ` Kefeng Wang
2021-08-26 8:22 ` [BUG] amba: Remove deferred device addition Kefeng Wang
2021-08-27 0:04 ` Saravana Kannan
2021-08-27 14:38 ` Kefeng Wang
2021-08-27 19:09 ` Saravana Kannan
2021-08-28 1:09 ` Kefeng Wang
2021-09-09 3:30 ` Saravana Kannan [this message]
2021-09-10 7:59 ` Kefeng Wang
2022-07-05 19:25 ` Saravana Kannan
2022-08-27 10:26 ` Leizhen (ThunderTown)
2021-08-25 12:33 ` [PATCH 3/3] amba: Properly handle device probe without IRQ domain Rob Herring
2021-08-25 14:41 ` Kefeng Wang
2021-08-17 22:27 ` [PATCH 0/3] " Rob Herring
2021-08-23 2:19 ` Kefeng Wang
2021-08-23 9:05 ` Russell King (Oracle)
2021-08-23 10:57 ` Kefeng Wang
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--subject='Re: [BUG] amba: Remove deferred device addition' \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).