Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
* PROBLEM: Long Workqueue delays V2
@ 2020-08-18 11:58 Jim Baxter
  2020-08-18 16:45 ` Jim Baxter
  2020-08-19 13:12 ` Jim Baxter
  0 siblings, 2 replies; 5+ messages in thread
From: Jim Baxter @ 2020-08-18 11:58 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel, linux-mm
  Cc: Resch Carsten (CM/ESO6), Rosca, Eugeniu (ADITG/ESB)

I am asking this question again to include the fs-devel list.


We have issues with the workqueue of the kernel overloading the CPU 0 
when we we disconnect a USB stick.

This results in other items on the shared workqueue being delayed by
around 6.5 seconds with a default kernel configuration and 2.3 seconds
on a config tailored for our RCar embedded platform.



We first noticed this issue on custom hardware and we have recreated it
on an RCar Starter Kit using a test module [1] to replicate the
behaviour, the test module outputs any delays of greater then 9ms.

To run the test we have a 4GB random file on a USB stick and perform
the following test.
The stick is mounted as R/O and we are copying data from the stick:

- Mount the stick.
mount -o ro,remount /dev/sda1

- Load the Module:
# taskset -c 0 modprobe latency-mon

- Copy large amount of data from the stick:
# dd if=/run/media/sda1/sample.txt of=/dev/zero
[ 1437.517603] DELAY: 10
8388607+1 records in
8388607+1 records out


- Disconnect the USB stick:
[ 1551.796792] usb 2-1: USB disconnect, device number 2
[ 1558.625517] DELAY: 6782


The Delay output 6782 is in milliseconds.



Using umount stops the issue occurring but is unfortunately not guaranteed
in our particular system.


From my analysis the hub_event workqueue kworker/0:1+usb thread uses around
98% of the CPU.

I have traced the workqueue:workqueue_queue_work function while unplugging the USB
and there is no particular workqueue function being executed a lot more then the 
others for the kworker/0:1+usb thread.


Using perf I identified the hub_events workqueue was spending a lot of time in
invalidate_partition(), I have included a cut down the captured data from perf in
[2] which shows the additional functions where the kworker spends most of its time.


I am aware there will be delays on the shared workqueue, are the delays
we are seeing considered normal?


Is there any way to mitigate or identify where the delay is?
I am unsure if this is a memory or filesystem subsystem issue.


Thank you for you help.

Thanks,
Jim Baxter

[1] Test Module:
// SPDX-License-Identifier: GPL-2.0
/*
 * Simple WQ latency monitoring
 *
 * Copyright (C) 2020 Advanced Driver Information Technology.
 */

#include <linux/init.h>
#include <linux/ktime.h>
#include <linux/module.h>

#define PERIOD_MS 100

static struct delayed_work wq;
static u64 us_save;

static void wq_cb(struct work_struct *work)
{
	u64 us = ktime_to_us(ktime_get());
	u64 us_diff = us - us_save;
	u64 us_print = 0;

	if (!us_save)
		goto skip_print;


	us_print = us_diff / 1000 - PERIOD_MS;
	if (us_print > 9)
		pr_crit("DELAY: %lld\n", us_print);

skip_print:
	us_save = us;
	schedule_delayed_work(&wq, msecs_to_jiffies(PERIOD_MS));
}

static int latency_mon_init(void)
{
	us_save = 0;
	INIT_DELAYED_WORK(&wq, wq_cb);
	schedule_delayed_work(&wq, msecs_to_jiffies(PERIOD_MS));

	return 0;
}

static void latency_mon_exit(void)
{
	cancel_delayed_work_sync(&wq);
	pr_info("%s\n", __func__);
}

module_init(latency_mon_init);
module_exit(latency_mon_exit);
MODULE_AUTHOR("Eugeniu Rosca <erosca@de.adit-jv.com>");
MODULE_LICENSE("GPL");


[2] perf trace:
    95.22%     0.00%  kworker/0:2-eve  [kernel.kallsyms]
    |
    ---ret_from_fork
       kthread
       worker_thread
       |          
        --95.15%--process_one_work
		  |          
		   --94.99%--hub_event
			 |          
			  --94.99%--usb_disconnect
			  <snip>
				|  
				--94.90%--invalidate_partition
				   __invalidate_device
				   |          
				   |--64.55%--invalidate_bdev
				   |  |          
				   |   --64.13%--invalidate_mapping_pages
				   |     |          
				   |     |--24.09%--invalidate_inode_page
				   |     |   |          
				   |     |   --23.44%--remove_mapping
				   |     |     |          
				   |     |      --23.20%--__remove_mapping
				   |     |        |          
				   |     |         --21.90%--arch_local_irq_restore
				   |     |          
				   |     |--22.44%--arch_local_irq_enable
				   |          
					--30.35%--shrink_dcache_sb 
					<snip>
					  |      
					  --30.17%--truncate_inode_pages_range


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-09-08 19:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-18 11:58 PROBLEM: Long Workqueue delays V2 Jim Baxter
2020-08-18 16:45 ` Jim Baxter
2020-08-19 13:12 ` Jim Baxter
2020-08-27 12:06   ` Jim Baxter
2020-09-08 15:54     ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).