LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Chao Gao <email@example.com>
To: "Paul E. McKenney" <firstname.lastname@example.org>
Cc: Feng Tang <email@example.com>,
kernel test robot <firstname.lastname@example.org>,
John Stultz <email@example.com>,
Thomas Gleixner <firstname.lastname@example.org>,
Stephen Boyd <email@example.com>, Jonathan Corbet <firstname.lastname@example.org>,
Mark Rutland <Mark.Rutland@arm.com>,
Marc Zyngier <email@example.com>, Andi Kleen <firstname.lastname@example.org>,
Xing Zhengjun <email@example.com>,
Chris Mason <firstname.lastname@example.org>, LKML <email@example.com>,
Linux Memory Management List <firstname.lastname@example.org>,
email@example.com, firstname.lastname@example.org, email@example.com,
Subject: Re: [clocksource] 8901ecc231: stress-ng.lockbus.ops_per_sec -9.5% regression
Date: Thu, 5 Aug 2021 10:16:48 +0800 [thread overview]
Message-ID: <20210805021646.GA11629@gao-cwp> (raw)
On Tue, Aug 03, 2021 at 06:48:16AM -0700, Paul E. McKenney wrote:
>On Tue, Aug 03, 2021 at 04:58:00PM +0800, Chao Gao wrote:
>> On Mon, Aug 02, 2021 at 10:02:57AM -0700, Paul E. McKenney wrote:
>> >On Mon, Aug 02, 2021 at 02:20:09PM +0800, Chao Gao wrote:
>> >> [snip]
>> >> >commit 48ebcfbfd877f5d9cddcc03c91352a8ca7b190af
>> >> >Author: Paul E. McKenney <firstname.lastname@example.org>
>> >> >Date: Thu May 27 11:03:28 2021 -0700
>> >> >
>> >> > clocksource: Forgive repeated long-latency watchdog clocksource reads
>> >> >
>> >> > Currently, the clocksource watchdog reacts to repeated long-latency
>> >> > clocksource reads by marking that clocksource unstable on the theory that
>> >> > these long-latency reads are a sign of a serious problem. And this theory
>> >> > does in fact have real-world support in the form of firmware issues .
>> >> >
>> >> > However, it is also possible to trigger this using stress-ng on what
>> >> > the stress-ng man page terms "poorly designed hardware" . And it
>> >> > is not necessarily a bad thing for the kernel to diagnose cases where
>> >> > high-stress workloads are being run on hardware that is not designed
>> >> > for this sort of use.
>> >> >
>> >> > Nevertheless, it is quite possible that real-world use will result in
>> >> > some situation requiring that high-stress workloads run on hardware
>> >> > not designed to accommodate them, and also requiring that the kernel
>> >> > refrain from marking clocksources unstable.
>> >> >
>> >> > Therefore, provide an out-of-tree patch that reacts to this situation
>> >> > by leaving the clocksource alone, but using the old 62.5-millisecond
>> >> > skew-detection threshold in response persistent long-latency reads.
>> >> > In addition, the offending clocksource is marked for re-initialization
>> >> > in this case, which both restarts that clocksource with a clean bill of
>> >> > health and avoids false-positive skew reports on later watchdog checks.
>> >> Hi Paul,
>> >> Sorry to dig out this old thread.
>> >Not a problem, especially given that this is still an experimental patch
>> >(marked with "EXP" in -rcu). So one remaining question is "what is this
>> >patch really supposed to do, if anything?".
>> We are testing with TDX  and analyzing why kernel in a TD, or Trust Domain,
>> sometimes spots a large TSC skew. We have inspected tsc hardware/ucode/tdx
>> module to ensure no hardware issue, and also ported tsc_sync.c to a userspace
>> tool such that this tool can help to constantly check if tsc is synchronized
>> when some workload is running. Finally, we believe that the large TSC skew
>> spotted by TD kernel is a false positive.
>> Your patches (those are merged) have improved clocksource watchdog a lot to
>> reduce false-positives. But due to the nature of TDX, switching between TD
>> and host takes more time. Then, the time window between two reads from
>> watchdog clocksource in cs_watchdog_read() increases, so does the
>> probability of the two reads being interrupted by whatever on host. Then,
>> sometimes, especially when there are heavy workloads in both host and TD,
>> the maximum number of retries in cs_watchdog_read() is exceeded and tsc is
>> marked unstable.
>> Then we apply this out-of-tree patch, it helps to further reduce
>> false-positives. But TD kernel still observes TSC skew in some cases. After
>> a close look into kernel logs, we find patterns in those cases: an expected
>> re-initialization somehow doesn't happen. That's why we raise this issue
>> and ask for your advice.
>I am glad that the patch at least helps. ;-)
>> : https://software.intel.com/content/www/us/en/develop/articles/intel-trust-domain-extensions.html
>> >And here the clocksource failed the coarse-grained check and marked
>> >the clocksource as unstable. Perhaps because the previous read
>> >forced a coarse-grained check. Except that this should have forced
>> >a reinitialization. Ah, it looks like I need to suppress setting
>> >CLOCK_SOURCE_WATCHDOG if coarse-grained checks have been enabled.
>> >That could cause false-positive failure for the next check, after all.
>> >And perhaps make cs_watchdog_read() modify its print if there is
>> >a watchdog reset pending or if the current clocksource has the
>> >CLOCK_SOURCE_WATCHDOG flag cleared.
>> >Perhaps as shown in the additional patch below, to be folded into the
>> Thanks. Will test with below patch applied.
>If this patch helps, but problems remain, another thing to try is to
>increase the clocksource.max_cswd_read_retries kernel boot parameter
>above its default value of 3. Maybe to 5 or 10?
>If this patch does not help, please let me know. In that case, there
>are probably more fixes required.
This patch works well; no false-positive (marking TSC unstable) in a
10hr stress test.
next prev parent reply other threads:[~2021-08-05 2:09 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-21 8:33 kernel test robot
2021-05-21 13:56 ` Paul E. McKenney
2021-05-22 16:08 ` Paul E. McKenney
2021-05-26 6:49 ` Feng Tang
2021-05-26 13:49 ` Paul E. McKenney
2021-05-27 18:29 ` Paul E. McKenney
2021-05-27 19:01 ` Andi Kleen
2021-05-27 19:19 ` Paul E. McKenney
2021-05-27 19:29 ` Matthew Wilcox
2021-05-27 21:05 ` Paul E. McKenney
2021-05-28 0:58 ` Andi Kleen
2021-06-01 17:10 ` Paul E. McKenney
2021-08-02 6:20 ` Chao Gao
2021-08-02 17:02 ` Paul E. McKenney
2021-08-03 8:58 ` Chao Gao
2021-08-03 13:48 ` Paul E. McKenney
2021-08-05 2:16 ` Chao Gao [this message]
2021-08-05 4:03 ` Paul E. McKenney
2021-08-05 4:34 ` Andi Kleen
2021-08-05 15:33 ` Paul E. McKenney
2021-08-05 5:39 ` Chao Gao
2021-08-05 15:37 ` Paul E. McKenney
2021-08-06 2:10 ` Chao Gao
2021-08-06 4:15 ` Paul E. McKenney
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--subject='Re: [clocksource] 8901ecc231: stress-ng.lockbus.ops_per_sec -9.5% regression' \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).