LKML Archive on
help / color / mirror / Atom feed
From: brookxu <>
To: Thomas Gleixner <>,,
Subject: Re: [RFC PATCH] clocksource: skip check while watchdog hung up or unstable
Date: Wed, 11 Aug 2021 21:18:34 +0800	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <877dgsp2vp.ffs@tglx>

Thanks for your time.

Thomas Gleixner wrote on 2021/8/11 8:44 下午:
> On Wed, Aug 11 2021 at 17:55, brookxu wrote:
>> From: Chunguang Xu <>
>> After patch 1f45f1f3 (clocksource: Make clocksource validation work
>> for all clocksources), md_nsec may be 0 in some scenarios, such as
>> the watchdog is delayed for a long time or the watchdog has a
>> time-warp.
> Maybe 0? There is exactly one single possibility for it to be zero:
>   cs->wd_last == wdnow, i.e. delta = 0 -> wd_nsec = 0
> So how does that condition solve any long delay or wrap around of the
> watchdog? It's more than unlikely to hit exactly this case where the
> readout is identical to the previous readout unless the watchdog stopped
> counting.

Maybe I missed something. Like this example, when watchdog run ,hpet have
wrap around:

'hpet' wd_now: d76e5a69 wd_last: f929eb3c mask: ffffffff

We can calculate the number of elapsed cycles:
cycles = wd_now - wd_last = 0xde446f2d

clocksource_delta() uses the MSB to determine an invalid inteval and returns
0, but for 0xde446f2d, this judgment should be wrong.

>> We found a problem when testing nvme disks with fio, when multiple
>> queue interrupts of a disk were mapped to a single CPU. IO interrupt
>> processing will cause the watchdog to be delayed for a long time
>> (155 seconds), the system reports TSC unstable and switches the clock
> If you hold off the softirq from running for 155 seconds then the TSC
> watchdog is the least of your problems.

To be precise, we are processing interrupts in handle_edge_irq() for a long
time. Since the interrupts of multiple hardware queues are mapped to a single
CPU, multiple cores are continuously issuing IO, and then a single core is
processing IO. Perhaps the test case can be optimized, but shouldn't this lead
to switching clocks in principle?

> Thanks,
>         tglx

  reply	other threads:[~2021-08-11 13:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-11  9:55 brookxu
2021-08-11 12:44 ` Thomas Gleixner
2021-08-11 13:18   ` brookxu [this message]
2021-08-11 14:01     ` Thomas Gleixner
2021-08-11 15:26       ` brookxu
2021-08-12 10:53         ` Thomas Gleixner
2021-08-13  0:54           ` brookxu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \
    --subject='Re: [RFC PATCH] clocksource: skip check while watchdog hung up or unstable' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).