LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Mike Travis <travis@sgi.com>
To: Pavel Machek <pavel@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	rpurdie@rpsys.net, Jack Steiner <steiner@sgi.com>,
	linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 1/1] SGI X86 UV: Provide a System Activity Indicator	driver
Date: Wed, 01 Oct 2008 11:15:09 -0700	[thread overview]
Message-ID: <48E3BE2D.4020606@sgi.com> (raw)
In-Reply-To: <20080928194201.GA1310@ucw.cz>

Pavel Machek wrote:
>> Another relevant point is that I will be adding a bit more functionality
>> to disable the timer interrupt on truly "idle" cpus (like have been idle
>> for some amount of seconds).  We would then use the "exit from idle"
>> callback to reestablish the timer interrupt.  [This would allow them to
>> enter power down states if appropriate.]
> 
> Should you look at nohz instead of reinventing it? 

Thanks, I did look at it.  Quite complex.  Maybe I'm missing something
but I don't see how it fits in?  Are you saying I should be using data
in the percpu tick_sched to gather the idle information for the once
per second per cpu status update interrupt?  I see the @idle_active
entry but wouldn't this always be false during the timer interrupt?
Using any other entries would appear to be more complex than a simple
store byte and subtracting two longs.

Or perhaps I should somehow hook into the sched_timer interrupt instead
of having a separate once per second per cpu interrupt?  (Does this
sched_timer interrupt each cpu once per second?)

> 
>>> As i suggested in my previous mail about this topic, a low-frequency 
>>> sampling method should be used instead, to indicate system status. I 
>>> thought the leds drivers have all that in place already.
>> It is low frequency (once per second), this is just setting what's to
>> be sampled.
>>
>> As I mentioned, this is not for LED displays (human readable), it's for the
>> system controller to monitor how all parts of the system are running, and
>> this one is just the cpu parts.  The LED driver approach would have me
>> registering 4096 led devices, with all their callbacks, 4096 strings saying
>> "LED0001", etc., and I still cannot associate a specific register bit
>> (AKA LED if that's what it was), with a specific cpu using the LED driver.
>>
>> The LED driver is fine for a couple of blinking lights indicating overall
>> system activity, disk activity, etc.  (Btw, I did not see a network trigger,
>> or a paging trigger, or an out of memory trigger, or some other things that
>> might be useful for real time monitoring of the system.)
> 
> ...so add them...
> 
>> But the LED driver has way more overhead than needed for this simple application.
>>
> 
> So overhead from led driver is not okay, while overhead from messing
> with idle loop is okay? Interesting...
> 								Pavel

The overhead is mainly the registration of descriptor blocks for the
4096 registers representing the 4096 cpus all at separate addresses.
The overhead in this patch for maintaining the "idle" state (prior to the
timer interrupt causing "exit_idle") is storing a byte and subtracting the
current jiffies from the jiffies at the last one second timer interrupt.
(Even this subtraction can be removed, the only *important* item is
whether the cpu is currently idle or not.)

This data is written to node local memory that's highly likely to be in
the cache, as the same memory block is used for all UV hub operations.

Unfortunately, I am experiencing a simulator problem at the moment or
I'd be able to quantify the exact amount of time added to the exit_idle()
function, but it's basically noise in the overall resumption of a thread.

One other factor, this overhead is *only* for UV systems, no other x86_64
systems or architectures are affected, so again I'm not understanding the
objection.  This request was made from our hardware and RAS engineers,
and is identical to what's been in the ia64 kernel for a few years now.

Perhaps the confusion is it's near relationship to real "LED" lights?
The original name "LED" is historical.  The bits are read by a system
controller that has the job of monitoring the entire system, including
both soft and hard errors and determining faulty [or near faulty]
system components.  For example, if a node suddenly hangs, this is a
one of the diagnostic aids used in determining the state of that node.
(Btw, the SCIR register that is written to once per second is a FIFO so
it contains the last 64 updates of this register giving a temporal view
of each cpu as well.)

Thanks,
Mike

  reply	other threads:[~2008-10-01 18:15 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-19 14:37 [PATCH 0/1] " Mike Travis
2008-09-19 14:37 ` [PATCH 1/1] " Mike Travis
2008-09-22 11:48   ` Ingo Molnar
2008-09-22 14:42     ` Mike Travis
2008-09-28 19:42       ` Pavel Machek
2008-10-01 18:15         ` Mike Travis [this message]
2008-10-01 19:41           ` Mike Travis
2008-10-02  8:37           ` Pavel Machek
2008-10-02 14:33             ` Mike Travis
2008-09-22 14:47     ` Mike Travis
2008-10-24 11:19 Mike Travis
2008-10-24 12:01 ` Pavel Machek
2008-10-24 12:05   ` Ingo Molnar
2008-10-24 12:27     ` Mike Travis
2008-10-24 18:12       ` Andi Kleen
2008-10-24 22:18         ` Mike Travis
2008-10-24 22:24           ` Mike Travis
2008-10-27 11:42             ` Ingo Molnar
2008-10-27 14:38               ` Mike Travis
2008-10-25  6:56           ` Andi Kleen
2008-10-27 15:12             ` Mike Travis
2008-10-27 11:36           ` Ingo Molnar
2008-10-27 11:43       ` Ingo Molnar
2008-10-24 12:14   ` Mike Travis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48E3BE2D.4020606@sgi.com \
    --to=travis@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=pavel@suse.cz \
    --cc=rpurdie@rpsys.net \
    --cc=steiner@sgi.com \
    --cc=tglx@linutronix.de \
    --subject='Re: [PATCH 1/1] SGI X86 UV: Provide a System Activity Indicator	driver' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).