LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* PME_Turn_Off in Linux
@ 2007-01-17 16:43 Miller, Mike (OS Dev)
  2007-01-17 21:33 ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Miller, Mike (OS Dev) @ 2007-01-17 16:43 UTC (permalink / raw)
  To: LKML, linux-pci
  Cc: Nguyen, Tom L, Brainard, Jim, Patterson, Andrew D (Linux R&D)

Hello,
We've been seeing some nasty data corruption issues on some platforms.
We've been capturing PCI-E traces looking for something nasty but we
haven't found anything yet. One of the hardware guys if asking if there
is a call in Linux to issue a PME_Turn_Off broadcast message.
 
PME_Turn_Off Broadcast Message
Before main component power and reference clocks are turned off, the
Root Complex or Switch Downstream Port must issue a broadcast Message
that instructs all agents downstream of that point within the hierarchy
to cease initiation of any subsequent PM_PME Messages, effective
immediately upon receipt of the PME_Turn_Off Message.

This must be initiated from the root complex. Is there such a call in
linux?

Thanks,
mikem


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: PME_Turn_Off in Linux
  2007-01-17 16:43 PME_Turn_Off in Linux Miller, Mike (OS Dev)
@ 2007-01-17 21:33 ` Greg KH
  2007-01-17 22:35   ` Miller, Mike (OS Dev)
  0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2007-01-17 21:33 UTC (permalink / raw)
  To: Miller, Mike (OS Dev)
  Cc: LKML, linux-pci, Nguyen, Tom L, Brainard, Jim, Patterson,
	Andrew D (Linux R&D)

On Wed, Jan 17, 2007 at 10:43:14AM -0600, Miller, Mike (OS Dev) wrote:
> Hello,
> We've been seeing some nasty data corruption issues on some platforms.
> We've been capturing PCI-E traces looking for something nasty but we
> haven't found anything yet. One of the hardware guys if asking if there
> is a call in Linux to issue a PME_Turn_Off broadcast message.
>  
> PME_Turn_Off Broadcast Message
> Before main component power and reference clocks are turned off, the
> Root Complex or Switch Downstream Port must issue a broadcast Message
> that instructs all agents downstream of that point within the hierarchy
> to cease initiation of any subsequent PM_PME Messages, effective
> immediately upon receipt of the PME_Turn_Off Message.
> 
> This must be initiated from the root complex. Is there such a call in
> linux?

This firmware that implements the PCI-E connection should do this, I
don't think there is anything that the Operating system can do to
control this, as PCI-E should be transparant to the OS.

Unless this is on a PCI-E Hotplug system?  What is the sequence of
events that cause the data corruption?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: PME_Turn_Off in Linux
  2007-01-17 21:33 ` Greg KH
@ 2007-01-17 22:35   ` Miller, Mike (OS Dev)
  2007-01-17 22:55     ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Miller, Mike (OS Dev) @ 2007-01-17 22:35 UTC (permalink / raw)
  To: Greg KH
  Cc: LKML, Nguyen, Tom L, Brainard, Jim, Patterson,
	Andrew D (Linux R&D),
	linux-pci

greg k-h wrote: 

> On Wed, Jan 17, 2007 at 10:43:14AM -0600, Miller, Mike (OS Dev) wrote:
> > Hello,
> > We've been seeing some nasty data corruption issues on some 
> platforms.
> > We've been capturing PCI-E traces looking for something 
> nasty but we 
> > haven't found anything yet. One of the hardware guys if asking if 
> > there is a call in Linux to issue a PME_Turn_Off broadcast message.
> >  
> > PME_Turn_Off Broadcast Message
> > Before main component power and reference clocks are turned 
> off, the 
> > Root Complex or Switch Downstream Port must issue a 
> broadcast Message 
> > that instructs all agents downstream of that point within the 
> > hierarchy to cease initiation of any subsequent PM_PME Messages, 
> > effective immediately upon receipt of the PME_Turn_Off Message.
> > 
> > This must be initiated from the root complex. Is there such 
> a call in 
> > linux?
> 
> This firmware that implements the PCI-E connection should do 
> this, I don't think there is anything that the Operating 
> system can do to control this, as PCI-E should be transparant 
> to the OS.

Hmmm, the hw folks tell me that "other" os'es implement that. But I
would tend to agree that system firmware should probably be doing this.

> 
> Unless this is on a PCI-E Hotplug system?  What is the 

No hotplug.

> sequence of events that cause the data corruption?

Install rhel4 u4 on ia64, at the reboot prompt let the system sit idle
for several hours or overnight. Then after rebooting the filesystems are
totally trashed. I usually get a message that the kernel is not a valid
compressed file format. If I try to rescue the system I cannot mount any
filesystems. I don't have the message handy but it complains about an
invalid Verneed record, whatever that is.

I've also tried the same procedure using a dumb SAS hba. It complained
that it couldn't read the initrd image but on a second attempt it acted
like it read the initrd but the system goes out in the weeds while
booting. Not the same symptoms but I suspect there's some relationship.

I have not tried any other distros yet.

Thanks,
mikem

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: PME_Turn_Off in Linux
  2007-01-17 22:35   ` Miller, Mike (OS Dev)
@ 2007-01-17 22:55     ` Greg KH
  0 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2007-01-17 22:55 UTC (permalink / raw)
  To: Miller, Mike (OS Dev)
  Cc: LKML, Nguyen, Tom L, Brainard, Jim, Patterson,
	Andrew D (Linux R&D),
	linux-pci

On Wed, Jan 17, 2007 at 04:35:02PM -0600, Miller, Mike (OS Dev) wrote:
> > On Wed, Jan 17, 2007 at 10:43:14AM -0600, Miller, Mike (OS Dev) wrote:
> > > Hello,
> > > We've been seeing some nasty data corruption issues on some 
> > platforms.
> > > We've been capturing PCI-E traces looking for something 
> > nasty but we 
> > > haven't found anything yet. One of the hardware guys if asking if 
> > > there is a call in Linux to issue a PME_Turn_Off broadcast message.
> > >  
> > > PME_Turn_Off Broadcast Message
> > > Before main component power and reference clocks are turned 
> > off, the 
> > > Root Complex or Switch Downstream Port must issue a 
> > broadcast Message 
> > > that instructs all agents downstream of that point within the 
> > > hierarchy to cease initiation of any subsequent PM_PME Messages, 
> > > effective immediately upon receipt of the PME_Turn_Off Message.
> > > 
> > > This must be initiated from the root complex. Is there such 
> > a call in 
> > > linux?
> > 
> > This firmware that implements the PCI-E connection should do 
> > this, I don't think there is anything that the Operating 
> > system can do to control this, as PCI-E should be transparant 
> > to the OS.
> 
> Hmmm, the hw folks tell me that "other" os'es implement that. But I
> would tend to agree that system firmware should probably be doing this.

Where would the "other" oses implement this, as they don't even know the
pci device is a pci-e port?  How can the os send a PCI-E message without
talking directly to the chipset-specific controller chip?

> > 
> > Unless this is on a PCI-E Hotplug system?  What is the 
> 
> No hotplug.

That's good :)

> > sequence of events that cause the data corruption?
> 
> Install rhel4 u4 on ia64, at the reboot prompt let the system sit idle
> for several hours or overnight. Then after rebooting the filesystems are
> totally trashed. I usually get a message that the kernel is not a valid
> compressed file format. If I try to rescue the system I cannot mount any
> filesystems. I don't have the message handy but it complains about an
> invalid Verneed record, whatever that is.

The RHEL4 kernel is pretty old as far as PCI-E goes.  Can you try this
on a kernel.org release?  2.6.19.2 would be great at the least.  If not,
you're going to have to get your support from Red Hat on this issue :(

Any kernel log messages while the machine is idle before rebooting?

What tasks are running overnight that would cause writes to the disk?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-01-17 22:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-17 16:43 PME_Turn_Off in Linux Miller, Mike (OS Dev)
2007-01-17 21:33 ` Greg KH
2007-01-17 22:35   ` Miller, Mike (OS Dev)
2007-01-17 22:55     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).