LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Abhishek Sahu <abhsahu@nvidia.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: <linux-kernel@vger.kernel.org>, Lukas Wunner <lukas@wunner.de>,
	<linux-pci@vger.kernel.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>
Subject: Re: [PATCH 2/2] PCI: Create device link for NVIDIA GPU
Date: Tue, 4 Jun 2019 17:13:03 +0530	[thread overview]
Message-ID: <4b4876eb-b3a0-6796-9d7a-af518a396689@nvidia.com> (raw)
In-Reply-To: <20190603172246.GC189360@google.com>



On 6/3/2019 10:52 PM, Bjorn Helgaas wrote:
> [+cc Rafael, just FYI]
> 
> On Mon, Jun 03, 2019 at 01:30:51PM +0530, Abhishek Sahu wrote:
>> On 6/1/2019 2:09 AM, Bjorn Helgaas wrote:
>>> On Fri, May 31, 2019 at 10:31:09AM +0530, Abhishek Sahu wrote:
>>>> NVIDIA Turing GPUs include hardware support for USB Type-C and
>>>> VirtualLink. It helps in delivering the power, display, and data
>>>> required to power VR headsets through a single USB Type-C connector.
>>>> The Turing GPU is a multi-function PCI device has the following
>>>> four functions:
>>>>
>>>> 	- VGA display controller (Function 0)
>>>> 	- Audio controller (Function 1)
>>>> 	- USB xHCI Host controller (Function 2)
>>>> 	- USB Type-C USCI controller (Function 3)
>>>>
>>>> The function 0 is tightly coupled with other functions in the
>>>> hardware. When function 0 goes in runtime suspended state,
>>>> then it will do power gating for most of the hardware blocks.
>>>> Some of these hardware blocks are used by other functions which
>>>> leads to functional failure. So if any of these functions (1/2/3)
>>>> are active, then function 0 should also be in active state.
>>>
>>>> 'commit 07f4f97d7b4b ("vga_switcheroo: Use device link for
>>>> HDA controller")' creates the device link from function 1 to
>>>> function 0. A similar kind of device link needs to be created
>>>> between function 0 and functions 2 and 3 for NVIDIA Turing GPU.
>>>
>>> I can't point to language that addresses this, but this sounds like a
>>> case of the GPU not conforming to the PCI spec.  The general
>>> assumption is that the OS should be able to discover everything it
>>> needs to do power management directly from the architected PCI config
>>> space.
>>
>>  The GPU is following PCIe spec but following is the implementation
>>  from HW side
> 
> Unless you can find spec language that talks about D-state
> dependencies between functions, I claim this is not following the
> PCIe spec.  For example, PCIe r5.0, sec 1.4, says "the PCI/PCIe
> hardware/software model includes architectural constructs necessary to
> discover, configure, and use a Function, without needing Function-
> specific knowledge." Sec 5.1 says "D states are associated with a
> particular Function" and "PM provides ... a mechanism to identify
> power management capabilities of a given Function [and] the ability to
> transition a Function into a certain power management state."
> 

 Thanks Bjorn. Here in case of GPU's these functions are not
 completely independent so it is not following PCIe spec in
 that aspect.

> If there *is* something about dependencies between functions in the
> spec, we should improve the generic PCI core to pay attention to that,
> and then we wouldn't need this quirk.
> 
> If the spec doesn't provide a way to discover them, these dependencies
> are exceptions from the spec, and we have to handle them as hardware
> defects, using quirks like this.  That's fine, but let's not pretend
> that this is a conforming device and that adding quirks is the
> expected process.  Just call a spade a spade and say we're working
> around a defect in this particular device.
> 

 Yes. I am agree with that we need to be very careful in
 adding quirks like this. I will communicate the same to
 HW team so they can explore other options to handle this
 in HW design side for future chips.

> I think the best path forward would be to add this quirk for the
> existing device, and then pursue a spec change to add something like
> a new PCIe capability to describe the dependencies.  Then we could
> enhance the PCI core once and power management for future devices
> would "Just Work" without having to add quirks.
> 

 Yes. It will be long term process. If other HW has similar
 requirement then it would be good to have this.

 Regards,
 Abhishek


      reply	other threads:[~2019-06-04 11:43 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-31  5:01 [PATCH 0/2] PCI: device link quirk " Abhishek Sahu
2019-05-31  5:01 ` [PATCH 1/2] PCI: Code reorganization for VGA device link Abhishek Sahu
2019-06-03 17:15   ` Bjorn Helgaas
2019-06-04 11:38     ` Abhishek Sahu
2019-05-31  5:01 ` [PATCH 2/2] PCI: Create device link for NVIDIA GPU Abhishek Sahu
2019-05-31 20:39   ` Bjorn Helgaas
2019-06-03  8:00     ` Abhishek Sahu
2019-06-03 10:07       ` Lukas Wunner
2019-06-03 17:22       ` Bjorn Helgaas
2019-06-04 11:43         ` Abhishek Sahu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b4876eb-b3a0-6796-9d7a-af518a396689@nvidia.com \
    --to=abhsahu@nvidia.com \
    --cc=helgaas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=rjw@rjwysocki.net \
    --subject='Re: [PATCH 2/2] PCI: Create device link for NVIDIA GPU' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).