LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Vinod Koul <vkoul@kernel.org>
To: Arnaud Pouliquen <arnaud.pouliquen@st.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Pierre-Yves MORDRET <pierre-yves.mordret@st.com>,
	linux-stm32@st-md-mailman.stormreply.com,
	linux-kernel@vger.kernel.org, dmaengine@vger.kernel.org
Subject: Re: [PATCH] dmaengine: stm32-dma: fix residue calculation in stm32-dma
Date: Tue, 30 Apr 2019 13:52:55 +0530	[thread overview]
Message-ID: <20190430082255.GP3845@vkoul-mobl.Dlink> (raw)
In-Reply-To: <26fa7710-76cb-e202-a367-c2e2408b6808@st.com>

On 29-04-19, 16:52, Arnaud Pouliquen wrote:
> 
> 
> On 4/29/19 7:13 AM, Vinod Koul wrote:
> > On 26-04-19, 15:41, Arnaud Pouliquen wrote:
> >>>> During residue calculation. the DMA can switch to the next sg. When
> >>>> this race condition occurs, the residue returned value is not valid.
> >>>> Indeed the position in the sg returned by the hardware is the position
> >>>> of the next sg, not the current sg.
> >>>> Solution is to check the sg after the calculation to verify it.
> >>>> If a transition is detected we consider that the DMA has switched to
> >>>> the beginning of next sg.
> >>>
> >>> Now, that sounds like duct tape. Why should we bother doing that.
> >>>
> >>> Also looking back at the stm32_dma_desc_residue() and calls to it from
> >>> stm32_dma_tx_status() am not sure we are doing the right thing
> >> Please, could you explain what you have in mind here?
> > 
> > So when we call vchan_find_desc() that tells us if the descriptor is in
> > the issued queue or not..  Ideally it should not matter if we have one
> > or N descriptors issued to hardware.
> > 
> > So why should you bother checking for next_sg.
> > 
> >>> why are we looking at next_sg here, can you explain me that please
> >>
> >> This solution is similar to one implemented in the at_hdmac.c driver
> >> (atc_get_bytes_left function).
> >>
> >> Yes could be consider as a workaround for a hardware issue...
> >>
> >> In stm32 DMA Peripheral, we can register up to 2 sg descriptors (sg1 &
> >> sg2)in DMA registers, and use it in a cyclic mode (auto reload). This
> >> mode is mainly use for audio transfer initiated by an ALSA driver.
> >>
> >> >From hardware point of view the DMA transfers first block based on sg1,
> >> then it updates registers to prepare sg2 transfer, and then generates an
> >> IRQ to inform that it issues the next transfer (sg2).
> >>
> >> Then driver can update sg1 to prepare the third transfer...
> >>
> >> In parallel the client driver can requests status to get the residue to
> >> update internal pointer.
> >> The issue is in the race condition between the call of the
> >> device_tx_status ops and the update of the DMA register on sg switch.
> > 
> > Sorry I do not agree! You are in stm32_dma_tx_status() hold the lock and
> > IRQs are disabled, so even if sg2 was loaded, you will not get an
> > interrupt and wont know. By looking at sg1 register you will see that
> > sg1 is telling you that it has finished and residue can be zero. That is
> > fine and correct to report.
> > 
> > Most important thing here is that reside is for _requested_ descriptor
> > and not _current_ descriptor, so looking into sg2 doesnt not fit.
> > 
> >> During a short time the hardware updated the registers containing the
> >> sg ID but not the transfer counter(SxNDTR). In this case there is a
> >> mismatch between the Sg ID and the associated transfer counter.
> >> So residue calculation is wrong.
> >> Idea of this patch is to perform the calculation and then to crosscheck
> >> that the hardware has not switched to the next sg during the
> >> calculation. The way to crosscheck is to compare the the sg ID before
> >> and after the calculation.
> >>
> >> I tested the solution to force a new recalculation but no real solution
> >> to trust the registers during this phase. In this case an approximation
> >> is to consider that the DMA is transferring the first bytes of the next sg.
> >> So we return the residue corresponding to the beginning of the next buffer.
> > 
> > And that is wrong!. The argument is 'cookie' and you return residue for
> > that cookie.
> > 
> > For example, if you have dma txn with cookie 1, 2, 3, 4 submitted, then currently HW
> > is processing cookie 2, then for tx_status on:
> > cookie 1: return DMA_COMPLETE, residue 0
> > cookie 2: return DMA_IN_PROGRESS, residue (read from HW)
> > cookie 3: return DMA_IN_PROGRESS, residue txn length
> > cookie 4: return DMA_IN_PROGRESS, residue txn length
> > 
> > Thanks
> > 
> I think i miss something in my explanation, as from my humble POV (not
> enough expert in DMA framework...) we only one cookie here as only one
> cyclic transfer...

> Regarding your answers it looks like my sg explanation are not clear and
> introduce confusions... Sorry for this, i was used sg for internal STM32
> DMA driver, not for the framework API itself.
> 
> Let try retry to re-explain you the stm32 DMA cyclic mode management.
> 
> STM32 STM32 hardware:
> -------------------
> (ref manual:
> https://www.st.com/content/ccc/resource/technical/document/reference_manual/group0/51/ba/9e/5e/78/5b/4b/dd/DM00327659/files/DM00327659.pdf/jcr:content/translations/en.DM00327659.pdf)
> 
> The stm32 DMA supports cyclic mode using a hardware double
> buffer mode.
> In this double buffer, we can program up to 2 transfers. When one is
> completed, the DMA automatically switch on the other. This could be see
> as a hardware LLI with only 2 transfer descriptors.
> A hardware bit CT (current target) is used to determine the
> current transfer (CT = 0 or 1).
> A hardware NDT (num of data to transfer) counter can be read to
> determine DMA position in current transfer.
> An IRQ is generated when this CT bit is updated to allows driver to
> update the double buffer for the next transfer.
> 
> On client side (a.e audio):
> -------------------------
> The client requests a cyclic transfer by calling
> stm32_dma_prep_dma_cyclic. For instance it can request the transfer of a
> buffer divided in 10 periods. In this case only one cookie submitted
> (right?).
> 
> At stm32dma driver level these 10 periods are registered in an internal
> software table (desc->sg_req[]).As cyclic, the last sg_req point to the
> first one.
> 
> So to be able to transfer the whole software table, we have to update
> the STM32 DMA double buffer at each end of transfer period.
> The filed chan->next_sg points to the next sg_req in the software table.
> that should be write in the STM32 DMA double buffer.
> 
> Residue calculation:
> -------------------
> During a transfer we can get the position in a period thanks to the
> NDT(num of data to transfer) bit-field.
> 
> So the calculation is :
> 1) Get the NDT field value
> 3) add the periods remaining in the desc->sg_req[] table.
> 
> In parallel the STM32 DMA hardware updates the transfer buffer in 3 steps:
> 1) update CT register field.
> 2) Update NDT register field.
> 3) generate the IRQ (As you mention the IRQ is not treated during the
> device_tx_status as protected from interrupts).
> 
> We are facing issue when computing the residue during the update of the
> CT and the NDT. The CT and NDT can as been updated ( both or only CT...)
> without driver context update (IRQ disabled).
> In this case we can point to the beginning of the current transfer(
> completed) instead of the next_transfer. This generates a residue error
> and for audio a time-stamp regression (so video freeze or audio plop).
> 
> So the patch proposed consists in:
> 1) getting the current NDT value
> 2) reading CT and check that the hardware does not point to the next_sg.
> 	if yes:
> 	- CT has been updated by hardware but IRQ still not treated.
> 	- By default we consider the current_sg as completed, so we
> 	  point to the beginning of the next_sg buffer.
> 
> Hope that will help to clarify.

Yes that helps, maybe we should add these bits in code and changelog..
:)

And how does this impact non cyclic case where N descriptors maybe
issued. The driver seems to support non cyclic too...

Thanks
-- 
~Vinod

  reply	other threads:[~2019-04-30  8:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-27 12:21 Arnaud Pouliquen
2019-04-23 15:18 ` Arnaud Pouliquen
2019-04-24  9:26   ` Pierre Yves MORDRET
2019-04-26 12:17 ` Vinod Koul
2019-04-26 13:41   ` Arnaud Pouliquen
2019-04-29  5:13     ` Vinod Koul
2019-04-29 14:52       ` Arnaud Pouliquen
2019-04-30  8:22         ` Vinod Koul [this message]
2019-04-30 14:58           ` Arnaud Pouliquen
2019-04-30 17:18             ` Vinod Koul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190430082255.GP3845@vkoul-mobl.Dlink \
    --to=vkoul@kernel.org \
    --cc=arnaud.pouliquen@st.com \
    --cc=dan.j.williams@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-stm32@st-md-mailman.stormreply.com \
    --cc=pierre-yves.mordret@st.com \
    --subject='Re: [PATCH] dmaengine: stm32-dma: fix residue calculation in stm32-dma' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).