LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/2] drm/msm: Smooth out ringbuffer-full handling
@ 2021-04-28 19:36 Rob Clark
  2021-04-28 19:36 ` [PATCH 1/2] drm/msm: Handle ringbuffer overflow Rob Clark
  2021-04-28 19:36 ` [PATCH 2/2] drm/msm: Periodically update RPTR shadow Rob Clark
  0 siblings, 2 replies; 6+ messages in thread
From: Rob Clark @ 2021-04-28 19:36 UTC (permalink / raw)
  To: dri-devel
  Cc: Jordan Crouse, Rob Clark, Akhil P Oommen,
	AngeloGioacchino Del Regno, Dave Airlie, Douglas Anderson,
	Eric Anholt, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Jonathan Marek, Konrad Dybcio, Kristian H. Kristensen,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list,
	Marijn Suijten, Sai Prakash Ranjan, Sharat Masetty

From: Rob Clark <robdclark@chromium.org>

With some recent userspace work to allow more rendering to be merged
into a single SUBMIT ioctl, I realized we have some sharp edges around
running out of free ringbuffer space.

1) Currently we only flush once all the cmds (or rather IBs to the cmd
   buffer) are written into the ringbuffer.  Which places a restriction
   that the submit must fit in the rb.  Which means slightly less than
   2k cmds per submit, after accounting for some of the other packets
   needed.
2) Currently, for devices that use RPTR shadow, we only write the
   CP_WHERE_AM_I packet at the end of the submit, so we aren't seeing
   partial progress that the GPU is making chewing through previous
   large submits
3) We spin for up to 1sec waiting for rb space, and then give up and
   proceed to overwrite the packets that that CP is currently chewing
   on.. which goes badly.  If userspace is submitting >1sec of work
   per submit ioctl, this means we spin for a long time, and then
   corrupt the rb anyways.

This patchset doesn't completely address #1.  And in general we don't
want to be uninteruptably blocking for so much time.. but this will
require some more extensive changes.

What it does do is address #2 by periodically emitting a CP_WHERE_AM_I,
and #3 by adding detection and error handling for rb overflow, returning
-ENOSPC when that happens.

Rob Clark (2):
  drm/msm: Handle ringbuffer overflow
  drm/msm: Periodically update RPTR shadow

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c   | 32 ++++++++++++++++++++----
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 30 +++++++++++++++++-----
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 24 +++++++++++++++++-
 drivers/gpu/drm/msm/msm_gem_submit.c    |  7 +++++-
 drivers/gpu/drm/msm/msm_gpu.c           | 33 +++++++++++++++++++++++--
 drivers/gpu/drm/msm/msm_gpu.h           |  2 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h    |  5 ++++
 7 files changed, 117 insertions(+), 16 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] drm/msm: Handle ringbuffer overflow
  2021-04-28 19:36 [PATCH 0/2] drm/msm: Smooth out ringbuffer-full handling Rob Clark
@ 2021-04-28 19:36 ` Rob Clark
  2021-11-25  7:36   ` Dmitry Baryshkov
  2021-04-28 19:36 ` [PATCH 2/2] drm/msm: Periodically update RPTR shadow Rob Clark
  1 sibling, 1 reply; 6+ messages in thread
From: Rob Clark @ 2021-04-28 19:36 UTC (permalink / raw)
  To: dri-devel
  Cc: Jordan Crouse, Rob Clark, Rob Clark, Sean Paul, David Airlie,
	Daniel Vetter, AngeloGioacchino Del Regno, Konrad Dybcio,
	Kristian H. Kristensen, Marijn Suijten, Jonathan Marek,
	Akhil P Oommen, Eric Anholt, Sai Prakash Ranjan, Sharat Masetty,
	Douglas Anderson, Dave Airlie,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

From: Rob Clark <robdclark@chromium.org>

Currently if userspace manages to fill up the ring faster than the GPU
can consume we (a) spin for up to 1sec, and then (b) overwrite the
ringbuffer contents from previous submits that the GPU is still busy
executing.  Which predictably goes rather badly.

Instead, just skip flushing (updating WPTR) and reset ring->next back to
where it was before we tried writing the submit into the ringbuffer, and
return an error to userspace (which can then try again).

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  3 +++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  3 +++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 24 +++++++++++++++++-
 drivers/gpu/drm/msm/msm_gem_submit.c    |  7 +++++-
 drivers/gpu/drm/msm/msm_gpu.c           | 33 +++++++++++++++++++++++--
 drivers/gpu/drm/msm/msm_gpu.h           |  2 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h    |  5 ++++
 7 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index ce13d49e615b..0c8faad3b328 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -36,6 +36,9 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
 		OUT_RING(ring, upper_32_bits(shadowptr(a5xx_gpu, ring)));
 	}
 
+	if (unlikely(ring->overflow))
+		return;
+
 	spin_lock_irqsave(&ring->preempt_lock, flags);
 
 	/* Copy the shadow to the actual register */
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index d553f62f4eeb..4a4728a774c0 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -68,6 +68,9 @@ static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
 		OUT_RING(ring, upper_32_bits(shadowptr(a6xx_gpu, ring)));
 	}
 
+	if (unlikely(ring->overflow))
+		return;
+
 	spin_lock_irqsave(&ring->preempt_lock, flags);
 
 	/* Copy the shadow to the actual register */
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 0f184c3dd9d9..a658777e07b1 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -467,6 +467,9 @@ void adreno_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring, u32 reg)
 {
 	uint32_t wptr;
 
+	if (unlikely(ring->overflow))
+		return;
+
 	/* Copy the shadow to the actual register */
 	ring->cur = ring->next;
 
@@ -788,12 +791,31 @@ static uint32_t ring_freewords(struct msm_ringbuffer *ring)
 	return (rptr + (size - 1) - wptr) % size;
 }
 
+static bool space_avail(struct msm_ringbuffer *ring, uint32_t ndwords)
+{
+	if (ring_freewords(ring) >= ndwords)
+		return true;
+
+	/* We don't have a good way to know in general when the RPTR has
+	 * advanced.. newer things that use CP_WHERE_AM_I to update the
+	 * shadow rptr could possibly insert a packet to generate an irq.
+	 * But that doesn't cover older GPUs.  But if the ringbuffer is
+	 * full, it could take a while before it is empty again, so just
+	 * insert a blind sleep to avoid a busy loop.
+	 */
+	msleep(1);
+
+	return false;
+}
+
 void adreno_wait_ring(struct msm_ringbuffer *ring, uint32_t ndwords)
 {
-	if (spin_until(ring_freewords(ring) >= ndwords))
+	if (spin_until(space_avail(ring, ndwords))) {
 		DRM_DEV_ERROR(ring->gpu->dev->dev,
 			"timeout waiting for space in ringbuffer %d\n",
 			ring->id);
+		ring->overflow = true;
+	}
 }
 
 /* Get legacy powerlevels from qcom,gpu-pwrlevels and populate the opp table */
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 5480852bdeda..4bc669460fda 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -683,6 +683,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 	submitid = atomic_inc_return(&ident) - 1;
 
 	ring = gpu->rb[queue->prio];
+
+	GEM_WARN_ON(ring->overflow);
+
 	trace_msm_gpu_submit(pid_nr(pid), ring->id, submitid,
 		args->nr_bos, args->nr_cmds);
 
@@ -829,7 +832,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 		}
 	}
 
-	msm_gpu_submit(gpu, submit);
+	ret = msm_gpu_submit(gpu, submit);
+	if (ret)
+		goto out;
 
 	args->fence = submit->fence->seqno;
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index ab7c167b0623..7655ad9108c8 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -787,7 +787,7 @@ void msm_gpu_retire(struct msm_gpu *gpu)
 }
 
 /* add bo's to gpu's ring, and kick gpu: */
-void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
+int msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 {
 	struct drm_device *dev = gpu->dev;
 	struct msm_drm_private *priv = dev->dev_private;
@@ -834,9 +834,38 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	spin_unlock(&ring->submit_lock);
 
 	gpu->funcs->submit(gpu, submit);
-	priv->lastctx = submit->queue->ctx;
 
 	hangcheck_timer_reset(gpu);
+
+	if (unlikely(ring->overflow)) {
+		/*
+		 * Reset the ptr back to before the submit, so the GPU
+		 * doesn't see a partial submit:
+		 */
+		ring->next = ring->cur;
+
+		/*
+		 * Clear the overflow flag, hopefully the next submit on
+		 * the ring actually fits
+		 */
+		ring->overflow = false;
+
+		/*
+		 * One might be tempted to remove the submit from the
+		 * submits list, and drop it's reference (and drop the
+		 * active reference for all the bos).  But we can't
+		 * really signal the fence attached to obj->resv without
+		 * disturbing other fences on the timeline.  So instead
+		 * just leave it and let it retire normally when a
+		 * later submit completes.
+		 */
+
+		return -ENOSPC;
+	}
+
+	priv->lastctx = submit->queue->ctx;
+
+	return 0;
 }
 
 /*
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index d7cd02cd2109..2dd2ef1f8328 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -302,7 +302,7 @@ int msm_gpu_perfcntr_sample(struct msm_gpu *gpu, uint32_t *activetime,
 		uint32_t *totaltime, uint32_t ncntrs, uint32_t *cntrs);
 
 void msm_gpu_retire(struct msm_gpu *gpu);
-void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit);
+int msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit);
 
 int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 		struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
index fe55d4a1aa16..d8ad9818c389 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -40,6 +40,8 @@ struct msm_ringbuffer {
 	struct drm_gem_object *bo;
 	uint32_t *start, *end, *cur, *next;
 
+	bool overflow;
+
 	/*
 	 * List of in-flight submits on this ring.  Protected by submit_lock.
 	 */
@@ -69,6 +71,9 @@ void msm_ringbuffer_destroy(struct msm_ringbuffer *ring);
 static inline void
 OUT_RING(struct msm_ringbuffer *ring, uint32_t data)
 {
+	if (ring->overflow)
+		return;
+
 	/*
 	 * ring->next points to the current command being written - it won't be
 	 * committed as ring->cur until the flush
-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/2] drm/msm: Periodically update RPTR shadow
  2021-04-28 19:36 [PATCH 0/2] drm/msm: Smooth out ringbuffer-full handling Rob Clark
  2021-04-28 19:36 ` [PATCH 1/2] drm/msm: Handle ringbuffer overflow Rob Clark
@ 2021-04-28 19:36 ` Rob Clark
  2021-05-02 19:47   ` Jordan Crouse
  1 sibling, 1 reply; 6+ messages in thread
From: Rob Clark @ 2021-04-28 19:36 UTC (permalink / raw)
  To: dri-devel
  Cc: Jordan Crouse, Rob Clark, Rob Clark, Sean Paul, David Airlie,
	Daniel Vetter, AngeloGioacchino Del Regno, Konrad Dybcio,
	Kristian H. Kristensen, Marijn Suijten, Jonathan Marek,
	Sai Prakash Ranjan, Akhil P Oommen, Eric Anholt, Sharat Masetty,
	Douglas Anderson, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

From: Rob Clark <robdclark@chromium.org>

On a5xx and a6xx devices that are using CP_WHERE_AM_I to update a
ringbuffer read-ptr shadow value, periodically emit a CP_WHERE_AM_I
every 32 commands, so that a later submit waiting for ringbuffer
space to become available sees partial progress, rather than not
seeing rptr advance at all until the GPU gets to the end of the
submit that it is currently chewing on.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 29 ++++++++++++++++++++++-----
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 +++++++++++++++++++------
 2 files changed, 45 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 0c8faad3b328..5202f1498a48 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -18,6 +18,18 @@ static void a5xx_dump(struct msm_gpu *gpu);
 
 #define GPU_PAS_ID 13
 
+static void update_shadow_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+
+	if (a5xx_gpu->has_whereami) {
+		OUT_PKT7(ring, CP_WHERE_AM_I, 2);
+		OUT_RING(ring, lower_32_bits(shadowptr(a5xx_gpu, ring)));
+		OUT_RING(ring, upper_32_bits(shadowptr(a5xx_gpu, ring)));
+	}
+}
+
 void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
 		bool sync)
 {
@@ -30,11 +42,8 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
 	 * Most flush operations need to issue a WHERE_AM_I opcode to sync up
 	 * the rptr shadow
 	 */
-	if (a5xx_gpu->has_whereami && sync) {
-		OUT_PKT7(ring, CP_WHERE_AM_I, 2);
-		OUT_RING(ring, lower_32_bits(shadowptr(a5xx_gpu, ring)));
-		OUT_RING(ring, upper_32_bits(shadowptr(a5xx_gpu, ring)));
-	}
+	if (sync)
+		update_shadow_rptr(gpu, ring);
 
 	if (unlikely(ring->overflow))
 		return;
@@ -171,6 +180,16 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 			ibs++;
 			break;
 		}
+
+		/*
+		 * Periodically update shadow-wptr if needed, so that we
+		 * can see partial progress of submits with large # of
+		 * cmds.. otherwise we could needlessly stall waiting for
+		 * ringbuffer state, simply due to looking at a shadow
+		 * rptr value that has not been updated
+		 */
+		if ((ibs % 32) == 0)
+			update_shadow_rptr(gpu, ring);
 	}
 
 	/*
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 4a4728a774c0..2986e36ffd8d 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -52,21 +52,25 @@ static bool a6xx_idle(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
 	return true;
 }
 
-static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
+static void update_shadow_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
-	uint32_t wptr;
-	unsigned long flags;
 
 	/* Expanded APRIV doesn't need to issue the WHERE_AM_I opcode */
 	if (a6xx_gpu->has_whereami && !adreno_gpu->base.hw_apriv) {
-		struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
-
 		OUT_PKT7(ring, CP_WHERE_AM_I, 2);
 		OUT_RING(ring, lower_32_bits(shadowptr(a6xx_gpu, ring)));
 		OUT_RING(ring, upper_32_bits(shadowptr(a6xx_gpu, ring)));
 	}
+}
+
+static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
+{
+	uint32_t wptr;
+	unsigned long flags;
+
+	update_shadow_rptr(gpu, ring);
 
 	if (unlikely(ring->overflow))
 		return;
@@ -148,7 +152,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 	struct msm_ringbuffer *ring = submit->ring;
-	unsigned int i;
+	unsigned int i, ibs = 0;
 
 	a6xx_set_pagetable(a6xx_gpu, ring, submit->queue->ctx);
 
@@ -184,8 +188,19 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 			OUT_RING(ring, lower_32_bits(submit->cmd[i].iova));
 			OUT_RING(ring, upper_32_bits(submit->cmd[i].iova));
 			OUT_RING(ring, submit->cmd[i].size);
+			ibs++;
 			break;
 		}
+
+		/*
+		 * Periodically update shadow-wptr if needed, so that we
+		 * can see partial progress of submits with large # of
+		 * cmds.. otherwise we could needlessly stall waiting for
+		 * ringbuffer state, simply due to looking at a shadow
+		 * rptr value that has not been updated
+		 */
+		if ((ibs % 32) == 0)
+			update_shadow_rptr(gpu, ring);
 	}
 
 	get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] drm/msm: Periodically update RPTR shadow
  2021-04-28 19:36 ` [PATCH 2/2] drm/msm: Periodically update RPTR shadow Rob Clark
@ 2021-05-02 19:47   ` Jordan Crouse
  0 siblings, 0 replies; 6+ messages in thread
From: Jordan Crouse @ 2021-05-02 19:47 UTC (permalink / raw)
  To: Rob Clark
  Cc: dri-devel, Rob Clark, Sean Paul, David Airlie, Daniel Vetter,
	AngeloGioacchino Del Regno, Konrad Dybcio,
	Kristian H. Kristensen, Marijn Suijten, Jonathan Marek,
	Sai Prakash Ranjan, Akhil P Oommen, Eric Anholt, Sharat Masetty,
	Douglas Anderson, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

On Wed, Apr 28, 2021 at 12:36:49PM -0700, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> On a5xx and a6xx devices that are using CP_WHERE_AM_I to update a
> ringbuffer read-ptr shadow value, periodically emit a CP_WHERE_AM_I
> every 32 commands, so that a later submit waiting for ringbuffer
> space to become available sees partial progress, rather than not
> seeing rptr advance at all until the GPU gets to the end of the
> submit that it is currently chewing on.

Acked-by: Jordan Crouse <jordan@cosmicpenguin.net>

> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 29 ++++++++++++++++++++++-----
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 +++++++++++++++++++------
>  2 files changed, 45 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index 0c8faad3b328..5202f1498a48 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -18,6 +18,18 @@ static void a5xx_dump(struct msm_gpu *gpu);
>  
>  #define GPU_PAS_ID 13
>  
> +static void update_shadow_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
> +{
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> +
> +	if (a5xx_gpu->has_whereami) {
> +		OUT_PKT7(ring, CP_WHERE_AM_I, 2);
> +		OUT_RING(ring, lower_32_bits(shadowptr(a5xx_gpu, ring)));
> +		OUT_RING(ring, upper_32_bits(shadowptr(a5xx_gpu, ring)));
> +	}
> +}
> +
>  void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
>  		bool sync)
>  {
> @@ -30,11 +42,8 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
>  	 * Most flush operations need to issue a WHERE_AM_I opcode to sync up
>  	 * the rptr shadow
>  	 */
> -	if (a5xx_gpu->has_whereami && sync) {
> -		OUT_PKT7(ring, CP_WHERE_AM_I, 2);
> -		OUT_RING(ring, lower_32_bits(shadowptr(a5xx_gpu, ring)));
> -		OUT_RING(ring, upper_32_bits(shadowptr(a5xx_gpu, ring)));
> -	}
> +	if (sync)
> +		update_shadow_rptr(gpu, ring);
>  
>  	if (unlikely(ring->overflow))
>  		return;
> @@ -171,6 +180,16 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>  			ibs++;
>  			break;
>  		}
> +
> +		/*
> +		 * Periodically update shadow-wptr if needed, so that we
> +		 * can see partial progress of submits with large # of
> +		 * cmds.. otherwise we could needlessly stall waiting for
> +		 * ringbuffer state, simply due to looking at a shadow
> +		 * rptr value that has not been updated
> +		 */
> +		if ((ibs % 32) == 0)
> +			update_shadow_rptr(gpu, ring);
>  	}
>  
>  	/*
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 4a4728a774c0..2986e36ffd8d 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -52,21 +52,25 @@ static bool a6xx_idle(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
>  	return true;
>  }
>  
> -static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
> +static void update_shadow_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
>  {
>  	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>  	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> -	uint32_t wptr;
> -	unsigned long flags;
>  
>  	/* Expanded APRIV doesn't need to issue the WHERE_AM_I opcode */
>  	if (a6xx_gpu->has_whereami && !adreno_gpu->base.hw_apriv) {
> -		struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> -
>  		OUT_PKT7(ring, CP_WHERE_AM_I, 2);
>  		OUT_RING(ring, lower_32_bits(shadowptr(a6xx_gpu, ring)));
>  		OUT_RING(ring, upper_32_bits(shadowptr(a6xx_gpu, ring)));
>  	}
> +}
> +
> +static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
> +{
> +	uint32_t wptr;
> +	unsigned long flags;
> +
> +	update_shadow_rptr(gpu, ring);
>  
>  	if (unlikely(ring->overflow))
>  		return;
> @@ -148,7 +152,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>  	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>  	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
>  	struct msm_ringbuffer *ring = submit->ring;
> -	unsigned int i;
> +	unsigned int i, ibs = 0;
>  
>  	a6xx_set_pagetable(a6xx_gpu, ring, submit->queue->ctx);
>  
> @@ -184,8 +188,19 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>  			OUT_RING(ring, lower_32_bits(submit->cmd[i].iova));
>  			OUT_RING(ring, upper_32_bits(submit->cmd[i].iova));
>  			OUT_RING(ring, submit->cmd[i].size);
> +			ibs++;
>  			break;
>  		}
> +
> +		/*
> +		 * Periodically update shadow-wptr if needed, so that we
> +		 * can see partial progress of submits with large # of
> +		 * cmds.. otherwise we could needlessly stall waiting for
> +		 * ringbuffer state, simply due to looking at a shadow
> +		 * rptr value that has not been updated
> +		 */
> +		if ((ibs % 32) == 0)
> +			update_shadow_rptr(gpu, ring);
>  	}
>  
>  	get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] drm/msm: Handle ringbuffer overflow
  2021-04-28 19:36 ` [PATCH 1/2] drm/msm: Handle ringbuffer overflow Rob Clark
@ 2021-11-25  7:36   ` Dmitry Baryshkov
  2021-11-25 17:20     ` Rob Clark
  0 siblings, 1 reply; 6+ messages in thread
From: Dmitry Baryshkov @ 2021-11-25  7:36 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: Jordan Crouse, Rob Clark, Sean Paul, David Airlie, Daniel Vetter,
	AngeloGioacchino Del Regno, Konrad Dybcio,
	Kristian H. Kristensen, Marijn Suijten, Jonathan Marek,
	Akhil P Oommen, Eric Anholt, Sai Prakash Ranjan, Sharat Masetty,
	Douglas Anderson, Dave Airlie,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

On 28/04/2021 22:36, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> Currently if userspace manages to fill up the ring faster than the GPU
> can consume we (a) spin for up to 1sec, and then (b) overwrite the
> ringbuffer contents from previous submits that the GPU is still busy
> executing.  Which predictably goes rather badly.
> 
> Instead, just skip flushing (updating WPTR) and reset ring->next back to
> where it was before we tried writing the submit into the ringbuffer, and
> return an error to userspace (which can then try again).
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>

Rob, you've posted this patch, but never merged it. Should it be merged 
at some point?

> ---
>   drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  3 +++
>   drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  3 +++
>   drivers/gpu/drm/msm/adreno/adreno_gpu.c | 24 +++++++++++++++++-
>   drivers/gpu/drm/msm/msm_gem_submit.c    |  7 +++++-
>   drivers/gpu/drm/msm/msm_gpu.c           | 33 +++++++++++++++++++++++--
>   drivers/gpu/drm/msm/msm_gpu.h           |  2 +-
>   drivers/gpu/drm/msm/msm_ringbuffer.h    |  5 ++++
>   7 files changed, 72 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index ce13d49e615b..0c8faad3b328 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -36,6 +36,9 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
>   		OUT_RING(ring, upper_32_bits(shadowptr(a5xx_gpu, ring)));
>   	}
>   
> +	if (unlikely(ring->overflow))
> +		return;
> +
>   	spin_lock_irqsave(&ring->preempt_lock, flags);
>   
>   	/* Copy the shadow to the actual register */
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index d553f62f4eeb..4a4728a774c0 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -68,6 +68,9 @@ static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
>   		OUT_RING(ring, upper_32_bits(shadowptr(a6xx_gpu, ring)));
>   	}
>   
> +	if (unlikely(ring->overflow))
> +		return;
> +
>   	spin_lock_irqsave(&ring->preempt_lock, flags);
>   
>   	/* Copy the shadow to the actual register */
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index 0f184c3dd9d9..a658777e07b1 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -467,6 +467,9 @@ void adreno_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring, u32 reg)
>   {
>   	uint32_t wptr;
>   
> +	if (unlikely(ring->overflow))
> +		return;
> +
>   	/* Copy the shadow to the actual register */
>   	ring->cur = ring->next;
>   
> @@ -788,12 +791,31 @@ static uint32_t ring_freewords(struct msm_ringbuffer *ring)
>   	return (rptr + (size - 1) - wptr) % size;
>   }
>   
> +static bool space_avail(struct msm_ringbuffer *ring, uint32_t ndwords)
> +{
> +	if (ring_freewords(ring) >= ndwords)
> +		return true;
> +
> +	/* We don't have a good way to know in general when the RPTR has
> +	 * advanced.. newer things that use CP_WHERE_AM_I to update the
> +	 * shadow rptr could possibly insert a packet to generate an irq.
> +	 * But that doesn't cover older GPUs.  But if the ringbuffer is
> +	 * full, it could take a while before it is empty again, so just
> +	 * insert a blind sleep to avoid a busy loop.
> +	 */
> +	msleep(1);
> +
> +	return false;
> +}
> +
>   void adreno_wait_ring(struct msm_ringbuffer *ring, uint32_t ndwords)
>   {
> -	if (spin_until(ring_freewords(ring) >= ndwords))
> +	if (spin_until(space_avail(ring, ndwords))) {
>   		DRM_DEV_ERROR(ring->gpu->dev->dev,
>   			"timeout waiting for space in ringbuffer %d\n",
>   			ring->id);
> +		ring->overflow = true;
> +	}
>   }
>   
>   /* Get legacy powerlevels from qcom,gpu-pwrlevels and populate the opp table */
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
> index 5480852bdeda..4bc669460fda 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -683,6 +683,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
>   	submitid = atomic_inc_return(&ident) - 1;
>   
>   	ring = gpu->rb[queue->prio];
> +
> +	GEM_WARN_ON(ring->overflow);
> +
>   	trace_msm_gpu_submit(pid_nr(pid), ring->id, submitid,
>   		args->nr_bos, args->nr_cmds);
>   
> @@ -829,7 +832,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
>   		}
>   	}
>   
> -	msm_gpu_submit(gpu, submit);
> +	ret = msm_gpu_submit(gpu, submit);
> +	if (ret)
> +		goto out;
>   
>   	args->fence = submit->fence->seqno;
>   
> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> index ab7c167b0623..7655ad9108c8 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.c
> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> @@ -787,7 +787,7 @@ void msm_gpu_retire(struct msm_gpu *gpu)
>   }
>   
>   /* add bo's to gpu's ring, and kick gpu: */
> -void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> +int msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>   {
>   	struct drm_device *dev = gpu->dev;
>   	struct msm_drm_private *priv = dev->dev_private;
> @@ -834,9 +834,38 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>   	spin_unlock(&ring->submit_lock);
>   
>   	gpu->funcs->submit(gpu, submit);
> -	priv->lastctx = submit->queue->ctx;
>   
>   	hangcheck_timer_reset(gpu);
> +
> +	if (unlikely(ring->overflow)) {
> +		/*
> +		 * Reset the ptr back to before the submit, so the GPU
> +		 * doesn't see a partial submit:
> +		 */
> +		ring->next = ring->cur;
> +
> +		/*
> +		 * Clear the overflow flag, hopefully the next submit on
> +		 * the ring actually fits
> +		 */
> +		ring->overflow = false;
> +
> +		/*
> +		 * One might be tempted to remove the submit from the
> +		 * submits list, and drop it's reference (and drop the
> +		 * active reference for all the bos).  But we can't
> +		 * really signal the fence attached to obj->resv without
> +		 * disturbing other fences on the timeline.  So instead
> +		 * just leave it and let it retire normally when a
> +		 * later submit completes.
> +		 */
> +
> +		return -ENOSPC;
> +	}
> +
> +	priv->lastctx = submit->queue->ctx;
> +
> +	return 0;
>   }
>   
>   /*
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index d7cd02cd2109..2dd2ef1f8328 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -302,7 +302,7 @@ int msm_gpu_perfcntr_sample(struct msm_gpu *gpu, uint32_t *activetime,
>   		uint32_t *totaltime, uint32_t ncntrs, uint32_t *cntrs);
>   
>   void msm_gpu_retire(struct msm_gpu *gpu);
> -void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit);
> +int msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit);
>   
>   int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
>   		struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
> index fe55d4a1aa16..d8ad9818c389 100644
> --- a/drivers/gpu/drm/msm/msm_ringbuffer.h
> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
> @@ -40,6 +40,8 @@ struct msm_ringbuffer {
>   	struct drm_gem_object *bo;
>   	uint32_t *start, *end, *cur, *next;
>   
> +	bool overflow;
> +
>   	/*
>   	 * List of in-flight submits on this ring.  Protected by submit_lock.
>   	 */
> @@ -69,6 +71,9 @@ void msm_ringbuffer_destroy(struct msm_ringbuffer *ring);
>   static inline void
>   OUT_RING(struct msm_ringbuffer *ring, uint32_t data)
>   {
> +	if (ring->overflow)
> +		return;
> +
>   	/*
>   	 * ring->next points to the current command being written - it won't be
>   	 * committed as ring->cur until the flush
> 


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] drm/msm: Handle ringbuffer overflow
  2021-11-25  7:36   ` Dmitry Baryshkov
@ 2021-11-25 17:20     ` Rob Clark
  0 siblings, 0 replies; 6+ messages in thread
From: Rob Clark @ 2021-11-25 17:20 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Rob Clark, dri-devel, Jordan Crouse, Sean Paul, David Airlie,
	Daniel Vetter, AngeloGioacchino Del Regno, Konrad Dybcio,
	Kristian H. Kristensen, Marijn Suijten, Jonathan Marek,
	Akhil P Oommen, Eric Anholt, Sai Prakash Ranjan, Sharat Masetty,
	Douglas Anderson, Dave Airlie,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

On Wed, Nov 24, 2021 at 11:36 PM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On 28/04/2021 22:36, Rob Clark wrote:
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Currently if userspace manages to fill up the ring faster than the GPU
> > can consume we (a) spin for up to 1sec, and then (b) overwrite the
> > ringbuffer contents from previous submits that the GPU is still busy
> > executing.  Which predictably goes rather badly.
> >
> > Instead, just skip flushing (updating WPTR) and reset ring->next back to
> > where it was before we tried writing the submit into the ringbuffer, and
> > return an error to userspace (which can then try again).
> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
>
> Rob, you've posted this patch, but never merged it. Should it be merged
> at some point?

I think it is a bit less needed now, since drm/sched will limit the #
of in-flight submits (when I sent that patch, it was before conversion
to use drm/sched)

With a bit more locking re-work we could do something more clever like
just blocking until there is space in the ringbuffer.. but aren't
there quite yet.

BR,
-R

>
> > ---
> >   drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  3 +++
> >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  3 +++
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.c | 24 +++++++++++++++++-
> >   drivers/gpu/drm/msm/msm_gem_submit.c    |  7 +++++-
> >   drivers/gpu/drm/msm/msm_gpu.c           | 33 +++++++++++++++++++++++--
> >   drivers/gpu/drm/msm/msm_gpu.h           |  2 +-
> >   drivers/gpu/drm/msm/msm_ringbuffer.h    |  5 ++++
> >   7 files changed, 72 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > index ce13d49e615b..0c8faad3b328 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > @@ -36,6 +36,9 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
> >               OUT_RING(ring, upper_32_bits(shadowptr(a5xx_gpu, ring)));
> >       }
> >
> > +     if (unlikely(ring->overflow))
> > +             return;
> > +
> >       spin_lock_irqsave(&ring->preempt_lock, flags);
> >
> >       /* Copy the shadow to the actual register */
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index d553f62f4eeb..4a4728a774c0 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -68,6 +68,9 @@ static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
> >               OUT_RING(ring, upper_32_bits(shadowptr(a6xx_gpu, ring)));
> >       }
> >
> > +     if (unlikely(ring->overflow))
> > +             return;
> > +
> >       spin_lock_irqsave(&ring->preempt_lock, flags);
> >
> >       /* Copy the shadow to the actual register */
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > index 0f184c3dd9d9..a658777e07b1 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > @@ -467,6 +467,9 @@ void adreno_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring, u32 reg)
> >   {
> >       uint32_t wptr;
> >
> > +     if (unlikely(ring->overflow))
> > +             return;
> > +
> >       /* Copy the shadow to the actual register */
> >       ring->cur = ring->next;
> >
> > @@ -788,12 +791,31 @@ static uint32_t ring_freewords(struct msm_ringbuffer *ring)
> >       return (rptr + (size - 1) - wptr) % size;
> >   }
> >
> > +static bool space_avail(struct msm_ringbuffer *ring, uint32_t ndwords)
> > +{
> > +     if (ring_freewords(ring) >= ndwords)
> > +             return true;
> > +
> > +     /* We don't have a good way to know in general when the RPTR has
> > +      * advanced.. newer things that use CP_WHERE_AM_I to update the
> > +      * shadow rptr could possibly insert a packet to generate an irq.
> > +      * But that doesn't cover older GPUs.  But if the ringbuffer is
> > +      * full, it could take a while before it is empty again, so just
> > +      * insert a blind sleep to avoid a busy loop.
> > +      */
> > +     msleep(1);
> > +
> > +     return false;
> > +}
> > +
> >   void adreno_wait_ring(struct msm_ringbuffer *ring, uint32_t ndwords)
> >   {
> > -     if (spin_until(ring_freewords(ring) >= ndwords))
> > +     if (spin_until(space_avail(ring, ndwords))) {
> >               DRM_DEV_ERROR(ring->gpu->dev->dev,
> >                       "timeout waiting for space in ringbuffer %d\n",
> >                       ring->id);
> > +             ring->overflow = true;
> > +     }
> >   }
> >
> >   /* Get legacy powerlevels from qcom,gpu-pwrlevels and populate the opp table */
> > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
> > index 5480852bdeda..4bc669460fda 100644
> > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > @@ -683,6 +683,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
> >       submitid = atomic_inc_return(&ident) - 1;
> >
> >       ring = gpu->rb[queue->prio];
> > +
> > +     GEM_WARN_ON(ring->overflow);
> > +
> >       trace_msm_gpu_submit(pid_nr(pid), ring->id, submitid,
> >               args->nr_bos, args->nr_cmds);
> >
> > @@ -829,7 +832,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
> >               }
> >       }
> >
> > -     msm_gpu_submit(gpu, submit);
> > +     ret = msm_gpu_submit(gpu, submit);
> > +     if (ret)
> > +             goto out;
> >
> >       args->fence = submit->fence->seqno;
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > index ab7c167b0623..7655ad9108c8 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > @@ -787,7 +787,7 @@ void msm_gpu_retire(struct msm_gpu *gpu)
> >   }
> >
> >   /* add bo's to gpu's ring, and kick gpu: */
> > -void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> > +int msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> >   {
> >       struct drm_device *dev = gpu->dev;
> >       struct msm_drm_private *priv = dev->dev_private;
> > @@ -834,9 +834,38 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
> >       spin_unlock(&ring->submit_lock);
> >
> >       gpu->funcs->submit(gpu, submit);
> > -     priv->lastctx = submit->queue->ctx;
> >
> >       hangcheck_timer_reset(gpu);
> > +
> > +     if (unlikely(ring->overflow)) {
> > +             /*
> > +              * Reset the ptr back to before the submit, so the GPU
> > +              * doesn't see a partial submit:
> > +              */
> > +             ring->next = ring->cur;
> > +
> > +             /*
> > +              * Clear the overflow flag, hopefully the next submit on
> > +              * the ring actually fits
> > +              */
> > +             ring->overflow = false;
> > +
> > +             /*
> > +              * One might be tempted to remove the submit from the
> > +              * submits list, and drop it's reference (and drop the
> > +              * active reference for all the bos).  But we can't
> > +              * really signal the fence attached to obj->resv without
> > +              * disturbing other fences on the timeline.  So instead
> > +              * just leave it and let it retire normally when a
> > +              * later submit completes.
> > +              */
> > +
> > +             return -ENOSPC;
> > +     }
> > +
> > +     priv->lastctx = submit->queue->ctx;
> > +
> > +     return 0;
> >   }
> >
> >   /*
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> > index d7cd02cd2109..2dd2ef1f8328 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.h
> > +++ b/drivers/gpu/drm/msm/msm_gpu.h
> > @@ -302,7 +302,7 @@ int msm_gpu_perfcntr_sample(struct msm_gpu *gpu, uint32_t *activetime,
> >               uint32_t *totaltime, uint32_t ncntrs, uint32_t *cntrs);
> >
> >   void msm_gpu_retire(struct msm_gpu *gpu);
> > -void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit);
> > +int msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit);
> >
> >   int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
> >               struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
> > diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
> > index fe55d4a1aa16..d8ad9818c389 100644
> > --- a/drivers/gpu/drm/msm/msm_ringbuffer.h
> > +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
> > @@ -40,6 +40,8 @@ struct msm_ringbuffer {
> >       struct drm_gem_object *bo;
> >       uint32_t *start, *end, *cur, *next;
> >
> > +     bool overflow;
> > +
> >       /*
> >        * List of in-flight submits on this ring.  Protected by submit_lock.
> >        */
> > @@ -69,6 +71,9 @@ void msm_ringbuffer_destroy(struct msm_ringbuffer *ring);
> >   static inline void
> >   OUT_RING(struct msm_ringbuffer *ring, uint32_t data)
> >   {
> > +     if (ring->overflow)
> > +             return;
> > +
> >       /*
> >        * ring->next points to the current command being written - it won't be
> >        * committed as ring->cur until the flush
> >
>
>
> --
> With best wishes
> Dmitry

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-25 17:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-28 19:36 [PATCH 0/2] drm/msm: Smooth out ringbuffer-full handling Rob Clark
2021-04-28 19:36 ` [PATCH 1/2] drm/msm: Handle ringbuffer overflow Rob Clark
2021-11-25  7:36   ` Dmitry Baryshkov
2021-11-25 17:20     ` Rob Clark
2021-04-28 19:36 ` [PATCH 2/2] drm/msm: Periodically update RPTR shadow Rob Clark
2021-05-02 19:47   ` Jordan Crouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).