LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: neilb@suse.de, christopher.leech@intel.com,
	linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: akpm@linux-foundation.org, torvalds@linux-foundation.org,
	yur@emcraft.com, wd@denx.de, arjan@linux.intel.com,
	rmk+kernel@arm.linux.org.uk
Subject: [PATCH 2.6.21-rc4 08/15] md: move raid5 parity checks to raid5_run_ops
Date: Thu, 22 Mar 2007 23:52:18 -0700	[thread overview]
Message-ID: <20070323065218.15570.19058.stgit@dwillia2-linux.ch.intel.com> (raw)
In-Reply-To: <20070323064856.15570.45052.stgit@dwillia2-linux.ch.intel.com>

handle_stripe sets STRIPE_OP_CHECK to request a check operation in
raid5_run_ops.  If raid5_run_ops is able to perform the check with a
dma engine the parity will be preserved in memory removing the need to
re-read it from disk, as is necessary in the synchronous case.

'Repair' operations re-use the same logic as compute block, with the caveat
that the results of the compute block are immediately written back to the
parity disk.  To differentiate these operations the STRIPE_OP_MOD_REPAIR_PD
flag is added.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---

 drivers/md/raid5.c |   81 ++++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 62 insertions(+), 19 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 9856742..17a114c 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2411,32 +2411,75 @@ static void handle_stripe5(struct stripe_head *sh)
 			locked += handle_write_operations5(sh, rcw, 0);
 	}
 
-	/* maybe we need to check and possibly fix the parity for this stripe
-	 * Any reads will already have been scheduled, so we just see if enough data
-	 * is available
+	/* 1/ Maybe we need to check and possibly fix the parity for this stripe.
+	 *    Any reads will already have been scheduled, so we just see if enough data
+	 *    is available.
+	 * 2/ Hold off parity checks while parity dependent operations are in flight
+	 *    (conflicting writes are protected by the 'locked' variable)
 	 */
-	if (syncing && locked == 0 &&
-	    !test_bit(STRIPE_INSYNC, &sh->state)) {
+	if ((syncing && locked == 0 && !test_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending) &&
+		!test_bit(STRIPE_INSYNC, &sh->state)) ||
+	    	test_bit(STRIPE_OP_CHECK, &sh->ops.pending) ||
+	    	test_bit(STRIPE_OP_MOD_REPAIR_PD, &sh->ops.pending)) {
+
 		set_bit(STRIPE_HANDLE, &sh->state);
-		if (failed == 0) {
-			BUG_ON(uptodate != disks);
-			compute_parity5(sh, CHECK_PARITY);
-			uptodate--;
-			if (page_is_zero(sh->dev[sh->pd_idx].page)) {
-				/* parity is correct (on disc, not in buffer any more) */
-				set_bit(STRIPE_INSYNC, &sh->state);
-			} else {
-				conf->mddev->resync_mismatches += STRIPE_SECTORS;
-				if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery))
-					/* don't try to repair!! */
+		/* Take one of the following actions:
+		 * 1/ start a check parity operation if (uptodate == disks)
+		 * 2/ finish a check parity operation and act on the result
+		 * 3/ skip to the writeback section if we previously
+		 *    initiated a recovery operation
+		 */
+		if (failed == 0 && !test_bit(STRIPE_OP_MOD_REPAIR_PD, &sh->ops.pending)) {
+			if (!test_and_set_bit(STRIPE_OP_CHECK, &sh->ops.pending)) {
+				BUG_ON(uptodate != disks);
+				clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
+				sh->ops.count++;
+				uptodate--;
+			} else if (test_and_clear_bit(STRIPE_OP_CHECK, &sh->ops.complete)) {
+				clear_bit(STRIPE_OP_CHECK, &sh->ops.ack);
+				clear_bit(STRIPE_OP_CHECK, &sh->ops.pending);
+
+				if (sh->ops.zero_sum_result == 0)
+					/* parity is correct (on disc, not in buffer any more) */
 					set_bit(STRIPE_INSYNC, &sh->state);
 				else {
-					compute_block(sh, sh->pd_idx);
-					uptodate++;
+					conf->mddev->resync_mismatches += STRIPE_SECTORS;
+					if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery))
+						/* don't try to repair!! */
+						set_bit(STRIPE_INSYNC, &sh->state);
+					else {
+						BUG_ON(test_and_set_bit(
+							STRIPE_OP_COMPUTE_BLK,
+							&sh->ops.pending));
+						set_bit(STRIPE_OP_MOD_REPAIR_PD,
+							&sh->ops.pending);
+						BUG_ON(test_and_set_bit(R5_Wantcompute,
+							&sh->dev[sh->pd_idx].flags));
+						sh->ops.target = sh->pd_idx;
+						sh->ops.count++;
+						uptodate++;
+					}
 				}
 			}
 		}
-		if (!test_bit(STRIPE_INSYNC, &sh->state)) {
+
+		/* check if we can clear a parity disk reconstruct */
+		if (test_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.complete) &&
+			test_bit(STRIPE_OP_MOD_REPAIR_PD, &sh->ops.pending)) {
+
+			clear_bit(STRIPE_OP_MOD_REPAIR_PD, &sh->ops.pending);
+			clear_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.complete);
+			clear_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.ack);
+			clear_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending);
+		}
+
+		/* Wait for check parity and compute block operations to complete
+		 * before write-back
+		 */
+		if (!test_bit(STRIPE_INSYNC, &sh->state) &&
+			!test_bit(STRIPE_OP_CHECK, &sh->ops.pending) &&
+			!test_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending)) {
+
 			/* either failed parity check, or recovery is happening */
 			if (failed==0)
 				failed_num = sh->pd_idx;

  parent reply	other threads:[~2007-03-23  6:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-23  6:51 [PATCH 2.6.21-rc4 00/15] md raid5 acceleration and async_tx Dan Williams
2007-03-23  6:51 ` [PATCH 2.6.21-rc4 01/15] dmaengine: add base support for the async_tx api Dan Williams
2007-03-23  6:51 ` [PATCH 2.6.21-rc4 02/15] ARM: Add drivers/dma to arch/arm/Kconfig Dan Williams
2007-03-23  6:51 ` [PATCH 2.6.21-rc4 03/15] dmaengine: add the async_tx api Dan Williams
2007-03-23  6:51 ` [PATCH 2.6.21-rc4 04/15] md: add raid5_run_ops and support routines Dan Williams
2007-03-23  6:52 ` [PATCH 2.6.21-rc4 05/15] md: use raid5_run_ops for stripe cache operations Dan Williams
2007-03-23  6:52 ` [PATCH 2.6.21-rc4 06/15] md: move write operations to raid5_run_ops Dan Williams
2007-03-23  6:52 ` [PATCH 2.6.21-rc4 07/15] md: move raid5 compute block " Dan Williams
2007-03-23  6:52 ` Dan Williams [this message]
2007-03-23  6:52 ` [PATCH 2.6.21-rc4 09/15] md: satisfy raid5 read requests via raid5_run_ops Dan Williams
2007-03-23  6:52 ` [PATCH 2.6.21-rc4 10/15] md: use async_tx and raid5_run_ops for raid5 expansion operations Dan Williams
2007-03-23  6:52 ` [PATCH 2.6.21-rc4 11/15] md: move raid5 io requests to raid5_run_ops Dan Williams
2007-03-23  6:52 ` [PATCH 2.6.21-rc4 12/15] md: remove raid5 compute_block and compute_parity5 Dan Williams
2007-03-23  6:52 ` [PATCH 2.6.21-rc4 13/15] dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines Dan Williams
2007-03-23  6:52 ` [PATCH 2.6.21-rc4 14/15] iop13xx: Surface the iop13xx adma units to the iop-adma driver Dan Williams
2007-03-23  6:52 ` [PATCH 2.6.21-rc4 15/15] iop3xx: Surface the iop3xx DMA and AAU " Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070323065218.15570.19058.stgit@dwillia2-linux.ch.intel.com \
    --to=dan.j.williams@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=christopher.leech@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=rmk+kernel@arm.linux.org.uk \
    --cc=torvalds@linux-foundation.org \
    --cc=wd@denx.de \
    --cc=yur@emcraft.com \
    --subject='Re: [PATCH 2.6.21-rc4 08/15] md: move raid5 parity checks to raid5_run_ops' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).