LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* What's in infiniband.git for 2.6.22
@ 2007-04-26 18:20 Roland Dreier
2007-04-26 22:43 ` [PATCH][RFC] IB: Return "maybe missed event" hint from ib_req_notify_cq() Roland Dreier
2007-04-27 15:30 ` What's in infiniband.git for 2.6.22 Michael S. Tsirkin
0 siblings, 2 replies; 9+ messages in thread
From: Roland Dreier @ 2007-04-26 18:20 UTC (permalink / raw)
To: linux-kernel, general
Here's a short summary of what my plans for 2.6.22 are. For
reference, everything is in my git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git
Please let me know if you have any thoughts on these plans, or if
there is something that you feel is missing from this list.
* mlx4 driver for new Mellanox ConnectX HCAs. This is the connectx
branch in git. I will merge this soon, after a few more cleanups
and one final posting for review. There are actually two parts here:
- "IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules"
This touches the core and all drivers, but I think it is a better
design and actually helps other drivers too in addition to being a
prerequisite for the mlx4 driver. I haven't heard anyone speak
out against it so I plan to go ahead and merge it.
- "IB/mlx4: Add driver for Mellanox ConnectX HCAs"
I'll fold all the mlx4_core and mlx4_ib code into this patch and
merge it.
- "mlx4_eth: Add 10 gigabit ethernet driver for Mellanox ConnectX"
This will NOT be merged for 2.6.22 at least. For one thing it is
pretty much just a stub that doesn't do anything useful. When
there is working 10 gig support, I'll post this to lkml and netdev
for review, but this is 2.6.23 stuff at the soonest.
* IPoIB NAPI work. This is the ipoib branch in git. Again, there are
really two parts here:
- "IB: Return "maybe missed event" hint from ib_req_notify_cq()"
This extends the API in a way that lets us implement NAPI, but may
be useful for other things too. It touches all the drivers, and I
still need to finish updating cxgb3 to work correctly. I haven't
heard anything negative about this, so I'll fix it up, post it one
more time for review, and plan on merging it.
- "IPoIB: Convert to NAPI"
This is the actual conversion of IPoIB to use NAPI, based on the
previous extension to ib_req_notify_cq(). There seems to be a
need to merge this, based on people's experiences with congestion
collapse under high load. So I'm planning on merging this too.
* I also have the following bunch of more minor patches queued, and I
will ask Linus to pull them soon. The majority of them are ipath
fixes (and I hope Qlogic will send fixes for the two other bugs that
I know of, namely corrupting the list of pending mmaps if an object
is destroyed before userspace mmaps it, and doing spin_lock_irq()
from interrupt context). There are a few other cleanups and minor
fixes scattered around. Here's the shortlog of the for-2.6.22 branch:
Arthur Jones (2):
IB/ipath: Call free_irq() on chip specific initialization failure
IB/ipath: Force PIOAvail update entry point
Bryan O'Sullivan (17):
IB/ipath: Add ability to set and clear IB local loopback
IB/ipath: Fix user memory region creation when IOMMU present
IB/ipath: Definitions of two RXE parity err bits were reversed
IB/ipath: Fix up some debug messages
IB/ipath: Change packet problems vs chip errors handling and reporting
IB/ipath: Fix bad argument to clear_bit()
IB/ipath: Fix CQ flushing when QP is modified to error state
IB/ipath: Remove unused ipath_read_kreg64_port()
IB/ipath: Fix calculation for number of kernel PIO buffers
IB/ipath: Discard multicast packets without a GRH
IB/ipath: Print better error messages if kernel is misconfigured
IB/ipath: Improve handling and reporting of parity errors
IB/ipath: On unrecoverable errors, force link down, LEDs off
IB/ipath: Prevent random program use of diags interface
IB/ipath: Disable IB link earlier in shutdown sequence
IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times
IB/ipath: Fix unit selection when all CPU affinity bits set
Hal Rosenstock (3):
IB/umad: Fix declaration of dev_map[]
IB/mad: Change SMI to use enums rather than magic return codes
IB/umad: Clarify documentation of transaction ID
Joachim Fenkes (2):
IB/ehca: Implement modify_port
IB: Set class_dev->dev in core for nice device symlink
Mark Debbage (1):
IB/ipath: Allow receive ports mapped into userspace to be shared
Michael Albaugh (1):
IB/ipath: Fix driver crash (in interrupt or during unload) after chip reset
Ralph Campbell (8):
IB/ipath: Don't initialize port memory for subports
IB/ipath: Fix SRQ limit event causing dropped CQ entry
IB/ipath: NMI cpu lockup if local loopback used
IB/ipath: Support larger IB_QP_MAX_DEST_RD_ATOMIC and IB_QP_MAX_QP_RD_ATOMIC
IB/ipath: Fix QP error completion queue entries
IB/ipath: Fix PSN update for RC retries
IB/ipath: Fix port sharing on powerpc
IB/ipath: Fix RDMA reads of length zero and error handling
Robert Walsh (4):
IB/ipath: Check reserved memory keys
IB/ipath: Remove duplicate stuff from ipath_verbs.h
IB/ipath: Check that a UD work request's address handle is valid
IB/ipath: Fix WC format drift between user and kernel space
Roland Dreier (6):
IB: Remove reference to obsolete CONFIG_IPATH_CORE
IPoIB: Remove pointless opcode field from debugging output
IB/mthca: Update HCA firmware revisions
IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory
IB/mthca: Simplify CQ cleaning in mthca_free_qp()
IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements
Sean Hefty (5):
RDMA/ucma: Simplify ucma_get_event()
IB/ucm: Simplify ib_ucm_event()
IB/sa: Set src_path_bits correctly in ib_init_ah_from_path()
IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr
IB/umad: Implement GRH handling for sent/received MADs
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH][RFC] IB: Return "maybe missed event" hint from ib_req_notify_cq()
2007-04-26 18:20 What's in infiniband.git for 2.6.22 Roland Dreier
@ 2007-04-26 22:43 ` Roland Dreier
2007-04-26 22:45 ` [PATCH][RFC] IPoIB: Convert to NAPI Roland Dreier
2007-04-30 20:11 ` [ofa-general] [PATCH][RFC] IB: Return "maybe missed event" hint from ib_req_notify_cq() Hoang-Nam Nguyen
2007-04-27 15:30 ` What's in infiniband.git for 2.6.22 Michael S. Tsirkin
1 sibling, 2 replies; 9+ messages in thread
From: Roland Dreier @ 2007-04-26 22:43 UTC (permalink / raw)
To: linux-kernel; +Cc: general
> - "IB: Return "maybe missed event" hint from ib_req_notify_cq()"
> This extends the API in a way that lets us implement NAPI, but may
> be useful for other things too. It touches all the drivers, and I
> still need to finish updating cxgb3 to work correctly. I haven't
> heard anything negative about this, so I'll fix it up, post it one
> more time for review, and plan on merging it.
As promised, here is that patch for review, with a cxgb3
implementation included.
---
The semantics defined by the InfiniBand specification say that
completion events are only generated when a completions is added to a
completion queue (CQ) after completion notification is requested. In
other words, this means that the following race is possible:
while (CQ is not empty)
ib_poll_cq(CQ);
// new completion is added after while loop is exited
ib_req_notify_cq(CQ);
// no event is generated for the existing completion
To close this race, the IB spec recommends doing another poll of the
CQ after requesting notification.
However, it is not always possible to arrange code this way (for
example, we have found that NAPI for IPoIB cannot poll after
requesting notification). Also, some hardware (eg Mellanox HCAs)
actually will generate an event for completions added before the call
to ib_req_notify_cq() -- which is allowed by the spec, since there's
no way for any upper-layer consumer to know exactly when a completion
was really added -- so the extra poll of the CQ is just a waste.
Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for
ib_req_notify_cq() so that it can return a hint about whether the a
completion may have been added before the request for notification.
The return value of ib_req_notify_cq() is extended so:
< 0 means an error occurred while requesting notification
== 0 means notification was requested successfully, and if
IB_CQ_REPORT_MISSED_EVENTS was passed in, then no
events were missed and it is safe to wait for another
event.
> 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was
passed in. It means that the consumer must poll the
CQ again to make sure it is empty to avoid the race
described above.
We add a flag to enable this behavior rather than turning it on
unconditionally, because checking for missed events may incur
significant overhead for some low-level drivers, and consumers that
don't care about the results of this test shouldn't be forced to pay
for the test.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
---
drivers/infiniband/hw/amso1100/c2.h | 2 +-
drivers/infiniband/hw/amso1100/c2_cq.c | 16 ++++++++---
drivers/infiniband/hw/cxgb3/cxio_hal.c | 3 ++
drivers/infiniband/hw/cxgb3/iwch_provider.c | 8 +++--
drivers/infiniband/hw/ehca/ehca_iverbs.h | 2 +-
drivers/infiniband/hw/ehca/ehca_reqs.c | 14 +++++++--
drivers/infiniband/hw/ehca/ipz_pt_fn.h | 8 +++++
drivers/infiniband/hw/ipath/ipath_cq.c | 15 +++++++---
drivers/infiniband/hw/ipath/ipath_verbs.h | 2 +-
drivers/infiniband/hw/mthca/mthca_cq.c | 12 +++++---
drivers/infiniband/hw/mthca/mthca_dev.h | 4 +-
include/rdma/ib_verbs.h | 40 +++++++++++++++++++++------
12 files changed, 93 insertions(+), 33 deletions(-)
diff --git a/drivers/infiniband/hw/amso1100/c2.h b/drivers/infiniband/hw/amso1100/c2.h
index 04a9db5..fa58200 100644
--- a/drivers/infiniband/hw/amso1100/c2.h
+++ b/drivers/infiniband/hw/amso1100/c2.h
@@ -519,7 +519,7 @@ extern void c2_free_cq(struct c2_dev *c2dev, struct c2_cq *cq);
extern void c2_cq_event(struct c2_dev *c2dev, u32 mq_index);
extern void c2_cq_clean(struct c2_dev *c2dev, struct c2_qp *qp, u32 mq_index);
extern int c2_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *entry);
-extern int c2_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify);
+extern int c2_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
/* CM */
extern int c2_llp_connect(struct iw_cm_id *cm_id,
diff --git a/drivers/infiniband/hw/amso1100/c2_cq.c b/drivers/infiniband/hw/amso1100/c2_cq.c
index 5175c99..d2b3366 100644
--- a/drivers/infiniband/hw/amso1100/c2_cq.c
+++ b/drivers/infiniband/hw/amso1100/c2_cq.c
@@ -217,17 +217,19 @@ int c2_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *entry)
return npolled;
}
-int c2_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
+int c2_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags notify_flags)
{
struct c2_mq_shared __iomem *shared;
struct c2_cq *cq;
+ unsigned long flags;
+ int ret = 0;
cq = to_c2cq(ibcq);
shared = cq->mq.peer;
- if (notify == IB_CQ_NEXT_COMP)
+ if ((notify_flags & IB_CQ_SOLICITED_MASK) == IB_CQ_NEXT_COMP)
writeb(C2_CQ_NOTIFICATION_TYPE_NEXT, &shared->notification_type);
- else if (notify == IB_CQ_SOLICITED)
+ else if ((notify_flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED)
writeb(C2_CQ_NOTIFICATION_TYPE_NEXT_SE, &shared->notification_type);
else
return -EINVAL;
@@ -241,7 +243,13 @@ int c2_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
*/
readb(&shared->armed);
- return 0;
+ if (notify_flags & IB_CQ_REPORT_MISSED_EVENTS) {
+ spin_lock_irqsave(&cq->lock, flags);
+ ret = !c2_mq_empty(&cq->mq);
+ spin_unlock_irqrestore(&cq->lock, flags);
+ }
+
+ return ret;
}
static void c2_free_cq_buf(struct c2_dev *c2dev, struct c2_mq *mq)
diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c b/drivers/infiniband/hw/cxgb3/cxio_hal.c
index f5e9aee..76049af 100644
--- a/drivers/infiniband/hw/cxgb3/cxio_hal.c
+++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c
@@ -114,7 +114,10 @@ int cxio_hal_cq_op(struct cxio_rdev *rdev_p, struct t3_cq *cq,
return -EIO;
}
}
+
+ return 1;
}
+
return 0;
}
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 24e0df0..e89957f 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -292,7 +292,7 @@ static int iwch_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata)
#endif
}
-static int iwch_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
+static int iwch_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
{
struct iwch_dev *rhp;
struct iwch_cq *chp;
@@ -303,7 +303,7 @@ static int iwch_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
chp = to_iwch_cq(ibcq);
rhp = chp->rhp;
- if (notify == IB_CQ_SOLICITED)
+ if ((flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED)
cq_op = CQ_ARM_SE;
else
cq_op = CQ_ARM_AN;
@@ -317,9 +317,11 @@ static int iwch_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
PDBG("%s rptr 0x%x\n", __FUNCTION__, chp->cq.rptr);
err = cxio_hal_cq_op(&rhp->rdev, &chp->cq, cq_op, 0);
spin_unlock_irqrestore(&chp->lock, flag);
- if (err)
+ if (err < 0)
printk(KERN_ERR MOD "Error %d rearming CQID 0x%x\n", err,
chp->cq.cqid);
+ if (err > 0 && !(flags & IB_CQ_REPORT_MISSED_EVENTS))
+ err = 0;
return err;
}
diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h
index 95fd59f..9e5460d 100644
--- a/drivers/infiniband/hw/ehca/ehca_iverbs.h
+++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h
@@ -135,7 +135,7 @@ int ehca_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc);
int ehca_peek_cq(struct ib_cq *cq, int wc_cnt);
-int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify cq_notify);
+int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags notify_flags);
struct ib_qp *ehca_create_qp(struct ib_pd *pd,
struct ib_qp_init_attr *init_attr,
diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c
index 08d3f89..caec9de 100644
--- a/drivers/infiniband/hw/ehca/ehca_reqs.c
+++ b/drivers/infiniband/hw/ehca/ehca_reqs.c
@@ -634,11 +634,13 @@ poll_cq_exit0:
return ret;
}
-int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify cq_notify)
+int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags notify_flags)
{
struct ehca_cq *my_cq = container_of(cq, struct ehca_cq, ib_cq);
+ unsigned long spl_flags;
+ int ret = 0;
- switch (cq_notify) {
+ switch (notify_flags & IB_CQ_SOLICITED_MASK) {
case IB_CQ_SOLICITED:
hipz_set_cqx_n0(my_cq, 1);
break;
@@ -649,5 +651,11 @@ int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify cq_notify)
return -EINVAL;
}
- return 0;
+ if (notify_flags & IB_CQ_REPORT_MISSED_EVENTS) {
+ spin_lock_irqsave(&my_cq->spinlock, spl_flags);
+ ret = ipz_qeit_is_valid(&my_cq->ipz_queue);
+ spin_unlock_irqrestore(&my_cq->spinlock, spl_flags);
+ }
+
+ return ret;
}
diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.h b/drivers/infiniband/hw/ehca/ipz_pt_fn.h
index 8199c45..57f141a 100644
--- a/drivers/infiniband/hw/ehca/ipz_pt_fn.h
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.h
@@ -140,6 +140,14 @@ static inline void *ipz_qeit_get_inc_valid(struct ipz_queue *queue)
return cqe;
}
+static inline int ipz_qeit_is_valid(struct ipz_queue *queue)
+{
+ struct ehca_cqe *cqe = ipz_qeit_get(queue);
+ u32 cqe_flags = cqe->cqe_flags;
+
+ return cqe_flags >> 7 == (queue->toggle_state & 1);
+}
+
/*
* returns and resets Queue Entry iterator
* returns address (kv) of first Queue Entry
diff --git a/drivers/infiniband/hw/ipath/ipath_cq.c b/drivers/infiniband/hw/ipath/ipath_cq.c
index 87462e0..9582145 100644
--- a/drivers/infiniband/hw/ipath/ipath_cq.c
+++ b/drivers/infiniband/hw/ipath/ipath_cq.c
@@ -306,17 +306,18 @@ int ipath_destroy_cq(struct ib_cq *ibcq)
/**
* ipath_req_notify_cq - change the notification type for a completion queue
* @ibcq: the completion queue
- * @notify: the type of notification to request
+ * @notify_flags: the type of notification to request
*
* Returns 0 for success.
*
* This may be called from interrupt context. Also called by
* ib_req_notify_cq() in the generic verbs code.
*/
-int ipath_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
+int ipath_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags notify_flags)
{
struct ipath_cq *cq = to_icq(ibcq);
unsigned long flags;
+ int ret = 0;
spin_lock_irqsave(&cq->lock, flags);
/*
@@ -324,9 +325,15 @@ int ipath_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
* any other transitions (see C11-31 and C11-32 in ch. 11.4.2.2).
*/
if (cq->notify != IB_CQ_NEXT_COMP)
- cq->notify = notify;
+ cq->notify = notify_flags & IB_CQ_SOLICITED_MASK;
+
+ if ((notify_flags & IB_CQ_REPORT_MISSED_EVENTS) &&
+ cq->queue->head != cq->queue->tail)
+ ret = 1;
+
spin_unlock_irqrestore(&cq->lock, flags);
- return 0;
+
+ return ret;
}
/**
diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.h b/drivers/infiniband/hw/ipath/ipath_verbs.h
index c0c8d5b..6b3b770 100644
--- a/drivers/infiniband/hw/ipath/ipath_verbs.h
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.h
@@ -716,7 +716,7 @@ struct ib_cq *ipath_create_cq(struct ib_device *ibdev, int entries,
int ipath_destroy_cq(struct ib_cq *ibcq);
-int ipath_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify notify);
+int ipath_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags notify_flags);
int ipath_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata);
diff --git a/drivers/infiniband/hw/mthca/mthca_cq.c b/drivers/infiniband/hw/mthca/mthca_cq.c
index efd79ef..cf0868f 100644
--- a/drivers/infiniband/hw/mthca/mthca_cq.c
+++ b/drivers/infiniband/hw/mthca/mthca_cq.c
@@ -726,11 +726,12 @@ repoll:
return err == 0 || err == -EAGAIN ? npolled : err;
}
-int mthca_tavor_arm_cq(struct ib_cq *cq, enum ib_cq_notify notify)
+int mthca_tavor_arm_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags)
{
__be32 doorbell[2];
- doorbell[0] = cpu_to_be32((notify == IB_CQ_SOLICITED ?
+ doorbell[0] = cpu_to_be32(((flags & IB_CQ_SOLICITED_MASK) ==
+ IB_CQ_SOLICITED ?
MTHCA_TAVOR_CQ_DB_REQ_NOT_SOL :
MTHCA_TAVOR_CQ_DB_REQ_NOT) |
to_mcq(cq)->cqn);
@@ -743,7 +744,7 @@ int mthca_tavor_arm_cq(struct ib_cq *cq, enum ib_cq_notify notify)
return 0;
}
-int mthca_arbel_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
+int mthca_arbel_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
{
struct mthca_cq *cq = to_mcq(ibcq);
__be32 doorbell[2];
@@ -755,7 +756,8 @@ int mthca_arbel_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
doorbell[0] = ci;
doorbell[1] = cpu_to_be32((cq->cqn << 8) | (2 << 5) | (sn << 3) |
- (notify == IB_CQ_SOLICITED ? 1 : 2));
+ ((flags & IB_CQ_SOLICITED_MASK) ==
+ IB_CQ_SOLICITED ? 1 : 2));
mthca_write_db_rec(doorbell, cq->arm_db);
@@ -766,7 +768,7 @@ int mthca_arbel_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
wmb();
doorbell[0] = cpu_to_be32((sn << 28) |
- (notify == IB_CQ_SOLICITED ?
+ ((flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED ?
MTHCA_ARBEL_CQ_DB_REQ_NOT_SOL :
MTHCA_ARBEL_CQ_DB_REQ_NOT) |
cq->cqn);
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index b7e42ef..9bae3cc 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -495,8 +495,8 @@ void mthca_unmap_eq_icm(struct mthca_dev *dev);
int mthca_poll_cq(struct ib_cq *ibcq, int num_entries,
struct ib_wc *entry);
-int mthca_tavor_arm_cq(struct ib_cq *cq, enum ib_cq_notify notify);
-int mthca_arbel_arm_cq(struct ib_cq *cq, enum ib_cq_notify notify);
+int mthca_tavor_arm_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
+int mthca_arbel_arm_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
int mthca_init_cq(struct mthca_dev *dev, int nent,
struct mthca_ucontext *ctx, u32 pdn,
struct mthca_cq *cq);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 765589f..529a69d 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -431,9 +431,11 @@ struct ib_wc {
u8 port_num; /* valid only for DR SMPs on switches */
};
-enum ib_cq_notify {
- IB_CQ_SOLICITED,
- IB_CQ_NEXT_COMP
+enum ib_cq_notify_flags {
+ IB_CQ_SOLICITED = 1 << 0,
+ IB_CQ_NEXT_COMP = 1 << 1,
+ IB_CQ_SOLICITED_MASK = IB_CQ_SOLICITED | IB_CQ_NEXT_COMP,
+ IB_CQ_REPORT_MISSED_EVENTS = 1 << 2,
};
enum ib_srq_attr_mask {
@@ -987,7 +989,7 @@ struct ib_device {
struct ib_wc *wc);
int (*peek_cq)(struct ib_cq *cq, int wc_cnt);
int (*req_notify_cq)(struct ib_cq *cq,
- enum ib_cq_notify cq_notify);
+ enum ib_cq_notify_flags flags);
int (*req_ncomp_notif)(struct ib_cq *cq,
int wc_cnt);
struct ib_mr * (*get_dma_mr)(struct ib_pd *pd,
@@ -1414,14 +1416,34 @@ int ib_peek_cq(struct ib_cq *cq, int wc_cnt);
/**
* ib_req_notify_cq - Request completion notification on a CQ.
* @cq: The CQ to generate an event for.
- * @cq_notify: If set to %IB_CQ_SOLICITED, completion notification will
- * occur on the next solicited event. If set to %IB_CQ_NEXT_COMP,
- * notification will occur on the next completion.
+ * @flags:
+ * Must contain exactly one of %IB_CQ_SOLICITED or %IB_CQ_NEXT_COMP
+ * to request an event on the next solicited event or next work
+ * completion at any type, respectively. %IB_CQ_REPORT_MISSED_EVENTS
+ * may also be |ed in to request a hint about missed events, as
+ * described below.
+ *
+ * Return Value:
+ * < 0 means an error occurred while requesting notification
+ * == 0 means notification was requested successfully, and if
+ * IB_CQ_REPORT_MISSED_EVENTS was passed in, then no events
+ * were missed and it is safe to wait for another event. In
+ * this case is it guaranteed that any work completions added
+ * to the CQ since the last CQ poll will trigger a completion
+ * notification event.
+ * > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was passed
+ * in. It means that the consumer must poll the CQ again to
+ * make sure it is empty to avoid missing an event because of a
+ * race between requesting notification and an entry being
+ * added to the CQ. This return value means it is possible
+ * (but not guaranteed) that a work completion has been added
+ * to the CQ since the last poll without triggering a
+ * completion notification event.
*/
static inline int ib_req_notify_cq(struct ib_cq *cq,
- enum ib_cq_notify cq_notify)
+ enum ib_cq_notify_flags flags)
{
- return cq->device->req_notify_cq(cq, cq_notify);
+ return cq->device->req_notify_cq(cq, flags);
}
/**
--
1.5.1.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH][RFC] IPoIB: Convert to NAPI
2007-04-26 22:43 ` [PATCH][RFC] IB: Return "maybe missed event" hint from ib_req_notify_cq() Roland Dreier
@ 2007-04-26 22:45 ` Roland Dreier
2007-04-30 20:11 ` [ofa-general] [PATCH][RFC] IB: Return "maybe missed event" hint from ib_req_notify_cq() Hoang-Nam Nguyen
1 sibling, 0 replies; 9+ messages in thread
From: Roland Dreier @ 2007-04-26 22:45 UTC (permalink / raw)
To: linux-kernel; +Cc: general, netdev
And here's the patch to convert IPoIB over to using NAPI...
---
Convert the IP-over-InfiniBand network device driver over to using
NAPI to handle all completions (both receive and send).
Signed-off-by: Roland Dreier <rolandd@cisco.com>
---
drivers/infiniband/ulp/ipoib/ipoib.h | 1 +
drivers/infiniband/ulp/ipoib/ipoib_cm.c | 2 +-
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 89 ++++++++++++++++++++++------
drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +
4 files changed, 74 insertions(+), 20 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index fd55826..15867af 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -311,6 +311,7 @@ extern struct workqueue_struct *ipoib_workqueue;
/* functions */
+int ipoib_poll(struct net_device *dev, int *budget);
void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr);
struct ipoib_ah *ipoib_create_ah(struct net_device *dev,
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 2b242a4..e1fdae1 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -418,7 +418,7 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
skb->dev = dev;
/* XXX get correct PACKET_ type here */
skb->pkt_type = PACKET_HOST;
- netif_rx_ni(skb);
+ netif_receive_skb(skb);
repost:
if (unlikely(ipoib_cm_post_receive(dev, wr_id)))
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index ba0ee5c..e3cc241 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -226,7 +226,7 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
skb->dev = dev;
/* XXX get correct PACKET_ type here */
skb->pkt_type = PACKET_HOST;
- netif_rx_ni(skb);
+ netif_receive_skb(skb);
} else {
ipoib_dbg_data(priv, "dropping loopback packet\n");
dev_kfree_skb_any(skb);
@@ -280,28 +280,65 @@ static void ipoib_ib_handle_tx_wc(struct net_device *dev, struct ib_wc *wc)
wc->status, wr_id, wc->vendor_err);
}
-static void ipoib_ib_handle_wc(struct net_device *dev, struct ib_wc *wc)
+int ipoib_poll(struct net_device *dev, int *budget)
{
- if (wc->wr_id & IPOIB_CM_OP_SRQ)
- ipoib_cm_handle_rx_wc(dev, wc);
- else if (wc->wr_id & IPOIB_OP_RECV)
- ipoib_ib_handle_rx_wc(dev, wc);
- else
- ipoib_ib_handle_tx_wc(dev, wc);
+ struct ipoib_dev_priv *priv = netdev_priv(dev);
+ int max = min(*budget, dev->quota);
+ int done;
+ int t;
+ int empty;
+ int n, i;
+
+repoll:
+ done = 0;
+ empty = 0;
+
+ while (max) {
+ t = min(IPOIB_NUM_WC, max);
+ n = ib_poll_cq(priv->cq, t, priv->ibwc);
+
+ for (i = 0; i < n; ++i) {
+ struct ib_wc *wc = priv->ibwc + i;
+
+ if (wc->wr_id & IPOIB_CM_OP_SRQ) {
+ ++done;
+ --max;
+ ipoib_cm_handle_rx_wc(dev, wc);
+ } else if (wc->wr_id & IPOIB_OP_RECV) {
+ ++done;
+ --max;
+ ipoib_ib_handle_rx_wc(dev, wc);
+ } else
+ ipoib_ib_handle_tx_wc(dev, wc);
+ }
+
+ if (n != t) {
+ empty = 1;
+ break;
+ }
+ }
+
+ dev->quota -= done;
+ *budget -= done;
+
+ if (empty) {
+ netif_rx_complete(dev);
+ if (unlikely(ib_req_notify_cq(priv->cq,
+ IB_CQ_NEXT_COMP |
+ IB_CQ_REPORT_MISSED_EVENTS))) {
+ netif_rx_reschedule(dev, 0);
+ return 1;
+ }
+
+ return 0;
+ }
+
+ return 1;
}
void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr)
{
- struct net_device *dev = (struct net_device *) dev_ptr;
- struct ipoib_dev_priv *priv = netdev_priv(dev);
- int n, i;
-
- ib_req_notify_cq(cq, IB_CQ_NEXT_COMP);
- do {
- n = ib_poll_cq(cq, IPOIB_NUM_WC, priv->ibwc);
- for (i = 0; i < n; ++i)
- ipoib_ib_handle_wc(dev, priv->ibwc + i);
- } while (n == IPOIB_NUM_WC);
+ netif_rx_schedule(dev_ptr);
}
static inline int post_send(struct ipoib_dev_priv *priv,
@@ -514,9 +551,10 @@ int ipoib_ib_dev_stop(struct net_device *dev)
struct ib_qp_attr qp_attr;
unsigned long begin;
struct ipoib_tx_buf *tx_req;
- int i;
+ int i, n;
clear_bit(IPOIB_FLAG_INITIALIZED, &priv->flags);
+ netif_poll_disable(dev);
ipoib_cm_dev_stop(dev);
@@ -568,6 +606,16 @@ int ipoib_ib_dev_stop(struct net_device *dev)
goto timeout;
}
+ do {
+ n = ib_poll_cq(priv->cq, IPOIB_NUM_WC, priv->ibwc);
+ for (i = 0; i < n; ++i) {
+ if (priv->ibwc[i].wr_id & IPOIB_OP_RECV)
+ ipoib_ib_handle_rx_wc(dev, priv->ibwc + i);
+ else
+ ipoib_ib_handle_tx_wc(dev, priv->ibwc + i);
+ }
+ } while (n == IPOIB_NUM_WC);
+
msleep(1);
}
@@ -596,6 +644,9 @@ timeout:
msleep(1);
}
+ netif_poll_enable(dev);
+ ib_req_notify_cq(priv->cq, IB_CQ_NEXT_COMP);
+
return 0;
}
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index f2a40ae..a69c472 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -952,6 +952,8 @@ static void ipoib_setup(struct net_device *dev)
dev->hard_header = ipoib_hard_header;
dev->set_multicast_list = ipoib_set_mcast_list;
dev->neigh_setup = ipoib_neigh_setup_dev;
+ dev->poll = ipoib_poll;
+ dev->weight = 100;
dev->watchdog_timeo = HZ;
--
1.5.1.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: What's in infiniband.git for 2.6.22
2007-04-26 18:20 What's in infiniband.git for 2.6.22 Roland Dreier
2007-04-26 22:43 ` [PATCH][RFC] IB: Return "maybe missed event" hint from ib_req_notify_cq() Roland Dreier
@ 2007-04-27 15:30 ` Michael S. Tsirkin
2007-04-28 3:56 ` Roland Dreier
1 sibling, 1 reply; 9+ messages in thread
From: Michael S. Tsirkin @ 2007-04-27 15:30 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-kernel, general
> Quoting Roland Dreier <rdreier@cisco.com>:
> Subject: What's in infiniband.git for 2.6.22
>
> Here's a short summary of what my plans for 2.6.22 are. For
> reference, everything is in my git tree:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git
What about the mthca patch to use separate HW queues for kernel RC/UD/userspace RC?
--
MST
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: What's in infiniband.git for 2.6.22
2007-04-27 15:30 ` What's in infiniband.git for 2.6.22 Michael S. Tsirkin
@ 2007-04-28 3:56 ` Roland Dreier
2007-04-28 17:55 ` Michael S. Tsirkin
2007-04-29 7:51 ` Michael S. Tsirkin
0 siblings, 2 replies; 9+ messages in thread
From: Roland Dreier @ 2007-04-28 3:56 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: linux-kernel, general
> What about the mthca patch to use separate HW queues for kernel RC/UD/userspace RC?
right, I'll queue that up too.
BTW is there something analogous we could do for mlx4, or is FW not
quite ready?
- R.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: What's in infiniband.git for 2.6.22
2007-04-28 3:56 ` Roland Dreier
@ 2007-04-28 17:55 ` Michael S. Tsirkin
2007-04-29 7:51 ` Michael S. Tsirkin
1 sibling, 0 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2007-04-28 17:55 UTC (permalink / raw)
To: Roland Dreier; +Cc: Michael S. Tsirkin, linux-kernel, general
> Quoting Roland Dreier <rdreier@cisco.com>:
> Subject: Re: What's in infiniband.git for 2.6.22
>
> > What about the mthca patch to use separate HW queues for kernel RC/UD/userspace RC?
>
> right, I'll queue that up too.
> BTW is there something analogous we could do for mlx4, or is FW not
> quite ready?
I am assured the issue this is sloving is not present in mlx4.
--
MST
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: What's in infiniband.git for 2.6.22
2007-04-28 3:56 ` Roland Dreier
2007-04-28 17:55 ` Michael S. Tsirkin
@ 2007-04-29 7:51 ` Michael S. Tsirkin
2007-04-30 16:30 ` Roland Dreier
1 sibling, 1 reply; 9+ messages in thread
From: Michael S. Tsirkin @ 2007-04-29 7:51 UTC (permalink / raw)
To: Roland Dreier; +Cc: Michael S. Tsirkin, linux-kernel, general
> Quoting Roland Dreier <rdreier@cisco.com>:
> Subject: Re: What's in infiniband.git for 2.6.22
>
> > What about the mthca patch to use separate HW queues for kernel RC/UD/userspace RC?
>
> right, I'll queue that up too.
I think you want to queue the following obvios bugix up as well:
http://www.openfabrics.org/git/?p=~vlad/ofed_1_2/.git;a=blob;f=kernel_patches/fixes/ipoib_crash_on_error.patch;hb=HEAD
--
MST
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: What's in infiniband.git for 2.6.22
2007-04-29 7:51 ` Michael S. Tsirkin
@ 2007-04-30 16:30 ` Roland Dreier
0 siblings, 0 replies; 9+ messages in thread
From: Roland Dreier @ 2007-04-30 16:30 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: linux-kernel, general
> I think you want to queue the following obvios bugix up as well:
> http://www.openfabrics.org/git/?p=~vlad/ofed_1_2/.git;a=blob;f=kernel_patches/fixes/ipoib_crash_on_error.patch;hb=HEAD
Yes, I'll get that one too.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [ofa-general] [PATCH][RFC] IB: Return "maybe missed event" hint from ib_req_notify_cq()
2007-04-26 22:43 ` [PATCH][RFC] IB: Return "maybe missed event" hint from ib_req_notify_cq() Roland Dreier
2007-04-26 22:45 ` [PATCH][RFC] IPoIB: Convert to NAPI Roland Dreier
@ 2007-04-30 20:11 ` Hoang-Nam Nguyen
1 sibling, 0 replies; 9+ messages in thread
From: Hoang-Nam Nguyen @ 2007-04-30 20:11 UTC (permalink / raw)
To: Roland Dreier; +Cc: general, general-bounces, linux-kernel
Hi Roland!
As far as this concerns ehca this looks great.
Thanks
Nam
general-bounces@lists.openfabrics.org wrote on 27.04.2007 00:43:19:
> > - "IB: Return "maybe missed event" hint from ib_req_notify_cq()"
> > This extends the API in a way that lets us implement NAPI, but may
> > be useful for other things too. It touches all the drivers, and I
> > still need to finish updating cxgb3 to work correctly. I haven't
> > heard anything negative about this, so I'll fix it up, post it one
> > more time for review, and plan on merging it.
>
> As promised, here is that patch for review, with a cxgb3
> implementation included.
>
> ---
>
> The semantics defined by the InfiniBand specification say that
> completion events are only generated when a completions is added to a
> completion queue (CQ) after completion notification is requested. In
> other words, this means that the following race is possible:
>
> while (CQ is not empty)
> ib_poll_cq(CQ);
> // new completion is added after while loop is exited
> ib_req_notify_cq(CQ);
> // no event is generated for the existing completion
>
> To close this race, the IB spec recommends doing another poll of the
> CQ after requesting notification.
>
> However, it is not always possible to arrange code this way (for
> example, we have found that NAPI for IPoIB cannot poll after
> requesting notification). Also, some hardware (eg Mellanox HCAs)
> actually will generate an event for completions added before the call
> to ib_req_notify_cq() -- which is allowed by the spec, since there's
> no way for any upper-layer consumer to know exactly when a completion
> was really added -- so the extra poll of the CQ is just a waste.
>
> Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for
> ib_req_notify_cq() so that it can return a hint about whether the a
> completion may have been added before the request for notification.
> The return value of ib_req_notify_cq() is extended so:
>
> < 0 means an error occurred while requesting notification
> == 0 means notification was requested successfully, and if
> IB_CQ_REPORT_MISSED_EVENTS was passed in, then no
> events were missed and it is safe to wait for another
> event.
> > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was
> passed in. It means that the consumer must poll the
> CQ again to make sure it is empty to avoid the race
> described above.
>
> We add a flag to enable this behavior rather than turning it on
> unconditionally, because checking for missed events may incur
> significant overhead for some low-level drivers, and consumers that
> don't care about the results of this test shouldn't be forced to pay
> for the test.
>
> Signed-off-by: Roland Dreier <rolandd@cisco.com>
> ---
> drivers/infiniband/hw/amso1100/c2.h | 2 +-
> drivers/infiniband/hw/amso1100/c2_cq.c | 16 ++++++++---
> drivers/infiniband/hw/cxgb3/cxio_hal.c | 3 ++
> drivers/infiniband/hw/cxgb3/iwch_provider.c | 8 +++--
> drivers/infiniband/hw/ehca/ehca_iverbs.h | 2 +-
> drivers/infiniband/hw/ehca/ehca_reqs.c | 14 +++++++--
> drivers/infiniband/hw/ehca/ipz_pt_fn.h | 8 +++++
> drivers/infiniband/hw/ipath/ipath_cq.c | 15 +++++++---
> drivers/infiniband/hw/ipath/ipath_verbs.h | 2 +-
> drivers/infiniband/hw/mthca/mthca_cq.c | 12 +++++---
> drivers/infiniband/hw/mthca/mthca_dev.h | 4 +-
> include/rdma/ib_verbs.h | 40
> +++++++++++++++++++++------
> 12 files changed, 93 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/infiniband/hw/amso1100/c2.h
> b/drivers/infiniband/hw/amso1100/c2.h
> index 04a9db5..fa58200 100644
> --- a/drivers/infiniband/hw/amso1100/c2.h
> +++ b/drivers/infiniband/hw/amso1100/c2.h
> @@ -519,7 +519,7 @@ extern void c2_free_cq(struct c2_dev *c2dev,
> struct c2_cq *cq);
> extern void c2_cq_event(struct c2_dev *c2dev, u32 mq_index);
> extern void c2_cq_clean(struct c2_dev *c2dev, struct c2_qp *qp,
u32mq_index);
> extern int c2_poll_cq(struct ib_cq *ibcq, int num_entries, struct
> ib_wc *entry);
> -extern int c2_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify);
> +extern int c2_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
>
> /* CM */
> extern int c2_llp_connect(struct iw_cm_id *cm_id,
> diff --git a/drivers/infiniband/hw/amso1100/c2_cq.c
> b/drivers/infiniband/hw/amso1100/c2_cq.c
> index 5175c99..d2b3366 100644
> --- a/drivers/infiniband/hw/amso1100/c2_cq.c
> +++ b/drivers/infiniband/hw/amso1100/c2_cq.c
> @@ -217,17 +217,19 @@ int c2_poll_cq(struct ib_cq *ibcq, int
> num_entries, struct ib_wc *entry)
> return npolled;
> }
>
> -int c2_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
> +int c2_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags notify_flags)
> {
> struct c2_mq_shared __iomem *shared;
> struct c2_cq *cq;
> + unsigned long flags;
> + int ret = 0;
>
> cq = to_c2cq(ibcq);
> shared = cq->mq.peer;
>
> - if (notify == IB_CQ_NEXT_COMP)
> + if ((notify_flags & IB_CQ_SOLICITED_MASK) == IB_CQ_NEXT_COMP)
> writeb(C2_CQ_NOTIFICATION_TYPE_NEXT, &shared->notification_type);
> - else if (notify == IB_CQ_SOLICITED)
> + else if ((notify_flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED)
> writeb(C2_CQ_NOTIFICATION_TYPE_NEXT_SE,
&shared->notification_type);
> else
> return -EINVAL;
> @@ -241,7 +243,13 @@ int c2_arm_cq(struct ib_cq *ibcq, enum
> ib_cq_notify notify)
> */
> readb(&shared->armed);
>
> - return 0;
> + if (notify_flags & IB_CQ_REPORT_MISSED_EVENTS) {
> + spin_lock_irqsave(&cq->lock, flags);
> + ret = !c2_mq_empty(&cq->mq);
> + spin_unlock_irqrestore(&cq->lock, flags);
> + }
> +
> + return ret;
> }
>
> static void c2_free_cq_buf(struct c2_dev *c2dev, struct c2_mq *mq)
> diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c
> b/drivers/infiniband/hw/cxgb3/cxio_hal.c
> index f5e9aee..76049af 100644
> --- a/drivers/infiniband/hw/cxgb3/cxio_hal.c
> +++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c
> @@ -114,7 +114,10 @@ int cxio_hal_cq_op(struct cxio_rdev *rdev_p,
> struct t3_cq *cq,
> return -EIO;
> }
> }
> +
> + return 1;
> }
> +
> return 0;
> }
>
> diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c
> b/drivers/infiniband/hw/cxgb3/iwch_provider.c
> index 24e0df0..e89957f 100644
> --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
> +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
> @@ -292,7 +292,7 @@ static int iwch_resize_cq(struct ib_cq *cq, int
> cqe, struct ib_udata *udata)
> #endif
> }
>
> -static int iwch_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
> +static int iwch_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags
flags)
> {
> struct iwch_dev *rhp;
> struct iwch_cq *chp;
> @@ -303,7 +303,7 @@ static int iwch_arm_cq(struct ib_cq *ibcq, enum
> ib_cq_notify notify)
>
> chp = to_iwch_cq(ibcq);
> rhp = chp->rhp;
> - if (notify == IB_CQ_SOLICITED)
> + if ((flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED)
> cq_op = CQ_ARM_SE;
> else
> cq_op = CQ_ARM_AN;
> @@ -317,9 +317,11 @@ static int iwch_arm_cq(struct ib_cq *ibcq, enum
> ib_cq_notify notify)
> PDBG("%s rptr 0x%x\n", __FUNCTION__, chp->cq.rptr);
> err = cxio_hal_cq_op(&rhp->rdev, &chp->cq, cq_op, 0);
> spin_unlock_irqrestore(&chp->lock, flag);
> - if (err)
> + if (err < 0)
> printk(KERN_ERR MOD "Error %d rearming CQID 0x%x\n", err,
> chp->cq.cqid);
> + if (err > 0 && !(flags & IB_CQ_REPORT_MISSED_EVENTS))
> + err = 0;
> return err;
> }
>
> diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h
> b/drivers/infiniband/hw/ehca/ehca_iverbs.h
> index 95fd59f..9e5460d 100644
> --- a/drivers/infiniband/hw/ehca/ehca_iverbs.h
> +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h
> @@ -135,7 +135,7 @@ int ehca_poll_cq(struct ib_cq *cq, int
> num_entries, struct ib_wc *wc);
>
> int ehca_peek_cq(struct ib_cq *cq, int wc_cnt);
>
> -int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify cq_notify);
> +int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags
> notify_flags);
>
> struct ib_qp *ehca_create_qp(struct ib_pd *pd,
> struct ib_qp_init_attr *init_attr,
> diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c
> b/drivers/infiniband/hw/ehca/ehca_reqs.c
> index 08d3f89..caec9de 100644
> --- a/drivers/infiniband/hw/ehca/ehca_reqs.c
> +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c
> @@ -634,11 +634,13 @@ poll_cq_exit0:
> return ret;
> }
>
> -int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify cq_notify)
> +int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags
> notify_flags)
> {
> struct ehca_cq *my_cq = container_of(cq, struct ehca_cq, ib_cq);
> + unsigned long spl_flags;
> + int ret = 0;
>
> - switch (cq_notify) {
> + switch (notify_flags & IB_CQ_SOLICITED_MASK) {
> case IB_CQ_SOLICITED:
> hipz_set_cqx_n0(my_cq, 1);
> break;
> @@ -649,5 +651,11 @@ int ehca_req_notify_cq(struct ib_cq *cq, enum
> ib_cq_notify cq_notify)
> return -EINVAL;
> }
>
> - return 0;
> + if (notify_flags & IB_CQ_REPORT_MISSED_EVENTS) {
> + spin_lock_irqsave(&my_cq->spinlock, spl_flags);
> + ret = ipz_qeit_is_valid(&my_cq->ipz_queue);
> + spin_unlock_irqrestore(&my_cq->spinlock, spl_flags);
> + }
> +
> + return ret;
> }
> diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.h
> b/drivers/infiniband/hw/ehca/ipz_pt_fn.h
> index 8199c45..57f141a 100644
> --- a/drivers/infiniband/hw/ehca/ipz_pt_fn.h
> +++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.h
> @@ -140,6 +140,14 @@ static inline void
> *ipz_qeit_get_inc_valid(struct ipz_queue *queue)
> return cqe;
> }
>
> +static inline int ipz_qeit_is_valid(struct ipz_queue *queue)
> +{
> + struct ehca_cqe *cqe = ipz_qeit_get(queue);
> + u32 cqe_flags = cqe->cqe_flags;
> +
> + return cqe_flags >> 7 == (queue->toggle_state & 1);
> +}
> +
> /*
> * returns and resets Queue Entry iterator
> * returns address (kv) of first Queue Entry
> diff --git a/drivers/infiniband/hw/ipath/ipath_cq.c
> b/drivers/infiniband/hw/ipath/ipath_cq.c
> index 87462e0..9582145 100644
> --- a/drivers/infiniband/hw/ipath/ipath_cq.c
> +++ b/drivers/infiniband/hw/ipath/ipath_cq.c
> @@ -306,17 +306,18 @@ int ipath_destroy_cq(struct ib_cq *ibcq)
> /**
> * ipath_req_notify_cq - change the notification type for a completion
queue
> * @ibcq: the completion queue
> - * @notify: the type of notification to request
> + * @notify_flags: the type of notification to request
> *
> * Returns 0 for success.
> *
> * This may be called from interrupt context. Also called by
> * ib_req_notify_cq() in the generic verbs code.
> */
> -int ipath_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
> +int ipath_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags
> notify_flags)
> {
> struct ipath_cq *cq = to_icq(ibcq);
> unsigned long flags;
> + int ret = 0;
>
> spin_lock_irqsave(&cq->lock, flags);
> /*
> @@ -324,9 +325,15 @@ int ipath_req_notify_cq(struct ib_cq *ibcq,
> enum ib_cq_notify notify)
> * any other transitions (see C11-31 and C11-32 in ch. 11.4.2.2).
> */
> if (cq->notify != IB_CQ_NEXT_COMP)
> - cq->notify = notify;
> + cq->notify = notify_flags & IB_CQ_SOLICITED_MASK;
> +
> + if ((notify_flags & IB_CQ_REPORT_MISSED_EVENTS) &&
> + cq->queue->head != cq->queue->tail)
> + ret = 1;
> +
> spin_unlock_irqrestore(&cq->lock, flags);
> - return 0;
> +
> + return ret;
> }
>
> /**
> diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.h
> b/drivers/infiniband/hw/ipath/ipath_verbs.h
> index c0c8d5b..6b3b770 100644
> --- a/drivers/infiniband/hw/ipath/ipath_verbs.h
> +++ b/drivers/infiniband/hw/ipath/ipath_verbs.h
> @@ -716,7 +716,7 @@ struct ib_cq *ipath_create_cq(struct ib_device
> *ibdev, int entries,
>
> int ipath_destroy_cq(struct ib_cq *ibcq);
>
> -int ipath_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify notify);
> +int ipath_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags
> notify_flags);
>
> int ipath_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata
*udata);
>
> diff --git a/drivers/infiniband/hw/mthca/mthca_cq.c
> b/drivers/infiniband/hw/mthca/mthca_cq.c
> index efd79ef..cf0868f 100644
> --- a/drivers/infiniband/hw/mthca/mthca_cq.c
> +++ b/drivers/infiniband/hw/mthca/mthca_cq.c
> @@ -726,11 +726,12 @@ repoll:
> return err == 0 || err == -EAGAIN ? npolled : err;
> }
>
> -int mthca_tavor_arm_cq(struct ib_cq *cq, enum ib_cq_notify notify)
> +int mthca_tavor_arm_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags)
> {
> __be32 doorbell[2];
>
> - doorbell[0] = cpu_to_be32((notify == IB_CQ_SOLICITED ?
> + doorbell[0] = cpu_to_be32(((flags & IB_CQ_SOLICITED_MASK) ==
> + IB_CQ_SOLICITED ?
> MTHCA_TAVOR_CQ_DB_REQ_NOT_SOL :
> MTHCA_TAVOR_CQ_DB_REQ_NOT) |
> to_mcq(cq)->cqn);
> @@ -743,7 +744,7 @@ int mthca_tavor_arm_cq(struct ib_cq *cq, enum
> ib_cq_notify notify)
> return 0;
> }
>
> -int mthca_arbel_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify notify)
> +int mthca_arbel_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags
flags)
> {
> struct mthca_cq *cq = to_mcq(ibcq);
> __be32 doorbell[2];
> @@ -755,7 +756,8 @@ int mthca_arbel_arm_cq(struct ib_cq *ibcq, enum
> ib_cq_notify notify)
>
> doorbell[0] = ci;
> doorbell[1] = cpu_to_be32((cq->cqn << 8) | (2 << 5) | (sn << 3) |
> - (notify == IB_CQ_SOLICITED ? 1 : 2));
> + ((flags & IB_CQ_SOLICITED_MASK) ==
> + IB_CQ_SOLICITED ? 1 : 2));
>
> mthca_write_db_rec(doorbell, cq->arm_db);
>
> @@ -766,7 +768,7 @@ int mthca_arbel_arm_cq(struct ib_cq *ibcq, enum
> ib_cq_notify notify)
> wmb();
>
> doorbell[0] = cpu_to_be32((sn << 28) |
> - (notify == IB_CQ_SOLICITED ?
> + ((flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED ?
> MTHCA_ARBEL_CQ_DB_REQ_NOT_SOL :
> MTHCA_ARBEL_CQ_DB_REQ_NOT) |
> cq->cqn);
> diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h
> b/drivers/infiniband/hw/mthca/mthca_dev.h
> index b7e42ef..9bae3cc 100644
> --- a/drivers/infiniband/hw/mthca/mthca_dev.h
> +++ b/drivers/infiniband/hw/mthca/mthca_dev.h
> @@ -495,8 +495,8 @@ void mthca_unmap_eq_icm(struct mthca_dev *dev);
>
> int mthca_poll_cq(struct ib_cq *ibcq, int num_entries,
> struct ib_wc *entry);
> -int mthca_tavor_arm_cq(struct ib_cq *cq, enum ib_cq_notify notify);
> -int mthca_arbel_arm_cq(struct ib_cq *cq, enum ib_cq_notify notify);
> +int mthca_tavor_arm_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
> +int mthca_arbel_arm_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
> int mthca_init_cq(struct mthca_dev *dev, int nent,
> struct mthca_ucontext *ctx, u32 pdn,
> struct mthca_cq *cq);
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 765589f..529a69d 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -431,9 +431,11 @@ struct ib_wc {
> u8 port_num; /* valid only for DR SMPs on switches */
> };
>
> -enum ib_cq_notify {
> - IB_CQ_SOLICITED,
> - IB_CQ_NEXT_COMP
> +enum ib_cq_notify_flags {
> + IB_CQ_SOLICITED = 1 << 0,
> + IB_CQ_NEXT_COMP = 1 << 1,
> + IB_CQ_SOLICITED_MASK = IB_CQ_SOLICITED | IB_CQ_NEXT_COMP,
> + IB_CQ_REPORT_MISSED_EVENTS = 1 << 2,
> };
>
> enum ib_srq_attr_mask {
> @@ -987,7 +989,7 @@ struct ib_device {
> struct ib_wc *wc);
> int (*peek_cq)(struct ib_cq *cq, int wc_cnt);
> int (*req_notify_cq)(struct ib_cq *cq,
> - enum ib_cq_notify cq_notify);
> + enum ib_cq_notify_flags flags);
> int (*req_ncomp_notif)(struct ib_cq *cq,
> int wc_cnt);
> struct ib_mr * (*get_dma_mr)(struct ib_pd *pd,
> @@ -1414,14 +1416,34 @@ int ib_peek_cq(struct ib_cq *cq, int wc_cnt);
> /**
> * ib_req_notify_cq - Request completion notification on a CQ.
> * @cq: The CQ to generate an event for.
> - * @cq_notify: If set to %IB_CQ_SOLICITED, completion notification will
> - * occur on the next solicited event. If set to %IB_CQ_NEXT_COMP,
> - * notification will occur on the next completion.
> + * @flags:
> + * Must contain exactly one of %IB_CQ_SOLICITED or %IB_CQ_NEXT_COMP
> + * to request an event on the next solicited event or next work
> + * completion at any type, respectively. %IB_CQ_REPORT_MISSED_EVENTS
> + * may also be |ed in to request a hint about missed events, as
> + * described below.
> + *
> + * Return Value:
> + * < 0 means an error occurred while requesting notification
> + * == 0 means notification was requested successfully, and if
> + * IB_CQ_REPORT_MISSED_EVENTS was passed in, then no events
> + * were missed and it is safe to wait for another event. In
> + * this case is it guaranteed that any work completions added
> + * to the CQ since the last CQ poll will trigger a completion
> + * notification event.
> + * > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was passed
> + * in. It means that the consumer must poll the CQ again to
> + * make sure it is empty to avoid missing an event because of a
> + * race between requesting notification and an entry being
> + * added to the CQ. This return value means it is possible
> + * (but not guaranteed) that a work completion has been added
> + * to the CQ since the last poll without triggering a
> + * completion notification event.
> */
> static inline int ib_req_notify_cq(struct ib_cq *cq,
> - enum ib_cq_notify cq_notify)
> + enum ib_cq_notify_flags flags)
> {
> - return cq->device->req_notify_cq(cq, cq_notify);
> + return cq->device->req_notify_cq(cq, flags);
> }
>
> /**
> --
> 1.5.1.2
> _______________________________________________
> general mailing list
> general@lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-04-30 20:11 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-26 18:20 What's in infiniband.git for 2.6.22 Roland Dreier
2007-04-26 22:43 ` [PATCH][RFC] IB: Return "maybe missed event" hint from ib_req_notify_cq() Roland Dreier
2007-04-26 22:45 ` [PATCH][RFC] IPoIB: Convert to NAPI Roland Dreier
2007-04-30 20:11 ` [ofa-general] [PATCH][RFC] IB: Return "maybe missed event" hint from ib_req_notify_cq() Hoang-Nam Nguyen
2007-04-27 15:30 ` What's in infiniband.git for 2.6.22 Michael S. Tsirkin
2007-04-28 3:56 ` Roland Dreier
2007-04-28 17:55 ` Michael S. Tsirkin
2007-04-29 7:51 ` Michael S. Tsirkin
2007-04-30 16:30 ` Roland Dreier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).