Hi
1. in the callback o2net_fill_node_map -> o2net_tx_can_proceed()
2. if the function o2net_tx_can_proceed returns false, then "ret" and
sc are uninialized,
and re-using the value from the previous iteration. I think this is not
reasonable.
I do not know whether to hide a bug.
checking the return value is harmless and robustness.
Finally, any feedback about this process (positive or negative) would be
greatly appreciated.
/* Get a map of all nodes to which this node is currently connected to */
void o2net_fill_node_map(unsigned long *map, unsigned bytes)
{
struct o2net_sock_container *sc = NULL;
int node, ret = 0;
BUG_ON(bytes < (BITS_TO_LONGS(O2NM_MAX_NODES) * sizeof(unsigned
long)));
memset(map, 0, bytes);
for (node = 0; node < O2NM_MAX_NODES; ++node) {
if (!o2net_tx_can_proceed(o2net_nn_from_num(node), &sc,
&ret))
continue;
if (!ret) {
set_bit(node, map);
sc_put(sc)
}
+ sc = NULL;
+ ret=0;
}
}
________________________________
zhangguanghui 10102
-------------------------------------------------------------------------------------------------------------------------------------
????????????????????????????????????????
????????????????????????????????????????
????????????????????????????????????????
???
This e-mail and its attachments contain confidential information from H3C, which
is
intended only for the person or entity whose address is listed above. Any use of
the
information contained herein in any way (including, but not limited to, total or
partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender
by phone or email immediately and delete it!
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20150606/c2954986/attachment.html
Hi? Please read the rules located at Documentation/SubmittingPatches before submitting patch. This will help maintainer review your patch. On 2015/6/6 21:06, Zhangguanghui wrote:> Hi > > 1. in the callback o2net_fill_node_map -> o2net_tx_can_proceed() > 2. if the function o2net_tx_can_proceed returns false, then "ret" and sc are uninialized, > and re-using the value from the previous iteration. I think this is not reasonable. > I do not know whether to hide a bug. > checking the return value is harmless and robustness. > > Finally, any feedback about this process (positive or negative) would be greatly appreciated. > > /* Get a map of all nodes to which this node is currently connected to */ > > void o2net_fill_node_map(unsigned long *map, unsigned bytes) > > { > struct o2net_sock_container *sc = NULL; > int node, ret = 0; > BUG_ON(bytes < (BITS_TO_LONGS(O2NM_MAX_NODES) * sizeof(unsigned long))); > memset(map, 0, bytes); > for (node = 0; node < O2NM_MAX_NODES; ++node) { > if (!o2net_tx_can_proceed(o2net_nn_from_num(node), &sc, &ret)) > continue; > if (!ret) { > set_bit(node, map); > sc_put(sc) > > } > > + sc = NULL; > > + ret=0; > > } > } > > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!---> zhangguanghui 10102 > ------------------------------------------------------------------------------------------------------------------------------------- > ???????????????????????????????????????? > ???????????????????????????????????????? > ???????????????????????????????????????? > ??? > This e-mail and its attachments contain confidential information from H3C, which is > intended only for the person or entity whose address is listed above. Any use of the > information contained herein in any way (including, but not limited to, total or partial > disclosure, reproduction, or dissemination) by persons other than the intended > recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender > by phone or email immediately and delete it! > > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel >
Hi
1. In the callback o2net_sendpage -> sendpage()
2. If sendpage (tcp socket send) continuously returns -EAGAIN, then get into an
endless loop,
even though the function of cond_resched() have already used.
3. I think it is not reasonable, try to continuously send 20 times returns
-EAGAIN ,
shutdown the socket to avoid affecting the entire cluster.
Finally, any feedback about this process (positive or negative) would be
greatly appreciated.
--- tcp.c 2015-06-30 11:46:54.727447919 +0800
+++ tcp.c.diff 2015-06-30 11:52:12.823447881 +0800
@@ -949,6 +949,7 @@
{
struct o2net_node *nn = o2net_nn_from_num(sc->sc_node->nd_num);
ssize_t ret;
+ int send_fails = 20;
while (1) {
mutex_lock(&sc->sc_send_lock);
@@ -959,10 +960,11 @@
mutex_unlock(&sc->sc_send_lock);
if (ret == size)
break;
- if (ret == (ssize_t)-EAGAIN) {
+ if (ret == (ssize_t)-EAGAIN && send_fails > 0) {
mlog(0, "sendpage of size %zu to " SC_NODEF_FMT
" returned EAGAIN\n", size, SC_NODEF_ARGS(sc));
cond_resched();
+ --send_fails;
continue;
}
mlog(ML_ERROR, "sendpage of size %zu to " SC_NODEF_FMT
syslog:
/var/log/syslog:Jun 29 09:32:58 cvk47 kernel: [156022.769539]
(kworker/u130:1,12041,9):o2net_sendpage:1026 sendpage of size 24 to node cvk61
(num 5) at 172.16.202.61:7100 returned EAGAIN
/var/log/syslog:Jun 29 09:32:58 cvk47 kernel: [156022.769542]
(kworker/u130:1,12041,9):o2net_sendpage:1026 sendpage of size 24 to node cvk61
(num 5) at 172.16.202.61:7100 returned EAGAIN
/var/log/syslog:Jun 29 09:32:58 cvk47 kernel: [156022.769544]
(kworker/u130:1,12041,9):o2net_sendpage:1026 sendpage of size 24 to node cvk61
(num 5) at 172.16.202.61:7100 returned EAGAIN
/var/log/syslog:Jun 29 09:32:58 cvk47 kernel: [156022.769546]
(kworker/u130:1,12041,9):o2net_sendpage:1026 sendpage of
________________________________
zhangguanghui 10102
-------------------------------------------------------------------------------------------------------------------------------------
????????????????????????????????????????
????????????????????????????????????????
????????????????????????????????????????
???
This e-mail and its attachments contain confidential information from H3C, which
is
intended only for the person or entity whose address is listed above. Any use of
the
information contained herein in any way (including, but not limited to, total or
partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender
by phone or email immediately and delete it!
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20150630/33cc54e0/attachment.html