Alexander Kolbasov
2008-Oct-02 21:41 UTC
[crossbow-discuss] Comments for squeue.c Part 3 (final)
These are my remaining comments for squeue.c. ---------------------------------------------------------------------- 1. 464 squeue_worker_wakeup(squeue_t *sqp) Please add some explanation for the SQS_TMO_PROG flag and the use of sq_tid. . It seems that both do the same thing, so it is not clear why do you need both. Also it is not quite clear what you are trying to achieve with this complicated dance with sq_tid and SQS_TMO_PROG. Also, why does it drop the sq_lock before returning rather than leaving that to the caller? This becomes rather confusing down the line (and is not even documented!). More over, in some cases you re-acquire the lock right after calling this function. 2. 506 } else if (sqp->sq_wait != 0) { The code at lines 470-476 guarantees that this condition is never true. 3. 809 squeue_poll_fire(void *arg) No one ever calls this function 4. 827 boolean_t interrupt = (proc_type == SQS_ENTER); Can you just pass the flag from the caller? 5. Style 832 again: This is, essentially a do { } while loop. Can you rewrite it as such? This will make loop termination conditions more explicit (gotos are bad for a reason). 6. 848 * We have backlog built up. Switch to polling mode if the 849 * device underneath allows it. Need to do it only for 850 * drain by non-interrupt thread so interrupts don''t 851 * come and disrupt us in between. If its a interrupt thread, 852 * no need because most devices will not issue another 853 * interrupt till this one returns. The comment would benefit from some re-wording. At the beginning of the file we see: 223 /* 224 * The minimum packet queued when worker thread doing the drain triggers 225 * polling (if squeue allows it). The choice of 3 is arbitrary. You 226 * definitely don''t want it to be 1 since that will trigger polling 227 * on very low loads as well (ssh seems to do be one such example 228 * where packet flow was very low yet somehow 1 packet ended up getting 229 * queued and worker thread fires every 10ms and blanking also gets 230 * triggered. 231 */ 232 int squeue_worker_poll_min = 3; Now it seems that this value is completely ignored. This seems to contradict one another. 7. 835 head = sqp->sq_first; 836 sqp->sq_first = NULL; 837 sqp->sq_last = NULL; 838 sqp->sq_count = 0; 839 sqp->sq_size = 0; What is the point of this? Why clean up all the information from squeue? Now you lost the ability to use MDB to look at messages on squeue and now have to hunt for them in the registers. What is the benefit? If you want to do the processng without holding to the sq_lock, can you stash the original list somewhere in squeue? 8. 866 while ((mp = head) != NULL) { Why there is no check for the time limit expiration? Why are packets that were on squeue originally are treated differently from the new ones in terms of time slot expiration? 920 now = gethrtime(); 921 if ((now < expire) || (proc_type == SQS_WORKER)) { There is no need to check the time expiration if this is a worker thread, so you can drop gethrtime() call in this case. 9. 923 * If time not expired or we are worker thread and 924 * this squeue is polling capable, continue to do 925 * the drain. The comment disagrees with the code - there is no check for being polling capable. The code continues as long as there are messages always n case of worker thread. 10. 941 done: The label is never used (thanks!). 11. 944 * If we are the worker thread and no work is left, send the poll 945 * thread down once more to see if something arrived. Otherwise, 946 * turn the interrupts back on and we are done. 947 */ 948 if (sqp->sq_state & SQS_GET_PKTS) { 949 sqp->sq_state &= ~proc_type; 950 return; 951 } Comment doesn''t match the code. Seems that you assume that polling thread will run or is running now, so you return. You leave the SQS_PROC flag set - I assume that the polling thread requires that. It would be good to explain the use of SQS_GET_PKTS flag as well as the SQS_PROC flag by the polling thread. 12. 956 * Do one last check to see if anything arrived 957 * in the NIC. We leave the SQS_PROC set to ensure 958 * that poll thread keeps the PROC and can decide 959 * if it needs to turn polling off etc. If we 960 * drop the SQS_PROC here and poll thread comes up 961 * empty handed, it can not safely turn polling off 962 * since someone else could have acquired the PROC 963 * in between and enabled polling. 964 */ 965 SQS_POLL_CHECK_NOW(sqp); The comment confuses more than explains anything. Despite the name, SQS_POLL_CHECK_NOW doesn''t check for anything - it just wakes up the polling thread (which may be running anyway). Also, what does etc in the comment above mean? Is polling state controlled by the SQS_PROC flag? 13. 967 } else { 968 SQS_POLLING_OFF(sqp, sq_rx_ring); Why do you turn polling off here? 14. 978 * Quiesce, Restart, or Cleanup of the squeue poll thread. The quiesce and 979 * restart operations are driven from the mac layer below. The cleanup is 980 * typically driven from interface plumb/unplumb. Please explain the semantics of these three operations. Also, as I mentioned, the interface to trigger these operations is horrible. 15. 995 if (sqp->sq_state & SQS_POLL_THR_QUIESCED) 996 return; This check is only needed if sq_state & SQS_POLL_THR_QUIESCE, right? So it the following would be clearer: if (sqp->sq_state & SQS_POLL_THR_QUIESCE && ! (sqp->sq_state & SQS_POLL_THR_QUIESCED) { ... } 16. 1006 #define MAX_BYTES_TO_PICKUP 150000 1007 boolean_t sq_poll_process = B_TRUE; Can you move these (along with some explanation to the top where other similar tuneables are declared)? As well as the polling notes below. Also, should MAX_BYTES_TO_PICKUP be a constant or a variable? 17. 1017 * Sending the poll thread down to NIC is dependant on 3 criterions 1018 * 1019 * 1) Its always driven from squeue_drain 1020 * 2) The sq_count dips below lowater mark or as soon as we get out of 1021 * the loop before going back in (more work arrived in between) 1022 * 3) Before exiting the squeue_drain, if the poll thread is not already 1023 * working and we are the worker thread, try to poll one more time. I don''t see the code that implements low watermark. Also it is not clear what loops are you talking about here. In 3) the comment is misleading. Part of the confusion stems from the term "send down poll thread to NIC". All the code does is wakes up the polling thread in several places. The polling thread runs asynchronously, so the detailed description of places where it can be waken up is not very relevant - it is an implementation detail, unless it is really important to wake up in certain places in the code (if this is so, please specify such cases). 18. 1025 * For latency sake, we do allow any thread calling squeue_enter 1026 * to process its packet provided: 1027 * 1028 * 1) Nothing is queued 1029 * 2) If more packets arrived in between, the non worker thread are allowed to 1030 * do the drain till their time quata expired provided SQS_GET_PKTS wasn''t 1031 * set in between. I hope you do not do this for latency sake, but to reduce latency :-) 1) Nothing is queued where? 2) If more packets arrived in between what? Isn''t the thread calling squeue_enter always a non worker thread? quata -> quanta? Why is it important that SQS_GET_PKT wasn''t set in between? If you do not want to allow squeue_enter work in parallel with the polling thread, the important fact is that the polling thread is actually running. But this would be indicated by SQS_PROC, right? 19. 1033 * Avoiding deadlocks with interrupts 1034 * =================================1035 * 1036 * One of the big problem is that we can''t send poll_thr down while holding the 1037 * sq_lock. Since poll_thr is always signalled from squeue_drain, we need to 1038 * ensure that only worker thread doing drain or poll_thr are inside (or can 1039 * get inside) squeue perimeter once the poll_thr is signalled. I don''t understand this comment. Again, the use of "send poll_thr down" is incorrect and misleading. Please explain what part of the polling thread should not held sq_lock. Isn''t what you are trying to say boils down to the fact that the worker and poller can work in parallel, while no one can process messages while any of these two are running? Also, the comment above doesn''t explain how this coordination is achieved (with SQS_PROC flag, I guess). 20. 1051 squeue_get_pkts(squeue_t *sqp) Why don''t you call this ''squeue_polling_thread'' since this is what it is. It is not just some function to get packets. This thread never exits even when the interface is unplumbed. Is it hard to retire the thread on unplumb? 21. Can you document (and ASSERT) which parts of the worker thread require SQS_PROC being set? 22. 1070 if (!(sqp->sq_state & SQS_POLL_THR_CONTROL)) { 1071 CALLB_CPR_SAFE_BEGIN(&cprinfo); 1072 cv_wait(async, lock); 1073 CALLB_CPR_SAFE_END(&cprinfo, lock); 1074 } I don''t understand this. What is going on here? If this is not control you wait fro new signals but wait only once. What if you wake up and it is still not a control? If this isn''t important, why is it important to check the first time around? If you are trying to sleep when polling is disabled can you just sleep in a while() loop until polling is enabled? I can guess that your real intention is while (sq_state & POLL_CAPAB && !POLL_THR_CONTROL) or is it something else? 23. 1076 if (sqp->sq_state & 1077 (SQS_POLL_THR_CONTROL | SQS_POLL_THR_QUIESCED)) { Why is the check for QUIESCED included? 24. 1089 if (!(sqp->sq_state & SQS_POLLING) || 1090 !(sqp->sq_state & SQS_GET_PKTS)) 1091 cmn_err(CE_CONT, "sq_get_pkts: BAD state 1 = %x\n", 1092 sqp->sq_state); Is this an ASSERT?? Can you replace this strange SQS_GET_PKTS flag with another one telling that the polling thread is actually doing something? I am not sure that even that is needed since SQS_PROC should be set. 25. 1094 * XXX: In sq_clean_ring, till this flags is set, we 1095 * can''t let the clean succeed. I can''t find sq_clean_ring anywhere. 1097 /* XXX Should we check polling capability again */ You have the sq_lock, right? How can it change? 26. 1099 sq_rx_ring = sqp->sq_rx_ring; 1100 sq_get_pkts = sq_rx_ring->rr_rx; 1101 sq_mac_handle = sq_rx_ring->rr_rx_handle; 1102 ip_accept = sq_rx_ring->rr_ip_accept; 1103 sq_ill = sq_rx_ring->rr_ill; 1104 bytes_to_pickup = MAX_BYTES_TO_PICKUP; 1105 mutex_exit(lock); 1106 head = sq_get_pkts(sq_mac_handle, bytes_to_pickup); Can any of these change on you while sq_lock is not held? Is it possible to change thus to something like mac_get_packets (sqp->sq_rx_ring, bytes_to_pickup) ? What protects fields of the ring - is it sq_lock or something else? Do they ever change? And needless to say you have squeue_get_pkts and sq_get_pkts. Very nice. 27. 1109 mp = ip_accept(sq_ill, sq_rx_ring, sqp, head, What is this? A line or two of explanation would be nice 28 1116 if ((sqp->sq_state & (SQS_PROC|SQS_POLLING|SQS_GET_PKTS)) !1117 (SQS_PROC|SQS_POLLING|SQS_GET_PKTS)) 1118 cmn_err(CE_PANIC, "squeue_get_pkts: sqp = %p(%x) " 1119 "bad state\n", (void *)sqp, sqp->sq_state); Should it be an ASSERT or really a panic? 29. 1123 * We have packets to process and worker thread 1124 * is not running. Check to see if poll thread is 1125 * allowed to process. Let it do processing only if it 1126 * picked up some packets from the NIC otherwise 1127 * wakeup the worker thread. This is the first mentioning of the fact that the poll thread is allowed to process messages, effectively becoming a worker thread. Why it can only process packets if it picked up something from NIC? What is the difference? 30. 1133 sqp->sq_run = curthread; Shouldn''t it be set by squeue_drain instead? 31. 1135 squeue_intrdrain_ns); Should it us the same value as interrupts or something different? 32. 1138 if (sqp->sq_first == NULL) 1139 goto poll_again; Style - please use proper loops instead of goto. 33. 1142 * Couldn''t do the entire drain, let the 1143 * worker thread take over. I don''t understand this part. It does happily continue to drain whatever packets it got from NIC but not allowed to drain anything else? Why does the time limit work in such a strange manner? What is the difference? 34. 1149 * Put the SQS_PROC_HELD on so the worker 1150 * thread can distinguish where its called from. We Why does worker thread need to distinguish who woke it up? What difference does it make? 35. 1193 squeue_worker_thr_control(squeue_t *sqp) All this control part is overly complicated and poorly documented. It would benefit from some extra effort. -- - akolb