Hello everyone. A webserver that I've been working on seems to be having some issues of some sort with the nfe-20071124 driver. We have it running on two gigabit ports, one facing inwards to the database server, and the other facing outwards. The outward facing port is the only one which I currently have bandwidth information on, and it seems to be pushing around 35-40 Mb/s. Unfortunately, there are literally billions of interrupts generated in only a short period of time. In the last 40 days, there are 4,502,689,887 interrupts on the outward facing port, and another 653,327,087 on the inward facing port. It is literally slowing down the machine to a crawl - on a comparatively powerful server that previously performed the same task, load hovered around 0.2 -> 0.4. On the current machine, the load average hovers between 14 and 16. /var/log/messages is flooded with messages such as: kernel: nfe0: watchdog timeout (missed Tx interrupts) -- recovering. Attached I am including portions of the output of lspci, in case there is relevant data there. If there is anything else I can provide you with, please don't hesitate to ask. Sincerely, tsawolf -------------- next part -------------- A non-text attachment was scrubbed... Name: lsp Type: application/octet-stream Size: 2194 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080121/d626cea2/lsp.obj
Hello everyone. A webserver that I've been working on seems to be having some issues of some sort with the nfe-20071124 driver. We have it running on two gigabit ports, one facing inwards to the database server, and the other facing outwards. The outward facing port is the only one which I currently have bandwidth information on, and it seems to be pushing around 35-40 Mb/s. Unfortunately, there are literally billions of interrupts generated in only a short period of time. In the last 40 days, there are 4,502,689,887 interrupts on the outward facing port, and another 653,327,087 on the inward facing port. It is literally slowing down the machine to a crawl - on a comparatively powerful server that previously performed the same task, load hovered around 0.2 -> 0.4. On the current machine, the load average hovers between 14 and 16. /var/log/messages is flooded with messages such as: kernel: nfe0: watchdog timeout (missed Tx interrupts) -- recovering. Attached I am including portions of the output of lspci, in case there is relevant data there. If there is anything else I can provide you with, please don't hesitate to ask. Sincerely, tsawolf -------------- next part -------------- A non-text attachment was scrubbed... Name: lsp Type: application/octet-stream Size: 2194 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080121/8bf7cbaa/lsp.obj
On Mon, Jan 21, 2008 at 02:37:55PM -0500, The Wolf wrote: > Hello everyone. > > A webserver that I've been working on seems to be having some issues of some > sort with the nfe-20071124 driver. We have it running on two gigabit ports, > one facing inwards to the database server, and the other facing outwards. > The outward facing port is the only one which I currently have bandwidth > information on, and it seems to be pushing around 35-40 Mb/s. > > Unfortunately, there are literally billions of interrupts generated in only > a short period of time. In the last 40 days, there are 4,502,689,887 > interrupts on the outward facing port, and another 653,327,087 on the inward > facing port. It is literally slowing down the machine to a crawl - on a > comparatively powerful server that previously performed the same task, load > hovered around 0.2 -> 0.4. On the current machine, the load average hovers > between 14 and 16. > > /var/log/messages is flooded with messages such as: kernel: nfe0: watchdog > timeout (missed Tx interrupts) -- recovering. > I think this was fixed by scott's commit. If you still see this issue it would indicate other bug in nfe(4). What FreeBSD version do you use? > Attached I am including portions of the output of lspci, in case there is > relevant data there. > > If there is anything else I can provide you with, please don't hesitate to > ask. > > Sincerely, > tsawolf According to the output of lscpi, I think nfe(4) should use MSI/MSIX on your system. Would you show me the output of nfe related verbosed boot message and vmstat -i output? Btw, nfe(4)'s interrupt moderation does not seem to work at all. I failed to find a way to make interrupt moderation work on nfe(4). I guess Linux also have the same issue too. However I don't think NVIDIA ethernet controllers lack the feature, we just still do not know what regergister is related with interrupt moderation control and what magic value should be used to activate it. Maybe I should see disassembled NDIS driver as a reference but the code size is too big to analyze it. :-( -- Regards, Pyun YongHyeon