I use softflowd and flowd together on a Solaris 9 host talking NetFlow v5. I seem to be seeing an inconsistency between the output of `softflowctl statistics` and the results of issuing a `softflowctl expire-all`. I expect that issuing an `expire-all` would force softflowd to export all of its current flow data to flowd and restart monitoring. However, running a `flowd-reader -v flows.db | wc -l` before and after indicates that this is not the case. Example output: ##### BEGIN # softflowctl statistics; \> echo "%%%flows: `flowd-reader -v flows.raw | wc -l`"; \ > softflowctl expire-all; \ > echo "%%%flows: `flowd-reader -v flows.raw | wc -l`"; \ > softflowctl statisticsstatistics softflowd[11574]: Accumulated statistics: Number of active flows: 4176 Packets processed: 9372374 Fragments: 2 Ignored packets: 1405 (1405 non-IP, 0 too short) Flows expired: 119941 (0 forced) Flows exported: 239882 in 7569 packets (0 failures) Expired flow statistics: minimum average maximum Flow bytes: 46 51562 79795808 Flow packets: 1 76 109762 Duration: 0.00s 18.83s 299.70s Expired flow reasons: tcp = 0 tcp.rst = 5895 tcp.fin = 0 udp = 0 icmp = 0 general = 0 maxlife = 0 over 2Gb = 0 maxflows = 0 flushed = 114046 Per-protocol statistics: Octets Packets Avg Life Max Life Unknown (1): 162797 2397 19.20s 298.84s Unknown (6): 5939745100 8157978 18.67s 299.70s Unknown (17): 244516151 895628 19.53s 299.69s Unknown (41): 1088 16 2.93s 6.06s %%%flows 354 expire-all softflowd[11574]: Expired 4181 flows. %%%flows 531 statistics softflowd[11574]: Accumulated statistics: Number of active flows: 0 Packets processed: 9372970 Fragments: 2 Ignored packets: 1405 (1405 non-IP, 0 too short) Flows expired: 124122 (0 forced) Flows exported: 248244 in 7833 packets (0 failures) Expired flow statistics: minimum average maximum Flow bytes: 46 51506 79795808 Flow packets: 1 76 109762 Duration: 0.00s 18.54s 299.70s Expired flow reasons: tcp = 0 tcp.rst = 5895 tcp.fin = 0 udp = 0 icmp = 0 general = 0 maxlife = 0 over 2Gb = 0 maxflows = 0 flushed = 118227 Per-protocol statistics: Octets Packets Avg Life Max Life Unknown (1): 167651 2469 19.32s 298.84s Unknown (6): 6134757449 8419521 18.33s 299.70s Unknown (17): 258098658 950964 19.46s 299.69s Unknown (41): 1088 16 2.93s 6.06s ##### END It seems as though "flows expired" is increasing by about the right amount, "flows exported" is going up by a factor of two over the number of active flows, and the flowd datafile is barely going up at all! # cat flowd.conf listen on 127.0.0.1:4432 flow source 127.0.0.1 store ALL logfile "/data/netflow/flows.raw" Is my understanding of the way netflow, and in particular netflow-tools, works flawed? Any ideas on how to proceed in working out what''s going wrong? Regards, Robin -- Robin Breathe, Computer Services, Oxford Brookes University, Oxford, UK rbreathe at brookes.ac.uk Tel: +44 1865 483685 Fax: +44 1865 483073 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 187 bytes Desc: OpenPGP digital signature Url : http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050926/10225e45/attachment.bin
Robin Breathe wrote:> I use softflowd and flowd together on a Solaris 9 host talking NetFlow v5. > > I seem to be seeing an inconsistency between the output of `softflowctl > statistics` and the results of issuing a `softflowctl expire-all`.hm, this is weird...> Is my understanding of the way netflow, and in particular netflow-tools, > works flawed? Any ideas on how to proceed in working out what''s going wrong?Could you try running tcpdump to capture the export packets to see which number is correct? -d
Damien Miller wrote:> Robin Breathe wrote: >> I use softflowd and flowd together on a Solaris 9 host talking NetFlow >> v5. >> >> I seem to be seeing an inconsistency between the output of `softflowctl >> statistics` and the results of issuing a `softflowctl expire-all`. > > hm, this is weird... > >> Is my understanding of the way netflow, and in particular netflow-tools, >> works flawed? Any ideas on how to proceed in working out what''s going >> wrong? > > Could you try running tcpdump to capture the export packets to see which > number is correct?Damien, I actually did some fairly extensive testing earlier, I simply couldn''t decide how best to report the results. Running tcpdump shows that there really are far more active flows than end up in my flowd logs. Running both softflowd and flowd in debug mode, it becomes apparent that softflowd *thinks* that it is expiring all the flows (it prints the "EXPIRED:" line in check_expired()), but some never arrive with flowd. If I initiate a test flow (e.g. complete ssh session), then if I do an `expire-all` after a few seconds, it won''t get exported; if I wait for it to naturally expire, it is exported. At least this is easily reproduceable :) The end result of this is that taking "snapshots" with `expire-all` every 5 minutes *usually* gives me only the first 3 minutes worth of flows (assuming the real flow-start-time algorithm we discussed earlier is sane). The exact period which gets exported seems to vary a little (+/-90s), though this is likely due to varying traffic patterns (running on a network with >1500 active hosts). Your help is really appreciated, this was an unexpected surprise drawing close to the end of an otherwise successful project :) I was going to trawl through the code myself tomorrow, but you already know it; I''d be even more delighted if we could nail this bug down together. Kind regards, Robin -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050927/baab7979/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 187 bytes Desc: OpenPGP digital signature Url : http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050927/baab7979/attachment-0001.bin
On Tue, 27 Sep 2005, Robin Breathe wrote:> Damien, > > I actually did some fairly extensive testing earlier, I simply couldn''t > decide how best to report the results. > > Running tcpdump shows that there really are far more active flows than > end up in my flowd logs.Are you tcpdumping the traffic flows themselves or the flow export packets? To ask an obvious question: are you using any flowd.conf filters that drop flows?> Your help is really appreciated, this was an unexpected surprise drawing > close to the end of an otherwise successful project :) I was going to > trawl through the code myself tomorrow, but you already know it; I''d be > even more delighted if we could nail this bug down together.I''m setting up a test environment right now. Hopefully this will be fairly easy to track down. (famous last words...) -d
Damien Miller wrote:> On Tue, 27 Sep 2005, Robin Breathe wrote: >> I actually did some fairly extensive testing earlier, I simply couldn''t >> decide how best to report the results. >> >> Running tcpdump shows that there really are far more active flows than >> end up in my flowd logs. > > Are you tcpdumping the traffic flows themselves or the flow export packets?On the traffic flows themselves; I''m running softflowd and flowd on the same Solaris 9 box, and S9 doesn''t support tcpdump on the loopback interface. If necessary I can set the two components up on different machines allowing me to tcpdump the flow export packets...> To ask an obvious question: are you using any flowd.conf filters that > drop flows?I can replicate the issue on a vanilla configuration: listen on 127.0.0.1:4432 flow source 127.0.0.1 accept all store ALL logfile "/data/netflow/flows.raw" I have tried with and without bpf filters on softflowd: they make no difference (but did highlight another possible far-less-important bug to be looked at another time - not all valid recipes are accepted).>> Your help is really appreciated, this was an unexpected surprise drawing >> close to the end of an otherwise successful project :) I was going to >> trawl through the code myself tomorrow, but you already know it; I''d be >> even more delighted if we could nail this bug down together. > > I''m setting up a test environment right now. Hopefully this will be > fairly easy to track down. (famous last words...)Probably wrong, I''ve got it into my head that this relates to iterating over the expiries list rather than the flows list in the `expire-all` case. I''ll look at the code once I get into work. Robin -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 187 bytes Desc: OpenPGP digital signature Url : http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050927/d091a8da/attachment.bin
Robin Breathe wrote:> Probably wrong, I''ve got it into my head that this relates to iterating > over the expiries list rather than the flows list in the `expire-all` > case. I''ll look at the code once I get into work.Having looked at the code, my head is cleansed of this idea :) Robin -- Robin Breathe, Computer Services, Oxford Brookes University, Oxford, UK rbreathe at brookes.ac.uk Tel: +44 1865 483685 Fax: +44 1865 483073 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 187 bytes Desc: OpenPGP digital signature Url : http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050927/3ca532e7/attachment.bin
Robin Breathe wrote:> Damien, > > I''ve managed to replicate the problem running with a stock installation > of softflowd/flowd from ports under FreeBSD 6.0-BETA5.Yes, I have replicated it too. It looks like flowd can lose packets when they come in large sudden bursts. I''m adding an input buffer and changing the processing order to better cope with these situations. -d
Damien Miller wrote:>> I''ve managed to replicate the problem running with a stock installation >> of softflowd/flowd from ports under FreeBSD 6.0-BETA5. > > Yes, I have replicated it too. It looks like flowd can lose packets when > they come in large sudden bursts. I''m adding an input buffer and > changing the processing order to better cope with these situations.Excellent! Let me know when it''s ready to test :) Robin -- Robin Breathe, Computer Services, Oxford Brookes University, Oxford, UK rbreathe at brookes.ac.uk Tel: +44 1865 483685 Fax: +44 1865 483073 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 187 bytes Desc: OpenPGP digital signature Url : http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050927/655936ef/attachment.bin
Robin Breathe wrote:> Damien Miller wrote: > >>>I''ve managed to replicate the problem running with a stock installation >>>of softflowd/flowd from ports under FreeBSD 6.0-BETA5. >> >>Yes, I have replicated it too. It looks like flowd can lose packets when >>they come in large sudden bursts. I''m adding an input buffer and >>changing the processing order to better cope with these situations. > > > Excellent! Let me know when it''s ready to test :)Here it is. The attached patch adds an input queue to flowd. When a listening fd becomes ready (i.e. flow data is received), flowd will read and enqueue up to 1024 packets before processing them. It won''t stop situations where flowd simply can''t keep up with the data coming at it (my laptop for example drops after ~1k flows per second), but it should help it deal better with sudden bursts of flows. I''ll tweak this a bit further, I think it could be made a little faster by replacing the allocations of queued packets with a freelist of preallocated ones. It would probably be better to maintain an output queue as well, instead of doing one write per flow. This is against flowd -current, so if you aren''t already using a snapshot, then you will need to get one from: http://www2.mindrot.org/flowd_snap/ -d -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: flowd-input-buffer.diff Url: http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050927/0f50cf41/attachment.ksh
Damien Miller wrote:> Here it is. > > The attached patch adds an input queue to flowd. When a listening fd > becomes ready (i.e. flow data is received), flowd will read and enqueue > up to 1024 packets before processing them. > > It won''t stop situations where flowd simply can''t keep up with the data > coming at it (my laptop for example drops after ~1k flows per second), > but it should help it deal better with sudden bursts of flows. > > I''ll tweak this a bit further, I think it could be made a little faster > by replacing the allocations of queued packets with a freelist of > preallocated ones. It would probably be better to maintain an output > queue as well, instead of doing one write per flow. > > This is against flowd -current, so if you aren''t already using a > snapshot, then you will need to get one from: > > http://www2.mindrot.org/flowd_snap/I''m beginning to test the 20050927 snapshot (which looks to have the queue patch), but needed to apply the attached patch in order for it to compile in Solaris 9. It also fails to configure if bison is present, is this intentional? Robin -- Robin Breathe, Computer Services, Oxford Brookes University, Oxford, UK rbreathe at brookes.ac.uk Tel: +44 1865 483685 Fax: +44 1865 483073 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: patch.diff Url: http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050927/3a9585bd/attachment.ksh -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 187 bytes Desc: OpenPGP digital signature Url : http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050927/3a9585bd/attachment.bin
Robin Breathe wrote:> I''m beginning to test the 20050927 snapshot (which looks to have the > queue patch), but needed to apply the attached patch in order for it to > compile in Solaris 9.No, the queue patch has not been committed.> It also fails to configure if bison is present, is this intentional?configure should use Berkeley yacc in favour of bison. Could you send me a compilation log of these two errors in action? I have compiled on Solaris 9 without problems recently. -d
Sorry for this terse response, I''ll expand a little. On Wed, 28 Sep 2005, Damien Miller wrote:> Robin Breathe wrote: >> I''m beginning to test the 20050927 snapshot (which looks to have the >> queue patch), but needed to apply the attached patch in order for it to >> compile in Solaris 9. > > No, the queue patch has not been committed.You will have to apply it yourself. Please let me know how it goes and I''ll commit it if it is stable and improves your situation. Note that the snapshot releases change the log format a bit from the last stable release. You can convert your logs using flowd-reader''s -L option.>> It also fails to configure if bison is present, is this intentional? > > configure should use Berkeley yacc in favour of bison.GNU Bison is known to miscompile parse.y and I haven''t bothered to spend the time to figure out why. On Solaris, /usr/ccs/bin/yacc is known to do the right thing (on Linux, use byacc). -d
Damien Miller wrote:> On Wed, 28 Sep 2005, Damien Miller wrote: >> Robin Breathe wrote: >>> I''m beginning to test the 20050927 snapshot (which looks to have the >>> queue patch), but needed to apply the attached patch in order for it to >>> compile in Solaris 9. >> >> No, the queue patch has not been committed. > > You will have to apply it yourself. Please let me know how it goes and > I''ll commit it if it is stable and improves your situation.The patch didn''t apply cleanly as given (it didn''t like the process_netflow_v7 function), but I munged it in. However, I still seem to be getting forgotten flows: ## logged 1 lines # sleep 30 # softflowctl stop-gather stop-gather softflowd[3271]: Data collection stopped. ## logged 1 lines # softflowctl statistics statistics softflowd[3271]: Accumulated statistics: Number of active flows: 1570 Packets processed: 101136 Fragments: 0 Ignored packets: 21 (21 non-IP, 0 too short) Flows expired: 0 (0 forced) Flows exported: 0 in 0 packets (0 failures) # softflowctl expire-all expire-all softflowd[3271]: Expired 1570 flows. # softflowctl statistics statistics softflowd[3271]: Accumulated statistics: Number of active flows: 0 Packets processed: 101136 Fragments: 0 Ignored packets: 21 (21 non-IP, 0 too short) Flows expired: 1570 (0 forced) Flows exported: 3140 in 98 packets (0 failures) Expired flow statistics: minimum average maximum Flow bytes: 46 48461 34269298 Flow packets: 1 64 34677 Duration: 0.00s 4.88s 31.27s Expired flow reasons: tcp = 0 tcp.rst = 0 tcp.fin = 0 udp = 0 icmp = 0 general = 0 maxlife = 0 over 2Gb = 0 maxflows = 0 flushed = 1570 Per-protocol statistics: Octets Packets Avg Life Max Life Unknown (1): 545 5 0.51s 2.05s Unknown (6): 73832643 92504 4.89s 31.26s Unknown (17): 2250079 8617 4.92s 31.27s Unknown (41): 680 10 4.40s 6.00s ## logged 176 lines In fact, if anything it seems to be worse? Should I try increasing INPUT_MAX_PACKET_PER_FD?> Note that the snapshot releases change the log format a bit from the > last stable release. You can convert your logs using flowd-reader''s -L > option.Yup, this will force me to update my flowdb->sqlite conversion program, which is probably not a bad thing (I''ll dis-entangle it from flowd-reader).>>> It also fails to configure if bison is present, is this intentional? >> >> configure should use Berkeley yacc in favour of bison. > > GNU Bison is known to miscompile parse.y and I haven''t bothered to spend > the time to figure out why. On Solaris, /usr/ccs/bin/yacc is known to do > the right thing (on Linux, use byacc).Fair enough, just unexpected :) Robin -- Robin Breathe, Computer Services, Oxford Brookes University, Oxford, UK rbreathe at brookes.ac.uk Tel: +44 1865 483685 Fax: +44 1865 483073 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 187 bytes Desc: OpenPGP digital signature Url : http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050928/4aabdd73/attachment.bin
Damien, Below are the results of using flowd-SNAP-20050927 w/flowd-input-buffer.diff along with softflowd-SNAP-20050928 (for enhanced statistical accuracy): ## logged 1 lines # sleep 30 # softflowctl stop-gather stop-gather softflowd[16769]: Data collection stopped. ## logged 1 lines # softflowctl statistics statistics softflowd[16769]: Accumulated statistics: Number of active flows: 2770 Packets processed: 78026 Fragments: 0 Ignored packets: 23 (23 non-IP, 0 too short) Flows expired: 0 (0 forced) Flows exported: 0 in 0 packets (0 failures) Packets received by libpcap: 78049 Packets dropped by libpacp: 0 Packets dropped by interface: 0 # softflowctl expire-all expire-all softflowd[16769]: Expired 2770 flows. # softflowctl statistics statistics softflowd[16769]: Accumulated statistics: Number of active flows: 0 Packets processed: 78026 Fragments: 0 Ignored packets: 23 (23 non-IP, 0 too short) Flows expired: 2770 (0 forced) Flows exported: 2770 in 371 packets (0 failures) Packets received by libpcap: 78049 Packets dropped by libpacp: 0 Packets dropped by interface: 0 Expired flow statistics: minimum average maximum Flow bytes: 46 15935 2096331 Flow packets: 1 28 2802 Duration: 0.00s 4.01s 32.79s Expired flow reasons: tcp = 0 tcp.rst = 0 tcp.fin = 0 udp = 0 icmp = 0 general = 0 maxlife = 0 over 2Gb = 0 maxflows = 0 flushed = 2770 Per-protocol statistics: Octets Packets Avg Life Max Life Unknown (1): 15674 79 8.36s 31.82s Unknown (6): 40366416 67964 3.88s 32.76s Unknown (17): 3756178 9970 4.65s 32.79s Unknown (41): 884 13 4.72s 6.00s ## logged 852 lines There are still an aweful lot of flows getting lost :\ Regards, Robin -- Robin Breathe, Computer Services, Oxford Brookes University, Oxford, UK rbreathe at brookes.ac.uk Tel: +44 1865 483685 Fax: +44 1865 483073 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 187 bytes Desc: OpenPGP digital signature Url : http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050928/9950aeed/attachment.bin