Hi,
We have several systems doing data acquisition and I had originally
thought we were seeing the interrupt handler for out PCI card not being
called quickly enough, however I misread the diagnostics :)
The digitised data is fed into a FIFO and when it is part full
(32kbytes) an interrupt is generated. The IRQ routine reads 32kbyte
chunks into a kernel buffer (4Mbyte) until part full goes away. If the
FIFO full flag is seen (it is latched by the hardware) then acquisition
is halted.
The problem appears to now be that the userland process that reads data
out of the kernel is being stalled for over 4 seconds. This process
reads from the kernel and does some minor processing and then writes it
out to a child process to do some more work on it.
I ran 'ps -xaulwww' in a loop every second to see what ELSE was using
the CPU when it was stalled and found that my script stalled for 7
seconds.
I tried increasing the buffer inside the kernel (to 8Mb) which seemed to
have no effect, however renice'ing the process from -5 to -20 has
greatly reduced the frequency of occurrence. WRT the buffer size - I
would expect that if I increased it more it would reduce the problem
but since I have only increased it to ~4 seconds worth and the stall is
longer I see no effect.
Given that renice'ing has an effect it seems to be a scheduler problem,
I don't see how it can be something to do with the motherboard stalling
the whole system otherwise the FIFO full error would occur, however I
only see the 4Mb kernel buffer filling up.
One other possibility would be something holding a lock for too long
that blocks both the DAQ readout process and ps, however I am not sure
how I would find out what.
Unfortunately the system is in Finland and I'm in Australia so I can't
sit at the console :(
I am hoping to be able to replicate the HW & SW locally at some stage
but haven't been able to yet.
Any help appreciated, thanks!
--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
-- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 188 bytes
Desc: This is a digitally signed message part.
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20090820/9d0526bf/attachment.pgp