Dear All,
Apr 17 16:59:10 node1 kernel: Lustre:
7833:0:(router.c:167:lnet_notify()) Ignoring prediction from
10.1.1.11 at tcp of 192.168.0.71 at tcp down 542730734650 seconds in the future
What could cause this error message?
I don''t find anything really useful searching the web.
My main problem not exactly this, I''m investigating about a strange
behaviour.
We have a small cluster with 8 nodes, and a samba gw for windows
clients. Linux clients can use the cluster without any problems, the
samba machine can see the mount without any problem.
But the samba share freeze up after some hours, but it could take max.
2-3 days. As I see, with CentOS 5.1 it was only 4-5-6 hours, but last
time with Debian 4.0 (with 2.6.18 stock kernel) it was 2-3 days.
As I see, after a while there appears some smbd process with switching
between state ''D'' and ''S''. When the share
get to be unreachable the smbd
processes cannot be killed and the lustre mount cannot be umounted, but
it still total usable without any problem, but of course only in linux.
There is nothing special in samba logs, and nothing special in kernel
logs related to lustre or anything else, except the above message some
of the nodes.
Lustre: 1.6.4.3
Samba 3.0.28a-1 right now, but it was CentOS 4.4 I guess with 3.0.10 and
the same problem.
Any idea?
Thank you.
tamas