Sorry, the message went privately to Daisuke, which was not my intention.
---------- Forwarded message ----------
From: Mikolaj Golub <to.my.trociny at gmail.com>
Date: Mon, Nov 26, 2012 at 9:38 AM
Subject: Re: hastctl hang
To: Daisuke Aoyama <aoyama at peach.ne.jp>
On Mon, Nov 26, 2012 at 01:17:46AM +0900, Daisuke Aoyama
wrote:> Hello,
>
> I'm trying to integrate HAST to NAS4Free (FreeBSD 9.1-RC3).
> Now I have created version 9.1.0.1.531.
>
http://sourceforge.net/projects/nas4free/files/NAS4Free-9.1.0.1/9.1.0.1.531/
>
> Basic CARP + HAST + iSCSI target setup can be done, but very frequently I
> get hastctl hang when called:
>
> /sbin/hastctl status
> /sbin/hastctl dump
>
> Is it better for this method not to call from a script?
> or somthing wrong to use it?
Normally it is ok to use hastctl for scripting.
Do you have it hang forever of just for a few seconds?
Usually hanged hastctl means that hastd master process is waiting for
its worker (either its response or exit).
Could you provide logs from both master ans secondary? Also you might
want to run hastd with -d to make it more verbose.
> Also, I don't know how to detect an error of writing to local device
from
> hastd.
> Does anyone know about it?
Currently only by monitoring logs. It looks like a good idea to add
error counters to hastctl statistics output...
> Thanks,
> Daisuke Aoyama
>
> -- the procstat shows like this:
> [root at nas4free-nodeb /tmp]# procstat -ka|grep hast
> 11668 100069 hastd - mi_switch
> sleepq_catch_signals sleepq_wait_sig _sleep kern_wait sys_wait4
> amd64_syscall Xfast_syscall
> 17981 100406 hastd - mi_switch
> sleepq_catch_signals sleepq_wait_sig _sleep do_wait
> __umtx_op_wait_uint_private amd64_syscall Xfast_syscall
> 17981 100559 hastd - mi_switch
> sleepq_catch_signals sleepq_wait_sig _sleep soreceive_generic kern_recvit
> recvit sys_recvfrom amd64_syscall Xfast_syscall
> 17981 100560 hastd - mi_switch
> sleepq_catch_signals sleepq_wait_sig _sleep soreceive_generic kern_recvit
> recvit sys_recvfrom amd64_syscall Xfast_syscall
> 17981 100561 hastd - mi_switch
> sleepq_catch_signals sleepq_wait_sig _sleep do_wait
> __umtx_op_wait_uint_private amd64_syscall Xfast_syscall
> 17984 100078 hastd - mi_switch
> sleepq_catch_signals sleepq_wait_sig _sleep do_wait
> __umtx_op_wait_uint_private amd64_syscall Xfast_syscall
> 17984 100562 hastd - mi_switch
> sleepq_catch_signals sleepq_wait_sig _sleep soreceive_generic kern_recvit
> recvit sys_recvfrom amd64_syscall Xfast_syscall
> 17984 100563 hastd - mi_switch
> sleepq_catch_signals sleepq_wait_sig _sleep soreceive_generic kern_recvit
> recvit sys_recvfrom amd64_syscall Xfast_syscall
> 17984 100564 hastd - mi_switch
> sleepq_catch_signals sleepq_wait_sig _sleep do_wait
> __umtx_op_wait_uint_private amd64_syscall Xfast_syscall
> 18218 100145 hastctl - mi_switch
> sleepq_catch_signals sleepq_wait_sig _sleep soreceive_generic kern_recvit
> recvit sys_recvfrom amd64_syscall Xfast_syscall
>
> [root at nas4free-nodeb /tmp]# procstat -ta|grep hast
> 11668 100069 hastd - 0 120 sleep wait
> 17979 100557 hastd - 2 120 sleep g_waitid
Strange, I don't see 17979 process in procstat -k output. Again, the logs
might be helpful here.
> 17981 100406 hastd - 2 120 sleep uwait
> 17981 100559 hastd - 0 120 sleep sbwait
> 17981 100560 hastd - 0 120 sleep sbwait
> 17981 100561 hastd - 1 120 sleep uwait
> 17984 100078 hastd - 2 121 sleep uwait
> 17984 100562 hastd - 3 120 sleep sbwait
> 17984 100563 hastd - 2 120 sleep sbwait
> 17984 100564 hastd - 1 121 sleep uwait
> 18218 100145 hastctl - 2 152 sleep sbwait
> -- the procstat shows like this:
>
>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at
freebsd.org"
--
Mikolaj Golub