Edward Tomasz NapieraĆa
2017-Sep-22 07:15 UTC
ctld: only 579 iSCSI targets can be created
On 0922T1036, Eugene M. Zheganin wrote:> Hi, > > I have old 11-STABLE as an iSCSI server, but out of the blue I > encountered weird problem: only 579 targets can be created. I mean, I am > fully aware that the out-of-the-box limit is 128 targets, with is > enforced by the CTL_MAX_PORTS define, and I've set it to 1024 (and of > course rebuilt and installed a new kernel), but when I add more that 579 > targets I start to get the protocol errors: > > Sep 22 10:16:48 san1 ctld[8657]: 10.0.3.127 > (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid > opcode 0x4 > Sep 22 10:16:48 san1 ctld[8658]: 10.0.3.127 > (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid > opcode 0x46 > Sep 22 10:17:31 san1 ctld[8746]: 10.0.3.127 > (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid > opcode 0x4 > Sep 22 10:17:31 san1 ctld[8747]: 10.0.3.127 > (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid > opcode 0x46 > Sep 22 10:19:58 san1 ctld[9190]: 10.0.3.127 > (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid > opcode 0x4 > Sep 22 10:19:58 san1 ctld[9191]: 10.0.3.127 > (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid > opcode 0x46 > Sep 22 10:21:33 san1 ctld[9518]: 10.0.3.127 > (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid > opcode 0x4 > Sep 22 10:21:33 san1 ctld[9519]: 10.0.3.127 > (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid > opcode 0x46There are two weird things here. First is that the error is coming from ctld(8) - the userspace daemon, not the kernel. The second is that those invalid opcodes are actually both valid - they are the Text Request, and the Logout Request with Immediate flag set, exectly what you'd expect for a discovery session. Do you have a way to do a packet dump?
Hi, Edward Tomasz Napiera?a wrote 2017-09-22 12:15:> > There are two weird things here. First is that the error is coming > from > ctld(8) - the userspace daemon, not the kernel. The second is that > those > invalid opcodes are actually both valid - they are the Text Request, > and the Logout Request with Immediate flag set, exectly what you'd > expect > for a discovery session. > > Do you have a way to do a packet dump?Sure. Here it is: http://enaza.ru/stub-data/iscsi-protocol-error.pcap Target IP is 10.0.2.4, initiator IP is 10.0.3.127. During the session captured in this file I got in messages: Sep 22 15:38:11 san1 ctld[61373]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x4 Sep 22 15:38:11 san1 ctld[61374]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x46 This error happens when the initiator is trying to connect the disk from a target discovered. Target is running FreeBSD 11.0-STABLE #1 r310734M where M is for CTL_MAX_PORTS 1024 (old verion, yup, but I have a suspicion that I still failed to prove that more recent version have some iSCSI vs ZFS conflict, but that's another story). Initiator is running Windows 7 Professional x64, inside a ESX virtual machine. This happens only when some unclear threshold is crossed, previous ~2 hundreds of initiators run Windows 7 Professional too. If you need any additional data/diagnostics please let me know. Eugene.
Thanks for the packet trace. What happens there is that the Windows initiator logs in, requests Discovery ("SendTargets=All"), receives the list of targets, as expected, and then... sends "SendTargets=All" again, instead of logging off. This results in ctld(8) dropping the session. The initiator then starts the Discovery session again, but this time it only logs in and then out, without actually requesting the target list. Perhaps you could work around this by using "discovery-filter", as documented in ctl.conf(5)? 2017-09-22 11:49 GMT+01:00 Eugene M. Zheganin <emz at norma.perm.ru>:> Hi, > > Edward Tomasz Napiera?a wrote 2017-09-22 12:15: > >> >> There are two weird things here. First is that the error is coming from >> ctld(8) - the userspace daemon, not the kernel. The second is that those >> invalid opcodes are actually both valid - they are the Text Request, >> and the Logout Request with Immediate flag set, exectly what you'd expect >> for a discovery session. >> >> Do you have a way to do a packet dump? >> > > Sure. Here it is: > > http://enaza.ru/stub-data/iscsi-protocol-error.pcap > > Target IP is 10.0.2.4, initiator IP is 10.0.3.127. During the session > captured in this file I got in messages: > > Sep 22 15:38:11 san1 ctld[61373]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): > protocol error: received invalid opcode 0x4 > Sep 22 15:38:11 san1 ctld[61374]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): > protocol error: received invalid opcode 0x46 > > This error happens when the initiator is trying to connect the disk from a > target discovered. > > Target is running FreeBSD 11.0-STABLE #1 r310734M where M is for > CTL_MAX_PORTS 1024 (old verion, yup, but I have a suspicion that I still > failed to prove that more recent version have some iSCSI vs ZFS conflict, > but that's another story). Initiator is running Windows 7 Professional x64, > inside a ESX virtual machine. This happens only when some unclear threshold > is crossed, previous ~2 hundreds of initiators run Windows 7 Professional > too. > > If you need any additional data/diagnostics please let me know. > > Eugene. > >