John Tracy
2007-Nov-21 18:34 UTC
[zfs-discuss] iSCSI target using ZFS filesystem as backing
Hello All- I''m working on a Sun Ultra 80 M2 workstation. It has eight 750 GB SATA disks installed. I''ve tried the following on both ON build 72, Solaris 10 update 4, and Indiana with the same results. If I create a ZFS filesystem using 1-7 hard drives (I''ve tried 1 and 7), and then try to make an iSCSI target on that pool, when a client machine tries to access the iSCSI volume, the memory usage on the Ultra 80 goes to the same size as the ZFS filesystem. For example: I''m creating a RaidZ ZFS pool: zpool create -f telephone raidz c9d0 c10d0 c11d0 c12d0 c13d0 c14d0 c15d0 I then create a two terabyte filesystem on that zvol: zfs create -V 2000g telephone/jelley And make it into an iSCSI target: iscsitadm create target -b /dev/zvol/dsk/telephone/jelley jelley Now if I perform a ''iscsitadm list target'', the iSCSI target appears like it should: Target: jelley iSCSI Name: iqn.1986-03.com.sun:02:fcaa1650-f202-4fef-b44b-b9452a237511.jelley Connections: 0 Now when I try to connect to it with my Windows 2003 server running the MS iSCSI initiator, I see the memory usage climb to the point that the totally exhausts all available physical memory (prstat): PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 511 root 2000G 106M sleep 59 0 0:02:58 1.1% iscsitgtd/15 2139 root 8140K 4204K sleep 59 0 0:00:00 0.0% sshd/1 2164 root 3276K 2740K cpu1 49 0 0:00:00 0.0% prstat/1 2144 root 2672K 1752K sleep 49 0 0:00:00 0.0% bash/1 574 noaccess 173M 92M sleep 59 0 0:03:18 0.0% java/25 Do you see the iscsitgtd process trying to use 2000 gigabytes of RAM? I can sit there and hold down spacebar while the Windows workstation is trying to access it, and the memory usage climbs at an astronomical rate, until it exhausts all the available memory on the box (several hundred megabytes per minute). The total ram it tries to allocate depends totally on the size of the iSCSI volume. If it''s a 1000 megabyte volume, then it only allocates a gig... if it''s 600 gigs, it tries to allocate 600 gigs. Now here is the real kicker. I took this down to as simple of a configuration as possible--one single drive with a ZFS filesystem on it. The memory utilization was the same. I then tried creating the iSCSI target on a UFS filesystem. Everything work beautifully, and memory utilization was no longer directly proportional to the size of the iSCSI volume. If I create something small, like a 100 gig iSCSI target, the system does eventually get around to finishing and releases the ram. When what''s really strange is when I try to access the iSCSI volume, the memory usage then climbs megabyte per megabyte until it is exhausted, and then access to the iSCSI volume is terribly slow. I can copy a 300 meg file in just six seconds when the memory utilization on the iscsitgtd process is low. But if I try a 2.5 gig file, once it get''s about 1500 megs into it, performance drops about 99.9% and it''s incredibly slow... again, until it''s done and the iscsitgtd releases the ram, then it''s plenty zippy for small IO operations. Has anybody else been making iSCSI targets on ZFS pools? I''ve had a case open with Sun since Oct 3, if any Sun folks want to look at the details (case #65684887). I''m getting very desperate to get this fixed, as this massive amount of storage was the only reason I got this M80... Any pointers would be greatly appreciated. Thanks- John Tracy This message posted from opensolaris.org
Jim Dunham
2007-Nov-21 22:39 UTC
[zfs-discuss] iSCSI target using ZFS filesystem as backing
John,> I''m working on a Sun Ultra 80 M2 workstation. It has eight 750 GB > SATA disks installed. I''ve tried the following on both ON build 72, > Solaris 10 update 4, and Indiana with the same results. > > If I create a ZFS filesystem using 1-7 hard drives (I''ve tried 1 > and 7), and then try to make an iSCSI target on that pool, when a > client machine tries to access the iSCSI volume, the memory usage > on the Ultra 80 goes to the same size as the ZFS filesystem. For > example: > > I''m creating a RaidZ ZFS pool: > zpool create -f telephone raidz c9d0 c10d0 c11d0 c12d0 c13d0 c14d0 > c15d0 > > I then create a two terabyte filesystem on that zvol: > zfs create -V 2000g telephone/jelley > > And make it into an iSCSI target: > iscsitadm create target -b /dev/zvol/dsk/telephone/jelley jelleyTry changing from a cached ZVOL to a raw ZVOL iscsitadm create target -b /dev/zvol/Rdsk/telephone/jelley jelley You can also try: zpool set shareiscsi=on telephone/jelley - Jim> > Now if I perform a ''iscsitadm list target'', the iSCSI target > appears like it should: > Target: jelley > iSCSI Name: iqn.1986-03.com.sun:02:fcaa1650-f202-4fef-b44b- > b9452a237511.jelley > Connections: 0 > > Now when I try to connect to it with my Windows 2003 server running > the MS iSCSI initiator, I see the memory usage climb to the point > that the totally exhausts all available physical memory (prstat): > > PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/ > NLWP > 511 root 2000G 106M sleep 59 0 0:02:58 1.1% > iscsitgtd/15 > 2139 root 8140K 4204K sleep 59 0 0:00:00 0.0% sshd/1 > 2164 root 3276K 2740K cpu1 49 0 0:00:00 0.0% prstat/1 > 2144 root 2672K 1752K sleep 49 0 0:00:00 0.0% bash/1 > 574 noaccess 173M 92M sleep 59 0 0:03:18 0.0% java/25 > > Do you see the iscsitgtd process trying to use 2000 gigabytes of > RAM? I can sit there and hold down spacebar while the Windows > workstation is trying to access it, and the memory usage climbs at > an astronomical rate, until it exhausts all the available memory on > the box (several hundred megabytes per minute). The total ram it > tries to allocate depends totally on the size of the iSCSI volume. > If it''s a 1000 megabyte volume, then it only allocates a gig... if > it''s 600 gigs, it tries to allocate 600 gigs. > > Now here is the real kicker. I took this down to as simple of a > configuration as possible--one single drive with a ZFS filesystem > on it. The memory utilization was the same. I then tried creating > the iSCSI target on a UFS filesystem. Everything work beautifully, > and memory utilization was no longer directly proportional to the > size of the iSCSI volume. > > If I create something small, like a 100 gig iSCSI target, the > system does eventually get around to finishing and releases the > ram. When what''s really strange is when I try to access the iSCSI > volume, the memory usage then climbs megabyte per megabyte until it > is exhausted, and then access to the iSCSI volume is terribly slow. > > I can copy a 300 meg file in just six seconds when the memory > utilization on the iscsitgtd process is low. But if I try a 2.5 gig > file, once it get''s about 1500 megs into it, performance drops > about 99.9% and it''s incredibly slow... again, until it''s done and > the iscsitgtd releases the ram, then it''s plenty zippy for small IO > operations. > > Has anybody else been making iSCSI targets on ZFS pools? > > I''ve had a case open with Sun since Oct 3, if any Sun folks want to > look at the details (case #65684887). > > I''m getting very desperate to get this fixed, as this massive > amount of storage was the only reason I got this M80... > > Any pointers would be greatly appreciated. > > Thanks- > John Tracy > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discussJim Dunham Storage Platform Software Group Sun Microsystems, Inc. 1617 Southwood Drive Nashua, NH 03063
John Tracy
2007-Nov-27 04:17 UTC
[zfs-discuss] iSCSI target using ZFS filesystem as backing
Thanks Jim- That was exactly the problem. Have a good Monday. -John This message posted from opensolaris.org
Hey guys, I just hit exactly the same problem, but for some reason I can''t see one of my zpools in /dev/zvol/rdsk. I''m testing out ZFS over iSCSI. To begin with I created a 1GB file on an 8GB pool, it seemed ok (but I''d missed the fact it was using 1GB of RAM). Next I wanted a bigger iSCSI drive to play with, so created a 40GB pool, created a 35GB file with iscsitadm, and watched the machine grind to a halt... :) I remembered this post and I''m trying to use the workaround now, but I can''t find the new pool in /dev/zvol/rdsk: # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT largepool 39.8G 728K 39.7G 0% ONLINE - zfspool 7.94G 1.00G 6.94G 12% ONLINE - #zfs list NAME USED AVAIL REFER MOUNTPOINT largepool 702K 39.1G 20K /largepool largepool/zfstest 18K 39.1G 18K /largepool/zfstest zfspool 5.00G 2.81G 1.00G /zfspool zfspool/iscsitest 2.06M 6.81G 2.06M - # cd /dev/zvol/rdsk # ls zfspool I''ve tried setting shareiscsi=on, but that doesn''t seem to be working either, running "iscsiadm list target" only shows the old target. This message posted from opensolaris.org
John Tracy
2008-Feb-08 13:35 UTC
[zfs-discuss] iSCSI target using ZFS filesystem as backing
This thread is actually a bit different than what you''re experiencing. I never see any huge memory usage from the iscsitgtd in the process table, but I''m definitely encountering memory leaks. And the box itself stays stable and at low CPU utilization. I can restart the iscsitgtd and the problems disappear. However, I think I know the problem you are encountering well, and have managed to overcome it. It looks like you found my post from last November with the answer. I''ll outline here exactly how I''m creating my iscsi targets, and I''m hoping you might see where your commands are different. As Jim pointed out, the problem was that I was creating the filesystem on a cached zfs volume. I should have been using the ***R***dsk path when creating it. Here''s how I configured the iscsi targets successfully without using a ton of RAM, and utilizing the shareiscsi=on functionality successfully: [u]Here I''ve got the pool created with seven disks:[/u] bash-3.00# zpool status pool: backups state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM backups ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c2d0 ONLINE 0 0 0 c3d0 ONLINE 0 0 0 c4d0 ONLINE 0 0 0 c5d0 ONLINE 0 0 0 c6d0 ONLINE 0 0 0 c7d0 ONLINE 0 0 0 c8d0 ONLINE 0 0 0 errors: No known data errors bash-3.00# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT backups 4.75T 232G 4.52T 4% ONLINE - [u]Here are my existing filesystems:[/u] bash-3.00# zfs list NAME USED AVAIL REFER MOUNTPOINT backups 1.98T 2.00T 26.7G /backups backups/server1 2.78G 2.98T 2.78G - backups/server2 85.3G 2.02T 85.3G - backups/server3 71.5G 2.03T 71.5G - backups/server4 11.7G 2.28T 11.7G - [u]Now I''m creating a new zfs filesystem of 500 GB:[/u] bash-3.00# zfs create -V 500G backups/sample [u]Here are my existing targets (before turning on shareiscsi on this new filsystem):[/u] bash-3.00# iscsitadm list target Target: backups/server1 iSCSI Name: iqn.1986-03.com.sun:02:2cd67427-eeeb-e32f-e9b2-a1db82f81b9d Connections: 0 Target: backups/server2 iSCSI Name: iqn.1986-03.com.sun:02:82994ecc-94cd-6616-85c8-f77b6f415724 Connections: 0 Target: backups/server3 iSCSI Name: iqn.1986-03.com.sun:02:485b5231-9bca-437c-9888-fb57f0fd099d Connections: 0 Target: backups/server4 iSCSI Name: iqn.1986-03.com.sun:02:1efe77e7-1fd3-60f6-e1c3-8fc2f4f74fd9 Connections: 1 [u]Now I''m going to set shareiscsi=on on my new filesystem:[/u] bash-3.00# zfs set shareiscsi=on backups/sample [u]And behold, the new target exists (and the machine doesn''t grind to a halt:[/u] bash-3.00# iscsitadm list target Target: backups/server1 iSCSI Name: iqn.1986-03.com.sun:02:2cd67427-eeeb-e32f-e9b2-a1db82f81b9d Connections: 0 Target: backups/server2 iSCSI Name: iqn.1986-03.com.sun:02:82994ecc-94cd-6616-85c8-f77b6f415724 Connections: 0 Target: backups/server3 iSCSI Name: iqn.1986-03.com.sun:02:485b5231-9bca-437c-9888-fb57f0fd099d Connections: 0 Target: backups/server4 iSCSI Name: iqn.1986-03.com.sun:02:1efe77e7-1fd3-60f6-e1c3-8fc2f4f74fd9 Connections: 1 [b]Target: backups/sample iSCSI Name: iqn.1986-03.com.sun:02:130d1ac4-f846-e29e-cc45-819663bda4e2 Connections: 0[/b] When I first starting working with iSCSI and ZFS, I couldn''t get the shareiscsi=on flag to create the target. I was trying to do things manually because of that problem, and that is when I discovered the huge memory usage (I could watch the iscsitgtd try to allocate four terabytes of RAM when I tried to share the whole pool... yes, it really did grind to a halt, but tried for a good eight hours before dying!!!). Check out this thread I made back in Nov, and you''ll see what I was experiencing: http://www.opensolaris.org/jive/thread.jspa?messageID=178912?? Hope it helps- John This message posted from opensolaris.org
John Tracy
2008-Feb-08 13:40 UTC
[zfs-discuss] iSCSI target using ZFS filesystem as backing
I think I know the problem you are encountering well, and have managed to overcome it. I''ll outline here exactly how I''m creating my iscsi targets, and I''m hoping you might see where your commands are different. As Jim pointed out, the problem was that I was creating the filesystem on a cached zfs volume. I should have been using the ***R***dsk path when creating it. Here''s how I configured the iscsi targets successfully without using a ton of RAM, and utilizing the shareiscsi=on functionality successfully: [u]Here I''ve got the pool created with seven disks:[/u] bash-3.00# zpool status pool: backups state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM backups ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c2d0 ONLINE 0 0 0 c3d0 ONLINE 0 0 0 c4d0 ONLINE 0 0 0 c5d0 ONLINE 0 0 0 c6d0 ONLINE 0 0 0 c7d0 ONLINE 0 0 0 c8d0 ONLINE 0 0 0 errors: No known data errors bash-3.00# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT backups 4.75T 232G 4.52T 4% ONLINE - [u]Here are my existing filesystems:[/u] bash-3.00# zfs list NAME USED AVAIL REFER MOUNTPOINT backups 1.98T 2.00T 26.7G /backups backups/server1 2.78G 2.98T 2.78G - backups/server2 85.3G 2.02T 85.3G - backups/server3 71.5G 2.03T 71.5G - backups/server4 11.7G 2.28T 11.7G - [u]Now I''m creating a new zfs filesystem of 500 GB:[/u] bash-3.00# zfs create -V 500G backups/sample [u]Here are my existing targets (before turning on shareiscsi on this new filsystem):[/u] bash-3.00# iscsitadm list target Target: backups/server1 iSCSI Name: iqn.1986-03.com.sun:02:2cd67427-eeeb-e32f-e9b2-a1db82f81b9d Connections: 0 Target: backups/server2 iSCSI Name: iqn.1986-03.com.sun:02:82994ecc-94cd-6616-85c8-f77b6f415724 Connections: 0 Target: backups/server3 iSCSI Name: iqn.1986-03.com.sun:02:485b5231-9bca-437c-9888-fb57f0fd099d Connections: 0 Target: backups/server4 iSCSI Name: iqn.1986-03.com.sun:02:1efe77e7-1fd3-60f6-e1c3-8fc2f4f74fd9 Connections: 1 [u]Now I''m going to set shareiscsi=on on my new filesystem:[/u] bash-3.00# zfs set shareiscsi=on backups/sample [u]And behold, the new target exists (and the machine doesn''t grind to a halt:[/u] bash-3.00# iscsitadm list target Target: backups/server1 iSCSI Name: iqn.1986-03.com.sun:02:2cd67427-eeeb-e32f-e9b2-a1db82f81b9d Connections: 0 Target: backups/server2 iSCSI Name: iqn.1986-03.com.sun:02:82994ecc-94cd-6616-85c8-f77b6f415724 Connections: 0 Target: backups/server3 iSCSI Name: iqn.1986-03.com.sun:02:485b5231-9bca-437c-9888-fb57f0fd099d Connections: 0 Target: backups/server4 iSCSI Name: iqn.1986-03.com.sun:02:1efe77e7-1fd3-60f6-e1c3-8fc2f4f74fd9 Connections: 1 [b]Target: backups/sample iSCSI Name: iqn.1986-03.com.sun:02:130d1ac4-f846-e29e-cc45-819663bda4e2 Connections: 0[/b] When I first starting working with iSCSI and ZFS, I couldn''t get the shareiscsi=on flag to create the target. I was trying to do things manually because of that, and that is when I discovered the huge memory usage (I could watch the iscsitgtd try to allocate four terabytes of RAM when I tried to share the whole pool... yes, it really did grind to a halt, but tried for a good eight hours before dying!!!). Hope it helps- John This message posted from opensolaris.org
Bleh, found out why they weren''t appearing. I was just creating a regular ZFS filesystem and setting shareiscsi=on. If you create a volume it works fine... I wonder if that''s something that could do with being added to the documentation for shareiscsi? I can see now that all the examples of how to use it are using the "zfs create -V" command, but can''t find anything that explicitly states that shareiscsi needs a fixed size volume. Should ZFS generate an error if somebody tries to set shareiscsi=on for a filesystem that doesn''t support that property? This message posted from opensolaris.org
Darren J Moffat
2008-Feb-11 11:02 UTC
[zfs-discuss] iSCSI target using ZFS filesystem as backing
Ross wrote:> Bleh, found out why they weren''t appearing. I was just creating a regular ZFS filesystem and setting shareiscsi=on. If you create a volume it works fine... > > I wonder if that''s something that could do with being added to the documentation for shareiscsi? I can see now that all the examples of how to use it are using the "zfs create -V" command, but can''t find anything that explicitly states that shareiscsi needs a fixed size volume. > > Should ZFS generate an error if somebody tries to set shareiscsi=on for a filesystem that doesn''t support that property?My initial reaction was yes, however there is a case where you want to set shareisci=on for a filesystem. Setting it on a filesystem allows for it to be inherited by any volumes created below that point in the hierarchy. Lets take this fictional, but reasonable, dataset hierarchy. tank/volumes/template/solaris tank/volumes/template/linux tank/volumes/template/windows tank/volumes/archive/ tank/volumes/active/host-abc tank/volumes/active/host-xyz tank is the pool name. volumes is a dataset (with canmount=false if you like) template, archive, active are allso datasets (again canmount=false) The actual volumes are: solaris, linux, windows, host-abc, host-xyz So where do we a turn on iscsi sharing ? It could be done at the individual volume layer, or it could be done up at the "volumes" dataset layer eg: zfs set shareiscsi=on tank/volumes/template/solaris zfs set shareiscsi=on tank/volumes/template/linux zfs set shareiscsi=on tank/volumes/template/windows ... or just do: zfs set shareiscsi=on tank/volumes/ Aside: having canmount=false on tank/volumes may or may not be a good idea but it depends on the local deployment. -- Darren J Moffat