*I apologise about the length of this e-mail, I tried to cover all details*
I am following up on a previous post which is here
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2010-03/msg00392.html
The sum up my end goal is this;
"To have a SAN type system where I have multiple servers that contain
multiple disks. I can lose a server and due to RAID1 across the servers,
the data will still be on the network, IO will increase due to their
being multiple servers to read from simultaneously as opposed to one NAS
box and lastly to be able to add new servers to the system to increase
the storage (to the end users the amount of available space increases).
Stage one is to create a two server system that can take a failure of a
server. Second stage is to get better IO from two servers then from one
NAS box. Last stage is to have all that and the ability to easily add
more storage"
I have created 3 (not great spec just some spear) servers, two of which
have 2 HDD's each and I will call storedevice1 and storedevice2 these
are my devices that will hold the data. My 3rd server is my controller
which controls the devices. Each server has two hard drives which using
iscsi-target I export as data0 and data1 and in the controller I use the
iscsi-initiator to connect to these 4 HDD's. Here is the config files
storedevice1:
#cat /usr/local/etc/iscsi/targets
# extents       file                    start   length
#extent0        /tmp/iscsi-target0      0       100MB
extent0         /data0/data             0       28GB
extent1         /data1/data             0       28GB
# target        flags   storage         netmask
target0         rw      extent0         192.168.2.0/24
target1         rw      extent1         192.168.2.0/24
# ls -lh /data0/
total 2195442
drwxrwxr-x  2 root  operator   512B Apr 20 17:49 .snap
-rw-r--r--  1 root  wheel       28G May  5 21:37 data
# ls -lh /data1/
total 2195442
drwxrwxr-x  2 root  operator   512B Apr 22 13:27 .snap
-rw-r--r--  1 root  wheel       28G May  5 21:37 data
storedevice2:
#cat /usr/local/etc/iscsi/targets
# extents       file                    start   length
extent2         /data0/data             0       28GB
extent3         /data1/data             0       28GB
# target        flags   storage         netmask
target2         rw      extent2         192.168.2.0/24
target3         rw      extent3         192.168.2.0/24
# ls -lh /data0/
total 2191250
drwxrwxr-x  2 root  operator   512B Apr 22 15:09 .snap
-rw-r--r--  1 root  wheel       28G May  5 21:37 data
# ls -lh /data1/
total 2191250
drwxrwxr-x  2 root  operator   512B Apr 22 17:40 .snap
-rw-r--r--  1 root  wheel       28G May  5 21:37 data
which gives me 4 extents and 4 targets accross both. /dataX/data is a
file which I think it needs to be (???)
On my controller I have;
OffSanCtrl1# cat /etc/iscsi.conf
offsan0 {
	TargetName      = iqn.1994-04.org.netbsd.iscsi-target:target0
        TargetAddress   = 192.168.2.160:3260,1
}
offsan1 {
        TargetName      = iqn.1994-04.org.netbsd.iscsi-target:target1
        TargetAddress   = 192.168.2.160:3260,1
}
offsan2 {
        TargetName      = iqn.1994-04.org.netbsd.iscsi-target:target2
        TargetAddress   = 192.168.2.161:3260,1
offsan3 {
        TargetName      = iqn.1994-04.org.netbsd.iscsi-target:target3
        TargetAddress   = 192.168.2.161:3260,1
}
which is my initiator and connects to my 4 targets
Up to this point I think I am doing everything correctly. I then setup a
zpool on the controller
OffSanCtrl1# zpool status
  pool: store0
 state: ONLINE
 scrub: none requested
config:
        NAME        STATE     READ WRITE CKSUM
        store0      ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            da1s1d  ONLINE       0     0     0
            da3s1d  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            da2s1d  ONLINE       0     0     0
            da4s1d  ONLINE       0     0     0
errors: No known data errors
with da1s1d and da2s1d being from storedevice1 and da3s1d and da4s1d
from storedevice2 so if I am correct this should be a a sort of RAID10
(anything that could be done better please tell me).
I now set-up zfs on this zpool (again, I think I'm doing this the right way)
OffSanCtrl1# zfs list
NAME           USED  AVAIL  REFER  MOUNTPOINT
store0        4.12G  50.5G    19K  /store0
store0/users  4.12G  50.5G  4.12G  /store0/users
Lastly, I need to be able to allow network users/servers to connect to
this. My choices I think are iscsi and samba as I have *nix and windows
machines, so I'll try iscsi. From the controller, create a target from
the zfs mount point I have created. If I am correct, a user should be
able to connect to the target from the controller, write data which will
actually be writing data across both storedevice's
OffSanCtrl1# cat /usr/local/etc/iscsi/targets
# extents       file                    start   length
extent0         /store0/users/data      0       55G
# target        flags   storage         netmask
target9         rw      extent0         192.168.2.0/24
It didn't work until I created a file called data under /store0/users/
so I guess this relies on a file of some sort..maybe?? I think this bit
I've done wrong.
I test this from a windows7 machine, open up the iscsi-initiator,
connect to the controller, it connects and I can see the drive so I
format it and mount it. I can then write data to the drive and read data
from it. So...it works...to a degree...though I think what is actually
happening is it's writing data to the controller somewhere and NOT to
the storedevices...however I'm getting a bit lost.
I get quite good write speeds at first and it slowly craws down to about
5MB/s...but with small files and to begind with, with big files it is
very fast and impressive. What is causing the slow down, I am not
sure...possibly RAM...cache...I'm not using amazing servers as
storedevice1/2 so it could possibly be that. But I know I have lots of
things to do...so I'm asking for any advice and assistance from the
community.
P.S Please note at this point I am not looking at ZIL or things like
that, I simply want a working system to test the theory and build a
proposal with. I know better servers will give me better IO, but should
not affect testing redundency, failovers etc etc which is what I need to
work out.
Many thanks...even if you just read this beast of an e-mail
Hi! On Thu, 06 May 2010 10:09:33 +0100 Michal wrote:> Many thanks...even if you just read this beast of an e-mailThis is not an answer to your questions but my tiny experience in this area. I created two USB 2G storages with FreeBSD current i386: node001 and node002. Each system has one 250G hard disk. Both nodes export their disks via iscsi: ----- node001% sudo istgtcontrol list lun0 storage "/dev/ad0" 250059350016 DONE LIST command ----- node002% sudo istgtcontrol list lun0 storage "/dev/ad4" 250059350016 DONE LIST command ----- I used my workstation as a controller. And since it's an i386 (FreeBSD-CURRENT) I didn't try zfs but created a gmirror: ----- da0 at iscsi0 bus 0 scbus7 target 0 lun 0 da0: <FreeBSD iSCSI DISK 0001> Fixed Direct Access SCSI-5 device da0: 238475MB (488397168 512 byte sectors: 255H 63S/T 30401C) da1 at iscsi0 bus 0 scbus7 target 1 lun 0 da1: <FreeBSD iSCSI DISK 0001> Fixed Direct Access SCSI-5 device da1: 238475MB (488397168 512 byte sectors: 255H 63S/T 30401C) GEOM_MIRROR: Device mirror/storage0 launched (2/2). ----- bsam% gmirror status Name Status Components mirror/storage0 COMPLETE da0 da1 bsam% mount -p | grep storage /dev/mirror/storage0 /storage ufs rw 2 2 ----- While using 100Mb/s network connection I've got 7-8 MB/s writes (after some sysctl tuning, 3-4 MB/s without). And switching to 1Gb/s network connection (Intel Gigabit Ethernet Controllers) gave me something like 20 Mb/s writes (no tuning): ----- bsam% dd if=/dev/zero of=/storage/file.2G bs=1M count=2000 2000+0 records in 2000+0 records out 2097152000 bytes transferred in 108.479201 secs (19332296 bytes/sec) bsam% netstat -w10 input (Total) output packets errs idrops bytes packets errs bytes colls 21 0 0 1658 9 0 582 0 44 0 0 5425 35 0 3014 0 70470 0 0 5640905 114909 0 184072616 0 157670 0 0 11966753 257062 0 390774311 0 175056 0 0 11951631 285343 0 390099162 0 156502 0 0 11867216 255072 0 387223400 0 174907 0 0 11947580 285064 0 390607312 0 155607 0 0 11800874 253530 0 385202710 0 173359 0 0 11835884 282798 0 386524530 0 157604 0 0 11922283 256966 0 389324902 0 174224 0 0 11919850 284396 0 390212650 0 173677 0 0 11853356 283110 0 386944770 0 174934 0 0 11932229 285361 0 389785504 0 69574 0 0 4584810 113056 0 148735429 0 282 0 0 22196 374 0 452062 0 20 0 0 2054 14 0 2473 0 ----- node001% netstat -w10 input (Total) output packets errs idrops bytes packets errs bytes colls 12 0 0 1008 1 0 178 0 17 0 0 1626 2 0 292 0 60498 0 0 88413959 37258 0 2616474 0 139627 0 0 204095754 86076 0 6064930 0 139188 0 0 203450154 85816 0 6047488 0 138486 0 0 202411672 85363 0 6000534 0 139258 0 0 203561033 85848 0 6049360 0 137636 0 0 201181818 84866 0 5981524 0 138262 0 0 202064896 85169 0 6004090 0 139297 0 0 203633800 85829 0 6031458 0 138884 0 0 203011098 85495 0 6025390 0 138280 0 0 202089488 85166 0 6004198 0 139272 0 0 203583962 85828 0 6047758 0 58465 0 0 85346106 36192 0 2545632 0 193 0 0 236228 132 0 10360 0 ----- node002% netstat -w10 input (Total) output packets errs idrops bytes packets errs bytes colls 12 0 0 1008 1 0 178 0 16 0 0 1566 2 0 292 0 73703 0 0 107730061 44956 0 3134984 0 139420 0 0 203862166 85058 0 5899360 0 139295 0 0 203674472 85024 0 5896468 0 138129 0 0 201945040 84345 0 5850658 0 139068 0 0 203351261 84873 0 5886130 0 138214 0 0 202097218 84293 0 5846170 0 137860 0 0 201540738 84120 0 5835754 0 138751 0 0 202889156 84668 0 5871694 0 139260 0 0 203626362 84907 0 5888854 0 138468 0 0 202435390 84534 0 5863900 0 139210 0 0 203557426 84865 0 5885938 0 45288 0 0 66099374 27684 0 1924360 0 193 0 0 236228 136 0 10624 0 11 0 0 1574 3 0 518 0 ----- Well, there were no tuning at all since I didn't have much spare time to exteriment. Here is some more info about hard and soft. Controller: ----- bsam% uname -a FreeBSD bsam.sem.ipt.ru 9.0-CURRENT FreeBSD 9.0-CURRENT #2 r207498: Sun May 2 16:19:56 MSD 2010 bsam@bsam.sem.ipt.ru:/home/bsam/FreeBSD/base/head/obj/storage/src/sys/BB i386 age0@pci0:2:0:0: class=0x020000 card=0x82261043 chip=0x10481969 rev=0xb0 hdr=0x00 vendor = 'Attansic (Now owned by Atheros)' device = 'Gigabit Ethernet 10/100/1000 Base-T Controller (Atheros L1)' class = network subclass = ethernet cap 01[40] = powerspec 2 supports D0 D3 current D0 cap 05[48] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[58] = PCI-Express 1 endpoint max data 128(128) link x1(x1) ----- Node001: ----- node001% uname -a FreeBSD node001.sem.ipt.ru 9.0-CURRENT FreeBSD 9.0-CURRENT #13 r206496: Mon Apr 12 18:28:45 MSD 2010 bsam@bsam.sem.ipt.ru:/m/home/bsam/FreeBSD/base/head/obj/m/home/bsam/FreeBSD/base/head/src/sys/BB i386 em0@pci0:2:1:0: class=0x020000 card=0x002e8086 chip=0x100e8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = 'Gigabit Ethernet Controller (82540EM)' class = network subclass = ethernet cap 01[dc] = powerspec 2 supports D0 D3 current D0 cap 07[e4] = PCI-X supports 2048 burst read, 1 split transaction cap 05[f0] = MSI supports 1 message, 64 bit ----- Node002: ----- node002% uname -a FreeBSD node002.sem.ipt.ru 9.0-CURRENT FreeBSD 9.0-CURRENT #13 r206496: Mon Apr 12 18:28:45 MSD 2010 bsam@bsam.sem.ipt.ru:/m/home/bsam/FreeBSD/base/head/obj/m/home/bsam/FreeBSD/base/head/src/sys/BB i386 em0@pci0:0:10:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = 'Gigabit Ethernet Controller (Copper) rev 5 (82541PI)' class = network subclass = ethernet cap 01[dc] = powerspec 2 supports D0 D3 current D0 cap 07[e4] = PCI-X supports 2048 burst read, 1 split transaction ----- -- WBR, Boris Samorodov (bsam) Research Engineer, http://www.ipt.ru Telephone & Internet SP FreeBSD Committer, http://www.FreeBSD.org The Power To Serve