Xen Dar
2009-Jul-03 12:16 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
I currently have 10tb of data on an NTFS windows system and I would like to move it to ZFS on open solaris, without having to buy an extra 10TB to do the transfer. If anyone has a method for doing this I would really appreciate any help. -- This message posted from opensolaris.org
Darren J Moffat
2009-Jul-03 12:21 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
Xen Dar wrote:> I currently have 10tb of data on an NTFS windows system and I would like to move it to ZFS on open solaris, without having to buy an extra 10TB to do the transfer. If anyone has a method for doing this I would really appreciate any help.You will need to describe the physical hardware and current RAID system in much more detail. For example do are you using mirroring or raid5 (other)? Are the mirrors always in separate full disks ? Are the mirror disks always in a separate array ? Are you using hardware raid ? What is your desired ZFS pool layout ? Mirroring, RAIDZ (RAID5), RAIDZ2 (RAID6) ? It is not possible to convert NTFS to ZFS "in place" which is why it is necessary to know where you are starting from and where you want to get to. -- Darren J Moffat
Xen Dar
2009-Jul-03 13:01 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
1st thx for the quick response. Current config is windows XP with 8 seprate non raided 1.5TB sata NTFS drives My backup data is stored on another server with hardware raid 5 and FreeNas this wont come into the equation. I currently dont have a opensolaris install. The current does and donts of the storage pool system are still unclear to me as I am RTFMing. I do have a sata intel hardware raid controller that I have not installed on any of my systems yet. So what I would like to do is create a new server potentially booting it from a CF drive. And have the sata drives running as redundant storage space. The crux being that I dont want to loose all the data that is currently on the drives during transformation. I hope this is clearer and thx. -- This message posted from opensolaris.org
Darren J Moffat
2009-Jul-03 13:48 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
Xen Dar wrote:> 1st thx for the quick response. > > Current config is windows XP with 8 seprate non raided 1.5TB sata NTFS drivesThat doesn''t look hopeful whats more I would hightly recommend *against* running ZFS without either mirroring or raidz. Are these drives all actually full or can you use some windows software to shrink the NTFS filesystem ? If so how small could you get it ? So is it 8 separate NTFS filesystems ? If not there is some kind of raid, it might be a concat or a stripe. From what you have said it sounds like it isn''t a mirror or raid 5. A mirror would have been ideal because you could break the mirror, create a new ZFS pool out of the mirror and copy the data over to that, then destroy the NTFS filesystem and add the drives to the ZFS pool as mirrors. However it doesn''t sound like you have the config able to do that.> My backup data is stored on another server with hardware raid 5 and FreeNas this wont come into the equation. > I currently dont have a opensolaris install. The current does and donts of the storage pool system are still unclear to me as I am RTFMing. > > I do have a sata intel hardware raid controller that I have not installed on any of my systems yet. So what I would like to do is create a new server potentially booting it from a CF drive. And have the sata drives running as redundant storage space.Use of hardware raid to providing mirroring, striping or raid5 is generally not recommended with ZFS.> The crux being that I dont want to loose all the data that is currently on the drives during transformation. > > I hope this is clearer and thx.It is but I don''t think you can easily do it given the hardware and raid configuration you have. Note also that by default OpenSolaris can''t actually read NTFS but with FUSE it can http://opensolaris.org/os/project/fuse/ The FUSE project has not yet integrated into OpenSolaris so it is not in the repository. See also http://forums.opensolaris.com/message.jspa?messageID=1287 -- Darren J Moffat
Ross
2009-Jul-03 13:55 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
Absolutely no way to do that without wiping the data and restoring it from your backup server. There isn''t any in place conversion from NTFS to ZFS, and in any case with those drives you would be highly advised to go for raid-z or raid-z2. You''ll end up with less capacity, but far less risk of loosing everything if one drive fails. I don''t know what setup you have with those 8 drives, but if you just have them as one big pool it sounds very risky. Since you don''t have any redundancy on your current server, there''s not even any way to borrow a drive and migrate things that way. My advice would be to take a current backup of everything. Verify the backup to ensure that you absolutely, positively can read every single byte of it, and then install ZFS and bring the data over (robocopy should work fine for that). -- This message posted from opensolaris.org
Xen Dar
2009-Jul-03 14:12 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
with 2tb of empty drives to use as virgin raid-z. And 8 seprate full 1.5tb NTFS drives that are in no way linked mirrored raided. What would be my best data migration strategy? I also apologise if this comes across noobish, but the fact is thats what I am when it comes to this. And once again thx for the very useful insites. -- This message posted from opensolaris.org
Ross
2009-Jul-03 14:49 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
o_0 So you''ve got 8 drives that are all completely separate? Are the drives completely full? Do you have any space at all? When you say 2TB of empty drives, how many drives, what capacities? It may be possible to come up with something, but you have to bear in mind that you''ll loose some space going to raid-z, and you''ll also loose some space due to the overhead of ZFS. Quick question to the more experienced guys here - how much space would you end up with from 8 1.5TB drives in a raid-z array? Around 8-9TB? How much free space do you have on the backup server, could you move some data over there? I''m still not entirely sure this is possible, but it may be worth a try. Do you have any budget to buy one or two spare 1.5TB drives? -- This message posted from opensolaris.org
Ian Collins
2009-Jul-03 21:11 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
Ross wrote: [please keep some context for the email list]> Quick question to the more experienced guys here - how much space would you end up with from 8 1.5TB drives in a raid-z array? Around 8-9TB? >Bearing in mind manufacturer TB != real TB, each drive will give about 1.35TB of formatted space, to you would get about 10.5TB. -- Ian.
Erik Trimble
2009-Jul-03 23:34 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
Ian Collins wrote:> Ross wrote: > > [please keep some context for the email list] >> Quick question to the more experienced guys here - how much space >> would you end up with from 8 1.5TB drives in a raid-z array? Around >> 8-9TB? >> > Bearing in mind manufacturer TB != real TB, each drive will give about > 1.35TB of formatted space, to you would get about 10.5TB. >As Ian pointed out, virtually all 1.5TB drives are actually 1,500,000,000,000 bytes in size, which equal about 1.364 TiB (to use the stupid SI name). Using a RAID-Z on 8 drives, will give you roughly 1.364TiB x 7 ~= 9.5TiB of usable space, give or take a hundred or two GB. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA
Ross
2009-Jul-04 04:29 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
Is that accounting for ZFS overhead? I thought it was more than that (but of course, it''s great news if not) :-) -- This message posted from opensolaris.org
Ian Collins
2009-Jul-04 05:02 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
Ross wrote:> Is that accounting for ZFS overhead? I thought it was more than that (but of course, it''s great news if not) :-) >A raidz2 pool with 8 500G drives showed 2.67GB free. -- Ian.
Eric D. Mudama
2009-Jul-04 06:39 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
On Fri, Jul 3 at 16:34, Erik Trimble wrote:> Ian Collins wrote: >> Ross wrote: >> >> [please keep some context for the email list] >>> Quick question to the more experienced guys here - how much space >>> would you end up with from 8 1.5TB drives in a raid-z array? Around >>> 8-9TB? >>> >> Bearing in mind manufacturer TB != real TB, each drive will give about >> 1.35TB of formatted space, to you would get about 10.5TB. >> > As Ian pointed out, virtually all 1.5TB drives are actually > 1,500,000,000,000 bytes in size, which equal about 1.364 TiB (to use the > stupid SI name).Actually, pretty sure most manufacturers are settling on IDEMA LBA1-02 for sizing formulas. LBA count = (97696368) + (1953504 * (Desired Capacity in Gbytes - 50.0)) So a "1.5TB" drive from that formula would be: = 97696368 + (1953504 * (1500-50)) = 2930277168 LBAs * 512 bytes per sector = 1,500,301,910,016 bytes If, for example, you look at the spec sheet on wdc.com for their 1.5TB drive, they explicitly advertise the exact calculated sector count above: 2,930,277,168 http://www.wdc.com/en/products/products.asp?driveid=575 There was tons of raid sizing wierdness issues in the past, but the above is easy to check and is becoming very common. Every drive vendor wants to be able to sell their drives into a raid array that was originally populated with a competitor''s devices. --eric -- Eric D. Mudama edmudama at mail.bounceswoosh.org
Erik Trimble
2009-Jul-04 17:20 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
Ian Collins wrote:> Ross wrote: >> Is that accounting for ZFS overhead? I thought it was more than that >> (but of course, it''s great news if not) :-) >> > A raidz2 pool with 8 500G drives showed 2.67GB free. >Same here. The ZFS overhead appears to be much smaller than similar UFS filesystems. E.g. on 500GB Hitachi drives: Total disk sectors available: 976743646 + 16384 (reserved sectors) Part Tag Flag First Sector Size Last Sector 0 usr wm 256 465.75GB 976743646 1 unassigned wm 0 0 0 2 unassigned wm 0 0 0 3 unassigned wm 0 0 0 4 unassigned wm 0 0 0 5 unassigned wm 0 0 0 6 unassigned wm 0 0 0 8 reserved wm 976743647 8.00MB 976760030 This is with an EFI label, which reports almost EXACTLY the amount expected (500GB = 465GiB). I''m using them in a 4-disk RAID-Z, so I lose 1 disk to parity. The info is: # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT data 1.81T 1.75T 64.0G 96% ONLINE - # zfs list data NAME USED AVAIL REFER MOUNTPOINT data 1.31T 26.2G 41.2G /data # df -k /data Filesystem kbytes used avail capacity Mounted on data 1433069568 43178074 27479559 62% /data Given the numbers, I would expect 3 x 465.75GB = 3 x 488374272kB = 1465122816 kB. So, ''df'' reports my RAID-Z as being 2.18% smaller than the aggregate raw disk partition size. If the same numbers hold up for you, with 8 x 1.5TB in a RAID-Z: 1.5TB ~ 1.364TiB 7 x 1.364TiB ~ 9.546TiB Lose 2.2% for ZFS overhead: 9.546TiB x 0.978 ~ 9.34 TiB That''s todays math lesson! :-) -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA
Jim Klimov
2009-Jul-08 14:16 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
First of all, as other posters stressed, your data is not safe by being stored in a single copy, in the first place. Before doing anything to it, make a backup and test the backup if anyhow possible. At least, do it to any data that is more worth than the rest of it ;) As it was stressed in other posts, and not replied - how much disk space do you actually have available on your 8 disks? I.e. can you copy over some files in WinXP over to some disks in order to free up at least one drive? Half of the drives (ideal)? How much compressable is your data (i.e. videos vs text files)? Is it compressed on the NTFS filesystem already (pointer to freeing up space - if not)? Depending on how much free space can be actually gained by moving and compressing your data, there''s a number of scenarios possible, detailed below. The point I''m trying to get to is: as soon as you can free up a single drive so you can move it into the Solaris machine, you can set it up as your ZFS pool. Initially this pool would only contain a single vdev (a single drive, a mirror or a raidz group of drives, which may be concatenated to make up the larger pool if there''s more than one vdev, as detailed below). You create a filesystem dataset on the pool and enable compression (if your data can be compressed). In recent Solaris and OpenSolaris you can use gzip-9 to fit the info tighter on the drive. Also keep in mind that this setting applies to any data written *after* the value is set. So a dataset can store data objects written with mixed compression levels, if the value is changed on the fly. Alternatively, and more simple to support, you can make several datasets with pre-defined compression evels (i.e. not to waste CPU cycles to compress JPEGs and MP3''s). Now, as you copy the data from NTFS to Solaris, you (are expected to) free up at least one more drive which can be added to the ZFS pool. Its capacity is at this moment concatenated to the same pool. If you free up many drives at once, you can go for a raidz vdev. Best-case scenario is that you free up enough disks to build a redundant ZFS volume right away (raidz1, raidz2 or mirror - as the redundant pool''s capacty decreases and data protection grows). Apparently, you don''t expect to have enough drives to mirror all data, so let''s skip that idea for now. The raidz levels require that you free up at least two drives initially. AFAIK the raidz vdevs can not be expanded at the moment, so the more drives you''re initially using - the less overhead capacity you''ll lose. As you progress with data copying, you can free up some more drives and make another raidz vdev, attached to this pool. You can use a trick to make a raidz vdev with missing redundancy disks (which you''d attach and resilver later on). This is possible, but not "production" ready in any manner, and prone to data loss of the whole set of several drives whenever anything goes wrong. To my sole risk, I used it to make and populate a raidz2 pool of 4 devices while I only had 2 available drives at that moment (the other 2 were the old raid10 mirror''s components with original data). The fake raidz redundancy devices trick is discussed in this thread: [http://opensolaris.org/jive/thread.jspa?messageID=328360&tstart=0] In a worst-case scenario you''ll have a either single pool of concatenated disks, or a number of separate pools - like your separate NTFS systems are now; in my opinion, this is the better of two worses. In case of separate ZFS pools, you can move them around and you only lose one disk worth of data if anything (drive, cabling, software, power) goes wrong. With a concatenated pool, however, you have all of the drives'' free space also concatenated as one big available bubble. That''s your choice to make. Later on you can expand the single drive vdevs to become mirrors, as you buy or free up drives. If you find that your data compresses well, so that you start with a single-drive concatenation pool and then find that you can free up several drives at once and use raidz sets, see if you can squeeze out at least 3-4 drives (including a fake device for raidz redundancy if you choose to try the trick). If you can - start a new pool made with raidz vdevs and migrate the data from single drives to it, then scrap their pool and reuse them. Remember that you can''t currently remove a vdev from the pool. For such "temporary" pools (preferably redundant, or not) you can also use any number of older-smaller drives if you can get hands on them ;) On a side note, copying this much data over LAN would take ages. If your disks are not too much fragmented, you can typically expect 20-40Mb/s for large files. Zillions of small files (or heavily fragmented disks) make up so many mechanical seeks that speeds can fall down to well under 1Mb/s. Easy to see that copying a single 1.5Tb drive can take anywhere from half a day on a gigabit LAN, and about 2-3 days on a 100Mbit LAN (7-10 Mb/s typical). In your position, I would explore (first on a testbed!) whether you can use the same machine to read NTFS and write ZFS. My ideas so far include either dual-booting to Solaris with some kind of NTFS driver (for example, see these posts: [http://blogs.sun.com/pradhap/entry/mount_ntfs_ext2_ext3_in] or [http://blogs.sun.com/pradhap/entry/mounting_ntfs_partition_in_solaris]), or virtualizing either Solaris or WinXP (see VirtualBox), and finding a way to assign whole physical drives to a virtual machine. If you virtualize OpenSolaris this way, make sure that your test pool (a USB flash drive, perhaps) can indeed be imported to a physical machine. It may be possible that this way your data can be copied over faster. On another side note, expect your filesystem security settings to be lost (unless you integrate with a Windows domain), and remember that ZFS filenames are currently limited to 255 bytes. That bug has bit me once last year - so NTFS files had to be renamed. Hope this helps, let us know if it does ;) //Jim Klimov -- This message posted from opensolaris.org
Jim Klimov
2009-Jul-08 15:55 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
I meant to add that due to the sheer amount of data (and time needed) to copy, you really don''t want to use copying tools which abort on error, such as MS Explorer. Normally I''d suggest something like FAR in Windows or Midnight Commander in Unix to copy over networked connections (CIFS shares), or further on - tar/cpio/whatever. These would let you know of errors and/or suggest that you retry copying (if errors were due to environment like LAN switch reset). However, interactive tools would stall until you answer, and non-interactive tools would not continue copying over what they lost on the first pass. Overall from my experience, I''d suggest RSync running in a loop with partial-dir enabled, for either local copying or over-the-net copying. This way rsync takes care of copying only the changed files (or continuing files which failed from the point where they failed), and it does so without requiring supervision. For Windows side you can look for a project called cwRsync which includes parts of Cygwin to make the environment for rsync (ssh, openssl, etc). My typical runs between Unix hosts look like: solaris# cd /pool/dumpstore/databases solaris# while ! rsync -vaP --stats --exclude=''*.bak'' --exclude=''temp'' --partial --append source:/DUMP/snapshots/mysql . ; do sleep 5; echo "===== `date`: RETRY"; done; date (Slashes in the end of pathnames do matter a lot - directory or its contents) For Windows the basic syntax remains nearly the same, I don''t want to add confusion by crafting it out of my head now with nowhere to test. If your setup is in a LAN and security overhead can be disregarded, use ''rsync -e rsh'' (or use ssh with lightweight algorithms) to not waste CPU on encryption. Alternatively, you can configure the Solaris host to act as an rsync server and use the rsync algorithm (with desired settings) directly. Also, if your files are not ASCII-named, you might want to look at rsync --iconv parameter to recode pathnames. And remember about ZFS 255-byte(!) limit on names. For Unicode names the string character length is roughly half that. //HTH, Jim -- This message posted from opensolaris.org
David Magda
2009-Jul-08 16:29 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
On Wed, July 8, 2009 11:55, Jim Klimov wrote:> My typical runs between Unix hosts look like: > > solaris# cd /pool/dumpstore/databases > solaris# while ! rsync -vaP --stats --exclude=''*.bak'' --exclude=''temp'' > --partial --append source:/DUMP/snapshots/mysql . ; do sleep 5; echo > "===== `date`: RETRY"; done; dateIf possible, also try to use rsync 3.x if you''re going to go down that route. In previous versiosn it was necessary to traverse the entire file system to get a file list before starting a transfer. Starting with 3.0.0 (and when talking to another 3.x), it will send incremental updates so bits start moving quicker:> ENHANCEMENTS: > > - A new incremental-recursion algorithm is now used when rsync is talking > to another 3.x version. This starts the transfer going more quickly > (before all the files have been found), and requires much less memory. > See the --recursive option in the manpage for some restrictions.http://www.samba.org/ftp/rsync/src/rsync-3.0.0-NEWS
Jim Klimov
2009-Jul-09 06:04 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
True, correction accepted, covering my head with ashes in shame ;) We do use a custom-built package of rsync-3.0.5 with a number of their standard contributed patches applied. To be specific, these: checksum-reading.diff checksum-updating.diff detect-renamed.diff downdate.diff fileflags.diff fsync.diff netgroup-auth.diff Speaking of which, I should also suggest that after your cycles of "incremental" or "partial" rsync''ing are complete (which protects longterm outstanding IO against intermediate problems), you should rerun the same rsync command adding a "-cn" flag. This way both sides will calculate and compare checksums on their copies of files, and any files broken during transfer will be reported (and updated if you remove the "-n" flag). If by using "-cn" you find any files broken during transfer, check both versions. It may quite be possible that during these intensive operations the original disk''s bits flipped. So in fact your new copy may be (or not be) the better half. If this is a distro archive with pre-existing checksums (md5sums) available, they would certainly help you decide which copy is the good one (or neither one is, as it happens sometimes). If this is a consumer multimedia archive with files in formats resistant to damage (MP3, MP4, JPEG to a lesser degree) you probably don''t care much for an occasional noise glitch or a "green frames" artefact. Or you do :) Finally, the rsync flag "--sumfiles" (I''m not certain whether it''s stock or from some of the patches) specifically allows you to create .rsyncsums files in each directory to speed up future synchronizations. Since on non-ZFS either the original file or its checksum file can degrade undetected, recalculating and checking the validity of these checksums once in a while is a good idea to detect errors crawling in. // HTH, Jim -- This message posted from opensolaris.org
Xen Dar
2009-Jul-09 06:54 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
Ok so this is my solution, pls be advised I am a total linux nube so I am learning as I go along. I installed opensolaris and setup rpool as my base install on a single 1TB drive. I attached one of my NTFS drives to the system then used a utility called prtparts to get the name of the NTFS drive attached and then mounted it succesfully. I then started transfering data accross till the drive was empty (this is currently in progress) Once thats done I will add the empty NTFS drive to my ZFS pool and repeat the operation with my other drives. This leaves me with the issue of redundancy which is sorely lacking, ideally I would like to do the same think directly into a zraid pool, but I understand from what I have read that you cant add single drives to a zraid and I want all my drives in a single pool as only to loose the space for the pool redundancy once. I havent worked out if I can transform my zpool int a zraid after I have copied all my data. Once again thx for the great support. And maybe someone can direct me to an area in a forum that explains y I cant use sudo... -- This message posted from opensolaris.org
Xen Dar
2009-Jul-09 06:56 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
Just for clarity 90% of my stuff is media mostly AVI''s or MKV''s so I dont think that compresses very well. -- This message posted from opensolaris.org
Lejun Zhu
2009-Jul-09 07:03 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
> Ok so this is my solution, pls be advised I am a > total linux nube so I am learning as I go along. I > installed opensolaris and setup rpool as my base > install on a single 1TB drive. I attached one of my > NTFS drives to the system then used a utility called > prtparts to get the name of the NTFS drive attached > and then mounted it succesfully. > I then started transfering data accross till the > drive was empty (this is currently in progress) Once > thats done I will add the empty NTFS drive to my ZFS > pool and repeat the operation with my other drives. > > This leaves me with the issue of redundancy which is > sorely lacking, ideally I would like to do the same > think directly into a zraid pool, but I understand > from what I have read that you cant add single drives > to a zraid and I want all my drives in a single pool > as only to loose the space for the pool redundancy > once. > > I havent worked out if I can transform my zpool int a > zraid after I have copied all my data. > > Once again thx for the great support. And maybe > someone can direct me to an area in a forum that > explains y I cant use sudo...Hope this helps http://forums.opensolaris.com/thread.jspa?threadID=583&tstart=-1 -- This message posted from opensolaris.org
Jim Klimov
2009-Jul-09 10:49 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
> I installed opensolaris and setup rpool as my base install on a single 1TB driveIf I understand correctly, you have rpool and the data pool configured all as one pool? That''s not probably what you''d really want. For one part, the bootable root pool should all be available to GRUB from a single hardware device and this precludes any striping or raidz configurations for the root pool (only single drives and mirrors are supported). You should rather make a separate root pool (depends on your installation size, RAM -> swap, number of OS versions to roll back); I''d suffice with anything from 8 to 20Gb. And the rest of the disk (as another slice) becomes the data pool which can later be expanded by adding stripes. Obviously, data already on the disk won''t magically become striped to all drives unless you rewrite it.> a single 1TB driveMinor detail: I thought you were moving 1.5TB disks? Or did you find a drive with adequately few data (1 TB used)?> transfering data accross till the drive was emptyI thought NTFS driver for Solaris is read-only? Not a good transactional approach. Delete original data only after all copying has completed (and perhaps cross-checked) and the disk can actually be reused in the ZFS pool. For example, if you were to remake the pool (as suggested above for rpool and below for raidz data pool) - where would you re-get the original data for copying over again?> I havent worked out if I can transform my zpool int a zraid after I have > copied all my data.My guess would be - no, you can''t (not directly at least). I think you can mirror the striped pool''s component drives on the fly, by buying new drives one at a time - which requires buying these drives. Or if you buy and attach all 8-9 drives at once, you can build another pool with raidz layout and migrate all data to it. Your old drives can then be attached to this pool as another raidz vdev stripe (or even mirror, but that''s probably not needed for your usecase). These scenarios are not unlike raid50 or raid51, respectively. In case of striping, you can build and expand your pool by vdev''s of different layout and size. As said before, currently there''s a problem that you can''t shrink the pool to remove devices (other than break mirrors into single drives). Perhaps you can get away by buying now only the "parity" drives for your future pool layout (which depends on the number of motherboard/controller connectors, and power source capacity, and your computer case size, etc.) and following the ideas for "best-case" scenario from my post. Then you''d start the pool by making a raidz1 device of 3-5 drives total (new empty ones, possibly including the "missing" fake parity device), and then making and attaching to the pool more new similar raidz vdev''s as you free up NTFS disks. I did some calculations on this last evening. For example, if your data fits on 8 "data" drives, you can make 1*8-Ddrive raidz1 set with 9 drives (8+1), 2*4-Ddrive sets with 10 drives (8+2), 3*3-Ddrive sets with 12 drives (9+3). I''d buy 4 new drives and stick with the latter 12-drive pool scenario - 1) build a complete 4-drive raidz1 set (3-Ddrive + 1*Pdrive), 2) move over 3 drives worth of data, 3) build and attach a fake 4-drive raidz1 set (3-Ddrive + 1 missing Pdrive), 4) move over 3 drives worth of data, 5) build and attach a fake 4-drive raidz1 set (3-Ddrive + 1 missing Pdrive), 6) move over 2 drives worth of data, 7) complete the parities for the missing Pdrives of the two faked sets. This does not in any way involve the capacity of your bootroot drives (which I think were expected to be a CF card, no?). So you already have at least one such drive ;) Even if your current drive is partially consumed by the root pool, I think you can sacrifice some 20Gb on each drive in one 4-disk raidz1 vdev. You can mirror the root pool with one of these drives, and make a mirrored swap pool on the other couple. //Jim -- This message posted from opensolaris.org
Jim Klimov
2009-Jul-09 10:57 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
One more note,> For example, if you were to remake the pool (as suggested above for rpool and > below for raidz data pool) - where would you re-get the original data for copying > over again?Of course, if you take on with the idea of buying 4 drives and building a raidz1 vdev right away, and if you actually moved (deleted) the data from the NTFS disk, you should start by creating this new pool with a complete raidz1 vdev. Then you transfer (copy then delete) data to it from your current ZFS pool and only then you remake/migrate the root pool if needed. Perhaps it would make sense to start with a faked raidz1 array (along with a new smaller root pool on its drives) made of just 3 more 1Tb disks, so you would just recycle and add your current zfs drive as a parity disk to this pool after all is complete. As you see, there''s lots of options depending on budget, creativity and other factors. It is possible that in the course of your quest you''ll try several of them. Starting out with a transactionable approach (i.e. not deleting the originals until necessary) pays off in such cases. //Jim -- This message posted from opensolaris.org
Xen Dar
2009-Jul-09 14:13 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
> > I installed opensolaris and setup rpool as my base > install on a single 1TB drive > > If I understand correctly, you have rpool and the > data pool configured all as one > pool?Correct> That''s not probably what you''d really want. For one > part, the bootable root pool > should all be available to GRUB from a single > hardware device and this precludes > any striping or raidz configurations for the root > pool (only single drives and > mirrors are supported).Makes sense> You should rather make a separate root pool (depends > on your installation size, > RAM -> swap, number of OS versions to roll back); I''d > suffice with anything from > 8 to 20Gb. And the rest of the disk (as another > slice) becomes the data pool whichI would like to use a 16gb sd card for this- if there is a post or a resource on "how to" you know of pls point me to it.> can later be expanded by adding stripes. Obviously, > data already on the disk > won''t magically become striped to all drives unless > you rewrite it. > > > a single 1TB drive > > Minor detail: I thought you were moving 1.5TB disks? > Or did you find a drive with > adequately few data (1 TB used)?I have 2 x 1TB drives that are clean and 8 by 1.5TB drives with all my data on.> > transfering data accross till the drive was empty > > I thought NTFS driver for Solaris is read-only?Nope I copied(not moved) all the data 800GB so far in 3 and a half hours succesfully to my rpool.> Not a good transactional approach. Delete original > data only after all copying has > completed (and perhaps cross-checked) and the disk > can actually be reused in the > ZFS pool. > > For example, if you were to remake the pool (as > suggested above for rpool and > below for raidz data pool) - where would you re-get > the original data for copying > over again? > > > I havent worked out if I can transform my zpool int > a zraid after I have > > copied all my data. > > My guess would be - no, you can''t (not directly at > least). I think you can mirror the > striped pool''s component drives on the fly, by buying > new drives one at a time - > which requires buying these drives. Or if you buy and > attach all 8-9 drives at once,Trying to spare myself the expense as this is my home system so budget is a constraint.> you can build another pool with raidz layout and > migrate all data to it. Your old > drives can then be attached to this pool as another > raidz vdev stripe (or even > mirror, but that''s probably not needed for your > usecase). These scenarios are > not unlike raid50 or raid51, respectively. > > In case of striping, you can build and expand your > pool by vdev''s of different > layout and size. As said before, currently there''s a > problem that you can''t shrink > the pool to remove devices (other than break mirrors > into single drives). > > Perhaps you can get away by buying now only the > "parity" drives for your future > pool layout (which depends on the number of > motherboard/controller connectors, > and power source capacity, and your computer case > size, etc.) and following the > ideas for "best-case" scenario from my post.Motherboard has 7 sata connectors in addition I have a Intel sata raid controller with 6 connectors which I havent put on yet and I am using a dual psu coolermaster case which supports 16 drives> > Then you''d start the pool by making a raidz1 device > of 3-5 drives total (new empty > ones, possibly including the "missing" fake parity > device), and then making and > attaching to the pool more new similar raidz vdev''s > as you free up NTFS disks. > > I did some calculations on this last evening. > > For example, if your data fits on 8 "data" drives, > you can make 1*8-Ddrive raidz1 > set with 9 drives (8+1), 2*4-Ddrive sets with 10 > drives (8+2), 3*3-Ddrive sets with > 12 drives (9+3). > > I''d buy 4 new drives and stick with the latter > 12-drive pool scenario - > 1) build a complete 4-drive raidz1 set (3-Ddrive + > 1*Pdrive), > 2) move over 3 drives worth of data, > 3) build and attach a fake 4-drive raidz1 set > (3-Ddrive + 1 missing Pdrive), > 4) move over 3 drives worth of data, > 5) build and attach a fake 4-drive raidz1 set > (3-Ddrive + 1 missing Pdrive), > 6) move over 2 drives worth of data, > 7) complete the parities for the missing Pdrives of > the two faked sets. > > This does not in any way involve the capacity of your > bootroot drives (which I think > were expected to be a CF card, no?). So you already > have at least one such drive ;) > Even if your current drive is partially consumed by > the root pool, I think you can > sacrifice some 20Gb on each drive in one 4-disk > raidz1 vdev. You can mirror the > root pool with one of these drives, and make a > mirrored swap pool on the other > couple.Ok I am going to have to read through this slowly and fully understand the fake raid scenario. What I am trying to avoid is having multiple raidz''s because every time I have another one I loose a lot of extra space to parity. Much like in raid 5.> //JimAnd last thx so very much for spending so much time and effort in transferring knowlege, I really do appreciate it. -- This message posted from opensolaris.org
Jim Klimov
2009-Jul-09 23:59 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
> Trying to spare myself the expense as this is my home system so budget is > a constraint.> What I am trying to avoid is having multiple raidz''s because every time I > have another one I loose a lot of extra space to parity. Much like in raid 5.There''s a common perception which I tend to share now, that "consumer" drives have a somewhat higher rate of unreliability and failure. Some aspects relate to the design priorities (i.e. balance price vs size vs duty cycle), or conspiracy-theory stuff (force consumers into buying more drives more often). Hand-made computers tend to increase that rate due to any number of reasons (components, connections, thermal issues, power source issues). I''ve passed that the hard way while building many home computers and cheap campus servers at my Uni. Including 24-drive linux filers with mdadm and hardware raid cards :) Another problem is that larger drives take a lot longer to rebuild (about 4 hours to write a single drive in your case with an otherwise idle system) or even resilver with a filled-up array like yours. This is especially a problem in classic RAID setups, where the whole drive is considered failed if anything goes wrong. It''s quite often that some hidden problem occurs with another drive of the array, so it is all considered dead, and the chance grows with disk size. That''s one of many other valid good reasons why "enterprise" drives are smaller. Hopefully ZFS does contain such failures down to a few blocks which have checksum mismatch. Anyway, I''d not be comfortable with large unreliable-big-drive sets even with some redundancy. Hence my somewhat arbitrary recommendation of 4-drive raidz1 sets. The industry seems to agree that at most 7-9 drives are reasonable for a single RAID5/6 volume (vdev in case of ZFS), though. Since you already have 2 clean 1Tb disks, you can buy just 2 more. In the end you''d have one 4*1Tb raidz1 and two 4*1.5Tb raidz1 vdevs in a pool, summing up to 3+(4.5*2) = 12Gb of usable space in a redundant set. For me personally, that would be worth its salt. There may however be some discrepancy between the space on the first set (3Gb) which amounts to just 2*1Tb drives freeing up. That can introduce more costly corrections into my calculations (i.e. a 5*1Tb disk set)... Concerning the sd/cf card for booting, I have no experience. From what I''ve seen, you can google for notes on card booting in Eeepc and similar netbooks, and for comments on the making of livecd/liveusb - capable Solaris distros (see some at http://www.opensolaris.org/os/downloads/). You''d probably need to make sure that the BIOS emulates the card as an IDE/SATA hard disk device, and/or bundle the needed drivers into the Solaris miniroot.> And last thx so very much for spending so much time and effort in transferring > knowlege, I really do appreciate it.You''re very welcome. I do hope this helps and you don''t lose data in the progress, due to my possible mistakes or misconceptions, or otherwise ;) //Jim -- This message posted from opensolaris.org
Jim Klimov
2009-Jul-10 00:19 UTC
[zfs-discuss] Migrating 10TB of data from NTFS is there a simple way?
You might also search for OpenSolaris NAS projects. Some that I''ve seen previously involve nearly the same config you''re building - a CF card or USB stick with the OS and a number of HDDs in a zfs pool for the data only. I am not certain which ones I''ve seen, but you can look for EON, and PulsarOS... http://eonstorage.blogspot.com/2008_11_01_archive.html (features page) http://eonstorage.blogspot.com/2009/05/eon-zfs-nas-0591-based-on-snv114.html http://code.google.com/p/pulsaros/ http://pulsaros.digitalplayground.at/ Haven''t yet tried them, though. //Jim -- This message posted from opensolaris.org