On Jan 2, 2007, at 11:14, Richard Elling wrote:> Don''t dispense with proper backups or you will be unhappy. One > of my New Years resolutions is to campaign against unhappiness. > So I would encourage you to explore ways to backup such large > data stores in a timely and economical way.The Sun StorageTech Availability Suite is supposedly being released as open source (into OpenSolaris) this month, so having a duplicate copy of the data on another machine may become easier. Bottom message at:> As the Availability Suite Project & Technical Lead, I will take this > opportunity to say that in January ''07, all of the Sun StorageTech > Availability Suite (AVS) software is going into OpenSolaris! > > This will include both the Remote Mirror (SNDR) and Point-in-Time Copy > (II) software, which runs on OpenSolaris supported hardware > platforms of > SPARC, x86 and x64.http://www.opensolaris.org/jive/thread.jspa?messageID=78537 _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Skipped content of type multipart/alternative-------------- next part -------------- _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
I''ve ploughed through the documentation, but it''s kind of vague on some points and I need to buy some hardware if I''m to test it, so I thought I''d ask first. I''ll begin by describing what I want to achieve, and would apreciate if someone could tell me if this is possible or how close I can come. My current situation: I have a lot of data, currently som 10-12 TB spread over about 50 disks in five file servers, adding on the average a new disk every 1-2 months, and a new server every year or so. The old paradigm with volumes makes this a major pain in the rear end, as it becomes very difficult to organize data. The large volume also makes backups impractical, as I''m just an ordinary home user, so fancy tape robots and such are out of my price range. Most of my data is not compressable, and the size varies greatly, from large files (GB-sized) to huge numbers (millions) of tiny files (just a few kB). At the moment, all these are Windows servers, but I plan to switch to some Unix/Linux variant as soon as there is enough benefits (and ZFS sure looks like it could be the juicy bait in that trap). The clients (15 or so) are a mix of Linux, Windows and a bunch of Xboxes (as media players), with the Windows machines gradually being phased out and changed to Linux as fast as I can rewrite my own software (which I can''t do without) for Linux. What I want: * The entire storage should be visible as one file system, with one logical file structure where volumes and servers are not even visible, as if it was one huge disk. No paths like /root/server1fs/volume1/dir... in other words. * Software RAID support, even across the network, so I can just add a bunch of parity disks and survive if a few disks crash. To me, it''s well worth it to pony up with the money for 5-10 extra disks if I know that that many disks can fail before I start to lose data. That would be good enough to dispense with the need for proper backups. * A RAID that allows me to use differently sized disks without losing lots of disk space. I''m OK if some disk space is lost (ie a file is not striped over all disks, somewhat increasing the stripe size and thereby the size of the parity data), but I don''t want to have my 400 GB disks only use the first 160 GB just because I have a shitload of 160 GB disks. * Performance does not need to be stellar, but should not be snail-like either. If it''s enough to fill a 100 Mbit network cable, I''m perfectly happy, if it can''t fill a 10 Mbit I''m starting to get worried. * A file system that handles huge numbers of tiny files somewhat effectively. Many file systems use a full block even for a tiny file, which cause huge overheads when there are many files. * Good interoperability with Linux, Windows and Xbox (actually, this is just a question of Samba compliance and as such out of scope for this discussion). Is this doable? If not, how close can I get and what is it that I can''t get? This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Anders Troberg wrote:> What I want: > > * Software RAID support, even across the network, so I can just > add a bunch of parity disks and survive if a few disks crash. > To me, it''s well worth it to pony up with the money for 5-10 > extra disks if I know that that many disks can fail before I > start to lose data. That would be good enough to dispense with > the need for proper backups.Don''t dispense with proper backups or you will be unhappy. One of my New Years resolutions is to campaign against unhappiness. So I would encourage you to explore ways to backup such large data stores in a timely and economical way. Note: if you were using a plain file system like UFS, then you would see recommendations for performing backups when the file system is quiescent. With ZFS, this isn''t really a problem and the use of ZFS snapshots makes clean backups of a busy file system easier. -- richard _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Anders, Have you considered something like the following: http://www.newegg.com/Product/Product.asp?Item=N82E16816133001 I realize you''re having issues sticking more HDD''s internally, this should solve that issue. Running iSCSI volumes is going to get real ugly in a big hurry and I strongly suggest you do NOT go that route. Your best bet (to do things on the cheap) would be to have two servers, one directly connected to the storage, the other with the esata cards installed and waiting. Assuming you can deal with *some* downtime, you simply move the cables from the one head to the other, import your pool, and continue along. This should provide more than enough storage for a while. It''s 5.5 TB per array with 500GB disks, and 6 arrays per server. Technically you could squeeze more arrays per server as well, as I believe you can find Mobo''s with more than 6 pci slots, and I''m pretty sure they also make 8-port esata/sas cards. Finally if you need *real time* you could split the arrays, take two ports to one server, two to the other, and run sun cluster. When one server goes down the other should take over instantly. This is obviously going to cut your storage in half, but if you need real-time you''re going to have to take a hit somewhere. This is actually the route I plan on taking eventually. Anyone else want to comment on the feasibility of it? As for cost, I would think if you ebay all of your old hardware, and wait for some sales on 500GB HDD''s, it should more than get you started on this. --Tim _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
<blockquote>Ideally you should add 3-5 disks at a time so you can add raidz(like raid5) groups so the failure of a disk won''t cause lost of data.</blockquote> Actually, I usually add them 8 at a time, it just average out to one every 1-2 months. <blockquote>with ZFS its easier if you keep all disks on one server, just buy the biggest disk box you can and fill with 8x sata controllers. Spreading the disks over multiple servers means you have to use iscsi to share the disks which is just an added headache. Plus you will have to pay for more network infrastructure, plus powering extra CPU''s and resources that isn''t needed if all disks are attached to one box.</blockquote> There are several reasons I keep them in several servers. One is to spread the investment, another is so that I don''t have to rely completely on a single machine. I''ve also found cooling challenging, my two biggest servers currently have 15 disks each and I need 8 fans to keep them cool. I also tend to run out of free PCI slots for the controllers... Anyway, now I have them in several machines and it would be too expensive to re-think that. Power, network infrastructure, physical space and so on are not important issues for me. Power is cheap, I already have ten times the network infrastructure I need, I have a large room in the basement as server room. <blockquote>Yes this is possible, but not advisable, but ZFS allows you to mount your file systems where ever so you won''t have to deal with /root/server1fs/dir but you can use ones like mydata/january or mydata/movies or what ever you need.</blockquote> My problem is that some of my categories are larger than the file system of an individual server. If I understand it correctly, each server has it''s own file system, it does not add its storage space to a common pool? At the moment I solve this by moving stuff around to accommodate disk sizes and having my own file system which lives on top of the standard file system and gives a virtual view of things organized to my liking, but it''s a very awkward way of doing it. <blockquote>yes its possible but not a feature of ZFS, you will need to share disks using iscsi, and then put the shared disks in a zfs pool.</blockquote> It looks like that might be the path I may have to take. I''ll have to read up on iSCSI. <blockquote>its best to add similar disks in a raidz group, so if you added 5x 160GB disks in a raidz group you would get 4x160GB of data with one drive being used for parity data protecting you from any drive in that group dieing.</blockquote> While I could do that, it would significantly lower safety margins, as it''s enough for two drives to fail in a group to loose it. It would be much nicer to have a huge group with several parity drives. It''s also nice if I don''t have to find a similar replacement drive when one fails, as they often can''t be found at that point. <blockquote>not a problem. ZFS uses variable sized blocks anything from 512bytes to 128k bytes per block, it is even flexible in raidz configurations where a 512 byte file uses just 1k of space, 512 bytes for data, and 512 bytes for parity.</blockquote> Nice! <blockquote>Don''t dispense with proper backups or you will be unhappy. One of my New Years resolutions is to campaign against unhappiness. So I would encourage you to explore ways to backup such large data stores in a timely and economical way.</blockquote> I know, but there really is no viable way of backup for that amount of data for a home user. A RAID array that can survive 5-10 disks failing would go a long way, as it would take a lightning strike (which, due to the location of my house, is very unlikely) or a fire to do that much damage, and it would still be within my economical limits. My data is also moving around a lot as it gets sorted and renamed, so even incremental backups may grow big. ---- I''ll try to sum up the advice and the results as I understand them: It would be best to put all disks in one machine, but for practical reasons this will probably not happen. Next best bet is to mount the disks remotely using iSCSI. Regardless of where the disks are, it''s best to group them in RAID groups according to size in order to not lose space. This will make the data more vulnerable as there will be fewer parity blocks for each piece of data. No need to worry about performance or interoperability. Correct? ---- As for unequal size RAID disk, I had an idea I started on for my file system mentioned earlier, but never got around to finish. I''ll just mention it here, perhaps someone will find it useful. What I did was to not use all the available disks for each file. Say that I, as an example, had 20 disks, and I wanted a failsafe that could take four disks failing. I cut the file into fewer stripes than available disks, say 14, then generated the 4 parity stripes. This meant that there now was 18 stripes, which were placed on the 18 disks with the most remaining free space. Of course, each stripe gets a bit bigger, which means that the parity stripes increase a little compared to if all disks were used for each file, but not much. As the larger disks are filled up, some stripes will get placed on the smaller and all disks will fill up. This flexible striping scheme also allowed me to reduce the number of stripes used for small files. It just don''t make any sense striping a tiny file across many disks, as the seek overhead and block size will become the dominant factors. I just striped the file in as many stripes as the block size warranted, then generated my parity files from that. Remember, I built this on top of an ordinary file system, using files for stripes and parity stripes, so it was not a true file system and performance was crap, but the basic principle is sound and should be applicable to a real file system. This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss