Dear list, Solaris 10 U3 on SPARC. I had a 197GB raidz storage pool. Within that pool, I had allocated a 191GB zvol (filesystem A), and a 6.75GB zvol (filesystem B). These used all but a couple hundred K of the zpool. Both zvols contained UFS filesystems with logging enabled. The (A) filesystem was about 79% full. (B) was also nearly full, but unmounted and not being used. This configuration worked happily for a bit over two months. Then the other day, a user decided to copy (cp) about 11GB worth of video files within (A). This caused UFS to choke as such: Mar 9 17:34:43 maxwell ufs: [ID 702911 kern.warning] WARNING: Error writing master during ufs log roll Mar 9 17:34:43 maxwell ufs: [ID 127457 kern.warning] WARNING: ufs log for /export/home/engr changed state to Error Mar 9 17:34:43 maxwell ufs: [ID 616219 kern.warning] WARNING: Please umount(1M) /export/home/engr and run fsck(1M) I do as the message says: unmount and attempt to fsck. I am then bombarded with thousands of errors, BUT fsck can not fix them due to ''no space left on device''. That''s right, the filesystem with about 30GB free didn''t have enough free space to fsck. Strange. After messing with the machine all weekend, rebooting, calling coworkers (other sys admins), calling sun, scratching my head, etc.. The solution ended up being to _delete the (B) zvol_ (which contained only junk data). Once that was done, fsck ran all the way through without problems (besides wiping all my ACLs) and things were happy again. So I surmised that ZFS ran out of space to do it''s thing, and for whatever reason, that ''out of space'' got pushed down into the zvol as well, causing fsck to choke. I _have_ been able to reproduce the situation on a test machine, but not reliably. It basically comprises of setting up two zvols that take up almost all of the pool space, newfsing them, filling one up to about 90% full, then looping though copys of 1/2 of the remaining space until it dies. (So for a 36GB pool, create a 34GB zvol and a 2.xxGB zvol. newfs them. Mount the larger one. Create a 30GB junk file. Create a directory of say 5 files worth about 2GB total. Then do ''while true; do copy -r dira dirb;done'' until it fails. Sometimes it does, sometimes not.) Why does this happen? Is it a bug? I know there is a recommendation of 20% free space for good performance, but that thought never occurred to me when this machine was set up (zvols only, no zfs proper). I think it is a bug simply because it _allowed_ me to create a configuration that didn''t leave enough room for overhead. There isn''t a whole lot of info surrounding zvol. Does the 80% free rule still apply to the underlining zfs if only zvols are used? That would be really unfortunate. I think most people wanting to use a zvol would want to use 100% of a pool toward the zvol. -Brian -- --------------------------------------------------- Brian H. Nelson Youngstown State University System Administrator Media and Academic Computing bnelson[at]cis.ysu.edu ---------------------------------------------------
On Tue, Mar 20, 2007 at 06:01:28PM -0400, Brian H. Nelson wrote:> Why does this happen? Is it a bug? I know there is a recommendation of > 20% free space for good performance, but that thought never occurred to > me when this machine was set up (zvols only, no zfs proper).It sounds like this bug: 6430003 record size needs to affect zvol reservation size on RAID-Z Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Hello Adam, Wednesday, March 21, 2007, 12:42:49 AM, you wrote: AL> On Tue, Mar 20, 2007 at 06:01:28PM -0400, Brian H. Nelson wrote:>> Why does this happen? Is it a bug? I know there is a recommendation of >> 20% free space for good performance, but that thought never occurred to >> me when this machine was set up (zvols only, no zfs proper).AL> It sounds like this bug: AL> 6430003 record size needs to affect zvol reservation size on RAID-Z AL> Adam Adam, while you are here, what about gzip compression in ZFS? I mean are you going to integrate changes soon? -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
On Wed, Mar 21, 2007 at 01:23:06AM +0100, Robert Milkowski wrote:> Adam, while you are here, what about gzip compression in ZFS? > I mean are you going to integrate changes soon?I submitted the RTI today. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Hello Adam, Wednesday, March 21, 2007, 1:24:35 AM, you wrote: AL> On Wed, Mar 21, 2007 at 01:23:06AM +0100, Robert Milkowski wrote:>> Adam, while you are here, what about gzip compression in ZFS? >> I mean are you going to integrate changes soon?AL> I submitted the RTI today. Great! btw: I assume that compression level will be hard coded after all, right? -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
On Wed, Mar 21, 2007 at 01:36:10AM +0100, Robert Milkowski wrote:> btw: I assume that compression level will be hard coded after all, > right?Nope. You''ll be able to choose from gzip-N with N ranging from 1 to 9 just like gzip(1). Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Adam Leventhal wrote:> On Tue, Mar 20, 2007 at 06:01:28PM -0400, Brian H. Nelson wrote: > >> Why does this happen? Is it a bug? I know there is a recommendation of >> 20% free space for good performance, but that thought never occurred to >> me when this machine was set up (zvols only, no zfs proper). >> > > It sounds like this bug: > > 6430003 record size needs to affect zvol reservation size on RAID-Z > > AdamCould be, but 6429996 sounds like a more likely candidate: zvols don''t reserve enough space for requisite meta data. I can create some large files (2GB) and the ''available'' space only decreases by .01-.04GB for each file. The raidz pool is 7x36GB disks, with the default 8k volblocksize. Would/should 6430003 affect me? I don''t understand what determines "minimum allocatable size" and the number of ''skipped'' sectors for a given situation. Either way, my main concern is that I can address the problem so that the same situation does not reoccur. Are there workarounds for these bugs? How can I determine how much space needs to be reserved? How much (if any) of the remaining free space could be used for an additional zvol (with its own allocation of reserved space)? Thanks, Brian -- --------------------------------------------------- Brian H. Nelson Youngstown State University System Administrator Media and Academic Computing bnelson[at]cis.ysu.edu --------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070328/47b6e681/attachment.html>
Can anyone comment? -Brian Brian H. Nelson wrote:> Adam Leventhal wrote: >> On Tue, Mar 20, 2007 at 06:01:28PM -0400, Brian H. Nelson wrote: >> >>> Why does this happen? Is it a bug? I know there is a recommendation of >>> 20% free space for good performance, but that thought never occurred to >>> me when this machine was set up (zvols only, no zfs proper). >>> >> >> It sounds like this bug: >> >> 6430003 record size needs to affect zvol reservation size on RAID-Z >> >> Adam > > Could be, but 6429996 sounds like a more likely candidate: zvols don''t > reserve enough space for requisite meta data. > > > I can create some large files (2GB) and the ''available'' space only > decreases by .01-.04GB for each file. The raidz pool is 7x36GB disks, > with the default 8k volblocksize. Would/should 6430003 affect me? I > don''t understand what determines "minimum allocatable size" and the > number of ''skipped'' sectors for a given situation. > > Either way, my main concern is that I can address the problem so that > the same situation does not reoccur. Are there workarounds for these > bugs? How can I determine how much space needs to be reserved? How > much (if any) of the remaining free space could be used for an > additional zvol (with its own allocation of reserved space)? > > Thanks, > Brian > > -- > --------------------------------------------------- > Brian H. Nelson Youngstown State University > System Administrator Media and Academic Computing > bnelson[at]cis.ysu.edu > --------------------------------------------------- > > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- --------------------------------------------------- Brian H. Nelson Youngstown State University System Administrator Media and Academic Computing bnelson[at]cis.ysu.edu --------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070403/86a0e35b/attachment.html>