OK, I know that there''s been some discussion on this before, but I''m not sure that any specific advice came out of it. What would the advice be for supporting a largish number of users (10,000 say) on a system that supports ZFS? We currently use vxfs and assign a user quota, and backups are done via Legato Networker.>From what little I currently understand, the general advice would seem to be to assign a filesystem to each user, and to set a quota on that. I can see this being OK for small numbers of users (up to 1000 maybe), but I can also see it being a bit tedious for larger numbers than that.I just tried a quick test on Sol10u2: for x in 0 1 2 3 4 5 6 7 8 9; do for y in 0 1 2 3 4 5 6 7 8 9; do zfs create testpool/$x$y; zfs set quota=1024k testpool/$x$y done; done [apologies for the formatting - is there any way to preformat text on this forum?] It ran OK for a minute or so, but then I got a slew of errors: cannot mount ''/testpool/38'': unable to create mountpoint filesystem successfully created, but not mounted So, OOTB there''s a limit that I need to raise to support more than approx 40 filesystems (I know that this limit can be raised, I''ve not checked to see exactly what I need to fix). It does beg the question of why there''s a limit like this when ZFS is encouraging use of large numbers of filesystems. If I have 10,000 filesystems, is the mount time going to be a problem? I tried: for x in 0 1 2 3 4 5 6 7 8 9; do for x in 0 1 2 3 4 5 6 7 8 9; do zfs umount testpool/001; zfs mount testpool/001 done; done This took 12 seconds, which is OK until you scale it up - even if we assume that mount and unmount take the same amount of time, so 100 mounts will take 6 seconds, this means that 10,000 mounts will take 5 minutes. Admittedly, this is on a test system without fantastic performance, but there *will* be a much larger delay on mounting a ZFS pool like this over a comparable UFS filesystem. I currently use Legato Networker, which (not unreasonably) backs up each filesystem as a separate session - if I continue to use this I''m going to have 10,000 backup sessions on each tape backup. I''m not sure what kind of challenges restoring this kind of beast will present. Others have already been through the problems with standard tools such as ''df'' becoming less useful. One alternative is to ditch quotas altogether - but even though "disk is cheap", it''s not free, and regular backups take time (and tapes are not free either!). In any case, 10,000 undergraduates really will be able to fill more disks than we can afford to provision. We tried running a Windows fileserver back in the days when it had no support for per-user quotas; we did some ad-hockery that helped to keep track of the worst offenders (ableit after the event), but what really killed us was the uncertainty over whether some idiot would decide to fill all available space with "vital research data" (or junk, depending on your point of view). I can see the huge benefits that ZFS quotas and reservations can bring, but I can also see that there is a possibility that there are situations where ZFS could be useful, but the lack of ''legacy'' user-based quotas make it impractical. If the ZFS developers really are not going to implement user quotas is there any advice on what someone like me could do - at the moment I''m presuming that I''ll just have to leave ZFS alone. Thanks in advance Steve Bennett, Lancaster University This message posted from opensolaris.org
Steve Bennett wrote:>OK, I know that there''s been some discussion on this before, but I''m not sure that any specific advice came out of it. What would the advice be for supporting a largish number of users (10,000 say) on a system that supports ZFS? We currently use vxfs and assign a user quota, and backups are done via Legato Networker. > >Using lots of filesystems is definitely encouraged - as long as doing so makes sense in your environment.>>From what little I currently understand, the general advice would seem to be to assign a filesystem to each user, and to set a quota on that. I can see this being OK for small numbers of users (up to 1000 maybe), but I can also see it being a bit tedious for larger numbers than that. > >I just tried a quick test on Sol10u2: > for x in 0 1 2 3 4 5 6 7 8 9; do for y in 0 1 2 3 4 5 6 7 8 9; do > zfs create testpool/$x$y; zfs set quota=1024k testpool/$x$y > done; done >[apologies for the formatting - is there any way to preformat text on this forum?] >It ran OK for a minute or so, but then I got a slew of errors: > cannot mount ''/testpool/38'': unable to create mountpoint > filesystem successfully created, but not mounted > >So, OOTB there''s a limit that I need to raise to support more than approx 40 filesystems (I know that this limit can be raised, I''ve not checked to see exactly what I need to fix). It does beg the question of why there''s a limit like this when ZFS is encouraging use of large numbers of filesystems. > >There is no 40 filesystem limit. You most likely had a pre-existing file/directory in testpool of the same name of the filesystem you tried to create. fsh-hake# zfs list NAME USED AVAIL REFER MOUNTPOINT testpool 77K 7.81G 24.5K /testpool fsh-hake# echo "hmm" > /testpool/01 fsh-hake# zfs create testpool/01 cannot mount ''testpool/01'': Not a directory filesystem successfully created, but not mounted fsh-hake#>If I have 10,000 filesystems, is the mount time going to be a problem? >I tried: > for x in 0 1 2 3 4 5 6 7 8 9; do for x in 0 1 2 3 4 5 6 7 8 9; do > zfs umount testpool/001; zfs mount testpool/001 > done; done >This took 12 seconds, which is OK until you scale it up - even if we assume that mount and unmount take the same amount of time, so 100 mounts will take 6 seconds, this means that 10,000 mounts will take 5 minutes. Admittedly, this is on a test system without fantastic performance, but there *will* be a much larger delay on mounting a ZFS pool like this over a comparable UFS filesystem. > >So this really depends on why and when you''re unmounting filesystems. I suspect it won''t matter much since you won''t be unmounting/remounting your filesystems.>I currently use Legato Networker, which (not unreasonably) backs up each filesystem as a separate session - if I continue to use this I''m going to have 10,000 backup sessions on each tape backup. I''m not sure what kind of challenges restoring this kind of beast will present. > >Others have already been through the problems with standard tools such as ''df'' becoming less useful. > >Is there a specific problem you had in mind regarding ''df;?>One alternative is to ditch quotas altogether - but even though "disk is cheap", it''s not free, and regular backups take time (and tapes are not free either!). In any case, 10,000 undergraduates really will be able to fill more disks than we can afford to provision. We tried running a Windows fileserver back in the days when it had no support for per-user quotas; we did some ad-hockery that helped to keep track of the worst offenders (ableit after the event), but what really killed us was the uncertainty over whether some idiot would decide to fill all available space with "vital research data" (or junk, depending on your point of view). > >I can see the huge benefits that ZFS quotas and reservations can bring, but I can also see that there is a possibility that there are situations where ZFS could be useful, but the lack of ''legacy'' user-based quotas make it impractical. If the ZFS developers really are not going to implement user quotas is there any advice on what someone like me could do - at the moment I''m presuming that I''ll just have to leave ZFS alone. > >I wouldn''t give up that easily... looks like 1 filesystem per user, and 1 quota per filesystem does exactly what you want: fsh-hake# zfs get -r -o name,value quota testpool NAME VALUE testpool none testpool/ann 10M testpool/bob 10M testpool/john 10M .... fsh-hake# I''m assuming that you decided against 1 filesystem per user due to supposed 40 filesystem limit, which is isn''t true. eric
On Tue, 2006-06-27 at 23:07, Steve Bennett wrote:> >From what little I currently understand, the general advice would > seem to be to assign a filesystem to each user, and to set a quota > on that. I can see this being OK for small numbers of users (up to > 1000 maybe), but I can also see it being a bit tedious for larger > numbers than that.I''ve seen this discussed; even recommended. I don''t think, though - given that zfs has been available in a supported version of Solaris for about 24 hours or so - that we''ve yet got to the point of best practice or recommendation yet. That said, the idea of one filesystem per user does have its attractions. With zfs - unlike other filesystems - it''s feasible. Whether it''s sensible is another matter. Still, you could give them a zone each as well... (One snag is that for undergraduates, there isn''t really an intermediate level - department or research grant, for example - that can be used as the allocation unit.)> I just tried a quick test on Sol10u2: > for x in 0 1 2 3 4 5 6 7 8 9; do for y in 0 1 2 3 4 5 6 7 8 9; do > zfs create testpool/$x$y; zfs set quota=1024k testpool/$x$y > done; done > [apologies for the formatting - is there any way to preformat text on this forum?] > It ran OK for a minute or so, but then I got a slew of errors: > cannot mount ''/testpool/38'': unable to create mountpoint > filesystem successfully created, but not mounted > > So, OOTB there''s a limit that I need to raise to support more than > approx 40 filesystems (I know that this limit can be raised, I''ve not > checked to see exactly what I need to fix). It does beg the question > of why there''s a limit like this when ZFS is encouraging use of large > numbers of filesystems.Works fine for me. I''ve done this up to 16000 or so (not with current bits, that was last year).> If I have 10,000 filesystems, is the mount time going to be a problem? > I tried: > for x in 0 1 2 3 4 5 6 7 8 9; do for x in 0 1 2 3 4 5 6 7 8 9; do > zfs umount testpool/001; zfs mount testpool/001 > done; done > This took 12 seconds, which is OK until you scale it up - even if we assume > that mount and unmount take the same amount of time,It''s not quite symmetric; I think umount is a fraction slower (it has to check if the filesystem is in use, amongst other things), but the estimate is probably accurate enough.> so 100 mounts will take 6 seconds, this means that 10,000 mounts > will take 5 minutes. Admittedly, this is on a test system without > fantastic performance, but there *will* be a much larger delay > on mounting a ZFS pool like this over a comparable UFS filesystem.My test last year got to 16000 filesystems on a 1G server before it went ballistic and all operations took infinitely long. I had clearly run out of physical memory. 5 minutes doesn''t sound too bad to me. It''s an order of magnitude quicker than it took to initialize ufs quotas before ufs logging was introduced.> One alternative is to ditch quotas altogether - but even though > "disk is cheap", it''s not free, and regular backups take time > (and tapes are not free either!). In any case, 10,000 > undergraduates really will be able to fill more disks than > we can afford to provision.Last year, before my previous employer closed down, we switched off user disk quotas for 20,000 researchers. The world didn''t end. The disks didn''t fill up. All the work we had to do managing user quotas vanished. The number of calls to the helpdesk to sort out stupid problems due to applications running out of disk space plummeted down to zero. -- -Peter Tribble L.I.S., University of Hertfordshire - http://www.herts.ac.uk/ http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
We have over 10000 filesystems under /home in strongspace.com and it works fine. I forget but there was a bug or there was an improvement made around nevada build 32 (we''re currently at 41) that made the initial mount on reboot significantly faster. Before that it was around 10-15 minutes. I wonder if that improvement didn''t make it into sol10U2? -Jason Sent via BlackBerry from Cingular Wireless -----Original Message----- From: eric kustarz <eric.kustarz at sun.com> Date: Tue, 27 Jun 2006 15:55:45 To:Steve Bennett <S.Bennett at lancaster.ac.uk> Cc:zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Supporting ~10K users on ZFS Steve Bennett wrote:>OK, I know that there''s been some discussion on this before, but I''m not sure that any specific advice came out of it. What would the advice be for supporting a largish number of users (10,000 say) on a system that supports ZFS? We currently use vxfs and assign a user quota, and backups are done via Legato Networker. > >Using lots of filesystems is definitely encouraged - as long as doing so makes sense in your environment.>>From what little I currently understand, the general advice would seem to be to assign a filesystem to each user, and to set a quota on that. I can see this being OK for small numbers of users (up to 1000 maybe), but I can also see it being a bit tedious for larger numbers than that. > >I just tried a quick test on Sol10u2: > for x in 0 1 2 3 4 5 6 7 8 9; do for y in 0 1 2 3 4 5 6 7 8 9; do > zfs create testpool/$x$y; zfs set quota=1024k testpool/$x$y > done; done >[apologies for the formatting - is there any way to preformat text on this forum?] >It ran OK for a minute or so, but then I got a slew of errors: > cannot mount ''/testpool/38'': unable to create mountpoint > filesystem successfully created, but not mounted > >So, OOTB there''s a limit that I need to raise to support more than approx 40 filesystems (I know that this limit can be raised, I''ve not checked to see exactly what I need to fix). It does beg the question of why there''s a limit like this when ZFS is encouraging use of large numbers of filesystems. > >There is no 40 filesystem limit. You most likely had a pre-existing file/directory in testpool of the same name of the filesystem you tried to create. fsh-hake# zfs list NAME USED AVAIL REFER MOUNTPOINT testpool 77K 7.81G 24.5K /testpool fsh-hake# echo "hmm" > /testpool/01 fsh-hake# zfs create testpool/01 cannot mount ''testpool/01'': Not a directory filesystem successfully created, but not mounted fsh-hake#>If I have 10,000 filesystems, is the mount time going to be a problem? >I tried: > for x in 0 1 2 3 4 5 6 7 8 9; do for x in 0 1 2 3 4 5 6 7 8 9; do > zfs umount testpool/001; zfs mount testpool/001 > done; done >This took 12 seconds, which is OK until you scale it up - even if we assume that mount and unmount take the same amount of time, so 100 mounts will take 6 seconds, this means that 10,000 mounts will take 5 minutes. Admittedly, this is on a test system without fantastic performance, but there *will* be a much larger delay on mounting a ZFS pool like this over a comparable UFS filesystem. > >So this really depends on why and when you''re unmounting filesystems. I suspect it won''t matter much since you won''t be unmounting/remounting your filesystems.>I currently use Legato Networker, which (not unreasonably) backs up each filesystem as a separate session - if I continue to use this I''m going to have 10,000 backup sessions on each tape backup. I''m not sure what kind of challenges restoring this kind of beast will present. > >Others have already been through the problems with standard tools such as ''df'' becoming less useful. > >Is there a specific problem you had in mind regarding ''df;?>One alternative is to ditch quotas altogether - but even though "disk is cheap", it''s not free, and regular backups take time (and tapes are not free either!). In any case, 10,000 undergraduates really will be able to fill more disks than we can afford to provision. We tried running a Windows fileserver back in the days when it had no support for per-user quotas; we did some ad-hockery that helped to keep track of the worst offenders (ableit after the event), but what really killed us was the uncertainty over whether some idiot would decide to fill all available space with "vital research data" (or junk, depending on your point of view). > >I can see the huge benefits that ZFS quotas and reservations can bring, but I can also see that there is a possibility that there are situations where ZFS could be useful, but the lack of ''legacy'' user-based quotas make it impractical. If the ZFS developers really are not going to implement user quotas is there any advice on what someone like me could do - at the moment I''m presuming that I''ll just have to leave ZFS alone. > >I wouldn''t give up that easily... looks like 1 filesystem per user, and 1 quota per filesystem does exactly what you want: fsh-hake# zfs get -r -o name,value quota testpool NAME VALUE testpool none testpool/ann 10M testpool/bob 10M testpool/john 10M .... fsh-hake# I''m assuming that you decided against 1 filesystem per user due to supposed 40 filesystem limit, which is isn''t true. eric _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
jason at joyent.com wrote On 06/27/06 17:17,:> We have over 10000 filesystems under /home in strongspace.com and it works fine.> I forget but there was a bug or there was an improvement made around nevada > build 32 (we''re currently at 41) that made the initial mount on reboot > significantly faster.> Before that it was around 10-15 minutes. I wonder if that improvement didn''t make> it into sol10U2? That fix (bug 6377670) made it into build 34 and S10_U2.> > -Jason > > Sent via BlackBerry from Cingular Wireless > > -----Original Message----- > From: eric kustarz <eric.kustarz at sun.com> > Date: Tue, 27 Jun 2006 15:55:45 > To:Steve Bennett <S.Bennett at lancaster.ac.uk> > Cc:zfs-discuss at opensolaris.org > Subject: Re: [zfs-discuss] Supporting ~10K users on ZFS > > Steve Bennett wrote: > > >>OK, I know that there''s been some discussion on this before, but I''m not sure that any specific advice came out of it. What would the advice be for supporting a largish number of users (10,000 say) on a system that supports ZFS? We currently use vxfs and assign a user quota, and backups are done via Legato Networker. >> >> > > > Using lots of filesystems is definitely encouraged - as long as doing so > makes sense in your environment. > > >>>From what little I currently understand, the general advice would seem to be to assign a filesystem to each user, and to set a quota on that. I can see this being OK for small numbers of users (up to 1000 maybe), but I can also see it being a bit tedious for larger numbers than that. >> >>I just tried a quick test on Sol10u2: >> for x in 0 1 2 3 4 5 6 7 8 9; do for y in 0 1 2 3 4 5 6 7 8 9; do >> zfs create testpool/$x$y; zfs set quota=1024k testpool/$x$y >> done; done >>[apologies for the formatting - is there any way to preformat text on this forum?] >>It ran OK for a minute or so, but then I got a slew of errors: >> cannot mount ''/testpool/38'': unable to create mountpoint >> filesystem successfully created, but not mounted >> >>So, OOTB there''s a limit that I need to raise to support more than approx 40 filesystems (I know that this limit can be raised, I''ve not checked to see exactly what I need to fix). It does beg the question of why there''s a limit like this when ZFS is encouraging use of large numbers of filesystems. >> >> > > > There is no 40 filesystem limit. You most likely had a pre-existing > file/directory in testpool of the same name of the filesystem you tried > to create. > > fsh-hake# zfs list > NAME USED AVAIL REFER MOUNTPOINT > testpool 77K 7.81G 24.5K /testpool > fsh-hake# echo "hmm" > /testpool/01 > fsh-hake# zfs create testpool/01 > cannot mount ''testpool/01'': Not a directory > filesystem successfully created, but not mounted > fsh-hake# > > >>If I have 10,000 filesystems, is the mount time going to be a problem? >>I tried: >> for x in 0 1 2 3 4 5 6 7 8 9; do for x in 0 1 2 3 4 5 6 7 8 9; do >> zfs umount testpool/001; zfs mount testpool/001 >> done; done >>This took 12 seconds, which is OK until you scale it up - even if we assume that mount and unmount take the same amount of time, so 100 mounts will take 6 seconds, this means that 10,000 mounts will take 5 minutes. Admittedly, this is on a test system without fantastic performance, but there *will* be a much larger delay on mounting a ZFS pool like this over a comparable UFS filesystem. >> >> > > > So this really depends on why and when you''re unmounting filesystems. I > suspect it won''t matter much since you won''t be unmounting/remounting > your filesystems. > > >>I currently use Legato Networker, which (not unreasonably) backs up each filesystem as a separate session - if I continue to use this I''m going to have 10,000 backup sessions on each tape backup. I''m not sure what kind of challenges restoring this kind of beast will present. >> >>Others have already been through the problems with standard tools such as ''df'' becoming less useful. >> >> > > > Is there a specific problem you had in mind regarding ''df;? > > >>One alternative is to ditch quotas altogether - but even though "disk is cheap", it''s not free, and regular backups take time (and tapes are not free either!). In any case, 10,000 undergraduates really will be able to fill more disks than we can afford to provision. We tried running a Windows fileserver back in the days when it had no support for per-user quotas; we did some ad-hockery that helped to keep track of the worst offenders (ableit after the event), but what really killed us was the uncertainty over whether some idiot would decide to fill all available space with "vital research data" (or junk, depending on your point of view). >> >>I can see the huge benefits that ZFS quotas and reservations can bring, but I can also see that there is a possibility that there are situations where ZFS could be useful, but the lack of ''legacy'' user-based quotas make it impractical. If the ZFS developers really are not going to implement user quotas is there any advice on what someone like me could do - at the moment I''m presuming that I''ll just have to leave ZFS alone. >> >> > > > I wouldn''t give up that easily... looks like 1 filesystem per user, and > 1 quota per filesystem does exactly what you want: > fsh-hake# zfs get -r -o name,value quota testpool > NAME VALUE > testpool none > testpool/ann 10M > testpool/bob 10M > testpool/john 10M > .... > fsh-hake# > > I''m assuming that you decided against 1 filesystem per user due to > supposed 40 filesystem limit, which is isn''t true. > > eric > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Neil
> There is no 40 filesystem limit. You most likely had a pre-existing > file/directory in testpool of the same name of the filesystem > you tried to create.I''m absolutely sure that I didn''t. This was a freshly created pool. Having said that, I recreated the pool just now and tried again and it worked fine. I''ll let you know if I manage to repeat the previous problem.> So this really depends on why and when you''re unmounting > filesystems. I suspect it won''t matter much since you > won''t be unmounting/remounting your filesystems.I was thinking of reboot times, but I''ve just tried with 1000 filesystems and it seemed to be much quicker than when I mounted them one-by-one. Presumably there''s a lot of optimisation that can be done when all filesystems in a pool are mounted simultaneously. I''ve noticed another possible issue - each mount consumes about 45KB of memory - not an issue with tens or hundreds of filesystems, but going back to the 10,000 user scenario this would be 450MB of memory. I know that memory is cheap, but it''s still a pretty noticeable amount.> >Others have already been through the problems with standard > >tools such as ''df'' becoming less useful. > > Is there a specific problem you had in mind regarding ''df;?The fact that you get 10,000 lines of output from df certainly makes it less useful. Some awkward users, and we have plenty of them, might complain (possibly with some justification) that they would prefer that other users not be able to see their quota and disk usage. And I''ve found another problem. We use NFS, and currently it''s pretty straightforward to mount thing:/export/home on another box. With 10,000 filesystems it''s not so straightforward - especially since the current structure (which it would be annoying to change) is /export/home/XX/username (where XX is a 2 digit number). The ability to mount a tree of ZFS filesystems in one go would be useful. I know the reasons for not doing this on traditional filesystems - does they apply to ZFS too?> I wouldn''t give up that easily... looks like 1 filesystem per > user, and 1 quota per filesystem does exactly what you wantI''m not giving up! My thought is that ZFS presents a *huge* change, and retaining ''legacy'' quotas as an optional mechanism would help to ease people into it by allowing them to change a bit more gradually. In our case - we have an upgrade of a 10,000 user system scheduled for later this summer - I think the differences are too great. If we were able to start with one filesystem and then slice pieces off it as we gain more confidence we''d probably use zfs. As it is I think we''ll try zfs on smaller systems first and maybe think again next summer. Thanks for your help. Steve. This message posted from opensolaris.org
Hello Steve, Thursday, June 29, 2006, 5:54:50 PM, you wrote: SB> I''ve noticed another possible issue - each mount consumes about 45KB of SB> memory - not an issue with tens or hundreds of filesystems, but going SB> back to the 10,000 user scenario this would be 450MB of memory. I know SB> that memory is cheap, but it''s still a pretty noticeable amount. How did you measure it? (I''m not saying it doesn''t take those 45kB - just I haven''t checked it myself and I wonder how you checked it). SB> The ability to mount a tree of ZFS filesystems in one go would be useful. SB> I know the reasons for not doing this on traditional filesystems - does they SB> apply to ZFS too? I''m not sure but IIRC there were changes to NFS v4 to allow it - but you should check (search opensolaris newsgroups). SB> In our case - we have an upgrade of a 10,000 user system scheduled for SB> later this summer - I think the differences are too great. If we were SB> able to start with one filesystem and then slice pieces off it as we SB> gain more confidence we''d probably use zfs. As it is I think we''ll try SB> zfs on smaller systems first and maybe think again next summer. You can start with one filesystem and migrate account by account later. Just create pool named home and put all users in their dirs inside that pool (/home/joe /home/tom ...). Now if you want migrate /home/joe to its own filesystem all you have to do is (while user is not logged): mv /home/joe /home/joe_old; zfs create home/joe; tar ...... you get the idea. btw: I belive it was discussed here before - it would be great if one would automatically convert given directory on zfs filesystem into zfs filesystem (without actually copying all data) and vice versa (making given zfs filesystem a directory) -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
> > I just tried a quick test on Sol10u2: > for x in 0 1 2 3 4 5 6 7 8 9; do for y in 0 1 2 > 3 4 5 6 7 8 9; do > zfs create testpool/$x$y; zfs set quota=1024k > testpool/$x$y > done; done > ologies for the formatting - is there any way to > preformat text on this forum?]Remove the quota from the loop, and before the loop do a zfs set quota=1024k testpool. This should be a more efficent.... Doug This message posted from opensolaris.org
Robert Milkowski wrote:>Hello Steve, > >Thursday, June 29, 2006, 5:54:50 PM, you wrote: >SB> I''ve noticed another possible issue - each mount consumes about 45KB of >SB> memory - not an issue with tens or hundreds of filesystems, but going >SB> back to the 10,000 user scenario this would be 450MB of memory. I know >SB> that memory is cheap, but it''s still a pretty noticeable amount. > >How did you measure it? (I''m not saying it doesn''t take those 45kB - >just I haven''t checked it myself and I wonder how you checked it). > >Each filesystem holding onto memory (unnecessarily if no one is using that filesystem) is something we''re thinking about changing.> > >SB> The ability to mount a tree of ZFS filesystems in one go would be useful. >SB> I know the reasons for not doing this on traditional filesystems - does they >SB> apply to ZFS too? > >I''m not sure but IIRC there were changes to NFS v4 to allow it - but >you should check (search opensolaris newsgroups). > > >Right - NFSv4 allows client''s to cross filesystem boundaries. Trond just recently added this support into the linux client (see http://blogs.sun.com/roller/page/erickustarz/20060417 ). We''re getting closer to adding this to the Solaris client (within Sun, we call it mirror mounts). What about using the automounter? eric
> How did you measure it? (I''m not saying it doesn''t > take those 45kB - just I haven''t checked it myself > and I wonder how you checked it).ran ''top'', looked at ''mem free'' created 1000 filesystems ran ''top'' again. rebooted to be sure ran ''top'' again I''m sure I should use something better than top, but it does the job. I just repeated this and found that I was wrong on usage. 1000 filesystems brought my free memory on a freshly booted system down from 856MB to 620MB. I make 236KB per filesystem. If that''s right, 10,000 mounts would eat 2.4GB of memory.> > The ability to mount a tree of ZFS filesystems in > > one go would be useful. > I''m not sure but IIRC there were changes to NFS v4 to > allow it - but you should check (search opensolaris newsgroups).It looks like it''s a proposed feature in NFSv4, but it only seems to run on the ''powerpoint'' platform so far...> You can start with one filesystem and migrate account > by account later. Just create pool named home and put > all users in their dirs inside that pool (/home/joe /home/tom ...).Not if I want to keep usage under quota control I can''t...!> Now if you want migrate /home/joe to its own > filesystem all you have > to do is (while user is not logged): mv /home/joe > /home/joe_old; zfs > create home/joe; tar ...... you get the idea.I do, and it''s what I''d love to be able to do.> btw: I belive it was discussed here before - it would > be great if one would automatically convert given > directory on zfs filesystem into zfs filesystemBut what would you do if there were hardlinks to that dir from elsewhere? What happens if the contents of the dir before conversion will not fit into the quota that you set on the directory? I''m sure there''s other problems too. It''s easier to leave things like filesystem conversion to standard utils than have tools like zfs taking magical actions in the background. Steve. This message posted from opensolaris.org
Eric said:> Each filesystem holding onto memory (unnecessarily if > no one is using that filesystem) is something we''re thinking > about changing.OK - glad to hear that it''s already been acknowledged as an issue!> Right - NFSv4 allows client''s to cross filesystem boundaries. > Trond just recently added this support into the linux client (see > http://blogs.sun.com/roller/page/erickustarz/20060417). > We''re getting closer to adding this to the Solaris client (within > Sun, we call it mirror mounts).Once it''s in Solaris it might be possible to hide this stuff from users by having a container with the storage in it, then mount that in different containers for users (and maybe for backup too).> What about using the automounter?yeah, thought of that, but we put some structure in ages ago to get around the possible problems with thousands of entries in one directory - so we have /export/home/NN/username where NN is a 2 digit number. I don''t think there''s any way to specify an automount map with multiple levels in it. We could do it by having multiple autmount maps but then it all starts getting messy. Steve. This message posted from opensolaris.org
Casper.Dik at Sun.COM
2006-Jun-30 09:34 UTC
[zfs-discuss] Re: Re: Supporting ~10K users on ZFS
>yeah, thought of that, but we put some structure in ages ago >to get around the possible problems with thousands of entries >in one directory - so we have /export/home/NN/username >where NN is a 2 digit number. > >I don''t think there''s any way to specify an automount map >with multiple levels in it.You can have composite mounts (multiple nested mounts ) but that is essentially a single automount entry so it can''t be overly long, I believe. I don''t think that having a flat /home space is really an issue, though; it''s all memory based so searches will be fast. If making the map is complicated, you could think about using executable automount maps. They allow you a lot of flexibility if the ordinary map structure fails you. An executable automount map is triggered when a lookup is done for an entry in a directory and is created by making the auto_xxx file executable. Casper
Casper said:> You can have composite mounts (multiple nested mounts) > but that is essentially a single automount entry so it > can''t be overly long, I believe.I''ve seen that in the man page, but I''ve never managed to find a use for it! What I''d *like* to be able to do is have a map that amounts to: 00 -ro \ / keck:/export/home/00 /* -rw /export/home/00/& 01 -ro \ / keck:/export/home/01 /* -rw /export/home/01/& ... This doesn''t work - I think it''s beyond the capabilities of automountd. I don''t even think an executable map would help. I can see that I could do an executable map to preserve the /export/home/NN/username on the server, but have /home/username on the client - we were considering this on a different system here (where we''re encountering similar problems with a panasas fileserver). Thanks Steve.
Casper.Dik at Sun.COM
2006-Jun-30 12:33 UTC
[zfs-discuss] Re: Re: Supporting ~10K users on ZFS
>What I''d *like* to be able to do is have a map that amounts to: > >00 -ro \ > / keck:/export/home/00 > /* -rw /export/home/00/&What is our interest in mounting the 00 and 01 directories? is there any data there not in the subdirectories? Currently, I''m using executable maps to create zfs home directories. Casper
Michael J. Ellis
2006-Jul-03 18:45 UTC
[zfs-discuss] Re: Re: Re: Supporting ~10K users on ZFS
> > Currently, I''m using executable maps to create zfs > home directories. > > CasperCasper, anything you can share with us on that? Sounds interesting. thanks, -- MikeE This message posted from opensolaris.org
Casper.Dik at Sun.COM
2006-Jul-03 19:27 UTC
[zfs-discuss] Re: Re: Re: Supporting ~10K users on ZFS
>> >> Currently, I''m using executable maps to create zfs >> home directories. >> >> Casper > > >Casper, anything you can share with us on that? Sounds interesting.It''s really very lame: Add to /etc/auto_home as last entry: +/etc/auto_home_import And install /etc/auto_home_import as executable script: #!/bin/ksh -p # # Find home directory; create directories under /export/home # with zfs if they do not exist. # hdir=$(echo ~$1) if [[ "$hdir" != /home/* ]] then # Not a user with a valid home directory. exit fi # # At this point we have verified that "$1" is a valid # user with a home of the form /home/username. # h=/export/home/"$1" if [ -d "$h" ] then echo "localhost:$h" exit 0 fi /usr/sbin/zfs create "export/home/$1" || exit 1 cd /etc/skel umask 022 /bin/find . -type f | while read f; do f=$(basename $f) # Remove optional local prefix /etc/skel files. f="$h/${f##local}" cp "$f" "$t" chown "$1" "$t" done chown "$1" $h echo "localhost:$h" exit 0
James Dickens
2006-Jul-03 20:13 UTC
[zfs-discuss] Re: Re: Re: Supporting ~10K users on ZFS
On 7/3/06, Casper.Dik at sun.com <Casper.Dik at sun.com> wrote:> > >> > >> Currently, I''m using executable maps to create zfs > >> home directories. > >> > >> Casper > > > > > >Casper, anything you can share with us on that? Sounds interesting. > > > It''s really very lame: > > > Add to /etc/auto_home as last entry: > > +/etc/auto_home_import > > And install /etc/auto_home_import as executable script: > > #!/bin/ksh -p > # > # Find home directory; create directories under /export/home > # with zfs if they do not exist. > # > > hdir=$(echo ~$1) > > if [[ "$hdir" != /home/* ]] > then > # Not a user with a valid home directory. > exit > fi > > # > # At this point we have verified that "$1" is a valid > # user with a home of the form /home/username. > # > h=/export/home/"$1" > if [ -d "$h" ] > then > echo "localhost:$h" > exit 0 > fi > > /usr/sbin/zfs create "export/home/$1" || exit 1another way to do this that is quicker if you are executing this often is create a user directory with all the skel files in place, snapshot it, then clone that directory and chown the files. zfs snapshot /export/home/skel at skel export/home/$1 ; chown -R /export/home/$1 James Dickens uadmin.blogspot.com> > cd /etc/skel > umask 022 > /bin/find . -type f | while read f; do > f=$(basename $f) > # Remove optional local prefix /etc/skel files. > f="$h/${f##local}" > cp "$f" "$t" > chown "$1" "$t" > done > > chown "$1" $h > > echo "localhost:$h" > exit 0 > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
James Dickens
2006-Jul-03 20:14 UTC
[zfs-discuss] Re: Re: Re: Supporting ~10K users on ZFS
On 7/3/06, James Dickens <jamesd.wi at gmail.com> wrote:> On 7/3/06, Casper.Dik at sun.com <Casper.Dik at sun.com> wrote: > > > > >> > > >> Currently, I''m using executable maps to create zfs > > >> home directories. > > >> > > >> Casper > > > > > > > > >Casper, anything you can share with us on that? Sounds interesting. > > > > > > It''s really very lame: > > > > > > Add to /etc/auto_home as last entry: > > > > +/etc/auto_home_import > > > > And install /etc/auto_home_import as executable script: > > > > #!/bin/ksh -p > > # > > # Find home directory; create directories under /export/home > > # with zfs if they do not exist. > > # > > > > hdir=$(echo ~$1) > > > > if [[ "$hdir" != /home/* ]] > > then > > # Not a user with a valid home directory. > > exit > > fi > > > > # > > # At this point we have verified that "$1" is a valid > > # user with a home of the form /home/username. > > # > > h=/export/home/"$1" > > if [ -d "$h" ] > > then > > echo "localhost:$h" > > exit 0 > > fi > > > > /usr/sbin/zfs create "export/home/$1" || exit 1 > > another way to do this that is quicker if you are executing this often > is create a user directory with all the skel files in place, snapshot > it, then clone that directory and chown the files. > > zfs snapshot /export/home/skel at skel export/home/$1 ; chown -R /export/home/$1 >oops i guess i need more coffee zfs clone /export/home/skel at skel export/home/$1 ; chown -R /export/home/$1> James Dickens > uadmin.blogspot.com > > > > > > cd /etc/skel > > umask 022 > > /bin/find . -type f | while read f; do > > f=$(basename $f) > > # Remove optional local prefix /etc/skel files. > > f="$h/${f##local}" > > cp "$f" "$t" > > chown "$1" "$t" > > done > > > > chown "$1" $h > > > > echo "localhost:$h" > > exit 0 > > > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > >
Nicholas Senedzuk
2006-Jul-03 22:12 UTC
[zfs-discuss] Re: Re: Re: Supporting ~10K users on ZFS
I am new to zfs and do not understand the reason that you would want to create a separate file system for each home directory. Can some one explain to me why you would want to do this? On 7/3/06, James Dickens <jamesd.wi at gmail.com> wrote:> > On 7/3/06, James Dickens <jamesd.wi at gmail.com> wrote: > > On 7/3/06, Casper.Dik at sun.com <Casper.Dik at sun.com> wrote: > > > > > > >> > > > >> Currently, I''m using executable maps to create zfs > > > >> home directories. > > > >> > > > >> Casper > > > > > > > > > > > >Casper, anything you can share with us on that? Sounds interesting. > > > > > > > > > It''s really very lame: > > > > > > > > > Add to /etc/auto_home as last entry: > > > > > > +/etc/auto_home_import > > > > > > And install /etc/auto_home_import as executable script: > > > > > > #!/bin/ksh -p > > > # > > > # Find home directory; create directories under /export/home > > > # with zfs if they do not exist. > > > # > > > > > > hdir=$(echo ~$1) > > > > > > if [[ "$hdir" != /home/* ]] > > > then > > > # Not a user with a valid home directory. > > > exit > > > fi > > > > > > # > > > # At this point we have verified that "$1" is a valid > > > # user with a home of the form /home/username. > > > # > > > h=/export/home/"$1" > > > if [ -d "$h" ] > > > then > > > echo "localhost:$h" > > > exit 0 > > > fi > > > > > > /usr/sbin/zfs create "export/home/$1" || exit 1 > > > > another way to do this that is quicker if you are executing this often > > is create a user directory with all the skel files in place, snapshot > > it, then clone that directory and chown the files. > > > > zfs snapshot /export/home/skel at skel export/home/$1 ; chown -R > /export/home/$1 > > > oops i guess i need more coffee > > zfs clone /export/home/skel at skel export/home/$1 ; chown -R /export/home/$1 > > > James Dickens > > uadmin.blogspot.com > > > > > > > > > > cd /etc/skel > > > umask 022 > > > /bin/find . -type f | while read f; do > > > f=$(basename $f) > > > # Remove optional local prefix /etc/skel files. > > > f="$h/${f##local}" > > > cp "$f" "$t" > > > chown "$1" "$t" > > > done > > > > > > chown "$1" $h > > > > > > echo "localhost:$h" > > > exit 0 > > > > > > _______________________________________________ > > > zfs-discuss mailing list > > > zfs-discuss at opensolaris.org > > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060703/2926c7a9/attachment.html>
James Dickens
2006-Jul-03 22:18 UTC
[zfs-discuss] Re: Re: Re: Supporting ~10K users on ZFS
On 7/3/06, Nicholas Senedzuk <nicholas.senedzuk at gmail.com> wrote:> I am new to zfs and do not understand the reason that you would want to > create a separate file system for each home directory. Can some one explain > to me why you would want to do this? >because in ZFS filesystems are cheap, you can assign a quota or reservation for each user. you can see how much space they are using with df /export/home/username or zfs list username no more waiting for du -s to complete , you can make a snapshot of each users data/filesystem. I''m sure they are more but another time James Dickens uadmin.blogspot.com> > > On 7/3/06, James Dickens <jamesd.wi at gmail.com> wrote: > > On 7/3/06, James Dickens <jamesd.wi at gmail.com> wrote: > > > On 7/3/06, Casper.Dik at sun.com <Casper.Dik at sun.com> wrote: > > > > > > > > >> > > > > >> Currently, I''m using executable maps to create zfs > > > > >> home directories. > > > > >> > > > > >> Casper > > > > > > > > > > > > > > >Casper, anything you can share with us on that? Sounds interesting. > > > > > > > > > > > > It''s really very lame: > > > > > > > > > > > > Add to /etc/auto_home as last entry: > > > > > > > > +/etc/auto_home_import > > > > > > > > And install /etc/auto_home_import as executable script: > > > > > > > > #!/bin/ksh -p > > > > # > > > > # Find home directory; create directories under /export/home > > > > # with zfs if they do not exist. > > > > # > > > > > > > > hdir=$(echo ~$1) > > > > > > > > if [[ "$hdir" != /home/* ]] > > > > then > > > > # Not a user with a valid home directory. > > > > exit > > > > fi > > > > > > > > # > > > > # At this point we have verified that "$1" is a valid > > > > # user with a home of the form /home/username. > > > > # > > > > h=/export/home/"$1" > > > > if [ -d "$h" ] > > > > then > > > > echo "localhost:$h" > > > > exit 0 > > > > fi > > > > > > > > /usr/sbin/zfs create "export/home/$1" || exit 1 > > > > > > another way to do this that is quicker if you are executing this often > > > is create a user directory with all the skel files in place, snapshot > > > it, then clone that directory and chown the files. > > > > > > zfs snapshot /export/home/skel at skel export/home/$1 ; chown -R > /export/home/$1 > > > > > oops i guess i need more coffee > > > > zfs clone /export/home/skel at skel export/home/$1 ; chown -R /export/home/$1 > > > > > James Dickens > > > uadmin.blogspot.com > > > > > > > > > > > > > > cd /etc/skel > > > > umask 022 > > > > /bin/find . -type f | while read f; do > > > > f=$(basename $f) > > > > # Remove optional local prefix /etc/skel files. > > > > f="$h/${f##local}" > > > > cp "$f" "$t" > > > > chown "$1" "$t" > > > > done > > > > > > > > chown "$1" $h > > > > > > > > echo "localhost:$h" > > > > exit 0 > > > > > > > > _______________________________________________ > > > > zfs-discuss mailing list > > > > zfs-discuss at opensolaris.org > > > > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > > > > > > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > >
Hi, does anybody successfully try the option sharenfs=on for an zfs filesystem with 10000 users? On my system (sol10u2), that is not only awfully slow but does also not work smoothly. I did run the following commands: zpool create -R /test test c2t600C0FF0000000000988193CD00CE701d0s0 zfs create test/home zfs set sharenfs=on test/home for u in `range 0000 9999`; do zfs create test/home/$u; done zpool export test zpool import -R /test test The zpool export command required about 30 minutes to finish. And the import command, after it did some silent work for 45 minutes, just reported a lot of error messages: ... cannot share ''test/home/4643'': error reading /etc/dfs/sharetab cannot share ''test/home/8181'': error reading /etc/dfs/sharetab cannot share ''test/home/1219'': error reading /etc/dfs/sharetab cannot share ''test/home/3900'': error reading /etc/dfs/sharetab cannot share ''test/home/7768'': error reading /etc/dfs/sharetab cannot share ''test/home/1314'': error reading /etc/dfs/sharetab cannot share ''test/home/3420'': error reading /etc/dfs/sharetab cannot share ''test/home/7786'': error reading /etc/dfs/sharetab cannot share ''test/home/9707'': error reading /etc/dfs/sharetab ... Regards, Hans This message posted from opensolaris.org
On 7/6/06, H.-J. Schnitzer <schnitzer at rz.rwth-aachen.de> wrote:> The zpool export command required about 30 minutes to finish. > And the import command, after it did some silent work for 45 minutes, > just reported a lot of error messages: > > ... > cannot share ''test/home/4643'': error reading /etc/dfs/sharetabIt seems as though these would come about if a memory allocation fails or there is a corrupt line in /etc/dfs/sharetab. Does /var/adm/messages have any messages indicating you were "out of space" (memory) or that / was full? Mike -- Mike Gerdts http://mgerdts.blogspot.com/
We have done some work to make this bearable on boot by introction of the undocumented SHARE_NOINUSE_CHECK environment variable. This disables an expensive check which verifies that the filesystem is not already shared. Since we''re doing the initial shares on the system, we can safely disable this check. You may want to try your experiment with this environment variable set, but keep in mind that manually experimentation with this flag set coudl result in a subdirectory of a filesystem being shared at the same time as its parent, for example. To do much more we need to fundamentally rearchitect the was /etc/dfs/dfstab and /etc/dfs/sharetab work. Thankfully, there is already a project to rewrite all of this under the guise of a new ''share manager'' command. The assumption is that, in addition to simplifying the administration model, it will also provide much greater scalability, as well as a programmatic method for sharing filesystems from within zfs(1M). You may want to ping nfs-discuss for any current status. - Eric On Thu, Jul 06, 2006 at 02:38:03AM -0700, H.-J. Schnitzer wrote:> Hi, > > does anybody successfully try the option sharenfs=on for an zfs filesystem > with 10000 users? On my system (sol10u2), that is not only awfully slow but > does also not work smoothly. I did run the following commands: > > zpool create -R /test test c2t600C0FF0000000000988193CD00CE701d0s0 > zfs create test/home > zfs set sharenfs=on test/home > for u in `range 0000 9999`; do zfs create test/home/$u; done > zpool export test > zpool import -R /test test > > The zpool export command required about 30 minutes to finish. > And the import command, after it did some silent work for 45 minutes, > just reported a lot of error messages: > > ... > cannot share ''test/home/4643'': error reading /etc/dfs/sharetab > cannot share ''test/home/8181'': error reading /etc/dfs/sharetab > cannot share ''test/home/1219'': error reading /etc/dfs/sharetab > cannot share ''test/home/3900'': error reading /etc/dfs/sharetab > cannot share ''test/home/7768'': error reading /etc/dfs/sharetab > cannot share ''test/home/1314'': error reading /etc/dfs/sharetab > cannot share ''test/home/3420'': error reading /etc/dfs/sharetab > cannot share ''test/home/7786'': error reading /etc/dfs/sharetab > cannot share ''test/home/9707'': error reading /etc/dfs/sharetab > ... > > Regards, > Hans > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Thank you, setting SHARE_NOINUSE_CHECK indeed speeds up things substantially. However, there seems to be a bug in the NFS part of Solaris 10u2 when so many filesystems are shared. When I run "showmount -e" after the pool has been (successfully) imported, I get an error: $ showmount -e showmount: RPC: Unable to receive In /var/svc/log/network-nfs-server:default.log one can see the following messages: [ Jul 4 12:10:34 Stopping because process dumped core. ] [ Jul 4 12:10:34 Executing stop method ("/lib/svc/method/nfs-server stop 52") ] [ Jul 4 12:17:24 Method "stop" exited with status 0 ] [ Jul 4 12:17:24 Executing start method ("/lib/svc/method/nfs-server start") ] [ Jul 4 12:20:17 Method "start" exited with status 0 ] [ Jul 4 12:47:52 Stopping because process dumped core. ] [ Jul 4 12:47:52 Executing stop method ("/lib/svc/method/nfs-server stop 108") ] [ Jul 4 12:54:43 Method "stop" exited with status 0 ] [ Jul 4 12:54:43 Executing start method ("/lib/svc/method/nfs-server start") ] [ Jul 4 12:57:33 Method "start" exited with status 0 ] [ Jul 4 12:57:43 Stopping because process dumped core. ] [ Jul 4 12:57:43 Executing stop method ("/lib/svc/method/nfs-server stop 148") ] [ Jul 4 13:04:37 Method "stop" exited with status 0 ] [ Jul 4 13:04:37 Executing start method ("/lib/svc/method/nfs-server start") ] [ Jul 4 13:07:28 Method "start" exited with status 0 ] [ Jul 4 13:08:18 Stopping because process dumped core. ] [ Jul 4 13:08:18 Executing stop method ("/lib/svc/method/nfs-server stop 160") ] [ Jul 4 13:15:24 Method "stop" exited with status 0 ] [ Jul 4 13:15:24 Executing start method ("/lib/svc/method/nfs-server start") ] [ Jul 4 13:18:17 Method "start" exited with status 0 ] As you can see, the system stops and starts the nfs server subsequently over and over. Hans This message posted from opensolaris.org
Michael Schuster - Sun Microsystems
2006-Jul-10 15:27 UTC
[zfs-discuss] Re: Re: Supporting ~10K users on ZFS
You''ll also note that there''s a line saying "Stopping because process dumped core" which we shouldn''t ignore, IMO. In case this is a Sun-supported config (s10u2 indicates as much), please file a case :-) regards Michael Schuster H.-J. Schnitzer wrote:> Thank you, setting SHARE_NOINUSE_CHECK indeed speeds up things substantially. > However, there seems to be a bug in the NFS part of Solaris 10u2 when so many filesystems > are shared. When I run "showmount -e" after the pool has been (successfully) imported, > I get an error: > > $ showmount -e > showmount: RPC: Unable to receive > > In /var/svc/log/network-nfs-server:default.log one can see the following messages: > > [ Jul 4 12:10:34 Stopping because process dumped core. ] > [ Jul 4 12:10:34 Executing stop method ("/lib/svc/method/nfs-server stop 52") ] > [ Jul 4 12:17:24 Method "stop" exited with status 0 ] > [ Jul 4 12:17:24 Executing start method ("/lib/svc/method/nfs-server start") ] > [ Jul 4 12:20:17 Method "start" exited with status 0 ] > [ Jul 4 12:47:52 Stopping because process dumped core. ] > [ Jul 4 12:47:52 Executing stop method ("/lib/svc/method/nfs-server stop 108") ] > [ Jul 4 12:54:43 Method "stop" exited with status 0 ] > [ Jul 4 12:54:43 Executing start method ("/lib/svc/method/nfs-server start") ] > [ Jul 4 12:57:33 Method "start" exited with status 0 ] > [ Jul 4 12:57:43 Stopping because process dumped core. ] > [ Jul 4 12:57:43 Executing stop method ("/lib/svc/method/nfs-server stop 148") ] > [ Jul 4 13:04:37 Method "stop" exited with status 0 ] > [ Jul 4 13:04:37 Executing start method ("/lib/svc/method/nfs-server start") ] > [ Jul 4 13:07:28 Method "start" exited with status 0 ] > [ Jul 4 13:08:18 Stopping because process dumped core. ] > [ Jul 4 13:08:18 Executing stop method ("/lib/svc/method/nfs-server stop 160") ] > [ Jul 4 13:15:24 Method "stop" exited with status 0 ] > [ Jul 4 13:15:24 Executing start method ("/lib/svc/method/nfs-server start") ] > [ Jul 4 13:18:17 Method "start" exited with status 0 ] > > As you can see, the system stops and starts the nfs server subsequently over and over. > > Hans > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Michael Schuster (+49 89) 46008-2974 / x62974 visit the online support center: http://www.sun.com/osc/ Recursion, n.: see ''Recursion''
My guess is that mountd is blowing its head off and then SMF restarts it and the other NFS services which includes nfsd. So it is time to fix up the mountd. A trace back of the mountd core would be helpful in confirming this. And as was mentioned a few days ago, there is an existing bug that should cover this (and should be fixed in some form for mountd). Spencer On Mon, H.-J. Schnitzer wrote:> Thank you, setting SHARE_NOINUSE_CHECK indeed speeds up things substantially. > However, there seems to be a bug in the NFS part of Solaris 10u2 when so many filesystems > are shared. When I run "showmount -e" after the pool has been (successfully) imported, > I get an error: > > $ showmount -e > showmount: RPC: Unable to receive > > In /var/svc/log/network-nfs-server:default.log one can see the following messages: > > [ Jul 4 12:10:34 Stopping because process dumped core. ] > [ Jul 4 12:10:34 Executing stop method ("/lib/svc/method/nfs-server stop 52") ] > [ Jul 4 12:17:24 Method "stop" exited with status 0 ] > [ Jul 4 12:17:24 Executing start method ("/lib/svc/method/nfs-server start") ] > [ Jul 4 12:20:17 Method "start" exited with status 0 ] > [ Jul 4 12:47:52 Stopping because process dumped core. ] > [ Jul 4 12:47:52 Executing stop method ("/lib/svc/method/nfs-server stop 108") ] > [ Jul 4 12:54:43 Method "stop" exited with status 0 ] > [ Jul 4 12:54:43 Executing start method ("/lib/svc/method/nfs-server start") ] > [ Jul 4 12:57:33 Method "start" exited with status 0 ] > [ Jul 4 12:57:43 Stopping because process dumped core. ] > [ Jul 4 12:57:43 Executing stop method ("/lib/svc/method/nfs-server stop 148") ] > [ Jul 4 13:04:37 Method "stop" exited with status 0 ] > [ Jul 4 13:04:37 Executing start method ("/lib/svc/method/nfs-server start") ] > [ Jul 4 13:07:28 Method "start" exited with status 0 ] > [ Jul 4 13:08:18 Stopping because process dumped core. ] > [ Jul 4 13:08:18 Executing stop method ("/lib/svc/method/nfs-server stop 160") ] > [ Jul 4 13:15:24 Method "stop" exited with status 0 ] > [ Jul 4 13:15:24 Executing start method ("/lib/svc/method/nfs-server start") ] > [ Jul 4 13:18:17 Method "start" exited with status 0 ] > > As you can see, the system stops and starts the nfs server subsequently over and over. > > Hans > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Casper.Dik at Sun.COM
2006-Jul-10 15:51 UTC
[zfs-discuss] Re: Re: Supporting ~10K users on ZFS
>You''ll also note that there''s a line saying "Stopping because process dumped >core" which we shouldn''t ignore, IMO. > >In case this is a Sun-supported config (s10u2 indicates as much), please file a >case :-)This looks like the rpcgen issue where the list is encoded using a recursive rather than iterative scheme. Fixed in Solaris Express but not in Solaris 10. Guess we need that fix in S10. Casper
On Thu, Jun 29, 2006 at 08:20:56PM +0200, Robert Milkowski wrote:> btw: I belive it was discussed here before - it would be great if one > would automatically convert given directory on zfs filesystem into zfs > filesystem (without actually copying all data)Yep, and an RFE filed: 6400399 want "zfs split"> and vice versa (making given zfs filesystem a directory)But more filesystems is better! :-) (and, this would be pretty nontrivial, we''d have to resolve conflicting inode (object) numbers, thus rewriting all metadata). Back to slogging through old mail archives, --matt
On Fri, Jun 30, 2006 at 02:12:09AM -0700, Steve Bennett wrote:> > How did you measure it? (I''m not saying it doesn''t > > take those 45kB - just I haven''t checked it myself > > and I wonder how you checked it). > > ran ''top'', looked at ''mem free'' > created 1000 filesystems > ran ''top'' again. > rebooted to be sure > ran ''top'' again > > I''m sure I should use something better than top, but it does the job. > > I just repeated this and found that I was wrong on usage. 1000 filesystems > brought my free memory on a freshly booted system down from 856MB to 620MB. > I make 236KB per filesystem. If that''s right, 10,000 mounts would eat 2.4GB of > memory.It may be correct that having 1,000 filesystems mounted used up an average of 236k per filesystem on your machine. However, you can not necessarily extrapolate to more filesystems. Most of that memory is ZFS cached data which is evictable. So under memory pressure, we will throw it out and make room for more filesystems to be mounted (or files accessed, apps run, etc). That said, there is still some minimum amount of memory used, and we''re working on reducing it. See bug 6425094 "each mounted filesystem requires too much memory". --matt