Can anyone from Sun comment on the status/priority of bug ID 6761786? Seems like this would be a very high priority bug, but it hasn''t been updated since Oct 2008. Has anyone else with thousands of volume snapshots experienced the hours long import process? -- Dave
Dave, Its logged as an RFE (Request for Enhancement) not as a CR (bug). The status is 3-Accepted/ P1 RFE RFE''s are generally looked at in a much different way then a CR. ..Remco Dave wrote:> Can anyone from Sun comment on the status/priority of bug ID 6761786? > Seems like this would be a very high priority bug, but it hasn''t been > updated since Oct 2008. > > Has anyone else with thousands of volume snapshots experienced the hours > long import process? > > -- > Dave > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Thu, Aug 27, 2009 at 3:24 PM, Remco Lengers <remco at lengers.com> wrote:> Dave, > > Its logged as an RFE (Request for Enhancement) not as a CR (bug). > > The status is 3-Accepted/ P1 RFE > > RFE''s are generally looked at in a much different way then a CR. > > ..Remco >Seriously? It''s considered "works as designed" for a system to take 5+ hours to boot? Wow. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090827/5d60eda3/attachment.html>
I think the value of auto-snapshotting zvols is debatable. At least, there are not many folks who need to do this. What I''d rather see is a default property of ''auto-snapshot=off'' for zvols. Blake On Thu, Aug 27, 2009 at 4:29 PM, Tim Cook<tim at cook.ms> wrote:> > > On Thu, Aug 27, 2009 at 3:24 PM, Remco Lengers <remco at lengers.com> wrote: >> >> Dave, >> >> Its logged as an RFE (Request for Enhancement) not as a CR (bug). >> >> The status is 3-Accepted/ ?P1 ?RFE >> >> RFE''s are generally looked at in a much different way then a CR. >> >> ..Remco > > > Seriously? ?It''s considered "works as designed" for a system to take 5+ > hours to boot? ?Wow. > > --Tim > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >
Tim, > Seriously? It''s considered "works as designed" for a system to take 5+ > hours to boot? Wow. Thats not what I am saying...I am merely stating the administrative facts, as it may explain the inactivity on this matter. I am unsure if it is supposed to be an RFE or it became one by mistake. Regards, ..Remco Tim Cook wrote:> > > On Thu, Aug 27, 2009 at 3:24 PM, Remco Lengers <remco at lengers.com > <mailto:remco at lengers.com>> wrote: > > Dave, > > Its logged as an RFE (Request for Enhancement) not as a CR (bug). > > The status is 3-Accepted/ P1 RFE > > RFE''s are generally looked at in a much different way then a CR. > > ..Remco > > > > Seriously? It''s considered "works as designed" for a system to take 5+ > hours to boot? Wow. > > --Tim >
Just to make sure we''re looking at the same thing: http://bugs.opensolaris.org/view_bug.do?bug_id=6761786 This is not an issue of auto snapshots. If I have a ZFS server that exports 300 zvols via iSCSI and I have daily snapshots retained for 14 days, that is a total of 4200 snapshots. According to the link/bug report above it will take roughly 5.5 hours to import my pool (even when the pool is operating perfectly fine and is not degraded or faulted). This is obviously unacceptable to anyone in an HA environment. Hopefully someone close to the issue can clarify. -- Dave Blake wrote:> I think the value of auto-snapshotting zvols is debatable. At least, > there are not many folks who need to do this. > > What I''d rather see is a default property of ''auto-snapshot=off'' for zvols. > > Blake > > On Thu, Aug 27, 2009 at 4:29 PM, Tim Cook<tim at cook.ms> wrote: >> >> On Thu, Aug 27, 2009 at 3:24 PM, Remco Lengers <remco at lengers.com> wrote: >>> Dave, >>> >>> Its logged as an RFE (Request for Enhancement) not as a CR (bug). >>> >>> The status is 3-Accepted/ P1 RFE >>> >>> RFE''s are generally looked at in a much different way then a CR. >>> >>> ..Remco >> >> Seriously? It''s considered "works as designed" for a system to take 5+ >> hours to boot? Wow. >> >> --Tim >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >> > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Dave Yep that''s an RFE. (Request For Enchantment) that''s how things are reported to engineers to fix things inside Sun. If it''s an honest to goodness CR = bug (However it normally need a real support paying customer to have a problem to go from RFE to CR) the "responsible engineer" evaluates it, and eventually gets it fixed, or not. When I worked at Sun I logged a lot of RFEs, only a few where accepted as bugs and fixed. Click on the "new Search" link and look at the type and state menus. It gives you an idea of the states a RFE and CR goes through. It''s probably documented somewhere, but I can''t find it. Part of the joy of Sun putting out in public something most other vendors would not dream of doing. Oh and it doesn''t help both RFEs and CR are labelled "bug" at http://bugs.opensolaris.org/ So. Looking at your RFE. It tells you which version on Nevada it was reported against (translating this into an Opensolaris version is easy - NOT!) Look at "Related Bugs 6612830 " This will tell you the "Responsible Engineer Richard Morris" and when it was fixed "Release Fixed , solaris_10u6(s10u6_01) (Bug ID:2160894 -->) " Although as nothing in life is guaranteed it looks like another bug 2160894 has been identified and that''s not yet on bugs.opensolaris.org Hope that helps. Trevor Dave wrote: Just to make sure we''re looking at the same thing: http://bugs.opensolaris.org/view_bug.do?bug_id=6761786 This is not an issue of auto snapshots. If I have a ZFS server that exports 300 zvols via iSCSI and I have daily snapshots retained for 14 days, that is a total of 4200 snapshots. According to the link/bug report above it will take roughly 5.5 hours to import my pool (even when the pool is operating perfectly fine and is not degraded or faulted). This is obviously unacceptable to anyone in an HA environment. Hopefully someone close to the issue can clarify. -- Dave Blake wrote: I think the value of auto-snapshotting zvols is debatable. At least, there are not many folks who need to do this. What I''d rather see is a default property of ''auto-snapshot=off'' for zvols. Blake On Thu, Aug 27, 2009 at 4:29 PM, Tim Cook wrote: On Thu, Aug 27, 2009 at 3:24 PM, Remco Lengers wrote: Dave, Its logged as an RFE (Request for Enhancement) not as a CR (bug). The status is 3-Accepted/ P1 RFE RFE''s are generally looked at in a much different way then a CR. ..Remco Seriously? It''s considered "works as designed" for a system to take 5+ hours to boot? Wow. --Tim _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss www.eagle.co.nz This email is confidential and may be legally privileged. If received in error please destroy and immediately notify us. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
For whatever it''s worth to have someone post on a list.. I would *really* like to see this improved as well. The time it takes to iterate over both thousands of filesystems and thousands of snapshots makes me very cautious about taking advantage of some of the built-in zfs features in an HA environment. -- This message posted from opensolaris.org
Dave This helps:- http://defect.opensolaris.org/bz/page.cgi?id=fields.html The most common thing you will see is "Duplicate". As different people find the same problem at different times in different ways and when they searched database to see if it was "known" they could not find a bug description that seems to match their problem. I logged quite a few of these :-) The other common state is "Incomplete" typically because the submitter has not provided enough info. for the evaluator to evaluate it. Oh and what other company would allow you to see this data? :- http://defect.opensolaris.org/bz/reports.cgi (Old Charts is interesting) Trevor Trevor Pretty wrote: Dave Yep that''s an RFE. (Request For Enchantment) that''s how things are reported to engineers to fix things inside Sun. If it''s an honest to goodness CR = bug (However it normally need a real support paying customer to have a problem to go from RFE to CR) the "responsible engineer" evaluates it, and eventually gets it fixed, or not. When I worked at Sun I logged a lot of RFEs, only a few where accepted as bugs and fixed. Click on the "new Search" link and look at the type and state menus. It gives you an idea of the states a RFE and CR goes through. It''s probably documented somewhere, but I can''t find it. Part of the joy of Sun putting out in public something most other vendors would not dream of doing. Oh and it doesn''t help both RFEs and CR are labelled "bug" at http://bugs.opensolaris.org/ So. Looking at your RFE. It tells you which version on Nevada it was reported against (translating this into an Opensolaris version is easy - NOT!) Look at "Related Bugs 6612830 " This will tell you the "Responsible Engineer Richard Morris" and when it was fixed "Release Fixed , solaris_10u6(s10u6_01) (Bug ID:2160894 -->) " Although as nothing in life is guaranteed it looks like another bug 2160894 has been identified and that''s not yet on bugs.opensolaris.org Hope that helps. Trevor Dave wrote: Just to make sure we''re looking at the same thing: http://bugs.opensolaris.org/view_bug.do?bug_id=6761786 This is not an issue of auto snapshots. If I have a ZFS server that exports 300 zvols via iSCSI and I have daily snapshots retained for 14 days, that is a total of 4200 snapshots. According to the link/bug report above it will take roughly 5.5 hours to import my pool (even when the pool is operating perfectly fine and is not degraded or faulted). This is obviously unacceptable to anyone in an HA environment. Hopefully someone close to the issue can clarify. -- Dave Blake wrote: I think the value of auto-snapshotting zvols is debatable. At least, there are not many folks who need to do this. What I''d rather see is a default property of ''auto-snapshot=off'' for zvols. Blake On Thu, Aug 27, 2009 at 4:29 PM, Tim Cook wrote: On Thu, Aug 27, 2009 at 3:24 PM, Remco Lengers wrote: Dave, Its logged as an RFE (Request for Enhancement) not as a CR (bug). The status is 3-Accepted/ P1 RFE RFE''s are generally looked at in a much different way then a CR. ..Remco Seriously? It''s considered "works as designed" for a system to take 5+ hours to boot? Wow. --Tim _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss www.eagle.co.nz This email is confidential and may be legally privileged. If received in error please destroy and immediately notify us. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Thanks, Trevor. I understand the RFE/CR distinction. What I don''t understand is how this is not a bug that should be fixed in all solaris versions. The related ID 6612830 says it was fixed in Sol 10 U6, which was a while ago. I am using OpenSolaris, so I would really appreciate confirmation that it has been fixed in OpenSolaris as well. I can''t tell by the info on the bugs DB - it seems like it hasn''t been fixed in OpenSolaris. If it has, then the status should reflect it as Fixed/Closed in the bug database... -- Dave Trevor Pretty wrote:> Dave > > Yep that''s an RFE. (Request For Enchantment) that''s how things are > reported to engineers to fix things inside Sun. If it''s an honest to > goodness CR = bug (However it normally need a real support paying > customer to have a problem to go from RFE to CR) the "responsible > engineer" evaluates it, and eventually gets it fixed, or not. When I > worked at Sun I logged a lot of RFEs, only a few where accepted as bugs > and fixed. > > Click on the "new Search" link and look at the type and state menus. It > gives you an idea of the states a RFE and CR goes through. It''s probably > documented somewhere, but I can''t find it. Part of the joy of Sun > putting out in public something most other vendors would not dream of doing. > > Oh and it doesn''t help both RFEs and CR are labelled "bug" at > http://bugs.opensolaris.org/ > > So. Looking at your RFE. > > It tells you which version on Nevada it was reported against > (translating this into an Opensolaris version is easy - NOT!) > > Look at "*Related Bugs* 6612830 > <http://bugs.opensolaris.org/bugdatabase/view_bug.do;jsessionid=e49afb42be7df0f5f17ec9c2d711?bug_id=6612830> > " > > This will tell you the > > "*Responsible Engineer* Richard Morris" > > and when it was fixed > > "*Release Fixed* , solaris_10u6(s10u6_01) (*Bug ID:*2160894 > <http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=2160894>) " > > Although as nothing in life is guaranteed it looks like another bug > 2160894 has been identified and that''s not yet on bugs.opensolaris.org > > Hope that helps. > > Trevor > > > Dave wrote: >> Just to make sure we''re looking at the same thing: >> >> http://bugs.opensolaris.org/view_bug.do?bug_id=6761786 >> >> This is not an issue of auto snapshots. If I have a ZFS server that >> exports 300 zvols via iSCSI and I have daily snapshots retained for 14 >> days, that is a total of 4200 snapshots. According to the link/bug >> report above it will take roughly 5.5 hours to import my pool (even when >> the pool is operating perfectly fine and is not degraded or faulted). >> >> This is obviously unacceptable to anyone in an HA environment. Hopefully >> someone close to the issue can clarify. >> >> -- >> Dave >> >> Blake wrote: >> >>> I think the value of auto-snapshotting zvols is debatable. At least, >>> there are not many folks who need to do this. >>> >>> What I''d rather see is a default property of ''auto-snapshot=off'' for zvols. >>> >>> Blake >>> >>> On Thu, Aug 27, 2009 at 4:29 PM, Tim Cook<tim at cook.ms> wrote: >>> >>>> On Thu, Aug 27, 2009 at 3:24 PM, Remco Lengers <remco at lengers.com> wrote: >>>> >>>>> Dave, >>>>> >>>>> Its logged as an RFE (Request for Enhancement) not as a CR (bug). >>>>> >>>>> The status is 3-Accepted/ P1 RFE >>>>> >>>>> RFE''s are generally looked at in a much different way then a CR. >>>>> >>>>> ..Remco >>>>> >>>> Seriously? It''s considered "works as designed" for a system to take 5+ >>>> hours to boot? Wow. >>>> >>>> --Tim >>>> >>>> _______________________________________________ >>>> zfs-discuss mailing list >>>> zfs-discuss at opensolaris.org >>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >>>> >>>> >>>> >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >>> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > > > > */ > > *//* > > *//*///* > > www.eagle.co.nz <http://www.eagle.co.nz/> > > This email is confidential and may be legally privileged. If received in > error please destroy and immediately notify us. >
Trevor Pretty wrote:> "*Release Fixed* , solaris_10u6(s10u6_01) (*Bug ID:*2160894 > <http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=2160894>) " > > Although as nothing in life is guaranteed it looks like another bug > 2160894 has been identified and that''s not yet on bugs.opensolaris.orgThat isn''t acutally another bug but an implementation artefact of the multiple release support in Bugster. Bug numbers beginning with 2* aren''t actually real bugs bug sub-CRs of the main one. -- Darren J Moffat
On Aug 28, 2009, at 12:15 AM, Dave wrote:> Thanks, Trevor. I understand the RFE/CR distinction. What I don''t > understand is how this is not a bug that should be fixed in all > solaris versions.In a former life, I worked at Sun to identify things like this that affect availability and lobbied to get them fixed. There are opposing forces at work: the functionality is correct as designed versus availability folks think it should go faster. It is difficult to build the case that code changes should be made for availability when other workarounds exist. It will be more fruitful for you to examine the implementation and see if there is a better way to improve the efficiencies of your snapshot processes. For example, the case can be made for a secondary data store containing long-term snapshots which can allow you to further optimize the primary data store for performance and availability. -- richard
Richard Elling wrote:> On Aug 28, 2009, at 12:15 AM, Dave wrote: > >> Thanks, Trevor. I understand the RFE/CR distinction. What I don''t >> understand is how this is not a bug that should be fixed in all >> solaris versions. > > In a former life, I worked at Sun to identify things like this that > affect availability > and lobbied to get them fixed. There are opposing forces at work: the > functionality > is correct as designed versus availability folks think it should go > faster. It is difficult > to build the case that code changes should be made for availability when > other > workarounds exist. It will be more fruitful for you to examine the > implementation and > see if there is a better way to improve the efficiencies of your > snapshot processes. > For example, the case can be made for a secondary data store containing > long-term > snapshots which can allow you to further optimize the primary data store > for > performance and availability. > -- richardThis is unfortunate, but it seems this may be the only option if I want to import a pool within a reasonable amount of time. It''s very frustrating to know that it can be fixed (evidenced by the S10U6 fix), but won''t be fixed in Nevada/OpenSolaris - or so it seems. It may be filed as an RFE, but in my opinion it is most definitely a bug. -- Dave
On Fri, 28 Aug 2009, Dave wrote:> Thanks, Trevor. I understand the RFE/CR distinction. What I don''t > understand is how this is not a bug that should be fixed in all solaris > versions.Just to get the terminology right: "CR" means Change Request, and can refer to Defects ("bugs") or RFE''s. Defects have higher priority than RFE''s, even though sometimes what makes something a defect vs. an RFE can be a bit subjective. But both bugs/defects and RFE''s are CR''s.> Oh and it doesn''t help both RFEs and CR are labelled "bug" at > http://bugs.opensolaris.org/That''s not true. Bugs or Defects are distinct from RFEs. There''s a "Type" pulldown on that site that lets you choose. But I would agree with the assertion that it doesn''t help to have RFEs labelled with a "Bug ID" number. Regards, markm
Hi, I''d like to have it fixed as well, I''m having the same problem with 20 zvols which are windows xp images exported through iscsi, they are auto-snapshotted every hour/day/month, right now I''ve got nearly 1500 snapshots and booting this 4core xeon with 8Gb of ram and 8 disks on four pairs of mirrors takes around 15-20 minutes (last time I''ve booted it two months ago). So, it is definitely a bug, in my opinion, that such a pc takes 15 minutes to handle 1500 snapshots during boot while it would take a few seconds at worst to create the same number of snapshots. -- This message posted from opensolaris.org
On Thu, Aug 27, 2009 at 01:37:11PM -0600, Dave wrote:> Can anyone from Sun comment on the status/priority of bug ID 6761786? > Seems like this would be a very high priority bug, but it hasn''t been > updated since Oct 2008. > > Has anyone else with thousands of volume snapshots experienced the hours > long import process?It might not be direct ZFS fault. I tried to reproduce this on FreeBSD and I was able to import pool with ~2000 ZVOLs and ~10000 ZVOL snapshots in few minutes. Those were empty ZVOLs and empty snapshots, so keep that in mind. All in all creating /dev/ entries might be slow in Solaris that''s why experience this behaviour when importing ZFS pool with many ZVOLs and many ZVOL snapshots (note that every ZVOL snapshot is a device entry in /dev/zvol/, not like with file systems where snapshots are mounted on .zfs/snapshot/<name> lookup and not on import time). -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090831/358a6f1a/attachment.bin>
On 08/31/09 19:54, Pawel Jakub Dawidek wrote:> On Thu, Aug 27, 2009 at 01:37:11PM -0600, Dave wrote: >> Can anyone from Sun comment on the status/priority of bug ID 6761786? >> Seems like this would be a very high priority bug, but it hasn''t been >> updated since Oct 2008. >> >> Has anyone else with thousands of volume snapshots experienced the hours >> long import process? > > It might not be direct ZFS fault. I tried to reproduce this on FreeBSD > and I was able to import pool with ~2000 ZVOLs and ~10000 ZVOL snapshots > in few minutes. Those were empty ZVOLs and empty snapshots, so keep that > in mind. All in all creating /dev/ entries might be slow in Solaris > that''s why experience this behaviour when importing ZFS pool with many > ZVOLs and many ZVOL snapshots (note that every ZVOL snapshot is a device > entry in /dev/zvol/, not like with file systems where snapshots are > mounted on .zfs/snapshot/<name> lookup and not on import time). >Indeed, that (devfsadm taking a long time) is probably 6822622 ''zpool import with a large number of zvols is very slow''. Alas the information available on b.o.o is extremely thin. I ran into this slow import with lots of snapshots of ZVOLs myself some builds ago. A boot would take around 20 minutes. A good way to see if you are suffering from this problem is to temporarily comment out the line ''/usr/sbin/zfs volinit'' from /lib/svc/method/devices-local. Booting should be much faster then. (I have since disabled automatic snapshots on ZVOLs and my system boots in reasonable time again). Menno -- Menno Lageman - Sun Microsystems - http://blogs.sun.com/menno