Juergen Nickelsen
2010-Jan-05 21:53 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
Is there any limit on the number of snapshots in a file system? The documentation -- manual page, admin guide, troubleshooting guide -- does not mention any. That seems to confirm my assumption that is is probably not a fixed limit, but there may still be a practical one, just like there is no limit on the number of file systems in a pool, but nobody would find having a million file systems practical. I have tried to create a number of snapshots in a file system for a few hours. An otherwise unloaded X4250 with a nearly empty RAID-Z2 pool of six builtin disks (146 GB, 10K rpm) managed to create a few snapshots per second in an empty file system. It had not visibly slowed down when it reached 36051 snapshots after hours and I stopped it; to my surprise destroying the file system (with all these snapshots in it) took about as long. With ``iostat -xn 1'''' I could see that the disk usage was still low, at about 13% IIRC. So 36000 snapshots in an empty file system is not a problem. Is it different with a file system that is, say, to 70% full? Or on a bigger pool? Or with a significantly larger number of snapshots, say, a million? I am asking for real experience here, not for the theory. Regards, Juergen.
Ian Collins
2010-Jan-06 01:17 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
Juergen Nickelsen wrote:> Is there any limit on the number of snapshots in a file system? > > The documentation -- manual page, admin guide, troubleshooting guide > -- does not mention any. That seems to confirm my assumption that is > is probably not a fixed limit, but there may still be a practical > one, just like there is no limit on the number of file systems in a > pool, but nobody would find having a million file systems practical. > > I have tried to create a number of snapshots in a file system for a > few hours. An otherwise unloaded X4250 with a nearly empty RAID-Z2 > pool of six builtin disks (146 GB, 10K rpm) managed to create a few > snapshots per second in an empty file system. > > It had not visibly slowed down when it reached 36051 snapshots after > hours and I stopped it; to my surprise destroying the file system > (with all these snapshots in it) took about as long. With ``iostat > -xn 1'''' I could see that the disk usage was still low, at about 13% > IIRC. > > So 36000 snapshots in an empty file system is not a problem. Is it > different with a file system that is, say, to 70% full? Or on a > bigger pool? Or with a significantly larger number of snapshots, > say, a million? I am asking for real experience here, not for the > theory. > >The most I ever had was about 240000 on a 2TB pool (~1000 filesystems, x 60 days x 4 per day). There wasn''t any noticeable performance impact, except when I built a tree of snapshots (via libzfs) to work out which ones had to be replicated. Deleting 50 days worth of them took a very long time! -- Ian.
Lutz Schumann
2010-Jan-06 11:03 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
Snapshots do not impact write performance. Deletion of the snapshots seems to be also a constant operation (time taken = number of snapshots x some time). However see http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6761786. When importing a pool with many snapshots (which happens during reboot also) the import may take a long time (example: 10000 snapshots ~ 1-2 days). I''ve not tested the new release of Solaris (svn_125++) which fixes this regarding this issue. So a test with osol 125++ would be nice :) -- This message posted from opensolaris.org
Robert Milkowski
2010-Jan-07 00:54 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
On 06/01/2010 11:03, Lutz Schumann wrote:> Snapshots do not impact write performance. Deletion of the snapshots seems to be also a constant operation (time taken = number of snapshots x some time). > >that''s not entirely true. By having a snapshot you are not releasing the space forcing zfs to allocate new space from other parts of a disk drive. This may lead (depending on workload) to more fragmentation, less localized data (more and longer seeks). Then the time it takes to delete a snapshot depends mostly on how many blocks needs to be freed. -- Robert Milkowski http://milek.blogspot.com
Juergen Nickelsen
2010-Jan-08 07:21 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
Lutz Schumann <presales at storageconcepts.de> writes:> When importing a pool with many snapshots (which happens during > reboot also) the import may take a long time (example: 10000 > snapshots ~ 1-2 days). > > I''ve not tested the new release of Solaris (svn_125++) which fixes > this regarding this issue. So a test with osol 125++ would be nice > :)That is indeed significant. I do not know which software version the platform for storage for our customers runs on, but there is something to look out for. Thanks to you and the others for the answers! -- Hello, IT... Have you tried turning it off and on again? -- "The IT Crowd"
Peter van Gemert
2010-Jan-08 12:40 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
> By having a snapshot you > are not releasing the > space forcing zfs to allocate new space from other > parts of a disk > drive. This may lead (depending on workload) to more > fragmentation, less > localized data (more and longer seeks). >ZFS uses COW (copy on write) during writes. This means that it first has to find a new location for the data and when this data is written, the original block is released. When using snapshots, the original block is not released. I don''t think the use of snapshots will alter the way data is fragmented or localized on disk. --- PeterVG -- This message posted from opensolaris.org
Robert Milkowski
2010-Jan-08 13:51 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
On 08/01/2010 12:40, Peter van Gemert wrote:>> By having a snapshot you >> are not releasing the >> space forcing zfs to allocate new space from other >> parts of a disk >> drive. This may lead (depending on workload) to more >> fragmentation, less >> localized data (more and longer seeks). >> >> > ZFS uses COW (copy on write) during writes. This means that it first has to find a new location for the data and when this data is written, the original block is released. When using snapshots, the original block is not released. > > I don''t think the use of snapshots will alter the way data is fragmented or localized on disk. > > --- > PeterVG >Well, it will (depending on workload). For example - lets say you have a 80GB disk drive as a pool with a single db file which is 1GB in size. Now no snapshots are created and you constantly are modyfing logical blocks in the file. As ZFS will release the old block and will re-use it later on so all current data should be roughly within the first 2GB of the disk drive therefore highly localized. Now if you would create a snapshot while modyfing data, then another one and another one, you would end-up in a situation where free blocks are availably further and further onto a disk drive. When you end-up almost filling the disk drive even if you delete all snapshots now your active data will be scattered all over the disk (assuming you were not modyfing 100% of data between creating snapshots). It won''t be highly localized anymore. -- Robert Milkowski http://milek.blogspot.com
David Dyer-Bennet
2010-Jan-08 14:50 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
On Fri, January 8, 2010 07:51, Robert Milkowski wrote:> On 08/01/2010 12:40, Peter van Gemert wrote: >>> By having a snapshot you >>> are not releasing the >>> space forcing zfs to allocate new space from other >>> parts of a disk >>> drive. This may lead (depending on workload) to more >>> fragmentation, less >>> localized data (more and longer seeks). >>> >>> >> ZFS uses COW (copy on write) during writes. This means that it first has >> to find a new location for the data and when this data is written, the >> original block is released. When using snapshots, the original block is >> not released. >> >> I don''t think the use of snapshots will alter the way data is fragmented >> or localized on disk.> Well, it will (depending on workload). > For example - lets say you have a 80GB disk drive as a pool with a > single db file which is 1GB in size. > Now no snapshots are created and you constantly are modyfing logical > blocks in the file. As ZFS will release the old block and will re-use it > later on so all current data should be roughly within the first 2GB of > the disk drive therefore highly localized.I thought block re-use was delayed to allow for TXG rollback, though? They''ll certainly get reused eventually, but I think they get reused later rather than sooner. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Robert Milkowski
2010-Jan-08 15:20 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
On 08/01/2010 14:50, David Dyer-Bennet wrote:> On Fri, January 8, 2010 07:51, Robert Milkowski wrote: > >> On 08/01/2010 12:40, Peter van Gemert wrote: >> >>>> By having a snapshot you >>>> are not releasing the >>>> space forcing zfs to allocate new space from other >>>> parts of a disk >>>> drive. This may lead (depending on workload) to more >>>> fragmentation, less >>>> localized data (more and longer seeks). >>>> >>>> >>>> >>> ZFS uses COW (copy on write) during writes. This means that it first has >>> to find a new location for the data and when this data is written, the >>> original block is released. When using snapshots, the original block is >>> not released. >>> >>> I don''t think the use of snapshots will alter the way data is fragmented >>> or localized on disk. >>> > >> Well, it will (depending on workload). >> For example - lets say you have a 80GB disk drive as a pool with a >> single db file which is 1GB in size. >> Now no snapshots are created and you constantly are modyfing logical >> blocks in the file. As ZFS will release the old block and will re-use it >> later on so all current data should be roughly within the first 2GB of >> the disk drive therefore highly localized. >> > I thought block re-use was delayed to allow for TXG rollback, though? > They''ll certainly get reused eventually, but I think they get reused later > rather than sooner. > >yes there is a delay but iirc it is only several transactions while the above scenario in practice usually means a snapshot a day and keep 30 of them. -- Robert Milkowski http://milek.blogspot.com
Bob Friesenhahn
2010-Jan-08 15:46 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
On Fri, 8 Jan 2010, Peter van Gemert wrote:> > I don''t think the use of snapshots will alter the way data is > fragmented or localized on disk.What happens after a snapshot is deleted? Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Lutz Schumann
2010-Jan-11 18:00 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
Ok, tested this myself ... (same hardware used for both tests) OpenSolaris svn_104 (actually Nexenta Core 2): 100 Snaps ------------ root at nexenta:/volumes# time for i in $(seq 1 100); do zfs snapshot ssd/vol at test1_$i; done real 0m24.991s user 0m0.297s sys 0m0.679s Import: root at nexenta:/volumes# time zpool import ssd real 0m25.053s user 0m0.031s sys 0m0.216s 2) 500 snaps (400 Created, 500 imported) ------------------------------------------------- root at nexenta:/volumes# time for i in $(seq 101 500); do zfs snapshot ssd/vol at test1_$i; done real 3m6.257s user 0m1.190s sys 0m2.896s root at nexenta:/volumes# time zpool import ssd real 3m59.206s user 0m0.091s sys 0m0.956s 3) 1500 Snaps (1000 created, 1500 imported) ----------------------------------------------------- root at nexenta:/volumes# time for i in $(seq 501 1500); do zfs snapshot ssd/vol at test1_$i; done real 22m23.206s user 0m3.041s sys 0m8.785s root at nexenta:/volumes# time zpool import ssd real 36m26.765s user 0m0.233s sys 0m4.545s .... you see where this goes - its expotential !! Now with svn_130 (same pool, still 1500 snaps on it) .. not we are booting OpenSolaris svn_130. Sun Microsystems Inc. SunOS 5.11 snv_130 November 2008 rah at osol_dev130:~# zpool import pool: ssd id: 16128137881522033167 state: ONLINE status: The pool is formatted using an older on-disk version. action: The pool can be imported using its name or numeric identifier, though some features will not be available without an explicit ''zpool upgrade''. config: ssd ONLINE c9d1 ONLINE rah at osol_dev130:~# time zpool import ssd real 0m0.756s user 0m0.014s sys 0m0.056s rah at osol_dev130:~# zfs list -t snapshot | wc -l 1502 rah at osol_dev130:~# time zpool export ssd real 0m0.425s user 0m0.003s sys 0m0.029s I like this one :) ... just for fun ... (5K Snaps) rah at osol_dev130:~# time for i in $(seq 1501 5000); do zfs snapshot ssd/vol at test1_$i; done real 1m18.977s user 0m9.889s sys 0m19.969s rah at osol_dev130:~# zpool export ssd rah at osol_dev130:~# time zpool import ssd real 0m0.421s user 0m0.014s sys 0m0.055s ... just for fun ... (10K Snaps) rah at osol_dev130:~# time for i in $(seq 5001 10000); do zfs snapshot ssd/vol at test1_$i; done real 2m6.242s user 0m14.107s sys 0m28.573s rah at osol_dev130:~# time zpool import ssd real 0m0.405s user 0m0.014s sys 0m0.057s Very nice, so volume import is solved. .. however ... a lot of snaps still have a impact on system performance. After the import of the 10000 snaps volume, I saw "devfsadm" eating up all CPU: load averages: 5.00, 3.32, 1.58; up 0+00:18:12 18:50:05 99 processes: 95 sleeping, 2 running, 2 on cpu CPU states: 0.0% idle, 4.6% user, 95.4% kernel, 0.0% iowait, 0.0% swap Kernel: 409 ctxsw, 14 trap, 47665 intr, 1223 syscall Memory: 8190M phys mem, 5285M free mem, 4087M total swap, 4087M free swap PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND 167 root 6 22 0 25M 13M run 3:14 49.41% devfsadm ... a truss showed that it is the device node allocation eating up the CPU: /5: 0.0010 xstat(2, "/devices/pseudo/zfs at 0:8941,raw", 0xFE32FCE0) = 0 /5: 0.0005 fcntl(7, F_SETLK, 0xFE32FED0) = 0 /5: 0.0000 close(7) = 0 /5: 0.0001 lwp_unpark(3) = 0 /3: 0.0200 lwp_park(0xFE61EF58, 0) = 0 /3: 0.0000 time() = 1263232337 /5: 0.0001 open("/etc/dev/.devfsadm_dev.lock", O_RDWR|O_CREAT, 0644) = 7 /5: 0.0001 fcntl(7, F_SETLK, 0xFE32FEF0) = 0 /5: 0.0000 read(7, "A7\0\0\0", 4) = 4 /5: 0.0001 getpid() = 167 [1] /5: 0.0000 getpid() = 167 [1] /5: 0.0001 open("/devices/pseudo/devinfo at 0:devinfo", O_RDONLY) = 10 /5: 0.0000 ioctl(10, DINFOIDENT, 0x00000000) = 57311 /5: 0.0138 ioctl(10, 0xDF06, 0xFE32FA60) = 2258109 /5: 0.0027 ioctl(10, DINFOUSRLD, 0x086CD000) = 2260992 /5: 0.0001 close(10) = 0 /5: 0.0015 modctl(MODGETNAME, 0xFE32F060, 0x00000401, 0xFE32F05C, 0xFD1E0008) = 0 /5: 0.0010 xstat(2, "/devices/pseudo/zfs at 0:8941", 0xFE32FCE0) = 0 /5: 0.0005 fcntl(7, F_SETLK, 0xFE32FED0) = 0 /5: 0.0000 close(7) = 0 /5: 0.0001 lwp_unpark(3) = 0 /3: 0.0201 lwp_park(0xFE61EF58, 0) = 0 /3: 0.0001 time() = 1263232337 /5: 0.0001 open("/etc/dev/.devfsadm_dev.lock", O_RDWR|O_CREAT, 0644) = 7 /5: 0.0001 fcntl(7, F_SETLK, 0xFE32FEF0) = 0 /5: 0.0000 read(7, "A7\0\0\0", 4) = 4 /5: 0.0001 getpid() = 167 [1] /5: 0.0000 getpid() = 167 [1] /5: 0.0001 open("/devices/pseudo/devinfo at 0:devinfo", O_RDONLY) = 10 /5: 0.0000 ioctl(10, DINFOIDENT, 0x00000000) = 57311 /5: 0.0138 ioctl(10, 0xDF06, 0xFE32FA60) = 2258109 /5: 0.0027 ioctl(10, DINFOUSRLD, 0x086CD000) = 2260992 /5: 0.0001 close(10) = 0 /5: 0.0015 modctl(MODGETNAME, 0xFE32F060, 0x00000401, 0xFE32F05C, 0xFD1E0008) = 0 After 5 minutes all devices were created. .. another strange issue: rah at osol_dev130:/dev/zvol/dsk/ssd# ls -al load averages: 1.16, 2.17, 1.50; up 0+00:22:07 18:54:00 99 processes: 97 sleeping, 2 on cpu CPU states: 49.1% idle, 0.1% user, 50.8% kernel, 0.0% iowait, 0.0% swap Kernel: 257 ctxsw, 1 trap, 607 intr, 282 syscall Memory: 8190M phys mem, 5280M free mem, 4087M total swap, 4087M free swap PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND 860 rah 2 59 0 41M 14M sleep 0:01 0.05% gnome-netstatus 866 root 1 59 0 1656K 1080K sleep 0:00 0.02% gnome-netstatus .. it seems to be a issue related to /dev ... .. so having a 10000 snaps of a single zvol is not nice :) Robert -- This message posted from opensolaris.org
Richard Elling
2010-Jan-11 19:48 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
comment below... On Jan 11, 2010, at 10:00 AM, Lutz Schumann wrote:> Ok, tested this myself ... > > (same hardware used for both tests) > > OpenSolaris svn_104 (actually Nexenta Core 2): > > 100 Snaps > ------------ > root at nexenta:/volumes# time for i in $(seq 1 100); do zfs snapshot ssd/vol at test1_$i; done > > > real 0m24.991s > user 0m0.297s > sys 0m0.679s > > Import: > root at nexenta:/volumes# time zpool import ssd > > real 0m25.053s > user 0m0.031s > sys 0m0.216s > > 2) 500 snaps (400 Created, 500 imported) > ------------------------------------------------- > > root at nexenta:/volumes# time for i in $(seq 101 500); do zfs snapshot ssd/vol at test1_$i; done > > real 3m6.257s > user 0m1.190s > sys 0m2.896s > > root at nexenta:/volumes# time zpool import ssd > > real 3m59.206s > user 0m0.091s > sys 0m0.956s > > 3) 1500 Snaps (1000 created, 1500 imported) > ----------------------------------------------------- > > root at nexenta:/volumes# time for i in $(seq 501 1500); do zfs snapshot ssd/vol at test1_$i; done > > real 22m23.206s > user 0m3.041s > sys 0m8.785s > > root at nexenta:/volumes# time zpool import ssd > > real 36m26.765s > user 0m0.233s > sys 0m4.545s > > .... you see where this goes - its expotential !! > > Now with svn_130 (same pool, still 1500 snaps on it) > > .. not we are booting OpenSolaris svn_130. > Sun Microsystems Inc. SunOS 5.11 snv_130 November 2008 > > rah at osol_dev130:~# zpool import > pool: ssd > id: 16128137881522033167 > state: ONLINE > status: The pool is formatted using an older on-disk version. > action: The pool can be imported using its name or numeric identifier, though > some features will not be available without an explicit ''zpool upgrade''. > config: > > ssd ONLINE > c9d1 ONLINE > > rah at osol_dev130:~# time zpool import ssd > > real 0m0.756s > user 0m0.014s > sys 0m0.056s > > rah at osol_dev130:~# zfs list -t snapshot | wc -l > 1502 > > rah at osol_dev130:~# time zpool export ssd > > real 0m0.425s > user 0m0.003s > sys 0m0.029s > > I like this one :) > > ... just for fun ... (5K Snaps) > > rah at osol_dev130:~# time for i in $(seq 1501 5000); do zfs snapshot ssd/vol at test1_$i; done > > real 1m18.977s > user 0m9.889s > sys 0m19.969s > > rah at osol_dev130:~# zpool export ssd > rah at osol_dev130:~# time zpool import ssd > > real 0m0.421s > user 0m0.014s > sys 0m0.055s > > ... just for fun ... (10K Snaps) > > rah at osol_dev130:~# time for i in $(seq 5001 10000); do zfs snapshot ssd/vol at test1_$i; done > > real 2m6.242s > user 0m14.107s > sys 0m28.573s > > rah at osol_dev130:~# time zpool import ssd > > real 0m0.405s > user 0m0.014s > sys 0m0.057s > > Very nice, so volume import is solved.cool> > .. however ... a lot of snaps still have a impact on system performance. After the import of the 10000 snaps volume, I saw "devfsadm" eating up all CPU:If you are snapshotting ZFS volumes, then each will create an entry in the device tree. In other words, if these were file systems instead of volumes, you would not see devfsadm so busy.> > load averages: 5.00, 3.32, 1.58; up 0+00:18:12 18:50:05 > 99 processes: 95 sleeping, 2 running, 2 on cpu > CPU states: 0.0% idle, 4.6% user, 95.4% kernel, 0.0% iowait, 0.0% swap > Kernel: 409 ctxsw, 14 trap, 47665 intr, 1223 syscall > Memory: 8190M phys mem, 5285M free mem, 4087M total swap, 4087M free swap > > PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND > 167 root 6 22 0 25M 13M run 3:14 49.41% devfsadm > > ... a truss showed that it is the device node allocation eating up the CPU: > > /5: 0.0010 xstat(2, "/devices/pseudo/zfs at 0:8941,raw", 0xFE32FCE0) = 0 > /5: 0.0005 fcntl(7, F_SETLK, 0xFE32FED0) = 0 > /5: 0.0000 close(7) = 0 > /5: 0.0001 lwp_unpark(3) = 0 > /3: 0.0200 lwp_park(0xFE61EF58, 0) = 0 > /3: 0.0000 time() = 1263232337 > /5: 0.0001 open("/etc/dev/.devfsadm_dev.lock", O_RDWR|O_CREAT, 0644) = 7 > /5: 0.0001 fcntl(7, F_SETLK, 0xFE32FEF0) = 0 > /5: 0.0000 read(7, "A7\0\0\0", 4) = 4 > /5: 0.0001 getpid() = 167 [1] > /5: 0.0000 getpid() = 167 [1] > /5: 0.0001 open("/devices/pseudo/devinfo at 0:devinfo", O_RDONLY) = 10 > /5: 0.0000 ioctl(10, DINFOIDENT, 0x00000000) = 57311 > /5: 0.0138 ioctl(10, 0xDF06, 0xFE32FA60) = 2258109 > /5: 0.0027 ioctl(10, DINFOUSRLD, 0x086CD000) = 2260992 > /5: 0.0001 close(10) = 0 > /5: 0.0015 modctl(MODGETNAME, 0xFE32F060, 0x00000401, 0xFE32F05C, 0xFD1E0008) = 0 > /5: 0.0010 xstat(2, "/devices/pseudo/zfs at 0:8941", 0xFE32FCE0) = 0 > /5: 0.0005 fcntl(7, F_SETLK, 0xFE32FED0) = 0 > /5: 0.0000 close(7) = 0 > /5: 0.0001 lwp_unpark(3) = 0 > /3: 0.0201 lwp_park(0xFE61EF58, 0) = 0 > /3: 0.0001 time() = 1263232337 > /5: 0.0001 open("/etc/dev/.devfsadm_dev.lock", O_RDWR|O_CREAT, 0644) = 7 > /5: 0.0001 fcntl(7, F_SETLK, 0xFE32FEF0) = 0 > /5: 0.0000 read(7, "A7\0\0\0", 4) = 4 > /5: 0.0001 getpid() = 167 [1] > /5: 0.0000 getpid() = 167 [1] > /5: 0.0001 open("/devices/pseudo/devinfo at 0:devinfo", O_RDONLY) = 10 > /5: 0.0000 ioctl(10, DINFOIDENT, 0x00000000) = 57311 > /5: 0.0138 ioctl(10, 0xDF06, 0xFE32FA60) = 2258109 > /5: 0.0027 ioctl(10, DINFOUSRLD, 0x086CD000) = 2260992 > /5: 0.0001 close(10) = 0 > /5: 0.0015 modctl(MODGETNAME, 0xFE32F060, 0x00000401, 0xFE32F05C, 0xFD1E0008) = 0 > > After 5 minutes all devices were created. > > .. another strange issue: > rah at osol_dev130:/dev/zvol/dsk/ssd# ls -al > > load averages: 1.16, 2.17, 1.50; up 0+00:22:07 18:54:00 > 99 processes: 97 sleeping, 2 on cpu > CPU states: 49.1% idle, 0.1% user, 50.8% kernel, 0.0% iowait, 0.0% swap > Kernel: 257 ctxsw, 1 trap, 607 intr, 282 syscall > Memory: 8190M phys mem, 5280M free mem, 4087M total swap, 4087M free swap > > PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND > 860 rah 2 59 0 41M 14M sleep 0:01 0.05% gnome-netstatus > 866 root 1 59 0 1656K 1080K sleep 0:00 0.02% gnome-netstatus > > .. it seems to be a issue related to /dev ...I don''t see the issue, could you elaborate?> .. so having a 10000 snaps of a single zvol is not nice :)AIUI, devfsadm creates a database. Could you try the last experiment again, since the database should now be updated. CR6903071 seems to offer some insight in the comments related to the device tree and COMSTART, though it is closed as a dupe (perhaps unfortunate?) http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6903071 There may be some improvement in devfsadm that would help. -- richard
Damon Atkins
2010-Jan-12 02:20 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
One thing which may help is the zfs import was single threaded, ie it open every disk one disk (maybe slice) at a time and processed it, as of 128b it is multi-threaded, ie it opens N disks/slices at once and process N disks/slices at once. When N is the number of threads it decides to use. http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844191 This most like/maybe cause other parts of the process to now become multi-threaded as well. It would be nice to no longer have /etc/zfs/zpool.cache, now zfs import is fast enough. (which is a second reason I longed the bug) -- This message posted from opensolaris.org
Lutz Schumann
2010-Jan-12 05:38 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
Cause you mention the fixed / bugs I have a more general question. Is there a way to see all commits to OSOL that are related to a Bug Report ? Background: I''m interested in how e.g. the zfs import bug was fixed. -- This message posted from opensolaris.org
Lutz Schumann
2010-Jan-12 05:45 UTC
[zfs-discuss] (Practical) limit on the number of snapshots?
> > .. however ... a lot of snaps still have a impact > on system performance. After the import of the 10000 > snaps volume, I saw "devfsadm" eating up all CPU: > > If you are snapshotting ZFS volumes, then each will > create an entry in the > device tree. In other words, if these were file > systems instead of volumes, you > would not see devfsadm so busy.Ok, nice to know that. For our use case we are focused on zvols (comstar iSCSi to virtualized hosts). And still it works fine for a reasonable number of zvols with a nice backup/snapshot cycle (~ 12 (5min) + 24 (1h) +7 (daily) + 4 (weekly) + 12 (monthly) + 1 for each year -> ~ 60 Snaps for each zvol).> > .. another strange issue: > > rah at osol_dev130:/dev/zvol/dsk/ssd# ls -al > > > > load averages: 1.16, 2.17, 1.50; > up 0+00:22:07 18:54:00 > : 97 sleeping, 2 on cpu > > CPU states: 49.1% idle, 0.1% user, 50.8% kernel, > I don''t see the issue, could you elaborate?How ( I know how to "truss", but not not so familiar with debugging at the kernel level :) ?> > .. so having a 10000 snaps of a single zvol is not > nice :) > > AIUI, devfsadm creates a database. Could you try the > last experiment again,Will try, but have no access to test equipment right now. Thanks for the feedback. -- This message posted from opensolaris.org
Lutz, On Mon, Jan 11, 2010 at 09:38:16PM -0800, Lutz Schumann wrote:> Cause you mention the fixed / bugs I have a more general question. > > Is there a way to see all commits to OSOL that are related to a Bug Report ?You can go to : src.opensolaris.org and give the bug-id in the history field and select ON gate and search. That should list all the files that were modified by the fix for that bug. Now for each file you can got to the histry and get a diff of version where the fix was integrated and the previous version. Hope that helps. Regards, Sanjeev -- ---------------- Sanjeev Bagewadi Solaris RPE Bangalore, India