Hi guys, when ZFS starts it ends up hanging the system. We have a raidz over 5 x 2TB disks with 5 ZFS filesystems. (The root filesystem is on separate disks). # uname -a FreeBSD fortinbras.XXX 8.2-STABLE FreeBSD 8.2-STABLE #0: Wed Oct 19 09:20:04 SAST 2011 root at storage.XXX:/usr/obj/usr/src/sys/GENERIC amd64 ZFS filesystem version 5 ZFS storage pool version 28 The setup was working great until we decided to try out deduplication. After testing it on a few files I set deduplication on 1 of the filesystems and moved around 1.5TB of data into it. There was basically no space saving, just a big performance hit, so we decided to take it off. While deleting a directory on this filesystem in preparation the system hung. After an abnormal half an hour long bootup everything was fine, and a scrub was clean. We then decided to rather just destroy this filesystem as it happened to have disposable data on and would be 100 times quicker(?) I ran the zfs destroy command which sat there for about 40 hours while the free space on the pool gradually increased, at which point the system hung again. Rebooting took hours, stuck at ZFS initialisation, after which the console returned: pid 37 (zfs), uid 0, was killed: out of swap space pid 38 (sh), uid 0, was killed: out of swap space This''s now the state I''m stuck in. I can naturally boot up without ZFS but once I start it manually the disks in the pool start flashing for an hour or 2 and the system hangs before they finish doing whatever they''re doing. The system has 6GB of RAM and a 10GB swap partition. I added a 30GB swap file but this hasn''t helped. # sysctl hw.physmem hw.physmem: 6363394048 # sysctl vfs.zfs.arc_max vfs.zfs.arc_max: 5045088256 (I lowered arc_max to 1GB but hasn''t helped) I''ve included the output when starting ZFS after setting vfs.zfs.debug=1 at the bottom. There''s no more ZFS output for the next few hours while the disks are flashing and the system is responsive. A series of top outputs after starting ZFS: last pid: 1536; load averages: 0.00, 0.02, 0.08 21 processes: 1 running, 20 sleeping CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 12M Active, 7344K Inact, 138M Wired, 44K Cache, 10M Buf, 5678M Free Swap: 39G Total, 39G Free last pid: 1567; load averages: 0.13, 0.05, 0.08 25 processes: 1 running, 24 sleeping CPU: 0.0% user, 0.0% nice, 2.3% system, 2.1% interrupt, 95.6% idle Mem: 14M Active, 7880K Inact, 328M Wired, 44K Cache, 13M Buf, 5485M Free Swap: 39G Total, 39G Free last pid: 1632; load averages: 0.06, 0.04, 0.06 25 processes: 1 running, 24 sleeping CPU: 0.0% user, 0.0% nice, 0.5% system, 0.1% interrupt, 99.4% idle Mem: 14M Active, 8040K Inact, 2421M Wired, 40K Cache, 13M Buf, 3392M Free Swap: 39G Total, 39G Free last pid: 1693; load averages: 0.11, 0.10, 0.08 25 processes: 1 running, 24 sleeping CPU: 0.0% user, 0.0% nice, 0.3% system, 0.1% interrupt, 99.5% idle Mem: 14M Active, 8220K Inact, 4263M Wired, 40K Cache, 13M Buf, 1550M Free Swap: 39G Total, 39G Free last pid: 1767; load averages: 0.00, 0.00, 0.00 25 processes: 1 running, 24 sleeping CPU: 0.0% user, 0.0% nice, 27.6% system, 0.0% interrupt, 72.4% idle Mem: 14M Active, 8212K Inact, 4380M Wired, 40K Cache, 13M Buf, 1433M Free Swap: 39G Total, 39G Free *sudden system freeze* Whether it ends up utilising the swap space and/or thrashing I don''t know - I only know that the zpool disks'' fancy LED''s have stopped as well as the general disk access LED. At this point I can still ping the host, and get a response from telnetting to port 22, but I can''t ssh in or access the console. Any suggestions would be appreciated, we''re rather fond of the data on the other 4 innocent filesystems ;) The debug output: Dec 1 15:56:07 fortinbras kernel: ZFS filesystem version 5 Dec 1 15:56:07 fortinbras kernel: zvol_init:1700[1]: ZVOL Initialized. Dec 1 15:56:07 fortinbras kernel: ZFS storage pool version 28 Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada2. Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada2. Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:116[1]: Created geom and consumer for ada2. Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada2... Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada2 is 12202424374202010419 Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada2. Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada1. Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada1. Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada1. Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada1... Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada1 is 1791271857894800867 Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada1. Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada3. Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada3. Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada3. Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada3... Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada3 is 2075211652913351343 Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada3. Dec 1 15:56:07 fortinbras kernel: Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada4. Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada4. Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada4. Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada4... Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada4 is 6076447326626437945 Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada4. Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada5. Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada5. Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada5. Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada5... Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada5 is 16505750022304911009 Dec 1 15:56:07 fortinbras kernel: Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada5. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to ada2. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer to ada2. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to ada1. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer to ada1. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to ada3. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer to ada3. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to ada4. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer to ada4. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to ada5. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer to ada5. Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:168[1]: Destroyed geom zfs::vdev. Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada2. Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada2. Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:116[1]: Created geom and consumer for ada2. Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Dec 1 15:56:08 fortinbras kernel: Reading guid from ada2... Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada2 is 12202424374202010419 Dec 1 15:56:08 fortinbras kernel: Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada2. Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada1. Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada1. Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada1. Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada1... Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada1 is 1791271857894800867 Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada1. Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada3. Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada3. Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada3. Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada3... Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada3 is 2075211652913351343 Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada3. Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada4. Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada4. Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada4. Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada4... Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada4 is 6076447326626437945 Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada4. Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada5. Dec 1 15:56:08 fortinbras kernel: Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada5. Dec 1 15:56:08 fortinbras kernel: Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada5. Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada5... Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada5 is 16505750022304911009 Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada5.
You have just learned the hard way that dedup is *highly* toxic if misused. If you have a backup of your data, then you should delete the *pool*. Trying to destroy the dataset (the zfs level filesystem) will probably never succeed unless you have it located on an SSD or you have an enormous amount of RAM (maybe 100GB? I haven''t done the math on your system). There really isn''t any other solution to this that I''m aware of. (Destroying the filesystem means a *lot* of random I/O? your drives are probably completely swamped by the workload.) In general, deleting data (especially filesystems) should almost never be done in the face of dedup, and you should not use dedup unless you know that your data has a lot of natural redundancies in it *and* you have adequate memory. In general, dedup is *wrong* for use by typical home/hobbyist users. It can make sense when hosting a lot of VM images, or in some situations like backups with a lot of redundant copies. I really wish we made it harder for end-users to enable dedup. For the first year or so after Nexenta shipped it, it was the single most frequent source of support calls. If you''ve not done the analysis already, and you''re not a storage administrator, you probably should not enable dedup. - Garrett On Dec 1, 2011, at 1:43 PM, Gareth de Vaux wrote:> Hi guys, when ZFS starts it ends up hanging the system. > > We have a raidz over 5 x 2TB disks with 5 ZFS filesystems. (The > root filesystem is on separate disks). > > # uname -a > FreeBSD fortinbras.XXX 8.2-STABLE FreeBSD 8.2-STABLE #0: Wed Oct 19 09:20:04 SAST 2011 root at storage.XXX:/usr/obj/usr/src/sys/GENERIC amd64 > > ZFS filesystem version 5 > ZFS storage pool version 28 > > The setup was working great until we decided to try out deduplication. > After testing it on a few files I set deduplication on 1 of the > filesystems and moved around 1.5TB of data into it. There was basically > no space saving, just a big performance hit, so we decided to take it off. > While deleting a directory on this filesystem in preparation the system > hung. After an abnormal half an hour long bootup everything was fine, > and a scrub was clean. We then decided to rather just destroy this > filesystem as it happened to have disposable data on and would be 100 > times quicker(?) I ran the zfs destroy command which sat there for about > 40 hours while the free space on the pool gradually increased, at which > point the system hung again. Rebooting took hours, stuck at ZFS > initialisation, after which the console returned: > > pid 37 (zfs), uid 0, was killed: out of swap space > pid 38 (sh), uid 0, was killed: out of swap space > > This''s now the state I''m stuck in. I can naturally boot up without ZFS > but once I start it manually the disks in the pool start flashing for > an hour or 2 and the system hangs before they finish doing whatever > they''re doing. > > The system has 6GB of RAM and a 10GB swap partition. I added a 30GB > swap file but this hasn''t helped. > > # sysctl hw.physmem > hw.physmem: 6363394048 > > # sysctl vfs.zfs.arc_max > vfs.zfs.arc_max: 5045088256 > > (I lowered arc_max to 1GB but hasn''t helped) > > I''ve included the output when starting ZFS after setting vfs.zfs.debug=1 > at the bottom. There''s no more ZFS output for the next few hours while > the disks are flashing and the system is responsive. > > A series of top outputs after starting ZFS: > > last pid: 1536; load averages: 0.00, 0.02, 0.08 > 21 processes: 1 running, 20 sleeping > CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > Mem: 12M Active, 7344K Inact, 138M Wired, 44K Cache, 10M Buf, 5678M Free > Swap: 39G Total, 39G Free > > last pid: 1567; load averages: 0.13, 0.05, 0.08 > 25 processes: 1 running, 24 sleeping > CPU: 0.0% user, 0.0% nice, 2.3% system, 2.1% interrupt, 95.6% idle > Mem: 14M Active, 7880K Inact, 328M Wired, 44K Cache, 13M Buf, 5485M Free > Swap: 39G Total, 39G Free > > last pid: 1632; load averages: 0.06, 0.04, 0.06 > 25 processes: 1 running, 24 sleeping > CPU: 0.0% user, 0.0% nice, 0.5% system, 0.1% interrupt, 99.4% idle > Mem: 14M Active, 8040K Inact, 2421M Wired, 40K Cache, 13M Buf, 3392M Free > Swap: 39G Total, 39G Free > > last pid: 1693; load averages: 0.11, 0.10, 0.08 > 25 processes: 1 running, 24 sleeping > CPU: 0.0% user, 0.0% nice, 0.3% system, 0.1% interrupt, 99.5% idle > Mem: 14M Active, 8220K Inact, 4263M Wired, 40K Cache, 13M Buf, 1550M Free > Swap: 39G Total, 39G Free > > last pid: 1767; load averages: 0.00, 0.00, 0.00 > 25 processes: 1 running, 24 sleeping > CPU: 0.0% user, 0.0% nice, 27.6% system, 0.0% interrupt, 72.4% idle > Mem: 14M Active, 8212K Inact, 4380M Wired, 40K Cache, 13M Buf, 1433M Free > Swap: 39G Total, 39G Free > > *sudden system freeze* > > Whether it ends up utilising the swap space and/or thrashing I don''t know - > I only know that the zpool disks'' fancy LED''s have stopped as well as the > general disk access LED. At this point I can still ping the host, and get > a response from telnetting to port 22, but I can''t ssh in or access the > console. > > Any suggestions would be appreciated, we''re rather fond of the data on the > other 4 innocent filesystems ;) > > > The debug output: > > Dec 1 15:56:07 fortinbras kernel: ZFS filesystem version 5 > Dec 1 15:56:07 fortinbras kernel: zvol_init:1700[1]: ZVOL Initialized. > Dec 1 15:56:07 fortinbras kernel: ZFS storage pool version 28 > Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada2. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada2. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:116[1]: Created geom and consumer for ada2. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada2... > Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada2 is 12202424374202010419 > Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada2. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada1. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada1. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada1. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada1... > Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada1 is 1791271857894800867 > Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada1. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada3. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada3. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada3. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada3... > Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada3 is 2075211652913351343 > Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada3. > Dec 1 15:56:07 fortinbras kernel: > Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada4. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada4. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada4. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada4... > Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada4 is 6076447326626437945 > Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada4. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada5. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada5. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada5. > Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada5... > Dec 1 15:56:07 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada5 is 16505750022304911009 > Dec 1 15:56:07 fortinbras kernel: > Dec 1 15:56:07 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada5. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to ada2. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer to ada2. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to ada1. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer to ada1. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to ada3. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer to ada3. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to ada4. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer to ada4. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:156[1]: Closing access to ada5. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:160[1]: Destroyed consumer to ada5. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_detach:168[1]: Destroyed geom zfs::vdev. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada2. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada2. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:116[1]: Created geom and consumer for ada2. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: > Dec 1 15:56:08 fortinbras kernel: Reading guid from ada2... > Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada2 is 12202424374202010419 > Dec 1 15:56:08 fortinbras kernel: > Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada2. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada1. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada1. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada1. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada1... > Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada1 is 1791271857894800867 > Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada1. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada3. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada3. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada3. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada3... > Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada3 is 2075211652913351343 > Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada3. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada4. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada4. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada4. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada4... > Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada4 is 6076447326626437945 > Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada4. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:384[1]: Found provider by name /dev/ada5. > Dec 1 15:56:08 fortinbras kernel: > Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:95[1]: Attaching to ada5. > Dec 1 15:56:08 fortinbras kernel: > Dec 1 15:56:08 fortinbras kernel: vdev_geom_attach:136[1]: Created consumer for ada5. > Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:239[1]: Reading guid from ada5... > Dec 1 15:56:08 fortinbras kernel: vdev_geom_read_guid:273[1]: guid for ada5 is 16505750022304911009 > Dec 1 15:56:08 fortinbras kernel: vdev_geom_open_by_path:399[1]: guid match for provider /dev/ada5. > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> > The system has 6GB of RAM and a 10GB swap partition. I added a 30GB > swap file but this hasn''t helped. >ZFS doesn''t use swap for the ARC (it''s wired aka unswappable memory). And ZFS uses the ARC for dedupe support. You will need to find a lot of extra RAM to stuff into that machine in order for it to boot correctly, load the dedeupe tables into ARC, process the intent log, and then import the pool. And, you''ll need that extra RAM in order to destroy the ZFS filesystem that has dedupe enabled. Basically, your DDT (dedupe table) is running you out of ARC space and livelocking (or is it deadlocking, never can keep those terms straight) the box. You can remove the RAM once you have things working again. Just don''t re-enable dedupe until you have at least 16 GB of RAM in the box that can be dedicated to ZFS. And be sure to add a cache device to the pool. I just went through something similar with an 8 GB ZFS box (RAM is on order, but purchasing dept ordered from wrong supplier so we''re stuck waiting for it to arrive) where I tried to destroy dedupe''d filesystem. Exact same results as you. Stole RAM out of a different server temporarily to get things working on this box again.> # sysctl hw.physmem > hw.physmem: 6363394048 > > # sysctl vfs.zfs.arc_max > vfs.zfs.arc_max: 5045088256 > > (I lowered arc_max to 1GB but hasn''t helped) >DO NOT LOWER THE ARC WHEN DEDUPE ENABLED!! -- Freddie Cash fjwcash at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20111201/9545590c/attachment-0001.html>
"Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D."
2011-Dec-01 22:50 UTC
[zfs-discuss] ZFS not starting
FYI http://www.oracle.com/technetwork/articles/servers-storage-admin/o11-113-size-zfs-dedup-1354231.html never to late:-( On 12/1/2011 5:19 PM, Freddie Cash wrote:> > The system has 6GB of RAM and a 10GB swap partition. I added a 30GB > swap file but this hasn''t helped. > > > ZFS doesn''t use swap for the ARC (it''s wired aka unswappable memory). > And ZFS uses the ARC for dedupe support. > > You will need to find a lot of extra RAM to stuff into that machine in > order for it to boot correctly, load the dedeupe tables into ARC, > process the intent log, and then import the pool. > > And, you''ll need that extra RAM in order to destroy the ZFS filesystem > that has dedupe enabled. > > Basically, your DDT (dedupe table) is running you out of ARC space and > livelocking (or is it deadlocking, never can keep those terms > straight) the box. > > You can remove the RAM once you have things working again. Just don''t > re-enable dedupe until you have at least 16 GB of RAM in the box that > can be dedicated to ZFS. And be sure to add a cache device to the pool. > > I just went through something similar with an 8 GB ZFS box (RAM is on > order, but purchasing dept ordered from wrong supplier so we''re stuck > waiting for it to arrive) where I tried to destroy dedupe''d > filesystem. Exact same results as you. Stole RAM out of a different > server temporarily to get things working on this box again. > > # sysctl hw.physmem > hw.physmem: 6363394048 <tel:6363394048> > > # sysctl vfs.zfs.arc_max > vfs.zfs.arc_max: 5045088256 <tel:5045088256> > > (I lowered arc_max to 1GB but hasn''t helped) > > > DO NOT LOWER THE ARC WHEN DEDUPE ENABLED!! > > -- > Freddie Cash > fjwcash at gmail.com <mailto:fjwcash at gmail.com> > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Hung-Sheng Tsao Ph D. Founder& Principal HopBit GridComputing LLC cell: 9734950840 http://laotsao.wordpress.com/ http://laotsao.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20111201/a4d36fc7/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: laotsao.vcf Type: text/x-vcard Size: 153 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20111201/a4d36fc7/attachment.vcf>
On Thu 2011-12-01 (14:19), Freddie Cash wrote:> You will need to find a lot of extra RAM to stuff into that machine in > order for it to boot correctly, load the dedeupe tables into ARC, process > the intent log, and then import the pool.Thanks guys, managed to get 24GB together and it made it (looks like it used 12GB of that).> And, you''ll need that extra RAM in order to destroy the ZFS filesystem that > has dedupe enabled.That filesystem''s gone, and it seems like I''ve got mostly the right ammount of free space. Waiting to see what a scrub has to say ..