BJ Quinn
2008-Sep-30 20:53 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
Is there more information that I need to post in order to help diagnose this problem? -- This message posted from opensolaris.org
Richard Elling
2008-Sep-30 21:16 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
BJ Quinn wrote:> Is there more information that I need to post in order to help diagnose this problem? >Segmentation faults should be correctly handled by the software. Please file a bug and attach the core. http://bugs.opensolaris.org -- richard
BJ Quinn
2008-Sep-30 22:01 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
Please forgive my ignorance. I''m fairly new to Solaris (Linux convert), and although I recognize that Linux has the same concept of Segmentation faults / core dumps, I believe my typical response to a Segmentation Fault was to upgrade the kernel and that always fixed the problem (i.e. somebody else filed the bug and fixed the problem before I got around to doing it myself). So - I''m running stock OpenSolaris 2008.05. Even if the bug was fixed, I imagine it would require a Solaris kernel upgrade anyway, right? Perhaps I could simply try that first? Are the kernel upgrades "stable"? I know for a while there, before the 2008.05 release, Solaris just released a new "development" kernel every two weeks. I don''t think I want to just haphazardly upgrade to some random bi-weekly development kernel. Are there actually "stable" kernel upgrades for OS, and how would I go about upgrading it if there are? -- This message posted from opensolaris.org
Richard Elling
2008-Sep-30 22:16 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
BJ Quinn wrote:> Please forgive my ignorance. I''m fairly new to Solaris (Linux convert), and although I recognize that Linux has the same concept of Segmentation faults / core dumps, I believe my typical response to a Segmentation Fault was to upgrade the kernel and that always fixed the problem (i.e. somebody else filed the bug and fixed the problem before I got around to doing it myself). > > So - I''m running stock OpenSolaris 2008.05. Even if the bug was fixed, I imagine it would require a Solaris kernel upgrade anyway, right? Perhaps I could simply try that first? Are the kernel upgrades "stable"? I know for a while there, before the 2008.05 release, Solaris just released a new "development" kernel every two weeks. I don''t think I want to just haphazardly upgrade to some random bi-weekly development kernel. Are there actually "stable" kernel upgrades for OS, and how would I go about upgrading it if there are? >If there was a bug already filed and fixed, then it should be in the bugs database, which is searchable at: http://bugs.opensolaris.org -- richard
BJ Quinn
2008-Sep-30 23:03 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
True, but a search for zfs "segmentation fault" returns 500 bugs. It''s possible one of those is related to my issue, but it would take all day to find out. If it''s not "flaky" or "unstable", I''d like to try upgrading to the newest kernel first, unless my Linux mindset is truly out of place here, or if it''s not relatively easy to do. Are these kernels truly considered stable? How would I upgrade? -- This message posted from opensolaris.org
Richard Elling
2008-Sep-30 23:54 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
BJ Quinn wrote:> True, but a search for zfs "segmentation fault" returns 500 bugs. It''s possible one of those is related to my issue, but it would take all day to find out. If it''s not "flaky" or "unstable", I''d like to try upgrading to the newest kernel first, unless my Linux mindset is truly out of place here, or if it''s not relatively easy to do. Are these kernels truly considered stable? How would I upgrade? >Searching bug databases can be an art... Project Indiana is where notifications of package repository changes are made. b98 is available, with instructions posted recently http://www.opensolaris.org/jive/thread.jspa?threadID=75115&tstart=15 Be sure to read the release notes http://opensolaris.org/os/project/indiana/resources/rn3/image-update/ -- richard
Bob Friesenhahn
2008-Sep-30 23:55 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
On Tue, 30 Sep 2008, BJ Quinn wrote:> True, but a search for zfs "segmentation fault" returns 500 bugs. > It''s possible one of those is related to my issue, but it would take > all day to find out. If it''s not "flaky" or "unstable", I''d like to > try upgrading to the newest kernel first, unless my Linux mindset is > truly out of place here, or if it''s not relatively easy to do. Are > these kernels truly considered stable? How would I upgrade? -- ThisLinux and Solaris are quite different when it comes to kernel strategies. Linux documents and stabilizes its kernel interfaces while Solaris does not document its kernel interfaces, but focuses on stable shared library interfaces. Most Linux system APIs have a direct kernel API equivalent but Solaris often uses a completely different kernel interface. Segmentation faults in user applications are generally due to user-space bugs rather than due to the kernel. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Joerg Schilling
2008-Oct-01 13:18 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:> On Tue, 30 Sep 2008, BJ Quinn wrote: > > > True, but a search for zfs "segmentation fault" returns 500 bugs. > > It''s possible one of those is related to my issue, but it would take > > all day to find out. If it''s not "flaky" or "unstable", I''d like to > > try upgrading to the newest kernel first, unless my Linux mindset is > > truly out of place here, or if it''s not relatively easy to do. Are > > these kernels truly considered stable? How would I upgrade? -- This > > Linux and Solaris are quite different when it comes to kernel > strategies. Linux documents and stabilizes its kernel interfacesLinux does not implement stable kernel interfaces. It may be that there is an intention to do so but I''ve seen problems on Linux resulting from self-incompatibility on a regular base. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
Fajar A. Nugraha
2008-Oct-01 13:49 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
Next "stable" (as in fedora or ubuntu releases) opensolaris version will be 2008.11. In my case I found 2008.05 is simply unusable (my main interest is xen/xvm), but upgrading to the latest available build with OS''s pkg, (similar to apt-get) fixed the problem. If you installed the original OS 2008.05, upgrading is somewhat harder because it requires some additional steps (see OS website for details). Once you''re running current build, upgrading is just a simple command. In OS, when you upgrade, you get to keep you old version as well, so you can easily rollback if something went wrong. On 10/1/08, BJ Quinn <bjquinn at seidal.com> wrote:> True, but a search for zfs "segmentation fault" returns 500 bugs. It''s > possible one of those is related to my issue, but it would take all day to > find out. If it''s not "flaky" or "unstable", I''d like to try upgrading to > the newest kernel first, unless my Linux mindset is truly out of place here, > or if it''s not relatively easy to do. Are these kernels truly considered > stable? How would I upgrade? > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
David G. Bustos
2008-Oct-01 18:08 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
The problem could be in the zfs command or in the kernel. Run "pstack" on the core dump and search the bug database for the functions it lists. If you can''t find a bug that matches your situation and your stack, file a new bug and attach the core. If the engineers find a duplicate bug, they''ll just close it as a duplicate, and the bug database will show a pointer to the original bug. David -- This message posted from opensolaris.org
BJ Quinn
2008-Oct-09 01:50 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
Ok I''m taking a step back here. Forgetting the incremental for a minute (which is the part causing the segmentation fault), I''m simply trying to use zfs send -R to get a whole filesystem and all of its snapshots. I ran the following, after creating a compressed pool called backup : zfs send -R datapool/shares at BACKUP20081008 | zfs recv -dv backup datapool/shares has three snapshots - BACKUP081007, BACKUP20081008, and just BACKUP, in that age order. However, the command above creates backup/shares and backup/shares at BACKUP081007, and the contents of backup/shares seems to be from that (the oldest) snapshot. The newer (and explicitly specified) snapshot, BACKUP20081008 snapshot doesn''t get copied over as a snapshot, and its contents don''t get copied over into the backup/shares as the "current" snapshot. Am I doing something wrong here? Possibly having that additional, newer, generic BACKUP snapshot (I just wanted a snapshot to always represent the newest backup, so I created BACKUP20081008 and then BACKUP immediately afterwards) messed it up? Can I not transfer over snapshots from a certain point backwards in time, or does zfs send -R require sending from the newest snapshot backwards, or am I off altogether? Thanks! -- This message posted from opensolaris.org
BJ Quinn
2008-Oct-09 05:49 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
Oh and I had been doing this remotely, so I didn''t notice the following error before - receiving incremental stream of datapool/shares at BACKUP20081008 into backup/shares at BACKUP20081008 cannot receive incremental stream: destination backup/shares has been modified since most recent snapshot This is reported after the first snapshot, BACKUP081007 gets copied, and then it quits. I don''t see why it would have been modified. I guess it''s possible I cd''ed into the backup directory at some point during the send/recv, but I don''t think so. Should I set the readonly property on the backup FS or something? -- This message posted from opensolaris.org
Brent Jones
2008-Oct-09 07:17 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
On Wed, Oct 8, 2008 at 10:49 PM, BJ Quinn <bjquinn at seidal.com> wrote:> Oh and I had been doing this remotely, so I didn''t notice the following error before - > > receiving incremental stream of datapool/shares at BACKUP20081008 into backup/shares at BACKUP20081008 > cannot receive incremental stream: destination backup/shares has been modified > since most recent snapshot > > This is reported after the first snapshot, BACKUP081007 gets copied, and then it quits. I don''t see why it would have been modified. I guess it''s possible I cd''ed into the backup directory at some point during the send/recv, but I don''t think so. Should I set the readonly property on the backup FS or something? > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >Correct, the other side should be set Read Only, that way nothing at all is modified when the other hosts tries to zfs send. -- Brent Jones brent at servuhome.net
Christian Heßmann
2008-Oct-09 08:39 UTC
[zfs-discuss] Segmentation fault / core dump with recursive send/recv
On 09.10.2008, at 09:17, Brent Jones wrote:> Correct, the other side should be set Read Only, that way nothing at > all is modified when the other hosts tries to zfs send.Since I use the receiving side for backup purposes only, which means that any change would be accidental - shouldn''t a recv -F do the trick just as well? Best regards, Christian
BJ Quinn
2008-Oct-09 22:56 UTC
[zfs-discuss] Segmentation fault / core dump with recursive
Yeah -F should probably work fine (I''m trying it as we speak, but it takes a little while), but it makes me a bit nervous. I mean, it should only be necessary if (as the error message suggests) something HAS actually changed, right? So, here''s what I tried - first of all, I set the backup FS to readonly. That resulted in the same error message. Strange, how could something have changed since the last snapshot if I CONSCIOUSLY didn''t change anything or CD into it or anything AND it was set to readonly? Oh well, so I tried another idea - I had been setting compression to gzip-1 on my backup, but my source filesystem had compression=gzip-1 and recordsize=16k. So, I set both of those settings on my backup FS (and readonly OFF). Now the only difference between my backup and my backup source should be the fact that they have different FS names (datapool/shares and backup/shares), but they would kinda have to, wouldn''t they? At any rate, I''m trying -F now, but that makes me a bit uncomfortable. Why does zfs think something has changed? Am I truly creating a backup that could be restored and won''t have something screwed up either with some of the older snapshots or with the current, "promoted" version of the FS? If something really has changed between snapshots, or the incrementals aren''t copied over just right, I could end up with my best backup being all corrupted. (My other backup methods, of course, contain only current data, not the whole series of snapshots.) -- This message posted from opensolaris.org
BJ Quinn
2008-Oct-10 20:16 UTC
[zfs-discuss] Segmentation fault / core dump with recursive
Ok, in addition to my "why do I have to use -F" post above, now I''ve tried it with -F but after the first in the series of snapshots gets sent, it gives me a "cannot mount ''/backup/shares'': failed to create mountpoint". -- This message posted from opensolaris.org
Scott Williamson
2008-Oct-10 20:59 UTC
[zfs-discuss] Segmentation fault / core dump with recursive
On Thu, Oct 9, 2008 at 6:56 PM, BJ Quinn <bjquinn at seidal.com> wrote:> So, here''s what I tried - first of all, I set the backup FS to readonly. > That resulted in the same error message. Strange, how could something have > changed since the last snapshot if I CONSCIOUSLY didn''t change anything or > CD into it or anything AND it was set to readonly? >To this, when I ''zfs send'' differential snapshots to another pool on Sol10U5 I see this message on some file systems and not others. no cd etc. The only solution I found was to set the parent file system to mount=legacy and not mount it. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081010/25c8bf16/attachment.html>
BJ Quinn
2008-Oct-10 21:29 UTC
[zfs-discuss] Segmentation fault / core dump with recursive
You''ve seen -F be necessary on some systems and not on others? Also, was the mount=legacy suggestion for my problem with not wanting to use -F or for my "cannot create mountpoint" problem? Or both? If you use legacy mountpoints, does that mean that mounting the parent filesystem doesn''t actually mount each sub-filesystem, and you have to mount them all individually? -- This message posted from opensolaris.org
I found setting atime=off was enough to get zfs receive working for me, but the readonly property should work as well. I chose not to set the pool readonly, as I want to be able to use my backup pool as a replacement easily, without changing any settings. Not using -F means that as soon as the backup system gets used and data gets written to it, that data is protected from being overwritten by the next backup. -- This message posted from opensolaris.org
BJ Quinn
2008-Oct-11 15:47 UTC
[zfs-discuss] Segmentation fault / core dump with recursive
readonly=on worked (at least with -F), but then it got the error creating a mountpoint I mentioned above. So I took away readonly=on, and it got past that part, however the snapshots past the first one take an eternity. I left it overnight and it managed to get from 21MB copied for the second snapshot to 26MB. Admittedly, I''m trying to back up to a thumb drive (a fast one), but the first snapshot it copies over, which contains 13GB, doesn''t take very long. It slows to a crawl after the first snapshot is copied. iostat shows the thumb drive at 100% busy, but it''s not actually doing much. I''ll try turning off atime and not using -F and possibly also not auto-mounting the backup drive (mountpoint=legacy or mountpoint=none I guess) so as to eliminate other possible sources of changed data. Anyone know why it would go so slow after the first snapshot? -- This message posted from opensolaris.org
BJ Quinn
2008-Oct-13 19:20 UTC
[zfs-discuss] Segmentation fault / core dump with recursive
Ok so I left the thumb drive to try to backup all weekend. It got *most* of the first snapshot copied over, about 50MB, and that''s it. So I tried an external USB hard drive today, and it actually bothered to copy over the snapshots, but it does so very slowly. It copied over the first snapshot (the "full stream") at about 25MB/s (uncompressed). Seems silly that it won''t retain compression during copy, which would speed up my transfer, but oh well. However, once it gets to the incrementals, it slows down to about 1MB-5MB/s. Then, once it got to the last incremental snapshot, copied it over and then hung. (By the way, when I say incrementals, I don''t mean I''m doing a zfs send -i, just a zfs send -R to a clean drive.) Now if I try to do a zfs list on my backup pool, it hangs and never comes back. My zfs send -R command has now also hung. iostat and the little green light on the external drive both show no further activity on the external drive. This is crazy. I feel like I''m beta testing a brand new feature or something. Nothing works. Does anybody actually use this? -- This message posted from opensolaris.org
BJ Quinn
2008-Oct-14 18:45 UTC
[zfs-discuss] Segmentation fault / core dump with recursive
Well, I haven''t solved everything yet, but I do feel better now that I realize that it was setting moutpoint=none that caused the zfs send/recv to hang. Allowing the default mountpoint setting fixed that problem. I''m now trying with moutpoint=legacy, because I''d really rather leave it unmounted, especially during the backup itself, to prevent changes happening while the incrementals are copying over, and also in the end to hopefully let me avoid using -F. The incrementals (copying all the snapshots beyond the first one copied) are really slow, however. Is there anything that can be done to speed that up? I''m using compression (gzip-1) on the source filesystem. I wanted the backup to retain the same compression. Can ZFS copy the compressed version over to the backup, or does it really have to uncompress it and recompress it? That takes time and lots of CPU cycles. I''m dealing with highly compressible data (at least 6.5:1). -- This message posted from opensolaris.org