I''ve got a really unique problem. Running xeno-bk changeset 1364, I get random filesystem/bash globbing problems in dom0. Same system under 2.4.26 without xeno, no issues at all. In this case the dom0 distribution is slackware, running ext3, on an Adaptec AIC-7892P U160/m scsi controller. The problem manifests itself most easily with ''ls'': """ root@durandal:~# ls /var/adm/packages/ aaa_base-10.0.0-noarch-1 guile-1.6.4-i486-1 pango-1.4.0-i486-1 aaa_elflibs-9.2.0-i486-1 gzip-1.3.3-i386-2 pciutils-2.1.11-i486-5 acpid-1.0.3-i486-1 hdparm-5.5-i486-1 pcre-4.5-i486-2 ... """ works as expected. """ root@durandal:~# ls /var/adm/packages/* : : No such file or directory : : No such file or directory ... """ fails, as does: """ root@durandal:~# cd /var/log/packages/ root@durandal:/var/log/packages# ls * : : No such file or directory ... """ Now things get really strange: """ root@durandal:/var/log/packages# for x in *; do echo $x; done aaa_base-10.0.0-noarch-1 aaa_elflibs-9.2.0-i486-1 acpid-1.0.3-i486-1 at-3.1.8-i486-2 ... root@durandal:/var/log/packages# echo * aaa_base-10.0.0-noarch-1 aaa_elflibs-9.2.0-i486-1 acpid-1.0.3-i486-1 ... root@durandal:/var/log/packages# ls `echo *` : : No such file or directory : : No such file or directory ... root@durandal:/var/log/packages# strace ls * : : command not found root@durandal:/var/log/packages# cd root@durandal:# strace ls /var/log/packages/* : : command not found root@durandal:~# ls * /usr/bin/ls: *: No such file or directory root@durandal:~# strace ls * execve("/usr/bin/ls", ["ls", "*"], [/* 28 vars */]) = 0 brk(0) = 0x805a000 ... stat64("*", 0x805b6f4) = -1 ENOENT (No such file or directory) lstat64("*", 0x805b6f4) = -1 ENOENT (No such file or directory) write(2, "ls: ", 4ls: ) = 4 write(2, "*", 1*) = 1 open("/usr/share/locale/en_US/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) write(2, ": No such file or directory", 27: No such file or directory) = 27 write(2, "\n", 1 """ So strace only works in SOME of the directories that exhibit this problem, and when it fails, it''s really trying to stat a literal ''*'' - the bash globbing fails randomly. Normally I''d accept this as being a borked install that I''m trying to run as dom0, but none of this behavior exhibits itself under the 2.4.26 stock kernel. I did a make clean and a fresh make world to be sure I didn''t have any old object files laying around, no change in the behavior. The filesystem fsck -f''s clean under both xeno and non-xeno. -m ------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> I''ve got a really unique problem. Running xeno-bk changeset 1364, I get > random filesystem/bash globbing problems in dom0. Same system under 2.4.26 > without xeno, no issues at all. > > In this case the dom0 distribution is slackware, running ext3, on an Adaptec > AIC-7892P U160/m scsi controller. > > root@durandal:~# ls /var/adm/packages/* > : : No such file or directory > : : No such file or directoryBizarre. It''s hard to figure out how Xen/Linux could be doing this, unless there''s some terrible memory corruption going on that would cause things to be segfaulting or kernel oops. Xen just doesn''t do subtle bugs ;-) Can you do an ldd on /bin/ls just to see what libraries you''re using. Are there alternative libraries under /lib (e.g. non tls or non i686) that you could try? Finally, just make sure that there''s nothing in your environment that is switching on something subtle returned from uname or arch... Ian ------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> So strace only works in SOME of the directories that exhibit this problem, > and when it fails, it''s really trying to stat a literal ''*'' - the bash globbing > fails randomly. > > Normally I''d accept this as being a borked install that I''m trying to run as > dom0, but none of this behavior exhibits itself under the 2.4.26 stock kernel. > > I did a make clean and a fresh make world to be sure I didn''t have any old > object files laying around, no change in the behavior. The filesystem fsck -f''s > clean under both xeno and non-xeno.Very odd! Just a random thought - you aren''t using TLS libraries are you? If you were then you''d get a pretty clear warning during boot, so it''s unlikely. -- Keir ------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
urmk@reason.marist.edu
2004-Oct-06 14:49 UTC
Re: [Xen-devel] Very odd filesystem problem in dom0
Yeah, I''m inclined to agree - it''s a very subtle, very weird situation. I can confirm that it ISN''T system libraries: """ root@durandal:~# ldd bash-static not a dynamic executable root@durandal:~# ./bash-static root@durandal:~# ls /var/adm/packages/* : : No such file or directory : : No such file or directory : : No such file or directory : : No such file or directory ... root@durandal:~# ldd ./ls-static not a dynamic executable root@durandal:~# ./ls-static /var/adm/packages/* : : No such file or directory : : No such file or directory : : No such file or directory """ Static LS fails from both inside static bash and inside dynamic bash. It -definitely- seems to be tied to those directories - I can move the old one, create a new one of the same name, and the new one is fine - but I can''t copy the files from the old one over: """ root@durandal:/var/adm# cp packages/* tem/ : `'': specified destination directory does not exist """ I can''t find any variables being set that have anything to do with the uname, and I''ve run uname-variant kernels on the same slack10 base installs before with no odd issues. I''ve tried booting off the old kernel (and the install cd) and re-copying the /var directory (mv /var /old-var, cp -av /old-var /var) so that if its some sort of low-level fs corruption, that SHOULD get past it: No luck. set -o xtrace shows bash is expanding the globbing correctly: """ root@durandal:/var/log# ls packages/* + /usr/bin/ls --color=auto -F -b -T 0 packages/aaa_base-10.0.0-noarch-1 packages/aaa_elflibs-9.2.0-i486-1 ... """ Turning off color-ls doesn''t do any good, either. Ah-hah. It was just sugested to me that the size of the directory might matter: It''s not quite the number of files, it''s the lengh of the argument character string. (Sorry, writing this email stream-of-thought as I test different thing) Filling a test directory with 100 files (for x in `seq 0 100`; do touch $x; done) doesn''t trigger it, but filling a test dir with 100 files with long names (touch aaaaaaaaaaaaaaaaaaaaaa-$x) does. Is xeno changing the size of the exec string buffer in the kernel? -m On Tue, Oct 05, 2004 at 10:24:53PM +0100, Ian Pratt wrote:> Bizarre. It''s hard to figure out how Xen/Linux could be doing > this, unless there''s some terrible memory corruption going on > that would cause things to be segfaulting or kernel oops. Xen > just doesn''t do subtle bugs ;-) > > Can you do an ldd on /bin/ls just to see what libraries you''re > using. Are there alternative libraries under /lib (e.g. non tls > or non i686) that you could try? > > Finally, just make sure that there''s nothing in your environment > that is switching on something subtle returned from uname or > arch...------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
urmk@reason.marist.edu
2004-Oct-06 15:25 UTC
Re: [Xen-devel] Very odd filesystem problem in dom0
OK, this is definitely related to the length of the exec string, and the exec getting corrupted. I can''t recreate it with the obvious echo `perl print ''a''x100`, but I can create it with the length of filenames in the directory -- i''ve even gotten it to corrupt to the point that argv[0] is showing up later in the arg string: """ root@durandal:~/temp# ls * : : No such file or directory : : No such file or directory : : No such file or directory /bin/ls* 2222222222 3333333333 4444444444 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa0 .... """ /bin/ls was obviously not in that directory. I''ll keep working on narrowing it down more and looking at the kernel code. -m On Wed, Oct 06, 2004 at 10:49:42AM -0400, urmk@reason.marist.edu wrote:> Ah-hah. It was just sugested to me that the size of the directory might matter: > It''s not quite the number of files, it''s the lengh of the argument character > string. (Sorry, writing this email stream-of-thought as I test different > thing) > > Filling a test directory with 100 files (for x in `seq 0 100`; do touch $x; done) > doesn''t trigger it, but filling a test dir with 100 files with long names > (touch aaaaaaaaaaaaaaaaaaaaaa-$x) does. > > Is xeno changing the size of the exec string buffer in the kernel?------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Just to confirm: you are seeing this problem on 2.6.8.1-xen? I can reproduce on 2.6 but not on 2.4. Looks like the arguments go into execve() but get lost sometime during the exec syscall... -- Keir> OK, this is definitely related to the length of the exec string, and the exec > getting corrupted. I can''t recreate it with the obvious echo `perl print ''a''x100`, > but I can create it with the length of filenames in the directory -- i''ve even > gotten it to corrupt to the point that argv[0] is showing up later in the arg > string: > > """ > root@durandal:~/temp# ls * > : : No such file or directory > : : No such file or directory > : : No such file or directory > /bin/ls* > 2222222222 > 3333333333 > 4444444444 > aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa0 > .... > """ > > /bin/ls was obviously not in that directory. > > I''ll keep working on narrowing it down more and looking at the kernel code. > > -m > > On Wed, Oct 06, 2004 at 10:49:42AM -0400, urmk@reason.marist.edu wrote: > > Ah-hah. It was just sugested to me that the size of the directory might matter: > > It''s not quite the number of files, it''s the lengh of the argument character > > string. (Sorry, writing this email stream-of-thought as I test different > > thing) > > > > Filling a test directory with 100 files (for x in `seq 0 100`; do touch $x; done) > > doesn''t trigger it, but filling a test dir with 100 files with long names > > (touch aaaaaaaaaaaaaaaaaaaaaa-$x) does. > > > > Is xeno changing the size of the exec string buffer in the kernel? > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xen-devel------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> OK, this is definitely related to the length of the exec string, and the exec > getting corrupted. I can''t recreate it with the obvious echo `perl print ''a''x100`, > but I can create it with the length of filenames in the directory -- i''ve even > gotten it to corrupt to the point that argv[0] is showing up later in the arg > string:Building without writable pagetables gets rid of the problem. Kind of good news in a way -- I have a reproducible test case where writable p.t.''s always screw up. It''s now worth my while to spend some "quality time" with that code tomorrow. :-) Hopefully any code bug(s) I find will also be responsible for the other crashes/weirdnesses that people have been seeing (e.g., the rmap crash [Flavio] and the crash in Xen [Dave Becker]). -- Keir ------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel