nic@cray.com
2007-Jan-09 16:13 UTC
[Lustre-devel] [Bug 11283] 1.4.9.pre: SEGV in libcfs_debug_vmsg2: format1 == NULL
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by
using the following link:
https://bugzilla.lustre.org/show_bug.cgi?id=11283
So -- There is one major issue with the patch and a few small nits:
The 2 changes to lustre/include/linux/lustre_net.h re-add the annoying newlines
back into the debug messages. Ick :)
The big one is the change from "buf[256]" to "buf[4047]". To
start, 4047 is an
awfully strange value to just pick, at minimum that needs to be commented by
CFS. The worst part is that lputs() has a hard limit of 256 chars that it will
actually print to the console.
Here is a short example:
nic@guppy1:~> cat lputs_test.c
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int size = 0;
int i;
char *buf = NULL;
if (argc < 2) {
fprintf(stderr, "usage: %s size\n", argv[0]);
exit(1);
}
size = atoi(argv[1]);
buf = (char *)malloc( (sizeof(char) * size) + 1);
for (i=0; i < size; i++) {
sprintf(buf+i, "%c", ''a''+ (i % 26));
}
buf[size] = ''\0'';
printf("len: %d buf: %s\n", strlen(buf), buf);
lputs(buf);
return 0;
}
I''ll run it twice:
nic@guppy1:~> yod -np 1 ./qk_lputs_test 252
len: 252 buf:
abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcde
jklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzab
ghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqr
nic@guppy1:~> yod -np 1 ./qk_lputs_test 253
len: 253 buf:
abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcde
jklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzab
ghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrs
>From console log - notice that the leading "0- " takes up 4 of the
possible chars.
[2007-01-09 16:56:05][c0-0c0s3n2]0- ******* _cstart2(), yod_pid=30159 rank=0
lognid=0 physnid=0xe pid=5
[2007-01-09 16:56:05][c0-0c0s3n2]0-
abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqr
[2007-01-09 16:56:06][c0-0c0s3n2]0- received final app termination, pid=5
[2007-01-09 16:56:08][c0-0c0s3n2]0- ******* _cstart2(), yod_pid=30162 rank=0
lognid=0 physnid=0xe pid=2
[2007-01-09 16:56:08][c0-0c0s3n2]0-
abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqr
On the second run we are missing the very last "s".
So -- at best we are going to get quite truncated error messages from Lustre,
which is enough for me to light this on fire & hand back to you folks :)