I am having the exact same problems with my linux box. I have a 486, 24Mb of ram, and a RAID 1 partition. The box runs dhcp, xinetd, telnetd, and samba. It currently only serves the RAID 1 partition via samba to a single Win98 box. Under any large file transfer to the linux box, regardless of which machine initiates the x-fer, the linux box kernel panics and dies. Currently, I have tried running many kernels, upgrading to a different glibc and rawhide kernel, and even scrapped RH and made an LFS partition. I have also used multiple versions of samba and I have tried using 3com's driver instead of the kernel driver. Nothing has fixed this problem. Recently, I thought the network load was filling the physical memory because the error was usually a "Unable to handle kernel virtual memory paging request," however, I have tried both stock kernels and RH kernels with the rmap VM system and both crash. I also increased /proc/sys/vm/freepages to such large numbers that there was always over a meg of free mem while a file transfer was going, but the box still crashed. The stack dump used to be from many different programs, but, after I increased freepages, it seems to always be from the 3c59x module. I have tried 3com's driver instead. It held up better under load, but still crashed after about 15 minutes of file transfer via samba. I do not believe this is a RAID problem as I get the same problem with RAID turned off and sharing a regular non-RAID directory with samba. I now feel that, because two seperate drivers fail, this must be some issue with the hardware. As I just found this problem and it seems as if you have 3com card in the same family as mine, I am very confidant now. If you search the mail list archives for the 3c59x driver, you will find this problem many times over. I don't yet know the solution, but, if you can, you might try a different networking card. I am going to switch out the NIC cards between my Linux box and Win98 box and see how that does. Hopefully it's not an issue between the two cards.
Hi TJ, Thanks for all that input. I am not yet very litterate with all the internals of the kernel, and about the tools to diagnose this. I also figured out that the 3c59x modules was constantly in the Ooops and other messages reportings from the kernel. That machine having 2 network cards, I switched to the other, and I do have less problems. It is a VIA network card (via-rhine kernel driver). I will have to make a few stress tests which this card, as I had other problems while recovering from the messed-up file system. Is there some work for the 3c59x driver to fix these ? Or are these network cards considered obsolete ? Pascal TJ wrote:>I am having the exact same problems with my linux box. > > I have a 486, 24Mb of ram, and a RAID 1 partition. The box runs >dhcp, xinetd, telnetd, and samba. It currently only serves the RAID 1 >partition via samba to a single Win98 box. >Under any large file transfer to the linux box, regardless of which >machine initiates the x-fer, the linux box kernel panics and dies. >Currently, I have tried running many kernels, upgrading to a different >glibc and rawhide kernel, and even scrapped RH and made an LFS >partition. I have also used multiple versions of samba and I have tried >using 3com's driver instead of the kernel driver. Nothing has fixed this >problem. Recently, I thought the network load was filling the physical >memory because the error was usually a "Unable to handle kernel virtual >memory paging request," however, I have tried both stock kernels and RH >kernels with the rmap VM system and both crash. I also increased >/proc/sys/vm/freepages to such large numbers that there was always over >a meg of free mem while a file transfer was going, but the box still >crashed. > The stack dump used to be from many different programs, but, after I >increased freepages, it seems to always be from the 3c59x module. I have >tried 3com's driver instead. It held up better under load, but still >crashed after about 15 minutes of file transfer via samba. > I do not believe this is a RAID problem as I get the same problem >with RAID turned off and sharing a regular non-RAID directory with >samba. > I now feel that, because two seperate drivers fail, this must be >some issue with the hardware. As I just found this problem and it seems >as if you have 3com card in the same family as mine, I am very confidant >now. If you search the mail list archives for the 3c59x driver, you will >find this problem many times over. I don't yet know the solution, but, >if you can, you might try a different networking card. I am going to >switch out the NIC cards between my Linux box and Win98 box and see how >that does. Hopefully it's not an issue between the two cards. > >
On Feb 28, 2002 23:07 -0500, TJ wrote:> I have a 486, 24Mb of ram, and a RAID 1 partition. The box runs > dhcp, xinetd, telnetd, and samba. It currently only serves the RAID 1 > partition via samba to a single Win98 box. Under any large file > transfer to the linux box, regardless of which machine initiates the > x-fer, the linux box kernel panics and dies.> The stack dump used to be from many different programs, but, after > I increased freepages, it seems to always be from the 3c59x module. I > have tried 3com's driver instead. It held up better under load, but > still crashed after about 15 minutes of file transfer via samba.> I now feel that, because two seperate drivers fail, this must be > some issue with the hardware. As I just found this problem and it seems > as if you have 3com card in the same family as mine, I am very confidant > now. If you search the mail list archives for the 3c59x driver, you > will find this problem many times over. I don't yet know the solution, > but, if you can, you might try a different networking card. I am going > to switch out the NIC cards between my Linux box and Win98 box and see > how that does. Hopefully it's not an issue between the two cards.It is good to hear that you have tracked down this problem. Just a few helpful hints for future bug reports and postings to Linux mailing lists: 1) You didn't mention the kernel verion anywhere in your email. This makes it very hard for anyone to know whether the problem could have been fixed in a more recent kernel. In general, you shouldn't report a bug unless you are running the most recent stable kernel. 2) If you are getting a kernel oops (what you call a panic or stack dump) then you need to take this, run it through the ksymoops program, and include it with your bug report. 3) The ext3 mailing list is not the right place to post this. Not once in your email did you mention that ext3 is in use, and since you found the problem to be in a network driver, you probably should post it to a network-related list and/or the maintainers of this driver. 4) Please use the <enter> key at the end of each line, or set up your mail program to insert linefeeds after 72 characters or so. This makes it much easier for others to quote only part of your email. 5) If you have found there is a common problem with the oops messages, it may be useful to include some of the URLs to these other bug reports. This makes it easier for the person trying to fix the problem to see if there are patterns in the problems. Since you have already done the work to find these other reports, you may as well save the effort of another person doing the same. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/