Hello, As I wrote in #11742 [1] I experienced a kernel panic after doing heavy I/O on the 1.6.5rc2 cluster on the mds. Since nobody answered to this bug until now (and I think in other cases the lustre team is _really_ fast (thanks for that :))) I fear that it was not recognised by anybody. This kernel-panic seems somehow to be related to the bug mentioned above (#11742) as this bugnr. is mentioned in the dmesg output when it died. Furthermore right before it started to fail there were several messages like the following: LustreError: 3342:0:(osc_request.c:678:osc_announce_cached()) dirty 81108992 > dirty_max 33554432 This behaviour is described in #13344 [2]. Any ideas? Greetings Patrick Winnertz [1]:https://bugzilla.lustre.org/show_bug.cgi?id=11742 [2]:https://bugzilla.lustre.org/show_bug.cgi?id=13344 -- Patrick Winnertz Tel.: +49 (0) 2161 / 4643 - 0 credativ GmbH, HRB M?nchengladbach 12080 Hohenzollernstr. 133, 41061 M?nchengladbach Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz
On May 16, 2008 12:45 +0200, Patrick Winnertz wrote:> As I wrote in #11742 [1] I experienced a kernel panic after doing heavy I/O > on the 1.6.5rc2 cluster on the mds. Since nobody answered to this bug > until now (and I think in other cases the lustre team is _really_ fast > (thanks for that :))) I fear that it was not recognised by anybody. > > This kernel-panic seems somehow to be related to the bug mentioned above > (#11742) as this bugnr. is mentioned in the dmesg output when it died. > Furthermore right before it started to fail there were several messages > like the following: > > LustreError: 3342:0:(osc_request.c:678:osc_announce_cached()) dirty > 81108992 > dirty_max 33554432 > > This behaviour is described in #13344 [2].Sorry, I don''t have net access right now, so I can''t see your comments in the bug, but the above messsage is definitely unusual and an indication of some kind of code bug. The client imposes a limit on the amount of dirty data that it can cache (in /proc/fs/lustre/osc/*/max_dirty_mb, default 32MB), on a per-OST basis. This ensures that in case of lock cancellation there isn''t 5TB of dirty data out on the client and flushing this to the OST will take 30min. It seems that either the accounting of the number of dirty pages on the client has gone badly, or the client has actually dirtied far more data (80MB) than it should have (32MB). Could you please explain the type of IO that the client is doing? Is this normal write(), or writev(), pwrite(), O_DIRECT, mmap, other? Were there IO errors, or IO resends, or some other unusual problem? The entry points for this IO into Lustre is all slightly different, and it wouldn''t be the first time there was an accounting error somewhere. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Hello! On May 16, 2008, at 6:45 AM, Patrick Winnertz wrote:> As I wrote in #11742 [1] I experienced a kernel panic after doing > heavy I/O > on the 1.6.5rc2 cluster on the mds. Since nobody answered to this bug > until now (and I think in other cases the lustre team is _really_ fast > (thanks for that :))) I fear that it was not recognised by anybody.I just looked into the logs, you have out of memory issues at the very least during that i/o. Also checksum errors. The log you uploaded does not contain actual crash info, but rather only these messages (that do not cause crash in itself), followed by oom and checksum error messages. I do not see any panic messages in your logs. Any chance you have a serial console or other way to see what was the actual panic complete with stacktrace and other useful info? (ideally a crashdump). Bug 11742 that was referenced is just related to checksum errors problems. How do you do your heavy i/o? just regular writes or mmap writes or what? Bye, Oleg
Hello Andreas, Sorry for my late answer.> Could you please explain the type of IO that the client is doing? Is > this normal write(), or writev(), pwrite(), O_DIRECT, mmap, other? > Were there IO errors, or IO resends, or some other unusual problem? > The entry points for this IO into Lustre is all slightly different, and > it wouldn''t be the first time there was an accounting error somewhere.I''m doing IO with fsstress.. what kind of IO it is I don''t know exaclty but if you like I can publish the sources on our company server so that you can download it in order to test and debug it better. Greetings Patrick Winnertz -- Patrick Winnertz Tel.: +49 (0) 2161 / 4643 - 0 credativ GmbH, HRB M?nchengladbach 12080 Hohenzollernstr. 133, 41061 M?nchengladbach Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz
Hello Oleg, Sorry for this late answer...> I do not see any panic messages in your logs. Any chance you have a > serial console or other way to see what was the actual panic complete with > stacktrace and other useful info? (ideally a crashdump).After the mgs crashes there is no chance to get a konsole there.. the only possible action is to hit the reset button and reboot the server. But I''ll try to create a coredump of the kernel and send it to you or place it somewhere else where you can download it. (Maybe a bugreport? If yes... which one? One of the 2 I mentioned or a new one?)> Bug 11742 that was referenced is just related to checksum errors > problems. > How do you do your heavy i/o? just regular writes or mmap writes or > what?I''m using for the I/O tests fsstress. I''ll send it to you. Greetings Patrick Winnertz -- Patrick Winnertz Tel.: +49 (0) 2161 / 4643 - 0 credativ GmbH, HRB M?nchengladbach 12080 Hohenzollernstr. 133, 41061 M?nchengladbach Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz
Hello! On May 27, 2008, at 5:20 AM, Patrick Winnertz wrote:>> I do not see any panic messages in your logs. Any chance you have a >> serial console or other way to see what was the actual panic >> complete with >> stacktrace and other useful info? (ideally a crashdump). > After the mgs crashes there is no chance to get a konsole there.. > the only > possible action is to hit the reset button and reboot the server. > But I''ll try to create a coredump of the kernel and send it to you > or place it > somewhere else where you can download it. (Maybe a bugreport? If > yes... which > one? One of the 2 I mentioned or a new one?)Thanks. No need to send kernel crashdump to us. Just obtain a crashed thread backtrace and post and email with that (and panic message) to lustre-discuss, also indicate that you have a complete crashdump. We''ll ask you for more details if we would need something else.>> Bug 11742 that was referenced is just related to checksum errors >> problems. >> How do you do your heavy i/o? just regular writes or mmap writes or >> what? > I''m using for the I/O tests fsstress. I''ll send it to you.Ok, thanks. Bye, Oleg