[LU-914] Client panic on ptlrpc_free_req() Created: 12/Dec/11 Updated: 07/Apr/12 Resolved: 05/Apr/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Marek Magrys | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre 2.1RC2 on servers, mix of 2.1 and 2.1RC2 on clients. |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 6506 |
| Description |
|
Some of our clients die with: We do not have any reproducer and probably we won't have it anyway as the LBUG is caused by many kinds of binaries. I've attached some panic logs from three clients, we had at least 5 crashes caused by this bug for now. |
| Comments |
| Comment by Oleg Drokin [ 08/Feb/12 ] |
|
hm, is this still an issue for you? Can you reproduce with dlmtrace and dentry debug levels added and collect a lustre debug log please? |
| Comment by Marek Magrys [ 08/Feb/12 ] |
|
The issue didn't hit us for a long time now (since we've opened the ticket), so I guess you can close it for now and if it strikes back again I'll ask for reopen. We don't have any reproducer for this, so for now I think we cannot do anything here. |
| Comment by Andreas Dilger [ 05/Apr/12 ] |
|
Closing per last comment that it cannot be reproduced. |
| Comment by Marek Magrys [ 06/Apr/12 ] |
|
Today it striked back on our login node: Servers are on 2.1, clients on 2.1.1, both Scientific Linux 5. One of our users claims, that his 'grep' might have caused the crash, which would be more than odd. However I'm not sure if you should reopen this bug, as we still don't have any reproducer. |
| Comment by Oleg Drokin [ 06/Apr/12 ] |
|
Do you have kernel crashdumping installed and setup? Can you print the request content and backtrace? |
| Comment by Marek Magrys [ 07/Apr/12 ] |
|
No we don't, but we'll enable crashdumps on our login node. I don't have any detailed logs, as for some reason there's no /tmp/lustre-log file. I will try to set some more verbose debugging options here, do you have any tips on how to obtain as much information as possible, without heavily affecting the performance? |
| Comment by Oleg Drokin [ 07/Apr/12 ] |
|
Unfortunately extensive debug will slow things down. But having reliably working crashdumps is always a very good idea and will not affect your performance (other than eating a bit of RAM for the crash kernel image). |