[LU-1702] LustreError: 3218:0: (mdt_open.c:1035:mdt_reconstruct_open()) LBUG Created: 02/Aug/12 Updated: 29/May/17 Resolved: 29/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.2.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Michael Di Domenico | Assignee: | Bob Glossman (Inactive) |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Redhat 6.2 x86_64 with infiniband |
||
| Severity: | 3 |
| Rank (Obsolete): | 4058 |
| Description |
|
Message from syslogd@metal at Aug 109:46:26 ... Message from syslogd@metal at Aug 1 09:46:26 .,. Message from syslogd@metal at Aug 1 09:46:26 1 Aug 1 09:46:26 metal kernel: LustreError:3218:0:(mdt_open.c:1035:mdt_reconstruct_open()) ASSERTION ( (!(rc < 0) I I (lustre_msg_get_transno(req->r~repmsg) == 0)) ) failed: Message from syslogd@metal at Aug 1 09:46:26 ... Message from syslogd@metal at Aug 1 09:46:26 ... Aug 1 09:46:26 metal kernel: LustreError: 3218:0:(mdt_open.c:1035:mdt_reconstruct_open()) LBUG Message from syslogd@meta1 at Aug 109:46:26 ... kernel:Kernel panic -not syncing: LBUG |
| Comments |
| Comment by Peter Jones [ 07/Aug/12 ] |
|
Bob will look into this one |
| Comment by Bob Glossman (Inactive) [ 07/Aug/12 ] |
|
Michael, |
| Comment by Michael Di Domenico [ 07/Aug/12 ] |
|
The bug seems to be related to load on the machine (which is caused by) heavy scanning of the filesystem. I don't have an exact reproducer, but the machine has crashed several times during some heavy IO periods. I'm not able to pull syslog/dmesg entries in-mass from the system. If there's something specific you're looking for i can search around. i do have lustre logs in /tmp, but because they're binary, i am not able to remove them from the system. if you have commands that will let me instrument the data you need from the log and print it, i can scan it back in and send it over |
| Comment by Bob Glossman (Inactive) [ 07/Aug/12 ] |
If the only thing stopping you from sending us lustre logs is that they are binary, you can convert them to human readable text with 'lctl df'. Would that permit you to send them? |
| Comment by Michael Di Domenico [ 08/Aug/12 ] |
|
Yes, converting them to ascii definitely helps, however, the excerpt of lines during the kernel panic is 155k, I'll need to pair that down before i can pull the data from the system. can you tell me if there are recurring lines that i can grep out, i'll ask for the whole file, but i suspect it'll get declined since the result is currently 37MB |
| Comment by Colin Faber [X] (Inactive) [ 29/Aug/12 ] |
|
Hi, We've experienced this as well and believe it to be more related to recovery issues and less general load related. I'll try and get some more data together and post it here so this ticket is more complete. -cf |
| Comment by James Beal [ 26/Sep/14 ] |
|
I have seen this on 2.2 |
| Comment by Andreas Dilger [ 29/May/17 ] |
|
Close old ticket. |