[LU-7332] LustreError: 201113:0:(osd_internal.h:1101:osd_trans_exec_check()) LBUG Created: 23/Oct/15 Updated: 18/May/16 Resolved: 26/Oct/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Vinayak (Inactive) | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
We are getting hit with this issue much frequently while running I am attaching the logs to this ticket. LustreError: 201113:0:(osd_internal.h:1101:osd_trans_exec_check()) LBUG Pid: 201113, comm: mdt03_000 Call Trace: libcfs_debug_dumpstack+0x55/0x80 [libcfs] lbug_with_loc+0x47/0xb0 [libcfs] osd_xattr_set+0x5d8/0x6c0 [osd_ldiskfs] ? ldiskfs_xattr_inode_get+0xdb/0xf0 [ldiskfs] lod_sub_object_xattr_set+0x223/0x460 [lod] lod_xattr_set_internal+0x126/0x2b0 [lod] lod_xattr_set+0x101/0x430 [lod] ? mdd_env_info+0x25/0x70 [mdd] mdd_links_write+0x235/0x2e0 [mdd] mdd_links_rename+0x312/0x620 [mdd] mdd_link+0x104c/0x10f0 [mdd] mdt_reint_link+0x9b1/0xb40 [mdt] ? mdt_root_squash+0x2c/0x3f0 [mdt] ? __req_capsule_get+0x162/0x6e0 [ptlrpc] mdt_reint_rec+0x5d/0x200 [mdt] mdt_reint_internal+0x62b/0xb80 [mdt] mdt_reint+0x6b/0x120 [mdt] tgt_request_handle+0x8bc/0x12e0 [ptlrpc] ptlrpc_main+0xe41/0x1910 [ptlrpc] ? ptlrpc_main+0x0/0x1910 [ptlrpc] kthread+0x96/0xa0 child_rip+0xa/0x20 ? kthread+0x0/0xa0 ? child_rip+0x0/0x20 |
| Comments |
| Comment by Vinayak (Inactive) [ 23/Oct/15 ] |
|
Hello Andreas, Initially we thought that this issue is much related to We found this issue on Latest Intel master. Please help me in correcting the Affect version also. |
| Comment by Peter Jones [ 23/Oct/15 ] |
|
I am assuming that by "Latest Intel master" you mean the tip of the community tree master. |
| Comment by Vinayak (Inactive) [ 23/Oct/15 ] |
|
Yes Peter. I meant the same. Thanks, |
| Comment by Alex Zhuravlev [ 23/Oct/15 ] |
|
this is because of huge LINKEA. please try http://review.whamcloud.com/#/c/12412/ |
| Comment by Vinayak (Inactive) [ 23/Oct/15 ] |
|
Thanks for pointing me to the solution Alex. I will try it and let you know.. |
| Comment by Andreas Dilger [ 23/Oct/15 ] |
|
Please reply back if that patch fixed your problem, and we can prioritize the landing of the patch. |
| Comment by Vinayak (Inactive) [ 24/Oct/15 ] |
|
Hello Andreas, Alex, We have tried the patch and sanity, test_51e passes in the initial run on 4 node set up (2 clients, 1 MDS, 1 OSS). Submitted the test for multi run on the same set up and also asked our testing team to verify the issue on environment (10+ nodes production env) where it is reproducible. I will update you soon whatever I hear back from our testing team. Thanks, |
| Comment by Vinayak (Inactive) [ 26/Oct/15 ] |
|
Multi run passed all (100 times) test instances. |
| Comment by Peter Jones [ 26/Oct/15 ] |
|
Thanks! We'll close this out as a duplicate of |
| Comment by Vinayak (Inactive) [ 27/Oct/15 ] |
|
Sure Peter. I will keep the updates posted. Thanks, |