Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: Lustre 2.10.8
Labels:
None
Environment:
Clients: 2.12.0, CentOS 7.6

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

LBUG today on oak-MDT0000, never seem this one before. We have had some big data transfers using dsync going on on Sherlock (2.12.0 clients). Might be related, or not.

[4954375.921845] LustreError: 15102:0:(tgt_handler.c:628:process_req_last_xid()) @@@ Unexpected xid 5d6425ffe4140 vs. last_xid 5d6425ffe418f
  req@ffffa1597f41f200 x1642955450237248/t0(0) o101->98bbe778-4f70-8a89-d80e-d6a8120c693b@10.8.2.23@o2ib6:663/0 lens 736/0 e 0 to 0 dl 1567111883 ref 1 fl Interpret:/2/ffffffff rc 0/-1
[4954542.487326] LustreError: 15290:0:(mdt_lib.c:961:mdt_attr_valid_xlate()) Unknown attr bits: 0x60000
[4954542.517377] LustreError: 15290:0:(mdt_lib.c:961:mdt_attr_valid_xlate()) Skipped 3754300 previous similar messages
[4954874.316190] LustreError: 15347:0:(lod_object.c:3919:lod_ah_init()) ASSERTION( !lod_obj_is_striped(child) ) failed: 
[4954874.351112] LustreError: 15347:0:(lod_object.c:3919:lod_ah_init()) LBUG
[4954874.373452] Pid: 15347, comm: mdt01_049 3.10.0-862.14.4.el7_lustre.x86_64 #1 SMP Mon Oct 8 11:21:37 PDT 2018
[4954874.406359] Call Trace:
[4954874.414973]  [<ffffffffc08af7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[4954874.437035]  [<ffffffffc08af87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[4954874.459664]  [<ffffffffc135a89f>] lod_ah_init+0x23f/0xde0 [lod]
[4954874.479751]  [<ffffffffc13d306b>] mdd_object_make_hint+0xcb/0x190 [mdd]
[4954874.502388]  [<ffffffffc13bed50>] mdd_create_data+0x330/0x730 [mdd]
[4954874.523606]  [<ffffffffc129140c>] mdt_mfd_open+0xc5c/0xe70 [mdt]
[4954874.544523]  [<ffffffffc1291b9b>] mdt_finish_open+0x57b/0x690 [mdt]
[4954874.565743]  [<ffffffffc1293478>] mdt_reint_open+0x17c8/0x3190 [mdt]
[4954874.587229]  [<ffffffffc1288cb3>] mdt_reint_rec+0x83/0x210 [mdt]
[4954874.607567]  [<ffffffffc126a19b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
[4954874.630197]  [<ffffffffc126a6c2>] mdt_intent_reint+0x162/0x430 [mdt]
[4954874.651677]  [<ffffffffc126d4cb>] mdt_intent_opc+0x1eb/0xaf0 [mdt]
[4954874.672619]  [<ffffffffc1275d68>] mdt_intent_policy+0x138/0x320 [mdt]
[4954874.694668]  [<ffffffffc0be82dd>] ldlm_lock_enqueue+0x38d/0x980 [ptlrpc]
[4954874.719320]  [<ffffffffc0c11c03>] ldlm_handle_enqueue0+0xa83/0x1670 [ptlrpc]
[4954874.743104]  [<ffffffffc0c977f2>] tgt_enqueue+0x62/0x210 [ptlrpc]
[4954874.764026]  [<ffffffffc0c9b72a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
[4954874.787245]  [<ffffffffc0c4404b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
[4954874.813872]  [<ffffffffc0c47792>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
[4954874.835628]  [<ffffffff8babdf21>] kthread+0xd1/0xe0
[4954874.852252]  [<ffffffff8c1255f7>] ret_from_fork_nospec_end+0x0/0x39
[4954874.873448]  [<ffffffffffffffff>] 0xffffffffffffffff
[4954874.890366] Kernel panic - not syncing: LBUG

I do have a crash dump if you're interested. MDT failover was smooth so not a big deal:

Aug 29 14:04:49 oak-md1-s1 kernel: Lustre: oak-MDT0000: Recovery over after 0:55, of 1464 clients 1464 recovered and 0 were evicted.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

foreach_bt-crash-oak-md1-s2-2019-08-31-18-02-59.log
03/Sep/19 4:24 AM
1.28 MB
Stephane Thiell
vmcore-dmesg-oak-md1-s2-2019-08-31-18-02-59.txt
03/Sep/19 4:26 AM
692 kB
Stephane Thiell

Activity

People

Assignee:: Lai Siyao

Reporter:: Stephane Thiell

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 29/Aug/19 9:24 PM

Updated:: 30/Apr/20 11:28 PM