Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.9.0, Lustre 2.10.3
-
None
-
Lustre-master 2.8.56_68_gd4a4c07 build 3430, RHEL7, Spirit performance cluster Revision: d4a4c0795b3befb87d47a5bf441adeba3b1c36f8
-
3
-
9223372036854775807
Description
Attempted IOR test on both sets of OSTs.
No progress made on test.
Client and OST logs are full of:
OST
LustreError: 5245:0:(events.c:449:server_bulk_callback()) event type 3, status -113, desc ffff88104dc53200 LustreError: 9071:0:(events.c:449:server_bulk_callback()) event type 3, status -103, desc ffff880b0c956200 LustreError: 5245:0:(events.c:449:server_bulk_callback()) event type 5, status -113, desc ffff880b13e95200 LustreError: 5245:0:(events.c:449:server_bulk_callback()) event type 3, status -113, desc ffff880b13e95200 LustreError: 5245:0:(events.c:449:server_bulk_callback()) event type 5, status -113, desc ffff881032b5a800 LustreError: 5245:0:(events.c:449:server_bulk_callback()) event type 3, status -113, desc ffff881032b5a800
Client
[12378.313197] LustreError: 3138:0:(events.c:203:client_bulk_callback()) event type 1, status -5, desc ffff8807442f9c00 [12378.313223] LustreError: 3136:0:(events.c:203:client_bulk_callback()) event type 1, status -5, desc ffff8807442f9800 [12378.337649] LustreError: 3136:0:(events.c:203:client_bulk_callback()) event type 1, status -5, desc ffff8810102c4200 [12378.337671] LustreError: 3136:0:(events.c:203:client_bulk_callback()) event type 1, status -5, desc ffff880f55547400 [12378.337677] LustreError: 3138:0:(events.c:203:client_bulk_callback()) event type 1, status -5, desc ffff880729970000 [12378.362167] LustreError: 3137:0:(events.c:203:client_bulk_callback()) event type 1, status -5, desc ffff880804b78800 [12378.374324] LustreError: 3137:0:(events.c:203:client_bulk_callback()) event type 1, status -5, desc ffff8807299e5200 [12378.410886] LustreError: 3137:0:(events.c:203:client_bulk_callback()) event type 1, status -5, desc ffff880f53ce1c00
Eventually, one client LBUGS
[12377.942289] LustreError: 3153:0:(niobuf.c:319:ptlrpc_register_bulk()) ASSERTION( desc->bd_md_count == 0 ) failed: [12378.564210] LustreError: 3153:0:(niobuf.c:319:ptlrpc_register_bulk()) LBUG [12378.564210] LustreError: 3153:0:(niobuf.c:319:ptlrpc_register_bulk()) LBUG [12378.571890] Pid: 3153, comm: ptlrpcd_01_02 [12378.576468] Call Trace: [12378.580868] [<ffffffffa08557d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs] [12378.588652] [<ffffffffa0855d75>] lbug_with_loc+0x45/0xc0 [libcfs] [12378.595597] [<ffffffffa0c3e661>] ptlrpc_register_bulk+0x831/0x9c0 [ptlrpc] [12378.603389] [<ffffffffa08cace2>] ? LNetMDUnlink+0xe2/0x180 [lnet] [12378.610322] [<ffffffffa0c6be76>] ? sptlrpc_import_sec_ref+0x36/0x40 [ptlrpc] [12378.618321] [<ffffffffa0c3f1af>] ptl_send_rpc+0x1ff/0xda0 [ptlrpc] [12378.625361] [<ffffffffa0c39256>] ptlrpc_check_set.part.23+0x1896/0x1dd0 [ptlrpc] [12378.633743] [<ffffffffa0c397eb>] ptlrpc_check_set+0x5b/0xe0 [ptlrpc] [12378.640976] [<ffffffffa0c643fb>] ptlrpcd_check+0x4eb/0x5e0 [ptlrpc] [12378.648095] [<ffffffffa0c647ab>] ptlrpcd+0x2bb/0x560 [ptlrpc] [12378.654613] [<ffffffff810b8910>] ? default_wake_function+0x0/0x20 [12378.661550] [<ffffffffa0c644f0>] ? ptlrpcd+0x0/0x560 [ptlrpc] [12378.668067] [<ffffffff810a5b2f>] kthread+0xcf/0xe0 [12378.673515] [<ffffffff810a5a60>] ? kthread+0x0/0xe0 [12378.679058] [<ffffffff81646a98>] ret_from_fork+0x58/0x90 [12378.685086] [<ffffffff810a5a60>] ? kthread+0x0/0xe0 [12378.690626] [12378.692396] Kernel panic - not syncing: LBUG [12378.697164] CPU: 9 PID: 3153 Comm: ptlrpcd_01_02 Tainted: G OE ------------ 3.10.0-327.28.3.el7.x86_64 #1 [12378.709098] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013 [12378.720548] ffffffffa0872def 000000003210ab53 ffff881000c1ba80 ffffffff81636453 [12378.728852] ffff881000c1bb00 ffffffff8162fce7 ffffffff00000008 ffff881000c1bb10 [12378.737165] ffff881000c1bab0 000000003210ab53 ffffffffa0c90d30 0000000000000246 [12378.745463] Call Trace: [12378.748204] [<ffffffff81636453>] dump_stack+0x19/0x1b [12378.753939] [<ffffffff8162fce7>] panic+0xd8/0x1e7 [12378.759295] [<ffffffffa0855ddb>] lbug_with_loc+0xab/0xc0 [libcfs] [12378.766235] [<ffffffffa0c3e661>] ptlrpc_register_bulk+0x831/0x9c0 [ptlrpc] [12378.774009] [<ffffffffa08cace2>] ? LNetMDUnlink+0xe2/0x180 [lnet] [12378.780949] [<ffffffffa0c6be76>] ? sptlrpc_import_sec_ref+0x36/0x40 [ptlrpc] [12378.788950] [<ffffffffa0c3f1af>] ptl_send_rpc+0x1ff/0xda0 [ptlrpc] [12378.795981] [<ffffffffa0c39256>] ptlrpc_check_set.part.23+0x1896/0x1dd0 [ptlrpc] [12378.804368] [<ffffffffa0c397eb>] ptlrpc_check_set+0x5b/0xe0 [ptlrpc] [12378.811596] [<ffffffffa0c643fb>] ptlrpcd_check+0x4eb/0x5e0 [ptlrpc] [12378.818725] [<ffffffffa0c647ab>] ptlrpcd+0x2bb/0x560 [ptlrpc] [12378.825237] [<ffffffff810b8910>] ? wake_up_state+0x20/0x20 [12378.831492] [<ffffffffa0c644f0>] ? ptlrpcd_check+0x5e0/0x5e0 [ptlrpc] [12378.838778] [<ffffffff810a5b2f>] kthread+0xcf/0xe0 [12378.844224] [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140 [12378.851510] [<ffffffff81646a98>] ret_from_fork+0x58/0x90 [12378.857534] [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140
vmcore is available on Spirit for analysis
Attachments
Issue Links
- is duplicated by
-
LU-11647 niobuf.c:330:ptlrpc_register_bulk()) ASSERTION( desc->bd_md_count == 0 ) failed:
- Resolved
-
LU-11692 lustre kernel panic - (niobuf.c:330:ptlrpc_register_bulk()) LBUG
- Resolved
- is related to
-
LU-7650 ko2iblnd map_on_demand can't negotitate when page sizes are different between nodes.
- Resolved