Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
None
-
Lustre 2.4.0
-
Lustre 2.3.56-2chaos (github.com/chaos/lustre), includes new unstable-page limiting patches
-
3
-
5628
Description
Hit the following while running ior at large scale on Sequoia.
98304 tasks, command line:
ior -F -e -g -C -t 1m -b 512m -o /p/lsfull/morrone/f
2012-11-20 13:17:16.901289 {DefaultControlEventListener} [mmcs]{623}.13.1: Lustre: lsfull-MDT0000-mdc-c0000003ea31d400: Connection to lsfull-MDT0000 (at 172.20.5.1@o2ib500) was lost; in progress operations using this service will wait for recovery to complete 2012-11-20 13:17:16.948873 {DefaultControlEventListener} [mmcs]{623}.1.0: Lustre: lsfull-MDT0000-mdc-c0000003ea31d400: Connection restored to lsfull-MDT0000 (at 172.20.5.1@o2ib 500) 2012-11-20 13:17:17.841116 {DefaultControlEventListener} [mmcs]{623}.4.1: LustreError: 3722:0:(client.c:2250:__ptlrpc_free_req()) ASSERTION( cfs_list_empty(&request->rq_list) ) failed: req c0000003edfad400 2012-11-20 13:17:17.882162 {DefaultControlEventListener} [mmcs]{623}.4.1: LustreError: 3722:0:(client.c:2250:__ptlrpc_free_req()) LBUG 2012-11-20 13:17:17.920585 {DefaultControlEventListener} [mmcs]{623}.4.1: Call Trace: 2012-11-20 13:17:17.959831 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee0338d0] [c000000000008160] .show_stack+0x7c/0x184 (unreliable) 2012-11-20 13:17:17.998895 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033980] [8000000000a70cb8] .libcfs_debug_dumpstack+0xd8/0x150 [libcfs] 2012-11-20 13:17:18.039425 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033a30] [8000000000a71480] .lbug_with_loc+0x50/0xc0 [libcfs] 2012-11-20 13:17:18.042457 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033ac0] [8000000003a056c0] .__ptlrpc_req_finished+0x980/0xb60 [ptlrpc] 2012-11-20 13:17:18.082353 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033b80] [8000000006a13ac4] .ll_fsync+0x4e4/0xc50 [lustre] 2012-11-20 13:17:18.128703 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033c80] [c0000000000fc094] .vfs_fsync_range+0xb0/0x104 2012-11-20 13:17:18.133202 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033d30] [c0000000000fc18c] .do_fsync+0x3c/0x6c 2012-11-20 13:17:18.135223 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033dc0] [c0000000000fc1fc] .SyS_fsync+0x18/0x28 2012-11-20 13:17:18.173757 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033e30] [c000000000000580] syscall_exit+0x0/0x2c 2012-11-20 13:17:18.211524 {DefaultControlEventListener} [mmcs]{623}.4.1: Kernel panic - not syncing: LBUG 2012-11-20 13:17:18.258280 {DefaultControlEventListener} [mmcs]{623}.4.1: Call Trace: 2012-11-20 13:17:18.262113 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee0338f0] [c000000000008160] .show_stack+0x7c/0x184 (unreliable) 2012-11-20 13:17:18.263117 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee0339a0] [c000000000432c0c] .panic+0x80/0x1a8 2012-11-20 13:17:18.284729 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033a30] [8000000000a714e0] .lbug_with_loc+0xb0/0xc0 [libcfs] 2012-11-20 13:17:18.285323 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033ac0] [8000000003a056c0] .__ptlrpc_req_finished+0x980/0xb60 [ptlrpc] 2012-11-20 13:17:18.287243 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033b80] [8000000006a13ac4] .ll_fsync+0x4e4/0xc50 [lustre] 2012-11-20 13:17:18.320182 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033c80] [c0000000000fc094] .vfs_fsync_range+0xb0/0x104 2012-11-20 13:17:18.359030 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033d30] [c0000000000fc18c] .do_fsync+0x3c/0x6c 2012-11-20 13:17:18.403154 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033dc0] [c0000000000fc1fc] .SyS_fsync+0x18/0x28 2012-11-20 13:17:18.439477 {DefaultControlEventListener} [mmcs]{623}.4.1: [c0000003ee033e30] [c000000000000580] syscall_exit+0x0/0x2c