Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.4.0
-
None
-
3
-
5196
Description
Hit this running replay-single in a loop (with reformats in between):
Oct 12 17:26:34 centos6-1 kernel: [78449.889549] Lustre: DEBUG MARKER: == replay-single test 38: test recovery from unlink llog (test llog_gen_rec) == 17:26:34 (1350077194) Oct 12 17:26:36 centos6-1 kernel: [78451.413482] Turning device loop0 (0x700000) read-only Oct 12 17:26:36 centos6-1 kernel: [78451.438247] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Oct 12 17:26:36 centos6-1 kernel: [78451.445453] Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000 Oct 12 17:26:36 centos6-1 kernel: [78451.758180] Removing read-only on unknown block (0x700000) Oct 12 17:26:42 centos6-1 kernel: [78457.516097] Lustre: 26795:0:(client.c:1909:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1350077176/real 1350077176] req@ffff880040aacbf0 x1415657775176175/t0(0) o38->lustre-MDT0000-osp-OST0000@0@lo:12/10 lens 400/544 e 0 to 1 dl 1350077202 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Oct 12 17:26:42 centos6-1 kernel: [78457.517268] Lustre: 26795:0:(client.c:1909:ptlrpc_expire_one_request()) Skipped 42 previous similar messages Oct 12 17:26:46 centos6-1 kernel: [78462.119988] LDISKFS-fs (loop0): recovery complete Oct 12 17:26:46 centos6-1 kernel: [78462.122225] LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: Oct 12 17:27:26 centos6-1 kernel: [78502.136534] LNet: Service thread pid 12732 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Oct 12 17:27:26 centos6-1 kernel: [78502.137174] Pid: 12732, comm: ll_mgs_0000 Oct 12 17:27:26 centos6-1 kernel: [78502.137349] Oct 12 17:27:26 centos6-1 kernel: [78502.137349] Call Trace: Oct 12 17:27:26 centos6-1 kernel: [78502.137662] [<ffffffff814fb054>] ? _spin_lock_irqsave+0x24/0x30 Oct 12 17:27:26 centos6-1 kernel: [78502.137869] [<ffffffff814f8ad1>] schedule_timeout+0x191/0x2e0 Oct 12 17:27:26 centos6-1 kernel: [78502.138069] [<ffffffff8107bcd0>] ? process_timeout+0x0/0x10 Oct 12 17:27:26 centos6-1 kernel: [78502.138289] [<ffffffffa116b550>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.138639] [<ffffffffa0e94781>] cfs_waitq_timedwait+0x11/0x20 [libcfs] Oct 12 17:27:26 centos6-1 kernel: [78502.138877] [<ffffffffa116f60d>] ldlm_completion_ast+0x48d/0x730 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.139093] [<ffffffff81057d60>] ? default_wake_function+0x0/0x20 Oct 12 17:27:26 centos6-1 kernel: [78502.139300] [<ffffffff814faf3e>] ? _spin_unlock+0xe/0x10 Oct 12 17:27:26 centos6-1 kernel: [78502.139496] [<ffffffffa068a0d5>] mgs_completion_ast_config+0x55/0x140 [mgs] Oct 12 17:27:26 centos6-1 kernel: [78502.139731] [<ffffffffa116ee06>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.139960] [<ffffffffa068a080>] ? mgs_completion_ast_config+0x0/0x140 [mgs] Oct 12 17:27:26 centos6-1 kernel: [78502.140206] [<ffffffffa116dde0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.140422] [<ffffffffa0689da6>] mgs_revoke_lock+0x136/0x2a0 [mgs] Oct 12 17:27:26 centos6-1 kernel: [78502.140671] [<ffffffffa116dde0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.140954] [<ffffffffa068a080>] ? mgs_completion_ast_config+0x0/0x140 [mgs] Oct 12 17:27:26 centos6-1 kernel: [78502.141195] [<ffffffffa068aa98>] mgs_handle_target_reg+0x788/0xe10 [mgs] Oct 12 17:27:26 centos6-1 kernel: [78502.141422] [<ffffffffa068cd4b>] mgs_handle+0x8eb/0x11e0 [mgs] Oct 12 17:27:26 centos6-1 kernel: [78502.141642] [<ffffffffa0ea46d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] Oct 12 17:27:26 centos6-1 kernel: [78502.141889] [<ffffffffa11a6483>] ptlrpc_server_handle_request+0x463/0xe70 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.142257] [<ffffffffa0e9466e>] ? cfs_timer_arm+0xe/0x10 [libcfs] Oct 12 17:27:26 centos6-1 kernel: [78502.142495] [<ffffffffa119f171>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.142733] [<ffffffff81051f73>] ? __wake_up+0x53/0x70 Oct 12 17:27:26 centos6-1 kernel: [78502.142956] [<ffffffffa11a901a>] ptlrpc_main+0xb9a/0x1960 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.143194] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.143416] [<ffffffff8100c14a>] child_rip+0xa/0x20 Oct 12 17:27:26 centos6-1 kernel: [78502.143633] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.143876] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.144097] [<ffffffff8100c140>] ? child_rip+0x0/0x20 Oct 12 17:27:26 centos6-1 kernel: [78502.144299] Oct 12 17:27:26 centos6-1 kernel: [78502.144448] LustreError: dumping log to /tmp/lustre-log.1350077246.12732 Oct 12 17:27:26 centos6-1 kernel: [78502.172308] LNet: Service thread pid 12733 was inactive for 40.03s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Oct 12 17:27:26 centos6-1 kernel: [78502.172919] Pid: 12733, comm: ll_mgs_0001 Oct 12 17:27:26 centos6-1 kernel: [78502.173124] Oct 12 17:27:26 centos6-1 kernel: [78502.173124] Call Trace: Oct 12 17:27:26 centos6-1 kernel: [78502.173440] [<ffffffffa0ea46d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] Oct 12 17:27:26 centos6-1 kernel: [78502.173686] [<ffffffffa11a6483>] ptlrpc_server_handle_request+0x463/0xe70 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.174055] [<ffffffffa0e9466e>] ? cfs_timer_arm+0xe/0x10 [libcfs] Oct 12 17:27:26 centos6-1 kernel: [78502.174285] [<ffffffffa119f171>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.174509] [<ffffffff81051f73>] ? __wake_up+0x53/0x70 Oct 12 17:27:26 centos6-1 kernel: [78502.174730] [<ffffffffa11a901a>] ptlrpc_main+0xb9a/0x1960 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.174963] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.175177] [<ffffffff8100c14a>] child_rip+0xa/0x20 Oct 12 17:27:26 centos6-1 kernel: [78502.175387] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.175616] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:27:26 centos6-1 kernel: [78502.175839] [<ffffffff8100c140>] ? child_rip+0x0/0x20 Oct 12 17:27:26 centos6-1 kernel: [78502.176035] Oct 12 17:28:04 centos6-1 kernel: [78540.060524] BUG: soft lockup - CPU#2 stuck for 67s! [ll_mgs_0001:12733] Oct 12 17:28:04 centos6-1 kernel: [78540.060524] Modules linked in: lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd mds mgs lquota obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd2 jbd sha512_generic sha256_generic mbcache ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache nfs_acl auth_rpcgss sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] CPU 2 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] Modules linked in: lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd mds mgs lquota obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd2 jbd sha512_generic sha256_generic mbcache ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache nfs_acl auth_rpcgss sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] Pid: 12733, comm: ll_mgs_0001 Not tainted 2.6.32-debug #6 Bochs Bochs Oct 12 17:28:04 centos6-1 kernel: [78540.061006] RIP: 0010:[<ffffffffa0fd3e67>] [<ffffffffa0fd3e67>] llog_osd_next_block+0x327/0xa60 [obdclass] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] RSP: 0018:ffff88006a265b60 EFLAGS: 00010287 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] RAX: ffff8800568b9f18 RBX: ffff88006a265bf0 RCX: ffff880046299f20 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] RDX: 0000000010620000 RSI: 0000000000000000 RDI: ffff8800568ba238 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] RBP: ffffffff8100bc0e R08: 0000000000000000 R09: ffff8800568b9f20 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] R10: 0000000000000002 R11: ffff88007d2b4928 R12: ffff88004cf6fdf0 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] R13: ffff8800568b8238 R14: ffff880067ccaef0 R15: ffff88007d2b4928 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] FS: 00007fbf95ae7700(0000) GS:ffff880006300000(0000) knlGS:0000000000000000 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Oct 12 17:28:04 centos6-1 kernel: [78540.061006] CR2: ffff880046299f28 CR3: 000000005bcaa000 CR4: 00000000000006e0 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] Process ll_mgs_0001 (pid: 12733, threadinfo ffff88006a264000, task ffff88006a4fe240) Oct 12 17:28:04 centos6-1 kernel: [78540.061006] Stack: Oct 12 17:28:04 centos6-1 kernel: [78540.061006] ffffffffa123aba0 ffff88006a265c40 ffff88006a265bb0 ffffffffa1197a96 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] <d> ffffffff00001ce8 000001ac710d3ef0 ffff88004cf6fed0 ffff8800568b8230 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] <d> 0000000058f92bb8 ffff8800568b8228 000001ad6a265bc0 ffff8800710d3ef0 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] Call Trace: Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa1197a96>] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa11b405c>] ? llog_origin_handle_next_block+0x55c/0x780 [ptlrpc] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa068cf73>] ? mgs_handle+0xb13/0x11e0 [mgs] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa0ea46d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa11a6483>] ? ptlrpc_server_handle_request+0x463/0xe70 [ptlrpc] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa0e9466e>] ? cfs_timer_arm+0xe/0x10 [libcfs] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa119f171>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffff81051f73>] ? __wake_up+0x53/0x70 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa11a901a>] ? ptlrpc_main+0xb9a/0x1960 [ptlrpc] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffff8100c14a>] ? child_rip+0xa/0x20 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:28:04 centos6-1 kernel: [78540.061006] [<ffffffff8100c140>] ? child_rip+0x0/0x20 Oct 12 17:28:04 centos6-1 kernel: [78540.061006] Code: 86 7f 02 00 00 41 8b 45 08 25 ff f0 00 00 3d 10 60 00 00 0f 84 0b 02 00 00 48 63 c9 49 8d 44 0d f8 8b 10 48 29 d1 49 8d 4c 0d 00 <8b> 51 08 81 e2 ff f0 00 00 81 fa 10 60 00 00 0f 84 c4 01 00 00 Oct 12 17:29:05 centos6-1 kernel: [78600.477548] INFO: task ll_cfg_requeue:28122 blocked for more than 120 seconds. Oct 12 17:29:05 centos6-1 kernel: [78600.477921] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 12 17:29:05 centos6-1 kernel: [78600.478272] ll_cfg_requeu D 0000000000000002 3936 28122 2 0x00000080 Oct 12 17:29:05 centos6-1 kernel: [78600.478501] ffff880074775ae0 0000000000000046 ffff8800004896b0 0000000000000286 Oct 12 17:29:05 centos6-1 kernel: [78600.478885] 0000000000800500 0000000000000000 0000000000800500 ffff88003648c340 Oct 12 17:29:05 centos6-1 kernel: [78600.479249] ffff880037dbc778 ffff880074775fd8 000000000000fba8 ffff880037dbc778 Oct 12 17:29:05 centos6-1 kernel: [78600.479610] Call Trace: Oct 12 17:29:05 centos6-1 kernel: [78600.479779] [<ffffffff814f8b55>] schedule_timeout+0x215/0x2e0 Oct 12 17:29:05 centos6-1 kernel: [78600.480016] [<ffffffffa0fa4f10>] ? llog_process_thread_daemonize+0x0/0x80 [obdclass] Oct 12 17:29:05 centos6-1 kernel: [78600.480376] [<ffffffff8100c0e2>] ? kernel_thread+0x82/0xe0 Oct 12 17:29:05 centos6-1 kernel: [78600.480608] [<ffffffffa0fa4f10>] ? llog_process_thread_daemonize+0x0/0x80 [obdclass] Oct 12 17:29:05 centos6-1 kernel: [78600.480989] [<ffffffff814f87cb>] wait_for_common+0x12b/0x180 Oct 12 17:29:05 centos6-1 kernel: [78600.481193] [<ffffffff81057d60>] ? default_wake_function+0x0/0x20 Oct 12 17:29:05 centos6-1 kernel: [78600.481411] [<ffffffffa0e9b7ca>] ? cfs_create_thread+0x7a/0xa0 [libcfs] Oct 12 17:29:05 centos6-1 kernel: [78600.481674] [<ffffffffa0fe8210>] ? class_config_llog_handler+0x0/0x1800 [obdclass] Oct 12 17:29:05 centos6-1 kernel: [78600.482055] [<ffffffffa0fe8210>] ? class_config_llog_handler+0x0/0x1800 [obdclass] Oct 12 17:29:05 centos6-1 kernel: [78600.482406] [<ffffffff814f88dd>] wait_for_completion+0x1d/0x20 Oct 12 17:29:05 centos6-1 kernel: [78600.482624] [<ffffffffa0fa67ab>] llog_process_or_fork+0x2ab/0x540 [obdclass] Oct 12 17:29:05 centos6-1 kernel: [78600.482882] [<ffffffffa0fa6a54>] llog_process+0x14/0x20 [obdclass] Oct 12 17:29:05 centos6-1 kernel: [78600.483109] [<ffffffffa0fddc54>] class_config_parse_llog+0x1e4/0x340 [obdclass] Oct 12 17:29:05 centos6-1 kernel: [78600.483457] [<ffffffffa03f1a32>] mgc_process_cfg_log+0x4f2/0x1500 [mgc] Oct 12 17:29:05 centos6-1 kernel: [78600.483688] [<ffffffffa0fcf627>] ? class_handle2object+0x97/0x180 [obdclass] Oct 12 17:29:05 centos6-1 kernel: [78600.483930] [<ffffffffa03f2e73>] mgc_process_log+0x433/0x1330 [mgc] Oct 12 17:29:05 centos6-1 kernel: [78600.484158] [<ffffffffa0e94bde>] ? cfs_free+0xe/0x10 [libcfs] Oct 12 17:29:05 centos6-1 kernel: [78600.484363] [<ffffffffa03eda80>] ? mgc_blocking_ast+0x0/0x680 [mgc] Oct 12 17:29:05 centos6-1 kernel: [78600.484614] [<ffffffffa116f180>] ? ldlm_completion_ast+0x0/0x730 [ptlrpc] Oct 12 17:29:05 centos6-1 kernel: [78600.484853] [<ffffffffa03f48d8>] mgc_requeue_thread+0x348/0x7a0 [mgc] Oct 12 17:29:05 centos6-1 kernel: [78600.485068] [<ffffffff81057d60>] ? default_wake_function+0x0/0x20 Oct 12 17:29:05 centos6-1 kernel: [78600.485278] [<ffffffffa03f4590>] ? mgc_requeue_thread+0x0/0x7a0 [mgc] Oct 12 17:29:05 centos6-1 kernel: [78600.485494] [<ffffffff8100c14a>] child_rip+0xa/0x20 Oct 12 17:29:05 centos6-1 kernel: [78600.485702] [<ffffffffa03f4590>] ? mgc_requeue_thread+0x0/0x7a0 [mgc] Oct 12 17:29:05 centos6-1 kernel: [78600.485941] [<ffffffffa03f4590>] ? mgc_requeue_thread+0x0/0x7a0 [mgc] Oct 12 17:29:05 centos6-1 kernel: [78600.486156] [<ffffffff8100c140>] ? child_rip+0x0/0x20 Oct 12 17:29:17 centos6-1 kernel: [78612.816541] LustreError: 0:0:(ldlm_lockd.c:374:waiting_locks_callback()) ### lock callback timer expired after 151s: evicting client at 0@lo ns: MGS lock: ffff8800664b7db8/0x3b4f61b5b2dfb580 lrc: 3/0,0 mode: CR/CR res: 111542254400876/0 rrc: 2 type: PLN flags: 0x20 nid: 0@lo remote: 0x3b4f61b5b2dfb56b expref: 9 pid: 12733 timeout 4314545333 Oct 12 17:29:17 centos6-1 kernel: [78612.818123] LNet: Service thread pid 12732 completed after 150.68s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Oct 12 17:29:17 centos6-1 kernel: [78612.818948] LustreError: 166-1: MGC192.168.10.211@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Oct 12 17:29:17 centos6-1 kernel: [78612.819367] LustreError: Skipped 17 previous similar messages Oct 12 17:29:17 centos6-1 kernel: [78612.819691] LustreError: 15c-8: MGC192.168.10.211@tcp: The configuration from log 'lustre-MDT0000' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. Oct 12 17:29:17 centos6-1 kernel: [78612.820086] Lustre: Evicted from MGS (at MGC192.168.10.211@tcp_0) after server handle changed from 0x3b4f61b5b2dfb572 to 0x3b4f61b5b2dfb595 Oct 12 17:29:17 centos6-1 kernel: [78612.820182] Lustre: MGC192.168.10.211@tcp: Reactivating import Oct 12 17:29:17 centos6-1 kernel: [78612.820183] Lustre: Skipped 37 previous similar messages Oct 12 17:29:17 centos6-1 kernel: [78612.820200] Lustre: MGC192.168.10.211@tcp: Connection restored to MGS (at 0@lo) Oct 12 17:29:17 centos6-1 kernel: [78612.820201] Lustre: Skipped 23 previous similar messages Oct 12 17:29:17 centos6-1 kernel: [78612.821981] LustreError: 12716:0:(obd_mount.c:1850:server_start_targets()) failed to start server lustre-MDT0000: -5 Oct 12 17:29:17 centos6-1 kernel: [78612.822548] LustreError: 12716:0:(obd_mount.c:2397:server_fill_super()) Unable to start targets: -5 Oct 12 17:29:17 centos6-1 kernel: [78612.822937] LustreError: 12716:0:(obd_mount.c:1350:lustre_disconnect_osp()) Can't end config log lustre Oct 12 17:29:17 centos6-1 kernel: [78612.823318] LustreError: 12716:0:(obd_mount.c:2110:server_put_super()) lustre-MDT0000: Fail to disconnect osp-on-ost! Oct 12 17:29:17 centos6-1 kernel: [78612.823753] LustreError: 12716:0:(obd_mount.c:2140:server_put_super()) no obd lustre-MDT0000 Oct 12 17:29:28 centos6-1 kernel: [78624.060033] BUG: soft lockup - CPU#2 stuck for 67s! [ll_mgs_0001:12733] Oct 12 17:29:28 centos6-1 kernel: [78624.060033] Modules linked in: lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd mds mgs lquota obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd2 jbd sha512_generic sha256_generic mbcache ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache nfs_acl auth_rpcgss sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] CPU 2 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] Modules linked in: lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd mds mgs lquota obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd2 jbd sha512_generic sha256_generic mbcache ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache nfs_acl auth_rpcgss sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] Pid: 12733, comm: ll_mgs_0001 Not tainted 2.6.32-debug #6 Bochs Bochs Oct 12 17:29:28 centos6-1 kernel: [78624.061524] RIP: 0010:[<ffffffffa0fd3e67>] [<ffffffffa0fd3e67>] llog_osd_next_block+0x327/0xa60 [obdclass] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] RSP: 0018:ffff88006a265b60 EFLAGS: 00010287 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] RAX: ffff8800568b9f18 RBX: ffff88006a265bf0 RCX: ffff880046299f20 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] RDX: 0000000010620000 RSI: 0000000000000000 RDI: ffff8800568ba238 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] RBP: ffffffff8100bc0e R08: 0000000000000000 R09: ffff8800568b9f20 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] R10: 0000000000000002 R11: ffff88007d2b4928 R12: ffff88004cf6fdf0 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] R13: ffff8800568b8238 R14: ffff880067ccaef0 R15: ffff88007d2b4928 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] FS: 00007fbf95ae7700(0000) GS:ffff880006300000(0000) knlGS:0000000000000000 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Oct 12 17:29:28 centos6-1 kernel: [78624.061524] CR2: ffff880046299f28 CR3: 000000005bcaa000 CR4: 00000000000006e0 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] Process ll_mgs_0001 (pid: 12733, threadinfo ffff88006a264000, task ffff88006a4fe240) Oct 12 17:29:28 centos6-1 kernel: [78624.061524] Stack: Oct 12 17:29:28 centos6-1 kernel: [78624.061524] ffffffffa123aba0 ffff88006a265c40 ffff88006a265bb0 ffffffffa1197a96 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] <d> ffffffff00001ce8 000001ac710d3ef0 ffff88004cf6fed0 ffff8800568b8230 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] <d> 0000000058f92bb8 ffff8800568b8228 000001ad6a265bc0 ffff8800710d3ef0 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] Call Trace: Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa1197a96>] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa11b405c>] ? llog_origin_handle_next_block+0x55c/0x780 [ptlrpc] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa068cf73>] ? mgs_handle+0xb13/0x11e0 [mgs] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa0ea46d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa11a6483>] ? ptlrpc_server_handle_request+0x463/0xe70 [ptlrpc] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa0e9466e>] ? cfs_timer_arm+0xe/0x10 [libcfs] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa119f171>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffff81051f73>] ? __wake_up+0x53/0x70 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa11a901a>] ? ptlrpc_main+0xb9a/0x1960 [ptlrpc] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffff8100c14a>] ? child_rip+0xa/0x20 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffffa11a8480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Oct 12 17:29:28 centos6-1 kernel: [78624.061524] [<ffffffff8100c140>] ? child_rip+0x0/0x20 Oct 12 17:29:28 centos6-1 kernel: [78624.061524] Code: 86 7f 02 00 00 41 8b 45 08 25 ff f0 00 00 3d 10 60 00 00 0f 84 0b 02 00 00 48 63 c9 49 8d 44 0d f8 8b 10 48 29 d1 49 8d 4c 0d 00 <8b> 51 08 81 e2 ff f0 00 00 81 fa 10 60 00 00 0f 84 c4 01 00 00