Console [hyperion-agb20] log at 2013-10-12 11:00:00 PDT. 2013-10-12 11:13:33 LustreError: 6291:0:(service.c:2864:ptlrpc_start_thread()) cannot start thread 'll_ost01_009': rc -2816 2013-10-12 11:13:33 LustreError: 6321:0:(service.c:2467:ptlrpc_main()) ASSERTION( svcpt->scp_nthrs_starting == 1 ) failed: 2013-10-12 11:13:33 LustreError: 6321:0:(service.c:2467:ptlrpc_main()) LBUG 2013-10-12 11:13:33 Pid: 6321, comm: ll_ost01_010 2013-10-12 11:13:33 2013-10-12 11:13:33 Call Trace: 2013-10-12 11:13:33 [] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 2013-10-12 11:13:33 [] lbug_with_loc+0x47/0xb0 [libcfs] 2013-10-12 11:13:33 [] ptlrpc_main+0x153c/0x1740 [ptlrpc] 2013-10-12 11:13:33 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:13:33 [] kthread+0x96/0xa0 2013-10-12 11:13:33 [] child_rip+0xa/0x20 2013-10-12 11:13:33 [] ? kthread+0x0/0xa0 2013-10-12 11:13:33 [] ? child_rip+0x0/0x20 2013-10-12 11:13:33 2013-10-12 11:14:28 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds 2013-10-12 11:14:28 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.120.5@o2ib (60): c: 4, oc: 0, rc: 8 2013-10-12 11:14:28 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds 2013-10-12 11:14:28 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.120.5@o2ib1 (76): c: 2, oc: 0, rc: 8 2013-10-12 11:14:30 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 7 seconds 2013-10-12 11:14:30 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.124.10@o2ib (57): c: 6, oc: 0, rc: 8 2013-10-12 11:14:52 Lustre: 5868:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1381601692/real 1381601692] req@ffff8804ecd26800 x1448711817269336/t0(0) o400->lustre-MDT0000-lwp-OST000e@192.168.120.5@o2ib1:12/10 lens 224/224 e 0 to 1 dl 1381601798 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:14:52 Lustre: lustre-MDT0000-lwp-OST0005: Connection to lustre-MDT0000 (at 192.168.120.5@o2ib1) was lost; in progress operations using this service will wait for recovery to complete 2013-10-12 11:14:52 Lustre: 5868:0:(client.c:1897:ptlrpc_expire_one_request()) Skipped 2 previous similar messages 2013-10-12 11:14:52 Lustre: lustre-MDT0000-lwp-OST000e: Connection to lustre-MDT0000 (at 192.168.120.5@o2ib1) was lost; in progress operations using this service will wait for recovery to complete 2013-10-12 11:14:52 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1381601692/real 1381601692] req@ffff8804cabaec00 x1448711817269348/t0(0) o38->lustre-MDT0000-lwp-OST000e@192.168.120.5@o2ib1:12/10 lens 400/544 e 0 to 1 dl 1381601742 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:14:56 BUG: soft lockup - CPU#4 stuck for 67s! [ll_ost01_008:6313] 2013-10-12 11:14:56 BUG: soft lockup - CPU#5 stuck for 67s! [ll_ost01_002:5939] 2013-10-12 11:14:56 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:14:56 CPU 5 2013-10-12 11:14:56 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:14:56 2013-10-12 11:14:56 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:14:56 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:14:56 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:14:56 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:14:56 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:14:56 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:14:56 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:14:56 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:14:56 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:14:56 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:14:56 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:14:56 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:14:57 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:14:57 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:14:57 Stack: 2013-10-12 11:14:57 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:14:57 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:14:57 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:14:57 Call Trace: 2013-10-12 11:14:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:14:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:14:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:14:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:14:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:14:57 [] ? kthread+0x96/0xa0 2013-10-12 11:14:57 [] ? child_rip+0xa/0x20 2013-10-12 11:14:57 [] ? kthread+0x0/0xa0 2013-10-12 11:14:57 [] ? child_rip+0x0/0x20 2013-10-12 11:14:57 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:14:57 Call Trace: 2013-10-12 11:14:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:14:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:14:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:14:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:14:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:14:57 [] ? kthread+0x96/0xa0 2013-10-12 11:14:57 [] ? child_rip+0xa/0x20 2013-10-12 11:14:57 [] ? kthread+0x0/0xa0 2013-10-12 11:14:57 [] ? child_rip+0x0/0x20 2013-10-12 11:14:57 BUG: soft lockup - CPU#6 stuck for 67s! [ll_ost01_006:6292] 2013-10-12 11:14:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:14:57 CPU 6 2013-10-12 11:14:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:14:57 2013-10-12 11:14:57 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:14:57 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:14:57 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:14:57 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:14:57 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:14:57 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:14:57 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:14:57 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:14:57 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:14:57 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:14:57 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:14:57 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:14:57 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:14:57 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:14:57 Stack: 2013-10-12 11:14:57 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:14:57 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:14:57 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:14:57 Call Trace: 2013-10-12 11:14:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:14:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:14:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:14:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:14:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:14:57 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:14:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:14:57 [] ? kthread+0x96/0xa0 2013-10-12 11:14:57 [] ? child_rip+0xa/0x20 2013-10-12 11:14:57 [] ? kthread+0x0/0xa0 2013-10-12 11:14:57 [] ? child_rip+0x0/0x20 2013-10-12 11:14:57 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:14:57 Call Trace: 2013-10-12 11:14:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:14:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:14:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:14:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:14:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:14:57 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:14:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:14:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:14:57 [] ? kthread+0x96/0xa0 2013-10-12 11:14:57 [] ? child_rip+0xa/0x20 2013-10-12 11:14:57 [] ? kthread+0x0/0xa0 2013-10-12 11:14:57 [] ? child_rip+0x0/0x20 2013-10-12 11:14:57 BUG: soft lockup - CPU#7 stuck for 67s! [ll_ost01_009:6315] 2013-10-12 11:14:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:14:57 CPU 7 2013-10-12 11:14:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:14:57 2013-10-12 11:14:57 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:14:57 RIP: 0010:[] [] _spin_lock+0x1c/0x30 2013-10-12 11:14:57 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:14:57 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:14:57 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:14:57 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:14:57 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:14:57 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:14:58 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:14:58 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:14:58 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:14:58 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:14:58 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:14:58 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:14:58 Stack: 2013-10-12 11:14:58 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:14:58 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:14:58 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:14:58 Call Trace: 2013-10-12 11:14:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:14:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:14:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:14:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:14:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:14:58 [] ? kthread+0x96/0xa0 2013-10-12 11:14:58 [] ? child_rip+0xa/0x20 2013-10-12 11:14:58 [] ? kthread+0x0/0xa0 2013-10-12 11:14:58 [] ? child_rip+0x0/0x20 2013-10-12 11:14:58 Code: 81 2f 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e 90 0f b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 2013-10-12 11:14:58 Call Trace: 2013-10-12 11:14:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:14:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:14:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:14:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:14:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:14:58 [] ? kthread+0x96/0xa0 2013-10-12 11:14:58 [] ? child_rip+0xa/0x20 2013-10-12 11:14:58 [] ? kthread+0x0/0xa0 2013-10-12 11:14:58 [] ? child_rip+0x0/0x20 2013-10-12 11:14:58 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:14:58 CPU 4 2013-10-12 11:14:58 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:14:58 2013-10-12 11:14:58 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:14:58 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:14:58 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:14:58 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:14:58 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:14:58 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:14:58 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:14:58 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:14:58 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:14:58 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:14:58 CR2: 00007ffffffdc4d8 CR3: 00000008314d7000 CR4: 00000000000407e0 2013-10-12 11:14:58 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:14:58 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:14:58 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:14:58 Stack: 2013-10-12 11:14:58 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:14:58 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:14:58 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:14:58 Call Trace: 2013-10-12 11:14:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:14:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:14:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:14:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:14:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:14:58 [] ? kthread+0x96/0xa0 2013-10-12 11:14:58 [] ? child_rip+0xa/0x20 2013-10-12 11:14:58 [] ? kthread+0x0/0xa0 2013-10-12 11:14:58 [] ? child_rip+0x0/0x20 2013-10-12 11:14:58 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:14:58 Call Trace: 2013-10-12 11:14:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:14:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:14:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:14:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:14:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:14:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:14:58 [] ? kthread+0x96/0xa0 2013-10-12 11:14:58 [] ? child_rip+0xa/0x20 2013-10-12 11:14:58 [] ? kthread+0x0/0xa0 2013-10-12 11:14:58 [] ? child_rip+0x0/0x20 2013-10-12 11:15:23 Lustre: 5875:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1381601617/real 0] req@ffff88070d25ec00 x1448711817269300/t0(0) o400->lustre-MDT0000-lwp-OST000e@192.168.120.5@o2ib1:12/10 lens 224/224 e 0 to 1 dl 1381601723 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:15:23 LustreError: 166-1: MGC192.168.120.5@o2ib: Connection to MGS (at 192.168.120.5@o2ib) was lost; in progress operations using this service will fail 2013-10-12 11:15:23 Lustre: 5875:0:(client.c:1897:ptlrpc_expire_one_request()) Skipped 1 previous similar message 2013-10-12 11:15:43 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds 2013-10-12 11:15:43 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Skipped 1 previous similar message 2013-10-12 11:15:43 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.120.5@o2ib (60): c: 5, oc: 0, rc: 8 2013-10-12 11:15:43 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Skipped 1 previous similar message 2013-10-12 11:15:48 Lustre: 5864:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1381601642/real 0] req@ffff8807d2f2c000 x1448711817269316/t0(0) o400->lustre-MDT0000-lwp-OST0005@192.168.120.5@o2ib1:12/10 lens 224/224 e 0 to 1 dl 1381601748 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:15:48 Lustre: 5864:0:(client.c:1897:ptlrpc_expire_one_request()) Skipped 1 previous similar message 2013-10-12 11:15:55 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds 2013-10-12 11:15:55 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.118.106@o2ib (150): c: 7, oc: 0, rc: 8 2013-10-12 11:16:01 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds 2013-10-12 11:16:01 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Skipped 7 previous similar messages 2013-10-12 11:16:01 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.124.31@o2ib (156): c: 7, oc: 0, rc: 8 2013-10-12 11:16:01 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Skipped 7 previous similar messages 2013-10-12 11:16:10 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds 2013-10-12 11:16:10 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Skipped 10 previous similar messages 2013-10-12 11:16:10 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.124.2@o2ib (157): c: 6, oc: 0, rc: 8 2013-10-12 11:16:10 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Skipped 10 previous similar messages 2013-10-12 11:16:12 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1381601717/real 0] req@ffff88054d4ad400 x1448711817269356/t0(0) o38->lustre-MDT0000-lwp-OST000e@192.168.120.5@o2ib1:12/10 lens 400/544 e 0 to 1 dl 1381601772 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:16:20 BUG: soft lockup - CPU#4 stuck for 67s! [ll_ost01_008:6313] 2013-10-12 11:16:20 BUG: soft lockup - CPU#5 stuck for 67s! [ll_ost01_002:5939] 2013-10-12 11:16:20 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:16:20 CPU 5 2013-10-12 11:16:20 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:16:20 2013-10-12 11:16:20 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:16:20 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:16:20 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:16:20 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:16:20 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:16:20 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:16:20 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:16:20 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:16:20 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:16:20 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:16:20 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:16:20 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:16:20 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:16:20 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:16:20 Stack: 2013-10-12 11:16:20 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:16:20 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:16:20 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:16:20 Call Trace: 2013-10-12 11:16:20 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:21 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:21 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:21 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:21 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:21 [] ? kthread+0x96/0xa0 2013-10-12 11:16:21 [] ? child_rip+0xa/0x20 2013-10-12 11:16:21 [] ? kthread+0x0/0xa0 2013-10-12 11:16:21 [] ? child_rip+0x0/0x20 2013-10-12 11:16:21 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:16:21 Call Trace: 2013-10-12 11:16:21 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:21 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:21 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:21 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:21 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:21 [] ? kthread+0x96/0xa0 2013-10-12 11:16:21 [] ? child_rip+0xa/0x20 2013-10-12 11:16:21 [] ? kthread+0x0/0xa0 2013-10-12 11:16:21 [] ? child_rip+0x0/0x20 2013-10-12 11:16:21 BUG: soft lockup - CPU#6 stuck for 67s! [ll_ost01_006:6292] 2013-10-12 11:16:21 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:16:21 CPU 6 2013-10-12 11:16:21 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:16:21 2013-10-12 11:16:21 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:16:21 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:16:21 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:16:21 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:16:21 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:16:21 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:16:21 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:16:21 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:16:21 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:16:21 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:16:21 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:16:21 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:16:21 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:16:21 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:16:21 Stack: 2013-10-12 11:16:21 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:16:21 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:16:21 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:16:21 Call Trace: 2013-10-12 11:16:21 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:21 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:21 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:21 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:21 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:21 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:16:21 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:21 [] ? kthread+0x96/0xa0 2013-10-12 11:16:21 [] ? child_rip+0xa/0x20 2013-10-12 11:16:21 [] ? kthread+0x0/0xa0 2013-10-12 11:16:21 [] ? child_rip+0x0/0x20 2013-10-12 11:16:21 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:16:21 Call Trace: 2013-10-12 11:16:21 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:21 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:21 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:21 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:21 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:21 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:16:21 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:21 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:21 [] ? kthread+0x96/0xa0 2013-10-12 11:16:21 [] ? child_rip+0xa/0x20 2013-10-12 11:16:21 [] ? kthread+0x0/0xa0 2013-10-12 11:16:21 [] ? child_rip+0x0/0x20 2013-10-12 11:16:21 BUG: soft lockup - CPU#7 stuck for 67s! [ll_ost01_009:6315] 2013-10-12 11:16:21 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:16:21 CPU 7 2013-10-12 11:16:21 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:16:21 2013-10-12 11:16:21 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:16:21 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:16:21 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:16:21 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:16:21 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:16:21 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:16:21 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:16:21 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:16:22 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:16:22 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:16:22 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:16:22 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:16:22 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:16:22 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:16:22 Stack: 2013-10-12 11:16:22 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:16:22 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:16:22 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:16:22 Call Trace: 2013-10-12 11:16:22 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:22 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:22 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:22 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:22 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:22 [] ? kthread+0x96/0xa0 2013-10-12 11:16:22 [] ? child_rip+0xa/0x20 2013-10-12 11:16:22 [] ? kthread+0x0/0xa0 2013-10-12 11:16:22 [] ? child_rip+0x0/0x20 2013-10-12 11:16:22 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:16:22 Call Trace: 2013-10-12 11:16:22 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:22 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:22 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:22 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:22 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:22 [] ? kthread+0x96/0xa0 2013-10-12 11:16:22 [] ? child_rip+0xa/0x20 2013-10-12 11:16:22 [] ? kthread+0x0/0xa0 2013-10-12 11:16:22 [] ? child_rip+0x0/0x20 2013-10-12 11:16:22 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:16:22 CPU 4 2013-10-12 11:16:22 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:16:22 2013-10-12 11:16:22 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:16:22 RIP: 0010:[] [] _spin_lock+0x1c/0x30 2013-10-12 11:16:22 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:16:22 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:16:22 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:16:22 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:16:22 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:16:22 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:16:22 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:16:22 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:16:22 CR2: 00007ffffffdc4d8 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:16:22 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:16:22 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:16:22 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:16:22 Stack: 2013-10-12 11:16:22 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:16:22 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:16:22 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:16:22 Call Trace: 2013-10-12 11:16:22 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:22 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:22 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:22 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:22 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:22 [] ? kthread+0x96/0xa0 2013-10-12 11:16:22 [] ? child_rip+0xa/0x20 2013-10-12 11:16:22 [] ? kthread+0x0/0xa0 2013-10-12 11:16:22 [] ? child_rip+0x0/0x20 2013-10-12 11:16:22 Code: 81 2f 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e 90 0f b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 2013-10-12 11:16:22 Call Trace: 2013-10-12 11:16:22 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:22 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:22 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:22 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:22 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:22 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:22 [] ? kthread+0x96/0xa0 2013-10-12 11:16:22 [] ? child_rip+0xa/0x20 2013-10-12 11:16:22 [] ? kthread+0x0/0xa0 2013-10-12 11:16:22 [] ? child_rip+0x0/0x20 2013-10-12 11:16:46 INFO: task kthreadd:6322 blocked for more than 120 seconds. 2013-10-12 11:16:46 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-10-12 11:16:46 kthreadd D 000000000000000c 0 6322 2 0x00000000 2013-10-12 11:16:46 ffff880df8e23ee0 0000000000000046 0000000000000000 ffff880df8e23ea4 2013-10-12 11:16:46 0000000000000000 ffff88083fe82800 ffff880044696740 0000000000000400 2013-10-12 11:16:46 ffff880df8e19058 ffff880df8e23fd8 000000000000fb88 ffff880df8e19058 2013-10-12 11:16:46 Call Trace: 2013-10-12 11:16:46 [] ? libcfs_debug_dumplog_thread+0x0/0x30 [libcfs] 2013-10-12 11:16:46 [] kthread+0x77/0xa0 2013-10-12 11:16:46 [] child_rip+0xa/0x20 2013-10-12 11:16:46 [] ? kthread+0x0/0xa0 2013-10-12 11:16:46 [] ? child_rip+0x0/0x20 2013-10-12 11:16:53 LNet: Service thread pid 5939 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: 2013-10-12 11:16:53 Pid: 5939, comm: ll_ost01_002 2013-10-12 11:16:53 2013-10-12 11:16:53 Call Trace: 2013-10-12 11:16:53 [] ? kiblnd_check_sends+0x2ba/0x610 [ko2iblnd] 2013-10-12 11:16:53 [] ? kiblnd_queue_tx+0x4a/0x60 [ko2iblnd] 2013-10-12 11:16:53 [] ? kiblnd_launch_tx+0xf7/0xa80 [ko2iblnd] 2013-10-12 11:16:53 [] ? kiblnd_send+0x2a3/0x9b0 [ko2iblnd] 2013-10-12 11:16:53 [] ? lnet_ni_send+0x4b/0xf0 [lnet] 2013-10-12 11:16:53 [] ? lnet_send+0x6e6/0xb70 [lnet] 2013-10-12 11:16:53 [] ? LNetPut+0x31a/0x860 [lnet] 2013-10-12 11:16:53 [] ? ptl_send_buf+0x1e0/0x550 [ptlrpc] 2013-10-12 11:16:53 [] ? at_measured+0x108/0x380 [ptlrpc] 2013-10-12 11:16:53 [] ? null_authorize+0x75/0x100 [ptlrpc] 2013-10-12 11:16:53 [] ? ptlrpc_send_reply+0x28e/0x7f0 [ptlrpc] 2013-10-12 11:16:53 [] ? call_rcu_sched+0x15/0x20 2013-10-12 11:16:53 [] ? target_send_reply_msg+0x54/0x190 [ptlrpc] 2013-10-12 11:16:53 [] ? target_send_reply+0x3e6/0x720 [ptlrpc] 2013-10-12 11:16:53 [] ? oti_to_request+0x75/0xc0 [ost] 2013-10-12 11:16:53 [] ? ost_handle+0x203/0x44d0 [ost] 2013-10-12 11:16:53 [] ? apic_timer_interrupt+0xe/0x20 2013-10-12 11:16:53 [] ? _spin_lock+0x1e/0x30 2013-10-12 11:16:53 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:53 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:53 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:53 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:53 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:53 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:53 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:53 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:53 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:53 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:53 [] ? kthread+0x96/0xa0 2013-10-12 11:16:53 [] ? child_rip+0xa/0x20 2013-10-12 11:16:53 [] ? kthread+0x0/0xa0 2013-10-12 11:16:53 [] ? child_rip+0x0/0x20 2013-10-12 11:16:53 2013-10-12 11:16:53 LustreError: dumping log to /tmp/lustre-log.1381601813.5939 2013-10-12 11:16:53 LNet: Service thread pid 6223 was inactive for 200.21s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: 2013-10-12 11:16:53 Pid: 6223, comm: ll_ost01_004 2013-10-12 11:16:53 2013-10-12 11:16:53 Call Trace: 2013-10-12 11:16:53 [] schedule_timeout+0x192/0x2e0 2013-10-12 11:16:53 [] ? process_timeout+0x0/0x10 2013-10-12 11:16:53 [] ptlrpc_set_wait+0x2da/0x860 [ptlrpc] 2013-10-12 11:16:53 [] ? default_wake_function+0x0/0x20 2013-10-12 11:16:53 [] ? ldlm_work_gl_ast_lock+0x0/0x290 [ptlrpc] 2013-10-12 11:16:53 [] ldlm_run_ast_work+0x1bb/0x470 [ptlrpc] 2013-10-12 11:16:53 [] ldlm_glimpse_locks+0x3b/0x100 [ptlrpc] 2013-10-12 11:16:53 [] ofd_intent_policy+0x516/0x7d0 [ofd] 2013-10-12 11:16:53 [] ldlm_lock_enqueue+0x361/0x8c0 [ptlrpc] 2013-10-12 11:16:53 [] ldlm_handle_enqueue0+0x4ef/0x10a0 [ptlrpc] 2013-10-12 11:16:53 [] ldlm_handle_enqueue+0x5e/0x70 [ptlrpc] 2013-10-12 11:16:53 [] ? ldlm_server_completion_ast+0x0/0x6c0 [ptlrpc] 2013-10-12 11:16:53 [] ? ost_blocking_ast+0x0/0x1020 [ost] 2013-10-12 11:16:53 [] ? ldlm_server_glimpse_ast+0x0/0x3b0 [ptlrpc] 2013-10-12 11:16:53 [] ost_handle+0x1d1d/0x44d0 [ost] 2013-10-12 11:16:53 [] ? ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc] 2013-10-12 11:16:53 [] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc] 2013-10-12 11:16:53 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:53 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:54 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:54 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:16:54 [] ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:54 [] kthread+0x96/0xa0 2013-10-12 11:16:54 [] child_rip+0xa/0x20 2013-10-12 11:16:54 [] ? kthread+0x0/0xa0 2013-10-12 11:16:54 [] ? child_rip+0x0/0x20 2013-10-12 11:16:54 2013-10-12 11:16:54 Pid: 6313, comm: ll_ost01_008 2013-10-12 11:16:54 2013-10-12 11:16:54 Call Trace: 2013-10-12 11:16:54 [] ? kiblnd_check_sends+0x2ba/0x610 [ko2iblnd] 2013-10-12 11:16:54 [] ? kiblnd_queue_tx+0x4a/0x60 [ko2iblnd] 2013-10-12 11:16:54 [] ? kiblnd_launch_tx+0xf7/0xa80 [ko2iblnd] 2013-10-12 11:16:54 [] ? kiblnd_send+0x2a3/0x9b0 [ko2iblnd] 2013-10-12 11:16:54 [] ? lnet_ni_send+0x4b/0xf0 [lnet] 2013-10-12 11:16:54 [] ? lnet_send+0x6e6/0xb70 [lnet] 2013-10-12 11:16:54 [] ? LNetPut+0x31a/0x860 [lnet] 2013-10-12 11:16:54 [] ? ptl_send_buf+0x1e0/0x550 [ptlrpc] 2013-10-12 11:16:54 [] ? at_measured+0x108/0x380 [ptlrpc] 2013-10-12 11:16:54 [] ? null_authorize+0x75/0x100 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_send_reply+0x28e/0x7f0 [ptlrpc] 2013-10-12 11:16:54 [] ? call_rcu_sched+0x15/0x20 2013-10-12 11:16:54 [] ? target_send_reply_msg+0x54/0x190 [ptlrpc] 2013-10-12 11:16:54 [] ? target_send_reply+0x3e6/0x720 [ptlrpc] 2013-10-12 11:16:54 [] ? oti_to_request+0x75/0xc0 [ost] 2013-10-12 11:16:54 [] ? ost_handle+0x203/0x44d0 [ost] 2013-10-12 11:16:54 [] ? apic_timer_interrupt+0xe/0x20 2013-10-12 11:16:54 [] ? _spin_lock+0x1e/0x30 2013-10-12 11:16:54 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:54 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:54 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:54 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:54 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:54 [] ? kthread+0x96/0xa0 2013-10-12 11:16:54 [] ? child_rip+0xa/0x20 2013-10-12 11:16:54 [] ? kthread+0x0/0xa0 2013-10-12 11:16:54 [] ? child_rip+0x0/0x20 2013-10-12 11:16:54 2013-10-12 11:16:54 Pid: 6295, comm: ll_ost01_007 2013-10-12 11:16:54 2013-10-12 11:16:54 Call Trace: 2013-10-12 11:16:54 [] schedule_timeout+0x192/0x2e0 2013-10-12 11:16:54 [] ? process_timeout+0x0/0x10 2013-10-12 11:16:54 [] ptlrpc_set_wait+0x2da/0x860 [ptlrpc] 2013-10-12 11:16:54 [] ? default_wake_function+0x0/0x20 2013-10-12 11:16:54 [] ? ldlm_work_gl_ast_lock+0x0/0x290 [ptlrpc] 2013-10-12 11:16:54 [] ldlm_run_ast_work+0x1bb/0x470 [ptlrpc] 2013-10-12 11:16:54 [] ldlm_glimpse_locks+0x3b/0x100 [ptlrpc] 2013-10-12 11:16:54 [] ofd_intent_policy+0x516/0x7d0 [ofd] 2013-10-12 11:16:54 [] ldlm_lock_enqueue+0x361/0x8c0 [ptlrpc] 2013-10-12 11:16:54 [] ldlm_handle_enqueue0+0x4ef/0x10a0 [ptlrpc] 2013-10-12 11:16:54 [] ldlm_handle_enqueue+0x5e/0x70 [ptlrpc] 2013-10-12 11:16:54 [] ? ldlm_server_completion_ast+0x0/0x6c0 [ptlrpc] 2013-10-12 11:16:54 [] ? ost_blocking_ast+0x0/0x1020 [ost] 2013-10-12 11:16:54 [] ? ldlm_server_glimpse_ast+0x0/0x3b0 [ptlrpc] 2013-10-12 11:16:54 [] ost_handle+0x1d1d/0x44d0 [ost] 2013-10-12 11:16:54 [] ? ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc] 2013-10-12 11:16:54 [] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc] 2013-10-12 11:16:54 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:54 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:54 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:54 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:16:54 [] ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:54 [] kthread+0x96/0xa0 2013-10-12 11:16:54 [] child_rip+0xa/0x20 2013-10-12 11:16:54 [] ? kthread+0x0/0xa0 2013-10-12 11:16:54 [] ? child_rip+0x0/0x20 2013-10-12 11:16:54 2013-10-12 11:16:54 LNet: Service thread pid 6315 was inactive for 200.74s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: 2013-10-12 11:16:54 LNet: Skipped 2 previous similar messages 2013-10-12 11:16:54 Pid: 6315, comm: ll_ost01_009 2013-10-12 11:16:54 2013-10-12 11:16:54 Call Trace: 2013-10-12 11:16:54 [] ? kiblnd_check_sends+0x2ba/0x610 [ko2iblnd] 2013-10-12 11:16:54 [] ? kiblnd_queue_tx+0x4a/0x60 [ko2iblnd] 2013-10-12 11:16:54 [] ? kiblnd_launch_tx+0xf7/0xa80 [ko2iblnd] 2013-10-12 11:16:54 [] ? kiblnd_send+0x2a3/0x9b0 [ko2iblnd] 2013-10-12 11:16:54 [] ? lnet_ni_send+0x4b/0xf0 [lnet] 2013-10-12 11:16:54 [] ? lnet_send+0x6e6/0xb70 [lnet] 2013-10-12 11:16:54 [] ? LNetPut+0x31a/0x860 [lnet] 2013-10-12 11:16:54 [] ? ptl_send_buf+0x1e0/0x550 [ptlrpc] 2013-10-12 11:16:54 [] ? at_measured+0x108/0x380 [ptlrpc] 2013-10-12 11:16:54 [] ? null_authorize+0x75/0x100 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_send_reply+0x28e/0x7f0 [ptlrpc] 2013-10-12 11:16:54 [] ? call_rcu_sched+0x15/0x20 2013-10-12 11:16:54 [] ? target_send_reply_msg+0x54/0x190 [ptlrpc] 2013-10-12 11:16:54 [] ? target_send_reply+0x3e6/0x720 [ptlrpc] 2013-10-12 11:16:54 [] ? oti_to_request+0x75/0xc0 [ost] 2013-10-12 11:16:54 [] ? ost_handle+0x203/0x44d0 [ost] 2013-10-12 11:16:54 [] ? reschedule_interrupt+0xe/0x20 2013-10-12 11:16:54 [] ? _spin_lock+0x21/0x30 2013-10-12 11:16:54 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:16:54 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:16:54 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:16:54 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:16:54 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:16:54 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:16:54 [] ? kthread+0x96/0xa0 2013-10-12 11:16:54 [] ? child_rip+0xa/0x20 2013-10-12 11:16:54 [] ? kthread+0x0/0xa0 2013-10-12 11:16:54 [] ? child_rip+0x0/0x20 2013-10-12 11:16:54 2013-10-12 11:16:54 LNet: Service thread pid 6292 was inactive for 200.96s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. 2013-10-12 11:16:54 LNet: Service thread pid 5938 was inactive for 200.97s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. 2013-10-12 11:17:11 Lustre: lustre-OST000e: haven't heard from client 52ab5f25-6714-c947-512b-c3acf49a6c09 (at 192.168.118.114@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88054dbeb000, cur 1381601831 expire 1381601681 last 1381601604 2013-10-12 11:17:11 Lustre: lustre-OST000e: haven't heard from client a663b67f-86c2-1fd1-bcda-f2a0b8b8e129 (at 192.168.118.119@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8802fdd7dc00, cur 1381601831 expire 1381601681 last 1381601604 2013-10-12 11:17:14 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds 2013-10-12 11:17:14 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Skipped 8 previous similar messages 2013-10-12 11:17:14 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.124.10@o2ib (154): c: 8, oc: 0, rc: 8 2013-10-12 11:17:14 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Skipped 8 previous similar messages 2013-10-12 11:17:33 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1381601798/real 0] req@ffff88077979e400 x1448711817269380/t0(0) o38->lustre-MDT0000-lwp-OST000e@192.168.120.5@o2ib1:12/10 lens 400/544 e 0 to 1 dl 1381601853 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:17:33 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) Skipped 6 previous similar messages 2013-10-12 11:17:44 BUG: soft lockup - CPU#4 stuck for 67s! [ll_ost01_008:6313] 2013-10-12 11:17:44 BUG: soft lockup - CPU#5 stuck for 67s! [ll_ost01_002:5939] 2013-10-12 11:17:44 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:17:44 CPU 5 2013-10-12 11:17:44 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:17:44 2013-10-12 11:17:44 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:17:44 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:17:44 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:17:44 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:17:44 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:17:44 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:17:44 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:17:44 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:17:44 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:17:44 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:17:44 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:17:44 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:17:44 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:17:44 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:17:44 Stack: 2013-10-12 11:17:44 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:17:44 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:17:44 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:17:44 Call Trace: 2013-10-12 11:17:44 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:17:45 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:17:45 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:17:45 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:17:45 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:17:45 [] ? kthread+0x96/0xa0 2013-10-12 11:17:45 [] ? child_rip+0xa/0x20 2013-10-12 11:17:45 [] ? kthread+0x0/0xa0 2013-10-12 11:17:45 [] ? child_rip+0x0/0x20 2013-10-12 11:17:45 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:17:45 Call Trace: 2013-10-12 11:17:45 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:17:45 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:17:45 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:17:45 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:17:45 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:17:45 [] ? kthread+0x96/0xa0 2013-10-12 11:17:45 [] ? child_rip+0xa/0x20 2013-10-12 11:17:45 [] ? kthread+0x0/0xa0 2013-10-12 11:17:45 [] ? child_rip+0x0/0x20 2013-10-12 11:17:45 BUG: soft lockup - CPU#6 stuck for 67s! [ll_ost01_006:6292] 2013-10-12 11:17:45 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:17:45 CPU 6 2013-10-12 11:17:45 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:17:45 2013-10-12 11:17:45 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:17:45 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:17:45 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:17:45 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:17:45 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:17:45 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:17:45 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:17:45 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:17:45 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:17:45 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:17:45 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:17:45 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:17:45 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:17:45 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:17:45 Stack: 2013-10-12 11:17:45 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:17:45 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:17:45 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:17:45 Call Trace: 2013-10-12 11:17:45 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:17:45 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:17:45 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:17:45 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:17:45 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:17:45 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:17:45 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:17:45 [] ? kthread+0x96/0xa0 2013-10-12 11:17:45 [] ? child_rip+0xa/0x20 2013-10-12 11:17:45 [] ? kthread+0x0/0xa0 2013-10-12 11:17:45 [] ? child_rip+0x0/0x20 2013-10-12 11:17:45 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:17:45 Call Trace: 2013-10-12 11:17:45 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:17:45 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:17:45 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:17:45 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:17:45 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:17:45 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:17:45 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:17:45 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:17:45 [] ? kthread+0x96/0xa0 2013-10-12 11:17:45 [] ? child_rip+0xa/0x20 2013-10-12 11:17:45 [] ? kthread+0x0/0xa0 2013-10-12 11:17:45 [] ? child_rip+0x0/0x20 2013-10-12 11:17:45 BUG: soft lockup - CPU#7 stuck for 67s! [ll_ost01_009:6315] 2013-10-12 11:17:45 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:17:45 CPU 7 2013-10-12 11:17:45 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:17:45 2013-10-12 11:17:45 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:17:45 RIP: 0010:[] [] _spin_lock+0x1c/0x30 2013-10-12 11:17:45 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:17:45 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:17:45 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:17:45 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:17:45 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:17:45 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:17:46 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:17:46 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:17:46 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:17:46 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:17:46 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:17:46 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:17:46 Stack: 2013-10-12 11:17:46 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:17:46 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:17:46 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:17:46 Call Trace: 2013-10-12 11:17:46 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:17:46 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:17:46 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:17:46 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:17:46 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:17:46 [] ? kthread+0x96/0xa0 2013-10-12 11:17:46 [] ? child_rip+0xa/0x20 2013-10-12 11:17:46 [] ? kthread+0x0/0xa0 2013-10-12 11:17:46 [] ? child_rip+0x0/0x20 2013-10-12 11:17:46 Code: 81 2f 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e 90 0f b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 2013-10-12 11:17:46 Call Trace: 2013-10-12 11:17:46 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:17:46 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:17:46 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:17:46 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:17:46 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:17:46 [] ? kthread+0x96/0xa0 2013-10-12 11:17:46 [] ? child_rip+0xa/0x20 2013-10-12 11:17:46 [] ? kthread+0x0/0xa0 2013-10-12 11:17:46 [] ? child_rip+0x0/0x20 2013-10-12 11:17:46 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:17:46 CPU 4 2013-10-12 11:17:46 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:17:46 2013-10-12 11:17:46 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:17:46 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:17:46 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:17:46 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:17:46 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:17:46 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:17:46 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:17:46 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:17:46 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:17:46 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:17:46 CR2: 00007ffffffdc4d8 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:17:46 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:17:46 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:17:46 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:17:46 Stack: 2013-10-12 11:17:46 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:17:46 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:17:46 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:17:46 Call Trace: 2013-10-12 11:17:46 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:17:46 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:17:46 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:17:46 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:17:46 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:17:46 [] ? kthread+0x96/0xa0 2013-10-12 11:17:46 [] ? child_rip+0xa/0x20 2013-10-12 11:17:46 [] ? kthread+0x0/0xa0 2013-10-12 11:17:46 [] ? child_rip+0x0/0x20 2013-10-12 11:17:46 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:17:46 Call Trace: 2013-10-12 11:17:46 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:17:46 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:17:46 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:17:46 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:17:46 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:17:46 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:17:46 [] ? kthread+0x96/0xa0 2013-10-12 11:17:46 [] ? child_rip+0xa/0x20 2013-10-12 11:17:46 [] ? kthread+0x0/0xa0 2013-10-12 11:17:46 [] ? child_rip+0x0/0x20 2013-10-12 11:17:53 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1381601873/real 1381601873] req@ffff8801670f7400 x1448711817269388/t0(0) o250->MGC192.168.120.5@o2ib@192.168.120.5@o2ib:26/25 lens 400/544 e 0 to 1 dl 1381601928 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:17:53 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) Skipped 2 previous similar messages 2013-10-12 11:18:39 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds 2013-10-12 11:18:39 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Skipped 2 previous similar messages 2013-10-12 11:18:39 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.118.106@o2ib (158): c: 8, oc: 0, rc: 8 2013-10-12 11:18:39 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Skipped 2 previous similar messages 2013-10-12 11:18:46 INFO: task kthreadd:6322 blocked for more than 120 seconds. 2013-10-12 11:18:46 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-10-12 11:18:46 kthreadd D 000000000000000c 0 6322 2 0x00000000 2013-10-12 11:18:46 ffff880df8e23ee0 0000000000000046 0000000000000000 ffff880df8e23ea4 2013-10-12 11:18:46 0000000000000000 ffff88083fe82800 ffff880044696740 0000000000000400 2013-10-12 11:18:46 ffff880df8e19058 ffff880df8e23fd8 000000000000fb88 ffff880df8e19058 2013-10-12 11:18:46 Call Trace: 2013-10-12 11:18:46 [] ? libcfs_debug_dumplog_thread+0x0/0x30 [libcfs] 2013-10-12 11:18:46 [] kthread+0x77/0xa0 2013-10-12 11:18:46 [] child_rip+0xa/0x20 2013-10-12 11:18:46 [] ? kthread+0x0/0xa0 2013-10-12 11:18:46 [] ? child_rip+0x0/0x20 2013-10-12 11:18:48 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1381601873/real 0] req@ffff88070d25e400 x1448711817269392/t0(0) o38->lustre-MDT0000-lwp-OST000e@192.168.120.5@o2ib1:12/10 lens 400/544 e 0 to 1 dl 1381601928 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:19:08 BUG: soft lockup - CPU#4 stuck for 67s! [ll_ost01_008:6313] 2013-10-12 11:19:08 BUG: soft lockup - CPU#5 stuck for 67s! [ll_ost01_002:5939] 2013-10-12 11:19:08 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:19:08 CPU 5 2013-10-12 11:19:08 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:19:08 2013-10-12 11:19:08 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:19:08 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:19:08 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:19:08 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:19:08 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:19:08 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:19:08 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:19:08 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:19:08 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:19:08 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:19:08 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:19:08 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:19:09 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:19:09 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:19:09 Stack: 2013-10-12 11:19:09 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:19:09 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:19:09 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:19:09 Call Trace: 2013-10-12 11:19:09 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:19:09 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:19:09 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:19:09 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:19:09 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:19:09 [] ? kthread+0x96/0xa0 2013-10-12 11:19:09 [] ? child_rip+0xa/0x20 2013-10-12 11:19:09 [] ? kthread+0x0/0xa0 2013-10-12 11:19:09 [] ? child_rip+0x0/0x20 2013-10-12 11:19:09 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:19:09 Call Trace: 2013-10-12 11:19:09 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:19:09 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:19:09 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:19:09 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:19:09 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:19:09 [] ? kthread+0x96/0xa0 2013-10-12 11:19:09 [] ? child_rip+0xa/0x20 2013-10-12 11:19:09 [] ? kthread+0x0/0xa0 2013-10-12 11:19:09 [] ? child_rip+0x0/0x20 2013-10-12 11:19:09 BUG: soft lockup - CPU#6 stuck for 67s! [ll_ost01_006:6292] 2013-10-12 11:19:09 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:19:09 CPU 6 2013-10-12 11:19:09 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:19:09 2013-10-12 11:19:09 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:19:09 RIP: 0010:[] [] _spin_lock+0x1c/0x30 2013-10-12 11:19:09 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:19:09 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:19:09 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:19:09 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:19:09 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:19:09 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:19:09 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:19:09 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:19:09 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:19:09 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:19:09 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:19:09 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:19:09 Stack: 2013-10-12 11:19:09 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:19:09 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:19:09 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:19:09 Call Trace: 2013-10-12 11:19:09 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:19:09 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:19:09 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:19:09 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:19:09 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:19:09 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:19:09 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:19:09 [] ? kthread+0x96/0xa0 2013-10-12 11:19:09 [] ? child_rip+0xa/0x20 2013-10-12 11:19:09 [] ? kthread+0x0/0xa0 2013-10-12 11:19:09 [] ? child_rip+0x0/0x20 2013-10-12 11:19:09 Code: 81 2f 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e 90 0f b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 2013-10-12 11:19:09 Call Trace: 2013-10-12 11:19:09 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:19:09 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:19:09 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:19:09 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:19:09 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:19:09 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:19:09 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:19:09 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:19:09 [] ? kthread+0x96/0xa0 2013-10-12 11:19:09 [] ? child_rip+0xa/0x20 2013-10-12 11:19:09 [] ? kthread+0x0/0xa0 2013-10-12 11:19:09 [] ? child_rip+0x0/0x20 2013-10-12 11:19:09 BUG: soft lockup - CPU#7 stuck for 67s! [ll_ost01_009:6315] 2013-10-12 11:19:09 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:19:09 CPU 7 2013-10-12 11:19:09 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:19:09 2013-10-12 11:19:09 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:19:09 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:19:09 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:19:09 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:19:09 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:19:09 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:19:09 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:19:09 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:19:10 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:19:10 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:19:10 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:19:10 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:19:10 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:19:10 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:19:10 Stack: 2013-10-12 11:19:10 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:19:10 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:19:10 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:19:10 Call Trace: 2013-10-12 11:19:10 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:19:10 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:19:10 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:19:10 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:19:10 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:19:10 [] ? kthread+0x96/0xa0 2013-10-12 11:19:10 [] ? child_rip+0xa/0x20 2013-10-12 11:19:10 [] ? kthread+0x0/0xa0 2013-10-12 11:19:10 [] ? child_rip+0x0/0x20 2013-10-12 11:19:10 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:19:10 Call Trace: 2013-10-12 11:19:10 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:19:10 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:19:10 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:19:10 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:19:10 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:19:10 [] ? kthread+0x96/0xa0 2013-10-12 11:19:10 [] ? child_rip+0xa/0x20 2013-10-12 11:19:10 [] ? kthread+0x0/0xa0 2013-10-12 11:19:10 [] ? child_rip+0x0/0x20 2013-10-12 11:19:10 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:19:10 CPU 4 2013-10-12 11:19:10 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:19:10 2013-10-12 11:19:10 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:19:10 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:19:10 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:19:10 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:19:10 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:19:10 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:19:10 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:19:10 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:19:10 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:19:10 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:19:10 CR2: 00007ffffffdc4d8 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:19:10 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:19:10 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:19:10 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:19:10 Stack: 2013-10-12 11:19:10 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:19:10 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:19:10 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:19:10 Call Trace: 2013-10-12 11:19:10 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:19:10 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:19:10 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:19:10 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:19:10 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:19:10 [] ? kthread+0x96/0xa0 2013-10-12 11:19:10 [] ? child_rip+0xa/0x20 2013-10-12 11:19:10 [] ? kthread+0x0/0xa0 2013-10-12 11:19:10 [] ? child_rip+0x0/0x20 2013-10-12 11:19:10 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:19:10 Call Trace: 2013-10-12 11:19:10 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:19:10 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:19:10 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:19:10 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:19:10 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:19:10 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:19:10 [] ? kthread+0x96/0xa0 2013-10-12 11:19:10 [] ? child_rip+0xa/0x20 2013-10-12 11:19:10 [] ? kthread+0x0/0xa0 2013-10-12 11:19:10 [] ? child_rip+0x0/0x20 2013-10-12 11:19:59 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds 2013-10-12 11:19:59 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Skipped 29 previous similar messages 2013-10-12 11:19:59 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.124.20@o2ib (162): c: 8, oc: 0, rc: 8 2013-10-12 11:19:59 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Skipped 29 previous similar messages 2013-10-12 11:20:28 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1381601973/real 0] req@ffff8806e73b2800 x1448711817269416/t0(0) o38->lustre-MDT0000-lwp-OST000e@192.168.120.5@o2ib1:12/10 lens 400/544 e 0 to 1 dl 1381602028 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:20:28 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) Skipped 4 previous similar messages 2013-10-12 11:20:32 BUG: soft lockup - CPU#4 stuck for 67s! [ll_ost01_008:6313] 2013-10-12 11:20:32 BUG: soft lockup - CPU#5 stuck for 67s! [ll_ost01_002:5939] 2013-10-12 11:20:32 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:20:32 CPU 5 2013-10-12 11:20:32 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:20:32 2013-10-12 11:20:32 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:20:32 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:20:32 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:20:32 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:20:32 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:20:32 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:20:32 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:20:32 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:20:32 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:20:32 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:20:32 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:20:32 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:20:32 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:20:32 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:20:32 Stack: 2013-10-12 11:20:32 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:20:32 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:20:32 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:20:32 Call Trace: 2013-10-12 11:20:32 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:20:33 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:20:33 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:20:33 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:20:33 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:20:33 [] ? kthread+0x96/0xa0 2013-10-12 11:20:33 [] ? child_rip+0xa/0x20 2013-10-12 11:20:33 [] ? kthread+0x0/0xa0 2013-10-12 11:20:33 [] ? child_rip+0x0/0x20 2013-10-12 11:20:33 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:20:33 Call Trace: 2013-10-12 11:20:33 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:20:33 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:20:33 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:20:33 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:20:33 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:20:33 [] ? kthread+0x96/0xa0 2013-10-12 11:20:33 [] ? child_rip+0xa/0x20 2013-10-12 11:20:33 [] ? kthread+0x0/0xa0 2013-10-12 11:20:33 [] ? child_rip+0x0/0x20 2013-10-12 11:20:33 BUG: soft lockup - CPU#6 stuck for 67s! [ll_ost01_006:6292] 2013-10-12 11:20:33 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:20:33 CPU 6 2013-10-12 11:20:33 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:20:33 2013-10-12 11:20:33 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:20:33 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:20:33 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:20:33 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:20:33 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:20:33 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:20:33 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:20:33 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:20:33 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:20:33 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:20:33 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:20:33 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:20:33 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:20:33 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:20:33 Stack: 2013-10-12 11:20:33 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:20:33 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:20:33 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:20:33 Call Trace: 2013-10-12 11:20:33 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:20:33 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:20:33 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:20:33 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:20:33 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:20:33 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:20:33 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:20:33 [] ? kthread+0x96/0xa0 2013-10-12 11:20:33 [] ? child_rip+0xa/0x20 2013-10-12 11:20:33 [] ? kthread+0x0/0xa0 2013-10-12 11:20:33 [] ? child_rip+0x0/0x20 2013-10-12 11:20:33 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:20:33 Call Trace: 2013-10-12 11:20:33 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:20:33 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:20:33 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:20:33 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:20:33 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:20:33 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:20:33 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:20:33 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:20:33 [] ? kthread+0x96/0xa0 2013-10-12 11:20:33 [] ? child_rip+0xa/0x20 2013-10-12 11:20:33 [] ? kthread+0x0/0xa0 2013-10-12 11:20:33 [] ? child_rip+0x0/0x20 2013-10-12 11:20:33 BUG: soft lockup - CPU#7 stuck for 67s! [ll_ost01_009:6315] 2013-10-12 11:20:33 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:20:33 CPU 7 2013-10-12 11:20:33 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:20:33 2013-10-12 11:20:33 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:20:33 RIP: 0010:[] [] _spin_lock+0x1c/0x30 2013-10-12 11:20:33 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:20:33 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:20:33 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:20:33 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:20:33 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:20:33 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:20:34 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:20:34 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:20:34 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:20:34 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:20:34 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:20:34 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:20:34 Stack: 2013-10-12 11:20:34 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:20:34 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:20:34 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:20:34 Call Trace: 2013-10-12 11:20:34 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:20:34 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:20:34 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:20:34 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:20:34 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:20:34 [] ? kthread+0x96/0xa0 2013-10-12 11:20:34 [] ? child_rip+0xa/0x20 2013-10-12 11:20:34 [] ? kthread+0x0/0xa0 2013-10-12 11:20:34 [] ? child_rip+0x0/0x20 2013-10-12 11:20:34 Code: 81 2f 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e 90 0f b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 2013-10-12 11:20:34 Call Trace: 2013-10-12 11:20:34 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:20:34 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:20:34 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:20:34 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:20:34 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:20:34 [] ? kthread+0x96/0xa0 2013-10-12 11:20:34 [] ? child_rip+0xa/0x20 2013-10-12 11:20:34 [] ? kthread+0x0/0xa0 2013-10-12 11:20:34 [] ? child_rip+0x0/0x20 2013-10-12 11:20:34 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:20:34 CPU 4 2013-10-12 11:20:34 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:20:34 2013-10-12 11:20:34 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:20:34 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:20:34 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:20:34 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:20:34 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:20:34 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:20:34 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:20:34 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:20:34 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:20:34 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:20:34 CR2: 00007ffffffdc4d8 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:20:34 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:20:34 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:20:34 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:20:34 Stack: 2013-10-12 11:20:34 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:20:34 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:20:34 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:20:34 Call Trace: 2013-10-12 11:20:34 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:20:34 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:20:34 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:20:34 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:20:34 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:20:34 [] ? kthread+0x96/0xa0 2013-10-12 11:20:34 [] ? child_rip+0xa/0x20 2013-10-12 11:20:34 [] ? kthread+0x0/0xa0 2013-10-12 11:20:34 [] ? child_rip+0x0/0x20 2013-10-12 11:20:34 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:20:34 Call Trace: 2013-10-12 11:20:34 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:20:34 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:20:34 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:20:34 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:20:34 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:20:34 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:20:34 [] ? kthread+0x96/0xa0 2013-10-12 11:20:34 [] ? child_rip+0xa/0x20 2013-10-12 11:20:34 [] ? kthread+0x0/0xa0 2013-10-12 11:20:34 [] ? child_rip+0x0/0x20 2013-10-12 11:20:46 INFO: task kthreadd:6322 blocked for more than 120 seconds. 2013-10-12 11:20:46 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-10-12 11:20:46 kthreadd D 000000000000000c 0 6322 2 0x00000000 2013-10-12 11:20:46 ffff880df8e23ee0 0000000000000046 0000000000000000 ffff880df8e23ea4 2013-10-12 11:20:46 0000000000000000 ffff88083fe82800 ffff880044696740 0000000000000400 2013-10-12 11:20:46 ffff880df8e19058 ffff880df8e23fd8 000000000000fb88 ffff880df8e19058 2013-10-12 11:20:46 Call Trace: 2013-10-12 11:20:46 [] ? libcfs_debug_dumplog_thread+0x0/0x30 [libcfs] 2013-10-12 11:20:46 [] kthread+0x77/0xa0 2013-10-12 11:20:46 [] child_rip+0xa/0x20 2013-10-12 11:20:46 [] ? kthread+0x0/0xa0 2013-10-12 11:20:46 [] ? child_rip+0x0/0x20 2013-10-12 11:21:43 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1381602048/real 0] req@ffff88081435d000 x1448711817269428/t0(0) o38->lustre-MDT0000-lwp-OST000e@192.168.120.5@o2ib1:12/10 lens 400/544 e 0 to 1 dl 1381602103 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:21:43 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) Skipped 2 previous similar messages 2013-10-12 11:21:56 BUG: soft lockup - CPU#4 stuck for 67s! [ll_ost01_008:6313] 2013-10-12 11:21:56 BUG: soft lockup - CPU#5 stuck for 67s! [ll_ost01_002:5939] 2013-10-12 11:21:56 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:21:56 CPU 5 2013-10-12 11:21:56 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:21:56 2013-10-12 11:21:56 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:21:56 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:21:56 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:21:56 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:21:56 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:21:56 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:21:56 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:21:56 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:21:56 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:21:56 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:21:56 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:21:56 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:21:56 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:21:56 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:21:56 Stack: 2013-10-12 11:21:56 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:21:56 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:21:56 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:21:56 Call Trace: 2013-10-12 11:21:56 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:21:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:21:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:21:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:21:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:21:57 [] ? kthread+0x96/0xa0 2013-10-12 11:21:57 [] ? child_rip+0xa/0x20 2013-10-12 11:21:57 [] ? kthread+0x0/0xa0 2013-10-12 11:21:57 [] ? child_rip+0x0/0x20 2013-10-12 11:21:57 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:21:57 Call Trace: 2013-10-12 11:21:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:21:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:21:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:21:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:21:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:21:57 [] ? kthread+0x96/0xa0 2013-10-12 11:21:57 [] ? child_rip+0xa/0x20 2013-10-12 11:21:57 [] ? kthread+0x0/0xa0 2013-10-12 11:21:57 [] ? child_rip+0x0/0x20 2013-10-12 11:21:57 BUG: soft lockup - CPU#6 stuck for 67s! [ll_ost01_006:6292] 2013-10-12 11:21:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:21:57 CPU 6 2013-10-12 11:21:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:21:57 2013-10-12 11:21:57 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:21:57 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:21:57 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:21:57 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:21:57 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:21:57 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:21:57 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:21:57 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:21:57 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:21:57 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:21:57 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:21:57 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:21:57 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:21:57 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:21:57 Stack: 2013-10-12 11:21:57 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:21:57 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:21:57 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:21:57 Call Trace: 2013-10-12 11:21:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:21:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:21:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:21:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:21:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:21:57 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:21:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:21:57 [] ? kthread+0x96/0xa0 2013-10-12 11:21:57 [] ? child_rip+0xa/0x20 2013-10-12 11:21:57 [] ? kthread+0x0/0xa0 2013-10-12 11:21:57 [] ? child_rip+0x0/0x20 2013-10-12 11:21:57 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:21:57 Call Trace: 2013-10-12 11:21:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:21:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:21:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:21:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:21:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:21:57 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:21:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:21:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:21:57 [] ? kthread+0x96/0xa0 2013-10-12 11:21:57 [] ? child_rip+0xa/0x20 2013-10-12 11:21:57 [] ? kthread+0x0/0xa0 2013-10-12 11:21:57 [] ? child_rip+0x0/0x20 2013-10-12 11:21:57 BUG: soft lockup - CPU#7 stuck for 67s! [ll_ost01_009:6315] 2013-10-12 11:21:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:21:57 CPU 7 2013-10-12 11:21:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:21:57 2013-10-12 11:21:57 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:21:57 RIP: 0010:[] [] _spin_lock+0x1c/0x30 2013-10-12 11:21:57 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:21:57 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:21:57 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:21:57 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:21:57 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:21:57 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:21:58 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:21:58 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:21:58 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:21:58 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:21:58 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:21:58 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:21:58 Stack: 2013-10-12 11:21:58 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:21:58 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:21:58 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:21:58 Call Trace: 2013-10-12 11:21:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:21:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:21:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:21:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:21:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:21:58 [] ? kthread+0x96/0xa0 2013-10-12 11:21:58 [] ? child_rip+0xa/0x20 2013-10-12 11:21:58 [] ? kthread+0x0/0xa0 2013-10-12 11:21:58 [] ? child_rip+0x0/0x20 2013-10-12 11:21:58 Code: 81 2f 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e 90 0f b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 2013-10-12 11:21:58 Call Trace: 2013-10-12 11:21:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:21:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:21:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:21:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:21:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:21:58 [] ? kthread+0x96/0xa0 2013-10-12 11:21:58 [] ? child_rip+0xa/0x20 2013-10-12 11:21:58 [] ? kthread+0x0/0xa0 2013-10-12 11:21:58 [] ? child_rip+0x0/0x20 2013-10-12 11:21:58 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:21:58 CPU 4 2013-10-12 11:21:58 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:21:58 2013-10-12 11:21:58 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:21:58 RIP: 0010:[] [] _spin_lock+0x1c/0x30 2013-10-12 11:21:58 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:21:58 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:21:58 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:21:58 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:21:58 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:21:58 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:21:58 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:21:58 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:21:58 CR2: 00007ffffffdc4d8 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:21:58 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:21:58 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:21:58 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:21:58 Stack: 2013-10-12 11:21:58 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:21:58 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:21:58 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:21:58 Call Trace: 2013-10-12 11:21:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:21:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:21:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:21:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:21:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:21:58 [] ? kthread+0x96/0xa0 2013-10-12 11:21:58 [] ? child_rip+0xa/0x20 2013-10-12 11:21:58 [] ? kthread+0x0/0xa0 2013-10-12 11:21:58 [] ? child_rip+0x0/0x20 2013-10-12 11:21:58 Code: 81 2f 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e 90 0f b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 2013-10-12 11:21:58 Call Trace: 2013-10-12 11:21:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:21:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:21:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:21:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:21:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:21:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:21:58 [] ? kthread+0x96/0xa0 2013-10-12 11:21:58 [] ? child_rip+0xa/0x20 2013-10-12 11:21:58 [] ? kthread+0x0/0xa0 2013-10-12 11:21:58 [] ? child_rip+0x0/0x20 2013-10-12 11:22:15 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds 2013-10-12 11:22:15 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Skipped 3 previous similar messages 2013-10-12 11:22:15 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.118.119@o2ib (150): c: 8, oc: 0, rc: 8 2013-10-12 11:22:15 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Skipped 3 previous similar messages 2013-10-12 11:22:46 INFO: task kthreadd:6322 blocked for more than 120 seconds. 2013-10-12 11:22:46 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-10-12 11:22:46 kthreadd D 000000000000000c 0 6322 2 0x00000000 2013-10-12 11:22:46 ffff880df8e23ee0 0000000000000046 0000000000000000 ffff880df8e23ea4 2013-10-12 11:22:46 0000000000000000 ffff88083fe82800 ffff880044696740 0000000000000400 2013-10-12 11:22:46 ffff880df8e19058 ffff880df8e23fd8 000000000000fb88 ffff880df8e19058 2013-10-12 11:22:46 Call Trace: 2013-10-12 11:22:46 [] ? libcfs_debug_dumplog_thread+0x0/0x30 [libcfs] 2013-10-12 11:22:46 [] kthread+0x77/0xa0 2013-10-12 11:22:46 [] child_rip+0xa/0x20 2013-10-12 11:22:46 [] ? kthread+0x0/0xa0 2013-10-12 11:22:46 [] ? child_rip+0x0/0x20 2013-10-12 11:23:20 BUG: soft lockup - CPU#4 stuck for 67s! [ll_ost01_008:6313] 2013-10-12 11:23:20 BUG: soft lockup - CPU#5 stuck for 67s! [ll_ost01_002:5939] 2013-10-12 11:23:20 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:23:20 CPU 5 2013-10-12 11:23:20 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:23:20 2013-10-12 11:23:20 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:23:20 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:23:20 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:23:20 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:23:20 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:23:20 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:23:20 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:23:20 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:23:20 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:23:20 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:23:20 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:23:20 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:23:20 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:23:20 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:23:20 Stack: 2013-10-12 11:23:20 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:23:20 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:23:20 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:23:20 Call Trace: 2013-10-12 11:23:20 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:23:21 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:23:21 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:23:21 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:23:21 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:23:21 [] ? kthread+0x96/0xa0 2013-10-12 11:23:21 [] ? child_rip+0xa/0x20 2013-10-12 11:23:21 [] ? kthread+0x0/0xa0 2013-10-12 11:23:21 [] ? child_rip+0x0/0x20 2013-10-12 11:23:21 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:23:21 Call Trace: 2013-10-12 11:23:21 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:23:21 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:23:21 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:23:21 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:23:21 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:23:21 [] ? kthread+0x96/0xa0 2013-10-12 11:23:21 [] ? child_rip+0xa/0x20 2013-10-12 11:23:21 [] ? kthread+0x0/0xa0 2013-10-12 11:23:21 [] ? child_rip+0x0/0x20 2013-10-12 11:23:21 BUG: soft lockup - CPU#6 stuck for 67s! [ll_ost01_006:6292] 2013-10-12 11:23:21 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:23:21 CPU 6 2013-10-12 11:23:21 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:23:21 2013-10-12 11:23:21 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:23:21 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:23:21 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:23:21 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:23:21 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:23:21 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:23:21 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:23:21 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:23:21 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:23:21 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:23:21 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:23:21 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:23:21 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:23:21 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:23:21 Stack: 2013-10-12 11:23:21 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:23:21 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:23:21 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:23:21 Call Trace: 2013-10-12 11:23:21 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:23:21 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:23:21 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:23:21 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:23:21 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:23:21 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:23:21 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:23:21 [] ? kthread+0x96/0xa0 2013-10-12 11:23:21 [] ? child_rip+0xa/0x20 2013-10-12 11:23:21 [] ? kthread+0x0/0xa0 2013-10-12 11:23:21 [] ? child_rip+0x0/0x20 2013-10-12 11:23:21 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:23:21 Call Trace: 2013-10-12 11:23:21 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:23:21 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:23:21 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:23:21 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:23:21 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:23:21 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:23:21 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:23:21 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:23:21 [] ? kthread+0x96/0xa0 2013-10-12 11:23:21 [] ? child_rip+0xa/0x20 2013-10-12 11:23:21 [] ? kthread+0x0/0xa0 2013-10-12 11:23:21 [] ? child_rip+0x0/0x20 2013-10-12 11:23:21 BUG: soft lockup - CPU#7 stuck for 67s! [ll_ost01_009:6315] 2013-10-12 11:23:21 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:23:21 CPU 7 2013-10-12 11:23:21 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:23:21 2013-10-12 11:23:21 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:23:21 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:23:21 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:23:21 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:23:21 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:23:21 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:23:21 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:23:21 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:23:22 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:23:22 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:23:22 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:23:22 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:23:22 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:23:22 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:23:22 Stack: 2013-10-12 11:23:22 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:23:22 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:23:22 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:23:22 Call Trace: 2013-10-12 11:23:22 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:23:22 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:23:22 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:23:22 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:23:22 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:23:22 [] ? kthread+0x96/0xa0 2013-10-12 11:23:22 [] ? child_rip+0xa/0x20 2013-10-12 11:23:22 [] ? kthread+0x0/0xa0 2013-10-12 11:23:22 [] ? child_rip+0x0/0x20 2013-10-12 11:23:22 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:23:22 Call Trace: 2013-10-12 11:23:22 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:23:22 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:23:22 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:23:22 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:23:22 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:23:22 [] ? kthread+0x96/0xa0 2013-10-12 11:23:22 [] ? child_rip+0xa/0x20 2013-10-12 11:23:22 [] ? kthread+0x0/0xa0 2013-10-12 11:23:22 [] ? child_rip+0x0/0x20 2013-10-12 11:23:22 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:23:22 CPU 4 2013-10-12 11:23:22 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:23:22 2013-10-12 11:23:22 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:23:22 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:23:22 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:23:22 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:23:22 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:23:22 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:23:22 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:23:22 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:23:22 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:23:22 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:23:22 CR2: 00007ffffffdc4d8 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:23:22 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:23:22 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:23:22 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:23:22 Stack: 2013-10-12 11:23:22 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:23:22 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:23:22 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:23:22 Call Trace: 2013-10-12 11:23:22 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:23:22 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:23:22 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:23:22 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:23:22 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:23:22 [] ? kthread+0x96/0xa0 2013-10-12 11:23:22 [] ? child_rip+0xa/0x20 2013-10-12 11:23:22 [] ? kthread+0x0/0xa0 2013-10-12 11:23:22 [] ? child_rip+0x0/0x20 2013-10-12 11:23:22 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:23:22 Call Trace: 2013-10-12 11:23:22 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:23:22 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:23:22 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:23:22 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:23:22 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:23:22 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:23:22 [] ? kthread+0x96/0xa0 2013-10-12 11:23:22 [] ? child_rip+0xa/0x20 2013-10-12 11:23:22 [] ? kthread+0x0/0xa0 2013-10-12 11:23:22 [] ? child_rip+0x0/0x20 2013-10-12 11:24:38 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1381602223/real 0] req@ffff88067a042c00 x1448711817269464/t0(0) o38->lustre-MDT0000-lwp-OST000e@192.168.120.5@o2ib1:12/10 lens 400/544 e 0 to 1 dl 1381602278 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:24:38 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) Skipped 8 previous similar messages 2013-10-12 11:24:44 BUG: soft lockup - CPU#4 stuck for 67s! [ll_ost01_008:6313] 2013-10-12 11:24:44 BUG: soft lockup - CPU#5 stuck for 67s! [ll_ost01_002:5939] 2013-10-12 11:24:44 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:24:44 CPU 5 2013-10-12 11:24:44 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:24:44 2013-10-12 11:24:44 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:24:44 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:24:44 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:24:44 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:24:44 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:24:44 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:24:44 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:24:44 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:24:44 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:24:44 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:24:44 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:24:44 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:24:44 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:24:44 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:24:44 Stack: 2013-10-12 11:24:44 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:24:44 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:24:44 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:24:44 Call Trace: 2013-10-12 11:24:44 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:24:45 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:24:45 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:24:45 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:24:45 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:24:45 [] ? kthread+0x96/0xa0 2013-10-12 11:24:45 [] ? child_rip+0xa/0x20 2013-10-12 11:24:45 [] ? kthread+0x0/0xa0 2013-10-12 11:24:45 [] ? child_rip+0x0/0x20 2013-10-12 11:24:45 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:24:45 Call Trace: 2013-10-12 11:24:45 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:24:45 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:24:45 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:24:45 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:24:45 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:24:45 [] ? kthread+0x96/0xa0 2013-10-12 11:24:45 [] ? child_rip+0xa/0x20 2013-10-12 11:24:45 [] ? kthread+0x0/0xa0 2013-10-12 11:24:45 [] ? child_rip+0x0/0x20 2013-10-12 11:24:45 BUG: soft lockup - CPU#6 stuck for 67s! [ll_ost01_006:6292] 2013-10-12 11:24:45 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:24:45 CPU 6 2013-10-12 11:24:45 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:24:45 2013-10-12 11:24:45 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:24:45 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:24:45 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:24:45 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:24:45 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:24:45 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:24:45 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:24:45 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:24:45 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:24:45 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:24:45 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:24:45 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:24:45 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:24:45 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:24:45 Stack: 2013-10-12 11:24:45 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:24:45 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:24:45 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:24:45 Call Trace: 2013-10-12 11:24:45 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:24:45 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:24:45 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:24:45 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:24:45 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:24:45 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:24:45 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:24:45 [] ? kthread+0x96/0xa0 2013-10-12 11:24:45 [] ? child_rip+0xa/0x20 2013-10-12 11:24:45 [] ? kthread+0x0/0xa0 2013-10-12 11:24:45 [] ? child_rip+0x0/0x20 2013-10-12 11:24:45 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:24:45 Call Trace: 2013-10-12 11:24:45 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:24:45 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:24:45 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:24:45 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:24:45 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:24:45 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:24:45 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:24:45 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:24:45 [] ? kthread+0x96/0xa0 2013-10-12 11:24:45 [] ? child_rip+0xa/0x20 2013-10-12 11:24:45 [] ? kthread+0x0/0xa0 2013-10-12 11:24:45 [] ? child_rip+0x0/0x20 2013-10-12 11:24:45 BUG: soft lockup - CPU#7 stuck for 67s! [ll_ost01_009:6315] 2013-10-12 11:24:45 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:24:45 CPU 7 2013-10-12 11:24:45 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:24:45 2013-10-12 11:24:45 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:24:45 RIP: 0010:[] [] _spin_lock+0x1c/0x30 2013-10-12 11:24:45 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:24:45 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:24:45 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:24:45 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:24:45 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:24:45 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:24:46 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:24:46 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:24:46 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:24:46 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:24:46 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:24:46 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:24:46 Stack: 2013-10-12 11:24:46 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:24:46 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:24:46 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:24:46 Call Trace: 2013-10-12 11:24:46 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:24:46 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:24:46 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:24:46 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:24:46 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:24:46 [] ? kthread+0x96/0xa0 2013-10-12 11:24:46 [] ? child_rip+0xa/0x20 2013-10-12 11:24:46 [] ? kthread+0x0/0xa0 2013-10-12 11:24:46 [] ? child_rip+0x0/0x20 2013-10-12 11:24:46 Code: 81 2f 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e 90 0f b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 2013-10-12 11:24:46 Call Trace: 2013-10-12 11:24:46 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:24:46 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:24:46 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:24:46 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:24:46 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:24:46 [] ? kthread+0x96/0xa0 2013-10-12 11:24:46 [] ? child_rip+0xa/0x20 2013-10-12 11:24:46 [] ? kthread+0x0/0xa0 2013-10-12 11:24:46 [] ? child_rip+0x0/0x20 2013-10-12 11:24:46 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:24:46 CPU 4 2013-10-12 11:24:46 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:24:46 2013-10-12 11:24:46 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:24:46 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:24:46 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:24:46 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:24:46 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:24:46 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:24:46 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:24:46 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:24:46 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:24:46 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:24:46 CR2: 00007ffffffdc4d8 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:24:46 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:24:46 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:24:46 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:24:46 Stack: 2013-10-12 11:24:46 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:24:46 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:24:46 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:24:46 Call Trace: 2013-10-12 11:24:46 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:24:46 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:24:46 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:24:46 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:24:46 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:24:46 [] ? kthread+0x96/0xa0 2013-10-12 11:24:46 [] ? child_rip+0xa/0x20 2013-10-12 11:24:46 [] ? kthread+0x0/0xa0 2013-10-12 11:24:46 [] ? child_rip+0x0/0x20 2013-10-12 11:24:46 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:24:46 Call Trace: 2013-10-12 11:24:46 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:24:46 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:24:46 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:24:46 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:24:46 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:24:46 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:24:46 [] ? kthread+0x96/0xa0 2013-10-12 11:24:46 [] ? child_rip+0xa/0x20 2013-10-12 11:24:46 [] ? kthread+0x0/0xa0 2013-10-12 11:24:46 [] ? child_rip+0x0/0x20 2013-10-12 11:24:46 INFO: task kthreadd:6322 blocked for more than 120 seconds. 2013-10-12 11:24:46 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-10-12 11:24:46 kthreadd D 000000000000000c 0 6322 2 0x00000000 2013-10-12 11:24:46 ffff880df8e23ee0 0000000000000046 0000000000000000 ffff880df8e23ea4 2013-10-12 11:24:46 0000000000000000 ffff88083fe82800 ffff880044696740 0000000000000400 2013-10-12 11:24:46 ffff880df8e19058 ffff880df8e23fd8 000000000000fb88 ffff880df8e19058 2013-10-12 11:24:46 Call Trace: 2013-10-12 11:24:46 [] ? libcfs_debug_dumplog_thread+0x0/0x30 [libcfs] 2013-10-12 11:24:46 [] kthread+0x77/0xa0 2013-10-12 11:24:46 [] child_rip+0xa/0x20 2013-10-12 11:24:46 [] ? kthread+0x0/0xa0 2013-10-12 11:24:46 [] ? child_rip+0x0/0x20 2013-10-12 11:26:08 BUG: soft lockup - CPU#4 stuck for 67s! [ll_ost01_008:6313] 2013-10-12 11:26:08 BUG: soft lockup - CPU#5 stuck for 67s! [ll_ost01_002:5939] 2013-10-12 11:26:08 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:26:08 CPU 5 2013-10-12 11:26:08 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:26:08 2013-10-12 11:26:08 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:26:08 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:26:08 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:26:08 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:26:08 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:26:08 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:26:08 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:26:08 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:26:08 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:26:08 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:26:08 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:26:08 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:26:08 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:26:08 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:26:08 Stack: 2013-10-12 11:26:08 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:26:08 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:26:08 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:26:08 Call Trace: 2013-10-12 11:26:08 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:26:09 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:26:09 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:26:09 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:26:09 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:26:09 [] ? kthread+0x96/0xa0 2013-10-12 11:26:09 [] ? child_rip+0xa/0x20 2013-10-12 11:26:09 [] ? kthread+0x0/0xa0 2013-10-12 11:26:09 [] ? child_rip+0x0/0x20 2013-10-12 11:26:09 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:26:09 Call Trace: 2013-10-12 11:26:09 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:26:09 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:26:09 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:26:09 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:26:09 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:26:09 [] ? kthread+0x96/0xa0 2013-10-12 11:26:09 [] ? child_rip+0xa/0x20 2013-10-12 11:26:09 [] ? kthread+0x0/0xa0 2013-10-12 11:26:09 [] ? child_rip+0x0/0x20 2013-10-12 11:26:09 BUG: soft lockup - CPU#6 stuck for 67s! [ll_ost01_006:6292] 2013-10-12 11:26:09 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:26:09 CPU 6 2013-10-12 11:26:09 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:26:09 2013-10-12 11:26:09 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:26:09 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:26:09 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:26:09 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:26:09 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:26:09 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:26:09 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:26:09 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:26:09 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:26:09 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:26:09 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:26:09 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:26:09 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:26:09 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:26:09 Stack: 2013-10-12 11:26:09 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:26:09 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:26:09 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:26:09 Call Trace: 2013-10-12 11:26:09 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:26:09 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:26:09 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:26:09 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:26:09 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:26:09 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:26:09 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:26:09 [] ? kthread+0x96/0xa0 2013-10-12 11:26:09 [] ? child_rip+0xa/0x20 2013-10-12 11:26:09 [] ? kthread+0x0/0xa0 2013-10-12 11:26:09 [] ? child_rip+0x0/0x20 2013-10-12 11:26:09 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:26:09 Call Trace: 2013-10-12 11:26:09 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:26:09 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:26:09 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:26:09 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:26:09 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:26:09 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:26:09 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:26:09 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:26:09 [] ? kthread+0x96/0xa0 2013-10-12 11:26:09 [] ? child_rip+0xa/0x20 2013-10-12 11:26:09 [] ? kthread+0x0/0xa0 2013-10-12 11:26:09 [] ? child_rip+0x0/0x20 2013-10-12 11:26:09 BUG: soft lockup - CPU#7 stuck for 67s! [ll_ost01_009:6315] 2013-10-12 11:26:09 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:26:09 CPU 7 2013-10-12 11:26:09 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:26:09 2013-10-12 11:26:09 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:26:09 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:26:09 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:26:09 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:26:09 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:26:09 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:26:09 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:26:09 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:26:10 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:26:10 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:26:10 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:26:10 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:26:10 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:26:10 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:26:10 Stack: 2013-10-12 11:26:10 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:26:10 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:26:10 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:26:10 Call Trace: 2013-10-12 11:26:10 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:26:10 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:26:10 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:26:10 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:26:10 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:26:10 [] ? kthread+0x96/0xa0 2013-10-12 11:26:10 [] ? child_rip+0xa/0x20 2013-10-12 11:26:10 [] ? kthread+0x0/0xa0 2013-10-12 11:26:10 [] ? child_rip+0x0/0x20 2013-10-12 11:26:10 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:26:10 Call Trace: 2013-10-12 11:26:10 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:26:10 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:26:10 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:26:10 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:26:10 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:26:10 [] ? kthread+0x96/0xa0 2013-10-12 11:26:10 [] ? child_rip+0xa/0x20 2013-10-12 11:26:10 [] ? kthread+0x0/0xa0 2013-10-12 11:26:10 [] ? child_rip+0x0/0x20 2013-10-12 11:26:10 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:26:10 CPU 4 2013-10-12 11:26:10 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:26:10 2013-10-12 11:26:10 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:26:10 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:26:10 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:26:10 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:26:10 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:26:10 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:26:10 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:26:10 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:26:10 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:26:10 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:26:10 CR2: 00007ffffffdc4d8 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:26:10 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:26:10 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:26:10 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:26:10 Stack: 2013-10-12 11:26:10 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:26:10 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:26:10 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:26:10 Call Trace: 2013-10-12 11:26:10 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:26:10 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:26:10 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:26:10 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:26:10 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:26:10 [] ? kthread+0x96/0xa0 2013-10-12 11:26:10 [] ? child_rip+0xa/0x20 2013-10-12 11:26:10 [] ? kthread+0x0/0xa0 2013-10-12 11:26:10 [] ? child_rip+0x0/0x20 2013-10-12 11:26:10 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:26:10 Call Trace: 2013-10-12 11:26:10 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:26:10 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:26:10 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:26:10 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:26:10 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:26:10 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:26:10 [] ? kthread+0x96/0xa0 2013-10-12 11:26:10 [] ? child_rip+0xa/0x20 2013-10-12 11:26:10 [] ? kthread+0x0/0xa0 2013-10-12 11:26:10 [] ? child_rip+0x0/0x20 2013-10-12 11:26:32 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds 2013-10-12 11:26:32 LNetError: 4985:0:(o2iblnd_cb.c:3012:kiblnd_check_txs_locked()) Skipped 48 previous similar messages 2013-10-12 11:26:32 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Timed out RDMA with 192.168.124.28@o2ib (155): c: 8, oc: 0, rc: 8 2013-10-12 11:26:32 LNetError: 4985:0:(o2iblnd_cb.c:3075:kiblnd_check_conns()) Skipped 48 previous similar messages 2013-10-12 11:26:46 INFO: task kthreadd:6322 blocked for more than 120 seconds. 2013-10-12 11:26:46 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-10-12 11:26:46 kthreadd D 000000000000000c 0 6322 2 0x00000000 2013-10-12 11:26:46 ffff880df8e23ee0 0000000000000046 0000000000000000 ffff880df8e23ea4 2013-10-12 11:26:46 0000000000000000 ffff88083fe82800 ffff880044696740 0000000000000400 2013-10-12 11:26:46 ffff880df8e19058 ffff880df8e23fd8 000000000000fb88 ffff880df8e19058 2013-10-12 11:26:46 Call Trace: 2013-10-12 11:26:46 [] ? libcfs_debug_dumplog_thread+0x0/0x30 [libcfs] 2013-10-12 11:26:46 [] kthread+0x77/0xa0 2013-10-12 11:26:46 [] child_rip+0xa/0x20 2013-10-12 11:26:46 [] ? kthread+0x0/0xa0 2013-10-12 11:26:46 [] ? child_rip+0x0/0x20 2013-10-12 11:27:32 BUG: soft lockup - CPU#4 stuck for 67s! [ll_ost01_008:6313] 2013-10-12 11:27:32 BUG: soft lockup - CPU#5 stuck for 67s! [ll_ost01_002:5939] 2013-10-12 11:27:32 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:27:32 CPU 5 2013-10-12 11:27:32 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:27:32 2013-10-12 11:27:32 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:27:32 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:27:32 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:27:32 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:27:32 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:27:32 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:27:32 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:27:32 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:27:32 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:27:32 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:27:32 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:27:32 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:27:32 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:27:32 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:27:32 Stack: 2013-10-12 11:27:32 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:27:32 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:27:32 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:27:32 Call Trace: 2013-10-12 11:27:32 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:27:33 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:27:33 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:27:33 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:27:33 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:27:33 [] ? kthread+0x96/0xa0 2013-10-12 11:27:33 [] ? child_rip+0xa/0x20 2013-10-12 11:27:33 [] ? kthread+0x0/0xa0 2013-10-12 11:27:33 [] ? child_rip+0x0/0x20 2013-10-12 11:27:33 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:27:33 Call Trace: 2013-10-12 11:27:33 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:27:33 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:27:33 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:27:33 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:27:33 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:27:33 [] ? kthread+0x96/0xa0 2013-10-12 11:27:33 [] ? child_rip+0xa/0x20 2013-10-12 11:27:33 [] ? kthread+0x0/0xa0 2013-10-12 11:27:33 [] ? child_rip+0x0/0x20 2013-10-12 11:27:33 BUG: soft lockup - CPU#6 stuck for 67s! [ll_ost01_006:6292] 2013-10-12 11:27:33 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:27:33 CPU 6 2013-10-12 11:27:33 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:27:33 2013-10-12 11:27:33 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:27:33 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:27:33 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:27:33 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:27:33 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:27:33 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:27:33 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:27:33 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:27:33 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:27:33 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:27:33 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:27:33 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:27:33 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:27:33 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:27:33 Stack: 2013-10-12 11:27:33 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:27:33 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:27:33 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:27:33 Call Trace: 2013-10-12 11:27:33 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:27:33 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:27:33 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:27:33 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:27:33 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:27:33 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:27:33 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:27:33 [] ? kthread+0x96/0xa0 2013-10-12 11:27:33 [] ? child_rip+0xa/0x20 2013-10-12 11:27:33 [] ? kthread+0x0/0xa0 2013-10-12 11:27:33 [] ? child_rip+0x0/0x20 2013-10-12 11:27:33 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:27:33 Call Trace: 2013-10-12 11:27:33 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:27:33 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:27:33 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:27:33 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:27:33 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:27:33 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:27:33 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:27:33 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:27:33 [] ? kthread+0x96/0xa0 2013-10-12 11:27:33 [] ? child_rip+0xa/0x20 2013-10-12 11:27:33 [] ? kthread+0x0/0xa0 2013-10-12 11:27:33 [] ? child_rip+0x0/0x20 2013-10-12 11:27:33 BUG: soft lockup - CPU#7 stuck for 67s! [ll_ost01_009:6315] 2013-10-12 11:27:33 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:27:33 CPU 7 2013-10-12 11:27:33 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:27:33 2013-10-12 11:27:33 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:27:33 RIP: 0010:[] [] _spin_lock+0x1c/0x30 2013-10-12 11:27:33 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:27:33 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:27:33 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:27:33 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:27:33 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:27:33 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:27:34 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:27:34 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:27:34 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:27:34 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:27:34 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:27:34 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:27:34 Stack: 2013-10-12 11:27:34 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:27:34 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:27:34 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:27:34 Call Trace: 2013-10-12 11:27:34 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:27:34 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:27:34 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:27:34 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:27:34 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:27:34 [] ? kthread+0x96/0xa0 2013-10-12 11:27:34 [] ? child_rip+0xa/0x20 2013-10-12 11:27:34 [] ? kthread+0x0/0xa0 2013-10-12 11:27:34 [] ? child_rip+0x0/0x20 2013-10-12 11:27:34 Code: 81 2f 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e 90 0f b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 2013-10-12 11:27:34 Call Trace: 2013-10-12 11:27:34 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:27:34 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:27:34 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:27:34 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:27:34 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:27:34 [] ? kthread+0x96/0xa0 2013-10-12 11:27:34 [] ? child_rip+0xa/0x20 2013-10-12 11:27:34 [] ? kthread+0x0/0xa0 2013-10-12 11:27:34 [] ? child_rip+0x0/0x20 2013-10-12 11:27:34 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:27:34 CPU 4 2013-10-12 11:27:34 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:27:34 2013-10-12 11:27:34 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:27:34 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:27:34 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:27:34 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:27:34 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:27:34 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:27:34 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:27:34 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:27:34 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:27:34 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:27:34 CR2: 00007ffffffdc4d8 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:27:34 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:27:34 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:27:34 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:27:34 Stack: 2013-10-12 11:27:34 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:27:34 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:27:34 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:27:34 Call Trace: 2013-10-12 11:27:34 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:27:34 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:27:34 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:27:34 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:27:34 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:27:34 [] ? kthread+0x96/0xa0 2013-10-12 11:27:34 [] ? child_rip+0xa/0x20 2013-10-12 11:27:34 [] ? kthread+0x0/0xa0 2013-10-12 11:27:34 [] ? child_rip+0x0/0x20 2013-10-12 11:27:34 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:27:34 Call Trace: 2013-10-12 11:27:34 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:27:34 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:27:34 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:27:34 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:27:34 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:27:34 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:27:34 [] ? kthread+0x96/0xa0 2013-10-12 11:27:34 [] ? child_rip+0xa/0x20 2013-10-12 11:27:34 [] ? kthread+0x0/0xa0 2013-10-12 11:27:34 [] ? child_rip+0x0/0x20 2013-10-12 11:28:46 INFO: task kthreadd:6322 blocked for more than 120 seconds. 2013-10-12 11:28:46 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-10-12 11:28:46 kthreadd D 000000000000000c 0 6322 2 0x00000000 2013-10-12 11:28:46 ffff880df8e23ee0 0000000000000046 0000000000000000 ffff880df8e23ea4 2013-10-12 11:28:46 0000000000000000 ffff88083fe82800 ffff880044696740 0000000000000400 2013-10-12 11:28:46 ffff880df8e19058 ffff880df8e23fd8 000000000000fb88 ffff880df8e19058 2013-10-12 11:28:46 Call Trace: 2013-10-12 11:28:46 [] ? libcfs_debug_dumplog_thread+0x0/0x30 [libcfs] 2013-10-12 11:28:46 [] kthread+0x77/0xa0 2013-10-12 11:28:47 [] child_rip+0xa/0x20 2013-10-12 11:28:47 [] ? kthread+0x0/0xa0 2013-10-12 11:28:47 [] ? child_rip+0x0/0x20 2013-10-12 11:28:56 BUG: soft lockup - CPU#4 stuck for 66s! [ll_ost01_008:6313] 2013-10-12 11:28:56 BUG: soft lockup - CPU#5 stuck for 66s! [ll_ost01_002:5939] 2013-10-12 11:28:56 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:28:56 CPU 5 2013-10-12 11:28:56 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:28:56 2013-10-12 11:28:56 Pid: 5939, comm: ll_ost01_002 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:28:56 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:28:56 RSP: 0018:ffff880ccc9d5cb0 EFLAGS: 00000297 2013-10-12 11:28:56 RAX: 00000000000032af RBX: ffff880ccc9d5cb0 RCX: 000000000000002f 2013-10-12 11:28:56 RDX: 00000000000032ae RSI: ffff8805eef22c00 RDI: ffff880813eda430 2013-10-12 11:28:56 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:28:56 R10: 00000000000957c0 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:28:56 R13: ffffffffa0f70433 R14: ffff880ccc9d5d60 R15: ffff8805eef22fa0 2013-10-12 11:28:56 FS: 0000000000000000(0000) GS:ffff8800446a0000(0000) knlGS:0000000000000000 2013-10-12 11:28:56 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:28:56 CR2: 00007ffff7ff7000 CR3: 00000010347bb000 CR4: 00000000000407e0 2013-10-12 11:28:57 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:28:57 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:28:57 Process ll_ost01_002 (pid: 5939, threadinfo ffff880ccc9d4000, task ffff880ccc9ca040) 2013-10-12 11:28:57 Stack: 2013-10-12 11:28:57 ffff880ccc9d5d10 ffffffffa0a41340 ffff880ccc9d5d10 ffffffffa0a78b6b 2013-10-12 11:28:57 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22c00 2013-10-12 11:28:57 ffff880813eda400 ffff880813eda4d0 0000000000001967 000000005259914d 2013-10-12 11:28:57 Call Trace: 2013-10-12 11:28:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:28:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:28:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:28:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:28:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:28:57 [] ? kthread+0x96/0xa0 2013-10-12 11:28:57 [] ? child_rip+0xa/0x20 2013-10-12 11:28:57 [] ? kthread+0x0/0xa0 2013-10-12 11:28:57 [] ? child_rip+0x0/0x20 2013-10-12 11:28:57 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:28:57 Call Trace: 2013-10-12 11:28:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:28:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:28:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:28:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:28:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:28:57 [] ? kthread+0x96/0xa0 2013-10-12 11:28:57 [] ? child_rip+0xa/0x20 2013-10-12 11:28:57 [] ? kthread+0x0/0xa0 2013-10-12 11:28:57 [] ? child_rip+0x0/0x20 2013-10-12 11:28:57 BUG: soft lockup - CPU#6 stuck for 66s! [ll_ost01_006:6292] 2013-10-12 11:28:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:28:57 CPU 6 2013-10-12 11:28:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:28:57 2013-10-12 11:28:57 Pid: 6292, comm: ll_ost01_006 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:28:57 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:28:57 RSP: 0018:ffff880e05805cb0 EFLAGS: 00000283 2013-10-12 11:28:57 RAX: 00000000000032b1 RBX: ffff880e05805cb0 RCX: 000000000000003d 2013-10-12 11:28:57 RDX: 00000000000032ae RSI: ffff88067a044800 RDI: ffff880813eda430 2013-10-12 11:28:57 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:28:57 R10: 00000000000958d5 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000400 2013-10-12 11:28:57 R13: ffffffffa0f70433 R14: ffff880e05805d60 R15: ffff88067a044ba0 2013-10-12 11:28:57 FS: 0000000000000000(0000) GS:ffff8800446c0000(0000) knlGS:0000000000000000 2013-10-12 11:28:57 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:28:57 CR2: 00007ffff7feb000 CR3: 000000082827b000 CR4: 00000000000407e0 2013-10-12 11:28:57 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:28:57 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:28:57 Process ll_ost01_006 (pid: 6292, threadinfo ffff880e05804000, task ffff880fd403c080) 2013-10-12 11:28:57 Stack: 2013-10-12 11:28:57 ffff880e05805d10 ffffffffa0a41340 ffff880e05805d10 ffffffffa0a78b6b 2013-10-12 11:28:57 ffff880813f11c40 0000000000000000 0000000000000000 ffff88067a044800 2013-10-12 11:28:57 ffff880813eda400 ffff880813eda4d0 00000000000014e8 000000005259914d 2013-10-12 11:28:57 Call Trace: 2013-10-12 11:28:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:28:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:28:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:28:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:28:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:28:57 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:28:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:28:57 [] ? kthread+0x96/0xa0 2013-10-12 11:28:57 [] ? child_rip+0xa/0x20 2013-10-12 11:28:57 [] ? kthread+0x0/0xa0 2013-10-12 11:28:57 [] ? child_rip+0x0/0x20 2013-10-12 11:28:57 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:28:57 Call Trace: 2013-10-12 11:28:57 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:28:57 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:28:57 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:28:57 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:28:57 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:28:57 [] ? __wake_up_common+0x59/0x90 2013-10-12 11:28:57 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:28:57 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:28:57 [] ? kthread+0x96/0xa0 2013-10-12 11:28:57 [] ? child_rip+0xa/0x20 2013-10-12 11:28:57 [] ? kthread+0x0/0xa0 2013-10-12 11:28:57 [] ? child_rip+0x0/0x20 2013-10-12 11:28:57 BUG: soft lockup - CPU#7 stuck for 66s! [ll_ost01_009:6315] 2013-10-12 11:28:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:28:57 CPU 7 2013-10-12 11:28:57 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:28:57 2013-10-12 11:28:57 Pid: 6315, comm: ll_ost01_009 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:28:57 RIP: 0010:[] [] _spin_lock+0x21/0x30 2013-10-12 11:28:57 RSP: 0018:ffff880df8e01cb0 EFLAGS: 00000283 2013-10-12 11:28:57 RAX: 00000000000032b0 RBX: ffff880df8e01cb0 RCX: 0000000000000054 2013-10-12 11:28:57 RDX: 00000000000032ae RSI: ffff8805eef22400 RDI: ffff880813eda430 2013-10-12 11:28:57 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:28:57 R10: 00000000000958b3 R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 2013-10-12 11:28:57 R13: ffffffffa0f70433 R14: ffff880df8e01d60 R15: ffff8805eef227a0 2013-10-12 11:28:58 FS: 0000000000000000(0000) GS:ffff8800446e0000(0000) knlGS:0000000000000000 2013-10-12 11:28:58 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:28:58 CR2: 00007ffff7feb000 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:28:58 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:28:58 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:28:58 Process ll_ost01_009 (pid: 6315, threadinfo ffff880df8e00000, task ffff880dfffff500) 2013-10-12 11:28:58 Stack: 2013-10-12 11:28:58 ffff880df8e01d10 ffffffffa0a41340 ffff880df8e01d10 ffffffffa0a78b6b 2013-10-12 11:28:58 ffff880813f11c40 0000000000000000 0000000000000000 ffff8805eef22400 2013-10-12 11:28:58 ffff880813eda400 ffff880813eda4d0 00000000000014e0 000000005259914d 2013-10-12 11:28:58 Call Trace: 2013-10-12 11:28:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:28:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:28:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:28:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:28:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:28:58 [] ? kthread+0x96/0xa0 2013-10-12 11:28:58 [] ? child_rip+0xa/0x20 2013-10-12 11:28:58 [] ? kthread+0x0/0xa0 2013-10-12 11:28:58 [] ? child_rip+0x0/0x20 2013-10-12 11:28:58 Code: 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 e5 0f 1f 2013-10-12 11:28:58 Call Trace: 2013-10-12 11:28:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:28:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:28:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:28:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:28:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:28:58 [] ? kthread+0x96/0xa0 2013-10-12 11:28:58 [] ? child_rip+0xa/0x20 2013-10-12 11:28:58 [] ? kthread+0x0/0xa0 2013-10-12 11:28:58 [] ? child_rip+0x0/0x20 2013-10-12 11:28:58 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:28:58 CPU 4 2013-10-12 11:28:58 Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ldiskfs(U) jbd2 mbcache ko2iblnd(U) lnet(U) sha512_generic sha256_generic crc32c_intel libcfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) dm_round_robin spl(U) zlib_deflate scsi_dh_rdac sg sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod vhost_net macvtap macvlan tun kvm sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support wmi ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc igb dca ptp pps_core mlx4_en mlx4_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] 2013-10-12 11:28:58 2013-10-12 11:28:58 Pid: 6313, comm: ll_ost01_008 Tainted: P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1 appro 512x/S2600JF 2013-10-12 11:28:58 RIP: 0010:[] [] _spin_lock+0x1e/0x30 2013-10-12 11:28:58 RSP: 0018:ffff880dffff7cb0 EFLAGS: 00000287 2013-10-12 11:28:58 RAX: 00000000000032b2 RBX: ffff880dffff7cb0 RCX: 0000000000000054 2013-10-12 11:28:58 RDX: 00000000000032ae RSI: ffff8802fddb8000 RDI: ffff880813eda430 2013-10-12 11:28:58 RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a 2013-10-12 11:28:58 R10: 00000000000a3eda R11: 5a5a5a5a5a5a5a5a R12: ffff880834502080 2013-10-12 11:28:58 R13: ffffffffa0f70433 R14: ffff880dffff7d60 R15: ffff8802fddb83a0 2013-10-12 11:28:58 FS: 0000000000000000(0000) GS:ffff880044680000(0000) knlGS:0000000000000000 2013-10-12 11:28:58 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 2013-10-12 11:28:58 CR2: 00007ffffffdc4d8 CR3: 0000000001a85000 CR4: 00000000000407e0 2013-10-12 11:28:58 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-10-12 11:28:58 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-10-12 11:28:58 Process ll_ost01_008 (pid: 6313, threadinfo ffff880dffff6000, task ffff880dfffecae0) 2013-10-12 11:28:58 Stack: 2013-10-12 11:28:58 ffff880dffff7d10 ffffffffa0a41340 ffff880dffff7d10 ffffffffa0a78b6b 2013-10-12 11:28:58 ffff880813f11c40 0000000000000000 0000000000000000 ffff8802fddb8000 2013-10-12 11:28:58 ffff880813eda400 ffff880813eda4d0 000000000000f53b 000000005259914d 2013-10-12 11:28:58 Call Trace: 2013-10-12 11:28:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:28:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:28:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:28:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:28:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:28:58 [] ? kthread+0x96/0xa0 2013-10-12 11:28:58 [] ? child_rip+0xa/0x20 2013-10-12 11:28:58 [] ? kthread+0x0/0xa0 2013-10-12 11:28:58 [] ? child_rip+0x0/0x20 2013-10-12 11:28:58 Code: 00 00 00 01 74 05 e8 f2 2d d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 2013-10-12 11:28:58 Call Trace: 2013-10-12 11:28:58 [] ? ptlrpc_server_drop_request+0x80/0x300 [ptlrpc] 2013-10-12 11:28:58 [] ? nrs_resource_put_safe+0x9b/0xf0 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_finish_request+0xf9/0x160 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_finish_active_request+0xf8/0x150 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_server_handle_request+0x1b2/0xc00 [ptlrpc] 2013-10-12 11:28:58 [] ? cfs_timer_arm+0xe/0x10 [libcfs] 2013-10-12 11:28:58 [] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 2013-10-12 11:28:58 [] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_main+0xaed/0x1740 [ptlrpc] 2013-10-12 11:28:58 [] ? ptlrpc_main+0x0/0x1740 [ptlrpc] 2013-10-12 11:28:58 [] ? kthread+0x96/0xa0 2013-10-12 11:28:58 [] ? child_rip+0xa/0x20 2013-10-12 11:28:58 [] ? kthread+0x0/0xa0 2013-10-12 11:28:58 [] ? child_rip+0x0/0x20 2013-10-12 11:29:13 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1381602498/real 0] req@ffff8801672b1000 x1448711817269520/t0(0) o38->lustre-MDT0000-lwp-OST0005@192.168.120.5@o2ib1:12/10 lens 400/544 e 0 to 1 dl 1381602553 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 2013-10-12 11:29:13 Lustre: 5862:0:(client.c:1897:ptlrpc_expire_one_request()) Skipped 13 previous similar messages