Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.10.0
-
None
-
Full - RHEL7.3 Server/Client DNE
master, build# 3486
-
3
-
9223372036854775807
Description
This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/04f714d8-d4d2-11e6-8b52-5254006e85c2.
The sub-test test_1 failed with the following error:
test failed to respond and timed out
test_log:
[48194.781895] LustreError: 3746:0:(ldlm_resource.c:882:ldlm_resource_complain()) lustre-OST0007-osc-ffff88005a894800: namespace resource [0x76:0x0:0x0].0x0 (ffff880046c1e300) refcount nonzero (1) after lock cleanup; forcing cleanup. [48194.788720] LustreError: 3746:0:(ldlm_resource.c:1463:ldlm_resource_dump()) --- Resource: [0x76:0x0:0x0].0x0 (ffff880046c1e300) refcount = 2 [48194.794573] LustreError: 3746:0:(ldlm_resource.c:1466:ldlm_resource_dump()) Granted locks (in reverse order): [48194.797833] LustreError: 3746:0:(ldlm_resource.c:1469:ldlm_resource_dump()) ### ### ns: lustre-OST0007-osc-ffff88005a894800 lock: ffff8800080d8e00/0x932697c069dde9c4 lrc: 3/0,1 mode: PW/PW res: [0x76:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 28672->18446744073709551615) flags: 0x526400020000 nid: local remote: 0x8ba51921be08b8ef expref: -99 pid: 16890 timeout: 0 lvb_type: 1 [48229.039814] LustreError: 11-0: lustre-OST0005-osc-ffff88004abe5800: operation ost_write to node 10.2.4.190@tcp failed: rc = -107 [48229.046636] LustreError: 167-0: lustre-OST0005-osc-ffff88004abe5800: This client was evicted by lustre-OST0005; in progress operations using this service will fail. [48229.062748] Lustre: 18207:0:(llite_lib.c:2622:ll_dirty_page_discard_warn()) lustre: dirty page discard: 10.2.4.189@tcp:/lustre/fid: [0x240000405:0x45e:0x0]/ may get corrupted (rc -108) [48229.070255] LustreError: 12523:0:(ldlm_resource.c:882:ldlm_resource_complain()) lustre-OST0005-osc-ffff88004abe5800: namespace resource [0xbf:0x0:0x0].0x0 (ffff88001d24f000) refcount nonzero (1) after lock cleanup; forcing cleanup. [48229.076674] LustreError: 12523:0:(ldlm_resource.c:1463:ldlm_resource_dump()) --- Resource: [0xbf:0x0:0x0].0x0 (ffff88001d24f000) refcount = 2 [48229.082456] LustreError: 12523:0:(ldlm_resource.c:1466:ldlm_resource_dump()) Granted locks (in reverse order): [48229.085666] LustreError: 12523:0:(ldlm_resource.c:1469:ldlm_resource_dump()) ### ### ns: lustre-OST0005-osc-ffff88004abe5800 lock: ffff88000e771600/0x932697c069e0446c lrc: 3/0,1 mode: PW/PW res: [0xbf:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 28672->18446744073709551615) flags: 0x526400020000 nid: local remote: 0x8ba51921be096af1 expref: -99 pid: 10779 timeout: 0 lvb_type: 1 [48240.216240] INFO: task truncate:16890 blocked for more than 120 seconds. [48240.219339] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [48240.222359] truncate D ffff880079664140 0 16890 27039 0x00000080 [48240.225411] ffff88001823bbf0 0000000000000082 ffff8800086d9f60 ffff88001823bfd8 [48240.228469] ffff88001823bfd8 ffff88001823bfd8 ffff8800086d9f60 ffff880079664138 [48240.231446] ffff88007966413c ffff8800086d9f60 00000000ffffffff ffff880079664140 [48240.234380] Call Trace: [48240.236808] [<ffffffff8168c989>] schedule_preempt_disabled+0x29/0x70 [48240.239556] [<ffffffff8168a5e5>] __mutex_lock_slowpath+0xc5/0x1c0 [48240.242242] [<ffffffff81689a4f>] mutex_lock+0x1f/0x2f [48240.244850] [<ffffffffa0cb7a6d>] vvp_io_setattr_start+0x16d/0x240 [lustre] [48240.247608] [<ffffffffa0803035>] cl_io_start+0x65/0x130 [obdclass] [48240.250248] [<ffffffffa08053d5>] cl_io_loop+0xa5/0x190 [obdclass] [48240.252893] [<ffffffffa0cb2610>] cl_setattr_ost+0x240/0x3a0 [lustre] [48240.255505] [<ffffffffa0c8c709>] ll_setattr_raw+0x1299/0x1340 [lustre] [48240.258095] [<ffffffffa0c8c813>] ll_setattr+0x63/0xc0 [lustre] [48240.260529] [<ffffffff8121b879>] notify_change+0x279/0x3d0 [48240.262914] [<ffffffff812a8c16>] ? security_inode_need_killpriv+0x16/0x20 [48240.265369] [<ffffffff811fc3f5>] do_truncate+0x75/0xc0 [48240.267646] [<ffffffff811fc5a4>] vfs_truncate+0x164/0x190 [48240.269914] [<ffffffff811fc65c>] do_sys_truncate+0x8c/0xb0 [48240.272156] [<ffffffff811fc80e>] SyS_truncate+0xe/0x10 [48240.274338] [<ffffffff816967c9>] system_call_fastpath+0x16/0x1b [48240.276546] INFO: task lfs:17858 blocked for more than 120 seconds. [48240.278740] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [48240.281035] lfs D ffff880079664140 0 17858 27046 0x00000080 [48240.283285] ffff880046c37c20 0000000000000086 ffff880052811f60 ffff880046c37fd8 [48240.285563] ffff880046c37fd8 ffff880046c37fd8 ffff880052811f60 ffff880079664138 [48240.287811] ffff88007966413c ffff880052811f60 00000000ffffffff ffff880079664140 [48240.290058] Call Trace: [48240.291829] [<ffffffff8168c989>] schedule_preempt_disabled+0x29/0x70 [48240.293883] [<ffffffff8168a5e5>] __mutex_lock_slowpath+0xc5/0x1c0 [48240.295889] [<ffffffff81689a4f>] mutex_lock+0x1f/0x2f [48240.298771] [<ffffffff812c9b66>] ima_file_check+0xa6/0x1b0 [48240.357072] [<ffffffff8120cffa>] do_last+0x58a/0x12a0 [48240.358931] [<ffffffff8120ddd2>] path_openat+0xc2/0x490 [48240.360794] [<ffffffff811806bb>] ? unlock_page+0x2b/0x30 [48240.362601] [<ffffffff8120ff4b>] do_filp_open+0x4b/0xb0 [48240.364355] [<ffffffff8121cbc7>] ? __alloc_fd+0xa7/0x130 [48240.366071] [<ffffffff811fd453>] do_sys_open+0xf3/0x1f0 [48240.367747] [<ffffffff81691fc5>] ? do_page_fault+0x35/0x90 [48240.369406] [<ffffffff811fd56e>] SyS_open+0x1e/0x20 [48240.370970] [<ffffffff816967c9>] system_call_fastpath+0x16/0x1b [48240.372593] INFO: task lfs:19133 blocked for more than 120 seconds. [48240.374235] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [48240.375993] lfs D ffff880079664140 0 19133 27081 0x00000080 [48240.377717] ffff880037e73c20 0000000000000086 ffff88001c880000 ffff880037e73fd8 [48240.379462] ffff880037e73fd8 ffff880037e73fd8 ffff88001c880000 ffff880079664138 [48240.381186] ffff88007966413c ffff88001c880000 00000000ffffffff ffff880079664140 [48240.382886] Call Trace: [48240.384118] [<ffffffff8168c989>] schedule_preempt_disabled+0x29/0x70 [48240.385672] [<ffffffff8168a5e5>] __mutex_lock_slowpath+0xc5/0x1c0 [48240.387215] [<ffffffff81689a4f>] mutex_lock+0x1f/0x2f [48240.388672] [<ffffffff812c9b66>] ima_file_check+0xa6/0x1b0 [48240.390145] [<ffffffff8120cffa>] do_last+0x58a/0x12a0 [48240.391576] [<ffffffff8120ddd2>] path_openat+0xc2/0x490 [48240.393031] [<ffffffff811bae71>] ? page_add_new_anon_rmap+0x91/0x150 [48240.394562] [<ffffffff8120ff4b>] do_filp_open+0x4b/0xb0 [48240.395999] [<ffffffff8121cbc7>] ? __alloc_fd+0xa7/0x130 [48240.397438] [<ffffffff811fd453>] do_sys_open+0xf3/0x1f0 [48240.398865] [<ffffffff81691fc5>] ? do_page_fault+0x35/0x90 [48240.400311] [<ffffffff811fd56e>] SyS_open+0x1e/0x20 [48240.401749] [<ffffffff816967c9>] system_call_fastpath+0x16/0x1b [48240.403238] INFO: task file_concat.sh:21235 blocked for more than 120 seconds. [48240.404831] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [48240.406503] file_concat.sh D ffff880079664140 0 21235 27053 0x00000080 [48240.408141] ffff880046d93c20 0000000000000086 ffff88000e75ce70 ffff880046d93fd8 [48240.409850] ffff880046d93fd8 ffff880046d93fd8 ffff88000e75ce70 ffff880079664138 [48240.411585] ffff88007966413c ffff88000e75ce70 00000000ffffffff ffff880079664140 [48240.413307] Call Trace: [48240.414561] [<ffffffff8168c989>] schedule_preempt_disabled+0x29/0x70 [48240.416137] [<ffffffff8168a5e5>] __mutex_lock_slowpath+0xc5/0x1c0 [48240.417693] [<ffffffff81689a4f>] mutex_lock+0x1f/0x2f [48240.419140] [<ffffffff812c9b66>] ima_file_check+0xa6/0x1b0 [48240.420614] [<ffffffff8120cffa>] do_last+0x58a/0x12a0 [48240.422047] [<ffffffff8120ddd2>] path_openat+0xc2/0x490 [48240.423496] [<ffffffff811806bb>] ? unlock_page+0x2b/0x30 [48240.424934] [<ffffffff8120ff4b>] do_filp_open+0x4b/0xb0 [48240.426375] [<ffffffff8121cbc7>] ? __alloc_fd+0xa7/0x130 [48240.427796] [<ffffffff811fd453>] do_sys_open+0xf3/0x1f0 [48240.429212] [<ffffffff81691fc5>] ? do_page_fault+0x35/0x90 [48240.430639] [<ffffffff811fd56e>] SyS_open+0x1e/0x20 [48240.432021] [<ffffffff816967c9>] system_call_fastpath+0x16/0x1b [48269.199829] 14[24902]: segfault at 8 ip 00007f9f028f93b8 sp 00007fffad27a740 error 4 in ld-2.17.so[7f9f028ee000+20000] [48360.468235] INFO: task dir_create.sh:26010 blocked for more than 120 seconds. [48360.470096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [48360.471815] dir_create.sh D ffff880039df0140 0 26010 26005 0x00000080 [48360.473516] ffff88003dee7c40 0000000000000082 ffff88005a8c8000 ffff88003dee7fd8 [48360.475277] ffff88003dee7fd8 ffff88003dee7fd8 ffff88005a8c8000 ffff880039df0138 [48360.477018] ffff880039df013c ffff88005a8c8000 00000000ffffffff ffff880039df0140 [48360.478758] Call Trace: [48360.480057] [<ffffffff8168c989>] schedule_preempt_disabled+0x29/0x70 [48360.481667] [<ffffffff8168a5e5>] __mutex_lock_slowpath+0xc5/0x1c0 [48360.483248] [<ffffffff81689a4f>] mutex_lock+0x1f/0x2f [48360.484754] [<ffffffff8168349c>] lookup_slow+0x33/0xa7 [48360.486250] [<ffffffff8120bacf>] link_path_walk+0x80f/0x8b0 [48360.487777] [<ffffffff81099f63>] ? __sigqueue_free.part.11+0x33/0x40 [48360.489367] [<ffffffff8120ddb3>] path_openat+0xa3/0x490 [48360.490837] [<ffffffff8120ff4b>] do_filp_open+0x4b/0xb0 [48360.492305] [<ffffffff8121cbc7>] ? __alloc_fd+0xa7/0x130 [48360.493778] [<ffffffff811fd453>] do_sys_open+0xf3/0x1f0 [48360.495245] [<ffffffff811fd56e>] SyS_open+0x1e/0x20 [48360.496664] [<ffffffff816967c9>] system_call_fastpath+0x16/0x1b [48360.498201] INFO: task dir_create.sh:26028 blocked for more than 120 seconds. [48360.499819] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.