Details
-
Bug
-
Resolution: Won't Fix
-
Minor
-
None
-
Lustre 2.11.0
-
onyx, full DNE
servers: el7.4, ldiskfs, branch master, v2.10.56, b3678
clients: el7.4, branch master, v2.10.56, b3678
-
3
-
9223372036854775807
Description
session: https://testing.hpdd.intel.com/test_sessions/45ec3e40-419a-47db-95d1-7dbe1c6a0b66
test set: https://testing.hpdd.intel.com/test_sets/bd99b32a-e0a6-11e7-9c63-52540065bddc
There are 10 traces after parallel-scale-nfsv3 times out, and the tops of the dd and ln traces look the same:
From console log:
[17022.391464] nfs: server onyx-30vm4 not responding, still trying [17040.277998] INFO: task dd:9525 blocked for more than 120 seconds. [17040.278758] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [17040.279567] dd D ffffffff816a76b0 0 9525 3854 0x00000080 [17040.280435] ffff88006b797bd0 0000000000000082 ffff88004ef3bf40 ffff88006b797fd8 [17040.281271] ffff88006b797fd8 ffff88006b797fd8 ffff88004ef3bf40 ffff88007fd16cc0 [17040.282108] 0000000000000000 7fffffffffffffff ffff88007ff682e8 ffffffff816a76b0 [17040.282991] Call Trace: [17040.283302] [<ffffffff816a76b0>] ? bit_wait+0x50/0x50 [17040.283823] [<ffffffff816a9589>] schedule+0x29/0x70 [17040.284335] [<ffffffff816a7099>] schedule_timeout+0x239/0x2c0 [17040.285022] [<ffffffff81062efe>] ? kvm_clock_get_cycles+0x1e/0x20 [17040.285681] [<ffffffff810e93ac>] ? ktime_get_ts64+0x4c/0xf0 [17040.286276] [<ffffffff81062efe>] ? kvm_clock_get_cycles+0x1e/0x20 [17040.286954] [<ffffffff810e93ac>] ? ktime_get_ts64+0x4c/0xf0 [17040.287534] [<ffffffff816a76b0>] ? bit_wait+0x50/0x50 [17040.288113] [<ffffffff816a8c0d>] io_schedule_timeout+0xad/0x130 [17040.288743] [<ffffffff816a8ca8>] io_schedule+0x18/0x20 [17040.289291] [<ffffffff816a76c1>] bit_wait_io+0x11/0x50 [17040.289862] [<ffffffff816a71e5>] __wait_on_bit+0x65/0x90 [17040.290411] [<ffffffff81181cc1>] wait_on_page_bit+0x81/0xa0 [17040.291041] [<ffffffff810b19e0>] ? wake_bit_function+0x40/0x40 [17040.291656] [<ffffffff81181df1>] __filemap_fdatawait_range+0x111/0x190 [17040.292335] [<ffffffff81181e84>] filemap_fdatawait_range+0x14/0x30 [17040.293040] [<ffffffff81183dc6>] filemap_write_and_wait_range+0x56/0x90 [17040.293751] [<ffffffffc04e3516>] nfs_file_fsync+0x86/0x110 [nfs] [17040.294387] [<ffffffff812333cb>] vfs_fsync+0x2b/0x40 [17040.294982] [<ffffffffc04e3956>] nfs_file_flush+0x46/0x60 [nfs] [17040.295583] [<ffffffff811fe294>] filp_close+0x34/0x80 [17040.296148] [<ffffffff81220388>] __close_fd+0x78/0xa0 [17040.296709] [<ffffffff811ffd03>] SyS_close+0x23/0x50 [17040.297241] [<ffffffff816b5089>] system_call_fastpath+0x16/0x1b
and
[17040.339627] ln D ffffffff816a76b0 0 9837 3840 0x00000080 [17040.340461] ffff880046a7bb20 0000000000000086 ffff88007b6eeeb0 ffff880046a7bfd8 [17040.341300] ffff880046a7bfd8 ffff880046a7bfd8 ffff88007b6eeeb0 ffff88007fd16cc0 [17040.342135] 0000000000000000 7fffffffffffffff ffff88007ff682e8 ffffffff816a76b0 [17040.343018] Call Trace: [17040.343304] [<ffffffff816a76b0>] ? bit_wait+0x50/0x50 [17040.343836] [<ffffffff816a9589>] schedule+0x29/0x70 [17040.344344] [<ffffffff816a7099>] schedule_timeout+0x239/0x2c0 [17040.345048] [<ffffffff81062efe>] ? kvm_clock_get_cycles+0x1e/0x20 [17040.345696] [<ffffffff810e93ac>] ? ktime_get_ts64+0x4c/0xf0 [17040.346285] [<ffffffff81062efe>] ? kvm_clock_get_cycles+0x1e/0x20 [17040.346973] [<ffffffff810e93ac>] ? ktime_get_ts64+0x4c/0xf0 [17040.347563] [<ffffffff816a76b0>] ? bit_wait+0x50/0x50 [17040.348104] [<ffffffff816a8c0d>] io_schedule_timeout+0xad/0x130 [17040.348751] [<ffffffff816a8ca8>] io_schedule+0x18/0x20 [17040.349299] [<ffffffff816a76c1>] bit_wait_io+0x11/0x50 [17040.349880] [<ffffffff816a71e5>] __wait_on_bit+0x65/0x90 [17040.350431] [<ffffffff81181cc1>] wait_on_page_bit+0x81/0xa0 [17040.351069] [<ffffffff810b19e0>] ? wake_bit_function+0x40/0x40 [17040.351685] [<ffffffff81181df1>] __filemap_fdatawait_range+0x111/0x190 [17040.352376] [<ffffffff81181e84>] filemap_fdatawait_range+0x14/0x30 [17040.353073] [<ffffffff81181ec7>] filemap_fdatawait+0x27/0x30 [17040.353681] [<ffffffff81183cfc>] filemap_write_and_wait+0x4c/0x80 [17040.354327] [<ffffffffc04f4910>] nfs_wb_all+0x20/0x100 [nfs] [17040.354982] [<ffffffffc04e7b7b>] nfs_getattr+0x1bb/0x250 [nfs] [17040.355571] [<ffffffff812062c6>] vfs_getattr+0x46/0x80 [17040.356112] [<ffffffff812063f5>] vfs_fstatat+0x75/0xc0 [17040.356705] [<ffffffff8120694e>] SYSC_newstat+0x2e/0x60 [17040.357262] [<ffffffff816b0456>] ? trace_do_page_fault+0x56/0x150 [17040.357940] [<ffffffff816afaea>] ? do_async_page_fault+0x1a/0xd0 [17040.358556] [<ffffffff816ac5f8>] ? async_page_fault+0x28/0x30 [17040.359218] [<ffffffff81206c2e>] SyS_newstat+0xe/0x10 [17040.359763] [<ffffffff816b5089>] system_call_fastpath+0x16/0x1b