[LU-3923] Interop 2.3.0<->2.5 failure on test suite sanity-quota test_18c Created: 11/Sep/13 Updated: 22/Dec/17 Resolved: 22/Dec/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
server: 2.3.0 |
||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 10368 | ||||||||||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/66dda284-19fa-11e3-8fec-52540035b04c. The sub-test test_18c failed with the following error:
client console: 14:01:38:Lustre: 6763:0:(client.c:1896:ptlrpc_expire_one_request()) Skipped 3 previous similar messages 14:01:38:INFO: task tee:11217 blocked for more than 120 seconds. 14:01:39:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 14:01:39:tee D 0000000000000000 0 11217 8082 0x00000080 14:01:40: ffff88003e9719c8 0000000000000082 00000000ffffffff 00005a151837ff05 14:01:41: ffff880063084080 ffff880037e04210 000000000174696e ffffffffadd267a6 14:01:44: ffff880063084638 ffff88003e971fd8 000000000000fb88 ffff880063084638 14:01:44:Call Trace: 14:01:45: [<ffffffff810a2431>] ? ktime_get_ts+0xb1/0xf0 14:01:46: [<ffffffffa0369440>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs] 14:01:47: [<ffffffff8150e8c3>] io_schedule+0x73/0xc0 14:01:47: [<ffffffffa036944e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs] 14:01:48: [<ffffffff8150f27f>] __wait_on_bit+0x5f/0x90 14:01:49: [<ffffffffa0369440>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs] 14:01:49: [<ffffffff8150f328>] out_of_line_wait_on_bit+0x78/0x90 14:01:50: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 14:01:51: [<ffffffff81119c1e>] ? find_get_page+0x1e/0xa0 14:01:51: [<ffffffffa036942f>] nfs_wait_on_request+0x2f/0x40 [nfs] 14:01:52: [<ffffffffa036fe2a>] nfs_updatepage+0x20a/0x4e0 [nfs] 14:01:53: [<ffffffffa035e552>] nfs_write_end+0x152/0x2b0 [nfs] 14:01:53: [<ffffffff81119582>] ? iov_iter_copy_from_user_atomic+0x92/0x130 14:01:54: [<ffffffff8111a81a>] generic_file_buffered_write+0x18a/0x2e0 14:01:54: [<ffffffff8106327e>] ? try_to_wake_up+0x24e/0x3e0 14:01:55: [<ffffffff8111c210>] __generic_file_aio_write+0x260/0x490 14:01:55: [<ffffffff8111c4c8>] generic_file_aio_write+0x88/0x100 14:01:56: [<ffffffffa035df8e>] nfs_file_write+0xde/0x1f0 [nfs] 14:01:56: [<ffffffff8118106a>] do_sync_write+0xfa/0x140 14:01:57: [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 14:02:00: [<ffffffff8100bb8e>] ? apic_timer_interrupt+0xe/0x20 14:02:00: [<ffffffff8121bed6>] ? security_file_permission+0x16/0x20 14:02:01: [<ffffffff81181368>] vfs_write+0xb8/0x1a0 14:02:02: [<ffffffff8118266b>] ? fget_light+0x3b/0x90 14:02:02: [<ffffffff81181c61>] sys_write+0x51/0x90 14:02:02: [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290 14:02:03: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b 14:02:04:INFO: task tee:20761 blocked for more than 120 seconds. 14:02:04:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 14:02:04:tee D 0000000000000000 0 20761 11534 0x00000080 14:02:04: ffff880063a8b9c8 0000000000000086 00000000ffffffff 00005a151837ff05 14:02:04: ffff88007cfa4aa0 ffff88004a0dea60 000000000174696c ffffffffadd267a6 14:02:06: ffff88007cfa5058 ffff880063a8bfd8 000000000000fb88 ffff88007cfa5058 |
| Comments |
| Comment by Oleg Drokin [ 13/Sep/13 ] |
|
the stack trace does not sound related to lustre, but rather nfs. |
| Comment by Niu Yawei (Inactive) [ 16/Sep/13 ] |
directio on /mnt/lustre/d0.sanity-quota/d18/f.sanity-quota.18c for 100x1048576 bytes PASS pdsh@client-32vm2: gethostbyname("client-32vm1") failed sanity-quota test_18c: @@@@@@ FAIL: post-failover df: 1 I didn't find anything abnormal from the dmesg, is there something wrong with the teste environment? looks pdsh failed to get hostname of client-32vm1 (which is another client). Can this bug be reproduced? |
| Comment by Andreas Dilger [ 22/Dec/17 ] |
|
Close old bug that has not been seen in a long time. |