[LU-3329] Failure on test suite sanity test_18: test failed to respond and timed out Created: 13/May/13 Updated: 17/Apr/17 Resolved: 17/Apr/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Zhenyu Xu |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
tag-2.3.65 |
||
| Severity: | 3 |
| Rank (Obsolete): | 8211 |
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/68dcb62c-ba7f-11e2-b1a3-52540035b04c. The sub-test test_18 failed with the following error:
Found some D processes on OST side: kjournald D 0000000000000000 0 354 2 0x00000000 ffff880037bf3c30 0000000000000046 00000010ffffffff 00000d55bc99ffca ffff88000002bb08 ffff88007969a9f0 00000000000877bc ffffffffae72b5ca ffff880037b5faf8 ffff880037bf3fd8 000000000000fb88 ffff880037b5faf8 Call Trace: [<ffffffff810a1ac9>] ? ktime_get_ts+0xa9/0xe0 [<ffffffff811b60c0>] ? sync_buffer+0x0/0x50 [<ffffffff8150e723>] io_schedule+0x73/0xc0 [<ffffffff811b6100>] sync_buffer+0x40/0x50 [<ffffffff8150f0df>] __wait_on_bit+0x5f/0x90 [<ffffffff811b60c0>] ? sync_buffer+0x0/0x50 [<ffffffff8150f188>] out_of_line_wait_on_bit+0x78/0x90 [<ffffffff81096ce0>] ? wake_bit_function+0x0/0x50 [<ffffffff811b60b6>] __wait_on_buffer+0x26/0x30 [<ffffffff811b70d1>] __sync_dirty_buffer+0x71/0xf0 [<ffffffffa00683c5>] journal_commit_transaction+0xe35/0x1310 [jbd] [<ffffffff81080fcc>] ? lock_timer_base+0x3c/0x70 [<ffffffff81081a5b>] ? try_to_del_timer_sync+0x7b/0xe0 [<ffffffffa006d768>] kjournald+0xe8/0x250 [jbd] [<ffffffff81096ca0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa006d680>] ? kjournald+0x0/0x250 [jbd] [<ffffffff81096936>] kthread+0x96/0xa0 [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffff810968a0>] ? kthread+0x0/0xa0 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 auditd D 0000000000000000 0 1104 1 0x00000000 ffff88007db6be08 0000000000000086 ffff88007db6bd88 ffff880037bf3e80 0000000000000000 ffff8800374208d0 0000000000000000 0000000000000000 ffff880037f6dab8 ffff88007db6bfd8 000000000000fb88 ffff880037f6dab8 Call Trace: [<ffffffff81096f8e>] ? prepare_to_wait+0x4e/0x80 [<ffffffffa006d545>] log_wait_commit+0xc5/0x140 [jbd] [<ffffffff81096ca0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa00855e6>] ext3_sync_file+0x126/0x1a0 [ext3] [<ffffffff8111a618>] ? filemap_write_and_wait_range+0x78/0x90 [<ffffffff811b1b11>] vfs_fsync_range+0xa1/0xe0 [<ffffffff811b1bbd>] vfs_fsync+0x1d/0x20 [<ffffffff811b1bfe>] do_fsync+0x3e/0x60 [<ffffffff811b1c50>] sys_fsync+0x10/0x20 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b |
| Comments |
| Comment by Peter Jones [ 14/May/13 ] |
|
Bobijam Could you please look into this one? Thanks Peter |
| Comment by Zhenyu Xu [ 15/May/13 ] |
|
the syslog of MDS shows that for some unknown reasons ntpd of it reset its time forward around 40 minutes
|
| Comment by Sarah Liu [ 14/Jun/13 ] |
|
another instance: |
| Comment by Andreas Dilger [ 17/Apr/17 ] |
|
Close old issue. |