[LU-7386] Failover: recovery-random-scale test_fail_client_mds: test failed to respond and timed out Created: 04/Nov/15  Updated: 14/May/16  Resolved: 14/May/16

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Server/ Client: RHEL 6.7 - ZFS


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/5f264fc6-7d96-11e5-82ee-5254006e85c2.

The sub-test test_fail_client_mds failed with the following error:

test failed to respond and timed out

Client 3 dimes shows following:

tar           D 0000000000000000     0  6157   6155 0x00000080
 ffff8800532178b8 0000000000000086 0000000000000000 000006532680c895
 0000000200000000 ffff880078bc3130 0000026b113572b0 ffffffffa9d0f8ae
 000000001ca71ef2 000000010023fe39 ffff880078917068 ffff880053217fd8
Call Trace:
 [<ffffffff81127540>] ? sync_page+0x0/0x50
 [<ffffffff81538d33>] io_schedule+0x73/0xc0
 [<ffffffff8112757d>] sync_page+0x3d/0x50
 [<ffffffff815397ff>] __wait_on_bit+0x5f/0x90
 [<ffffffff811277b3>] wait_on_page_bit+0x73/0x80
 [<ffffffff810a14e0>] ? wake_bit_function+0x0/0x50
 [<ffffffffa09b0ee2>] vvp_page_assume+0x32/0xa0 [lustre]
 [<ffffffffa053c3e8>] cl_page_invoid+0x68/0x160 [obdclass]
 [<ffffffffa053efc6>] cl_page_assume+0x56/0x210 [obdclass]
 [<ffffffffa09a3fff>] ll_write_begin+0xff/0x750 [lustre]
 [<ffffffff81127ee3>] generic_file_buffered_write+0x123/0x2e0
 [<ffffffff8107e987>] ? current_fs_time+0x27/0x30
 [<ffffffff81129940>] __generic_file_aio_write+0x260/0x490
 [<ffffffff8153ace6>] ? down_read+0x16/0x30
 [<ffffffffa09154e3>] ? lov_object_maxbytes+0x33/0x40 [lov]
 [<ffffffffa09b3a0d>] vvp_io_write_start+0x15d/0x5c0 [lustre]
 [<ffffffffa0540f1b>] ? cl_lock_request+0x7b/0x200 [obdclass]
 [<ffffffffa0541eda>] cl_io_start+0x6a/0x140 [obdclass]
 [<ffffffffa05450e4>] cl_io_loop+0xb4/0x1b0 [obdclass]
 [<ffffffffa095c1a7>] ll_file_io_generic+0x317/0xab0 [lustre]
 [<ffffffffa095e68b>] ll_file_aio_write+0x20b/0x860 [lustre]
 [<ffffffffa095ee0b>] ll_file_write+0x12b/0x260 [lustre]
 [<ffffffff81191a98>] vfs_write+0xb8/0x1a0
 [<ffffffff81192f86>] ? fget_light_pos+0x16/0x50
 [<ffffffff811925d1>] sys_write+0x51/0xb0
 [<ffffffff810e884e>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b


 Comments   
Comment by Saurabh Tandan (Inactive) [ 20/Jan/16 ]

Another instance found for hardfailover: EL7 Server/Client
https://testing.hpdd.intel.com/test_sets/28e655fc-bc00-11e5-a592-5254006e85c2

Comment by Saurabh Tandan (Inactive) [ 20/Jan/16 ]

Another instance found for hardfailover: EL7 Server/SLES11 SP3 Client
build# 3303
https://testing.hpdd.intel.com/test_sets/8da9c094-bb2b-11e5-b3d5-5254006e85c2

Comment by Saurabh Tandan (Inactive) [ 09/Feb/16 ]

Another instance found for hardfailover : EL7 Server/Client, tag 2.7.66, master build 3314
https://testing.hpdd.intel.com/test_sessions/8d13249a-ca8f-11e5-9609-5254006e85c2

Comment by Sarah Liu [ 14/May/16 ]

dup of LU-4621

Generated at Sat Feb 10 02:08:27 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.