[LU-5872] sanity-quota test_6: dd not finished in 240 secs Created: 05/Nov/14  Updated: 23/Nov/21  Resolved: 13/Oct/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 16423

 Description   

This issue was created by maloo for John Hammond <john.hammond@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/27a469b2-6513-11e4-ab6d-5254006e85c2.

The sub-test test_6 failed with the following error:

dd not finished in 240 secs

Please provide additional information about the failure here.

Info required for matching: sanity-quota 6



 Comments   
Comment by Niu Yawei (Inactive) [ 06/Nov/14 ]

When approaching hard limit, client turns to sync write (page by page), so the write will be extremely slow. I checked the test log, there is no errors, the dd was still in progress after 240 seconds.

Looks 240 seconds isn't enough for 2M bytes sync write (4k per OST write) to finish on current autotest system. Given that test script checks if file is growing every 30 seconds to avoid waiting too long when dd is hung, I think we can safely bump the 240 to a larger value.

Comment by Andreas Dilger [ 06/Nov/14 ]

Even better would be to make the test run more quickly, if possible, rather than accepting that it will take more time to complete. 2MB in 4KB chunks is 512 sync writes. With a real disk this should only take about 5s to complete, so it is sad that it takes so much longer.

Comment by Niu Yawei (Inactive) [ 07/Nov/14 ]

Looks the OSTs were quit busy on handling statfs & ping requests, and each OST_WRITE took very long time, for example:

00000100:00100000:1.0:1415181131.383051:0:583:0:(service.c:2116:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_042:c453fba3-d5c5-1944-5f99-be5cc0651185+7:29501:x1483920018695096:12345-10.1.5.70@tcp:4 Request procesed in 1869583us (1869643us total) trans 90194313389 rc 0/0
Comment by Emoly Liu [ 10/Jul/15 ]

Another instance: https://testing.hpdd.intel.com/test_sets/f878df9a-2661-11e5-8b33-5254006e85c2
BTW, both test_1 and test_3 failed due to LU-5245 in this test.

Comment by Artem Blagodarenko (Inactive) [ 23/Nov/21 ]

+1 https://testing.whamcloud.com/test_sets/5a5c171b-264f-4d1d-a843-60df8e84db41

Generated at Sat Feb 10 01:55:15 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.