[LU-14920] Regular failures in runtests Created: 09/Aug/21  Updated: 08/Dec/21  Resolved: 08/Dec/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.15.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Oleg Drokin Assignee: Patrick Farrell
Resolution: Fixed Votes: 0
Labels: None

Attachments: PNG File image-2021-08-09-13-04-55-376.png    
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Since about beginning on July in my testrig I see regular runtest failure that looks like this:

1(can't cp /etc/hosts to /mnt/lustre/hosts.8162)  

I did some tracing around and it seems the issues started with LU-13798 https://review.whamcloud.com/39436 - first the patch itself failed a few times and then once landed on master on June 30th - it started to fail like this on master, here's the frequency



 Comments   
Comment by Patrick Farrell [ 10/Aug/21 ]

Oleg,

Any updates on testing with this and trying to get more debug?

Comment by Gerrit Updater [ 11/Aug/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44604
Subject: LU-14920 tests: Additional debug and testing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ea135bcae2fa28d98a5b95cfabef2ef508ab581a

Comment by Gerrit Updater [ 11/Aug/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44605
Subject: LU-14920 tests: Additional debug and testing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: cf7d0c8018a9f188eae212be0f3ab34b886ae5d1

Comment by Gerrit Updater [ 11/Aug/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44606
Subject: LU-14920 tests: Additional debug and testing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c9b32309a2993025eef4679aa672059ee6517c71

Comment by Gerrit Updater [ 11/Aug/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44607
Subject: LU-14920 tests: Additional debug and testing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 35fa5a0909ea3360b5a36e7caaefa39ad8d2a1fc

Comment by Gerrit Updater [ 11/Aug/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44608
Subject: LU-14920 tests: Additional debug and testing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ad215aef72d74dfbe4b696f3b2ccd948e665f243

Comment by Gerrit Updater [ 11/Aug/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44609
Subject: LU-14920 tests: Additional debug and testing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d9f07c22eb899170c9640748af7b9fda92fb4b1c

Comment by Patrick Farrell [ 11/Aug/21 ]

Yes, this stream of patches is deliberate - it's the best way to get runtests run over and over in Oleg's testing.

Comment by Gerrit Updater [ 12/Aug/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44645
Subject: LU-14920 llite: Reverts
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8d78c9c85e3a02cf24ea3190a5ab1631ec71d5a5

Comment by Gerrit Updater [ 12/Aug/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44646
Subject: LU-14920 llite: Disable parallel dio by default
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6970da5afa1c9df86df60c975e0496933354c769

Comment by Gerrit Updater [ 12/Aug/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44647
Subject: LU-14920 tests: Additional debug and testing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 56ebd3201797ecf0f468892e7b45ba0d4702e225

Comment by Gerrit Updater [ 23/Aug/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44734
Subject: LU-14920 llite: Disable parallel dio except aio
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 041a56034aa5420c044b39f104b1283662967097

Comment by Gerrit Updater [ 08/Sep/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44868
Subject: LU-14920 llite: Additional debug
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: bf29b7419dcd0e4c7362115e747c15ec9668113d

Comment by Gerrit Updater [ 08/Sep/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44869
Subject: LU-14920 llite: Additional debug
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4106afc93bf6ad9c4e34d5e77aa162304091658c

Comment by Gerrit Updater [ 08/Sep/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44870
Subject: LU-14920 llite: Further testing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8bd9b4982255b01526e53b14e8e6fdb073a73722

Comment by Gerrit Updater [ 09/Sep/21 ]

"Patrick Farrell <pfarrell@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44877
Subject: LU-14920 llite: Remove flag but leave PDIO enabled
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 1fe9e8f4b8e67d23302d929fd6e3261ba26b8187

Comment by Peter Jones [ 08/Dec/21 ]

As per Oleg, not happening anymore

Generated at Sat Feb 10 03:13:54 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.