[LU-17082] sanity test_235: process hangs on a deadlock Created: 04/Sep/23  Updated: 21/Oct/23  Resolved: 21/Oct/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Lai Siyao
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for eaujames <eaujames@ddn.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/19ecce8c-0af7-4b28-8e33-1f74507cc0fe

test_235 failed with the following error:

process hangs on a deadlock

Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/97550 - 4.18.0-477.15.1.el8_8.x86_64
servers: https://build.whamcloud.com/job/lustre-reviews/97550 - 4.18.0-477.15.1.el8_lustre.x86_64

== sanity test 235: LU-1715: flock deadlock detection does not work properly ========================================================== 19:23:37 (1693596217)
953461: taking lock1 [100, 200]
953461: done
953461 sleeping 2
953461: putting lock1 [100, 200]
953461: done
953461 Exit
lock timeout
953460: taking lock0 [0, 100]
953460: done
953460 sleeping 1
953460: taking lock3 [100, 300]
953459: sleeping 1
953459: taking lock2 [200, 300]
953459: done
953459: taking lock0 [0, 100]
953459: done
953459: putting lock0 [0, 100]
953459: done
953459: putting lock2 [200, 300]
953459: done
953459 Exit
 sanity test_235: @@@@@@ FAIL: process hangs on a deadlock 

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity test_235 - process hangs on a deadlock



 Comments   
Comment by Etienne Aujames [ 04/Sep/23 ]

The first case occurs on 2023-08-31:
https://testing.whamcloud.com/sub_tests/d4e8c069-e5be-4e6a-814b-192a924c4f48

This may be a regression due to https://review.whamcloud.com/46733 ("LU-15526 mdt: enable remote PDO lock"):

commit 7270e16fcbe52ad89634b2e1e033e983248d0566
...
Commit:     Oleg Drokin <green@whamcloud.com>
CommitDate: Thu Aug 31 06:23:22 2023 +0000

    LU-15526 mdt: enable remote PDO lock
Comment by Peter Jones [ 04/Sep/23 ]

Lai

Does this seem related to the LU-15526 change?

Peter

Comment by Lai Siyao [ 05/Sep/23 ]

This is unlikely to be caused by https://review.whamcloud.com/46733 because sanity 235 doesn't create directory, while that patch is about locking of remote directory.

Comment by Lai Siyao [ 11/Oct/23 ]

This test doesn't fail from Sep 15, which should have been fixed by some commit around that day.

Generated at Sat Feb 10 03:32:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.