[LU-6785] Interop 2.7.0<->master sanity test_56w: cannot swap layouts: Device or resource busy Created: 01/Jul/15  Updated: 28/Oct/20  Resolved: 10/Sep/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Critical
Reporter: Maloo Assignee: Henri Doreau (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

server: 2.7.0
client: lustre-master build # 3071 EL7


Issue Links:
Related
is related to LU-4840 Deadlock when truncating file during... Resolved
is related to LU-6475 race between open and migration Resolved
is related to LU-14084 change 'lfs migrate' to use 'MIGRATIO... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/a260a392-1237-11e5-bd2d-5254006e85c2.

The sub-test test_56w failed with the following error:

/usr/bin/lfs migrate -i 0 /mnt/lustre/d56w.sanityw/migr_1_ost failed
yes: standard output: Broken pipe
yes: write error
/usr/bin/lfs_migrate -y -c 6 /mnt/lustre/d56w.sanityw/file1
/mnt/lustre/d56w.sanityw/file1: /usr/bin/lfs: /mnt/lustre/d56w.sanityw/file1: cannot swap layouts: Device or resource busy
cannot put lease: No locks available (37)
error: migrate: migrate stripe file '/mnt/lustre/d56w.sanityw/file1' failed
falling back to rsync-based migration
done


 Comments   
Comment by Andreas Dilger [ 03/Sep/15 ]

It seems that this interop regression was added from http://review.whamcloud.com/10013 "LU-4840 lfs: Use file lease to implement migration" which landed back on 2015-05-28. This landed after the 2.5.54 tag was made (2015-05-17) and first appeared in the 2.7.55 tag (2015-06-10) .

I'd guess that the new client is always trying to use the file lease lock, but that doesn't exist in 2.7.0 and earlier (interop failures are seen with 2.5.x also). Is there some mechanism by which the client can determine if the MDS has the file lease capability, and fall back to the previous group lock mechanism?

Comment by Henri Doreau (Inactive) [ 04/Sep/15 ]

The file_lease_supported variable in lfs was supposed to address this. It is set automatically on failed attempts to get a file lease and it determines whether to fallback on group lock or not. Obviously there's something fishy, I can reproduce easily. I'm having a look.

Comment by Gerrit Updater [ 04/Sep/15 ]

Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: http://review.whamcloud.com/16238
Subject: LU-6785 utils: compatibility fix for lfs migrate
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 0e4820031647405d939ccf43be8bc76426310b9f

Comment by Gerrit Updater [ 10/Sep/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16238/
Subject: LU-6785 utils: compatibility fix for lfs migrate
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: a99e42c9fa47677cd2468abfa9378d776cc40803

Comment by Joseph Gmitter (Inactive) [ 10/Sep/15 ]

Landed for 2.8.0

Generated at Sat Feb 10 02:03:13 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.