[LU-13346] Fix link and rename race on zfs odb Created: 09/Mar/20  Updated: 13/Aug/21  Resolved: 13/Aug/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Shaun Tancheff Assignee: Shaun Tancheff
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Related
is related to LU-11549 Unattached inodes after 3 min racer run. Resolved
is related to LU-12848 Add test case for LU-11549 Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

https://jira.whamcloud.com/browse/LU-12848

Uncovered an issue with ZFS w/sanity 105:

== sanityn test 105: A racy rename/link an open file should not cause fs corruption ================== 13:15:42 (1574273742)
fail_loc=0x8000018a
/home/green/git/lustre-release/lustre/tests/sanityn.sh: line 4905: 10460 Terminated              $MULTIOP $DIR2/$tdir/mdt0dir/foodir/file2 Ow4096_c
rm: cannot remove '/mnt/lustre/d105.sanityn/mdt1dir/file2x': No such file or directory
 sanityn test_105: @@@@@@ FAIL: Removing test dir failed 
  Trace dump:
  = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6108:error()
  = /home/green/git/lustre-release/lustre/tests/sanityn.sh:4906:test_105()
  = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6410:run_one()
  = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6449:run_one_logged()
  = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6280:run_test()
  = /home/green/git/lustre-release/lustre/tests/sanityn.sh:4908:main()
Dumping lctl log to /tmp/testlogs//sanityn.test_105.*.1574273747.log
oleg256-server: Warning: Permanently added 'oleg256-client.virtnet' (ECDSA) to the list of known hosts.
oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.debug_log.oleg256-server.1574273747.log.Gd6d36" failed: Operation not permitted (1)
oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.dmesg.oleg256-server.1574273747.log.uwPdAz" failed: Operation not permitted (1)
oleg256-server: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1178) [sender=3.1.2]
pdsh@oleg256-client: oleg256-server: ssh exited with exit code 23
Resetting fail_loc on all nodes...done.
FAIL 105 (6s)
cleanup: ======================================================
== sanityn test complete, duration 21 sec ============================================================ 13:15:51 (1574273751)
sanityn: FAIL: test_105 Removing test dir failed
rm: cannot remove '/mnt/lustre/d105.sanityn/mdt1dir': Directory not empty
 sanityn test_105: @@@@@@ FAIL: remove sub-test dirs failed 
  Trace dump:
  = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6108:error()
  = /home/green/git/lustre-release/lustre/tests/test-framework.sh:5593:check_and_cleanup_lustre()
  = /home/green/git/lustre-release/lustre/tests/sanityn.sh:4920:main()
Dumping lctl log to /tmp/testlogs//sanityn.test_105.*.1574273752.log
oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.debug_log.oleg256-server.1574273752.log.65aqHO" failed: Operation not permitted (1)
oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.dmesg.oleg256-server.1574273752.log.rhRRav" failed: Operation not permitted (1)
oleg256-server: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1178) [sender=3.1.2]
pdsh@oleg256-client: oleg256-server: ssh exited with exit code 23


 Comments   
Comment by Gerrit Updater [ 09/Mar/20 ]

Shaun Tancheff (shaun.tancheff@hpe.com) uploaded a new patch: https://review.whamcloud.com/37836
Subject: LU-13346 osd-zfs: set attr stores LUSTRE_LMA_FL_MASKS
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ad57c033738be3b9bafcf1f0359fbeee12e2bdb1

Generated at Sat Feb 10 03:00:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.