[LU-12848] Add test case for LU-11549 Created: 11/Oct/19 Updated: 25/Aug/21 Resolved: 25/Aug/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Andreas Dilger | Assignee: | Alexander Zarochentsev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
The test case in patch https://review.whamcloud.com/35991 " |
| Comments |
| Comment by Alexander Zarochentsev [ 20/Nov/19 ] |
|
the test reveals problems with ZFS backend (it is test #104 in our branch and test #105 in the patch for master): == sanityn test 104: rename to an open file and link race should not cause fs corruption ============= 15:06:58 (1574089618) fail_loc=0x8000018a /usr/lib64/lustre/tests/sanityn.sh: line 4633: 17769 Terminated $MULTIOP $DIR2/$tdir/mdt0dir/foodir/file2 Ow4096_c rm: cannot remove '/mnt/lustre/d104.sanityn/mdt1dir/file2x': No such file or directory sanityn test_104: @@@@@@ FAIL: Removing test dir failed Trace dump: = /usr/lib64/lustre/tests/../tests/test-framework.sh:5988:error() = /usr/lib64/lustre/tests/sanityn.sh:4634:test_104() = /usr/lib64/lustre/tests/../tests/test-framework.sh:6272:run_one() = /usr/lib64/lustre/tests/../tests/test-framework.sh:6311:run_one_logged() = /usr/lib64/lustre/tests/../tests/test-framework.sh:6107:run_test() = /usr/lib64/lustre/tests/sanityn.sh:4636:main() Dumping lctl log to /tmp/test_logs/1574089606/sanityn.test_104.*.1574089621.log Resetting fail_loc on all nodes...done. FAIL 104 (5s) sanityn: FAIL: test_104 Removing test dir failed Dumping lctl log to /tmp/test_logs/1574089606/sanityn..*.1574089624.log Resetting fail_loc on all nodes...done. the same failure seen in Oleg's testing http://testing.linuxhacker.ru:3333/lustre-reports/4579/results.html : == sanityn test 105: A racy rename/link an open file should not cause fs corruption ================== 13:15:42 (1574273742) fail_loc=0x8000018a /home/green/git/lustre-release/lustre/tests/sanityn.sh: line 4905: 10460 Terminated $MULTIOP $DIR2/$tdir/mdt0dir/foodir/file2 Ow4096_c rm: cannot remove '/mnt/lustre/d105.sanityn/mdt1dir/file2x': No such file or directory sanityn test_105: @@@@@@ FAIL: Removing test dir failed Trace dump: = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6108:error() = /home/green/git/lustre-release/lustre/tests/sanityn.sh:4906:test_105() = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6410:run_one() = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6449:run_one_logged() = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6280:run_test() = /home/green/git/lustre-release/lustre/tests/sanityn.sh:4908:main() Dumping lctl log to /tmp/testlogs//sanityn.test_105.*.1574273747.log oleg256-server: Warning: Permanently added 'oleg256-client.virtnet' (ECDSA) to the list of known hosts. oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.debug_log.oleg256-server.1574273747.log.Gd6d36" failed: Operation not permitted (1) oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.dmesg.oleg256-server.1574273747.log.uwPdAz" failed: Operation not permitted (1) oleg256-server: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1178) [sender=3.1.2] pdsh@oleg256-client: oleg256-server: ssh exited with exit code 23 Resetting fail_loc on all nodes...done. FAIL 105 (6s) cleanup: ====================================================== == sanityn test complete, duration 21 sec ============================================================ 13:15:51 (1574273751) sanityn: FAIL: test_105 Removing test dir failed rm: cannot remove '/mnt/lustre/d105.sanityn/mdt1dir': Directory not empty sanityn test_105: @@@@@@ FAIL: remove sub-test dirs failed Trace dump: = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6108:error() = /home/green/git/lustre-release/lustre/tests/test-framework.sh:5593:check_and_cleanup_lustre() = /home/green/git/lustre-release/lustre/tests/sanityn.sh:4920:main() Dumping lctl log to /tmp/testlogs//sanityn.test_105.*.1574273752.log oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.debug_log.oleg256-server.1574273752.log.65aqHO" failed: Operation not permitted (1) oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.dmesg.oleg256-server.1574273752.log.rhRRav" failed: Operation not permitted (1) oleg256-server: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1178) [sender=3.1.2] pdsh@oleg256-client: oleg256-server: ssh exited with exit code 23 I believe it means an fs corruption, but ZFS has no tool to check it. |
| Comment by Alexander Zarochentsev [ 20/Nov/19 ] |
|
sanityN 105 test which illustrates the problem https://review.whamcloud.com/#/c/35991/ |
| Comment by Gerrit Updater [ 25/Aug/21 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/35991/ |
| Comment by Peter Jones [ 25/Aug/21 ] |
|
Landed for 2.15 |