Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13346

Fix link and rename race on zfs odb

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      https://jira.whamcloud.com/browse/LU-12848

      Uncovered an issue with ZFS w/sanity 105:

      == sanityn test 105: A racy rename/link an open file should not cause fs corruption ================== 13:15:42 (1574273742)
      fail_loc=0x8000018a
      /home/green/git/lustre-release/lustre/tests/sanityn.sh: line 4905: 10460 Terminated              $MULTIOP $DIR2/$tdir/mdt0dir/foodir/file2 Ow4096_c
      rm: cannot remove '/mnt/lustre/d105.sanityn/mdt1dir/file2x': No such file or directory
       sanityn test_105: @@@@@@ FAIL: Removing test dir failed 
        Trace dump:
        = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6108:error()
        = /home/green/git/lustre-release/lustre/tests/sanityn.sh:4906:test_105()
        = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6410:run_one()
        = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6449:run_one_logged()
        = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6280:run_test()
        = /home/green/git/lustre-release/lustre/tests/sanityn.sh:4908:main()
      Dumping lctl log to /tmp/testlogs//sanityn.test_105.*.1574273747.log
      oleg256-server: Warning: Permanently added 'oleg256-client.virtnet' (ECDSA) to the list of known hosts.
      oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.debug_log.oleg256-server.1574273747.log.Gd6d36" failed: Operation not permitted (1)
      oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.dmesg.oleg256-server.1574273747.log.uwPdAz" failed: Operation not permitted (1)
      oleg256-server: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1178) [sender=3.1.2]
      pdsh@oleg256-client: oleg256-server: ssh exited with exit code 23
      Resetting fail_loc on all nodes...done.
      FAIL 105 (6s)
      cleanup: ======================================================
      == sanityn test complete, duration 21 sec ============================================================ 13:15:51 (1574273751)
      sanityn: FAIL: test_105 Removing test dir failed
      rm: cannot remove '/mnt/lustre/d105.sanityn/mdt1dir': Directory not empty
       sanityn test_105: @@@@@@ FAIL: remove sub-test dirs failed 
        Trace dump:
        = /home/green/git/lustre-release/lustre/tests/test-framework.sh:6108:error()
        = /home/green/git/lustre-release/lustre/tests/test-framework.sh:5593:check_and_cleanup_lustre()
        = /home/green/git/lustre-release/lustre/tests/sanityn.sh:4920:main()
      Dumping lctl log to /tmp/testlogs//sanityn.test_105.*.1574273752.log
      oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.debug_log.oleg256-server.1574273752.log.65aqHO" failed: Operation not permitted (1)
      oleg256-server: rsync: chown "/tmp/testlogs/.sanityn.test_105.dmesg.oleg256-server.1574273752.log.rhRRav" failed: Operation not permitted (1)
      oleg256-server: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1178) [sender=3.1.2]
      pdsh@oleg256-client: oleg256-server: ssh exited with exit code 23
      

      Attachments

        Issue Links

          Activity

            People

              stancheff Shaun Tancheff
              stancheff Shaun Tancheff
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: