Details

    • 10582

    Description

      Several minor issues have been identified during the review of the initial version of the HSM posix copytool, such as calling select() on regular files.

      Attachments

        Issue Links

          Activity

            [LU-3971] CLONE - Posix copytool cleanup

            Per last comment in ticket, patch has landed and ticket can be closed.

            jlevi Jodi Levi (Inactive) added a comment - Per last comment in ticket, patch has landed and ticket can be closed.

            Patch http://review.whamcloud.com/#/c/7583/ has been merged. This ticket can be closed.

            hdoreau Henri Doreau (Inactive) added a comment - Patch http://review.whamcloud.com/#/c/7583/ has been merged. This ticket can be closed.
            jhammond John Hammond added a comment -

            See LU-3973 for a bug in cleanup_large_files().

            jhammond John Hammond added a comment - See LU-3973 for a bug in cleanup_large_files().

            > I wonder if the "/usr/lib64/lustre/tests/sanity-hsm.sh: line 394: [: /mnt/lustre: integer expression expected" (from cleanup_large_files()/make_large_for_progress() ??...) error could be the root cause of this ...

            I think this is due to cleanup_large_files() introduced by another patch during September.

            adegremont Aurelien Degremont (Inactive) added a comment - > I wonder if the "/usr/lib64/lustre/tests/sanity-hsm.sh: line 394: [: /mnt/lustre: integer expression expected" (from cleanup_large_files()/make_large_for_progress() ??...) error could be the root cause of this ... I think this is due to cleanup_large_files() introduced by another patch during September.

            Patch-set #6 failed in sanity-hsm/test_104 (LU-4022 has been created to address this error) with the following log :

            == sanity-hsm test 104: Copy tool data field == 07:05:05 (1380377105)
            CMD: wtm-11vm1 pkill -CONT -x lhsmtool_posix
            Purging archive on wtm-11vm1
            CMD: wtm-11vm1 rm -rf /home/cgearing/.autotest/shared_dir/2013-09-27/185645-70126231448960/arc1/*
            Starting copytool agt1 on wtm-11vm1
            CMD: wtm-11vm1 mkdir -p /home/cgearing/.autotest/shared_dir/2013-09-27/185645-70126231448960/arc1
            CMD: wtm-11vm1 lhsmtool_posix  --daemon --hsm-root /home/cgearing/.autotest/shared_dir/2013-09-27/185645-70126231448960/arc1 --bandwidth 1 /mnt/lustre < /dev/null > /logdir/test_logs/2013-09-27/lustre-reviews-el6-x86_64--review--1_2_1__18438__-70126231448960-185644/sanity-hsm.test_104.copytool_log.wtm-11vm1.log 2>&1
            /usr/lib64/lustre/tests/sanity-hsm.sh: line 394: [: /mnt/lustre: integer expression expected
            39+0 records in
            39+0 records out
            39000000 bytes (39 MB) copied, 5.79256 s, 6.7 MB/s
            CMD: wtm-11vm7 /usr/sbin/lctl set_param mdt.lustre-MDT0000.hsm_control=disabled
            mdt.lustre-MDT0000.hsm_control=disabled
            CMD: wtm-11vm7 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm_control
            CMD: wtm-11vm7 /usr/sbin/lctl get_param -n			mdt.lustre-MDT0000.hsm.agent_actions |			grep 0x200002341:0x77:0x0 | cut -f16 -d=
            CMD: wtm-11vm7 /usr/sbin/lctl set_param mdt.lustre-MDT0000.hsm_control=enabled
            mdt.lustre-MDT0000.hsm_control=enabled
            CMD: wtm-11vm7 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm_control
             sanity-hsm test_104: @@@@@@ FAIL: Data field in records is () and not ([434541]) 
              Trace dump:
              = /usr/lib64/lustre/tests/test-framework.sh:4266:error_noexit()
              = /usr/lib64/lustre/tests/test-framework.sh:4293:error()
              = /usr/lib64/lustre/tests/sanity-hsm.sh:2297:test_104()
              = /usr/lib64/lustre/tests/test-framework.sh:4547:run_one()
              = /usr/lib64/lustre/tests/test-framework.sh:4580:run_one_logged()
              = /usr/lib64/lustre/tests/test-framework.sh:4435:run_test()
              = /usr/lib64/lustre/tests/sanity-hsm.sh:2301:main()
            Dumping lctl log to /logdir/test_logs/2013-09-27/lustre-reviews-el6-x86_64--review--1_2_1__18438__-70126231448960-185644/sanity-hsm.test_104.*.1380377113.log
            CMD: wtm-11vm1,wtm-11vm2.rosso.whamcloud.com,wtm-11vm7,wtm-11vm8 /usr/sbin/lctl dk > /logdir/test_logs/2013-09-27/lustre-reviews-el6-x86_64--review--1_2_1__18438__-70126231448960-185644/sanity-hsm.test_104.debug_log.\$(hostname -s).1380377113.log;
                     dmesg > /logdir/test_logs/2013-09-27/lustre-reviews-el6-x86_64--review--1_2_1__18438__-70126231448960-185644/sanity-hsm.test_104.dmesg.\$(hostname -s).1380377113.log
            CMD: wtm-11vm1 pkill -INT -x lhsmtool_posix
            Copytool is stopped on wtm-11vm1
            

            I wonder if the "/usr/lib64/lustre/tests/sanity-hsm.sh: line 394: [: /mnt/lustre: integer expression expected" (from cleanup_large_files()/make_large_for_progress() ??...) error could be the root cause of this ...

            bfaccini Bruno Faccini (Inactive) added a comment - Patch-set #6 failed in sanity-hsm/test_104 ( LU-4022 has been created to address this error) with the following log : == sanity-hsm test 104: Copy tool data field == 07:05:05 (1380377105) CMD: wtm-11vm1 pkill -CONT -x lhsmtool_posix Purging archive on wtm-11vm1 CMD: wtm-11vm1 rm -rf /home/cgearing/.autotest/shared_dir/2013-09-27/185645-70126231448960/arc1/* Starting copytool agt1 on wtm-11vm1 CMD: wtm-11vm1 mkdir -p /home/cgearing/.autotest/shared_dir/2013-09-27/185645-70126231448960/arc1 CMD: wtm-11vm1 lhsmtool_posix --daemon --hsm-root /home/cgearing/.autotest/shared_dir/2013-09-27/185645-70126231448960/arc1 --bandwidth 1 /mnt/lustre < /dev/null > /logdir/test_logs/2013-09-27/lustre-reviews-el6-x86_64--review--1_2_1__18438__-70126231448960-185644/sanity-hsm.test_104.copytool_log.wtm-11vm1.log 2>&1 /usr/lib64/lustre/tests/sanity-hsm.sh: line 394: [: /mnt/lustre: integer expression expected 39+0 records in 39+0 records out 39000000 bytes (39 MB) copied, 5.79256 s, 6.7 MB/s CMD: wtm-11vm7 /usr/sbin/lctl set_param mdt.lustre-MDT0000.hsm_control=disabled mdt.lustre-MDT0000.hsm_control=disabled CMD: wtm-11vm7 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm_control CMD: wtm-11vm7 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.agent_actions | grep 0x200002341:0x77:0x0 | cut -f16 -d= CMD: wtm-11vm7 /usr/sbin/lctl set_param mdt.lustre-MDT0000.hsm_control=enabled mdt.lustre-MDT0000.hsm_control=enabled CMD: wtm-11vm7 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm_control sanity-hsm test_104: @@@@@@ FAIL: Data field in records is () and not ([434541]) Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:4266:error_noexit() = /usr/lib64/lustre/tests/test-framework.sh:4293:error() = /usr/lib64/lustre/tests/sanity-hsm.sh:2297:test_104() = /usr/lib64/lustre/tests/test-framework.sh:4547:run_one() = /usr/lib64/lustre/tests/test-framework.sh:4580:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:4435:run_test() = /usr/lib64/lustre/tests/sanity-hsm.sh:2301:main() Dumping lctl log to /logdir/test_logs/2013-09-27/lustre-reviews-el6-x86_64--review--1_2_1__18438__-70126231448960-185644/sanity-hsm.test_104.*.1380377113.log CMD: wtm-11vm1,wtm-11vm2.rosso.whamcloud.com,wtm-11vm7,wtm-11vm8 /usr/sbin/lctl dk > /logdir/test_logs/2013-09-27/lustre-reviews-el6-x86_64--review--1_2_1__18438__-70126231448960-185644/sanity-hsm.test_104.debug_log.\$(hostname -s).1380377113.log; dmesg > /logdir/test_logs/2013-09-27/lustre-reviews-el6-x86_64--review--1_2_1__18438__-70126231448960-185644/sanity-hsm.test_104.dmesg.\$(hostname -s).1380377113.log CMD: wtm-11vm1 pkill -INT -x lhsmtool_posix Copytool is stopped on wtm-11vm1 I wonder if the "/usr/lib64/lustre/tests/sanity-hsm.sh: line 394: [: /mnt/lustre: integer expression expected" (from cleanup_large_files()/make_large_for_progress() ??...) error could be the root cause of this ...

            Thanks Bruno,

            patch rebased and re-pushed.

            hdoreau Henri Doreau (Inactive) added a comment - Thanks Bruno, patch rebased and re-pushed.

            Change #7583 got a bunch of auto-tests errors, starting with one in conf-sanity/test_61 solved in LU-3938 where patch (http://review.whamcloud.com/7671) just landed.

            So can patch be re-base+submitted ??

            Thanks!

            bfaccini Bruno Faccini (Inactive) added a comment - Change #7583 got a bunch of auto-tests errors, starting with one in conf-sanity/test_61 solved in LU-3938 where patch ( http://review.whamcloud.com/7671 ) just landed. So can patch be re-base+submitted ?? Thanks!

            http://review.whamcloud.com/7568 landed in 2.5
            http://review.whamcloud.com/#/c/7583/ also needed, but not critical for 2.5

            jlevi Jodi Levi (Inactive) added a comment - http://review.whamcloud.com/7568 landed in 2.5 http://review.whamcloud.com/#/c/7583/ also needed, but not critical for 2.5

            People

              jlevi Jodi Levi (Inactive)
              hdoreau Henri Doreau (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: