Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3886

sanity test_56a: @@@@@@ FAIL: /usr/bin/lfs getstripe --obd wrong: found 6, expected 3

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.5.0
    • None
    • 3
    • 10111

    Description

      This problem is similar with LU-3846 and LU-3858. The test suit should wait for a few seconds after it clear the stripe of the directory. Otherwise, the newly created entries under the directory will have 2 stripe counts rather than 1.

      Attachments

        Issue Links

          Activity

            [LU-3886] sanity test_56a: @@@@@@ FAIL: /usr/bin/lfs getstripe --obd wrong: found 6, expected 3
            adilger Andreas Dilger made changes -
            Resolution New: Cannot Reproduce [ 5 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]

            Haven't seen this in a long time.

            adilger Andreas Dilger added a comment - Haven't seen this in a long time.
            pjones Peter Jones made changes -
            End date New: 01/Sep/15
            Start date New: 05/Sep/13
            adilger Andreas Dilger made changes -
            Link New: This issue is duplicated by LU-7071 [ LU-7071 ]

            I've hit this problem with lustre-master tag 2.6.92. Results at https://testing.hpdd.intel.com/test_sets/37e63f92-9f0d-11e4-91b3-5254006e85c2

            jamesanunez James Nunez (Inactive) added a comment - I've hit this problem with lustre-master tag 2.6.92. Results at https://testing.hpdd.intel.com/test_sets/37e63f92-9f0d-11e4-91b3-5254006e85c2

            Yeah, I hit the problem of 'found 6, expected 3' every time when I run sanity.sh.

            lixi Li Xi (Inactive) added a comment - Yeah, I hit the problem of 'found 6, expected 3' every time when I run sanity.sh.
            emoly.liu Emoly Liu added a comment -

            The problem you found by run.sh is probably related to the following code:

            When we set stripe for root(mount point), set_default is enabled in ll_dir_ioctl()

                    case LL_IOC_LOV_SETSTRIPE: {
            ...
                            int set_default = 0;
            ...
                            if (inode->i_sb->s_root == file->f_dentry)
                                    set_default = 1;
            
                            /* in v1 and v3 cases lumv1 points to data */
                            rc = ll_dir_setstripe(inode, lumv1, set_default);
            

            Then, in ll_dir_setstripe() if set_default=1, we will call ll_send_mgc_param() to set information asynchronously.

                    if (set_default && mgc->u.cli.cl_mgc_mgsexp) {
                            /* Set root stripesize */
                            /* Set root stripecount */
                            /* Set root stripeoffset */
                    }
            

            Since you run setstripe very frequently and many times in run.sh, the config log queue might be very long (bottleneck), and mgs will take more time to process it.

            BTW, can you hit this problem if you don't use run.sh, just run sanity.sh regularly?

            emoly.liu Emoly Liu added a comment - The problem you found by run.sh is probably related to the following code: When we set stripe for root(mount point), set_default is enabled in ll_dir_ioctl() case LL_IOC_LOV_SETSTRIPE: { ... int set_default = 0; ... if (inode->i_sb->s_root == file->f_dentry) set_default = 1; /* in v1 and v3 cases lumv1 points to data */ rc = ll_dir_setstripe(inode, lumv1, set_default); Then, in ll_dir_setstripe() if set_default=1, we will call ll_send_mgc_param() to set information asynchronously. if (set_default && mgc->u.cli.cl_mgc_mgsexp) { /* Set root stripesize */ /* Set root stripecount */ /* Set root stripeoffset */ } Since you run setstripe very frequently and many times in run.sh, the config log queue might be very long (bottleneck), and mgs will take more time to process it. BTW, can you hit this problem if you don't use run.sh, just run sanity.sh regularly?
            emoly.liu Emoly Liu made changes -
            Link New: This issue is related to LU-3846 [ LU-3846 ]
            emoly.liu Emoly Liu made changes -
            Link New: This issue is related to LU-3858 [ LU-3858 ]
            emoly.liu Emoly Liu added a comment -

            Yes, this time I hit that. I will investigate it.

            emoly.liu Emoly Liu added a comment - Yes, this time I hit that. I will investigate it.

            People

              emoly.liu Emoly Liu
              lixi Li Xi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: