Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.1.0
    • Lustre 2.1.0, Lustre 1.8.6
    • None
    • 2.6.18-194.17.1.el5
    • 3
    • 18,169
    • 4066

    Description

      I hit this problem several times while running sanityn:

      == sanityn test 4: fstat validation on multiple mount points ========================================= 19:19:59 (1303179599)
      Mtimes don't match 1303179601, 1303179727
      sanityn test_4: @@@@@@ FAIL: test_4 failed with 1

      test result on maloo:
      https://maloo.whamcloud.com/test_sets/4d51c720-6a37-11e0-b32b-52540025f9af

      It's same to bug 18169.

      Attachments

        Issue Links

          Activity

            [LU-221] sanityN.sh test_4 failed

            Integrated in lustre-master » i686,server,el5,ofa #231
            LU-221 don't use a/c/m time for newly allocated object in OST

            Oleg Drokin : 414251797ed178eec5d431e1f5aa4a889d2b159f
            Files :

            • lustre/obdfilter/filter.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » i686,server,el5,ofa #231 LU-221 don't use a/c/m time for newly allocated object in OST Oleg Drokin : 414251797ed178eec5d431e1f5aa4a889d2b159f Files : lustre/obdfilter/filter.c

            Integrated in lustre-master » i686,server,el6,inkernel #231
            LU-221 don't use a/c/m time for newly allocated object in OST

            Oleg Drokin : 414251797ed178eec5d431e1f5aa4a889d2b159f
            Files :

            • lustre/obdfilter/filter.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » i686,server,el6,inkernel #231 LU-221 don't use a/c/m time for newly allocated object in OST Oleg Drokin : 414251797ed178eec5d431e1f5aa4a889d2b159f Files : lustre/obdfilter/filter.c

            Integrated in lustre-master » x86_64,server,el6,inkernel #231
            LU-221 don't use a/c/m time for newly allocated object in OST

            Oleg Drokin : 414251797ed178eec5d431e1f5aa4a889d2b159f
            Files :

            • lustre/obdfilter/filter.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,server,el6,inkernel #231 LU-221 don't use a/c/m time for newly allocated object in OST Oleg Drokin : 414251797ed178eec5d431e1f5aa4a889d2b159f Files : lustre/obdfilter/filter.c

            Integrated in lustre-master » x86_64,client,el5,inkernel #231
            LU-221 don't use a/c/m time for newly allocated object in OST

            Oleg Drokin : 414251797ed178eec5d431e1f5aa4a889d2b159f
            Files :

            • lustre/obdfilter/filter.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,el5,inkernel #231 LU-221 don't use a/c/m time for newly allocated object in OST Oleg Drokin : 414251797ed178eec5d431e1f5aa4a889d2b159f Files : lustre/obdfilter/filter.c

            Integrated in lustre-master » x86_64,client,el5,ofa #231
            LU-221 don't use a/c/m time for newly allocated object in OST

            Oleg Drokin : 414251797ed178eec5d431e1f5aa4a889d2b159f
            Files :

            • lustre/obdfilter/filter.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,el5,ofa #231 LU-221 don't use a/c/m time for newly allocated object in OST Oleg Drokin : 414251797ed178eec5d431e1f5aa4a889d2b159f Files : lustre/obdfilter/filter.c

            there is an error in the Maloo's test for this patch
            https://maloo.whamcloud.com/test_sets/b3babb64-b1fb-11e0-b33f-52540025f9af

            there is a bug (https://bugzilla.lustre.org/show_bug.cgi?id=23161) in buzilla that tracked this issue.

            not sure whether the new occurrence is related to the patch yet.

            hongchao.zhang Hongchao Zhang added a comment - there is an error in the Maloo's test for this patch https://maloo.whamcloud.com/test_sets/b3babb64-b1fb-11e0-b33f-52540025f9af there is a bug ( https://bugzilla.lustre.org/show_bug.cgi?id=23161 ) in buzilla that tracked this issue. not sure whether the new occurrence is related to the patch yet.

            the patch is at http://review.whamcloud.com/#change,1084

            some notes about the patch,
            1, S_ISUID, S_ISGID can't be used to determine whether the inode's a/c/m time is valid, for the a/c/m can be set individually
            by users(say, "touch"), test_39m in sanity.sh just tests this case.

            in this patch, the a/c/m time is set as the minimal value of inode->i_a(m,c)time (LONG_MIN), and it will be ignored
            by clients for only the newest time can be used.

            2, in this patch, the a/c/m time is initialized after creating the inode, and it can be moved into ldiskfs if it degrade the
            creation performance.

            hongchao.zhang Hongchao Zhang added a comment - the patch is at http://review.whamcloud.com/#change,1084 some notes about the patch, 1, S_ISUID, S_ISGID can't be used to determine whether the inode's a/c/m time is valid, for the a/c/m can be set individually by users(say, "touch"), test_39m in sanity.sh just tests this case. in this patch, the a/c/m time is set as the minimal value of inode->i_a(m,c)time (LONG_MIN), and it will be ignored by clients for only the newest time can be used. 2, in this patch, the a/c/m time is initialized after creating the inode, and it can be moved into ldiskfs if it degrade the creation performance.
            pjones Peter Jones added a comment -

            I think that we need to make fixing this issue a priority. As Andreas says it causes regular autotest failures

            pjones Peter Jones added a comment - I think that we need to make fixing this issue a priority. As Andreas says it causes regular autotest failures

            the initial patch has been created (some improvement is still needed), but it was suspended for there are several other high
            priority bugs needed to investigate. I'll complete it soon.

            hongchao.zhang Hongchao Zhang added a comment - the initial patch has been created (some improvement is still needed), but it was suspended for there are several other high priority bugs needed to investigate. I'll complete it soon.

            Any progress on this bug?

            I see this is still failing in Maloo, e.g. https://maloo.whamcloud.com/test_sets/fa0a9a24-a78f-11e0-bd2a-52540025f9af. Fixing the failing regression tests makes our testing much more efficient.

            I think the proposed fix is relatively straight forward to implement - in filter_commitrw_write(), if both SUID and SGID are still set (i.e. this is the first time this was done), then OBD_MD_FLMTIME|OBD_MD_FLCTIME|OBD_MD_FLATIME should be OR'd into the "i" flag passed to iattr_from_obdo(), and ATTR_MTIME | ATTR_CTIME | ATTR_ATIME should be left in the ia_valid mask, so that the values sent from the client overwrite those on disk in the call to fsfilt_setattr().

            adilger Andreas Dilger added a comment - Any progress on this bug? I see this is still failing in Maloo, e.g. https://maloo.whamcloud.com/test_sets/fa0a9a24-a78f-11e0-bd2a-52540025f9af . Fixing the failing regression tests makes our testing much more efficient. I think the proposed fix is relatively straight forward to implement - in filter_commitrw_write(), if both SUID and SGID are still set (i.e. this is the first time this was done), then OBD_MD_FLMTIME|OBD_MD_FLCTIME|OBD_MD_FLATIME should be OR'd into the "i" flag passed to iattr_from_obdo(), and ATTR_MTIME | ATTR_CTIME | ATTR_ATIME should be left in the ia_valid mask, so that the values sent from the client overwrite those on disk in the call to fsfilt_setattr().

            It should already be possible to handle this today by checking the SUID/SGID flags on the inode, and if both of them are set then the object a/m/ctime should be ignored (return 0 for all of them). I'd prefer not to set those values to 0 on disk because some parts of the code consider ctime = 0 an invalid inode.

            It is definitely correct that the times should be based on the client and not the server. While it is desirable to have clock in sync on all nodes, this is not required.

            adilger Andreas Dilger added a comment - It should already be possible to handle this today by checking the SUID/SGID flags on the inode, and if both of them are set then the object a/m/ctime should be ignored (return 0 for all of them). I'd prefer not to set those values to 0 on disk because some parts of the code consider ctime = 0 an invalid inode. It is definitely correct that the times should be based on the client and not the server. While it is desirable to have clock in sync on all nodes, this is not required.

            People

              hongchao.zhang Hongchao Zhang
              niu Niu Yawei (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: