Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13374

open-close operation can lead to loss of file time set performed in the middle

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • None
    • None
    • 9223372036854775807

    Description

      the original request in LU-12026 was to update *time as a part of LSOM update, i.e. to be stored in LSOM ea, but implemented as a regular time update on close, what is obviously wrong because it breaks set-in-past functionality (and in general due to the reasons we have LSOM but not SOM) - close always sends *time but does not bother accurately re-gather them from servers, while the local cache could be already stale. in other words this will not work:
      1. client1 open
      2. client2 set-in-past
      3. client1 close

      the reproducer is attached:

      — set-in-past on opened file
      File: '/mnt/lustre/f26c.sanityn'
      Modify: 2020-03-19 22:31:05.000000000 +0300
      File: '/mnt/lustre2/f26c.sanityn'
      Modify: 2020-03-19 22:31:05.000000000 +0300

      — set-in-past on closed file
      File: '/mnt/lustre/f26c.sanityn'
      Modify: 2000-12-31 14:12:59.000000000 +0300
      File: '/mnt/lustre2/f26c.sanityn'
      Modify: 2000-12-31 14:12:59.000000000 +0300

      Attachments

        Issue Links

          Activity

            [LU-13374] open-close operation can lead to loss of file time set performed in the middle
            pjones Peter Jones added a comment -

            Merged for 2.16

            pjones Peter Jones added a comment - Merged for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/54450/
            Subject: LU-13374 mdd: fix close time update race with set-in-past
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: bcb954c1403fb1c0fed2ae1d4ed817e59a93a0b7

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/54450/ Subject: LU-13374 mdd: fix close time update race with set-in-past Project: fs/lustre-release Branch: master Current Patch Set: Commit: bcb954c1403fb1c0fed2ae1d4ed817e59a93a0b7

            "Vitaly Fertman <vitaly.fertman@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54450
            Subject: LU-13374 mdd: fix close time update race with set-in-past
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fb50062d86f6ed215fbfe8ed20a010e99a8b92b8

            gerrit Gerrit Updater added a comment - "Vitaly Fertman <vitaly.fertman@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54450 Subject: LU-13374 mdd: fix close time update race with set-in-past Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fb50062d86f6ed215fbfe8ed20a010e99a8b92b8

            ah, it seems to be not so complex

            vitaly_fertman Vitaly Fertman added a comment - ah, it seems to be not so complex
            vitaly_fertman Vitaly Fertman added a comment - - edited

            I think this is what we want? If no size change, then LSOM would not be updated and set-in-past would be kept?

            yes, set-in-past problem would be fixed. however, in this case for the first update on close, size will be changed from 0 to the actual one. if no more size changes - no more timestamps updated on any close. i.e. the purpose of LU-12026 will be mostly lost. Or was it added only for a simplification of size change detection? the ticket does not explain the purpose well enough... 

            vitaly_fertman Vitaly Fertman added a comment - - edited I think this is what we want? If no size change, then LSOM would not be updated and set-in-past would be  kept ? yes, set-in-past problem would be fixed. however, in this case for the first update on close, size will be changed from 0 to the actual one. if no more size changes - no more timestamps updated on any close. i.e. the purpose of LU-12026 will be mostly lost. Or was it added only for a simplification of size change detection? the ticket does not explain the purpose well enough... 

            Andreas, do I understand you right that your last proposal also means to extend LSOM by timestamps, correct? or by saying 'timestamps in LSOM' you still mean to update in the regular file attribute but only along when LSOM ea update?

            I mean to keep the timestamps in the regular inode, but only when LSOM is updated because of size.

            if no size change, no LSOM ea update, so the set-in-past would be dropped.

            I think this is what we want? If no size change, then LSOM would not be updated and set-in-past would be kept?

            adilger Andreas Dilger added a comment - Andreas, do I understand you right that your last proposal also means to extend LSOM by timestamps, correct? or by saying 'timestamps in LSOM' you still mean to update in the regular file attribute but only along when LSOM ea update? I mean to keep the timestamps in the regular inode, but only when LSOM is updated because of size. if no size change, no LSOM ea update, so the set-in-past would be dropped. I think this is what we want? If no size change, then LSOM would not be updated and set-in-past would be kept ?

            Andreas, do I understand you right that your last proposal also means to extend LSOM by timestamps, correct? or by saying 'timestamps in LSOM' you still mean to update in the regular file attribute but only along when LSOM ea update?

            If another client has set the timestamp to the past, then it will already have done an open/close during that call, which should have also updated LSOM to the current size.

            why? if no size change, no LSOM ea update, so the set-in-past would be dropped.

            vitaly_fertman Vitaly Fertman added a comment - Andreas, do I understand you right that your last proposal also means to extend LSOM by timestamps, correct? or by saying 'timestamps in LSOM' you still mean to update in the regular file attribute but only along when LSOM ea update? If another client has set the timestamp to the past, then it will already have done an open/close during that call, which should have also updated LSOM to the current size. why? if no size change, no LSOM ea update, so the set-in-past would be dropped.

            vitaly_fertman, what about only updating the timestamps in LSOM when the size and/or blocks are also being updated? That should limit the timestamp updates to the case when LSOM is stale/empty. If another client has set the timestamp to the past, then it will already have done an open/close during that call, which should have also updated LSOM to the current size. If the file size grows after the timestamp was updated, then I think that is a legitimate/correct case to update the timestamps again.

            adilger Andreas Dilger added a comment - vitaly_fertman , what about only updating the timestamps in LSOM when the size and/or blocks are also being updated? That should limit the timestamp updates to the case when LSOM is stale/empty. If another client has set the timestamp to the past, then it will already have done an open/close during that call, which should have also updated LSOM to the current size. If the file size grows after the timestamp was updated, then I think that is a legitimate/correct case to update the timestamps again.

            my original thought was to extend the LSOM EA for these timestamps. Change the logic when the EA is to be written on disk. Add the changelog with the needed flags. it seems there is no problems with old ondisk EAs… it seems they will be automatically replaced on the next write, we just need to distinguish the one we have read is of the previous version/shirt size…

            vitaly_fertman Vitaly Fertman added a comment - my original thought was to extend the LSOM EA for these timestamps. Change the logic when the EA is to be written on disk. Add the changelog with the needed flags. it seems there is no problems with old ondisk EAs… it seems they will be automatically replaced on the next write, we just need to distinguish the one we have read is of the previous version/shirt size…

            I'm thinking the "set time to past" call should revoke DLM locks from all of the clients that have the file open? Then we could check if the client has a lock (with LVB+timestamps) when it is doing the close before sending the LSOM attributes to the MDS. That should avoid this particular ordering issue I think?

            On the one hand, I want LSOM to be updated easily from a client (any open+stat+close) so that it can be used by statx() and ldiskfs scanning tools. On the other hand, this bug has existed for over 4 years (since 2.12.4) and hasn't really come up as an issue with users AFAIK.

            adilger Andreas Dilger added a comment - I'm thinking the "set time to past" call should revoke DLM locks from all of the clients that have the file open? Then we could check if the client has a lock (with LVB+timestamps) when it is doing the close before sending the LSOM attributes to the MDS. That should avoid this particular ordering issue I think? On the one hand, I want LSOM to be updated easily from a client (any open+stat+close) so that it can be used by statx() and ldiskfs scanning tools. On the other hand, this bug has existed for over 4 years (since 2.12.4) and hasn't really come up as an issue with users AFAIK.

            People

              vitaly_fertman Vitaly Fertman
              vitaly_fertman Vitaly Fertman
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: