Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5951

sanity test_39k: mtime is lost on close

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.8.0
    • Lustre 2.7.0, Lustre 2.8.0, Lustre 2.5.4
    • 3
    • 16613

    Description

      This issue was created by maloo for John Hammond <john.hammond@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/01943f56-7150-11e4-b80a-5254006e85c2.

      The sub-test test_39k failed with the following error:

      mtime is lost on close: 1416505386, should be 1384969360
      

      I ran 39k in a loop locally and saw the same failure in 2 out of 256 runs.

      Here are all the instances from maloo:

      https://testing.hpdd.intel.com/sub_tests/7e8dab70-069d-11e2-9e80-52540035b04c ~2012-09-24
      https://testing.hpdd.intel.com/sub_tests/9f92a846-0c4e-11e2-8132-52540035b04c ~2012-10-01

      https://testing.hpdd.intel.com/sub_tests/bed3a506-06c6-11e4-9c81-5254006e85c2 2014-07-08 01:42:48 UTCs
      https://testing.hpdd.intel.com/sub_tests/127d427e-0c86-11e4-8fe6-5254006e85c2 2014-07-15 21:43:59 UTCs
      https://testing.hpdd.intel.com/sub_tests/ab1c3752-2a76-11e4-8657-5254006e85c2 2014-08-23 00:04:17 UTCs

      https://testing.hpdd.intel.com/sub_tests/1791dfdc-6d01-11e4-8bd3-5254006e85c2 2014-11-14 08:35:54 UTCs
      https://testing.hpdd.intel.com/sub_tests/545725c0-6db6-11e4-a728-5254006e85c2 2014-11-15 20:02:55 UTCs
      https://testing.hpdd.intel.com/sub_tests/9c2721d8-7078-11e4-a6ba-5254006e85c2 2014-11-19 16:37:26 UTCs
      https://testing.hpdd.intel.com/sub_tests/bb99d730-712d-11e4-9495-5254006e85c2 2014-11-20 17:12:53 UTCs
      https://testing.hpdd.intel.com/sub_tests/2a99b35e-7150-11e4-b80a-5254006e85c2 2014-11-20 17:12:53 UTCs
      https://testing.hpdd.intel.com/sub_tests/1db886d4-7177-11e4-89a9-5254006e85c2 2014-11-21 00:07:35 UTCs

      Info required for matching: sanity 39k

      Attachments

        Issue Links

          Activity

            [LU-5951] sanity test_39k: mtime is lost on close

            The fixVersion has been updated to 2.9.0 to properly address the issue that was being addressed by http://review.whamcloud.com/15473/
            This issue is not something we are currently seeing as a failure on master.

            jgmitter Joseph Gmitter (Inactive) added a comment - The fixVersion has been updated to 2.9.0 to properly address the issue that was being addressed by http://review.whamcloud.com/15473/ This issue is not something we are currently seeing as a failure on master.

            Reopening as the recent landing had caused LU-7252. The recent landing has been reverted.

            jgmitter Joseph Gmitter (Inactive) added a comment - Reopening as the recent landing had caused LU-7252 . The recent landing has been reverted.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16734/
            Subject: Revert "LU-5951 ptlrpc: track unreplied requests"
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: c250f40ef3222dbeb92d7914a0d9f38a3525d2fb

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16734/ Subject: Revert " LU-5951 ptlrpc: track unreplied requests" Project: fs/lustre-release Branch: master Current Patch Set: Commit: c250f40ef3222dbeb92d7914a0d9f38a3525d2fb

            Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: http://review.whamcloud.com/16734
            Subject: Revert "LU-5951 ptlrpc: track unreplied requests"
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: e12c29a48b33a0ae7bd4147dab57dae5597954aa

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: http://review.whamcloud.com/16734 Subject: Revert " LU-5951 ptlrpc: track unreplied requests" Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: e12c29a48b33a0ae7bd4147dab57dae5597954aa

            Landed for 2.8.0

            jgmitter Joseph Gmitter (Inactive) added a comment - Landed for 2.8.0

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15473/
            Subject: LU-5951 ptlrpc: track unreplied requests
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: c77e504fdac12d3be7d19a652d6c7da497018c76

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15473/ Subject: LU-5951 ptlrpc: track unreplied requests Project: fs/lustre-release Branch: master Current Patch Set: Commit: c77e504fdac12d3be7d19a652d6c7da497018c76
            jamesanunez James Nunez (Inactive) added a comment - - edited I've seen this issue again: 2015-07-02 18:54:42 - https://testing.hpdd.intel.com/test_sets/4ee3283c-2102-11e5-8eb6-5254006e85c2 2015-07-03 18:45:11 - https://testing.hpdd.intel.com/test_sets/f4ce726e-21e9-11e5-a388-5254006e85c2 2015-07-10 05:07:18 - https://testing.hpdd.intel.com/test_sets/0d7b77d2-26fc-11e5-925d-5254006e85c2

            Ok, I'll update the patch to maintain an unreplied xid list for each import.

            niu Niu Yawei (Inactive) added a comment - Ok, I'll update the patch to maintain an unreplied xid list for each import.
            bzzz Alex Zhuravlev added a comment - - edited

            well, if we don't track that, then it's very easy to "lose" some slots: at moment X we used 8 slots, then later we were using 2 slots at most. using tags we can reuse only those 2 slots, but we can't report the others slots can be reused. there is no strong need to maintain that absolutely up to date,
            technically it should be possible (and not very complex) to introduce another list, like.. ptlrpc_next_xid() (or it's callers) atomically puts RPC on the list, after_reply() and __ptlrpc_req_free() delete the RPC from the list.

            bzzz Alex Zhuravlev added a comment - - edited well, if we don't track that, then it's very easy to "lose" some slots: at moment X we used 8 slots, then later we were using 2 slots at most. using tags we can reuse only those 2 slots, but we can't report the others slots can be reused. there is no strong need to maintain that absolutely up to date, technically it should be possible (and not very complex) to introduce another list, like.. ptlrpc_next_xid() (or it's callers) atomically puts RPC on the list, after_reply() and __ptlrpc_req_free() delete the RPC from the list.

            It isn't totally clear that we need the change from http://review.whamcloud.com/14793 in order for the multi-slot code to work. While it would make the tracking of unreplied RPCs a bit more complex, having an atomic XID assignment set at "send" time is not quite the same as "unreplied" so there still needs to be a mechanism used to track which RPCs have replies.

            The one major difference would be that there needs to be some mechanism to track RPC XIDs which are never sent, so that they don't permanently get stuck as the lowest unreplied XID. It would seem possible to do this in __ptlrpc_req_free() I think?

            adilger Andreas Dilger added a comment - It isn't totally clear that we need the change from http://review.whamcloud.com/14793 in order for the multi-slot code to work. While it would make the tracking of unreplied RPCs a bit more complex, having an atomic XID assignment set at "send" time is not quite the same as "unreplied" so there still needs to be a mechanism used to track which RPCs have replies. The one major difference would be that there needs to be some mechanism to track RPC XIDs which are never sent, so that they don't permanently get stuck as the lowest unreplied XID. It would seem possible to do this in __ptlrpc_req_free() I think?

            Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/15473
            Subject: LU-5951 osc: set ioepoch to ost setattr/punch/write
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fafc374824db7d69bed1c527989ea60d825200dd

            gerrit Gerrit Updater added a comment - Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/15473 Subject: LU-5951 osc: set ioepoch to ost setattr/punch/write Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fafc374824db7d69bed1c527989ea60d825200dd

            People

              niu Niu Yawei (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: