Details

    • Technical task
    • Resolution: Unresolved
    • Blocker
    • None
    • None
    • None
    • 9223372036854775807

    Description

      For a file or directory flagged with Protect(P) state under the protection of EX WBC lock, the open() system call does not need to communicate with MDS, can also be executed locally in MemFS of Lustre WBC.

      However, Lustre is a stateful filesystem. Each open keeps a state on the MDS. We must keep transparency for applications once the EX WBC lock is cancelled. To achieve this goal, each local open will be recorded in the inode’s open list (or maintain pre dentry?); When the EX WBC lock is cancelling, it must reopen the files in the open list from MDS.

      For regular files, after reopened from MDS, the metadata and data I/O can use the reopened file handle. It is transparent to applications.

      Attachments

        Issue Links

          Activity

            [LU-13010] WBC: Reopen the file when WBC EX lock revoking
            qian_wc Qian Yingjin added a comment -

            For directories, it must be handled carefully for the ->readdir() call.
            Currently the mechanism adopted by MemFS (tmpfs) is to simply scan the in-memory sub dentries of the directory in dcache linearly to fill the content returned to readdir call: ->dcache_readdir().
            While Lustre new readdir implementation is much complex. It does readdir in hash order and uses hash of a file name as a telldir/seekdir cookie stored in the file handle.
            Thus, we must find a method to bridge two implementation firstly.

            qian_wc Qian Yingjin added a comment - For directories, it must be handled carefully for the ->readdir() call. Currently the mechanism adopted by MemFS (tmpfs) is to simply scan the in-memory sub dentries of the directory in dcache linearly to fill the content returned to readdir call: ->dcache_readdir(). While Lustre new readdir implementation is much complex. It does readdir in hash order and uses hash of a file name as a telldir/seekdir cookie stored in the file handle. Thus, we must find a method to bridge two implementation firstly.

            Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/38423
            Subject: LU-13010 wbc: reopen files when root WBC EX lock is revoking
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 72e3461f56db0a9de513aba4547518b4f1d6e943

            gerrit Gerrit Updater added a comment - Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/38423 Subject: LU-13010 wbc: reopen files when root WBC EX lock is revoking Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 72e3461f56db0a9de513aba4547518b4f1d6e943

            It makes sense to align the implementation of MDS_OPEN RPC generation with the "Simplified Interoperability MDS_OPEN request replay" architecture. In particular, change the client over to generating new MDS_OPEN RPCs from the VFS file handles for all recovery so that there is a single unified mechanism for handling this. This will also allow simplifying the client RPC replay code to remove mod_open_req, mod_close_req, rq_replay, and a considerable amount of related complexity in ptlrpc and mdc code.

            There may need to be a small amount of extra saved state in the Lustre file handle (e.g. rq_transno) after the RPC is committed on the MDS and the RPC is dropped from imp_replay_list in order to generate a new MDS_OPEN RPC during replay. However, it may also be enough for the MDS_OPEN RPC to generate a fake rq_transno < exp_last_committed so that it is before any other real RPC sent during recovery.

            Once this cleanup is done, then WBC open file handles would just generate MDS_OPEN RPCs directly during WBC cache flush, without the need to save a lot of extra state on the client if many files are open.

            adilger Andreas Dilger added a comment - It makes sense to align the implementation of MDS_OPEN RPC generation with the " Simplified Interoperability MDS_OPEN request replay " architecture. In particular, change the client over to generating new MDS_OPEN RPCs from the VFS file handles for all recovery so that there is a single unified mechanism for handling this. This will also allow simplifying the client RPC replay code to remove mod_open_req , mod_close_req , rq_replay , and a considerable amount of related complexity in ptlrpc and mdc code. There may need to be a small amount of extra saved state in the Lustre file handle (e.g. rq_transno ) after the RPC is committed on the MDS and the RPC is dropped from imp_replay_list in order to generate a new MDS_OPEN RPC during replay. However, it may also be enough for the MDS_OPEN RPC to generate a fake rq_transno < exp_last_committed so that it is before any other real RPC sent during recovery. Once this cleanup is done, then WBC open file handles would just generate MDS_OPEN RPCs directly during WBC cache flush, without the need to save a lot of extra state on the client if many files are open.

            Ideally, it would make sense to create the files on the MDS with an open call, with a small change to allow passing the number of openers on the client. That avoids to send an extra RPC to the MDS.

            adilger Andreas Dilger added a comment - Ideally, it would make sense to create the files on the MDS with an open call, with a small change to allow passing the number of openers on the client. That avoids to send an extra RPC to the MDS.

            People

              wc-triage WC Triage
              qian_wc Qian Yingjin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: