Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • None
    • 9223372036854775807

    Description

      Eviction is a standard mechanism for Lustre targets to protect themselves against dead or misbehaving clients.

      On a live filesystem, evictions happen, eventually and it could be useful for sysadmin to have an exact counter to monitor them and take action if needed.

      I will propose a patch where an eviction counter is added to obd_device, increased when an eviction occurs and exposed through lctl get_param.

      Attachments

        Issue Links

          Activity

            [LU-14111] Report per-target eviction count

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52098/
            Subject: LU-14111 tests: only support recovery-small test 146 for 2.15.54+
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: b034dd27dd39483e40f91ea82d3f5c62b514ec54

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52098/ Subject: LU-14111 tests: only support recovery-small test 146 for 2.15.54+ Project: fs/lustre-release Branch: master Current Patch Set: Commit: b034dd27dd39483e40f91ea82d3f5c62b514ec54

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52098
            Subject: LU-14111 tests: only support recovery-small test 146 for 2.15.54+
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 77f9f3232f685b10518596ac69f2961ba7c342fa

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52098 Subject: LU-14111 tests: only support recovery-small test 146 for 2.15.54+ Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 77f9f3232f685b10518596ac69f2961ba7c342fa

            I see a discussion of YAML output for various parameters. I'm working on a YAML netlink version for the debugfs issue. It should provide the ability to express any stats in YAMl format when requested.

            simmonsja James A Simmons added a comment - I see a discussion of YAML output for various parameters. I'm working on a YAML netlink version for the debugfs issue. It should provide the ability to express any stats in YAMl format when requested.

            New maloo test fails in interop with 2.15

            simmonsja James A Simmons added a comment - New maloo test fails in interop with 2.15
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/40528/
            Subject: LU-14111 obdclass: count eviction per obd_device
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 3c69d46e1766480c0ffd1bef840b4e167b4cf88e

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/40528/ Subject: LU-14111 obdclass: count eviction per obd_device Project: fs/lustre-release Branch: master Current Patch Set: Commit: 3c69d46e1766480c0ffd1bef840b4e167b4cf88e

            I don't know if there was ever an "official" documentation to that effect, but at WC the use of YAML has definitely been adopted as the standard format for new "complex" parameter files ever since IML started to be developed.

            From my experience, it is possible to create YAML-compliant files (I use http://yaml-online-parser.appspot.com/ to verify this) that are both machine readable and human readable. Examples of "new" complex files include osc.*.import and obdfilter.*.exports.*.export, obdfilter.*.job_stats, obdfilter.*.lfsck_status, and others. Also, there is a "lfs getstripe --yaml" option for dumping file layouts in YAML format, and "lctl --device MGS llog_print $fsname-client" and "lctl --device MGS llog_print params" to dump the config records.

            Ideally, we could also convert old "complex" files (e.g. brw_stats, but with a new filename) over to YAML format as well, but that hasn't happened yet.

            adilger Andreas Dilger added a comment - I don't know if there was ever an "official" documentation to that effect, but at WC the use of YAML has definitely been adopted as the standard format for new "complex" parameter files ever since IML started to be developed. From my experience, it is possible to create YAML-compliant files (I use http://yaml-online-parser.appspot.com/ to verify this) that are both machine readable and human readable. Examples of "new" complex files include osc.*.import and obdfilter.*.exports.*.export , obdfilter.*.job_stats , obdfilter.*.lfsck_status , and others. Also, there is a " lfs getstripe --yaml " option for dumping file layouts in YAML format, and " lctl --device MGS llog_print $fsname-client " and " lctl --device MGS llog_print params " to dump the config records. Ideally, we could also convert old "complex" files (e.g. brw_stats , but with a new filename) over to YAML format as well, but that hasn't happened yet.

            (moving this out of the patch review as this is not related)

            The upstream kernel folks are just starting to come to this realization, and trying to do crazy things like adding a syscall to read from an array of fd's at the same time, but it is just more efficient to have all of the values in a single file that is formatted for easy parsing (YAML).

            I was thinking for a while that lots of get_param entries has a yaml-like syntax or almost yaml-compatible and that this was a good path forward, but I've never seen any commitment or official recommendation that's the way to go and that those params should be made YAML compatible as mush as possible?

             

            degremoa Aurelien Degremont (Inactive) added a comment - (moving this out of the patch review as this is not related) The upstream kernel folks are just starting to come to this realization, and trying to do crazy things like adding a syscall to read from an array of fd's at the same time, but it is just more efficient to have all of the values in a single file that is formatted for easy parsing (YAML). I was thinking for a while that lots of get_param entries has a yaml-like syntax or almost yaml-compatible and that this was a good path forward, but I've never seen any commitment or official recommendation that's the way to go and that those params should be made YAML compatible as mush as possible?  

            But the patch is tracking that server-side, not client side.

             

            I was wondering where was a good place to report that data on server-side and as I understood that the move away from /proc is pushing a direction where new data should have its own /sys entries, rather than adding it to a more complex output one.

             

            degremoa Aurelien Degremont (Inactive) added a comment - - edited But the patch is tracking that server-side, not client side.   I was wondering where was a good place to report that data on server-side and as I understood that the move away from /proc is pushing a direction where new data should have its own /sys entries, rather than adding it to a more complex output one.  

            Actually, this information is already in osc.*.import:

                connection:
                   failover_nids: [ 192.168.20.1@tcp ]
                   current_connection: 192.168.20.1@tcp
                   connection_attempts: 2623
                   generation: 2623
                   in-progress_invalidations: 0
                   idle: 0 sec
            

            so it makes sense to include this information there.

            adilger Andreas Dilger added a comment - Actually, this information is already in osc.*.import : connection: failover_nids: [ 192.168.20.1@tcp ] current_connection: 192.168.20.1@tcp connection_attempts: 2623 generation: 2623 in-progress_invalidations: 0 idle: 0 sec so it makes sense to include this information there.

            People

              degremoa Aurelien Degremont (Inactive)
              degremoa Aurelien Degremont (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: