Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • None
    • None
    • 9223372036854775807

    Description

      We would like a way to dump current tbf stats.
      For example print all the UID being tracked and what are each UID bucket usage.

      Attachments

        Issue Links

          Activity

            [LU-13037] print stats for NRS TBF rules

            We have been using the patch. But I think it needs addition work to be more useful. We will need to think about how it could improved.

            mhanafi Mahmoud Hanafi added a comment - We have been using the patch. But I think it needs addition work to be more useful. We will need to think about how it could improved.
            pjones Peter Jones added a comment -

            mhanafi jaylan you have not provided any feedback as to whether this patch meets your requirements. However, rumour has it that you are carrying this patch - does this mean that you can now provide us some feedback as to whether this patch is useful and should proceed with landing?

            pjones Peter Jones added a comment - mhanafi jaylan you have not provided any feedback as to whether this patch meets your requirements. However, rumour has it that you are carrying this patch - does this mean that you can now provide us some feedback as to whether this patch is useful and should proceed with landing?
            pjones Peter Jones added a comment -

            jaylan any updates on testing this patch?

            pjones Peter Jones added a comment - jaylan any updates on testing this patch?
            qian_wc Qian Yingjin added a comment -

            I build it based on the latest master branch.
            It seems that tc_rpc_rate is __u32, while in 2.12.4, it was __u64

            To make it pass the build for 2.12.4, you just need to modify it with:

            seq_printf(p, "%llu\n", cli->tc_rpc_rate);
            

            Regards,
            Qian

            qian_wc Qian Yingjin added a comment - I build it based on the latest master branch. It seems that tc_rpc_rate is __u32, while in 2.12.4, it was __u64 To make it pass the build for 2.12.4, you just need to modify it with: seq_printf(p, "%llu\n" , cli->tc_rpc_rate); Regards, Qian

            I had compilation error in lustre-2.12.4 against CentOS 7.7 kernel :

            Making all in .
            /tmp/rpmbuild-lustre-jlan-UDyILlEP/BUILD/lustre-2.12.4/lustre/ptlrpc/nrs_tbf.c: In function 'nrs_tbf_stats_seq_show':
            /tmp/rpmbuild-lustre-jlan-UDyILlEP/BUILD/lustre-2.12.4/lustre/ptlrpc/nrs_tbf.c:3882:2: error: format '%u' expects argument of type 'unsigned int', but argument 3 has type '__u64' [-Werror=format=]
            seq_printf(p, "%u\n", cli->tc_rpc_rate);
            ^
            /tmp/rpmbuild-lustre-jlan-UDyILlEP/BUILD/lustre-2.12.4/lustre/ptlrpc/nrs_tbf.c: At top level:
            cc1: error: unrecognized command line option "-Wno-stringop-overflow" [-Werror]
            cc1: error: unrecognized command line option "-Wno-stringop-truncation" [-Werror]
            cc1: error: unrecognized command line option "-Wno-format-truncation" [-Werror]
            cc1: all warnings being treated as errors
            make[7]: *** [/tmp/rpmbuild-lustre-jlan-UDyILlEP/BUILD/lustre-2.12.4/lustre/ptlrpc/nrs_tbf.o] Error 1

            I do not know how I got those "unrecgonized command line option". Before applying this patch it was compiled fine.

            jaylan Jay Lan (Inactive) added a comment - I had compilation error in lustre-2.12.4 against CentOS 7.7 kernel : Making all in . /tmp/rpmbuild-lustre-jlan-UDyILlEP/BUILD/lustre-2.12.4/lustre/ptlrpc/nrs_tbf.c: In function 'nrs_tbf_stats_seq_show': /tmp/rpmbuild-lustre-jlan-UDyILlEP/BUILD/lustre-2.12.4/lustre/ptlrpc/nrs_tbf.c:3882:2: error: format '%u' expects argument of type 'unsigned int', but argument 3 has type '__u64' [-Werror=format=] seq_printf(p, "%u\n", cli->tc_rpc_rate); ^ /tmp/rpmbuild-lustre-jlan-UDyILlEP/BUILD/lustre-2.12.4/lustre/ptlrpc/nrs_tbf.c: At top level: cc1: error: unrecognized command line option "-Wno-stringop-overflow" [-Werror] cc1: error: unrecognized command line option "-Wno-stringop-truncation" [-Werror] cc1: error: unrecognized command line option "-Wno-format-truncation" [-Werror] cc1: all warnings being treated as errors make [7] : *** [/tmp/rpmbuild-lustre-jlan-UDyILlEP/BUILD/lustre-2.12.4/lustre/ptlrpc/nrs_tbf.o] Error 1 I do not know how I got those "unrecgonized command line option". Before applying this patch it was compiled fine.
            qian_wc Qian Yingjin added a comment -

            Hi Mahmoud,
            I just updated the patch (https://review.whamcloud.com/#/c/36950/), could you please try it again?

            Thanks,
            Qian

            qian_wc Qian Yingjin added a comment - Hi Mahmoud, I just updated the patch ( https://review.whamcloud.com/#/c/36950/ ), could you please try it again? Thanks, Qian
            lixi_wc Li Xi added a comment -

            mhanafi Sorry for the misunderstanding. I found a bug of the patch. Not sure whether that is the cause of the duplicated outputs. The patch will be refreshed soon anyway.

            lixi_wc Li Xi added a comment - mhanafi Sorry for the misunderstanding. I found a bug of the patch. Not sure whether that is the cause of the duplicated outputs. The patch will be refreshed soon anyway.

            I understand that we have hp and reg. But We get 2 request for the same cpt and queue_type type

             - uid:             929411059
              cpt:             0
              queue_type:      reg
              refs:            1
              rule:            default
            --
            - uid:             929411059
              cpt:             0
              queue_type:      reg
              refs:            1
              rule:            default
            -- 

            Yes Peter we would like to get this landed.

            mhanafi Mahmoud Hanafi added a comment - I understand that we have hp and reg. But We get 2 request for the same cpt and queue_type type - uid: 929411059 cpt: 0 queue_type: reg refs: 1 rule: default -- - uid: 929411059 cpt: 0 queue_type: reg refs: 1 rule: default -- Yes Peter we would like to get this landed.
            lixi_wc Li Xi added a comment - - edited

            Sorry for late reply.

            Why do we get more than 1 stats for a specific cpt and queue_type.

            Understood comparing to this, a single stats would be easier to understand. However, this is determined by the internal design and implementation of request handling of Lustre which have good reasons too. And TBF has no other choice but to use and depend on it.

            Lustre seperate requests into two different types, regular (reg) requests and high priority (hp) requests. And handling of these two types of requests are seperated in order to make sure high priority requests won't be blocked by many regular requests.

            And Lustre divide CPUs into different patitions (cpt). Each patition handles RPC requests independently.

            Because of these existing mechansim, TBF have to set the RPC rate limitations seperately (maybe with the same values), and the stats are seperated for each cpt and request type.

            lixi_wc Li Xi added a comment - - edited Sorry for late reply. Why do we get more than 1 stats for a specific cpt and queue_type. Understood comparing to this, a single stats would be easier to understand. However, this is determined by the internal design and implementation of request handling of Lustre which have good reasons too. And TBF has no other choice but to use and depend on it. Lustre seperate requests into two different types, regular (reg) requests and high priority (hp) requests. And handling of these two types of requests are seperated in order to make sure high priority requests won't be blocked by many regular requests. And Lustre divide CPUs into different patitions (cpt). Each patition handles RPC requests independently. Because of these existing mechansim, TBF have to set the RPC rate limitations seperately (maybe with the same values), and the stats are seperated for each cpt and request type.
            pjones Peter Jones added a comment -

            mhanafi I noticed this week that you are carrying this patch in your distribution. Sorry that we missed your question above. Have you had any other questions/comments about using this change? Do you think that we should proceed with landing it in its current form or is more work required?

            pjones Peter Jones added a comment - mhanafi I noticed this week that you are carrying this patch in your distribution. Sorry that we missed your question above. Have you had any other questions/comments about using this change? Do you think that we should proceed with landing it in its current form or is more work required?

            People

              lixi_wc Li Xi
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated: