Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15095

lctl: error invoking upcall /usr/sbin/lctl set_param *.*.lbug_on_grant_miscount=1

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.15.0
    • Upstream
    • None
    • 3
    • 9223372036854775807

    Description

      I'm getting many messages like this:
      lctl: error invoking upcall /usr/sbin/lctl set_param ..lbug_on_grant_miscount=1
      in the logs.
      this was introduced in bb5d81ea95 ("LU-14543 target: prevent overflowing of tgd->tgd_tot_granted")
      I don't quite understand why this clearly debugging tunable needs to be persisten in the config logs?
      IMO, the better would be to have node-wide non-persisten tunable like osd's track_declare_assert

      Attachments

        Issue Links

          Activity

            [LU-15095] lctl: error invoking upcall /usr/sbin/lctl set_param *.*.lbug_on_grant_miscount=1

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/46185/
            Subject: LU-15095 tests: skip lbug_on_grant_miscount on client
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 49e29f38343ce0389df0aecf308b0986de94c029

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/46185/ Subject: LU-15095 tests: skip lbug_on_grant_miscount on client Project: fs/lustre-release Branch: master Current Patch Set: Commit: 49e29f38343ce0389df0aecf308b0986de94c029

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46185
            Subject: LU-15095 tests: skip lbug_on_grant_miscount on client
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: c168c309860dbf9745af1105ed3b236c0e2ce89c

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46185 Subject: LU-15095 tests: skip lbug_on_grant_miscount on client Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: c168c309860dbf9745af1105ed3b236c0e2ce89c

            Moving this patch to a module parameter is causing RHEL7.9 testing to fail 100% with:
            https://testing.whamcloud.com/test_sets/8a2e1a74-c60f-4e33-bbfa-2c3a7efa7e13

            ptlrpc/ptlrpc options: 'lbug_on_grant_miscount=1'
            [console] Lustre: Lustre: Build Version: 2.14.56_68_g5914687
            [console] ptlrpc: Unknown parameter `lbug_on_grant_miscount'
            modprobe: ERROR: could not insert 'ptlrpc': Unknown symbol in module, or unknown parameter (see dmesg)
            

            The build version is the same on the client and server, so it isn't a case of an old build being used on the client.

            I think the problem is that this is a client-only build being tested, and the module parameter is only for the server, so it just doesn't exist on the el7.9 client. I think the test-framework needs to be changed to only set this parameter on the OSS and MDS and not the client nodes.

            adilger Andreas Dilger added a comment - Moving this patch to a module parameter is causing RHEL7.9 testing to fail 100% with: https://testing.whamcloud.com/test_sets/8a2e1a74-c60f-4e33-bbfa-2c3a7efa7e13 ptlrpc/ptlrpc options: 'lbug_on_grant_miscount=1' [console] Lustre: Lustre: Build Version: 2.14.56_68_g5914687 [console] ptlrpc: Unknown parameter `lbug_on_grant_miscount' modprobe: ERROR: could not insert 'ptlrpc': Unknown symbol in module, or unknown parameter (see dmesg) The build version is the same on the client and server, so it isn't a case of an old build being used on the client. I think the problem is that this is a client-only build being tested, and the module parameter is only for the server, so it just doesn't exist on the el7.9 client. I think the test-framework needs to be changed to only set this parameter on the OSS and MDS and not the client nodes.
            pjones Peter Jones added a comment -

            Landed for 2.15

            pjones Peter Jones added a comment - Landed for 2.15

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45521/
            Subject: LU-15095 target: lbug_on_grant_miscount module parameter
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 2c787065441ee60c6c163dc77851d0964f81a89c

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45521/ Subject: LU-15095 target: lbug_on_grant_miscount module parameter Project: fs/lustre-release Branch: master Current Patch Set: Commit: 2c787065441ee60c6c163dc77851d0964f81a89c

            "Vladimir Saveliev <vlaidimir.saveliev@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45521
            Subject: LU-15095 target: lbug_on_grant_miscount module parameter
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 91f914fe01c71280523c5fee3bf2d31db593c9e5

            gerrit Gerrit Updater added a comment - "Vladimir Saveliev <vlaidimir.saveliev@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45521 Subject: LU-15095 target: lbug_on_grant_miscount module parameter Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 91f914fe01c71280523c5fee3bf2d31db593c9e5

            Do you mean to make it as parameter of module ptlrpc?

            yes

            bzzz Alex Zhuravlev added a comment - Do you mean to make it as parameter of module ptlrpc? yes

            in some cases this parameter is set (written) into the config few times.

            to reproduce it should be enough to run llmount.sh locally:

            [   30.040858] Lustre: Modifying parameter general.*.*.lbug_on_grant_miscount in log params
            [   30.161330] LustreError: 5029:0:(obd_config.c:1326:process_param2_config()) lctl: error invoking upcall /usr/sbin/lctl set_param *.*.lbug_on_grant_miscount=1: rc = -2; time 191us
            
            bzzz Alex Zhuravlev added a comment - in some cases this parameter is set (written) into the config few times. to reproduce it should be enough to run llmount.sh locally: [ 30.040858] Lustre: Modifying parameter general.*.*.lbug_on_grant_miscount in log params [ 30.161330] LustreError: 5029:0:(obd_config.c:1326:process_param2_config()) lctl: error invoking upcall /usr/sbin/lctl set_param *.*.lbug_on_grant_miscount=1: rc = -2; time 191us

            do we really need to save this to the log? why not use a variable like cfs_fail_loc or ldiskfs_track_declares_assert ?

            Do you mean to make it as parameter of module ptlrpc?

            int ldiskfs_track_declares_assert;
            module_param(ldiskfs_track_declares_assert, int, 0644);
            

            It sounds like a good idea, thanks.

            vsaveliev Vladimir Saveliev added a comment - do we really need to save this to the log? why not use a variable like cfs_fail_loc or ldiskfs_track_declares_assert ? Do you mean to make it as parameter of module ptlrpc? int ldiskfs_track_declares_assert; module_param(ldiskfs_track_declares_assert, int , 0644); It sounds like a good idea, thanks.
            vsaveliev Vladimir Saveliev added a comment - - edited

            Do we want lbug_on_grant_miscount set across reboots?

            Without that the parameter lbug_on_grant_miscount would get turned off in tests which include failover.
            That is, it was made permanent intentionally.

            I did not get "lctl: error invoking upcall" in my tests and will debug the issue.

             

            vsaveliev Vladimir Saveliev added a comment - - edited Do we want lbug_on_grant_miscount set across reboots? Without that the parameter lbug_on_grant_miscount would get turned off in tests which include failover. That is, it was made permanent intentionally. I did not get "lctl: error invoking upcall" in my tests and will debug the issue.  

            People

              vsaveliev Vladimir Saveliev
              bzzz Alex Zhuravlev
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: