Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10869

conf-sanity test 76a fails with 'error while apply max_dirty_mb'

Details

    • 3
    • 9223372036854775807

    Description

      conf-sanity test_76a fails to ‘/usr/sbin/lctl set_param -P osc.*.max_dirty_mb=64’. The last thing seen in the test_log is

      Change MGS params
      max_dirty_mb: 32
      new_max_dirty_mb: 64
      CMD: trevis-50vm10 /usr/sbin/lctl set_param -P osc.*.max_dirty_mb=64
      CMD: trevis-52vm7.trevis.hpdd.intel.com /usr/sbin/lctl get_param -n osc.*.max_dirty_mb |
                         head -1
      CMD: trevis-52vm7.trevis.hpdd.intel.com /usr/sbin/lctl get_param -n osc.*.max_dirty_mb |
                         head -1
      Waiting 90 secs for update
      CMD: trevis-52vm7.trevis.hpdd.intel.com /usr/sbin/lctl get_param -n osc.*.max_dirty_mb |
                         head -1
      …
      CMD: trevis-52vm7.trevis.hpdd.intel.com /usr/sbin/lctl get_param -n osc.*.max_dirty_mb |
                         head -1
      Update not seen after 90s: wanted '64' got '32'
      32
       conf-sanity test_76a: @@@@@@ FAIL: error while apply max_dirty_mb
      
      

       

      On the MDS console and dmesg, we see that the command issues on the MGS/MDS is

      [33955.881468] Lustre: DEBUG MARKER: trevis-50vm9.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
      [33965.425226] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P osc.*.max_dirty_mb=64
      [34056.950846] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  conf-sanity test_76a: @@@@@@ FAIL: error while apply max_dirty_mb
      

       

      This test has failed with this error previously while testing the patch for LU-9325, https://review.whamcloud.com/#/c/30539/.

       

      Logs for this failure are at

      https://testing.hpdd.intel.com/test_sets/a77f2e70-3543-11e8-95c0-52540065bddc

       

      Attachments

        Issue Links

          Activity

            [LU-10869] conf-sanity test 76a fails with 'error while apply max_dirty_mb'

            John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/32471/
            Subject: LU-10869 build: package configuration files for Ubuntu / Debian
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set:
            Commit: 74d56ecdf3d87c9baa1b8b9d67cd11617d5bea9c

            gerrit Gerrit Updater added a comment - John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/32471/ Subject: LU-10869 build: package configuration files for Ubuntu / Debian Project: fs/lustre-release Branch: b2_10 Current Patch Set: Commit: 74d56ecdf3d87c9baa1b8b9d67cd11617d5bea9c

            Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/32471
            Subject: LU-10869 build: package configuration files for Ubuntu / Debian
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set: 1
            Commit: d4004c95ef7b3cbbbe9e539524068a863abbdf11

            gerrit Gerrit Updater added a comment - Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/32471 Subject: LU-10869 build: package configuration files for Ubuntu / Debian Project: fs/lustre-release Branch: b2_10 Current Patch Set: 1 Commit: d4004c95ef7b3cbbbe9e539524068a863abbdf11
            pjones Peter Jones added a comment -

            Landed for 2.12

            pjones Peter Jones added a comment - Landed for 2.12

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31850/
            Subject: LU-10869 build: package configuration files for Ubuntu / Debian
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 8f221cf65b644d798493da489674abd2e2b7f23f

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31850/ Subject: LU-10869 build: package configuration files for Ubuntu / Debian Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8f221cf65b644d798493da489674abd2e2b7f23f

            Patch looks good, when I tested it locally.

            utopiabound Nathaniel Clark added a comment - Patch looks good, when I tested it locally.

            Nathaniel can you take another look please.

            simmonsja James A Simmons added a comment - Nathaniel can you take another look please.

            Now that we have the proper udev rules being packaged for Ubuntu it should pass. Note for lctl set_param -P to work you need the following basic udev rule:

            SUBSYSTEM=="lustre", ACTION=="change", ENV{PARAM}=="?*", RUN+="/usr/sbin/lctl set_param $env{PARAM}=$env{SETTING}"

            In 99-lustre.rules.

            simmonsja James A Simmons added a comment - Now that we have the proper udev rules being packaged for Ubuntu it should pass. Note for lctl set_param -P to work you need the following basic udev rule: SUBSYSTEM=="lustre", ACTION=="change", ENV{PARAM}=="?*", RUN+="/usr/sbin/lctl set_param $env{PARAM}=$env{SETTING}" In 99-lustre.rules.

            James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/31850
            Subject: LU-10869 build: package configuration files for Ubuntu / Debian
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d664a1f3b00d98a631dc57d262d9f9e89d7096ea

            gerrit Gerrit Updater added a comment - James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/31850 Subject: LU-10869 build: package configuration files for Ubuntu / Debian Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d664a1f3b00d98a631dc57d262d9f9e89d7096ea

            It looks like Client code gets info from MGS:

            Sat Mar 31 05:54:52 2018 [18642] (obd_config.c:1057:process_param2_config()) Process entered
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:361:print_lustre_cfg()) Process entered
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:366:print_lustre_cfg()) lustre_cfg: ffff88004532f600
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:367:print_lustre_cfg())   lcfg->lcfg_version: 0x1cf60001
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:369:print_lustre_cfg())   lcfg->lcfg_command: 0xce032
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:370:print_lustre_cfg())   lcfg->lcfg_num: 0x0
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:371:print_lustre_cfg())   lcfg->lcfg_flags: 0x0
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:372:print_lustre_cfg())   lcfg->lcfg_nid: 0@<0:0>
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:374:print_lustre_cfg())   lcfg->lcfg_bufcount: 3
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:379:print_lustre_cfg())   lcfg->lcfg_buflens[0]: 19 *-ffff880049c30000
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:379:print_lustre_cfg())   lcfg->lcfg_buflens[1]: 22 osc.*.max_dirty_mb=64
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:379:print_lustre_cfg())   lcfg->lcfg_buflens[2]: 5 lctl
            Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:382:print_lustre_cfg()) Process leaving
            Sat Mar 31 05:54:52 2018 [18642] (obd_config.c:1090:process_param2_config()) Process leaving (rc=0 : 0 : 0)
            

            This may be a subtle regression of https://review.whamcloud.com/30143 (LU-9431) since it takes the new code path in the failing case. Or maybe an issue/difference with kobj's in Linux 4.4?

            utopiabound Nathaniel Clark added a comment - It looks like Client code gets info from MGS: Sat Mar 31 05:54:52 2018 [18642] (obd_config.c:1057:process_param2_config()) Process entered Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:361:print_lustre_cfg()) Process entered Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:366:print_lustre_cfg()) lustre_cfg: ffff88004532f600 Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:367:print_lustre_cfg()) lcfg->lcfg_version: 0x1cf60001 Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:369:print_lustre_cfg()) lcfg->lcfg_command: 0xce032 Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:370:print_lustre_cfg()) lcfg->lcfg_num: 0x0 Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:371:print_lustre_cfg()) lcfg->lcfg_flags: 0x0 Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:372:print_lustre_cfg()) lcfg->lcfg_nid: 0@<0:0> Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:374:print_lustre_cfg()) lcfg->lcfg_bufcount: 3 Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:379:print_lustre_cfg()) lcfg->lcfg_buflens[0]: 19 *-ffff880049c30000 Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:379:print_lustre_cfg()) lcfg->lcfg_buflens[1]: 22 osc.*.max_dirty_mb=64 Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:379:print_lustre_cfg()) lcfg->lcfg_buflens[2]: 5 lctl Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:382:print_lustre_cfg()) Process leaving Sat Mar 31 05:54:52 2018 [18642] (obd_config.c:1090:process_param2_config()) Process leaving (rc=0 : 0 : 0) This may be a subtle regression of https://review.whamcloud.com/30143 ( LU-9431 ) since it takes the new code path in the failing case. Or maybe an issue/difference with kobj's in Linux 4.4?
            pjones Peter Jones added a comment -

            Nathaniel

            Can you please investigate?

            Thanks

            Peter

            pjones Peter Jones added a comment - Nathaniel Can you please investigate? Thanks Peter

            People

              utopiabound Nathaniel Clark
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: