[LU-10869] conf-sanity test 76a fails with 'error while apply max_dirty_mb' Created: 01/Apr/18  Updated: 19/Jun/18  Resolved: 19/Apr/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: Lustre 2.12.0, Lustre 2.10.5

Type: Bug Priority: Minor
Reporter: James Nunez (Inactive) Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: ubuntu

Issue Links:
Related
is related to LU-6063 conf-sanity test_76a fails on RHEL7, ... Resolved
is related to LU-9431 class_process_proc_param can't handle... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

conf-sanity test_76a fails to ‘/usr/sbin/lctl set_param -P osc.*.max_dirty_mb=64’. The last thing seen in the test_log is

Change MGS params
max_dirty_mb: 32
new_max_dirty_mb: 64
CMD: trevis-50vm10 /usr/sbin/lctl set_param -P osc.*.max_dirty_mb=64
CMD: trevis-52vm7.trevis.hpdd.intel.com /usr/sbin/lctl get_param -n osc.*.max_dirty_mb |
                   head -1
CMD: trevis-52vm7.trevis.hpdd.intel.com /usr/sbin/lctl get_param -n osc.*.max_dirty_mb |
                   head -1
Waiting 90 secs for update
CMD: trevis-52vm7.trevis.hpdd.intel.com /usr/sbin/lctl get_param -n osc.*.max_dirty_mb |
                   head -1
…
CMD: trevis-52vm7.trevis.hpdd.intel.com /usr/sbin/lctl get_param -n osc.*.max_dirty_mb |
                   head -1
Update not seen after 90s: wanted '64' got '32'
32
 conf-sanity test_76a: @@@@@@ FAIL: error while apply max_dirty_mb

 

On the MDS console and dmesg, we see that the command issues on the MGS/MDS is

[33955.881468] Lustre: DEBUG MARKER: trevis-50vm9.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[33965.425226] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P osc.*.max_dirty_mb=64
[34056.950846] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  conf-sanity test_76a: @@@@@@ FAIL: error while apply max_dirty_mb

 

This test has failed with this error previously while testing the patch for LU-9325, https://review.whamcloud.com/#/c/30539/.

 

Logs for this failure are at

https://testing.hpdd.intel.com/test_sets/a77f2e70-3543-11e8-95c0-52540065bddc

 



 Comments   
Comment by James A Simmons [ 01/Apr/18 ]

Oh crap. I think I know why this fails. No udev rule is being packaged for Ubuntu.

Comment by Peter Jones [ 01/Apr/18 ]

Nathaniel

Can you please investigate?

Thanks

Peter

Comment by Nathaniel Clark [ 02/Apr/18 ]

It looks like Client code gets info from MGS:

Sat Mar 31 05:54:52 2018 [18642] (obd_config.c:1057:process_param2_config()) Process entered
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:361:print_lustre_cfg()) Process entered
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:366:print_lustre_cfg()) lustre_cfg: ffff88004532f600
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:367:print_lustre_cfg())   lcfg->lcfg_version: 0x1cf60001
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:369:print_lustre_cfg())   lcfg->lcfg_command: 0xce032
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:370:print_lustre_cfg())   lcfg->lcfg_num: 0x0
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:371:print_lustre_cfg())   lcfg->lcfg_flags: 0x0
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:372:print_lustre_cfg())   lcfg->lcfg_nid: 0@<0:0>
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:374:print_lustre_cfg())   lcfg->lcfg_bufcount: 3
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:379:print_lustre_cfg())   lcfg->lcfg_buflens[0]: 19 *-ffff880049c30000
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:379:print_lustre_cfg())   lcfg->lcfg_buflens[1]: 22 osc.*.max_dirty_mb=64
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:379:print_lustre_cfg())   lcfg->lcfg_buflens[2]: 5 lctl
Sat Mar 31 05:54:52 2018 [18642] (llog_swab.c:382:print_lustre_cfg()) Process leaving
Sat Mar 31 05:54:52 2018 [18642] (obd_config.c:1090:process_param2_config()) Process leaving (rc=0 : 0 : 0)

This may be a subtle regression of https://review.whamcloud.com/30143 (LU-9431) since it takes the new code path in the failing case. Or maybe an issue/difference with kobj's in Linux 4.4?

Comment by Gerrit Updater [ 02/Apr/18 ]

James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/31850
Subject: LU-10869 build: package configuration files for Ubuntu / Debian
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d664a1f3b00d98a631dc57d262d9f9e89d7096ea

Comment by James A Simmons [ 02/Apr/18 ]

Now that we have the proper udev rules being packaged for Ubuntu it should pass. Note for lctl set_param -P to work you need the following basic udev rule:

SUBSYSTEM=="lustre", ACTION=="change", ENV{PARAM}=="?*", RUN+="/usr/sbin/lctl set_param $env{PARAM}=$env{SETTING}"

In 99-lustre.rules.

Comment by James A Simmons [ 10/Apr/18 ]

Nathaniel can you take another look please.

Comment by Nathaniel Clark [ 11/Apr/18 ]

Patch looks good, when I tested it locally.

Comment by Gerrit Updater [ 19/Apr/18 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31850/
Subject: LU-10869 build: package configuration files for Ubuntu / Debian
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 8f221cf65b644d798493da489674abd2e2b7f23f

Comment by Peter Jones [ 19/Apr/18 ]

Landed for 2.12

Comment by Gerrit Updater [ 21/May/18 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/32471
Subject: LU-10869 build: package configuration files for Ubuntu / Debian
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: d4004c95ef7b3cbbbe9e539524068a863abbdf11

Comment by Gerrit Updater [ 11/Jun/18 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/32471/
Subject: LU-10869 build: package configuration files for Ubuntu / Debian
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: 74d56ecdf3d87c9baa1b8b9d67cd11617d5bea9c

Generated at Sat Feb 10 02:38:54 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.