[LU-12048] nrs_policies not being set after reboot. Created: 06/Mar/19  Updated: 27/Mar/19  Resolved: 27/Mar/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Mahmoud Hanafi Assignee: Emoly Liu
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Setting nrs_policies on the MGS does not apply them when a server is rebooted. It seems to work only on the MGS node.

 

lctl set_param -P ost.OSS.ost_io.nrs_policies="tbf"

I check the params log file and it has the setting

 # llog_reader /tmp/params 
rec #1 type=10620000 len=224 offset 8192
rec #2 type=10620000 len=104 offset 8416
rec #3 type=10620000 len=224 offset 8520
rec #4 type=10620000 len=224 offset 8744
rec #5 type=10620000 len=104 offset 8968
rec #6 type=10620000 len=224 offset 9072
rec #7 type=10620000 len=224 offset 9296
rec #8 type=10620000 len=120 offset 9520
rec #9 type=10620000 len=224 offset 9640
rec #10 type=10620000 len=224 offset 9864
rec #11 type=10620000 len=128 offset 10088
rec #12 type=10620000 len=224 offset 10216
rec #13 type=10620000 len=224 offset 10440
rec #14 type=10620000 len=120 offset 10664
rec #15 type=10620000 len=224 offset 10784
Header size : 8192
Time : Wed Mar  6 11:05:31 2019
Number of records: 15
Target uuid : 
-----------------------
#01 (224)SKIP START marker   2 (flags=0x05, v2.12.0.0) general         'at_min' Tue Feb 26 12:28:57 2019-Tue Feb 26 12:29:08 2019
#02 (104)SKIP set_param 0:general  1:at_min=200  2:lctl  
#03 (224)SKIP END   marker   2 (flags=0x06, v2.12.0.0) general         'at_min' Tue Feb 26 12:28:57 2019-Tue Feb 26 12:29:08 2019
#04 (224)marker   3 (flags=0x01, v2.12.0.0) general         'at_min' Tue Feb 26 12:29:08 2019-
#05 (104)set_param 0:general  1:at_min=200  2:lctl  
#06 (224)END   marker   3 (flags=0x02, v2.12.0.0) general         'at_min' Tue Feb 26 12:29:08 2019-
#07 (224)SKIP START marker   2 (flags=0x05, v2.12.0.0) OSS             'ost.OSS.ost_io.nrs_policies' Wed Mar  6 10:48:11 2019-Wed Mar  6 10:50:01 2019
#08 (120)SKIP set_param 0:OSS  1:ost.OSS.ost_io.nrs_policies=tbf  2:lctl  
#09 (224)SKIP END   marker   2 (flags=0x06, v2.12.0.0) OSS             'ost.OSS.ost_io.nrs_policies' Wed Mar  6 10:48:11 2019-Wed Mar  6 10:50:01 2019
#10 (224)SKIP START marker   3 (flags=0x05, v2.12.0.0) OSS             'ost.OSS.ost_io.nrs_policies' Wed Mar  6 10:50:01 2019-Wed Mar  6 10:50:50 2019
#11 (128)SKIP set_param 0:OSS  1:ost.OSS.ost_io.nrs_policies=fifio  2:lctl  
#12 (224)SKIP END   marker   3 (flags=0x06, v2.12.0.0) OSS             'ost.OSS.ost_io.nrs_policies' Wed Mar  6 10:50:01 2019-Wed Mar  6 10:50:50 2019
#13 (224)marker   4 (flags=0x01, v2.12.0.0) OSS             'ost.OSS.ost_io.nrs_policies' Wed Mar  6 10:50:50 2019-
#14 (120)set_param 0:OSS  1:ost.OSS.ost_io.nrs_policies=tbf  2:lctl  
#15 (224)END   marker   4 (flags=0x02, v2.12.0.0) OSS             'ost.OSS.ost_io.nrs_policies' Wed Mar  6 10:50:50 2019-

Also when running the following on the MGS this should push out the settings all servers.

lctl set_param -P ost.OSS.ost_io.nrs_policies="tbf"
lctl set_param -P ost.OSS.ost_io.nrs_tbf_rule="start css gid={1128} rate=10000"

 



 Comments   
Comment by James A Simmons [ 06/Mar/19 ]

I just tried it manually, no reboot and it looks like it works. Is it only a reboot issue?

Comment by Mahmoud Hanafi [ 06/Mar/19 ]

What do you mean "manually"

Those setting should get set correctly on all servers after a reboot.

Comment by Mahmoud Hanafi [ 06/Mar/19 ]

Is this related?

https://jira.whamcloud.com/browse/LU-10937

Comment by Emoly Liu [ 06/Mar/19 ]

Hi mhanafi,

I tried this locally, after umount and mount my OST, it still works. Could you please try the following steps in your system?

  • run " lctl set_param -P ost.OSS.ost_io.nrs_policies="tbf" " on the MGS
  • run " lctl get_param ost.OSS.ost_io.nrs_policies " to make sure "tbf" is enabled correctly. This may take a few seconds.
  • reboot or umount/mount your OST and see if "tbf" is still enabled.

Please post the output here and upload MGS/OST logs if you still hit the issue.

Thanks.

Comment by Mahmoud Hanafi [ 06/Mar/19 ]

Ok Now I can't reproduce it. Is there way to I can clear out the params file on the MGS and re-test this.

 

Comment by Emoly Liu [ 07/Mar/19 ]

mhanafi,

You can erase configuration logs by writeconf command. After that and the servers restart, the configuration logs are re-generated and stored on the MGS (as in a new file system). But in this way, you must umount all the clients/MDT/OSTs, run writeconf command on MGS/MDT/OSTs/ devices respectively, then mount them again. Please note that all the parameters will be cleared, so this is a dangerous operation.

Are you sure you want to do this? Or just use "lctl set_param" to set nrs_policies to fifo back, and then redo tbf test again?

 

Comment by Mahmoud Hanafi [ 26/Mar/19 ]

We can close this case

Comment by Emoly Liu [ 27/Mar/19 ]

Thanks.

Comment by James A Simmons [ 27/Mar/19 ]

lctl set_param  -P -d "cmd" will delete parameters on the MGS server.

Generated at Sat Feb 10 02:49:13 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.