[LU-12759] parameter grant_shrink gets reset to 1 after client reconnects Created: 13/Sep/19  Updated: 17/Jan/20  Resolved: 12/Nov/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.14.0, Lustre 2.12.4

Type: Bug Priority: Minor
Reporter: Alexander Zarochentsev Assignee: Alexander Zarochentsev
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-12120 LustreError: 15069:0:(tgt_grant.c:561... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Impossible to set osc.*.grant_shrink to 0 , it gets reset to 1 after OST reconnect, regardless whether it was simple set by lctl set_param or lctl set_param -P.

[root@devvm1 tests]# ../utils/lctl get_param osc.*.grant_shrink
osc.lustre-OST0000-osc-ffff880043e43800.grant_shrink=1
osc.lustre-OST0001-osc-ffff880043e43800.grant_shrink=1
[root@devvm1 tests]# ../utils/lctl set_param osc.*.grant_shrink=0
osc.lustre-OST0000-osc-ffff880043e43800.grant_shrink=0
osc.lustre-OST0001-osc-ffff880043e43800.grant_shrink=0

wait some time for idle OSTs get disconnected,
then write to files to get the OSTs connected again:

[root@devvm1 tests]# dd if=/dev/zero bs=1M count=1 of=/mnt/lustre/foo1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00950013 s, 110 MB/s
[root@devvm1 tests]# dd if=/dev/zero bs=1M count=1 of=/mnt/lustre/foo2
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0131701 s, 79.6 MB/s

Now the previous settings are gone:

[root@devvm1 tests]# ../utils/lctl get_param osc.*.grant_shrink
osc.lustre-OST0000-osc-ffff880043e43800.grant_shrink=1
osc.lustre-OST0001-osc-ffff880043e43800.grant_shrink=1
[root@devvm1 tests]# 


 Comments   
Comment by Gerrit Updater [ 13/Sep/19 ]

Alexander Zarochentsev (c17826@cray.com) uploaded a new patch: https://review.whamcloud.com/36177
Subject: LU-12759 osc: don't re-enable grant shrink on reconnect
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a1a723ce63dab1cf3e0305ee3556eaa308655f0b

Comment by Alexander Zarochentsev [ 13/Sep/19 ]

with the fix grant shrinking remaining disabled after reconnecting to OSTs:

[root@devvm1 tests]# ../utils/lctl get_param osc.*.grant_shrink
osc.lustre-OST0000-osc-ffff880045d29800.grant_shrink=1
osc.lustre-OST0001-osc-ffff880045d29800.grant_shrink=1
[root@devvm1 tests]# ../utils/lctl set_param osc.*.grant_shrink=0
osc.lustre-OST0000-osc-ffff880045d29800.grant_shrink=0
osc.lustre-OST0001-osc-ffff880045d29800.grant_shrink=0
[root@devvm1 tests]# ../utils/lctl get_param osc.*.grant_shrink
osc.lustre-OST0000-osc-ffff880045d29800.grant_shrink=0
osc.lustre-OST0001-osc-ffff880045d29800.grant_shrink=0
[root@devvm1 tests]# dmesg | tail 
[14492.674788] Lustre: lustre-OST0001: new disk, initializing
[14494.139279] Lustre: lustre-MDT0000: Connection restored to f7ad677a-8d53-4 (at 192.168.56.101@tcp)
[14494.139286] Lustre: Skipped 6 previous similar messages
[14494.168439] Lustre: Mounted lustre-client
[14495.540432] Lustre: DEBUG MARKER: Using TIMEOUT=20
[14502.268274] Lustre: lustre-OST0001: Connection restored to f7ad677a-8d53-4 (at 192.168.56.101@tcp)
[14502.268282] Lustre: Skipped 3 previous similar messages
[14502.286601] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x00000002c0000400-0x0000000300000400]:1:ost
[14502.286799] Lustre: cli-lustre-OST0001-super: Allocated super-sequence [0x00000002c0000400-0x0000000300000400]:1:ost]
[14517.305391] Lustre: lustre-OST0000-osc-ffff880045d29800: disconnect after 22s idle
[root@devvm1 tests]# dd if=/dev/zero bs=1M count=1 of=/mnt/lustre/foo1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0110673 s, 94.7 MB/s
[root@devvm1 tests]# dd if=/dev/zero bs=1M count=1 of=/mnt/lustre/foo2
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.014042 s, 74.7 MB/s
[root@devvm1 tests]# dmesg | tail 
[14495.540432] Lustre: DEBUG MARKER: Using TIMEOUT=20
[14502.268274] Lustre: lustre-OST0001: Connection restored to f7ad677a-8d53-4 (at 192.168.56.101@tcp)
[14502.268282] Lustre: Skipped 3 previous similar messages
[14502.286601] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x00000002c0000400-0x0000000300000400]:1:ost
[14502.286799] Lustre: cli-lustre-OST0001-super: Allocated super-sequence [0x00000002c0000400-0x0000000300000400]:1:ost]
[14517.305391] Lustre: lustre-OST0000-osc-ffff880045d29800: disconnect after 22s idle
[14542.047684] Lustre: lustre-OST0000-osc-ffff880045d29800: reconnect after 25s idle
[14542.047940] Lustre: lustre-OST0000: Connection restored to f7ad677a-8d53-4 (at 192.168.56.101@tcp)
[14542.047944] Lustre: Skipped 1 previous similar message
[14543.718663] Lustre: lustre-OST0001-osc-ffff880045d29800: reconnect after 26s idle
[root@devvm1 tests]# ../utils/lctl get_param osc.*.grant_shrink
osc.lustre-OST0000-osc-ffff880045d29800.grant_shrink=0
osc.lustre-OST0001-osc-ffff880045d29800.grant_shrink=0
[root@devvm1 tests]# 
Comment by Gerrit Updater [ 12/Nov/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36177/
Subject: LU-12759 osc: don't re-enable grant shrink on reconnect
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: efa3425c5f5a6763ea834408b982e4df5a90c914

Comment by Peter Jones [ 12/Nov/19 ]

Landed for 2.14

Comment by Gerrit Updater [ 07/Jan/20 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37152
Subject: LU-12759 osc: don't re-enable grant shrink on reconnect
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 270fd0dc37e83fb19ccde7db1dfd197cd20d5c0a

Comment by Gerrit Updater [ 17/Jan/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37152/
Subject: LU-12759 osc: don't re-enable grant shrink on reconnect
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: 9ede55bc906fd033ccb9f40fb0a6a625e836bef3

Generated at Sat Feb 10 02:55:24 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.