[LU-11968] tbf QOS gid rules not being enforced Created: 13/Feb/19 Updated: 02/Mar/19 Resolved: 28/Feb/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | Li Xi |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Setting up tbf qos for ost_io like this lctl set_param ost.OSS.ost_io.nrs_policies="tbf gid" lctl set_param ost.OSS.ost_io.nrs_tbf_rule="start viz gid={1128} rate=10000" lctl set_param ost.OSS.ost_io.nrs_tbf_rule="start css gid={1125} rate=10000" lctl set_param ost.OSS.ost_io.nrs_tbf_rule="change default rate=100" Running IOR as a user with group css, the writes are correctly throttled by the css rule. But read are throttled by the default rate. Lowering the default rate decrease reads bw, and increasing the default rate increases read bw. IOR is getting 8GB/sec Writes and 1.4GB/sec reads. It should be 8GB/sec for read and writes.
srv2 /sys/kernel/debug/lustre/ost/OSS/ost_io # cat nrs_tbf_rule
regular_requests:
CPT 0:
css {1125} 10000, ref 1
viz {1128} 10000, ref 0
default {*} 100, ref 1
CPT 1:
css {1125} 10000, ref 1
viz {1128} 10000, ref 0
default {*} 100, ref 1
CPT 2:
css {1125} 10000, ref 1
viz {1128} 10000, ref 0
default {*} 100, ref 1
CPT 3:
css {1125} 10000, ref 1
viz {1128} 10000, ref 0
default {*} 100, ref 1
CPT 4:
css {1125} 10000, ref 1
viz {1128} 10000, ref 0
default {*} 100, ref 1
|
| Comments |
| Comment by Mahmoud Hanafi [ 13/Feb/19 ] |
|
I have attached debug logs that shows switch from write to read and how it picks the wrong rule. |
| Comment by Peter Jones [ 14/Feb/19 ] |
|
Li Xi Could you please investigate? Thanks Peter |
| Comment by Li Xi [ 14/Feb/19 ] |
osc_build_rpc
cl_req_attr_set
coo_req_attr_set
vvp_req_attr_set
obdo_from_inode
dst->o_gid = from_kgid(&init_user_ns, src->i_gid);
osc_brw_prep_request
body->oa.o_uid = oa->o_uid;
body->oa.o_gid = oa->o_gid;
nrs_tbf_id_cli_set
ost_tbf_id_cli_set
id->ti_uid = body->oa.o_uid;
id->ti_gid = body->oa.o_gid;
obdo_from_inode
dst->o_uid = from_kuid(&init_user_ns, src->i_uid);
obdo_from_la
I am trying to reproduce and writing debug patch. |
| Comment by Gerrit Updater [ 14/Feb/19 ] |
|
Li Xi (lixi@ddn.com) uploaded a new patch: https://review.whamcloud.com/34257 |
| Comment by Li Xi [ 14/Feb/19 ] |
|
Hi Mahmoud, do you mind to apply the patch 34257 and check whether the rule is matched as expected. As far as I test, there is nothing strange in my environment: # lctl set_param ost.OSS.ost_io.nrs_policies="tbf gid"
# lctl set_param ost.OSS.ost_io.nrs_tbf_rule="start viz gid={1000} rate=10000"
# su - test
$ dd if=/mnt/lustre/file of=/dev/null bs=1048576
Feb 14 23:07:43 server17-el7-vm1 kernel: LustreError: 14327:0:(vvp_object.c:218:vvp_req_attr_set()) cra_type 0, gid 1000 Feb 14 23:07:43 server17-el7-vm1 kernel: LustreError: 14327:0:(osc_request.c:1383:osc_brw_prep_request()) gid 1000 Feb 14 23:07:43 server17-el7-vm1 kernel: LustreError: 14431:0:(nrs_tbf.c:1534:ost_tbf_id_cli_set()) ost_tbf_id_cli_set gid: 1000 Feb 14 23:07:43 server17-el7-vm1 kernel: LustreError: 14431:0:(nrs_tbf.c:263:nrs_tbf_rule_match()) rule [viz] matches ID [1000 |
| Comment by Mahmoud Hanafi [ 28/Feb/19 ] |
|
This is not a bug and can be close. I was using a 2.11 client.
|
| Comment by Peter Jones [ 28/Feb/19 ] |
|
ok thanks |