[LU-3468] Add UID/GID into RPC request - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Won't Fix
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Labels:
- ptr

Rank (Obsolete):
8678

Description

We are interested in implementing UID/GID based NRS policies to see what we can get. In order to do this, it is essential to add the UID/GID of the processes that trigger the RPCs into to the request bodies. We implement this by filling the UID/GID into the padding of the request body and then get a 'UID/GID Round Robin' policy by changing CRRN policy (the attaced patch). We know it is not a good implementation though it works fine for testing. And we know it is not easy to implement a good one because we need to handle global user ID over entire cluster. Any advice or idea? Thanks!

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

lustre-nrs-urr.patch
5 kB
13/Jun/13 4:51 AM

Issue Links

is related to

LU-16077 Cannot use tbf to filter brw request per effective uid/gid, inode attr ids is used instead

Resolved

Activity

[LU-3468] Add UID/GID into RPC request

Peter Jones added a comment - 14/Jan/14 2:04 AM

ok thanks Li Xi!

Peter Jones added a comment - 14/Jan/14 2:04 AM ok thanks Li Xi!

Li Xi (Inactive) added a comment - 14/Jan/14 12:40 AM

Hi Jeff,

This a earlier ticket than TBF. Now TBF has implemented jobstat support, which I think can cover most use cases of UID/GID based RPC scheduler. It would be good for me if this ticket is closed.

Thank you!

Li Xi (Inactive) added a comment - 14/Jan/14 12:40 AM Hi Jeff, This a earlier ticket than TBF. Now TBF has implemented jobstat support, which I think can cover most use cases of UID/GID based RPC scheduler. It would be good for me if this ticket is closed. Thank you!

Jeff Layton (Inactive) added a comment - 13/Jan/14 4:21 PM

Has there been any further development on this patch? How does it compare to TBF (~~LU-3558~~)? Thanks!

Jeff Layton (Inactive) added a comment - 13/Jan/14 4:21 PM Has there been any further development on this patch? How does it compare to TBF ( LU-3558 )? Thanks!

Li Xi (Inactive) added a comment - 14/Jun/13 1:30 AM

Hi Andreas,

Thank you so much for the advice! It is really helpful!

Li Xi (Inactive) added a comment - 14/Jun/13 1:30 AM Hi Andreas, Thank you so much for the advice! It is really helpful!

Andreas Dilger added a comment - 13/Jun/13 9:50 AM

I would suggest a couple of different things:

the JobStats information would be a very good way of handling this, and it would allow prioritizing RPC processing between different batch jobs as well as between batch and interactive (e.g. with JobID==batch and without==interactive)
the OST and MDT RPCs already contain space for the UID/GID in each of the RPCs (struct obdo and struct mdt_body). That makes it a bit more complex to process the RPCs for NRS, but the ORR policy is already looking into the RPC request to determine the OST object ID and offsets. I'm not sure if the uid/gid fields are always filled in for all OST/MDT RPCs, but they could be.
alternately, it might be enough to do round-robin over the UID/GID of the objects being accessed? It wouldn't be 100% fair in every case, but would work for the large majority of cases and would avoid the need to change the network protocol just for this.

In the long term, I'd prefer to develop only a small number of policies that are more sophisticated. Having separate policies for each "parameter" means that it will be difficult to get the best overall performance. Separate UID/GID policies will allow load balancing between users, but will not optimize the IO ordering like ORR.

It would be better to have a single NRS policy that can do many things at once, like balance between nodes, users, jobs, sort RPCs within objects, both round-robin and constrained with upper and lower limits for bandwidth or IOPS.

Andreas Dilger added a comment - 13/Jun/13 9:50 AM I would suggest a couple of different things: the JobStats information would be a very good way of handling this, and it would allow prioritizing RPC processing between different batch jobs as well as between batch and interactive (e.g. with JobID==batch and without==interactive) the OST and MDT RPCs already contain space for the UID/GID in each of the RPCs (struct obdo and struct mdt_body). That makes it a bit more complex to process the RPCs for NRS, but the ORR policy is already looking into the RPC request to determine the OST object ID and offsets. I'm not sure if the uid/gid fields are always filled in for all OST/MDT RPCs, but they could be. alternately, it might be enough to do round-robin over the UID/GID of the objects being accessed? It wouldn't be 100% fair in every case, but would work for the large majority of cases and would avoid the need to change the network protocol just for this. In the long term, I'd prefer to develop only a small number of policies that are more sophisticated. Having separate policies for each "parameter" means that it will be difficult to get the best overall performance. Separate UID/GID policies will allow load balancing between users, but will not optimize the IO ordering like ORR. It would be better to have a single NRS policy that can do many things at once, like balance between nodes, users, jobs, sort RPCs within objects, both round-robin and constrained with upper and lower limits for bandwidth or IOPS.

Li Xi (Inactive) added a comment - 13/Jun/13 8:21 AM

Is JobStats suitable for this purpose？
https://jira.hpdd.intel.com/browse/LU-694

Li Xi (Inactive) added a comment - 13/Jun/13 8:21 AM Is JobStats suitable for this purpose？ https://jira.hpdd.intel.com/browse/LU-694

Add UID/GID into RPC request

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates