[LU-3555] job_stats "setattr" stats are 2x what they should be Created: 03/Jul/13  Updated: 05/Jun/15  Resolved: 11/Jul/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.3.0, Lustre 2.4.1, Lustre 2.5.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Andreas Dilger Assignee: Niu Yawei (Inactive)
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

"FSTYPE=zfs llmount.sh" configuration with 3 OSTs. Not sure if this is at all relevant to the problem or not.


Severity: 3
Rank (Obsolete): 8954

 Description   

It seems that the /proc/fs/lustre/obdfilter//job_stats output for "setattr" is double what is reported for the regular /proc/fs/lustre/obdfilter//stats output:

# grep setattr testfs-OST0000/*stats 
testfs-OST0000/job_stats:  setattr: { samples: 344, unit:  reqs }
testfs-OST0000/stats:setattr                   593 samples [reqs]
# grep setattr testfs-OST0000/*stats 
testfs-OST0000/job_stats:  setattr: { samples: 373, unit:  reqs }
testfs-OST0000/stats:setattr                   651 samples [reqs]

As you can see, the job_stats setattr count went up 2x the amount for the regular stats setattr count (this is during a simple "cp -av /etc /mnt/testfs" operation from a single client). Conversely, the "write" stats were the same:

# grep write testfs-OST0000/*stats
testfs-OST0000/job_stats:  write:   { samples: 307, unit: bytes, min: 11, max:  222390, sum: 2001024 }
testfs-OST0000/stats:write_bytes               307 samples [bytes] 11 222390 2001024

It is likely this is a bug in the job_stats handling, since the total number of write operations ~= total number of setattr ~= total regular files copied, but job_stats is almost 2x this number at the end.

# find /etc -type f | wc -l
1281
# grep setattr testfs-OST000*/*stats 
testfs-OST0000/job_stats:  setattr: { samples: 426, unit:  reqs }
testfs-OST0000/stats:setattr                   758 samples [reqs]
testfs-OST0001/job_stats:  setattr: { samples: 427, unit:  reqs }
testfs-OST0001/stats:setattr                   761 samples [reqs]
testfs-OST0002/job_stats:  setattr: { samples: 427, unit:  reqs }
testfs-OST0002/stats:setattr                   752 samples [reqs]
# grep write testfs-OST000*/*stats 
testfs-OST0000/job_stats:  write:   { samples: 418, unit: bytes, min: 4, max:  644275, sum: 3295088 }
testfs-OST0000/stats:write_bytes               418 samples [bytes] 4 644275 3295088
testfs-OST0001/job_stats:  write:   { samples: 425, unit: bytes, min: 5, max: 1048576, sum: 8450420 }
testfs-OST0001/stats:write_bytes               425 samples [bytes] 5 1048576 8450420
testfs-OST0002/job_stats:  write:   { samples: 421, unit: bytes, min: 1, max: 1048576, sum: 9396075 }
testfs-OST0002/stats:write_bytes               421 samples [bytes] 1 1048576 9396075


 Comments   
Comment by Niu Yawei (Inactive) [ 04/Jul/13 ]

looks job_stats 'setattr' is 1/2 what they should be but not 2x. I think it because some setattr on OST are come from MDS, and those requests sent by OSP which doesn't have OBD_CONNECT_JOBSTATS enabled for connection.

That means only the setattr comes from client will be logged on in jobstats, I think it's reasonable.

Comment by Andreas Dilger [ 04/Jul/13 ]

Hmm, I guess this means instead that we are sending 2x as man RPCs to the OST as we should be.

Comment by Niu Yawei (Inactive) [ 05/Jul/13 ]

Are there chown/chgrp operations? setattr of chown/chgrp comes from MDS, but setattr of truncate and amtime setting comes from client. So I guess half of setattr come from chown/chgrp, and half of them come from change amtime?

Comment by Andreas Dilger [ 11/Jul/13 ]

Since the POSIX API does not have a single syscall for both timestamp and uid/gid updates, there is no way to have only a single RPC for these operations.

Generated at Sat Feb 10 01:34:55 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.