[LU-3555] job_stats "setattr" stats are 2x what they should be Created: 03/Jul/13 Updated: 05/Jun/15 Resolved: 11/Jul/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0, Lustre 2.4.1, Lustre 2.5.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Andreas Dilger | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Environment: |
"FSTYPE=zfs llmount.sh" configuration with 3 OSTs. Not sure if this is at all relevant to the problem or not. |
||
| Severity: | 3 |
| Rank (Obsolete): | 8954 |
| Description |
|
It seems that the /proc/fs/lustre/obdfilter//job_stats output for "setattr" is double what is reported for the regular /proc/fs/lustre/obdfilter//stats output: # grep setattr testfs-OST0000/*stats
testfs-OST0000/job_stats: setattr: { samples: 344, unit: reqs }
testfs-OST0000/stats:setattr 593 samples [reqs]
# grep setattr testfs-OST0000/*stats
testfs-OST0000/job_stats: setattr: { samples: 373, unit: reqs }
testfs-OST0000/stats:setattr 651 samples [reqs]
As you can see, the job_stats setattr count went up 2x the amount for the regular stats setattr count (this is during a simple "cp -av /etc /mnt/testfs" operation from a single client). Conversely, the "write" stats were the same: # grep write testfs-OST0000/*stats
testfs-OST0000/job_stats: write: { samples: 307, unit: bytes, min: 11, max: 222390, sum: 2001024 }
testfs-OST0000/stats:write_bytes 307 samples [bytes] 11 222390 2001024
It is likely this is a bug in the job_stats handling, since the total number of write operations ~= total number of setattr ~= total regular files copied, but job_stats is almost 2x this number at the end. # find /etc -type f | wc -l
1281
# grep setattr testfs-OST000*/*stats
testfs-OST0000/job_stats: setattr: { samples: 426, unit: reqs }
testfs-OST0000/stats:setattr 758 samples [reqs]
testfs-OST0001/job_stats: setattr: { samples: 427, unit: reqs }
testfs-OST0001/stats:setattr 761 samples [reqs]
testfs-OST0002/job_stats: setattr: { samples: 427, unit: reqs }
testfs-OST0002/stats:setattr 752 samples [reqs]
# grep write testfs-OST000*/*stats
testfs-OST0000/job_stats: write: { samples: 418, unit: bytes, min: 4, max: 644275, sum: 3295088 }
testfs-OST0000/stats:write_bytes 418 samples [bytes] 4 644275 3295088
testfs-OST0001/job_stats: write: { samples: 425, unit: bytes, min: 5, max: 1048576, sum: 8450420 }
testfs-OST0001/stats:write_bytes 425 samples [bytes] 5 1048576 8450420
testfs-OST0002/job_stats: write: { samples: 421, unit: bytes, min: 1, max: 1048576, sum: 9396075 }
testfs-OST0002/stats:write_bytes 421 samples [bytes] 1 1048576 9396075
|
| Comments |
| Comment by Niu Yawei (Inactive) [ 04/Jul/13 ] |
|
looks job_stats 'setattr' is 1/2 what they should be but not 2x. I think it because some setattr on OST are come from MDS, and those requests sent by OSP which doesn't have OBD_CONNECT_JOBSTATS enabled for connection. That means only the setattr comes from client will be logged on in jobstats, I think it's reasonable. |
| Comment by Andreas Dilger [ 04/Jul/13 ] |
|
Hmm, I guess this means instead that we are sending 2x as man RPCs to the OST as we should be. |
| Comment by Niu Yawei (Inactive) [ 05/Jul/13 ] |
|
Are there chown/chgrp operations? setattr of chown/chgrp comes from MDS, but setattr of truncate and amtime setting comes from client. So I guess half of setattr come from chown/chgrp, and half of them come from change amtime? |
| Comment by Andreas Dilger [ 11/Jul/13 ] |
|
Since the POSIX API does not have a single syscall for both timestamp and uid/gid updates, there is no way to have only a single RPC for these operations. |