[LU-7733] ptlrpc_lprocfs_svc_req_history_show (Seq_printf) time resolution should be higher than 1 second. Created: 02/Feb/16 Updated: 19/Sep/16 Resolved: 16/May/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.3 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Improvement | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | Bob Glossman (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
We would like to see a higher time resolution in ptlrpc_lprocfs_svc_req_history_show output. Here we see it is only printing up to a second. seq_printf(s, LPD64":%s:%s:x"LPU64":%d:%s:%ld:%lds(%+lds) ", req->rq_history_seq, libcfs_nid2str(req->rq_self), libcfs_id2str(req->rq_peer), req->rq_xid, req->rq_reqlen, ptlrpc_rqphase2str(req), req->rq_arrival_time.tv_sec, req->rq_sent - req->rq_arrival_time.tv_sec, req->rq_sent - req->rq_deadline); |
| Comments |
| Comment by Oleg Drokin [ 03/Feb/16 ] |
|
its trivial to add extra resolution, the biggest question is that of compatibility. While we d o not have any tools that parse it, there might be something in the community that is not expecting anything but an integer there. Perhaps we need to ask around in lustre-devel? |
| Comment by Peter Jones [ 03/Feb/16 ] |
|
Bob Can you push a provisional change here while we assess the impact of making this change? Thanks Peter |
| Comment by Bob Glossman (Inactive) [ 03/Feb/16 ] |
|
not at all clear what is wanted here. I can easily print a higher resolution value in place of req->rq_arrival_time.tv_sec, just convert to jiffies or microseconds or something. not very human readable compared to seconds IMHO. don't know what to do about the diff values, req->rq_sent - req->rq_arrival_time.tv_sec and req->rq_sent - req->rq_deadline. rq_sent is in seconds so I think only seconds diffs are easy to come by. |
| Comment by Mahmoud Hanafi [ 04/Feb/16 ] |
|
We would be interested at lest microseconds. In the rpctrace debug dump the completion times are in micro-seconds. |
| Comment by Gerrit Updater [ 05/Feb/16 ] |
|
Bob Glossman (bob.glossman@intel.com) uploaded a new patch: http://review.whamcloud.com/18335 |
| Comment by Bob Glossman (Inactive) [ 05/Feb/16 ] |
|
still not sure I'm really addressing the request exactly, but have pushed a mod with my best guess. it's at least a starting point for discussion. |
| Comment by Mahmoud Hanafi [ 11/Feb/16 ] |
|
I agree with Andreas comment, what we want is seconds.microsecond. I did try out the patch. One the MDS it is working but not the output we were looking for... 6250131278572683267:10.151.26.117@o2ib:12345-10.151.26.120@o2ib:x1525908159598524:224:Complete:1455222088536099:-536099us(-192000000us) opc 400 6250131278572683271:10.151.26.117@o2ib:12345-10.151.26.120@o2ib:x1525908159598504:224:Complete:1455222088536106:-536106us(-192000000us) opc 400 6250127494744440832:10.151.26.117@o2ib:12345-10.151.27.19@o2ib:x1525733058816312:264:Complete:1455221207545364:-545364us(-192000000us) opc 700 There is something wrong on the OSS as it doesn't print anything. nbp1-oss1 /proc/fs/lustre/ost/OSS/ost_io # cat req_buffer_history_max 5000 nbp1-oss1 /proc/fs/lustre/ost/OSS/ost_io # cat req_buffer_history_len 1250 nbp1-oss1 /proc/fs/lustre/ost/OSS/ost_io # cat req_history nbp1-oss1 /proc/fs/lustre/ost/OSS/ost_io # |
| Comment by Mahmoud Hanafi [ 25/Mar/16 ] |
|
I don't understand the negative numbers here.... 6266059553153941504:10.151.27.60@o2ib:12345-10.151.7.174@o2ib:x1529670166131456:328:Complete:1458930678.979710:-1.020290s(-192.0s) 6266059553091026945:10.151.27.60@o2ib:12345-10.151.53.147@o2ib:x1529728924728644:328:Complete:1458930678.964339:-1.035661s(-192.0s) 6266059553440268289:10.151.27.60@o2ib:12345-10.151.54.82@o2ib:x1529750902839948:328:Complete:1458930679.001036:-1.998964s(-192.0s) 6266059553444003841:10.151.27.60@o2ib:12345-10.151.54.82@o2ib:x1529750902839960:328:Complete:1458930679.001944:-1.998056s(-192.0s) 6266059553454292994:10.151.27.60@o2ib:12345-10.151.40.144@o2ib:x1529023858938888:328:Complete:1458930679.004457:-1.995543s(-192.0s) 6266059553458880514:10.151.27.60@o2ib:12345-10.151.40.144@o2ib:x1529023858938904:328:Complete:1458930679.005578:-1.994422s(-192.0s) 6266059553481883650:10.151.27.60@o2ib:12345-10.151.40.144@o2ib:x1529023858938924:328:Complete:1458930679.011186:-1.988814s(-192.0s) 6266059553533132803:10.151.27.60@o2ib:12345-10.153.10.116@o2ib233:x1529703480218204:328:Complete:1458930679.023701:-1.976299s(-192.0s) |
| Comment by Bob Glossman (Inactive) [ 25/Mar/16 ] |
|
the negative numbers are because the values shown are deltas, not absolute times. if I'm not mistaken the values shown without the patch were always 0 or negative too. |
| Comment by Jay Lan (Inactive) [ 19/Apr/16 ] |
|
Bob, the review #18335 has conflict with b2_7_fe. 'Git blame' points to |
| Comment by Bob Glossman (Inactive) [ 19/Apr/16 ] |
|
Jay, |
| Comment by Jay Lan (Inactive) [ 19/Apr/16 ] |
|
Hi Bob, I can do the back port myself, given that Intel tells me that this patch would still work as expected without |
| Comment by Bob Glossman (Inactive) [ 19/Apr/16 ] |
|
Jay, I can't promise 100% that We don't usually port to branches until after a mod lands in master. This one hasn't. Hasn't even gotten any feedback from the originator of the request if it meets his needs. |
| Comment by Jay Lan (Inactive) [ 19/Apr/16 ] |
|
Oops, sorry, Bob. I did not read the contents of this ticket. Yes the mod was simply to increase the time resolution in print messages. Actually I already cherry-picked the mod to b2_5_fe before. |
| Comment by Gerrit Updater [ 16/May/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18335/ |
| Comment by Peter Jones [ 16/May/16 ] |
|
Landed for 2.9 |