[LU-8747] OSC_READ/WRITE replacement after LU-6943 Created: 22/Oct/16 Updated: 26/Oct/16 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question/Request | Priority: | Major |
| Reporter: | Gabriele Paciucci (Inactive) | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
In snapshot_time 1454067781.979713 secs.usecs read_bytes 100000 samples [bytes] 4096 4096 409600000 write_bytes 100000 samples [bytes] 4096 4096 409600000 osc_read 401 samples [bytes] 4096 1048576 409600000 osc_write 391 samples [bytes] 655360 1048576 409600000 open 3 samples [regs] close 3 samples [regs] seek 2 samples [regs] truncate 1 samples [regs] getxattr 100000 samples [regs] inode_permission 7 samples [regs] After the osc.zlfs2-OST001f-osc-ffff8820254c4000.stats= snapshot_time 1477130565.324299 secs.usecs req_waittime 1584 samples [usec] 170 7238 2291201 3700860537 req_active 1584 samples [reqs] 1 2 1618 1686 read_bytes 432 samples [bytes] 983 1048576 306500567 291188566310545 write_bytes 1148 samples [bytes] 983 1048576 1181807521 1239149817647387 ost_read 432 samples [usec] 170 4809 575487 981550847 ost_write 1148 samples [usec] 255 6154 1707192 2666192812 ldlm_cancel 1 samples [usec] 264 264 264 69696 obd_ping 2 samples [usec] 247 773 1020 658538 I'm assuming the old osc_read == SUM(read_bytes in osc.*.stats). Could you please confirm? Normally the SUM(read_bytes in osc.*.stats) should be = of the read_bytes in llite.*.stats when the application is reading data from the network or < when the application is reading data already in the VFS cache. Shouldn't be possible to have a situation where SUM(read_bytes in osc.*.stats) > read_bytes in llite.*.stats in theory. I did some experiment and I found that the SUM(read_bytes in osc.*.stats) > read_bytes in llite.*.stats. Could you explain why? |
| Comments |
| Comment by Gabriele Paciucci (Inactive) [ 22/Oct/16 ] |
|
I did a simple test: lctl set_param llite.*.stats=0; lctl set_param osc.*.stats=0 dd if=/dev/zero of=pippo bs=1M count=1000 [root@broadwell1 SIDRA]# lctl get_param llite.*.stats |grep write_bytes write_bytes 1000 samples [bytes] 1048576 1048576 1048576000 lctl get_param osc.*.stats |grep write_bytes write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 32 samples [bytes] 1048576 1048576 33554432 35184372088832 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 32 samples [bytes] 1048576 1048576 33554432 35184372088832 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 32 samples [bytes] 1048576 1048576 33554432 35184372088832 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 32 samples [bytes] 1048576 1048576 33554432 35184372088832 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 32 samples [bytes] 1048576 1048576 33554432 35184372088832 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 32 samples [bytes] 1048576 1048576 33554432 35184372088832 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 32 samples [bytes] 1048576 1048576 33554432 35184372088832 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 32 samples [bytes] 1048576 1048576 33554432 35184372088832 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 write_bytes 31 samples [bytes] 1048576 1048576 32505856 34084860461056 lctl set_param llite.*.stats=0; lctl set_param osc.*.stats=0 [root@broadwell1 SIDRA]# dd of=/dev/null if=pippo bs=1M count=1000 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 0.288931 s, 3.6 GB/s [root@broadwell1 SIDRA]# lctl get_param llite.*.stats|grep read_bytes read_bytes 1000 samples [bytes] 1048576 1048576 1048576000 lctl get_param osc.*.stats|grep read_bytes empty.... |
| Comment by Gabriele Paciucci (Inactive) [ 22/Oct/16 ] |
|
So the application in my first question was reading data using a different call from read() and this explain why SUM(read_bytes in osc.*.stats) > read_bytes in llite.*.stats. |
| Comment by John Hammond [ 24/Oct/16 ] |
|
In llite.*.stats the {read,write}_bytes stats only count the number of bytes returned from some variant of read() or write(). If those reads hit the cache then the corresponding osc stats will remain unchanged. |