Affects Version/s: None
Fix Version/s: None
When we were using Lustre below NFS then it is observed that the write
requests that we get is not page aligned even if the application is sending
it correctly. Mostly it is the first and last page which is not aligned.
Obseravtion as per code analysis is that below function is causing the issue:
i.e. 0th vector is filled with min of buflen or wr_head and rest differently
Because of this, first and last page is not aligned.
Interestingly when such request hit Lustre write path, as the first page is
un-aligned (or partial) the whole write becomes un-aligned and causing performance
degradation. The performance degradation seems mostly because of un-aglined
write causing read rpcs.
As we can see here first partial size vector causing all following page size
vector to be considered as partial. Which in tern causing read requests/rpcs.
I think, we can avoid those read requests as we know first partial vector
is causing this. We can detect it by scanning the vector (we are already
iterating over it). Based on this thought/approach cook proto-type patch.
Seeking feedback on this approach or any question/suggestion/concern on the same ?
Following stats are collected without and with patch.
- salloc -N 8 --ntasks-per-node=16 mpirun --allow-run-as-root /work/tools/bin/ior -a POSIX -w -r -vv -e -t 1m -b 2g -C -Q 21 -F -o /scratch1_nfs/file
Max Write: 379.60 MiB/sec (398.03 MB/sec)
Max Read: 4618.64 MiB/sec (4842.99 MB/sec)
lustre-2.10.4/w proto-type patch
Max Write: 3817.34 MiB/sec (4002.77 MB/sec)
Max Read: 4474.30 MiB/sec (4691.64 MB/sec)