Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
Sub directory mount allows mounting a sub directory of filesytem on different mount point on client.
If that directory is not MDT0 and assigned with other MDT index (e.g. MDT1) as a root DIR, that remote directory is mounted on client as "ROOT", but this is actual "ROOT" of filesystem.
It's a problem in this case and cuases too many unnecessary lock grant/cancel when if application traverse the lustre tree (e.g. permission check) many times from same client.
In general, when client send lookup requests with locks to server and it can be kept until it cancels. However, if remote directory is mounted on client as "ROOT", that locks will be canceled and MDS needs to revalidate all time.
Here is a simple reproducer.
Create /exafs/mdt0 with MDT0 and /exafs/mdt1 with MDT1. [root@ec01 ~]# lfs mkdir -i 0 /exafs/mdt0 [root@ec01 ~]# lfs mkdir -i 1 /exafs/mdt1 [root@ec01 ~]# lfs setdirstripe -D -c 1 -i 0 --max-inherit=-1 /exafs/mdt0 [root@ec01 ~]# lfs setdirstripe -D -c 1 -i 1 --max-inherit=-1 /exafs/mdt1 [root@ec01 ~]# mkdir /mnt/exafs/mdt0 -p [root@ec01 ~]# mkdir /mnt/exafs/mdt1 -p mount /exafs/mdt0 and /exafs/mdt1 on client as a sub dir. [root@ec01 ~]# mount -t lustre 10.0.11.208@o2ib12:10.0.11.209@o2ib12:/exafs/mdt0 /mnt/exafs/mdt0 [root@ec01 ~]# mount -t lustre 10.0.11.208@o2ib12:10.0.11.209@o2ib12:/exafs/mdt1 /mnt/exafs/mdt1 [root@ec01 ~]# df -t lustre Filesystem 1K-blocks Used Available Use% Mounted on 10.0.11.208@o2ib12:10.0.11.209@o2ib12:/exafs 56230674432 2433444 55659026780 1% /exafs 10.0.11.208@o2ib12:10.0.11.209@o2ib12:/exafs/mdt0 56230674432 2433444 55659026780 1% /mnt/exafs/mdt0 10.0.11.208@o2ib12:10.0.11.209@o2ib12:/exafs/mdt1 56230674432 2433444 55659026780 1% /mnt/exafs/mdt1 [root@ec01 ~]# lctl set_param ldlm.namespaces.*.lru_size=1000000
1st case (mdtest to MDT0 with sub dir mount)
[root@ec01 ~]# lctl set_param mdc.*.stats=clear [root@ec01 ~]# mpirun -np 16 --allow-run-as-root /work/tools/bin/mdtest -n 10000 -C -E -r -F -u -d /mnt/exafs/mdt0 SUMMARY rate: (of 1 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation 17871.676 17871.676 17871.676 0.000 File stat 0.000 0.000 0.000 0.000 File read 27612.328 27612.328 27612.328 0.000 File removal 24975.434 24975.434 24975.434 0.000 Tree creation 373.126 373.126 373.126 0.000 Tree removal 81.089 81.089 81.089 0.000 [root@ec01 ~]# lctl get_param mdc.*MDT0000*.stats mdc.exafs-MDT0000-mdc-ffff8cedc293e800.stats= snapshot_time 1147669.149937342 secs.nsecs start_time 0.000000000 secs.nsecs elapsed_time 1147669.149937342 secs.nsecs req_waittime 960054 samples [usec] 22 638501 146216676 449927615916 req_active 960054 samples [reqs] 1 19 7969292 69391178 ldlm_ibits_enqueue 320018 samples [reqs] 1 1 320018 320018 mds_close 320000 samples [usec] 27 20104 35563148 6982285958 ldlm_cancel 160000 samples [usec] 22 19544 14834323 3810788927 obd_ping 1 samples [usec] 64 64 64 4096 seq_query 1 samples [usec] 143 143 143 20449
2nd case (mdtest to remote direcotry with sub dir mount)
[root@ec01 ~]# lctl set_param mdc.*.stats=clear [root@ec01 ~]# mpirun -np 16 --allow-run-as-root /work/tools/bin/mdtest -n 10000 -C -E -r -F -u -d /mnt/exafs/mdt1 SUMMARY rate: (of 1 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation 12583.511 12583.511 12583.511 0.000 File stat 0.000 0.000 0.000 0.000 File read 17565.493 17565.493 17565.493 0.000 File removal 15153.482 15153.482 15153.482 0.000 Tree creation 393.056 393.056 393.056 0.000 Tree removal 235.953 235.953 235.953 0.000 [root@ec01 ~]# lctl get_param mdc.*MDT0001*.stats mdc.exafs-MDT0001-mdc-ffff8cee701cf800.stats= snapshot_time 1147812.226056545 secs.nsecs start_time 0.000000000 secs.nsecs elapsed_time 1147812.226056545 secs.nsecs req_waittime 1920132 samples [usec] 23 36225 460375703 180834738289 req_active 1920132 samples [reqs] 1 58 28583891 443285545 ldlm_ibits_enqueue 800057 samples [reqs] 1 1 800057 800057 mds_close 320000 samples [usec] 31 19646 57650756 14258855424 ldlm_cancel 640037 samples [usec] 23 36225 131679890 44212017584 obd_ping 2 samples [usec] 89 115 204 21146 seq_query 2 samples [usec] 69 179 248 36802
There are 2.5x ldlm_ibits_enqueue and 4 x ldlm_cancel in 2nd case(mounted remote directory as "ROOT").