[LU-16026] Proper ROOT dir handles in remote directory Created: 19/Jul/22  Updated: 13/Jul/23  Resolved: 19/Jan/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Shuichi Ihara Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Sub directory mount allows mounting a sub directory of filesytem on different mount point on client.
If that directory is not MDT0 and assigned with other MDT index (e.g. MDT1) as a root DIR, that remote directory is mounted on client as "ROOT", but this is actual "ROOT" of filesystem.

It's a problem in this case and cuases too many unnecessary lock grant/cancel when if application traverse the lustre tree (e.g. permission check) many times from same client.
In general, when client send lookup requests with locks to server and it can be kept until it cancels. However, if remote directory is mounted on client as "ROOT", that locks will be canceled and MDS needs to revalidate all time.

Here is a simple reproducer.

Create /exafs/mdt0 with MDT0 and /exafs/mdt1 with MDT1.

[root@ec01 ~]# lfs mkdir -i 0 /exafs/mdt0
[root@ec01 ~]# lfs mkdir -i 1 /exafs/mdt1

[root@ec01 ~]# lfs setdirstripe -D -c 1 -i 0 --max-inherit=-1 /exafs/mdt0
[root@ec01 ~]# lfs setdirstripe -D -c 1 -i 1 --max-inherit=-1 /exafs/mdt1

[root@ec01 ~]# mkdir /mnt/exafs/mdt0 -p
[root@ec01 ~]# mkdir /mnt/exafs/mdt1 -p

mount /exafs/mdt0 and /exafs/mdt1 on client as a sub dir.
[root@ec01 ~]# mount -t lustre 10.0.11.208@o2ib12:10.0.11.209@o2ib12:/exafs/mdt0 /mnt/exafs/mdt0
[root@ec01 ~]# mount -t lustre 10.0.11.208@o2ib12:10.0.11.209@o2ib12:/exafs/mdt1 /mnt/exafs/mdt1

[root@ec01 ~]# df -t lustre
Filesystem                                          1K-blocks    Used   Available Use% Mounted on
10.0.11.208@o2ib12:10.0.11.209@o2ib12:/exafs      56230674432 2433444 55659026780   1% /exafs
10.0.11.208@o2ib12:10.0.11.209@o2ib12:/exafs/mdt0 56230674432 2433444 55659026780   1% /mnt/exafs/mdt0
10.0.11.208@o2ib12:10.0.11.209@o2ib12:/exafs/mdt1 56230674432 2433444 55659026780   1% /mnt/exafs/mdt1

[root@ec01 ~]# lctl set_param ldlm.namespaces.*.lru_size=1000000

 

1st case (mdtest to MDT0 with sub dir mount)

[root@ec01 ~]# lctl set_param mdc.*.stats=clear
[root@ec01 ~]# mpirun -np 16 --allow-run-as-root /work/tools/bin/mdtest -n 10000 -C -E -r -F -u -d /mnt/exafs/mdt0

SUMMARY rate: (of 1 iterations)
   Operation                     Max            Min           Mean        Std Dev
   ---------                     ---            ---           ----        -------
   File creation               17871.676      17871.676      17871.676          0.000
   File stat                       0.000          0.000          0.000          0.000
   File read                   27612.328      27612.328      27612.328          0.000
   File removal                24975.434      24975.434      24975.434          0.000
   Tree creation                 373.126        373.126        373.126          0.000
   Tree removal                   81.089         81.089         81.089          0.000

[root@ec01 ~]# lctl get_param mdc.*MDT0000*.stats
mdc.exafs-MDT0000-mdc-ffff8cedc293e800.stats=
snapshot_time             1147669.149937342 secs.nsecs
start_time                0.000000000 secs.nsecs
elapsed_time              1147669.149937342 secs.nsecs
req_waittime              960054 samples [usec] 22 638501 146216676 449927615916
req_active                960054 samples [reqs] 1 19 7969292 69391178
ldlm_ibits_enqueue        320018 samples [reqs] 1 1 320018 320018
mds_close                 320000 samples [usec] 27 20104 35563148 6982285958
ldlm_cancel               160000 samples [usec] 22 19544 14834323 3810788927
obd_ping                  1 samples [usec] 64 64 64 4096
seq_query                 1 samples [usec] 143 143 143 20449

2nd case (mdtest to remote direcotry with sub dir mount)

[root@ec01 ~]# lctl set_param mdc.*.stats=clear
[root@ec01 ~]# mpirun -np 16 --allow-run-as-root /work/tools/bin/mdtest -n 10000 -C -E -r -F -u -d /mnt/exafs/mdt1

SUMMARY rate: (of 1 iterations)
   Operation                     Max            Min           Mean        Std Dev
   ---------                     ---            ---           ----        -------
   File creation               12583.511      12583.511      12583.511          0.000
   File stat                       0.000          0.000          0.000          0.000
   File read                   17565.493      17565.493      17565.493          0.000
   File removal                15153.482      15153.482      15153.482          0.000
   Tree creation                 393.056        393.056        393.056          0.000
   Tree removal                  235.953        235.953        235.953          0.000


[root@ec01 ~]# lctl get_param mdc.*MDT0001*.stats
mdc.exafs-MDT0001-mdc-ffff8cee701cf800.stats=
snapshot_time             1147812.226056545 secs.nsecs
start_time                0.000000000 secs.nsecs
elapsed_time              1147812.226056545 secs.nsecs
req_waittime              1920132 samples [usec] 23 36225 460375703 180834738289
req_active                1920132 samples [reqs] 1 58 28583891 443285545
ldlm_ibits_enqueue        800057 samples [reqs] 1 1 800057 800057
mds_close                 320000 samples [usec] 31 19646 57650756 14258855424
ldlm_cancel               640037 samples [usec] 23 36225 131679890 44212017584
obd_ping                  2 samples [usec] 89 115 204 21146
seq_query                 2 samples [usec] 69 179 248 36802

There are 2.5x ldlm_ibits_enqueue and 4 x ldlm_cancel in 2nd case(mounted remote directory as "ROOT").



 Comments   
Comment by Gerrit Updater [ 25/Aug/22 ]

"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48339
Subject: LU-16026 mdt: improve remote subdir mount
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c15410d86ca88f42751ff7874507d8fe3b37bccb

Comment by Gerrit Updater [ 13/Sep/22 ]

"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48535
Subject: LU-16026 llite: always enable remote subdir mount
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4ba8c1f91bcbc2495aadbfbf06c6e8d89aff26a6

Comment by Shuichi Ihara [ 21/Dec/22 ]

patch worked as expected below. I've confirmed no more extra ldlm_cancel and ldlm_ibits_enqueue even remote directory is mounted as ROOT.

sub directory mount with MDT0.

root@vm0:~# lctl get_param version 
version=2.15.52_47_g2679de4
root@vm0:~# lctl set_param ldlm.namespaces.*.lru_size=1000000
root@vm0:~# lctl set_param mdc.*.stats=clear
root@vm0:~# /usr/mpi/gcc/openmpi-4.1.2a1/bin/mpirun --oversubscribe -np 16 --allow-run-as-root /work/tools/bin/mdtest -n 10000 -C -E -r -F -u -d /mnt/exafs/mdt0
root@vm0:~# lctl get_param mdc.*MDT0000*.stats
mdc.exafs-MDT0000-mdc-ffff994209fb7000.stats=
snapshot_time             220.561610624 secs.nsecs
start_time                0.000000000 secs.nsecs
elapsed_time              220.561610624 secs.nsecs
req_waittime              960084 samples [usecs] 29 197986 180872606 399329041328
req_active                960084 samples [reqs] 1 14 6856669 50885133
ldlm_ibits_enqueue        320034 samples [reqs] 1 1 320034 320034
mds_close                 320000 samples [usecs] 45 197927 49927409 119532239021
ldlm_cancel               160014 samples [usecs] 29 58710 14269882 4844727860
obd_ping                  1 samples [usecs] 138 138 138 19044
seq_query                 1 samples [usecs] 189 189 189 35721

sub directory mount with remote directory (MDT1)

root@vm0:~# lctl set_param mdc.*.stats=clear
root@vm0:~# /usr/mpi/gcc/openmpi-4.1.2a1/bin/mpirun --oversubscribe -np 16 --allow-run-as-root /work/tools/bin/mdtest -n 10000 -C -E -r -F -u -d /mnt/exafs/mdt1
root@vm0:~# lctl get_param mdc.*MDT0001*.stats
mdc.exafs-MDT0001-mdc-ffff99420fcff800.stats=
snapshot_time             391.139182776 secs.nsecs
start_time                0.000000000 secs.nsecs
elapsed_time              391.139182776 secs.nsecs
req_waittime              960076 samples [usecs] 31 197107 180467199 396472739267
req_active                960076 samples [reqs] 1 22 6840395 50493549
ldlm_ibits_enqueue        320030 samples [reqs] 1 1 320030 320030
mds_close                 320000 samples [usecs] 43 64606 49595362 18409222060
ldlm_cancel               160010 samples [usecs] 31 64333 14363529 5541388115
obd_ping                  1 samples [usecs] 192 192 192 36864
seq_query                 1 samples [usecs] 156 156 156 24336
 
Comment by Gerrit Updater [ 17/Jan/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48535/
Subject: LU-16026 llite: always enable remote subdir mount
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6f490275b0e0455a431707775d685fb3df1d322d

Comment by Peter Jones [ 19/Jan/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:23:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.