Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.5.0
-
2
-
11951
Description
lustre-2.5.52 client (and maybe more old client as well) causes metadata performance (unlink files in the single shared directory) regression.
Here is test results on lustre-2.5.52 clients and lustre-2.4.1 clients. lustre-2.5.52 is running on all servers.
1 x MDS, 4 x OSS (32 x OST) and 16 clients(64 processs, 20000 files per process)
lustre-2.4.1 client 4.1-take2.log -- started at 12/09/2013 07:31:29 -- mdtest-1.9.1 was launched with 64 total task(s) on 16 node(s) Command line used: /work/tools/bin/mdtest -d /lustre/dir.0 -n 20000 -F -i 3 Path: /lustre FS: 1141.8 TiB Used FS: 0.0% Inodes: 50.0 Mi Used Inodes: 0.0% 64 tasks, 1280000 files SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 58200.265 56783.559 57589.448 594.589 File stat : 123351.857 109571.584 114223.612 6455.043 File read : 109917.788 83891.903 99965.718 11472.968 File removal : 60825.889 59066.121 59782.774 754.599 Tree creation : 4048.556 1971.934 3082.293 853.878 Tree removal : 21.269 15.069 18.204 2.532 -- finished at 12/09/2013 07:34:53 --
lustre-2.5.5.2 client -- started at 12/09/2013 07:13:42 -- mdtest-1.9.1 was launched with 64 total task(s) on 16 node(s) Command line used: /work/tools/bin/mdtest -d /lustre/dir.0 -n 20000 -F -i 3 Path: /lustre FS: 1141.8 TiB Used FS: 0.0% Inodes: 50.0 Mi Used Inodes: 0.0% 64 tasks, 1280000 files SUMMARY: (of 3 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- File creation : 58286.631 56689.423 57298.286 705.112 File stat : 127671.818 116429.261 121610.854 4631.684 File read : 173527.817 158205.242 166676.568 6359.445 File removal : 46818.194 45638.851 46118.111 506.151 Tree creation : 3844.458 2576.354 3393.050 578.560 Tree removal : 21.383 18.329 19.844 1.247 -- finished at 12/09/2013 07:17:07 --
46K ops/sec (lusre-2.5.52) vs 60K ops/sec (lustre-2.4.1). 25% performance drops on Lustre-2.5.52 compared to Lustre-2.4.1.
First, I tried only 10398 patch, but build fails since OBD_CONNECT_UNLINK_CLOSE is defined in 10426 patch. So, I needed both patches at same time to compile.
BTW, here is same mdtest benchmark on same hardware, but lustre version is 2.5.2RC2.
Unique Directory Operation
Shared Directory Operation
Unique directory metadata operation, overall, the result of master + 10398 + 10426 patches are close to 2.5.2RC2 results except directory stats. (stats operation, 2.5 is better than master)
However, metadata operations to a shared directory, most of 2.5.2RC2's numbers are still much higher than master or master + 10398 + 10426 patch's resutls. That's oritinal issue on this ticket, but still big performance gap there. Read operation, master branch much improved against 2.5 branch.