[LU-9982] Clients striping from mapped FID in nodemap Created: 13/Sep/17  Updated: 17/Sep/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Minor
Reporter: Jean-Baptiste Riaux (Inactive) Assignee: Etienne Aujames
Resolution: Unresolved Votes: 0
Labels: cea

Attachments: Microsoft Word Lustre Striping by Client-0.7.docx    
Issue Links:
Related
is related to LU-11656 "lfs getstripe" on directory does not... Resolved
is related to LU-11510 preserve PFL/FLR/DoM layout with lfs_... Resolved
is related to LU-11586 sanity-pfl test_10: FAIL: /mnt/lustre... Resolved
is related to LU-10345 allow specifying file layout by filen... Open
is related to LU-11234 Choose lustre pool to place file’s ob... Open
is related to LU-11739 Don't inherit default layout from roo... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

The main idea is to bind a FID given by path or FID (formatted with or without brackets)
The associated set of clients (range) when creating files or directories, will inherit striping information from template FID when it's possible. Striping priorities:

  1. Explicit striping
  2. Parent striping
  3. Template striping
  4. Filesystem default

As it's possible to query the nodemap on a given export, it's possible to get the information on the MDT stack which will transmit the template object to lower layer. Then, when a template exists, a new field into dt_allocation_hint points to the template object and will be read in lod_ah_init function and lod_layout_component is filled accordingly.

Example of usage:

  • Start lustre
  • On a client (normal striping)
    [root@client ~]# mount -t lustre 192.168.10.134@tcp:/testfs /lustre/testfs2
    [root@client ~]# cd /lustre/testfs2/
    [root@client testfs2]# echo "test" > testfile
    [root@client testfs2]# lfs getstripe -c testfile
    1
    [root@client testfs2]# lfs getstripe -c my_template
    2
    [root@client testfs2]# lfs path2fid my_template
    [0x200000402:0x1:0x0]
    
  • On the MGS node (creating the nodemap + adding client or range)
    [root@mds1]# lctl nodemap_add  SG1
    [root@mds1]# lctl nodemap_add_range --name SG1 --range 192.168.10.[130-140]@tcp
    [root@mds1]# lctl set_param -P nodemap.SG1.template=0x200000401:0x1:0x0
    or
    [root@mds1]# lctl nodemap_set_template --name SG1 --template /lustre/testfs2/my_template
    or
    [root@mds1]# lctl nodemap_set_template --name SG1 --template 0x200000401:0x1:0x0
    or
    [root@mds1]# lctl nodemap_set_template --name SG1 --template "[0x200000401:0x1:0x0]"
    [root@mds1]# lctl nodemap_modify --name SG1 --property admin --value 1 # working as root
    [root@mds1]# lctl nodemap_activate 1
    [root@mds1]# cat /proc/fs/lustre/nodemap/SG1/template
    0x200000402:0x1:0x0
    
  • On the client:
    [root@client testfs2]# echo "test" > testfile2
    [root@client testfs2]# lfs getstripe -c testfile2
    2
    

Todo in next days:

  • add a valid fid format check when bound
  • I intend to change the type of record from char[FID_LEN] to lu_fid directly (add a nodemap record). It will prevent the call to sscanf in mdt_reint_open


 Comments   
Comment by Gerrit Updater [ 13/Sep/17 ]

Jean-Baptiste Riaux (riaux.jb@intel.com) uploaded a new patch: https://review.whamcloud.com/28972
Subject: LU-9982 lustre: Clients striping from mapped FID in nodemap
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 827a92af1f3cba7b1a4105c77e35973e85702a87

Comment by Jean-Baptiste Riaux (Inactive) [ 13/Sep/17 ]

I'll fix the checkpath issues before end of week.

Comment by Jean-Baptiste Riaux (Inactive) [ 16/Sep/17 ]

Code cleaned.
Added internal lock on template object in mdt_reint_open
Changed nodemap template code to match existing features (like fileset for example, see LU-7846, LU-9289, LU-8258).

I think I will recreate a set of 3 patches instead instead of 1 big patch:

  • nodemap : add template info to nodemap
  • mdt: templated create for dir and files
  • test: test for nodemap template in sanity-sec
Comment by Jean-Baptiste Riaux (Inactive) [ 21/Nov/17 ]

Added templated directory.
Templated directory must be created as a directory with default striping information,

lfs setdirstripe -D -c stripe_count template_directory

Then as for a template file, bound to nodemap like this:

lctl set_param -P nodemap.SG1.template=fidoftemplatedir
Comment by Jean-Baptiste Riaux (Inactive) [ 05/Feb/18 ]

New version of the patch available (still https://review.whamcloud.com/#/c/28972) using dt_allocation_hint.

Comment by Gerrit Updater [ 05/Feb/18 ]

Jean-Baptiste Riaux (riaux.jb@intel.com) uploaded a new patch: https://review.whamcloud.com/31171
Subject: LU-9982 lustre: Clients striping from mapped FID in nodemap
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: fba1cf5a2e8bf664d2caa430073f34b2805703c3

Comment by Jean-Baptiste Riaux (Inactive) [ 05/Feb/18 ]

Added tests in file sanity-nodemap.sh (dependency on https://review.whamcloud.com/#/c/28972)

Comment by Jean-Baptiste Riaux (Inactive) [ 15/Mar/18 ]

Document design updated. 

Code updated (memory leaks fixed due to LU-8998 landing, PFL) + other fixes (leak when nodemap was destroyed and nodemap_putref calls added when missing) + added stripe offset and ost-list support for the template (see lod_ah_init in patch 28972).

I am really interested by feedback on lod_ah_init in file lod_object.c because I am pretty sure there is a smarter or more delicate way to transfer template lod_layout_component into created object.

Comment by Jean-Baptiste Riaux (Inactive) [ 19/Mar/18 ]

Performance impact measures (worst case is 1.9% degradation with feature enabled and used, multiple runs done).

I will add the performance impact in my next commit message.

Creating 500K files in a directory on a single node setup.

  Master LU-9982 %diff
stripe count 1 33.245s 33.770s +1.579%
stripe count 2 33.865s 34.520s +1.934%

Also found a bug when bursting files creation with SELinux enforcing (no problem when SELinux is set to permissive). Currently investigating (lod_ah_init).

+ 1 minor bug to fix (memory leak when changing template, nodemap->fidtpl not freed). 

Comment by Andreas Dilger [ 24/Feb/20 ]

What would be a useful enhancement to this is defaulting to use the root directory of the fileset as the template. That would work the same way as the current root directory of the whole filesystem, making the virtualization of Lustre more transparent.

Generated at Sat Feb 10 02:31:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.