[LU-10114] Feasibility of increasing upper limit of maximum HSM backends registered with MDT Created: 10/Oct/17  Updated: 29/Nov/18  Resolved: 29/Nov/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.0

Type: Improvement Priority: Minor
Reporter: Matt Rásó-Barnett (Inactive) Assignee: Teddy
Resolution: Fixed Votes: 0
Labels: HSM

Issue Links:
Related
is related to LU-10092 PCC: Lustre Persistent Client Cache Resolved
Rank (Obsolete): 9223372036854775807

 Description   

Hello,
As mentioned in Xue Wei's LAD'17 talk: https://www.eofs.eu/_media/events/lad17/08_li_xi_lcoc_lad_2017.pdf (LCOC: Lustre Cache on Client based on SSD – Xue Wei, NSCC-Wuxi and Li Xi, DDN), there is currently an upper limit of 32 HSM archives that can be registered.

We also have a potential use-case for HSM that would greatly benefit from being able to increase this threshold, for example we would then be able to allocate a HSM archive for an individual customer's project, thus being able to colocate their HSM archived files more logically on our particular HSM backend (a tape filesystem), which would greatly improve our ability to restore large numbers of files.

My question is just is this upper limit something that is relatively simple to increase without impacting much else?

I've emailed Li Xi from DDN who was also listed in the talk (I haven't managed to find the email address of Xue Wei from NSCC-Wuxi) if he or Xue could comment on this too, so I'll update this if I hear back from them.

Thanks,
Matt



 Comments   
Comment by John Hammond [ 10/Oct/17 ]

Unfortunately the limit of 32 archives is part of the wire protocol:

struct req_msg_field RMF_MDS_HSM_ARCHIVE =
        DEFINE_MSGF("hsm_archive", 0,
                    sizeof(__u32), lustre_swab_generic_32s, NULL);
EXPORT_SYMBOL(RMF_MDS_HSM_ARCHIVE);

So it can be changed but it will take some time for this to be seen in a production release.

Comment by Andreas Dilger [ 10/Oct/17 ]

John, typically it should be possible to increase the size of a buffer without breaking backward compatibility, so long as old clients aren't expected to access any of the fields beyond the old size of the buffer. Unfortunately, the MDS_HSM_ARCHIVE RPC is used between the client and MDS, so there needs to be some compatibility in place (probably struct obd_connect_data feature flag and max archive count) to negotiate the limits between them.

I also see that the archive ID is used as a 32-bit value in a few structs, in particular those in lustre_user.h:

struct hsm_user_state {
        /** Current HSM states, from enum hsm_states. */
        __u32                   hus_states;
        __u32                   hus_archive_id;
        /**  The current undergoing action, if there is one */
        __u32                   hus_in_progress_state;
        __u32                   hus_in_progress_action;
        struct hsm_extent       hus_in_progress_location;
        char                    hus_extended_info[];
};

struct hsm_request {
        __u32 hr_action;        /* enum hsm_user_action */
        __u32 hr_archive_id;    /* archive id, used only with HUA_ARCHIVE */
        __u64 hr_flags;         /* request flags */
        __u32 hr_itemcount;     /* item count in hur_user_item vector */
        __u32 hr_data_len;
};

struct hsm_action_list {
        __u32 hal_version;
        __u32 hal_count;       /* number of hai's to follow */
        __u64 hal_compound_id; /* returned by coordinator */
        __u64 hal_flags;
        __u32 hal_archive_id; /* which archive backend */
        __u32 padding1;
        char  hal_fsname[0];   /* null-terminated */
        /* struct hsm_action_item[hal_count] follows, aligned on 8-byte
           boundaries. See hai_zero */
} __attribute__((packed));

struct hsm_user_import {
        __u64           hui_size;
        __u64           hui_atime;
        __u64           hui_mtime;
        __u32           hui_atime_ns;
        __u32           hui_mtime_ns;
        __u32           hui_uid;
        __u32           hui_gid;
        __u32           hui_mode;
        __u32           hui_archive_id;
};

but these all appear to be used as integer values and not bitmaps, so they should be OK without any changes.

Also, sanity-hsm.sh test_50() and test_51() are testing that up to 32 archives can be used, so this test would need to be updated if we allow more archives.

Comment by Andreas Dilger [ 10/Oct/17 ]

Also, the copytool registration to the kernel passes only a 32-bit lk_data mask to indicate which archives it is in charge of:

struct lustre_kernelcomm {
        __u32 lk_wfd;
        __u32 lk_rfd;
        __u32 lk_uid;
        __u32 lk_group;
        __u32 lk_data;
        __u32 lk_flags;
} __attribute__((packed));
Comment by John Hammond [ 12/Oct/17 ]

Hi Li Xi, could you make Xue Wei aware of this ticket?

Comment by Gerrit Updater [ 29/Apr/18 ]

Teddy Zheng (jjkky@yahoo.com) uploaded a new patch: https://review.whamcloud.com/32197
Subject: LU-10114 hsm: increasing upper limit of maximum HSM backends registered with MDT
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 733344c1d871ae7925822de8135b32900ea2c776

Comment by Gerrit Updater [ 09/Aug/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32806/
Subject: LU-10114 hsm: add OBD_CONNECT2_ARCHIVE_ID_ARRAY to pass archive_id lists in array
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 1c7e7d1243f78c72210a0ba3c22d5c84838a416e

Comment by Gerrit Updater [ 09/Nov/18 ]

John L. Hammond (jhammond@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33631
Subject: LU-10114 hsm: noop chaneg for introp baselining
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2b02bf4747f0e6cb5d523471a4a6df226c7a5e86

Comment by Gerrit Updater [ 29/Nov/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32197/
Subject: LU-10114 hsm: increase upper limit of maximum HSM backends registered with MDT
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 3bfb6107ba4e92d8aa02e842502bc44bac7b8b43

Comment by Peter Jones [ 29/Nov/18 ]

Landed for 2.12

Generated at Sat Feb 10 02:32:11 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.