Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.4.0
-
3
-
5233
Description
This issue was created by maloo for Li Wei <liwei@whamcloud.com>
This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/6f4fd31a-1735-11e2-afe1-52540035b04c.
The sub-test test_49 failed with the following error:
Different LDLM_TIMEOUT:6 20 20
Info required for matching: conf-sanity 49
Attachments
Activity
Oh, sorry, that looks like code cleanup in 5671, not critical.
Patch landed to master. Please reopen or let me know if I need to reopen if more work needs to be done.
> storing the individual fields from "lustre_disk_data" (e.g. ldd_flags) is not in question here.
Ahh good, that was part of my misunderstanding. That's my fault for not actually looking at the patch. Then to be perfectly clear the following lustre_disk_data entries will be left as properties.
```
/* Persistent mount data is stored in these user attributes */
#define LDD_VERSION_PROP "lustre:version" /* ldd->ldd_config_ver */
#define LDD_FLAGS_PROP "lustre:flags" /* ldd->ldd_flags */
#define LDD_INDEX_PROP "lustre:index" /* ldd->ldd_svindex /*
#define LDD_FSNAME_PROP "lustre:fsname" /* ldd->ldd_fsname */
#define LDD_SVNAME_PROP "lustre:svname" /* ldd->ldd_svname */
#define LDD_UUID_PROP "lustre:uuid" /* ldd->ldd_uuid */
#define LDD_USERDATA_PROP "lustre:userdata" /* ldd->ldd_userdata */
#define LDD_MOUNTOPTS_PROP "lustre:mountopts" /* ldd->ldd_mount_ops */
```
> "ldd_params" is intended for a limited number of parameters (e.g. mgsnode, failnode, etc) that need to be set on a per-target basis when the filesystem is mounted...
Correct me if I'm wrong, but the only reason to make this distinction in the first place (which is subtle and totally non-obvious to an end user) is so these options can be passed as a string to mount(2). Exactly how the lustre_disk_data gets stored on disk for a particular osd should be handled by that osd. We wouldn't want the limitations of any one OSD to dictate to the others how this data must be stored. That's why the wrapper osd_write_ldd() and osd_read_ldd() functions were added in the first place.
It sounds to me like the problem here isn't that they are stored on disk differently, it's that several were omitted in the initial implementation. Let me propose a new patch (today/tomorrow) which instead of collapsing these individual ZFS properties in to ldd_params adds all the missing PARAM_*s in a generic fashion. The LDD_MSGNODE_PROP, LDD_FAILNODE_PROP, LDD_FAILMODE_PROP, and LDD_INDENTITY_UPCALL_PROP will be removed and the ZFS property names for params can be dynamically generated as 'lustre:<param key>'. This way if you add a new PARAM_XYZ you won't need to update the zfs_read_ldd() or zfs_write_ldd() functions. That should address your maintenance concerns. For LDD_FAILNODE_PROP->PARAM_FAILNODE and LDD_FAILMODE_PROP->PARAM_FAILMODE where the names currently differ we can rename the property on our existing file systems to avoid the need for compatibility code. Presumably there aren't other sites running ZFS just yet (are there?).
On a related but separate note, it would be awfully to nice to someday pass a versioned 'struct lustre_disk_data' as the mount data instead of this string. This would effectively eliminate a lot of the current parsing headaches. Or better yet, the lustre_disk_data could be read from disk in the context of mount(2). But that's of course a much more ambitious bit of development work.
Brian,
storing the individual fields from "lustre_disk_data" (e.g. ldd_flags) is not in question here. The original issue is that while the lustre_disk_data struct an "ldd_params" field for storing arbitrary configuration strings, it was split into separate fields for only ZFS. This patch restored it to the original behaviour of just containing a string of parameters that didn't need to be handled individually and explicitly in userspace, and keeps the handling uniform between ZFS and ldiskfs. Without this, there were a number of parameters (e.g. "timeout", "network", etc) that were not explicitly handled, and this would add an ongoing extra maintenance effort.
I have no objection to restore the 4 parameters that were disabled by the http://review.whamcloud.com/4732 patch:
#define LDD_MGSNODE_PROP "lustre:mgsnode" #define LDD_FAILNODE_PROP "lustre:failnode" #define LDD_FAILMODE_PROP "lustre:failmode" #define LDD_IDENTITY_UPCALL_PROP "lustre:identity_upcall"
since they are the only ones that could have been broken by this patch (though strictly only the first three are needed, "identity_upcall" could be set from the MGS for all MDTs).
"ldd_params" is intended for a limited number of parameters (e.g. mgsnode, failnode, etc) that need to be set on a per-target basis when the filesystem is mounted and do not have /proc/fs/lustre equivalents. The remaining global parameters can be fetched from the MGS config log for all OSTs, or they would have to be updated manually for every MDT or OST in the filesystem, and applied for every new target that is added.
I now see that the "network" option is not explicitly supported by the "lustre:*" parsing, which would need to be added in addition to explicit "timeout" and potentially other tunables.
Separately, if there is a desire for setting arbitrary parameters in lprocfs for a particular ZFS target (i.e. anything in /proc/fs/lustre/
{type}/
{name}/
{parameter}) a generic mechanism could be created to find and parse ZFS properties of the form "lustre:param:{parameter}" for that target.
I agree that the Lustre configuration code could be improved, and there is a desire to do this for Chroma as well, but I don't think this bug is the right place to do that design.
Note that for ldiskfs, unlike ZFS, there is no space for arbitrary parameter storage beyond the "ldd_params" field, so it wouldn't be possible to add/remove individual parameters for ldiskfs without essentially writing helpers to extract them and concatenate them again in mount_utils_ldiskfs.c and in the kernel, and it still wouldn't be shared between kernel & userspace like the ZFS dataset properties are today.
Andreas, I think for ZFS OSDs we need to absolutely store all the Lustre parameters of interest to an administrator this way. Certainly everything the administrator might need to modify, and potentially make the rest visible as read-only properties. It makes it far easier to 1) know what options are available, 2) verify what they are set too, and 3) change a particular option. As a nice bonus we can even cleanly register callbacks so Lustre gets notified when one of them is changed so they may be modified online. This is an ideal interface for Lustre to use. If it needs to be extended in some way for Lustre we should look in to that. Frankly, as it was done for ldiskfs was a constant pain anytime something needed to change so I'd like to avoid that entirely with ZFS if we can.
In fact, doing it this way already has saved us a lot of trouble in one instance where we needed to clear a single bit in the ldd flags on all the servers due to a registration bug. We were able to trivially clear the single bit we needed and verify that change across all the servers. It was soooooooo nice.
Chris or Prakash, can you please comment on which specific parameters you are setting with these dataset properties before the patch landed that broke your mouting?
Are you looking to be able to store arbitrary configuration parameters on the ZFS dataset as properties, or only a specific limited number of parameters that are needed during mount that cannot be fetched from the MGS config?
I pushed a patch to revert that commit here: http://review.whamcloud.com/5144
ok then let's reopen but lower the priority to track the code cleanup patch landing