[LU-6013] Separate mount helpers for client and server Created: 10/Dec/14 Updated: 26/Feb/21 Resolved: 18/Nov/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0, Lustre 2.5.3 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Olaf Faaland | Assignee: | Olaf Faaland |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Rank (Obsolete): | 16757 | ||||||||||||||||
| Description |
|
Currently a single mount helper, mount.lustre, is used to mount lustre on a client, as well as to start a lustre server (e.g. OST). This makes it harder to see bugs in which client-side mount code is executed in a server context or vice versa, and cause undesirable side-affects. An example seen at LLNL recently was with osd_init(). Client nodes could not mount a lustre filesystem because mount called osd_init(), which attempted to load the mount_osd_{zfs,ldiskfs} backfs modules. One of the modules failed to load, causing the mount to fail, even though the client does not use those modules at all. That specific bug is being corrected by change http://review.whamcloud.com/#/c/12550/ for The proposal here is to move server-side code from mount_lustre.c to mount_lustre_server.c, modify the build system to generate two separate binaries, mount.lustre and mount.lustre_server, and then update startup scripts, spec file, and documentation appropriately. |
| Comments |
| Comment by Andreas Dilger [ 10/Dec/14 ] |
|
This would also require that all Lustre server filesystems change their type from "lustre" to "lustre_server", and use "mount -t lustre_server" everywhere. That is a pretty impactful change for only potential future problems, since it will cause all existing filesystems to fail mount on upgrade or downgrade, and invalidate all existing documentation/tutorials/presentations on using Lustre. As a very short term fix (possibly for 2.7.0 and suitable for 2.5.x) would be to patch mount_lustre.c to skip osd_init() entirely if a client device name (with ":/" in it) is given on the command line. That should move somewhere after parse_options() is called, maybe right before parse_ldd() since it appears to be the first place that uses the osd_*() functions. That avoids even trying to access the shared libraries on clients and avoids a whole class of bugs easily. It is also worthwhile to read some of the discussion in I think a reasonable step beyond that is to allow filesystems to be mounted with type "lustre_server" (i.e. create a link from mount.lustre_server to mount.lustre, and register the "lustre_server" filesystem type in the kernel). I'm also not at all against separating the client and server mount code and creating separate binaries, as I expect that there isn't a lot of overlap in terms of mount options or handling between clients and servers. It should still be possible to have mount.lustre fork/exec /sbin/mount.lustre_server (or whatever it is called) if it detects the device name doesn't have ":/" in it, but is instead a local block device. That allows the separation of code and the ability to mount "lustre_server" filesystems directly, without the immediate requirement to move over to a separate filesystem type. Once we have the ability to mount "lustre_server" filesystems in new releases, and possibly backported to maintenance releases (2.5.5 maybe?) for a couple of years, then we can think about removing the old mount support completely. |
| Comment by Gerrit Updater [ 10/Dec/14 ] |
|
Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/13019 |
| Comment by Olaf Faaland [ 11/Dec/14 ] |
|
I agree re: changing the code so that osd_init() is not called on clients. I created an equivalent patch, then found that Bruno F. also has a patch in review that does that, http://review.whamcloud.com/#/c/12550/. So that is covered. Thanks for the pointer to I like the idea of the mount helper for the server being a different binary than the one for the client, both because it puts one step in place towards changing the mount type on the server, and also by using separate binaries makes it more convincing that the client and server mount code is really operating independently (no subtle inter-dependencies between client and server mount code on global variables, for example). Since the same package is used for a client or server install, I can't think of a better way to allow that than your suggestion of exec'ing the server binary from within the client one, based on the device name. So I'll proceed that way for now, and change if a better idea comes up. |
| Comment by Andreas Dilger [ 11/Dec/14 ] |
|
Note that there is a separate client-only package that could include just the mount.lustre binary and not the server-only mount.lustre_server binary. The server package should contain both. |
| Comment by Gerrit Updater [ 11/Dec/14 ] |
|
Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/13029 |
| Comment by Andreas Dilger [ 11/Dec/14 ] |
|
I abandoned the master version of my osd_init() patch, but it probably still makes sense for b2_5 if Bruno's patch doesn't land there, to avoid problems mounting the client. |
| Comment by Gerrit Updater [ 13/Nov/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13019/ |
| Comment by Joseph Gmitter (Inactive) [ 18/Nov/15 ] |
|
Landed for 2.8 |
| Comment by Gerrit Updater [ 26/Feb/21 ] |
|
Neil Brown (neilb@suse.de) uploaded a new patch: https://review.whamcloud.com/41767 |