[LU-1430] Changing network address without --writeconf option Created: 21/May/12  Updated: 29/May/17  Resolved: 29/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.4.0

Type: New Feature Priority: Minor
Reporter: Artem Blagodarenko (Inactive) Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: patch
Environment:

Lustre 2.x


Issue Links:
Related
is related to LU-2200 Test failure on test suite conf-sanit... Resolved
is related to LUDOC-98 Document new lctl replace_nids feature Closed
Rank (Obsolete): 7651

 Description   

Currently after device address is changed tunefs should be started with
"writeconf" options. This erase configuration logs for the filesystem
that this server is part of, and regenerate them. This is very dangerous.
All clients must be unmounted and servers for this filesystem should be
stopped. All targets (OSTs/MDTs) must then be restarted to regenerate
logs. No clients should be started until all targets have restarted.
The targets start order is also important. Wrong order can be cause of
failure.



 Comments   
Comment by Artem Blagodarenko (Inactive) [ 24/May/12 ]

patch in RB
http://review.whamcloud.com/2896

Comment by Andreas Dilger [ 04/Sep/12 ]

Happy to see this feature being landed.

Have you given any thought to reviving Nathan's patches to clean up conf_param to match the set_param syntax in https://bugzilla.lustre.org/show_bug.cgi?id=17471, to simplify other user configuration tasks?

Comment by Artem Blagodarenko (Inactive) [ 04/Sep/12 ]

I have forwarded this question to Nathan. Change conf_param to match the set_param is good idea, but Nathan should set priority for this, because I have some other tasks in progress now.

Thanks for comments in review board.

Comment by Nathan Rutman [ 04/Sep/12 ]

Hi Andreas -
actually the other day I had an idea which may be both easy and effective: replace the whole conf_param "direct" proc access with a simple upcall-type mechanism from the MGC.

  1. on MGS: lctl conf_param random-proc-string=random-value-string
  2. gets added to all the config logs, or alternatively a single special param log
  3. param log updates get pulled by all the clients and servers
  4. MGC executes via upcall-type mechanism to userspace (i.e. an ioctl) a local lctl set_param string=value

Benefits:

  • identical conf_param / set_param
  • "permanent" wildcarding in strings
  • no unimplemented conf_param paths (e.g. ptlrpc services)
  • simpler implementation

We currently have this sitting in our backlog pile, although personally this bugs the hell out of me.

Comment by Nathan Rutman [ 09/Nov/12 ]

Xyratex MRP-397

Comment by Nathan Rutman [ 09/Nov/12 ]

Can this be landed?

Comment by Richard Henwood (Inactive) [ 13/Nov/12 ]

Documentation for this feature is being tracked on:

http://jira.whamcloud.com/browse/LUDOC-98

This ticket needs to be allocated to someone closer to this work than me.

Resources are available to introduce the documentation change workflow (it is pretty much identical to code changes):
http://wiki.whamcloud.com/display/PUB/Making+changes+to+the+Lustre+Manual
Feedback, comments to improve these resources are most welcome.

Comment by Richard Henwood (Inactive) [ 16/Nov/12 ]

This feature needs an entry in the User Manual. Nathan or Artem, can you prepare the documentation against LUDOC-98.

Comment by Andreas Dilger [ 16/Nov/12 ]

Nathan,
regarding your question (which I didn't originally see, because I wasn't CC'd on the bug), I think this wouldn't be a bad idea. Unifying the conf_param and set_param code would be great.

I also like the idea of being able to permanently keep the "wildcard" for the parameters.

That should definitely be a separate bug than this one.

For this bug, it would be great if "lctl replace_nids" could be used to fix LU-2200 (conf-sanity.sh test_32a upgrade test) so that the filesystem images can have their NIDs rewritten to work with o2ib networking. I don't necessarily think that is a prerequisite for landing, but it definitely would provide a more robust test than the current "replace the NIDs with the same NIDs" test that is included in the patch.

Comment by Nathan Rutman [ 19/Nov/12 ]

Nathan or Artem, can you prepare the documentation against LUDOC-98.

Done.

Comment by Jian Yu [ 16/Jan/13 ]

Patch in http://review.whamcloud.com/2896 was landed on master branch.

Comment by Nathan Rutman [ 16/Jan/13 ]

regarding your question (which I didn't originally see, because I wasn't CC'd on the bug), I think this wouldn't be a bad idea. Unifying the conf_param and set_param code would be great.
I also like the idea of being able to permanently keep the "wildcard" for the parameters.
That should definitely be a separate bug than this one.

Added as LU-2629

Comment by Artem Blagodarenko (Inactive) [ 11/Apr/13 ]

>Have you given any thought to reviving Nathan's patches to clean up conf_param to match the set_param >syntax in https://bugzilla.lustre.org/show_bug.cgi?id=17471, to simplify other user configuration tasks?

This idea is realized here.
https://jira.hpdd.intel.com/browse/LU-3155

Comment by Jinshan Xiong (Inactive) [ 03/Sep/13 ]

I tried `lctl replace_nids' several days ago and it didn't work for me in my case where the MGS NID has to be changed also. I followed the instructions. From I understanding, it has to run a command on the targets to change the MGS NID.

Please let me know if I misuse the command.

Generated at Sat Feb 10 01:16:33 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.