[LU-57] Allow OSTs to be created with no primary node, only failnodes Created: 25/Jan/11 Updated: 28/Jun/11 Resolved: 03/Jun/11 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.6 |
| Fix Version/s: | Lustre 2.1.0, Lustre 1.8.6 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Kit Westneat (Inactive) | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Severity: | 3 |
| Bugzilla ID: | 19,064 |
| Rank (Obsolete): | 5008 |
| Description |
|
A result of 22656 is that any system that has all its nodes defined as failnodes will be unable to mount after a writeconf. As DDN currently lists both partner nodes as failnodes, this could cause lots of time needlessly debugging and solving issues at customer sites. DDN currently reverts 22656 in its releases, but a solution like the one proposed in 19064 would allow us to use a version of Lustre with the 22656 patch. |
| Comments |
| Comment by Peter Jones [ 25/Jan/11 ] |
|
Bobijam, Could you please look into this one? Thanks Peter |
| Comment by Zhenyu Xu [ 30/Jan/11 ] |
|
I agree to add --noprimnode in mkfs.lustre as bz 19064 proposed, which won't affect bz22656 usage if not specify the option during mkfs. What your opinion? Andreas. |
| Comment by Andreas Dilger [ 09/Feb/11 ] |
|
Brian, any thoughts on this? I haven't been involved in the config code for a while, so maybe you have some better understanding of the usability of this approach. Bobijam, you may also want to contact Nathan Rutman <Nathan_Rutman@xyratex.com> to see if he is interested to discus this. |
| Comment by Peter Jones [ 09/Feb/11 ] |
|
Bobijam If Nathan is not responsive, you might also try asking Emoly\YuJian as I think that at least one of them (or maybe both) has also worked in this area recently Regards Peter |
| Comment by Brian Murrell (Inactive) [ 09/Feb/11 ] |
|
Generally I agree with the sentiments that we really should not care which node is "primary" and which node(s) is(are) "failover" ( to use two terms which should die). The technical details aside, I think that all nodes which has been designated to provide service for a target should be treated and handled by the configuration engine identically. This includes being apathetic about which node actually registers a target too. If we were to make such a change I think some of the syntax should be changed/extended too. - If we were to embrace the deprecation of the concept of "primary" and "failover" nodes patches like https://bugzilla.lustre.org/attachment.cgi?id=31800 would need to be reworked to remove their "primary node"ness. Without having looked at the patch, it is likely that it's intention is good, it just needs to fully embrace the deprecation of "primary" and "failover" and remove the specific syntax dealing with it. |
| Comment by Zhenyu Xu [ 13/Feb/11 ] |
|
post a patch at http://review.whamcloud.com/234 |
| Comment by Peter Jones [ 15/Mar/11 ] |
|
DDN have confirmed that this fix appears to meet their needs so let's proceed with landing this patch for our next 1.8.x release |
| Comment by Zhenyu Xu [ 16/Mar/11 ] |
|
Hi Johann, Please check the review inline comment. The basic proposal is to change the misleading --servicenode param name to --standbynode |
| Comment by Brian Murrell (Inactive) [ 17/Mar/11 ] |
I can't speak for DDN, who I think are the proponents of this change, but by my understanding of what they are looking for, --standbynode doesn't seem to embrace the concept of being able to list all of the nodes that will provide service to a target without any implication of disposition about the nodes' relationship (i.e. master/slave, primary/secondary, ???/standby) to the target. Maybe I am reading more into what the goal of this work is than it really is. That said, I have maintained for a long time (at least secretly for at least some of that time) that for a given target (OST, MDT, etc.) all nodes should be considered "equal" and none should be considered a master/primary/etc. If that indeed is the goal here, it would be nice if the nomenclature enforced that. Maybe it's also that I am reading more into standby than I should but standby to me means secondary. |
| Comment by Kit Westneat (Inactive) [ 17/Mar/11 ] |
|
Ultimately what we'd like is to have the same behavior that 1.8.3 had, where we could put all the nodes on the mkfs line and start the OSTs on any node. I guess I prefer servicenode to standbynode, but I understand Johann's objections. Limiting the startup to just the nodes specified as servicenodes would be fine, if that's the direction you all want to take. |
| Comment by Johann Lombardi (Inactive) [ 17/Mar/11 ] |
|
I don't like --standbynode either. I think my position is quite simple. We should choose between one of those 2 options: #2 is very easy and #1 requires a bit more work although it is still doable. |
| Comment by Zhenyu Xu [ 17/Mar/11 ] |
|
update the patch adopting Johann's solution #1 which also fits DDN's requirement as Kit commented at 17/Mar/11 8:12 AM |
| Comment by Build Master (Inactive) [ 17/Mar/11 ] |
|
Integrated in Bobi Jam : 433ac83bf8cea7204d562a6f16c4645f796f5361
|
| Comment by Peter Jones [ 30/Mar/11 ] |
|
Bobijam Landing permission has been granted by Oracle for this change. Can you please send the patch to lustre-gate-18@sun.com Thanks Peter |
| Comment by Peter Jones [ 30/Mar/11 ] |
|
Patch landed upstream by Oracle. Bobijam, is this patch relevant for master? |
| Comment by Zhenyu Xu [ 30/Mar/11 ] |
|
yes, i'll push for master review. |
| Comment by Build Master (Inactive) [ 31/Mar/11 ] |
|
Integrated in Bobi Jam : 651f7e3504bcb4cbb4737ab1826e0b45603baf31
|
| Comment by Build Master (Inactive) [ 01/Apr/11 ] |
|
Integrated in Terry Rutledge : deb1d73396f404fba0e277fd79b8e97cd17903b1
|
| Comment by Build Master (Inactive) [ 01/Apr/11 ] |
|
Integrated in Terry Rutledge : deb1d73396f404fba0e277fd79b8e97cd17903b1
|
| Comment by Build Master (Inactive) [ 01/Apr/11 ] |
|
Integrated in Terry Rutledge : deb1d73396f404fba0e277fd79b8e97cd17903b1
|
| Comment by Build Master (Inactive) [ 01/Apr/11 ] |
|
Integrated in Terry Rutledge : deb1d73396f404fba0e277fd79b8e97cd17903b1
|
| Comment by Build Master (Inactive) [ 01/Apr/11 ] |
|
Integrated in Terry Rutledge : deb1d73396f404fba0e277fd79b8e97cd17903b1
|
| Comment by Build Master (Inactive) [ 01/Apr/11 ] |
|
Integrated in Terry Rutledge : deb1d73396f404fba0e277fd79b8e97cd17903b1
|
| Comment by Build Master (Inactive) [ 01/Apr/11 ] |
|
Integrated in Terry Rutledge : deb1d73396f404fba0e277fd79b8e97cd17903b1
|
| Comment by Build Master (Inactive) [ 01/Apr/11 ] |
|
Integrated in Terry Rutledge : deb1d73396f404fba0e277fd79b8e97cd17903b1
|
| Comment by Build Master (Inactive) [ 04/Apr/11 ] |
|
Integrated in Terry Rutledge : deb1d73396f404fba0e277fd79b8e97cd17903b1
|
| Comment by Build Master (Inactive) [ 06/Apr/11 ] |
|
Integrated in Terry Rutledge : deb1d73396f404fba0e277fd79b8e97cd17903b1
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Build Master (Inactive) [ 03/Jun/11 ] |
|
Integrated in Oleg Drokin : 80ac0f4ee600d5a0b8d818843562d9328fef2ef0
|
| Comment by Peter Jones [ 03/Jun/11 ] |
|
Landed for 1.8.6 and 2.1 |