Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4334

With ZFS can only declare a single mgsnode for MDT or OST

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.7.0, Lustre 2.5.4
    • Lustre 2.4.1, Lustre 2.6.0, Lustre 2.5.2
    • SL6.4
    • 1
    • 11857

    Description

      When trying to declare two mgsnodes for HA only one entry is ever accepted when using backfstype=zfs. All below examples fail, if the --mgsnode=nid:nid syntax worked then all might be okay.

      mkfs.lustre --reformat --fsname=RSF1 --ost --index=2 --mgsnode=10.82.0.9@tcp1 --mgsnode=10.82.0.9@tcp1:10.82.0.10@tcp1 --servicenode=10.82.0.11@tcp1:10.82.0.12@tcp1 --backfstype=zfs OST2/ost

      mkfs.lustre --reformat --fsname=RSF1 --ost --index=2 --mgsnode=10.82.0.9@tcp1:10.82.0.10@tcp1 --servicenode=10.82.0.11@tcp1:10.82.0.12@tcp1 --backfstype=zfs OST2/ost

      Attachments

        Issue Links

          Activity

            [LU-4334] With ZFS can only declare a single mgsnode for MDT or OST

            Does

            zfs set lustre:mgsnode=192.168.139.10@tcp:192.168.139.70@tcp mdt/mdt1 
            

            imply that ':' is the standard separator for zfs properties, for example in the LU-4749 does failnode be split the same way.

            chris Chris Gearing (Inactive) added a comment - Does zfs set lustre:mgsnode=192.168.139.10@tcp:192.168.139.70@tcp mdt/mdt1 imply that ':' is the standard separator for zfs properties, for example in the LU-4749 does failnode be split the same way.
            pjones Peter Jones added a comment -

            Landed for 2.7

            pjones Peter Jones added a comment - Landed for 2.7

            This will fix setting multiple mgsnode properties on ZFS.
            http://review.whamcloud.com/11161

            Workaround for older systems:

            Instead of

            tunefs.lustre --mgsnode=192.168.139.10@tcp --mgsnode=192.168.139.70@tcp mdt/mdt1
            

            Use the following:

            zfs set lustre:mgsnode=192.168.139.10@tcp:192.168.139.70@tcp mdt/mdt1 
            
            utopiabound Nathaniel Clark added a comment - This will fix setting multiple mgsnode properties on ZFS. http://review.whamcloud.com/11161 Workaround for older systems: Instead of tunefs.lustre --mgsnode=192.168.139.10@tcp --mgsnode=192.168.139.70@tcp mdt/mdt1 Use the following: zfs set lustre:mgsnode=192.168.139.10@tcp:192.168.139.70@tcp mdt/mdt1
            utopiabound Nathaniel Clark added a comment - - edited

            It seems like the right idea would be to store NID information in a single ZFS property <server1ip1>@tcp,<server1ip2>@tcp:<server2ip1>@tcp,<server2ip2>@tcp similar to how it can be input on the command line.

            This will apply to mgsnode, failnode, and servicenode.

            utopiabound Nathaniel Clark added a comment - - edited It seems like the right idea would be to store NID information in a single ZFS property <server1ip1>@tcp,<server1ip2>@tcp:<server2ip1>@tcp,<server2ip2>@tcp similar to how it can be input on the command line. This will apply to mgsnode, failnode, and servicenode.

            As part of fixing this issue, we need to make certain that the relevant OSD documentation is updated to clearly define the APIs and expecations for the OSD developer.

            morrone Christopher Morrone (Inactive) added a comment - As part of fixing this issue, we need to make certain that the relevant OSD documentation is updated to clearly define the APIs and expecations for the OSD developer.
            utopiabound Nathaniel Clark added a comment - - edited

            It looks like there are two separate issues here:

            1) listing failover or mgsnode in the form --mgsnode=NID1 --mgsnode=NID2 or --mgsnode=NID1,NID2 will result in only NID2 being recorded.

            This seems to be due to how metadata is stored on zfs, that property names are unique, thus setting it twice will just overwrite the first with the second.

            2) listing nids in the form --mgsnode=NID1:NID2 will result in only NID1 being used

            utopiabound Nathaniel Clark added a comment - - edited It looks like there are two separate issues here: 1) listing failover or mgsnode in the form --mgsnode=NID1 --mgsnode=NID2 or --mgsnode=NID1,NID2 will result in only NID2 being recorded. This seems to be due to how metadata is stored on zfs, that property names are unique, thus setting it twice will just overwrite the first with the second. 2) listing nids in the form --mgsnode=NID1:NID2 will result in only NID1 being used
            ekolb Eric Kolb added a comment -

            Hello,

            Yes the --mgsnode=nid:nid setting can be applied to the MDTs and OSTs but the fail-over does not occur. The Lustre components seem only to use the fist nid in the specified list and upon fail-over of the MGS they will not use the second nid specified.

            ekolb Eric Kolb added a comment - Hello, Yes the --mgsnode=nid:nid setting can be applied to the MDTs and OSTs but the fail-over does not occur. The Lustre components seem only to use the fist nid in the specified list and upon fail-over of the MGS they will not use the second nid specified.
            jslandry JS Landry added a comment -

            Hi, this syntax works. --mgsnode=node1:node2

            1. tunefs.lustre lustre1-ost4/ost0
              checking for existing Lustre data: found

            Read previous values:
            Target: lustre1-OST0004
            Index: 4
            Lustre FS: lustre1
            Mount type: zfs
            Flags: 0x1022
            (OST first_time no_primnode )
            Persistent mount opts:
            Parameters: mgsnode=10.225.8.3@o2ib failover.node=10.225.4.4@o2ib

            Permanent disk data:
            Target: lustre1:OST0004
            Index: 4
            Lustre FS: lustre1
            Mount type: zfs
            Flags: 0x1022
            (OST first_time no_primnode )
            Persistent mount opts:
            Parameters: mgsnode=10.225.8.3@o2ib failover.node=10.225.4.4@o2ib

            Writing lustre1-ost4/ost0 properties
            lustre:version=1
            lustre:flags=4130
            lustre:index=4
            lustre:fsname=lustre1
            lustre:svname=lustre1:OST0004
            lustre:mgsnode=10.225.8.3@o2ib
            lustre:failover.node=10.225.4.4@o2ib

            1. tunefs.lustre --mgsnode=mds1-225@o2ib:mds2-225@o2ib lustre1-ost4/ost0
              checking for existing Lustre data: found

            Read previous values:
            Target: lustre1-OST0004
            Index: 4
            Lustre FS: lustre1
            Mount type: zfs
            Flags: 0x1022
            (OST first_time no_primnode )
            Persistent mount opts:
            Parameters: mgsnode=10.225.8.3@o2ib failover.node=10.225.4.4@o2ib

            Permanent disk data:
            Target: lustre1:OST0004
            Index: 4
            Lustre FS: lustre1
            Mount type: zfs
            Flags: 0x1022
            (OST first_time no_primnode )
            Persistent mount opts:
            Parameters: mgsnode=10.225.8.3@o2ib failover.node=10.225.4.4@o2ib mgsnode=10.225.8.2@o2ib:10.225.8.3@o2ib

            Writing lustre1-ost4/ost0 properties
            lustre:version=1
            lustre:flags=4130
            lustre:index=4
            lustre:fsname=lustre1
            lustre:svname=lustre1:OST0004
            lustre:mgsnode=10.225.8.3@o2ib
            lustre:failover.node=10.225.4.4@o2ib
            lustre:mgsnode=10.225.8.2@o2ib:10.225.8.3@o2ib

            1. tunefs.lustre lustre1-ost4/ost0
              checking for existing Lustre data: found

            Read previous values:
            Target: lustre1-OST0004
            Index: 4
            Lustre FS: lustre1
            Mount type: zfs
            Flags: 0x1022
            (OST first_time no_primnode )
            Persistent mount opts:
            Parameters: mgsnode=10.225.8.2@o2ib:10.225.8.3@o2ib failover.node=10.225.4.4@o2ib

            Permanent disk data:
            Target: lustre1:OST0004
            Index: 4
            Lustre FS: lustre1
            Mount type: zfs
            Flags: 0x1022
            (OST first_time no_primnode )
            Persistent mount opts:
            Parameters: mgsnode=10.225.8.2@o2ib:10.225.8.3@o2ib failover.node=10.225.4.4@o2ib

            Writing lustre1-ost4/ost0 properties
            lustre:version=1
            lustre:flags=4130
            lustre:index=4
            lustre:fsname=lustre1
            lustre:svname=lustre1:OST0004
            lustre:mgsnode=10.225.8.2@o2ib:10.225.8.3@o2ib
            lustre:failover.node=10.225.4.4@o2ib

            jslandry JS Landry added a comment - Hi, this syntax works. --mgsnode=node1:node2 tunefs.lustre lustre1-ost4/ost0 checking for existing Lustre data: found Read previous values: Target: lustre1-OST0004 Index: 4 Lustre FS: lustre1 Mount type: zfs Flags: 0x1022 (OST first_time no_primnode ) Persistent mount opts: Parameters: mgsnode=10.225.8.3@o2ib failover.node=10.225.4.4@o2ib Permanent disk data: Target: lustre1:OST0004 Index: 4 Lustre FS: lustre1 Mount type: zfs Flags: 0x1022 (OST first_time no_primnode ) Persistent mount opts: Parameters: mgsnode=10.225.8.3@o2ib failover.node=10.225.4.4@o2ib Writing lustre1-ost4/ost0 properties lustre:version=1 lustre:flags=4130 lustre:index=4 lustre:fsname=lustre1 lustre:svname=lustre1:OST0004 lustre:mgsnode=10.225.8.3@o2ib lustre:failover.node=10.225.4.4@o2ib tunefs.lustre --mgsnode=mds1-225@o2ib:mds2-225@o2ib lustre1-ost4/ost0 checking for existing Lustre data: found Read previous values: Target: lustre1-OST0004 Index: 4 Lustre FS: lustre1 Mount type: zfs Flags: 0x1022 (OST first_time no_primnode ) Persistent mount opts: Parameters: mgsnode=10.225.8.3@o2ib failover.node=10.225.4.4@o2ib Permanent disk data: Target: lustre1:OST0004 Index: 4 Lustre FS: lustre1 Mount type: zfs Flags: 0x1022 (OST first_time no_primnode ) Persistent mount opts: Parameters: mgsnode=10.225.8.3@o2ib failover.node=10.225.4.4@o2ib mgsnode=10.225.8.2@o2ib:10.225.8.3@o2ib Writing lustre1-ost4/ost0 properties lustre:version=1 lustre:flags=4130 lustre:index=4 lustre:fsname=lustre1 lustre:svname=lustre1:OST0004 lustre:mgsnode=10.225.8.3@o2ib lustre:failover.node=10.225.4.4@o2ib lustre:mgsnode=10.225.8.2@o2ib:10.225.8.3@o2ib tunefs.lustre lustre1-ost4/ost0 checking for existing Lustre data: found Read previous values: Target: lustre1-OST0004 Index: 4 Lustre FS: lustre1 Mount type: zfs Flags: 0x1022 (OST first_time no_primnode ) Persistent mount opts: Parameters: mgsnode=10.225.8.2@o2ib:10.225.8.3@o2ib failover.node=10.225.4.4@o2ib Permanent disk data: Target: lustre1:OST0004 Index: 4 Lustre FS: lustre1 Mount type: zfs Flags: 0x1022 (OST first_time no_primnode ) Persistent mount opts: Parameters: mgsnode=10.225.8.2@o2ib:10.225.8.3@o2ib failover.node=10.225.4.4@o2ib Writing lustre1-ost4/ost0 properties lustre:version=1 lustre:flags=4130 lustre:index=4 lustre:fsname=lustre1 lustre:svname=lustre1:OST0004 lustre:mgsnode=10.225.8.2@o2ib:10.225.8.3@o2ib lustre:failover.node=10.225.4.4@o2ib
            ekolb Eric Kolb added a comment -

            Perhaps the below sequence displays the issue more clearly.

            $ tunefs.lustre OST2/ost
            checking for existing Lustre data: found

            Read previous values:
            Target: RSF1-OST0002
            Index: 2
            Lustre FS: RSF1
            Mount type: zfs
            Flags: 0x1002
            (OST no_primnode )
            Persistent mount opts:
            Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.9@tcp1

            Permanent disk data:
            Target: RSF1-OST0002
            Index: 2
            Lustre FS: RSF1
            Mount type: zfs
            Flags: 0x1002
            (OST no_primnode )
            Persistent mount opts:
            Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.9@tcp1

            Writing OST2/ost properties
            lustre:version=1
            lustre:flags=4098
            lustre:index=2
            lustre:fsname=RSF1
            lustre:svname=RSF1-OST0002
            lustre:failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1
            lustre:mgsnode=10.82.0.9@tcp1

            $ tunefs.lustre --mgsnode=10.82.0.9@tcp1 --mgsnode=10.82.0.10@tcp1 OST2/ost
            checking for existing Lustre data: found

            Read previous values:
            Target: RSF1-OST0002
            Index: 2
            Lustre FS: RSF1
            Mount type: zfs
            Flags: 0x1002
            (OST no_primnode )
            Persistent mount opts:
            Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.9@tcp1

            Permanent disk data:
            Target: RSF1-OST0002
            Index: 2
            Lustre FS: RSF1
            Mount type: zfs
            Flags: 0x1002
            (OST no_primnode )
            Persistent mount opts:
            Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.9@tcp1 mgsnode=10.82.0.10@tcp1

            Writing OST2/ost properties
            lustre:version=1
            lustre:flags=4098
            lustre:index=2
            lustre:fsname=RSF1
            lustre:svname=RSF1-OST0002
            lustre:failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1
            lustre:mgsnode=10.82.0.9@tcp1
            lustre:mgsnode=10.82.0.10@tcp1

            $ tunefs.lustre OST2/ost
            checking for existing Lustre data: found

            Read previous values:
            Target: RSF1-OST0002
            Index: 2
            Lustre FS: RSF1
            Mount type: zfs
            Flags: 0x1002
            (OST no_primnode )
            Persistent mount opts:
            Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.10@tcp1

            Permanent disk data:
            Target: RSF1-OST0002
            Index: 2
            Lustre FS: RSF1
            Mount type: zfs
            Flags: 0x1002
            (OST no_primnode )
            Persistent mount opts:
            Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.10@tcp1

            Writing OST2/ost properties
            lustre:version=1
            lustre:flags=4098
            lustre:index=2
            lustre:fsname=RSF1
            lustre:svname=RSF1-OST0002
            lustre:failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1
            lustre:mgsnode=10.82.0.10@tcp1

            ekolb Eric Kolb added a comment - Perhaps the below sequence displays the issue more clearly. $ tunefs.lustre OST2/ost checking for existing Lustre data: found Read previous values: Target: RSF1-OST0002 Index: 2 Lustre FS: RSF1 Mount type: zfs Flags: 0x1002 (OST no_primnode ) Persistent mount opts: Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.9@tcp1 Permanent disk data: Target: RSF1-OST0002 Index: 2 Lustre FS: RSF1 Mount type: zfs Flags: 0x1002 (OST no_primnode ) Persistent mount opts: Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.9@tcp1 Writing OST2/ost properties lustre:version=1 lustre:flags=4098 lustre:index=2 lustre:fsname=RSF1 lustre:svname=RSF1-OST0002 lustre:failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 lustre:mgsnode=10.82.0.9@tcp1 $ tunefs.lustre --mgsnode=10.82.0.9@tcp1 --mgsnode=10.82.0.10@tcp1 OST2/ost checking for existing Lustre data: found Read previous values: Target: RSF1-OST0002 Index: 2 Lustre FS: RSF1 Mount type: zfs Flags: 0x1002 (OST no_primnode ) Persistent mount opts: Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.9@tcp1 Permanent disk data: Target: RSF1-OST0002 Index: 2 Lustre FS: RSF1 Mount type: zfs Flags: 0x1002 (OST no_primnode ) Persistent mount opts: Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.9@tcp1 mgsnode=10.82.0.10@tcp1 Writing OST2/ost properties lustre:version=1 lustre:flags=4098 lustre:index=2 lustre:fsname=RSF1 lustre:svname=RSF1-OST0002 lustre:failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 lustre:mgsnode=10.82.0.9@tcp1 lustre:mgsnode=10.82.0.10@tcp1 $ tunefs.lustre OST2/ost checking for existing Lustre data: found Read previous values: Target: RSF1-OST0002 Index: 2 Lustre FS: RSF1 Mount type: zfs Flags: 0x1002 (OST no_primnode ) Persistent mount opts: Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.10@tcp1 Permanent disk data: Target: RSF1-OST0002 Index: 2 Lustre FS: RSF1 Mount type: zfs Flags: 0x1002 (OST no_primnode ) Persistent mount opts: Parameters: failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 mgsnode=10.82.0.10@tcp1 Writing OST2/ost properties lustre:version=1 lustre:flags=4098 lustre:index=2 lustre:fsname=RSF1 lustre:svname=RSF1-OST0002 lustre:failover.node=10.82.0.11@tcp1:10.82.0.12@tcp1 lustre:mgsnode=10.82.0.10@tcp1

            People

              utopiabound Nathaniel Clark
              ekolb Eric Kolb
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: