Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13306

allow clients to accept mgs_nidtbl_entry with IPv6 NIDs

Details

    • Improvement
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • None
    • 9223372036854775807

    Description

      Currently, if a client receives an mgs_nidtbl_entry with mne_nid_type != 0 or mne_nid_size != sizeof(lnet_nid_t)=8 the client will stop processing all later records, essentially cutting the client off from OST/MDT IR logs that have even one IPv6 addresses configured.

      To add IPv6 NID support, but preserve compatibility with older clients, there should be a new connect flag sent to the MGS at connect time:

      #define OBD_CONNECT2_MNE_TYPE         0x1000000ULL /* mne_nid_type flags */
      

      that should be saved in the client export on the MGS and used by mgs_nidtbl_read() to decide what type of NIDs should be sent to the client. The client should also send the flags that it understands MNE_NID_SUPPORTED = (MNE_NID_TYPE_* | (MNR_REC_FLAG_* << 8)) in struct obd_connect_data.padding0 renamed to ocd_mne_type so that the MGS can also avoid sending future record types to a client that it doesn't understand. The MGS reply should mask out any flags in ocd_mne_type that it doesn't support itself.  If a client export does not have OBD_CONNECT2_MNE_TYPE set, the MGS should only send compatible lnet_nid_t records with mne_nid_type=0 and mne_nid_size=sizeof(lnet_nid_t)=8.

      To allow future expansion, the mne_rec_type field should be treated as a bitmask of features present in the mgs_nidtbl_entry instead of a enumerated type. The first bit set in mne_nid_type=MNE_NID_TYPE_RECORD=0x01 would be used for expanding the record size, with the larger per-NID records as follows, with mne_nid_size=24, and a per-record mnr_rec_flags=MNR_REC_FLAG_NIDv6=0x01 to allow positively identifying the record content rather than depending on "all zero" fields to detect the difference:

      struct mne_nid_record {         /* when MNE_NID_TYPE_RECORD is set */
              __u8                    mnr_rec_flags;  /* MNR_REC_FLAG_* flags */
              __u8                    mnr_padding1;
              __u16                   mnr_padding2;
              struct lnet_nid_v6      mnr_nid_v6;
              /* other fields depending on mne_nid_type, mne_nid_size, mnr_rec_flags */
      };
      
      struct mgs_nidtbl_entry {
              __u64           mne_version;    /* table version of this entry */
              __u32           mne_instance;   /* target instance # */
              __u32           mne_index;      /* target index */
              __u32           mne_length;     /* length of this entry - by bytes */
              __u8            mne_type;       /* target type LDD_F_SV_TYPE_OST/MDT */
              __u8            mne_nid_type;   /* MNE_NID_TYPE_* flags */
              __u8            mne_nid_size;   /* size of each NID, by bytes */
              __u8            mne_nid_count;  /* # of NIDs in buffer */
              union {         /* don't walk the union array directly, use mne_nid_size */
                      lnet_nid_t             mne_nids[];     /* when mne_nid_type == 0 */
                      struct mnr_nid_record mne_records[];  /* if MNE_NID_TYPE_RECORD */
              };
      };
      

      The 4 extra bytes of fields at the start plus the 20-byte mnr_nid_v6 field keep all records with 64-bit size and alignment. Since lnet_nid_v6 can also encapsulate an IPv4 NID (based on .nid_type) there is no need to handle lnet_nid_t directly within mnr_nid_record. However, MNR_REC_FLAG_NIDv6 should be verified in case there is some other kind of NID in the record.

      When processing mne_nids or mne_records, it should be using mne_nid_size to walk the records, rather than having a loop that processes mne_nids[i] as an array, otherwise this will break if mne_nid_size changes in the future (ignoring any potential changes from LU-10360):

                      void *ptr;
      
                      /* iterate all nids to find one UUID by NID */
                      rc = -ENOENT;
      
                      for (i = 0, ptr = entry->mne_records; i < entry->mne_nid_count; i++, ptr += entry->mne_nid_size) {
                              lnet_nid_v6 nid_v6;
      
                              if (entry->mne_nid_type & MNE_NID_TYPE_RECORD) {
                                      struct mnr_nid_record *rec = ptr;
        
                                      if (!(rec->mnr_rec_flags & MNR_REC_FLAG_NIDv6))
                                              continue;
                                      nid_v6 = rec->mnr_nid_v6;
                              } else {
                                      lnet_nid2nidv6(*(lnet_nid_t *)ptr, &nid_v6);
                              }
                  
                              rc = client_import_find_conn(obd->u.cli.cl_import, nid_v6,
                                                           (struct obd_uuid *)uuid);
                              if (rc == 0)
                                      break;
                      }
      
      

      Attachments

        Issue Links

          Activity

            [LU-13306] allow clients to accept mgs_nidtbl_entry with IPv6 NIDs
            pjones Peter Jones added a comment -

            All patches seem to be merged for 2.16

            pjones Peter Jones added a comment - All patches seem to be merged for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50750/
            Subject: LU-13306 mgc: handle large NID formats
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: e4d2d4ff74a103e4288d7581b40020eaff22b139

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50750/ Subject: LU-13306 mgc: handle large NID formats Project: fs/lustre-release Branch: master Current Patch Set: Commit: e4d2d4ff74a103e4288d7581b40020eaff22b139

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52053/
            Subject: LU-13306 mgs: support large NID for mgs_write_log_osc_to_lov
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 576928b2cd7dcdb6fcb6000ef593ecfb03f66500

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52053/ Subject: LU-13306 mgs: support large NID for mgs_write_log_osc_to_lov Project: fs/lustre-release Branch: master Current Patch Set: Commit: 576928b2cd7dcdb6fcb6000ef593ecfb03f66500

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52053
            Subject: LU-13306 mgs: support large NID for mgs_write_log_osc_to_lov
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: e28695e90051d33bc5129d3512cdd9e01297f4d8

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52053 Subject: LU-13306 mgs: support large NID for mgs_write_log_osc_to_lov Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: e28695e90051d33bc5129d3512cdd9e01297f4d8

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50896/
            Subject: LU-13306 mgs: use large NIDS in the nid table on the MGS
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: c0cb747ebe95d3fefa2a70c09c2ae0c4fa873b47

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50896/ Subject: LU-13306 mgs: use large NIDS in the nid table on the MGS Project: fs/lustre-release Branch: master Current Patch Set: Commit: c0cb747ebe95d3fefa2a70c09c2ae0c4fa873b47

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51185
            Subject: LU-13306 test: base code
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: ba68cc5f0b71cccc60b3f240cbcacba82796a29c

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51185 Subject: LU-13306 test: base code Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ba68cc5f0b71cccc60b3f240cbcacba82796a29c

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50896
            Subject: LU-13306 mgs: use large NIDS in the nid table on the MGS
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 6ea3f9538d8ce242f87d96731e2ee4d0900c3983

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50896 Subject: LU-13306 mgs: use large NIDS in the nid table on the MGS Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 6ea3f9538d8ce242f87d96731e2ee4d0900c3983

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50750
            Subject: LU-13306 mgc: handle large NIDs gracefully
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 86d7f923d7caec756f0354d774e511482c138f50

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50750 Subject: LU-13306 mgc: handle large NIDs gracefully Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 86d7f923d7caec756f0354d774e511482c138f50
            adilger Andreas Dilger added a comment - - edited

            The patch https://review.whamcloud.com/39373 LU-10391 lnet: extended nids introduces struct lnet_nid_x (IMHO, lnet_nid_v6) to encapsulate larger NIDs:

            static inline int nidv6_is_legacy(const struct lnet_nid_v6 *nid_v6)
            {
                   return nid_v6->nid_type < 256;
            }
            
            struct lnet_nid_v6 {
                   __u16       nid_type;
                   __u16       nid_num;
                   union {
                           __u32        nid_addr0;
                           __be32       nid_addr[4];
                   };
            };
            

            The added mnr_rec_flags and padding fields are used since it only needs 4-byte alignment and is 20 bytes in size. The nid_is_legacy() check can be used to encapsulate IPv4 addresses without the need for external identification.

            adilger Andreas Dilger added a comment - - edited The patch https://review.whamcloud.com/39373 LU-10391 lnet: extended nids introduces struct lnet_nid_x (IMHO, lnet_nid_v6 ) to encapsulate larger NIDs: static inline int nidv6_is_legacy( const struct lnet_nid_v6 *nid_v6) { return nid_v6->nid_type < 256; } struct lnet_nid_v6 { __u16 nid_type; __u16 nid_num; union { __u32 nid_addr0; __be32 nid_addr[4]; }; }; The added mnr_rec_flags and padding fields are used since it only needs 4-byte alignment and is 20 bytes in size. The nid_is_legacy() check can be used to encapsulate IPv4 addresses without the need for external identification.

            It would be very useful to get this code fixed to skip "unknown" NIDs in mgs_nidtbl_entry, so that we have some kind of interop with older clients. Otherwise, as soon as any IPv6 NID is listed in the IR table the client will no longer be able to use the IR list throwing a big wrench into LU-10360.

            adilger Andreas Dilger added a comment - It would be very useful to get this code fixed to skip "unknown" NIDs in mgs_nidtbl_entry , so that we have some kind of interop with older clients. Otherwise, as soon as any IPv6 NID is listed in the IR table the client will no longer be able to use the IR list throwing a big wrench into LU-10360 .

            People

              simmonsja James A Simmons
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: