Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9370

Lustre 2.9 + zfs 0.7 + draid = OSS hangup

Details

    • Bug
    • Resolution: Not a Bug
    • Minor
    • None
    • Lustre 2.9.0
    • CentOS 7 in a Hyper-V vm
    • 1
    • 3
    • 9223372036854775807

    Description

      I'm trying to build a draid based OST for lustre.

      Initially created as [https://github.com/thegreatgazoo/zfs/issues/2|thegreatgazoo/zfs issue].

      Generic lustre MGS/MDT is up'n'running.

      Fresh VM (4 CPU, 4G RAM) with CentOS 7 "minimal" install and 18 scsi disks (images).

      Perform yum -y update ; reboot, then run setup-node.sh NODE from workstation.
      Ssh to the NODE and run ./mkzpool.sh:

      [root@node26 ~]# ./mkzpool.sh 
      + zpool list
      + grep -w 'no pools available'
      + zpool destroy oss3pool
      + zpool list
      + grep -w 'no pools available'
      no pools available
      + '[' -f 17.nvl ']'
      + draidcfg -r 17.nvl
      dRAID1 vdev of 17 child drives: 3 x (4 data + 1 parity) and 2 distributed spare
      Using 32 base permutations
        15, 2, 8, 7,10, 5, 4,16, 1,13,14, 9,11,12, 3, 6, 0,
         5,15,14, 9, 0,11,13, 4, 3,12, 8,10, 7, 1, 6, 2,16,
        10,11,14, 5,15, 2,13, 6, 1, 3, 4, 7,12,16, 9, 0, 8,
        13, 2,12,14, 8, 0, 7, 4, 9,15,11, 6, 3,16, 1, 5,10,
        13, 5, 2,16, 6, 0, 4, 8,10, 1, 3,14, 9,11,12, 7,15,
         8,12, 3,14, 0, 4,16, 6, 2,11, 1, 7, 9,15,13, 5,10,
        16,14, 2, 9, 7, 4,11, 0, 6,12,10, 8, 1,13,15, 5, 3,
         5,16, 6, 1,10,15,11, 3, 8,14, 2,12, 0, 7, 9, 4,13,
         4,12, 8,10,14, 9, 6,11,15, 0, 3,13, 7, 2, 5,16, 1,
        10,14,16,11,12, 2, 5, 3, 4, 7, 0, 1, 6, 9,13, 8,15,
         2, 1,11,15,16, 6,12, 3,10,13, 8, 5, 4, 0, 7, 9,14,
        15,14, 1, 5,16, 2,12, 8, 9, 6,11,10, 3, 0, 7, 4,13,
         1, 5,10, 9, 2, 8, 4,16, 7,11, 3,12, 6,14, 0,13,15,
         3, 7,16,10,13, 2, 6, 8,14,15,12,11, 0, 9, 1, 4, 5,
        15, 2,14, 8, 5,16, 3,13, 4, 1, 9,12,10, 0, 6, 7,11,
        14,12,11,15,16,10, 2, 9, 8, 4, 3, 1,13, 5, 7, 0, 6,
         7,13, 2,11,14, 0, 1, 8, 9,10,16, 4, 6,12, 5, 3,15,
        16, 1,11, 4, 3, 9, 6,13, 5, 7,10,15,14,12, 2, 0, 8,
         0, 5, 2,10,16,12, 6, 3,11,14, 1, 9, 7,15, 4, 8,13,
         8,13,11, 4,10, 6, 7,16, 5,12, 9,14, 2, 3, 0,15, 1,
         9, 6,12,16, 4, 7, 3, 0, 2,15,13, 8,11,14, 5,10, 1,
         8,12, 0, 6,15, 7, 4,13,14,10, 1, 9, 5, 3,11, 2,16,
         5,15, 9,10,16, 6,11, 0, 7,13, 8,14, 3, 4, 1,12, 2,
        15,14, 2, 9, 4,11, 7, 1, 6,10, 5, 0, 8,12,13,16, 3,
        15,16, 0,10, 3,12,11, 7, 1, 8, 6,13, 4, 5, 9, 2,14,
        15, 4, 7,13,14, 2, 9,10,16, 1,11,12, 8, 0, 3, 5, 6,
        15, 8,13, 0, 4, 7, 3,14, 5,12, 2, 9,10,11, 6,16, 1,
         0, 7, 5, 3, 1,14,16, 4, 2,15,12, 8,10, 6, 9,11,13,
         7, 6, 0,15,16,11, 8, 1, 5,12,13,14,10, 9, 3, 2, 4,
        14,16,10, 6, 4,13, 3, 1,15,12,11, 8, 9, 5, 0, 7, 2,
         9, 3, 5,15,10,11, 8, 7, 2,14, 6,13, 0, 4, 1,12,16,
         4, 6, 7,14, 5, 3,12, 1,13, 9,16, 2, 0,10, 8,11,15,
      + zpool create -f oss3pool draid1 cfg=17.nvl /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr
      + zpool list
      NAME       SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
      oss3pool  14,7G   612K  14,7G         -     0%     0%  1.00x  ONLINE  -
      + zpool status
        pool: oss3pool
       state: ONLINE
        scan: none requested
      config:
      
      	NAME            STATE     READ WRITE CKSUM
      	oss3pool        ONLINE       0     0     0
      	  draid1-0      ONLINE       0     0     0
      	    sdb         ONLINE       0     0     0
      	    sdc         ONLINE       0     0     0
      	    sdd         ONLINE       0     0     0
      	    sde         ONLINE       0     0     0
      	    sdf         ONLINE       0     0     0
      	    sdg         ONLINE       0     0     0
      	    sdh         ONLINE       0     0     0
      	    sdi         ONLINE       0     0     0
      	    sdj         ONLINE       0     0     0
      	    sdk         ONLINE       0     0     0
      	    sdl         ONLINE       0     0     0
      	    sdm         ONLINE       0     0     0
      	    sdn         ONLINE       0     0     0
      	    sdo         ONLINE       0     0     0
      	    sdp         ONLINE       0     0     0
      	    sdq         ONLINE       0     0     0
      	    sdr         ONLINE       0     0     0
      	spares
      	  $draid1-0-s0  AVAIL   
      	  $draid1-0-s1  AVAIL   
      
      errors: No known data errors
      + grep oss3pool
      + mount
      oss3pool on /oss3pool type zfs (rw,xattr,noacl)
      + mkfs.lustre --reformat --ost --backfstype=zfs --fsname=ZFS01 --index=3 --mgsnode=mgs@tcp0 oss3pool/ZFS01
      
         Permanent disk data:
      Target:     ZFS01:OST0003
      Index:      3
      Lustre FS:  ZFS01
      Mount type: zfs
      Flags:      0x62
                    (OST first_time update )
      Persistent mount opts: 
      Parameters: mgsnode=172.17.32.220@tcp
      
      mkfs_cmd = zfs create -o canmount=off -o xattr=sa oss3pool/ZFS01
      Writing oss3pool/ZFS01 properties
        lustre:version=1
        lustre:flags=98
        lustre:index=3
        lustre:fsname=ZFS01
        lustre:svname=ZFS01:OST0003
        lustre:mgsnode=172.17.32.220@tcp
      + '[' -d /lustre/ZFS01/. ']'
      + mount -v -t lustre oss3pool/ZFS01 /lustre/ZFS01
      arg[0] = /sbin/mount.lustre
      arg[1] = -v
      arg[2] = -o
      arg[3] = rw
      arg[4] = oss3pool/ZFS01
      arg[5] = /lustre/ZFS01
      source = oss3pool/ZFS01 (oss3pool/ZFS01), target = /lustre/ZFS01
      options = rw
      checking for existing Lustre data: found
      Writing oss3pool/ZFS01 properties
        lustre:version=1
        lustre:flags=34
        lustre:index=3
        lustre:fsname=ZFS01
        lustre:svname=ZFS01:OST0003
        lustre:mgsnode=172.17.32.220@tcp
      mounting device oss3pool/ZFS01 at /lustre/ZFS01, flags=0x1000000 options=osd=osd-zfs,,mgsnode=172.17.32.220@tcp,virgin,update,param=mgsnode=172.17.32.220@tcp,svname=ZFS01-OST0003,device=oss3pool/ZFS01
      mount.lustre: mount oss3pool/ZFS01 at /lustre/ZFS01 failed: Address already in use retries left: 0
      mount.lustre: mount oss3pool/ZFS01 at /lustre/ZFS01 failed: Address already in use
      The target service's index is already in use. (oss3pool/ZFS01)
      [root@node26 ~]# mount -v -t lustre oss3pool/ZFS01 /lustre/ZFS01
      arg[0] = /sbin/mount.lustre
      arg[1] = -v
      arg[2] = -o
      arg[3] = rw
      arg[4] = oss3pool/ZFS01
      arg[5] = /lustre/ZFS01
      source = oss3pool/ZFS01 (oss3pool/ZFS01), target = /lustre/ZFS01
      options = rw
      checking for existing Lustre data: found
      mounting device oss3pool/ZFS01 at /lustre/ZFS01, flags=0x1000000 options=osd=osd-zfs,,mgsnode=172.17.32.220@tcp,virgin,param=mgsnode=172.17.32.220@tcp,svname=ZFS01-OST0003,device=oss3pool/ZFS01
      
      

      Attachments

        Activity

          [LU-9370] Lustre 2.9 + zfs 0.7 + draid = OSS hangup

          Thanks for support, folks!

          I'll try to contact Hyper-V admin and re-create the disk set to work it out.
          But it may take a while.

          jno jno (Inactive) added a comment - Thanks for support, folks! I'll try to contact Hyper-V admin and re-create the disk set to work it out. But it may take a while.

          I understand, I have been there before. Please let us know if we can close this ticket. If you have any dRAID testing questions please keep in contact with us.

          jsalians_intel John Salinas (Inactive) added a comment - I understand, I have been there before. Please let us know if we can close this ticket. If you have any dRAID testing questions please keep in contact with us.

          Wow!

          It was a Hyper-V host out of my control...

          Thanks for the point, went to dig.

          jno jno (Inactive) added a comment - Wow! It was a Hyper-V host out of my control... Thanks for the point, went to dig.

          Looks like you are getting further in the process but failing early in the mount process due to underlying errors:

          [ 1946.710418] LNet: HW CPU cores: 4, npartitions: 1
          [ 1946.718890] alg: No test for adler32 (adler32-zlib)
          [ 1946.719309] alg: No test for crc32 (crc32-table)
          [ 1951.762047] sha512_ssse3: Using AVX optimized SHA-512 implementation
          [ 1954.979635] Lustre: Lustre: Build Version: 2.9.0
          [ 1955.302507] LNet: Added LNI 172.17.32.226@tcp [8/256/0/180]
          [ 1955.302711] LNet: Accept secure, port 988
          [ 1959.056006] GPT:disk_guids don't match.
          [ 1959.056034] GPT:partition_entry_array_crc32 values don't match: 0x5d3c877c != 0x443a8464
          [ 1959.056037] GPT: Use GNU Parted to correct GPT errors.
          [ 1959.056059]  sdb: sdb1 sdb9
          [ 1959.230444]  sdb: sdb1 sdb9
          [ 1959.406374] GPT:disk_guids don't match.
          [ 1959.406384] GPT:partition_entry_array_crc32 values don't match: 0x29fa53ae != 0xa19347b0
          [ 1959.406387] GPT: Use GNU Parted to correct GPT errors.
          [ 1959.406408]  sdc: sdc1 sdc9
          [ 1959.610229]  sdc: sdc1 sdc9
          [ 1959.821418] Alternate GPT is invalid, using primary GPT.
          [ 1959.821444]  sdd: sdd1 sdd9
          [ 1959.903091]  sdd: sdd1 sdd9
          [ 1960.088271] GPT:disk_guids don't match.
          [ 1960.088279] GPT:partition_entry_array_crc32 values don't match: 0xda543dc7 != 0xdb0d75f4
          [ 1960.088281] GPT: Use GNU Parted to correct GPT errors.
          [ 1960.088302]  sde: sde1 sde9
          [ 1960.324063]  sde: sde1 sde9
          [ 1960.347788]  sde: sde1 sde9
          [ 1960.515198] Alternate GPT is invalid, using primary GPT.
          [ 1960.515225]  sdf: sdf1 sdf9
          [ 1960.845503]  sdf: sdf1 sdf9
          [ 1960.869365]  sdf: sdf1 sdf9
          [ 1961.018646] GPT:disk_guids don't match.
          [ 1961.018654] GPT:partition_entry_array_crc32 values don't match: 0xf42c8d7b != 0x97a63590
          [ 1961.018657] GPT: Use GNU Parted to correct GPT errors.
          [ 1961.018679]  sdg: sdg1 sdg9
          [ 1961.349725]  sdg: sdg1 sdg9
          [ 1961.373959]  sdg: sdg1 sdg9
          [ 1961.524544] Alternate GPT is invalid, using primary GPT.
          [ 1961.524569]  sdh: sdh1 sdh9
          [ 1961.655219]  sdh: sdh1 sdh9
          [ 1961.814506] GPT:disk_guids don't match.
          [ 1961.814515] GPT:partition_entry_array_crc32 values don't match: 0x3d5540f9 != 0x85f3e2e6
          [ 1961.814517] GPT: Use GNU Parted to correct GPT errors.
          [ 1961.814537]  sdi: sdi1 sdi9
          [ 1961.867240]  sdi: sdi1 sdi9
          [ 1962.081393] Alternate GPT is invalid, using primary GPT.
          [ 1962.081420]  sdj: sdj1 sdj9
          [ 1962.261463]  sdj: sdj1 sdj9
          [ 1962.485817] Alternate GPT is invalid, using primary GPT.
          [ 1962.485841]  sdk: sdk1 sdk9
          [ 1962.617151]  sdk: sdk1 sdk9
          [ 1962.828196] GPT:disk_guids don't match.
          [ 1962.828206] GPT:partition_entry_array_crc32 values don't match: 0x7cf05c31 != 0xbfb68e7
          [ 1962.828208] GPT: Use GNU Parted to correct GPT errors.
          [ 1962.828232]  sdl: sdl1 sdl9
          [ 1962.990115]  sdl: sdl1 sdl9
          [ 1963.188994] GPT:disk_guids don't match.
          [ 1963.189028] GPT:partition_entry_array_crc32 values don't match: 0x38ff1612 != 0xc53037f5
          [ 1963.189031] GPT: Use GNU Parted to correct GPT errors.
          [ 1963.189055]  sdm: sdm1 sdm9
          [ 1963.453695]  sdm: sdm1 sdm9
          [ 1963.622171] GPT:disk_guids don't match.
          [ 1963.622179] GPT:partition_entry_array_crc32 values don't match: 0x6577aef4 != 0x1624515d
          [ 1963.622182] GPT: Use GNU Parted to correct GPT errors.
          [ 1963.622202]  sdn: sdn1 sdn9
          [ 1963.927932]  sdn: sdn1 sdn9
          [ 1964.131710] Alternate GPT is invalid, using primary GPT.
          [ 1964.131737]  sdo: sdo1 sdo9
          [ 1964.304545]  sdo: sdo1 sdo9
          [ 1964.537353] Alternate GPT is invalid, using primary GPT.
          [ 1964.537380]  sdp: sdp1 sdp9
          [ 1964.608130]  sdp: sdp1 sdp9
          [ 1964.861371] Alternate GPT is invalid, using primary GPT.
          [ 1964.861397]  sdq: sdq1 sdq9
          [ 1964.988531]  sdq: sdq1 sdq9
          [ 1965.295413] GPT:disk_guids don't match.
          [ 1965.295421] GPT:partition_entry_array_crc32 values don't match: 0x2cd0988d != 0x828d383e
          [ 1965.295424] GPT: Use GNU Parted to correct GPT errors.
          [ 1965.295458]  sdr: sdr1 sdr9
          [ 1965.577126]  sdr: sdr1 sdr9 
          

          I have seen this before when disks are exact clones of each other and have identical UUID/WWIDs, but there are probably other reasons as well.

          jsalians_intel John Salinas (Inactive) added a comment - Looks like you are getting further in the process but failing early in the mount process due to underlying errors: [ 1946.710418] LNet: HW CPU cores: 4, npartitions: 1 [ 1946.718890] alg: No test for adler32 (adler32-zlib) [ 1946.719309] alg: No test for crc32 (crc32-table) [ 1951.762047] sha512_ssse3: Using AVX optimized SHA-512 implementation [ 1954.979635] Lustre: Lustre: Build Version: 2.9.0 [ 1955.302507] LNet: Added LNI 172.17.32.226@tcp [8/256/0/180] [ 1955.302711] LNet: Accept secure, port 988 [ 1959.056006] GPT:disk_guids don't match. [ 1959.056034] GPT:partition_entry_array_crc32 values don't match: 0x5d3c877c != 0x443a8464 [ 1959.056037] GPT: Use GNU Parted to correct GPT errors. [ 1959.056059] sdb: sdb1 sdb9 [ 1959.230444] sdb: sdb1 sdb9 [ 1959.406374] GPT:disk_guids don't match. [ 1959.406384] GPT:partition_entry_array_crc32 values don't match: 0x29fa53ae != 0xa19347b0 [ 1959.406387] GPT: Use GNU Parted to correct GPT errors. [ 1959.406408] sdc: sdc1 sdc9 [ 1959.610229] sdc: sdc1 sdc9 [ 1959.821418] Alternate GPT is invalid, using primary GPT. [ 1959.821444] sdd: sdd1 sdd9 [ 1959.903091] sdd: sdd1 sdd9 [ 1960.088271] GPT:disk_guids don't match. [ 1960.088279] GPT:partition_entry_array_crc32 values don't match: 0xda543dc7 != 0xdb0d75f4 [ 1960.088281] GPT: Use GNU Parted to correct GPT errors. [ 1960.088302] sde: sde1 sde9 [ 1960.324063] sde: sde1 sde9 [ 1960.347788] sde: sde1 sde9 [ 1960.515198] Alternate GPT is invalid, using primary GPT. [ 1960.515225] sdf: sdf1 sdf9 [ 1960.845503] sdf: sdf1 sdf9 [ 1960.869365] sdf: sdf1 sdf9 [ 1961.018646] GPT:disk_guids don't match. [ 1961.018654] GPT:partition_entry_array_crc32 values don't match: 0xf42c8d7b != 0x97a63590 [ 1961.018657] GPT: Use GNU Parted to correct GPT errors. [ 1961.018679] sdg: sdg1 sdg9 [ 1961.349725] sdg: sdg1 sdg9 [ 1961.373959] sdg: sdg1 sdg9 [ 1961.524544] Alternate GPT is invalid, using primary GPT. [ 1961.524569] sdh: sdh1 sdh9 [ 1961.655219] sdh: sdh1 sdh9 [ 1961.814506] GPT:disk_guids don't match. [ 1961.814515] GPT:partition_entry_array_crc32 values don't match: 0x3d5540f9 != 0x85f3e2e6 [ 1961.814517] GPT: Use GNU Parted to correct GPT errors. [ 1961.814537] sdi: sdi1 sdi9 [ 1961.867240] sdi: sdi1 sdi9 [ 1962.081393] Alternate GPT is invalid, using primary GPT. [ 1962.081420] sdj: sdj1 sdj9 [ 1962.261463] sdj: sdj1 sdj9 [ 1962.485817] Alternate GPT is invalid, using primary GPT. [ 1962.485841] sdk: sdk1 sdk9 [ 1962.617151] sdk: sdk1 sdk9 [ 1962.828196] GPT:disk_guids don't match. [ 1962.828206] GPT:partition_entry_array_crc32 values don't match: 0x7cf05c31 != 0xbfb68e7 [ 1962.828208] GPT: Use GNU Parted to correct GPT errors. [ 1962.828232] sdl: sdl1 sdl9 [ 1962.990115] sdl: sdl1 sdl9 [ 1963.188994] GPT:disk_guids don't match. [ 1963.189028] GPT:partition_entry_array_crc32 values don't match: 0x38ff1612 != 0xc53037f5 [ 1963.189031] GPT: Use GNU Parted to correct GPT errors. [ 1963.189055] sdm: sdm1 sdm9 [ 1963.453695] sdm: sdm1 sdm9 [ 1963.622171] GPT:disk_guids don't match. [ 1963.622179] GPT:partition_entry_array_crc32 values don't match: 0x6577aef4 != 0x1624515d [ 1963.622182] GPT: Use GNU Parted to correct GPT errors. [ 1963.622202] sdn: sdn1 sdn9 [ 1963.927932] sdn: sdn1 sdn9 [ 1964.131710] Alternate GPT is invalid, using primary GPT. [ 1964.131737] sdo: sdo1 sdo9 [ 1964.304545] sdo: sdo1 sdo9 [ 1964.537353] Alternate GPT is invalid, using primary GPT. [ 1964.537380] sdp: sdp1 sdp9 [ 1964.608130] sdp: sdp1 sdp9 [ 1964.861371] Alternate GPT is invalid, using primary GPT. [ 1964.861397] sdq: sdq1 sdq9 [ 1964.988531] sdq: sdq1 sdq9 [ 1965.295413] GPT:disk_guids don't match. [ 1965.295421] GPT:partition_entry_array_crc32 values don't match: 0x2cd0988d != 0x828d383e [ 1965.295424] GPT: Use GNU Parted to correct GPT errors. [ 1965.295458] sdr: sdr1 sdr9 [ 1965.577126] sdr: sdr1 sdr9 I have seen this before when disks are exact clones of each other and have identical UUID/WWIDs, but there are probably other reasons as well.
          jno jno (Inactive) added a comment - - edited

          Well, same GOOD different days with some quite randomly chosen modules:

          [root@node26 ~]# ./mkzpool.sh 
          + modules=(spl zfs lnet lustre ost osd_zfs)
          + typeset -a modules
          + ./collect-info.sh
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/ (stored 0%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/Now (deflated 51%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.script.log (deflated 89%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.rpm-qa.txt (deflated 69%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.lsmod.txt (deflated 66%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.lsblk.txt (deflated 79%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.df.txt (deflated 54%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.mount.txt (deflated 74%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.dmesg.txt (deflated 73%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.zpool_events.txt (deflated 61%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.zpool_events_verbose.txt (deflated 79%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.lctl_dl.txt (stored 0%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.lctl_dk.txt (deflated 12%)
           adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.messages (deflated 84%)
          + modFilter=
          + for module in '${modules[*]}'
          + echo '+ [spl]'
          + [spl]
          ++ test -z ''
          ++ echo ''
          + modFilter=spl
          + modprobe -v spl
          + for module in '${modules[*]}'
          + echo '+ [zfs]'
          + [zfs]
          ++ test -z spl
          ++ echo 'spl|'
          + modFilter='spl|zfs'
          + modprobe -v zfs
          + for module in '${modules[*]}'
          + echo '+ [lnet]'
          + [lnet]
          ++ test -z 'spl|zfs'
          ++ echo 'spl|zfs|'
          + modFilter='spl|zfs|lnet'
          + modprobe -v lnet
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/net/lustre/libcfs.ko 
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/net/lustre/lnet.ko 
          + for module in '${modules[*]}'
          + echo '+ [lustre]'
          + [lustre]
          ++ test -z 'spl|zfs|lnet'
          ++ echo 'spl|zfs|lnet|'
          + modFilter='spl|zfs|lnet|lustre'
          + modprobe -v lustre
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/obdclass.ko 
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/ptlrpc.ko 
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/fld.ko 
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/fid.ko 
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/lov.ko 
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/mdc.ko 
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/lmv.ko 
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/lustre.ko 
          + for module in '${modules[*]}'
          + echo '+ [ost]'
          + [ost]
          ++ test -z 'spl|zfs|lnet|lustre'
          ++ echo 'spl|zfs|lnet|lustre|'
          + modFilter='spl|zfs|lnet|lustre|ost'
          + modprobe -v ost
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/ost.ko 
          + for module in '${modules[*]}'
          + echo '+ [osd_zfs]'
          + [osd_zfs]
          ++ test -z 'spl|zfs|lnet|lustre|ost'
          ++ echo 'spl|zfs|lnet|lustre|ost|'
          + modFilter='spl|zfs|lnet|lustre|ost|osd_zfs'
          + modprobe -v osd_zfs
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/lquota.ko 
          insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/osd_zfs.ko 
          + lsmod
          + grep -E 'spl|zfs|lnet|lustre|ost|osd_zfs'
          osd_zfs 252589 0 
          lquota 354067 1 osd_zfs
          ost 14991 0 
          lustre 816649 0 
          lmv 222021 1 lustre
          mdc 173180 1 lustre
          lov 295937 1 lustre
          fid 90581 2 mdc,osd_zfs
          fld 85860 3 fid,lmv,osd_zfs
          ptlrpc 2129791 8 fid,fld,lmv,mdc,lov,ost,lquota,lustre
          obdclass 1909130 20 fid,fld,lmv,mdc,lov,ost,lquota,lustre,ptlrpc,osd_zfs
          lnet 444969 4 lustre,obdclass,ptlrpc,ksocklnd
          libcfs 405310 13 fid,fld,lmv,mdc,lov,ost,lnet,lquota,lustre,obdclass,ptlrpc,osd_zfs,ksocklnd
          zfs 4026085 1 osd_zfs
          zunicode 331170 1 zfs
          zavl 19839 1 zfs
          icp 299501 1 zfs
          zcommon 77836 2 zfs,osd_zfs
          znvpair 93348 3 zfs,zcommon,osd_zfs
          spl 130321 6 icp,zfs,zavl,zcommon,znvpair,osd_zfs
          zlib_deflate 26914 1 spl
          + zpool list
          + grep -w 'no pools available'
          + zpool destroy oss3pool
          + zpool list
          + grep -w 'no pools available'
          no pools available
          + '[' -f 17.nvl ']'
          + draidcfg -r 17.nvl
          dRAID1 vdev of 17 child drives: 3 x (4 data + 1 parity) and 2 distributed spare
          Using 32 base permutations
           15, 2, 8, 7,10, 5, 4,16, 1,13,14, 9,11,12, 3, 6, 0,
           5,15,14, 9, 0,11,13, 4, 3,12, 8,10, 7, 1, 6, 2,16,
           10,11,14, 5,15, 2,13, 6, 1, 3, 4, 7,12,16, 9, 0, 8,
           13, 2,12,14, 8, 0, 7, 4, 9,15,11, 6, 3,16, 1, 5,10,
           13, 5, 2,16, 6, 0, 4, 8,10, 1, 3,14, 9,11,12, 7,15,
           8,12, 3,14, 0, 4,16, 6, 2,11, 1, 7, 9,15,13, 5,10,
           16,14, 2, 9, 7, 4,11, 0, 6,12,10, 8, 1,13,15, 5, 3,
           5,16, 6, 1,10,15,11, 3, 8,14, 2,12, 0, 7, 9, 4,13,
           4,12, 8,10,14, 9, 6,11,15, 0, 3,13, 7, 2, 5,16, 1,
           10,14,16,11,12, 2, 5, 3, 4, 7, 0, 1, 6, 9,13, 8,15,
           2, 1,11,15,16, 6,12, 3,10,13, 8, 5, 4, 0, 7, 9,14,
           15,14, 1, 5,16, 2,12, 8, 9, 6,11,10, 3, 0, 7, 4,13,
           1, 5,10, 9, 2, 8, 4,16, 7,11, 3,12, 6,14, 0,13,15,
           3, 7,16,10,13, 2, 6, 8,14,15,12,11, 0, 9, 1, 4, 5,
           15, 2,14, 8, 5,16, 3,13, 4, 1, 9,12,10, 0, 6, 7,11,
           14,12,11,15,16,10, 2, 9, 8, 4, 3, 1,13, 5, 7, 0, 6,
           7,13, 2,11,14, 0, 1, 8, 9,10,16, 4, 6,12, 5, 3,15,
           16, 1,11, 4, 3, 9, 6,13, 5, 7,10,15,14,12, 2, 0, 8,
           0, 5, 2,10,16,12, 6, 3,11,14, 1, 9, 7,15, 4, 8,13,
           8,13,11, 4,10, 6, 7,16, 5,12, 9,14, 2, 3, 0,15, 1,
           9, 6,12,16, 4, 7, 3, 0, 2,15,13, 8,11,14, 5,10, 1,
           8,12, 0, 6,15, 7, 4,13,14,10, 1, 9, 5, 3,11, 2,16,
           5,15, 9,10,16, 6,11, 0, 7,13, 8,14, 3, 4, 1,12, 2,
           15,14, 2, 9, 4,11, 7, 1, 6,10, 5, 0, 8,12,13,16, 3,
           15,16, 0,10, 3,12,11, 7, 1, 8, 6,13, 4, 5, 9, 2,14,
           15, 4, 7,13,14, 2, 9,10,16, 1,11,12, 8, 0, 3, 5, 6,
           15, 8,13, 0, 4, 7, 3,14, 5,12, 2, 9,10,11, 6,16, 1,
           0, 7, 5, 3, 1,14,16, 4, 2,15,12, 8,10, 6, 9,11,13,
           7, 6, 0,15,16,11, 8, 1, 5,12,13,14,10, 9, 3, 2, 4,
           14,16,10, 6, 4,13, 3, 1,15,12,11, 8, 9, 5, 0, 7, 2,
           9, 3, 5,15,10,11, 8, 7, 2,14, 6,13, 0, 4, 1,12,16,
           4, 6, 7,14, 5, 3,12, 1,13, 9,16, 2, 0,10, 8,11,15,
          + zpool create -f oss3pool draid1 cfg=17.nvl /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr
          + zpool list
          NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
          oss3pool 14,7G 612K 14,7G - 0% 0% 1.00x ONLINE -
          + zpool status
           pool: oss3pool
           state: ONLINE
           scan: none requested
          config:
          
           NAME STATE READ WRITE CKSUM
           oss3pool ONLINE 0 0 0
           draid1-0 ONLINE 0 0 0
           sdb ONLINE 0 0 0
           sdc ONLINE 0 0 0
           sdd ONLINE 0 0 0
           sde ONLINE 0 0 0
           sdf ONLINE 0 0 0
           sdg ONLINE 0 0 0
           sdh ONLINE 0 0 0
           sdi ONLINE 0 0 0
           sdj ONLINE 0 0 0
           sdk ONLINE 0 0 0
           sdl ONLINE 0 0 0
           sdm ONLINE 0 0 0
           sdn ONLINE 0 0 0
           sdo ONLINE 0 0 0
           sdp ONLINE 0 0 0
           sdq ONLINE 0 0 0
           sdr ONLINE 0 0 0
           spares
           $draid1-0-s0 AVAIL 
           $draid1-0-s1 AVAIL 
          
          errors: No known data errors
          + mount
          + grep oss3pool
          oss3pool on /oss3pool type zfs (rw,xattr,noacl)
          + ./collect-info.sh
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/ (stored 0%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/Now (deflated 51%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.script.log (deflated 90%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.rpm-qa.txt (deflated 69%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.lsmod.txt (deflated 67%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.lsblk.txt (deflated 79%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.df.txt (deflated 55%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.mount.txt (deflated 74%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.dmesg.txt (deflated 73%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.zpool_events.txt (deflated 69%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.zpool_events_verbose.txt (deflated 85%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.lctl_dl.txt (stored 0%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.lctl_dk.txt (deflated 68%)
           adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.messages (deflated 84%)
          + mkfs.lustre --reformat --replace --ost --backfstype=zfs --fsname=ZFS01 --index=3 --mgsnode=mgs@tcp0 oss3pool/ZFS01
          
           Permanent disk data:
          Target: ZFS01-OST0003
          Index: 3
          Lustre FS: ZFS01
          Mount type: zfs
          Flags: 0x42
           (OST update )
          Persistent mount opts: 
          Parameters: mgsnode=172.17.32.220@tcp
          
          mkfs_cmd = zfs create -o canmount=off -o xattr=sa oss3pool/ZFS01
          Writing oss3pool/ZFS01 properties
           lustre:version=1
           lustre:flags=66
           lustre:index=3
           lustre:fsname=ZFS01
           lustre:svname=ZFS01-OST0003
           lustre:mgsnode=172.17.32.220@tcp
          + ./collect-info.sh
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/ (stored 0%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/Now (deflated 51%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.script.log (deflated 90%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.rpm-qa.txt (deflated 69%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.lsmod.txt (deflated 67%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.lsblk.txt (deflated 79%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.df.txt (deflated 56%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.mount.txt (deflated 74%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.dmesg.txt (deflated 73%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.zpool_events.txt (deflated 69%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.zpool_events_verbose.txt (deflated 85%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.lctl_dl.txt (stored 0%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.lctl_dk.txt (deflated 9%)
           adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.messages (deflated 84%)
          + '[' -d /lustre/ZFS01/. ']'
          + mount -v -t lustre oss3pool/ZFS01 /lustre/ZFS01
          arg[0] = /sbin/mount.lustre
          arg[1] = -v
          arg[2] = -o
          arg[3] = rw
          arg[4] = oss3pool/ZFS01
          arg[5] = /lustre/ZFS01
          source = oss3pool/ZFS01 (oss3pool/ZFS01), target = /lustre/ZFS01
          options = rw
          checking for existing Lustre data: found
          Writing oss3pool/ZFS01 properties
           lustre:version=1
           lustre:flags=2
           lustre:index=3
           lustre:fsname=ZFS01
           lustre:svname=ZFS01-OST0003
           lustre:mgsnode=172.17.32.220@tcp
          mounting device oss3pool/ZFS01 at /lustre/ZFS01, flags=0x1000000 options=osd=osd-zfs,,mgsnode=172.17.32.220@tcp,update,param=mgsnode=172.17.32.220@tcp,svname=ZFS01-OST0003,device=oss3pool/ZFS01
          

          Debug info will arrive in few minutes...

          jno jno (Inactive) added a comment - - edited Well, same GOOD different days with some quite randomly chosen modules: [root@node26 ~]# ./mkzpool.sh + modules=(spl zfs lnet lustre ost osd_zfs) + typeset -a modules + ./collect-info.sh adding: debug_info.20170502_050611.727878977_0400-3460-node26/ (stored 0%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/Now (deflated 51%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.script.log (deflated 89%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.rpm-qa.txt (deflated 69%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.lsmod.txt (deflated 66%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.lsblk.txt (deflated 79%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.df.txt (deflated 54%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.mount.txt (deflated 74%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.dmesg.txt (deflated 73%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.zpool_events.txt (deflated 61%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.zpool_events_verbose.txt (deflated 79%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.lctl_dl.txt (stored 0%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.lctl_dk.txt (deflated 12%) adding: debug_info.20170502_050611.727878977_0400-3460-node26/OUTPUT.messages (deflated 84%) + modFilter= + for module in '${modules[*]}' + echo '+ [spl]' + [spl] ++ test -z '' ++ echo '' + modFilter=spl + modprobe -v spl + for module in '${modules[*]}' + echo '+ [zfs]' + [zfs] ++ test -z spl ++ echo 'spl|' + modFilter='spl|zfs' + modprobe -v zfs + for module in '${modules[*]}' + echo '+ [lnet]' + [lnet] ++ test -z 'spl|zfs' ++ echo 'spl|zfs|' + modFilter='spl|zfs|lnet' + modprobe -v lnet insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/net/lustre/libcfs.ko insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/net/lustre/lnet.ko + for module in '${modules[*]}' + echo '+ [lustre]' + [lustre] ++ test -z 'spl|zfs|lnet' ++ echo 'spl|zfs|lnet|' + modFilter='spl|zfs|lnet|lustre' + modprobe -v lustre insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/obdclass.ko insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/ptlrpc.ko insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/fld.ko insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/fid.ko insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/lov.ko insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/mdc.ko insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/lmv.ko insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/lustre.ko + for module in '${modules[*]}' + echo '+ [ost]' + [ost] ++ test -z 'spl|zfs|lnet|lustre' ++ echo 'spl|zfs|lnet|lustre|' + modFilter='spl|zfs|lnet|lustre|ost' + modprobe -v ost insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/ost.ko + for module in '${modules[*]}' + echo '+ [osd_zfs]' + [osd_zfs] ++ test -z 'spl|zfs|lnet|lustre|ost' ++ echo 'spl|zfs|lnet|lustre|ost|' + modFilter='spl|zfs|lnet|lustre|ost|osd_zfs' + modprobe -v osd_zfs insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/lquota.ko insmod /lib/modules/3.10.0-514.16.1.el7.x86_64/extra/kernel/fs/lustre/osd_zfs.ko + lsmod + grep -E 'spl|zfs|lnet|lustre|ost|osd_zfs' osd_zfs 252589 0 lquota 354067 1 osd_zfs ost 14991 0 lustre 816649 0 lmv 222021 1 lustre mdc 173180 1 lustre lov 295937 1 lustre fid 90581 2 mdc,osd_zfs fld 85860 3 fid,lmv,osd_zfs ptlrpc 2129791 8 fid,fld,lmv,mdc,lov,ost,lquota,lustre obdclass 1909130 20 fid,fld,lmv,mdc,lov,ost,lquota,lustre,ptlrpc,osd_zfs lnet 444969 4 lustre,obdclass,ptlrpc,ksocklnd libcfs 405310 13 fid,fld,lmv,mdc,lov,ost,lnet,lquota,lustre,obdclass,ptlrpc,osd_zfs,ksocklnd zfs 4026085 1 osd_zfs zunicode 331170 1 zfs zavl 19839 1 zfs icp 299501 1 zfs zcommon 77836 2 zfs,osd_zfs znvpair 93348 3 zfs,zcommon,osd_zfs spl 130321 6 icp,zfs,zavl,zcommon,znvpair,osd_zfs zlib_deflate 26914 1 spl + zpool list + grep -w 'no pools available' + zpool destroy oss3pool + zpool list + grep -w 'no pools available' no pools available + '[' -f 17.nvl ']' + draidcfg -r 17.nvl dRAID1 vdev of 17 child drives: 3 x (4 data + 1 parity) and 2 distributed spare Using 32 base permutations 15, 2, 8, 7,10, 5, 4,16, 1,13,14, 9,11,12, 3, 6, 0, 5,15,14, 9, 0,11,13, 4, 3,12, 8,10, 7, 1, 6, 2,16, 10,11,14, 5,15, 2,13, 6, 1, 3, 4, 7,12,16, 9, 0, 8, 13, 2,12,14, 8, 0, 7, 4, 9,15,11, 6, 3,16, 1, 5,10, 13, 5, 2,16, 6, 0, 4, 8,10, 1, 3,14, 9,11,12, 7,15, 8,12, 3,14, 0, 4,16, 6, 2,11, 1, 7, 9,15,13, 5,10, 16,14, 2, 9, 7, 4,11, 0, 6,12,10, 8, 1,13,15, 5, 3, 5,16, 6, 1,10,15,11, 3, 8,14, 2,12, 0, 7, 9, 4,13, 4,12, 8,10,14, 9, 6,11,15, 0, 3,13, 7, 2, 5,16, 1, 10,14,16,11,12, 2, 5, 3, 4, 7, 0, 1, 6, 9,13, 8,15, 2, 1,11,15,16, 6,12, 3,10,13, 8, 5, 4, 0, 7, 9,14, 15,14, 1, 5,16, 2,12, 8, 9, 6,11,10, 3, 0, 7, 4,13, 1, 5,10, 9, 2, 8, 4,16, 7,11, 3,12, 6,14, 0,13,15, 3, 7,16,10,13, 2, 6, 8,14,15,12,11, 0, 9, 1, 4, 5, 15, 2,14, 8, 5,16, 3,13, 4, 1, 9,12,10, 0, 6, 7,11, 14,12,11,15,16,10, 2, 9, 8, 4, 3, 1,13, 5, 7, 0, 6, 7,13, 2,11,14, 0, 1, 8, 9,10,16, 4, 6,12, 5, 3,15, 16, 1,11, 4, 3, 9, 6,13, 5, 7,10,15,14,12, 2, 0, 8, 0, 5, 2,10,16,12, 6, 3,11,14, 1, 9, 7,15, 4, 8,13, 8,13,11, 4,10, 6, 7,16, 5,12, 9,14, 2, 3, 0,15, 1, 9, 6,12,16, 4, 7, 3, 0, 2,15,13, 8,11,14, 5,10, 1, 8,12, 0, 6,15, 7, 4,13,14,10, 1, 9, 5, 3,11, 2,16, 5,15, 9,10,16, 6,11, 0, 7,13, 8,14, 3, 4, 1,12, 2, 15,14, 2, 9, 4,11, 7, 1, 6,10, 5, 0, 8,12,13,16, 3, 15,16, 0,10, 3,12,11, 7, 1, 8, 6,13, 4, 5, 9, 2,14, 15, 4, 7,13,14, 2, 9,10,16, 1,11,12, 8, 0, 3, 5, 6, 15, 8,13, 0, 4, 7, 3,14, 5,12, 2, 9,10,11, 6,16, 1, 0, 7, 5, 3, 1,14,16, 4, 2,15,12, 8,10, 6, 9,11,13, 7, 6, 0,15,16,11, 8, 1, 5,12,13,14,10, 9, 3, 2, 4, 14,16,10, 6, 4,13, 3, 1,15,12,11, 8, 9, 5, 0, 7, 2, 9, 3, 5,15,10,11, 8, 7, 2,14, 6,13, 0, 4, 1,12,16, 4, 6, 7,14, 5, 3,12, 1,13, 9,16, 2, 0,10, 8,11,15, + zpool create -f oss3pool draid1 cfg=17.nvl /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr + zpool list NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT oss3pool 14,7G 612K 14,7G - 0% 0% 1.00x ONLINE - + zpool status pool: oss3pool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM oss3pool ONLINE 0 0 0 draid1-0 ONLINE 0 0 0 sdb ONLINE 0 0 0 sdc ONLINE 0 0 0 sdd ONLINE 0 0 0 sde ONLINE 0 0 0 sdf ONLINE 0 0 0 sdg ONLINE 0 0 0 sdh ONLINE 0 0 0 sdi ONLINE 0 0 0 sdj ONLINE 0 0 0 sdk ONLINE 0 0 0 sdl ONLINE 0 0 0 sdm ONLINE 0 0 0 sdn ONLINE 0 0 0 sdo ONLINE 0 0 0 sdp ONLINE 0 0 0 sdq ONLINE 0 0 0 sdr ONLINE 0 0 0 spares $draid1-0-s0 AVAIL $draid1-0-s1 AVAIL errors: No known data errors + mount + grep oss3pool oss3pool on /oss3pool type zfs (rw,xattr,noacl) + ./collect-info.sh adding: debug_info.20170502_050643.778657696_0400-4594-node26/ (stored 0%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/Now (deflated 51%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.script.log (deflated 90%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.rpm-qa.txt (deflated 69%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.lsmod.txt (deflated 67%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.lsblk.txt (deflated 79%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.df.txt (deflated 55%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.mount.txt (deflated 74%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.dmesg.txt (deflated 73%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.zpool_events.txt (deflated 69%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.zpool_events_verbose.txt (deflated 85%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.lctl_dl.txt (stored 0%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.lctl_dk.txt (deflated 68%) adding: debug_info.20170502_050643.778657696_0400-4594-node26/OUTPUT.messages (deflated 84%) + mkfs.lustre --reformat --replace --ost --backfstype=zfs --fsname=ZFS01 --index=3 --mgsnode=mgs@tcp0 oss3pool/ZFS01 Permanent disk data: Target: ZFS01-OST0003 Index: 3 Lustre FS: ZFS01 Mount type: zfs Flags: 0x42 (OST update ) Persistent mount opts: Parameters: mgsnode=172.17.32.220@tcp mkfs_cmd = zfs create -o canmount=off -o xattr=sa oss3pool/ZFS01 Writing oss3pool/ZFS01 properties lustre:version=1 lustre:flags=66 lustre:index=3 lustre:fsname=ZFS01 lustre:svname=ZFS01-OST0003 lustre:mgsnode=172.17.32.220@tcp + ./collect-info.sh adding: debug_info.20170502_050655.040550023_0400-5778-node26/ (stored 0%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/Now (deflated 51%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.script.log (deflated 90%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.rpm-qa.txt (deflated 69%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.lsmod.txt (deflated 67%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.lsblk.txt (deflated 79%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.df.txt (deflated 56%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.mount.txt (deflated 74%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.dmesg.txt (deflated 73%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.zpool_events.txt (deflated 69%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.zpool_events_verbose.txt (deflated 85%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.lctl_dl.txt (stored 0%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.lctl_dk.txt (deflated 9%) adding: debug_info.20170502_050655.040550023_0400-5778-node26/OUTPUT.messages (deflated 84%) + '[' -d /lustre/ZFS01/. ']' + mount -v -t lustre oss3pool/ZFS01 /lustre/ZFS01 arg[0] = /sbin/mount.lustre arg[1] = -v arg[2] = -o arg[3] = rw arg[4] = oss3pool/ZFS01 arg[5] = /lustre/ZFS01 source = oss3pool/ZFS01 (oss3pool/ZFS01), target = /lustre/ZFS01 options = rw checking for existing Lustre data: found Writing oss3pool/ZFS01 properties lustre:version=1 lustre:flags=2 lustre:index=3 lustre:fsname=ZFS01 lustre:svname=ZFS01-OST0003 lustre:mgsnode=172.17.32.220@tcp mounting device oss3pool/ZFS01 at /lustre/ZFS01, flags=0x1000000 options=osd=osd-zfs,,mgsnode=172.17.32.220@tcp,update,param=mgsnode=172.17.32.220@tcp,svname=ZFS01-OST0003,device=oss3pool/ZFS01 Debug info will arrive in few minutes... debug_info.20170502_050611.727878977_0400-3460-node26.zip debug_info.20170502_050643.778657696_0400-4594-node26.zip debug_info.20170502_050655.040550023_0400-5778-node26.zip

          BTW, there are quite a while (37) of modules here:

          [root@node26 ~]# find . -name '*.ko' 
          ./spl/module/spl/spl.ko
          ./spl/module/splat/splat.ko
          ./zfs/module/avl/zavl.ko
          ./zfs/module/icp/icp.ko
          ./zfs/module/nvpair/znvpair.ko
          ./zfs/module/unicode/zunicode.ko
          ./zfs/module/zcommon/zcommon.ko
          ./zfs/module/zfs/zfs.ko
          ./zfs/module/zpios/zpios.ko
          ./lustre-release/libcfs/libcfs/libcfs.ko
          ./lustre-release/lnet/klnds/o2iblnd/ko2iblnd.ko
          ./lustre-release/lnet/klnds/socklnd/ksocklnd.ko
          ./lustre-release/lnet/lnet/lnet.ko
          ./lustre-release/lnet/selftest/lnet_selftest.ko
          ./lustre-release/lustre/fid/fid.ko
          ./lustre-release/lustre/fld/fld.ko
          ./lustre-release/lustre/lfsck/lfsck.ko
          ./lustre-release/lustre/llite/llite_lloop.ko
          ./lustre-release/lustre/llite/lustre.ko
          ./lustre-release/lustre/lmv/lmv.ko
          ./lustre-release/lustre/lod/lod.ko
          ./lustre-release/lustre/lov/lov.ko
          ./lustre-release/lustre/mdc/mdc.ko
          ./lustre-release/lustre/mdd/mdd.ko
          ./lustre-release/lustre/mdt/mdt.ko
          ./lustre-release/lustre/mgc/mgc.ko
          ./lustre-release/lustre/mgs/mgs.ko
          ./lustre-release/lustre/obdclass/obdclass.ko
          ./lustre-release/lustre/obdclass/llog_test.ko
          ./lustre-release/lustre/obdecho/obdecho.ko
          ./lustre-release/lustre/ofd/ofd.ko
          ./lustre-release/lustre/osc/osc.ko
          ./lustre-release/lustre/osd-zfs/osd_zfs.ko
          ./lustre-release/lustre/osp/osp.ko
          ./lustre-release/lustre/ost/ost.ko
          ./lustre-release/lustre/ptlrpc/ptlrpc.ko
          ./lustre-release/lustre/quota/lquota.ko
          

          and it's not obvious to me which and when to load...
          In the debug_info.20170424_044934.896525307_0400-5362-node26.zip one may see spl and zfs things loaded.

          Ok, I'll try to load all or some of them now and re-try.

          jno jno (Inactive) added a comment - BTW, there are quite a while (37) of modules here: [root@node26 ~]# find . -name '*.ko' ./spl/module/spl/spl.ko ./spl/module/splat/splat.ko ./zfs/module/avl/zavl.ko ./zfs/module/icp/icp.ko ./zfs/module/nvpair/znvpair.ko ./zfs/module/unicode/zunicode.ko ./zfs/module/zcommon/zcommon.ko ./zfs/module/zfs/zfs.ko ./zfs/module/zpios/zpios.ko ./lustre-release/libcfs/libcfs/libcfs.ko ./lustre-release/lnet/klnds/o2iblnd/ko2iblnd.ko ./lustre-release/lnet/klnds/socklnd/ksocklnd.ko ./lustre-release/lnet/lnet/lnet.ko ./lustre-release/lnet/selftest/lnet_selftest.ko ./lustre-release/lustre/fid/fid.ko ./lustre-release/lustre/fld/fld.ko ./lustre-release/lustre/lfsck/lfsck.ko ./lustre-release/lustre/llite/llite_lloop.ko ./lustre-release/lustre/llite/lustre.ko ./lustre-release/lustre/lmv/lmv.ko ./lustre-release/lustre/lod/lod.ko ./lustre-release/lustre/lov/lov.ko ./lustre-release/lustre/mdc/mdc.ko ./lustre-release/lustre/mdd/mdd.ko ./lustre-release/lustre/mdt/mdt.ko ./lustre-release/lustre/mgc/mgc.ko ./lustre-release/lustre/mgs/mgs.ko ./lustre-release/lustre/obdclass/obdclass.ko ./lustre-release/lustre/obdclass/llog_test.ko ./lustre-release/lustre/obdecho/obdecho.ko ./lustre-release/lustre/ofd/ofd.ko ./lustre-release/lustre/osc/osc.ko ./lustre-release/lustre/osd-zfs/osd_zfs.ko ./lustre-release/lustre/osp/osp.ko ./lustre-release/lustre/ost/ost.ko ./lustre-release/lustre/ptlrpc/ptlrpc.ko ./lustre-release/lustre/quota/lquota.ko and it's not obvious to me which and when to load... In the debug_info.20170424_044934.896525307_0400-5362-node26.zip one may see spl and zfs things loaded. Ok, I'll try to load all or some of them now and re-try.
          jsalians_intel John Salinas (Inactive) added a comment - - edited

          Right but look in your lsmod output – it does not appear lustre or lnet are there.

          $ grep lustre OUTPUT.lsmod.txt
          $

          This is why all of your Lustre commands are failing – such as: invalid parameter 'dump_kernel'
          open(dump_kernel) failed: No such file or directory

          Could you please load the Lustre & lnet kernel modules and try this again? Also I do not see output from the mds. If there are still issues that would be helpful.

          Thank you

          jsalians_intel John Salinas (Inactive) added a comment - - edited Right but look in your lsmod output – it does not appear lustre or lnet are there. $ grep lustre OUTPUT.lsmod.txt $ This is why all of your Lustre commands are failing – such as: invalid parameter 'dump_kernel' open(dump_kernel) failed: No such file or directory Could you please load the Lustre & lnet kernel modules and try this again? Also I do not see output from the mds. If there are still issues that would be helpful. Thank you

          Hi there,

           

          Yes, it was installed. From build (make install).

          I.e. one may see

          [root@node26 ~]# lustre_
          lustre_req_history lustre_routes_config lustre_rsync
          lustre_rmmod lustre_routes_conversion lustre_start
          [root@node26 ~]# lustre_
          
          jno jno (Inactive) added a comment - Hi there,   Yes, it was installed. From build (make install). I.e. one may see [root@node26 ~]# lustre_ lustre_req_history lustre_routes_config lustre_rsync lustre_rmmod lustre_routes_conversion lustre_start [root@node26 ~]# lustre_

          Greetings,

          Maybe I am not looking at this right but it does not look like Lustre is installed on the OSS node? Can you confirm? In the rpm list I didn't see the rpms and the Lustre command did not appear to be able to run.

          jsalians_intel John Salinas (Inactive) added a comment - Greetings, Maybe I am not looking at this right but it does not look like Lustre is installed on the OSS node? Can you confirm? In the rpm list I didn't see the rpms and the Lustre command did not appear to be able to run.
          jno jno (Inactive) added a comment - - edited

          I've added calls to collect-info.sh right into mkzpool.sh script (with sleep/sync/sleep magic to have the last zip kept).

          Here we are:

          • debug_info.20170424_044901.648221747_0400-3235-node26.zip
          • debug_info.20170424_044924.231035970_0400-4268-node26.zip
          • debug_info.20170424_044934.896525307_0400-5362-node26.zip
          • - console at hang (I dunno what "dcla" means here)
             
            [root@node26 ~]# ./mkzpool.sh 
            + ./collect-info.sh
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/ (stored 0%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/Now (deflated 51%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.script.log (deflated 89%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.rpm-qa.txt (deflated 69%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.lsmod.txt (deflated 66%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.lsblk.txt (deflated 79%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.df.txt (deflated 54%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.mount.txt (deflated 74%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.dmesg.txt (deflated 73%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.zpool_events.txt (deflated 62%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.zpool_events_verbose.txt (deflated 79%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.lctl_dl.txt (stored 0%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.lctl_dk.txt (deflated 12%)
              adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.messages (deflated 83%)
            + zpool list
            + grep -w 'no pools available'
            + zpool destroy oss3pool
            + zpool list
            + grep -w 'no pools available'
            no pools available
            + '[' -f 17.nvl ']'
            + draidcfg -r 17.nvl
            dRAID1 vdev of 17 child drives: 3 x (4 data + 1 parity) and 2 distributed spare
            Using 32 base permutations
              15, 2, 8, 7,10, 5, 4,16, 1,13,14, 9,11,12, 3, 6, 0,
               5,15,14, 9, 0,11,13, 4, 3,12, 8,10, 7, 1, 6, 2,16,
              10,11,14, 5,15, 2,13, 6, 1, 3, 4, 7,12,16, 9, 0, 8,
              13, 2,12,14, 8, 0, 7, 4, 9,15,11, 6, 3,16, 1, 5,10,
              13, 5, 2,16, 6, 0, 4, 8,10, 1, 3,14, 9,11,12, 7,15,
               8,12, 3,14, 0, 4,16, 6, 2,11, 1, 7, 9,15,13, 5,10,
              16,14, 2, 9, 7, 4,11, 0, 6,12,10, 8, 1,13,15, 5, 3,
               5,16, 6, 1,10,15,11, 3, 8,14, 2,12, 0, 7, 9, 4,13,
               4,12, 8,10,14, 9, 6,11,15, 0, 3,13, 7, 2, 5,16, 1,
              10,14,16,11,12, 2, 5, 3, 4, 7, 0, 1, 6, 9,13, 8,15,
               2, 1,11,15,16, 6,12, 3,10,13, 8, 5, 4, 0, 7, 9,14,
              15,14, 1, 5,16, 2,12, 8, 9, 6,11,10, 3, 0, 7, 4,13,
               1, 5,10, 9, 2, 8, 4,16, 7,11, 3,12, 6,14, 0,13,15,
               3, 7,16,10,13, 2, 6, 8,14,15,12,11, 0, 9, 1, 4, 5,
              15, 2,14, 8, 5,16, 3,13, 4, 1, 9,12,10, 0, 6, 7,11,
              14,12,11,15,16,10, 2, 9, 8, 4, 3, 1,13, 5, 7, 0, 6,
               7,13, 2,11,14, 0, 1, 8, 9,10,16, 4, 6,12, 5, 3,15,
              16, 1,11, 4, 3, 9, 6,13, 5, 7,10,15,14,12, 2, 0, 8,
               0, 5, 2,10,16,12, 6, 3,11,14, 1, 9, 7,15, 4, 8,13,
               8,13,11, 4,10, 6, 7,16, 5,12, 9,14, 2, 3, 0,15, 1,
               9, 6,12,16, 4, 7, 3, 0, 2,15,13, 8,11,14, 5,10, 1,
               8,12, 0, 6,15, 7, 4,13,14,10, 1, 9, 5, 3,11, 2,16,
               5,15, 9,10,16, 6,11, 0, 7,13, 8,14, 3, 4, 1,12, 2,
              15,14, 2, 9, 4,11, 7, 1, 6,10, 5, 0, 8,12,13,16, 3,
              15,16, 0,10, 3,12,11, 7, 1, 8, 6,13, 4, 5, 9, 2,14,
              15, 4, 7,13,14, 2, 9,10,16, 1,11,12, 8, 0, 3, 5, 6,
              15, 8,13, 0, 4, 7, 3,14, 5,12, 2, 9,10,11, 6,16, 1,
               0, 7, 5, 3, 1,14,16, 4, 2,15,12, 8,10, 6, 9,11,13,
               7, 6, 0,15,16,11, 8, 1, 5,12,13,14,10, 9, 3, 2, 4,
              14,16,10, 6, 4,13, 3, 1,15,12,11, 8, 9, 5, 0, 7, 2,
               9, 3, 5,15,10,11, 8, 7, 2,14, 6,13, 0, 4, 1,12,16,
               4, 6, 7,14, 5, 3,12, 1,13, 9,16, 2, 0,10, 8,11,15,
            + zpool create -f oss3pool draid1 cfg=17.nvl /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr
            + zpool list
            NAME       SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
            oss3pool  14,7G   612K  14,7G         -     0%     0%  1.00x  ONLINE  -
            + zpool status
              pool: oss3pool
             state: ONLINE
              scan: none requested
            config:
            
            	NAME            STATE     READ WRITE CKSUM
            	oss3pool        ONLINE       0     0     0
            	  draid1-0      ONLINE       0     0     0
            	    sdb         ONLINE       0     0     0
            	    sdc         ONLINE       0     0     0
            	    sdd         ONLINE       0     0     0
            	    sde         ONLINE       0     0     0
            	    sdf         ONLINE       0     0     0
            	    sdg         ONLINE       0     0     0
            	    sdh         ONLINE       0     0     0
            	    sdi         ONLINE       0     0     0
            	    sdj         ONLINE       0     0     0
            	    sdk         ONLINE       0     0     0
            	    sdl         ONLINE       0     0     0
            	    sdm         ONLINE       0     0     0
            	    sdn         ONLINE       0     0     0
            	    sdo         ONLINE       0     0     0
            	    sdp         ONLINE       0     0     0
            	    sdq         ONLINE       0     0     0
            	    sdr         ONLINE       0     0     0
            	spares
            	  $draid1-0-s0  AVAIL   
            	  $draid1-0-s1  AVAIL   
            
            errors: No known data errors
            + grep oss3pool
            + mount
            oss3pool on /oss3pool type zfs (rw,xattr,noacl)
            + ./collect-info.sh
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/ (stored 0%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/Now (deflated 51%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.script.log (deflated 89%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.rpm-qa.txt (deflated 69%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.lsmod.txt (deflated 66%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.lsblk.txt (deflated 79%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.df.txt (deflated 55%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.mount.txt (deflated 74%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.dmesg.txt (deflated 73%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.zpool_events.txt (deflated 71%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.zpool_events_verbose.txt (deflated 85%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.lctl_dl.txt (stored 0%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.lctl_dk.txt (deflated 12%)
              adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.messages (deflated 83%)
            + mkfs.lustre --reformat --replace --ost --backfstype=zfs --fsname=ZFS01 --index=3 --mgsnode=mgs@tcp0 oss3pool/ZFS01
            
               Permanent disk data:
            Target:     ZFS01-OST0003
            Index:      3
            Lustre FS:  ZFS01
            Mount type: zfs
            Flags:      0x42
                          (OST update )
            Persistent mount opts: 
            Parameters: mgsnode=172.17.32.220@tcp
            
            mkfs_cmd = zfs create -o canmount=off -o xattr=sa oss3pool/ZFS01
            Writing oss3pool/ZFS01 properties
              lustre:version=1
              lustre:flags=66
              lustre:index=3
              lustre:fsname=ZFS01
              lustre:svname=ZFS01-OST0003
              lustre:mgsnode=172.17.32.220@tcp
            + ./collect-info.sh
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/ (stored 0%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/Now (deflated 51%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.script.log (deflated 89%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.rpm-qa.txt (deflated 69%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.lsmod.txt (deflated 66%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.lsblk.txt (deflated 79%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.df.txt (deflated 55%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.mount.txt (deflated 74%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.dmesg.txt (deflated 73%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.zpool_events.txt (deflated 72%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.zpool_events_verbose.txt (deflated 85%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.lctl_dl.txt (stored 0%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.lctl_dk.txt (deflated 12%)
              adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.messages (deflated 83%)
            + '[' -d /lustre/ZFS01/. ']'
            + mount -v -t lustre oss3pool/ZFS01 /lustre/ZFS01
            arg[0] = /sbin/mount.lustre
            arg[1] = -v
            arg[2] = -o
            arg[3] = rw
            arg[4] = oss3pool/ZFS01
            arg[5] = /lustre/ZFS01
            source = oss3pool/ZFS01 (oss3pool/ZFS01), target = /lustre/ZFS01
            options = rw
            checking for existing Lustre data: found
            Writing oss3pool/ZFS01 properties
              lustre:version=1
              lustre:flags=2
              lustre:index=3
              lustre:fsname=ZFS01
              lustre:svname=ZFS01-OST0003
              lustre:mgsnode=172.17.32.220@tcp
            mounting device oss3pool/ZFS01 at /lustre/ZFS01, flags=0x1000000 options=osd=osd-zfs,,mgsnode=172.17.32.220@tcp,update,param=mgsnode=172.17.32.220@tcp,svname=ZFS01-OST0003,device=oss3pool/ZFS01
            
            
            
            
            

             And yes, it now (after fixing my mistake with --index) crashes on the 1st mount.

          jno jno (Inactive) added a comment - - edited I've added calls to collect-info.sh right into mkzpool.sh script (with sleep/sync/sleep magic to have the last zip kept). Here we are: debug_info.20170424_044901.648221747_0400-3235-node26.zip debug_info.20170424_044924.231035970_0400-4268-node26.zip debug_info.20170424_044934.896525307_0400-5362-node26.zip - console at hang (I dunno what "dcla" means here)   [root@node26 ~]# ./mkzpool.sh + ./collect-info.sh adding: debug_info.20170424_044901.648221747_0400-3235-node26/ (stored 0%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/Now (deflated 51%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.script.log (deflated 89%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.rpm-qa.txt (deflated 69%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.lsmod.txt (deflated 66%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.lsblk.txt (deflated 79%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.df.txt (deflated 54%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.mount.txt (deflated 74%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.dmesg.txt (deflated 73%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.zpool_events.txt (deflated 62%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.zpool_events_verbose.txt (deflated 79%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.lctl_dl.txt (stored 0%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.lctl_dk.txt (deflated 12%) adding: debug_info.20170424_044901.648221747_0400-3235-node26/OUTPUT.messages (deflated 83%) + zpool list + grep -w 'no pools available' + zpool destroy oss3pool + zpool list + grep -w 'no pools available' no pools available + '[' -f 17.nvl ']' + draidcfg -r 17.nvl dRAID1 vdev of 17 child drives: 3 x (4 data + 1 parity) and 2 distributed spare Using 32 base permutations 15, 2, 8, 7,10, 5, 4,16, 1,13,14, 9,11,12, 3, 6, 0, 5,15,14, 9, 0,11,13, 4, 3,12, 8,10, 7, 1, 6, 2,16, 10,11,14, 5,15, 2,13, 6, 1, 3, 4, 7,12,16, 9, 0, 8, 13, 2,12,14, 8, 0, 7, 4, 9,15,11, 6, 3,16, 1, 5,10, 13, 5, 2,16, 6, 0, 4, 8,10, 1, 3,14, 9,11,12, 7,15, 8,12, 3,14, 0, 4,16, 6, 2,11, 1, 7, 9,15,13, 5,10, 16,14, 2, 9, 7, 4,11, 0, 6,12,10, 8, 1,13,15, 5, 3, 5,16, 6, 1,10,15,11, 3, 8,14, 2,12, 0, 7, 9, 4,13, 4,12, 8,10,14, 9, 6,11,15, 0, 3,13, 7, 2, 5,16, 1, 10,14,16,11,12, 2, 5, 3, 4, 7, 0, 1, 6, 9,13, 8,15, 2, 1,11,15,16, 6,12, 3,10,13, 8, 5, 4, 0, 7, 9,14, 15,14, 1, 5,16, 2,12, 8, 9, 6,11,10, 3, 0, 7, 4,13, 1, 5,10, 9, 2, 8, 4,16, 7,11, 3,12, 6,14, 0,13,15, 3, 7,16,10,13, 2, 6, 8,14,15,12,11, 0, 9, 1, 4, 5, 15, 2,14, 8, 5,16, 3,13, 4, 1, 9,12,10, 0, 6, 7,11, 14,12,11,15,16,10, 2, 9, 8, 4, 3, 1,13, 5, 7, 0, 6, 7,13, 2,11,14, 0, 1, 8, 9,10,16, 4, 6,12, 5, 3,15, 16, 1,11, 4, 3, 9, 6,13, 5, 7,10,15,14,12, 2, 0, 8, 0, 5, 2,10,16,12, 6, 3,11,14, 1, 9, 7,15, 4, 8,13, 8,13,11, 4,10, 6, 7,16, 5,12, 9,14, 2, 3, 0,15, 1, 9, 6,12,16, 4, 7, 3, 0, 2,15,13, 8,11,14, 5,10, 1, 8,12, 0, 6,15, 7, 4,13,14,10, 1, 9, 5, 3,11, 2,16, 5,15, 9,10,16, 6,11, 0, 7,13, 8,14, 3, 4, 1,12, 2, 15,14, 2, 9, 4,11, 7, 1, 6,10, 5, 0, 8,12,13,16, 3, 15,16, 0,10, 3,12,11, 7, 1, 8, 6,13, 4, 5, 9, 2,14, 15, 4, 7,13,14, 2, 9,10,16, 1,11,12, 8, 0, 3, 5, 6, 15, 8,13, 0, 4, 7, 3,14, 5,12, 2, 9,10,11, 6,16, 1, 0, 7, 5, 3, 1,14,16, 4, 2,15,12, 8,10, 6, 9,11,13, 7, 6, 0,15,16,11, 8, 1, 5,12,13,14,10, 9, 3, 2, 4, 14,16,10, 6, 4,13, 3, 1,15,12,11, 8, 9, 5, 0, 7, 2, 9, 3, 5,15,10,11, 8, 7, 2,14, 6,13, 0, 4, 1,12,16, 4, 6, 7,14, 5, 3,12, 1,13, 9,16, 2, 0,10, 8,11,15, + zpool create -f oss3pool draid1 cfg=17.nvl /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr + zpool list NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT oss3pool 14,7G 612K 14,7G - 0% 0% 1.00x ONLINE - + zpool status pool: oss3pool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM oss3pool ONLINE 0 0 0 draid1-0 ONLINE 0 0 0 sdb ONLINE 0 0 0 sdc ONLINE 0 0 0 sdd ONLINE 0 0 0 sde ONLINE 0 0 0 sdf ONLINE 0 0 0 sdg ONLINE 0 0 0 sdh ONLINE 0 0 0 sdi ONLINE 0 0 0 sdj ONLINE 0 0 0 sdk ONLINE 0 0 0 sdl ONLINE 0 0 0 sdm ONLINE 0 0 0 sdn ONLINE 0 0 0 sdo ONLINE 0 0 0 sdp ONLINE 0 0 0 sdq ONLINE 0 0 0 sdr ONLINE 0 0 0 spares $draid1-0-s0 AVAIL $draid1-0-s1 AVAIL errors: No known data errors + grep oss3pool + mount oss3pool on /oss3pool type zfs (rw,xattr,noacl) + ./collect-info.sh adding: debug_info.20170424_044924.231035970_0400-4268-node26/ (stored 0%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/Now (deflated 51%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.script.log (deflated 89%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.rpm-qa.txt (deflated 69%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.lsmod.txt (deflated 66%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.lsblk.txt (deflated 79%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.df.txt (deflated 55%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.mount.txt (deflated 74%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.dmesg.txt (deflated 73%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.zpool_events.txt (deflated 71%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.zpool_events_verbose.txt (deflated 85%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.lctl_dl.txt (stored 0%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.lctl_dk.txt (deflated 12%) adding: debug_info.20170424_044924.231035970_0400-4268-node26/OUTPUT.messages (deflated 83%) + mkfs.lustre --reformat --replace --ost --backfstype=zfs --fsname=ZFS01 --index=3 --mgsnode=mgs@tcp0 oss3pool/ZFS01 Permanent disk data: Target: ZFS01-OST0003 Index: 3 Lustre FS: ZFS01 Mount type: zfs Flags: 0x42 (OST update ) Persistent mount opts: Parameters: mgsnode=172.17.32.220@tcp mkfs_cmd = zfs create -o canmount=off -o xattr=sa oss3pool/ZFS01 Writing oss3pool/ZFS01 properties lustre:version=1 lustre:flags=66 lustre:index=3 lustre:fsname=ZFS01 lustre:svname=ZFS01-OST0003 lustre:mgsnode=172.17.32.220@tcp + ./collect-info.sh adding: debug_info.20170424_044934.896525307_0400-5362-node26/ (stored 0%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/Now (deflated 51%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.script.log (deflated 89%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.rpm-qa.txt (deflated 69%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.lsmod.txt (deflated 66%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.lsblk.txt (deflated 79%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.df.txt (deflated 55%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.mount.txt (deflated 74%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.show_kernelmod_params.txt (deflated 67%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.kernel_debug_trace.txt (deflated 57%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.dmesg.txt (deflated 73%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.zpool_events.txt (deflated 72%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.zpool_events_verbose.txt (deflated 85%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.lctl_dl.txt (stored 0%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.lctl_dk.txt (deflated 12%) adding: debug_info.20170424_044934.896525307_0400-5362-node26/OUTPUT.messages (deflated 83%) + '[' -d /lustre/ZFS01/. ']' + mount -v -t lustre oss3pool/ZFS01 /lustre/ZFS01 arg[0] = /sbin/mount.lustre arg[1] = -v arg[2] = -o arg[3] = rw arg[4] = oss3pool/ZFS01 arg[5] = /lustre/ZFS01 source = oss3pool/ZFS01 (oss3pool/ZFS01), target = /lustre/ZFS01 options = rw checking for existing Lustre data: found Writing oss3pool/ZFS01 properties lustre:version=1 lustre:flags=2 lustre:index=3 lustre:fsname=ZFS01 lustre:svname=ZFS01-OST0003 lustre:mgsnode=172.17.32.220@tcp mounting device oss3pool/ZFS01 at /lustre/ZFS01, flags=0x1000000 options=osd=osd-zfs,,mgsnode=172.17.32.220@tcp,update,param=mgsnode=172.17.32.220@tcp,svname=ZFS01-OST0003,device=oss3pool/ZFS01  And yes, it now (after fixing my mistake with --index) crashes on the 1st mount.

          People

            wc-triage WC Triage
            jno jno (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: