Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2963

fail to create large stripe count file with -ENOSPC error

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.4.0
    • None
    • 3
    • 7227

    Attachments

      1. hostlist.sh
        3 kB
      2. ior-lsc.pbs
        1.0 kB
      3. new-build.sh
        4 kB
      4. new-lustre-start.sh
        6 kB
      5. testfs-barry-224.conf
        0.6 kB

      Issue Links

        Activity

          [LU-2963] fail to create large stripe count file with -ENOSPC error
          pjones Peter Jones added a comment -

          That is excellent news - thanks James!

          pjones Peter Jones added a comment - That is excellent news - thanks James!

          Excellent news. The patch from LU-4791 fixes this issue. You can close this ticket.

          simmonsja James A Simmons added a comment - Excellent news. The patch from LU-4791 fixes this issue. You can close this ticket.

          James,

          The patch for LU-4791 has landed to master and there is a b2_4 patch available that has not landed yet. If you are able, please test with the LU-4791 patch and see if it fixes this issue.

          Thank you.

          jamesanunez James Nunez (Inactive) added a comment - James, The patch for LU-4791 has landed to master and there is a b2_4 patch available that has not landed yet. If you are able, please test with the LU-4791 patch and see if it fixes this issue. Thank you.

          The problem was large_xattr was not set on the MDS. That was resolved. What is not resolved is that when large stripe count is not set then the default LOV_MAX_STRIPE is not 160 but something less due to changes in the data being sent over wire.

          simmonsja James A Simmons added a comment - The problem was large_xattr was not set on the MDS. That was resolved. What is not resolved is that when large stripe count is not set then the default LOV_MAX_STRIPE is not 160 but something less due to changes in the data being sent over wire.

          James,

          Have you run this large stripe job recently and, if so, are you still seeing this problem?

          Thanks,
          James

          jamesanunez James Nunez (Inactive) added a comment - James, Have you run this large stripe job recently and, if so, are you still seeing this problem? Thanks, James

          For the last test shot we had to reformat the file system due to the changes in the fid format. After mounting the file system I always run the large stripe job first.

          simmonsja James A Simmons added a comment - For the last test shot we had to reformat the file system due to the changes in the fid format. After mounting the file system I always run the large stripe job first.

          James,
          this ENOSPC problem may only be related to your test configuration, if there are individual OSTs that are out of space for some reason. Creating a file with specific striping will fail if it can't allocate at least 3/4 of the requested stripes (some margin is allowed so that applications don't get failures when a small number of OSTs are offline).

          Is it possible that earlier in your testing that some OSTs were filled up?

          adilger Andreas Dilger added a comment - James, this ENOSPC problem may only be related to your test configuration, if there are individual OSTs that are out of space for some reason. Creating a file with specific striping will fail if it can't allocate at least 3/4 of the requested stripes (some margin is allowed so that applications don't get failures when a small number of OSTs are offline). Is it possible that earlier in your testing that some OSTs were filled up?

          Sorry about the confusion with this ticket. When I created this ticket for our first test shot this problem was only observed during our hero wide stripe test with 367 ost at the time. After that test shot I opened this ticket and prepared a scaling job that would create directories with powers of two stripe count. So for the second test shot we ran this scaling job to discover that the failure happened around 128 stripes which is below the old 160 stripe limit. For this last test shot run we again saw this problem not only at larger stripe count (128 stripes again) but also for single shared file that was stripe across 4 OSTs. This shared file was being written to by 18K number of nodes. So I don't think it is a general wide stripe problem we are seeing but some other issue. We thought it might of been a grant issues since the OSTs are only 250 GB in size but Oleg told me during LUG this is unlikely the case.

          P.S
          I can't seem to find the MDS ldump Will talk to the admin tomorrow.

          simmonsja James A Simmons added a comment - Sorry about the confusion with this ticket. When I created this ticket for our first test shot this problem was only observed during our hero wide stripe test with 367 ost at the time. After that test shot I opened this ticket and prepared a scaling job that would create directories with powers of two stripe count. So for the second test shot we ran this scaling job to discover that the failure happened around 128 stripes which is below the old 160 stripe limit. For this last test shot run we again saw this problem not only at larger stripe count (128 stripes again) but also for single shared file that was stripe across 4 OSTs. This shared file was being written to by 18K number of nodes. So I don't think it is a general wide stripe problem we are seeing but some other issue. We thought it might of been a grant issues since the OSTs are only 250 GB in size but Oleg told me during LUG this is unlikely the case. P.S I can't seem to find the MDS ldump Will talk to the admin tomorrow.
          yujian Jian Yu added a comment - - edited

          Lustre Branch: master
          Lustre Build: http://build.whamcloud.com/job/lustre-master/1441/
          Distro/Arch: RHEL6.4/x86_64 (kernel version: 2.6.32-358.2.1.el6)
          Network: TCP (1GigE)
          OSSCOUNT=4
          OSTCOUNT=224 (with 56 OSTs per OSS)

          MDSOPT="--mkfsoptions='-O large_xattr'"

          The parallel-scale test iorssf passed with 224 OSTs:
          https://maloo.whamcloud.com/test_sessions/ce2253de-af21-11e2-8f8e-52540035b04c

          As per run_ior() in lustre/tests/functions.sh, "$LFS setstripe $testdir -c -1" was performed before running the IOR command.

          Another test run with MDSOPT="--mkfsoptions='-O large_xattr -J size=1024'" also passed:
          https://maloo.whamcloud.com/test_sessions/94998110-af57-11e2-8f8e-52540035b04c

          + /usr/bin/lfs setstripe /mnt/lustre/d0.ior.ssf -c -1
          + /usr/bin/lfs getstripe -d /mnt/lustre/d0.ior.ssf
          stripe_count:   -1 stripe_size:    1048576 stripe_offset:  -1 
          + /usr/bin/IOR -a POSIX -C -g -b 1g -o /mnt/lustre/d0.ior.ssf/iorData -t 4m -v -e -w -r -i 5 -k
          

          More tests passed:

          # ls -l /mnt/lustre/
          total 0
          # lfs setstripe -c 224 /mnt/lustre/file
          # lfs getstripe -i -c -s /mnt/lustre/file
          lmm_stripe_count:   224
          lmm_stripe_size:    1048576
          lmm_stripe_offset:  133
          # yes | dd bs=1024 count=1048576 of=/mnt/lustre/file
          1048576+0 records in
          1048576+0 records out
          1073741824 bytes (1.1 GB) copied, 1288.4 s, 833 kB/s
          # lfs getstripe -i -c -s /mnt/lustre/file
          lmm_stripe_count:   224
          lmm_stripe_size:    1048576
          lmm_stripe_offset:  133
          
          # mkdir /mnt/lustre/dir
          # lfs getstripe -d /mnt/lustre/dir
          stripe_count:   1 stripe_size:    1048576 stripe_offset:  -1
          # lfs setstripe -c 224 /mnt/lustre/dir
          # lfs getstripe -d /mnt/lustre/dir
          stripe_count:   224 stripe_size:    1048576 stripe_offset:  -1
          # touch /mnt/lustre/dir/file
          # lfs getstripe -i -c -s /mnt/lustre/dir/file
          lmm_stripe_count:   224
          lmm_stripe_size:    1048576
          lmm_stripe_offset:  189
          # yes | dd bs=1024 count=1048576 of=/mnt/lustre/dir/file
          1048576+0 records in
          1048576+0 records out
          1073741824 bytes (1.1 GB) copied, 1359.48 s, 790 kB/s
          # lfs getstripe -i -c -s /mnt/lustre/dir/file
          lmm_stripe_count:   224
          lmm_stripe_size:    1048576
          lmm_stripe_offset:  189
          # lfs getstripe -d /mnt/lustre/dir
          stripe_count:   224 stripe_size:    1048576 stripe_offset:  -1
          
          yujian Jian Yu added a comment - - edited Lustre Branch: master Lustre Build: http://build.whamcloud.com/job/lustre-master/1441/ Distro/Arch: RHEL6.4/x86_64 (kernel version: 2.6.32-358.2.1.el6) Network: TCP (1GigE) OSSCOUNT=4 OSTCOUNT=224 (with 56 OSTs per OSS) MDSOPT="--mkfsoptions='-O large_xattr'" The parallel-scale test iorssf passed with 224 OSTs: https://maloo.whamcloud.com/test_sessions/ce2253de-af21-11e2-8f8e-52540035b04c As per run_ior() in lustre/tests/functions.sh, "$LFS setstripe $testdir -c -1" was performed before running the IOR command. Another test run with MDSOPT="--mkfsoptions='-O large_xattr -J size=1024'" also passed: https://maloo.whamcloud.com/test_sessions/94998110-af57-11e2-8f8e-52540035b04c + /usr/bin/lfs setstripe /mnt/lustre/d0.ior.ssf -c -1 + /usr/bin/lfs getstripe -d /mnt/lustre/d0.ior.ssf stripe_count: -1 stripe_size: 1048576 stripe_offset: -1 + /usr/bin/IOR -a POSIX -C -g -b 1g -o /mnt/lustre/d0.ior.ssf/iorData -t 4m -v -e -w -r -i 5 -k More tests passed: # ls -l /mnt/lustre/ total 0 # lfs setstripe -c 224 /mnt/lustre/file # lfs getstripe -i -c -s /mnt/lustre/file lmm_stripe_count: 224 lmm_stripe_size: 1048576 lmm_stripe_offset: 133 # yes | dd bs=1024 count=1048576 of=/mnt/lustre/file 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB) copied, 1288.4 s, 833 kB/s # lfs getstripe -i -c -s /mnt/lustre/file lmm_stripe_count: 224 lmm_stripe_size: 1048576 lmm_stripe_offset: 133 # mkdir /mnt/lustre/dir # lfs getstripe -d /mnt/lustre/dir stripe_count: 1 stripe_size: 1048576 stripe_offset: -1 # lfs setstripe -c 224 /mnt/lustre/dir # lfs getstripe -d /mnt/lustre/dir stripe_count: 224 stripe_size: 1048576 stripe_offset: -1 # touch /mnt/lustre/dir/file # lfs getstripe -i -c -s /mnt/lustre/dir/file lmm_stripe_count: 224 lmm_stripe_size: 1048576 lmm_stripe_offset: 189 # yes | dd bs=1024 count=1048576 of=/mnt/lustre/dir/file 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB) copied, 1359.48 s, 790 kB/s # lfs getstripe -i -c -s /mnt/lustre/dir/file lmm_stripe_count: 224 lmm_stripe_size: 1048576 lmm_stripe_offset: 189 # lfs getstripe -d /mnt/lustre/dir stripe_count: 224 stripe_size: 1048576 stripe_offset: -1
          di.wang Di Wang added a comment -

          James: I just checked the debug log, I did not find mds log there? Just want to confirm, the bug you hit in the last test is still "-NOSPC when you try to create a file with 224 stripes?

          di.wang Di Wang added a comment - James: I just checked the debug log, I did not find mds log there? Just want to confirm, the bug you hit in the last test is still "-NOSPC when you try to create a file with 224 stripes?

          People

            yujian Jian Yu
            simmonsja James A Simmons
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: