Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4340

conf-sanity test_69: error: File too large

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.8.0
    • Lustre 2.6.0, Lustre 2.4.2, Lustre 2.5.1, Lustre 2.7.0, Lustre 2.8.0
    • client and server: lustre-master build #1783 RHEL6.4 ldiskfs
    • 3
    • 11869

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/77c48f3e-59f3-11e3-98fc-52540035b04c.

      The sub-test test_69 failed with the following error:

      create file after reformat

      test log shows:

      CMD: client-16vm3 lctl get_param -n osc.lustre-OST0000-osc-MDT0000.prealloc_last_id
       - created 10000 (time 1385749407.25 total 47.32 last 47.32)
       - created 20000 (time 1385749454.94 total 95.01 last 47.69)
       - created 30000 (time 1385749503.05 total 143.12 last 48.11)
       - created 40000 (time 1385749551.55 total 191.62 last 48.50)
      open(/mnt/lustre/d0.conf-sanity/d69/f.conf-sanity.69-49787) error: File too large
      total: 49787 creates in 240.39 seconds: 207.11 creates/second
      stop ost1 service on client-16vm4
      CMD: client-16vm4 grep -c /mnt/ost1' ' /proc/mounts
      Stopping /mnt/ost1 (opts:-f) on client-16vm4
      CMD: client-16vm4 umount -d -f /mnt/ost1
      CMD: client-16vm4 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: client-16vm4 grep -c /mnt/ost1' ' /proc/mounts
      CMD: client-16vm4 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      CMD: client-16vm4 mkfs.lustre --mgsnode=client-16vm3@tcp --fsname=lustre --ost --index=0 --param=sys.timeout=20 --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-E lazy_itable_init\" --reformat --replace /dev/lvm-Role_OSS/P1
      
         Permanent disk data:
      Target:     lustre-OST0000
      Index:      0
      Lustre FS:  lustre
      Mount type: ldiskfs
      Flags:      0x42
                    (OST update )
      Persistent mount opts: errors=remount-ro
      Parameters: mgsnode=10.10.4.122@tcp sys.timeout=20
      
      device size = 2048MB
      formatting backing filesystem ldiskfs on /dev/lvm-Role_OSS/P1
      	target name  lustre-OST0000
      	4k blocks     50000
      	options        -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize=4290772992,lazy_journal_init -F
      mkfs_cmd = mke2fs -j -b 4096 -L lustre-OST0000  -I 256 -q -O extents,uninit_bg,dir_nlink,quota,huge_file,flex_bg -G 256 -E lazy_itable_init,resize=4290772992,lazy_journal_init -F /dev/lvm-Role_OSS/P1 50000
      Writing CONFIGS/mountdata
      start ost1 service on client-16vm4
      CMD: client-16vm4 mkdir -p /mnt/ost1
      CMD: client-16vm4 test -b /dev/lvm-Role_OSS/P1
      Starting ost1:   /dev/lvm-Role_OSS/P1 /mnt/ost1
      CMD: client-16vm4 mkdir -p /mnt/ost1; mount -t lustre   		                   /dev/lvm-Role_OSS/P1 /mnt/ost1
      CMD: client-16vm4 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/bin:/bin:/sbin:/usr/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"-1\" \"all -lnet -lnd -pinger\" 4 
      CMD: client-16vm4 e2label /dev/lvm-Role_OSS/P1 2>/dev/null
      Started lustre-OST0000
      CMD: client-16vm3 /usr/sbin/lctl get_param -n version
      CMD: client-16vm3 /usr/sbin/lctl get_param -n version
      CMD: client-16vm3 lctl list_param osc.lustre-OST*-osc             > /dev/null 2>&1
      CMD: client-16vm3 lctl get_param -n at_min
      can't get osc.lustre-OST0000-osc-MDT0000.ost_server_uuid by list_param in 40 secs
      Go with osc.lustre-OST0000-osc-MDT0000.ost_server_uuid directly
      CMD: client-16vm3 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/bin:/bin:/sbin:/usr/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh wait_import_state FULL osc.lustre-OST0000-osc-MDT0000.ost_server_uuid 40 
      client-16vm3: osc.lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 0 sec
      touch: cannot touch `/mnt/lustre/d0.conf-sanity/d69/f.conf-sanity.69-last': File too large
       conf-sanity test_69: @@@@@@ FAIL: create file after reformat 
      

      Attachments

        Issue Links

          Activity

            [LU-4340] conf-sanity test_69: error: File too large

            James Nunez (james.a.nunez@intel.com) uploaded a new patch: http://review.whamcloud.com/15966
            Subject: LU-4340 tests: Adding debug to conf-sanity test 69
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: cca19d41a6b39ae2020655923290939cbe21bade

            gerrit Gerrit Updater added a comment - James Nunez (james.a.nunez@intel.com) uploaded a new patch: http://review.whamcloud.com/15966 Subject: LU-4340 tests: Adding debug to conf-sanity test 69 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: cca19d41a6b39ae2020655923290939cbe21bade
            yujian Jian Yu added a comment - The failure occurred consistently on master branch: https://testing.hpdd.intel.com/test_sets/1f034d6c-34ea-11e5-be21-5254006e85c2 https://testing.hpdd.intel.com/test_sets/7a5d164a-34ed-11e5-b875-5254006e85c2 https://testing.hpdd.intel.com/test_sets/ba0dd0aa-34a6-11e5-a9b3-5254006e85c2 https://testing.hpdd.intel.com/test_sets/3aaaa0b4-3437-11e5-be70-5254006e85c2
            pjones Peter Jones added a comment -

            Landed for 2.8

            pjones Peter Jones added a comment - Landed for 2.8

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15487/
            Subject: LU-4340 tests: Fix test_69 of conf-sanity test
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 34f94efa4847ebd84b2fa42b7a0fc85bd7f6f8e3

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15487/ Subject: LU-4340 tests: Fix test_69 of conf-sanity test Project: fs/lustre-release Branch: master Current Patch Set: Commit: 34f94efa4847ebd84b2fa42b7a0fc85bd7f6f8e3

            Ashish Purkar (ashish.purkar@seagate.com) uploaded a new patch: http://review.whamcloud.com/15487
            Subject: LU-4340 tests: Fix test_69 of conf-sanity test
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: abc7765b941fcebe7c489137af0e27d2212c71de

            gerrit Gerrit Updater added a comment - Ashish Purkar (ashish.purkar@seagate.com) uploaded a new patch: http://review.whamcloud.com/15487 Subject: LU-4340 tests: Fix test_69 of conf-sanity test Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: abc7765b941fcebe7c489137af0e27d2212c71de
            jamesanunez James Nunez (Inactive) added a comment - Hit this issue with 2.7.0-RC4. Results are at: https://testing.hpdd.intel.com/test_sessions/193dce6a-c42f-11e4-a0ef-5254006e85c2
            yujian Jian Yu added a comment - More failure instances on master branch: https://testing.hpdd.intel.com/test_sets/1261560a-9bdc-11e4-a352-5254006e85c2 https://testing.hpdd.intel.com/test_sets/3e12c382-9bec-11e4-afb8-5254006e85c2 https://testing.hpdd.intel.com/test_sets/3e12c382-9bec-11e4-afb8-5254006e85c2 https://testing.hpdd.intel.com/test_sets/79089440-9bbe-11e4-b679-5254006e85c2 https://testing.hpdd.intel.com/test_sets/e379ea16-9b9d-11e4-9d4a-5254006e85c2
            jamesanunez James Nunez (Inactive) added a comment - - edited

            I experienced this bug with lustre-master tag 2.6.91 build # 2771. Results are at https://testing.hpdd.intel.com/test_sets/f326bd9e-8618-11e4-ac52-5254006e85c2

            jamesanunez James Nunez (Inactive) added a comment - - edited I experienced this bug with lustre-master tag 2.6.91 build # 2771. Results are at https://testing.hpdd.intel.com/test_sets/f326bd9e-8618-11e4-ac52-5254006e85c2
            jamesanunez James Nunez (Inactive) added a comment - Hit this bug on master, tag 2.6.90. Test results at: https://testing.hpdd.intel.com/test_sessions/7f927c9a-6ccf-11e4-a452-5254006e85c2
            sarah Sarah Liu added a comment -

            Hit the error again in b2_6 build # 2 (2.6.0-RC2)
            server and client: RHEL6 ldiskfs

            MDSCOUNT=1

            https://testing.hpdd.intel.com/test_sets/181e5ee6-0c47-11e4-b749-5254006e85c2

            sarah Sarah Liu added a comment - Hit the error again in b2_6 build # 2 (2.6.0-RC2) server and client: RHEL6 ldiskfs MDSCOUNT=1 https://testing.hpdd.intel.com/test_sets/181e5ee6-0c47-11e4-b749-5254006e85c2
            jamesanunez James Nunez (Inactive) added a comment - - edited

            I'm hitting this "File too large" error and there is enough space on OST0 and the MDS, but I've used up all my inodes:

            # lfs df -i
            UUID                      Inodes       IUsed       IFree IUse% Mounted on
            lscratch-MDT0000_UUID      100000       49642       50358  50% /lustre/scratch[MDT:0]
            lscratch-MDT0001_UUID      100000         201       99799   0% /lustre/scratch[MDT:1]
            lscratch-MDT0002_UUID      100000         201       99799   0% /lustre/scratch[MDT:2]
            lscratch-MDT0003_UUID      100000         201       99799   0% /lustre/scratch[MDT:3]
            lscratch-OST0000_UUID       50016       50016           0 100% /lustre/scratch[OST:0]
            
            filesystem summary:       400000       50245      349755  13% /lustre/scratch
            

            We use the small OST and MDS size for all tests in conf-sanity. So, why doesn't this error always occur? From conf-sanity:

            # use small MDS + OST size to speed formatting time
            # do not use too small MDSSIZE/OSTSIZE, which affect the default jouranl size
            # STORED_MDSSIZE is used in test_18
            STORED_MDSSIZE=$MDSSIZE
            STORED_OSTSIZE=$OSTSIZE
            MDSSIZE=200000
            OSTSIZE=200000
            
            jamesanunez James Nunez (Inactive) added a comment - - edited I'm hitting this "File too large" error and there is enough space on OST0 and the MDS, but I've used up all my inodes: # lfs df -i UUID Inodes IUsed IFree IUse% Mounted on lscratch-MDT0000_UUID 100000 49642 50358 50% /lustre/scratch[MDT:0] lscratch-MDT0001_UUID 100000 201 99799 0% /lustre/scratch[MDT:1] lscratch-MDT0002_UUID 100000 201 99799 0% /lustre/scratch[MDT:2] lscratch-MDT0003_UUID 100000 201 99799 0% /lustre/scratch[MDT:3] lscratch-OST0000_UUID 50016 50016 0 100% /lustre/scratch[OST:0] filesystem summary: 400000 50245 349755 13% /lustre/scratch We use the small OST and MDS size for all tests in conf-sanity. So, why doesn't this error always occur? From conf-sanity: # use small MDS + OST size to speed formatting time # do not use too small MDSSIZE/OSTSIZE, which affect the default jouranl size # STORED_MDSSIZE is used in test_18 STORED_MDSSIZE=$MDSSIZE STORED_OSTSIZE=$OSTSIZE MDSSIZE=200000 OSTSIZE=200000

            People

              jamesanunez James Nunez (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: