Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6739

EL7 mds-survey test_1: mds-survey failed

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.8.0
    • None
    • server and client: lustre-master build # 3071 EL7
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/ea948bde-135d-11e5-b4b0-5254006e85c2.

      The sub-test test_1 failed with the following error:

      mds-survey failed
      

      There is no log for MDS at all, this failure blocks all the following tests from being run. test log shows

      ====> Destroy 4 directories on onyx-42vm3:lustre-MDT0000_ecc
      ssh_exchange_identification: Connection closed by remote host
      Mon Jun 15 05:04:43 PDT 2015 /usr/bin/mds-survey from onyx-42vm6.onyx.hpdd.intel.com
      mdt 1 file  103011 dir    4 thr    4 create 18976.80 [ 11998.67, 23975.59] lookup 397752.68 [ 397752.68, 397752.68] md_getattr 294970.99 [ 294970.99, 294970.99] setxattr 1069.79 [    0.00, 7999.06] destroy             ERROR 
      mdt 1 file  103011 dir    4 thr    8 create             ERROR lookup             ERROR md_getattr             ERROR setxattr             ERROR destroy             ERROR 
      starting run for config:  test: create  file: 103011 threads: 4  directories: 4
      starting run for config:  test: lookup  file: 103011 threads: 4  directories: 4
      starting run for config:  test: md_getattr  file: 103011 threads: 4  directories: 4
      starting run for config:  test: setxattr  file: 103011 threads: 4  directories: 4
      starting run for config:  test: destroy  file: 103011 threads: 4  directories: 4
      starting run for config:  test: create  file: 103011 threads: 8  directories: 4
      starting run for config:  test: lookup  file: 103011 threads: 8  directories: 4
      starting run for config:  test: md_getattr  file: 103011 threads: 8  directories: 4
      starting run for config:  test: setxattr  file: 103011 threads: 8  directories: 4
      starting run for config:  test: destroy  file: 103011 threads: 8  directories: 4
       mds-survey test_1: @@@@@@ FAIL: mds-survey failed 
      

      Attachments

        Issue Links

          Activity

            [LU-6739] EL7 mds-survey test_1: mds-survey failed

            The console logs are under "lustre-provisioning" from before mds-survey:

            12:06:51:[ 1721.271559] WARNING: at lustre-2.7.55/ldiskfs/ext4_jbd2.c:260 __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]()
            12:06:51:[ 1721.294943] CPU: 1 PID: 5479 Comm: lctl Tainted: GF          O--------------   3.10.0-229.4.2.el7_lustre.x86_64 #1
            12:06:51:[ 1721.297354] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
            12:06:52:[ 1721.306555] Call Trace:
            12:06:52:[ 1721.308459]  [<ffffffff816050da>] dump_stack+0x19/0x1b
            12:06:52:[ 1721.310520]  [<ffffffff8106e34b>] warn_slowpath_common+0x6b/0xb0
            12:06:52:[ 1721.312659]  [<ffffffff8106e49a>] warn_slowpath_null+0x1a/0x20
            12:06:52:[ 1721.314897]  [<ffffffffa05616b2>] __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]
            12:06:52:[ 1721.319529]  [<ffffffffa0584659>] ldiskfs_free_blocks+0x5c9/0xb90 [ldiskfs]
            12:06:53:[ 1721.321814]  [<ffffffffa0578f75>] ldiskfs_xattr_release_block+0x275/0x330 [ldiskfs]
            12:06:53:[ 1721.324060]  [<ffffffffa057c1ab>] ldiskfs_xattr_delete_inode+0x2bb/0x300 [ldiskfs]
            12:06:53:[ 1721.326316]  [<ffffffffa0576ad5>] ldiskfs_evict_inode+0x1b5/0x610 [ldiskfs]
            12:06:53:[ 1721.328683]  [<ffffffff811e23d7>] evict+0xa7/0x170
            12:06:53:[ 1721.330790]  [<ffffffff811e2c15>] iput+0xf5/0x180
            12:06:53:[ 1721.332864]  [<ffffffffa0ba3e73>] osd_object_delete+0x1d3/0x300 [osd_ldiskfs]
            12:06:53:[ 1721.335175]  [<ffffffffa07586ad>] lu_object_free.isra.30+0x9d/0x1a0 [obdclass]
            12:06:53:[ 1721.337494]  [<ffffffffa0758872>] lu_object_put+0xc2/0x320 [obdclass]
            12:06:54:[ 1721.339735]  [<ffffffffa0f2a6d7>] echo_md_destroy_internal+0xe7/0x520 [obdecho]
            12:06:54:[ 1721.342007]  [<ffffffffa0f3217a>] echo_md_handler.isra.43+0x191a/0x2250 [obdecho]
            12:06:54:[ 1721.348581]  [<ffffffffa0f34766>] echo_client_iocontrol+0x1146/0x1d10 [obdecho]
            12:06:54:[ 1721.354898]  [<ffffffffa0724d1c>] class_handle_ioctl+0x1b3c/0x22b0 [obdclass]
            12:06:54:[ 1721.358813]  [<ffffffffa070a5e2>] obd_class_ioctl+0xd2/0x170 [obdclass]
            12:06:54:[ 1721.360799]  [<ffffffff811da2c5>] do_vfs_ioctl+0x2e5/0x4c0
            12:06:54:[ 1721.364564]  [<ffffffff811da541>] SyS_ioctl+0xa1/0xc0
            12:06:55:[ 1721.366357]  [<ffffffff81615029>] system_call_fastpath+0x16/0x1b
            12:06:55:[ 1721.368204] ---[ end trace aed93badbc88e370 ]---
            12:06:55:[ 1721.370058] LDISKFS-fs: ldiskfs_free_blocks:5107: aborting transaction: error 28 in __ldiskfs_handle_dirty_metadata
            12:06:55:[ 1721.372395] LDISKFS: jbd2_journal_dirty_metadata failed: handle type 5 started at line 240, credits 3/0, errcode -28
            12:06:55:[ 1721.385026] LDISKFS-fs error (device dm-0) in ldiskfs_free_blocks:5123: error 28
            

            Looks like the same journal size problem as LU-6722.

            adilger Andreas Dilger added a comment - The console logs are under "lustre-provisioning" from before mds-survey: 12:06:51:[ 1721.271559] WARNING: at lustre-2.7.55/ldiskfs/ext4_jbd2.c:260 __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]() 12:06:51:[ 1721.294943] CPU: 1 PID: 5479 Comm: lctl Tainted: GF O-------------- 3.10.0-229.4.2.el7_lustre.x86_64 #1 12:06:51:[ 1721.297354] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 12:06:52:[ 1721.306555] Call Trace: 12:06:52:[ 1721.308459] [<ffffffff816050da>] dump_stack+0x19/0x1b 12:06:52:[ 1721.310520] [<ffffffff8106e34b>] warn_slowpath_common+0x6b/0xb0 12:06:52:[ 1721.312659] [<ffffffff8106e49a>] warn_slowpath_null+0x1a/0x20 12:06:52:[ 1721.314897] [<ffffffffa05616b2>] __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs] 12:06:52:[ 1721.319529] [<ffffffffa0584659>] ldiskfs_free_blocks+0x5c9/0xb90 [ldiskfs] 12:06:53:[ 1721.321814] [<ffffffffa0578f75>] ldiskfs_xattr_release_block+0x275/0x330 [ldiskfs] 12:06:53:[ 1721.324060] [<ffffffffa057c1ab>] ldiskfs_xattr_delete_inode+0x2bb/0x300 [ldiskfs] 12:06:53:[ 1721.326316] [<ffffffffa0576ad5>] ldiskfs_evict_inode+0x1b5/0x610 [ldiskfs] 12:06:53:[ 1721.328683] [<ffffffff811e23d7>] evict+0xa7/0x170 12:06:53:[ 1721.330790] [<ffffffff811e2c15>] iput+0xf5/0x180 12:06:53:[ 1721.332864] [<ffffffffa0ba3e73>] osd_object_delete+0x1d3/0x300 [osd_ldiskfs] 12:06:53:[ 1721.335175] [<ffffffffa07586ad>] lu_object_free.isra.30+0x9d/0x1a0 [obdclass] 12:06:53:[ 1721.337494] [<ffffffffa0758872>] lu_object_put+0xc2/0x320 [obdclass] 12:06:54:[ 1721.339735] [<ffffffffa0f2a6d7>] echo_md_destroy_internal+0xe7/0x520 [obdecho] 12:06:54:[ 1721.342007] [<ffffffffa0f3217a>] echo_md_handler.isra.43+0x191a/0x2250 [obdecho] 12:06:54:[ 1721.348581] [<ffffffffa0f34766>] echo_client_iocontrol+0x1146/0x1d10 [obdecho] 12:06:54:[ 1721.354898] [<ffffffffa0724d1c>] class_handle_ioctl+0x1b3c/0x22b0 [obdclass] 12:06:54:[ 1721.358813] [<ffffffffa070a5e2>] obd_class_ioctl+0xd2/0x170 [obdclass] 12:06:54:[ 1721.360799] [<ffffffff811da2c5>] do_vfs_ioctl+0x2e5/0x4c0 12:06:54:[ 1721.364564] [<ffffffff811da541>] SyS_ioctl+0xa1/0xc0 12:06:55:[ 1721.366357] [<ffffffff81615029>] system_call_fastpath+0x16/0x1b 12:06:55:[ 1721.368204] ---[ end trace aed93badbc88e370 ]--- 12:06:55:[ 1721.370058] LDISKFS-fs: ldiskfs_free_blocks:5107: aborting transaction: error 28 in __ldiskfs_handle_dirty_metadata 12:06:55:[ 1721.372395] LDISKFS: jbd2_journal_dirty_metadata failed: handle type 5 started at line 240, credits 3/0, errcode -28 12:06:55:[ 1721.385026] LDISKFS-fs error (device dm-0) in ldiskfs_free_blocks:5123: error 28 Looks like the same journal size problem as LU-6722 .

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: