Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3114

Test failure on test suite sanity-lfsck, subtest test_5

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • None
    • None
    • 3
    • 7566

    Description

      This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/cfaf9a1a-9de0-11e2-8513-52540035b04c.

      The sub-test test_5 failed with the following error:

      test failed to respond and timed out

      This failure is happening a lot lately:

      Failure Rate: 54.76% of last 42 executions [all branches]

      The mds panics and crashes during a mount command in test_5. This is with fail_loc set to 0x1504. From the mds console log:

      19:11:31:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl  		                   /dev/lvm-MDS/P1 /mnt/mds1
      19:11:31:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
      19:11:31:Lustre: *** cfs_fail_loc=1504, val=0***
      19:11:31:Lustre: Skipped 3 previous similar messages
      19:11:31:Lustre: Setting parameter lustre-MDT0000-mdtlov.lov.stripesize in log lustre-MDT0000
      19:11:31:Lustre: Skipped 9 previous similar messages
      19:11:31:Lustre: *** cfs_fail_loc=1504, val=0***
      19:11:31:LustreError: 14342:0:(mdd_compat.c:361:mdd_compat_fixes()) lustre-MDD0000: [0x200000007:0x1:0x0] is used on ldiskfs?!
      19:11:31:LustreError: 14342:0:(obd_mount_server.c:1698:server_fill_super()) Unable to start targets: -524
      19:11:31:LustreError: 11352:0:(lod_dev.c:813:lod_device_free()) ASSERTION( atomic_read(&lu->ld_ref) == 0 ) failed: 
      19:11:31:LustreError: 11352:0:(lod_dev.c:813:lod_device_free()) LBUG
      19:11:31:Pid: 11352, comm: obd_zombid
      19:11:31:
      19:11:31:Call Trace:
      19:11:31: [<ffffffffa04b2895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      19:11:33: [<ffffffffa04b2e97>] lbug_with_loc+0x47/0xb0 [libcfs]
      19:11:33: [<ffffffffa0e544bb>] lod_device_free+0x1eb/0x220 [lod]
      19:11:33: [<ffffffffa061b6fd>] class_decref+0x46d/0x580 [obdclass]
      19:11:33: [<ffffffffa05f9399>] obd_zombie_impexp_cull+0x309/0x5d0 [obdclass]
      19:11:33: [<ffffffffa05f9725>] obd_zombie_impexp_thread+0xc5/0x1c0 [obdclass]
      19:11:33: [<ffffffff8105fa40>] ? default_wake_function+0x0/0x20
      19:11:33: [<ffffffffa05f9660>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
      19:11:33: [<ffffffff8100c0ca>] child_rip+0xa/0x20
      19:11:33: [<ffffffffa05f9660>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
      19:11:33: [<ffffffffa05f9660>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
      19:11:33: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      19:11:33:
      19:11:33:LustreError: 14342:0:(mdt_handler.c:4587:mdt_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: 
      19:11:33:LustreError: 14342:0:(mdt_handler.c:4587:mdt_fini()) LBUG
      19:11:33:Pid: 14342, comm: mount.lustre
      19:11:33:
      19:11:33:Call Trace:
      19:11:33: [<ffffffffa04b2895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      19:11:33: [<ffffffffa04b2e97>] lbug_with_loc+0x47/0xb0 [libcfs]
      19:11:33: [<ffffffffa0d98979>] mdt_device_fini+0xc49/0xd80 [mdt]
      19:11:34: [<ffffffffa0622117>] class_cleanup+0x577/0xda0 [obdclass]
      19:11:34: [<ffffffffa05f7bb6>] ? class_name2dev+0x56/0xe0 [obdclass]
      19:11:34: [<ffffffffa06239fc>] class_process_config+0x10bc/0x1c80 [obdclass]
      19:11:34: [<ffffffffa061d223>] ? lustre_cfg_new+0x353/0x7e0 [obdclass]
      19:11:34: [<ffffffffa0624739>] class_manual_cleanup+0x179/0x6f0 [obdclass]
      19:11:34: [<ffffffffa05f7bb6>] ? class_name2dev+0x56/0xe0 [obdclass]
      19:11:34: [<ffffffffa065977c>] server_put_super+0x5bc/0xf00 [obdclass]
      19:11:34: [<ffffffffa065dfd8>] server_fill_super+0x658/0x1570 [obdclass]
      19:11:34: [<ffffffffa062eb38>] lustre_fill_super+0x1d8/0x530 [obdclass]
      19:11:34: [<ffffffffa062e960>] ? lustre_fill_super+0x0/0x530 [obdclass]
      19:11:34: [<ffffffff8117930f>] get_sb_nodev+0x5f/0xa0
      19:11:34: [<ffffffffa06265e5>] lustre_get_sb+0x25/0x30 [obdclass]
      19:11:34: [<ffffffff81178f6b>] vfs_kern_mount+0x7b/0x1b0
      19:11:34: [<ffffffff81179112>] do_kern_mount+0x52/0x130
      19:11:34: [<ffffffff811975c2>] do_mount+0x2d2/0x8d0
      19:11:34: [<ffffffff81197c50>] sys_mount+0x90/0xe0
      19:11:34: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      19:11:34:
      19:11:34:Kernel panic - not syncing: LBUG
      

      Info required for matching: sanity-lfsck 5

      Attachments

        Issue Links

          Activity

            [LU-3114] Test failure on test suite sanity-lfsck, subtest test_5
            standan Saurabh Tandan (Inactive) made changes -
            Remote Link New: This issue links to "Page (HPDD Community Wiki)" [ 15809 ]
            adilger Andreas Dilger made changes -
            Link New: This issue duplicates LU-3039 [ LU-3039 ]
            jlevi Jodi Levi (Inactive) made changes -
            Resolution New: Duplicate [ 3 ]
            Status Original: Open [ 1 ] New: Closed [ 6 ]
            keith Keith Mannthey (Inactive) made changes -
            Priority Original: Minor [ 4 ] New: Blocker [ 1 ]
            bogl Bob Glossman (Inactive) made changes -
            Description Original: This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

            This issue relates to the following test suite run: [https://maloo.whamcloud.com/test_sets/cfaf9a1a-9de0-11e2-8513-52540035b04c].

            The sub-test test_5 failed with the following error:
            {quote}
            test failed to respond and timed out
            {quote}

            The mds panics and crashes during a mount command in test_5. This is with fail_loc set to 0x1504. From the console log:
            {noformat}
            19:11:31:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1
            19:11:31:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts:
            19:11:31:Lustre: *** cfs_fail_loc=1504, val=0***
            19:11:31:Lustre: Skipped 3 previous similar messages
            19:11:31:Lustre: Setting parameter lustre-MDT0000-mdtlov.lov.stripesize in log lustre-MDT0000
            19:11:31:Lustre: Skipped 9 previous similar messages
            19:11:31:Lustre: *** cfs_fail_loc=1504, val=0***
            19:11:31:LustreError: 14342:0:(mdd_compat.c:361:mdd_compat_fixes()) lustre-MDD0000: [0x200000007:0x1:0x0] is used on ldiskfs?!
            19:11:31:LustreError: 14342:0:(obd_mount_server.c:1698:server_fill_super()) Unable to start targets: -524
            19:11:31:LustreError: 11352:0:(lod_dev.c:813:lod_device_free()) ASSERTION( atomic_read(&lu->ld_ref) == 0 ) failed:
            19:11:31:LustreError: 11352:0:(lod_dev.c:813:lod_device_free()) LBUG
            19:11:31:Pid: 11352, comm: obd_zombid
            19:11:31:
            19:11:31:Call Trace:
            19:11:31: [<ffffffffa04b2895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            19:11:33: [<ffffffffa04b2e97>] lbug_with_loc+0x47/0xb0 [libcfs]
            19:11:33: [<ffffffffa0e544bb>] lod_device_free+0x1eb/0x220 [lod]
            19:11:33: [<ffffffffa061b6fd>] class_decref+0x46d/0x580 [obdclass]
            19:11:33: [<ffffffffa05f9399>] obd_zombie_impexp_cull+0x309/0x5d0 [obdclass]
            19:11:33: [<ffffffffa05f9725>] obd_zombie_impexp_thread+0xc5/0x1c0 [obdclass]
            19:11:33: [<ffffffff8105fa40>] ? default_wake_function+0x0/0x20
            19:11:33: [<ffffffffa05f9660>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
            19:11:33: [<ffffffff8100c0ca>] child_rip+0xa/0x20
            19:11:33: [<ffffffffa05f9660>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
            19:11:33: [<ffffffffa05f9660>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
            19:11:33: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
            19:11:33:
            19:11:33:LustreError: 14342:0:(mdt_handler.c:4587:mdt_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed:
            19:11:33:LustreError: 14342:0:(mdt_handler.c:4587:mdt_fini()) LBUG
            19:11:33:Pid: 14342, comm: mount.lustre
            19:11:33:
            19:11:33:Call Trace:
            19:11:33: [<ffffffffa04b2895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            19:11:33: [<ffffffffa04b2e97>] lbug_with_loc+0x47/0xb0 [libcfs]
            19:11:33: [<ffffffffa0d98979>] mdt_device_fini+0xc49/0xd80 [mdt]
            19:11:34: [<ffffffffa0622117>] class_cleanup+0x577/0xda0 [obdclass]
            19:11:34: [<ffffffffa05f7bb6>] ? class_name2dev+0x56/0xe0 [obdclass]
            19:11:34: [<ffffffffa06239fc>] class_process_config+0x10bc/0x1c80 [obdclass]
            19:11:34: [<ffffffffa061d223>] ? lustre_cfg_new+0x353/0x7e0 [obdclass]
            19:11:34: [<ffffffffa0624739>] class_manual_cleanup+0x179/0x6f0 [obdclass]
            19:11:34: [<ffffffffa05f7bb6>] ? class_name2dev+0x56/0xe0 [obdclass]
            19:11:34: [<ffffffffa065977c>] server_put_super+0x5bc/0xf00 [obdclass]
            19:11:34: [<ffffffffa065dfd8>] server_fill_super+0x658/0x1570 [obdclass]
            19:11:34: [<ffffffffa062eb38>] lustre_fill_super+0x1d8/0x530 [obdclass]
            19:11:34: [<ffffffffa062e960>] ? lustre_fill_super+0x0/0x530 [obdclass]
            19:11:34: [<ffffffff8117930f>] get_sb_nodev+0x5f/0xa0
            19:11:34: [<ffffffffa06265e5>] lustre_get_sb+0x25/0x30 [obdclass]
            19:11:34: [<ffffffff81178f6b>] vfs_kern_mount+0x7b/0x1b0
            19:11:34: [<ffffffff81179112>] do_kern_mount+0x52/0x130
            19:11:34: [<ffffffff811975c2>] do_mount+0x2d2/0x8d0
            19:11:34: [<ffffffff81197c50>] sys_mount+0x90/0xe0
            19:11:34: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
            19:11:34:
            19:11:34:Kernel panic - not syncing: LBUG
            {noformat}

            Info required for matching: sanity-lfsck 5
            New: This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

            This issue relates to the following test suite run: [https://maloo.whamcloud.com/test_sets/cfaf9a1a-9de0-11e2-8513-52540035b04c].

            The sub-test test_5 failed with the following error:
            {quote}
            test failed to respond and timed out
            {quote}

            This failure is happening a lot lately:

            Failure Rate: 54.76% of last 42 executions [all branches]

            The mds panics and crashes during a mount command in test_5. This is with fail_loc set to 0x1504. From the mds console log:
            {noformat}
            19:11:31:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1
            19:11:31:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts:
            19:11:31:Lustre: *** cfs_fail_loc=1504, val=0***
            19:11:31:Lustre: Skipped 3 previous similar messages
            19:11:31:Lustre: Setting parameter lustre-MDT0000-mdtlov.lov.stripesize in log lustre-MDT0000
            19:11:31:Lustre: Skipped 9 previous similar messages
            19:11:31:Lustre: *** cfs_fail_loc=1504, val=0***
            19:11:31:LustreError: 14342:0:(mdd_compat.c:361:mdd_compat_fixes()) lustre-MDD0000: [0x200000007:0x1:0x0] is used on ldiskfs?!
            19:11:31:LustreError: 14342:0:(obd_mount_server.c:1698:server_fill_super()) Unable to start targets: -524
            19:11:31:LustreError: 11352:0:(lod_dev.c:813:lod_device_free()) ASSERTION( atomic_read(&lu->ld_ref) == 0 ) failed:
            19:11:31:LustreError: 11352:0:(lod_dev.c:813:lod_device_free()) LBUG
            19:11:31:Pid: 11352, comm: obd_zombid
            19:11:31:
            19:11:31:Call Trace:
            19:11:31: [<ffffffffa04b2895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            19:11:33: [<ffffffffa04b2e97>] lbug_with_loc+0x47/0xb0 [libcfs]
            19:11:33: [<ffffffffa0e544bb>] lod_device_free+0x1eb/0x220 [lod]
            19:11:33: [<ffffffffa061b6fd>] class_decref+0x46d/0x580 [obdclass]
            19:11:33: [<ffffffffa05f9399>] obd_zombie_impexp_cull+0x309/0x5d0 [obdclass]
            19:11:33: [<ffffffffa05f9725>] obd_zombie_impexp_thread+0xc5/0x1c0 [obdclass]
            19:11:33: [<ffffffff8105fa40>] ? default_wake_function+0x0/0x20
            19:11:33: [<ffffffffa05f9660>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
            19:11:33: [<ffffffff8100c0ca>] child_rip+0xa/0x20
            19:11:33: [<ffffffffa05f9660>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
            19:11:33: [<ffffffffa05f9660>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
            19:11:33: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
            19:11:33:
            19:11:33:LustreError: 14342:0:(mdt_handler.c:4587:mdt_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed:
            19:11:33:LustreError: 14342:0:(mdt_handler.c:4587:mdt_fini()) LBUG
            19:11:33:Pid: 14342, comm: mount.lustre
            19:11:33:
            19:11:33:Call Trace:
            19:11:33: [<ffffffffa04b2895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            19:11:33: [<ffffffffa04b2e97>] lbug_with_loc+0x47/0xb0 [libcfs]
            19:11:33: [<ffffffffa0d98979>] mdt_device_fini+0xc49/0xd80 [mdt]
            19:11:34: [<ffffffffa0622117>] class_cleanup+0x577/0xda0 [obdclass]
            19:11:34: [<ffffffffa05f7bb6>] ? class_name2dev+0x56/0xe0 [obdclass]
            19:11:34: [<ffffffffa06239fc>] class_process_config+0x10bc/0x1c80 [obdclass]
            19:11:34: [<ffffffffa061d223>] ? lustre_cfg_new+0x353/0x7e0 [obdclass]
            19:11:34: [<ffffffffa0624739>] class_manual_cleanup+0x179/0x6f0 [obdclass]
            19:11:34: [<ffffffffa05f7bb6>] ? class_name2dev+0x56/0xe0 [obdclass]
            19:11:34: [<ffffffffa065977c>] server_put_super+0x5bc/0xf00 [obdclass]
            19:11:34: [<ffffffffa065dfd8>] server_fill_super+0x658/0x1570 [obdclass]
            19:11:34: [<ffffffffa062eb38>] lustre_fill_super+0x1d8/0x530 [obdclass]
            19:11:34: [<ffffffffa062e960>] ? lustre_fill_super+0x0/0x530 [obdclass]
            19:11:34: [<ffffffff8117930f>] get_sb_nodev+0x5f/0xa0
            19:11:34: [<ffffffffa06265e5>] lustre_get_sb+0x25/0x30 [obdclass]
            19:11:34: [<ffffffff81178f6b>] vfs_kern_mount+0x7b/0x1b0
            19:11:34: [<ffffffff81179112>] do_kern_mount+0x52/0x130
            19:11:34: [<ffffffff811975c2>] do_mount+0x2d2/0x8d0
            19:11:34: [<ffffffff81197c50>] sys_mount+0x90/0xe0
            19:11:34: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
            19:11:34:
            19:11:34:Kernel panic - not syncing: LBUG
            {noformat}

            Info required for matching: sanity-lfsck 5
            maloo Maloo created issue -

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: