Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8161

sanity-quota test_7a: lockup during mount on OSS

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/6b2f55d2-1c93-11e6-952a-5254006e85c2.

      The sub-test test_7a failed with the following error:

      test failed to respond and timed out
      
      stack trace from OSS console log:
      
      13:20:51:Lustre: DEBUG MARKER: mkdir -p /mnt/ost1; mount -t lustre   		                   lustre-ost1/ost1 /mnt/ost1
      13:20:51:LustreError: 137-5: lustre-OST0000_UUID: not available for connect from 10.2.4.151@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
      13:20:51:LustreError: 137-5: lustre-OST0000_UUID: not available for connect from 10.2.4.151@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
      13:20:51:LustreError: Skipped 2 previous similar messages
      13:20:51:LustreError: 137-5: lustre-OST0000_UUID: not available for connect from 10.2.4.151@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
      13:20:51:LustreError: Skipped 5 previous similar messages
      13:20:51:LustreError: 137-5: lustre-OST0000_UUID: not available for connect from 10.2.4.149@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
      13:20:51:LustreError: Skipped 10 previous similar messages
      13:20:51:BUG: soft lockup - CPU#1 stuck for 67s! [mount.lustre:4919]
      13:20:51:Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) osd_zfs(U) lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic libcfs(U) nfsd exportfs autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 zfs(P)(U) zcommon(P)(U) znvpair(P)(U) spl(U) zlib_deflate zavl(P)(U) zunicode(P)(U) microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      13:20:51:CPU 1 
      13:20:51:Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) osd_zfs(U) lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic libcfs(U) nfsd exportfs autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 zfs(P)(U) zcommon(P)(U) znvpair(P)(U) spl(U) zlib_deflate zavl(P)(U) zunicode(P)(U) microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      13:20:51:
      13:20:51:Pid: 4919, comm: mount.lustre Tainted: P           -- ------------    2.6.32-573.26.1.el6_lustre.x86_64 #1 Red Hat KVM
      13:20:51:RIP: 0010:[<ffffffff8129e8a9>]  [<ffffffff8129e8a9>] __write_lock_failed+0x9/0x20
      13:20:51:RSP: 0018:ffff88001f9b3890  EFLAGS: 00000287
      13:20:51:RAX: 0000000000000000 RBX: ffff88001f9b3898 RCX: ffff880058673dd8
      13:20:51:RDX: 0000000000000000 RSI: ffff8800618a2000 RDI: ffff8800618a20d8
      13:20:51:RBP: ffffffff8100bc0e R08: dead000000200200 R09: dead000000100100
      13:20:51:R10: dead000000200200 R11: 0000000000000000 R12: ffff880077c82000
      13:20:51:R13: ffff88001f9b3808 R14: ffffffffa0243e45 R15: ffff88001f9b3818
      13:20:51:FS:  00007efe3f9be7a0(0000) GS:ffff880002300000(0000) knlGS:0000000000000000
      13:20:51:CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      13:20:51:CR2: 00000030118e90c0 CR3: 0000000062137000 CR4: 00000000000006e0
      13:20:51:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      13:20:51:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      13:20:51:Process mount.lustre (pid: 4919, threadinfo ffff88001f9b0000, task ffff880078dc4040)
      13:20:51:Stack:
      13:20:51: ffffffff8153d007 ffff88001f9b38e8 ffffffffa0f457b4 ffff88001f9b3958
      13:20:51:<d> ffff8800618a2000 ffff88001f9b3908 ffff8800618a2000 ffff88001f9b3958
      13:20:51:<d> 00000000fffffff0 ffff88006022ab40 ffff88001f9b3958 ffff88001f9b3908
      13:20:51:Call Trace:
      13:20:51: [<ffffffff8153d007>] ? _write_lock+0x17/0x20
      13:20:51: [<ffffffffa0f457b4>] ? osd_oi_fini+0x44/0x820 [osd_zfs]
      13:20:51: [<ffffffffa0f35c4c>] ? osd_device_fini+0x12c/0x530 [osd_zfs]
      13:20:51: [<ffffffffa0f368b0>] ? osd_device_alloc+0x2e0/0x480 [osd_zfs]
      13:20:51: [<ffffffffa08a71af>] ? obd_setup+0x1bf/0x290 [obdclass]
      13:20:51: [<ffffffffa08a7488>] ? class_setup+0x208/0x870 [obdclass]
      13:20:51: [<ffffffffa08b054c>] ? class_process_config+0xc6c/0x1ad0 [obdclass]
      13:20:51: [<ffffffff8117904c>] ? __kmalloc+0x21c/0x230
      13:20:51: [<ffffffffa08b78ad>] ? do_lcfg+0x61d/0x750 [obdclass]
      13:20:51: [<ffffffffa08b7a74>] ? lustre_start_simple+0x94/0x200 [obdclass]
      13:20:51: [<ffffffffa08f18d1>] ? server_fill_super+0xfd1/0x1a6a [obdclass]
      13:20:51: [<ffffffff8117904c>] ? __kmalloc+0x21c/0x230
      13:20:51: [<ffffffffa08bc084>] ? lustre_fill_super+0xb64/0x2120 [obdclass]
      13:20:51: [<ffffffffa08bb520>] ? lustre_fill_super+0x0/0x2120 [obdclass]
      13:20:51: [<ffffffff81195a5f>] ? get_sb_nodev+0x5f/0xa0
      13:20:51: [<ffffffffa08b3105>] ? lustre_get_sb+0x25/0x30 [obdclass]
      13:20:51: [<ffffffff8119509b>] ? vfs_kern_mount+0x7b/0x1b0
      13:20:51: [<ffffffff81195242>] ? do_kern_mount+0x52/0x130
      13:20:51: [<ffffffff811a7f82>] ? vfs_ioctl+0x22/0xa0
      13:20:51: [<ffffffff811b71db>] ? do_mount+0x2fb/0x930
      13:20:51: [<ffffffff811b78a0>] ? sys_mount+0x90/0xe0
      13:20:51: [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
      13:20:51:Code: 00 00 48 8b 5b 20 48 83 eb 07 48 39 d9 73 06 48 89 01 31 c0 c3 b8 f2 ff ff ff c3 90 90 90 90 90 90 90 f0 81 07 00 00 00 01 f3 90 <81> 3f 00 00 00 01 75 f6 f0 81 2f 00 00 00 01 0f 85 e2 ff ff ff 
      13:20:51:Call Trace:
      13:20:51: [<ffffffff8153d007>] ? _write_lock+0x17/0x20
      13:20:51: [<ffffffffa0f457b4>] ? osd_oi_fini+0x44/0x820 [osd_zfs]
      13:20:51: [<ffffffffa0f35c4c>] ? osd_device_fini+0x12c/0x530 [osd_zfs]
      13:20:51: [<ffffffffa0f368b0>] ? osd_device_alloc+0x2e0/0x480 [osd_zfs]
      13:20:51: [<ffffffffa08a71af>] ? obd_setup+0x1bf/0x290 [obdclass]
      13:20:51: [<ffffffffa08a7488>] ? class_setup+0x208/0x870 [obdclass]
      13:20:51: [<ffffffffa08b054c>] ? class_process_config+0xc6c/0x1ad0 [obdclass]
      13:20:51: [<ffffffff8117904c>] ? __kmalloc+0x21c/0x230
      13:20:51: [<ffffffffa08b78ad>] ? do_lcfg+0x61d/0x750 [obdclass]
      13:20:51: [<ffffffffa08b7a74>] ? lustre_start_simple+0x94/0x200 [obdclass]
      13:20:51: [<ffffffffa08f18d1>] ? server_fill_super+0xfd1/0x1a6a [obdclass]
      13:20:51: [<ffffffff8117904c>] ? __kmalloc+0x21c/0x230
      13:20:51: [<ffffffffa08bc084>] ? lustre_fill_super+0xb64/0x2120 [obdclass]
      13:20:51: [<ffffffffa08bb520>] ? lustre_fill_super+0x0/0x2120 [obdclass]
      13:20:51: [<ffffffff81195a5f>] ? get_sb_nodev+0x5f/0xa0
      13:20:51: [<ffffffffa08b3105>] ? lustre_get_sb+0x25/0x30 [obdclass]
      13:20:51: [<ffffffff8119509b>] ? vfs_kern_mount+0x7b/0x1b0
      13:20:51: [<ffffffff81195242>] ? do_kern_mount+0x52/0x130
      13:20:51: [<ffffffff811a7f82>] ? vfs_ioctl+0x22/0xa0
      13:20:51: [<ffffffff811b71db>] ? do_mount+0x2fb/0x930
      13:20:51: [<ffffffff811b78a0>] ? sys_mount+0x90/0xe0
      13:20:51: [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
      13:20:51:Kernel panic - not syncing: softlockup: hung tasks
      13:20:51:Pid: 4919, comm: mount.lustre Tainted: P           --L------------    2.6.32-573.26.1.el6_lustre.x86_64 #1
      13:20:51:Call Trace:
      13:20:51: <IRQ>  [<ffffffff81539407>] ? panic+0xa7/0x16f
      13:20:51: [<ffffffff810ed943>] ? watchdog_timer_fn+0x223/0x230
      13:20:51: [<ffffffff810ed720>] ? watchdog_timer_fn+0x0/0x230
      13:20:51: [<ffffffff810a60ae>] ? __run_hrtimer+0x8e/0x1d0
      13:20:51: [<ffffffff810a6446>] ? hrtimer_interrupt+0xe6/0x260
      13:20:51: [<ffffffff81035dfd>] ? local_apic_timer_interrupt+0x3d/0x70
      13:20:51: [<ffffffff81543de5>] ? smp_apic_timer_interrupt+0x45/0x60
      13:20:51: [<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20
      13:20:51: <EOI>  [<ffffffff8129e8a9>] ? __write_lock_failed+0x9/0x20
      13:20:51: [<ffffffff8153d007>] ? _write_lock+0x17/0x20
      13:20:51: [<ffffffffa0f457b4>] ? osd_oi_fini+0x44/0x820 [osd_zfs]
      13:20:51: [<ffffffffa0f35c4c>] ? osd_device_fini+0x12c/0x530 [osd_zfs]
      13:20:51: [<ffffffffa0f368b0>] ? osd_device_alloc+0x2e0/0x480 [osd_zfs]
      13:20:51: [<ffffffffa08a71af>] ? obd_setup+0x1bf/0x290 [obdclass]
      13:20:51: [<ffffffffa08a7488>] ? class_setup+0x208/0x870 [obdclass]
      13:20:51: [<ffffffffa08b054c>] ? class_process_config+0xc6c/0x1ad0 [obdclass]
      13:20:51: [<ffffffff8117904c>] ? __kmalloc+0x21c/0x230
      13:20:51: [<ffffffffa08b78ad>] ? do_lcfg+0x61d/0x750 [obdclass]
      13:20:51: [<ffffffffa08b7a74>] ? lustre_start_simple+0x94/0x200 [obdclass]
      13:20:51: [<ffffffffa08f18d1>] ? server_fill_super+0xfd1/0x1a6a [obdclass]
      13:20:51: [<ffffffff8117904c>] ? __kmalloc+0x21c/0x230
      13:20:51: [<ffffffffa08bc084>] ? lustre_fill_super+0xb64/0x2120 [obdclass]
      13:20:51: [<ffffffffa08bb520>] ? lustre_fill_super+0x0/0x2120 [obdclass]
      13:20:51: [<ffffffff81195a5f>] ? get_sb_nodev+0x5f/0xa0
      13:20:51: [<ffffffffa08b3105>] ? lustre_get_sb+0x25/0x30 [obdclass]
      13:20:51: [<ffffffff8119509b>] ? vfs_kern_mount+0x7b/0x1b0
      13:20:51: [<ffffffff81195242>] ? do_kern_mount+0x52/0x130
      13:20:51: [<ffffffff811a7f82>] ? vfs_ioctl+0x22/0xa0
      13:20:51: [<ffffffff811b71db>] ? do_mount+0x2fb/0x930
      13:20:51: [<ffffffff811b78a0>] ? sys_mount+0x90/0xe0
      13:20:51: [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
      

      Please provide additional information about the failure here.

      Info required for matching: sanity-quota 7a

      Attachments

        Issue Links

          Activity

            [LU-8161] sanity-quota test_7a: lockup during mount on OSS
            yujian Jian Yu added a comment -

            This is a duplicate of LU-8147.

            yujian Jian Yu added a comment - This is a duplicate of LU-8147 .

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: