Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8477

insanity: general protection fault in qsd_reint_main

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.9.0
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

      This looks a lot like LU-7707, but that is marked FIXED.
      Raising new ticket. Will let an expert decide if this is a new problem or an old one.

      insanity test suite hangs on unmounting ost3 after all tests have completed successfully. In the suite_stdout log, we see

      04:59:27:CMD: trevis-34vm8 grep -c /mnt/lustre-ost3' ' /proc/mounts
      04:59:27:Stopping /mnt/lustre-ost3 (opts:-f) on trevis-34vm8
      04:59:27:CMD: trevis-34vm8 umount -d -f /mnt/lustre-ost3
      05:58:33:********** Timeout by autotest system **********
      

      in the test_complete log for trevis-34vm8, we see:

      04:59:34:[23974.321587] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-ost3
      04:59:34:[23980.437263] LustreError: 9346:0:(client.c:1168:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff88004a027c00 x1541682514006912/t0(0) o101->lustre-MDT0000-lwp-OST0002@10.9.5.180@tcp:23/10 lens 456/496 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1
      04:59:34:[23980.444227] LustreError: 9346:0:(qsd_reint.c:56:qsd_reint_completion()) lustre-OST0002: failed to enqueue global quota lock, glb fid:[0x200000006:0x20000:0x0], rc:-5
      04:59:34:[23980.447641] LustreError: 9346:0:(qsd_reint.c:56:qsd_reint_completion()) Skipped 3 previous similar messages
      04:59:34:[23980.455644] general protection fault: 0000 [#1] SMP 
      04:59:34:[23980.456007] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) ldiskfs(OE) dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic crct10dif_common ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ppdev pcspkr virtio_balloon parport_pc i2c_piix4 parport nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi cirrus syscopyarea sysfillrect sysimgblt virtio_blk drm_kms_helper ttm 8139too ata_piix drm serio_raw virtio_pci virtio_ring virtio 8139cp mii i2c_core libata floppy
      04:59:34:[23980.456007] CPU: 0 PID: 9347 Comm: qsd_reint_1.lus Tainted: G           OE  ------------   3.10.0-327.28.2.el7_lustre.x86_64 #1
      04:59:34:[23980.456007] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      04:59:34:[23980.456007] task: ffff880055fe4500 ti: ffff8800497dc000 task.ti: ffff8800497dc000
      04:59:34:[23980.456007] RIP: 0010:[<ffffffff810af04b>]  [<ffffffff810af04b>] __wake_up_common+0x2b/0x90
      04:59:34:[23980.456007] RSP: 0018:ffff8800497dfd88  EFLAGS: 00010086
      04:59:34:[23980.456007] RAX: 0000000000000246 RBX: ffff88005ec4eea0 RCX: 0000000000000000
      04:59:34:[23980.456007] RDX: 5a5a5a5a5a5a5a5a RSI: 0000000000000003 RDI: ffff88005ec4eea0
      04:59:34:[23980.456007] RBP: ffff8800497dfdc0 R08: 0000000000000000 R09: 0000000000000000
      04:59:34:[23980.456007] R10: 0000000000000000 R11: 0000000000000003 R12: ffff88005ec4eea8
      04:59:34:[23980.456007] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000003
      04:59:34:[23980.456007] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
      04:59:34:[23980.456007] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      04:59:34:[23980.456007] CR2: 00007f2272e02000 CR3: 0000000078cb4000 CR4: 00000000000006f0
      04:59:34:[23980.456007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      04:59:34:[23980.456007] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      04:59:34:[23980.456007] Stack:
      04:59:34:[23980.456007]  0000000100000000 0000000000000000 ffff88005ec4eea0 0000000000000246
      04:59:34:[23980.456007]  0000000000000003 0000000000000001 0000000000000000 ffff8800497dfdf8
      04:59:34:[23980.456007]  ffffffff810b0dc9 ffff88005ec4ee00 ffff88005ec4ee00 5a5a5a5a5a5a5a5a
      04:59:34:[23980.456007] Call Trace:
      04:59:34:[23980.456007]  [<ffffffff810b0dc9>] __wake_up+0x39/0x50
      04:59:34:[23980.456007]  [<ffffffffa0bdbd86>] qsd_reint_main+0x86/0xe40 [lquota]
      04:59:34:[23980.456007]  [<ffffffff8163b838>] ? __schedule+0x2d8/0x900
      04:59:34:[23980.456007]  [<ffffffffa0bdbd00>] ? qsd_reconciliation+0xa90/0xa90 [lquota]
      04:59:34:[23980.456007]  [<ffffffff810a5b2f>] kthread+0xcf/0xe0
      04:59:34:[23980.456007]  [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140
      04:59:34:[23980.456007]  [<ffffffff81646dd8>] ret_from_fork+0x58/0x90
      04:59:34:[23980.456007]  [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140
      04:59:34:[23980.456007] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 54 4c 8d 67 08 53 48 83 ec 10 89 55 cc 48 8b 57 08 4c 89 45 d0 <48> 8b 0a 49 39 d4 48 8d 42 e8 4c 8d 69 e8 75 0b eb 3b 0f 1f 00 
      04:59:34:[23980.456007] RIP  [<ffffffff810af04b>] __wake_up_common+0x2b/0x90
      04:59:34:[23980.456007]  RSP <ffff8800497dfd88>
      

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/0a88c552-5a18-11e6-906c-5254006e85c2.

      Attachments

        Activity

          People

            wc-triage WC Triage
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: