Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3638

GPF crash in osp_key_exit

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.5.0
    • None
    • 3
    • 9371

    Description

      Just hit this running recent master:

      <4>[113366.463322] Lustre: DEBUG MARKER: == replay-single test 0c: check replay-barrier == 10:40:38 (1374676838)
      <3>[113367.223979] LustreError: 22867:0:(osd_handler.c:1191:osd_ro()) *** setting lustre-MDT0000 read-only ***
      <4>[113367.225361] Turning device loop0 (0x700000) read-only
      <4>[113367.303612] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
      <4>[113367.345628] Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000
      <4>[113367.526705] Lustre: Unmounted lustre-client
      <4>[113367.907894] Lustre: Failing over lustre-MDT0000
      <1>[113368.092078] BUG: unable to handle kernel paging request at ffff8800b6966c68
      <1>[113368.092813] IP: [<ffffffffa0936019>] osp_key_exit+0x9/0x20 [osp]
      <4>[113368.093486] PGD 1a26063 PUD 501067 PMD 6b6067 PTE 80000000b6966060
      <4>[113368.094166] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
      <4>[113368.094774] last sysfs file: /sys/devices/system/cpu/possible
      <4>[113368.095424] CPU 2 
      <4>[113368.095515] Modules linked in:
      <3>[113368.096298] LustreError: 11-0: lustre-MDT0000-lwp-OST0001: Communicating with 0@lo, operation obd_ping failed with -107.
      <4>[113368.096304] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
      <4>[113368.096034]  lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd
      <3>[113368.116346] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 0@lo (no target)
      <4>[113368.096034]  mgs lquota lfsck obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd sha512_generic sha256_generic ext4 mbcache jbd2 virtio_balloon i2c_piix4 i2c_core virtio_console virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs]
      <4>[113368.096034] 
      <4>[113368.096034] Pid: 22149, comm: mdt00_002 Not tainted 2.6.32-rhe6.4-debug #2 Red Hat KVM
      <4>[113368.096034] RIP: 0010:[<ffffffffa0936019>]  [<ffffffffa0936019>] osp_key_exit+0x9/0x20 [osp]
      <4>[113368.096034] RSP: 0018:ffff8800973dfe10  EFLAGS: 00010282
      <4>[113368.096034] RAX: ffffffffa0936010 RBX: 00000000000000c8 RCX: 0000000000000000
      <4>[113368.096034] RDX: ffff8800b6966bf0 RSI: ffffffffa095ac00 RDI: ffff8800b7121610
      <4>[113368.096034] RBP: ffff8800973dfe10 R08: 0000000000000001 R09: 0000000000000000
      <4>[113368.096034] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800b7121610
      <4>[113368.096034] R13: ffff88008a49af30 R14: ffff88006bbafef0 R15: ffff8800b52aac20
      <4>[113368.096034] FS:  0000000000000000(0000) GS:ffff880006280000(0000) knlGS:0000000000000000
      <4>[113368.096034] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      <4>[113368.096034] CR2: ffff8800b6966c68 CR3: 0000000001a25000 CR4: 00000000000006e0
      <4>[113368.096034] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      <4>[113368.096034] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      <4>[113368.096034] Process mdt00_002 (pid: 22149, threadinfo ffff8800973de000, task ffff8800973dc380)
      <4>[113368.096034] Stack:
      <4>[113368.096034]  ffff8800973dfe30 ffffffffa0ff4488 ffff8800b69677f0 ffff8800b52aabf0
      <4>[113368.096034] <d> ffff8800973dfed0 ffffffffa1190899 ffff8800973dfe50 ffff880000000000
      <4>[113368.096034] <d> ffff8800b52aac80 00000000973dffd8 ffff88006bbafef0 00000000973dc938
      <4>[113368.096034] Call Trace:
      <4>[113368.096034]  [<ffffffffa0ff4488>] lu_context_exit+0x58/0xa0 [obdclass]
      <4>[113368.096034]  [<ffffffffa1190899>] ptlrpc_main+0x9d9/0x1650 [ptlrpc]
      <4>[113368.096034]  [<ffffffffa118fec0>] ? ptlrpc_main+0x0/0x1650 [ptlrpc]
      <4>[113368.096034]  [<ffffffff81094606>] kthread+0x96/0xa0
      <4>[113368.096034]  [<ffffffff8100c10a>] child_rip+0xa/0x20
      <4>[113368.096034]  [<ffffffff81094570>] ? kthread+0x0/0xa0
      <4>[113368.096034]  [<ffffffff8100c100>] ? child_rip+0x0/0x20
      <4>[113368.096034] Code: <48> c7 42 78 00 00 00 00 c9 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 
      

      Crashdump and modules are in /exports/crashdumps/192.168.10.221-2013-07-24-10\:40\:42
      source branch in my tree: master-20130723

      Attachments

        Activity

          People

            wc-triage WC Triage
            green Oleg Drokin
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: