Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6049

General Protection Fault at echo_session_key_fini+0xa9

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.8.0
    • Lustre 2.1.6, Lustre 2.5.3
    • kernel 2.6.32-279.5.2
    • 3
    • 16855

    Description

      General Protection Fault at echo_session_key_fini+0xa9

      A crash on an OSS with the following trace has been encountered with lustre 2.1.6 and kernel 2.6.32-279.5.2

      2013-10-18 07:57:21 general protection fault: 0000 [#1] SMP
      2013-10-18 07:57:21 last sysfs file: /sys/devices/pci0000:80/0000:80:02.2/0000:84:00.0/host10/port-10:3/end_device-10:3/target10:0:3/10:0:3:4/state
      2013-10-18 07:57:21 CPU 0
      2013-10-18 07:57:21 Modules linked in: nfs fscache obdecho(U) obdfilter(U) fsfilt_ldiskfs(U) ost(U) mgc(U) ldiskfs(U) lustre(U) lov(U) osc(U) mdc(U) lquota(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfsd lockd nfs_acl auth_rpcgss exportf
      s sunrpc rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U) mlx4_ib(U) mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) dm_mirror dm_region_hash dm_log dm_round_robin scsi_dh_rdac dm_multipath dm_mod uinput usbhid
       hid sg mpt2sas scsi_transport_sas raid_class sb_edac edac_core i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma igb dca ext4 mbcache jbd2 ehci_hcd sd_mod crc_t10dif ahci megaraid_sas [last unloaded: scsi_wait_scan]
      2013-10-18 07:57:21
      2013-10-18 07:57:21 Pid: 14573, comm: obd_zombid Not tainted 2.6.32-279.5.2.bl6.Bull.35_restricted.x86_64 #1 Bull SAS bullx R/X9DRH-7TF/7F/iTF/iF
      2013-10-18 07:57:21 RIP: 0010:[<ffffffffa0a2dc49>] [<ffffffffa0a2dc49>] echo_session_key_fini+0xa9/0x100 [obdecho]
      2013-10-18 07:57:21 RSP: 0018:ffff88105451bce0 EFLAGS: 00010246
      2013-10-18 07:57:21 RAX: 5a5a5a5a5a5a5a5a RBX: 5a5a5a5a5a5a5a5a RCX: 0000000000000002
      2013-10-18 07:57:21 RDX: ffff8808587c7540 RSI: 5a5a5a5a5a5a5a5a RDI: 0000000000000000
      2013-10-18 07:57:21 RBP: ffff88105451bcf0 R08: ffffffff81ac5ee8 R09: 0000000000000140
      2013-10-18 07:57:21 R10: 0000000000000000 R11: 000000000000000c R12: ffff880f1ad9a2f8
      2013-10-18 07:57:21 R13: 0000000000000010 R14: ffff88105451be20 R15: ffff8802f65a46d0
      2013-10-18 07:57:21 FS: 00007f14eea7f700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
      2013-10-18 07:57:21 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      2013-10-18 07:57:21 CR2: 00007f5cb4fc8000 CR3: 0000000001a06000 CR4: 00000000000406f0
      2013-10-18 07:57:21 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      2013-10-18 07:57:21 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      2013-10-18 07:57:21 Process obd_zombid (pid: 14573, threadinfo ffff881054518000, task ffff881068e0c080)
      2013-10-18 07:57:21 Stack:
      2013-10-18 07:57:21 0000000000000010 ffffffffa0a3cf40 ffff88105451bd20 ffffffffa0554d90
      2013-10-18 07:57:21 <d> ffff88105451bd20 ffffffffa0a3cf40 ffff880f1ad9a2f8 ffff88071cd302c0
      2013-10-18 07:57:21 <d> ffff88105451bd40 ffffffffa055514b 0000000000000000 ffff8802f65a46c8
      2013-10-18 07:57:21 Call Trace:
      2013-10-18 07:57:21 [<ffffffffa0554d90>] key_fini+0x60/0x1e0 [obdclass]
      2013-10-18 07:57:21 [<ffffffffa055514b>] lu_context_key_quiesce+0x5b/0x90 [obdclass]
      2013-10-18 07:57:21 [<ffffffffa05551d9>] lu_context_key_quiesce_many+0x59/0x80 [obdclass]
      2013-10-18 07:57:21 [<ffffffffa0a2d910>] echo_type_stop+0x20/0x30 [obdecho]
      2013-10-18 07:57:21 [<ffffffffa0554322>] lu_device_fini+0x52/0xd0 [obdclass]
      2013-10-18 07:57:21 [<ffffffffa0a2fed7>] echo_device_free+0x247/0x510 [obdecho]
      2013-10-18 07:57:22 [<ffffffffa053871d>] class_decref+0x46d/0x590 [obdclass]
      2013-10-18 07:57:22 [<ffffffffa0523bae>] obd_zombie_impexp_cull+0x31e/0x620 [obdclass]
      2013-10-18 07:57:22 [<ffffffffa0523f75>] obd_zombie_impexp_thread+0xc5/0x1c0 [obdclass]
      2013-10-18 07:57:22 [<ffffffff81048df0>] ? default_wake_function+0x0/0x20
      2013-10-18 07:57:22 [<ffffffffa0523eb0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
      2013-10-18 07:57:22 [<ffffffff8100412a>] child_rip+0xa/0x20
      2013-10-18 07:57:22 [<ffffffffa0523eb0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
      2013-10-18 07:57:22 [<ffffffffa0523eb0>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
      2013-10-18 07:57:22 [<ffffffff81004120>] ? child_rip+0x0/0x20
      2013-10-18 07:57:22 Code: 99 02 00 00 48 c7 05 d3 49 01 00 00 00 00 00 c7 05 c1 49 01 00 10 00 00 00 e8 14 74 a2 ff 48 b8 5a 5a 5a 5a 5a 5a 5a 5a 48 89 de <48> 89 03 48 8b 3d d5 69 01 00 e8 98 bb a1 ff 48 83 c4 08 5b c9
      2013-10-18 07:57:22 RIP [<ffffffffa0a2dc49>] echo_session_key_fini+0xa9/0x100 [obdecho]
      2013-10-18 07:57:22 RSP <ffff88105451bce0>
      2013-10-18 07:57:22 Initializing cgroup subsys cpuset
      2013-10-18 07:57:22 Initializing cgroup subsys cpu
      2013-10-18 07:57:22 Linux version 2.6.32-279.5.2.bl6.Bull.35_restricted.x86_64 (derr@atlas.frec.bull.fr) (gcc version 4.4.6 20120305 (Bull 4.4.6-4) (GCC) ) #1 SMP Wed Apr 24 14:29:54 CEST 2013
      

      The crash occured in echo_session_key_fini().
      key_fini() was calling key->lct_fini(), pointing on echo_session_key_fini(), using ctx->lc_value[index] as third parameter, but this array was containing POISON'ed values.

      static void echo_session_key_fini(...)
      {
              struct echo_session_info *session = data;
              OBD_SLAB_FREE_PTR(session, echo_session_kmem);    <================ GPF here, session is 0x5a5a5a5a5a5a5a5a
      }
      
      After OBD_SLAB_FREE_PTR() expansion:
      ------------------------------------
      
      static void echo_session_key_fini(...)
      {
              struct echo_session_info *session = data;
              LASSERT(session);
              lprocfs_counter_sub(obd_memory, OBD_MEMORY_STAT, (long)(sizeof *(session)));
              CDEBUG(D_MALLOC, name " '" #session "': %d at %p.\n", (int)(sizeof *(session)), session);
              memset(session, 0x5a, sizeof *(session));         <================ GPF here, session is 0x5a5a5a5a5a5a5a5a
              cfs_mem_cache_free(echo_session_kmem, session);
              (session) = (void *)0xdeadbeef;
      }
      

      Attachments

        Activity

          People

            emoly.liu Emoly Liu
            patrick.valentin Patrick Valentin (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: