Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10451

sptlrpc_ctxs_lprocfs_seq_show crash in recovery-small test 57

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      After LU-10224 landed that fixed all crashes in that test it seemed like.

      Well, I just had a very similar crash in a different place:

      [128715.132365] Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash =================================== 10:05:20 (1514387120)
      [128717.357108] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      [128717.358256] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect sysimgblt ttm ata_generic drm_kms_helper pata_acpi drm floppy i2c_piix4 virtio_console pcspkr virtio_balloon serio_raw virtio_blk ata_piix i2c_core libata nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs]
      [128717.371008] CPU: 3 PID: 20280 Comm: lctl Tainted: P           OE  ------------   3.10.0-debug #2
      [128717.372286] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [128717.372932] task: ffff8802cd528a80 ti: ffff8800a6838000 task.ti: ffff8800a6838000
      [128717.382186] RIP: 0010:[<ffffffffa05b8a47>]  [<ffffffffa05b8a47>] sptlrpc_ctxs_lprocfs_seq_show+0x27/0x100 [ptlrpc]
      [128717.383566] RSP: 0018:ffff8800a683be78  EFLAGS: 00010203
      [128717.384213] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8802a1586700 RCX: 0000000000000004
      [128717.385411] RDX: fffffffffffffff4 RSI: 0000000000000001 RDI: ffffffffa0636425
      [128717.389241] RBP: ffff8800a683be90 R08: 0000000000000001 R09: ffff8802f092f000
      [128717.390434] R10: 0000000000000000 R11: 0000000000000246 R12: ffff8800a284ef00
      [128717.391627] R13: 0000000000000001 R14: ffff8800a683bf48 R15: ffff8800a284ef00
      [128717.394109] FS:  00007f93a5430740(0000) GS:ffff88033e460000(0000) knlGS:0000000000000000
      [128717.395374] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [128717.396187] CR2: 00007f93a4aa7000 CR3: 0000000095e38000 CR4: 00000000000006e0
      [128717.403348] Lustre: Unmounted lustre-client
      [128717.413399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [128717.414405] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [128717.415645] Stack:
      [128717.416340]  0000000000000000 ffff880096912e00 0000000000000001 ffff8800a683bf00
      [128717.417877]  ffffffff81212c85 0000000000001000 0000000001e19af0 ffff8800a284ef38
      [128717.419405]  0000000000001000 0000000000000000 ffff880096912e00 0000000001ff0c13
      [128717.420901] Call Trace:
      [128717.421503]  [<ffffffff81212c85>] seq_read+0x105/0x3e0
      [128717.422165]  [<ffffffff811ed1dc>] vfs_read+0x9c/0x170
      [128717.422623]  [<ffffffff811edd44>] SyS_read+0x84/0xf0
      [128717.423149]  [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b
      [128717.423877] Code: 1f 44 00 00 0f 1f 44 00 00 55 b9 04 00 00 00 48 89 e5 41 55 41 54 49 89 fc 53 48 8b 9f d8 00 00 00 48 c7 c7 25 64 63 a0 48 8b 03 <4c> 8b 68 40 4c 89 ee f3 a6 75 42 48 8b bb 58 08 00 00 48 85 ff 
      [128717.425876] RIP  [<ffffffffa05b8a47>] sptlrpc_ctxs_lprocfs_seq_show+0x27/0x100 [ptlrpc]
      
      (gdb) l *(sptlrpc_ctxs_lprocfs_seq_show+0x27)
      0x8ca47 is in sptlrpc_ctxs_lprocfs_seq_show (/home/green/git/lustre-release/lustre/ptlrpc/sec_lproc.c:122).
      117	{
      118	        struct obd_device *dev = seq->private;
      119	        struct client_obd *cli = &dev->u.cli;
      120	        struct ptlrpc_sec *sec = NULL;
      121
      122		LASSERT(strcmp(dev->obd_type->typ_name, LUSTRE_OSC_NAME) == 0 ||
      123			strcmp(dev->obd_type->typ_name, LUSTRE_MDC_NAME) == 0 ||
      124			strcmp(dev->obd_type->typ_name, LUSTRE_MGC_NAME) == 0 ||
      125			strcmp(dev->obd_type->typ_name, LUSTRE_LWP_NAME) == 0 ||
      126			strcmp(dev->obd_type->typ_name, LUSTRE_OSP_NAME) == 0);
      

      It's not as frequent as all those othe failures, but still needs to be looked at I guess.

      Attachments

        Activity

          People

            wc-triage WC Triage
            green Oleg Drokin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: