Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2400

BUG: unable to handle kernel paging request in lprocfs_srch+0x48/0x80

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Minor
    • None
    • Lustre 2.4.0
    • 3
    • 5695

    Description

      I was running the replay-single test in a loop and eventually hit this:

      BUG: unable to handle kernel paging request at ffffffffffffffde
      IP: [<ffffffffa12d4f28>] lprocfs_srch+0x48/0x80 [obdclass]
      PGD 1a87067 PUD 1a88067 PMD 0 
      Oops: 0000 [#1] SMP 
      last sysfs file: /sys/module/ptlrpc/initstate
      CPU 0 
      Modules linked in: lustre(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) mdd(U) mgs(U) osd_zfs(U) lquota(U) obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) jbd sha512_generic sha256_generic ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge stp llc fuse autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate vhost_net macvtap macvlan tun kvm virtio_balloon snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk pata_acpi ata_generic ata_piix virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded: libcfs]
      
      Pid: 2886, comm: mount.lustre Tainted: P           ---------------    2.6.32-279.9.1.1chaos.ch5.1.x86_64 #1 Bochs Bochs
      RIP: 0010:[<ffffffffa12d4f28>]  [<ffffffffa12d4f28>] lprocfs_srch+0x48/0x80 [obdclass]
      RSP: 0018:ffff88001cc599a8  EFLAGS: 00010286
      RAX: ffffffffa135ccc0 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff88001cc59ab8 RDI: ffffffffa135ccc0
      RBP: ffff88001cc599c8 R08: 00000000fffffffe R09: 0000000000000000
      R10: 000000000000000f R11: 000000000000000f R12: ffff88001cc59ab8
      R13: ffffffffffffff8e R14: 0000000000000000 R15: ffff88001cc59ab8
      FS:  00007fa9dee12700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: ffffffffffffffde CR3: 000000001bf30000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process mount.lustre (pid: 2886, threadinfo ffff88001cc58000, task ffff880002053540)
      Stack:
       ffffffffffffff8e ffff88001cc59ab8 ffffffffffffff8e 0000000000000000
      <d> ffff88001cc59a18 ffffffffa12d7084 ffff88001cc599e8 ffff8800195f6000
      <d> ffff88001cc59a88 ffffffff8127d0a6 ffff88001c874000 0000000000000006
      Call Trace:
       [<ffffffffa12d7084>] lprocfs_register+0x34/0x100 [obdclass]
       [<ffffffff8127d0a6>] ? vsnprintf+0x2b6/0x5f0
       [<ffffffffa109e3ce>] lprocfs_register_mountpoint+0x12e/0xb80 [lustre]
       [<ffffffffa108f226>] client_common_fill_super+0x1a6/0x4e70 [lustre]
       [<ffffffff8127d420>] ? sprintf+0x40/0x50
       [<ffffffffa1094c64>] ll_fill_super+0xd74/0x1500 [lustre]
       [<ffffffffa130fb6c>] lustre_fill_super+0x12c/0x1ae0 [obdclass]
       [<ffffffff8117cb16>] ? set_anon_super+0x56/0x100
       [<ffffffff8116a03c>] ? pcpu_alloc+0x3ac/0xa50
       [<ffffffff8127a24a>] ? strlcpy+0x4a/0x60
       [<ffffffff8117d9e3>] ? sget+0x3e3/0x480
       [<ffffffff8117cac0>] ? set_anon_super+0x0/0x100
       [<ffffffffa130fa40>] ? lustre_fill_super+0x0/0x1ae0 [obdclass]
       [<ffffffff8117df2f>] get_sb_nodev+0x5f/0xa0
       [<ffffffffa12fb185>] lustre_get_sb+0x25/0x30 [obdclass]
       [<ffffffff8117db8b>] vfs_kern_mount+0x7b/0x1b0
       [<ffffffff8117dd32>] do_kern_mount+0x52/0x130
       [<ffffffff8119c442>] do_mount+0x2d2/0x8d0
       [<ffffffff8119cad0>] sys_mount+0x90/0xe0
       [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
      Code: a1 49 89 f4 e8 ca b0 22 e0 4d 85 ed 75 1d 48 c7 c7 c0 cc 35 a1 e8 69 2b dc df 48 83 c4 08 48 89 d8 5b 41 5c 41 5d c9 c3 0f 1f 00 <49> 8b 5d 50 48 85 db 75 10 eb d8 0f 1f 44 00 00 48 8b 5b 40 48 
      RIP  [<ffffffffa12d4f28>] lprocfs_srch+0x48/0x80 [obdclass]
       RSP <ffff88001cc599a8>
      CR2: ffffffffffffffde
      ---[ end trace 0ce3c640bfa74514 ]---
      Kernel panic - not syncing: Fatal exception
      Pid: 2886, comm: mount.lustre Tainted: P      D    ---------------    2.6.32-279.9.1.1chaos.ch5.1.x86_64 #1
      Call Trace:
       [<ffffffff814fdceb>] ? panic+0xa0/0x168
       [<ffffffff81501e84>] ? oops_end+0xe4/0x100
       [<ffffffff81043bfb>] ? no_context+0xfb/0x260
       [<ffffffff81043e85>] ? __bad_area_nosemaphore+0x125/0x1e0
       [<ffffffff81043f53>] ? bad_area_nosemaphore+0x13/0x20
       [<ffffffff810446b1>] ? __do_page_fault+0x321/0x480
       [<ffffffff8127c23d>] ? pointer+0x8d/0x830
       [<ffffffffa06ff799>] ? _ldlm_lock_debug+0x189/0x5d0 [ptlrpc]
       [<ffffffff81503e3e>] ? do_page_fault+0x3e/0xa0
       [<ffffffff815011f5>] ? page_fault+0x25/0x30
       [<ffffffffa12d4f28>] ? lprocfs_srch+0x48/0x80 [obdclass]
       [<ffffffffa12d4f06>] ? lprocfs_srch+0x26/0x80 [obdclass]
       [<ffffffffa12d7084>] ? lprocfs_register+0x34/0x100 [obdclass]
       [<ffffffff8127d0a6>] ? vsnprintf+0x2b6/0x5f0
       [<ffffffffa109e3ce>] ? lprocfs_register_mountpoint+0x12e/0xb80 [lustre]
       [<ffffffffa108f226>] ? client_common_fill_super+0x1a6/0x4e70 [lustre]
       [<ffffffff8127d420>] ? sprintf+0x40/0x50
       [<ffffffffa1094c64>] ? ll_fill_super+0xd74/0x1500 [lustre]
       [<ffffffffa130fb6c>] ? lustre_fill_super+0x12c/0x1ae0 [obdclass]
       [<ffffffff8117cb16>] ? set_anon_super+0x56/0x100
       [<ffffffff8116a03c>] ? pcpu_alloc+0x3ac/0xa50
       [<ffffffff8127a24a>] ? strlcpy+0x4a/0x60
       [<ffffffff8117d9e3>] ? sget+0x3e3/0x480
       [<ffffffff8117cac0>] ? set_anon_super+0x0/0x100
       [<ffffffffa130fa40>] ? lustre_fill_super+0x0/0x1ae0 [obdclass]
       [<ffffffff8117df2f>] ? get_sb_nodev+0x5f/0xa0
       [<ffffffffa12fb185>] ? lustre_get_sb+0x25/0x30 [obdclass]
       [<ffffffff8117db8b>] ? vfs_kern_mount+0x7b/0x1b0
       [<ffffffff8117dd32>] ? do_kern_mount+0x52/0x130
       [<ffffffff8119c442>] ? do_mount+0x2d2/0x8d0
       [<ffffffff8119cad0>] ? sys_mount+0x90/0xe0
       [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
      

      The reproducer:

      $ while true; sudo FSTYPE=zfs ONLY="59 60" sh ./lustre/tests/replay-single.sh; done
      

      It took about 6 hours to hit in my single node VM.

      There are also some other messages on the console just prior to the crash which I don't totally understand but I'll post here just for completeness:

      SLAB: cache with size 256 has lost its name
      ... snip ...
      SLAB: cache with size 1152 has lost its name
      LustreError: 958:0:(lprocfs_status.c:492:lprocfs_register())  Lproc: Attempting to register llite more than once 
      SLAB: cache with size 256 has lost its name
      ... snip ...
      SLAB: cache with size 1152 has lost its name
      kmem_cache_create: duplicate cache lustre_inode_cache
      SLAB: cache with size 256 has lost its name
      ... snip ...
      SLAB: cache with size 1152 has lost its name
      LustreError: 2522:0:(lprocfs_status.c:492:lprocfs_register())  Lproc: Attempting to register llite more than once 
      SLAB: cache with size 256 has lost its name
      ... snip ...
      SLAB: cache with size 1152 has lost its name
      LustreError: 11-0: lustre-MDT0000-osp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11
      LustreError: 11-0: lustre-MDT0000-osp-OST0000: Communicating with 0@lo, operation mds_connect failed with -11
      

      Attachments

        Activity

          People

            bzzz Alex Zhuravlev
            prakash Prakash Surya (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: