Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2510

general protection fault (ptlrpc_lprocfs_svc_req_history_seek+0x54/0x130) caused by " /proc/fs/lustre/mds/MDS/*/req_history"

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 2.4
    • 3
    • 5897

    Description

      I met this when I tried to cat /proc/fs/lustre/mds/MDS/*/req_history on a heavy load running MDS.

      Though it happened on DNE branch, but since DNE does not change this part of code at all, it should be master bug.

      Lustre: 7109:0:(service.c:1831:ptlrpc_server_handle_req_in()) @@@ Slow req_in handling 10s req@ffff880601b67450 x1421800558756026/t0(0) o101->1ed09f68-b097-0a37-85f8-9d43775754c0@192.168.2.123@o2ib:0/0 lens 576/0 e 0 to 0 dl 0 ref 1 fl New:/0/ffffffff rc 0/-1
      Lustre: 8557:0:(client.c:1842:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1355943962/real 0] req@ffff8805b29b9800 x1421768869502499/t0(0) o13->dnelust-OST0002-osc-MDT0000@192.168.2.131@o2ib:7/4 lens 224/368 e 0 to 1 dl 1355943969 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
      Lustre: dnelust-OST0006-osc-MDT0000: Connection to dnelust-OST0006 (at 192.168.2.131@o2ib) was lost; in progress operations using this service will wait for recovery to complete
      Lustre: Skipped 1 previous similar message
      Lustre: 8557:0:(client.c:1842:ptlrpc_expire_one_request()) Skipped 1 previous similar message
      Lustre: dnelust-OST0002-osc-MDT0000: Connection to dnelust-OST0002 (at 192.168.2.131@o2ib) was lost; in progress operations using this service will wait for recovery to complete
      Lustre: 6833:0:(service.c:1831:ptlrpc_server_handle_req_in()) @@@ Slow req_in handling 21s req@ffff8804968ab450 x1421800282870490/t0(0) o400->dnelust-MDT0000-osp-OST0006_UUID@192.168.2.131@o2ib:0/0 lens 224/0 e 0 to 0 dl 0 ref 1 fl New:/0/ffffffff rc 0/-1
      general protection fault: 0000 1 SMP
      last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
      CPU 7
      Modules linked in: obdecho(U) osp(U) lod(U) mdt(U) mgs(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) mdd(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) jbd2 nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa microcode mlx4_ib ib_mad ib_core mlx4_en mlx4_core serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core ses enclosure sg igb dca ext3 jbd mbcache sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

      Pid: 7221, comm: cat Not tainted 2.6.32-279.14.1.el6_lustre.g5fd2de9.x86_64 #1 Supermicro X8DTH-i/6/iF/6F/X8DTH
      RIP: 0010:[<ffffffffa084ac34>] [<ffffffffa084ac34>] ptlrpc_lprocfs_svc_req_history_seek+0x54/0x130 [ptlrpc]
      RSP: 0018:ffff88078092ddb8 EFLAGS: 00010206
      RAX: dead000000100100 RBX: ffff880dafbe53c0 RCX: dead0000001000d8
      RDX: 50d210227c2b0001 RSI: ffff880dafbe53c0 RDI: 50d210227c2b0000
      RBP: ffff88078092ddc8 R08: ffff8805ed337290 R09: 00000000fffffffd
      R10: 0000000000000000 R11: 00000000000000c8 R12: 0000000000000001
      R13: ffff8805ed337230 R14: ffff88082a4c7380 R15: ffff88078092de58
      FS: 00007f376b89b700(0000) GS:ffff88085c460000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007f377b2c3000 CR3: 0000000f9cc62000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process cat (pid: 7221, threadinfo ffff88078092c000, task ffff880499cc4040)
      Stack:
      ffff880700000000 ffff8802868d4050 ffff88078092de18 ffffffffa084ade1
      <d> 00050d1e8cf7e524 ffff8805ed337200 ffff88078092de78 ffff880cb9397240
      <d> ffff881023f8c300 ffff880dafbe53c0 0000000000000e86 ffff88078092de58
      Call Trace:
      [<ffffffffa084ade1>] ptlrpc_lprocfs_svc_req_history_next+0x71/0x1b0 [ptlrpc]
      [<ffffffff8119e19a>] seq_read+0x24a/0x3f0
      [<ffffffff811e11fe>] proc_reg_read+0x7e/0xc0
      [<ffffffff8117bee5>] vfs_read+0xb5/0x1a0
      [<ffffffff810d6d42>] ? audit_syscall_entry+0x272/0x2a0
      [<ffffffff8117c021>] sys_read+0x51/0x90
      [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
      Code: 3b 48 58 75 5e 4c 8d 87 90 00 00 00 4c 39 87 90 00 00 00 0f 84 86 00 00 00 48 83 c0 28 eb 18 0f 1f 84 00 00 00 00 00 48 8d 48 d8 <48> 8b 79 58 48 39 fa 76 13 48 8b 00 49 39 c0 75 eb b8 fe ff ff
      RIP [<ffffffffa084ac34>] ptlrpc_lprocfs_svc_req_history_seek+0x54/0x130 [ptlrpc]
      RSP <ffff88078092ddb8>

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              di.wang Di Wang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: