Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12246

sanity test 28 crashes with ‘Unable to handle kernel paging request’

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.12.1
    • 3
    • 9223372036854775807

    Description

      We’ve seen sanity test_28 crashes three times this year all for PPC and all for 2.12.1. The first time we saw this crash is with 2.12.0.78.

      Looking at a recent crash, with logs at https://testing.whamcloud.com/test_sets/661044aa-668f-11e9-8bb1-52540065bddc , we see

      ============================================ 00:57:30 \(1555981050\)
      [ 3427.381628] Lustre: DEBUG MARKER: == sanity test 28: create/mknod/mkdir with bad file types ============================================ 00:57:30 (1555981050)
      [ 3427.761602] Lustre: DEBUG MARKER: lctl set_param -n fail_loc=0 	    fail_val=0 2>/dev/null
      [ 3428.823676] Unable to handle kernel paging request for data at address 0x406ef778000000c0
      [ 3428.823785] Faulting instruction address: 0xc000000000337754
      [ 3428.823844] Oops: Kernel access of bad area, sig: 11 [#1]
      [ 3428.823880] SMP NR_CPUS=2048 NUMA pSeries
      [ 3428.823925] Modules linked in: lnet_selftest(OE) lustre(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic crct10dif_common ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core virtio_balloon auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 virtio_net virtio_blk virtio_pci virtio_ring virtio
      [ 3428.824556] CPU: 0 PID: 29737 Comm: bash Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.10.1.el7.ppc64 #1
      [ 3428.824630] task: c0000000758571c0 ti: c00000007536c000 task.ti: c00000007536c000
      [ 3428.824683] NIP: c000000000337754 LR: c0000000003378d4 CTR: c0000000003376a0
      [ 3428.824737] REGS: c00000007536f650 TRAP: 0300   Tainted: G           OE  ------------    (3.10.0-957.10.1.el7.ppc64)
      [ 3428.824807] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 28242488  XER: 00000000
      [ 3428.824931] CFAR: 0000000000002494 DAR: 406ef778000000c0 DSISR: 40000000 SOFTE: 1 
      GPR00: c0000000003378d4 c00000007536f8d0 c0000000016cd800 0000000000000000 
      GPR04: 00000000000080d0 0000000067533256 c0000000495df140 c000000003421ae0 
      GPR08: 00000000004073b0 0000000000000000 00000000024f0000 d000000000a6cf58 
      GPR12: c0000000003376a0 c000000007b80000 0000000000000008 0000000022000000 
      GPR16: 0000000000000000 c00000007945a4d0 0000000010133584 c000000075d34000 
      GPR20: 000000000000001f c00000003e7e2600 0000000000000000 000000000000000e 
      GPR24: 0000000000000000 d000000000a73b88 c00000007e01fa00 d0000000009f4410 
      GPR28: 000000000000004f 00000000000080d0 406ef778000000c0 c00000007e01fa00 
      [ 3428.825686] NIP [c000000000337754] .__kmalloc+0xb4/0x350
      [ 3428.825722] LR [c0000000003378d4] .__kmalloc+0x234/0x350
      [ 3428.825758] Call Trace:
      [ 3428.825778] [c00000007536f8d0] [c0000000003378d4] .__kmalloc+0x234/0x350 (unreliable)
      [ 3428.825862] [c00000007536f980] [d0000000009f4410] .ext4_htree_store_dirent+0x50/0x1b0 [ext4]
      [ 3428.825934] [c00000007536fa20] [d000000000a0ce10] .htree_dirblock_to_tree+0x1a0/0x230 [ext4]
      [ 3428.826005] [c00000007536fb00] [d000000000a0e5b8] .ext4_htree_fill_tree+0x1c8/0x3e0 [ext4]
      [ 3428.826076] [c00000007536fc20] [d0000000009f3e1c] .ext4_readdir+0x95c/0xbc0 [ext4]
      [ 3428.826142] [c00000007536fd60] [c000000000396d3c] .SyS_getdents+0x1fc/0x2b0
      [ 3428.826196] [c00000007536fe30] [c00000000000a284] system_call+0x38/0xfc
      [ 3428.826249] Instruction dump:
      [ 3428.826277] 7f5fd378 e94d0040 e93f0000 7ce95214 e9070008 7fc9502a e9270010 2fbe0000 
      [ 3428.826368] 41de006c 2fa90000 419e0064 e93f0022 <7f3e482a> 39200000 88cd02a2 992d02a2 
      [ 3428.826469] ---[ end trace 8c2d758ee3e2fa9b ]---
      [ 3428.828981] 
      [ 3428.829009] Sending IPI to other CPUs
      [ 3428.830044] IPI complete
      

      Logs for all other failures are at
      https://testing.whamcloud.com/test_sets/02902ee0-6266-11e9-8bb1-52540065bddc
      https://testing.whamcloud.com/test_sets/d0ff8300-5acf-11e9-a256-52540065bddc

      Attachments

        Issue Links

          Activity

            People

              ys Yang Sheng
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: