Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11716

sanity-scrub test_1b test: BUG: unable to handle kernel NULL pointer dereference IP: [<ffffffff816b68a9>] _raw_read_lock+0x9/0x20

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.13.0
    • Lustre 2.11.0
    • 3
    • 9223372036854775807

    Description

      [  880.747314] Lustre: *** cfs_fail_loc=198, val=0***
      [  880.747550] BUG: unable to handle kernel NULL pointer dereference at 0000000000000298
      [  880.747600] IP: [<ffffffff816b68a9>] _raw_read_lock+0x9/0x20
      [  880.747603] PGD 0 
      [  880.747605] Oops: 0002 [#1] SMP 
      [  880.747650] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt ppdev fb_sys_fops drm virtio_balloon joydev pcspkr i2c_piix4 parport_pc i2c_core parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_net virtio_blk ata_piix libata serio_raw virtio_pci virtio_ring virtio floppy
      [  880.747654] CPU: 1 PID: 11067 Comm: mdt_rdpg00_002 Tainted: G           OE  ------------   3.10.0-693.21.1.x3.1.143.x86_64 #1
      [  880.747655] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  880.747656] task: ffff88008d7abf40 ti: ffff8800a7b70000 task.ti: ffff8800a7b70000
      [  880.747660] RIP: 0010:[<ffffffff816b68a9>]  [<ffffffff816b68a9>] _raw_read_lock+0x9/0x20
      [  880.747667] RSP: 0018:ffff8800a7b737b0  EFLAGS: 00010213
      [  880.747668] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [  880.747669] RDX: ffff8800a7b73810 RSI: 0000000000000000 RDI: 0000000000000298
      [  880.747670] RBP: ffff8800a7b737b0 R08: ffff8800a7b73914 R09: 0000000000000000
      [  880.747671] R10: ffffffffc0ad529e R11: 0000000000000000 R12: 0000000000000000
      [  880.747672] R13: ffff8800a7b73810 R14: ffff8800a7b73880 R15: ffffffffc0699500
      [  880.747674] FS:  0000000000000000(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
      [  880.747676] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  880.747677] CR2: 0000000000000298 CR3: 00000000360ce000 CR4: 00000000000006e0
      [  880.747704] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  880.747724] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  880.747725] Call Trace:
      [  880.747830]  [<ffffffffc050181a>] ldiskfs_es_lookup_extent+0x2a/0x180 [ldiskfs]
      [  880.747841]  [<ffffffffc04cf06d>] ldiskfs_map_blocks+0x5d/0x700 [ldiskfs]
      [  880.747918]  [<ffffffffc0a8c35c>] ? qsd_op_end+0x7c/0x6e0 [lquota]
      [  880.748111]  [<ffffffffc0645169>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
      [  880.748122]  [<ffffffffc04cf775>] ldiskfs_getblk+0x65/0x200 [ldiskfs]
      [  880.748131]  [<ffffffffc04cf937>] ldiskfs_bread+0x27/0xc0 [ldiskfs]
      [  880.748206]  [<ffffffffc0aefe16>] iam_node_read+0x66/0x100 [osd_ldiskfs]
      [  880.748222]  [<ffffffffc0af2d7d>] iam_lfix_guess+0x2d/0xd0 [osd_ldiskfs]
      [  880.748225]  [<ffffffff816b24b2>] ? mutex_lock+0x12/0x2f
      [  880.748235]  [<ffffffffc0aef5bc>] iam_container_setup+0x5c/0x120 [osd_ldiskfs]
      [  880.748248]  [<ffffffffc0ad552c>] osd_index_try+0x49c/0x690 [osd_ldiskfs]
      [  880.748255]  [<ffffffffc0a75abd>] lquota_disk_slv_find_create+0x71d/0x850 [lquota]
      [  880.748271]  [<ffffffffc0a9b3c5>] qmt_pool_new_conn+0x2f5/0x360 [lquota]
      [  880.748280]  [<ffffffffc0a93dcc>] qmt_intent_policy+0x65c/0xe50 [lquota]
      [  880.748608]  [<ffffffffc089e0d0>] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc]
      [  880.748709]  [<ffffffffc0c860da>] mdt_intent_opc+0x21a/0xae0 [mdt]
      [  880.748754]  [<ffffffffc08a2550>] ? lustre_swab_ldlm_policy_data+0x30/0x30 [ptlrpc]
      [  880.748776]  [<ffffffffc0645169>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
      [  880.748795]  [<ffffffffc0c8df63>] mdt_intent_policy+0x1a3/0x360 [mdt]
      [  880.748824]  [<ffffffffc0851f0e>] ldlm_lock_enqueue+0x34e/0xa50 [ptlrpc]
      [  880.748887]  [<ffffffffc043d67e>] ? cfs_hash_add+0xbe/0x1a0 [libcfs]
      [  880.748922]  [<ffffffffc087a753>] ldlm_handle_enqueue0+0x8f3/0x13e0 [ptlrpc]
      [  880.748959]  [<ffffffffc08a25d0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc]
      [  880.749059]  [<ffffffffc0900b32>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [  880.749103]  [<ffffffffc09044da>] tgt_request_handle+0x92a/0x13b0 [ptlrpc]
      

      I think, the next steps happened

      • srcub 1b test called a scrub_prep
      • it set MDT failloc to 0x198
      • it prevented the inserting of osd_obj to oi, and broke the logic of osd_fid_lookup for searching quota slv file
      • osd_fid_lookup returned osd object with oo_inode equal to 0x0
      • and a bit later, osd_index_try got BUG cause it tried to access to &LDISKFS_I(inode)->i_es_lock

      Attachments

        Activity

          People

            aboyko Alexander Boyko
            aboyko Alexander Boyko
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: