Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4721

sanity-scrub test 14: Oops: IP: [<ffffffff81294840>] list_del+0x10/0xa0

Details

    • 3
    • 12976

    Description

      While running sanity-scrub test 14, MDS hit the following Oops:

      23:43:27:Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.quota_slave.enabled
      23:43:27:Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.mdt=ug3
      23:43:27:Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ug3
      23:43:27:BUG: unable to handle kernel NULL pointer dereference at (null)
      23:43:27:IP: [<ffffffff81294840>] list_del+0x10/0xa0
      23:43:27:PGD 0 
      23:43:27:Oops: 0000 [#1] SMP 
      23:43:27:last sysfs file: /sys/devices/system/cpu/online
      23:43:27:CPU 0 
      23:43:27:Modules linked in: osp(U) mdd(U) lfsck(U) lod(U) mdt(U) mgs(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) jbd2 nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: obdecho]
      23:43:27:
      23:43:27:Pid: 16460, comm: mdt00_000 Not tainted 2.6.32-431.5.1.el6_lustre.x86_64 #1 Red Hat KVM
      23:43:27:RIP: 0010:[<ffffffff81294840>]  [<ffffffff81294840>] list_del+0x10/0xa0
      23:43:27:RSP: 0018:ffff88007a9419c0  EFLAGS: 00010286
      23:43:27:RAX: 0000000000000000 RBX: ffff88006c6a0ec0 RCX: ffff88007c35c040
      23:43:27:RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88006c6a0ec0
      23:43:27:RBP: ffff88007a9419d0 R08: ffff88007a940000 R09: 00000000ffffffff
      23:43:27:R10: 0000000000000000 R11: 0000000000000000 R12: ffff880037c79480
      23:43:27:R13: ffff88005755e180 R14: ffff88005755e208 R15: ffffffffa06251c0
      23:43:27:FS:  0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
      23:43:27:CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      23:43:27:CR2: 0000000000000000 CR3: 000000007d6af000 CR4: 00000000000006f0
      23:43:27:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      23:43:27:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      23:43:27:Process mdt00_000 (pid: 16460, threadinfo ffff88007a940000, task ffff88007c8bc080)
      23:43:27:Stack:
      23:43:27: ffff88006c6a0ec0 ffff88006c6a0ec0 ffff88007a941a00 ffffffffa05d50de
      23:43:27:<d> ffff880037c79480 ffff880037aaa1c0 ffff880057590000 ffff880037aaa268
      23:43:27:<d> ffff88007a941a70 ffffffffa0c12351 0000000001020000 ffff880074c873c0
      23:43:27:Call Trace:
      23:43:27: [<ffffffffa05d50de>] local_oid_storage_fini+0x7e/0x280 [obdclass]
      23:43:27: [<ffffffffa0c12351>] lquota_disk_find_create+0x1a1/0x2d0 [lquota]
      23:43:27: [<ffffffffa0c12fcc>] lquota_disk_slv_find_create+0x21c/0x4b0 [lquota]
      23:43:27: [<ffffffffa0c36510>] qmt_pool_new_conn+0x2e0/0x3a6 [lquota]
      23:43:27: [<ffffffffa077c600>] ? lustre_swab_quota_body+0x0/0x50 [ptlrpc]
      23:43:27: [<ffffffffa0c2ef95>] qmt_intent_policy+0x545/0x9e0 [lquota]
      23:43:27: [<ffffffffa077af80>] ? lustre_swab_ldlm_intent+0x0/0x20 [ptlrpc]
      23:43:27: [<ffffffffa0d83ba8>] mdt_intent_policy+0x548/0x770 [mdt]
      23:43:27: [<ffffffffa072f511>] ldlm_lock_enqueue+0x361/0x8c0 [ptlrpc]
      23:43:27: [<ffffffffa0758abf>] ldlm_handle_enqueue0+0x4ef/0x10a0 [ptlrpc]
      23:43:27: [<ffffffffa0d83ed6>] mdt_enqueue+0x46/0xe0 [mdt]
      23:43:27: [<ffffffffa0d8abca>] mdt_handle_common+0x52a/0x1470 [mdt]
      23:43:27: [<ffffffffa0dc54f5>] mds_regular_handle+0x15/0x20 [mdt]
      23:43:28: [<ffffffffa0788a45>] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
      23:43:28: [<ffffffffa04963df>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
      23:43:28: [<ffffffffa07800e9>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
      23:43:28: [<ffffffffa0789dad>] ptlrpc_main+0xaed/0x1740 [ptlrpc]
      23:43:28: [<ffffffffa07892c0>] ? ptlrpc_main+0x0/0x1740 [ptlrpc]
      23:43:28: [<ffffffff8109aee6>] kthread+0x96/0xa0
      23:43:28: [<ffffffff8100c20a>] child_rip+0xa/0x20
      23:43:28: [<ffffffff8109ae50>] ? kthread+0x0/0xa0
      23:43:28: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      23:43:28:Code: 89 95 fc fe ff ff e9 ab fd ff ff 4c 8b ad e8 fe ff ff e9 db fd ff ff 90 90 90 90 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 47 08 <4c> 8b 00 4c 39 c7 75 39 48 8b 03 4c 8b 40 08 4c 39 c3 75 4c 48 
      23:43:28:RIP  [<ffffffff81294840>] list_del+0x10/0xa0
      23:43:28: RSP <ffff88007a9419c0>
      23:43:28:CR2: 0000000000000000
      

      Maloo report: https://maloo.whamcloud.com/test_sets/ffffb224-a49f-11e3-8fba-52540035b04c

      Attachments

        Activity

          [LU-4721] sanity-scrub test 14: Oops: IP: [<ffffffff81294840>] list_del+0x10/0xa0
          yujian Jian Yu added a comment -

          Lustre client build: https://build.hpdd.intel.com/job/lustre-b2_5/74/ (2.5.2 RC2)
          Lustre server build: https://build.hpdd.intel.com/job/lustre-b2_4/73/ (2.4.3)

          The same failure occurred while running sanity-scrub test 1b:
          https://maloo.whamcloud.com/test_sets/c2d42a94-f91b-11e3-9283-52540035b04c

          yujian Jian Yu added a comment - Lustre client build: https://build.hpdd.intel.com/job/lustre-b2_5/74/ (2.5.2 RC2) Lustre server build: https://build.hpdd.intel.com/job/lustre-b2_4/73/ (2.4.3) The same failure occurred while running sanity-scrub test 1b: https://maloo.whamcloud.com/test_sets/c2d42a94-f91b-11e3-9283-52540035b04c

          The patch for Lustre-2.5 has been landed to lustre-2.5.2

          yong.fan nasf (Inactive) added a comment - The patch for Lustre-2.5 has been landed to lustre-2.5.2

          The patch for master has been landed, and the patch for Lustre-2.5 will be landed later.

          yong.fan nasf (Inactive) added a comment - The patch for master has been landed, and the patch for Lustre-2.5 will be landed later.

          The patch has been landed to master, but we still need to land the patch to Lustre-2.5.x

          yong.fan nasf (Inactive) added a comment - The patch has been landed to master, but we still need to land the patch to Lustre-2.5.x

          Well, in fact after step 2 in your scenario it doesn't matter who will be first at L2 point, both threads will see zero reference and pass that point wrongly, so one will be first to free data and second will fail on that. Yes, it looks like possible scenario.

          tappro Mikhail Pershin added a comment - Well, in fact after step 2 in your scenario it doesn't matter who will be first at L2 point, both threads will see zero reference and pass that point wrongly, so one will be first to free data and second will fail on that. Yes, it looks like possible scenario.

          There is race condition in the local_oid_storage_fini() which may cause this failure, consider the following scenario:

          void local_oid_storage_fini(const struct lu_env *env,
                                      struct local_oid_storage *los)
          {
                  struct ls_device *ls;
          
          L1==>        if (!atomic_dec_and_test(&los->los_refcount))
                          return;
          
                  LASSERT(env);
                  LASSERT(los->los_dev);
                  ls = dt2ls_dev(los->los_dev);
          
          L2==>        mutex_lock(&ls->ls_los_mutex);
                  if (atomic_read(&los->los_refcount) > 0) {
                          mutex_unlock(&ls->ls_los_mutex);
                          return;
                  }
          
                  if (los->los_obj)
                          lu_object_put_nocache(env, &los->los_obj->do_lu);
          L3==>        cfs_list_del(&los->los_list);
          L4==>        OBD_FREE_PTR(los);
                  mutex_unlock(&ls->ls_los_mutex);
                  ls_device_put(env, ls);
          }
          

          1) The thread1 calls local_oid_storage_fini(), it is the last user for the "los", so it goes through 'L1', but blocks at 'L2' for the mutex. At that time, the "los" is still on the ls::ls_los_list, so it can be found by others.

          2) The thread2 calls local_oid_storage_init() for the same sequence, it has obtained the mutex earlier than the thread1, so it finds the "los" which is to be destroyed by the thread1.

          3) After the thread2 released the mutex, because schedule delay or others compete on the mutex, the thread1 still has NOT obtained the mutex.

          4) Then the thread2 calls local_oid_storage_fini(), the thread2 also thinks that it is the last user for the "los", fortunately enough, it has obtained the mutex earlier than the thread1 again, so it goes through 'L1','L2', and freed the "los" via 'L4'.

          5) Now, it is the thread1 turn, but the "los" is a freed one, and if the freed RAM has been allocated to others, then the thread1 may fail at 'L3' for NULL list_head.

          yong.fan nasf (Inactive) added a comment - There is race condition in the local_oid_storage_fini() which may cause this failure, consider the following scenario: void local_oid_storage_fini( const struct lu_env *env, struct local_oid_storage *los) { struct ls_device *ls; L1==> if (!atomic_dec_and_test(&los->los_refcount)) return ; LASSERT(env); LASSERT(los->los_dev); ls = dt2ls_dev(los->los_dev); L2==> mutex_lock(&ls->ls_los_mutex); if (atomic_read(&los->los_refcount) > 0) { mutex_unlock(&ls->ls_los_mutex); return ; } if (los->los_obj) lu_object_put_nocache(env, &los->los_obj->do_lu); L3==> cfs_list_del(&los->los_list); L4==> OBD_FREE_PTR(los); mutex_unlock(&ls->ls_los_mutex); ls_device_put(env, ls); } 1) The thread1 calls local_oid_storage_fini(), it is the last user for the "los", so it goes through 'L1', but blocks at 'L2' for the mutex. At that time, the "los" is still on the ls::ls_los_list, so it can be found by others. 2) The thread2 calls local_oid_storage_init() for the same sequence, it has obtained the mutex earlier than the thread1, so it finds the "los" which is to be destroyed by the thread1. 3) After the thread2 released the mutex, because schedule delay or others compete on the mutex, the thread1 still has NOT obtained the mutex. 4) Then the thread2 calls local_oid_storage_fini(), the thread2 also thinks that it is the last user for the "los", fortunately enough, it has obtained the mutex earlier than the thread1 again, so it goes through 'L1','L2', and freed the "los" via 'L4'. 5) Now, it is the thread1 turn, but the "los" is a freed one, and if the freed RAM has been allocated to others, then the thread1 may fail at 'L3' for NULL list_head.

          Above patch is for b2_5, patch for master is: http://review.whamcloud.com/9573

          adilger Andreas Dilger added a comment - Above patch is for b2_5, patch for master is: http://review.whamcloud.com/9573
          yong.fan nasf (Inactive) added a comment - Here is the patch: http://review.whamcloud.com/#/c/9572/

          I found some improper code in current implementation, which may cause invalid memory access or free, but I am still not sure whether it is the real reason for this failure or not. I cannot reproduce the failure by myself. Anyway, I will make patch to fix the issues I found.

          yong.fan nasf (Inactive) added a comment - I found some improper code in current implementation, which may cause invalid memory access or free, but I am still not sure whether it is the real reason for this failure or not. I cannot reproduce the failure by myself. Anyway, I will make patch to fix the issues I found.
          yujian Jian Yu added a comment -

          Here is a patch trying to reproduce the failure: http://review.whamcloud.com/9525

          yujian Jian Yu added a comment - Here is a patch trying to reproduce the failure: http://review.whamcloud.com/9525

          People

            yong.fan nasf (Inactive)
            yujian Jian Yu
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: