[LU-16246] NULL pointer at lod_lookup+0x24/0x38 Created: 18/Oct/22  Updated: 15/Nov/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Jason Feng Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

lustre servers:
10 nodes ,each node has kunpeng920 96core *2, memory 512GB,nvme 3.2T*4
centos 8.4.2105
kernel 5.10.0-60.18.0.50.aarch64 (openeuler 22.03 kernel)
lustre 0c68b13a5eeb408862bad795aaf9a24a11a14b6a

lustre clients:
10 nodes intel 6266C*2, memory 372GB
centos 8.4.2105
kernel 4.18.0-372.9.1.el8.x86_64

IO500 tag:io500-sc21


Issue Links:
Related
is related to LU-16245 __osd_init_iobuf()) ASSERTION( iobuf-... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
[32261.214407] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[32261.223858] Mem abort info:
[32261.227340]   ESR = 0x96000004
[32261.231077]   EC = 0x25: DABT (current EL), IL = 32 bits
[32261.237060]   SET = 0, FnV = 0
[32261.240797]   EA = 0, S1PTW = 0
[32261.244621] Data abort info:
[32261.248185]   ISV = 0, ISS = 0x00000004
[32261.252702]   CM = 0, WnR = 0
[32261.256354] user pgtable: 4k pages, 48-bit VAs, pgdp=0000202681405000
[32261.263462] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[32261.270918] Internal error: Oops: 96000004 [#1] SMP
[32261.276466] Modules linked in: ofd(OE) ost(OE) osd_zfs(POE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) mbcache jbd2 lustre(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ko2iblnd(OE) lnet(OE) crc32_generic libcfs(OE) dm_flakey dm_mod vfio_pci vfio_virqfd vfio_iommu_type1 vfio cuse rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) rfkill sunrpc nls_cp437 vfat fat zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) aes_ce_blk zcommon(POE) znvpair(POE) crypto_simd zavl(POE) ipmi_ssif cryptd icp(POE) aes_ce_cipher ghash_ce spl(OE) sha1_ce acpi_ipmi sbsa_gwdt ipmi_si ipmi_devintf ipmi_msghandler hisi_uncore_hha_pmu hisi_uncore_ddrc_pmu hisi_uncore_l3c_pmu hisi_uncore_pmu sch_fq_codel binfmt_misc knem(OE) xfs libcrc32c sd_mod sg hclge mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) mlx5_core(OE) mlxfw(OE) hisi_sas_v3_hw tls hisi_sas_main psample sha2_ce libsas nvme ahci
[32261.276555]  hibmc_drm mlxdevm(OE) sha256_arm64 nvme_core hns3 libahci scsi_transport_sas drm_vram_helper auxiliary(OE) t10_pi mlx_compat(OE) drm_ttm_helper libata hnae3 ttm megaraid_sas host_edma_drv i2c_designware_platform i2c_designware_core xpmem(OE) fuse
[32261.386429] CPU: 49 PID: 52372 Comm: mdt02_000 Kdump: loaded Tainted: P           OE     5.10.0-60.18.0.50.aarch64 #1
[32261.397678] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDDA, BIOS 1.35 04/30/2020
[32261.406595] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[32261.413307] pc : lod_lookup+0x24/0x38 [lod]
[32261.418192] lr : __mdd_lookup.isra.3+0x314/0x5b8 [mdd]
[32261.423997] sp : ffff8000650ab4d0
[32261.427987] x29: ffff8000650ab4d0 x28: ffff2042c84c8820
[32261.433966] x27: ffff80000912d000 x26: 00000000000034e0
[32261.439945] x25: ffff8000650ab6e0 x24: ffff2023d2d16c50
[32261.445924] x23: ffff2023d1ae0080 x22: ffff0045aaf12e60
[32261.451904] x21: ffff2023d1ae0080 x20: ffff80000912d000
[32261.457882] x19: 0000000000000000 x18: 0000000000000001
[32261.463861] x17: 0000000000000000 x16: ffff80000a7df920
[32261.469841] x15: ffffffffffffffff x14: ffffffffffffffff
[32261.475819] x13: 0000000000000018 x12: ffffffffffffffff
[32261.481798] x11: 0000000000000040 x10: 7f7f7f7f7f7f7f7f
[32261.487777] x9 : ffff80000ac39fc4 x8 : 0000000000000001
[32261.493757] x7 : 0000000000000b20 x6 : 0000000000004000
[32261.499737] x5 : ffff80000912d000 x4 : 0000000000000000
[32261.505716] x3 : ffff2023d2d16c50 x2 : ffff8000650ab6e0
[32261.511695] x1 : ffff2042d17dff00 x0 : ffff2023d1ae0080
[32261.517675] Call trace:
[32261.520818]  lod_lookup+0x24/0x38 [lod]
[32261.525337]  __mdd_lookup.isra.3+0x314/0x5b8 [mdd]
[32261.530806]  mdd_lookup+0x108/0x208 [mdd]
[32261.535524]  mdt_reint_open+0xffc/0x3810 [mdt]
[32261.540656]  mdt_reint_rec+0x170/0x390 [mdt]
[32261.545614]  mdt_reint_internal+0x6fc/0xf98 [mdt]
[32261.551004]  mdt_intent_open+0x17c/0x470 [mdt]
[32261.556134]  mdt_intent_opc+0x194/0x1040 [mdt]
[32261.561265]  mdt_intent_policy+0x23c/0x438 [mdt]
[32261.566662]  ldlm_lock_enqueue+0x5f0/0xbc0 [ptlrpc]
[32261.572276]  ldlm_handle_enqueue0+0x6ec/0x23e0 [ptlrpc]
[32261.578230]  tgt_enqueue+0xd4/0x2f0 [ptlrpc]
[32261.583232]  tgt_handle_request0+0xd4/0x9b0 [ptlrpc]
[32261.588922]  tgt_request_handle+0x7cc/0x1a30 [ptlrpc]
[32261.594701]  ptlrpc_server_handle_request+0x3bc/0x1218 [ptlrpc]
[32261.601342]  ptlrpc_main+0xdfc/0x16c8 [ptlrpc]
[32261.606462]  kthread+0x130/0x138
[32261.610369]  ret_from_fork+0x10/0x18
[32261.614621] Code: f9400c24 d1006084 aa0403e1 f9401c84 (f9400084)
[32261.621429] SMP: stopping secondary CPUs
[32261.628375] Starting crashdump kernel...
[32261.632977] Bye!


 Comments   
Comment by Andreas Dilger [ 18/Oct/22 ]

I may not be able to help much here, since I suspect this issue relates somehow to ARM server (what is PAGE_SIZE and endianness?), but some things of note:

  • Lustre version "0c68b13a5eeb408862bad795aaf9a24a11a14b6a" is v2_15_52, which is a development branch that is landing new features and has not been tested extensively. You are better off to run the b2_15 branch which is the Long Term Support (LTS) maintenance branch and is only getting bug fixes.
  • IO500 tag: io500-sc21 is old, you should be using io500-sc22 if you are planning to submit a result for the upcoming IO500 list at SC'22.
  • lod_lookup+0x24/0x38 are you able to decode this address in GDB and/or add printk() to this function to see which pointer is NULL?
Comment by Jason Feng [ 19/Oct/22 ]

Thanks for comment.

I will try b2_15 and new IO500 sc22.

static int lod_lookup(const struct lu_env *env, struct dt_object *dt,
                      struct dt_rec *rec, const struct dt_key *key)
{
        struct dt_object *next = dt_object_child(dt);

        It show this next = NULL.If next == null , - 1 is returned to avoid null pointer hanging,is this ok?

        return next->do_index_ops->dio_lookup(env, next, rec, key);
}
Comment by Andreas Dilger [ 19/Oct/22 ]
static int lod_lookup(const struct lu_env *env, struct dt_object *dt,
                      struct dt_rec *rec, const struct dt_key *key)
{         struct dt_object *next = dt_object_child(dt); 
          return next->do_index_ops->dio_lookup(env, next, rec, key);
}

It show this next = NULL. If next == null , - 1 is returned to avoid null pointer hanging, is this ok?

That might be OK for debugging (I would suggest to return something like -ENOENT or -EINVAL), but I suspect it will still not work properly because there is likely a problem elsewhere in the code.

The "dt" object is a directory, and the mdd_lookup() caller should have initialized the object correctly before calling lod_lookup(). I suspect some larger problem here, like the locking being broken or similar.

Comment by Jason Feng [ 19/Oct/22 ]

The directory is not deleted during the test, which may be caused by the memory problem. I try to reproduce the problem and capture the complete vmcore file for further analysis.

Comment by Aurelien Degremont (Inactive) [ 09/Nov/22 ]

For the record, AWS reproduced a very similar crash in `lod_lookup()` on AWS specific Graviton ARM processors (4K pages) but this is 'do_index_ops' and not 'next' which was NULL. The crash dump shows that the dt_object memory structure is correct, do_index_ops has the correct value, but the register was NULL and the system crashed. This happens several times. This is running 2.12.9 + backports.

Comment by Jason Feng [ 09/Nov/22 ]

Because the time sequence problem cannot be identified by kdump, can we add logs to further locate the problem?

 

Generated at Sat Feb 10 03:25:18 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.