[LU-114] WARNING: at fs/namei.c:1332 lookup_one_len+0xf1/0x110() Created: 04/Mar/11  Updated: 21/Mar/11  Resolved: 21/Mar/11

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0
Fix Version/s: Lustre 2.1.0

Type: Bug Priority: Minor
Reporter: Ned Bass Assignee: Johann Lombardi (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

RHEL6 x86_64


Severity: 3
Rank (Obsolete): 5095

 Description   

The following kernel warning happens when mounting the MDT for the first time after booting the node.

2011-02-24 15:50:06 ------------[ cut here ]------------
2011-02-24 15:50:06 WARNING: at fs/namei.c:1332 lookup_one_len+0xf1/0x110() (Not tainted)
2011-02-24 15:50:06 Hardware name: X8DTH-i/6/iF/6F
2011-02-24 15:50:06 Modules linked in: cmm osd_ldiskfs mdt mdd mds 
fsfilt_ldiskfs exportfs mgs mgc ext4 ldiskfs lustre lov osc lquota mdc fid fld
ko2iblnd ptlrpc obdclass lvfs lnet libcfs mbcache jbd2 ib_ipoib rdma_ucm ib_ucm
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa mlx4_ib ib_mad ib_core sg 
sd_mod crc_t10dif dm_mirror dm_region_hash dm_log dm_mod video output sbs sbshc
power_meter hwmon acpi_pad parport serio_raw i2c_i801 i2c_core ata_generic
pata_acpi ata_piix iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core
mpt2sas scsi_transport_sas raid_class ipv6 nfs lockd fscache nfs_acl
auth_rpcgss sunrpc mlx4_core igb dca [last unloaded: ldiskfs]
2011-02-24 15:50:06 Pid: 8906, comm: llog_process_th Not tainted 2.6.32-14chaos #1
2011-02-24 15:50:06 Call Trace:
2011-02-24 15:50:06  [<ffffffff8106b8f7>] warn_slowpath_common+0x87/0xc0
2011-02-24 15:50:06  [<ffffffff8106b94a>] warn_slowpath_null+0x1a/0x20
2011-02-24 15:50:06  [<ffffffff81178cf1>] lookup_one_len+0xf1/0x110
2011-02-24 15:50:06  [<ffffffffa0567a62>] sptlrpc_target_local_copy_conf+0xc2/0xeb0 [ptlrpc]
2011-02-24 15:50:06  [<ffffffffa033154e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
2011-02-24 15:50:06  [<ffffffffa0568c0b>] sptlrpc_conf_target_get_rules+0x3bb/0x5d0 [ptlrpc]
2011-02-24 15:50:06  [<ffffffffa09a8ceb>] ? mdd_llog_ctxt_get+0x7b/0x140 [mdd]
2011-02-24 15:50:06  [<ffffffffa09cf2c5>] mdt_adapt_sptlrpc_conf+0x45/0x110 [mdt]
2011-02-24 15:50:06  [<ffffffffa0a5f933>] ? cmm_llog_ctxt_get+0x53/0x120 [cmm]
2011-02-24 15:50:06  [<ffffffffa09e30b6>] mdt_device_alloc+0x1d06/0x25f0 [mdt]
2011-02-24 15:50:06  [<ffffffffa041ceaf>] obd_setup+0x1ff/0x330 [obdclass]
2011-02-24 15:50:06  [<ffffffffa09cf9b8>] ? mdt_init_export+0x1c8/0x1e0 [mdt]
2011-02-24 15:50:06  [<ffffffffa041d1e9>] class_setup+0x209/0xa50 [obdclass]
2011-02-24 15:50:06  [<ffffffffa0403b06>] ? class_name2dev+0x56/0xd0 [obdclass]
2011-02-24 15:50:06  [<ffffffffa0424d2c>] class_process_config+0xd6c/0x1fd0 [obdclass]
2011-02-24 15:50:06  [<ffffffffa0331983>] ? cfs_alloc+0x63/0x90 [libcfs]
2011-02-24 15:50:06  [<ffffffffa041fb6b>] ? lustre_cfg_new+0x33b/0x880 [obdclass]
2011-02-24 15:50:06  [<ffffffffa0427068>] class_config_llog_handler+0x948/0x16b0 [obdclass]
2011-02-24 15:50:06  [<ffffffff81096c9f>] ? up+0x2f/0x50
2011-02-24 15:50:06  [<ffffffffa03f0583>] llog_process_thread+0x9a3/0xe70 [obdclass]
2011-02-24 15:50:06  [<ffffffff810141ca>] child_rip+0xa/0x20
2011-02-24 15:50:06  [<ffffffffa03efbe0>] ? llog_process_thread+0x0/0xe70 [obdclass]
2011-02-24 15:50:06  [<ffffffff810141c0>] ? child_rip+0x0/0x20
2011-02-24 15:50:07 ---[ end trace 608329aca724c429 ]---

This is triggered by calling lookukp_one_len() without holding the inode semaphore. This patch avoids the warning:

diff --git a/lustre/ptlrpc/sec_config.c b/lustre/ptlrpc/sec_config.c
index a8b9630..1a3ec29 100644
--- a/lustre/ptlrpc/sec_config.c
+++ b/lustre/ptlrpc/sec_config.c
@@ -1031,8 +1031,10 @@ int sptlrpc_target_local_copy_conf(struct obd_device *obd,
 
         push_ctxt(&saved, &obd->obd_lvfs_ctxt, NULL);
 
+        LOCK_INODE_MUTEX(cfs_fs_pwd(current->fs)->d_inode);
         dentry = lookup_one_len(MOUNT_CONFIGS_DIR, cfs_fs_pwd(current->fs),
                                 strlen(MOUNT_CONFIGS_DIR));
+        UNLOCK_INODE_MUTEX(cfs_fs_pwd(current->fs)->d_inode);
         if (IS_ERR(dentry)) {
                 rc = PTR_ERR(dentry);
                 CERROR("cannot lookup %s directory: rc = %d\n",

But I'd like someone with better knowledge of this code to weigh in. i.e. would it be better to use ll_lookup_one_len(), or is there a reason we don't want to take the lock here? Thanks



 Comments   
Comment by Johann Lombardi (Inactive) [ 04/Mar/11 ]

You are right, we should just use ll_lookup_one_len() instead of lookup_one_len(), as done in bugzilla ticket 23645.
I have checked the other places where lookup_one_len() is used and we should be fine now.

Comment by Oleg Drokin [ 04/Mar/11 ]

Johann, I wonder if you can submit a patch for that?
Thanks.

Comment by Ned Bass [ 04/Mar/11 ]

I also came across this line in the course of researching this bug
at lustre/obdfilter/filter.c:1495:filter_fid2dentry():

        dchild = /*ll_*/lookup_one_len(name, dparent, len);

The comment looks like a debugging artifact since the locking in that
function is handled by filter_parent_lock() and filter_parent_unlock().
It would be nice to clean that up while we're at it. Thanks.

Comment by Peter Jones [ 04/Mar/11 ]

Assign to Johann

Comment by Johann Lombardi (Inactive) [ 07/Mar/11 ]

I've pushed a patch for review:
http://review.whamcloud.com/#change,303

Comment by Build Master (Inactive) [ 07/Mar/11 ]

Integrated in reviews-centos5 #404
LU-114 use ll_lookup_one_len() instead of lookup_one_len() in sptlrpc_target_local_copy_conf() should lock the parent dir when doing lookup

Johann Lombardi : 39dc64eb2034bf8fea7f9752ac2b33ff557c442f
Files :

  • lustre/ptlrpc/sec_config.c
  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 17/Mar/11 ]

Integrated in reviews-centos5 #493
LU-114 use ll_lookup_one_len() instead of lookup_one_len() in sptlrpc_target_local_copy_conf() should lock the parent dir when doing lookup

Johann Lombardi : 1d2ac42932a0c7cbcc2f6447750a507d86308580
Files :

  • lustre/obdfilter/filter.c
  • lustre/ptlrpc/sec_config.c
Comment by Build Master (Inactive) [ 17/Mar/11 ]

Integrated in lustre-master-centos5 #153
LU-114 use ll_lookup_one_len() instead of lookup_one_len() in sptlrpc_target_local_copy_conf() should lock the parent dir when doing lookup

Oleg Drokin : d0912bb3a0bf5a14002fb96047c3ea4ce1bfbc0e
Files :

  • lustre/ptlrpc/sec_config.c
  • lustre/obdfilter/filter.c
Comment by Johann Lombardi (Inactive) [ 21/Mar/11 ]

Patch landed for 2.1. Close the bug.

Generated at Sat Feb 10 01:03:53 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.