[LU-993] 1.8<->2.1.54 Test failure on test suite sanity :osd_attr_set()) ASSERTION((oh)->ot_declare_attr_set > 0) failed Created: 14/Jan/12  Updated: 12/Mar/12  Resolved: 09/Feb/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.2.0
Fix Version/s: Lustre 2.2.0

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: Text File mds3.crash.txt    
Issue Links:
Related
is related to LU-1190 Test failure on test suite sanity, su... Resolved
Severity: 3
Rank (Obsolete): 4238

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/30d15550-3e73-11e1-b417-5254004bbbd3.



 Comments   
Comment by Di Wang [ 14/Jan/12 ]

03:36:47:Lustre: DEBUG MARKER: == sanity test 51b: mkdir .../t-0 — .../t-70000 ====================== 03:35:35 (1326454535)
03:37:33:LustreError: 4720:0:(osd_handler.c:1573:osd_attr_set()) ASSERTION((oh)->ot_declare_attr_set > 0) failed
03:37:33:LustreError: 4720:0:(osd_handler.c:1573:osd_attr_set()) LBUG
03:37:33:Pid: 4720, comm: mdt_01
03:37:33:
03:37:33:Call Trace:
03:37:33: [<ffffffffa0370855>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
03:37:33: [<ffffffffa0370e95>] lbug_with_loc+0x75/0xe0 [libcfs]
03:37:33: [<ffffffffa037bd86>] libcfs_assertion_failed+0x66/0x70 [libcfs]
03:37:33: [<ffffffffa08b12d7>] osd_attr_set+0x4a7/0x510 [osd_ldiskfs]
03:37:33: [<ffffffffa08b011a>] ? osd_xattr_get+0x1ea/0x270 [osd_ldiskfs]
03:37:33: [<ffffffffa0802222>] ? __mdd_xattr_set+0xb2/0x330 [mdd]
03:37:33: [<ffffffffa08015a4>] mdd_attr_set_internal+0xb4/0x310 [mdd]
03:37:34: [<ffffffffa0801b55>] mdd_attr_check_set_internal+0x355/0x390 [mdd]
03:37:34: [<ffffffffa080f2ae>] ? mdd_lov_set_md+0x4be/0x610 [mdd]
03:37:34: [<ffffffffa0801bf9>] mdd_attr_check_set_internal_locked+0x69/0x180 [mdd]
03:37:34: [<ffffffffa08239e8>] mdd_create+0x1ec8/0x2470 [mdd]
03:37:34: [<ffffffffa04838ad>] ? htable_lookup+0xed/0x190 [obdclass]
03:37:38: [<ffffffffa037f599>] ? cfs_hash_bd_add_locked+0x29/0x90 [libcfs]
03:37:38: [<ffffffffa0579565>] ? lustre_msg_buf+0x85/0x90 [ptlrpc]
03:37:38: [<ffffffffa08cb11f>] ? cml_lookup+0x8f/0x1f0 [cmm]
03:37:39: [<ffffffffa05a74fb>] ? __req_capsule_get+0x14b/0x6b0 [ptlrpc]
03:37:39: [<ffffffffa08cc52c>] cml_create+0xbc/0x280 [cmm]
03:37:39: [<ffffffffa08676a6>] ? mdt_version_save+0x96/0x170 [mdt]
03:37:39: [<ffffffffa08686e1>] mdt_md_create+0x481/0x6a0 [mdt]
03:37:39: [<ffffffffa0554991>] ? ldlm_request_cancel+0x351/0x420 [ptlrpc]
03:37:39: [<ffffffffa03de98c>] ? lprocfs_counter_add+0x12c/0x196 [lvfs]
03:37:39: [<ffffffffa0869448>] mdt_reint_create+0x118/0x740 [mdt]
03:37:39: [<ffffffffa0828346>] ? md_ucred+0x26/0x60 [mdd]
03:37:39: [<ffffffffa0828346>] ? md_ucred+0x26/0x60 [mdd]
03:37:39: [<ffffffffa086676f>] mdt_reint_rec+0x3f/0x100 [mdt]
03:37:39: [<ffffffffa057ba44>] ? lustre_msg_get_flags+0x34/0xa0 [ptlrpc]
03:37:39: [<ffffffffa085ebf4>] mdt_reint_internal+0x6d4/0x9f0 [mdt]
03:37:39: [<ffffffffa08547e6>] ? mdt_reint_opcode+0x96/0x160 [mdt]
03:37:39: [<ffffffffa085ef5c>] mdt_reint+0x4c/0x120 [mdt]
03:37:39: [<ffffffffa057b518>] ? lustre_msg_check_version+0xc8/0xe0 [ptlrpc]
03:37:43: [<ffffffffa0851ec5>] mdt_handle_common+0x8d5/0x1810 [mdt]
03:37:44: [<ffffffffa05791a4>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
03:37:44: [<ffffffffa0852ed5>] mdt_regular_handle+0x15/0x20 [mdt]
03:37:44: [<ffffffffa0589f29>] ptlrpc_main+0xbc9/0x19a0 [ptlrpc]
03:37:44: [<ffffffffa0589360>] ? ptlrpc_main+0x0/0x19a0 [ptlrpc]
03:37:44: [<ffffffff8100c1ca>] child_rip+0xa/0x20
03:37:44: [<ffffffffa0589360>] ? ptlrpc_main+0x0/0x19a0 [ptlrpc]
03:37:44: [<ffffffffa0589360>] ? ptlrpc_main+0x0/0x19a0 [ptlrpc]
03:37:44: [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
03:37:44:
03:37:44:Kernel panic - not syncing: LBUG
03:37:44:Pid: 4720, comm: mdt_01 Not tainted 2.6.32-131.17.1.el6_lustre.x86_64 #1
03:37:44:Call Trace:
03:37:44: [<ffffffff814dac78>] ? panic+0x78/0x143
03:37:44: [<ffffffffa0370eeb>] ? lbug_with_loc+0xcb/0xe0 [libcfs]
03:37:44: [<ffffffffa037bd86>] ? libcfs_assertion_failed+0x66/0x70 [libcfs]
03:37:44: [<ffffffffa08b12d7>] ? osd_attr_set+0x4a7/0x510 [osd_ldiskfs]
03:37:45: [<ffffffffa08b011a>] ? osd_xattr_get+0x1ea/0x270 [osd_ldiskfs]
03:37:45: [<ffffffffa0802222>] ? __mdd_xattr_set+0xb2/0x330 [mdd]
03:37:45: [<ffffffffa08015a4>] ? mdd_attr_set_internal+0xb4/0x310 [mdd]
03:37:46: [<ffffffffa0801b55>] ? mdd_attr_check_set_internal+0x355/0x390 [mdd]
03:37:46: [<ffffffffa080f2ae>] ? mdd_lov_set_md+0x4be/0x610 [mdd]
03:37:46: [<ffffffffa0801bf9>] ? mdd_attr_check_set_internal_locked+0x69/0x180 [mdd]
03:37:46: [<ffffffffa08239e8>] ? mdd_create+0x1ec8/0x2470 [mdd]
03:37:46: [<ffffffffa04838ad>] ? htable_lookup+0xed/0x190 [obdclass]
03:37:46: [<ffffffffa037f599>] ? cfs_hash_bd_add_locked+0x29/0x90 [libcfs]
03:37:46: [<ffffffffa0579565>] ? lustre_msg_buf+0x85/0x90 [ptlrpc]
03:37:46: [<ffffffffa08cb11f>] ? cml_lookup+0x8f/0x1f0 [cmm]
03:37:46: [<ffffffffa05a74fb>] ? __req_capsule_get+0x14b/0x6b0 [ptlrpc]
03:37:46: [<ffffffffa08cc52c>] ? cml_create+0xbc/0x280 [cmm]
03:37:46: [<ffffffffa08676a6>] ? mdt_version_save+0x96/0x170 [mdt]
03:37:46: [<ffffffffa08686e1>] ? mdt_md_create+0x481/0x6a0 [mdt]
03:37:46: [<ffffffffa0554991>] ? ldlm_request_cancel+0x351/0x420 [ptlrpc]
03:37:46: [<ffffffffa03de98c>] ? lprocfs_counter_add+0x12c/0x196 [lvfs]
03:37:46: [<ffffffffa0869448>] ? mdt_reint_create+0x118/0x740 [mdt]
03:37:46: [<ffffffffa0828346>] ? md_ucred+0x26/0x60 [mdd]
03:37:46: [<ffffffffa0828346>] ? md_ucred+0x26/0x60 [mdd]
03:37:47: [<ffffffffa086676f>] ? mdt_reint_rec+0x3f/0x100 [mdt]
03:37:47: [<ffffffffa057ba44>] ? lustre_msg_get_flags+0x34/0xa0 [ptlrpc]
03:37:47: [<ffffffffa085ebf4>] ? mdt_reint_internal+0x6d4/0x9f0 [mdt]
03:37:47: [<ffffffffa08547e6>] ? mdt_reint_opcode+0x96/0x160 [mdt]
03:37:47: [<ffffffffa085ef5c>] ? mdt_reint+0x4c/0x120 [mdt]
03:37:47: [<ffffffffa057b518>] ? lustre_msg_check_version+0xc8/0xe0 [ptlrpc]
03:37:47: [<ffffffffa0851ec5>] ? mdt_handle_common+0x8d5/0x1810 [mdt]
03:37:47: [<ffffffffa05791a4>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
03:37:47: [<ffffffffa0852ed5>] ? mdt_regular_handle+0x15/0x20 [mdt]
03:37:47: [<ffffffffa0589f29>] ? ptlrpc_main+0xbc9/0x19a0 [ptlrpc]
03:37:47: [<ffffffffa0589360>] ? ptlrpc_main+0x0/0x19a0 [ptlrpc]
03:37:47: [<ffffffff8100c1ca>] ? child_rip+0xa/0x20
03:37:47: [<ffffffffa0589360>] ? ptlrpc_main+0x0/0x19a0 [ptlrpc]
03:37:47: [<ffffffffa0589360>] ? ptlrpc_main+0x0/0x19a0 [ptlrpc]
03:37:47: [<ffffffff8100c1c0>] ? child_rip+0x0/0x20

Server: 2.1.54
Client: 1.8.6 (Sigh, jenkins does not support 1.8.7 client yet. TT-308)

Comment by Peter Jones [ 15/Jan/12 ]

Niu

Could you look into this one please?

Thanks

Peter

Comment by Niu Yawei (Inactive) [ 16/Jan/12 ]

The sanity test_51b of master only creates 70 sub-directories, see the comment in the test script:

#export NUMTEST=70000
# FIXME: I select a relatively small number to do basic test.
# large number may give panic(). debugging on this is going on.
export NUMTEST=70
test_51b() {

however, b1_8 sanity test creates 700000 sub-directories, that why this bug only appeared in the interoprability testing. see the log:

== sanity test 51b: mkdir .../t-0 --- .../t-70000 ====================== 03:35:35 (1326454535)

Though the sanity 51b of b1_8 isn't a valid test for master, but it revealed an osd api porting defect:

When creating more than LDISKFS_LINK_MAX (65000) sub-directories, mdd_attr_set_internal() will be called to support large directory, however, the this set attr operation wasn't declared during mdd_declare_create(), so ASSERTION((oh)->ot_declare_attr_set > 0) will be triggered at the end.

Alex, could you take a look to see if I'm right? and do you have any ideas on how to fix it? Thanks in advance.

Comment by Alex Zhuravlev [ 16/Jan/12 ]

I'd think the following should be enough:

diff --git a/lustre/mdd/mdd_dir.c b/lustre/mdd/mdd_dir.c
index a2693a3..2cff838 100644
— a/lustre/mdd/mdd_dir.c
+++ b/lustre/mdd/mdd_dir.c
@@ -1843,6 +1843,8 @@ static int mdd_declare_create(const struct lu_env *env,
0, handle);
if (rc == 0)
rc = mdo_declare_ref_add(env, p, handle);
+ if (rc == 0)
+ rc = mdo_declare_attr_set(env, p, &ma->ma_attr, handle)
}
if (rc)
GOTO(out, rc);

Comment by Niu Yawei (Inactive) [ 16/Jan/12 ]

Thanks, Alex. http://review.whamcloud.com/1971

Comment by Cliff White (Inactive) [ 17/Jan/12 ]

We have seen this issue on Hyperion, panic attached

Comment by Peter Jones [ 02/Feb/12 ]

Niu

LU1048 has landed so can you please rebase your patch

Thanks

Peter

Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » x86_64,client,el5,ofa #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/mdd/mdd_object.c
  • lustre/mdd/mdd_internal.h
  • lustre/tests/sanity.sh
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_dir.c
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » x86_64,client,el5,inkernel #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/mdd/mdd_internal.h
  • lustre/tests/sanity.sh
  • lustre/mdd/mdd_object.c
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_dir.c
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » x86_64,server,el5,inkernel #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/tests/sanity.sh
  • lustre/mdd/mdd_internal.h
  • lustre/mdd/mdd_object.c
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_dir.c
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_object.c
  • lustre/tests/sanity.sh
  • lustre/mdd/mdd_dir.c
  • lustre/mdd/mdd_internal.h
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » x86_64,server,el6,inkernel #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/mdd/mdd_internal.h
  • lustre/mdd/mdd_dir.c
  • lustre/tests/sanity.sh
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_object.c
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » i686,server,el5,inkernel #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/tests/sanity.sh
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_object.c
  • lustre/mdd/mdd_internal.h
  • lustre/mdd/mdd_dir.c
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » x86_64,client,sles11,inkernel #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_dir.c
  • lustre/mdd/mdd_object.c
  • lustre/tests/sanity.sh
  • lustre/mdd/mdd_internal.h
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » x86_64,server,el5,ofa #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/mdd/mdd_object.c
  • lustre/mdd/mdd_dir.c
  • lustre/mdd/mdd_internal.h
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » i686,server,el5,ofa #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/tests/sanity.sh
  • lustre/mdd/mdd_dir.c
  • lustre/mdd/mdd_object.c
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_internal.h
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » x86_64,client,el6,inkernel #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/mdd/mdd_object.c
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_dir.c
  • lustre/mdd/mdd_internal.h
  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » i686,client,el5,inkernel #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/mdd/mdd_dir.c
  • lustre/mdd/mdd_internal.h
  • lustre/mdd/mdd_object.c
  • lustre/tests/sanity.sh
  • lustre/osd-ldiskfs/osd_handler.c
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » i686,client,el5,ofa #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/tests/sanity.sh
  • lustre/mdd/mdd_object.c
  • lustre/mdd/mdd_dir.c
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_internal.h
Comment by Peter Jones [ 09/Feb/12 ]

Landed for 2.2

Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » i686,client,el6,inkernel #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/tests/sanity.sh
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_internal.h
  • lustre/mdd/mdd_dir.c
  • lustre/mdd/mdd_object.c
Comment by Build Master (Inactive) [ 09/Feb/12 ]

Integrated in lustre-master » i686,server,el6,inkernel #462
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = SUCCESS
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/tests/sanity.sh
  • lustre/mdd/mdd_dir.c
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_internal.h
  • lustre/mdd/mdd_object.c
Comment by Build Master (Inactive) [ 17/Feb/12 ]

Integrated in lustre-master » x86_64,server,el6,ofa #480
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = FAILURE
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/mdd/mdd_dir.c
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_object.c
  • lustre/tests/sanity.sh
  • lustre/mdd/mdd_internal.h
Comment by Build Master (Inactive) [ 17/Feb/12 ]

Integrated in lustre-master » x86_64,client,el6,ofa #480
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = FAILURE
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/mdd/mdd_dir.c
  • lustre/tests/sanity.sh
  • lustre/osd-ldiskfs/osd_handler.c
  • lustre/mdd/mdd_object.c
  • lustre/mdd/mdd_internal.h
Comment by Build Master (Inactive) [ 17/Feb/12 ]

Integrated in lustre-master » i686,client,el6,ofa #480
LU-993 osd: code cleanup for directory nlink count (Revision ec20be97b9f977d3f4944523baaffb1bf95cf76c)

Result = ABORTED
Oleg Drokin : ec20be97b9f977d3f4944523baaffb1bf95cf76c
Files :

  • lustre/mdd/mdd_internal.h
  • lustre/mdd/mdd_dir.c
  • lustre/tests/sanity.sh
  • lustre/mdd/mdd_object.c
  • lustre/osd-ldiskfs/osd_handler.c
Generated at Sat Feb 10 01:12:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.