[LU-2668] mdd_create() failed at error path with osd_object_ref_del() ASSERTION( (oh)->ot_declare_ref_del > 0 ) failed Created: 23/Jan/13  Updated: 24/Jan/13  Resolved: 23/Jan/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Alexander Boyko Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: patch

Issue Links:
Duplicate
duplicates LU-2640 deactivate OSD_EXEC_OP() operation ac... Resolved
Severity: 3
Rank (Obsolete): 6224

 Description   

logs

...
00000004:00000001:1.0:1358933146.590333:0:31202:0:(osd_handler.c:3569:osd_index_ea_insert()) Process entered
00000020:00000001:1.0:1358933146.590333:0:31202:0:(lustre_fid.h:582:fid_flatten32()) Process leaving (rc=4194447 : 4194447 : 40008f)
00000004:00000001:1.0:1358933146.590340:0:31202:0:(osd_handler.c:3213:__osd_ea_add_rec()) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4)
00000020:00000001:1.0:1358933146.590341:0:31202:0:(lustre_fid.h:582:fid_flatten32()) Process leaving (rc=4194447 : 4194447 : 40008f)
00000004:00000001:1.0:1358933146.590341:0:31202:0:(osd_handler.c:3587:osd_index_ea_insert()) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4)
00000004:00000001:1.0:1358933146.590342:0:31202:0:(mdd_dir.c:523:__mdd_index_insert_only()) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4)
00000004:00000001:1.0:1358933146.590343:0:31202:0:(mdd_dir.c:540:__mdd_index_insert()) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4)
00000004:00000001:1.0:1358933146.590344:0:31202:0:(mdd_dir.c:1794:mdd_create()) Process leaving via cleanup (rc=18446744073709551588 : -28 : 0xffffffffffffffe4)
00000004:00040000:1.0:1358933146.590345:0:31202:0:(osd_handler.c:2320:osd_object_ref_del()) ASSERTION( (oh)->ot_declare_ref_del > 0 ) failed:
00000004:00040000:1.0:1358933146.590539:0:31202:0:(osd_handler.c:2320:osd_object_ref_del()) LBUG

stack trace

04:37:01:LustreError: 15712:0:(osd_handler.c:2320:osd_object_ref_del()) ASSERTION( (oh)->ot_declare_ref_del > 0 ) failed:
04:37:01:LustreError: 15712:0:(osd_handler.c:2320:osd_object_ref_del()) LBUG
04:37:01:Pid: 15712, comm: mdt00_004
04:37:01:
04:37:01:Call Trace:
04:37:02: [<ffffffffa04de905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
04:37:02: [<ffffffffa04def07>] lbug_with_loc+0x47/0xb0 [libcfs]
04:37:02: [<ffffffffa0d0c170>] osd_object_ref_add+0x0/0x270 [osd_ldiskfs]
04:37:02: [<ffffffffa0eb732b>] lod_ref_del+0x3b/0xd0 [lod]
04:37:02: [<ffffffffa0c4c5fd>] mdo_ref_del+0xad/0xb0 [mdd]
04:37:02: [<ffffffffa0c52ce2>] mdd_create+0xd02/0x1500 [mdd]
04:37:02: [<ffffffffa04ef351>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
04:37:02: [<ffffffffa0e362e1>] mdt_reint_open+0x1191/0x1900 [mdt]
04:37:02: [<ffffffffa04ef351>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
04:37:02: [<ffffffffa0e21161>] mdt_reint_rec+0x41/0xe0 [mdt]
04:37:02: [<ffffffffa0e1a803>] mdt_reint_internal+0x4e3/0x7d0 [mdt]
04:37:02: [<ffffffffa0e1adbd>] mdt_intent_reint+0x1ed/0x4f0 [mdt]
04:37:02: [<ffffffffa0e164be>] mdt_intent_policy+0x3ae/0x750 [mdt]
04:37:02: [<ffffffffa07a9351>] ldlm_lock_enqueue+0x361/0x8d0 [ptlrpc]
04:37:02: [<ffffffffa07ceff7>] ldlm_handle_enqueue0+0x4f7/0x1080 [ptlrpc]
04:37:02: [<ffffffffa0e16996>] mdt_enqueue+0x46/0x110 [mdt]
04:37:02: [<ffffffffa0e09a72>] mdt_handle_common+0x8e2/0x1680 [mdt]
04:37:02: [<ffffffffa0e40865>] mds_regular_handle+0x15/0x20 [mdt]
04:37:02: [<ffffffffa080050c>] ptlrpc_server_handle_request+0x41c/0xdf0 [ptlrpc]
04:37:02: [<ffffffffa04df64e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
04:37:02: [<ffffffffa07f7989>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
04:37:02: [<ffffffff810602c0>] ? default_wake_function+0x0/0x20
04:37:02: [<ffffffffa0801a96>] ptlrpc_main+0xbb6/0x1950 [ptlrpc]
04:37:02: [<ffffffffa0800ee0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
04:37:02: [<ffffffff8100c14a>] child_rip+0xa/0x20
04:37:02: [<ffffffffa0800ee0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
04:37:02: [<ffffffffa0800ee0>] ? ptlrpc_main+0x0/0x1950 [ptlrpc]
04:37:02: [<ffffffff8100c140>] ? child_rip+0x0/0x20

this fault occurs with this patch http://review.whamcloud.com/#change,5140, but issue exist at the master branch at error handle path.
I have found that mdd_create() function try to execute mdo_ref_del() without previous call of mdo_declare_ref_del(). This can happened after errors at mdd_create() and variable created=1 .



 Comments   
Comment by Alex Zhuravlev [ 23/Jan/13 ]

a dup of LU-2640

Comment by Alexander Boyko [ 23/Jan/13 ]

I create this http://review.whamcloud.com/5150, but probably does not needed according to LU-2640.

Comment by Alexander Boyko [ 23/Jan/13 ]

with this patch http://review.whamcloud.com/5138
I got the same kind of issue

00000004:00000001:1.0:1358954122.724124:0:11557:0:(osd_handler.c:3307:__osd_ea_add_rec()) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4)
00000020:00000001:1.0:1358954122.724125:0:11557:0:(lustre_fid.h:614:fid_flatten32()) Process leaving (rc=12583057 : 12583057 : c00091)
00000004:00000001:1.0:1358954122.724126:0:11557:0:(osd_handler.c:3694:osd_index_ea_insert()) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4)
00000004:00000001:1.0:1358954122.724127:0:11557:0:(mdd_dir.c:523:__mdd_index_insert_only()) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4)
00000004:00000001:1.0:1358954122.724127:0:11557:0:(mdd_dir.c:540:__mdd_index_insert()) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4)
00000004:00000001:1.0:1358954122.724128:0:11557:0:(mdd_dir.c:1794:mdd_create()) Process leaving via cleanup (rc=18446744073709551588 : -28 : 0xffffffffffffffe4)
00000004:00040000:1.0:1358954122.724130:0:11557:0:(osd_internal.h:412:osd_trans_exec_op()) ASSERTION( oh->ot_declare_ops_rb[rb] > 0 ) failed:
00000004:00040000:1.0:1358954122.724336:0:11557:0:(osd_internal.h:412:osd_trans_exec_op()) LBUG
Comment by Andreas Dilger [ 24/Jan/13 ]

Alexander, what are you using for testing this problem? It looks like the MDT is running out of space and fails in the error cleanup path.

Comment by Alexander Boyko [ 24/Jan/13 ]

Andreas I use this patch http://review.whamcloud.com/#change,5140 and sanity.sh 129. This test check max_dir_size limit.

Generated at Sat Feb 10 01:27:11 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.