Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3022

osd_handler.c:2534:osd_object_ref_del()) ASSERTION( inode->i_nlink > 0 ) failed

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • None
    • Lustre 2.4.0
    • 3
    • 7350

    Description

      There are multiple hits of this assertion in Jira, but all closed or have landed patches, so this might be something new.

      This is in current master with http://review.whamcloud.com/#change,5802 applied.

      [101057.524432] LustreError: 22066:0:(fld_handler.c:170:fld_server_lookup()) srv-lustre-MDT0000: Can not found the seq 0x1100000002000004
      [101057.525015] LustreError: 22066:0:(osd_handler.c:2016:osd_fld_lookup()) lustre-MDT0000-osd can not find [0x1100000002000004:0x111:0xd3000000]: rc = -5
      [101057.525579] LustreError: 22066:0:(osd_handler.c:3196:osd_remote_fid()) lustre-MDT0000-osd: Can not lookup fld for [0x1100000002000004:0x111:0xd3000000]
      [101057.599333] LustreError: 9254:0:(lov_object.c:185:lov_init_sub()) header@ffff88003f204ec0[0x0, 2, [0x100000000:0x165b:0x0] hash]{ 
      [101057.599926] LustreError: 9254:0:(lov_object.c:185:lov_init_sub()) ....lovsub@ffff88003f204f98[0]
      [101057.600385] LustreError: 9254:0:(lov_object.c:185:lov_init_sub()) ....osc@ffff8800a646deb0id: 5723 gr: 0 idx: 0 gen: 0 kms_valid: 0 kms 0 rc: 0 force_sync: 0 min_xid: 0 size: 0 mtime: 0 atime: 0 ctime: 0 blocks: 0
      [101057.604547] LustreError: 9254:0:(lov_object.c:185:lov_init_sub()) } header@ffff88003f204ec0
      [101057.604979] LustreError: 9254:0:(lov_object.c:185:lov_init_sub()) stripe 0 is already owned.
      [101057.605415] LustreError: 9254:0:(lov_object.c:186:lov_init_sub()) header@ffff88001940dea8[0x0, 1, [0x200000401:0x10c76:0x0] hash]{ 
      [101057.605917] LustreError: 9254:0:(lov_object.c:186:lov_init_sub()) ....vvp@ffff88001940df80(- 0 0) inode: ffff88004a3b9b08 144115205272570998/33554436 100644 1 1 ffff88001940df80 [0x200000401:0x10c76:0x0]
      [101057.606651] LustreError: 9254:0:(lov_object.c:186:lov_init_sub()) ....lov@ffff8800a646eed0stripes: 1, valid, lsm{ffff88003511aa60 0x0BD10BD0 1 1 0}: 
      [101057.607166] LustreError: 9254:0:(lov_object.c:186:lov_init_sub()) header@ffff88003f204ec0[0x0, 2, [0x100000000:0x165b:0x0] hash]{ 
      [101057.607650] LustreError: 9254:0:(lov_object.c:186:lov_init_sub()) ....lovsub@ffff88003f204f98[0]
      [101057.610941] LustreError: 9254:0:(lov_object.c:186:lov_init_sub()) ....osc@ffff8800a646deb0id: 5723 gr: 0 idx: 0 gen: 0 kms_valid: 0 kms 0 rc: 0 force_sync: 0 min_xid: 0 size: 0 mtime: 0 atime: 0 ctime: 0 blocks: 0
      [101057.611709] LustreError: 9254:0:(lov_object.c:186:lov_init_sub()) } header@ffff88003f204ec0
      [101057.612160] LustreError: 9254:0:(lov_object.c:186:lov_init_sub()) 
      [101057.612440] LustreError: 9254:0:(lov_object.c:186:lov_init_sub()) } header@ffff88001940dea8
      [101057.612916] LustreError: 9254:0:(lov_object.c:186:lov_init_sub()) owned.
      [101057.613171] LustreError: 9254:0:(lov_object.c:187:lov_init_sub()) header@ffff880062842ea8[0x0, 1, [0x200000400:0x111d3:0x0]]
      [101057.620565] LustreError: 9254:0:(lov_object.c:187:lov_init_sub()) try to own.
      [101057.624245] LustreError: 9254:0:(lcommon_cl.c:1201:cl_file_inode_init()) Failure to initialize cl object [0x200000400:0x111d3:0x0]: -5
      [101057.624794] LustreError: 9254:0:(llite_lib.c:2176:ll_prep_inode()) new_inode -fatal: rc -5
      [101058.178146] LustreError: 23316:0:(ldlm_resource.c:1161:ldlm_resource_get()) lvbo_init failed for resource 5722: rc -2
      [101058.228406] LustreError: 9294:0:(lov_object.c:185:lov_init_sub()) header@ffff88003f204ec0[0x0, 4, [0x100000000:0x165b:0x0] hash]{ 
      [101058.228962] LustreError: 9294:0:(lov_object.c:185:lov_init_sub()) ....lovsub@ffff88003f204f98[0]
      [101058.229465] LustreError: 9294:0:(lov_object.c:185:lov_init_sub()) ....osc@ffff8800a646deb0id: 5723 gr: 0 idx: 0 gen: 0 kms_valid: 1 kms 5 rc: 0 force_sync: 0 min_xid: 0 size: 5 mtime: 1364179625 atime: 0 ctime: 1364179625 blocks: 0
      [101058.230330] LustreError: 9294:0:(lov_object.c:185:lov_init_sub()) } header@ffff88003f204ec0
      [101058.230794] LustreError: 9294:0:(lov_object.c:185:lov_init_sub()) stripe 0 is already owned.
      [101058.231272] LustreError: 9294:0:(lov_object.c:186:lov_init_sub()) header@ffff88001940dea8[0x0, 3, [0x200000401:0x10c76:0x0] hash]{ 
      [101058.231808] LustreError: 9294:0:(lov_object.c:186:lov_init_sub()) ....vvp@ffff88001940df80(+ 0 0) inode: ffff88004a3b9b08 144115205272570998/33554436 100644 1 0 ffff88001940df80 [0x200000401:0x10c76:0x0]
      [101058.232634] LustreError: 9294:0:(lov_object.c:186:lov_init_sub()) ....lov@ffff8800a646eed0stripes: 1, valid, lsm{ffff88003511aa60 0x0BD10BD0 1 1 0}: 
      [101058.233199] LustreError: 9294:0:(lov_object.c:186:lov_init_sub()) header@ffff88003f204ec0[0x0, 4, [0x100000000:0x165b:0x0] hash]{ 
      [101058.233729] LustreError: 9294:0:(lov_object.c:186:lov_init_sub()) ....lovsub@ffff88003f204f98[0]
      [101058.234214] LustreError: 9294:0:(lov_object.c:186:lov_init_sub()) ....osc@ffff8800a646deb0id: 5723 gr: 0 idx: 0 gen: 0 kms_valid: 1 kms 5 rc: 0 force_sync: 0 min_xid: 0 size: 5 mtime: 1364179625 atime: 0 ctime: 1364179625 blocks: 0
      [101058.235057] LustreError: 9294:0:(lov_object.c:186:lov_init_sub()) } header@ffff88003f204ec0
      [101058.235524] LustreError: 9294:0:(lov_object.c:186:lov_init_sub()) 
      [101058.235795] LustreError: 9294:0:(lov_object.c:186:lov_init_sub()) } header@ffff88001940dea8
      [101058.236277] LustreError: 9294:0:(lov_object.c:186:lov_init_sub()) owned.
      [101058.236561] LustreError: 9294:0:(lov_object.c:187:lov_init_sub()) header@ffff88006d993ea8[0x0, 1, [0x200000400:0x111d3:0x0]]
      [101058.237083] LustreError: 9294:0:(lov_object.c:187:lov_init_sub()) try to own.
      [101058.248962] LustreError: 9294:0:(lcommon_cl.c:1201:cl_file_inode_init()) Failure to initialize cl object [0x200000400:0x111d3:0x0]: -5
      [101058.249501] LustreError: 9294:0:(llite_lib.c:2176:ll_prep_inode()) new_inode -fatal: rc -5
      [101058.300655] LustreError: 9259:0:(lov_object.c:185:lov_init_sub()) header@ffff88003f204ec0[0x0, 4, [0x100000000:0x165b:0x0] hash]{ 
      [101058.301600] LustreError: 9259:0:(lov_object.c:185:lov_init_sub()) ....lovsub@ffff88003f204f98[0]
      [101058.302233] LustreError: 9259:0:(lov_object.c:185:lov_init_sub()) ....osc@ffff8800a646deb0id: 5723 gr: 0 idx: 0 gen: 0 kms_valid: 1 kms 5 rc: 0 force_sync: 0 min_xid: 0 size: 5 mtime: 1364179625 atime: 0 ctime: 1364179625 blocks: 0
      [101058.303091] LustreError: 9259:0:(lov_object.c:185:lov_init_sub()) } header@ffff88003f204ec0
      [101058.303543] LustreError: 9259:0:(lov_object.c:185:lov_init_sub()) stripe 0 is already owned.
      [101058.303975] LustreError: 9259:0:(lov_object.c:186:lov_init_sub()) header@ffff88001940dea8[0x0, 3, [0x200000401:0x10c76:0x0] hash]{ 
      [101058.304470] LustreError: 9259:0:(lov_object.c:186:lov_init_sub()) ....vvp@ffff88001940df80(+ 0 0) inode: ffff88004a3b9b08 144115205272570998/33554436 100644 1 0 ffff88001940df80 [0x200000401:0x10c76:0x0]
      [101058.305222] LustreError: 9259:0:(lov_object.c:186:lov_init_sub()) ....lov@ffff8800a646eed0stripes: 1, valid, lsm{ffff88003511aa60 0x0BD10BD0 1 1 0}: 
      [101058.305737] LustreError: 9259:0:(lov_object.c:186:lov_init_sub()) header@ffff88003f204ec0[0x0, 4, [0x100000000:0x165b:0x0] hash]{ 
      [101058.312954] LustreError: 9259:0:(lov_object.c:186:lov_init_sub()) ....lovsub@ffff88003f204f98[0]
      [101058.313461] LustreError: 9259:0:(lov_object.c:186:lov_init_sub()) ....osc@ffff8800a646deb0id: 5723 gr: 0 idx: 0 gen: 0 kms_valid: 1 kms 5 rc: 0 force_sync: 0 min_xid: 0 size: 5 mtime: 1364179625 atime: 0 ctime: 1364179625 blocks: 0
      [101058.314344] LustreError: 9259:0:(lov_object.c:186:lov_init_sub()) } header@ffff88003f204ec0
      [101058.314842] LustreError: 9259:0:(lov_object.c:186:lov_init_sub()) 
      [101058.315132] LustreError: 9259:0:(lov_object.c:186:lov_init_sub()) } header@ffff88001940dea8
      [101058.315741] LustreError: 9259:0:(lov_object.c:186:lov_init_sub()) owned.
      [101058.316179] LustreError: 9259:0:(lov_object.c:187:lov_init_sub()) header@ffff880045dd6ea8[0x0, 1, [0x200000400:0x111d3:0x0]]
      [101058.317021] LustreError: 9259:0:(lov_object.c:187:lov_init_sub()) try to own.
      [101059.562642] LustreError: 4927:0:(ost_handler.c:1640:ost_blocking_ast()) Error -2 syncing data on lock cancel
      [101062.023713] LustreError: 23478:0:(osd_handler.c:2534:osd_object_ref_del()) ASSERTION( inode->i_nlink > 0 ) failed: 
      [101062.024156] LustreError: 23478:0:(osd_handler.c:2534:osd_object_ref_del()) LBUG
      [101062.024499] Pid: 23478, comm: mdt01_009
      [101062.024696] 
      [101062.024697] Call Trace:
      [101062.024994]  [<ffffffffa0b69915>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      [101062.025298]  [<ffffffffa0b69f17>] lbug_with_loc+0x47/0xb0 [libcfs]
      [101062.025577]  [<ffffffffa0581089>] osd_object_ref_del+0x239/0x310 [osd_ldiskfs]
      [101062.025936]  [<ffffffffa072646b>] lod_ref_del+0x3b/0xd0 [lod]
      [101062.026116]  [<ffffffffa0464ac5>] mdo_ref_del+0x35/0xc0 [mdd]
      [101062.026368]  [<ffffffffa046c246>] mdd_unlink+0x6d6/0xe30 [mdd]
      [101062.026660]  [<ffffffffa066b3a8>] mdo_unlink+0x18/0x50 [mdt]
      [101062.026945]  [<ffffffffa066e669>] mdt_reint_unlink+0x739/0xfd0 [mdt]
      [101062.027248]  [<ffffffffa066b061>] mdt_reint_rec+0x41/0xe0 [mdt]
      [101062.027535]  [<ffffffffa06646e3>] mdt_reint_internal+0x4e3/0x7d0 [mdt]
      [101062.027844]  [<ffffffffa0664a14>] mdt_reint+0x44/0xe0 [mdt]
      [101062.028177]  [<ffffffffa0652fa8>] mdt_handle_common+0x648/0x1660 [mdt]
      [101062.028510]  [<ffffffffa068e855>] mds_regular_handle+0x15/0x20 [mdt]
      [101062.028862]  [<ffffffffa1305fd3>] ptlrpc_server_handle_request+0x443/0xe40 [ptlrpc]
      [101062.029355]  [<ffffffffa0b6a65e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      [101062.029547]  [<ffffffffa12fef91>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc]
      [101062.029730]  [<ffffffff81051f73>] ? __wake_up+0x53/0x70
      [101062.029918]  [<ffffffffa1308a9d>] ptlrpc_main+0xafd/0x17f0 [ptlrpc]
      [101062.030118]  [<ffffffffa1307fa0>] ? ptlrpc_main+0x0/0x17f0 [ptlrpc]
      [101062.030301]  [<ffffffff8100c14a>] child_rip+0xa/0x20
      [101062.030507]  [<ffffffffa1307fa0>] ? ptlrpc_main+0x0/0x17f0 [ptlrpc]
      [101062.030725]  [<ffffffffa1307fa0>] ? ptlrpc_main+0x0/0x17f0 [ptlrpc]
      [101062.030918]  [<ffffffff8100c140>] ? child_rip+0x0/0x20
      [101062.031505] 
      [101062.046287] Kernel panic - not syncing: LBUG
      

      Crashdump in /exports/crashdumps/192.168.10.220-2013-03-24-22\:47\:11
      branch in my tree is master-20130324

      Attachments

        Issue Links

          Activity

            [LU-3022] osd_handler.c:2534:osd_object_ref_del()) ASSERTION( inode->i_nlink > 0 ) failed

            Patch has been landed to Master.
            ~ jfc.

            jfc John Fuchs-Chesney (Inactive) added a comment - Patch has been landed to Master. ~ jfc.

            in "ldiskfs_add_dot_dotdot", the inode->i_nlink is set as "2", which is not under the protection "osd_object->oo_guard", could it cause this problem?

            hongchao.zhang Hongchao Zhang added a comment - in "ldiskfs_add_dot_dotdot", the inode->i_nlink is set as "2", which is not under the protection "osd_object->oo_guard", could it cause this problem?
            yujian Jian Yu added a comment -

            Lustre Branch: b2_4
            Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/27/
            Distro/Arch: RHEL6.4/x86_64 + FC18/x86_64 (Server + Client)

            The racer test hit the following failure on MDS:

            18:40:59:Lustre: DEBUG MARKER: == racer test 1: racer on clients: client-32vm1,client-32vm2.lab.whamcloud.com DURATION=900 ========== 18:40:49 (1375494049)
            18:40:59:Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400):0:mdt
            18:41:40:LustreError: 11543:0:(osd_handler.c:2581:osd_object_ref_del()) ASSERTION( inode->i_nlink > 0 ) failed: 
            18:41:40:LustreError: 11543:0:(osd_handler.c:2581:osd_object_ref_del()) LBUG
            18:41:40:Pid: 11543, comm: mdt00_012
            18:41:40:
            18:41:40:Call Trace:
            18:41:40: [<ffffffffa06e5895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            18:41:40: [<ffffffffa06e5e97>] lbug_with_loc+0x47/0xb0 [libcfs]
            18:41:40: [<ffffffffa0e7a7e7>] osd_object_ref_del+0x1e7/0x220 [osd_ldiskfs]
            18:41:40: [<ffffffffa1038f9b>] lod_ref_del+0x3b/0xd0 [lod]
            18:41:40: [<ffffffffa0da9095>] mdo_ref_del+0x35/0xc0 [mdd]
            18:41:40: [<ffffffffa0db3858>] mdd_rename+0x2028/0x24f0 [mdd]
            18:41:40: [<ffffffffa0dad2fe>] ? __mdd_lookup_locked+0x1ee/0x490 [mdd]
            18:41:40: [<ffffffffa0f8bb62>] mdt_reint_rename+0x13d2/0x1b00 [mdt]
            18:41:40: [<ffffffffa082ce60>] ? lu_ucred+0x20/0x30 [obdclass]
            18:41:40: [<ffffffffa0f868f1>] mdt_reint_rec+0x41/0xe0 [mdt]
            18:41:40: [<ffffffffa0f6bb03>] mdt_reint_internal+0x4c3/0x780 [mdt]
            18:41:40: [<ffffffffa0f6be04>] mdt_reint+0x44/0xe0 [mdt]
            18:41:40: [<ffffffffa0f70ab7>] mdt_handle_common+0x647/0x16d0 [mdt]
            18:41:40: [<ffffffffa0faa1a5>] mds_regular_handle+0x15/0x20 [mdt]
            18:41:40: [<ffffffffa09a8388>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
            18:41:40: [<ffffffffa06e65de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
            18:41:40: [<ffffffffa06f7d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
            18:41:40: [<ffffffffa099f6e9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
            18:41:40: [<ffffffff81055ab3>] ? __wake_up+0x53/0x70
            18:41:40: [<ffffffffa09a971e>] ptlrpc_main+0xace/0x1700 [ptlrpc]
            18:41:40: [<ffffffffa09a8c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
            18:41:40: [<ffffffff8100c0ca>] child_rip+0xa/0x20
            18:41:40: [<ffffffffa09a8c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
            18:41:40: [<ffffffffa09a8c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
            18:41:40: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
            18:41:40:
            18:41:40:Kernel panic - not syncing: LBUG
            

            Maloo report: https://maloo.whamcloud.com/test_sets/a5ffcb6a-fcc0-11e2-9fdb-52540035b04c

            yujian Jian Yu added a comment - Lustre Branch: b2_4 Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/27/ Distro/Arch: RHEL6.4/x86_64 + FC18/x86_64 (Server + Client) The racer test hit the following failure on MDS: 18:40:59:Lustre: DEBUG MARKER: == racer test 1: racer on clients: client-32vm1,client-32vm2.lab.whamcloud.com DURATION=900 ========== 18:40:49 (1375494049) 18:40:59:Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400):0:mdt 18:41:40:LustreError: 11543:0:(osd_handler.c:2581:osd_object_ref_del()) ASSERTION( inode->i_nlink > 0 ) failed: 18:41:40:LustreError: 11543:0:(osd_handler.c:2581:osd_object_ref_del()) LBUG 18:41:40:Pid: 11543, comm: mdt00_012 18:41:40: 18:41:40:Call Trace: 18:41:40: [<ffffffffa06e5895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 18:41:40: [<ffffffffa06e5e97>] lbug_with_loc+0x47/0xb0 [libcfs] 18:41:40: [<ffffffffa0e7a7e7>] osd_object_ref_del+0x1e7/0x220 [osd_ldiskfs] 18:41:40: [<ffffffffa1038f9b>] lod_ref_del+0x3b/0xd0 [lod] 18:41:40: [<ffffffffa0da9095>] mdo_ref_del+0x35/0xc0 [mdd] 18:41:40: [<ffffffffa0db3858>] mdd_rename+0x2028/0x24f0 [mdd] 18:41:40: [<ffffffffa0dad2fe>] ? __mdd_lookup_locked+0x1ee/0x490 [mdd] 18:41:40: [<ffffffffa0f8bb62>] mdt_reint_rename+0x13d2/0x1b00 [mdt] 18:41:40: [<ffffffffa082ce60>] ? lu_ucred+0x20/0x30 [obdclass] 18:41:40: [<ffffffffa0f868f1>] mdt_reint_rec+0x41/0xe0 [mdt] 18:41:40: [<ffffffffa0f6bb03>] mdt_reint_internal+0x4c3/0x780 [mdt] 18:41:40: [<ffffffffa0f6be04>] mdt_reint+0x44/0xe0 [mdt] 18:41:40: [<ffffffffa0f70ab7>] mdt_handle_common+0x647/0x16d0 [mdt] 18:41:40: [<ffffffffa0faa1a5>] mds_regular_handle+0x15/0x20 [mdt] 18:41:40: [<ffffffffa09a8388>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc] 18:41:40: [<ffffffffa06e65de>] ? cfs_timer_arm+0xe/0x10 [libcfs] 18:41:40: [<ffffffffa06f7d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] 18:41:40: [<ffffffffa099f6e9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc] 18:41:40: [<ffffffff81055ab3>] ? __wake_up+0x53/0x70 18:41:40: [<ffffffffa09a971e>] ptlrpc_main+0xace/0x1700 [ptlrpc] 18:41:40: [<ffffffffa09a8c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] 18:41:40: [<ffffffff8100c0ca>] child_rip+0xa/0x20 18:41:40: [<ffffffffa09a8c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] 18:41:40: [<ffffffffa09a8c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] 18:41:40: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 18:41:40: 18:41:40:Kernel panic - not syncing: LBUG Maloo report: https://maloo.whamcloud.com/test_sets/a5ffcb6a-fcc0-11e2-9fdb-52540035b04c

            the patch has been updated and tracked at http://review.whamcloud.com/#change,4675

            hongchao.zhang Hongchao Zhang added a comment - the patch has been updated and tracked at http://review.whamcloud.com/#change,4675

            I think my patch in http://review.whamcloud.com/4675 may fix the i_nlink assertion problem. Hongchao, could you please update this patch under LU-2311 to fix whatever minor issue remains with the patch.

            I don't think this relates to the FLD sequence lookup errors that are included in the log, so I'd prefer to repurpose this bug to understand/fix the FLD/SEQ issues reported by Oleg.

            adilger Andreas Dilger added a comment - I think my patch in http://review.whamcloud.com/4675 may fix the i_nlink assertion problem. Hongchao, could you please update this patch under LU-2311 to fix whatever minor issue remains with the patch. I don't think this relates to the FLD sequence lookup errors that are included in the log, so I'd prefer to repurpose this bug to understand/fix the FLD/SEQ issues reported by Oleg.

            People

              hongchao.zhang Hongchao Zhang
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: