Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-931

fsstress - LBUG (lustre_idl.h:766:lu_fid_eq()) ASSERTION(fid_is_igif(f0) || fid_ver(f0) == 0)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.2.0, Lustre 2.1.2
    • Lustre 2.2.0
    • None
    • LLNL Hyperion
    • 3
    • 4755

    Description

      System crashes during fsstress testing
      -------

      2011-12-15 10:34:18 Lustre: 8786:0:(cmm_object.c:689:cml_rename_warn()) cml_rename failed for mdo_rename, should revoke: [mo_po [0x200000402:0x13:0x0]] [mo_pn [0x200000402:0x13:0x0]] [lf [0x200000402:0x13e8e:0x0]] [sname fstest_60e48179a98fb92fa7071967001e58b6] [mo_t [0x200000402:0x13e97:0x0]] [tname fstest_88b31a7611fd2bb789377f1366b2ce5b] [err -39]
      2011-12-15 10:34:18 Lustre: 8786:0:(cmm_object.c:689:cml_rename_warn()) Skipped 14 previous similar messages
      2011-12-15 10:41:34 Lustre: 8787:0:(cmm_object.c:689:cml_rename_warn()) cml_rename failed for mdo_rename, should revoke: [mo_po [0x200000408:0x16:0x0]] [mo_pn [0x200000408:0x16:0x0]] [lf [0x200000408:0x13bf3:0x0]] [sname fstest_4a6a5646e7b447858249641047b14d4a] [mo_t [0x200000408:0x13bfa:0x0]] [tname fstest_568da43f014966bf12ba58244b28bb2f] [err -39]
      2011-12-15 10:50:18 Lustre: mdt: This server is not able to keep up with request traffic (cpu-bound).
      2011-12-15 10:50:18 Lustre: 8761:0:(service.c:1186:ptlrpc_at_check_timed()) earlyQ=0 reqQ=0 recA=29, svcEst=19, delay=0(jiff)
      2011-12-15 10:50:43 Lustre: mdt: This server is not able to keep up with request traffic (cpu-bound).
      2011-12-15 10:50:43 Lustre: 8994:0:(service.c:1186:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=1, svcEst=31, delay=0(jiff)
      2011-12-15 10:50:43 Lustre: 8994:0:(service.c:983:ptlrpc_at_send_early_reply()) @@@ Already past deadline (18s), not sending early reply. Consider increasing at_early_margin (5)? req@ffff8801a63ac050 x1388238484166088/t0(0) o401>LOV_OSC_UUID@192.168.127.61@o2ib1:0/0 lens 4096/0 e 0 to 0 dl 1323975025 ref 2 fl Interpret:/0/ffffffff rc 0/-1
      2011-12-15 10:50:43 Lustre: 8981:0:(service.c:1732:ptlrpc_server_handle_request()) @@@ Request x1388238484166088 took longer than estimated (6:18s); client may timeout. req@ffff8801a63ac050 x1388238484166088/t0(0) o401->LOV_OSC_UUID@192.168.127.61@o2ib1:0/0 lens 4096/192 e 0 to 0 dl 1323975025 ref 1 fl Complete:/0/0 rc 0/0
      2011-12-15 10:50:43 Lustre: Skipped 1 previous similar message
      2011-12-15 10:52:33 LustreError: 8763:0:(lustre_idl.h:766:lu_fid_eq()) ASSERTION(fid_is_igif(f0) || fid_ver(f0) == 0) failed: [0x5a5a5a5a5a5a5a5a:0x5a5a5a5a:0x5a5a5a5a]
      2011-12-15 10:52:33 LustreError: 8763:0:(lustre_idl.h:766:lu_fid_eq()) LBUG
      2011-12-15 10:52:33 Pid: 8763, comm: mdt_46
      2011-12-15 10:52:33
      2011-12-15 10:52:33 Dec 15 10:52:33 Call Trace:
      2011-12-15 10:52:33 hyperion-rst6 ke [<ffffffffa03e3855>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      2011-12-15 10:52:33 rnel: LustreErro [<ffffffffa03e3e95>] lbug_with_loc+0x75/0xe0 [libcfs]
      2011-12-15 10:52:33 r: 8763:0:(lustr [<ffffffffa09d8b5a>] mdd_attr_set_internal+0x30a/0x310 [mdd]
      2011-12-15 10:52:33 e_idl.h:766:lu_f [<ffffffffa09d8eb5>] mdd_attr_check_set_internal+0x355/0x390 [mdd]
      2011-12-15 10:52:33 id_eq()) ASSERTI [<ffffffffa09d635d>] ? mdd_la_get+0xad/0xb0 [mdd]
      2011-12-15 10:52:33 ON(fid_is_igif(f [<ffffffffa09d96f9>] mdd_attr_check_set_internal_locked+0x69/0x180 [mdd]
      2011-12-15 10:52:33 0) || fid_ver(f0 [<ffffffffa09fefd0>] ? md_capainfo+0x20/0x30 [mdd]
      2011-12-15 10:52:33 ) == 0) failed: [<ffffffffa09f2b56>] ? mdd_object_capa+0x16/0x190 [mdd]
      2011-12-15 10:52:33 [0x5a5a5a5a5a5a5 [<ffffffffa09fc345>] mdd_rename+0x1865/0x2220 [mdd]
      2011-12-15 10:52:33 a5a:0x5a5a5a5a:0 [<ffffffffa03f2bcf>] ? cfs_hash_bd_from_key+0x3f/0xc0 [libcfs]
      2011-12-15 10:52:33 x5a5a5a5a]
      2011-12-15 10:52:33 Dec 1 [<ffffffffa0ac5239>] ? cmm_mode_get+0x109/0x320 [cmm]
      2011-12-15 10:52:33 5 10:52:33 hyper [<ffffffffa0ac5d5a>] cml_rename+0x33a/0xbb0 [cmm]
      2011-12-15 10:52:33 ion-rst6 kernel: [<ffffffffa03f2fe7>] ? cfs_hash_bd_get+0x37/0x90 [libcfs]
      2011-12-15 10:52:33 LustreError: 87 [<ffffffffa0ac54bd>] ? cmm_is_subdir+0x6d/0x2f0 [cmm]
      2011-12-15 10:52:33 63:0:(lustre_idl [<ffffffffa04c55e2>] ? lu_object_put+0x92/0x210 [obdclass]
      2011-12-15 10:52:33 .h:766:lu_fid_eq [<ffffffffa0a50276>] mdt_reint_rename+0x1f96/0x23e0 [mdt]
      2011-12-15 10:52:33 ()) LBUG
      2011-12-15 10:52:33 [<ffffffffa03f993b>] ? upcall_cache_get_entry+0x28b/0xa14 [libcfs]
      2011-12-15 10:52:33 [<ffffffffa0a4882f>] ? mdt_rename_unpack+0x44f/0x6a0 [mdt]
      2011-12-15 10:52:33 [<ffffffffa09ff006>] ? md_ucred+0x26/0x60 [mdd]
      2011-12-15 10:52:33 [<ffffffffa0a48abf>] mdt_reint_rec+0x3f/0x100 [mdt]
      2011-12-15 10:52:33 [<ffffffffa05bbf94>] ? lustre_msg_get_flags+0x34/0xa0 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffffa0a40f64>] mdt_reint_internal+0x6d4/0x9f0 [mdt]
      2011-12-15 10:52:33 [<ffffffffa0a36a86>] ? mdt_reint_opcode+0x96/0x160 [mdt]
      2011-12-15 10:52:33 [<ffffffffa0a412cc>] mdt_reint+0x4c/0x120 [mdt]
      2011-12-15 10:52:33 [<ffffffffa05bba68>] ? lustre_msg_check_version+0xc8/0xe0 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffffa0a33955>] mdt_handle_common+0x8d5/0x1810 [mdt]
      2011-12-15 10:52:33 [<ffffffffa05b96f4>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffffa0a34965>] mdt_regular_handle+0x15/0x20 [mdt]
      2011-12-15 10:52:33 [<ffffffffa05ca39e>] ptlrpc_main+0xb8e/0x1900 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffffa05c9810>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffff8100c1ca>] child_rip+0xa/0x20
      2011-12-15 10:52:33 [<ffffffffa05c9810>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffffa05c9810>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
      2011-12-15 10:52:33
      2011-12-15 10:52:33 Kernel panic - not syncing: LBUG
      2011-12-15 10:52:33 Pid: 8763, comm: mdt_46 Tainted: G ---------------- T 2.6.32-131.6.1.el6_lustre.x86_64 #1
      2011-12-15 10:52:33 Dec 15 10:52:33 Call Trace:
      2011-12-15 10:52:33 hyperion-rst6 ke [<ffffffff814da878>] ? panic+0x78/0x143
      2011-12-15 10:52:33 rnel: Kernel pan [<ffffffffa03e3eeb>] ? lbug_with_loc+0xcb/0xe0 [libcfs]
      2011-12-15 10:52:33 ic - not syncing [<ffffffffa09d8b5a>] ? mdd_attr_set_internal+0x30a/0x310 [mdd]
      2011-12-15 10:52:33 : LBUG
      2011-12-15 10:52:33 [<ffffffffa09d8eb5>] ? mdd_attr_check_set_internal+0x355/0x390 [mdd]
      2011-12-15 10:52:33 [<ffffffffa09d635d>] ? mdd_la_get+0xad/0xb0 [mdd]
      2011-12-15 10:52:33 [<ffffffffa09d96f9>] ? mdd_attr_check_set_internal_locked+0x69/0x180 [mdd]
      2011-12-15 10:52:33 [<ffffffffa09fefd0>] ? md_capainfo+0x20/0x30 [mdd]
      2011-12-15 10:52:33 [<ffffffffa09f2b56>] ? mdd_object_capa+0x16/0x190 [mdd]
      2011-12-15 10:52:33 [<ffffffffa09fc345>] ? mdd_rename+0x1865/0x2220 [mdd]
      2011-12-15 10:52:33 [<ffffffffa03f2bcf>] ? cfs_hash_bd_from_key+0x3f/0xc0 [libcfs]
      2011-12-15 10:52:33 [<ffffffffa0ac5239>] ? cmm_mode_get+0x109/0x320 [cmm]
      2011-12-15 10:52:33 [<ffffffffa0ac5d5a>] ? cml_rename+0x33a/0xbb0 [cmm]
      2011-12-15 10:52:33 [<ffffffffa03f2fe7>] ? cfs_hash_bd_get+0x37/0x90 [libcfs]
      2011-12-15 10:52:33 [<ffffffffa0ac54bd>] ? cmm_is_subdir+0x6d/0x2f0 [cmm]
      2011-12-15 10:52:33 [<ffffffffa04c55e2>] ? lu_object_put+0x92/0x210 [obdclass]
      2011-12-15 10:52:33 [<ffffffffa0a50276>] ? mdt_reint_rename+0x1f96/0x23e0 [mdt]
      2011-12-15 10:52:33 [<ffffffffa03f993b>] ? upcall_cache_get_entry+0x28b/0xa14 [libcfs]
      2011-12-15 10:52:33 [<ffffffffa0a4882f>] ? mdt_rename_unpack+0x44f/0x6a0 [mdt]
      2011-12-15 10:52:33 [<ffffffffa09ff006>] ? md_ucred+0x26/0x60 [mdd]
      2011-12-15 10:52:33 [<ffffffffa0a48abf>] ? mdt_reint_rec+0x3f/0x100 [mdt]
      2011-12-15 10:52:33 [<ffffffffa05bbf94>] ? lustre_msg_get_flags+0x34/0xa0 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffffa0a40f64>] ? mdt_reint_internal+0x6d4/0x9f0 [mdt]
      2011-12-15 10:52:33 [<ffffffffa0a36a86>] ? mdt_reint_opcode+0x96/0x160 [mdt]
      2011-12-15 10:52:33 [<ffffffffa0a412cc>] ? mdt_reint+0x4c/0x120 [mdt]
      2011-12-15 10:52:33 [<ffffffffa05bba68>] ? lustre_msg_check_version+0xc8/0xe0 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffffa0a33955>] ? mdt_handle_common+0x8d5/0x1810 [mdt]
      2011-12-15 10:52:33 [<ffffffffa05b96f4>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffffa0a34965>] ? mdt_regular_handle+0x15/0x20 [mdt]
      2011-12-15 10:52:33 [<ffffffffa05ca39e>] ? ptlrpc_main+0xb8e/0x1900 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffffa05c9810>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffff8100c1ca>] ? child_rip+0xa/0x20
      2011-12-15 10:52:33 [<ffffffffa05c9810>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffffa05c9810>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
      2011-12-15 10:52:33 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
      2011-12-15 10:52:33 Initializing cgroup subsys cpuset
      2011-12-15 10:52:33 Initializing cgroup subsys cpu

      Attachments

        Issue Links

          Activity

            People

              hongchao.zhang Hongchao Zhang
              cliffw Cliff White (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: