Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8662

osd_fid_lookup()) ASSERTION( tid->oii_ino == id->oii_ino && tid->oii_gen == id->oii_gen ) failed: OI mapping changed(2):

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Oleg Drokin <green@whamcloud.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/de8e19f6-88ee-11e6-ad53-5254006e85c2.

      The sub-test test_0a failed with the following error:

      test failed to respond and timed out
      

      Now, this is with dm_flakey patch that might have introduced some unexpected failure behavior, but even so it highlights an assert that could happen as it turns out, so we better look at this in more details

      Assertion was introcuded by FanYong's patch commit cecde8bdb4913fd4405d425b0bf3aead03181e9d for LU-8218

      MDS crashed with:

      15:13:56:[ 1124.877616] LustreError: 167-0: lustre-MDT0000-lwp-MDT0002: This client was evicted by lustre-MDT0000; in progress operations using this service will fail.
      15:13:56:[ 1124.908734] LDISKFS-fs error (device dm-6): mb_free_blocks:1453: group 4, block 68664:freeing already freed block (bit 1560); block bitmap corrupt.
      15:13:56:[ 1124.914666] Aborting journal on device dm-6-8.
      15:13:56:[ 1124.978308] LDISKFS-fs (dm-6): Remounting filesystem read-only
      15:13:56:[ 1124.982226] LDISKFS-fs warning (device dm-6): ldiskfs_mb_generate_buddy:761: group 4: 15957 blocks in bitmap, 15959 in bb, 16445 in gd, 1 pa's block bitmap corrupt
      15:13:56:[ 1124.987219] LDISKFS-fs error (device dm-6) in osd_trans_stop:1834: IO failure
      15:13:56:[ 1124.990016] LustreError: 6145:0:(osd_handler.c:1837:osd_trans_stop()) lustre-MDT0000: failed to stop transaction: rc = -5
      15:13:56:[ 1124.992521] Lustre: lustre-MDT0000: Recovery over after 0:04, of 5 clients 5 recovered and 0 were evicted.
      15:13:56:[ 1125.992942] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 4 sec
      15:13:56:[ 1126.229054] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 4 sec
      15:13:56:[ 1126.241517] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 4 sec
      15:13:56:[ 1126.476588] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 4 sec
      15:13:56:[ 1127.807784] format at osd_handler.c:1182:osd_fid_lookup doesn't end in newline
      15:13:56:[ 1127.812857] LustreError: 800:0:(osd_handler.c:1182:osd_fid_lookup()) ASSERTION( tid->oii_ino == id->oii_ino && tid->oii_gen == id->oii_gen ) failed: OI mapping changed(2): 139/537564674 => 139/1570612830
      15:13:56:[ 1127.815844] LustreError: 800:0:(osd_handler.c:1182:osd_fid_lookup()) LBUG
      15:13:56:[ 1127.819571] Pid: 800, comm: mdt00_001
      15:13:56:[ 1127.821424] 
      15:13:56:[ 1127.821424] Call Trace:
      15:13:56:[ 1127.824770]  [<ffffffffa06987d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
      15:13:56:[ 1127.826840]  [<ffffffffa0698d75>] lbug_with_loc+0x45/0xc0 [libcfs]
      15:13:56:[ 1127.828862]  [<ffffffffa0c250bb>] osd_fid_lookup+0x175b/0x1800 [osd_ldiskfs]
      15:13:56:[ 1127.830930]  [<ffffffffa0c251b5>] osd_object_init+0x55/0xf0 [osd_ldiskfs]
      15:13:56:[ 1127.833042]  [<ffffffffa07d68cf>] lu_object_alloc+0xdf/0x310 [obdclass]
      15:13:56:[ 1127.835107]  [<ffffffffa07d6ccc>] lu_object_find_try+0x16c/0x2b0 [obdclass]
      15:13:56:[ 1127.837173]  [<ffffffffa07d6ebc>] lu_object_find_at+0xac/0xe0 [obdclass]
      15:13:56:[ 1127.839279]  [<ffffffffa0e9bb65>] ? lod_index_lookup+0x25/0x30 [lod]
      15:13:56:[ 1127.841285]  [<ffffffffa0ef5f4f>] ? __mdd_lookup.isra.17+0x26f/0x450 [mdd]
      15:13:56:[ 1127.843369]  [<ffffffffa07d6f06>] lu_object_find+0x16/0x20 [obdclass]
      15:13:56:[ 1127.845403]  [<ffffffffa0dbff5b>] mdt_object_find+0x4b/0x170 [mdt]
      15:13:56:[ 1127.847443]  [<ffffffffa0dc3ab6>] mdt_getattr_name_lock+0x746/0x1900 [mdt]
      15:13:56:[ 1127.849494]  [<ffffffffa0dc4f20>] mdt_intent_getattr+0x2b0/0x480 [mdt]
      15:13:56:[ 1127.851464]  [<ffffffffa0dc8b3c>] mdt_intent_policy+0x5bc/0xbb0 [mdt]
      15:13:56:[ 1127.853443]  [<ffffffffa09ae1e7>] ldlm_lock_enqueue+0x387/0x970 [ptlrpc]
      15:13:56:[ 1127.855381]  [<ffffffffa09d7363>] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc]
      15:13:56:[ 1127.857316]  [<ffffffffa09fef40>] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc]
      15:13:56:[ 1127.859274]  [<ffffffffa0a57bf2>] tgt_enqueue+0x62/0x210 [ptlrpc]
      15:13:56:[ 1127.861113]  [<ffffffffa0a5c055>] tgt_request_handle+0x915/0x1320 [ptlrpc]
      15:13:56:[ 1127.862983]  [<ffffffffa0a07fdb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
      15:13:56:[ 1127.864882]  [<ffffffffa0a05b98>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
      15:13:56:[ 1127.866693]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
      15:13:56:[ 1127.868447]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
      15:13:56:[ 1127.870236]  [<ffffffffa0a0c090>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
      15:13:56:[ 1127.871995]  [<ffffffffa0a0b5f0>] ? ptlrpc_main+0x0/0x1de0 [ptlrpc]
      15:13:56:[ 1127.873738]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
      15:13:56:[ 1127.875343]  [<ffffffff810a5ac0>] ? kthread+0x0/0xe0
      15:13:56:[ 1127.876914]  [<ffffffff81646b98>] ret_from_fork+0x58/0x90
      15:13:56:[ 1127.878478]  [<ffffffff810a5ac0>] ? kthread+0x0/0xe0
      15:13:56:[ 1127.879973] 
      15:13:56:[ 1127.883734] Kernel panic - not syncing: LBUG
      

      Info required for matching: replay-single 0a

      Attachments

        Activity

          People

            yong.fan nasf (Inactive)
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: