Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10738

mdd: LBUG() from changelog_store_data_by_fid

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      When running sanity-hsm test_26d on Maloo with this patch applied, the MDS hits an LBUG(). This looks a lot like LU-10454: mdd_changelog_store_data_by_fid() tries to access some structure that is not available anymore due to a recent client eviction.

      Here is a test instance that triggers the bug (mds log file). Here is the important part:

      [ 2604.113084] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n mdt.lustre-MDT0000.evict_client 16a29cf5-5198-cefe-15d1-1c432b2ce629
      [ 2604.264461] Lustre: 3825:0:(genops.c:1759:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 16a29cf5-5198-cefe-15d1-1c432b2ce629 at adminstrative request
      [ 2604.269447] Lustre: 3825:0:(osd_internal.h:1139:osd_trans_exec_op()) lustre-MDT0000: opcode 7: credits = 0, rollback = 7
      [ 2604.272240] Lustre: 3825:0:(osd_handler.c:1723:osd_trans_dump_creds())   create: 0/0/0, destroy: 0/0/0
      [ 2604.274895] Lustre: 3825:0:(osd_handler.c:1730:osd_trans_dump_creds())   attr_set: 0/0/0, xattr_set: 1/89/0
      [ 2604.277555] Lustre: 3825:0:(osd_handler.c:1740:osd_trans_dump_creds())   write: 0/0/0, punch: 0/0/0, quota 0/0/0
      [ 2604.280293] Lustre: 3825:0:(osd_handler.c:1747:osd_trans_dump_creds())   insert: 0/0/0, delete: 0/0/0
      [ 2604.282859] Lustre: 3825:0:(osd_handler.c:1754:osd_trans_dump_creds())   ref_add: 0/0/0, ref_del: 0/0/0
      [ 2604.285402] LustreError: 3825:0:(osd_internal.h:1141:osd_trans_exec_op()) ASSERTION( !ldiskfs_track_declares_assert ) failed: 
      [ 2604.289787] LustreError: 3825:0:(osd_internal.h:1141:osd_trans_exec_op()) LBUG
      [ 2604.292129] Pid: 3825, comm: lctl
      [ 2604.294136] 
      [ 2604.294136] Call Trace:
      [ 2604.297725]  [<ffffffffc068d7ae>] libcfs_call_trace+0x4e/0x60 [libcfs]
      [ 2604.299849]  [<ffffffffc068d83c>] lbug_with_loc+0x4c/0xb0 [libcfs]
      [ 2604.301974]  [<ffffffffc0d54df1>] osd_write+0x5a1/0x5b0 [osd_ldiskfs]
      [ 2604.304053]  [<ffffffffc08c74e9>] dt_record_write+0x39/0x120 [obdclass]
      [ 2604.306141]  [<ffffffffc0888697>] llog_osd_write_rec+0xbf7/0x1460 [obdclass]
      [ 2604.308201]  [<ffffffffc087b3d9>] llog_write_rec+0xc9/0x520 [obdclass]
      [ 2604.310265]  [<ffffffffc0880370>] llog_cat_add_rec+0x220/0x8b0 [obdclass]
      [ 2604.312289]  [<ffffffffc08784fa>] llog_add+0x7a/0x1a0 [obdclass]
      [ 2604.314286]  [<ffffffff810ec7ba>] ? __getnstimeofday64+0x3a/0xd0
      [ 2604.316248]  [<ffffffffc1055b22>] mdd_changelog_store+0x1a2/0x5f0 [mdd]
      [ 2604.318314]  [<ffffffffc1063f8e>] mdd_changelog_data_store_by_fid+0x1ae/0x320 [mdd]
      [ 2604.320430]  [<ffffffffc1064564>] mdd_changelog_data_store_xattr+0x104/0x230 [mdd]
      [ 2604.322589]  [<ffffffffc106c17e>] mdd_xattr_set+0x95e/0x17f0 [mdd]
      [ 2604.324613]  [<ffffffffc0f12532>] mdt_hsm_attr_set+0xa2/0x230 [mdt]
      [ 2604.326673]  [<ffffffffc0efc5b1>] mdt_add_dirty_flag+0x1d1/0x250 [mdt]
      [ 2604.328694]  [<ffffffffc0ed0e3d>] mdt_ctxt_add_dirty_flag.isra.70+0xdd/0x1a0 [mdt]
      [ 2604.330870]  [<ffffffffc0ed3618>] mdt_obd_disconnect+0x3c8/0x670 [mdt]
      [ 2604.332913]  [<ffffffffc08986a9>] class_fail_export+0x279/0x580 [obdclass]
      [ 2604.334988]  [<ffffffffc089b72f>] obd_export_evict_by_uuid+0x12f/0x220 [obdclass]
      [ 2604.337030]  [<ffffffff8118483b>] ? unlock_page+0x2b/0x30
      [ 2604.338946]  [<ffffffffc08de96f>] lprocfs_evict_client_seq_write+0x1cf/0x290 [obdclass]
      [ 2604.340974]  [<ffffffffc0f0f086>] mdt_mds_evict_client_write+0x416/0x6a0 [mdt]
      [ 2604.342948]  [<ffffffff8127267d>] proc_reg_write+0x3d/0x80
      [ 2604.344734]  [<ffffffff81202ced>] vfs_write+0xbd/0x1e0
      [ 2604.346467]  [<ffffffff81203aff>] SyS_write+0x7f/0xe0
      [ 2604.348143]  [<ffffffff816b8929>] ? system_call_after_swapgs+0x156/0x214
      [ 2604.349917]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
      [ 2604.351609]  [<ffffffff816b889d>] ? system_call_after_swapgs+0xca/0x214
      
      

      Attachments

        Issue Links

          Activity

            People

              sbuisson Sebastien Buisson (Inactive)
              cealustre CEA
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: