Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7138

LBUG: (osd_handler.c:1017:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed:

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • Lustre 2.7.0
    • None
    • 1
    • 9223372036854775807

    Description

      This evening we have hit this LBUG on the MDT in our production file system, the file system is currently down as we hit the same bug every time we attempt to bring the MDT back, as soon as recovery finishes.

      <0>LustreError: 722:0:(osd_handler.c:1017:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed:
      <0>LustreError: 722:0:(osd_handler.c:1017:osd_trans_start()) LBUG
      <4>Pid: 722, comm: mdt01_017
      <4>
      <4>Call Trace:
      <4> [<ffffffffa065f895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      <4> [<ffffffffa065fe97>] lbug_with_loc+0x47/0xb0 [libcfs]
      <4> [<ffffffffa17df24d>] osd_trans_start+0x25d/0x660 [osd_ldiskfs]
      <4> [<ffffffffa09b9b4a>] llog_osd_destroy+0x42a/0xd40 [obdclass]
      <4> [<ffffffffa09b2edc>] llog_cat_new_log+0x1ec/0x710 [obdclass]
      <4> [<ffffffffa09b350a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
      <4> [<ffffffffa09ab1e9>] llog_add+0x89/0x1c0 [obdclass]
      <4> [<ffffffffa17f1976>] ? osd_attr_set+0x166/0x460 [osd_ldiskfs]
      <4> [<ffffffffa0d914e2>] mdd_changelog_store+0x122/0x290 [mdd]
      <4> [<ffffffffa0da4d0c>] mdd_changelog_data_store+0x16c/0x320 [mdd]
      <4> [<ffffffffa0dad9b3>] mdd_attr_set+0x12f3/0x1730 [mdd]
      <4> [<ffffffffa088a551>] mdt_reint_setattr+0xf81/0x13a0 [mdt]
      <4> [<ffffffffa087be1c>] ? mdt_root_squash+0x2c/0x3f0 [mdt]
      <4> [<ffffffffa08801dd>] mdt_reint_rec+0x5d/0x200 [mdt]
      <4> [<ffffffffa086423b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
      <4> [<ffffffffa08649ab>] mdt_reint+0x6b/0x120 [mdt]
      <4> [<ffffffffa0c6f56e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
      <4> [<ffffffffa0c1f5a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
      <4> [<ffffffff8106c4f0>] ? pick_next_task_fair+0xd0/0x130
      <4> [<ffffffffa0c1e760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
      <4> [<ffffffff8109e66e>] kthread+0x9e/0xc0
      <4> [<ffffffff8100c20a>] child_rip+0xa/0x20
      <4> [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
      <4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
      <4>
      <0>Kernel panic - not syncing: LBUG
      

      The stack trace doesn't quite seem to be the same as for LU-6634 (which anyway doesn't have any fix suggested.)

      Attachments

        Issue Links

          Activity

            [LU-7138] LBUG: (osd_handler.c:1017:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed:

            Alex,

            could you double check the bug number that this is a duplicate of? LU-6636 doesn't look right, did you mean LU-6634?

            Also, what would be the best way we can get a fix/patch for lustre 2.7?

            Cheers,
            Frederik

            ferner Frederik Ferner (Inactive) added a comment - - edited Alex, could you double check the bug number that this is a duplicate of? LU-6636 doesn't look right, did you mean LU-6634 ? Also, what would be the best way we can get a fix/patch for lustre 2.7? Cheers, Frederik
            bzzz Alex Zhuravlev added a comment - - edited

            a duplicate of LU-6634

            bzzz Alex Zhuravlev added a comment - - edited a duplicate of LU-6634

            ah, even better - there is llog_trans_destroy() in the master branch already. I'll cook the patch.

            bzzz Alex Zhuravlev added a comment - ah, even better - there is llog_trans_destroy() in the master branch already. I'll cook the patch.

            Oleg, based on the size of changelog_catalog which indicate it has reached its max size, I believe this ticket is the combination of LU-6556 (Catalog no longer able to wrap-around) and LU-6634 (llog_destroy() wrong call to destroy plain LLOG upon Catalog full condition because a journal transaction already started).
            My master patch for LU-6556 has still not landed, and I think Mike has already started working on a solution for LU-6634.
            So at the moment, I think there was no other solution than to remove changelog_catalog file to allow for the filesystem to be restarted.

            bfaccini Bruno Faccini (Inactive) added a comment - Oleg, based on the size of changelog_catalog which indicate it has reached its max size, I believe this ticket is the combination of LU-6556 (Catalog no longer able to wrap-around) and LU-6634 (llog_destroy() wrong call to destroy plain LLOG upon Catalog full condition because a journal transaction already started). My master patch for LU-6556 has still not landed, and I think Mike has already started working on a solution for LU-6634 . So at the moment, I think there was no other solution than to remove changelog_catalog file to allow for the filesystem to be restarted.

            Changelogs are indeed used for a robinhood instance. This appears to be still alive and had managed to keep up with the rate of changes as far as I can see.

            changelog_catalog isn't that big as far as I can see (4MB), and there are indeed not many files below O:

            [bnh65367@cs04r-sc-mds03-01 mdt]$ sudo find O -type f | wc -l
            35
            [bnh65367@cs04r-sc-mds03-01 mdt]$ ls -l changelog_*
            -rw-r--r-- 1 root root 4153280 Aug 19  2014 changelog_catalog
            -rw-r--r-- 1 root root    8384 Aug 19  2014 changelog_users
            

            Removing changelog_* did indeed allow us to bring the file system back successfully (so far).

            Immediate crisis over, but we'd be very interested in the root cause and how to avoid this in the future (and on our other file systems).

            ferner Frederik Ferner (Inactive) added a comment - - edited Changelogs are indeed used for a robinhood instance. This appears to be still alive and had managed to keep up with the rate of changes as far as I can see. changelog_catalog isn't that big as far as I can see (4MB), and there are indeed not many files below O: [bnh65367@cs04r-sc-mds03-01 mdt]$ sudo find O -type f | wc -l 35 [bnh65367@cs04r-sc-mds03-01 mdt]$ ls -l changelog_* -rw-r--r-- 1 root root 4153280 Aug 19 2014 changelog_catalog -rw-r--r-- 1 root root 8384 Aug 19 2014 changelog_users Removing changelog_* did indeed allow us to bring the file system back successfully (so far). Immediate crisis over, but we'd be very interested in the root cause and how to avoid this in the future (and on our other file systems).
            green Oleg Drokin added a comment -

            Ok, apparently LOGS/ dir is old style llogs.

            Anyway, the current theory is your changelogs catalog has overflown. If you check size of the "changelog_catalog" file in the MDT fs mounted as ldiskfs is quite big I imagine?

            What do you use changelogs for? a robinhood install or some other eager user that immediately consumes all changelogs as they are generated? Is it still alive and well? (As in - was consuming records while MDS was up, and did not happen to wedge itself ages ago).
            This also should mean that the number of files (not dirs) in the "O" dir should be small.

            If you don't care all that much about these changelogs because they have been consumed already - you can just remove "changelog_catalog" and "changelog_users" files and then MDS should be able to start. You will then need to reenable your changelogs and reregister all consumers.

            green Oleg Drokin added a comment - Ok, apparently LOGS/ dir is old style llogs. Anyway, the current theory is your changelogs catalog has overflown. If you check size of the "changelog_catalog" file in the MDT fs mounted as ldiskfs is quite big I imagine? What do you use changelogs for? a robinhood install or some other eager user that immediately consumes all changelogs as they are generated? Is it still alive and well? (As in - was consuming records while MDS was up, and did not happen to wedge itself ages ago). This also should mean that the number of files (not dirs) in the "O" dir should be small. If you don't care all that much about these changelogs because they have been consumed already - you can just remove "changelog_catalog" and "changelog_users" files and then MDS should be able to start. You will then need to reenable your changelogs and reregister all consumers.

            LOGS/ dir on the mdt is empty, no files at all, the directory is owned by root:root and world writeable.

            ferner Frederik Ferner (Inactive) added a comment - LOGS/ dir on the mdt is empty, no files at all, the directory is owned by root:root and world writeable.
            green Oleg Drokin added a comment -

            Can you also check that all the files in LOGS/ dir are root owned? what are the sizes of the 10 most recently written (based on mtime) ones?

            green Oleg Drokin added a comment - Can you also check that all the files in LOGS/ dir are root owned? what are the sizes of the 10 most recently written (based on mtime) ones?

            quota shows now quota for any user on the MDT when mounted as ldiskfs and yes, I can still write a file:

            [bnh65367@cs04r-sc-mds03-01 ~]$ sudo dd if=/dev/zero of=/lustre/lustre03/mdt/LOGS/aaa bs=1024k count=1
            1+0 records in
            1+0 records out
            1048576 bytes (1.0 MB) copied, 0.000997444 s, 1.1 GB/s
            [bnh65367@cs04r-sc-mds03-01 ~]$
            

            (just to be sure, I've also checked the output of df -hl and df -hil for all OSS nodes and all OSTs have space as well.)

            ferner Frederik Ferner (Inactive) added a comment - - edited quota shows now quota for any user on the MDT when mounted as ldiskfs and yes, I can still write a file: [bnh65367@cs04r-sc-mds03-01 ~]$ sudo dd if=/dev/zero of=/lustre/lustre03/mdt/LOGS/aaa bs=1024k count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.000997444 s, 1.1 GB/s [bnh65367@cs04r-sc-mds03-01 ~]$ (just to be sure, I've also checked the output of df -hl and df -hil for all OSS nodes and all OSTs have space as well.)
            green Oleg Drokin added a comment -

            if you mount as ldiskfs, you should see quotainfo with the quota tools, I imagine. like quota.

            Also can you write files in there? e.g. can you do dd if=/dev/zero of=/mnt/mdt-mountpoint/LOGS/aaa bs=1024k count=1 ?
            (remove the /mnt/mdt-mountpoint/LOGS/aaa afterwards).

            green Oleg Drokin added a comment - if you mount as ldiskfs, you should see quotainfo with the quota tools, I imagine. like quota. Also can you write files in there? e.g. can you do dd if=/dev/zero of=/mnt/mdt-mountpoint/LOGS/aaa bs=1024k count=1 ? (remove the /mnt/mdt-mountpoint/LOGS/aaa afterwards).
            green Oleg Drokin added a comment -

            Aha, reading into llog_osd_write_rec I see this nice piece:

                    /* if it's the last idx in log file, then return -ENOSPC */
                    if (loghandle->lgh_last_idx >= LLOG_BITMAP_SIZE(llh) - 1)
                            RETURN(-ENOSPC);
            

            I wonder if this is what you are hitting.... Hmmm....
            In fact looking at the backtrace, we are coming from llog_cat_add_rec where we have:

                    rc = llog_write_rec(env, loghandle, rec, reccookie, LLOG_NEXT_IDX, th);
                    if (rc < 0)
                            CDEBUG_LIMIT(rc == -ENOSPC ? D_HA : D_ERROR,
                                         "llog_write_rec %d: lh=%p\n", rc, loghandle);
                    up_write(&loghandle->lgh_lock);
                    if (rc == -ENOSPC) {
                            /* try to use next log */
                            loghandle = llog_cat_current_log(cathandle, th);
                            LASSERT(!IS_ERR(loghandle));
                            /* new llog can be created concurrently */
                            if (!llog_exist(loghandle)) {
                                    rc = llog_cat_new_log(env, cathandle, loghandle, th);
                                    if (rc < 0) {
                                            up_write(&loghandle->lgh_lock);
                                            RETURN(rc);
                                    }
                            }
            

            So we caught ENOSPC and got into the creation of new llog, but got another ENOSPC there as well where the llog should already be the new one, so it could not be end of previous llog, writes appear to be failing genuinely.

            green Oleg Drokin added a comment - Aha, reading into llog_osd_write_rec I see this nice piece: /* if it's the last idx in log file, then return -ENOSPC */ if (loghandle->lgh_last_idx >= LLOG_BITMAP_SIZE(llh) - 1) RETURN(-ENOSPC); I wonder if this is what you are hitting.... Hmmm.... In fact looking at the backtrace, we are coming from llog_cat_add_rec where we have: rc = llog_write_rec(env, loghandle, rec, reccookie, LLOG_NEXT_IDX, th); if (rc < 0) CDEBUG_LIMIT(rc == -ENOSPC ? D_HA : D_ERROR, "llog_write_rec %d: lh=%p\n", rc, loghandle); up_write(&loghandle->lgh_lock); if (rc == -ENOSPC) { /* try to use next log */ loghandle = llog_cat_current_log(cathandle, th); LASSERT(!IS_ERR(loghandle)); /* new llog can be created concurrently */ if (!llog_exist(loghandle)) { rc = llog_cat_new_log(env, cathandle, loghandle, th); if (rc < 0) { up_write(&loghandle->lgh_lock); RETURN(rc); } } So we caught ENOSPC and got into the creation of new llog, but got another ENOSPC there as well where the llog should already be the new one, so it could not be end of previous llog, writes appear to be failing genuinely.

            People

              green Oleg Drokin
              ferner Frederik Ferner (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: