Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7138

LBUG: (osd_handler.c:1017:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed:

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • Lustre 2.7.0
    • None
    • 1
    • 9223372036854775807

    Description

      This evening we have hit this LBUG on the MDT in our production file system, the file system is currently down as we hit the same bug every time we attempt to bring the MDT back, as soon as recovery finishes.

      <0>LustreError: 722:0:(osd_handler.c:1017:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed:
      <0>LustreError: 722:0:(osd_handler.c:1017:osd_trans_start()) LBUG
      <4>Pid: 722, comm: mdt01_017
      <4>
      <4>Call Trace:
      <4> [<ffffffffa065f895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      <4> [<ffffffffa065fe97>] lbug_with_loc+0x47/0xb0 [libcfs]
      <4> [<ffffffffa17df24d>] osd_trans_start+0x25d/0x660 [osd_ldiskfs]
      <4> [<ffffffffa09b9b4a>] llog_osd_destroy+0x42a/0xd40 [obdclass]
      <4> [<ffffffffa09b2edc>] llog_cat_new_log+0x1ec/0x710 [obdclass]
      <4> [<ffffffffa09b350a>] llog_cat_add_rec+0x10a/0x450 [obdclass]
      <4> [<ffffffffa09ab1e9>] llog_add+0x89/0x1c0 [obdclass]
      <4> [<ffffffffa17f1976>] ? osd_attr_set+0x166/0x460 [osd_ldiskfs]
      <4> [<ffffffffa0d914e2>] mdd_changelog_store+0x122/0x290 [mdd]
      <4> [<ffffffffa0da4d0c>] mdd_changelog_data_store+0x16c/0x320 [mdd]
      <4> [<ffffffffa0dad9b3>] mdd_attr_set+0x12f3/0x1730 [mdd]
      <4> [<ffffffffa088a551>] mdt_reint_setattr+0xf81/0x13a0 [mdt]
      <4> [<ffffffffa087be1c>] ? mdt_root_squash+0x2c/0x3f0 [mdt]
      <4> [<ffffffffa08801dd>] mdt_reint_rec+0x5d/0x200 [mdt]
      <4> [<ffffffffa086423b>] mdt_reint_internal+0x4cb/0x7a0 [mdt]
      <4> [<ffffffffa08649ab>] mdt_reint+0x6b/0x120 [mdt]
      <4> [<ffffffffa0c6f56e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
      <4> [<ffffffffa0c1f5a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
      <4> [<ffffffff8106c4f0>] ? pick_next_task_fair+0xd0/0x130
      <4> [<ffffffffa0c1e760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
      <4> [<ffffffff8109e66e>] kthread+0x9e/0xc0
      <4> [<ffffffff8100c20a>] child_rip+0xa/0x20
      <4> [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
      <4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
      <4>
      <0>Kernel panic - not syncing: LBUG
      

      The stack trace doesn't quite seem to be the same as for LU-6634 (which anyway doesn't have any fix suggested.)

      Attachments

        Issue Links

          Activity

            [LU-7138] LBUG: (osd_handler.c:1017:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed:
            green Oleg Drokin added a comment -

            "lop_write_rec error -28" means you are out of free space.
            Can ou check that you have enough inodes too? (df -i). Is there a quota set for the root user by any chance?

            green Oleg Drokin added a comment - "lop_write_rec error -28" means you are out of free space. Can ou check that you have enough inodes too? (df -i). Is there a quota set for the root user by any chance?

            After updating Lustre and mounting the MDT again, I got this on the console:

            Lustre: lustre03-MDT0000: Denying connection for new client lustre03-MDT0000-lwp-OST0001_UUID (at 10.144.144.31@o2ib), waiting for all 354 known clients (139 recovered, 170 in progress, and 0 evicted) to recove0
            Lustre: Skipped 19 previous similar messages
            Lustre: Skipped 3 previous similar messages
            Lustre: lustre03-MDT0000: Client 211c78af-7674-5b7e-5f34-3d38b509f421 (at 10.144.148.31@o2ib) reconnecting, waiting for 354 clients in recovery for 2:44
            Lustre: Skipped 1895 previous similar messages
            LustreError: 10692:0:(ldlm_lib.c:1748:check_for_next_transno()) lustre03-MDT0000: waking for gap in transno, VBR is OFF (skip: 370152455116, ql: 37, comp: 317, conn: 354, next: 370152455496, last_committed: 370)
            LustreError: 10692:0:(llog.c:726:llog_write_rec()) lop_write_rec error -28
            LustreError: 10692:0:(llog.c:726:llog_write_rec()) Skipped 13 previous similar messages
            LustreError: 10692:0:(osd_handler.c:1017:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed: 
            LustreError: 10692:0:(osd_handler.c:1017:osd_trans_start()) LBUG
            

            I'm also not having much luck with the vmcore, it appears I can find the right "NAMELIST" (vmlinux?, do I need the kernel-debuginfo rpm for the kernel we've been runninng?), so I'm thinking of uploading this somewhere and hopefully you could look at this?

            vmcore file now at ftp://ftpanon.diamond.ac.uk/LU-7138/vmcore-LU_7138

            I need to get some sleep now, but would appreciate if this could still be looked at. I'll be back in approximately 5h and users will start getting very impatient shortly after this, so an idea how to get this file system back up (sanely) would be really good at that point.

            ferner Frederik Ferner (Inactive) added a comment - - edited After updating Lustre and mounting the MDT again, I got this on the console: Lustre: lustre03-MDT0000: Denying connection for new client lustre03-MDT0000-lwp-OST0001_UUID (at 10.144.144.31@o2ib), waiting for all 354 known clients (139 recovered, 170 in progress, and 0 evicted) to recove0 Lustre: Skipped 19 previous similar messages Lustre: Skipped 3 previous similar messages Lustre: lustre03-MDT0000: Client 211c78af-7674-5b7e-5f34-3d38b509f421 (at 10.144.148.31@o2ib) reconnecting, waiting for 354 clients in recovery for 2:44 Lustre: Skipped 1895 previous similar messages LustreError: 10692:0:(ldlm_lib.c:1748:check_for_next_transno()) lustre03-MDT0000: waking for gap in transno, VBR is OFF (skip: 370152455116, ql: 37, comp: 317, conn: 354, next: 370152455496, last_committed: 370) LustreError: 10692:0:(llog.c:726:llog_write_rec()) lop_write_rec error -28 LustreError: 10692:0:(llog.c:726:llog_write_rec()) Skipped 13 previous similar messages LustreError: 10692:0:(osd_handler.c:1017:osd_trans_start()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed: LustreError: 10692:0:(osd_handler.c:1017:osd_trans_start()) LBUG I'm also not having much luck with the vmcore, it appears I can find the right "NAMELIST" (vmlinux?, do I need the kernel-debuginfo rpm for the kernel we've been runninng?), so I'm thinking of uploading this somewhere and hopefully you could look at this? vmcore file now at ftp://ftpanon.diamond.ac.uk/LU-7138/vmcore-LU_7138 I need to get some sleep now, but would appreciate if this could still be looked at. I'll be back in approximately 5h and users will start getting very impatient shortly after this, so an idea how to get this file system back up (sanely) would be really good at that point.
            green Oleg Drokin added a comment -

            you can use crash tool to load vmcore in, then use mod -S to load lustre module symbol information and also xbt crash module to see local variable contents.

            green Oleg Drokin added a comment - you can use crash tool to load vmcore in, then use mod -S to load lustre module symbol information and also xbt crash module to see local variable contents.

            Ok, I've started the build for a Lustre version with the debug patch.

            I have a vmcore from the initial LBUG kernel panic, is this potentially useful? (at the moment I'm not sure how I would go about extracting information from this, though.)

            ferner Frederik Ferner (Inactive) added a comment - Ok, I've started the build for a Lustre version with the debug patch. I have a vmcore from the initial LBUG kernel panic, is this potentially useful? (at the moment I'm not sure how I would go about extracting information from this, though.)
            green Oleg Drokin added a comment - - edited

            If you apply a patch like below, it won't really fix anything, but at least we'll know where does the error comes from and what was it (if you have a crashdump from the crash, there's a chance you can extract information from it too).

            diff --git a/lustre/obdclass/llog.c b/lustre/obdclass/llog.c
            index c34e179..680dae1 100644
            --- a/lustre/obdclass/llog.c
            +++ b/lustre/obdclass/llog.c
            @@ -702,12 +702,16 @@ int llog_write_rec(const struct lu_env *env, struct llog_handle *handle,
             	ENTRY;
             
             	rc = llog_handle2ops(handle, &lop);
            -	if (rc)
            +	if (rc) {
            +		CERROR("llog_handle2ops error %d\n", rc);
             		RETURN(rc);
            +	}
             
             	LASSERT(lop);
            -	if (lop->lop_write_rec == NULL)
            +	if (lop->lop_write_rec == NULL) {
            +		CERROR("lop->lop_write_rec == NULL");
             		RETURN(-EOPNOTSUPP);
            +	}
             
             	buflen = rec->lrh_len;
             	LASSERT(cfs_size_round(buflen) == buflen);
            @@ -718,6 +722,8 @@ int llog_write_rec(const struct lu_env *env, struct llog_handle *handle,
             	rc = lop->lop_write_rec(env, handle, rec, logcookies, idx, th);
             	if (!raised)
             		cfs_cap_lower(CFS_CAP_SYS_RESOURCE);
            +	if (rc)
            +		CERROR("lop_write_rec error %d\n", rc);
             	RETURN(rc);
             }
             
            

            the patch is only needed on MDS, btw

            green Oleg Drokin added a comment - - edited If you apply a patch like below, it won't really fix anything, but at least we'll know where does the error comes from and what was it (if you have a crashdump from the crash, there's a chance you can extract information from it too). diff --git a/lustre/obdclass/llog.c b/lustre/obdclass/llog.c index c34e179..680dae1 100644 --- a/lustre/obdclass/llog.c +++ b/lustre/obdclass/llog.c @@ -702,12 +702,16 @@ int llog_write_rec(const struct lu_env *env, struct llog_handle *handle, ENTRY; rc = llog_handle2ops(handle, &lop); - if (rc) + if (rc) { + CERROR("llog_handle2ops error %d\n", rc); RETURN(rc); + } LASSERT(lop); - if (lop->lop_write_rec == NULL) + if (lop->lop_write_rec == NULL) { + CERROR("lop->lop_write_rec == NULL"); RETURN(-EOPNOTSUPP); + } buflen = rec->lrh_len; LASSERT(cfs_size_round(buflen) == buflen); @@ -718,6 +722,8 @@ int llog_write_rec(const struct lu_env *env, struct llog_handle *handle, rc = lop->lop_write_rec(env, handle, rec, logcookies, idx, th); if (!raised) cfs_cap_lower(CFS_CAP_SYS_RESOURCE); + if (rc) + CERROR("lop_write_rec error %d\n", rc); RETURN(rc); } the patch is only needed on MDS, btw

            There's nothing obvious in syslog. The full contents of /var/log/messages before the crash/reboot is this:

            Sep  7 08:52:49 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x2000115a3:0x662:0x0] != self_fid [0x200010dc1:0xd167:0x0]
            Sep  7 11:25:06 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200010dbe:0x9e03:0x0] != self_fid [0x200010dbe:0x9e09:0x0]
            Sep  7 12:03:51 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200011586:0x283f:0x0] != self_fid [0x200011582:0x41a3:0x0]
            Sep  7 13:14:15 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200010dab:0x198d5:0x0] != self_fid [0x200011602:0x1c22e:0x0]
            Sep  7 13:30:42 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200011580:0x9afc:0x0] != self_fid [0x200010da9:0x16251:0x0]
            Sep  7 14:10:07 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x20001157b:0xf7a6:0x0] != self_fid [0x200011581:0x152fe:0x0]
            Sep  7 15:47:13 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x2000115b2:0x4cab:0x0] != self_fid [0x2000115b1:0x4cab:0x0]
            Sep  7 18:17:46 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client d6e80b71-1eca-2fca-7854-d8ad02c40814 (at 172.23.116.33@tcp) reconnecting
            Sep  7 18:17:46 cs04r-sc-mds03-01 kernel: Lustre: Skipped 1 previous similar message
            Sep  7 18:18:46 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client d6e80b71-1eca-2fca-7854-d8ad02c40814 (at 172.23.116.33@tcp) reconnecting
            Sep  8 11:05:29 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting
            Sep  8 11:05:29 cs04r-sc-mds03-01 kernel: Lustre: Skipped 2 previous similar messages
            Sep  8 11:06:11 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client 847ec63e-7411-2e53-d492-aca9e17e6c84 (at 172.23.146.38@tcp) reconnecting
            Sep  8 11:06:11 cs04r-sc-mds03-01 kernel: Lustre: Skipped 1 previous similar message
            Sep  8 11:06:25 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting
            Sep  8 11:07:19 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting
            Sep  8 11:07:19 cs04r-sc-mds03-01 kernel: Lustre: Skipped 1 previous similar message
            Sep  8 11:11:41 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client 847ec63e-7411-2e53-d492-aca9e17e6c84 (at 172.23.146.38@tcp) reconnecting
            Sep  8 11:17:00 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client 847ec63e-7411-2e53-d492-aca9e17e6c84 (at 172.23.146.38@tcp) reconnecting
            Sep  8 16:15:16 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x2000115a4:0xa9e9:0x0] != self_fid [0x200011626:0x1d353:0x0]
            Sep  8 17:23:12 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting
            Sep  8 17:23:27 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting
            Sep  8 18:33:55 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200010d9e:0x196cf:0x0] != self_fid [0x20001157c:0x10445:0x0]
            Sep  8 20:37:36 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x20001163d:0x29e3:0x0] != self_fid [0x200010db6:0x1c6f6:0x0]
            Sep  9 00:32:12 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200011640:0x407d:0x0] != self_fid [0x20001157f:0x1411a:0x0]
            Sep  9 01:07:50 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting
            Sep  9 02:36:05 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x2000115bf:0x1b19:0x0] != self_fid [0x200011570:0x1d5c:0x0]
            Sep 10 10:21:18 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200011645:0x8b5e:0x0] != self_fid [0x20001163a:0xcf49:0x0]
            Sep 10 10:29:03 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x20001163d:0x1dc2d:0x0] != self_fid [0x20001163d:0x1dc37:0x0]
            Sep 10 16:48:08 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200011668:0x739b:0x0] != self_fid [0x20001167c:0x1501f:0x0]
            

            Mounting the MDT as ldiskfs works and there is lots of space free:

            [bnh65367@cs04r-sc-mds03-01 127.0.0.1-2015-09-10-22:05:04]$ df -h
            /dev/mapper/vg_lustre03-mgs
                                  469M  2.7M  441M   1% /lustre/mgs
            /dev/mapper/vg_lustre03-mdt
                                  1.5T  103G  1.3T   8% /lustre/lustre03/mdt
            [bnh65367@cs04r-sc-mds03-01 127.0.0.1-2015-09-10-22:05:04]$ df -hi
            Filesystem           Inodes IUsed IFree IUse% Mounted on
            /dev/mapper/vg_lustre03-mdt
                                   484M  128M  357M   27% /lustre/lustre03/mdt
            /dev/mapper/vg_lustre03-mgs
                                   125K   218  125K    1% /lustre/mgs
            
            

            Is there another option to get the file system back online without the "at your own risk" patch? I'd rather not find out that half written logs are completely trashing our file system...

            ferner Frederik Ferner (Inactive) added a comment - - edited There's nothing obvious in syslog. The full contents of /var/log/messages before the crash/reboot is this: Sep 7 08:52:49 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x2000115a3:0x662:0x0] != self_fid [0x200010dc1:0xd167:0x0] Sep 7 11:25:06 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200010dbe:0x9e03:0x0] != self_fid [0x200010dbe:0x9e09:0x0] Sep 7 12:03:51 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200011586:0x283f:0x0] != self_fid [0x200011582:0x41a3:0x0] Sep 7 13:14:15 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200010dab:0x198d5:0x0] != self_fid [0x200011602:0x1c22e:0x0] Sep 7 13:30:42 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200011580:0x9afc:0x0] != self_fid [0x200010da9:0x16251:0x0] Sep 7 14:10:07 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x20001157b:0xf7a6:0x0] != self_fid [0x200011581:0x152fe:0x0] Sep 7 15:47:13 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x2000115b2:0x4cab:0x0] != self_fid [0x2000115b1:0x4cab:0x0] Sep 7 18:17:46 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client d6e80b71-1eca-2fca-7854-d8ad02c40814 (at 172.23.116.33@tcp) reconnecting Sep 7 18:17:46 cs04r-sc-mds03-01 kernel: Lustre: Skipped 1 previous similar message Sep 7 18:18:46 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client d6e80b71-1eca-2fca-7854-d8ad02c40814 (at 172.23.116.33@tcp) reconnecting Sep 8 11:05:29 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting Sep 8 11:05:29 cs04r-sc-mds03-01 kernel: Lustre: Skipped 2 previous similar messages Sep 8 11:06:11 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client 847ec63e-7411-2e53-d492-aca9e17e6c84 (at 172.23.146.38@tcp) reconnecting Sep 8 11:06:11 cs04r-sc-mds03-01 kernel: Lustre: Skipped 1 previous similar message Sep 8 11:06:25 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting Sep 8 11:07:19 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting Sep 8 11:07:19 cs04r-sc-mds03-01 kernel: Lustre: Skipped 1 previous similar message Sep 8 11:11:41 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client 847ec63e-7411-2e53-d492-aca9e17e6c84 (at 172.23.146.38@tcp) reconnecting Sep 8 11:17:00 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client 847ec63e-7411-2e53-d492-aca9e17e6c84 (at 172.23.146.38@tcp) reconnecting Sep 8 16:15:16 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x2000115a4:0xa9e9:0x0] != self_fid [0x200011626:0x1d353:0x0] Sep 8 17:23:12 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting Sep 8 17:23:27 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting Sep 8 18:33:55 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200010d9e:0x196cf:0x0] != self_fid [0x20001157c:0x10445:0x0] Sep 8 20:37:36 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x20001163d:0x29e3:0x0] != self_fid [0x200010db6:0x1c6f6:0x0] Sep 9 00:32:12 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200011640:0x407d:0x0] != self_fid [0x20001157f:0x1411a:0x0] Sep 9 01:07:50 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client a1185e16-56ed-6244-ef01-edb7c59bbde2 (at 172.23.146.38@tcp) reconnecting Sep 9 02:36:05 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x2000115bf:0x1b19:0x0] != self_fid [0x200011570:0x1d5c:0x0] Sep 10 10:21:18 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200011645:0x8b5e:0x0] != self_fid [0x20001163a:0xcf49:0x0] Sep 10 10:29:03 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x20001163d:0x1dc2d:0x0] != self_fid [0x20001163d:0x1dc37:0x0] Sep 10 16:48:08 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000-osd: FID [0x200011668:0x739b:0x0] != self_fid [0x20001167c:0x1501f:0x0] Mounting the MDT as ldiskfs works and there is lots of space free: [bnh65367@cs04r-sc-mds03-01 127.0.0.1-2015-09-10-22:05:04]$ df -h /dev/mapper/vg_lustre03-mgs 469M 2.7M 441M 1% /lustre/mgs /dev/mapper/vg_lustre03-mdt 1.5T 103G 1.3T 8% /lustre/lustre03/mdt [bnh65367@cs04r-sc-mds03-01 127.0.0.1-2015-09-10-22:05:04]$ df -hi Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_lustre03-mdt 484M 128M 357M 27% /lustre/lustre03/mdt /dev/mapper/vg_lustre03-mgs 125K 218 125K 1% /lustre/mgs Is there another option to get the file system back online without the "at your own risk" patch? I'd rather not find out that half written logs are completely trashing our file system...
            green Oleg Drokin added a comment - - edited

            So the next puzzle now is why did llog init or llog write fail, you are not out orf space or anythng like that, are you, no messages in logs pointing at other sources of failures?

            You can comment out the llog_destroy call like this as a temporary workaround, though if writes would keep failing, I am sure something else would hit too:

            diff --git a/lustre/obdclass/llog_cat.c b/lustre/obdclass/llog_cat.c
            index 981af6f..6e51088 100644
            --- a/lustre/obdclass/llog_cat.c
            +++ b/lustre/obdclass/llog_cat.c
            @@ -116,7 +116,7 @@ static int llog_cat_new_log(const struct lu_env *env,
                    loghandle->lgh_hdr->llh_cat_idx = rec->lid_hdr.lrh_index;
                    RETURN(0);
             out_destroy:
            -       llog_destroy(env, loghandle);
            +//     llog_destroy(env, loghandle);
                    RETURN(rc);
             }
            

            It's also unclear what would happen when the system encounters these half written logs later, so it's really a "try at your own risk" patch.

            green Oleg Drokin added a comment - - edited So the next puzzle now is why did llog init or llog write fail, you are not out orf space or anythng like that, are you, no messages in logs pointing at other sources of failures? You can comment out the llog_destroy call like this as a temporary workaround, though if writes would keep failing, I am sure something else would hit too: diff --git a/lustre/obdclass/llog_cat.c b/lustre/obdclass/llog_cat.c index 981af6f..6e51088 100644 --- a/lustre/obdclass/llog_cat.c +++ b/lustre/obdclass/llog_cat.c @@ -116,7 +116,7 @@ static int llog_cat_new_log(const struct lu_env *env, loghandle->lgh_hdr->llh_cat_idx = rec->lid_hdr.lrh_index; RETURN(0); out_destroy: - llog_destroy(env, loghandle); +// llog_destroy(env, loghandle); RETURN(rc); } It's also unclear what would happen when the system encounters these half written logs later, so it's really a "try at your own risk" patch.
            green Oleg Drokin added a comment -

            So, looking at the llog_cat_new_log, we can see on error it cleans up with:

            out_destroy:
                    llog_destroy(env, loghandle);
                    RETURN(rc);
            

            This gets translated into llog_osd_destroy that insists it creates its own transaction, but we are already in a transaction which causes thisassertion failure.

            It's trivial to reproduce with this patch (Causing 100% cash on startup):

            diff --git a/lustre/obdclass/llog_cat.c b/lustre/obdclass/llog_cat.c
            index 981af6f..3617e3b 100644
            --- a/lustre/obdclass/llog_cat.c
            +++ b/lustre/obdclass/llog_cat.c
            @@ -105,8 +105,8 @@ static int llog_cat_new_log(const struct lu_env *env,
                     * assigned to the record and updated in rec header */
                    rc = llog_write_rec(env, cathandle, &rec->lid_hdr,
                                        &loghandle->u.phd.phd_cookie, LLOG_NEXT_IDX, th);
            -       if (rc < 0)
            -               GOTO(out_destroy, rc);
            +//     if (rc < 0)
            +               GOTO(out_destroy, rc = -5);
             
                    CDEBUG(D_OTHER, "new recovery log "DOSTID":%x for index %u of catalog"
                           DOSTID"\n", POSTID(&loghandle->lgh_id.lgl_oi),
            

            If we think about it some more - it's actually totally logical that we do not need to cleanup in a different transaction and must cleanup in the same one so that it's atomic in the end.
            So I imagine we need to use a different destroy method that would reuse the same transaction.

            green Oleg Drokin added a comment - So, looking at the llog_cat_new_log, we can see on error it cleans up with: out_destroy: llog_destroy(env, loghandle); RETURN(rc); This gets translated into llog_osd_destroy that insists it creates its own transaction, but we are already in a transaction which causes thisassertion failure. It's trivial to reproduce with this patch (Causing 100% cash on startup): diff --git a/lustre/obdclass/llog_cat.c b/lustre/obdclass/llog_cat.c index 981af6f..3617e3b 100644 --- a/lustre/obdclass/llog_cat.c +++ b/lustre/obdclass/llog_cat.c @@ -105,8 +105,8 @@ static int llog_cat_new_log(const struct lu_env *env, * assigned to the record and updated in rec header */ rc = llog_write_rec(env, cathandle, &rec->lid_hdr, &loghandle->u.phd.phd_cookie, LLOG_NEXT_IDX, th); - if (rc < 0) - GOTO(out_destroy, rc); +// if (rc < 0) + GOTO(out_destroy, rc = -5); CDEBUG(D_OTHER, "new recovery log "DOSTID":%x for index %u of catalog" DOSTID"\n", POSTID(&loghandle->lgh_id.lgl_oi), If we think about it some more - it's actually totally logical that we do not need to cleanup in a different transaction and must cleanup in the same one so that it's atomic in the end. So I imagine we need to use a different destroy method that would reuse the same transaction.
            pjones Peter Jones added a comment -

            Oleg is looking into this

            pjones Peter Jones added a comment - Oleg is looking into this

            People

              green Oleg Drokin
              ferner Frederik Ferner (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: