[LU-3659] dt_declare_insert() ASSERTION( dt->do_index_ops ) failed Created: 29/Jul/13  Updated: 05/Sep/13  Resolved: 05/Sep/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.5.0

Type: Bug Priority: Major
Reporter: Ned Bass Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: yuc2
Environment:

x86_64 MDS with ZFS backend
Kernel 2.6.32-358.6.1.3chaos.ch5.1.x86_64
Lustre 2.4.0-RC2_5chaos


Severity: 3
Rank (Obsolete): 9440

 Description   

MDS was up and running a production MDT and MGS. I created and mounted a second MDT dataset in the pool for testing purposes. mount.lustre crashed with the following backtrace.

PID: 91831  TASK: ffff8800a1031500  CPU: 2   COMMAND: "mount.lustre"
 #0 [ffff881010da97a0] machine_kexec at ffffffff81035bfb
 #1 [ffff881010da9800] crash_kexec at ffffffff810c0932
 #2 [ffff881010da98d0] panic at ffffffff8150d943
 #3 [ffff881010da9950] lbug_with_loc at ffffffffa05abf4b [libcfs]
 #4 [ffff881010da9970] llog_osd_declare_create at ffffffffa070147f [obdclass]
 #5 [ffff881010da99d0] llog_declare_create at ffffffffa06cbd41 [obdclass]
 #6 [ffff881010da9a10] llog_open_create at ffffffffa06cc99f [obdclass]
 #7 [ffff881010da9a60] mdd_prepare at ffffffffa0ddab4c [mdd]
 #8 [ffff881010da9af0] mdt_prepare at ffffffffa0e3013a [mdt]
 #9 [ffff881010da9b50] server_start_targets at ffffffffa074ab26 [obdclass]
#10 [ffff881010da9c90] server_fill_super at ffffffffa074c104 [obdclass]
#11 [ffff881010da9d70] lustre_fill_super at ffffffffa071c818 [obdclass]
#12 [ffff881010da9da0] get_sb_nodev at ffffffff81183ecf
#13 [ffff881010da9de0] lustre_get_sb at ffffffffa0714285 [obdclass]
#14 [ffff881010da9e00] vfs_kern_mount at ffffffff811834eb
#15 [ffff881010da9e50] do_kern_mount at ffffffff81183692
#16 [ffff881010da9ea0] do_mount at ffffffff811a3892
#17 [ffff881010da9f20] sys_mount at ffffffff811a3f20
#18 [ffff881010da9f80] system_call_fastpath at ffffffff8100b072
    RIP: 00002aaaab1d809a  RSP: 00007fffffff9068  RFLAGS: 00010202
    RAX: 00000000000000a5  RBX: ffffffff8100b072  RCX: 0000000001000000
    RDX: 000000000040b45f  RSI: 00007fffffffc0d8  RDI: 000000000061d450
    RBP: 0000000000000000   R8: 000000000061d470   R9: 0000000000000000
    R10: 0000000001000000  R11: 0000000000000202  R12: 0000000000610c98
    R13: 0000000000610c90  R14: 000000000061d470  R15: 0000000000000000
    ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b
Lustre: ctl-ackack-MDT0000: No data found on store. Initialize space
Lustre: srv-ackack-MDT0000: No data found on store. Initialize space
Lustre: ackack-MDT0000: Initializing new disk
LustreError: 11-0: ackack-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11.
Lustre: ackack-MDT0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
LustreError: 91831:0:(dt_object.h:1301:dt_declare_insert()) ASSERTION( dt->do_index_ops ) failed: 
LustreError: 91831:0:(dt_object.h:1301:dt_declare_insert()) LBUG
Pid: 91831, comm: mount.lustre


 Comments   
Comment by Cliff White (Inactive) [ 29/Jul/13 ]

You were mounting the new MDT on the same server as the existing MDT? Did you set an --index value? Was the fsname the same?

Comment by Ned Bass [ 29/Jul/13 ]

fsname was different, i.e. I was creating a new filesystem for testing purposes. index was set to zero.

Comment by Cliff White (Inactive) [ 29/Jul/13 ]

Okay, just needed to clarify, thank you.

Comment by Keith Mannthey (Inactive) [ 30/Jul/13 ]

I did a quick test just now with a setup I had around (2.4.50).
The following worked just fine for me in my basic ldiskfs setup.

1 MGS
2 MDT (testfs1 and testfs2 both index 0)
2 OSTs (one each)
2 mounted Lustre FSs (testfs1,testfs2).

I will retry with 2.4 GA ZFS tomorrow and dig into this issue some more.

Do you have any other details you can share?
Were you seeing any memory allocation issues from mkfs to mount time?

Comment by Ned Bass [ 30/Jul/13 ]

Hi Keith,

There were no memory allocation issues or other unusual messages. I do have a crash dump we can dig through. I tried to extract the lustre log from it but our crash plugin failed. I suspect the plugin needs a compatibility refresh.

We've actually run two production filesystems in this concurrent manner before, which is why I felt confident enough to do this. But I guess the Lustre gods decided to punish me for doing testing on a production resource.

Comment by Keith Mannthey (Inactive) [ 31/Jul/13 ]

I was not able to trigger the issue today with my basic setup with 2.4 GA ZFS.

I have stated looking at a crashdump.

In general the main area of interest is in llog_osd_declare_create

   if (res->lgh_name) {
                struct dt_object *llog_dir;

                llog_dir = llog_osd_dir_get(env, res->lgh_ctxt);
                if (IS_ERR(llog_dir))
                        RETURN(PTR_ERR(llog_dir));
                logid_to_fid(&res->lgh_id, &lgi->lgi_fid);
                rc = dt_declare_insert(env, llog_dir,
                                       (struct dt_rec *)&lgi->lgi_fid,
                                       (struct dt_key *)res->lgh_name, th);
                lu_object_put(env, &llog_dir->do_lu);
                if (rc)
                        CERROR("%s: can't declare named llog %s: rc = %d\n",
                               o->do_lu.lo_dev->ld_obd->obd_name,
                               res->lgh_name, rc);

In general we know log_dir is not fully valid and I am trying to sort out what could have happened.

Comment by Keith Mannthey (Inactive) [ 01/Aug/13 ]

There are no error time registers recorded so progress is slower than I had anticipated.

Comment by Keith Mannthey (Inactive) [ 02/Aug/13 ]

I was able to identify res->lgh_obj / o a dt_object for the zfs osd_device. I am looking to see if I can get a pointer to more of the local data being used. It is not clear to me that it becomes the struct dt_object *llog_dir; that has the issues.

For sure in the zfs dt_device res->lgh_obj the do_index_ops = 0x0 so if does become the llog_dir it is just not set but like I said it is just not clear that is the case yet.

I opened an LU to track the need for dumping local regs at error time. It is a doable thing and a future improvement that would greatly speed up issues like this.

Comment by Keith Mannthey (Inactive) [ 06/Aug/13 ]
static int llog_osd_declare_create(const struct lu_env *env,
                                   struct llog_handle *res, struct thandle *th)

The struct llog_handle *res has been found. It's name is "changelog_catalog" the data structure is intact and leads to a wealth of valid information.

I still an unable to concretely id the struct lu_env *env and struct thandle *th that were passed into the function.

Just knowing *res is helpful but there is alot of contextual look-ups derived from the lu_env and thandle.

IT seems the core mystery is what happened in the function below:

struct dt_object *llog_osd_dir_get(const struct lu_env *env,  <==== we do not have this data. 
                                   struct llog_ctxt *ctxt)  <=== res->lgh_ctxt (we have this data)
{
        struct dt_device        *dt;
        struct dt_thread_info   *dti = dt_info(env);
        struct dt_object        *dir;
        int                      rc;

        dt = ctxt->loc_exp->exp_obd->obd_lvfs_ctxt.dt;
        if (ctxt->loc_dir == NULL) {                    <====== loc_dir was null and we do this path 
                rc = dt_root_get(env, dt, &dti->dti_fid);
                if (rc)                                 <====== We know this did not happen or we would have hit an error handler before the panic. 
                        return ERR_PTR(rc);
                dir = dt_locate(env, dt, &dti->dti_fid);  <====== The root function that return dt_object without do_index_ops set. Info is lost for all but the dt passed in. 
        } else {
                lu_object_get(&ctxt->loc_dir->do_lu);
                dir = ctxt->loc_dir;
        }

        return dir;
}

I am not too sure how more more useful progress I can make on this as I am not an expert in this area of code.

Mike, do you have any ideas?

Comment by Peter Jones [ 07/Aug/13 ]

Niu

Could you please help out on this one?

Thanks

Peter

Comment by Niu Yawei (Inactive) [ 08/Aug/13 ]

I think the problem could be that we missed calling dt_try_as_dir() in llog_osd_dir_get(), for the MDT which have MGS (on same osd), the dt_try_as_dir() for root is called when mgs fs setup, so it's fine, but when we start a MDT without MGS, there will be problem. I'll cooke a patch soon.

Comment by Niu Yawei (Inactive) [ 08/Aug/13 ]

http://review.whamcloud.com/7267

Comment by Jodi Levi (Inactive) [ 23/Aug/13 ]

Reducing from blocker as patch landed to master.

Comment by Niu Yawei (Inactive) [ 03/Sep/13 ]

Should we cherry-pick it to b2_4? If not, I think this ticket can be closed.

Comment by Peter Jones [ 05/Sep/13 ]

If we do land it to 2.4.1 we will track this separately

Generated at Sat Feb 10 01:35:49 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.