Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.4.0
-
None
-
single node with 4 or 8 cores
-
3
-
4390
Description
Two of my nodes running sanity in a loop crashed with this overnight.
I have a crashdump from one of the occurrences if somebody needs something from there, but tell me soon while I still have the modules and vmlinux for it.
[ 4343.790166] LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: [ 4343.832370] Lustre: MGC192.168.10.210@tcp: Reactivating import [ 4343.834877] Lustre: Found index 0 for lustre-MDT0000, updating log [ 4343.841331] Lustre: Modifying parameter lustre-MDT0000-mdtlov.lov.stripesize in log lustre-MDT0000 [ 4343.847226] LustreError: 20504:0:(mgc_request.c:248:do_config_log_add()) failed processing sptlrpc log: -2 [ 4343.874879] Lustre: lustre-MDT0000: used disk, loading [ 4343.875534] LustreError: 20552:0:(sec_config.c:1024:sptlrpc_target_local_copy_conf()) missing llog context [ 4344.061652] Lustre: 20552:0:(mdt_lproc.c:418:lprocfs_wr_identity_upcall()) lustre-MDT0000: identity upcall set to /home/green/git/lustre-release/lustre/utils/l_getidentity [ 4344.072933] Lustre: lustre-MDT0000: Temporarily refusing client connection from 0@lo [ 4344.073894] LustreError: 11-0: an error occurred while communicating with 0@lo. The mds_connect operation failed with -11 [ 4344.080580] LustreError: 20504:0:(mdd_device.c:219:changelog_user_init_cb()) ASSERTION( rec->cur_hdr.lrh_type == CHANGELOG_USER_REC ) failed: [ 4344.081572] LustreError: 20504:0:(mdd_device.c:219:changelog_user_init_cb()) LBUG [ 4344.082363] Pid: 20504, comm: mount.lustre [ 4344.082776] [ 4344.082776] Call Trace: [ 4344.083461] [<ffffffffa0b24915>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [ 4344.083980] [<ffffffffa0b24f27>] lbug_with_loc+0x47/0xb0 [libcfs] [ 4344.084564] [<ffffffffa05a9d57>] changelog_user_init_cb+0x127/0x170 [mdd] [ 4344.086120] [<ffffffffa044b568>] llog_reverse_process+0x5d8/0x9c0 [obdclass] [ 4344.086707] [<ffffffffa05a9c30>] ? changelog_user_init_cb+0x0/0x170 [mdd] [ 4344.087253] [<ffffffffa044e18e>] llog_cat_reverse_process_cb+0x17e/0x260 [obdclass] [ 4344.088137] [<ffffffffa044b568>] llog_reverse_process+0x5d8/0x9c0 [obdclass] [ 4344.088699] [<ffffffffa044e010>] ? llog_cat_reverse_process_cb+0x0/0x260 [obdclass] [ 4344.089540] [<ffffffffa044da30>] ? cat_cancel_cb+0x0/0x5e0 [obdclass] [ 4344.090117] [<ffffffffa044cdd8>] llog_cat_reverse_process+0x78/0x260 [obdclass] [ 4344.090802] [<ffffffffa05a9c30>] ? changelog_user_init_cb+0x0/0x170 [mdd] [ 4344.091060] [<ffffffffa044ca54>] ? llog_process+0x14/0x20 [obdclass] [ 4344.091301] [<ffffffffa05af69a>] mdd_prepare+0xe2a/0x1140 [mdd] [ 4344.091948] [<ffffffffa0c147da>] mdt_prepare+0x5a/0x14a0 [mdt] [ 4344.092209] [<ffffffffa04a2ade>] server_start_targets+0x147e/0x1d90 [obdclass] [ 4344.092599] [<ffffffffa048e050>] ? class_config_llog_handler+0x0/0x1800 [obdclass] [ 4344.092967] [<ffffffffa04a4798>] lustre_fill_super+0x13a8/0x1af0 [obdclass] [ 4344.093195] [<ffffffff8117d060>] ? set_anon_super+0x0/0x110 [ 4344.093411] [<ffffffffa04a33f0>] ? lustre_fill_super+0x0/0x1af0 [obdclass] [ 4344.093660] [<ffffffff8117e4cf>] get_sb_nodev+0x5f/0xa0 [ 4344.093874] [<ffffffffa048f955>] lustre_get_sb+0x25/0x30 [obdclass] [ 4344.094090] [<ffffffff8117e12b>] vfs_kern_mount+0x7b/0x1b0 [ 4344.094293] [<ffffffff8117e2d2>] do_kern_mount+0x52/0x130 [ 4344.094504] [<ffffffff8119c992>] do_mount+0x2d2/0x8c0 [ 4344.094714] [<ffffffff8119d010>] sys_mount+0x90/0xe0 [ 4344.094911] [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b [ 4344.095118] [ 4344.096049] Kernel panic - not syncing: LBUG [ 4344.096051] Pid: 20504, comm: mount.lustre Not tainted 2.6.32-debug #6 [ 4344.096053] Call Trace: [ 4344.096059] [<ffffffff814f75e4>] ? panic+0xa0/0x168 [ 4344.096069] [<ffffffffa0b24f7b>] ? lbug_with_loc+0x9b/0xb0 [libcfs] [ 4344.096076] [<ffffffffa05a9d57>] ? changelog_user_init_cb+0x127/0x170 [mdd] [ 4344.096092] [<ffffffffa044b568>] ? llog_reverse_process+0x5d8/0x9c0 [obdclass] [ 4344.096097] [<ffffffffa05a9c30>] ? changelog_user_init_cb+0x0/0x170 [mdd] [ 4344.096112] [<ffffffffa044e18e>] ? llog_cat_reverse_process_cb+0x17e/0x260 [obdclass] [ 4344.096127] [<ffffffffa044b568>] ? llog_reverse_process+0x5d8/0x9c0 [obdclass] [ 4344.096142] [<ffffffffa044e010>] ? llog_cat_reverse_process_cb+0x0/0x260 [obdclass] [ 4344.096157] [<ffffffffa044da30>] ? cat_cancel_cb+0x0/0x5e0 [obdclass] [ 4344.096171] [<ffffffffa044cdd8>] ? llog_cat_reverse_process+0x78/0x260 [obdclass] [ 4344.096176] [<ffffffffa05a9c30>] ? changelog_user_init_cb+0x0/0x170 [mdd] [ 4344.096191] [<ffffffffa044ca54>] ? llog_process+0x14/0x20 [obdclass] [ 4344.096196] [<ffffffffa05af69a>] ? mdd_prepare+0xe2a/0x1140 [mdd] [ 4344.096205] [<ffffffffa0c147da>] ? mdt_prepare+0x5a/0x14a0 [mdt] [ 4344.096223] [<ffffffffa04a2ade>] ? server_start_targets+0x147e/0x1d90 [obdclass] [ 4344.096241] [<ffffffffa048e050>] ? class_config_llog_handler+0x0/0x1800 [obdclass] [ 4344.096258] [<ffffffffa04a4798>] ? lustre_fill_super+0x13a8/0x1af0 [obdclass] [ 4344.096260] [<ffffffff8117d060>] ? set_anon_super+0x0/0x110 [ 4344.096276] [<ffffffffa04a33f0>] ? lustre_fill_super+0x0/0x1af0 [obdclass] [ 4344.096278] [<ffffffff8117e4cf>] ? get_sb_nodev+0x5f/0xa0 [ 4344.096295] [<ffffffffa048f955>] ? lustre_get_sb+0x25/0x30 [obdclass] [ 4344.096297] [<ffffffff8117e12b>] ? vfs_kern_mount+0x7b/0x1b0 [ 4344.096299] [<ffffffff8117e2d2>] ? do_kern_mount+0x52/0x130 [ 4344.096301] [<ffffffff8119c992>] ? do_mount+0x2d2/0x8c0 [ 4344.096303] [<ffffffff8119d010>] ? sys_mount+0x90/0xe0 [ 4344.096306] [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b