Details
-
Bug
-
Resolution: Duplicate
-
Minor
-
Lustre 2.1.0
-
None
-
A local CentOS 5 i686 VM, with a separate MGS device.
-
3
-
7879
Description
Commit: adc0fa37a44fce26e4c161176612c3c360a4dfbf
I was trying to mount Lustre with a separate MGS device on my VM:
[root@h221f tests]# MGSDEV=/tmp/lustre-mgs ./llmount.sh Stopping clients: h221f /mnt/lustre (opts:) Stopping clients: h221f /mnt/lustre2 (opts:) Loading modules from /root/lustre-release/lustre/tests/.. debug=0x33f0404 subsystem_debug=0xffb7e3ff ../lnet/lnet/lnet options: 'networks=tcp(eth1) accept=all' gss/krb5 is not supported quota/lquota options: 'hash_lqs_cur_bits=3' Formatting mgs, mds, osts Permanent disk data: Target: MGS Index: unassigned Lustre FS: lustre Mount type: ldiskfs Flags: 0x74 (MGS needs_index first_time update ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: formatting backing filesystem ldiskfs on /dev/loop0 target name MGS 4k blocks 50000 options -q -O uninit_bg,dir_nlink,huge_file,flex_bg -E lazy_journal_init -F mkfs_cmd = mke2fs -j -b 4096 -L MGS -q -O uninit_bg,dir_nlink,huge_file,flex_bg -E lazy_journal_init -F /dev/loop0 50000 Writing CONFIGS/mountdata Format mds1: /tmp/lustre-mdt1 Format ost1: /tmp/lustre-ost1 Format ost2: /tmp/lustre-ost2 Checking servers environments Checking clients h221f environments Loading modules from /root/lustre-release/lustre/tests/.. debug=0x33f0404 subsystem_debug=0xffb7e3ff gss/krb5 is not supported Setup mgs, mdt, osts Starting mgs: -o loop,user_xattr,acl /tmp/lustre-mgs /mnt/mgs Read from remote host 192.168.56.4: Connection reset by peer Connection to 192.168.56.4 closed.
I'll keep the crash dump for a few days. From the crash log:
Lustre: 2828:0:(debug.c:323:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release. Lustre: OBD class driver, http://wiki.whamcloud.com/ Lustre: Lustre Version: 2.0.66 Lustre: Build Version: ../lustre/scripts--PRISTINE-2.6.18-238.12.1.el5.2943701 Lustre: Lustre LU module (e0eb6020). Lustre: Added LNI 192.168.56.4@tcp [8/256/0/180] Lustre: Accept all, port 988 Lustre: Lustre OSC module (e1157ee0). Lustre: Lustre LOV module (e11f3e40). init dynlocks cache ldiskfs created from ext4-2.6-rhel5 Lustre: Lustre client module (e15a5be0). LDISKFS-fs (loop0): warning: maximal mount count reached, running e2fsck is recommended LDISKFS-fs (loop0): mounted filesystem with ordered data mode LDISKFS-fs (loop0): warning: maximal mount count reached, running e2fsck is recommended LDISKFS-fs (loop0): mounted filesystem with ordered data mode LDISKFS-fs (loop0): warning: maximal mount count reached, running e2fsck is recommended LDISKFS-fs (loop0): mounted filesystem with ordered data mode LDISKFS-fs (loop0): warning: maximal mount count reached, running e2fsck is recommended LDISKFS-fs (loop0): mounted filesystem with ordered data mode Lustre: 3578:0:(debug.c:323:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release. Lustre: 3578:0:(debug.c:323:libcfs_debug_str2mask()) Skipped 1 previous similar message LDISKFS-fs (loop0): mounted filesystem with ordered data mode LDISKFS-fs (loop0): mounted filesystem with ordered data mode Lustre: MGS MGS started Lustre: 3702:0:(sec.c:1474:sptlrpc_import_sec_adapt()) import MGC192.168.56.4@tcp->MGC192.168.56.4@tcp_0 netid 90000: select flavor null Lustre: 3727:0:(ldlm_lib.c:874:target_handle_connect()) MGS: connection from 739be4f1-ebe7-82f6-16d5-337bd19bdfcd@0@lo t0 exp 00000000 cur 1312188896 last 0 LustreError: 3727:0:(pack_generic.c:800:lustre_msghdr_get_flags()) ASSERTION(0) failed: incorrect message magic: 00000000 LustreError: 3727:0:(pack_generic.c:800:lustre_msghdr_get_flags()) LBUG Pid: 3727, comm: ll_mgs_02 Call Trace: [<00000000e0be15b0>] libcfs_debug_dumpstack+0x50/0x70 [libcfs] [<00000000e0be1d4d>] lbug_with_loc+0x6d/0xd0 [libcfs] [<00000000e0f40a00>] reply_in_callback+0x0/0x850 [ptlrpc] [<00000000e0f37e22>] lustre_msghdr_get_flags+0x82/0x90 [ptlrpc] [<00000000e0f40dc0>] reply_in_callback+0x3c0/0x850 [ptlrpc] [<00000000e1203851>] ldiskfs_mark_iloc_dirty+0x341/0x560 [ldiskfs] [<00000000e0f40a00>] reply_in_callback+0x0/0x850 [ptlrpc] [<00000000e0f3f367>] ptlrpc_master_callback+0x47/0xa0 [ptlrpc] [<00000000e0c33a0a>] lnet_enq_event_locked+0x5a/0xb0 [lnet] [<00000000e0c33ad8>] lnet_finalize+0x78/0x200 [lnet] [<00000000e0c42fcf>] lolnd_recv+0x5f/0x100 [lnet] [<00000000e0c37e09>] lnet_ni_recv+0xf9/0x260 [lnet] [<00000000e0c38059>] lnet_recv_put+0xe9/0x130 [lnet] [<00000000e0c3e560>] lnet_parse+0x14e0/0x2620 [lnet] [<00000000c048ca3d>] dput+0x72/0xed [<00000000e0db3baf>] llog_free_handle+0x9f/0x330 [obdclass] [<00000000c0490402>] mntput_no_expire+0x11/0x6a [<00000000e0b914f5>] pop_ctxt+0xe5/0x320 [lvfs] [<00000000e0dcb810>] __llog_ctxt_put+0x20/0x2e0 [obdclass] [<00000000e0db5c82>] llog_close+0x72/0x440 [obdclass] [<00000000e0c430b1>] lolnd_send+0x41/0x90 [lnet] [<00000000e0c37c9b>] lnet_ni_send+0x4b/0xc0 [lnet] [<00000000e0c3a04c>] lnet_send+0x1fc/0xd90 [lnet] [<00000000e0dcb810>] __llog_ctxt_put+0x20/0x2e0 [obdclass] [<00000000e0c40665>] LNetPut+0x565/0xef0 [lnet] [<00000000e0f2d764>] ptl_send_buf+0x1f4/0xab0 [ptlrpc] [<00000000e0f3ec66>] lustre_msg_set_timeout+0x96/0x110 [ptlrpc] [<00000000e0f2e26c>] ptlrpc_send_reply+0x24c/0x8b0 [ptlrpc] [<00000000e0ee0874>] target_send_reply+0x94/0x910 [ptlrpc] [<00000000e0f3db6c>] lustre_msg_get_conn_cnt+0xfc/0x1e0 [ptlrpc] [<00000000e124b51e>] mgs_handle+0x31e/0x1f10 [mgs] [<00000000e0f3718c>] lustre_msg_get_opc+0x10c/0x1f0 [ptlrpc] [<00000000e0f51b67>] ptlrpc_main+0x1217/0x27b0 [ptlrpc] [<00000000c044cf34>] audit_syscall_exit+0x2d4/0x2ea [<00000000e0f50950>] ptlrpc_main+0x0/0x27b0 [ptlrpc] [<00000000c0405c87>] kernel_thread_helper+0x7/0x10 <IRQ> Kernel panic - not syncing: LBUG
Attachments
Issue Links
- duplicates
-
LU-539 small size for RMF_CONNECT_DATA caused out of bound memory crash
- Resolved