Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.12.0
-
Ubuntu 18.04 clients/RHEL 7.6 servers
-
3
-
9223372036854775807
Description
conf-sanity test_28A hangs due to client problem mounting the file system. We are only seeing this issue when testing with Ubuntu 18.04 clients.
Looking at the test suite hang at https://testing.whamcloud.com/test_sets/d8edfc48-fdd4-11e8-a97c-52540065bddc , the client test_log is empty. Looking at the Client 1 (vm1) console log, we see the issue
[ 2626.155389] Lustre: DEBUG MARKER: == conf-sanity test 28A: permanent parameter setting ================================================= 02:34:15 (1544495655) [ 2626.772885] Lustre: Lustre: Build Version: 2.12.0_RC2 [ 2626.836890] LNet: Added LNI 10.9.4.224@tcp [8/256/0/180] [ 2626.842963] LNet: Accept all, port 7988 [ 2628.417559] Lustre: 3169:0:(gss_svc_upcall.c:1199:gss_init_svc_upcall()) Init channel is not opened by lsvcgssd, following request might be dropped until lsvcgssd is active [ 2628.419207] Key type lgssc registered [ 2628.519146] Lustre: Echo OBD driver; http://www.lustre.org/ [ 2639.110774] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre [ 2639.121362] Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock trevis-19vm4@tcp:/lustre /mnt/lustre [ 2650.553084] Lustre: Mounted lustre-client [ 2651.652270] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n llite.lustre-*.max_read_ahead_whole_mb [ 2651.665867] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n llite.lustre-*.max_read_ahead_whole_mb [ 2652.000321] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n llite.lustre-*.max_read_ahead_whole_mb [ 2652.011374] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n llite.lustre-*.max_read_ahead_whole_mb [ 2653.024019] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n llite.lustre-*.max_read_ahead_whole_mb [ 2653.764392] BUG: unable to handle kernel paging request at 0000000080d48269 [ 2653.765275] IP: class_process_config+0x1cf8/0x27b0 [obdclass] [ 2653.765911] PGD 0 P4D 0 [ 2653.766218] Oops: 0000 [#1] SMP PTI [ 2653.766619] Modules linked in: lustre(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd input_leds joydev mac_hid serio_raw sch_fq_codel sunrpc ip_tables x_tables autofs4 psmouse virtio_blk floppy 8139too 8139cp mii pata_acpi i2c_piix4 [last unloaded: libcfs] [ 2653.771057] CPU: 1 PID: 3729 Comm: llog_process_th Tainted: G W OE 4.15.0-32-generic #35-Ubuntu [ 2653.772039] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 2653.772668] RIP: 0010:class_process_config+0x1cf8/0x27b0 [obdclass] [ 2653.773327] RSP: 0018:ffffb4fd4247bc58 EFLAGS: 00010246 [ 2653.773892] RAX: 0000000080d47e61 RBX: ffff9d7f744c0880 RCX: 0000000000000000 [ 2653.774624] RDX: 0000000000000018 RSI: ffffffffc0768d04 RDI: ffff9d7f744c08c6 [ 2653.775363] RBP: ffffb4fd4247bd08 R08: 00000000ffffffff R09: 0000000000000024 [ 2653.776103] R10: ffffffffc07594b0 R11: f000000000000000 R12: ffffffffc0759680 [ 2653.776838] R13: ffffffffc0761b50 R14: 0000000000000000 R15: ffffb4fd4247bd40 [ 2653.777571] FS: 0000000000000000(0000) GS:ffff9d7fbfd00000(0000) knlGS:0000000000000000 [ 2653.778413] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2653.779026] CR2: 0000000080d48269 CR3: 0000000010a0a002 CR4: 00000000000606e0 [ 2653.779769] Call Trace: [ 2653.780100] ? libcfs_debug_msg+0x50/0x70 [libcfs] [ 2653.780624] ? libcfs_debug_msg+0x50/0x70 [libcfs] [ 2653.781174] class_config_llog_handler+0x7cb/0x14c0 [obdclass] [ 2653.781828] llog_process_thread+0x651/0x1580 [obdclass] [ 2653.782413] llog_process_thread_daemonize+0x9f/0xe0 [obdclass] [ 2653.783071] kthread+0x121/0x140 [ 2653.783461] ? llog_backup+0x4d0/0x4d0 [obdclass] [ 2653.783981] ? kthread_create_worker_on_cpu+0x70/0x70 [ 2653.784534] ret_from_fork+0x35/0x40 [ 2653.784951] Code: 8b 40 38 48 85 c0 0f 84 e5 08 00 00 48 89 da be 20 00 00 00 4c 89 ef e8 b7 41 ed c9 41 89 c4 e9 d9 f4 ff ff 48 8b 85 58 ff ff ff <48> 8b 80 08 04 00 00 48 8b 50 10 81 3a 03 bd ac bd 0f 85 87 06 [ 2653.786833] RIP: class_process_config+0x1cf8/0x27b0 [obdclass] RSP: ffffb4fd4247bc58 [ 2653.787628] CR2: 0000000080d48269 [ 0.439151] Kernel panic - not syncing: Out of memory and no killable processes...
Logs for other hangs are at
https://testing.whamcloud.com/test_sets/c185cb70-f713-11e8-b67f-52540065bddc
https://testing.whamcloud.com/test_sets/b5f76e48-f778-11e8-b67f-52540065bddc
https://testing.whamcloud.com/test_sets/05386eea-fa11-11e8-8a18-52540065bddc