Details
-
Bug
-
Resolution: Duplicate
-
Minor
-
None
-
Lustre 2.12.0
-
Ubuntu 18.04 clients
-
3
-
9223372036854775807
Description
No sanity-hsm tests are run due to a problem with clients mounting the Lustre file system. Same problem for sanity-dom and racer. So far, we only see this for Ubuntu 18.04 client testing.
Looking at the client test_log for https://testing.whamcloud.com/test_sets/d791f818-fdd4-11e8-a97c-52540065bddc , we see the two clients aren’t able to mount the file system
-----============= acceptance-small: sanity-hsm ============----- Tue Dec 11 00:37:27 UTC 2018 Running: bash /usr/lib64/lustre/tests/sanity-hsm.sh CMD: trevis-19vm4 /usr/sbin/lctl get_param -n version 2>/dev/null || /usr/sbin/lctl lustre_build_version 2>/dev/null || /usr/sbin/lctl --version 2>/dev/null | cut -d' ' -f2 Starting client trevis-19vm1.trevis.whamcloud.com,trevis-19vm2: -o user_xattr,flock trevis-19vm4@tcp:/lustre /mnt/lustre2 CMD: trevis-19vm1.trevis.whamcloud.com,trevis-19vm2 running=\$(mount | grep -c /mnt/lustre2' '); rc=0; if [ \$running -eq 0 ] ; then mkdir -p /mnt/lustre2; mount -t lustre -o user_xattr,flock trevis-19vm4@tcp:/lustre /mnt/lustre2; rc=\$?; fi; exit \$rc trevis-19vm1: mount.lustre: mount trevis-19vm4@tcp:/lustre at /mnt/lustre2 failed: File exists trevis-19vm2: mount.lustre: mount trevis-19vm4@tcp:/lustre at /mnt/lustre2 failed: File exists
Both clients have the following stack traces in their console logs
[ 362.502293] sysfs: cannot create duplicate filename '/devices/virtual/bdi/lustre- (ptrval)' [ 362.503296] WARNING: CPU: 1 PID: 2060 at /build/linux-wuhukg/linux-4.15.0/fs/sysfs/dir.c:31 sysfs_warn_dup+0x56/0x70 [ 362.504411] Modules linked in: mgc(OE) lustre(OE) lmv(OE) mdc(OE) fid(OE) osc(OE) lov(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd joydev input_leds mac_hid serio_raw sch_fq_codel sunrpc ip_tables x_tables autofs4 psmouse virtio_blk floppy i2c_piix4 8139too 8139cp pata_acpi mii [ 362.508527] CPU: 1 PID: 2060 Comm: mount.lustre Tainted: G OE 4.15.0-32-generic #35-Ubuntu [ 362.509518] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 362.510149] RIP: 0010:sysfs_warn_dup+0x56/0x70 [ 362.510657] RSP: 0018:ffffafd10081f970 EFLAGS: 00010286 [ 362.511235] RAX: 0000000000000000 RBX: ffff97c0f4a54000 RCX: 0000000000000006 [ 362.512039] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff97c13fd16490 [ 362.512830] RBP: ffffafd10081f988 R08: 0000000000000000 R09: 000000000000020e [ 362.513597] R10: 0000000000000001 R11: ffffffff8471ca60 R12: ffff97c138186880 [ 362.514358] R13: ffff97c13298fdd0 R14: ffff97c135b31000 R15: 0000000000000000 [ 362.515125] FS: 00007fcb13183740(0000) GS:ffff97c13fd00000(0000) knlGS:0000000000000000 [ 362.515988] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 362.516616] CR2: 00005616fd546338 CR3: 000000007861a003 CR4: 00000000000606e0 [ 362.517389] Call Trace: [ 362.517716] sysfs_create_dir_ns+0x77/0x90 [ 362.518202] kobject_add_internal+0xac/0x2b0 [ 362.518697] kobject_add+0x71/0xd0 [ 362.519104] ? _cond_resched+0x19/0x40 [ 362.519556] device_add+0x12c/0x680 [ 362.519973] device_create_groups_vargs+0xe4/0xf0 [ 362.520515] device_create_vargs+0x16/0x20 [ 362.520996] bdi_register_va.part.11+0x28/0x190 [ 362.521532] bdi_register_va+0x1b/0x20 [ 362.521978] super_setup_bdi_name+0x87/0xe0 [ 362.522485] ll_fill_super+0x1ce/0x1230 [lustre] [ 362.523035] ? lustre_start_mgc+0x30e/0x2710 [obdclass] [ 362.523664] ? libcfs_debug_msg+0x50/0x70 [libcfs] [ 362.524209] ? libcfs_debug_msg+0x50/0x70 [libcfs] [ 362.524769] lustre_fill_super+0x98d/0x2a10 [obdclass] [ 362.525345] ? sget_userns+0x419/0x490 [ 362.525788] ? sget+0x7d/0xa0 [ 362.526171] ? lustre_common_put_super+0xbe0/0xbe0 [obdclass] [ 362.526818] mount_nodev+0x4f/0xa0 [ 362.527237] lustre_mount+0x38/0x50 [obdclass] [ 362.527752] mount_fs+0x37/0x150 [ 362.528140] vfs_kern_mount.part.23+0x5d/0x110 [ 362.528652] do_mount+0x5ed/0xce0 [ 362.529047] ? copy_mount_options+0x2c/0x220 [ 362.529538] SyS_mount+0x98/0xe0 [ 362.529963] do_syscall_64+0x73/0x130 [ 362.530405] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 362.530976] RIP: 0033:0x7fcb122823ca [ 362.531402] RSP: 002b:00007ffeaea5d1c8 EFLAGS: 00000286 ORIG_RAX: 00000000000000a5 [ 362.532216] RAX: ffffffffffffffda RBX: 00005616fd543f10 RCX: 00007fcb122823ca [ 362.532983] RDX: 00005616fb57d005 RSI: 00007ffeaea5d238 RDI: 00005616fd542260 [ 362.533753] RBP: 00007ffeaea5d238 R08: 00005616fd543f10 R09: 0000000000000001 [ 362.534519] R10: 0000000001000000 R11: 0000000000000286 R12: 0000000000000000 [ 362.535281] R13: 00000000fffffff5 R14: 0000000000000000 R15: 00005616fb7895e0 [ 362.536054] Code: 85 c0 48 89 c3 74 12 b9 00 10 00 00 48 89 c2 31 f6 4c 89 ef e8 ac c6 ff ff 4c 89 e2 48 89 de 48 c7 c7 00 5c 2f 85 e8 ea 76 d8 ff <0f> 0b 48 89 df e8 80 09 f4 ff 5b 41 5c 41 5d 5d c3 66 0f 1f 84 [ 362.537994] ---[ end trace 39c378564360064a ]--- [ 362.538553] ------------[ cut here ]------------ [ 362.539325] kobject_add_internal failed for lustre- (ptrval) with -EEXIST, don't try to register things with the same name in the same directory. [ 362.540764] WARNING: CPU: 1 PID: 2060 at /build/linux-wuhukg/linux-4.15.0/lib/kobject.c:240 kobject_add_internal+0x26e/0x2b0 [ 362.541940] Modules linked in: mgc(OE) lustre(OE) lmv(OE) mdc(OE) fid(OE) osc(OE) lov(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd joydev input_leds mac_hid serio_raw sch_fq_codel sunrpc ip_tables x_tables autofs4 psmouse virtio_blk floppy i2c_piix4 8139too 8139cp pata_acpi mii [ 362.546100] CPU: 1 PID: 2060 Comm: mount.lustre Tainted: G W OE 4.15.0-32-generic #35-Ubuntu [ 362.547086] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 362.547719] RIP: 0010:kobject_add_internal+0x26e/0x2b0 [ 362.548286] RSP: 0018:ffffafd10081f9c0 EFLAGS: 00010282 [ 362.548867] RAX: 0000000000000000 RBX: ffff97c135b31010 RCX: 0000000000000006 [ 362.549677] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff97c13fd16490 [ 362.550438] RBP: ffffafd10081f9f0 R08: 0000000000000000 R09: 0000000000000243 [ 362.551197] R10: ffffdbedc0d29400 R11: ffffffff8471ca60 R12: ffff97c13290c2a0 [ 362.551964] R13: 00000000ffffffef R14: ffff97c135b31000 R15: 0000000000000000 [ 362.552736] FS: 00007fcb13183740(0000) GS:ffff97c13fd00000(0000) knlGS:0000000000000000 [ 362.553598] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 362.554240] CR2: 00005616fd546338 CR3: 000000007861a003 CR4: 00000000000606e0 [ 362.555015] Call Trace: [ 362.555327] kobject_add+0x71/0xd0 [ 362.555736] ? _cond_resched+0x19/0x40 [ 362.556172] device_add+0x12c/0x680 [ 362.556590] device_create_groups_vargs+0xe4/0xf0 [ 362.557120] device_create_vargs+0x16/0x20 [ 362.557594] bdi_register_va.part.11+0x28/0x190 [ 362.558104] bdi_register_va+0x1b/0x20 [ 362.558549] super_setup_bdi_name+0x87/0xe0 [ 362.559041] ll_fill_super+0x1ce/0x1230 [lustre] [ 362.559588] ? lustre_start_mgc+0x30e/0x2710 [obdclass] [ 362.560174] ? libcfs_debug_msg+0x50/0x70 [libcfs] [ 362.560720] ? libcfs_debug_msg+0x50/0x70 [libcfs] [ 362.561276] lustre_fill_super+0x98d/0x2a10 [obdclass] [ 362.561886] ? sget_userns+0x419/0x490 [ 362.562325] ? sget+0x7d/0xa0 [ 362.562711] ? lustre_common_put_super+0xbe0/0xbe0 [obdclass] [ 362.563346] mount_nodev+0x4f/0xa0 [ 362.563770] lustre_mount+0x38/0x50 [obdclass] [ 362.564276] mount_fs+0x37/0x150 [ 362.564668] vfs_kern_mount.part.23+0x5d/0x110 [ 362.565170] do_mount+0x5ed/0xce0 [ 362.565575] ? copy_mount_options+0x2c/0x220 [ 362.566061] SyS_mount+0x98/0xe0 [ 362.566452] do_syscall_64+0x73/0x130 [ 362.566883] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 362.567451] RIP: 0033:0x7fcb122823ca [ 362.567868] RSP: 002b:00007ffeaea5d1c8 EFLAGS: 00000286 ORIG_RAX: 00000000000000a5 [ 362.568686] RAX: ffffffffffffffda RBX: 00005616fd543f10 RCX: 00007fcb122823ca [ 362.569463] RDX: 00005616fb57d005 RSI: 00007ffeaea5d238 RDI: 00005616fd542260 [ 362.570224] RBP: 00007ffeaea5d238 R08: 00005616fd543f10 R09: 0000000000000001 [ 362.570990] R10: 0000000001000000 R11: 0000000000000286 R12: 0000000000000000 [ 362.571762] R13: 00000000fffffff5 R14: 0000000000000000 R15: 00005616fb7895e0 [ 362.572530] Code: 49 89 c4 48 85 ff 0f 84 41 fe ff ff 48 83 c7 18 e9 fc fd ff ff 48 8b 13 48 c7 c6 f0 e3 11 85 48 c7 c7 f0 09 3a 85 e8 72 3f 71 ff <0f> 0b e9 8f fe ff ff 0f 0b eb a5 0f 0b eb 98 48 89 de 48 c7 c7 [ 362.574470] ---[ end trace 39c378564360064b ]--- [ 362.576724] Lustre: Unmounted lustre-client [ 362.577624] LustreError: 2060:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount (-17)
All of these failures take place after a node-reset/lustre-initialization.
There are several failures like this at
https://testing.whamcloud.com/test_sets/04303b9a-fa11-11e8-8a18-52540065bddc
https://testing.whamcloud.com/test_sets/64026ecc-fdd5-11e8-93ea-52540065bddc