Details
-
Bug
-
Resolution: Duplicate
-
Minor
-
None
-
Lustre 2.11.0
-
Ubuntu clients
-
3
-
9223372036854775807
Description
conf-sanity tests 0, 1, 2, 3, 4, 5a/b/c/d and many others fail with the following error when trying to shut down the file system
stop mds service on onyx-50vm9 CMD: onyx-50vm9 grep -c /mnt/lustre-mds1' ' /proc/mounts || true Stopping /mnt/lustre-mds1 (opts:-f) on onyx-50vm9 CMD: onyx-50vm9 umount -d -f /mnt/lustre-mds1 CMD: onyx-50vm9 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' || true CMD: onyx-50vm6.onyx.hpdd.intel.com lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' || true rmmod: ERROR: Module lustre is in use conf-sanity test_0: @@@@@@ FAIL: cleanup failed with 203
unmounting the OSTs and MDT seems to work, but calling rmmod on the client seems to fail; from the suite_log for https://testing.hpdd.intel.com/test_sets/5d846520-287c-11e8-9e0e-52540065bddc.
Looking at the client console (vm6) we see
Ubuntu 16.04.2 LTS trevis-4vm3.trevis.hpdd.intel.com ttyS0 trevis-4vm3 login: [ 8.165539] audit: type=1400 audit(1521039871.976:11): apparmor="ALLOWED" operation="open" profile="/usr/sbin/sssd" name="/etc/gss/mech.d/" pid=547 comm="sssd_be" requested_mask="r" denied_mask="r" fsuid=0 ouid=0 [ 81.009062] random: nonblocking pool is initialized [ 138.440162] libcfs: module verification failed: signature and/or required key missing - tainting kernel
We don’t see this during RHEL 7 testing.
On the client dmesg log, we see an error
[24874.129276] Lustre: DEBUG MARKER: grep -c /mnt/lustre' ' /proc/mounts [24874.137005] Lustre: DEBUG MARKER: lsof -t /mnt/lustre [24880.900494] LustreError: 167-0: lustre-MDT0000-mdc-ffff880061827800: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. [24880.902374] LustreError: 8353:0:(file.c:4213:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000007:0x1:0x0] error: rc = -5 [24880.905171] Lustre: lustre-MDT0000-mdc-ffff880061827800: Connection restored to 10.2.9.244@tcp (at 10.2.9.244@tcp) [24881.073111] Lustre: DEBUG MARKER: umount /mnt/lustre 2>&1 [24881.110161] Lustre: Unmounted lustre-client
So far, this issue is only seen when testing Ubuntu clients and started on 2018-02-27 22:03:52 UTC.
Logs for the failures are at
https://testing.hpdd.intel.com/test_sets/f75808be-1cb5-11e8-a7cd-52540065bddc
https://testing.hpdd.intel.com/test_sets/4aeef8ce-1de8-11e8-bd91-52540065bddc
https://testing.hpdd.intel.com/test_sets/cf7f2d1a-1f29-11e8-b046-52540065bddc
https://testing.hpdd.intel.com/test_sets/a268caba-2894-11e8-b3c6-52540065bddc