Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10827

conf-sanity test 0 fails with ‘rmmod: ERROR: Module lustre is in use’

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.11.0
    • Ubuntu clients
    • 3
    • 9223372036854775807

    Description

      conf-sanity tests 0, 1, 2, 3, 4, 5a/b/c/d and many others fail with the following error when trying to shut down the file system

      stop mds service on onyx-50vm9
      CMD: onyx-50vm9 grep -c /mnt/lustre-mds1' ' /proc/mounts || true
      Stopping /mnt/lustre-mds1 (opts:-f) on onyx-50vm9
      CMD: onyx-50vm9 umount -d -f /mnt/lustre-mds1
      CMD: onyx-50vm9 lsmod | grep lnet > /dev/null &&
      lctl dl | grep ' ST ' || true
      CMD: onyx-50vm6.onyx.hpdd.intel.com lsmod | grep lnet > /dev/null &&
      lctl dl | grep ' ST ' || true
      rmmod: ERROR: Module lustre is in use
      conf-sanity test_0: @@@@@@ FAIL: cleanup failed with 203
      

      unmounting the OSTs and MDT seems to work, but calling rmmod on the client seems to fail; from the suite_log for https://testing.hpdd.intel.com/test_sets/5d846520-287c-11e8-9e0e-52540065bddc.

       

      Looking at the client console (vm6) we see

      Ubuntu 16.04.2 LTS trevis-4vm3.trevis.hpdd.intel.com ttyS0
      
      trevis-4vm3 login: [    8.165539] audit: type=1400 audit(1521039871.976:11): apparmor="ALLOWED" operation="open" profile="/usr/sbin/sssd" name="/etc/gss/mech.d/" pid=547 comm="sssd_be" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
      [   81.009062] random: nonblocking pool is initialized
      [  138.440162] libcfs: module verification failed: signature and/or required key missing - tainting kernel
      

      We don’t see this during RHEL 7 testing.

       

      On the client dmesg log, we see an error

      [24874.129276] Lustre: DEBUG MARKER: grep -c /mnt/lustre' ' /proc/mounts
      [24874.137005] Lustre: DEBUG MARKER: lsof -t /mnt/lustre
      [24880.900494] LustreError: 167-0: lustre-MDT0000-mdc-ffff880061827800: This client was evicted by lustre-MDT0000; in progress operations using this service will fail.
      [24880.902374] LustreError: 8353:0:(file.c:4213:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000007:0x1:0x0] error: rc = -5
      [24880.905171] Lustre: lustre-MDT0000-mdc-ffff880061827800: Connection restored to 10.2.9.244@tcp (at 10.2.9.244@tcp)
      [24881.073111] Lustre: DEBUG MARKER: umount /mnt/lustre 2>&1
      [24881.110161] Lustre: Unmounted lustre-client
      

       

      So far, this issue is only seen when testing Ubuntu clients and started on 2018-02-27 22:03:52 UTC.

       

      Logs for the failures are at

      https://testing.hpdd.intel.com/test_sets/f75808be-1cb5-11e8-a7cd-52540065bddc

      https://testing.hpdd.intel.com/test_sets/4aeef8ce-1de8-11e8-bd91-52540065bddc

      https://testing.hpdd.intel.com/test_sets/cf7f2d1a-1f29-11e8-b046-52540065bddc

      https://testing.hpdd.intel.com/test_sets/a268caba-2894-11e8-b3c6-52540065bddc

      Attachments

        Issue Links

          Activity

            People

              ys Yang Sheng
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: