Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4522

ldlm_cli_enqueue and ll_inode_revalidate_fini LustreError messages on 2.4.1 clients

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.6.0
    • Lustre 2.4.1
    • None
    • 3
    • 12365

    Description

      From time to time, KIT are seeing ldlm_cli_enqueue and ll_inode_revalidate_fini LustreError messages. They mainly appear on login nodes of our clusters.

      Here is an example of the error messages:
      Dec 17 10:24:28 ic2n988 kernel: [512992.139174] LustreError: 11865:0 (mdc_locks.c:840:mdc_enqueue()) ldlm_cli_enqueue: -13
      Dec 17 10:24:28 ic2n988 kernel: [512992.139183] LustreError: 11865:0:(mdc_locks.c:840:mdc_enqueue()) Skipped 2 previous similar messages
      Dec 17 10:24:28 ic2n988 kernel: [512992.139202] LustreError: 11865:0:(file.c:2716:ll_inode_revalidate_fini()) pfs2wor1: revalidate FID [0x1d080001:0x86b7421d:0x0] error: rc = -13
      Dec 17 10:24:28 ic2n988 kernel: [512992.139208] LustreError: 11865:0:(file.c:2716:ll_inode_revalidate_fini()) Skipped 2 previous similar messages
      Dec 17 10:24:29 ic2n988 kernel: [512993.347645] LustreError: 13000:0:(file.c:2716:ll_inode_revalidate_fini()) pfs2wor1: revalidate FID [0x1d080001:0x86b7421d:0x0] error: rc = -13

      The Lustre client in this case is at version 2.4.1 plus patch for LU-3645. The servers are also at version 2.4.1.

      What do the messages mean, and how can we get rid of these error messages?

      Attachments

        Issue Links

          Activity

            [LU-4522] ldlm_cli_enqueue and ll_inode_revalidate_fini LustreError messages on 2.4.1 clients

            Patch 8828 was landed to master for 2.6.0.

            adilger Andreas Dilger added a comment - Patch 8828 was landed to master for 2.6.0.

            We can go ahead and close this one. Thanks.

            From the customer:
            The update from Andreas Dilger at 24/Jan/14 5:30 AM provided what we expected from this case: Improving the error message or get rid of unnecessary LustreError messages. Therefore, you can close this case.

            orentas Oz Rentas (Inactive) added a comment - We can go ahead and close this one. Thanks. From the customer: The update from Andreas Dilger at 24/Jan/14 5:30 AM provided what we expected from this case: Improving the error message or get rid of unnecessary LustreError messages. Therefore, you can close this case.

            In Lustre 1.8 we used to return -EIDRM (Identifier Removed) so that it was clear this was an issue with missing users in the MDS user database, rather than some other kind of permission problem.

            We should also avoid printing these messages on the console, since they are just a distraction. I pushed http://review.whamcloud.com/8988 to quiet the client error messages for the common -EACCES error message.

            adilger Andreas Dilger added a comment - In Lustre 1.8 we used to return -EIDRM (Identifier Removed) so that it was clear this was an issue with missing users in the MDS user database, rather than some other kind of permission problem. We should also avoid printing these messages on the console, since they are just a distraction. I pushed http://review.whamcloud.com/8988 to quiet the client error messages for the common -EACCES error message.
            green Oleg Drokin added a comment -

            Could this be that login nodes have some users that are not known to MDS, similar to LU-4084 for example?

            green Oleg Drokin added a comment - Could this be that login nodes have some users that are not known to MDS, similar to LU-4084 for example?

            People

              adilger Andreas Dilger
              orentas Oz Rentas (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: