Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1597

Reads and Writes failing with -13 (-EACCES)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.3.0, Lustre 2.1.4
    • None
    • None
    • Client: lustre-modules-2.1.1-13chaos_2.6.32_220.17.1.3chaos.ch5.x86_64.x86_64
      Server: lustre-modules-2.1.1-4chaos_2.6.32_220.7.1.7chaos.ch5.x86_64.x86_64
    • 3
    • 3982

    Description

      We're currently seeing a user's reads and writes failing with -13 (-EACCES) errors. The errors are coming from a set of clients from a single cluster, but are using multiple different filesystems. From what I can tell, the -EACCES is coming from this part of the server code:

      filter_capa.c:
      
      138         if (capa == NULL) {
      139                 if (fid)
      140                         CERROR("seq/fid/opc "LPU64"/"DFID"/"LPX64
      141                                ": no capability has been passed\n",
      142                                seq, PFID(fid), opc);
      143                 else
      144                         CERROR("seq/opc "LPU64"/"LPX64
      145                                ": no capability has been passed\n",
      146                                seq, opc);
      147                 RETURN(-EACCES);
      148         }
      

      The message on the client is:

      Jul  3 13:26:50 ansel242 kernel: LustreError: 11-0: lsc-OST00b4-osc-ffff8806244c3800: Communicating with 172.19.1.113@o2ib100, operation ost_read failed with -13.
      Jul  3 13:26:50 ansel242 kernel: LustreError: Skipped 3495061 previous similar messages
      

      And there are corresponding messages on the server:

      Jul  3 13:26:51 sumom13 kernel: LustreError: 24607:0:(filter_capa.c:146:filter_auth_capa()) seq/opc 0/0x40: no capability has been passed
      Jul  3 13:26:51 sumom13 kernel: LustreError: 24607:0:(filter_capa.c:146:filter_auth_capa()) Skipped 3495057 previous similar messages
      

      It appears the for each "ost_

      {read|write}

      failed" message on the client, there is a "no capability" message on the server.

      I'm unsure why the capability isn't being set by the client, but it seems that is causing the -EACCES error to get propagated to the clients.

      Lustre versions:

      Client: lustre-modules-2.1.1-13chaos_2.6.32_220.17.1.3chaos.ch5.x86_64.x86_64
      Server: lustre-modules-2.1.1-4chaos_2.6.32_220.7.1.7chaos.ch5.x86_64.x86_64

      Attachments

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              prakash Prakash Surya (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: