Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2336

mds llog_write_rec 'No space left on device'

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.4.0
    • Lustre 2.2.0, Lustre 2.1.2
    • 3
    • 5568

    Description

      I noticed that our MDS logs this message once in while:
      LustreError: 3354:0:(llog_cat.c:298:llog_cat_add_rec()) llog_write_rec -28: lh=ffff8809f8de6780
      LustreError: 20199:0:(llog_cat.c:298:llog_cat_add_rec()) llog_write_rec -28: lh=ffff8809c6b299c0
      LustreError: 32022:0:(llog_cat.c:298:llog_cat_add_rec()) llog_write_rec -28: lh=ffff8804dd524d80
      LustreError: 32015:0:(llog_cat.c:298:llog_cat_add_rec()) llog_write_rec -28: lh=ffff8804dd76af00

      Attachments

        Activity

          [LU-2336] mds llog_write_rec 'No space left on device'

          Patch provided and landed.

          jfc John Fuchs-Chesney (Inactive) added a comment - Patch provided and landed.

          With Change, 5146 landed can this ticket be closed?

          jlevi Jodi Levi (Inactive) added a comment - With Change, 5146 landed can this ticket be closed?
          hongchao.zhang Hongchao Zhang added a comment - the patch is tracked at http://review.whamcloud.com/#change,5146

          We're seeing this in production here at LLNL, and it is confusing our admins. If the message is harmless, it really should be removed (or at least masked behind a CDEBUG flag).

          prakash Prakash Surya (Inactive) added a comment - We're seeing this in production here at LLNL, and it is confusing our admins. If the message is harmless, it really should be removed (or at least masked behind a CDEBUG flag).

          no, this should be not a problem, the PIDs in these logs are different, a new llog file will be created if the current one
          is full, and if there are two logs with same PID in a row, then it could indicate there is some problem. the related codes is,

                  /* now let's try to add the record */
                  rc = llog_write_rec(env, loghandle, rec, reccookie, 1, buf, -1, th);
                  if (rc < 0)
                          CERROR("llog_write_rec %d: lh=%p\n", rc, loghandle);
                  cfs_up_write(&loghandle->lgh_lock);
                  if (rc == -ENOSPC) {
                          /* try to use next log */
                          loghandle = llog_cat_current_log(cathandle, th);
                          LASSERT(!IS_ERR(loghandle));
                          /* new llog can be created concurrently */
                          if (!llog_exist(loghandle)) {
                                  rc = llog_cat_new_log(env, cathandle, loghandle, th);
                                  if (rc < 0) {
                                          cfs_up_write(&loghandle->lgh_lock);
                                          RETURN(rc);
                                  }
                          }
                          /* now let's try to add the record */
                          rc = llog_write_rec(env, loghandle, rec, reccookie, 1, buf,
                                              -1, th);
                          if (rc < 0)
                                  CERROR("llog_write_rec %d: lh=%p\n", rc, loghandle);
                          cfs_up_write(&loghandle->lgh_lock);
                  }
          
          hongchao.zhang Hongchao Zhang added a comment - no, this should be not a problem, the PIDs in these logs are different, a new llog file will be created if the current one is full, and if there are two logs with same PID in a row, then it could indicate there is some problem. the related codes is, /* now let's try to add the record */ rc = llog_write_rec(env, loghandle, rec, reccookie, 1, buf, -1, th); if (rc < 0) CERROR( "llog_write_rec %d: lh=%p\n" , rc, loghandle); cfs_up_write(&loghandle->lgh_lock); if (rc == -ENOSPC) { /* try to use next log */ loghandle = llog_cat_current_log(cathandle, th); LASSERT(!IS_ERR(loghandle)); /* new llog can be created concurrently */ if (!llog_exist(loghandle)) { rc = llog_cat_new_log(env, cathandle, loghandle, th); if (rc < 0) { cfs_up_write(&loghandle->lgh_lock); RETURN(rc); } } /* now let's try to add the record */ rc = llog_write_rec(env, loghandle, rec, reccookie, 1, buf, -1, th); if (rc < 0) CERROR( "llog_write_rec %d: lh=%p\n" , rc, loghandle); cfs_up_write(&loghandle->lgh_lock); }
          pjones Peter Jones added a comment -

          Hongchao

          Could you please comment on this one?

          Thanks

          Peter

          pjones Peter Jones added a comment - Hongchao Could you please comment on this one? Thanks Peter

          Could be a problem? This is not related with the mds crashes that I have described in the LU-2323 issue.

          thanks in advance

          ethz.support ETHz Support (Inactive) added a comment - Could be a problem? This is not related with the mds crashes that I have described in the LU-2323 issue. thanks in advance

          People

            hongchao.zhang Hongchao Zhang
            ethz.support ETHz Support (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: