Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-874

Client eviction on lock callback timeout

Details

    • 3
    • 4740

    Description

      Our testing has revealed that lustre 2.1 is far more likely than 1.8 to return short reads and writes (return code says fewer bytes read/written than requested).

      So far, the frequent reproducer is IOR shared single file, transfer size 128MB, block size 256MB, 32 client nodes, and 512 tasks evenly spread over the clients.

      The file is only striped over 2 OSTs.

      When the read() or write() return value is less than the requested amount, the size is, in every instance that I have seen thus far, a multiple of 1MB.

      I suspect that other loads will show the same problem. I think that our more common large-transfer-request work loads come from our file-per-process apps though, so we'll run some tests to see if it is easy to reproduce there as well.

      Attachments

        1. 874pdf.pdf
          35 kB
        2. 874pdf2.pdf
          88 kB
        3. 874pdf2.pdf
          88 kB
        4. lc3-OST001_brw_stats.txt
          8 kB
        5. LU-874.lustre-log.oss.1322741854.6037.gz
          4.58 MB
        6. reproducer.c
          1 kB
        7. zwicky3_brw_stats.txt
          22 kB

        Issue Links

          Activity

            [LU-874] Client eviction on lock callback timeout
            adilger Andreas Dilger made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            pjones Peter Jones made changes -
            End date New: 26/Jun/14
            Start date New: 22/Nov/11
            morrone Christopher Morrone (Inactive) made changes -
            Labels Original: ptr sequoia New: llnl ptr
            jlevi Jodi Levi (Inactive) made changes -
            Priority Original: Blocker [ 1 ] New: Major [ 3 ]
            morrone Christopher Morrone (Inactive) made changes -
            Labels Original: ptr sequoia topsequoia New: ptr sequoia
            morrone Christopher Morrone (Inactive) made changes -
            Fix Version/s New: Lustre 2.4.0 [ 10154 ]
            morrone Christopher Morrone (Inactive) made changes -
            Link New: This issue is duplicated by LU-2683 [ LU-2683 ]
            morrone Christopher Morrone (Inactive) made changes -
            Priority Original: Major [ 3 ] New: Blocker [ 1 ]
            morrone Christopher Morrone (Inactive) made changes -
            Labels Original: ptr New: ptr sequoia topsequoia
            morrone Christopher Morrone (Inactive) made changes -
            Affects Version/s New: Lustre 2.4.0 [ 10154 ]

            People

              bobijam Zhenyu Xu
              morrone Christopher Morrone (Inactive)
              Votes:
              4 Vote for this issue
              Watchers:
              40 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: