Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15882

Lustre 2.15 GPUDirect Testing IO failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.15.0
    • None
    • NVIDIA DGX A100
    • 3
    • 9223372036854775807

    Description

      gds sanity test found failed I/O:

       

      /usr/local/gds/tools/gdsio -V -D /data/sanity/tests// -d 0  -w 8 -s 1G -i 32K:1024K:1K -x 0 -I 1 -o 1
      Error: IO failed stopping traffic, fd :51 ret:-5 errno :1
      Error: IO failed stopping traffic, fd :49 ret:-5 errno :1
      Error: IO failed stopping traffic, fd :52 ret:-5 errno :1
      Error: IO failed stopping traffic, fd :56 ret:-5 errno :1
      Error: IO failed stopping traffic, fd :54 ret:-5 errno :1
      Error: IO failed stopping traffic, fd :55 ret:-5 errno :1
      FAILED

       

      dmesg:

      [Mon May 23 15:43:05 2022] nvidia-fs:write unable to flush dirty pages :-5
      [Mon May 23 15:43:05 2022] nvidia-fs:write unable to flush dirty pages :-5
      [Mon May 23 15:43:05 2022] nvidia-fs:write unable to flush dirty pages :-5
      [Mon May 23 15:43:05 2022] nvidia-fs:write unable to flush dirty pages :-5
      [Mon May 23 15:43:05 2022] nvidia-fs:write unable to flush dirty pages :-5
      [Mon May 23 15:43:05 2022] nvidia-fs:write unable to flush dirty pages :-5 

       

      **************************************************
      Testsuite : 181 / 182 tests passed
      done tests:Mon May 23 22:02:56 UTC 2022 

      ddn@a100-01:~$ lctl get_param version
      version=2.15.50_13_gc524079_dirty

      NVIDIA-SMI 515.43.04    

      Driver Version: 515.43.04    

      CUDA Version: 11.7

      Kernel: 5.4.0-109-generic

       

      Attachments

        Activity

          People

            wc-triage WC Triage
            okulachenko Oleg Kulachenko (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: