Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13470

sysfs ping write creates a flood-ping situation that could not be normally stopped

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • Lustre 2.14.0, Lustre 2.12.5
    • Lustre 2.14.0
    • None
    • 3
    • 9223372036854775807

    Description

      It looks like the then ping file was migrated to sysfs, it unfortunately introduced a bug when doing a write:

      ssize_t ping_store(struct kobject *kobj, struct attribute *attr,
                         const char *buffer, size_t count)
      {
              return ping_show(kobj, attr, (char *)buffer);
      }
      

      what it really sohuld be doing is return count, otherwise outer logic thinks it's a short write that needs to be retried (errno = 0) and enters a loop that you cannot really break short of disconnectign from the server:

      [root@centos6-16 ~]# cat /sys/fs/lustre/mdc/lustre-MDT0000-mdc-ffff880387d67800/ping
      [root@centos6-16 ~]# echo blahblah > /sys/fs/lustre/mdc/lustre-MDT0000-mdc-ffff880387d67800/ping
      
      ^Z
      ^C
      
      

      we can see how the cpu is eaten with all the retries and pings now:

        PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
       2529 root      20   0       0      0      0 R  27.8  0.0  20:07.59 socknal_sd+
      12488 root      20   0  115568   2124   1612 S  27.8  0.0  12:48.58 bash       
       2530 root      20   0       0      0      0 S  27.5  0.0  20:09.36 socknal_sd+
       3861 root      20   0       0      0      0 S  14.2  0.0   4:11.05 mdt03_002  
      16784 root      20   0       0      0      0 S  10.6  0.0   4:04.16 mdt03_004  
       4410 root      20   0       0      0      0 S   5.0  0.0   4:08.74 mdt03_003  
       3859 root      20   0       0      0      0 S   2.6  0.0   4:11.23 mdt03_000  
         55 root      20   0       0      0      0 S   0.3  0.0   0:22.51 rcuos/6    
       3860 root      20   0       0      0      0 S   0.3  0.0   3:58.34 mdt03_001  
      15467 green     20   0  162104   2408   1524 R   0.3  0.0   0:00.05 top        
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: