Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-872

'Text file busy' error when creating executable on NFS share and then running it on Lustre node.

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • Lustre 1.8.7
    • None

    Description

      We have exported Lustre file system via NFS (let's say from n1).

      The attached script 'repr.sh' creates the executable 'test.sh' on the Lustre folder, then connects to n1 by ssh and executes the created script 'test.sh' on n1.

      The script 'repr.sh' always results with the error "bash: /folder_on_lustre/test.sh: Text file busy"

      Reproducing script:
      cd /folder_on_lustre
      rm -rf test.sh
      echo "ls -l" >> test.sh
      chmod +x test.sh
      echo "ls" >> test.sh
      ssh -o 'StrictHostKeyChecking no' n1 '/folder_on_lustre/test.sh'

      Attachments

        Issue Links

          Activity

            [LU-872] 'Text file busy' error when creating executable on NFS share and then running it on Lustre node.
            pjones Peter Jones added a comment -

            No problem.

            pjones Peter Jones added a comment - No problem.
            evg1 Evgeny Repekto added a comment - - edited

            Lai,

            I made sure with our IT guys and they confirmed that n1 was not our MDS. We'll test this issue as only we get our MDS upgraded to 1.8.7

            I think this ticket may be closed. If it reproduces after upgrade I'll reopen it (I think I have the permission to?).

            Thank you and sorry for bother.

            evg1 Evgeny Repekto added a comment - - edited Lai, I made sure with our IT guys and they confirmed that n1 was not our MDS. We'll test this issue as only we get our MDS upgraded to 1.8.7 I think this ticket may be closed. If it reproduces after upgrade I'll reopen it (I think I have the permission to?). Thank you and sorry for bother.
            laisiyao Lai Siyao added a comment -

            Evgeny, I tested the same as you said. I'm afraid you didn't upgrade MDS to Lustre 1.8.7; LU-146 is a fix for MDS, upgrading client only won't help.

            laisiyao Lai Siyao added a comment - Evgeny, I tested the same as you said. I'm afraid you didn't upgrade MDS to Lustre 1.8.7; LU-146 is a fix for MDS, upgrading client only won't help.

            Sorry if I confused you.

            Clarifying our case:

            1) n1 - is Lustre node of version 1.8.7 containing folder /folder_on_lustre.
            2) besides n1 this folder is also contained on other Lustre nodes of version 1.8.2 (we specifically deployed Lustre 1.8.7 on n1 to see if resolution of LU-146 fixes our problem).
            3) c1 - is the remote client mounting Lustre folder '/folder_on_lustre' from n1 by NFS to the mountpoint with the same name , i.e. '/folder_on_lustre'
            4) when we run repr.sh on c1 (actually from /folder_on_lustre) we get the described behaviour.
            5) this is what we have in /etc/fstab on c1:
            n1_ip:/folder_on_lustre /folder_on_lustre nfs rw,hard,intr 0 0

            evg1 Evgeny Repekto added a comment - Sorry if I confused you. Clarifying our case: 1) n1 - is Lustre node of version 1.8.7 containing folder /folder_on_lustre. 2) besides n1 this folder is also contained on other Lustre nodes of version 1.8.2 (we specifically deployed Lustre 1.8.7 on n1 to see if resolution of LU-146 fixes our problem). 3) c1 - is the remote client mounting Lustre folder '/folder_on_lustre' from n1 by NFS to the mountpoint with the same name , i.e. '/folder_on_lustre' 4) when we run repr.sh on c1 (actually from /folder_on_lustre) we get the described behaviour. 5) this is what we have in /etc/fstab on c1: n1_ip:/folder_on_lustre /folder_on_lustre nfs rw,hard,intr 0 0
            laisiyao Lai Siyao added a comment -

            I made ssh password-less, and it showed the same result:

            [root@vivaldi tests]# mount|grep /mnt/lustre
            vivaldi@tcp:/lustre on /mnt/lustre type lustre (rw,user_xattr,acl,flock)
            [root@vivaldi tests]# ssh chopin mount|grep /mnt/lustre
            vivaldi:/mnt/lustre on /mnt/lustre type nfs (rw,nolock,addr=192.168.111.129)
            [root@vivaldi tests]# cat /tmp/repr.sh 
            cd /mnt/lustre
            rm -rf test.sh
            echo "ls -l" >> test.sh
            chmod +x test.sh
            echo "ls" >> test.sh
            ssh chopin '/mnt/lustre/test.sh'
            [root@vivaldi tests]# sh /tmp/repr.sh
            total 56
            -rw-------. 1 root root  1891 Nov 30 05:35 anaconda-ks.cfg
            -rw-r--r--. 1 root root 40354 Nov 30 05:35 install.log
            -rw-r--r--. 1 root root  8168 Nov 30 05:34 install.log.syslog
            anaconda-ks.cfg
            install.log
            install.log.syslog
            

            I want to verify one thing: on node 'n1' is /folder_on_lustre a NFS mountpoint or a Lustre mountpoint? In your test did you use any NFS share?

            laisiyao Lai Siyao added a comment - I made ssh password-less, and it showed the same result: [root@vivaldi tests]# mount|grep /mnt/lustre vivaldi@tcp:/lustre on /mnt/lustre type lustre (rw,user_xattr,acl,flock) [root@vivaldi tests]# ssh chopin mount|grep /mnt/lustre vivaldi:/mnt/lustre on /mnt/lustre type nfs (rw,nolock,addr=192.168.111.129) [root@vivaldi tests]# cat /tmp/repr.sh cd /mnt/lustre rm -rf test.sh echo "ls -l" >> test.sh chmod +x test.sh echo "ls" >> test.sh ssh chopin '/mnt/lustre/test.sh' [root@vivaldi tests]# sh /tmp/repr.sh total 56 -rw-------. 1 root root 1891 Nov 30 05:35 anaconda-ks.cfg -rw-r--r--. 1 root root 40354 Nov 30 05:35 install.log -rw-r--r--. 1 root root 8168 Nov 30 05:34 install.log.syslog anaconda-ks.cfg install.log install.log.syslog I want to verify one thing: on node 'n1' is /folder_on_lustre a NFS mountpoint or a Lustre mountpoint? In your test did you use any NFS share?

            Yes, we are using this fix as a part of 1.8.7 release. But if it matters, it is deployed on n1 only, on other Lustre nodes we use 1.8.2

            The difference I see in your scenario output is that we use password-less connection by ssh.

            evg1 Evgeny Repekto added a comment - Yes, we are using this fix as a part of 1.8.7 release. But if it matters, it is deployed on n1 only, on other Lustre nodes we use 1.8.2 The difference I see in your scenario output is that we use password-less connection by ssh.
            laisiyao Lai Siyao added a comment -
            [root@vivaldi tests]# cat /tmp/repr.sh 
            cd /mnt/lustre
            rm -rf test.sh
            echo "ls -l" >> test.sh
            chmod +x test.sh
            echo "ls" >> test.sh
            ssh chopin '/mnt/lustre/test.sh'
            [root@vivaldi tests]# sh /tmp/repr.sh 
            root@chopin's password: 
            total 56
            -rw-------. 1 root root  1891 Nov 30 05:35 anaconda-ks.cfg
            -rw-r--r--. 1 root root 40354 Nov 30 05:35 install.log
            -rw-r--r--. 1 root root  8168 Nov 30 05:34 install.log.syslog
            anaconda-ks.cfg
            install.log
            install.log.syslog
            

            I tested on my environment, it could pass. Evgeny, could you verify http://review.whamcloud.com/#change,1259 is included in your lustre code?

            laisiyao Lai Siyao added a comment - [root@vivaldi tests]# cat /tmp/repr.sh cd /mnt/lustre rm -rf test.sh echo "ls -l" >> test.sh chmod +x test.sh echo "ls" >> test.sh ssh chopin '/mnt/lustre/test.sh' [root@vivaldi tests]# sh /tmp/repr.sh root@chopin's password: total 56 -rw-------. 1 root root 1891 Nov 30 05:35 anaconda-ks.cfg -rw-r--r--. 1 root root 40354 Nov 30 05:35 install.log -rw-r--r--. 1 root root 8168 Nov 30 05:34 install.log.syslog anaconda-ks.cfg install.log install.log.syslog I tested on my environment, it could pass. Evgeny, could you verify http://review.whamcloud.com/#change,1259 is included in your lustre code?

            People

              laisiyao Lai Siyao
              evg1 Evgeny Repekto
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: