[LU-10457] open_by_handle_at() in write mode triggers ETXTBSY Created: 04/Jan/18  Updated: 09/Jan/20  Resolved: 09/Jan/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Diego Moreno (Inactive) Assignee: Oleg Drokin
Resolution: Duplicate Votes: 0
Labels: None

Attachments: File open-test.c    
Issue Links:
Duplicate
duplicates LU-8585 All Lustre test suites should pass wi... Open
is duplicated by LU-10667 Open by handle and normal open result... Resolved
Related
is related to LU-4398 mdt_object_open_lock() may not flush ... Resolved
is related to LU-12661 sanity test_817: FAIL: failed to exec... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

If open_by_handle_at() is called in O_WRONLY or O_RDWR mode and then the file descriptor is closed, other lustre clients will still report ETXTBSY.

Example:

On cn16
=======
bschubert@cn16 ~>sudo ~/src/test/open-test /mnt/lustre_client-ES24/bschubert/ime/test7 1
Opened /mnt/lustre_client-ES24/bschubert/ime/test7/test7, fd: 4
Closed d: 4

Now on cn41
=========
bschubert@cn41 ~>/mnt/lustre_client-ES24/bschubert/ime//test7
-bash: /mnt/lustre_client-ES24/bschubert/ime//test7: Text file busy

test7 is just any file which has the the execution bit set.



 Comments   
Comment by Li Xi (Inactive) [ 05/Jan/18 ]

I reproduced the problem easily. Interesting, the first run would return ETXTBUSY, but the next run succeeds.

[root@server17-el7-vm1 LU-10457]# ./open-test /mnt/lustre/file 1
Opened /mnt/lustre/file/file, fd: 4
Closed d: 4

[root@server17-el7-vm3 lustre]# ./file
-bash: ./file: /bin/bash: bad interpreter: Text file busy
[root@server17-el7-vm3 lustre]# ./file
hello!

Comment by Li Xi (Inactive) [ 05/Jan/18 ]

As expected, mdt_write_deny() returns -ETXTBUSY when the failure happens

 
00000004:00000001:0.0:1515116333.246172:0:12336:0:(mdt_open.c:342:mdt_mfd_open()) Process entered
00000004:00000001:0.0:1515116333.246173:0:12336:0:(mdt_open.c:174:mdt_write_deny()) Process entered
00000004:00000001:0.0:1515116333.246174:0:12336:0:(mdt_open.c:181:mdt_write_deny()) Process leaving (rc=18446744073709551590 : -26 : ffffffffffffffe6)
00000004:00000001:0.0:1515116333.246175:0:12336:0:(mdt_open.c:391:mdt_mfd_open()) Process leaving (rc=18446744073709551590 : -26 : ffffffffffffffe6)
00000004:00000001:0.0:1515116333.246176:0:12336:0:(mdt_open.c:615:mdt_finish_open()) Process leaving (rc=18446744073709551590 : -26 : ffffffffffffffe6)
00000004:00000001:0.0:1515116333.246178:0:12336:0:(mdt_open.c:1589:mdt_reint_open()) Process leaving

Comment by Li Xi (Inactive) [ 05/Jan/18 ]

When open-test closing the file, no mdt_mfd_close() is called on MDS side. And the trace on client side confirmed that  ll_md_close() dosn't call ll_md_real_close() because md_lock_match() returns 8. So, this program holds some LDLM lock when closing?

00000002:00000001:0.0:1515116985.549766:0:22146:0:(mdc_locks.c:149:mdc_lock_match()) Process leaving (rc=8 : 8 : 8)
00800000:00000001:0.0:1515116985.549766:0:22146:0:(obd_class.h:1740:md_lock_match()) Process leaving (rc=8 : 8 : 8)
00800000:00000001:0.0:1515116985.549767:0:22146:0:(lmv_obd.c:2965:lmv_lock_match()) Process leaving (rc=8 : 8 : 8)
00000080:00000001:0.0:1515116985.549768:0:22146:0:(obd_class.h:1740:md_lock_match()) Process leaving (rc=8 : 8 : 8)
00000080:00000001:0.0:1515116985.549771:0:22146:0:(file.c:314:ll_md_close()) Process leaving (rc=0 : 0 : 0)

Comment by Li Xi (Inactive) [ 05/Jan/18 ]

So following things happend in sequence:

  1. open-test got LDLM lock of the file ./test
  2. open-test closed the file, yet do not call ll_md_real_close() because holding the LDLM lock
  3. ./test program tried to open the file with EXEC yet got -ETXTBUSY in mdt_write_deny()
  4. The mdt_blocking_ast() was triggered, it cancels of the LDLM lock got by open-test program, and mdt_mfd_close()/mdt_write_put() is called to release the reference
  5. The next run of open-test can succeed because of step 4.
Comment by Li Xi (Inactive) [ 05/Jan/18 ]

The open-test get the LDLM lock when opening file because ldd->lld_nfs_dentry is set to 1 in ll_iget_for_nfs().

 

Comment by Oleg Drokin [ 05/Jan/18 ]

Hm, I tested this on the latest master on rhel7.2 (disregard the centos6 in the hostname) and don't see any problems, what version are you testing on, what kernel:

[root@centos6-16 tests]# cat /tmp/test.sh
#!/bin/bash

cp /bin/ls /mnt/lustre
/mnt/lustre/ls -ld .
/tmp/open-test /mnt/lustre/ls 2

TIME=0
while ! /mnt/lustre2/ls -ld . ; do echo nope ; TIME=$((TIME + 1)) ; sleep 1 ; done

echo Waited $TIME seconds for the open to clear
[root@centos6-16 tests]# bash /tmp/test.sh
drwxrwxr-x 12 green green 12288 Jan  4 01:37 .
Opened /mnt/lustre/ls/ls, fd: 4
Closed d: 4
/tmp/test.sh: line 8: /mnt/lustre2/ls: Text file busy
nope
drwxrwxr-x 12 green green 12288 Jan  4 01:37 .
Waited 1 seconds for the open to clear
Comment by Oleg Drokin [ 05/Jan/18 ]

I guess I did not read it far enough, yes there's one ETXTBUSY report due to the open lock.

it appears that the name_to_handle_at/open_by_handle_at use nfs-encoded export operation leading to the nfs detecting logic to trigger so the system sort of operates as designed.

It's going to be tricky to separate real nfs from these users I guess and we don't want the extra lock hit when opening the file. I guess the new downgrade logic might help us here to get a bigger lock and then just drop the unneeded bit.

Comment by Bernd Schubert [ 05/Jan/18 ]

Hmm, I can't imagine how that this works as it is supposed to, even in the NFS case. Maybe I should have pointed this out in more detail, but in my initial example I used two different nodes.
For NFS or any other overlay file system, one can expect that there are multiple nodes involved. For NFS users typically would create/modify files on their desktop to later on execute them natively on Lustre.
For the IME use case, the file is opened for multiple reasons in RW mode on the ime server, but users also later on want to use the files natively on Lustre.

Comment by Bernd Schubert [ 05/Jan/18 ]

Ah, actually Li was also using two different nodes. Sorry, I only saw server17-el7 and didn't notice the differentiation between -vm1 and -vm3.

Which Lustre version is this with? On the systems I tested with, it would never succeed to execute on the other node, until either

  • the node the had opened the file would execute it itself
  • the node the had opened the file would unmount lustre
  • I would be patient and wait for a very long time (> 30min)
Comment by Oleg Drokin [ 05/Jan/18 ]

Please note that /mnt/lustre and /mnt/lustre 2 are different mountpoints, so they act the same as two different nodes, just more convenient to test.

Comment by John Hammond [ 05/Jan/18 ]

This is resolved by https://review.whamcloud.com/#/c/9063/ (LU-4398 mdt: acquire an open lock for write or execute). But that change was reverted from master due to the metadata performance impact described by DDN in LU-5197.

Perhaps the exiting functionality of open leases could be used in the open by handle path to address this issue without incurring the performance drop.

Comment by Li Xi (Inactive) [ 08/Jan/18 ]

> Which Lustre version is this with? 

I was testing on master branch. I guess you are using IEEL3 (2.7)? Something might have been changed between them.

Comment by Gu Zheng (Inactive) [ 08/Jan/18 ]

Seems https://review.whamcloud.com/#/c/9063/ (LU-4398 mdt: acquire an open lock for write or execute) can resolve the problem, after applied it back on latest master, never reproduced the issue.
[root@vm3 ~]# ./open_test /mnt/lustre/file 1
Opened /mnt/lustre/file/file, fd: 4
Closed d: 4

[root@vm6 ~]# /mnt/lustre/file
hello lustre
[root@vm6 ~]# /mnt/lustre/file
hello lustre
[root@vm6 ~]# /mnt/lustre/file
hello lustre
[root@vm6 ~]# /mnt/lustre/file
hello lustre
[root@vm6 ~]# /mnt/lustre/file
hello lustre

Comment by Patrick Farrell (Inactive) [ 15/Feb/18 ]

Oleg pointed me at this, I reported a duplicate and contributed a patch and test case:
https://review.whamcloud.com/#/c/31304/

If we limited my patch to executable files as Oleg suggested, that might fit the bill. Curious what others think. I'll refresh tomorrow.

Comment by Gu Zheng (Inactive) [ 17/Apr/18 ]

Hi all, I just resubmit LU-4398 (https://review.whamcloud.com/32020) as Jinshan suggested, with it applied, the problem is gone, and with some simple tests, no significant regression found, but still, please feel free to try and test it more, thanks.

Comment by Bernd Schubert [ 31/May/18 ]

Hi all, I think there another implication of this issue. Our customer is complaining that quotas are not correctly released. We have basically mostly worked around the ETXTBSY issue, but I don't think we can do anything about quotas on our side.
Looking at the patches, I think this patch https://review.whamcloud.com/32020 will not help, as it will try to release conflicting locks on an O_EXEC attempt. The alternative patch from Pattrick  https://review.whamcloud.com/#/c/31304/ should work, as it always sends an mds close from the client, if the file was opened in write mode. Is there any side effect? It should just remove an NFS optimization?

Comment by Patrick Farrell (Inactive) [ 31/May/18 ]

It's a pretty significant optimization, but that aside, yes, I think it should be OK.

We have (at least for now) decided to live with the problem.

Why do you think this affects quotas, though?  I'm a little puzzled about the connection?

Comment by Bernd Schubert [ 31/May/18 ]

Without any prove if the root cause of LU-10457 is also the root of our quota issue - I would assume that a file that gets unlinked, but that is still open on the MDS will not release the data and so also will not release quotas.

Comment by Patrick Farrell (Inactive) [ 31/May/18 ]

Ah.  Interesting, I see your point.  Yes, that seems possible.

Comment by Shuichi Ihara (Inactive) [ 01/Jun/18 ]

similar fix in https://review.whamcloud.com/32265 for LU-4398 and I've already confimred patch solves ETXTBSY issue without performance regressions.

Here is test resutls

[root@c01 ~]#  cat /scratch1/test 
echo hello

[root@c02 ~]# /scratch1/a.out /scratch1/test 1
Opened /scratch1/test/test, fd: 4
Closed d: 4

[root@c01 ~]# /scratch1/test 
hello
[root@c01 ~]# /scratch1/test 
hello
[root@c01 ~]# /scratch1/test 
hello
16 client, 256 process (mdtest -n 2000 -u -vv -F -d /scratch1/mdtest.out/ -i 3 -p 10)

master
SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :      89389.726      78571.859      83605.731       4448.114
   File stat         :     263650.433     221026.947     238222.946      18348.631
   File read         :     113141.781     111882.782     112494.749        514.582
   File removal      :     121785.424     109912.532     114749.674       5090.305
   Tree creation     :        204.674         27.510        140.893         80.383
   Tree removal      :         30.096         28.858         29.363          0.531
V-1: Entering timestamp...

master + patch https://review.whamcloud.com/32265
SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :      84851.987      82996.117      83819.233        772.021
   File stat         :     262610.064     215623.544     244595.023      20687.611
   File read         :     115494.747     112069.322     113774.145       1398.468
   File removal      :     121395.003     115276.620     118372.989       2498.373
   Tree creation     :        223.484         65.453        156.498         66.722
   Tree removal      :         28.673         17.037         22.827          4.751
V-1: Entering timestamp...
Comment by Shuichi Ihara (Inactive) [ 01/Jun/18 ]

I was missing test of quota issue. As Bernd mentioned above, at the end, we might need Patrick's patch to solve both problems...

Comment by Oleg Drokin [ 03/Jun/18 ]

the unlink issue is somewhat trivially fixable by just ensuring that unlink also revokes open bit (we already revoke the lookup bit). The slowdown would only happen for files that actually have openlock cached on a client.

Comment by Andreas Dilger [ 09/Jan/20 ]

This might have been fixed by patch https://review.whamcloud.com/36641 "LU-8585 llite: don't cache MDS_OPEN_LOCK for volatile files".

Generated at Sat Feb 10 02:35:16 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.