[LU-376] Client hangs when listing big directory with ls -la Created: 30/May/11 Updated: 28/Jun/11 Resolved: 08/Jun/11 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0, Lustre 1.8.6 |
| Fix Version/s: | Lustre 2.1.0, Lustre 1.8.6 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Lukasz Flis | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Client: 1.8.5 |
||
| Severity: | 3 |
| Epic: | client, hang, interoperability, ls, server |
| Rank (Obsolete): | 4996 |
| Description |
|
We have noticed some interoperability issue between 1.8.5 clients and 2.0.59 server (no other versions tested) How to reproduce problem: On client node issue: Symptom is trivial - client hangs , when 2.0.59 is used, such kind of listing takes ~4s Problem is interconnect independent: tested with @tcp as well as with @o2ib Possible log message related to the issue: 00010000:00010000:10:1306772139.242230:0:3591:0:(ldlm_lock.c:597:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(PR) ns: scratch-MDT0000-mdc-ffff81041677b800 lock: ffff8103f56ec200/0xf6a4fad9013fdffb lrc: 3/1,0 mode: PR/PR res: 8589937616/1 bits 0x3 rrc: 2 type: IBT flags: 0x0 remote: 0x3b122fd677c9380d expref: -99 pid: 1905 timeout: 0 I can provide more information and do provide testing when needed. |
| Comments |
| Comment by Andreas Dilger [ 30/May/11 ] |
|
This sounds similar to |
| Comment by Lukasz Flis [ 31/May/11 ] |
|
I have tested latest 1.8.6 client version from git (branch b1_8 origin/b1_8) [root@n6-4-12 tmp]# dmesg | grep -e "Lustre Version" It seems that problem still exists. ls is hanging in S+ state forever. [root@n6-4-4 ~]# time { ls -la /mnt/lustre/bigdir > /dev/null; }real 0m3.809s Here are some logs from lctl dk: (default logging settings +rpctrace +dlmtrace) 00000100:00100000:8:1306833305.702655:0:31652:0:(client.c:2098:ptlrpc_queue_wait()) Sending RPC pname:cluuid:pid:xid:nid:opc ls:36a510e9-44ff-f153-1bb1-1436b094d361:31652:x1370313399880721:172.16.193.1@o2ib:37 We will also try newer server version in a moment. |
| Comment by Marek Magrys [ 01/Jun/11 ] |
|
Hello, With servers: 2.0.61-jenkins-g80841af-PRISTINE-2.6.18-238.9.1.el5_lustre.gc66d831 (RPMs from Whamcloud build server,stand-alone OFED) we get:
Client log: Server logs say nothing. To be clear: our Infiniband fabric is sane, we can communicate between servers and clients over IB. We tried clients with versions: Everything works fine with 2.0.61 client, however we'd like to make clients compatible with both servers: 1.8 and 2.0/2.1. |
| Comment by James A Simmons [ 01/Jun/11 ] |
|
The testing at ORNL also ran into this issue. I tried 1.8 clients with the |
| Comment by James A Simmons [ 01/Jun/11 ] |
|
Just confirmed it was |
| Comment by Peter Jones [ 01/Jun/11 ] |
|
In that case I will assign this to FanYong for comment |
| Comment by nasf (Inactive) [ 01/Jun/11 ] |
|
Oops. The patch for -#define DIR_END_OFF 0xfffffffffffffffeULL But forgot to land related patch to lustre-1.8, so b1_8 client cannot detect the end of dir hash, so loop for ever. Will fix it soon. |
| Comment by James A Simmons [ 01/Jun/11 ] |
|
Please make sure it works with the Oracle 1.8 branch |
| Comment by nasf (Inactive) [ 01/Jun/11 ] |
|
Andreas, according to our discussion about fixing "DIR_END_OFF" on master in |
| Comment by Andreas Dilger [ 01/Jun/11 ] |
|
Fan Yong, can you please explain the details of the incompatibility in detail? Definitely it seems there should be a patch made to our 1.8.6 git repo immediately to fix up the DIR_END_OFF values so that we don't have conflicts for interop between 1.8.6 and 2.1. I don't think the 32/64-bit hash fixes went into the Oracle 1.8.6 release, so there shouldn't be any interoperability issues if this is done soon. Is it possible that we make different values of DIR_END_OFF visible to the client depending on whether OBD_CONNECT_64BIT_HASH is used? There would never be anything appearing with hash > 0x7fffffffffffffff and < 0xfffffffffffffffe. It is a bit confusing that the client cares about the value of DIR_END_OFF, yet this value is defined only in lustre/llite/dir.c? Is this value passed over the network in some form, or is it only internal to the client? If it is passed over the network, and affects the protocol, then it should really have been defined in lustre_idl.h and checked in wirecheck.h/wiretest.h so that it is clear that changing this value will break the wire protocol. |
| Comment by nasf (Inactive) [ 01/Jun/11 ] |
|
Different from b1_8, b2_x uses "DIR_END_OFF" as the flag to tell client that it is the tail of directory. So client-side readdir() always tries to check such value to determine whether should stop readdir(). According to current algorithm, ldiskfs does not generate hash/offset value more than "0x7ffffffffffffffe", so in theory, any other values can be used as "DIR_END_OFF". Originally, the "DIR_END_OFF" is defined as "0xfffffffffffffffe", which maybe regards as a negative value by caller, like "llseek()", think it as failure. So we choose the positive "0x7fffffffffffffff" as the new "DIR_END_OFF". Unfortunately, for b1_8, it is still a negative definition "0xfffffffffffffffe", so b1_8 client cannot find the tail of directory, so it cannot stop the readdir() for ever. "DIR_END_OFF" is something like flags transferred from MDS to client, so it is wire protocol related. In b2_x, it is defined in lustre_idl.h, but not for b1_8, I will fix that. On the other hand, we should try to make our b2_x can interoperate with Oracle 1.8.6 release, even if our patches cannot go into Oracle's branch. So you are right, we should make MDS to return different "DIR_END_OFF" depends on client-type. Similar situation is related with already released lustre-2.0. |
| Comment by nasf (Inactive) [ 01/Jun/11 ] |
|
Another way to fix dir name hash/offset issues. We can keep the original definition of "DIR_END_OFF" as "0xfffffffffffffffeULL" unchanged. On the other hand, we introduce a new "LL_DIR_END_OFF" defined as "0x7fffffffffffffffULL", which is used on client-side only for telling up layer caller the "f_pos", like llseek(), telldir(), and so on. The advantages are: 1) There are no wire data changed, then no "DIR_END_OFF" related interoperability issues between any clients and servers, include new 2.x client and old 2.0 server, liblustre client and new 2.x server, Oracle 1.8 client and new 2.x server, and so on. 2) All up layer caller will get positive hash/offset for successful call. Andreas, how do you think that? |
| Comment by Andreas Dilger [ 02/Jun/11 ] |
|
I'm all in favour of keeping the wire protocol unchanged, since that will definitely keep the interop simpler. One concern is that we must not return hash values from ll_dir_llseek() in the range [0x7ff..ff-0xfff..ff]. It probably makes sense to add some check in the MDD code or on the client to verify that the hash value returned to the client is in that range [0-0x7ff..ff] or DIR_END_OFF. There should also be some comments added to the code to describe this requirement. I suspect that for the ZFS OSD we may need to downshift the 64-bit hash value by 1 to keep it in the correct range. |
| Comment by James A Simmons [ 02/Jun/11 ] |
|
On our test bed we have generic Oracle 1.8.6 clients. Rolling back our Lustre 2.X servers to just before the 1.8.6 Client: Lustre: Client lustre-client has started 2.0.61.0 MDS server: Lustre: 5600:0:(ldlm_lib.c:871:target_handle_connect()) lustre-MDT0000: connection from 635e51f0-3b73-a4c2-fd3d-2a58a698d038@10.37.248.70@o2ib1 t0 exp 0000000000000000 cur 1307023742 last 0 |
| Comment by nasf (Inactive) [ 02/Jun/11 ] |
|
patch for master: patch for b1_8: |
| Comment by James A Simmons [ 02/Jun/11 ] |
|
If you apply Lustre 2.1 patch server side do you need the patch for Lustre 1.8 clients? |
| Comment by nasf (Inactive) [ 02/Jun/11 ] |
|
Unnecessary, 2.x patch can make it to work with 1.8 client. The patch for 1.8 can make 1.8 client to work with 2.0 client and process llseek() properly. |
| Comment by Marek Magrys [ 03/Jun/11 ] |
|
It look like the patches didn't do the job, with Lustre jenkins-g4344280-PRISTINE-2.6.18-238.9.1.el5_lustre.gc66d831 (build #755) which has the patch integrated still doesn't work with 1.8.5/1.8.6 clients. The ls operation end with "Input/Output error". 2.0.61-jenkins-g6ca1679-PRISTINE-2.6.18-238.9.1.el5 - This one works ok. We are working on tests with patched client. |
| Comment by nasf (Inactive) [ 03/Jun/11 ] |
|
What the expected work modes are: patched 2.x server/client means: http://review.whamcloud.com/#change,886 set 2 build# 769 1) patched 2.x server + un-patched 1.8.5 client 2) patched 2.x server + patched 1.8.6 client 3) patched 2.x server + un-patched Oracle 1.8 client (without http://review.whamcloud.com/#change,410) 4) patched 2.x server + un-patched 2.0 client 5) patched 2.x server + patched 2.x client 6) un-patched 2.0 server + patched 2.x client |
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by Build Master (Inactive) [ 06/Jun/11 ] |
|
Integrated in Johann Lombardi : 20ffb12f5b57df325aa4a2e2b4dca4f9b44ed320
|
| Comment by nasf (Inactive) [ 06/Jun/11 ] |
|
The verified work modes (for 'ls -l') are: 1) patched 2.x server + patched 2.x client 5) unpatched 2.0 server + patched 2.x client |
| Comment by Andreas Dilger [ 06/Jun/11 ] |
|
Excellent. |
| Comment by Andreas Dilger [ 06/Jun/11 ] |
|
Oleg, the patch in http://review.whamcloud.com/#change,886 needs to land to maintain compatiblity with 2.1 and older Lustre releases, per previous comments in this bug. |
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Johann Lombardi : 83dae8a8d91f1f04bb89da4b0758229be6b03f19
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by Build Master (Inactive) [ 07/Jun/11 ] |
|
Integrated in Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
|
| Comment by nasf (Inactive) [ 08/Jun/11 ] |
|
Patches have been landed to lustre-1.8.6 and lustre-2.1.0 |
| Comment by James A Simmons [ 08/Jun/11 ] |
|
tested with our unpatched Oracle 1.8.6 clients. It worked like a charm. Thank you |