[LU-3360] Interop 2.1.5<-> 2.4 failure on test suite runtests: Stale file handle Created: 20/May/13 Updated: 16/Oct/13 Resolved: 31/May/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.4.0, Lustre 2.5.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Hongchao Zhang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
server: 2.1.5 |
||
| Severity: | 3 |
| Rank (Obsolete): | 8322 |
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/335f5472-bf03-11e2-88e0-52540035b04c. client-16vm3: debug=0x33f0404 client-16vm4: debug=0x33f0404 client-16vm3: subsystem_debug=0xffb7e3ff client-16vm3: debug_mb=32 client-16vm4: subsystem_debug=0xffb7e3ff client-16vm4: debug_mb=32 touching /mnt/lustre at Thu May 16 10:37:20 PDT 2013 create an empty file /mnt/lustre/hosts.12774 copying /etc/hosts to /mnt/lustre/hosts.12774 cp: cannot create regular file `/mnt/lustre/hosts.12774': Stale file handle |
| Comments |
| Comment by Sarah Liu [ 20/May/13 ] |
|
another failure in sanity.sh |
| Comment by Peter Jones [ 21/May/13 ] |
|
Hongchao Could you please look into this one? Thanks Peter |
| Comment by Sarah Liu [ 21/May/13 ] |
|
sanity-benchmark test_dbench hit similar error: https://maloo.whamcloud.com/test_sets/34ce8152-bf03-11e2-88e0-52540035b04c 10:37:48:Lustre: DEBUG MARKER: == sanity-benchmark test dbench: dbench == 10:37:47 (1368725867) 10:37:48:LustreError: 4924:0:(filter.c:1484:filter_fid2dentry()) fatal: invalid object id 0 10:37:48:LustreError: 4924:0:(filter.c:3129:__filter_oa2dentry()) filter_setattr error looking up object: 0:2 10:37:48:Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity-benchmark test_dbench: @@@@@@ FAIL: dbench failed! client console shows: 10:37:55:Lustre: DEBUG MARKER: == sanity-benchmark test dbench: dbench == 10:37:47 (1368725867) 10:37:55:LustreError: 11-0: lustre-OST0005-osc-ffff88007ace2800: Communicating with 10.10.4.123@tcp, operation ost_destroy failed with -71. 10:37:55:LustreError: 22071:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 512 of inode ffff88007c71cb38 failed -116 10:37:55:LustreError: 22071:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 512 of inode ffff88007c71cb38 failed -116 10:37:55:Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity-benchmark test_dbench: @@@@@@ FAIL: dbench failed! 10:37:55:Lustre: DEBUG MARKER: sanity-benchmark test_dbench: @@@@@@ FAIL: dbench failed! 10:37:55:Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2013-05-16/lustre-b2_1-el6-x86_64-vs-lustre-master-el6-x86_64--full--1_4_1__1501__-70235162192180-100105/sanity-benchmark.test_dbench.debug_log.$(hostname -s).1368725868.log; 10:37:55: dmesg > /logdir/test_logs/2013-05 |
| Comment by Hongchao Zhang [ 22/May/13 ] |
|
it could be related to the patch in static inline void lustre_set_wire_obdo(struct obd_connect_data *ocd, struct obdo *wobdo, struct obdo *lobdo) { memcpy(wobdo, lobdo, sizeof(*lobdo)); wobdo->o_flags &= ~OBD_FL_LOCAL_MASK; if (ocd == NULL) return; if (unlikely(!(ocd->ocd_connect_flags & OBD_CONNECT_FID)) && fid_seq_is_echo(fid_seq(&lobdo->o_oi.oi_fid))) { /* Currently OBD_FL_OSTID will only be used when 2.4 echo * client communicate with pre-2.4 server */ wobdo->o_oi.oi.oi_id = fid_oid(&lobdo->o_oi.oi_fid); wobdo->o_oi.oi.oi_seq = fid_seq(&lobdo->o_oi.oi_fid); } } if the group (oi_seq) is 0 and the id (oi_id) is 2, then the obdo sent to OST will be changed to group(oi_seq)=fid_seq(&lobdo->o_oi.oi_fid)=2, |
| Comment by James Nunez (Inactive) [ 22/May/13 ] |
|
I'm seeing something similar with sanityn, but no OST nor client logs to look at: https://maloo.whamcloud.com/test_sets/313058ec-c294-11e2-b2eb-52540035b04c |
| Comment by Hongchao Zhang [ 23/May/13 ] |
|
the patch is tracked at http://review.whamcloud.com/#change,6426 |
| Comment by James Nunez (Inactive) [ 24/May/13 ] |
|
Ran with the 6426 patch and interop testing. Tests ran with some subtest failures; some known, but still going through failures. The patch allows runtests to run with no stale file handle error. 2.1 clients with 6426 patched master servers: 6426 patched master clients with 2.1 servers: |
| Comment by Jodi Levi (Inactive) [ 31/May/13 ] |
|
Patch landed to master. Let me know if more patches are needed and I will reopen the ticket. |