[LU-4812] Interop 2.5.1<->2.6 failure on test suite sanity-hsm test_12c: request on 0x200000bd0:0x1c:0x0 is not SUCCEED on mds1 Created: 25/Mar/14 Updated: 14/Dec/21 Resolved: 14/Dec/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
server: 2.5.1 |
||
| Severity: | 3 |
| Rank (Obsolete): | 13239 |
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/bce807b0-b242-11e3-a93f-52540035b04c. The sub-test test_12c failed with the following error:
Update not seen after 100s: wanted 'SUCCEED' got '' sanity-hsm test_12c: @@@@@@ FAIL: request on 0x200000bd0:0x1c:0x0 is not SUCCEED on mds1 |
| Comments |
| Comment by Jodi Levi (Inactive) [ 25/Mar/14 ] |
|
Bruno, |
| Comment by Bruno Faccini (Inactive) [ 27/Mar/14 ] |
|
The subtest log indicates that the setstripe fails with ENODATA and later stat() fails with ERANGE : ........
error on ioctl 0x4008669a for '/mnt/lustre/d12c.sanity-hsm/f12c.sanity-hsm' (3): No data available
error: setstripe: create stripe file '/mnt/lustre/d12c.sanity-hsm/f12c.sanity-hsm' failed
5+0 records in
5+0 records out
5242880 bytes (5.2 MB) copied, 1.11764 s, 4.7 MB/s
Cannot stat /mnt/lustre/d12c.sanity-hsm/f12c.sanity-hsm: Numerical result out of range
CMD: client-13vm3 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000bd0:0x1c:0x0'.*action='ARCHIVE'/ {print \$13}' | cut -f2 -d=
........
Having a look into the Lustre debug logs for both the MDS and the Client, it seems that the MDS returns ERANGE from mdt_getxattr, but we miss a lot of debug/traces there : 00000100:00100000:0.0:1395452120.445468:0:14249:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-10.10.4.213@tcp, seq: 516 00000100:00100000:0.0:1395452120.445527:0:14249:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt00_000:4a4a5079-10da-02d9-9c8b-596c927d64af+70:26950:x1463237424976160:12345-10.10.4.213@tcp:49 00000100:00100000:0.0:1395452120.445594:0:14249:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc mdt00_000:4a4a5079-10da-02d9-9c8b-596c927d64af+70:26950:x1463237424976160:12345-10.10.4.213@tcp:49 Request procesed in 121us (294us total) trans 0 rc -34/-34 00000100:00100000:0.0:1395452120.445600:0:14249:0:(nrs_fifo.c:244:nrs_fifo_req_stop()) NRS stop fifo request from 12345-10.10.4.213@tcp, seq: 516 Sarah, do you have more detail about the 2.5.1 build I can use in order to reproduce ? |
| Comment by Bruno Faccini (Inactive) [ 31/Mar/14 ] |
|
Sarah, I added you as a watcher for this ticket since I need your help to better qualify the platform/problem ... |
| Comment by Sarah Liu [ 31/Mar/14 ] |
|
Hello Bruno, Here is the link I used for 2.5.1, I used the RHEL6 x86_64 server build: |