[LU-2851] Interop 2.3.0<->2.4 failure on test suite runtests: timeout when doing cp Created: 22/Feb/13 Updated: 23/Nov/17 Resolved: 23/Nov/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
server: lustre-2.3.0 |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 6902 |
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/88f6b4e2-7571-11e2-93d9-52540035b04c. The sub-test runtests failed with the following error:
CMD: client-27vm7 lctl dl | grep ' IN osc ' 2>/dev/null | wc -l CMD: client-27vm2.lab.whamcloud.com lctl dl | grep ' IN osc ' 2>/dev/null | wc -l enable jobstats, set job scheduler as procname_uid CMD: client-27vm7 /usr/sbin/lctl conf_param lustre.sys.jobid_var=procname_uid CMD: client-27vm2.lab.whamcloud.com /usr/sbin/lctl get_param -n jobid_var enable quota as required CMD: client-27vm7 /usr/sbin/lctl get_param -n version CMD: client-27vm1,client-27vm7,client-27vm8 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/sbin:/usr/sbin:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin: NAME=autotest_config sh rpc.sh set_default_debug \"0x33f0404\" \" 0xffb7e3ff\" 32 touching /mnt/lustre at Mon Feb 11 15:50:22 PST 2013 create an empty file /mnt/lustre/hosts.12675 copying /etc/hosts to /mnt/lustre/hosts.12675 |
| Comments |
| Comment by Jodi Levi (Inactive) [ 22/Feb/13 ] |
|
Sarah, |
| Comment by Sarah Liu [ 25/Feb/13 ] |
|
Jodi, I cannot find more logs than the above, here is another instance found between 2.1.4 server vs 2.4 client, still no useful logs. I will try to run the test manually to get more information https://maloo.whamcloud.com/test_sets/e246fd9e-7d7e-11e2-85d0-52540035b04c |
| Comment by Jodi Levi (Inactive) [ 05/Mar/13 ] |
|
Sarah, |
| Comment by Sarah Liu [ 12/Mar/13 ] |
|
I can reproduce it manually, here is the client trace cp R running task 0 7510 5875 0x00000080 ffff8803241b42a0 ffffffffa050e355 ffff8803245b4e78 ffff8803245b4cb8 ffff88031e9f84e0 0000000000000010 ffff8803234d8f08 ffffffffa052e086 ffff88031e9f84e0 ffff88031eb8ed60 0000000000000000 ffffffffa0511da5 Call Trace: [<ffffffffa0505995>] ? cl_env_info+0x15/0x20 [obdclass] [<ffffffffa094aafa>] ? lov_io_rw_iter_init+0x19a/0x2f0 [lov] [<ffffffffa05193c5>] ? cl_io_lock+0x485/0x560 [obdclass] [<ffffffffa0519542>] ? cl_io_loop+0xa2/0x1b0 [obdclass] [<ffffffffa0a14528>] ? ll_file_io_generic+0x428/0x570 [lustre] [<ffffffffa0a158e2>] ? ll_file_aio_write+0x142/0x2c0 [lustre] [<ffffffffa0a15bcc>] ? ll_file_write+0x16c/0x2a0 [lustre] [<ffffffff81176588>] ? vfs_write+0xb8/0x1a0 [<ffffffff81176e81>] ? sys_write+0x51/0x90 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b |
| Comment by Sarah Liu [ 12/Mar/13 ] |
|
client debug log |
| Comment by Keith Mannthey (Inactive) [ 13/Aug/13 ] |
|
A simple patch for runtests has been applied to master and it lets logging work so you can see what has really happened. http://review.whamcloud.com/7014 |
| Comment by Andreas Dilger [ 23/Nov/17 ] |
|
Close old test issues that haven't been seen recently. |