[LU-3930] 1.8.9<->2.4.1 interop: parallel-scale-nfsv3 test iorssf: ERROR: Input/output error Created: 11/Sep/13  Updated: 13/Sep/13  Resolved: 13/Sep/13

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.9
Fix Version/s: None

Type: Bug Priority: Blocker
Reporter: Jian Yu Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Lustre client: http://build.whamcloud.com/job/lustre-b1_8/258/ (1.8.9-wc1)
Lustre server: http://build.whamcloud.com/job/lustre-b2_4/45/ (2.4.1 RC2)


Issue Links:
Duplicate
duplicates LU-3052 Interop 1.8.9<->2.4 failure on test s... Resolved
Severity: 3
Rank (Obsolete): 10383

 Description   

parallel-scale-nfsv3 test iorssf failed as follows:

** error **
ERROR in aiori-POSIX.c (line 256): transfer failed.
ERROR: Input/output error
** exiting **

Console log on client node showed that:

22:26:56:Lustre: DEBUG MARKER: == parallel-scale-nfsv3 test iorssf: iorssf == 22:26:46 (1378704406)
22:31:04:INFO: task IOR:18593 blocked for more than 120 seconds.
22:31:05:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
22:31:05:IOR           D 0000000000000000     0 18593  18589 0x00000080
22:31:06: ffff88007b5e3c88 0000000000000086 ffff88007c55e0b8 ffff880002216680
22:31:07: ffff88007b5e3c48 ffffffff81053a60 ffff88007b5e3c28 ffffffff8106210b
22:31:07: ffff88007a9fa5f8 ffff88007b5e3fd8 000000000000fb88 ffff88007a9fa5f8
22:31:07:Call Trace:
22:31:07: [<ffffffff81053a60>] ? check_preempt_wakeup+0x1c0/0x260
22:31:07: [<ffffffff8106210b>] ? enqueue_task_fair+0xfb/0x100
22:31:07: [<ffffffff8104e0fc>] ? check_preempt_curr+0x7c/0x90
22:31:08: [<ffffffff814eb2ae>] __mutex_lock_slowpath+0x13e/0x180
22:31:08: [<ffffffff814eb14b>] mutex_lock+0x2b/0x50
22:31:08: [<ffffffff811123b9>] generic_file_aio_write+0x59/0xe0
22:31:08: [<ffffffffa03bac00>] ? nfs_file_open+0x0/0xc0 [nfs]
22:31:08: [<ffffffffa03baebe>] nfs_file_write+0xde/0x1f0 [nfs]
22:31:09: [<ffffffff8117628a>] do_sync_write+0xfa/0x140
22:31:09: [<ffffffff81090990>] ? autoremove_wake_function+0x0/0x40
22:31:10: [<ffffffff8120ca26>] ? security_file_permission+0x16/0x20
22:31:10: [<ffffffff81176588>] vfs_write+0xb8/0x1a0
22:31:11: [<ffffffff81176e81>] sys_write+0x51/0x90
22:31:11: [<ffffffff810d3a75>] ? __audit_syscall_exit+0x265/0x290
22:31:11: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Console log on MDS showed that:

22:26:49:Lustre: DEBUG MARKER: == parallel-scale-nfsv3 test iorssf: iorssf == 22:26:46 (1378704406)
22:26:49:Lustre: DEBUG MARKER: lfs setstripe /mnt/lustre/d0.ior.ssf -c -1
22:27:12:Lustre: MGS: Client 94cbca82-3d34-1ab9-851b-9caa3d6176ea (at 10.10.4.127@tcp) reconnecting
22:27:13:Lustre: lustre-MDT0000: Client lustre-MDT0000-lwp-OST0004_UUID (at 10.10.4.127@tcp) reconnecting
22:34:11:LustreError: 4777:0:(vvp_io.c:1088:vvp_io_commit_write()) Write page 61252 of inode ffff88003f151bb8 failed -122
22:34:11:LustreError: 4778:0:(vvp_io.c:1088:vvp_io_commit_write()) Write page 319049 of inode ffff88003f151bb8 failed -122
22:34:11:LustreError: 4783:0:(vvp_io.c:1088:vvp_io_commit_write()) Write page 320841 of inode ffff88003f151bb8 failed -122
22:34:11:LustreError: 4783:0:(vvp_io.c:1088:vvp_io_commit_write()) Skipped 49 previous similar messages
22:34:13:LustreError: 4777:0:(vvp_io.c:1088:vvp_io_commit_write()) Write page 833784 of inode ffff88003f151bb8 failed -122
22:34:13:LustreError: 4777:0:(vvp_io.c:1088:vvp_io_commit_write()) Skipped 64 previous similar messages
22:34:14:LustreError: 4777:0:(vvp_io.c:1088:vvp_io_commit_write()) Write page 564568 of inode ffff88003f151bb8 failed -122
22:34:14:LustreError: 4777:0:(vvp_io.c:1088:vvp_io_commit_write()) Skipped 154 previous similar messages
22:34:26:LustreError: 4781:0:(vvp_io.c:1088:vvp_io_commit_write()) Write page 574040 of inode ffff88003f151bb8 failed -122
22:34:26:LustreError: 4781:0:(vvp_io.c:1088:vvp_io_commit_write()) Skipped 270 previous similar messages
22:34:26:LustreError: 4778:0:(vvp_io.c:1088:vvp_io_commit_write()) Write page 586899 of inode ffff88003f151bb8 failed -122
22:34:26:LustreError: 4778:0:(vvp_io.c:1088:vvp_io_commit_write()) Skipped 595 previous similar messages
22:34:39:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  parallel-scale-nfsv3 test_iorssf: @@@@@@ FAIL: ior failed! 1

Maloo reports:
https://maloo.whamcloud.com/test_sets/c137f0b4-19ef-11e3-a95b-52540035b04c
https://maloo.whamcloud.com/test_sets/c9c00352-19ef-11e3-a95b-52540035b04c



 Comments   
Comment by Oleg Drokin [ 12/Sep/13 ]

-122 is EDQUOT was quota enabled from some previous run?

Comment by Jian Yu [ 13/Sep/13 ]

-122 is EDQUOT was quota enabled from some previous run?

This was caused by LU-3052.

While running the tests without specifying ENABLE_QUOTA=yes, both the parallel-scale-nfsv3 and parallel-scale-nfsv4 passed:
https://maloo.whamcloud.com/test_sessions/02aa89e2-1c22-11e3-bede-52540035b04c
https://maloo.whamcloud.com/test_sessions/7f8b03ae-1c51-11e3-ae26-52540035b04c

Generated at Sat Feb 10 01:38:09 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.