[LU-11559] sanity-sec test_21: req wrong generation: Created: 23/Oct/18 Updated: 19/Dec/23 Resolved: 31/Oct/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.5, Lustre 2.12.8 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com> This issue relates to the following test suite run: test_21 failed with the following error in the test logs, not sure if it is related: : CMD: onyx-30vm6.onyx.whamcloud.com runas -u60004 -g60004 -G60004 lfs quota -q /mnt/lustre running as uid/gid/euid/egid 60004/60004/60004/60004, groups: 60004 [lfs] [quota] [-q] [/mnt/lustre] /usr/lib64/lustre/tests/sanity-sec.sh: line 1273: [0]: syntax error: operand expected (error token is "[0]") CMD: onyx-30vm9 /usr/sbin/lctl nodemap_del c0 CMD: onyx-30vm9 /usr/sbin/lctl nodemap_del c1 : test_21 returned 1 In the console logs for the client I found: [ 9492.526296] LustreError: 16852:0:(client.c:1193:ptlrpc_import_delay_req()) @@@ req wrong generation: req@ffff989925543000 x1615054606938112/t0(0) o19->lustre-OST0001-osc-ffff9899209db800@10.2.4.100@tcp:28/4 lens 336/336 e 0 to 0 dl 0 ref 2 fl Rpc:NW/0/ffffffff rc 0/-1 [ 9492.528661] LustreError: 16852:0:(osc_quota.c:293:osc_quotactl()) ptlrpc_queue_wait failed, rc: -5 VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Sebastien Buisson [ 23/Oct/18 ] |
|
Hit again: But this time in test 22, and without "req wrong generation" error message in console (maybe skipped?). Looks like the command 'lfs quota' as run by user 60005 returns nothing, hence the script error at sanity-sec.sh::1273. |
| Comment by James Nunez (Inactive) [ 22/Feb/19 ] |
|
I'm seeing something similar in sanityn test 77j: jobid_var=procname_uid
ost.OSS.ost_io.nrs_policies=tbf opcode
ost.OSS.ost_io.nrs_tbf_rule=start ost_r opcode={ost_read} rate=5
ost.OSS.ost_io.nrs_tbf_rule=start ost_w opcode={ost_write} rate=20
CMD: trevis-24vm1.trevis.whamcloud.com,trevis-24vm2 dd if=/dev/zero of=/mnt/lustre/d77j.sanityn/nrs_r_trevis-24vm1.trevis.whamcloud.com bs=1M count=16
trevis-24vm2: dd: error writing ‘/mnt/lustre/d77j.sanityn/nrs_r_trevis-24vm1.trevis.whamcloud.com’: Input/output error
trevis-24vm2: 3+0 records in
trevis-24vm2: 2+0 records out
trevis-24vm2: 2097152 bytes (2.1 MB) copied, 0.0333487 s, 62.9 MB/s
trevis-24vm1: 16+0 records in
trevis-24vm1: 16+0 records out
trevis-24vm1: 16777216 bytes (17 MB) copied, 0.0697378 s, 241 MB/s
sanityn test_77j: @@@@@@ FAIL: dd at 0 on client failed (1)
We do see an error in the client (vm2) console log [38020.602925] Lustre: DEBUG MARKER: == sanityn test 77j: check TBF-OPCode NRS policy ===================================================== 10:28:04 (1549189684) [38021.702293] Lustre: DEBUG MARKER: dd if=/dev/zero of=/mnt/lustre/d77j.sanityn/nrs_r_trevis-24vm1.trevis.whamcloud.com bs=1M count=16 [38021.901674] LustreError: 15575:0:(client.c:1193:ptlrpc_import_delay_req()) @@@ req wrong generation: req@ffff9b73c0197600 x1624403585735904/t0(0) o101->lustre-OST0003-osc-ffff9b742883f800@10.9.5.44@tcp:28/4 lens 328/400 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1 [38022.511219] Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanityn test_77j: @@@@@@ FAIL: dd at 0 on client failed \(1\) |
| Comment by Andreas Dilger [ 22/Mar/19 ] |
|
I suspect this is the same as |
| Comment by Sarah Liu [ 02/Jun/20 ] |
|
seeing similar error in b2_12 testing CMD: trevis-33vm11.trevis.whamcloud.com chmod 200 /mnt/lustre/d21.sanity-sec CMD: trevis-33vm11.trevis.whamcloud.com chmod 300 /mnt/lustre/d21.sanity-sec CMD: trevis-33vm11.trevis.whamcloud.com chmod 400 /mnt/lustre/d21.sanity-sec CMD: trevis-33vm11.trevis.whamcloud.com chmod 500 /mnt/lustre/d21.sanity-sec CMD: trevis-33vm11.trevis.whamcloud.com chmod 600 /mnt/lustre/d21.sanity-sec CMD: trevis-33vm11.trevis.whamcloud.com chmod 700 /mnt/lustre/d21.sanity-sec CMD: trevis-33vm11.trevis.whamcloud.com chmod 303 /mnt/lustre/d21.sanity-sec CMD: trevis-33vm11.trevis.whamcloud.com chmod 777 /mnt/lustre/d21.sanity-sec CMD: trevis-33vm12 runas -u60006 -g60006 -G60006 lfs quota -q /mnt/lustre trevis-33vm12: running as uid/gid/euid/egid 60006/60006/60006/60006, groups: 60006 trevis-33vm12: [lfs] [quota] [-q] [/mnt/lustre] pdsh@trevis-33vm11: trevis-33vm12: ssh exited with exit code 2 /usr/lib64/lustre/tests/sanity-sec.sh: line 1267: [8]: syntax error: operand expected (error token is "[8]") CMD: trevis-47vm10 /usr/sbin/lctl nodemap_del c0 CMD: trevis-47vm10 /usr/sbin/lctl nodemap_del c1 CMD: trevis-47vm10 /usr/sbin/lctl nodemap_modify --name default --property admin --value 0 CMD: trevis-47vm10 /usr/sbin/lctl get_param -n nodemap.active CMD: trevis-47vm10 /usr/sbin/lctl list_nids | grep tcp | cut -f 1 -d @ CMD: trevis-47vm10 /usr/sbin/lctl get_param nodemap.default.admin_nodemap On MGS 10.9.3.130, default.admin_nodemap = nodemap.default.admin_nodemap=0 |
| Comment by Alena Nikitenko [ 23/Nov/21 ] |
|
Might have encountered something similar in sanity-quota test_37 (2.12.8) https://testing.whamcloud.com/test_sets/96c30097-9119-47df-a3ab-58781f980762 [64651.546456] LustreError: 3888:0:(client.c:1228:ptlrpc_import_delay_req()) @@@ req wrong generation: req@ffffa023604d2880 x1716956121891456/t0(0) o19->lustre-OST0004-osc-ffffa0237b2b5000@10.240.24.13@tcp:28/4 lens 336/336 e 0 to 0 dl 0 ref 2 fl Rpc:NW/0/ffffffff rc 0/-1 [64651.551120] LustreError: 3888:0:(osc_quota.c:308:osc_quotactl()) ptlrpc_queue_wait failed, rc: -5 |