[LU-1725] Test failure on test suite parallel-scale-nfsv4, subtest test_compilebench Created: 08/Aug/12  Updated: 29/May/17  Resolved: 29/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.3.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Zhenyu Xu
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 5783

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/bd3f01d2-dcd3-11e1-8744-52540035b04c.

The sub-test test_compilebench failed with the following error:

compilebench failed: 1

Found this error message in stdout log

IOError: [Errno 13] Permission denied: '/mnt/lustre/d0.compilebench/native-0/COPYING'
 parallel-scale-nfsv4 test_compilebench: @@@@@@ FAIL: compilebench failed: 1 


 Comments   
Comment by Jodi Levi (Inactive) [ 09/Aug/12 ]

It's likely this is a problem in the test environment that needs to be fixed to add the permissions needed.

Comment by Peter Jones [ 10/Aug/12 ]

Bobijam

What do you advise here?

Peter

Comment by Zhenyu Xu [ 11/Aug/12 ]

07:50:18:client-27vm6: lsof: WARNING: can't stat() nfs file system /mnt/lustre
07:50:18:client-27vm6: Output information may be incomplete.
07:50:18:client-27vm6: lsof: status error on /mnt/lustre: Permission denied
07:50:18:client-27vm6: lsof 4.82

looks like test environment issue, it cannot access /mnt/lustre on the NFS client nodes.

Comment by Jian Yu [ 13/Aug/12 ]

Console log MDS showed that:

08:09:44:Lustre: DEBUG MARKER: == parallel-scale-nfsv4 test compilebench: compilebench == 08:09:42 (1343920182)
08:09:44:Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400):0:mdt
08:09:44:Lustre: DEBUG MARKER: /usr/sbin/lctl mark .\/compilebench -D \/mnt\/lustre\/d0.compilebench -i 4         -r 4 --makej
08:09:45:Lustre: DEBUG MARKER: ./compilebench -D /mnt/lustre/d0.compilebench -i 4 -r 4 --makej
08:11:25:LustreError: 2571:0:(mdd_lov.c:464:mdd_lov_create()) ASSERTION( parent != ((void *)0) ) failed: 
08:11:25:LustreError: 2571:0:(mdd_lov.c:464:mdd_lov_create()) LBUG
08:11:25:Pid: 2571, comm: mdt00_002
08:11:25:
08:11:25:Call Trace:
08:11:25: [<ffffffffa04de905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
08:11:25: [<ffffffffa04def17>] lbug_with_loc+0x47/0xb0 [libcfs]
08:11:26: [<ffffffffa0bf978a>] mdd_lov_create+0x1e2a/0x2030 [mdd]
08:11:26: [<ffffffffa0c0054d>] mdd_create_data+0x2ed/0x550 [mdd]
08:11:26: [<ffffffffa0806896>] ? __req_capsule_get+0x176/0x750 [ptlrpc]
08:11:26: [<ffffffffa0914437>] cml_create_data+0x97/0x200 [cmm]
08:11:26: [<ffffffffa0c8126d>] mdt_finish_open+0x124d/0x1790 [mdt]
08:11:26: [<ffffffffa0c82ae6>] mdt_open_by_fid+0x2e6/0x410 [mdt]
08:11:26: [<ffffffffa0c82eeb>] mdt_reint_open+0x2db/0x18a0 [mdt]
08:11:26: [<ffffffffa0c0be4e>] ? md_ucred+0x1e/0x60 [mdd]
08:11:26: [<ffffffffa0c51235>] ? mdt_ucred+0x15/0x20 [mdt]
08:11:26: [<ffffffffa0c6897c>] ? mdt_root_squash+0x2c/0x3e0 [mdt]
08:11:26: [<ffffffffa0806896>] ? __req_capsule_get+0x176/0x750 [ptlrpc]
08:11:26: [<ffffffffa07da656>] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc]
08:11:26: [<ffffffffa0c6d281>] mdt_reint_rec+0x41/0xe0 [mdt]
08:11:26: [<ffffffffa0c66ada>] mdt_reint_internal+0x50a/0x810 [mdt]
08:11:26: [<ffffffffa0c670ad>] mdt_intent_reint+0x1ed/0x500 [mdt]
08:11:26: [<ffffffffa0c632c1>] mdt_intent_policy+0x371/0x6a0 [mdt]
08:11:26: [<ffffffffa0792881>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc]
08:11:27: [<ffffffffa07ba7ef>] ldlm_handle_enqueue0+0x48f/0xf70 [ptlrpc]
08:11:27: [<ffffffffa0c63636>] mdt_enqueue+0x46/0x130 [mdt]
08:11:27: [<ffffffffa0c5a932>] mdt_handle_common+0x922/0x1740 [mdt]
08:11:27: [<ffffffffa0c5b825>] mdt_regular_handle+0x15/0x20 [mdt]
08:11:27: [<ffffffffa07ea83d>] ptlrpc_server_handle_request+0x40d/0xea0 [ptlrpc]
08:11:27: [<ffffffffa04df65e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
08:11:27: [<ffffffffa07e1cc7>] ? ptlrpc_wait_event+0xa7/0x2a0 [ptlrpc]
08:11:27: [<ffffffff810533f3>] ? __wake_up+0x53/0x70
08:11:27: [<ffffffffa07ebe29>] ptlrpc_main+0xb59/0x1860 [ptlrpc]
08:11:27: [<ffffffffa07eb2d0>] ? ptlrpc_main+0x0/0x1860 [ptlrpc]
08:11:27: [<ffffffff8100c14a>] child_rip+0xa/0x20
08:11:27: [<ffffffffa07eb2d0>] ? ptlrpc_main+0x0/0x1860 [ptlrpc]
08:11:27: [<ffffffffa07eb2d0>] ? ptlrpc_main+0x0/0x1860 [ptlrpc]
08:11:27: [<ffffffff8100c140>] ? child_rip+0x0/0x20
08:11:27:
08:11:27:Kernel panic - not syncing: LBUG
Comment by Zhenyu Xu [ 13/Aug/12 ]

Yujian,

08:11:25:LustreError: 2571:0:(mdd_lov.c:464:mdd_lov_create()) ASSERTION( parent != ((void *)0) ) failed:
is fixed by LU-1697

Comment by Zhenyu Xu [ 14/Aug/12 ]

I think it's a random test environment setting problem (NFS client node cannot access its NFS mounted directory), we can close it for now until we hit it again from the same issue.

Comment by Sarah Liu [ 21/Aug/12 ]

Did not see this error on tag-2.2.93
https://maloo.whamcloud.com/test_sessions/384b54a8-e99e-11e1-881a-52540035b04c

Comment by Peter Jones [ 21/Aug/12 ]

dropping in priority because this is no longer happening.

Comment by Andreas Dilger [ 29/May/17 ]

Close old ticket.

Generated at Sat Feb 10 01:19:09 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.