[LU-4394] LBUG: (osc_lock.c:497:osc_lock_upcall()) ASSERTION( lock->cll_state >= CLS_QUEUING ) failed Created: 18/Dec/13  Updated: 19/Dec/13  Resolved: 18/Dec/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.2
Fix Version/s: None

Type: Bug Priority: Blocker
Reporter: Jian Yu Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Lustre Tag: v2_4_2_RC1
Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/69/
Distro/Arch: RHEL6.4/x86_64


Issue Links:
Duplicate
duplicates LU-3889 LBUG: (osc_lock.c:497:osc_lock_upcal... Resolved
Severity: 3
Rank (Obsolete): 12057

 Description   

While testing Lustre 2.4.2 RC1, parallel-scale-nfsv3 test iorssf failed as follows:

** error **
ERROR in aiori-POSIX.c (line 256): transfer failed.
ERROR: Permission denied
** exiting **

Console log on MDS node showed that:

01:30:43:Lustre: DEBUG MARKER: == parallel-scale-nfsv3 test iorssf: iorssf == 01:27:43 (1387358863)
01:30:43:Lustre: DEBUG MARKER: lfs setstripe /mnt/lustre/d0.ior.ssf -c -1
01:30:43:LustreError: 20587:0:(osc_lock.c:497:osc_lock_upcall()) ASSERTION( lock->cll_state >= CLS_QUEUING ) failed: 
01:30:43:LustreError: 20587:0:(osc_lock.c:497:osc_lock_upcall()) LBUG
01:30:43:Pid: 20587, comm: ptlrpcd_0
01:30:43:
01:30:43:Call Trace:
01:30:43: [<ffffffffa0456895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
01:30:43: [<ffffffffa0456e97>] lbug_with_loc+0x47/0xb0 [libcfs]
01:30:43: [<ffffffffa091b1ea>] osc_lock_upcall+0x44a/0x5f0 [osc]
01:30:43: [<ffffffffa091ada0>] ? osc_lock_upcall+0x0/0x5f0 [osc]
01:30:43: [<ffffffffa08fb876>] osc_enqueue_fini+0x106/0x240 [osc]
01:30:43: [<ffffffffa09002f2>] osc_enqueue_interpret+0xe2/0x1e0 [osc]
01:30:43: [<ffffffffa0754ebc>] ptlrpc_check_set+0x2ac/0x1b20 [ptlrpc]
01:30:43: [<ffffffffa078267b>] ptlrpcd_check+0x53b/0x560 [ptlrpc]
01:30:43: [<ffffffffa0782ba3>] ptlrpcd+0x233/0x390 [ptlrpc]
01:30:43: [<ffffffff81063990>] ? default_wake_function+0x0/0x20
01:30:43: [<ffffffffa0782970>] ? ptlrpcd+0x0/0x390 [ptlrpc]
01:30:43: [<ffffffff8100c0ca>] child_rip+0xa/0x20
01:30:43: [<ffffffffa0782970>] ? ptlrpcd+0x0/0x390 [ptlrpc]
01:30:43: [<ffffffffa0782970>] ? ptlrpcd+0x0/0x390 [ptlrpc]
01:30:43: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
01:30:43:
01:30:43:Kernel panic - not syncing: LBUG

Maloo report: https://maloo.whamcloud.com/test_sets/78c39230-67cd-11e3-846d-52540035b04c



 Comments   
Comment by Jian Yu [ 18/Dec/13 ]

I saw the failure was reported in LU-3889 before. Since the failure did not occur on Lustre b2_4 build #67 (over 6 times full group test runs), I'm not sure whether it's a regression on Lustre b2_4 build #69 or not, so I created this new ticket for investigating.

Comment by Peter Jones [ 18/Dec/13 ]

Oleg and Bobijam agree that this is a duplicate of LU-3889

Generated at Sat Feb 10 01:42:19 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.