[LU-4988] seq_client_rpc()) ASSERTION( exp !=\ ((void *)0) && !IS_ERR(exp) ) failed Created: 01/May/14  Updated: 01/May/14  Resolved: 01/May/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: John Hammond Assignee: Di Wang
Resolution: Duplicate Votes: 0
Labels: dne2

Severity: 3
Rank (Obsolete): 13816

 Description   
# cd ~/lustre-release
# git describe
2.5.58-48-g72f0d50
# git diff
diff --git a/lustre/tests/sanity.sh b/lustre/tests/sanity.sh
index e2034b8..536cda5 100644
--- a/lustre/tests/sanity.sh
+++ b/lustre/tests/sanity.sh
@@ -12411,6 +12411,10 @@ test_striped_dir() {
        local stripe_count
        local stripe_index

+       cleanup || exit 23
+       setup || exit 42
+       set -x
+
        mkdir -p $DIR/$tdir
        $LFS setdirstripe -i $mdt_index -c 2 -t all_char $DIR/$tdir/striped_dir ||
                error "set striped dir error"
# export MDSCOUNT=4
# llmount.sh
...
# ONLY=300a sh lustre/tests/sanity.sh
Logging to shared log directory: /tmp/test_logs/1398913203
t: Checking config lustre mounted on /mnt/lustre
Checking servers environments
Checking clients t environments
Using TIMEOUT=20
disable quota as required
osd-ldiskfs.track_declares_assert=1
running as uid/gid/euid/egid 500/500/500/500, groups:
 [touch] [/mnt/lustre/d0_runas_test/f26079]
excepting tests: 76 42a 42b 42c 42d 45 51d 68b
skipping tests SLOW=no: 24o 27m 64b 68 71 77f 78 115 124b 230d
preparing for tests involving mounts
mke2fs 1.42.7.wc2 (07-Nov-2013)

debug=-1
resend_count is set to 4 4
resend_count is set to 4 4
resend_count is set to 4 4
resend_count is set to 4 4
resend_count is set to 4 4

== sanity test 300a: basic striped dir sanity test == 22:00:09 (1398913209)
cln..Stopping clients: t /mnt/lustre (opts:)
Stopping client t /mnt/lustre opts:
Stopping clients: t /mnt/lustre2 (opts:)
Stopping /mnt/mds1 (opts:-f) on t
Stopping /mnt/mds2 (opts:-f) on t
Stopping /mnt/mds3 (opts:-f) on t
Stopping /mnt/mds4 (opts:-f) on t
Stopping /mnt/ost1 (opts:-f) on t
Stopping /mnt/ost2 (opts:-f) on t
waited 0 for 31 ST ost OSS OSS_uuid 0
modules unloaded.
mnt..Loading modules from /root/lustre-release/lustre
detected 4 online CPUs by sysfs
Force libcfs to create 2 CPU partitions
../libcfs/libcfs/libcfs options: 'cpu_npartitions=2'
debug=vfstrace rpctrace dlmtrace neterror ha config ioctl super
subsystem_debug=all -lnet -lnd -pinger
gss/krb5 is not supported
quota/lquota options: 'hash_lqs_cur_bits=3'
Checking servers environments
Checking clients t environments
Loading modules from /root/lustre-release/lustre
detected 4 online CPUs by sysfs
Force libcfs to create 2 CPU partitions
debug=vfstrace rpctrace dlmtrace neterror ha config ioctl super
subsystem_debug=all -lnet -lnd -pinger
gss/krb5 is not supported
Setup mgs, mdt, osts
Starting mds1:   -o loop /tmp/lustre-mdt1 /mnt/mds1
Started lustre-MDT0000
Starting mds2:   -o loop /tmp/lustre-mdt2 /mnt/mds2
Started lustre-MDT0001
Starting mds3:   -o loop /tmp/lustre-mdt3 /mnt/mds3
Started lustre-MDT0002
Starting mds4:   -o loop /tmp/lustre-mdt4 /mnt/mds4
Started lustre-MDT0003
Starting ost1:   -o loop /tmp/lustre-ost1 /mnt/ost1
Started lustre-OST0000
Starting ost2:   -o loop /tmp/lustre-ost2 /mnt/ost2
Started lustre-OST0001
Starting client: t: -o user_xattr,flock t@tcp:/lustre /mnt/lustre
Starting client t: -o user_xattr,flock t@tcp:/lustre /mnt/lustre
Started clients t:
t@tcp:/lustre on /mnt/lustre type lustre (rw,user_xattr,flock)
Using TIMEOUT=20
disable quota as required
done
+ mkdir -p /mnt/lustre/d300a.sanity
+ /root/lustre-release/lustre/utils/lfs setdirstripe -i 0 -c 2 -t all_char /mnt/lustre/d3\
00a.sanity/striped_dir

Message from syslogd@t at Apr 30 22:01:18 ...
 kernel:[ 1468.977712] LustreError: 29243:0:(fid_request.c:72:seq_client_rpc()) ASSERTION\
( exp != ((void *)0) && !IS_ERR(exp) ) failed:

Message from syslogd@t at Apr 30 22:01:18 ...
 kernel:[ 1468.981246] LustreError: 29243:0:(fid_request.c:72:seq_client_rpc()) LBUG


[ 1468.977712] LustreError: 29243:0:(fid_request.c:72:seq_client_rpc()) ASSERTION( exp !=\
 ((void *)0) && !IS_ERR(exp) ) failed:
[ 1468.981246] LustreError: 29243:0:(fid_request.c:72:seq_client_rpc()) LBUG
[ 1468.982681] Pid: 29243, comm: mdt00_002
[ 1468.983420]
[ 1468.983421] Call Trace:
[ 1468.984216]  [<ffffffffa0a498c5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[ 1468.985536]  [<ffffffffa0a49ec7>] lbug_with_loc+0x47/0xb0 [libcfs]
[ 1468.986732]  [<ffffffffa029a28d>] seq_client_rpc+0x7cd/0x940 [fid]
[ 1468.987922]  [<ffffffff810b777d>] ? trace_hardirqs_on+0xd/0x10
[ 1468.989032]  [<ffffffffa029a817>] seq_client_alloc_seq+0x417/0x490 [fid]
[ 1468.990321]  [<ffffffffa0299443>] ? seq_fid_alloc_prep+0x43/0xc0 [fid]
[ 1468.991553]  [<ffffffffa029ad2f>] seq_client_alloc_fid+0xef/0x490 [fid]
[ 1468.992803]  [<ffffffff81067bc0>] ? default_wake_function+0x0/0x20
[ 1468.993987]  [<ffffffff81538b40>] ? kmemleak_alloc+0x20/0xd0
[ 1468.995069]  [<ffffffffa09b57d2>] osp_fid_alloc+0xe2/0xf0 [osp]
[ 1468.996194]  [<ffffffffa02f9b0a>] lod_declare_xattr_set_lmv+0x7ba/0x1a30 [lod]
[ 1468.997556]  [<ffffffffa02ec221>] ? lod_ea_store_resize+0x1e1/0x7e0 [lod]
[ 1468.998841]  [<ffffffffa03001af>] lod_dir_striping_create_internal+0x24f/0x12d0 [lod]
[ 1469.000315]  [<ffffffffa03016e7>] lod_declare_object_create+0x227/0x390 [lod]
[ 1469.001685]  [<ffffffffa096eb54>] mdd_declare_object_create_internal+0xb4/0x1e0 [mdd]
[ 1469.003158]  [<ffffffffa09632e3>] mdd_create+0x813/0x18a0 [mdd]
[ 1469.004325]  [<ffffffffa03b0355>] ? lustre_msg_buf+0x55/0x60 [ptlrpc]
[ 1469.005591]  [<ffffffffa0d1ec43>] mdt_reint_create+0xac3/0xfa0 [mdt]
[ 1469.006807]  [<ffffffffa0d1621c>] ? mdt_root_squash+0x2c/0x410 [mdt]
[ 1469.008059]  [<ffffffffa03d7ca6>] ? __req_capsule_get+0x166/0x6e0 [ptlrpc]
[ 1469.009398]  [<ffffffffa03b138e>] ? lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc]
[ 1469.010800]  [<ffffffffa0d1a151>] mdt_reint_rec+0x41/0xe0 [mdt]
[ 1469.011938]  [<ffffffffa0cffdc3>] mdt_reint_internal+0x4c3/0x7c0 [mdt]
[ 1469.013179]  [<ffffffffa0d0064b>] mdt_reint+0x6b/0x120 [mdt]
[ 1469.014309]  [<ffffffffa0412155>] tgt_request_handle+0x245/0xad0 [ptlrpc]
[ 1469.015625]  [<ffffffffa03c2701>] ptlrpc_main+0xce1/0x1970 [ptlrpc]
[ 1469.016861]  [<ffffffffa03c1a20>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]
[ 1469.018074]  [<ffffffff8109eab6>] kthread+0x96/0xa0
[ 1469.019005]  [<ffffffff8100c30a>] child_rip+0xa/0x20
[ 1469.019954]  [<ffffffff81554710>] ? _spin_unlock_irq+0x30/0x40
[ 1469.021053]  [<ffffffff8100bb10>] ? restore_args+0x0/0x30
[ 1469.022096]  [<ffffffff8109ea20>] ? kthread+0x0/0xa0
[ 1469.023040]  [<ffffffff8100c300>] ? child_rip+0x0/0x20
[ 1469.024022]


 Comments   
Comment by Di Wang [ 01/May/14 ]

seems duplicate with LU-4850. I suspect http://review.whamcloud.com/#/c/9883 should fix this problem.

Comment by Di Wang [ 01/May/14 ]

Just test it, http://review.whamcloud.com/#/c/9883 should fix the problem. And also added this test case in 9883. I will close this for now.

Comment by Di Wang [ 01/May/14 ]

duplicate with LU-4850

Generated at Sat Feb 10 01:47:34 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.