[LU-272] Got LBUG when running replay-single test_74 Created: 03/May/11  Updated: 21/Oct/11  Resolved: 13/Jun/11

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Sarah Liu Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 5010

 Description   

Running replay-single on the latest lustre-master RHEL5/x86_64, got LBUG:

trace from ost:
------------------
May 2 23:36:09 fat-intel-2 kernel: Lustre: DEBUG MARKER: == replay-single test
74: Ensure applications don't fail waiting for OST recovery =====================
23:36:09 (1304404569)
May 2 23:36:09 fat-intel-2 xinetd[5855]: EXIT: shell status=0 pid=28690 duratioo
n=0(sec)
May 2 23:36:09 fat-intel-2 xinetd[5855]: START: shell pid=28711 from=192.168.4..
5
May 2 23:36:09 fat-intel-2 rshd[28720]: root@client-5-ib.lab.whamcloud.com as rr
oot: cmd='(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usrr
/sbin; cd /usr/lib64/lustre/tests; LUSTRE="/usr/lib64/lustre" sh -c "grep -c /mnn
t/ost1' ' /proc/mounts");echo XXRETCODE:$?'
May 2 23:36:09 fat-intel-2 xinetd[5855]: EXIT: shell status=0 pid=28711 duratioo
n=0(sec)
May 2 23:36:09 fat-intel-2 xinetd[5855]: START: shell pid=28740 from=192.168.4..
5
May 2 23:36:09 fat-intel-2 rshd[28745]: root@client-5-ib.lab.whamcloud.com as rr
oot: cmd='(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usrr
/sbin; cd /usr/lib64/lustre/tests; LUSTRE="/usr/lib64/lustre" sh -c "umount -d //
mnt/ost1");echo XXRETCODE:$?'
May 2 23:36:09 fat-intel-2 kernel: Lustre: Failing over lustre-OST0000
May 2 23:36:09 fat-intel-2 kernel: LustreError: 28629:0:(llog_cat.c:485:llog_caa
t_process_thread()) llog_cat_process() failed -4
May 2 23:36:09 fat-intel-2 kernel: LustreError: 28629:0:(llog_cat.c:485:llog_caa
t_process_thread()) Skipped 1 previous similar message
May 2 23:36:11 fat-intel-2 kernel: Lustre: lustre-OST0000: shutting down for faa
ilover; client state will be preserved.
May 2 23:36:11 fat-intel-2 kernel: LustreError: 28243:0:(lprocfs_status.c:430:ll
procfs_remove_proc_entry()) ASSERTION(parent != NULL) failed
May 2 23:36:11 fat-intel-2 kernel: LustreError: 28243:0:(lprocfs_status.c:430:ll
procfs_remove_proc_entry()) LBUG
May 2 23:36:11 fat-intel-2 kernel: Pid: 28243, comm: ll_cfg_requeue
May 2 23:36:11 fat-intel-2 kernel:
May 2 23:36:11 fat-intel-2 kernel: Call Trace:
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff886f65f1>] libcfs_debug_dumpstackk
+0x51/0x60 [libcfs]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff886f6b2a>] lbug_with_loc+0x7a/0xdd
0 [libcfs]
May 2 23:36:11 fat-intel-2 kernel: Lustre: OST lustre-OST0000 has stopped.
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff88701960>] cfs_tracefile_init+0x00
/0x10a [libcfs]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff887ba968>] lprocfs_remove_proc_enn
try+0x38/0x60 [obdclass]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff88c3dea0>] filter_cleanup+0xe0/0xx
4a0 [obdfilter]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff887c52cf>] class_decref+0x43f/0x55
b0 [obdclass]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff88c40c10>] filter_set_info_async++
0x5b0/0x1270 [obdfilter]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff88797b0a>] llog_close+0x1aa/0x2300
[obdclass]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff887ac215>] obd_devlist_next+0x65//
0x80 [obdclass]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff887ae00c>] class_notify_sptlrpc_cc
onf+0x46c/0x490 [obdclass]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff888e7593>] sptlrpc_conf_get+0x33//
0x270 [ptlrpc]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff888e745e>] logname2fsname+0xae/0xx
d0 [ptlrpc]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff887a28d7>] __llog_ctxt_put+0x27/00
x270 [obdclass]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff88bc585b>] mgc_process_log+0x233bb
/0x2630 [mgc]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff80062ff8>] thread_return+0x62/0xff
e
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff800939a9>] daemonize+0x2fa/0x304
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff88bc6c90>] mgc_blocking_ast+0x0/00
x460 [mgc]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff8888a390>] ldlm_completion_ast+0xx
0/0x780 [ptlrpc]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff8004b113>] try_to_del_timer_sync++
0x7f/0x88
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff88bc5c24>] do_requeue+0xd4/0x170
[mgc]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff88bca6ef>] mgc_requeue_thread+0x33
ef/0x68c [mgc]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff8008cf99>] default_wake_function++
0x0/0xe
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff88bca300>] mgc_requeue_thread+0x00
/0x68c [mgc]
May 2 23:36:11 fat-intel-2 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
May 2 23:36:11 fat-intel-2 kernel:
May 3 10:48:42 fat-intel-2 syslogd 1.4.1: restart.
May 3 10:48:42 fat-intel-2 kernel: klogd 1.4.1, log source = /proc/kmsg startedd
.
May 3 10:48:42 fat-intel-2 kernel: Linux version 2.6.18-194.17.1.el5_lustre.g1cc
c5fa2 (hudson@bld-centos5.whamcloud.com) (gcc version 4.1.2 20080704 (Red Hat 4..
1.2-48)) #1 SMP Tue Apr 12 15:23:31 MDT 2011
May 3 10:48:42 fat-intel-2 kernel: Command line: ro root=LABEL=/ console=ttyS0,,
115200
May 3 10:48:42 fat-intel-2 kernel: BIOS-provided physical RAM map:
May 3 10:48:42 fat-intel-2 kernel: BIOS-e820: 0000000000010000 - 000000000009cc
c00 (usable)
May 3 10:48:42 fat-intel-2 kernel: BIOS-e820: 000000000009cc00 - 00000000000a00
000 (reserved)


Info required for matching: replay-single test_74 74



 Comments   
Comment by Peter Jones [ 03/May/11 ]

Lai

Oleg thinks that this may be related to LU106 which you are already working on. Could you please review and comment?

Thanks

Peter

Comment by Lai Siyao [ 04/May/11 ]

This is not the same as LU106, though it's a bug of lprocfs:
obd->obd_proc_exports_entry may be NULL (see filter_setup()), but code in filter_cleanup() has a premise it's not NULL.

Peter, should I fix it in LU106 or separately?

Comment by Peter Jones [ 04/May/11 ]

If it is distinct then fix it separately. At least you have been looking at this code recently

Comment by Lai Siyao [ 13/May/11 ]

review is on http://review.whamcloud.com/#change,497

Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » x86_64,client,el5,inkernel #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » x86_64,client,sles11,inkernel #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » i686,client,el5,inkernel #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » i686,client,el5,ofa #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » x86_64,client,el6,inkernel #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » i686,client,el6,inkernel #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » x86_64,server,el5,inkernel #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,ofa #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » x86_64,server,el6,inkernel #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » i686,server,el5,inkernel #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » i686,server,el5,ofa #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » i686,server,el6,inkernel #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » x86_64,client,el5,ofa #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Build Master (Inactive) [ 01/Jun/11 ]

Integrated in lustre-master » x86_64,server,el5,ofa #145
LU-272 LBUG in replay-single test_74

Oleg Drokin : eb7129339587376e18b812c9f866b8c2aeaedfb2
Files :

  • lustre/obdfilter/filter.c
Comment by Peter Jones [ 13/Jun/11 ]

Landed for 2.1

Generated at Sat Feb 10 01:05:25 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.