[LU-8637] cont-sanity test_71c failed: class_export_put+0x18/0x310 [obdclass] Created: 25/Sep/16  Updated: 13/Oct/17  Resolved: 05/Oct/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Major
Reporter: nasf (Inactive) Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
== conf-sanity test 71c: start OST0, OST1, MDT1, MDT0 ================================================ 08:11:30 (1473408690)
Loading modules from /usr/lib64/lustre/tests/..
detected 2 online CPUs by sysfs
Force libcfs to create 2 CPU partitions
../libcfs/libcfs/libcfs options: 'cpu_npartitions=2 cpu_npartitions=2'
debug=-1
subsystem_debug=all -lnet -lnd -pinger
../lnet/lnet/lnet options: 'accept=all'
../lnet/klnds/socklnd/ksocklnd options: 'sock_timeout=10'
gss/krb5 is not supported
quota/lquota options: 'hash_lqs_cur_bits=3'
start ost1 service on fre1318
Starting ost1: -o user_xattr  /dev/vdb /mnt/ost1
Started lustre-OST0000
start ost2 service on fre1318
Starting ost2: -o user_xattr  /dev/vdc /mnt/ost2
Started lustre-OST0001
start mds service on fre1317
Starting mds2: -o rw,user_xattr  /dev/vdd /mnt/mds2
Started lustre-MDT0001
start mds service on fre1317
Starting mds1: -o rw,user_xattr  /dev/vdc /mnt/mds1
Started lustre-MDT0000
mount lustre on /mnt/lustre.....
Starting client: fre1319:  -o user_xattr,flock fre1317@tcp:/lustre /mnt/lustre
umount lustre on /mnt/lustre.....
Stopping client fre1319 /mnt/lustre (opts:)
stop mds service on fre1317
Stopping /mnt/mds1 (opts:-f) on fre1317
stop mds service on fre1317
Stopping /mnt/mds2 (opts:-f) on fre1317
stop ost1 service on fre1318
Stopping /mnt/ost1 (opts:-f) on fre1318
stop ost2 service on fre1318
Stopping /mnt/ost2 (opts:-f) on fre1318

Stack trace:

[ 3122.379501] Lustre: DEBUG MARKER: == conf-sanity test 71c: start OST0, OST1, MDT1, MDT0 ================================================ 08:11:30 (1473408690)
[ 3122.773761] LDISKFS-fs (vdb): mounted filesystem with ordered data mode. Opts: errors=remount-ro,user_xattr,no_mbcache
[ 3123.560675] LDISKFS-fs (vdc): mounted filesystem with ordered data mode. Opts: errors=remount-ro
[ 3123.569798] LDISKFS-fs (vdc): mounted filesystem with ordered data mode. Opts: errors=remount-ro,user_xattr,no_mbcache
[ 3124.465480] LustreError: 10753:0:(fid_handler.c:283:__seq_server_alloc_meta()) srv-lustre-OST0001: Can't allocate super-sequence, rc -115
[ 3128.613157] Lustre: 7753:0:(client.c:2093:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1473408699/real 1473408699]  req@ffff88007ab1ad00 x1544977752527532/t0(0) o38->lustre-MDT0001-lwp-OST0001@192.168.113.17@tcp:12/10 lens 520/544 e 0 to 1 dl 1473408704 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
[ 3135.465450] LustreError: 10753:0:(fid_handler.c:283:__seq_server_alloc_meta()) srv-lustre-OST0001: Can't allocate super-sequence, rc -115
[ 3138.614931] LustreError: 11-0: lustre-MDT0001-lwp-OST0001: operation obd_ping to node 192.168.113.17@tcp failed: rc = -107
[ 3138.621801] LustreError: Skipped 1 previous similar message
[ 3138.626747] Lustre: lustre-MDT0001-lwp-OST0001: Connection to lustre-MDT0001 (at 192.168.113.17@tcp) was lost; in progress operations using this service will wait for recovery to complete
[ 3138.634090] Lustre: Skipped 1 previous similar message
[ 3147.467503] LustreError: 10967:0:(ofd_fs.c:506:ofd_register_lwp_callback()) lustre-OST0001: cannot update controller: rc = -5
[ 3147.472893] general protection fault: 0000 [#1] SMP 
[ 3147.473831] Modules linked in: lod(OE) mdt(OE) mdd(OE) mgs(OE) obdecho(OE) osc(OE) osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ppdev parport_pc pcspkr virtio_balloon parport i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_net virtio_blk cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm ata_piix serio_raw drm virtio_pci virtio_ring i2c_core virtio libata floppy
[ 3147.477233] CPU: 1 PID: 10967 Comm: lwp_notify_lust Tainted: G           OE  ------------   3.10.0-327.13.1.x3.0.80.x86_64 #1
[ 3147.477233] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[ 3147.477233] task: ffff88007948b980 ti: ffff880077e40000 task.ti: ffff880077e40000
[ 3147.477233] RIP: 0010:[<ffffffffa0522e18>]  [<ffffffffa0522e18>] class_export_put+0x18/0x310 [obdclass]
[ 3147.477233] RSP: 0018:ffff880077e43e40  EFLAGS: 00010206
[ 3147.477233] RAX: ffff8800551b2f10 RBX: 5a5a5a5a5a5a5a5a RCX: 000000018040002b
[ 3147.477233] RDX: 000000018040002c RSI: ffffea0001e59f40 RDI: 5a5a5a5a5a5a5a5a
[ 3147.477233] RBP: ffff880077e43e50 R08: ffff88007967d000 R09: 000000018040002b
[ 3147.477233] R10: ffffea0001e59f40 R11: ffffffffa0db8685 R12: ffff880077ea800c
[ 3147.477233] R13: ffff8800553b9000 R14: ffff8800551b2f10 R15: 0000000000000000
[ 3147.477233] FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[ 3147.477233] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 3147.477233] CR2: 0000000000448bb0 CR3: 0000000001946000 CR4: 00000000000006e0
[ 3147.477233] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3147.477233] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 3147.477233] Stack:
[ 3147.477233]  ffff880054ded880 ffff880077ea800c ffff880077e43e68 ffffffffa057bc6a
[ 3147.477233]  ffff880054ded880 ffff880077e43e98 ffffffffa057c042 ffff8800553b9000
[ 3147.477233]  ffff880059b3a000 ffff880059b3a0b0 0000000000000000 ffff880077e43ec0
[ 3147.477233] Call Trace:
[ 3147.477233]  [<ffffffffa057bc6a>] lustre_put_lwp_item+0x3a/0x2b0 [obdclass]
[ 3147.477233]  [<ffffffffa057c042>] lustre_notify_lwp_list+0xc2/0x100 [obdclass]
[ 3147.477233]  [<ffffffffa0e38c64>] lwp_notify_main+0x54/0xb0 [osp]
[ 3147.477233]  [<ffffffffa0e38c10>] ? lwp_import_event+0xb0/0xb0 [osp]
[ 3147.477233]  [<ffffffff810a5acf>] kthread+0xcf/0xe0
[ 3147.477233]  [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
[ 3147.477233]  [<ffffffff816442d8>] ret_from_fork+0x58/0x90
[ 3147.477233]  [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
[ 3147.477233] Code: c7 c7 a0 37 5b a0 e8 58 f1 ec ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 85 ff 48 89 e5 41 54 53 48 89 fb 0f 84 27 02 00 00 <8b> 4f 40 8d 41 ff 3d 58 5a 5a 5a 0f 87 e4 01 00 00 f6 05 6c 26 
[ 3147.477233] RIP  [<ffffffffa0522e18>] class_export_put+0x18/0x310 [obdclass]
[ 3147.477233]  RSP <ffff880077e43e40>


 Comments   
Comment by Gerrit Updater [ 25/Sep/16 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/22724
Subject: LU-8637 obdclass: LWP callback hold export reference
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 073c2a5218a5bd90132607ad017823183f3b8a7e

Comment by Gerrit Updater [ 05/Oct/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/22724/
Subject: LU-8637 obdclass: LWP callback hold export reference
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: acf46c8846d6c3893a52f5caba1eabea67c1bdba

Comment by Peter Jones [ 05/Oct/16 ]

Landed for 2.9

Generated at Sat Feb 10 02:19:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.