[LU-6849] sanity-quota test 22 LBUG with “ASSERTION( orro->oo_ref == 0 ) failed: Busy NRS TRR policy object for OST” Created: 14/Jul/15 Updated: 02/Nov/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0, Lustre 2.9.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | Li Xi |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
review-zfs-part-1 test group in autotest |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
sanity-quota test 22 failed with 'server shadow-14vm11 environments are insane!'. Logs for this failure are: From the OST console: 17:35:27:Lustre: DEBUG MARKER: umount -d -f /mnt/ost2 17:35:27:Lustre: 4055:0:(client.c:2018:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1436808903/real 1436808903] req@ffff88007da416c0 x1506591584180392/t0(0) o250->MGC10.1.4.170@tcp@10.1.4.170@tcp:26/25 lens 520/544 e 0 to 1 dl 1436808914 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 17:35:27:Lustre: server umount lustre-OST0001 complete 17:35:27:LustreError: 4041:0:(nrs_orr.c:483:nrs_trr_hop_exit()) ASSERTION( orro->oo_ref == 0 ) failed: Busy NRS TRR policy object for OST with index 1, with 1 refs 17:35:27:LustreError: 4041:0:(nrs_orr.c:483:nrs_trr_hop_exit()) LBUG 17:35:27:Pid: 4041, comm: obd_zombid 17:35:27: 17:35:27:Call Trace: 17:35:27: [<ffffffffa0709875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 17:35:27: [<ffffffffa0709e77>] lbug_with_loc+0x47/0xb0 [libcfs] 17:35:27: [<ffffffffa0ada51c>] nrs_trr_hop_exit+0x1cc/0x1d0 [ptlrpc] 17:35:27: [<ffffffffa071bf46>] cfs_hash_putref+0x1a6/0x480 [libcfs] 17:35:27: [<ffffffffa0721dbb>] ? cfs_binheap_destroy+0x3db/0x810 [libcfs] 17:35:27: [<ffffffffa0adb61f>] nrs_orr_stop+0x6f/0x440 [ptlrpc] 17:35:27: [<ffffffffa0ad11af>] nrs_policy_stop0+0x4f/0x1f0 [ptlrpc] 17:35:27: [<ffffffffa0ad14f8>] nrs_policy_stop_locked+0x1a8/0x230 [ptlrpc] 17:35:27: [<ffffffffa0ad2854>] nrs_policy_unregister+0xb4/0x670 [ptlrpc] 17:35:27: [<ffffffffa0ad2f32>] ptlrpc_service_nrs_cleanup+0x122/0x5b0 [ptlrpc] 17:35:27: [<ffffffffa0a9a617>] ptlrpc_unregister_service+0x417/0xe10 [ptlrpc] 17:35:27: [<ffffffff8105bd73>] ? __wake_up+0x53/0x70 17:35:27: [<ffffffffa0fdc12a>] ost_cleanup+0x6a/0x1b0 [ost] 17:35:27: [<ffffffffa084cf55>] class_decref+0x1e5/0x4d0 [obdclass] 17:35:27: [<ffffffffa0715c01>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 17:35:27: [<ffffffffa083753c>] obd_zombie_impexp_cull+0x61c/0xac0 [obdclass] 17:35:27: [<ffffffffa0837a45>] obd_zombie_impexp_thread+0x65/0x190 [obdclass] 17:35:27: [<ffffffff81064be0>] ? default_wake_function+0x0/0x20 17:35:27: [<ffffffffa08379e0>] ? obd_zombie_impexp_thread+0x0/0x190 [obdclass] 17:35:27: [<ffffffff8109e78e>] kthread+0x9e/0xc0 17:35:27: [<ffffffff8100c28a>] child_rip+0xa/0x20 17:35:27: [<ffffffff8109e6f0>] ? kthread+0x0/0xc0 17:35:27: [<ffffffff8100c280>] ? child_rip+0x0/0x20 17:35:27: 17:35:27:Kernel panic - not syncing: LBUG 17:35:27:Pid: 4041, comm: obd_zombid Tainted: P --------------- 2.6.32-504.23.4.el6_lustre.g2382eb0.x86_64 #1 17:35:27:Call Trace: 17:35:27: [<ffffffff81529bbc>] ? panic+0xa7/0x16f 17:35:27: [<ffffffffa0709ecb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] 17:35:27: [<ffffffffa0ada51c>] ? nrs_trr_hop_exit+0x1cc/0x1d0 [ptlrpc] 17:35:27: [<ffffffffa071bf46>] ? cfs_hash_putref+0x1a6/0x480 [libcfs] 17:35:27: [<ffffffffa0721dbb>] ? cfs_binheap_destroy+0x3db/0x810 [libcfs] 17:35:27: [<ffffffffa0adb61f>] ? nrs_orr_stop+0x6f/0x440 [ptlrpc] 17:35:27: [<ffffffffa0ad11af>] ? nrs_policy_stop0+0x4f/0x1f0 [ptlrpc] 17:35:27: [mory mce=off disable_cpu_apicid=0 memmap=exactmap memmap=627K@4K memmap=131449K@49779K elfcorehdr=181228K memmap=4K$0K memmap=9K$631K memmap=64K$960K memmap=12K$2097140K memmap=272K$4194032K 17:35:27:KERNEL supported cpus: |
| Comments |
| Comment by Oleg Drokin [ 05/Aug/15 ] |
|
I just hit this too on current master in sanity test 77e: <4>[53304.002633] Lustre: DEBUG MARKER: == sanityn test 77e: check TBF NID nrs policy == 06:44:27 (1438771467)
<0>[53304.068281] LustreError: 7480:0:(nrs_orr.c:483:nrs_trr_hop_exit()) ASSERTION( orro->oo_ref == 0 ) failed: Busy NRS TRR policy object for OST with index 1, with 1 refs
<0>[53304.069257] LustreError: 7480:0:(nrs_orr.c:483:nrs_trr_hop_exit()) LBUG
<4>[53304.069586] Pid: 7480, comm: lctl
<4>[53304.069913]
<4>[53304.069914] Call Trace:
<4>[53304.070424] [<ffffffffa0771885>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4>[53304.070763] [<ffffffffa0771e87>] lbug_with_loc+0x47/0xb0 [libcfs]
<4>[53304.071034] [<ffffffffa0fa3956>] nrs_trr_hop_exit+0x126/0x130 [ptlrpc]
<4>[53304.071376] [<ffffffffa0784076>] cfs_hash_putref+0x1a6/0x490 [libcfs]
<4>[53304.071714] [<ffffffffa078a05b>] ? cfs_binheap_destroy+0x3db/0x810 [libcfs]
<4>[53304.072071] [<ffffffffa0fa49bd>] nrs_orr_stop+0x6d/0x330 [ptlrpc]
<4>[53304.072360] [<ffffffffa0f9b33e>] nrs_policy_stop0+0x4e/0x1f0 [ptlrpc]
<4>[53304.072598] [<ffffffffa0f9b8a8>] nrs_policy_stop_primary+0x198/0x1d0 [ptlrpc]
<4>[53304.072988] [<ffffffffa0f9bed8>] nrs_policy_start_locked+0x458/0x690 [ptlrpc]
<4>[53304.073510] [<ffffffffa0f9c2fc>] nrs_policy_ctl+0x1ec/0x280 [ptlrpc]
<4>[53304.073755] [<ffffffffa0f9dbd2>] ptlrpc_nrs_policy_control+0xe2/0x2a0 [ptlrpc]
<4>[53304.074265] [<ffffffffa0f7f276>] ptlrpc_lprocfs_nrs_seq_write+0x3e6/0x600 [ptlrpc]
<4>[53304.074822] [<ffffffffa0f7ee90>] ? ptlrpc_lprocfs_nrs_seq_write+0x0/0x600 [ptlrpc]
<4>[53304.075350] [<ffffffff811fb8c5>] proc_reg_write+0x85/0xc0
<4>[53304.075629] [<ffffffff8118f4a8>] vfs_write+0xb8/0x1a0
<4>[53304.075862] [<ffffffff8118fde1>] sys_write+0x51/0x90
<4>[53304.076099] [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
<4>[53304.076380]
<0>[53304.095801] Kernel panic - not syncing: LBUG
I have a crashdump and modules in /exports/crashdumps/192.168.10.224-2015-08-05-06\:44\:28/ |
| Comment by Oleg Drokin [ 08/Aug/15 ] |
|
and I just hit it once again. |
| Comment by Oleg Drokin [ 30/Sep/15 ] |
|
and just hit it once more |
| Comment by Oleg Drokin [ 07/Oct/15 ] |
|
Hit once again |
| Comment by James Nunez (Inactive) [ 11/Jan/16 ] |
|
More failures on master. This time in sanityn test_77e. Logs at: |
| Comment by Emoly Liu [ 05/Jul/16 ] |
|
Hit this issue when running sanityn.sh 77e. lixi, could you please have a look? Thanks. |
| Comment by Li Xi (Inactive) [ 05/Jul/16 ] |
|
Should we change oo_ref of struct nrs_orr_object to atomic_t? I feel that there are some places which could have race condition since the reference is not always protected by the same lock. |
| Comment by nasf (Inactive) [ 11/Dec/16 ] |
|
+1 on master: |
| Comment by Oleg Drokin [ 11/Jan/17 ] |
|
I just hit this one again in master, note how it's closely located with the sleeping while atomic warning (I think there's a ticket about this, but cannot find it right away), possibly related? [65131.137381] Lustre: DEBUG MARKER: == sanityn test 77d: check TRR nrs policy ============================================================ 17:59:22 (1484089162) [65134.786924] BUG: sleeping function called from invalid context at /home/green/git/lustre-release/libcfs/libcfs/hash.c:1123 [65134.793042] in_atomic(): 1, irqs_disabled(): 0, pid: 25850, name: lctl [65134.793574] INFO: lockdep is turned off. [65134.794029] CPU: 7 PID: 25850 Comm: lctl Tainted: G W OE ------------ 3.10.0-debug #1 [65134.794924] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [65134.795507] ffff880093aa8940 0000000076eac45a ffff880073a9bd18 ffffffff816fd400 [65134.796569] ffff880073a9bd30 ffffffff810b0109 0000000000000004 ffff880073a9bd80 [65134.797496] ffffffffa02989d3 ffffffffa029d486 ffff8800749b5f80 0000000000000004 [65134.798352] Call Trace: [65134.799008] [<ffffffff816fd400>] dump_stack+0x19/0x1b [65134.799696] [<ffffffff810b0109>] __might_sleep+0xe9/0x110 [65134.800431] [<ffffffffa02989d3>] cfs_hash_putref+0x233/0x500 [libcfs] [65134.801010] [<ffffffffa029d486>] ? cfs_binheap_destroy+0x396/0x830 [libcfs] [65134.801498] [<ffffffffa0951aa9>] nrs_orr_stop+0x69/0x270 [ptlrpc] [65134.802537] [<ffffffffa0949e59>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc] [65134.803040] [<ffffffffa094a7c8>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc] [65134.803823] [<ffffffffa094bb04>] nrs_policy_start_locked+0x624/0x670 [ptlrpc] [65134.804591] [<ffffffffa094bc7b>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc] [65134.805015] [<ffffffffa094e25f>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc] [65134.805809] [<ffffffffa092ea35>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc] [65134.806600] [<ffffffff8125d3ed>] proc_reg_write+0x3d/0x80 [65134.806992] [<ffffffff811ed38d>] vfs_write+0xbd/0x1e0 [65134.807398] [<ffffffff811fe3fd>] ? putname+0x3d/0x60 [65134.807785] [<ffffffff811ede54>] SyS_write+0x84/0xf0 [65134.808180] [<ffffffff8170fc89>] system_call_fastpath+0x16/0x1b [65134.808583] BUG: scheduling while atomic: lctl/25850/0x10000002 [65134.808958] INFO: lockdep is turned off. [65134.809361] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) osc(OE) mdc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) brd ext4 loop mbcache jbd2 sha512_generic crypto_null rpcsec_gss_krb5 syscopyarea ata_generic sysfillrect sysimgblt pata_acpi ttm drm_kms_helper ata_piix drm i2c_piix4 libata pcspkr serio_raw virtio_balloon i2c_core virtio_console floppy virtio_blk nfsd ip_tables [last unloaded: libcfs] [65134.814034] CPU: 7 PID: 25850 Comm: lctl Tainted: G W OE ------------ 3.10.0-debug #1 [65134.814810] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [65134.815505] ffff880093aa8940 0000000076eac45a ffff880073a9bc90 ffffffff816fd400 [65134.816785] ffff880073a9bca8 ffffffff816f9258 ffff880073a9bfd8 ffff880073a9bd08 [65134.817813] ffffffff817041c7 ffff880093aa8940 ffff880073a9bfd8 ffff880073a9bfd8 [65134.818721] Call Trace: [65134.819087] [<ffffffff816fd400>] dump_stack+0x19/0x1b [65134.819528] [<ffffffff816f9258>] __schedule_bug+0x59/0x68 [65134.819984] [<ffffffff817041c7>] __schedule+0x807/0x950 [65134.820387] [<ffffffff810b4956>] __cond_resched+0x26/0x30 [65134.821032] [<ffffffff817045fa>] _cond_resched+0x3a/0x50 [65134.821624] [<ffffffffa02989d8>] cfs_hash_putref+0x238/0x500 [libcfs] [65134.822213] [<ffffffffa029d486>] ? cfs_binheap_destroy+0x396/0x830 [libcfs] [65134.822837] [<ffffffffa0951aa9>] nrs_orr_stop+0x69/0x270 [ptlrpc] [65134.823464] [<ffffffffa0949e59>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc] [65134.824058] [<ffffffffa094a7c8>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc] [65134.825188] [<ffffffffa094bb04>] nrs_policy_start_locked+0x624/0x670 [ptlrpc] [65134.826299] [<ffffffffa094bc7b>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc] [65134.826910] [<ffffffffa094e25f>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc] [65134.827989] [<ffffffffa092ea35>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc] [65134.829059] [<ffffffff8125d3ed>] proc_reg_write+0x3d/0x80 [65134.829648] [<ffffffff811ed38d>] vfs_write+0xbd/0x1e0 [65134.830389] [<ffffffff811fe3fd>] ? putname+0x3d/0x60 [65134.830966] [<ffffffff811ede54>] SyS_write+0x84/0xf0 [65134.831935] [<ffffffff8170fc89>] system_call_fastpath+0x16/0x1b [65134.832463] LustreError: 25850:0:(nrs_orr.c:483:nrs_trr_hop_exit()) ASSERTION( orro->oo_ref == 0 ) failed: Busy NRS TRR policy object for OST with index 0, with 1 refs [65134.834538] LustreError: 25850:0:(nrs_orr.c:483:nrs_trr_hop_exit()) LBUG [65134.835210] Pid: 25850, comm: lctl [65134.835825] Call Trace: [65134.836973] [<ffffffffa02857d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs] [65134.837671] [<ffffffffa0285841>] lbug_with_loc+0x41/0xb0 [libcfs] [65134.838296] [<ffffffffa09516ab>] nrs_trr_hop_exit+0xfb/0x130 [ptlrpc] [65134.838884] [<ffffffffa0298982>] cfs_hash_putref+0x1e2/0x500 [libcfs] [65134.839519] [<ffffffffa029d486>] ? cfs_binheap_destroy+0x396/0x830 [libcfs] [65134.840295] [<ffffffffa0951aa9>] nrs_orr_stop+0x69/0x270 [ptlrpc] [65134.840990] [<ffffffffa0949e59>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc] [65134.841729] [<ffffffffa094a7c8>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc] [65134.842903] [<ffffffffa094bb04>] nrs_policy_start_locked+0x624/0x670 [ptlrpc] [65134.843961] [<ffffffffa094bc7b>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc] [65134.844570] [<ffffffffa094e25f>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc] [65134.845658] [<ffffffffa092ea35>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc] [65134.846795] [<ffffffff8125d3ed>] proc_reg_write+0x3d/0x80 [65134.847391] [<ffffffff811ed38d>] vfs_write+0xbd/0x1e0 [65134.847958] [<ffffffff811fe3fd>] ? putname+0x3d/0x60 [65134.848539] [<ffffffff811ede54>] SyS_write+0x84/0xf0 [65134.849076] [<ffffffff8170fc89>] system_call_fastpath+0x16/0x1b [65134.849640] [65134.850122] Kernel panic - not syncing: LBUG |
| Comment by Oleg Drokin [ 29/Jul/17 ] |
|
just hit this once again and again closely related to sleepign while atomic: Hm this comment was really long and jira hated it. [22075.470864] Lustre: DEBUG MARKER: == sanityn test 77b: check CRR-N NRS policy ========================================================== 09:02:25 (1501333345)
[22085.359953] BUG: sleeping function called from invalid context at /home/green/git/lustre-release/libcfs/libcfs/hash.c:1120
[22085.362133] in_atomic(): 1, irqs_disabled(): 0, pid: 8995, name: lctl
[22085.362674] 3 locks held by lctl/8995:
[22085.363514] #0: (sb_writers#3){......}, at: [<ffffffff811ed45b>] vfs_write+0x1ab/0x1e0
[22085.364569] #1: (&nrs_core.nrs_mutex){......}, at: [<ffffffffa05e46eb>] ptlrpc_lprocfs_nrs_seq_write+0x38b/0x560 [ptlrpc]
[22085.365679] #2: (&(&nrs->nrs_lock)->rlock){......}, at: [<ffffffffa05ffb0a>] nrs_policy_ctl+0x3a/0x2b0 [ptlrpc]
...
[22120.733831] BUG: scheduling while atomic: lctl/12741/0x10000002
[22120.734746] INFO: lockdep is turned off.
[22120.735614] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) osc(OE) mdc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) brd ext4 mbcache loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect sysimgblt ttm ata_generic drm_kms_helper pata_acpi drm i2c_piix4 ata_piix serio_raw i2c_core virtio_balloon libata virtio_console virtio_blk pcspkr floppy nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs]
[22120.763431] CPU: 9 PID: 12741 Comm: lctl Tainted: P W OE ------------ 3.10.0-debug #2
[22120.769006] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[22120.770687] ffff880073434b80 00000000ca53a485 ffff88008057bc90 ffffffff816fd3e4
[22120.774839] ffff88008057bca8 ffffffff816f9218 ffff88008057bfd8 ffff88008057bd08
[22120.777191] ffffffff817041a7 ffff880073434b80 ffff88008057bfd8 ffff88008057bfd8
[22120.780385] Call Trace:
[22120.781840] [<ffffffff816fd3e4>] dump_stack+0x19/0x1b
[22120.783524] [<ffffffff816f9218>] __schedule_bug+0x59/0x68
[22120.785190] [<ffffffff817041a7>] __schedule+0x807/0x950
[22120.786814] [<ffffffff810b4936>] __cond_resched+0x26/0x30
[22120.788015] [<ffffffff817045da>] _cond_resched+0x3a/0x50
[22120.790636] [<ffffffffa02000a8>] cfs_hash_putref+0x238/0x500 [libcfs]
[22120.794345] [<ffffffffa0204b66>] ? cfs_binheap_destroy+0x396/0x830 [libcfs]
[22120.795397] [<ffffffffa06059c9>] nrs_orr_stop+0x69/0x270 [ptlrpc]
[22120.797393] [<ffffffffa05fddd9>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc]
[22120.799836] [<ffffffffa05fe748>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc]
[22120.801265] [<ffffffffa05ffa84>] nrs_policy_start_locked+0x624/0x670 [ptlrpc]
[22120.803191] [<ffffffffa05ffbfb>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc]
[22120.803997] [<ffffffffa06021ef>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc]
[22120.828023] [<ffffffffa05e4705>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc]
[22120.837729] [<ffffffff8125d3cd>] proc_reg_write+0x3d/0x80
[22120.848944] [<ffffffff811ed36d>] vfs_write+0xbd/0x1e0
[22120.849644] [<ffffffff811fe3dd>] ? putname+0x3d/0x60
[22120.850441] [<ffffffff811ede34>] SyS_write+0x84/0xf0
[22120.851186] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b
[22120.851897] LustreError: 12741:0:(nrs_orr.c:483:nrs_trr_hop_exit()) ASSERTION( orro->oo_ref == 0 ) failed: Busy NRS TRR policy object for OST with index 1, with 1 refs
[22120.856255] LustreError: 12741:0:(nrs_orr.c:483:nrs_trr_hop_exit()) LBUG
[22120.856969] Pid: 12741, comm: lctl
[22120.857575]
Call Trace:
[22120.858750] [<ffffffffa01ec7ce>] libcfs_call_trace+0x4e/0x60 [libcfs]
[22120.859460] [<ffffffffa01ec85c>] lbug_with_loc+0x4c/0xb0 [libcfs]
[22120.860231] [<ffffffffa06055cb>] nrs_trr_hop_exit+0xfb/0x130 [ptlrpc]
[22120.861629] [<ffffffffa0200052>] cfs_hash_putref+0x1e2/0x500 [libcfs]
[22120.862123] [<ffffffffa0204b66>] ? cfs_binheap_destroy+0x396/0x830 [libcfs]
[22120.862821] [<ffffffffa06059c9>] nrs_orr_stop+0x69/0x270 [ptlrpc]
[22120.863589] [<ffffffffa05fddd9>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc]
[22120.864315] [<ffffffffa05fe748>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc]
[22120.865614] [<ffffffffa05ffa84>] nrs_policy_start_locked+0x624/0x670 [ptlrpc]
[22120.866922] [<ffffffffa05ffbfb>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc]
[22120.872716] [<ffffffffa06021ef>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc]
[22120.874257] [<ffffffffa05e4705>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc]
[22120.875521] [<ffffffff8125d3cd>] proc_reg_write+0x3d/0x80
[22120.876212] [<ffffffff811ed36d>] vfs_write+0xbd/0x1e0
[22120.900486] [<ffffffff811fe3dd>] ? putname+0x3d/0x60
[22120.901959] [<ffffffff811ede34>] SyS_write+0x84/0xf0
[22120.920647] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b
[22120.921334]
[22120.921903] Kernel panic - not syncing: LBUG
[22120.922521] CPU: 9 PID: 12741 Comm: lctl Tainted: P W OE ------------ 3.10.0-debug #2
[22120.923760] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[22120.924463] ffffffffa020bed2 00000000ca53a485 ffff88008057bc70 ffffffff816fd3e4
[22120.925713] ffff88008057bcf0 ffffffff816f8c34 ffffffff00000008 ffff88008057bd00
[22120.927552] ffff88008057bca0 00000000ca53a485 00000000ca53a485 ffff88033e52d948
[22120.929142] Call Trace:
[22120.929738] [<ffffffff816fd3e4>] dump_stack+0x19/0x1b
[22120.930405] [<ffffffff816f8c34>] panic+0xd8/0x1e7
[22120.931066] [<ffffffffa01ec874>] lbug_with_loc+0x64/0xb0 [libcfs]
[22120.931861] [<ffffffffa06055cb>] nrs_trr_hop_exit+0xfb/0x130 [ptlrpc]
[22120.932566] [<ffffffffa0200052>] cfs_hash_putref+0x1e2/0x500 [libcfs]
[22120.933266] [<ffffffffa0204b66>] ? cfs_binheap_destroy+0x396/0x830 [libcfs]
[22120.934002] [<ffffffffa06059c9>] nrs_orr_stop+0x69/0x270 [ptlrpc]
[22120.934708] [<ffffffffa05fddd9>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc]
[22120.935433] [<ffffffffa05fe748>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc]
[22120.936713] [<ffffffffa05ffa84>] nrs_policy_start_locked+0x624/0x670 [ptlrpc]
[22120.937994] [<ffffffffa05ffbfb>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc]
[22120.938714] [<ffffffffa06021ef>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc]
[22120.939994] [<ffffffffa05e4705>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc]
[22120.941251] [<ffffffff8125d3cd>] proc_reg_write+0x3d/0x80
[22120.941916] [<ffffffff811ed36d>] vfs_write+0xbd/0x1e0
[22120.942587] [<ffffffff811fe3dd>] ? putname+0x3d/0x60
[22120.943243] [<ffffffff811ede34>] SyS_write+0x84/0xf0
[22120.943899] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b
Please take a look at it. |
| Comment by Arshad Hussain [ 20/Oct/19 ] |
|
Another instance. http://testing.linuxhacker.ru:3333/lustre-reports/3782/results.html |
| Comment by Andreas Dilger [ 06/Nov/19 ] |
|
+1 on master in sanityn.sh test_77d https://testing.whamcloud.com/test_sets/580c04d6-f963-11e9-a0ba-52540065bddc |