Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
Lustre 2.4.0
-
None
-
1 Lustre Client Server
2 MDS Server, config failover
2 OSS Server,config failover
-
3
-
12055
Description
In the process of deleting a large number of small files(close to 800,000), simulate one OSS crash. During the othe OSS takeover the crashed one, the normal OSS printed call trace and crashed.
Call trace info as follow:
LustreError: 137-5: test-OST0000_UUID: not available for connect from 192.168.22.196@tcp (no target)
LustreError: Skipped 7 previous similar messages
LustreError: 137-5: test-OST0002_UUID: not available for connect from 192.168.22.239@tcp (no target)
LustreError: 137-5: test-OST0000_UUID: not available for connect from 192.168.22.239@tcp (no target)
LustreError: 137-5: test-OST0006_UUID: not available for connect from 192.168.22.239@tcp (no target)
LustreError: 137-5: test-OST0004_UUID: not available for connect from 192.168.22.239@tcp (no target)
LustreError: Skipped 4 previous similar messages
LustreError: Skipped 4 previous similar messages
LustreError: Skipped 4 previous similar messages
LDISKFS-fs (sdc): recovery complete
LustreError: 137-5: test-OST0002_UUID: not available for connect from 192.168.22.196@tcp (no target)
LustreError: Skipped 5 previous similar messages
LDISKFS-fs (sdc): mounted filesystem with ordered data mode. quota=on. Opts:
LustreError: 11-0: test-MDT0000-lwp-OST0002: Communicating with 192.168.22.239@tcp, operation mds_connect failed with -114.
Lustre: test-OST0002: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
LDISKFS-fs warning (device sdi): ldiskfs_multi_mount_protect: MMP interval 42 higher than expected, please wait.
Lustre: test-OST0002: Will be in recovery for at least 2:30, or until 4 clients reconnect
general protection fault: 0000 1 SMP
last sysfs file: /sys/class/nt_neg/info
CPU 0
Modules linked in: osp(U) ofd(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) ldiskfs(U) lquota(U) mdd(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) device(F)(U) tcmfc(F)(U) libpmcfc(U) lundev(F)(U) iscsi_target_mod(F)(U) target_core_mod(F)(U) replication(F)(U) snapshot(F)(U) hotcache(F)(U) cache(F)(U) mmgt(F)(U) copy(F)(U) raid5(F)(U) raid10(F)(U) raid1(F)(U) raid0(F)(U) data_record(F)(U) raid_common(F)(U) ACpoweroff(F)(U) pm80xx(F)(U) sys_disk(U) disk_noop(F)(U) disk_deadline(F)(U) disk_elevator(F)(U) lib_sas(F)(U) scsi_transport_sas(F)(U) lib_ata(F)(U) odsp_sas(F)(U) disk_adapter(F)(U) disk_vault(F)(U) disk_error(F)(U) odsp_scsi(F)(U) disk_err_stub(U) disk_arch(F)(U) cell(F)(U) ld_worker(F)(U) comm_duplock(F)(U) iodir(F)(U) pthread(F)(U) kmempool(F)(U) common(U) comm(F)(U) dma(F)(U) bugon(F)(U) driver_adapter(F)(U) event(F)(U) debug(U) nt_memcpy(U) drv_dma(F)(U) plxdma(F)(U) async_tx(U) xor(U) nt_neg_resume(F)(U) Plx8000(U) msgpio(U) bonding ipv6 sunrpc i2c_ismt(U) i2c_dev 8021q garp stp llc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_mod i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support e1000e ixgbe hwmon dca mdio ext4 mbcache jbd2 sd_mod crc_t10dif ahci [last unloaded: scsi_wait_scan]
Pid: 31104, comm: jbd2/sdc-8 Tainted: GF --------------- 2.6.32-358.6.2.l2.08 #2 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M.
RIP: 0010:[<ffffffff8123b2f6>] [<ffffffff8123b2f6>] string+0xa6/0x100
RSP: 0018:ffff8803219c1a00 EFLAGS: 00010212
RAX: 000000000000000c RBX: c3ddeeb188fd5000 RCX: 0000000000000074
RDX: c3ddeeb188fd4061 RSI: 00000000fffffffe RDI: 000000000000000c
RBP: ffff8803219c1a30 R08: 0000000000000073 R09: 0000000000000000
R10: 5a5a5a5a5a5a5a5a R11: 0000000000000004 R12: c3ddeeb188fd4055
R13: ffff880320154044 R14: 00000000ffffffff R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f6fd62f10a0 CR3: 00000003926ee000 CR4: 00000000000407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process jbd2/sdc-8 (pid: 31104, threadinfo ffff8803219c0000, task ffff88038d13a000)
Stack:
0000000000000000 c3ddeeb188fd4055 ffffffffa0b379aa ffffffffa0b379a8
<d> ffff8803219c1b80 c3ddeeb188fd5000 ffff8803219c1ad0 ffffffff8123c6d8
<d> 0000000000000004 0000000affffffff ffffffffffffffff 0000000000000400
Call Trace:
[<ffffffff8123c6d8>] vsnprintf+0x218/0x5e0
[<ffffffffa083327b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
[<ffffffffa0844e4f>] libcfs_debug_vmsg2+0x25f/0x890 [libcfs]
[<ffffffffa08454c1>] libcfs_debug_msg+0x41/0x50 [libcfs]
[<ffffffffa0b0a303>] tgt_cb_last_committed+0x303/0x410 [ptlrpc]
[<ffffffffa0fab8a4>] osd_trans_commit_cb+0xb4/0x2b0 [osd_ldiskfs]
[<ffffffffa0f68119>] ldiskfs_journal_commit_callback+0x89/0xc0 [ldiskfs]
[<ffffffffa0026cb7>] jbd2_journal_commit_transaction+0x13f7/0x1700 [jbd2]
[<ffffffff8104be0e>] ? try_to_wake_up+0x24e/0x3e0
[<ffffffff81067349>] ? try_to_del_timer_sync+0x79/0xd0
[<ffffffffa002cbf7>] kjournald2+0xb7/0x210 [jbd2]
[<ffffffff81076000>] ? autoremove_wake_function+0x0/0x40
[<ffffffffa002cb40>] ? kjournald2+0x0/0x210 [jbd2]
[<ffffffff81075ca6>] kthread+0x96/0xa0
[<ffffffff81003f1a>] child_rip+0xa/0x20
[<ffffffff81075c10>] ? kthread+0x0/0xa0
[<ffffffff81003f10>] ? child_rip+0x0/0x20
Code: f1 4f 8d 64 04 01 44 01 ce 85 ff 7e 2b 8d 50 ff 49 8d 54 14 01 eb 0a 66 0f 1f 44 00 00 49 83 c5 01 4c 39 e3 76 09 41 0f b6 4d 00 <41> 88 0c 24 49 83 c4 01 49 39 d4 75 e5 39 f7 7d 29 f7 d0 8d 14
RIP [<ffffffff8123b2f6>] string+0xa6/0x100
RSP <ffff8803219c1a00>
--[ end trace 67847978ef517e38 ]--
Kernel panic - not syncing: Fatal exception
Pid: 31104, comm: jbd2/sdc-8 Tainted: GF D --------------- 2.6.32-358.6.2.l2.08 #2
Call Trace:
[<ffffffff81476fa7>] ? panic+0xa1/0x163
[<ffffffff8147b2fc>] ? oops_end+0xdc/0xf0
[<ffffffff81006e3b>] ? die+0x5b/0x90
[<ffffffff8147ae02>] ? do_general_protection+0x152/0x160
[<ffffffff81050f80>] ? find_busiest_group+0x250/0xbb0
[<ffffffff8147a65f>] ? general_protection+0x1f/0x30
[<ffffffff8123b2f6>] ? string+0xa6/0x100
[<ffffffff8123c6d8>] ? vsnprintf+0x218/0x5e0
[<ffffffffa083327b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
[<ffffffffa0844e4f>] ? libcfs_debug_vmsg2+0x25f/0x890 [libcfs]
[<ffffffffa08454c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
[<ffffffffa0b0a303>] ? tgt_cb_last_committed+0x303/0x410 [ptlrpc]
[<ffffffffa0fab8a4>] ? osd_trans_commit_cb+0xb4/0x2b0 [osd_ldiskfs]
[<ffffffffa0f68119>] ? ldiskfs_journal_commit_callback+0x89/0xc0 [ldiskfs]
[<ffffffffa0026cb7>] ? jbd2_journal_commit_transaction+0x13f7/0x1700 [jbd2]
[<ffffffff8104be0e>] ? try_to_wake_up+0x24e/0x3e0
[<ffffffff81067349>] ? try_to_del_timer_sync+0x79/0xd0
[<ffffffffa002cbf7>] ? kjournald2+0xb7/0x210 [jbd2]
[<ffffffff81076000>] ? autoremove_wake_function+0x0/0x40
[<ffffffffa002cb40>] ? kjournald2+0x0/0x210 [jbd2]
[<ffffffff81075ca6>] ? kthread+0x96/0xa0
[<ffffffff81003f1a>] ? child_rip+0xa/0x20
[<ffffffff81075c10>] ? kthread+0x0/0xa0
[<ffffffff81003f10>] ? child_rip+0x0/0x20
*******show para for nt_memcpy16*******
src: ffff880338f74440, dst: ffffc903f32bde60, len: 48
*******show para for panic done*******
Attachments
Issue Links
- is related to
-
LU-4390 cfs_trace_get_tage+0x1c6/0x300 [libcfs]
-
- Resolved
-