Loading...

Details

Type: Bug
Resolution: Fixed
Priority: Blocker
Fix Version/s: Lustre 2.8.0
Affects Version/s: Lustre 2.8.0
Labels:
- soak
Environment:
lola
build: tip of master(df6cf859bbb29392064e6ddb701f3357e01b3a13) + patches

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

The error occurred during soak testing of build '20151113' (see https://wiki.hpdd.intel.com/pages/viewpage.action?title=Soak+Testing+on+Lola&spaceKey=Releases#SoakTestingonLola-20151113). DNE is enabled. OSTs have been formated with zfs, MDTs with ldiskfs as backend. MDSes are configured in active-active HA failover configuration.

During mount of mdt-2 the following error messages were printed:

Nov 13 16:27:52 lola-9 kernel: LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on. Opts: 
Nov 13 16:27:53 lola-9 kernel: LustreError: 6485:0:(tgt_lastrcvd.c:1458:tgt_clients_data_init()) soaked-MDT0002: duplicate export for client generation 1
Nov 13 16:27:53 lola-9 kernel: LustreError: 6485:0:(obd_config.c:575:class_setup()) setup soaked-MDT0002 failed (-114)
Nov 13 16:27:53 lola-9 kernel: LustreError: 6485:0:(obd_config.c:1663:class_config_llog_handler()) MGC192.168.1.108@o2ib10: cfg command failed: rc = -114
Nov 13 16:27:53 lola-9 kernel: Lustre:    cmd=cf003 0:soaked-MDT0002  1:soaked-MDT0002_UUID  2:2  3:soaked-MDT0002-mdtlov  4:f  
Nov 13 16:27:53 lola-9 kernel: 
Nov 13 16:27:53 lola-9 kernel: LustreError: 15c-8: MGC192.168.1.108@o2ib10: The configuration from log 'soaked-MDT0002' failed (-114). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
Nov 13 16:27:53 lola-9 kernel: LustreError: 6298:0:(obd_mount_server.c:1306:server_start_targets()) failed to start server soaked-MDT0002: -114
Nov 13 16:27:53 lola-9 kernel: LustreError: 6298:0:(obd_mount_server.c:1794:server_fill_super()) Unable to start targets: -114
Nov 13 16:27:53 lola-9 kernel: LustreError: 6298:0:(obd_config.c:622:class_cleanup()) Device 4 not setup

before crashing with

<4>general protection fault: 0000 [#1] SMP
<4>last sysfs file: /sys/module/lfsck/initstate
<4>CPU 25
<4>Modules linked in: mdd(U) lod(U) mdt(U) lfsck(U) mgc(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic crc32c_intel libcfs(U) ldiskfs(U) jbd2 8021q garp stp llc nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm scsi_dh_rdac dm_round_robin dm_multipath microcode iTCO_wdt iTCO_vendor_support zfs(P)(U) zcommon(P)(U) znvpair(P)(U) spl(U) zlib_deflate zavl(P)(U) zunicode(P)(U) sb_edac edac_core lpc_ich mfd_core i2c_i801 ioatdma sg igb dca i2c_algo_bit i2c_core ptp pps_core ext3 jbd mbcache sd_mod crc_t10dif ahci isci libsas wmi mpt2sas scsi_transport_sas raid_class mlx4_ib ib_sa ib_mad ib_core ib_addr ipv6 mlx4_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
<4>
<4>Pid: 6329, comm: obd_zombid Tainted: P           ---------------    2.6.32-504.30.3.el6_lustre.gb64632c.x86_64 #1 Intel Corporation S2600GZ ........../S2600GZ
<4>RIP: 0010:[<ffffffffa0c4a6ed>]  [<ffffffffa0c4a6ed>] tgt_client_free+0x25d/0x610 [ptlrpc]
<4>RSP: 0018:ffff8808337fddd0  EFLAGS: 00010206
<4>RAX: 5a5a5a5a5a5a5a5a RBX: ffff8803b80c2400 RCX: ffff8803b80c6ec0
<4>RDX: 0000000000000007 RSI: 5a5a5a5a5a5a5a5a RDI: 0000000000000282
<4>RBP: ffff8808337fde00 R08: 5a5a5a5a5a5a5a5a R09: 5a5a5a5a5a5a5a5a
<4>R10: 5a5a5a5a5a5a5a5a R11: 0000000000000000 R12: ffff8803b630d0b0
<4>R13: 5a5a5a5a5a5a5a5a R14: 5a5a5a5a5a5a5a5a R15: 5a5a5a5a5a5a5a5a
<4>FS:  0000000000000000(0000) GS:ffff88044e520000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
<4>CR2: 0000003232070df0 CR3: 0000000001a85000 CR4: 00000000000407e0
<4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>Process obd_zombid (pid: 6329, threadinfo ffff8808337fc000, task ffff880834c75520)
<4>Stack:
<4> ffff8803b6308038 ffff8803b80c2400 0000370000000000 ffff8803b80c2400
<4><d> ffff8803b6308038 ffff880834c75520 ffff8808337fde20 ffffffffa126ff81
<4><d> ffff8803b6308078 0000000000000000 ffff8808337fde60 ffffffffa099a350
<4>Call Trace:
<4> [<ffffffffa126ff81>] mdt_destroy_export+0x71/0x220 [mdt]
<4> [<ffffffffa099a350>] obd_zombie_impexp_cull+0x5e0/0xac0 [obdclass]
<4> [<ffffffffa099a895>] obd_zombie_impexp_thread+0x65/0x190 [obdclass]
<4> [<ffffffff81064c00>] ? default_wake_function+0x0/0x20
<4> [<ffffffffa099a830>] ? obd_zombie_impexp_thread+0x0/0x190 [obdclass]
<4> [<ffffffff8109e78e>] kthread+0x9e/0xc0
<4> [<ffffffff8100c28a>] child_rip+0xa/0x20
<4> [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
<4> [<ffffffff8100c280>] ? child_rip+0x0/0x20
<4>Code: 00 00 48 c7 83 c8 02 00 00 00 00 00 00 85 d2 78 4a 4d 85 e4 0f 84 4e 02 00 00 49 8b 84 24 18 03 00 00 48 85 c0 0f 84 3d 02 00 00 <f0> 0f b3 10 19 d2 85 d2 0f 84 23 03 00 00 f6 83 6f 01 00 00 02 
<1>RIP  [<ffffffffa0c4a6ed>] tgt_client_free+0x25d/0x610 [ptlrpc]
<4> RSP <ffff8808337fddd0>

Attached files: console, messages of node lola-9

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

console-lola-9.log.gz
880 kB
16/Nov/15 12:19 PM
dump_today.out
48 kB
24/Nov/15 6:42 PM
messages-lola-9.log.bz2
659 kB
16/Nov/15 12:19 PM

Issue Links

is related to

LU-7638 general protection fault: 0000 after mounting MDTs

Resolved

LU-7794 tgt_clients_data_init()) soaked-MDT0001: duplicate export for client generation 1

Resolved

LU-7455 Tracking tickets to make DNE pass soak-test.

Resolved

General protection fault: 0000 upon mounting MDT

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates