Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
None
-
Lustre 2.9.0
-
None
-
Spirit
build: https://build.hpdd.intel.com/job/lustre-reviews/41416/ CentOS-7.2
-
3
-
9223372036854775807
Description
Error happened during performance testing of lustre-reviews build #41416 on cluster Spirit.
Configuration reads as:
1 MDS with single MDT formatted with zfs
2 OSS with 2 and 3 OST / OSS formatted with zfs
16 Lustre clients
Besides executing performance test, purpose is to verify patch for LU-8573.
Issue might be related to LU-8508. (OST mount problem is reported in this ticket also)
- MDT/MGT mount completes (completed) successful
- Mount the first OST fails (reproducible) with the following error message:
Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16660:0:(mgc_request.c:253:do_config_log_add()) MGC192.1 68.1.3@o2ib: failed processing log, type 4: rc = -22 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(nodemap_storage.c:368:nodemap_idx_nodemap_add_u pdate()) cannot add nodemap config to non-existing MGS. Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(nodemap_storage.c:1324:nodemap_fs_init()) zfste st-OST0000: error loading nodemap config file, file must be removed via ldiskfs: rc = -22 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8808 0be2b540[0x0, 1, [0x1:0x0:0x0] hash exist]{ Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_stora ge@ffff88080be2b590 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-zfs@fff f880802e12378osd-zfs-object@ffff880802e12378 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff88 080be2b540 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8808 0be2ed80[0x0, 1, [0x200000003:0x0:0x0] hash exist]{ Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_stora ge@ffff88080be2edd0 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-zfs@fff f880035ce6128osd-zfs-object@ffff880035ce6128 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff88 080be2ed80 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8808 0be2b480[0x0, 1, [0x200000003:0x2:0x0] hash exist]{ Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_stora ge@ffff88080be2b4d0 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-zfs@fff f880802e124a0osd-zfs-object@ffff880802e124a0 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff88 080be2b480 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8808 22f44840[0x0, 1, [0x200000003:0x3:0x0] hash exist]{ Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_stora ge@ffff880822f44890 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-zfs@fff f88081fdc5720osd-zfs-object@ffff88081fdc5720 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff88 0822f44840 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8808 0be2bd80[0x0, 1, [0xa:0x0:0x0] hash exist]{ Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_stora ge@ffff88080be2bdd0 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-zfs@fff f880802e12cb8osd-zfs-object@ffff880802e12cb8 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff88 080be2bd80 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8808 378be3c0[0x0, 1, [0xa:0x8:0x0] hash exist]{ Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_stora ge@ffff8808378be410 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-zfs@fff f88080be70818osd-zfs-object@ffff88080be70818 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff88 08378be3c0 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(obd_config.c:578:class_setup()) setup zfstest-O ST0000 failed (-22) Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16788:0:(obd_config.c:1671:class_config_llog_handler()) MGC192.168.1.3@o2ib: cfg command failed: rc = -22 Sep 14 08:07:17 spirit-aeon-1 kernel: Lustre: cmd=cf003 0:zfstest-OST0000 1:dev 2:0 3:f Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 15b-f: MGC192.168.1.3@o2ib: The configuration from log ' zfstest-OST0000'failed from the MGS (-22). Make sure this client and the MGS are running compatible versio ns of Lustre. Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16660:0:(obd_mount_server.c:1352:server_start_targets()) failed to start server zfstest-OST0000: -22 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16660:0:(lu_object.c:1243:lu_device_fini()) ASSERTION( a tomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1 Sep 14 08:07:17 spirit-aeon-1 kernel: LustreError: 16660:0:(lu_object.c:1243:lu_device_fini()) LBUG Sep 14 08:07:17 spirit-aeon-1 kernel: Pid: 16660, comm: mount.lustre Sep 14 08:07:17 spirit-aeon-1 kernel: #012Call Trace: Sep 14 08:07:17 spirit-aeon-1 kernel: [<ffffffffa0bcb7d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs] Sep 14 08:07:17 spirit-aeon-1 kernel: [<ffffffffa0bcbd75>] lbug_with_loc+0x45/0xc0 [libcfs] Sep 14 08:07:17 spirit-aeon-1 kernel: [<ffffffffa0d1ec78>] lu_device_fini+0xb8/0xc0 [obdclass] Sep 14 08:07:17 spirit-aeon-1 kernel: [<ffffffffa0d03d72>] ls_device_put+0x82/0x2a0 [obdclass] Sep 14 08:07:17 spirit-aeon-1 kernel: [<ffffffffa0d0406d>] local_oid_storage_fini+0xdd/0x210 [obdclass] Sep 14 08:07:17 spirit-aeon-1 kernel: [<ffffffffa0fab281>] mgc_set_info_async+0x951/0x1630 [mgc] Sep 14 08:07:17 spirit-aeon-1 kernel: [<ffffffffa0d181c9>] ? lustre_process_log+0x9e9/0xc00 [obdclass] Sep 14 08:07:17 spirit-aeon-1 kernel: [<ffffffffa0bd6957>] ? libcfs_debug_msg+0x57/0x80 [libcfs] Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffffa0d42bf4>] server_start_targets+0x794/0x2d20 [obdclass] Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffffa0d1b32d>] ? lustre_start_mgc+0x20d/0x2490 [obdclass] Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffffa0d14030>] ? class_config_llog_handler+0x0/0x1b60 [obdclass] Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffffa0d4620d>] server_fill_super+0x108d/0x184c [obdclass] Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffffa0d1e058>] lustre_fill_super+0x328/0x950 [obdclass] Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffffa0d1dd30>] ? lustre_fill_super+0x0/0x950 [obdclass] Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffff811e235d>] mount_nodev+0x4d/0xb0 Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffffa0d15f88>] lustre_mount+0x38/0x60 [obdclass] Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffff811e2d09>] mount_fs+0x39/0x1b0 Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffff811fe5df>] vfs_kern_mount+0x5f/0xf0 Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffff81200b2e>] do_mount+0x24e/0xa40 Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffff8116e30e>] ? __get_free_pages+0xe/0x50 Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffff812013b6>] SyS_mount+0x96/0xf0 Sep 14 08:07:18 spirit-aeon-1 kernel: [<ffffffff81646d89>] system_call_fastpath+0x16/0x1b Sep 14 08:07:18 spirit-aeon-1 kernel:
Correlated error message on MDS:
Sep 14 08:07:17 spirit-3 kernel: LustreError: 140-5: Server zfstest-OST0000 requested index 0, but that index is already in use. Use --writeconf to force Sep 14 08:07:17 spirit-3 kernel: LustreError: 95443:0:(mgs_handler.c:531:mgs_target_reg()) Failed to write zfstest-OST0000 log (-98)
- FS has been formatted with help of script framework. No errors occured during reformat and format worked fine
with EE-3.1 version three days ago. (framework configuration unchanged) - One error message on OSS states that lustre version might be different. Double checked the installation state on all nodes to be done with the build specified above.
- Recorded incidents at 'Sep 13 12:53:17', 'Sep 14 08:07:17'
No crash dump was written, although panic on LBUG was set.
Attached files: console,messag logs of affected node (spirit-aeon-1) and MDS (spirit-3)
Attachments
Issue Links
- duplicates
-
LU-8508 kernel:LustreError: 3842:0:(lu_object.c:1243:lu_device_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1
- Resolved