[LU-8608] Rolling upgrade between 2.8.x and master failed: Upon upgrading OSS, OSS restarts when mounted Created: 13/Sep/16  Updated: 21/Oct/16  Resolved: 21/Oct/16

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Minor
Reporter: Saurabh Tandan (Inactive) Assignee: Kit Westneat
Resolution: Fixed Votes: 0
Labels: None
Environment:

Rolling Upgrade: Old version- b2_8_fe build# 25
New version- master build# 3431


Attachments: Text File debug_log_mds.txt     Text File mgs.log     Text File oss.log    
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

While performing rolling upgrade testing the OSS got restarted when it was mounted after the upgrade.
Following steps were taken:
1. OSS, MDS and 2 clients were built with b2_8_fe build# 25 and the lustre system was set up.
2. Unmounted OST and upgraded the OSS to master build# 3431.
3. After upgrade on OSS was complete , the target was mounted back.

Upon mounting, the OSS restarted abruptly.
Following is the log for OSS when the mount command was run.

[root@onyx-26 ~]# mount -t lustre -o acl,user_xattr /dev/sdb1 /mnt/ost0
mount.lustre: increased /sys/block/sdb/queue/max_sectors_kb from 512 to 16384
mount.lustre: change scheduler of /sys/block/sdb/queue/scheduler from cfq to deadline
[   79.285538] libcfs: module verification failed: signature and/or required key missing - tainting kernel
[   79.302042] LNet: HW CPU cores: 32, npartitions: 4
[   79.311423] alg: No test for adler32 (adler32-zlib)
[   79.318433] alg: No test for crc32 (crc32-table)
[   87.529705] Lustre: Lustre: Build Version: 2.8.57
[   87.721568] LNet: Added LNI 10.2.4.56@tcp [8/256/0/180]
[   87.728741] LNet: Accept secure, port 988
[   88.022628] LDISKFS-fs (sdb1): file extents enabled, maximum tree depth=5
[   88.426512] LDISKFS-fs (sdb1): recovery complete
[   88.485928] LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. Opts: acl,user_xattr,,errors=remount-ro,no_mbcache
[   88.864640] LustreError: 3112:0:(mgc_request.c:257:do_config_log_add()) MGC10.2.4.47@tcp: failed processing log, type 4: rc = -22
[   88.971376] LustreError: 3368:0:(nodemap_storage.c:368:nodemap_idx_nodemap_add_update()) cannot add nodemap config to non-existing MGS.
[   88.988471] LustreError: 3368:0:(nodemap_storage.c:1313:nodemap_fs_init()) lustre-OST0000: error loading nodemap config file, file must be removed via ldiskfs: rc = -22
[   89.067996] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8800b67832c0[0x0, 1, [0x1:0x0:0x0] hash exist]{
[   89.067996] 
[   89.085810] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff8800b6783310
[   89.085810] 
[   89.101070] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880035899c00osd-ldiskfs-object@ffff880035899c00(i:ffff880410851e88:81/3977440011)[plain]
[   89.101070] 
[   89.125243] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff8800b67832c0
[   89.125243] 
[   89.139953] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff880823297380[0x0, 1, [0x200000003:0x0:0x0] hash exist]{
[   89.139953] 
[   89.159766] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff8808232973d0
[   89.159766] 
[   89.174780] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880426ff8500osd-ldiskfs-object@ffff880426ff8500(i:ffff880426368d88:80/3977439977)[plain]
[   89.174780] 
[   89.198510] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff880823297380
[   89.198510] 
[   89.213998] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8800b6782b40[0x0, 1, [0xa:0x0:0x0] hash exist]{
[   89.213998] 
[   89.231128] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff8800b6782b90
[   89.231128] 
[   89.245899] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880035899400osd-ldiskfs-object@ffff880035899400(i:ffff88041085af88:82/3977440045)[plain]
[   89.245899] 
[   89.269322] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff8800b6782b40
[   89.269322] 
[   89.283367] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff880802cd1c80[0x0, 1, [0x200000003:0x8:0x0] hash exist]{
[   89.283367] 
[   89.302572] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff880802cd1cd0
[   89.302572] 
[   89.317098] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880823d2d900osd-ldiskfs-object@ffff880823d2d900(i:ffff8808163400c8:98/2123498910)[lfix]
[   89.317098] 
[   89.340058] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff880802cd1c80
[   89.340058] 
[   89.355308] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8800b67829c0[0x0, 1, [0xa:0xa:0x0] hash exist]{
[   89.355308] 
[   89.372243] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff8800b6782a10
[   89.372243] 
[   89.386873] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880035899f00osd-ldiskfs-object@ffff880035899f00(i:ffff88041085b808:83/2755944006)[plain]
[   89.386873] 
[   89.410071] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff8800b67829c0
[   89.410071] 
[   89.424408] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8800b6782c00[0x0, 1, [0x200000001:0x1017:0x0] hash exist]{
[   89.424408] 
[   89.443666] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff8800b6782c50
[   89.443666] 
[   89.458132] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880035899d00osd-ldiskfs-object@ffff880035899d00(i:ffff880035f15a08:12/2606405092)[plain]
[   89.458132] 
[   89.481146] LustreError: 3368:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff8800b6782c00
[   89.481146] 
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.10.0-327.28.2.el7_lustre.x86_64 (jenkins@onyx-1-sdh1-el7-x8664.onyx.hpdd.intel.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Thu Sep 1 10:55:39 PDT 2016

Not sure whether it is related to LU-8498.



 Comments   
Comment by Saurabh Tandan (Inactive) [ 13/Sep/16 ]

I also tried the same steps above with master build# 3437 which included LU-8498 but the issue persists and following are the logs of OSS with that build.

[root@onyx-26 ~]# mount -t lustre -o acl,user_xattr /dev/sdb1 /mnt/ost0
mount.lustre: increased /sys/block/sdb/queue/max_sectors_kb from 512 to 16384
mount.lustre: change scheduler o[  117.134758] libcfs: module verification failed: signature and/or required key missing - tainting kernel
f /sys/block/sdb/queue/scheduler from cfq to deadline
[  117.150964] LNet: HW CPU cores: 32, npartitions: 4
[  117.160981] alg: No test for adler32 (adler32-zlib)
[  117.168217] alg: No test for crc32 (crc32-table)
[  125.396239] Lustre: Lustre: Build Version: 2.8.57_50_g2fd1081
[  125.431861] LNet: Added LNI 10.2.4.56@tcp [8/256/0/180]
[  125.439130] LNet: Accept secure, port 988
[  125.499505] LDISKFS-fs (sdb1): file extents enabled, maximum tree depth=5
[  125.527499] LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. Opts: acl,user_xattr,,errors=remount-ro,no_mbcache
[  125.840774] LustreError: 9640:0:(mgc_request.c:253:do_config_log_add()) MGC10.2.4.47@tcp: failed processing log, type 4: rc = -22
[  125.920049] LustreError: 11501:0:(nodemap_storage.c:368:nodemap_idx_nodemap_add_update()) cannot add nodemap config to non-existing MGS.
[  125.936801] LustreError: 11501:0:(nodemap_storage.c:1324:nodemap_fs_init()) lustre-OST0000: error loading nodemap config file, file must be removed via ldiskfs: rc = -22
[  126.009549] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff880825f99200[0x0, 1, [0x1:0x0:0x0] hash exist]{
[  126.009549] 
[  126.027440] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff880825f99250
[  126.027440] 
[  126.042772] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880826527800osd-ldiskfs-object@ffff880826527800(i:ffff8803fa4d55c8:81/3977440011)[plain]
[  126.042772] 
[  126.067001] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff880825f99200
[  126.067001] 
[  126.081779] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff8808232dd380[0x0, 1, [0x200000003:0x0:0x0] hash exist]{
[  126.081779] 
[  126.101690] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff8808232dd3d0
[  126.101690] 
[  126.116828] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff8800b5d71100osd-ldiskfs-object@ffff8800b5d71100(i:ffff8803fa4cc4c8:80/3977439977)[plain]
[  126.116828] 
[  126.140682] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff8808232dd380
[  126.140682] 
[  126.156197] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff880825f98840[0x0, 1, [0xa:0x0:0x0] hash exist]{
[  126.156197] 
[  126.173455] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff880825f98890
[  126.173455] 
[  126.188219] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880826526300osd-ldiskfs-object@ffff880826526300(i:ffff8803fa4de6c8:82/3977440045)[plain]
[  126.188219] 
[  126.211805] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff880825f98840
[  126.211805] 
[  126.226004] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff880424c200c0[0x0, 1, [0x200000003:0x8:0x0] hash exist]{
[  126.226004] 
[  126.245393] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff880424c20110
[  126.245393] 
[  126.260065] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880826527900osd-ldiskfs-object@ffff880826527900(i:ffff8803fa4df388:98/2123498910)[lfix]
[  126.260065] 
[  126.283156] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff880424c200c0
[  126.283156] 
[  126.298736] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff880424c20f00[0x0, 1, [0xa:0xc:0x0] hash exist]{
[  126.298736] 
[  126.315798] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff880424c20f50
[  126.315798] 
[  126.330523] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880826525000osd-ldiskfs-object@ffff880826525000(i:ffff8803fa4def48:83/2922743499)[plain]
[  126.330523] 
[  126.353865] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff880424c20f00
[  126.353865] 
[  126.367878] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) header@ffff880825f99ec0[0x0, 1, [0x200000001:0x1017:0x0] hash exist]{
[  126.367878] 
[  126.387236] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....local_storage@ffff880825f99f10
[  126.387236] 
[  126.401810] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) ....osd-ldiskfs@ffff880826525a00osd-ldiskfs-object@ffff880826525a00(i:ffff8803fa4c55c8:12/2606405092)[plain]
[  126.401810] 
[  126.424933] LustreError: 11501:0:(ofd_dev.c:248:ofd_stack_fini()) } header@ffff880825f99ec0
[  126.424933] 
[  126.475139] LustreError: 11501:0:(obd_config.c:578:class_setup()) setup lustre-OST0000 failed (-22)
[  126.488291] LustreError: 11501:0:(obd_config.c:1671:class_config_llog_handler()) MGC10.2.4.47@tcp: cfg command failed: rc = -22
[  126.507069] Lustre:    cmd=cf003 0:lustre-OST0000  1:dev  2:0  3:f  
[  126.507069] 
[  126.521361] LustreError: 15b-f: MGC10.2.4.47@tcp: The configuration from log 'lustre-OST0000'failed from the MGS (-22).  Make sure this client and the MGS are running compatible versions of Lustre.
[  126.547018] LustreError: 9640:0:(obd_mount_server.c:1352:server_start_targets()) failed to start server lustre-OST0000: -22
[  126.562626] LustreError: 9640:0:(lu_object.c:1243:lu_device_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1
[  126.581424] LustreError: 9640:0:(lu_object.c:1243:lu_device_fini()) LBUG
[  126.591487] Pid: 9640, comm: mount.lustre
[  126.598335] 
[  126.598335] Call Trace:
[  126.607357]  [<ffffffffa072c7d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
[  126.617283]  [<ffffffffa072cd75>] lbug_with_loc+0x45/0xc0 [libcfs]
[  126.626242]  [<ffffffffa0867c78>] lu_device_fini+0xb8/0xc0 [obdclass]
[  126.635477]  [<ffffffffa084cd72>] ls_device_put+0x82/0x2a0 [obdclass]
[  126.644565]  [<ffffffffa084d06d>] local_oid_storage_fini+0xdd/0x210 [obdclass]
[  126.654470]  [<ffffffffa0806281>] mgc_set_info_async+0x951/0x1630 [mgc]
[  126.663594]  [<ffffffffa08611c9>] ? lustre_process_log+0x9e9/0xc00 [obdclass]
[  126.673310]  [<ffffffffa0737957>] ? libcfs_debug_msg+0x57/0x80 [libcfs]

Message from[  126.682342]  [<ffffffffa088bbf4>] server_start_targets+0x794/0x2d20 [obdclass]
 syslogd@onyx-26[  126.692039]  [<ffffffffa0864ab6>] ? lustre_start_mgc+0x996/0x2490 [obdclass]
 at Sep 12 17:52[  126.701460]  [<ffffffffa085d030>] ? class_config_llog_handler+0x0/0x1b60 [obdclass]
:37 ...
 kerne[  126.711580]  [<ffffffffa088f20d>] server_fill_super+0x108d/0x184c [obdclass]
l:LustreError: 9[  126.720983]  [<ffffffffa0867058>] lustre_fill_super+0x328/0x950 [obdclass]
640:0:(lu_object[  126.730180]  [<ffffffffa0866d30>] ? lustre_fill_super+0x0/0x950 [obdclass]
.c:1243:lu_devic[  126.739418]  [<ffffffff811e235d>] mount_nodev+0x4d/0xb0
e_fini()) ASSERT[  126.746774]  [<ffffffffa085ef88>] lustre_mount+0x38/0x60 [obdclass]
ION( atomic_read[  126.755355]  [<ffffffff811e2d09>] mount_fs+0x39/0x1b0
(&d->ld_ref) == [  126.762525]  [<ffffffff811fe5df>] vfs_kern_mount+0x5f/0xf0
0 ) failed: Refc[  126.770221]  [<ffffffff81200b2e>] do_mount+0x24e/0xa40
ount is 1

[  126.777478]  [<ffffffff8116e30e>] ? __get_free_pages+0xe/0x50
Message from sys[  126.785472]  [<ffffffff812013b6>] SyS_mount+0x96/0xf0
logd@onyx-26 at [  126.792693]  [<ffffffff81646d89>] system_call_fastpath+0x16/0x1b
Sep 12 17:52:37 [  126.800984] 
...
 kernel:Lu[  126.804371] Kernel panic - not syncing: LBUG
streError: 9640:0:(lu_object.c:1[  126.811718] CPU: 19 PID: 9640 Comm: mount.lustre Tainted: G          IOE  ------------   3.10.0-327.28.3.el7_lustre.x86_64 #1
243:lu_device_fini()) LBUG
[  126.827584] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.99.99.x045.022820121209 02/28/2012
[  126.842017]  ffffffffa0749def 000000007f8e88de ffff880824ef79e8 ffffffff8163667b
[  126.852647]  ffff880824ef7a68 ffffffff8162ff0a ffffffff00000008 ffff880824ef7a78
[  126.863183]  ffff880824ef7a18 000000007f8e88de ffffffffa08981d5 0000000000000000
[  126.873670] Call Trace:
[  126.878607]  [<ffffffff8163667b>] dump_stack+0x19/0x1b
[  126.886490]  [<ffffffff8162ff0a>] panic+0xd8/0x1e7
[  126.893930]  [<ffffffffa072cddb>] lbug_with_loc+0xab/0xc0 [libcfs]
[  126.903113]  [<ffffffffa0867c78>] lu_device_fini+0xb8/0xc0 [obdclass]
[  126.912379]  [<ffffffffa084cd72>] ls_device_put+0x82/0x2a0 [obdclass]
[  126.921613]  [<ffffffffa084d06d>] local_oid_storage_fini+0xdd/0x210 [obdclass]
[  126.931643]  [<ffffffffa0806281>] mgc_set_info_async+0x951/0x1630 [mgc]
[  126.941011]  [<ffffffffa08611c9>] ? lustre_process_log+0x9e9/0xc00 [obdclass]
[  126.950943]  [<ffffffffa0737957>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[  126.960275]  [<ffffffffa088bbf4>] server_start_targets+0x794/0x2d20 [obdclass]
[  126.970247]  [<ffffffffa0864ab6>] ? lustre_start_mgc+0x996/0x2490 [obdclass]
[  126.979979]  [<ffffffffa085d030>] ? class_config_dump_handler+0xb30/0xb30 [obdclass]
[  126.990495]  [<ffffffffa088f20d>] server_fill_super+0x108d/0x184c [obdclass]
[  127.000180]  [<ffffffffa0867058>] lustre_fill_super+0x328/0x950 [obdclass]
[  127.009642]  [<ffffffffa0866d30>] ? lustre_common_put_super+0x270/0x270 [obdclass]
[  127.019858]  [<ffffffff811e235d>] mount_nodev+0x4d/0xb0
[  127.027464]  [<ffffffffa085ef88>] lustre_mount+0x38/0x60 [obdclass]
[  127.036232]  [<ffffffff811e2d09>] mount_fs+0x39/0x1b0
[  127.043618]  [<ffffffff811fe5df>] vfs_kern_mount+0x5f/0xf0
[  127.051489]  [<ffffffff81200b2e>] do_mount+0x24e/0xa40
[  127.058959]  [<ffffffff8116e30e>] ? __get_free_pages+0xe/0x50
[  127.067116]  [<ffffffff812013b6>] SyS_mount+0x96/0xf0
[  127.074491]  [<ffffffff81646d89>] system_call_fastpath+0x16/0x1b
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.10.0-327.28.3.el7_lustre.x86_64 (jenkins@onyx-5-sdh1-el7-x8664.onyx.hpdd.intel.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Fri Sep 9 20:39:59 PDT 2016
Comment by Peter Jones [ 13/Sep/16 ]

Kit

What do you advise here?

Peter

Comment by Kit Westneat [ 13/Sep/16 ]

Hi Peter,

This looks like a dupe of the second issue in LU-8508:
https://jira.hpdd.intel.com/browse/LU-8508?focusedCommentId=162247&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-162247

Is it possible to test with this patch?
http://review.whamcloud.com/#/c/22004/

Thanks,
Kit

Comment by Saurabh Tandan (Inactive) [ 13/Sep/16 ]

I will try it out with this patch.

Comment by Saurabh Tandan (Inactive) [ 13/Sep/16 ]

Hi Kit,
I tried the testing with the patch mentioned above. The mount worked and the system did not restarted this time. But I could see a LustreError message in logs while OST was mounting. Is there any extra work needed for this?

[root@onyx-26 ~]# mount -t lustre -o acl,user_xattr /dev/sdb1 /mnt/ost0
mount.lustre: increased /sys/block/sdb/queue/max_sectors_kb from 512 to 16384
mount.lustre: change scheduler o[ 2836.318943] libcfs: module verification failed: signature and/or required key missing - tainting kernel
f /sys/block/sdb/queue/scheduler from cfq to dea[ 2836.333593] LNet: HW CPU cores: 32, npartitions: 4
dline
[ 2836.343150] alg: No test for adler32 (adler32-zlib)
[ 2836.348967] alg: No test for crc32 (crc32-table)
[ 2844.384607] Lustre: Lustre: Build Version: 2.8.57_22_g5cb1549
[ 2844.422845] LNet: Added LNI 10.2.4.56@tcp [8/256/0/180]
[ 2844.428845] LNet: Accept secure, port 988
[ 2844.498034] LDISKFS-fs (sdb1): file extents enabled, maximum tree depth=5
[ 2844.525873] LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. Opts: acl,user_xattr,,errors=remount-ro,no_mbcache
[ 2844.883460] LustreError: 38233:0:(mgc_request.c:253:do_config_log_add()) MGC10.2.4.47@tcp: failed processing log, type 4: rc = -22
[ 2845.382949] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
[root@onyx-26 ~]# [ 2852.091748] Lustre: lustre-OST0000: Will be in recovery for at least 2:30, or until 3 clients reconnect
[ 2852.102484] Lustre: lustre-OST0000: Connection restored to b0ab0605-5282-cb64-ddd3-483f2393ac20 (at 10.2.4.36@tcp)
[ 2853.948292] Lustre: lustre-OST0000: Connection restored to lustre-MDT0000-mdtlov_UUID (at 10.2.4.47@tcp)
[ 2895.399155] Lustre: lustre-OST0000: Connection restored to 15ce59bd-a3c6-167b-84dd-730a88c0fe5f (at 10.2.4.37@tcp)
[ 2895.801444] Lustre: lustre-OST0000: Recovery over after 0:44, of 3 clients 3 recovered and 0 were evicted.
[ 2895.830113] Lustre: lustre-OST0000: deleting orphan objects from 0x0:4 to 0x0:33

Thanks!

Comment by Kit Westneat [ 19/Sep/16 ]

Hi Saurabh,

Sorry for the delay in responding. Do you have the -1 debug logs (or trace and info) from the MGS and the OSS? I'm not sure why it'd be returning an error.

Thanks,
Kit

Comment by Saurabh Tandan (Inactive) [ 26/Sep/16 ]

Hi Kit,
I have attached the log files for both MGS and OSS above. I also have the system set up currently. Please let me know incase you need any more information.
Thanks!

Comment by Kit Westneat [ 11/Oct/16 ]

Hi Saurabh,

Ah these look like the dmesg logs, do you have the Lustre debug logs? I mean the logs that are generated by the lctl debug_kernel command. I'll need the trace and info log levels enabled in order to see what's going on.

Thanks,
Kit

Comment by Saurabh Tandan (Inactive) [ 13/Oct/16 ]

MDS debug_log file attached. I have the OSS debug_log file as well but its too big and not getting attached. Incase you want that too please let me know I will send it some other way to you.

Comment by Kit Westneat [ 14/Oct/16 ]

Actually I think this is to be expected if the MGS is at 2.8. Does the OSS mount ok or are there functionality problems?

Comment by Saurabh Tandan (Inactive) [ 14/Oct/16 ]

Nope there are no functionality problems , OSS mounts okay.

Comment by Saurabh Tandan (Inactive) [ 21/Oct/16 ]

As the error message is expected hence closing the ticket.

Generated at Sat Feb 10 02:19:02 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.