[LU-2059] mgc to backup configuration on osd-based llogs Created: 01/Oct/12 Updated: 29/May/17 Resolved: 29/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0, Lustre 2.4.1, Lustre 2.5.0, Lustre 2.6.0 |
| Fix Version/s: | Lustre 2.4.0, Lustre 2.5.0, Lustre 2.6.0, Lustre 2.5.4 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Alex Zhuravlev | Assignee: | John Fuchs-Chesney (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | yuc2 | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank (Obsolete): | 4308 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comments |
| Comment by Alex Zhuravlev [ 01/Oct/12 ] |
|
mgc_process_cfg_log() do not make a local copy of configuration if zfs is in use. this is because of lsi_srv_mnt == NULL in: /* Copy the setup log locally if we can. Don't mess around if we're running an MGS though (logs are already local). */ if (lctxt && lsi && IS_SERVER(lsi) && (lsi->lsi_srv_mnt == cli->cl_mgc_vfsmnt) && !IS_MGS(lsi) && lsi->lsi_srv_mnt) { this is done as mgc_copy_llog() still rely on lvfs, see mgc_llog_init(): rc = llog_setup(NULL, obd, olg, LLOG_CONFIG_ORIG_CTXT, tgt, &llog_lvfs_ops); this should be changed to llog_osd_ops |
| Comment by Andreas Dilger [ 09/Oct/12 ] |
|
http://review.whamcloud.com/4234 skips insanity.sh test_2 and test_4 for ZFS configs, and removes skip for conf-sanity.sh test_5d, test_19b, test_21b, test_27a for ldiskfs. |
| Comment by Andreas Dilger [ 09/Nov/12 ] |
|
Moving over to OSD API for the MGC backup will also allow us to delete a few thousand lines of obsolete code in lustre/lvfs/fsfilt*.c and lustre/lvfs/lvfs*.c. |
| Comment by Mikhail Pershin [ 18/Jan/13 ] |
| Comment by James A Simmons [ 11/Feb/13 ] |
|
Patch needs to be rebased. |
| Comment by Mikhail Pershin [ 11/Feb/13 ] |
|
Done |
| Comment by Nathaniel Clark [ 26/Feb/13 ] |
|
Tests are listed for zfs as ALWAYS_EXCEPTED under bug |
| Comment by Mikhail Pershin [ 27/Feb/13 ] |
|
Maloo failed at replay-ost-single test_7. As I can see that happens because config logs are bigger than 40 blocks as test expects but 44. Need to find out is that just because logs become really so big and test must be fixed or this is patch issue |
| Comment by James A Simmons [ 04/Mar/13 ] |
|
Mikhail does http://review.whamcloud.com/#change,5569 replace http://review.whamcloud.com/#change,5049 ? |
| Comment by Mikhail Pershin [ 04/Mar/13 ] |
|
James, it is the same as 5049 but with debug in replay-ost-single to find out how big config logs can be. See my previous comment. |
| Comment by James A Simmons [ 22/Mar/13 ] |
|
Any updates on this? |
| Comment by Mikhail Pershin [ 30/Mar/13 ] |
|
James, I've found object leaks with the latest patch. I am figuring out what is wrong. |
| Comment by James A Simmons [ 01/Apr/13 ] |
|
Excellent, looking forward to your patch. |
| Comment by Peter Jones [ 01/Apr/13 ] |
|
Is this a must-have fix for 2.4.0 or can it be tidied up in 2.4.1? |
| Comment by Peter Jones [ 02/Apr/13 ] |
|
dropping priority after chatting with Mike. |
| Comment by Christopher Morrone [ 02/Apr/13 ] |
|
Could you elaborate? This doesn't seem to me to be the kind of thing we should be doing on maintenance branches. |
| Comment by Peter Jones [ 02/Apr/13 ] |
|
Chris Mike basically said that this ticket was created to track some technical debt type cleanup and that it would be ok to live with the present situation for 2.4. If, as you say, when the changes are too large when ready then they would end up in 2.5 rather than 2.4.1 Peter |
| Comment by James A Simmons [ 24/Apr/13 ] |
|
With the latest test I'm getting this error. I do have a few other patches running with this one so tomorrow I will try a stock 2.3.64 with only this patch to see if the problem still exist. Apr 24 17:05:13 barry-oss1 kernel: [ 481.702376] INFO: task mount.lustre:8891 blocked for more than 120 seconds. |
| Comment by James A Simmons [ 25/Apr/13 ] |
|
With run plain 2.3.64 with this patch alone still produces this error. |
| Comment by Mikhail Pershin [ 25/Apr/13 ] |
|
James, how to reproduce this error? Some specific tests or sequence of actions? Is this first mount? |
| Comment by James A Simmons [ 25/Apr/13 ] |
|
This is the first mount. |
| Comment by Mikhail Pershin [ 25/Apr/13 ] |
|
James, is that with zfs or doesn't matter? |
| Comment by James A Simmons [ 25/Apr/13 ] |
|
ldiskfs. No zfs setup here. |
| Comment by Mikhail Pershin [ 25/Apr/13 ] |
|
OK, I can reproduce that after patch re-base over the latest master. |
| Comment by James A Simmons [ 25/Apr/13 ] |
|
Time for git bisect |
| Comment by Mikhail Pershin [ 27/Apr/13 ] |
|
OK, I found the conflict and solve it, please try again, patch is re-pushed |
| Comment by James A Simmons [ 29/Apr/13 ] |
|
Yes the patch working great now. One comment to point out to testers of this patch. You need to umount the file system and reformat the MGT. After you remount the file system should be good to go. |
| Comment by Christopher Morrone [ 29/Apr/13 ] |
|
That last comment about reformatting is disturbing. Hopefully that was only if you were running previous version of change 5049? |
| Comment by James A Simmons [ 01/May/13 ] |
|
The format for the changelog is altered with this patch. Because of this you need to delete the MGT change logs. Yes reformatting my MGT was a bit heavy handed. You still need to unmount your file system and then use tunefs.lustre --writeconf ... to erase your old data. Then remount and the file system should be good to go. |
| Comment by Christopher Morrone [ 01/May/13 ] |
|
We aren't going to land something that requires all existing users of Lustre to writeconf their filesystems, right? |
| Comment by James A Simmons [ 20/May/13 ] |
|
Looks like this is a 2.5 thing. Peter can you change the Affected version to 2.5 |
| Comment by Peter Jones [ 20/May/13 ] |
|
ok done |
| Comment by Mikhail Pershin [ 20/May/13 ] |
|
Christopher, the format of changelog was changed not in this patch, it was just rebased over the latest master. |
| Comment by James A Simmons [ 05/Jul/13 ] |
|
Testing with the latest patch I'm getting the following error when attempting to bring up a file system. Jul 5 11:36:30 barry-oss1 kernel: [ 530.526152] LDISKFS-fs (dm-9): recovery complete |
| Comment by Nathaniel Clark [ 25/Jul/13 ] |
|
Re-enable all tests EXCEPTed or skipped: |
| Comment by Jian Yu [ 06/Sep/13 ] |
|
Lustre build: http://build.whamcloud.com/job/lustre-b2_4/44/ (2.4.1 RC1) insanity test 1 failed as follows: Starting mds2: -o user_xattr,acl lustre-mdt2/mdt2 /mnt/mds2 CMD: wtm-29vm7 mkdir -p /mnt/mds2; mount -t lustre -o user_xattr,acl lustre-mdt2/mdt2 /mnt/mds2 wtm-29vm7: mount.lustre: mount lustre-mdt2/mdt2 at /mnt/mds2 failed: Input/output error wtm-29vm7: Is the MGS running? Start of lustre-mdt2/mdt2 on mds2 failed 5 insanity test_1: @@@@@@ FAIL: test_1 failed with 2 Console log on wtm-29vm7: 12:50:34:Lustre: DEBUG MARKER: mkdir -p /mnt/mds2; mount -t lustre -o user_xattr,acl lustre-mdt2/mdt2 /mnt/mds2 12:50:34:LustreError: 16886:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff88007073e000 x1445162105843372/t0(0) o253->MGC10.10.17.33@tcp@10.10.17.33@tcp:26/25 lens 4768/4768 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1 12:50:34:LustreError: 16886:0:(obd_mount_server.c:1124:server_register_target()) lustre-MDT0001: error registering with the MGS: rc = -5 (not fatal) 12:50:34:LustreError: 16886:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff8800740bb400 x1445162105843380/t0(0) o101->MGC10.10.17.33@tcp@10.10.17.33@tcp:26/25 lens 328/344 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1 12:50:34:LustreError: 16886:0:(client.c:1052:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff8800740bb400 x1445162105843384/t0(0) o101->MGC10.10.17.33@tcp@10.10.17.33@tcp:26/25 lens 328/344 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1 12:50:34:LustreError: 15c-8: MGC10.10.17.33@tcp: The configuration from log 'lustre-MDT0001' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. 12:50:34:LustreError: 16886:0:(obd_mount_server.c:1258:server_start_targets()) failed to start server lustre-MDT0001: -5 12:50:34:LustreError: 16886:0:(obd_mount_server.c:1700:server_fill_super()) Unable to start targets: -5 12:50:34:LustreError: 16886:0:(obd_mount_server.c:849:lustre_disconnect_lwp()) lustre-MDT0000-lwp-MDT0001: Can't end config log lustre-client. Maloo report: https://maloo.whamcloud.com/test_sets/059ee4a8-164c-11e3-aa2a-52540035b04c |
| Comment by Jodi Levi (Inactive) [ 18/Oct/13 ] |
|
Re-enabling the tests is being tracked in |
| Comment by Mikhail Pershin [ 18/Oct/13 ] |
|
patch was merged, are there more work to do in context of this bug? |
| Comment by Nathaniel Clark [ 18/Nov/13 ] |
|
Skipping insanity/10 because it fails in the same way as the other always_excepted bugs fail |
| Comment by Andreas Dilger [ 18/Nov/13 ] |
|
Reopening this bug, since it doesn't seem that the landed patch actually allows the OST to start when the MGS is down. I guess the config log might be replicated, but the MGC is somehow blocked waiting for the MGS? |
| Comment by Jodi Levi (Inactive) [ 18/Mar/14 ] |
|
Mike, |
| Comment by Mikhail Pershin [ 11/Apr/14 ] |
|
yes, I have to investigate this first |
| Comment by Andreas Dilger [ 30/Apr/14 ] |
|
I think the first step is to run a local test to verify manually that the OST is actually creating a local copy of the config llog on the first mount, and then check that the OST can start from the local llog if the MGS is not running. |
| Comment by Mikhail Pershin [ 05/May/14 ] |
|
Well, I did that for sure when did the initial patch for this ticket which makes local config llogs. Nevertheless I agree this is first step to do now, just to make sure it still works. Meanwhile I tend to think you are right that problem might be in locking or something else what cause MGC to wait for MGS |
| Comment by Alex Zhuravlev [ 05/May/14 ] |
|
hmm, there were tests in conf-sanity to verify OST can start with inactive MGS. |
| Comment by Andreas Dilger [ 07/May/14 ] |
|
I noticed that several conf-sanity.sh tests are being skipped for ZFS because of conf-sanity.sh: 5d, 19b, 21b, 27a There is also conf-sanity.sh test_50h that doesn't give a reason, and sanity-quota.sh test_7d. It would be good to submit a patch or two to enable these tests along with a Test-parameters: line that runs the relevant test ten times to see what the success rate is. |
| Comment by Mikhail Pershin [ 13/May/14 ] |
|
the local config logs are created as expected, meanwhile there was problem with 'params' log, it can be empty and that case shouldn't be critical. I've enabled all excepted tests and they passed with both ldiskfs and zfs except insanity test_1. It failed only with zfs as following: May 13 14:56:02 nodez kernel: LustreError: 13691:0:(mgc_request.c:516:do_requeue()) failed processing log: -5 May 13 14:56:02 nodez kernel: LustreError: 13691:0:(mgc_request.c:516:do_requeue()) Skipped 2 previous similar messages May 13 14:56:15 nodez kernel: INFO: task mount.lustre:15680 blocked for more than 120 seconds. May 13 14:56:15 nodez kernel: Tainted: P --------------- 2.6.32 #11 May 13 14:56:15 nodez kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. May 13 14:56:15 nodez kernel: mount.lustre D ffff880001e0f7e0 3592 15680 15679 0x00000080 May 13 14:56:15 nodez kernel: ffff8800241167f0 0000000000000082 0000000000000000 ffffffff814ffbf6 May 13 14:56:15 nodez kernel: ffff88002d7a3b38 ffffffff81a2d020 ffff880024116ac0 000000000000f7e0 May 13 14:56:15 nodez kernel: ffff88001eaabfd8 ffff880024116ac0 ffff8800241167f0 ffff88001eaab6c8 May 13 14:56:15 nodez kernel: Call Trace: May 13 14:56:15 nodez kernel: [<ffffffff814ffbf6>] ? __mutex_lock_slowpath+0x1f6/0x2a0 May 13 14:56:15 nodez kernel: [<ffffffff814ff325>] ? schedule_timeout+0x1c5/0x280 May 13 14:56:15 nodez kernel: [<ffffffff81048804>] ? task_rq_lock+0x54/0xa0 May 13 14:56:15 nodez kernel: [<ffffffff810512cd>] ? set_cpus_allowed_ptr+0xad/0x1b0 May 13 14:56:15 nodez kernel: [<ffffffff814ff325>] ? schedule_timeout+0x1c5/0x280 May 13 14:56:15 nodez kernel: [<ffffffff814fe319>] ? wait_for_common+0x139/0x1a0 May 13 14:56:15 nodez kernel: [<ffffffff81050f80>] ? default_wake_function+0x0/0x20 May 13 14:56:15 nodez kernel: [<ffffffffa0eed770>] ? client_lwp_config_process+0x0/0x1540 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0e72d9c>] ? llog_process_or_fork+0x2bc/0x4c0 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0e72faf>] ? llog_process+0xf/0x20 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0ea9f79>] ? class_config_parse_llog+0x1e9/0x340 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0e78b02>] ? __llog_ctxt_put+0xf2/0x150 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa027948c>] ? mgc_process_cfg_log+0x56c/0xe90 [mgc] May 13 14:56:15 nodez kernel: [<ffffffffa027bdfa>] ? mgc_process_log+0x59a/0x1390 [mgc] May 13 14:56:15 nodez kernel: [<ffffffffa0dc7f6f>] ? libcfs_debug_msg+0x3f/0x50 [libcfs] May 13 14:56:15 nodez kernel: [<ffffffffa027ccc5>] ? do_config_log_add+0xd5/0x5b0 [mgc] May 13 14:56:15 nodez kernel: [<ffffffffa0276dd0>] ? mgc_blocking_ast+0x0/0x800 [mgc] May 13 14:56:15 nodez kernel: [<ffffffffa0300240>] ? ldlm_completion_ast+0x0/0x910 [ptlrpc] May 13 14:56:15 nodez kernel: [<ffffffffa027db8f>] ? mgc_process_config+0x69f/0x1290 [mgc] May 13 14:56:15 nodez kernel: [<ffffffffa0ec1dc9>] ? lustre_process_log+0x219/0x1000 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0ec2106>] ? lustre_process_log+0x556/0x1000 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffff811283e2>] ? kmem_cache_alloc_trace+0x1c2/0x1d0 May 13 14:56:15 nodez kernel: [<ffffffffa0ee89c6>] ? lustre_start_lwp+0x146/0x700 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0ee8a1f>] ? lustre_start_lwp+0x19f/0x700 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0eead66>] ? server_start_targets+0xeb6/0x1d30 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0ebe4c6>] ? lustre_start_mgc+0x456/0x29b0 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0eb6af0>] ? class_config_llog_handler+0x0/0x1e50 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0ef1a3f>] ? server_fill_super+0xbff/0x15c0 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0ec0a20>] ? lustre_fill_super+0x0/0x520 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0ec0bf8>] ? lustre_fill_super+0x1d8/0x520 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0ec0a20>] ? lustre_fill_super+0x0/0x520 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffffa0ec0a20>] ? lustre_fill_super+0x0/0x520 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffff8113c9a9>] ? get_sb_nodev+0x59/0xb0 May 13 14:56:15 nodez kernel: [<ffffffffa0eb8b5c>] ? lustre_get_sb+0x1c/0x30 [obdclass] May 13 14:56:15 nodez kernel: [<ffffffff8113bf16>] ? vfs_kern_mount+0x76/0x1b0 May 13 14:56:15 nodez kernel: [<ffffffff8113c0c3>] ? do_kern_mount+0x53/0x130 May 13 14:56:15 nodez kernel: [<ffffffff81157d32>] ? do_mount+0x302/0x8f0 May 13 14:56:15 nodez kernel: [<ffffffff811583ba>] ? sys_mount+0x9a/0xf0 May 13 14:56:15 nodez kernel: [<ffffffff81009f42>] ? system_call_fastpath+0x16/0x1b May 13 14:57:14 nodez kernel: LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. May 13 14:57:14 nodez kernel: LustreError: Skipped 129 previous similar messages May 13 14:57:24 nodez kernel: LustreError: 11-0: lustre-MDT0000-osp-MDT0001: Communicating with 0@lo, operation mds_connect failed with -19. May 13 14:57:24 nodez kernel: LustreError: Skipped 51 previous similar messages May 13 14:58:15 nodez kernel: INFO: task ll_cfg_requeue:13691 blocked for more than 120 seconds. May 13 14:58:15 nodez kernel: Tainted: P --------------- 2.6.32 #11 May 13 14:58:15 nodez kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. May 13 14:58:15 nodez kernel: ll_cfg_requeu D ffff880001e0f7e0 4760 13691 2 0x00000080 May 13 14:58:15 nodez kernel: ffff88001f96c040 0000000000000046 0000000000000000 ffff880001e0f920 May 13 14:58:15 nodez kernel: 0000000000000018 ffffffff81a2d020 ffff88001f96c310 000000000000f7e0 May 13 14:58:15 nodez kernel: ffff880028f99fd8 ffff88001f96c310 ffff88001f96c040 ffff880001e0f920 May 13 14:58:15 nodez kernel: Call Trace: May 13 14:58:15 nodez kernel: [<ffffffff812bc54a>] ? string+0x3a/0xf0 May 13 14:58:15 nodez kernel: [<ffffffff812bdbf7>] ? vsnprintf+0x357/0x600 May 13 14:58:15 nodez kernel: [<ffffffff814ffb17>] ? __mutex_lock_slowpath+0x117/0x2a0 May 13 14:58:15 nodez kernel: [<ffffffff8150031f>] ? mutex_lock+0x1f/0x40 May 13 14:58:15 nodez kernel: [<ffffffffa027b8c3>] ? mgc_process_log+0x63/0x1390 [mgc] May 13 14:58:15 nodez kernel: [<ffffffff81501377>] ? _spin_lock_irqsave+0x17/0x20 May 13 14:58:15 nodez kernel: [<ffffffff81068353>] ? lock_timer_base+0x33/0x70 May 13 14:58:15 nodez kernel: [<ffffffff8106899e>] ? try_to_del_timer_sync+0x6e/0xd0 May 13 14:58:15 nodez kernel: [<ffffffffa0dc7f6f>] ? libcfs_debug_msg+0x3f/0x50 [libcfs] May 13 14:58:15 nodez kernel: [<ffffffffa027ea8d>] ? mgc_requeue_thread+0x30d/0x7a0 [mgc] May 13 14:58:15 nodez kernel: [<ffffffff81050f80>] ? default_wake_function+0x0/0x20 May 13 14:58:15 nodez kernel: [<ffffffffa027e780>] ? mgc_requeue_thread+0x0/0x7a0 [mgc] May 13 14:58:15 nodez kernel: [<ffffffff81077e9e>] ? kthread+0x8e/0xa0 May 13 14:58:15 nodez kernel: [<ffffffff8100b0ea>] ? child_rip+0xa/0x20 May 13 14:58:15 nodez kernel: [<ffffffff81077e10>] ? kthread+0x0/0xa0 May 13 14:58:15 nodez kernel: [<ffffffff8100b0e0>] ? child_rip+0x0/0x20 That looks like deadlock, not sure why it happens with ZFS only though |
| Comment by Mikhail Pershin [ 13/May/14 ] |
|
Patch in gerrit with last fixes related to 'params' log handling and enabled tests: The test_1 in insanity is still excluded, it failed due to deadlock of some kind and need more investigation. Maybe it worths to file another bug for that or reopen |
| Comment by Jodi Levi (Inactive) [ 23/May/14 ] |
|
Patch landed to Master. Please reopen ticket if more work is needed. |
| Comment by Nathaniel Clark [ 20/Jun/14 ] |
|
Port to b2_4: |
| Comment by Andreas Dilger [ 01/Oct/14 ] |
|
sanity-quota test_7d is still being skipped for ZFS filesystems, reportedly because of this bug. |
| Comment by Andreas Dilger [ 01/Oct/14 ] |
|
insanity.sh test_1 is also being skipped for ZFS filesystems due to this bug. |
| Comment by Jian Yu [ 25/Oct/14 ] |
|
Here are the back-ported patches for Lustre b2_5 branch: |
| Comment by Jian Yu [ 11/Nov/14 ] |
|
The above patches were merged into Lustre b2_5 branch for 2.5.4 release. |
| Comment by Andreas Dilger [ 28/Nov/14 ] |
|
The conf-sanity test_57b is still being skipped and patch http://review.whamcloud.com/10357 which enables it is faint testing. This bug is definitely not fixed. |
| Comment by Mikhail Pershin [ 16/Dec/14 ] |
|
Andreas, the patch you mentioned passed test_57b successfully. Did I miss something? |
| Comment by Mikhail Pershin [ 16/Dec/14 ] |
|
as for insanity test_1, it is not |