[LU-6954] LustreError: 12934:0:(mdd_device.c:305:mdd_changelog_llog_init()) fsrzb-MDD0000: changelog init failed: rc = -5 Created: 04/Aug/15  Updated: 19/Feb/16  Resolved: 19/Feb/16

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.4
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Olaf Faaland Assignee: Bruno Faccini (Inactive)
Resolution: Fixed Votes: 0
Labels: llnl
Environment:

lustre-2.5.4-4chaos_2.6.32_504.16.2.1chaos.ch5.3.x86_64.x86_64


Attachments: HTML File changelog_catalog     File changelog_catalog.stout     HTML File changelog_users    
Issue Links:
Related
is related to LU-6556 changelog catalog corruption if all p... Resolved
is related to LU-6634 (osd_handler.c:901:osd_trans_start())... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

lustre-2.5.4-4chaos_2.6.32_504.16.2.1chaos.ch5.3.x86_64.x86_64

The mds service on both porter and stout fails to start. Able to import zfs pool on both systems with no problem. The mgs device mounts with no problem but the mdt on both systems fails to mount. Doing a "writeconf" on the stout mds did not help. The following console messages were reported on stout-mds1 console:

2015-08-02 16:38:26 Lustre: Lustre: Build Version: 2.5.4-4chaos-4chaos--PRISTINE-2.6.32-504.16.2.1chaos.ch5.3.x86_64
2015-08-02 16:38:27 Lustre: MGC172.21.1.99@o2ib200: Connection restored to MGS (at 0@lo)
2015-08-02 16:38:28 Lustre: MGS: Logs for fs fsrzb were removed by user request.  All servers must be restarted in order to regenerate the logs.
2015-08-02 16:38:30 LustreError: 11-0: fsrzb-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11.
2015-08-02 16:38:31 Lustre: 12934:0:(llog_cat.c:718:llog_cat_reverse_process()) catalog 0x2:10 crosses index zero
2015-08-02 16:38:31 LustreError: 12934:0:(mdd_device.c:305:mdd_changelog_llog_init()) fsrzb-MDD0000: changelog init failed: rc = -5
2015-08-02 16:38:31 LustreError: 12934:0:(mdd_device.c:380:mdd_changelog_init()) fsrzb-MDD0000: changelog setup during init failed: rc = -5
2015-08-02 16:38:31 LustreError: 12934:0:(mdd_device.c:963:mdd_prepare()) fsrzb-MDD0000: failed to initialize changelog: rc = -5
2015-08-02 16:38:31 Lustre: fsrzb-MDT0000: Unable to start target: -5
2015-08-02 16:38:31 Lustre: Failing over fsrzb-MDT0000
2015-08-02 16:38:32 Lustre: server umount fsrzb-MDT0000 complete
2015-08-02 16:38:32 LustreError: 12934:0:(obd_mount.c:1331:lustre_fill_super()) Unable to mount  (-5)

A workaround was found to allow the MDT to mount:
Mounting the MDT via ZPL
Delete the changelog_catalog and changelog_users files
Unmount
Mount the MDT via lustre in the normal manner



 Comments   
Comment by Bruno Faccini (Inactive) [ 04/Aug/15 ]

Hello Olaf,
Did you keep a copy of changelog_catalog and changelog_users files that you can provide ?

Comment by Olaf Faaland [ 04/Aug/15 ]

Bruno,
Sorry, yes, attached now.

Comment by Bruno Faccini (Inactive) [ 04/Aug/15 ]

Thanks Olaf, and here is what I can tell after analyzing the changelog_catalog file you have provided.
I am not able to confirm that Lustre v2.5.4, you run with, contains or not patch from LU-4528 (http://review.whamcloud.com/#/c/10108/, Commit 7c243a561ffe8503a6abf5c4cafef0c3566192bc). Can you check this for me?
But if this is the case and since your changelog_catalog had just reached its end and was about to loop-back, I think you likely encountered the same kind of regression described in LU-6556.

Comment by Christopher Morrone [ 04/Aug/15 ]

I find it exceptionally unlikely that two different filesystems had both independently "just reached" the end of the changelog_catalog and were just about to loop back at exactly the same time. More explanation is needed.

Comment by Bruno Faccini (Inactive) [ 04/Aug/15 ]

Hello Chris, thanks to warn me about the fact that 2 FSs are affected, I should have better read the description text for this ticket, sorry about that.

But, if the same symptoms/msgs have occurred for both Filesystems failures, I can already confirm that the "crosses index zero" msg is an indication of a Catalog loop-back, and I will also need the 2nd changelog_catalog file for the 2nd filesystem to analyze it.
We may have end up in a situation where both Catalog have loop-back, when only one/1st was just doing so ... And this since last Filesystems restarts.

And BTW, I have double-checked the 1st Catalog you have provided and I can also confirm you that it shows the same corruption (Catalog records written past normal end, leading to a Catalog size > header+bitmap+records) than what has been found for LU-6556.

Concerning the fact I used the "just reached" comment, this may come from the fact that, for an unexplained reason at the moment, bits at the beginning of the bitmap have been cleared (or may be never set).

Comment by Olaf Faaland [ 05/Aug/15 ]

Bruno,

I confirmed that both filesystems produced the same sequence of error messages when attempting to start the MDT, including the "crosses index zero" and "changelog init failed" messages, same rc's.

We do have the LU-4528 patch in our build.

I'll attach the second changelog_catalog. The one you've already seen is from porter.

Comment by Olaf Faaland [ 14/Aug/15 ]

Bruno,

Do you have any updates on this? I see that the stout catalog file contains 67272 records, and it looks like the bitmap has only 64767 bits for tracking the status of the non- llog_log_hdr records. So it does seem to me that the changelog_catalog file is corrupt.

The records that appear after that have indices in the range 12197 - 14701, which seems odd. The code in llog_osd_prev_block() appears to me to assume that the records within a block have monotonically increasing indices, since only lrt_index is generally checked before deciding whether to read another block from disk or not. Am I correct that requirement for increasing indices?

thanks,
Olaf

Comment by Bruno Faccini (Inactive) [ 15/Aug/15 ]

Olaf,
Thanks to have provided the second ChangeLog catalog file for stout.
You are correct in the analysis of his already looped-back and corrupted content, which is again the same than for porter, and the one that has already been investigated in LU-6556.

So the scenario which is likely to have happen for both filesystems is that prior to start running with patch for LU-4528 in, both ChangeLog catalogs had already looped-back, but the patch has caused new records to be put at the end of each files instead of updating their corresponding records in-place. The corruption was detected at next FSs restart.

Am I right when I suspect that problem has occured during the 2nd restart for each FS since upgrading with LU-4528 in ?

Comment by Olaf Faaland [ 19/Aug/15 ]

Bruno,

Yes, it is likely we encountered this error on the 2nd restart after upgrading with LU-4528.

I am looking at the code in llog_osd_write_rec() and your patch for LU-6556. I see that in the old code, lgi_off is set to la_size unconditionally.

thanks,
Olaf

Comment by Olaf Faaland [ 19/Aug/15 ]

Bruno,

I can see how we could end up with the changelog_catalog file corruption if, before we upgraded to LU-4528 code, our changelog_catalog was already wrapped around, so that lgh_last_idx == 12196 and changelog_catalog size == 4,153,280. I think this is what you are saying happened.

However, in the LU-4528 patch, and in the previous code it applied to, I don't see something implementing changelog_catalog wrap-around - setting lgh_last_idx in some way other than incrementing or setting to 0 when creating changelog_catalog for the first time. Do you?

thanks,
Olaf

Comment by Olaf Faaland [ 19/Aug/15 ]

I guess what I'm really asking is, did changelog_catalogs wrap around prior to the LU-4528 patch? Some comment made me think so, maybe I misunderstood.

thanks,
Olaf

Comment by Olaf Faaland [ 19/Aug/15 ]

Bruno,

Looks to me like this code in llog_cat_new_log() implemented wrap-around before LU-4528. Please confirm I'm not misreading.

76         bitmap_size = LLOG_BITMAP_SIZE(llh);
77
78         index = (cathandle->lgh_last_idx + 1) % bitmap_size;
...
118         cathandle->lgh_last_idx = index;
119         llh->llh_tail.lrt_index = index;

thanks,
Olaf

Comment by Bruno Faccini (Inactive) [ 19/Aug/15 ]

Olaf,
To be quick, Catalog wrap-around was working before LU-4528 patch and no longer after, and also situations where Catalog has already wrapped-around will lead to corruption with new records being written past of expected Catalog's normal end of file size.

Comment by Bruno Faccini (Inactive) [ 08/Dec/15 ]

Hello Olaf,
Do you agree that this ticket can be closed as a dup of LU-6556?
Thanks again and in advance for your help and answer.

Comment by Olaf Faaland [ 09/Dec/15 ]

Hi Bruno,
Yes, I agree this should be closed as a dup of LU-6556. Thank you.
-Olaf

Comment by John Fuchs-Chesney (Inactive) [ 11/Dec/15 ]

Thanks Bruno and Olaf.
~ jfc.

Generated at Sat Feb 10 02:04:44 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.