Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: Blocker
Fix Version/s: Lustre 2.3.0, Lustre 2.4.0, Lustre 2.1.4, Lustre 1.8.9
Affects Version/s: Lustre 2.3.0, Lustre 2.1.3
Labels:
None

Severity:
3
Rank (Obsolete):
4476

Currently, the get_mmp_update_interval() in mmp.sh is implemented as follows:

get_mmp_update_interval() {
    local facet=$1
    local device=$2
    local interval

    interval=$(do_facet $facet "$DEBUGFS -c -R dump_mmp $device 2>/dev/null \
                | grep 'MMP Update Interval' | cut -d' ' -f4")
    [ -z "$interval" ] && interval=1

    echo $interval
}

The 'MMP Update Interval' string is incorrect now since debugfs has changed it to 'update_interval':

# debugfs -c -R dump_mmp /dev/vda5
debugfs 1.42.3.wc3 (15-Aug-2012)
/dev/vda5: catastrophic mode - not reading inode or group bitmaps
block_number: 16416
update_interval: 5
check_interval: 5
sequence: ff4d4d50
time: 1345213198 -- Fri Aug 17 07:19:58 2012
node_name: client-16vm3
device_name: /dev/vda5
magic: 0x4d4d50

And after the patch for ~~LU-264~~ is landed, the default value for MMP update interval has been changed to 5 seconds instead of 1 second.

The 'MMP Check Interval' string in get_mmp_check_interval() is also needed to be updated to 'check_interval'.

In addition, after looking into e2fsck/unix.c and ldiskfs/kernel_patches/patches/ext4-mmp-rhel6.patch, I found there is a more simple and reliable way than using an extra expect script to fix the issue in ~~LU-1689~~:

We can just run "tune2fs -E mmp_update_interval=$interval $device" to increase the time of "sleep(2 * mmp_check_interval + 1)" in ext2fs_mmp_start() (which is called by e2fsck in try_open_fs()->ext2fs_open2()). A new sequence number is written into the MMP block before that sleep. So, after e2fsck goes into ext2fs_mmp_start() and sets a new sequence number successfully, mount operation will always fail before e2fsck goes into ext2fs_mmp_stop() to set EXT4_MMP_SEQ_CLEAN into the MMP block.

is duplicated by

LU-1837 2.1.3<->2.3 Test failure on test suite mmp, subtest test_8

Resolved

LU-2578 Test failure on test suite mmp: test_8 failed with 1

Closed

Assignee:: Jian Yu

Reporter:: Jian Yu

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 17/Aug/12 11:23 AM

Updated:: 22/Feb/13 11:21 AM

Resolved:: 26/Aug/12 1:00 AM

Details

Description

Attachments

Issue Links

Activity

People

Dates