[LU-1762] mmp.sh gets wrong MMP update interval Created: 17/Aug/12  Updated: 22/Feb/13  Resolved: 26/Aug/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.3.0, Lustre 2.1.3
Fix Version/s: Lustre 2.3.0, Lustre 2.4.0, Lustre 2.1.4, Lustre 1.8.9

Type: Bug Priority: Blocker
Reporter: Jian Yu Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by LU-1837 2.1.3<->2.3 Test failure on test suit... Resolved
is duplicated by LU-2578 Test failure on test suite mmp: test_... Closed
Severity: 3
Rank (Obsolete): 4476

 Description   

Currently, the get_mmp_update_interval() in mmp.sh is implemented as follows:

get_mmp_update_interval() {
    local facet=$1
    local device=$2
    local interval

    interval=$(do_facet $facet "$DEBUGFS -c -R dump_mmp $device 2>/dev/null \
                | grep 'MMP Update Interval' | cut -d' ' -f4")
    [ -z "$interval" ] && interval=1

    echo $interval
}

The 'MMP Update Interval' string is incorrect now since debugfs has changed it to 'update_interval':

# debugfs -c -R dump_mmp /dev/vda5
debugfs 1.42.3.wc3 (15-Aug-2012)
/dev/vda5: catastrophic mode - not reading inode or group bitmaps
block_number: 16416
update_interval: 5
check_interval: 5
sequence: ff4d4d50
time: 1345213198 -- Fri Aug 17 07:19:58 2012
node_name: client-16vm3
device_name: /dev/vda5
magic: 0x4d4d50

And after the patch for LU-264 is landed, the default value for MMP update interval has been changed to 5 seconds instead of 1 second.

The 'MMP Check Interval' string in get_mmp_check_interval() is also needed to be updated to 'check_interval'.

In addition, after looking into e2fsck/unix.c and ldiskfs/kernel_patches/patches/ext4-mmp-rhel6.patch, I found there is a more simple and reliable way than using an extra expect script to fix the issue in LU-1689:

We can just run "tune2fs -E mmp_update_interval=$interval $device" to increase the time of "sleep(2 * mmp_check_interval + 1)" in ext2fs_mmp_start() (which is called by e2fsck in try_open_fs()->ext2fs_open2()). A new sequence number is written into the MMP block before that sleep. So, after e2fsck goes into ext2fs_mmp_start() and sets a new sequence number successfully, mount operation will always fail before e2fsck goes into ext2fs_mmp_stop() to set EXT4_MMP_SEQ_CLEAN into the MMP block.



 Comments   
Comment by Jian Yu [ 21/Aug/12 ]

Patch for b2_1 branch: http://review.whamcloud.com/3733
Patch for b2_3 branch: http://review.whamcloud.com/3746
Patch for master branch: http://review.whamcloud.com/3743

Comment by Peter Jones [ 26/Aug/12 ]

Landed for 2.3 and 2.4

Comment by Jian Yu [ 04/Jan/13 ]

Patch for b1_8 branch: http://review.whamcloud.com/4953

Generated at Sat Feb 10 01:19:27 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.