[LU-2008] After hardware reboot (using pm) the node cannot be accessed Created: 23/Sep/12 Updated: 15/Oct/13 Resolved: 08/Apr/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0, Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0, Lustre 2.4.1, Lustre 2.5.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Jian Yu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | HB | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 4110 | ||||||||||||
| Description |
|
This issue was created by maloo for bobijam <bobijam@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/02a8d976-05c6-11e2-b6a7-52540035b04c. The sub-test test_0b failed with the following error:
Info required for matching: replay-ost-single 0b 11:42:23:== replay-ost-single test 0b: empty replay =========================================================== 11:42:21 (1348425741) |
| Comments |
| Comment by Zhenyu Xu [ 23/Sep/12 ] |
|
Chris, Does it related to TT-868, or another issue? The test failed over an OSS (client-25vm4) and lost it forever. test parameter are: Test-Parameters: fortestonly envdefinitions=SLOW=yes clientcount=4 osscount=2 mdscount=2 austeroptions=-R failover=true useiscsi=true testlist=replay-ost-single |
| Comment by Jodi Levi (Inactive) [ 25/Sep/12 ] |
|
This is blocking the work on |
| Comment by Peter Jones [ 27/Sep/12 ] |
|
Dropping priority as this is not preventing any testing from happening |
| Comment by Sarah Liu [ 10/Jan/13 ] |
|
hit this issue again during 2.4 testing: https://maloo.whamcloud.com/test_sessions/6e837946-4809-11e2-8cdc-52540035b04c |
| Comment by Keith Mannthey (Inactive) [ 10/Jan/13 ] |
|
Well I saw from https://maloo.whamcloud.com/test_sets/8b557c90-4809-11e2-8cdc-52540035b04c I took a quick look at the test failure and two things really pop out at me. "Duration: 86400s " My calculator tell me that is 24 hours. "Failure Rate: 78.00% of last 100 executions [all branches] " I am learning to not trust this Metric but as testing is taking days right now this could be part of the issue. From the syslog of the test session: The Start: Dec 15 02:39:30 fat-intel-3vm5 kernel: Lustre: DEBUG MARKER: /usr/sbin/lctl mark Starting failover on mds1 Dec 15 02:39:35 fat-intel-3vm5 xinetd[1587]: EXIT: shell status=0 pid=7402 duration=5(sec) Dec 15 02:40:33 fat-intel-3vm5 xinetd[1587]: START: shell pid=7430 from=::ffff:10.10.4.86 Dec 15 02:40:33 fat-intel-3vm5 rshd[7431]: root@fat-intel-3vm1.lab.whamcloud.com as root: cmd='(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usr/sbin; cd /usr/lib64/lustre/tests; LUSTRE="/usr/lib64/lustre" USE_OFD=yes MGSFSTYPE=ldiskfs MDSFSTYPE=ldiskfs OSTFSTYPE=ldiskfs FSTYPE=ldiskfs sh -c "/usr/sbin/lctl mark Starting failover on mds1");echo XXRETCODE:$?' Dec 15 02:40:33 fat-intel-3vm5 kernel: Lustre: DEBUG MARKER: Starting failover on mds1 Dec 15 02:40:33 fat-intel-3vm5 xinetd[1587]: EXIT: shell status=0 pid=7430 duration=0(sec) Dec 15 02:41:20 fat-intel-3vm5 kernel: LustreError: 7107:0:(vvp_io.c:1075:vvp_io_commit_write()) Write page 465052 of inode ffff8800765bd638 failed -28 Dec 15 02:43:50 fat-intel-3vm5 kernel: LustreError: 7459:0:(vvp_io.c:1075:vvp_io_commit_write()) Write page 465147 of inode ffff8800765bd638 failed -28 Dec 15 02:43:51 fat-intel-3vm5 kernel: LustreError: 7471:0:(vvp_io.c:1075:vvp_io_commit_write()) Write page 0 of inode ffff8800765bd638 failed -28 Dec 15 02:46:20 fat-intel-3vm5 kernel: LustreError: 7475:0:(vvp_io.c:1075:vvp_io_commit_write()) Write page 464979 of inode ffff8800765bd638 failed -28 Dec 15 02:48:50 fat-intel-3vm5 kernel: LustreError: 7487:0:(vvp_io.c:1075:vvp_io_commit_write()) Write page 465103 of inode ffff8800765bd638 failed -28 A day or so of write page failed errors... The end: Dec 16 02:33:30 fat-intel-3vm5 kernel: LustreError: 6617:0:(vvp_io.c:1075:vvp_io_commit_write()) Write page 464781 of inode ffff8800765bd638 failed -28 Dec 16 02:33:30 fat-intel-3vm5 kernel: LustreError: 6617:0:(vvp_io.c:1075:vvp_io_commit_write()) Skipped 5 previous similar messages Dec 16 02:39:32 fat-intel-3vm5 xinetd[1587]: START: shell pid=6666 from=::ffff:10.10.4.86 Dec 16 02:39:32 fat-intel-3vm5 rshd[6667]: autotest@fat-intel-3vm1.lab.whamcloud.com as root: cmd='/home/autotest/.autotest/dynamic_bash/70041011263340' Dec 16 02:39:34 fat-intel-3vm5 kernel: SysRq : Show State Dec 16 02:39:34 fat-intel-3vm5 kernel: task PC stack pid father I am going to dig into this a bit more but something seems really broken here. |
| Comment by Keith Mannthey (Inactive) [ 10/Jan/13 ] |
|
Seems to be a script var issues: from The test output (just part of it) FAIL CLIENT fat-intel-3vm6... + pm -h powerman --off fat-intel-3vm6 pm: warning: server version (2.3.5) != client (2.3.12) Command completed successfully Starting failover on mds1 CMD: fat-intel-3vm3 /usr/sbin/lctl dl Failing on + pm -h powerman --off Usage: pm [action] [targets] -1,--on targets Power on targets -0,--off targets Power off targets -c,--cycle targets Power cycle targets So we fail Client fat-intel-3vm6 just fine. Then we start on the MDS "Starting failover on mds1" We get "Failing on " There is nothing here. pm is confused and sending out it usage info because we don't pass in a host. From lustre/tests/recovery-random-scale.sh calles log "Starting failover on $serverfacet" facet_failover "$serverfacet" || exit 1 In test-framework.sh facet_failover for ((index=0; index<$total; index++)); do
facet=$(echo ${affecteds[index]} | tr -s " " | cut -d"," -f 1)
local host=$(facet_active_host $facet)
echo "Failing ${affecteds[index]} on $host"
shutdown_facet $facet
done
$host is not set and ${affecteds[index]} is empty. Something in the framework is not quite right and I don't think I am quite the right person to dig all that out. I will put a sanity check into shutdown_facet() to keep errors like this from wasting a day of cycles. |
| Comment by Keith Mannthey (Inactive) [ 10/Jan/13 ] |
|
Well with the lack of return codes and argument it makes is reasonably complicated to properly detect the error and skip timeout. I noticed there were widespread timout issues with the total run. 7 tests goto timeout. http://jira.whamcloud.com/browse/TT-1016 Seems very related. The root issue may be occurring at a reasonably high rate and it is eating up lots of cycles on the test cluster. When things get off the rails like this we need to have a good way of bailing out. |
| Comment by Chris Gearing (Inactive) [ 01/Feb/13 ] |
|
Below is the configure used by one of the runs. Looks out to me but someone else might like to review and work out what is wrong. #!/bin/bash #Auto Generated By Whamcloud Autotest #Key Exports export mgs_HOST=fat-intel-3vm3 export mds_HOST=fat-intel-3vm3 export MGSDEV=/dev/lvm-MDS/P1 export MDSDEV=/dev/lvm-MDS/P1 export mds1_HOST=fat-intel-3vm3 export MDSDEV1=/dev/lvm-MDS/P1 export MDSCOUNT=1 export MDSSIZE=10485760 export MGSSIZE=10485760 export MDSFSTYPE=ldiskfs export MGSFSTYPE=ldiskfs export mdsfailover_HOST=fat-intel-3vm7 export mds1failover_HOST=fat-intel-3vm7 export MGSNID=fat-intel-3vm3:fat-intel-3vm7 export FAILURE_MODE=HARD export POWER_DOWN="pm -h powerman --off" export POWER_UP="pm -h powerman --on" export ost_HOST=fat-intel-3vm4 export ostfailover_HOST=fat-intel-3vm8 export ost1_HOST=fat-intel-3vm4 export OSTDEV1=/dev/lvm-OSS/P1 export ost1failover_HOST=fat-intel-3vm8 export ost2_HOST=fat-intel-3vm4 export OSTDEV2=/dev/lvm-OSS/P2 export ost2failover_HOST=fat-intel-3vm8 export ost3_HOST=fat-intel-3vm4 export OSTDEV3=/dev/lvm-OSS/P3 export ost3failover_HOST=fat-intel-3vm8 export ost4_HOST=fat-intel-3vm4 export OSTDEV4=/dev/lvm-OSS/P4 export ost4failover_HOST=fat-intel-3vm8 export ost5_HOST=fat-intel-3vm4 export OSTDEV5=/dev/lvm-OSS/P5 export ost5failover_HOST=fat-intel-3vm8 export ost6_HOST=fat-intel-3vm4 export OSTDEV6=/dev/lvm-OSS/P6 export ost6failover_HOST=fat-intel-3vm8 export ost7_HOST=fat-intel-3vm4 export OSTDEV7=/dev/lvm-OSS/P7 export ost7failover_HOST=fat-intel-3vm8 # some setup for conf-sanity test 24a, 24b, 33a export fs2mds_DEV=/dev/lvm-MDS/S1 export fs2ost_DEV=/dev/lvm-OSS/S1 export fs3ost_DEV=/dev/lvm-OSS/S2 export RCLIENTS="fat-intel-3vm6 fat-intel-3vm5" export OSTCOUNT=7 export NETTYPE=tcp export OSTSIZE=2097152 export OSTFSTYPE=ldiskfs export FSTYPE=ldiskfs export SHARED_DIRECTORY=/home/autotest/.autotest/shared_dir/2012-12-15/011204-70041009461540 export SLOW=yes # Adding contents of /home/autotest/autotest/mecturk-standalone.sh VERBOSE=true # Entries above here come are created by configurecluster.rb # Entries below here come from mecturk.h FSNAME=lustre TMP=${TMP:-/tmp} DAEMONSIZE=${DAEMONSIZE:-500} MDSOPT=${MDSOPT:-""} MGSOPT=${MGSOPT:-""} # sgpdd-survey requires these to be set. They apprarently have no side affect. SGPDD_YES=true REFORMAT=true # some bits for liblustre tcp connecttions export LNET_ACCEPT_PORT=7988 export ACCEPTOR_PORT=7988 OSTOPT=${OSTOPT:-""} STRIPE_BYTES=${STRIPE_BYTES:-1048576} STRIPES_PER_OBJ=${STRIPES_PER_OBJ:-0} SINGLEMDS=${SINGLEMDS:-"mds1"} TIMEOUT=${TIMEOUT:-20} PTLDEBUG=${PTLDEBUG:-0x33f0404} DEBUG_SIZE=${DEBUG_SIZE:-32} SUBSYSTEM=${SUBSYSTEM:- 0xffb7e3ff} MKFSOPT="" MOUNTOPT="" [ "x$MDSJOURNALSIZE" != "x" ] && MKFSOPT=$MKFSOPT" -J size=$MDSJOURNALSIZE" [ "x$MDSISIZE" != "x" ] && MKFSOPT=$MKFSOPT" -i $MDSISIZE" [ "x$MKFSOPT" != "x" ] && MKFSOPT="--mkfsoptions=\\\"$MKFSOPT\\\"" [ "x$MDSCAPA" != "x" ] && MKFSOPT="--param mdt.capa=$MDSCAPA" [ "$MDSFSTYPE" = "ldiskfs" ] && MDSOPT=$MDSOPT" --mountfsoptions=errors=remount-ro,iopen_nopriv,user_xattr,acl" [ "x$mdsfailover_HOST" != "x" ] && MDSOPT=$MDSOPT" --failnode=`h2$NETTYPE $mdsfailover_HOST`" [ "x$STRIPE_BYTES" != "x" ] && MOUNTOPT=$MOUNTOPT" --param lov.stripesize=$STRIPE_BYTES" [ "x$STRIPES_PER_OBJ" != "x" ] && MOUNTOPT=$MOUNTOPT" --param lov.stripecount=$STRIPES_PER_OBJ" [ "x$L_GETIDENTITY" != "x" ] && MOUNTOPT=$MOUNTOPT" --param mdt.identity_upcall=$L_GETIDENTITY" # Check for wide stripping [ $OSTCOUNT -gt 160 ] && MDSOPT=$MDSOPT" --mkfsoptions=-O large_xattr -J size=4096" MDS_MKFS_OPTS="--mdt --fsname=$FSNAME $MKFSOPT $MDSOPT" [ "$MDSFSTYPE" = "ldiskfs" ] && MDS_MKFS_OPTS=$MDS_MKFS_OPTS" --param sys.timeout=$TIMEOUT --device-size=$MDSSIZE" [ "$MDSFSTYPE" = "zfs" ] && MDS_MKFS_OPTS=$MDS_MKFS_OPTS" --vdev-size=$MDSSIZE" if combined_mgs_mds ; then [ "$MDSCOUNT" = "1" ] && MDS_MKFS_OPTS="--mgs $MDS_MKFS_OPTS" else MDS_MKFS_OPTS="--mgsnode=$MGSNID $MDS_MKFS_OPTS" [ "$MGSFSTYPE" = "ldiskfs" ] && MGS_MKFS_OPTS="--mgs --device-size=$MGSSIZE" [ "$MGSFSTYPE" = "zfs" ] && MGS_MKFS_OPTS="--mgs --vdev-size=$MGSSIZE" fi if [ "$MDSDEV1" != "$MGSDEV" ]; then if [ "$MGSFSTYPE" == "ldiskfs" ]; then MGS_MOUNT_OPTS=${MGS_MOUNT_OPTS:-"-o loop"} else MGS_MOUNT_OPTS=${MGS_MOUNT_OPTS:-""} fi else MGS_MOUNT_OPTS=${MGS_MOUNT_OPTS:-$MDS_MOUNT_OPTS} fi MKFSOPT="" MOUNTOPT="" [ "x$OSTJOURNALSIZE" != "x" ] && MKFSOPT=$MKFSOPT" -J size=$OSTJOURNALSIZE" [ "x$MKFSOPT" != "x" ] && MKFSOPT="--mkfsoptions=\\\"$MKFSOPT\\\"" [ "x$OSSCAPA" != "x" ] && MKFSOPT="--param ost.capa=$OSSCAPA" [ "x$ostfailover_HOST" != "x" ] && OSTOPT=$OSTOPT" --failnode=`h2$NETTYPE $ostfailover_HOST`" OST_MKFS_OPTS="--ost --fsname=$FSNAME --mgsnode=$MGSNID $MKFSOPT $OSTOPT" [ "$OSTFSTYPE" = "ldiskfs" ] && OST_MKFS_OPTS=$OST_MKFS_OPTS" --param sys.timeout=$TIMEOUT --device-size=$OSTSIZE" [ "$OSTFSTYPE" = "zfs" ] && OST_MKFS_OPTS=$OST_MKFS_OPTS" --vdev-size=$OSTSIZE" MDS_MOUNT_OPTS=${MDS_MOUNT_OPTS:-"-o user_xattr,acl"} OST_MOUNT_OPTS=${OST_MOUNT_OPTS:-""} # TT-430 SERVER_FAILOVER_PERIOD=$((60 * 15)) #RUNAS_ID=840000017 #client MOUNT=${MOUNT:-/mnt/${FSNAME}} MOUNT1=${MOUNT1:-$MOUNT} MOUNT2=${MOUNT2:-${MOUNT}2} MOUNTOPT=${MOUNTOPT:-"-o user_xattr,acl,flock"} [ "x$RMTCLIENT" != "x" ] && MOUNTOPT=$MOUNTOPT",remote_client" DIR=${DIR:-$MOUNT} DIR1=${DIR:-$MOUNT1} DIR2=${DIR2:-$MOUNT2} if [ $UID -ne 0 ]; then log "running as non-root uid $UID" RUNAS_ID="$UID" RUNAS_GID=`id -g $USER` RUNAS="" else RUNAS_ID=${RUNAS_ID:-500} RUNAS_GID=${RUNAS_GID:-$RUNAS_ID} RUNAS=${RUNAS:-"runas -u $RUNAS_ID"} fi PDSH="pdsh -t 120 -S -Rrsh -w" #PDSH="pdsh -t 120 -S -Rmrsh -w" export RSYNC_RSH=rsh FAILURE_MODE=${FAILURE_MODE:-SOFT} # or HARD POWER_DOWN=${POWER_DOWN:-"powerman --off"} POWER_UP=${POWER_UP:-"powerman --on"} SLOW=${SLOW:-no} FAIL_ON_ERROR=${FAIL_ON_ERROR:-true} # error: conf_param: No such device" issue in every test suite logs # sanity-quota test_32 hash_lqs_cur_bits isnt set properly QUOTA_TYPE=${QUOTA_TYPE:-"ug3"} QUOTA_USERS=${QUOTA_USERS:-"quota_usr quota_2usr sanityusr sanityusr1"} LQUOTAOPTS=${LQUOTAOPTS:-"hash_lqs_cur_bits=3"} # SKIP: parallel-scale test_compilebench compilebench not found # SKIP: parallel-scale test_connectathon connectathon dir not found # ------ cbench_DIR=/usr/bin cnt_DIR=/opt/connectathon MPIRUN=$(which mpirun 2>/dev/null) || true MPIRUN_OPTIONS="-mca boot ssh" MPI_USER=${MPI_USER:-mpiuser} SINGLECLIENT=$(hostname) #cbench_DIR=/data/src/benchmarks/compilebench.hg #cnt_DIR=/data/src/benchmarks/cthon04 # For multiple clients testing, we need use the cfg/ncli.sh config file, and # only need specify the "RCLIENTS" variable. The "CLIENTS" and "CLIENTCOUNT" # variables are defined in init_clients_lists(), which is called from cfg/ncli.sh. # So, if we add the contents of cfg/ncli.sh into autotest_config.sh, we would not # need specify "CLIENTS" and "CLIENTCOUNT", and the above two issues (#3 and #4) would also be fixed. # Start of contents of cfg/ncli.sh CLIENT1=${CLIENT1:-`hostname`} SINGLECLIENT=$CLIENT1 RCLIENTS=${RCLIENTS:-""} init_clients_lists [ -n "$RCLIENTS" -a "$PDSH" = "no_dsh" ] && \ error "tests for remote clients $RCLIENTS needs pdsh != do_dsh " || true [ -n "$FUNCTIONS" ] && . $FUNCTIONS || true # for recovery scale tests # default boulder cluster iozone location export PATH=/opt/iozone/bin:$PATH LOADS=${LOADS:-"dd tar dbench iozone"} for i in $LOADS; do [ -f $LUSTRE/tests/run_${i}.sh ] || \ error "incorrect load: $i" done CLIENT_LOADS=($LOADS) # End of contents of cfg/ncli.sh |
| Comment by Chris Gearing (Inactive) [ 01/Feb/13 ] |
|
Has anyone set this test up manually and run it? |
| Comment by Peter Jones [ 19/Mar/13 ] |
|
Yu Jian will investigate the problems in this area |
| Comment by Jian Yu [ 21/Mar/13 ] |
|
There is a common issue in lustre-initialization-1 reports in the above test sessions. After formatting all of the server targets, mounting them hit the following issues: 11:37:12:Setup mgs, mdt, osts 11:37:12:CMD: client-25vm3 mkdir -p /mnt/mds1 11:37:12:CMD: client-25vm3 test -b /dev/lvm-MDS/P1 11:37:12:Starting mds1: -o user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1 11:37:12:CMD: client-25vm3 mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1 11:37:43: e2label: MMP: device currently active while trying to open /dev/dm-0 11:37:43: MMP error info: last update: Sun Sep 23 11:37:37 2012 11:37:43: node: client-25vm3.lab.whamcloud.com device: dm-0 11:37:43:CMD: client-25vm3 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/1.4-gcc/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin: NAME=autotest_config sh rpc.sh set_default_debug \"0x33f0404\" \" 0xffb7e3ff\" 32 11:37:43:CMD: client-25vm3 e2label /dev/lvm-MDS/P1 2>/dev/null 11:37:43:Started lustre:MDT0000 11:37:43:CMD: client-25vm4 mkdir -p /mnt/ost1 11:37:43:CMD: client-25vm4 test -b /dev/lvm-OSS/P1 11:37:43:Starting ost1: /dev/lvm-OSS/P1 /mnt/ost1 11:37:43:CMD: client-25vm4 mkdir -p /mnt/ost1; mount -t lustre /dev/lvm-OSS/P1 /mnt/ost1 11:38:14: e2label: MMP: device currently active while trying to open /dev/dm-0 11:38:14: MMP error info: last update: Sun Sep 23 11:38:09 2012 11:38:15: node: client-25vm4.lab.whamcloud.com device: dm-0 11:38:15:CMD: client-25vm4 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/1.4-gcc/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin: NAME=autotest_config sh rpc.sh set_default_debug \"0x33f0404\" \" 0xffb7e3ff\" 32 11:38:15:CMD: client-25vm4 e2label /dev/lvm-OSS/P1 2>/dev/null 11:38:15:Started lustre:OST0000 The labels of server target devices were "lustre:MDT0000", "lustre:OST0000", etc., instead of "lustre-MDT0000", "lustre-OST0000", which caused facet_up() always return false, and then affected_facets() always returned empty under HARD failure mode.
|
| Comment by Jian Yu [ 27/Mar/13 ] |
|
This is a Lustre issue on master branch. Mounting an ldiskfs server target with MMP feature enabled will fail at ldiskfs_label_lustre() which uses e2label: [root@fat-amd-2 ~]# mkfs.lustre --mgsnode=client-1@tcp:client-3@tcp --fsname=lustre --ost --index=0 --failnode=fat-amd-3@tcp --param=sys.timeout=20 --backfstype=ldiskfs --device-size=16000000 --quiet --reformat /dev/disk/by-id/scsi-1IET_00020001
Permanent disk data:
Target: lustre:OST0000
Index: 0
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x62
(OST first_time update )
Persistent mount opts: errors=remount-ro
Parameters: mgsnode=10.10.4.1@tcp:10.10.4.3@tcp failover.node=10.10.4.134@tcp sys.timeout=20
[root@fat-amd-2 ~]# e2label /dev/disk/by-id/scsi-1IET_00020001
lustre:OST0000
[root@fat-amd-2 ~]# mkdir -p /mnt/ost1; mount -t lustre /dev/disk/by-id/scsi-1IET_00020001 /mnt/ost1
e2label: MMP: device currently active while trying to open /dev/sdf
MMP error info: last update: Wed Mar 27 07:29:05 2013
node: fat-amd-2 device: sdf
[root@fat-amd-2 ~]# e2label /dev/disk/by-id/scsi-1IET_00020001
lustre:OST0000
|
| Comment by Jian Yu [ 28/Mar/13 ] |
|
The issue was introduced by http://review.whamcloud.com/3611. Patch for master branch is in http://review.whamcloud.com/5867. |
| Comment by Peter Jones [ 08/Apr/13 ] |
|
Landed for 2.4 |