conman logs ->

<ConMan> Console [hamster8] log at 2012-06-19 10:59:59 EST.
2012-06-19 11:42:37 
2012-06-19 11:42:37 LDISKFS-fs error (device md6): ldiskfs_valid_block_bitmap: Invalid block bitmap - block_group = 57173, block = 1873444866
2012-06-19 11:42:37 Aborting journal on device md6-8.
2012-06-19 11:42:37 LustreError: 24331:0:(ost_handler.c:1414:ost_blocking_ast()) Error -30 syncing data on lock cancel
2012-06-19 11:42:37 LDISKFS-fs error (device md6): ldiskfs_journal_start_sb: <3>LustreError: 9018:0:(ost_handler.c:1414:ost_blocking_ast()) Error -30 syncing data on lock cancel
2012-06-19 11:42:37 Detected aborted journal
2012-06-19 11:42:37 LDISKFS-fs (md6): Remounting filesystem read-only
2012-06-19 11:42:37 LustreError: 25130:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) can't get handle for 136 credits: rc = -30
2012-06-19 11:42:37 LustreError: 25130:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:37 LDISKFS-fs (md6): Remounting filesystem read-only
2012-06-19 11:42:37 LDISKFS-fs error (device md6) in ldiskfs_ext_new_extent_cb: Journal has aborted
2012-06-19 11:42:37 LustreError: 25035:0:(fsfilt-ldiskfs.c:1327:fsfilt_ldiskfs_write_record()) can't start transaction for 37 blocks (128 bytes)
2012-06-19 11:42:37 LustreError: 25035:0:(filter.c:198:filter_finish_transno()) wrote trans 34390635984 for client a186348d-92d3-fe78-abbb-6d7b75e5abbb at #97: err = -30
2012-06-19 11:42:37 LustreError: 25035:0:(filter_io_26.c:534:filter_direct_io()) can't close transaction: -30
2012-06-19 11:42:37 LustreError: 26095:0:(obd.h:1399:obd_transno_commit_cb()) data-OST001e: transno 34390635984 commit error: 2
2012-06-19 11:42:37 LustreError: 26095:0:(obd.h:1399:obd_transno_commit_cb()) data-OST001e: transno 34390635983 commit error: 2
2012-06-19 11:42:37 journal commit I/O error
2012-06-19 11:42:38 LustreError: 25009:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) can't get handle for 266 credits: rc = -30
2012-06-19 11:42:38 LustreError: 25009:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25048:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25082:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25043:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25029:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25072:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25038:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25103:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25078:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25025:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25036:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25028:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25027:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25121:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25119:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25056:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25013:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:38 LustreError: 25089:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:39 LustreError: 25081:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) can't get handle for 582 credits: rc = -30
2012-06-19 11:42:39 LustreError: 25081:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) Skipped 17 previous similar messages
2012-06-19 11:42:39 LustreError: 25081:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
...
logs edited to delete many repeated lines like this ->
2012-06-19 11:42:39 LustreError: 25089:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
...
2012-06-19 11:42:40 LustreError: 25108:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) can't get handle for 266 credits: rc = -30
2012-06-19 11:42:40 LustreError: 25108:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) Skipped 17 previous similar messages
2012-06-19 11:42:40 LustreError: 25108:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:40 LustreError: 25005:0:(fsfilt-ldiskfs.c:367:fsfilt_ldiskfs_start()) error starting handle for op 8 (106 credits): rc -30
2012-06-19 11:42:41 LustreError: 25018:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
...
2012-06-19 11:42:41 LustreError: 25019:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 11:42:42 LDISKFS-fs warning (device md6): kmmpd: kmmpd being stopped since filesystem has been remounted as readonly.
2012-06-19 11:42:42 LustreError: 25072:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) can't get handle for 86 credits: rc = -30
2012-06-19 11:42:42 LustreError: 25072:0:(fsfilt-ldiskfs.c:501:fsfilt_ldiskfs_brw_start()) Skipped 17 previous similar messages
2012-06-19 11:42:42 LustreError: 25072:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
...

and then started to umount...
...
2012-06-19 12:24:53 LustreError: 25118:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 12:25:00 LustreError: 25110:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 12:25:08 LustreError: 25032:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 12:25:17 LustreError: 25007:0:(filter_io_26.c:706:filter_commitrw_write()) error starting transaction: rc = -30
2012-06-19 12:25:21 Lustre: Failing over data-OST001e
2012-06-19 12:25:21 LustreError: 24952:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req@ffff8107e17ac000 x1405009648444885/t0 o400-><?>@<?>:0/0 lens 192/0 e 0 to 0 dl 1340072737 ref 1 fl Interpret:H/0/0 rc -107/0
2012-06-19 12:25:21 LustreError: 24952:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 1 previous similar message
2012-06-19 12:25:21 LustreError: 137-5: UUID 'data-OST001e_UUID' is not available  for connect (stopping)
2012-06-19 12:25:21 LustreError: Skipped 5661 previous similar messages
2012-06-19 12:25:23 Lustre: data-OST001e: shutting down for failover; client state will be preserved.
2012-06-19 12:25:23 LustreError: 19766:0:(fsfilt-ldiskfs.c:1327:fsfilt_ldiskfs_write_record()) can't start transaction for 37 blocks (512 bytes)
2012-06-19 12:25:23 LustreError: 19766:0:(filter.c:757:filter_update_server_data()) error writing lr_server_data: rc = -30
2012-06-19 12:25:23 LustreError: 19766:0:(filter.c:1298:filter_post()) error writing server data: rc = -30
2012-06-19 12:25:23 LustreError: 19766:0:(fsfilt-ldiskfs.c:1327:fsfilt_ldiskfs_write_record()) can't start transaction for 37 blocks (8 bytes)
2012-06-19 12:25:23 LustreError: 19766:0:(filter.c:785:filter_update_last_objid()) error writing group 0 last objid: rc = -30
2012-06-19 12:25:23 LustreError: 19766:0:(filter.c:1304:filter_post()) error writing group 0 lastobjid: rc = -30
2012-06-19 12:25:23 LustreError: 19766:0:(filter.c:785:filter_update_last_objid()) error writing group 1 last objid: rc = -30
2012-06-19 12:25:23 LustreError: 19766:0:(filter.c:1304:filter_post()) error writing group 1 lastobjid: rc = -30
2012-06-19 12:25:23 VFS: cannot write quota structure on device md6 (error -30). Quota may get out of sync!
2012-06-19 12:25:24 LDISKFS-fs error (device md6): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 57173corrupted: 4634 blocks free in bitmap, 4602 - in gd
2012-06-19 12:25:24 
2012-06-19 12:25:24 LDISKFS-fs error (device md6): ldiskfs_discard_preallocations: Error loading buddy information for 57173
2012-06-19 12:25:30 LustreError: 24973:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-107)  req@ffff810bc7cf6000 x1404593604784122/t0 o400-><?>@<?>:0/0 lens 192/0 e 0 to 0 dl 1340072746 ref 1 fl Interpret:H/0/0 rc -107/0
2012-06-19 12:25:30 LustreError: 24973:0:(ldlm_lib.c:1919:target_send_reply_msg()) Skipped 1173 previous similar messages
2012-06-19 12:25:39 LDISKFS-fs error (device md6): ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 57173corrupted: 4634 blocks free in bitmap, 4602 - in gd
2012-06-19 12:25:39 
2012-06-19 12:25:39 LDISKFS-fs error (device md6): ldiskfs_discard_preallocations: Error loading buddy information for 57173
2012-06-19 12:25:40 LDISKFS-fs error (device md6): ldiskfs_put_super: Couldn't clean up the journal
2012-06-19 12:25:40 Lustre: server umount data-OST001e complete


e2fsck -n

[root@hamster8 ~]# e2fsck -n -v  /dev/md6
e2fsck 1.41.90.wc3 (28-May-2011)
MMP interval is 10 seconds and total wait time is 42 seconds. Please wait...
Warning: skipping journal recovery because doing a read-only filesystem check.
data-OST001e contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences:  -(1110823654--1110824959) -(1110829798--1110831103) +(1195704960--1195705756) +(1195715584--1195715963) +(1195727104--1195727359) +(1195730432--1195730687)
Fix? no

Free blocks count wrong for group #36415 (18373, counted=7620).
Fix? no

Free blocks count wrong for group #36416 (16601, counted=12248).
Fix? no

Free blocks count wrong for group #36440 (23324, counted=21020).
Fix? no

Free blocks count wrong for group #36441 (17960, counted=15911).
Fix? no

Free blocks count wrong for group #36456 (16872, counted=16616).
Fix? no

Free blocks count wrong (615605791, counted=615607598).
Fix? no


data-OST001e: ********** WARNING: Filesystem still has errors **********


  267770 inodes used (0.05%)
   71148 non-contiguous files (26.6%)
      32 non-contiguous directories (0.0%)
         # of inodes with ind/dind/tind blocks: 32/0/0
         Extent depth histogram: 224765/42780/177
1337115105 blocks used (68.47%)
       0 bad blocks
     340 large files

  267722 regular files
      39 directories
       0 character device files
       0 block device files
       0 fifos
       0 links
       0 symbolic links (0 fast symbolic links)
       0 sockets
--------
  267761 files



e2fsck -v

[root@hamster8 ~]# e2fsck  -v  /dev/md6
e2fsck 1.41.90.wc3 (28-May-2011)
MMP interval is 10 seconds and total wait time is 42 seconds. Please wait...
data-OST001e: recovering journal
data-OST001e has gone 566 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

data-OST001e: ***** FILE SYSTEM WAS MODIFIED *****

  267770 inodes used (0.05%)
   71148 non-contiguous files (26.6%)
      32 non-contiguous directories (0.0%)
         # of inodes with ind/dind/tind blocks: 32/0/0
         Extent depth histogram: 224765/42780/177
1337115009 blocks used (68.47%)
       0 bad blocks
     340 large files

  267722 regular files
      39 directories
       0 character device files
       0 block device files
       0 fifos
       0 links
       0 symbolic links (0 fast symbolic links)
       0 sockets
--------
  267761 files



[root@hamster8 ~]# tune2fs -l /dev/md6
tune2fs 1.41.90.wc3 (28-May-2011)
device /dev/md6 mounted by lustre per /proc/fs/lustre/obdfilter/data-OST001e/mntdev
Filesystem volume name:   data-OST001e
Last mounted on:          /
Filesystem UUID:          aa530250-c675-4a20-a805-07972cf3626a
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent mmp sparse_super large_file uninit_bg
Filesystem flags:         signed_directory_hash
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              488185856
Block count:              1952720896
Reserved block count:     97636044
Free blocks:              615605887
Free inodes:              487918086
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      558
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Filesystem created:       Thu Mar 11 01:18:50 2010
Last mount time:          Tue Jun 19 13:39:49 2012
Last write time:          Tue Jun 19 13:39:49 2012
Mount count:              2
Maximum mount count:      38
Last checked:             Tue Jun 19 13:16:30 2012
Check interval:           15552000 (6 months)
Next check after:         Sun Dec 16 14:16:30 2012
Lifetime writes:          117 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      bbd64b14-78f0-49be-bf19-91d91311ec16
Journal backup:           inode blocks
MMP block number:         1545
MMP update interval:      5


uname -a
  Linux hamster8 2.6.18-274.3.1.el5-1.8.7-wc1.n #1 SMP Fri Mar 2 14:19:45 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
