Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
4779
Description
== replay-dual test 0b: lost client during waiting for next transno ================================== 11:11:19 (1314285079)
Filesystem 1K-blocks Used Available Use% Mounted on
10.37.248.61@o2ib1:/lustre
22047088 922544 20002752 5% /lustre/barry
Failing mds1 on node barry-mds1
Stopping /tmp/mds1 (opts on barry-mds1
affected facets: mds1
Failover mds1 to barry-mds1
11:11:34 (1314285094) waiting for barry-mds1 network 900 secs ...
11:11:34 (1314285094) network interface is UP
Starting mds1: -o user_xattr,acl /dev/md5 /tmp/mds1
Started lustre-MDT0000
Starting client: spoon01: -o user_xattr,acl,flock 10.37.248.61@o2ib1:/lustre /lustre/barry
mount.lustre: mount 10.37.248.61@o2ib1:/lustre at /lustre/barry failed: File exists
replay-dual test_0b: @@@@@@ FAIL: mount1 fais
Client dmesg
Lustre: DEBUG MARKER: == replay-dual test 0b: lost client during waiting for next transno ================================== 11:11:19 (1314285079)
Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000
LustreError: 31491:0:(ldlm_request.c:1172:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
LustreError: 31491:0:(ldlm_request.c:1799:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
Lustre: client ffff810173cb2400 umount complete
Lustre: setting import lustre-MDT0000_UUID INACTIVE by administrator request
Lustre: setting import lustre-OST0000_UUID INACTIVE by administrator request
LustreError: 31613:0:(ldlm_request.c:1172:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
LustreError: 31613:0:(ldlm_request.c:1799:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
Lustre: client ffff81017c9d6800 umount complete
LustreError: 31623:0:(genops.c:304:class_newdev()) Device MGC10.37.248.61@o2ib1 already exists, won't add
LustreError: 31623:0:(obd_config.c:327:class_attach()) Cannot create device MGC10.37.248.61@o2ib1 of type mgc : -17
LustreError: 31623:0:(obd_mount.c:512:lustre_start_simple()) MGC10.37.248.61@o2ib1 attach error -17
LustreError: 31623:0:(obd_mount.c:2160:lustre_fill_super()) Unable to mount (-17)
Lustre: DEBUG MARKER: replay-dual test_0b: @@@@@@ FAIL: mount1 fais
MDS dmesg
Lustre: DEBUG MARKER: == replay-dual test 0b: lost client during waiting for next transno ================================== 11:11:19 (1314285079)
LustreError: 10361:0:(osd_handler.c:938:osd_ro()) *** setting device osd-ldiskfs read-only ***
Turning device md5 (0x900005) read-only
Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
Lustre: Failing over lustre-MDT0000
Lustre: 10460:0:(quota_master.c:793:close_quota_files()) quota[0] is off already
Lustre: 10460:0:(quota_master.c:793:close_quota_files()) Skipped 1 previous similar message
Lustre: Failing over mdd_obd-lustre-MDT0000
Lustre: mdd_obd-lustre-MDT0000: shutting down for failover; client state will be preserved.
Removing read-only on unknown block (0x900005)
Lustre: server umount lustre-MDT0000 complete
LDISKFS-fs (md5): recovery complete
LDISKFS-fs (md5): mounted filesystem with ordered data mode
JBD: barrier-based sync failed on md5-8 - disabling barriers
LDISKFS-fs (md5): mounted filesystem with ordered data mode
Lustre: Enabling ACL
Lustre: Enabling user_xattr
Lustre: lustre-MDT0000: used disk, loading
Lustre: 10592:0:(ldlm_lib.c:1903:target_recovery_init()) RECOVERY: service lustre-MDT0000, 66 recoverable clients, last_transno 4294967297
LustreError: 10599:0:(ldlm_lib.c:1740:target_recovery_thread()) lustre-MDT0000: started recovery thread pid 10599
LustreError: 10601:0:(mdt_handler.c:2785:mdt_recovery()) operation 400 on unconnected MDS from 12345-10.37.248.45@o2ib1
LustreError: 10601:0:(ldlm_lib.c:2128:target_send_reply_msg()) @@@ processing error (107) req@ffff81040ca1c400 x1378037410570255/t0(0) o-1><?>@<?>:0/0 lens 192/0 e 0 to 0 dl 1314285137 ref 1 fl Interpret:H/ffffffff/ffffffff rc -107/-1
LustreError: 10601:0:(ldlm_lib.c:2128:target_send_reply_msg()) Skipped 1 previous similar message
LustreError: 137-5: UUID 'lustre-MDT0000_UUID' is not available for connect (not set up)
Lustre: 10592:0:(mdt_lproc.c:257:lprocfs_wr_identity_upcall()) lustre-MDT0000: identity upcall set to /usr/sbin/l_getidentity
Lustre: 10592:0:(mds_lov.c:1004:mds_notify()) MDS mdd_obd-lustre-MDT0000: add target lustre-OST0000_UUID
Lustre: 10592:0:(mds_lov.c:1004:mds_notify()) Skipped 4 previous similar messages
JBD: barrier-based sync failed on md5-8 - disabling barriers
Lustre: 5799:0:(mds_lov.c:1024:mds_notify()) MDS mdd_obd-lustre-MDT0000: in recovery, not resetting orphans on lustre-OST0000_UUID
Lustre: 5799:0:(mds_lov.c:1024:mds_notify()) MDS mdd_obd-lustre-MDT0000: in recovery, not resetting orphans on lustre-OST0001_UUID
LustreError: 10601:0:(mdt_handler.c:2785:mdt_recovery()) operation 400 on unconnected MDS from 12345-10.37.248.44@o2ib1
Lustre: lustre-MDT0000: temporarily refusing client connection from 10.37.248.44@o2ib1
Lustre: Skipped 1 previous similar message
LNet: 10801:0:(debug.c:326:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
Lustre: DEBUG MARKER: replay-dual test_0b: @@@@@@ FAIL: mount1 fais
LustreError: 10601:0:(mdt_handler.c:2785:mdt_recovery()) operation 400 on unconnected MDS from 12345-10.37.248.4@o2ib1
Lustre: 10601:0:(ldlm_lib.c:2029:target_queue_recovery_request()) Next recovery transno: 4294967298, current: 4294967306, replaying
Lustre: 10601:0:(ldlm_lib.c:2029:target_queue_recovery_request()) Next recovery transno: 4294967298, current: 4294967303, replaying
LustreError: 10606:0:(mdt_handler.c:2785:mdt_recovery()) operation 400 on unconnected MDS from 12345-10.37.248.16@o2ib1
LustreError: 10606:0:(mdt_handler.c:2785:mdt_recovery()) Skipped 58 previous similar messages
Info required for matching: replay-dual test_0b 0b
Attachments
Issue Links
- Trackbacks
-
Changelog 2.2 version 2.2.0 Support for networks: o2iblnd OFED 1.5.4 Server support for kernels: 2.6.32220.4.2.el6 (RHEL6) Client support for unpatched kernels: 2.6.18274.18.1.el5 (RHEL5) 2.6.32220.4.2.el6 (RHEL6) 2.6.32.360....