[LU-441] ll_fill_super()) Unable to process log: -108 Created: 21/Jun/11 Updated: 07/May/15 Resolved: 07/May/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.8, Lustre 1.8.7, Lustre 1.8.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Jian Yu | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre Branch: v1_8_6_RC2 MGS/MDS Nodes: client-10-ib(active), client-12-ib(passive) OSS Nodes: fat-amd-1-ib(active), fat-amd-2-ib(active) Client Nodes: fat-amd-3-ib, client-6-ib |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Bugzilla ID: | 20,997 | ||||||||
| Rank (Obsolete): | 5211 | ||||||||
| Description |
|
replay-single test 0c failed as follows: == test 0c: expired recovery with no clients == 22:09:59
Filesystem 1K-blocks Used Available Use% Mounted on
client-10-ib@o2ib:client-12-ib@o2ib:/lustre
11811168 485956 10724828 5% /mnt/lustre
Failing mds on node client-12-ib
+ pm -h powerman --off client-12
Command completed successfully
affected facets: mds
+ pm -h powerman --on client-12
Command completed successfully
df pid is 14399
Failover mds to client-10-ib
22:10:44 (1308633044) waiting for client-10-ib network 900 secs ...
22:10:44 (1308633044) network interface is UP
Starting mds: -o user_xattr,acl /dev/disk/by-id/scsi-1IET_00010001 /mnt/mds
client-10-ib: lnet.debug=0x33f1504
client-10-ib: lnet.subsystem_debug=0xffb7e3ff
client-10-ib: lnet.debug_mb=48
Started lustre-MDT0000
Starting client: fat-amd-3-ib: -o user_xattr,acl,flock client-10-ib@o2ib:client-12-ib@o2ib:/lustre /mnt/lustre
mount.lustre: mount client-10-ib@o2ib:client-12-ib@o2ib:/lustre at /mnt/lustre failed: Cannot send after transport endpoint shutdown
replay-single test_0c: @@@@@@ FAIL: mount fails
Dmesg on the client node fat-amd-3: Lustre: 9996:0:(client.c:1487:ptlrpc_expire_one_request()) @@@ Request x1372200800092968 sent from MGC192.168.4.10@o2ib to NID 192.168.4.10@o2ib 0s ago has failed due to network error (5s prior to deadline). req@ffff8800d409d800 x1372200800092968/t0 o250->MGS@MGC192.168.4.10@o2ib_0:26/25 lens 368/584 e 0 to 1 dl 1308633093 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 9996:0:(client.c:1487:ptlrpc_expire_one_request()) Skipped 6 previous similar messages LustreError: 14511:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8800d409d400 x1372200800092971/t0 o501->MGS@MGC192.168.4.10@o2ib_1:26/25 lens 264/432 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0 LustreError: 15c-8: MGC192.168.4.10@o2ib: The configuration from log 'lustre-client' failed (-108). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. LustreError: 14511:0:(llite_lib.c:1095:ll_fill_super()) Unable to process log: -108 Lustre: client ffff8801192e9800 umount complete LustreError: 14511:0:(obd_mount.c:2065:lustre_fill_super()) Unable to mount (-108) Lustre: DEBUG MARKER: replay-single test_0c: @@@@@@ FAIL: mount fails Maloo report: https://maloo.whamcloud.com/test_sets/ed5b5ff0-9bca-11e0-9a27-52540025f9af This is an known issue on Lustre b1_8 branch: bug 20997 |
| Comments |
| Comment by Jian Yu [ 22/Jun/11 ] |
|
recovery-small test 57 failed with the same issue (the subsequent sub-tests failed due to test 57 failure): replay-vbr test 0c failed with the same issue (the subsequent sub-tests failed due to test 0c failure): |
| Comment by Jian Yu [ 23/Jun/11 ] |
|
Lustre Branch: v1_8_6_RC3 MGS/MDS Nodes: client-10-ib(active), client-12-ib(passive)
\ /
1 combined MGS/MDT
OSS Nodes: fat-amd-1-ib(active), fat-amd-2-ib(active)
\ /
OST1 (active in fat-amd-1-ib)
OST2 (active in fat-amd-2-ib)
OST3 (active in fat-amd-1-ib)
OST4 (active in fat-amd-2-ib)
OST5 (active in fat-amd-1-ib)
OST6 (active in fat-amd-2-ib)
Client Nodes: fat-amd-3-ib,client-[6,7,16,21,24]-ib
After running recovery-double-scale test, mounting the local client on fat-amd-3-ib failed as follows: ++ sh -c 'mount -t lustre -o user_xattr,acl,flock client-10-ib@o2ib:client-12-ib@o2ib:/lustre /mnt/lustre' mount.lustre: mount client-10-ib@o2ib:client-12-ib@o2ib:/lustre at /mnt/lustre failed: Cannot send after transport endpoint shutdown + return 108 Maloo report: https://maloo.whamcloud.com/test_sets/83e9914e-9d65-11e0-9a27-52540025f9af |
| Comment by Jian Yu [ 05/Aug/11 ] |
|
Clean upgrading from Lustre 1.8.6-wc1 to 2.0.66.0 also hit this issue: |
| Comment by Jian Yu [ 04/Sep/11 ] |
|
Clean upgrading from Lustre 1.8.5/1.8.6-wc1 to 2.1.0 also hit this issue: |
| Comment by Jian Yu [ 13/Oct/11 ] |
|
Lustre Tag: v1_8_7_WC1_RC1 recovery-double-scale test: https://maloo.whamcloud.com/test_sets/625e6856-f53f-11e0-908b-52540025f9af |
| Comment by Jian Yu [ 24/Feb/12 ] |
|
Clean upgrading from Lustre 1.8.7-wc1 to 2.1.1 also hit this issue: |
| Comment by Jian Yu [ 14/May/12 ] |
|
Lustre Tag: v1_8_8_WC1_RC1 recovery-double-scale test: https://maloo.whamcloud.com/test_sets/b1c8d1a8-9d8f-11e1-a1d8-52540035b04c |
| Comment by Isaac Huang (Inactive) [ 31/Aug/12 ] |
|
This is likely a dup of |
| Comment by Andreas Dilger [ 07/May/15 ] |
|
Haven't seen this in a long time. |