[LU-4241] Test failure on test recovery-small test_101: import is not in FULL state Created: 11/Nov/13  Updated: 12/Aug/22  Resolved: 12/Aug/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.5.1, Lustre 2.7.0, Lustre 2.8.0, Lustre 2.10.0, Lustre 2.10.5
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Issue Links:
Related
is related to LU-6725 replay-single test_0a: FAIL: import i... Resolved
Severity: 3
Rank (Obsolete): 11545

 Description   

This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/af3aacb8-48e7-11e3-b8b9-52540035b04c.

The sub-test test_101 failed with the following error:

import is not in FULL state

Info required for matching: recovery-small 101



 Comments   
Comment by Jian Yu [ 09/Jan/14 ]

While verifying patch http://review.whamcloud.com/#/c/8729/ on Lustre b2_5 branch, the same failure occurred on review-zfs test session:
https://maloo.whamcloud.com/test_sets/91181414-78b7-11e3-809c-52540035b04c

Comment by James Nunez (Inactive) [ 10/Jul/15 ]

Hit this issue again:
2015-07-10 02:34:57 - https://testing.hpdd.intel.com/test_sets/c778c684-26c1-11e5-8cf5-5254006e85c2
2015-07-30 16:30:15 - https://testing.hpdd.intel.com/test_sets/b7c5dba2-36f0-11e5-84a8-5254006e85c2

Comment by Di Wang [ 05/Aug/15 ]

hit again
https://testing.hpdd.intel.com/test_sets/c6cd44f2-3b53-11e5-95fa-5254006e85c2

Comment by Di Wang [ 06/Aug/15 ]

https://testing.hpdd.intel.com/test_sets/33fdd074-3b3a-11e5-95fa-5254006e85c2
On OST, I saw this

05:01:24:Lustre: lustre-OST0000: Will be in recovery for at least 1:00, or until 3 clients reconnect
05:01:24:Lustre: lustre-OST0000: Client d57edd96-7283-285b-d594-fd90ad5b6b79 (at 10.1.4.207@tcp) reconnecting, waiting for 3 clients in recovery for 0:46
05:01:24:Lustre: lustre-OST0000: Client lustre-MDT0000-mdtlov_UUID (at 10.1.4.201@tcp) reconnecting, waiting for 3 clients in recovery for 0:44
05:01:24:Lustre: lustre-OST0000: Client d57edd96-7283-285b-d594-fd90ad5b6b79 (at 10.1.4.207@tcp) reconnecting, waiting for 3 clients in recovery for 0:03
05:01:24:Lustre: lustre-OST0000: Client lustre-MDT0000-mdtlov_UUID (at 10.1.4.201@tcp) reconnecting, waiting for 3 clients in recovery for 0:01
05:01:24:Lustre: lustre-OST0000: recovery is timed out, evict stale exports
05:01:24:Lustre: lustre-OST0000: disconnecting 1 stale clients
05:01:24:Lustre: 16475:0:(ldlm_lib.c:1883:target_recovery_overseer()) recovery is aborted by hard timeout
05:01:24:Lustre: 16475:0:(ldlm_lib.c:1893:target_recovery_overseer()) recovery is aborted, evict exports in recovery
05:01:24:Lustre: 16475:0:(ldlm_lib.c:1893:target_recovery_overseer()) Skipped 2 previous similar messages
05:01:24:Lustre: lustre-OST0000: Recovery over after 1:33, of 3 clients 2 recovered and 1 was evicted.
05:01:24:Lustre: lustre-OST0000: deleting orphan objects from 0x0:11395 to 0x0:11457
05:01:24:Lustre: DEBUG MARKER: /usr/sbin/lctl mark osc.lustre-OST0000-osc-*.ost_server_uuid in FULL state after 99 sec
05:01:24:Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-*.ost_server_uuid in FULL state after 99 sec
05:01:24:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  rpc : @@@@@@ FAIL: can\'t put import for osc.lustre-OST0000-osc-*.ost_server_uuid into FULL state after 1475 sec, have DISCONN 
05:01:24:Lustre: DEBUG MARKER: rpc : @@@@@@ FAIL: can't put import for osc.lustre-OST0000-osc-*.ost_server_uuid into FULL state after 1475 sec, have DISCONN
Generated at Sat Feb 10 01:40:56 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.