[LU-964] df hangs when attempted from clients (NETAPP) Created: 04/Jan/12  Updated: 13/Jul/12  Resolved: 04/Jan/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.6
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Brent VanDyke (Inactive) Assignee: WC Triage
Resolution: Fixed Votes: 1
Labels: None
Environment:

Virtualized environment involving 4 OSS's, 1 MDT, and 4 clients


Attachments: Text File messages_MDT.log     Text File messages_MDT.log     Text File messages_OSS.log     Text File messages_OSS.log     Text File messages_client.log     Text File messages_client.log    
Severity: 2
Rank (Obsolete): 6497

 Description   

running fdisk -l from the MDT shows sdb as the correctr size of our lustre config and shows no partition table. Running df from ANY of the clients results in a hang.

From the MDT, we've captured the last few lines of the dmesg output:
Lustre: MDS lustre-MDT0000: lustre-OST0010_UUID now active, resetting orphans
Lustre: 3334:0:(quota_master.c:1718:mds_quota_recovery()) Only 0/32 OSTs are active, abort quota recovery
Lustre: lustre-OST0012-osc: Connection restored to service lustre-OST0012 using nid 192.168.1.31@tcp.
Lustre: MDS lustre-MDT0000: lustre-OST0012_UUID now active, resetting orphans
Lustre: 3334:0:(quota_master.c:1718:mds_quota_recovery()) Only 0/32 OSTs are active, abort quota recovery
Lustre: 3334:0:(quota_master.c:1718:mds_quota_recovery()) Skipped 4 previous similar messages
Lustre: lustre-OST0014-osc: Connection restored to service lustre-OST0014 using nid 192.168.1.31@tcp.
Lustre: Skipped 4 previous similar messages
Lustre: MDS lustre-MDT0000: lustre-OST0014_UUID now active, resetting orphans
Lustre: Skipped 4 previous similar messages


 Comments   
Comment by Brent VanDyke (Inactive) [ 04/Jan/12 ]

Please provide client logs from one of the clients experiencing the issue, the MDS, and one of the OSTs.

Comment by Brent VanDyke (Inactive) [ 04/Jan/12 ]

Attached are all three files

Comment by Brent VanDyke (Inactive) [ 04/Jan/12 ]

Attached are all 3 files

Comment by Michael Amos (Inactive) [ 04/Jan/12 ]

Make sure you run this on the MDS & All Clients

< # for i in 0 1 2 3 4 5 6 7 8 9 a b c d e f; do lctl conf_param lustre-OST000${i}.osc.active=0; done; >

Comment by Brent VanDyke (Inactive) [ 04/Jan/12 ]

Thanks, case closed

Comment by Cliff White (Inactive) [ 04/Jan/12 ]

Brilliant team at netapp fixed bug.

Generated at Sat Feb 10 01:12:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.