[LU-964] df hangs when attempted from clients (NETAPP) Created: 04/Jan/12 Updated: 13/Jul/12 Resolved: 04/Jan/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Brent VanDyke (Inactive) | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Environment: |
Virtualized environment involving 4 OSS's, 1 MDT, and 4 clients |
||
| Attachments: |
|
| Severity: | 2 |
| Rank (Obsolete): | 6497 |
| Description |
|
running fdisk -l from the MDT shows sdb as the correctr size of our lustre config and shows no partition table. Running df from ANY of the clients results in a hang. From the MDT, we've captured the last few lines of the dmesg output: Lustre: MDS lustre-MDT0000: lustre-OST0010_UUID now active, resetting orphans Lustre: 3334:0:(quota_master.c:1718:mds_quota_recovery()) Only 0/32 OSTs are active, abort quota recovery Lustre: lustre-OST0012-osc: Connection restored to service lustre-OST0012 using nid 192.168.1.31@tcp. Lustre: MDS lustre-MDT0000: lustre-OST0012_UUID now active, resetting orphans Lustre: 3334:0:(quota_master.c:1718:mds_quota_recovery()) Only 0/32 OSTs are active, abort quota recovery Lustre: 3334:0:(quota_master.c:1718:mds_quota_recovery()) Skipped 4 previous similar messages Lustre: lustre-OST0014-osc: Connection restored to service lustre-OST0014 using nid 192.168.1.31@tcp. Lustre: Skipped 4 previous similar messages Lustre: MDS lustre-MDT0000: lustre-OST0014_UUID now active, resetting orphans Lustre: Skipped 4 previous similar messages |
| Comments |
| Comment by Brent VanDyke (Inactive) [ 04/Jan/12 ] |
|
Please provide client logs from one of the clients experiencing the issue, the MDS, and one of the OSTs. |
| Comment by Brent VanDyke (Inactive) [ 04/Jan/12 ] |
|
Attached are all three files |
| Comment by Brent VanDyke (Inactive) [ 04/Jan/12 ] |
|
Attached are all 3 files |
| Comment by Michael Amos (Inactive) [ 04/Jan/12 ] |
|
Make sure you run this on the MDS & All Clients < # for i in 0 1 2 3 4 5 6 7 8 9 a b c d e f; do lctl conf_param lustre-OST000${i}.osc.active=0; done; > |
| Comment by Brent VanDyke (Inactive) [ 04/Jan/12 ] |
|
Thanks, case closed |
| Comment by Cliff White (Inactive) [ 04/Jan/12 ] |
|
Brilliant team at netapp fixed bug. |