[LU-8100] Missing MDTs in /proc/fs/lustre/lmv/lustre-clilmv-.../target_obd Created: 04/May/16 Updated: 27/Feb/17 Resolved: 21/Sep/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Giuseppe Di Natale (Inactive) | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | llnl, patch | ||
| Environment: |
lustre-2.8.0_14_gd0cbf68-1.x86_64 |
||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
Our test setup contains 16 MDTs. Our clients only ever see 10 MDTs. All output below is from a client with the file system mounted. All were formatted at the same time with the same script and same mgs nid. Attempts to use any MDT that is not listed fails (i.e. lfs mkdir --index=10 will fail). [root@catalyst320:mdc]# lfs mkdir --index=12 /p/lustre/dinatale/testdir error on LL_IOC_LMV_SETSTRIPE '/p/lustre/dinatale/testdir' (3): No such device error: mkdir: create stripe dir '/p/lustre/dinatale/testdir' failed [root@catalyst320:mdc]# lfs mkdir --index=8 /p/lustre/dinatale/testdir [root@catalyst320:mdc]# lfs getdirstripe /p/lustre/dinatale/testdir/ /p/lustre/dinatale/testdir/ lmv_stripe_count: 0 lmv_stripe_offset: 8 [root@catalyst320:mdc]# cat /proc/fs/lustre/lmv/lustre-clilmv-ffff881003e14400/target_obd 0: lustre-MDT0000_UUID ACTIVE 1: lustre-MDT0001_UUID ACTIVE 2: lustre-MDT0002_UUID ACTIVE 3: lustre-MDT0003_UUID ACTIVE 4: lustre-MDT0004_UUID ACTIVE 5: lustre-MDT0005_UUID ACTIVE 6: lustre-MDT0006_UUID ACTIVE 7: lustre-MDT0007_UUID ACTIVE 8: lustre-MDT0008_UUID ACTIVE 9: lustre-MDT0009_UUID ACTIVE ls /proc/fs/lustre/lmv/lustre-clilmv-ffff881003e14400/target_obds/ lustre-MDT0000-mdc-ffff881003e14400 lustre-MDT0006-mdc-ffff881003e14400 lustre-MDT0012-mdc-ffff881003e14400 lustre-MDT0001-mdc-ffff881003e14400 lustre-MDT0007-mdc-ffff881003e14400 lustre-MDT0013-mdc-ffff881003e14400 lustre-MDT0002-mdc-ffff881003e14400 lustre-MDT0008-mdc-ffff881003e14400 lustre-MDT0014-mdc-ffff881003e14400 lustre-MDT0003-mdc-ffff881003e14400 lustre-MDT0009-mdc-ffff881003e14400 lustre-MDT0015-mdc-ffff881003e14400 lustre-MDT0004-mdc-ffff881003e14400 lustre-MDT0010-mdc-ffff881003e14400 lustre-MDT0005-mdc-ffff881003e14400 lustre-MDT0011-mdc-ffff881003e14400 [root@catalyst320:mdc]# grep current_state */state lustre-MDT0000-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0001-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0002-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0003-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0004-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0005-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0006-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0007-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0008-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0009-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0010-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0011-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0012-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0013-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0014-mdc-ffff881003e14400/state:current_state: FULL lustre-MDT0015-mdc-ffff881003e14400/state:current_state: FULL |
| Comments |
| Comment by Peter Jones [ 05/May/16 ] |
|
Lai Could you please advise on this issue? Thanks Peter |
| Comment by Olaf Faaland [ 10/May/16 ] |
|
Note that when we formatted these MDTs, we accidentally named them 0000 - 0009 and 0010 - 00015 instead of 0000 - 000F So there's a gap in the numbering system we used for the name. That shouldn't matter, as far as I'm aware, but I mention it because of the correlation with the symptom we saw. |
| Comment by Lai Siyao [ 10/May/16 ] |
|
okay, I'll try to format system like yours and reproduce. at the same time, could you remount a client, `lfs mkdir --index=12 /p/lustre/dinatale/testdir` and then collect debug logs of this whole process? |
| Comment by Giuseppe Di Natale (Inactive) [ 10/May/16 ] |
|
Unfortunately, the file system we were testing with was only a temporary set up on one of our clusters. It no longer exists, so we can't collect any logs for a client for that specific file system. I may be able to try and reproduce the issue today and get you logs if I am successful. |
| Comment by Giuseppe Di Natale (Inactive) [ 10/May/16 ] |
|
Was able to reproduce the issue. Collected a log from a client involving a mount and the lfs command you requested. Let me know if you need anything else. The log file is called "debug_client_missing_mdts.log". |
| Comment by Giuseppe Di Natale (Inactive) [ 10/May/16 ] |
|
I was able to confirm Olaf's speculation on the naming convention. It appears the naming may be the source of the problem. I went ahead and redeployed a test file system where the MDT names ranged from 0000-000F and the client was able to connect to all MDTs. For completeness, lots of info below. [root@catalyst100:~]# lfs mkdir --index=12 /p/lustre/dinatale/testdir [root@catalyst100:~]# lfs getdirstripe /p/lustre/dinatale/testdir /p/lustre/dinatale/testdir lmv_stripe_count: 0 lmv_stripe_offset: 12 [root@catalyst100:mdc]# grep current_state */state lustre-MDT0000-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT0001-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT0002-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT0003-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT0004-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT0005-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT0006-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT0007-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT0008-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT0009-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT000a-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT000b-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT000c-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT000d-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT000e-mdc-ffff880fbb04f400/state:current_state: FULL lustre-MDT000f-mdc-ffff880fbb04f400/state:current_state: FULL [root@catalyst100:~]# ls /proc/fs/lustre/lmv/lustre-clilmv-ffff880fbb04f400/target_obds/ lustre-MDT0000-mdc-ffff880fbb04f400 lustre-MDT0004-mdc-ffff880fbb04f400 lustre-MDT0008-mdc-ffff880fbb04f400 lustre-MDT000c-mdc-ffff880fbb04f400 lustre-MDT0001-mdc-ffff880fbb04f400 lustre-MDT0005-mdc-ffff880fbb04f400 lustre-MDT0009-mdc-ffff880fbb04f400 lustre-MDT000d-mdc-ffff880fbb04f400 lustre-MDT0002-mdc-ffff880fbb04f400 lustre-MDT0006-mdc-ffff880fbb04f400 lustre-MDT000a-mdc-ffff880fbb04f400 lustre-MDT000e-mdc-ffff880fbb04f400 lustre-MDT0003-mdc-ffff880fbb04f400 lustre-MDT0007-mdc-ffff880fbb04f400 lustre-MDT000b-mdc-ffff880fbb04f400 lustre-MDT000f-mdc-ffff880fbb04f400 [root@catalyst100:~]# lfs mdts MDTS: 0: lustre-MDT0000_UUID ACTIVE 1: lustre-MDT0001_UUID ACTIVE 2: lustre-MDT0002_UUID ACTIVE 3: lustre-MDT0003_UUID ACTIVE 4: lustre-MDT0004_UUID ACTIVE 5: lustre-MDT0005_UUID ACTIVE 6: lustre-MDT0006_UUID ACTIVE 7: lustre-MDT0007_UUID ACTIVE 8: lustre-MDT0008_UUID ACTIVE 9: lustre-MDT0009_UUID ACTIVE 10: lustre-MDT000a_UUID ACTIVE 11: lustre-MDT000b_UUID ACTIVE 12: lustre-MDT000c_UUID ACTIVE 13: lustre-MDT000d_UUID ACTIVE 14: lustre-MDT000e_UUID ACTIVE 15: lustre-MDT000f_UUID ACTIVE |
| Comment by Lai Siyao [ 11/May/16 ] |
|
This looks to be just as designed, because during format "--index" specifies the target index in the system. So in your original setup, `lfs mkdir --index 10 ...` will fail, but `lfs mkdir --index 16 ...` should succeed, because the MDT with index 10 doesn't exist, but 16 exists. |
| Comment by Olaf Faaland [ 11/May/16 ] |
|
Lai, I see what you mean. That explains why our mkdir failed. However the proc files seem not to be consistent, which seems like a separate problem. On the client, MDTs with indexes 0x10-0x15 are missing from the listing in /proc/fs/lustre/lmv/lustre-clilmv-ffff881003e14400/target_obd, even though they are all present in /proc/fs/lustre/lmv/lustre-clilmv-ffff881003e14400/target_obds/. You can see this in the description, above. Why would that be? thanks, |
| Comment by Giuseppe Di Natale (Inactive) [ 11/May/16 ] |
|
Lai, I just launched a new test setup to do more testing. lfs mkdir was successful like you suggested it would be. Would this mean that there is a bug somewhere in the proc handler for /proc/fs/lustre/lmv/lustre-clilmv-.../target_obd since it doesn't contain all active MDTs in this case? |
| Comment by Lai Siyao [ 12/May/16 ] |
|
yes, current LMV code stores targets in an array, and the index can not exceed total count, so only the targets whose index is below the total count are listed. I'll make a fix later. |
| Comment by Giuseppe Di Natale (Inactive) [ 12/May/16 ] |
|
I found the portion of the code you are talking about. I might already have a patch. |
| Comment by Giuseppe Di Natale (Inactive) [ 17/May/16 ] |
|
Just to make sure no one else is working on this, I'll be submitting a patch soon. |
| Comment by Lai Siyao [ 18/May/16 ] |
|
No, thanks for your work! |
| Comment by Gerrit Updater [ 19/May/16 ] |
|
Giuseppe Di Natale (dinatale2@llnl.gov) uploaded a new patch: http://review.whamcloud.com/20336 |
| Comment by Giuseppe Di Natale (Inactive) [ 19/May/16 ] |
|
Went ahead and submitted a patch. It works on a VM setup I have, but I'm not 100% it's correct. I would also like to introduce a test to check this file. The test would require MDTs to have non-consecutive indices. Is it possible within the test framework to manually assign indices to MDTs or can someone point me to an example test? |
| Comment by Andreas Dilger [ 26/May/16 ] |
|
It should be possible to generate non-consecutive MDT indices. See conf-sanity test_56 for an example of this with large OST indices. Maybe renaming that to test_56a and adding test_56b for large MDT indices is the right way to go? |
| Comment by Gerrit Updater [ 01/Jun/16 ] |
|
Giuseppe Di Natale (dinatale2@llnl.gov) uploaded a new patch: http://review.whamcloud.com/20546 |
| Comment by Gerrit Updater [ 02/Jun/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20336/ |
| Comment by Gerrit Updater [ 21/Sep/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20546/ |
| Comment by Peter Jones [ 21/Sep/16 ] |
|
Landed for 2.9 |