Details
Description
On 2.8 client, user sees:
ls: cannot access /p/ldne/faaland1: Input/output error
After unmounting and remounting on the 2.8 clients, the problem went away.
Client console log shows:
2016-03-23 16:44:33 LustreError: 32645:0:(llite_lib.c:2309:ll_prep_inode()) new_inode -fatal: rc -5
MDS console log shows:
2016-03-23 16:44:33 LustreError: 44914:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) ldne-MDT0002: parent [0x2c0000403:0x2:0x0] is on r2016-03-23 16:44:33 LustreError: 44914:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) Skipped 20 previous similar messages
2016-03-23 16:45:22 LustreError: 44914:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) ldne-MDT0002: parent [0x2c0000403:0x2:0x0] is on r
2016-03-23 16:45:22 LustreError: 44914:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) Skipped 20 previous similar messages
2016-03-23 16:45:28 LustreError: 44195:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) ldne-MDT0002: parent [0x2c0000403:0x2:0x0] is on r
2016-03-23 16:45:28 LustreError: 44195:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) Skipped 19 previous similar messages
2016-03-23 16:46:03 LustreError: 44199:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) ldne-MDT0002: parent [0x2c0000403:0x2:0x0] is on r2016-03-23 16:51:42 LustreError: 44199:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) ldne-MDT0002: parent [0x2c0000403:0x2:0x0] is on r
2016-03-23 16:51:42 LustreError: 44199:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) Skipped 1 previous similar message
2016-03-23 16:54:19 LustreError: 44199:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) ldne-MDT0002: parent [0x2c0000403:0x2:0x0] is on r2016-03-23 16:54:19 LustreError: 44199:0:(mdt_handler.c:1376:mdt_getattr_name_lock()) Skipped 3 previous similar messages
The following sequence of steps produced the corrupt directory entry:
1. formatted filesystem on servers running 2.8 RC4
2. mounted on nodes running 2.8 RC4 clients
3. mounted on node running 2.55 client
4. on 2.55 client, created directories under root including "faaland1"
5. on 2.8 client zwicky80, renamed faaland1 faaland1.old
6. on 2.8 client zwicky80, lfs mkdir faaland1 --count=4; lfs setdirstripe -D --count=4 faaland1
7. On all 2.8 clients OTHER THAN zwicky80, (e.g. zwicky82), attempt to access faaland1 fails. I thought I saw ENOTSUPP in error output somewhere, but cannot find it now.
8. After umounting on all the clients, and then remounting on the 2.8 clients, access to the directory appeared to work normally again.