This is actually different problem, it seems related with OSP, instead of LWP
So It seems MDT1 is being evicted by MDT0(being restarted during the test)
According to the debug log
1. MDT1 is in the final stage of recovery at 1407570696, so it sends the final PING to MDT0
2. MDT0 queue the ping at the same time 1407570696
3. MDT0 processing the final ping req 24 seconds later, and it should reply MDT1 with "RECOVERY complete" to tell the recovery is done.
4. For some unknown reasons, MDT1 get the "RECOVERY complete" reply from MDT0 after 20 seconds
5. In the mean time, the ping_evictor on MDT0 evict the export from MDT1, because MDT1 can not ping MDT0 during recovery stage, i.e. the import state is not FULL
I am not sure why we see this now, probably because some recent changes, I did not dig yet. I think the way to fix this might be update exp_last_request_time in stage 3, because the client can not ping the server when it is waiting for the "final recovery" signal. So we should refresh the exp_last_request_time once server is ready to accept ping and other request.
Patches landed to Master.