The situation should be like this:
1) When you upgraded your MDS with the patch http://review.whamcloud.com/#patch,sidebyside,3467,7,lustre/osd-ldiskfs/osd_handler.c, the 64bithash/32bithash issue has been introduced in your system. Because the "osd_thread_info" is reused without totally reset when switch from one RPC processing to another RPC processing.
2) For old client, in spite of 32-bit or 64-bit, as long as it did NOT claim OBD_CONNECT_64BITHASH flags when connected to the MDS, the readdir RPC from such old client would cause that the "osd_thread_info::oti_it_ea::oie_file::f_mode" to be set as "FMODE_32BITHASH". As long as such readdir RPC happened once, the "FMODE_32BITHASH" flags on related "osd_thread_info" would not be cleared until the RPC service restarted on the MDS.
3) As long as the "FMODE_32BITHASH" was set, dir-hash processed by such RPC service thread would use the major hash only - 32bit. That is why we saw the 32bit dirhash returned.
4) No all the RPC service threads' "osd_thread_info::oti_it_ea::oie_file::f_mode" have been set as "FMODE_32BITHASH", depends on whether old clients triggered those RPC service threads to serve readdir RPCs or not. If the RPC service thread had not "FMODE_32BITHASH", then it will generate 64bithash, that is why we also saw some 64bit dirhash returned.
5) The readdir RPC from one client can be served by any RPC (readpage) service thread. So sometimes the readdir RPC was served by the RPC service thread which was set "FMODE_32BITHASH", but sometimes it may be served by the RPC service thread which was NOT set "FMODE_32BITHASH". For a large directory, one "ls -l dir" command may trigger several readdir RPCs, if these RPCs were handled by different RPC service threads, some of them were set "FMODE_32BITHASH" but some of them were NOT, then when client send 32bithash to the RPC service thread, which had NOT "FMODE_32BITHASH", the RPC service thread would explain the 32bithash (from client) as "major = 0, minor = 32bithash", that was wrong. So it cannot locate the right position.
6) For 2.x client, one readdir RPC will fetch back at most 256 pages, but for 1.8 client, only single page per RPC. So the readdir RPCs count for the same sized directory are different. And more readdir RPCs more failure possibility. That is why the failure is more easy to be reproduced on 1.8 client than on 2.x client.
It is NOT true that all the lustre-1.8.5 support 64bithash. I have checked your branches and found that the oldest branch which supports 64bithash is lustre-1.8.5.0-6chaos. But the former version, such as lustre-1.8.5.0-{1/2/3/4/5}chaos, they all do NOT support 64bithash.