Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.1.0, Lustre 1.8.6
-
None
-
Client: 1.8.5
Server: 2xMDS, 8xOSS, 24xOST, Lustre 2.0.59, RHEL 5.6
-
3
-
4996
Description
We have noticed some interoperability issue between 1.8.5 clients and 2.0.59 server (no other versions tested)
Clients with 2.0.59 are not affected with the problem.
How to reproduce problem:
On client node issue:
cd /mnt/lustre
mkdir somebigdir
for i in `seq 1 10000`; do touch file.$i; done;
ls -la
Symptom is trivial - client hangs , when 2.0.59 is used, such kind of listing takes ~4s
Problem is interconnect independent: tested with @tcp as well as with @o2ib
Possible log message related to the issue:
00010000:00010000:10:1306772139.242230:0:3591:0:(ldlm_lock.c:597:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(PR) ns: scratch-MDT0000-mdc-ffff81041677b800 lock: ffff8103f56ec200/0xf6a4fad9013fdffb lrc: 3/1,0 mode: PR/PR res: 8589937616/1 bits 0x3 rrc: 2 type: IBT flags: 0x0 remote: 0x3b122fd677c9380d expref: -99 pid: 1905 timeout: 0
00010000:00010000:10:1306772139.242239:0:3591:0:(ldlm_lock.c:580:ldlm_lock_addref_internal_nolock()) ### ldlm_lock_addref(PR) ns: scratch-MDT0000-mdc-ffff81041677b800 lock: ffff8103f56ec200/0xf6a4fad9013fdffb lrc: 2/1,0 mode: PR/PR res: 8589937616/1 bits 0x3 rrc: 3 type: IBT flags: 0x0 remote: 0x3b122fd677c9380d expref: -99 pid: 1905 timeout: 0
00010000:00010000:10:1306772139.242244:0:3591:0:(ldlm_lock.c:1088:ldlm_lock_match()) ### matched (0 0) ns: scratch-MDT0000-mdc-ffff81041677b800 lock: ffff8103f56ec200/0xf6a4fad9013fdffb lrc: 2/1,0 mode: PR/PR res: 8589937616/1 bits 0x3 rrc: 2 type: IBT flags: 0x0 remote: 0x3b122fd677c9380d expref: -99 pid: 1905 timeout: 0
00000080:00200000:10:1306772139.242252:0:3591:0:(dir.c:594:ll_dir_readpage_20()) VFS Op:inode=144115238810157057/0(ffff8103f56ef920) off 3590582044
00000100:00100000:10:1306772139.242259:0:3591:0:(client.c:2084:ptlrpc_queue_wait()) Sending RPC pname:cluuid:pid:xid:nid:opc ls:9a637513-e3b6-abe7-b530-d8d413e552d9:3591:x1370249573210902:172.16.193.1@o2ib:37
00000100:00100000:10:1306772139.242811:0:3591:0:(client.c:2189:ptlrpc_queue_wait()) Completed RPC pname:cluuid:pid:xid:nid:opc ls:9a637513-e3b6-abe7-b530-d8d413e552d9:3591:x1370249573210902:172.16.193.1@o2ib:37
I can provide more information and do provide testing when needed.
Best Regards
–
Lukasz Flis