Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-376

Client hangs when listing big directory with ls -la

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.1.0, Lustre 1.8.6
    • Lustre 2.1.0, Lustre 1.8.6
    • None
    • Client: 1.8.5
      Server: 2xMDS, 8xOSS, 24xOST, Lustre 2.0.59, RHEL 5.6

    Description

      We have noticed some interoperability issue between 1.8.5 clients and 2.0.59 server (no other versions tested)
      Clients with 2.0.59 are not affected with the problem.

      How to reproduce problem:

      On client node issue:
      cd /mnt/lustre
      mkdir somebigdir
      for i in `seq 1 10000`; do touch file.$i; done;
      ls -la

      Symptom is trivial - client hangs , when 2.0.59 is used, such kind of listing takes ~4s

      Problem is interconnect independent: tested with @tcp as well as with @o2ib

      Possible log message related to the issue:

      00010000:00010000:10:1306772139.242230:0:3591:0:(ldlm_lock.c:597:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(PR) ns: scratch-MDT0000-mdc-ffff81041677b800 lock: ffff8103f56ec200/0xf6a4fad9013fdffb lrc: 3/1,0 mode: PR/PR res: 8589937616/1 bits 0x3 rrc: 2 type: IBT flags: 0x0 remote: 0x3b122fd677c9380d expref: -99 pid: 1905 timeout: 0
      00010000:00010000:10:1306772139.242239:0:3591:0:(ldlm_lock.c:580:ldlm_lock_addref_internal_nolock()) ### ldlm_lock_addref(PR) ns: scratch-MDT0000-mdc-ffff81041677b800 lock: ffff8103f56ec200/0xf6a4fad9013fdffb lrc: 2/1,0 mode: PR/PR res: 8589937616/1 bits 0x3 rrc: 3 type: IBT flags: 0x0 remote: 0x3b122fd677c9380d expref: -99 pid: 1905 timeout: 0
      00010000:00010000:10:1306772139.242244:0:3591:0:(ldlm_lock.c:1088:ldlm_lock_match()) ### matched (0 0) ns: scratch-MDT0000-mdc-ffff81041677b800 lock: ffff8103f56ec200/0xf6a4fad9013fdffb lrc: 2/1,0 mode: PR/PR res: 8589937616/1 bits 0x3 rrc: 2 type: IBT flags: 0x0 remote: 0x3b122fd677c9380d expref: -99 pid: 1905 timeout: 0
      00000080:00200000:10:1306772139.242252:0:3591:0:(dir.c:594:ll_dir_readpage_20()) VFS Op:inode=144115238810157057/0(ffff8103f56ef920) off 3590582044
      00000100:00100000:10:1306772139.242259:0:3591:0:(client.c:2084:ptlrpc_queue_wait()) Sending RPC pname:cluuid:pid:xid:nid:opc ls:9a637513-e3b6-abe7-b530-d8d413e552d9:3591:x1370249573210902:172.16.193.1@o2ib:37
      00000100:00100000:10:1306772139.242811:0:3591:0:(client.c:2189:ptlrpc_queue_wait()) Completed RPC pname:cluuid:pid:xid:nid:opc ls:9a637513-e3b6-abe7-b530-d8d413e552d9:3591:x1370249573210902:172.16.193.1@o2ib:37

      I can provide more information and do provide testing when needed.
      Best Regards

      Lukasz Flis

      Attachments

        Activity

          [LU-376] Client hangs when listing big directory with ls -la
          pjones Peter Jones made changes -
          Affects Version/s New: Lustre 1.8.6 [ 10022 ]
          Affects Version/s Original: Lustre 1.8.x [ 10010 ]

          tested with our unpatched Oracle 1.8.6 clients. It worked like a charm. Thank you

          simmonsja James A Simmons added a comment - tested with our unpatched Oracle 1.8.6 clients. It worked like a charm. Thank you
          yong.fan nasf (Inactive) made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: In Progress [ 3 ] New: Resolved [ 5 ]

          Patches have been landed to lustre-1.8.6 and lustre-2.1.0

          yong.fan nasf (Inactive) added a comment - Patches have been landed to lustre-1.8.6 and lustre-2.1.0

          Integrated in lustre-master » i686,server,el5,inkernel #158
          LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset

          Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
          Files :

          • lustre/include/lustre_lite.h
          • lustre/ptlrpc/wiretest.c
          • lustre/include/lustre/lustre_idl.h
          • lustre/liblustre/dir.c
          • lustre/utils/wiretest.c
          • lustre/llite/llite_lib.c
          • lustre/mdd/mdd_object.c
          • lustre/llite/dir.c
          • lustre/include/lclient.h
          • lustre/lclient/lcommon_cl.c
          • lustre/llite/statahead.c
          • lustre/llite/file.c
          • lustre/lmv/lmv_obd.c
          • lustre/llite/llite_internal.h
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » i686,server,el5,inkernel #158 LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14 Files : lustre/include/lustre_lite.h lustre/ptlrpc/wiretest.c lustre/include/lustre/lustre_idl.h lustre/liblustre/dir.c lustre/utils/wiretest.c lustre/llite/llite_lib.c lustre/mdd/mdd_object.c lustre/llite/dir.c lustre/include/lclient.h lustre/lclient/lcommon_cl.c lustre/llite/statahead.c lustre/llite/file.c lustre/lmv/lmv_obd.c lustre/llite/llite_internal.h

          Integrated in lustre-master » x86_64,server,el5,ofa #158
          LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset

          Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
          Files :

          • lustre/include/lustre/lustre_idl.h
          • lustre/ptlrpc/wiretest.c
          • lustre/llite/llite_internal.h
          • lustre/lmv/lmv_obd.c
          • lustre/include/lustre_lite.h
          • lustre/liblustre/dir.c
          • lustre/mdd/mdd_object.c
          • lustre/include/lclient.h
          • lustre/lclient/lcommon_cl.c
          • lustre/utils/wiretest.c
          • lustre/llite/dir.c
          • lustre/llite/statahead.c
          • lustre/llite/llite_lib.c
          • lustre/llite/file.c
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,server,el5,ofa #158 LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14 Files : lustre/include/lustre/lustre_idl.h lustre/ptlrpc/wiretest.c lustre/llite/llite_internal.h lustre/lmv/lmv_obd.c lustre/include/lustre_lite.h lustre/liblustre/dir.c lustre/mdd/mdd_object.c lustre/include/lclient.h lustre/lclient/lcommon_cl.c lustre/utils/wiretest.c lustre/llite/dir.c lustre/llite/statahead.c lustre/llite/llite_lib.c lustre/llite/file.c

          Integrated in lustre-master » x86_64,client,el5,ofa #158
          LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset

          Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
          Files :

          • lustre/llite/llite_internal.h
          • lustre/utils/wiretest.c
          • lustre/liblustre/dir.c
          • lustre/include/lustre_lite.h
          • lustre/llite/dir.c
          • lustre/lclient/lcommon_cl.c
          • lustre/llite/llite_lib.c
          • lustre/llite/statahead.c
          • lustre/lmv/lmv_obd.c
          • lustre/ptlrpc/wiretest.c
          • lustre/include/lclient.h
          • lustre/mdd/mdd_object.c
          • lustre/include/lustre/lustre_idl.h
          • lustre/llite/file.c
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,el5,ofa #158 LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14 Files : lustre/llite/llite_internal.h lustre/utils/wiretest.c lustre/liblustre/dir.c lustre/include/lustre_lite.h lustre/llite/dir.c lustre/lclient/lcommon_cl.c lustre/llite/llite_lib.c lustre/llite/statahead.c lustre/lmv/lmv_obd.c lustre/ptlrpc/wiretest.c lustre/include/lclient.h lustre/mdd/mdd_object.c lustre/include/lustre/lustre_idl.h lustre/llite/file.c

          Integrated in lustre-master » i686,server,el5,ofa #158
          LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset

          Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
          Files :

          • lustre/utils/wiretest.c
          • lustre/include/lustre/lustre_idl.h
          • lustre/mdd/mdd_object.c
          • lustre/llite/llite_lib.c
          • lustre/liblustre/dir.c
          • lustre/llite/file.c
          • lustre/llite/statahead.c
          • lustre/lclient/lcommon_cl.c
          • lustre/lmv/lmv_obd.c
          • lustre/include/lustre_lite.h
          • lustre/include/lclient.h
          • lustre/llite/llite_internal.h
          • lustre/ptlrpc/wiretest.c
          • lustre/llite/dir.c
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » i686,server,el5,ofa #158 LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14 Files : lustre/utils/wiretest.c lustre/include/lustre/lustre_idl.h lustre/mdd/mdd_object.c lustre/llite/llite_lib.c lustre/liblustre/dir.c lustre/llite/file.c lustre/llite/statahead.c lustre/lclient/lcommon_cl.c lustre/lmv/lmv_obd.c lustre/include/lustre_lite.h lustre/include/lclient.h lustre/llite/llite_internal.h lustre/ptlrpc/wiretest.c lustre/llite/dir.c

          Integrated in lustre-master » i686,server,el6,inkernel #158
          LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset

          Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
          Files :

          • lustre/liblustre/dir.c
          • lustre/llite/file.c
          • lustre/include/lustre_lite.h
          • lustre/lclient/lcommon_cl.c
          • lustre/lmv/lmv_obd.c
          • lustre/mdd/mdd_object.c
          • lustre/llite/llite_lib.c
          • lustre/include/lustre/lustre_idl.h
          • lustre/llite/statahead.c
          • lustre/llite/llite_internal.h
          • lustre/utils/wiretest.c
          • lustre/include/lclient.h
          • lustre/ptlrpc/wiretest.c
          • lustre/llite/dir.c
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » i686,server,el6,inkernel #158 LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14 Files : lustre/liblustre/dir.c lustre/llite/file.c lustre/include/lustre_lite.h lustre/lclient/lcommon_cl.c lustre/lmv/lmv_obd.c lustre/mdd/mdd_object.c lustre/llite/llite_lib.c lustre/include/lustre/lustre_idl.h lustre/llite/statahead.c lustre/llite/llite_internal.h lustre/utils/wiretest.c lustre/include/lclient.h lustre/ptlrpc/wiretest.c lustre/llite/dir.c

          Integrated in lustre-master » x86_64,server,el5,inkernel #158
          LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset

          Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14
          Files :

          • lustre/include/lustre/lustre_idl.h
          • lustre/llite/dir.c
          • lustre/lmv/lmv_obd.c
          • lustre/include/lustre_lite.h
          • lustre/llite/llite_internal.h
          • lustre/llite/file.c
          • lustre/mdd/mdd_object.c
          • lustre/ptlrpc/wiretest.c
          • lustre/utils/wiretest.c
          • lustre/liblustre/dir.c
          • lustre/llite/statahead.c
          • lustre/llite/llite_lib.c
          • lustre/include/lclient.h
          • lustre/lclient/lcommon_cl.c
          hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,server,el5,inkernel #158 LU-376 Positive LL_DIR_END_OFF to indicate the tail of dir hash/offset Oleg Drokin : dea1dfafba827572dc1be042de4332e8962f1c14 Files : lustre/include/lustre/lustre_idl.h lustre/llite/dir.c lustre/lmv/lmv_obd.c lustre/include/lustre_lite.h lustre/llite/llite_internal.h lustre/llite/file.c lustre/mdd/mdd_object.c lustre/ptlrpc/wiretest.c lustre/utils/wiretest.c lustre/liblustre/dir.c lustre/llite/statahead.c lustre/llite/llite_lib.c lustre/include/lclient.h lustre/lclient/lcommon_cl.c

          People

            yong.fan nasf (Inactive)
            lflis Lukasz Flis
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: