Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1109

NFS server not responding when running parallel-scale test_iorsff

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.2.0, Lustre 2.1.2
    • Lustre 2.2.0
    • None
    • server/client: 2.1.55 RHEL6-x86_64
    • 3
    • 4710

    Description

      Hit the following error when running iorssf over NFS v3

      Lustre: DEBUG MARKER: == parallel-scale test iorssf: iorssf == 13:45:23 (1329342323)
      nfs: server 10.10.4.15 not responding, still trying
      nfs: server 10.10.4.15 not responding, still trying
      nfs: server 10.10.4.15 not responding, still trying
      nfs: server 10.10.4.15 not responding, still trying
      nfs: server 10.10.4.15 not responding, still trying

      Attachments

        Issue Links

          Activity

            [LU-1109] NFS server not responding when running parallel-scale test_iorsff

            Integrated in lustre-master » i686,client,el5,ofa #493
            LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1)

            Result = SUCCESS
            Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1
            Files :

            • lustre/llite/vvp_io.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » i686,client,el5,ofa #493 LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1) Result = SUCCESS Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1 Files : lustre/llite/vvp_io.c

            Integrated in lustre-master » x86_64,client,el6,ofa #493
            LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1)

            Result = SUCCESS
            Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1
            Files :

            • lustre/llite/vvp_io.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,el6,ofa #493 LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1) Result = SUCCESS Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1 Files : lustre/llite/vvp_io.c

            Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #493
            LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1)

            Result = SUCCESS
            Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1
            Files :

            • lustre/llite/vvp_io.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #493 LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1) Result = SUCCESS Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1 Files : lustre/llite/vvp_io.c

            Integrated in lustre-master » x86_64,client,el5,ofa #493
            LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1)

            Result = SUCCESS
            Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1
            Files :

            • lustre/llite/vvp_io.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,el5,ofa #493 LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1) Result = SUCCESS Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1 Files : lustre/llite/vvp_io.c

            Integrated in lustre-master » i686,server,el5,ofa #493
            LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1)

            Result = SUCCESS
            Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1
            Files :

            • lustre/llite/vvp_io.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » i686,server,el5,ofa #493 LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1) Result = SUCCESS Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1 Files : lustre/llite/vvp_io.c

            Integrated in lustre-master » x86_64,client,el5,inkernel #493
            LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1)

            Result = SUCCESS
            Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1
            Files :

            • lustre/llite/vvp_io.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,el5,inkernel #493 LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1) Result = SUCCESS Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1 Files : lustre/llite/vvp_io.c
            pjones Peter Jones added a comment -

            Landed for 2.2

            pjones Peter Jones added a comment - Landed for 2.2

            Integrated in lustre-master » x86_64,server,el5,inkernel #493
            LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1)

            Result = SUCCESS
            Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1
            Files :

            • lustre/llite/vvp_io.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,server,el5,inkernel #493 LU-1109 llite: do splice read stripe by stripe (Revision 211b00d651bbc57d9ab9d24d6d7e94b013957cf1) Result = SUCCESS Oleg Drokin : 211b00d651bbc57d9ab9d24d6d7e94b013957cf1 Files : lustre/llite/vvp_io.c
            sarah Sarah Liu added a comment - Hi, I reran the test on Toro instead of Juelich, both NFSv3 and v4 were pass: https://maloo.whamcloud.com/test_sets/395bfe5a-61d7-11e1-b462-5254004bbbd3 https://maloo.whamcloud.com/test_sets/00c195ae-61d8-11e1-b462-5254004bbbd3
            sarah Sarah Liu added a comment -

            I didn't run recovery test at that time and will keep you updated if I have any more information.

            sarah Sarah Liu added a comment - I didn't run recovery test at that time and will keep you updated if I have any more information.

            Hi Sara,

            It smells not to be the same problem. Did you have recovery test During the time IOR is running? The suspicious log is this one:

            LustreError: 31719:0:(file.c:2221:ll_inode_revalidate_fini()) failure -95 inode 1045761 LustreError: 31768:0:(mdc_locks.c:719:mdc_enqueue()) ldlm_cli_enqueue: -95
            LustreError: 31768:0:(file.c:2221:ll_inode_revalidate_fini()) failure -95 inode 1045761

            But I'm not sure before I get the full log.

            Can you please rerun the test with the following debug setttings on the nfs server(lustre client):
            1. lctl set_param debug=-1
            2. lctl set_param debug=-trace
            3. lctl set_param debug_mb=200
            4. lctl mark "XXXX IOR test starting..."

            After you notice nfsd is hung, please do the following besides collecting lustre logs:
            5. echo t > /proc/sysrq-trigger
            6. dmesg > dmesg.txt and upload dmesg.txt file.

            Thanks.

            jay Jinshan Xiong (Inactive) added a comment - Hi Sara, It smells not to be the same problem. Did you have recovery test During the time IOR is running? The suspicious log is this one: LustreError: 31719:0:(file.c:2221:ll_inode_revalidate_fini()) failure -95 inode 1045761 LustreError: 31768:0:(mdc_locks.c:719:mdc_enqueue()) ldlm_cli_enqueue: -95 LustreError: 31768:0:(file.c:2221:ll_inode_revalidate_fini()) failure -95 inode 1045761 But I'm not sure before I get the full log. Can you please rerun the test with the following debug setttings on the nfs server(lustre client): 1. lctl set_param debug=-1 2. lctl set_param debug=-trace 3. lctl set_param debug_mb=200 4. lctl mark "XXXX IOR test starting..." After you notice nfsd is hung, please do the following besides collecting lustre logs: 5. echo t > /proc/sysrq-trigger 6. dmesg > dmesg.txt and upload dmesg.txt file. Thanks.

            People

              jay Jinshan Xiong (Inactive)
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: