Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11826

Cannot send after transport endpoint shutdown

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • Lustre 2.10.6
    • None
    • CentOS 7.4
    • 3
    • 9223372036854775807

    Description

      When running multiple rm's of files, we get the following error in the shell:

      /bin/rm: cannot remove '</some/file/path>’: Cannot send after transport endpoint shutdown

      These coincide with the following error in /var/log/messages:

       Dec 24 11:13:09 foxtrot2 kernel: LustreError: 11-0: foxtrot-MDT0000-mdc-ffff883ff6b12800: operation mds_close to node 10.21.22.10@tcp failed: rc = -107Dec 24 11:13:09 foxtrot2 kernel: Lustre: foxtrot-MDT0000-mdc-ffff883ff6b12800: Connection to foxtrot-MDT0000 (at 10.21.22.10@tcp) was lost; in progress operations using this service will wait for recovery to completeDec 24 11:13:09 foxtrot2 kernel: LustreError: 167-0: foxtrot-MDT0000-mdc-ffff883ff6b12800: This client was evicted by foxtrot-MDT0000; in progress operations using this service will fail.Dec 24 11:13:09 foxtrot2 kernel: LustreError: 3598:0:(mdc_locks.c:1211:mdc_intent_getattr_async_interpret()) ldlm_cli_enqueue_fini: -5Dec 24 11:13:09 foxtrot2 kernel: LustreError: 3598:0:(mdc_locks.c:1211:mdc_intent_getattr_async_interpret()) Skipped 37 previous similar messagesDec 24 11:13:09 foxtrot2 kernel: LustreError: Skipped 50 previous similar messagesDec 24 11:13:09 foxtrot2 kernel: LustreError: 39322:0:(llite_lib.c:1512:ll_md_setattr()) md_setattr fails: rc = -5Dec 24 11:13:09 foxtrot2 kernel: LustreError: 38248:0:(file.c:172:ll_close_inode_openhandle()) foxtrot-clilmv-ffff883ff6b12800: inode [0x200030875:0x5d11:0x0] mdc close failed: rc = -107Dec 24 11:13:09 foxtrot2 kernel: LustreError: 38248:0:(file.c:172:ll_close_inode_openhandle()) Skipped 743 previous similar messagesDec 24 11:13:09 foxtrot2 kernel: LustreError: 41760:0:(vvp_io.c:1474:vvp_io_init()) foxtrot: refresh file layout [0x2000302ba:0x103db:0x0] error -108.Dec 24 11:13:09 foxtrot2 kernel: LustreError: 41760:0:(vvp_io.c:1474:vvp_io_init()) Skipped 310070 previous similar messagesDec 24 11:13:09 foxtrot2 kernel: LustreError: 44300:0:(mdc_request.c:1329:mdc_read_page()) foxtrot-MDT0000-mdc-ffff883ff6b12800: [0x20002cfcf:0x5a20:0x0] lock enqueue fails: rc = -108Dec 24 11:13:09 foxtrot2 kernel: LustreError: 39322:0:(llite_lib.c:1512:ll_md_setattr()) Skipped 5 previous similar messagesDec 24 11:13:09 foxtrot2 kernel: LustreError: 12816:0:(vvp_io.c:1474:vvp_io_init()) foxtrot: refresh file layout [0x200030766:0x18539:0x0] error -108.Dec 24 11:13:09 foxtrot2 kernel: LustreError: 39252:0:(vvp_io.c:1474:vvp_io_init()) foxtrot: refresh file layout [0x2000302ba:0x10403:0x0] error -108.Dec 24 11:13:09 foxtrot2 kernel: LustreError: 39252:0:(vvp_io.c:1474:vvp_io_init()) Skipped 143616 previous similar messagesDec 24 11:13:09 foxtrot2 kernel: LustreError: 44302:0:(file.c:172:ll_close_inode_openhandle()) foxtrot-clilmv-ffff883ff6b12800: inode [0x20000070c:0x2ea9:0x0] mdc close failed: rc = -108Dec 24 11:13:09 foxtrot2 kernel: LustreError: 44302:0:(file.c:172:ll_close_inode_openhandle()) Skipped 815 previous similar messagesDec 24 11:13:09 foxtrot2 kernel: LustreError: 12816:0:(vvp_io.c:1474:vvp_io_init()) Skipped 2986 previous similar messagesDec 24 11:13:10 foxtrot2 kernel: Lustre: foxtrot-MDT0000-mdc-ffff883ff6b12800: Connection restored to 10.21.22.10@tcp (at 10.21.22.10@tcp)
      

      Attachments

        Issue Links

          Activity

            [LU-11826] Cannot send after transport endpoint shutdown
            green Oleg Drokin added a comment -

            hm... How were these logs produced? I don't see any evictions in there?

            I guess I failed to mention that when you have the dump on eviction setup, there would be /tmp/lustre-log-TIMESTAMP files dropped for every event that are all binary that you need to pass through "lctl df" to turn into text. It's those files that we need. Hopefully they are still there?

            the lctl df step is not necessary, we can run that ourselves.

            green Oleg Drokin added a comment - hm... How were these logs produced? I don't see any evictions in there? I guess I failed to mention that when you have the dump on eviction setup, there would be /tmp/lustre-log-TIMESTAMP files dropped for every event that are all binary that you need to pass through "lctl df" to turn into text. It's those files that we need. Hopefully they are still there? the lctl df step is not necessary, we can run that ourselves.

            Hi Oleg,

            Did you get a chance to take a look at those logs, and if so, were you able to determine what the issue was? 

            Kind regards,

            Campbell

            cmcl Campbell Mcleay (Inactive) added a comment - Hi Oleg, Did you get a chance to take a look at those logs, and if so, were you able to determine what the issue was?  Kind regards, Campbell

            Hi Oleg,

            Two of the three clients lost connection with the MDS. Have dumped the logs out and ftp'ed them up. File names are:

            fmds1-lustre.log.220119.gz

            foxtrot2-lustre.log.220119.gz

            foxtrot3-lustre.log.220119.gz

            Regards,

            Campbell

            cmcl Campbell Mcleay (Inactive) added a comment - Hi Oleg, Two of the three clients lost connection with the MDS. Have dumped the logs out and ftp'ed them up. File names are: fmds1-lustre.log.220119.gz foxtrot2-lustre.log.220119.gz foxtrot3-lustre.log.220119.gz Regards, Campbell

            Thank you very much Oleg, that's a very handy parameter. I have set this on the clients and the MDS and resumed the deletes. Might take a little while for it to trigger, will send the results through if it disconnects again.

            cmcl Campbell Mcleay (Inactive) added a comment - Thank you very much Oleg, that's a very handy parameter. I have set this on the clients and the MDS and resumed the deletes. Might take a little while for it to trigger, will send the results through if it disconnects again.
            green Oleg Drokin added a comment -

            I just noticed that you only capture 5 minutes of log in 3G, that might be enough since a normal lock timeout is like 200s and ast timeout is only about 7s.

            green Oleg Drokin added a comment - I just noticed that you only capture 5 minutes of log in 3G, that might be enough since a normal lock timeout is like 200s and ast timeout is only about 7s.
            green Oleg Drokin added a comment -

            it's weird that your logs literally start with an eviction:

            00010000:00020000:51.1F:1545412061.748721:0:0:0:(ldlm_lockd.c:333:waiting_locks_
            callback()) ### lock callback timer expired after 0s: evicting client at 10.21.2
            2.32@tcp  ns: mdt-foxtrot-MDT0000_UUID lock: ffff88268bb39400/0x13866bc8bdb565ba
             lrc: 4/0,0 mode: PR/PR res: [0x20002a5ed:0x16aa8:0x0].0x0 bits 0x1b rrc: 3 type
            : IBT flags: 0x60200400000020 nid: 10.21.22.32@tcp remote: 0xc849fe3bc093e756 ex
            pref: 63841 pid: 20017 timeout: 0 lvb_type: 0
            

            it makes no sense to run days of logs on the clients waiting for the eviction, but you can set "dump logs on eviction" flag and so every time a client is evicted it would dump some logs, the first one would be containing all the history.
            you can do this by setting dump_on_eviction=1 with lctl set_param on every client and on the mdt.

            Also I don't see any evictions at 1547661781 timestamp in the logs, the preceedign evictions are at 1545663407 which clearly is too long ago and the next evictions are at 1547669576 which is 2+ hours way from your specified time (timezone calculation problem?) and does not match the eviction profile we saw in the initial report (ast timeout). In fact at no time we see "failed to reply to blocking AST" message in the mdt log.

            Unfortunately the locks you gathered when you did already became kind of fragmented, so it would be really great if you can capture something from clients and servers using the dump on eviction method above (Lustre has three tier logging so even when logs start with a particular timestamp that does not mean there are no gaps in between those entries)

            green Oleg Drokin added a comment - it's weird that your logs literally start with an eviction: 00010000:00020000:51.1F:1545412061.748721:0:0:0:(ldlm_lockd.c:333:waiting_locks_ callback()) ### lock callback timer expired after 0s: evicting client at 10.21.2 2.32@tcp ns: mdt-foxtrot-MDT0000_UUID lock: ffff88268bb39400/0x13866bc8bdb565ba lrc: 4/0,0 mode: PR/PR res: [0x20002a5ed:0x16aa8:0x0].0x0 bits 0x1b rrc: 3 type : IBT flags: 0x60200400000020 nid: 10.21.22.32@tcp remote: 0xc849fe3bc093e756 ex pref: 63841 pid: 20017 timeout: 0 lvb_type: 0 it makes no sense to run days of logs on the clients waiting for the eviction, but you can set "dump logs on eviction" flag and so every time a client is evicted it would dump some logs, the first one would be containing all the history. you can do this by setting dump_on_eviction=1 with lctl set_param on every client and on the mdt. Also I don't see any evictions at 1547661781 timestamp in the logs, the preceedign evictions are at 1545663407 which clearly is too long ago and the next evictions are at 1547669576 which is 2+ hours way from your specified time (timezone calculation problem?) and does not match the eviction profile we saw in the initial report (ast timeout). In fact at no time we see "failed to reply to blocking AST" message in the mdt log. Unfortunately the locks you gathered when you did already became kind of fragmented, so it would be really great if you can capture something from clients and servers using the dump on eviction method above (Lustre has three tier logging so even when logs start with a particular timestamp that does not mean there are no gaps in between those entries)

            Hi,

            Did you find anything of interest in the log I sent? Or do you need the client log? I'm finding it difficult to capture the log on the client as it may take a few days to trigger it and I can't watch it all the time. Even with 4GB of memory assigned to the debug log size, it only seems to capture 5 minutes worth (though the log is 3GB in size)

            Regards,

            Campbell

            cmcl Campbell Mcleay (Inactive) added a comment - Hi, Did you find anything of interest in the log I sent? Or do you need the client log? I'm finding it difficult to capture the log on the client as it may take a few days to trigger it and I can't watch it all the time. Even with 4GB of memory assigned to the debug log size, it only seems to capture 5 minutes worth (though the log is 3GB in size) Regards, Campbell

            Hi,

            I've uploaded the MDS log (uploads/LU-11826/fmds1-lustre.log.gz) to ftp.whamcloud.com. The time of the disconnection was 1547661781 in unix time (16th Jan 22:27:22). I'm running another rm in the hope I can capture some useful logs on the clients as well

            Regards,

            Campbell

            cmcl Campbell Mcleay (Inactive) added a comment - Hi, I've uploaded the MDS log (uploads/ LU-11826 /fmds1-lustre.log.gz) to ftp.whamcloud.com. The time of the disconnection was 1547661781 in unix time (16th Jan 22:27:22). I'm running another rm in the hope I can capture some useful logs on the clients as well Regards, Campbell
            pjones Peter Jones added a comment -

            Sometimes the diagnostic data collected as part of Lustre troubleshooting is too large to be attached to a JIRA ticket. For these cases, Whamcloud provides an anonymous write-only FTP upload service. In order to use this service, you'll need an FTP client (e.g. ncftp, ftp, etc.) and a JIRA issue. Use the 'uploads' directory and create a new subdirectory using your Jira issue as a name.

            In the following example, there are three debug logs in a single directory and the JIRA issue LU-4242 has been created. After completing the upload, please update the relevant issue with a note mentioning the upload, so that our engineers know where to find your logs.

            $ ls -lh
            total 333M
            rw-rr- 1 mjmac mjmac 98M Feb 23 17:36 mds-debug
            rw-rr- 1 mjmac mjmac 118M Feb 23 17:37 oss-00-debug
            rw-rr- 1 mjmac mjmac 118M Feb 23 17:37 oss-01-debug
            $ ncftp ftp.whamcloud.com
            NcFTP 3.2.2 (Sep 04, 2008) by Mike Gleason (http://www.NcFTP.com/contact/).
            Connecting to 99.96.190.235...
            (vsFTPd 2.2.2)
            Logging in...
            Login successful.
            Logged in to ftp.whamcloud.com.
            ncftp / > cd uploads
            Directory successfully changed.
            ncftp /uploads > mkdir LU-4242
            ncftp /uploads > cd LU-4242
            Directory successfully changed.
            ncftp /uploads/LU-4242 > put *
            mds-debug: 97.66 MB 11.22 MB/s
            oss-00-debug: 117.19 MB 11.16 MB/s
            oss-01-debug: 117.48 MB 11.18 MB/s
            ncftp /uploads/LU-4242 >
            

            Please note that this is a WRITE-ONLY FTP service, so you will not be able to see (with ls) the files or directories you've created, nor will you (or anyone other than Whamcloud staff) be able to see or read them.

            pjones Peter Jones added a comment - Sometimes the diagnostic data collected as part of Lustre troubleshooting is too large to be attached to a JIRA ticket. For these cases, Whamcloud provides an anonymous write-only FTP upload service. In order to use this service, you'll need an FTP client (e.g. ncftp, ftp, etc.) and a JIRA issue. Use the 'uploads' directory and create a new subdirectory using your Jira issue as a name. In the following example, there are three debug logs in a single directory and the JIRA issue LU-4242 has been created. After completing the upload, please update the relevant issue with a note mentioning the upload, so that our engineers know where to find your logs. $ ls -lh total 333M rw-rr- 1 mjmac mjmac 98M Feb 23 17:36 mds-debug rw-rr- 1 mjmac mjmac 118M Feb 23 17:37 oss-00-debug rw-rr- 1 mjmac mjmac 118M Feb 23 17:37 oss-01-debug $ ncftp ftp.whamcloud.com NcFTP 3.2.2 (Sep 04, 2008) by Mike Gleason (http: //www.NcFTP.com/contact/). Connecting to 99.96.190.235... (vsFTPd 2.2.2) Logging in... Login successful. Logged in to ftp.whamcloud.com. ncftp / > cd uploads Directory successfully changed. ncftp /uploads > mkdir LU-4242 ncftp /uploads > cd LU-4242 Directory successfully changed. ncftp /uploads/LU-4242 > put * mds-debug: 97.66 MB 11.22 MB/s oss-00-debug: 117.19 MB 11.16 MB/s oss-01-debug: 117.48 MB 11.18 MB/s ncftp /uploads/LU-4242 > Please note that this is a WRITE-ONLY FTP service, so you will not be able to see (with ls) the files or directories you've created, nor will you (or anyone other than Whamcloud staff) be able to see or read them.

            Hi Oleg,

            There was a disconnection at around 22:27 last night, about 5 hours after I started the rm's. Unfortunately, I was not able to dump the log until I got in this morning, and the timestamp at the beginning of the client logs is after this time, however, the MDS debug log has that period covered. I'll restart the rm's and see if I can capture the error. Should I increase the debug_mb value on the clients? In the meantime, I can send you the MDS log, which compressed, is about 22MB. Is there somewhere I can send it?

            Regards,

            Campbell

            cmcl Campbell Mcleay (Inactive) added a comment - Hi Oleg, There was a disconnection at around 22:27 last night, about 5 hours after I started the rm's. Unfortunately, I was not able to dump the log until I got in this morning, and the timestamp at the beginning of the client logs is after this time, however, the MDS debug log has that period covered. I'll restart the rm's and see if I can capture the error. Should I increase the debug_mb value on the clients? In the meantime, I can send you the MDS log, which compressed, is about 22MB. Is there somewhere I can send it? Regards, Campbell
            green Oleg Drokin added a comment -

            you could also do "+..." notation to just add things to what you have there.
            Don't change anything in sybsystem_debug, it's already fully enabled, you only need to adjust the debug setting

            In your case you are best advised to increase the debug log size to like 1G (note this will consume 1G of RAM)

            lctl set_param debug_mb=1024
            

            and then add dlmtrace info and rpctrace with

            lctl set_param debug="+dlmtrace +info +rpctrace"
            

            Run these two commands on all the clients you expect the problem might hit and then also on the MDS.

            once the problem hits, run "lctl dk >/tmp/${HOSTNAME}-lustre.log" on all nodes affected, collect the files and upload them somewhere.

            Thanks.

            green Oleg Drokin added a comment - you could also do "+..." notation to just add things to what you have there. Don't change anything in sybsystem_debug, it's already fully enabled, you only need to adjust the debug setting In your case you are best advised to increase the debug log size to like 1G (note this will consume 1G of RAM) lctl set_param debug_mb=1024 and then add dlmtrace info and rpctrace with lctl set_param debug="+dlmtrace +info +rpctrace" Run these two commands on all the clients you expect the problem might hit and then also on the MDS. once the problem hits, run "lctl dk >/tmp/${HOSTNAME}-lustre.log" on all nodes affected, collect the files and upload them somewhere. Thanks.

            People

              green Oleg Drokin
              cmcl Campbell Mcleay (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: