Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8567

mdc_reint.c:57:mdc_reint()) error in handling -17 encountered on power8 node

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • Lustre 2.9.0
    • None
    • Power8 running RHEL7.2 with lustre version 2.8.56
    • 3
    • 9223372036854775807

    Description

      During my latest testing after I updated my test file system to the latest as well I found when running a IOR single shared file on one client the job alway fails. Nothing shows up in the kernel logs but I have gathered lustre debug logs to show what the problem is. I have attached those logs here.

      Attachments

        Issue Links

          Activity

            [LU-8567] mdc_reint.c:57:mdc_reint()) error in handling -17 encountered on power8 node

            Sorry this doesn't appear to be the source of the bug. The true source is LU-7321. Its just now that bug appears only on my servers. Sorry for the noise.

            simmonsja James A Simmons added a comment - Sorry this doesn't appear to be the source of the bug. The true source is LU-7321 . Its just now that bug appears only on my servers. Sorry for the noise.
            bobijam Zhenyu Xu added a comment - - edited

            What's the failed job output? Can you strace it?

            The log shows that mkdir of "jsimmons" under 0x200000410:0x4:0x0] failed, since it exists.

            00000080:00000001:1.0:1472488456.442910:1344:93034:0:(namei.c:1235:ll_mkdir()) Process entered
            00000080:00200000:1.0:1472488456.442910:1344:93034:0:(namei.c:1238:ll_mkdir()) VFS Op:name=jsimmons, dir=[0x200000410:0x4:0x0](c0000007eb866c90)
            00000080:00000001:1.0:1472488456.442912:1696:93034:0:(namei.c:982:ll_new_node()) Process entered
            ...
            00000080:00000001:1.0:1472488456.444020:1792:93034:0:(namei.c:1008:ll_new_node()) Process leaving via err_exit (rc=18446744073709551599 : -17 : 0xffffffffffffffef)
            ...
            00000080:00000001:1.0:1472488456.444031:1456:93034:0:(namei.c:1249:ll_mkdir()) Process leaving (rc=18446744073709551599 : -17 : ffffffffffffffef)
            
            bobijam Zhenyu Xu added a comment - - edited What's the failed job output? Can you strace it? The log shows that mkdir of "jsimmons" under 0x200000410:0x4:0x0] failed, since it exists. 00000080:00000001:1.0:1472488456.442910:1344:93034:0:(namei.c:1235:ll_mkdir()) Process entered 00000080:00200000:1.0:1472488456.442910:1344:93034:0:(namei.c:1238:ll_mkdir()) VFS Op:name=jsimmons, dir=[0x200000410:0x4:0x0](c0000007eb866c90) 00000080:00000001:1.0:1472488456.442912:1696:93034:0:(namei.c:982:ll_new_node()) Process entered ... 00000080:00000001:1.0:1472488456.444020:1792:93034:0:(namei.c:1008:ll_new_node()) Process leaving via err_exit (rc=18446744073709551599 : -17 : 0xffffffffffffffef) ... 00000080:00000001:1.0:1472488456.444031:1456:93034:0:(namei.c:1249:ll_mkdir()) Process leaving (rc=18446744073709551599 : -17 : ffffffffffffffef)
            pjones Peter Jones added a comment -

            Bobijam

            Could you please assist with this issue?

            Thanks

            Peter

            pjones Peter Jones added a comment - Bobijam Could you please assist with this issue? Thanks Peter

            People

              bobijam Zhenyu Xu
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: