Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3230

conf-sanity fails to start run: umount of OST fails

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.6.0, Lustre 2.5.1
    • Lustre 2.4.0, Lustre 2.4.1, Lustre 2.5.0, Lustre 2.4.2, Lustre 2.5.1
    • 3
    • 7893

    Description

      This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

      This issue relates to the following test suite runs:
      http://maloo.whamcloud.com/test_sets/bbe080da-ad17-11e2-bd7c-52540035b04c
      http://maloo.whamcloud.com/test_sets/51e42416-ad76-11e2-b72d-52540035b04c
      http://maloo.whamcloud.com/test_sets/842709fa-ad73-11e2-b72d-52540035b04c

      The sub-test conf-sanity failed with the following error:

      test failed to respond and timed out

      Info required for matching: conf-sanity conf-sanity
      Info required for matching: replay-single test_90

      Attachments

        Issue Links

          Activity

            [LU-3230] conf-sanity fails to start run: umount of OST fails
            utopiabound Nathaniel Clark added a comment - back-port to b2_4 http://review.whamcloud.com/8591
            utopiabound Nathaniel Clark added a comment - - edited

            It looks like this bug is fixed with the landing of #7995. Should I create gerrit patch to port to b2_4 and b2_5?
            It will cherry-pick cleanly to the current heads of both b2_4 and b2_5?

            utopiabound Nathaniel Clark added a comment - - edited It looks like this bug is fixed with the landing of #7995. Should I create gerrit patch to port to b2_4 and b2_5? It will cherry-pick cleanly to the current heads of both b2_4 and b2_5?
            yujian Jian Yu added a comment - - edited More instances on Lustre b2_4 branch: https://maloo.whamcloud.com/test_sets/dcb5daa6-6579-11e3-8518-52540035b04c https://maloo.whamcloud.com/test_sets/6c3ab5e4-6358-11e3-8c76-52540035b04c https://maloo.whamcloud.com/test_sets/d4b0f714-6281-11e3-a8fd-52540035b04c
            yujian Jian Yu added a comment -

            Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/63/
            Distro/Arch: RHEL6.4/x86_64 (server), SLES11SP2/x86_64 (client)

            replay-dual test 3 hit this failure:
            https://maloo.whamcloud.com/test_sets/20b3d072-5c98-11e3-956b-52540035b04c

            yujian Jian Yu added a comment - Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/63/ Distro/Arch: RHEL6.4/x86_64 (server), SLES11SP2/x86_64 (client) replay-dual test 3 hit this failure: https://maloo.whamcloud.com/test_sets/20b3d072-5c98-11e3-956b-52540035b04c
            yujian Jian Yu added a comment -

            Lustre build: http://build.whamcloud.com/job/lustre-b2_4/58/
            Distro/Arch: RHEL6.4/x86_64

            FSTYPE=zfs
            MDSCOUNT=1
            MDSSIZE=2097152
            OSTCOUNT=2
            OSTSIZE=2097152

            obdfilter-survey test 3a hit the same failure:
            https://maloo.whamcloud.com/test_sets/19556f3e-5608-11e3-8e94-52540035b04c

            yujian Jian Yu added a comment - Lustre build: http://build.whamcloud.com/job/lustre-b2_4/58/ Distro/Arch: RHEL6.4/x86_64 FSTYPE=zfs MDSCOUNT=1 MDSSIZE=2097152 OSTCOUNT=2 OSTSIZE=2097152 obdfilter-survey test 3a hit the same failure: https://maloo.whamcloud.com/test_sets/19556f3e-5608-11e3-8e94-52540035b04c
            utopiabound Nathaniel Clark added a comment - http://review.whamcloud.com/7995
            yujian Jian Yu added a comment -

            Lustre build: http://build.whamcloud.com/job/lustre-b2_4/47/
            Distro/Arch: RHEL6.4/x86_64

            FSTYPE=zfs
            MDSCOUNT=1
            MDSSIZE=2097152
            OSTCOUNT=2
            OSTSIZE=2097152

            obdfilter-survey test 3a hit the same failure:
            https://maloo.whamcloud.com/test_sets/a488f632-4453-11e3-8472-52540035b04c

            yujian Jian Yu added a comment - Lustre build: http://build.whamcloud.com/job/lustre-b2_4/47/ Distro/Arch: RHEL6.4/x86_64 FSTYPE=zfs MDSCOUNT=1 MDSSIZE=2097152 OSTCOUNT=2 OSTSIZE=2097152 obdfilter-survey test 3a hit the same failure: https://maloo.whamcloud.com/test_sets/a488f632-4453-11e3-8472-52540035b04c

            Debugging patch to try to see if 6988 was on the right track but not broad enough.

            http://review.whamcloud.com/7995

            utopiabound Nathaniel Clark added a comment - Debugging patch to try to see if 6988 was on the right track but not broad enough. http://review.whamcloud.com/7995

            There have been two "recent" (Sept 2013) non conf-sanity/- failures (both in replay-single):

            replay-single/74 https://maloo.whamcloud.com/test_sets/f441c460-227f-11e3-af6a-52540035b04c
            A review-dne-zfs failure on OST0000

            21:28:53:Lustre: DEBUG MARKER: umount -d /mnt/ost1
            21:28:53:Lustre: Failing over lustre-OST0000
            21:28:53:LustreError: 15640:0:(ost_handler.c:1782:ost_blocking_ast()) Error -2 syncing data on lock cancel
            21:28:53:Lustre: 15640:0:(service.c:2030:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (50:74s); client may timeout.  req@ffff880046d72c00 x1446662193136696/t0(0) o103->cea0ffc2-1873-4321-a1a2-348391764373@10.10.16.253@tcp:0/0 lens 328/192 e 0 to 0 dl 1379651120 ref 1 fl Complete:H/0/0 rc -19/-19
            21:28:53:LustreError: 7671:0:(ost_handler.c:1782:ost_blocking_ast()) Error -2 syncing data on lock cancel
            21:28:53:Lustre: lustre-OST0000: Not available for connect from 10.10.17.1@tcp (stopping)
            21:28:53:Lustre: Skipped 5 previous similar messages
            21:28:53:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 7. Is it stuck?
            21:28:53:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 7. Is it stuck?
            21:28:53:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 7. Is it stuck?
            21:40:22:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 64 seconds. The obd refcount = 7. Is it stuck?
            

            The other is review run replay-single/53e https://maloo.whamcloud.com/test_sets/ddb85db2-208b-11e3-b9bc-52540035b04c (NOT ZFS)
            The MGS fails:

            03:55:06:Lustre: DEBUG MARKER: umount -d /mnt/mds1
            03:55:06:LustreError: 166-1: MGC10.10.4.154@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail
            03:55:07:Lustre: MGS is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 5. Is it stuck?
            03:55:31:Lustre: MGS is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 5. Is it stuck?
            03:56:05:Lustre: MGS is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 5. Is it stuck?
            
            utopiabound Nathaniel Clark added a comment - There have been two "recent" (Sept 2013) non conf-sanity/- failures (both in replay-single): replay-single/74 https://maloo.whamcloud.com/test_sets/f441c460-227f-11e3-af6a-52540035b04c A review-dne-zfs failure on OST0000 21:28:53:Lustre: DEBUG MARKER: umount -d /mnt/ost1 21:28:53:Lustre: Failing over lustre-OST0000 21:28:53:LustreError: 15640:0:(ost_handler.c:1782:ost_blocking_ast()) Error -2 syncing data on lock cancel 21:28:53:Lustre: 15640:0:(service.c:2030:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (50:74s); client may timeout. req@ffff880046d72c00 x1446662193136696/t0(0) o103->cea0ffc2-1873-4321-a1a2-348391764373@10.10.16.253@tcp:0/0 lens 328/192 e 0 to 0 dl 1379651120 ref 1 fl Complete:H/0/0 rc -19/-19 21:28:53:LustreError: 7671:0:(ost_handler.c:1782:ost_blocking_ast()) Error -2 syncing data on lock cancel 21:28:53:Lustre: lustre-OST0000: Not available for connect from 10.10.17.1@tcp (stopping) 21:28:53:Lustre: Skipped 5 previous similar messages 21:28:53:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 7. Is it stuck? 21:28:53:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 7. Is it stuck? 21:28:53:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 7. Is it stuck? 21:40:22:Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 64 seconds. The obd refcount = 7. Is it stuck? The other is review run replay-single/53e https://maloo.whamcloud.com/test_sets/ddb85db2-208b-11e3-b9bc-52540035b04c (NOT ZFS) The MGS fails: 03:55:06:Lustre: DEBUG MARKER: umount -d /mnt/mds1 03:55:06:LustreError: 166-1: MGC10.10.4.154@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail 03:55:07:Lustre: MGS is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 5. Is it stuck? 03:55:31:Lustre: MGS is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 5. Is it stuck? 03:56:05:Lustre: MGS is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 5. Is it stuck?
            utopiabound Nathaniel Clark added a comment - - edited

            sanity/132 failures appear to be LU-4019.

            utopiabound Nathaniel Clark added a comment - - edited sanity/132 failures appear to be LU-4019 .

            sanity/132 seem to share the following OST logs:

            15:51:18:Lustre: DEBUG MARKER: == sanity test 132: som avoids glimpse rpc == 15:50:26 (1380581426)
            15:51:18:LustreError: 23533:0:(ost_handler.c:1775:ost_blocking_ast()) Error -2 syncing data on lock cancel
            15:51:18:Lustre: lustre-OST0006: Client lustre-MDT0000-mdtlov_UUID (at 10.10.16.120@tcp) reconnecting
            15:51:18:Lustre: lustre-OST0006: Client lustre-MDT0000-mdtlov_UUID (at 10.10.16.120@tcp) refused reconnection, still busy with 1 active RPCs
            15:51:18:Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n ost.OSS.ost.stats
            15:51:18:Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n ost.OSS.ost.stats
            15:51:18:Lustre: lustre-OST0006: Client lustre-MDT0000-mdtlov_UUID (at 10.10.16.120@tcp) reconnecting
            15:51:18:Lustre: lustre-OST0006: Client lustre-MDT0000-mdtlov_UUID (at 10.10.16.120@tcp) refused reconnection, still busy with 1 active RPCs
            15:51:18:LustreError: 11-0: lustre-MDT0000-lwp-OST0001: Communicating with 10.10.16.120@tcp, operation obd_ping failed with -107.
            15:51:18:Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 10.10.16.120@tcp) was lost; in progress operations using this service will wait for recovery to complete
            

            Then a umount of OST0006 which never completes:

            15:52:09:Lustre: 7404:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1380581484/real 1380581484]  req@ffff8800634d5800 x1447637766224616/t0(0) o250->MGC10.10.16.120@tcp@10.10.16.120@tcp:26/25 lens 400/544 e 0 to 1 dl 1380581500 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
            15:52:09:Lustre: lustre-OST0006 is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 5. Is it stuck?
            

            From the MDT console log:

            16:51:27:Lustre: DEBUG MARKER: == sanity test 132: som avoids glimpse rpc == 15:50:26 (1380581426)
            16:51:27:LustreError: 11-0: lustre-OST0006-osc-MDT0000: Communicating with 10.10.16.121@tcp, operation ost_connect failed with -16.
            16:51:27:Lustre: DEBUG MARKER: /usr/sbin/lctl get_param mdt.*.som
            16:51:27:LustreError: 11-0: lustre-OST0006-osc-MDT0000: Communicating with 10.10.16.121@tcp, operation ost_connect failed with -16.
            16:51:27:Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.mdt.som=enabled
            16:51:27:Lustre: Setting parameter lustre-MDT0000.mdt.som in log lustre-MDT0000
            16:51:27:Lustre: Skipped 5 previous similar messages
            16:51:27:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts
            16:51:27:Lustre: DEBUG MARKER: umount -d -f /mnt/mds1
            16:51:27:LustreError: 3509:0:(client.c:1076:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff88004efcd000 x1447637735940204/t0(0) o13->lustre-OST0000-osc-MDT0000@10.10.16.121@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
            16:51:27:LustreError: 3509:0:(client.c:1076:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff88004efcd000 x1447637735940208/t0(0) o13->lustre-OST0002-osc-MDT0000@10.10.16.121@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
            16:51:27:LustreError: 3509:0:(client.c:1076:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff88004efcd000 x1447637735940216/t0(0) o6->lustre-OST0003-osc-MDT0000@10.10.16.121@tcp:28/4 lens 664/432 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
            16:51:27:LustreError: 3509:0:(client.c:1076:ptlrpc_import_delay_req()) Skipped 1 previous similar message
            16:51:27:Lustre: lustre-MDT0000: Not available for connect from 10.10.16.121@tcp (stopping)
            16:51:27:Lustre: lustre-MDT0000: Not available for connect from 10.10.16.121@tcp (stopping)
            16:51:27:LustreError: 3508:0:(client.c:1076:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff8800569b5400 x1447637735940228/t0(0) o13->lustre-OST0004-osc-MDT0000@10.10.16.121@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
            16:51:27:Lustre: 15981:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1380581444/real 1380581444]  req@ffff8800569b5400 x1447637735940248/t0(0) o251->MGC10.10.16.120@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1380581450 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
            16:51:27:LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.10.16.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
            16:51:27:Lustre: server umount lustre-MDT0000 complete
            

            From debug log on OST:

            ...
            1380410772.384659:(ldlm_lock.c:454:lock_handle_free()) slab-freed 'lock': 504 at ffff880025067c80.
            1380410772.386661:(ldlm_lock.c:454:lock_handle_free()) slab-freed 'lock': 504 at ffff88002583e380.
            1380410831.744886:(ofd_objects.c:563:ofd_attr_get()) Process entered
            1380410831.744887:(ofd_objects.c:588:ofd_attr_get()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe)
            1380410831.744889:(lprocfs_jobstats.c:217:lprocfs_job_stats_log()) Process entered
            1380410831.744890:(lprocfs_jobstats.c:224:lprocfs_job_stats_log()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea)
            1380410831.744891:(ofd_obd.c:1456:ofd_sync()) Process leaving
            1380410831.744892:(lustre_fid.h:719:fid_flatten32()) Process leaving (rc=4279240389 : 4279240389 : ff1006c5)
            1380410831.744893:(lustre_fid.h:719:fid_flatten32()) Process leaving (rc=4279240389 : 4279240389 : ff1006c5)
            1380410831.744897:(ofd_dev.c:285:ofd_object_free()) Process entered
            1380410831.744897:(ofd_dev.c:289:ofd_object_free()) object free, fid = [0x100000000:0x17c5:0x0]
            1380410831.744898:(ofd_dev.c:293:ofd_object_free()) slab-freed '(of)': 160 at ffff880026e3e9f0.
            1380410831.744899:(ofd_dev.c:294:ofd_object_free()) Process leaving
            1380410831.744899:(obd_class.h:1326:obd_sync()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe)
            1380410831.744900:(ost_handler.c:1775:ost_blocking_ast()) Error -2 syncing data on lock cancel
            1380410831.745806:(ost_handler.c:1777:ost_blocking_ast()) slab-freed '((oa))': 208 at ffff88002690ca40.
            1380410831.745808:(ost_handler.c:1778:ost_blocking_ast()) kfreed 'oinfo': 112 at ffff880026b61140.
            
            utopiabound Nathaniel Clark added a comment - sanity/132 seem to share the following OST logs: 15:51:18:Lustre: DEBUG MARKER: == sanity test 132: som avoids glimpse rpc == 15:50:26 (1380581426) 15:51:18:LustreError: 23533:0:(ost_handler.c:1775:ost_blocking_ast()) Error -2 syncing data on lock cancel 15:51:18:Lustre: lustre-OST0006: Client lustre-MDT0000-mdtlov_UUID (at 10.10.16.120@tcp) reconnecting 15:51:18:Lustre: lustre-OST0006: Client lustre-MDT0000-mdtlov_UUID (at 10.10.16.120@tcp) refused reconnection, still busy with 1 active RPCs 15:51:18:Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n ost.OSS.ost.stats 15:51:18:Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n ost.OSS.ost.stats 15:51:18:Lustre: lustre-OST0006: Client lustre-MDT0000-mdtlov_UUID (at 10.10.16.120@tcp) reconnecting 15:51:18:Lustre: lustre-OST0006: Client lustre-MDT0000-mdtlov_UUID (at 10.10.16.120@tcp) refused reconnection, still busy with 1 active RPCs 15:51:18:LustreError: 11-0: lustre-MDT0000-lwp-OST0001: Communicating with 10.10.16.120@tcp, operation obd_ping failed with -107. 15:51:18:Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 10.10.16.120@tcp) was lost; in progress operations using this service will wait for recovery to complete Then a umount of OST0006 which never completes: 15:52:09:Lustre: 7404:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1380581484/real 1380581484] req@ffff8800634d5800 x1447637766224616/t0(0) o250->MGC10.10.16.120@tcp@10.10.16.120@tcp:26/25 lens 400/544 e 0 to 1 dl 1380581500 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 15:52:09:Lustre: lustre-OST0006 is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 5. Is it stuck? From the MDT console log: 16:51:27:Lustre: DEBUG MARKER: == sanity test 132: som avoids glimpse rpc == 15:50:26 (1380581426) 16:51:27:LustreError: 11-0: lustre-OST0006-osc-MDT0000: Communicating with 10.10.16.121@tcp, operation ost_connect failed with -16. 16:51:27:Lustre: DEBUG MARKER: /usr/sbin/lctl get_param mdt.*.som 16:51:27:LustreError: 11-0: lustre-OST0006-osc-MDT0000: Communicating with 10.10.16.121@tcp, operation ost_connect failed with -16. 16:51:27:Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.mdt.som=enabled 16:51:27:Lustre: Setting parameter lustre-MDT0000.mdt.som in log lustre-MDT0000 16:51:27:Lustre: Skipped 5 previous similar messages 16:51:27:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts 16:51:27:Lustre: DEBUG MARKER: umount -d -f /mnt/mds1 16:51:27:LustreError: 3509:0:(client.c:1076:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88004efcd000 x1447637735940204/t0(0) o13->lustre-OST0000-osc-MDT0000@10.10.16.121@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 16:51:27:LustreError: 3509:0:(client.c:1076:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88004efcd000 x1447637735940208/t0(0) o13->lustre-OST0002-osc-MDT0000@10.10.16.121@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 16:51:27:LustreError: 3509:0:(client.c:1076:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88004efcd000 x1447637735940216/t0(0) o6->lustre-OST0003-osc-MDT0000@10.10.16.121@tcp:28/4 lens 664/432 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 16:51:27:LustreError: 3509:0:(client.c:1076:ptlrpc_import_delay_req()) Skipped 1 previous similar message 16:51:27:Lustre: lustre-MDT0000: Not available for connect from 10.10.16.121@tcp (stopping) 16:51:27:Lustre: lustre-MDT0000: Not available for connect from 10.10.16.121@tcp (stopping) 16:51:27:LustreError: 3508:0:(client.c:1076:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff8800569b5400 x1447637735940228/t0(0) o13->lustre-OST0004-osc-MDT0000@10.10.16.121@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 16:51:27:Lustre: 15981:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1380581444/real 1380581444] req@ffff8800569b5400 x1447637735940248/t0(0) o251->MGC10.10.16.120@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1380581450 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 16:51:27:LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.10.16.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. 16:51:27:Lustre: server umount lustre-MDT0000 complete From debug log on OST: ... 1380410772.384659:(ldlm_lock.c:454:lock_handle_free()) slab-freed 'lock': 504 at ffff880025067c80. 1380410772.386661:(ldlm_lock.c:454:lock_handle_free()) slab-freed 'lock': 504 at ffff88002583e380. 1380410831.744886:(ofd_objects.c:563:ofd_attr_get()) Process entered 1380410831.744887:(ofd_objects.c:588:ofd_attr_get()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe) 1380410831.744889:(lprocfs_jobstats.c:217:lprocfs_job_stats_log()) Process entered 1380410831.744890:(lprocfs_jobstats.c:224:lprocfs_job_stats_log()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) 1380410831.744891:(ofd_obd.c:1456:ofd_sync()) Process leaving 1380410831.744892:(lustre_fid.h:719:fid_flatten32()) Process leaving (rc=4279240389 : 4279240389 : ff1006c5) 1380410831.744893:(lustre_fid.h:719:fid_flatten32()) Process leaving (rc=4279240389 : 4279240389 : ff1006c5) 1380410831.744897:(ofd_dev.c:285:ofd_object_free()) Process entered 1380410831.744897:(ofd_dev.c:289:ofd_object_free()) object free, fid = [0x100000000:0x17c5:0x0] 1380410831.744898:(ofd_dev.c:293:ofd_object_free()) slab-freed '(of)': 160 at ffff880026e3e9f0. 1380410831.744899:(ofd_dev.c:294:ofd_object_free()) Process leaving 1380410831.744899:(obd_class.h:1326:obd_sync()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe) 1380410831.744900:(ost_handler.c:1775:ost_blocking_ast()) Error -2 syncing data on lock cancel 1380410831.745806:(ost_handler.c:1777:ost_blocking_ast()) slab-freed '((oa))': 208 at ffff88002690ca40. 1380410831.745808:(ost_handler.c:1778:ost_blocking_ast()) kfreed 'oinfo': 112 at ffff880026b61140.

            People

              utopiabound Nathaniel Clark
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: