Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6670

Hard Failover recovery-small test_28: post-failover df: 1

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: Lustre 2.8.0
    • Fix Version/s: None
    • Labels:
      None
    • Environment:
      lustre-master build #3029 SLES11 SP3
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/20c5cc8e-ff3d-11e4-a4ed-5254006e85c2.

      The sub-test test_28 failed with the following error:

      post-failover df: 1
      

      client dmesg

      [192506.215454] Lustre: DEBUG MARKER: == recovery-small test 28: handle error adding new clients (bug 6086) ================================ 04:02:23 (1432119743)
      [192506.310965] Lustre: DEBUG MARKER: mcreate /mnt/lustre/f28.recovery-small
      [192506.324746] Lustre: DEBUG MARKER: lctl set_param ldlm.namespaces.*.early_lock_cancel=0
      [192506.331351] Lustre: DEBUG MARKER: lctl set_param fail_loc=0x80000305
      [192506.338932] Lustre: DEBUG MARKER: chmod 0777 /mnt/lustre/f28.recovery-small
      [192506.349402] Lustre: *** cfs_fail_loc=305, val=0***
      [192506.349407] Lustre: Skipped 2 previous similar messages
      [192506.377737] Lustre: DEBUG MARKER: lctl set_param fail_loc=0
      [192506.388914] Lustre: DEBUG MARKER: lctl set_param fail_val=0
      [192506.398826] Lustre: DEBUG MARKER: lctl set_param ldlm.namespaces.*.early_lock_cancel=1
      [192506.635438] LustreError: 167-0: lustre-MDT0000-mdc-ffff880070b36000: This client was evicted by lustre-MDT0000; in progress operations using this service will fail.
      [192506.635441] LustreError: Skipped 8 previous similar messages
      [192506.635853] LustreError: 7641:0:(vvp_io.c:1444:vvp_io_init()) lustre: refresh file layout [0x200029440:0x28af:0x0] error -5.
      [192506.655993] LustreError: 15736:0:(ldlm_resource.c:776:ldlm_resource_complain()) lustre-MDT0000-mdc-ffff880070b36000: namespace resource [0x200029440:0x299b:0x0].0 (ffff88005f6ee1c0) refcount nonzero (1) after lock cleanup; forcing cleanup.
      [192506.655998] LustreError: 15736:0:(ldlm_resource.c:1369:ldlm_resource_dump()) --- Resource: [0x200029440:0x299b:0x0].0 (ffff88005f6ee1c0) refcount = 2
      [192517.136125] LustreError: 166-1: MGC10.1.6.246@tcp: Connection to MGS (at 10.1.6.250@tcp) was lost; in progress operations using this service will fail
      [192537.136108] LustreError: 23405:0:(mgc_request.c:527:do_requeue()) failed processing log: -5
      [192557.136202] LustreError: 23405:0:(mgc_request.c:527:do_requeue()) failed processing log: -5
      [192565.200662] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/mpi/gcc/openmpi/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/l
      [192565.453472] Lustre: DEBUG MARKER: lctl get_param -n at_max
      [192565.516470] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
      [192565.692596] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
      [192577.140250] Lustre: Evicted from MGS (at 10.1.6.246@tcp) after server handle changed from 0x7860df5639e24149 to 0xd44806558079509b
      [192582.157200] Lustre: 17892:0:(client.c:2755:ptlrpc_replay_interpret()) @@@ Version mismatch during replay
      [192582.157203]   req@ffff88005f4a1cc0 x1501485024619144/t665719930885(665719930885) o36->lustre-MDT0000-mdc-ffff880070b36000@10.1.6.246@tcp:12/10 lens 504/424 e 0 to 0 dl 1432119950 ref 2 fl Interpret:R/4/0 rc -75/-75
      [192713.156228] Lustre: 17892:0:(import.c:1293:completed_replay_interpret()) lustre-MDT0000-mdc-ffff880070b36000: version recovery fails, reconnecting
      [192713.158878] LustreError: 16488:0:(lmv_obd.c:1474:lmv_statfs()) can't stat MDS #0 (lustre-MDT0000-mdc-ffff880070b36000), error -5
      [192713.158895] LustreError: 16488:0:(llite_lib.c:1762:ll_statfs_internal()) md_statfs fails: rc = -5
      [192713.159013] LustreError: 7641:0:(vvp_io.c:1444:vvp_io_init()) lustre: refresh file layout [0x200029440:0x2e5a:0x0] error -5.
      [192713.159022] LustreError: 7641:0:(vvp_io.c:1444:vvp_io_init()) Skipped 115 previous similar messages
      [192713.184101] LustreError: 16492:0:(ldlm_resource.c:776:ldlm_resource_complain()) lustre-MDT0000-mdc-ffff880070b36000: namespace resource [0x200029440:0x2f6b:0x0].0 (ffff88005f6f1300) refcount nonzero (1) after lock cleanup; forcing cleanup.
      [192713.184105] LustreError: 16492:0:(ldlm_resource.c:1369:ldlm_resource_dump()) --- Resource: [0x200029440:0x2f6b:0x0].0 (ffff88005f6f1300) refcount = 1
      [192713.184261] Lustre: lustre-MDT0000-mdc-ffff880070b36000: Connection restored to lustre-MDT0000 (at 10.1.6.246@tcp)
      [192713.184263] Lustre: Skipped 20 previous similar messages
      [192713.272121] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  recovery-small test_28: @@@@@@ FAIL: post-failover df: 1 
      [192713.469987] Lustre: DEBUG MARKER: recovery-small test_28: @@@@@@ FAIL: post-failover df: 1
      [192713.728348] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2015-05-18/lustre-master-el6_6-x86_64-vs-lustre-master-sles11sp3-x86_64--failover--2_9_1__3029__-70061678429280-004551/recovery-small.test_28.debug_log.$(hostname -s).1432119951.log;
      [192713.728351]          dmesg > /logdir/test_logs
      
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hongchao.zhang Hongchao Zhang
                Reporter:
                maloo Maloo
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated: