Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12833

lustre_lwp_add_conn can't find lwp device

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.14.0
    • Lustre 2.13.0
    • After upgrade from 2.5 version
    • 3
    • 9223372036854775807

    Description

      LWP uses fsname-client config for creating devices
      the error occurred for #25 record

      00000040:00001000:2.0:1569263809.706616:0:96423:0:(llog.c:603:llog_process_thread()) lrh_index: 25 lrh_len: 112 (4304 remains)
      00000020:00000001:2.0:1569263809.706616:0:96423:0:(obd_mount_server.c:804:client_lwp_config_process()) Process entered
      00000020:00000001:2.0:1569263809.706617:0:96423:0:(obd_mount_server.c:738:lustre_lwp_add_conn()) Process entered
      00000020:00000001:2.0:1569263809.706618:0:96423:0:(obd_mount_server.c:700:lustre_find_lwp()) Process entered
      00000020:00000010:2.0:1569263809.706618:0:96423:0:(obd_mount_server.c:705:lustre_find_lwp()) kmalloced '*lwpname': 64 at ffff8807c129bd40.
      00000020:00000001:2.0:1569263809.706619:0:96423:0:(obd_mount_server.c:346:tgt_name2lwp_name()) Process entered
      00000020:00000010:2.0:1569263809.706619:0:96423:0:(obd_mount_server.c:348:tgt_name2lwp_name()) kmalloced 'fsname': 64 at ffff8807c129b780.
      00000020:00000001:2.0:1569263809.706620:0:96423:0:(obd_mount_server.c:372:tgt_name2lwp_name()) Process leaving via cleanup (rc=0 : 0 : 0x0)
      00000020:00000010:2.0:1569263809.706621:0:96423:0:(obd_mount_server.c:376:tgt_name2lwp_name()) kfreed 'fsname': 64 at ffff8807c129b780.
      00000020:00000001:2.0:1569263809.706656:0:96423:0:(obd_mount_server.c:727:lustre_find_lwp()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe)
      00000020:00020000:2.0:1569263809.706657:0:96423:0:(obd_mount_server.c:742:lustre_lwp_add_conn()) xxx1116-OST0000: can't find lwp device.
      00000020:00000001:2.0:1569263809.706660:0:96423:0:(obd_mount_server.c:743:lustre_lwp_add_conn()) Process leaving via out (rc=18446744073709551614 : -2 : 0xfffffffffffffffe)
      00000020:00000010:2.0:1569263809.706661:0:96423:0:(obd_mount_server.c:772:lustre_lwp_add_conn()) kfreed 'lwpname': 64 at ffff8807c129bd40.
      00000020:00000001:2.0:1569263809.706662:0:96423:0:(obd_mount_server.c:773:lustre_lwp_add_conn()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe)
      00000020:00000001:2.0:1569263809.706663:0:96423:0:(obd_mount_server.c:902:client_lwp_config_process()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe)
      

      Base on config it should be skipped

      #19 (224)END marker 7 (flags=0x02, v2.5.1.0) xxx1116-client 'mount opts' Thu Oct 1 15:29:07 2015-
      #20 (224)SKIP START marker 11 (flags=0x05, v2.5.1.0) xxx1116-MDT0002 'add mdc' Thu Oct 1 15:55:50 2015-Thu Oct 1 16:22:50 2015
      #21 (088)SKIP add_uuid nid=10.10.10.8@o2ib(0x500000a956a08) 0: 1:10.10.10.8@o2ib
      #22 (128)SKIP attach 0:xxx1116-MDT0002-mdc 1:mdc 2:xxx1116-clilmv_UUID
      #23 (144)SKIP setup 0:xxx1116-MDT0002-mdc 1:xxx1116-MDT0002_UUID 2:10.10.10.8@o2ib
      #24 (088)SKIP add_uuid nid=10.10.10.7@o2ib(0x500000a956a07) 0: 1:10.10.10.7@o2ib
      #25 (112)SKIP add_conn 0:xxx1116-MDT0002-mdc 1:10.10.10.7@o2ib
      #26 (088)SKIP add_uuid nid=10.10.10.7@o2ib(0x500000a956a07) 0: 1:10.10.10.7@o2ib
      #27 (112)SKIP add_conn 0:xxx1116-MDT0002-mdc 1:10.10.10.7@o2ib
      #28 (168)SKIP modify_mdc_tgts add 0:xxx1116-clilmv 1:xxx1116-MDT0002_UUID 2:2 3:1 4:xxx1116-MDT0002-mdc_UUID
      #29 (224)SKIP END marker 11 (flags=0x06, v2.5.1.0) xxx1116-MDT0002 'add mdc' Thu Oct 1 15:55:50 2015-Thu Oct 1 16:22:50 2015
      #30 (224)marker 12 (flags=0x01, v2.5.1.0) xxx1116-client 'mount opts' Thu Oct 1 15:55:50 2015-
      

      It looks like client_lwp_config_process () has a bug, and processing add_conn without processing add_uuid before.
      For marker it skips the record if SKIP flag is set. For add_uuid it base on flags from marker processing so skips too. But for add_conn it processes the record, tries to find a lwp device and fails. Because a device is added by add_uuid record.

      The workaround is to cleanup client config from SKIP records. 
      lctl clear_conf command and write conf should help also.

      Only SKIP for command 'add mdc' breaks LWP config processing. 

      Attachments

        Activity

          People

            aboyko Alexander Boyko
            aboyko Alexander Boyko
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: