Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6549

sanityn test_8: sanityn fails: Protocol error

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/9e3e0324-ef35-11e4-9a16-5254006e85c2.

      The sub-test test_8 failed with the following error:

      opendevunlink /mnt/lustre/f8.sanityn /mnt/lustre2/f8.sanityn
      

      This looks a bit like LU-4164, but isn't seen during interop tests.
      I'm raising it as a new bug. An expert may decide it's a dup.

      Info required for matching: sanityn 8

      Attachments

        Issue Links

          Activity

            [LU-6549] sanityn test_8: sanityn fails: Protocol error
            ys Yang Sheng added a comment -

            Client sent a getattr intent lock:

            dc-ffff88006d706c00@10.1.5.42@tcp:12/10 lens 576/536 e 0 to 0 dl 1440136071 ref 1 fl Complete:R/0/0 rc 301/301
            00000080:00200000:1.0:1440136064.527419:0:27534:0:(file.c:3269:__ll_inode_revalidate_it()) VFS Op:inode=144115473691181067/33554498(ffff88006913a678),name=f8.sanityn
            00000002:00010000:1.0:1440136064.527423:0:27534:0:(mdc_locks.c:1180:mdc_intent_lock()) (name: ,[0x200004280:0xb:0x0]) in obj [0x200004280:0xb:0x0], intent: getattr flags 00
            

            Server side:

            00000100:00100000:1.0:1440136064.528072:0:7507:0:(service.c:2032:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt00_000:0c7867ac-7f59-ff9b-f284-8f7aa2fff179+15:27534:x1510092010835688:12345-10.1.5.40@tcp:101
            00010000:00010000:1.0:1440136064.528078:0:7507:0:(ldlm_lockd.c:1185:ldlm_handle_enqueue0()) ### server-side enqueue handler START
            00010000:00010000:1.0:1440136064.528085:0:7507:0:(ldlm_lockd.c:1273:ldlm_handle_enqueue0()) ### server-side enqueue handler, new lock created ns: mdt-lustre-MDT0000_UUID lock: ffff88006d0c85c0/0xf49afa6507fd3114 lrc: 2/0,0 mode: --/CR res: [0x200004280:0xb:0x0].0 bits 0x0 rrc: 1 type: IBT flags: 0x40000000000000 nid: local remote: 0x2e111d232a85683a expref: -99 pid: 7507 timeout: 0 lvb_type: 0
            00010000:00010000:1.0:1440136064.528127:0:7507:0:(ldlm_lockd.c:1411:ldlm_handle_enqueue0()) ### server-side enqueue handler, sending reply(err=0, rc=-71) ns: mdt-lustre-MDT0000_UUID lock: ffff88006d0c85c0/0xf49afa6507fd3114 lrc: 1/0,0 mode: --/CR res: [0x200004280:0xb:0x0].0 bits 0x2 rrc: 1 type: IBT flags: 0x44000000000000 nid: 10.1.5.40@tcp remote: 0x2e111d232a85683a expref: 16 pid: 7507 timeout: 0 lvb_type: 0
            

            So looks like the real error code hidden in mdt_intent_opc(). We may catch it after http://review.whamcloud.com/#/c/11650/.

            ys Yang Sheng added a comment - Client sent a getattr intent lock: dc-ffff88006d706c00@10.1.5.42@tcp:12/10 lens 576/536 e 0 to 0 dl 1440136071 ref 1 fl Complete:R/0/0 rc 301/301 00000080:00200000:1.0:1440136064.527419:0:27534:0:(file.c:3269:__ll_inode_revalidate_it()) VFS Op:inode=144115473691181067/33554498(ffff88006913a678),name=f8.sanityn 00000002:00010000:1.0:1440136064.527423:0:27534:0:(mdc_locks.c:1180:mdc_intent_lock()) (name: ,[0x200004280:0xb:0x0]) in obj [0x200004280:0xb:0x0], intent: getattr flags 00 Server side: 00000100:00100000:1.0:1440136064.528072:0:7507:0:(service.c:2032:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt00_000:0c7867ac-7f59-ff9b-f284-8f7aa2fff179+15:27534:x1510092010835688:12345-10.1.5.40@tcp:101 00010000:00010000:1.0:1440136064.528078:0:7507:0:(ldlm_lockd.c:1185:ldlm_handle_enqueue0()) ### server-side enqueue handler START 00010000:00010000:1.0:1440136064.528085:0:7507:0:(ldlm_lockd.c:1273:ldlm_handle_enqueue0()) ### server-side enqueue handler, new lock created ns: mdt-lustre-MDT0000_UUID lock: ffff88006d0c85c0/0xf49afa6507fd3114 lrc: 2/0,0 mode: --/CR res: [0x200004280:0xb:0x0].0 bits 0x0 rrc: 1 type: IBT flags: 0x40000000000000 nid: local remote: 0x2e111d232a85683a expref: -99 pid: 7507 timeout: 0 lvb_type: 0 00010000:00010000:1.0:1440136064.528127:0:7507:0:(ldlm_lockd.c:1411:ldlm_handle_enqueue0()) ### server-side enqueue handler, sending reply(err=0, rc=-71) ns: mdt-lustre-MDT0000_UUID lock: ffff88006d0c85c0/0xf49afa6507fd3114 lrc: 1/0,0 mode: --/CR res: [0x200004280:0xb:0x0].0 bits 0x2 rrc: 1 type: IBT flags: 0x44000000000000 nid: 10.1.5.40@tcp remote: 0x2e111d232a85683a expref: 16 pid: 7507 timeout: 0 lvb_type: 0 So looks like the real error code hidden in mdt_intent_opc(). We may catch it after http://review.whamcloud.com/#/c/11650/ .

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: