Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10267

Wrong poll() returned revents for changelog device

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      The following program failed to poll on the changelog device:

           rc = llapi_changelog_start(&chglog_hdlr,
                                         CHANGELOG_FLAG_BLOCK | CHANGELOG_FLAG_JOBID,
                                         mdtname, 0);
           
              fd = llapi_changelog_get_fd(chglog_hdlr);
      
              fds[0].fd = fd;
              fds[0].events = POLLIN;
              rc = llapi_changelog_recv(chglog_hdlr, &rec);
              rc = poll(fds, 1, CHLG_POLL_INTV);
      

      The returned revents are always POLLERR.

      The Lustre log output caused this failure is:

      00000002:20000000:1.0:1511318935.046970:0:110785:0:(mdc_changelog.c:153:chlg_read_cat_process_cb()) 30 13TRUNC 1622762232318656574 0xe t=[0x200000401:0x1:0x0] p=[0x0:0x0:0x0] 
      00000040:00001000:1.0:1511318935.046987:0:110785:0:(llog.c:513:llog_process_thread()) processing rec 0xffff880012e1ae10 type 0x10660000
      00000040:00001000:1.0:1511318935.046988:0:110785:0:(llog.c:519:llog_process_thread()) after swabbing, type=0x10660000 idx=31
      00000040:00001000:1.0:1511318935.046988:0:110785:0:(llog.c:595:llog_process_thread()) lrh_index: 31 lrh_len: 120 (4592 remains)
      00000002:20000000:1.0:1511318935.047002:0:110785:0:(mdc_changelog.c:153:chlg_read_cat_process_cb()) 31 11CLOSE 1622762232427964466 0x242 t=[0x200000401:0x1:0x0] p=[0x0:0x0:0x0] 
      00000040:00001000:1.0:1511318935.047004:0:110785:0:(llog.c:513:llog_process_thread()) processing rec 0xffff880012e1ae88 type 0x10660000
      00000040:00001000:1.0:1511318935.047004:0:110785:0:(llog.c:519:llog_process_thread()) after swabbing, type=0x10660000 idx=32
      00000040:00001000:1.0:1511318935.047004:0:110785:0:(llog.c:595:llog_process_thread()) lrh_index: 32 lrh_len: 120 (4472 remains)
      00000002:20000000:1.0:1511318935.047004:0:110785:0:(mdc_changelog.c:153:chlg_read_cat_process_cb()) 32 13TRUNC 1622762281381438112 0xe t=[0x200000401:0x1:0x0] p=[0x0:0x0:0x0] 
      00000040:00001000:1.0:1511318935.047005:0:110785:0:(llog.c:513:llog_process_thread()) processing rec 0xffff880012e1af00 type 0x10660000
      00000040:00001000:1.0:1511318935.047005:0:110785:0:(llog.c:519:llog_process_thread()) after swabbing, type=0x10660000 idx=33
      00000040:00001000:1.0:1511318935.047006:0:110785:0:(llog.c:595:llog_process_thread()) lrh_index: 33 lrh_len: 120 (4352 remains)
      00000002:20000000:1.0:1511318935.047006:0:110785:0:(mdc_changelog.c:153:chlg_read_cat_process_cb()) 33 11CLOSE 1622762281387196858 0x242 t=[0x200000401:0x1:0x0] p=[0x0:0x0:0x0] 
      00000040:00001000:1.0:1511318935.047007:0:110785:0:(llog.c:513:llog_process_thread()) processing rec 0xffff880012e1af78 type 0x0
      00000040:00001000:1.0:1511318935.047007:0:110785:0:(llog.c:519:llog_process_thread()) after swabbing, type=0x0 idx=0
      00000040:00001000:1.0:1511318935.047009:0:110785:0:(llog.c:513:llog_process_thread()) processing rec 0xffff880012ce4040 type 0x0
      00000040:00001000:1.0:1511318935.047009:0:110785:0:(llog.c:519:llog_process_thread()) after swabbing, type=0x0 idx=0
      00000040:00001000:1.0:1511318935.047010:0:110785:0:(llog.c:556:llog_process_thread()) Re-read last llog buffer for new records, index 2, last 1
      00000040:00001000:0.0:1511318935.047146:0:85780:0:(llog_osd.c:859:llog_osd_next_block()) looking for log index 2 (cur idx 1 off8192), size 8256
      00000040:00001000:1.0:1511318935.047240:0:110785:0:(llog.c:483:llog_process_thread()) cur_offset 8192, chunk_offset 8192, buf_offset 64, rc = -5
      00000002:00020000:1.0:1511318935.047242:0:110785:0:(mdc_changelog.c:244:chlg_load()) lustre-MDT0000-mdc-ffff88000cabe000: fail to process llog: rc = -5
      

      And this bug can be easy reproduced by the following commands:

      lctl --device lustre-MDT0000 changelog_register
      dd if=/dev/zero of=/mnt/lustre/test bs=1M count=2
      lfs changelog lustre-MDT0000
      

      ,

      Attachments

        Issue Links

          Activity

            People

              bfaccini Bruno Faccini (Inactive)
              qian Qian Yingjin (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: