Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4258

ENOSPC when using conf_param on fs with a lot of OSTs

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.4.1
    • None
    • 3
    • 11621

    Description

      At TACC we are running into an issue where the config log is overflowing due to the large number of OSTs and conf_param on the mdt and clients is failing with ENOSPC. I tracked down the issue to llog_osd_write_rec:

      00000040:00000001:3.0:1384532145.376864:0:12518:0:(llog_osd.c:336:llog_osd_write_rec()) Process entered
      00000040:00001000:3.0:1384532145.376865:0:12518:0:(llog_osd.c:346:llog_osd_write_rec()) new record 10620000 to [0xa:0x419:0x0]
      00000040:00000001:3.0:1384532145.376867:0:12518:0:(llog_osd.c:440:llog_osd_write_rec()) Process leaving (rc=18446744073709551588 : -28 : ffffffffffffffe4)
      

      It looks like it's hitting this:

          /* if it's the last idx in log file, then return -ENOSPC */                     
          if (loghandle->lgh_last_idx >= LLOG_BITMAP_SIZE(llh) - 1)                       
              RETURN(-ENOSPC);                                                            
      

      Here is the CONFIGs directory for reference:

      total 31212
      -rw-r--r-- 1 root root 11674048 Sep 24 13:39 gsfs-client
      -rw-r--r-- 1 root root 11980704 Sep 24 13:39 gsfs-MDT0000
      -rw-r--r-- 1 root root     9432 Sep 24 13:48 gsfs-OST0000
      -rw-r--r-- 1 root root     9432 Sep 24 13:49 gsfs-OST0001
      ...
      -rw-r--r-- 1 root root     9432 Oct  1 12:15 gsfs-OST029f
      -rw-r--r-- 1 root root    12288 Sep 24 13:39 mountdata
      

      Is there a way to increase the BITMAP_SIZE? It looks like the bitmap itself is based on the CHUNK_SIZE but the BITMAP_SIZE macro doesn't reference it at all.

      Thanks.

      Attachments

        Activity

          [LU-4258] ENOSPC when using conf_param on fs with a lot of OSTs
          pjones Peter Jones added a comment -

          ok thanks Manish

          pjones Peter Jones added a comment - ok thanks Manish

          Hi,

          Since customer is moved to Lustre v2.4.3, this ticket can be closed.

          Thank You,
          Manish

          manish Manish Patel (Inactive) added a comment - Hi, Since customer is moved to Lustre v2.4.3, this ticket can be closed. Thank You, Manish

          This site also has about 12 OST pools. How do the OST pools get stored? Is it one record per pool, or one record per OST in the pool? I am assuming that the set_param -P won't include these.

          kitwestneat Kit Westneat (Inactive) added a comment - This site also has about 12 OST pools. How do the OST pools get stored? Is it one record per pool, or one record per OST in the pool? I am assuming that the set_param -P won't include these.

          It looks like there are 672 OSTs in the filesystem. That would mean 100 tunables set per OST to hit the ~65000 limit of BITMAP_SIZE (slightly less than the number of bits that fit in an 8kB block). Unfortunately, it isn't possible to increase this easily, since the 8kB CHUNK_SIZE is a network protocol limit for the llog code.

          As a workaround it would be possible to run --writeconf on all the servers (to erase the config log on the MGS and then regenerate it), but that isn't a real solution to the problem. In Lustre 2.5+ it is possible to use "lctl set_param -P" to store permanent parameters in a separate log from the --writeconf log (though this is only usable for 2.5+ clients), so this would give you at least some more space. It would be better to allow a larger configuration log, since this has been hit by others in the past as well.

          adilger Andreas Dilger added a comment - It looks like there are 672 OSTs in the filesystem. That would mean 100 tunables set per OST to hit the ~65000 limit of BITMAP_SIZE (slightly less than the number of bits that fit in an 8kB block). Unfortunately, it isn't possible to increase this easily, since the 8kB CHUNK_SIZE is a network protocol limit for the llog code. As a workaround it would be possible to run --writeconf on all the servers (to erase the config log on the MGS and then regenerate it), but that isn't a real solution to the problem. In Lustre 2.5+ it is possible to use "lctl set_param -P" to store permanent parameters in a separate log from the --writeconf log (though this is only usable for 2.5+ clients), so this would give you at least some more space. It would be better to allow a larger configuration log, since this has been hit by others in the past as well.
          green Oleg Drokin added a comment -

          No, I have not seen this before.

          "new record 10620000" - what are you doing to have so many config records? Changing some parameters frequently?

          Meanwhile if you run lctl write_conf, that should clear the config logs.

          green Oleg Drokin added a comment - No, I have not seen this before. "new record 10620000" - what are you doing to have so many config records? Changing some parameters frequently? Meanwhile if you run lctl write_conf, that should clear the config logs.
          pjones Peter Jones added a comment -

          Oleg

          Is this something that you have seen?

          Peter

          pjones Peter Jones added a comment - Oleg Is this something that you have seen? Peter

          The title should say lot of OSTs, typo on my part.

          kitwestneat Kit Westneat (Inactive) added a comment - The title should say lot of OSTs, typo on my part.

          People

            green Oleg Drokin
            manish Manish Patel (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: