Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1910

OSS kernel panics after upgrade

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • Lustre 1.8.8
    • None
    • Sun Fire x4540 server, 48 internal 1TB disks, lustre patched kernel - kernel-2.6.18-308.4.1.el5, Lustre 1.8.8
    • 3
    • 10643

    Description

      Since our recent upgrade to 1.8.8, we've been experiencing problems with the md subsystem. Our OSTs are constructed as 8+2 RAID6 metadevices using the mdadm utility.
      Every Sunday morning, cron.weekly runs the raid.check scripts and starts re-syncing and if it hits a medium error, the md subsytem hangs, for example "cat /proc/mdstat" hangs. The load on the server immediately starts going up until the server becomes unusable and we have to reboot the OSS server
      What could be causing this and should we be running raid.check on the ost metadevices?

      Attachments

        Activity

          People

            green Oleg Drokin
            hellenn Hellen (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: