Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6695

Jobstats breaks when "Too long env variable." errors occur

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.8.0
    • Lustre 2.5.3
    • None
    • 3
    • 9223372036854775807

    Description

      We have "Too long env variable" errors on a Lustre cluster at Stanford leading to broken JobStats report (using SLURM_JOB_ID). Jobids associated with processes reporting these errors are just ignored:

      LNetError: 15288:0:(linux-curproc.c:241:cfs_get_environ()) Too long env variable.
      LNetError: 15288:0:(linux-curproc.c:241:cfs_get_environ()) Skipped 2097 previous similar messages
      

      In our case, user process environ size is a bit more than 32K.
      It seems the problem comes from lustre_get_jobid() which uses the process environ variable to store some info when jobstats is enabled, but cfs_get_environ() is not able to handle large environ (which may be wise). However, we think an user shouldn't be able to disable jobstats like that. A change to cfs_get_environ() might not be enough. Please advice.

      Please find below the commands used to track the issue:

      [root@gpu-13-1 ~]# ps uw -q 15288
      USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
      suuser   15288 98.4  6.2 108826468 4144960 ?   Sl   13:55 235:46 terachem run.in
      
      [root@gpu-13-1 ~]# cat /proc/15288/environ | wc -c
      32936
      
      [root@gpu-13-1 ~]# scontrol pidinfo 15288
      Slurm job id 2376464 ends at Sun Jun 07 13:55:09 2015
      slurm_get_rem_time is 159433
      
      [root@gpu-13-1 ~]# squeue -j 2376464
                   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 2376464      slac temp800_   suuser  R    3:43:25      1 gpu-13-1
      
      [root@gpu-13-1 ~]# lsof -p 15288 | grep /scratch
      terachem 15288 suuser    1w   REG 2395,496332    386348 144116383972642817 /scratch/users/suuser/FeC2_catalyst/temp800_noFeECP_nanoFeC2/chunk_0080/run.out
      terachem 15288 suuser    2w   REG 2395,496332        43 144116383972642818 /scratch/users/suuser/FeC2_catalyst/temp800_noFeECP_nanoFeC2/chunk_0080/run.err
      
      [root@gpu-13-1 ~]# ls -l /scratch/users/suuser/FeC2_catalyst/temp800_noFeECP_nanoFeC2/chunk_0080/run.out
      -rw-r--r-- 1 suuser sugrp 386636 Jun  5 17:40 /scratch/users/suuser/FeC2_catalyst/temp800_noFeECP_nanoFeC2/chunk_0080/run.out
      [root@gpu-13-1 ~]# date
      Fri Jun  5 17:40:12 PDT 2015
      

      fsname is regal mounted on /scratch.
      No jobstats report seen from this job:

      [root@rcf-mgnt ~]# clush -w regal-oss[00-07] lctl get_param obdfilter.*.job_stats \| grep 2376464
      clush: regal-oss07: exited with exit code 1
      clush: regal-oss06: exited with exit code 1
      clush: regal-oss00: exited with exit code 1
      clush: regal-oss01: exited with exit code 1
      clush: regal-oss04: exited with exit code 1
      clush: regal-oss03: exited with exit code 1
      clush: regal-oss02: exited with exit code 1
      clush: regal-oss05: exited with exit code 1
      
      [root@regal-mds1 ~]# lctl get_param mdt.regal-MDT0000.job_stats | grep 2376464
      [root@regal-mds1 ~]# 
      

      Attachments

        Activity

          People

            niu Niu Yawei (Inactive)
            sthiell Stephane Thiell
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: