Andreas,
I also tried to reproduce the problem on some test hardware by creating a filesystem with the same exact 2.5.2 version of Lustre installed on our /scratch filesystem and was unsuccessful to reproduce as well. There must be something else going on with our /scratch filesystem, either due to large scale with 348 OSTs or the upgrade from the 2.1.5 version we were running, so i'm going to compare the setup of the filesystem and see if I can find any differences.
In regard to the debug output, we could not wait to put the system back into production so we developed a manual process to distribute files across by setting the stripe offset to a random OST for active user directories. We are cycling the first two active OSTs so that files created in directories where the stripe_offset is still set to -1 get distributed as well. Not efficient or as good performance, but at least it lets us run jobs for users and distribute files across all the OSTs. I can certainly generate the debug output, but afraid it would be polluted with the activity from all the users. That and we had to deactivate the first 16 OSTs since they reached > 93% capacity. We have a maintenance scheduled for next Tuesday and can collect data on a quiet system then. I've included the output from the prealloc information in case it might be useful. I noticed two had -5 as the prealloc_status, those OSTs are in the list of inactive OSTs, which is in the attached file as well. Note that in looking through the prealloc output, found these three sets of messages corresponding to those OSTs:
Oct 22 00:43:29 mds5 kernel: Lustre: setting import scratch-OST001d_UUID INACTIVE by administrator request
Oct 22 00:43:29 mds5 kernel: Lustre: Skipped 8 previous similar messages
Oct 22 00:43:29 mds5 kernel: LustreError: 22062:0:(osp_precreate.c:464:osp_precreate_send()) scratch-OST001d-osc-MDT0000: can't precreate: rc = -5
Oct 22 00:43:29 mds5 kernel: LustreError: 22062:0:(osp_precreate.c:968:osp_precreate_thread()) scratch-OST001d-osc-MDT0000: cannot precreate objects: rc = -5
Oct 22 01:04:06 mds5 kernel: Lustre: setting import scratch-OST0021_UUID INACTIVE by administrator request
Oct 22 01:04:06 mds5 kernel: LustreError: 22070:0:(osp_precreate.c:464:osp_precreate_send()) scratch-OST0021-osc-MDT0000: can't precreate: rc = -5
Oct 22 01:04:06 mds5 kernel: LustreError: 22070:0:(osp_precreate.c:968:osp_precreate_thread()) scratch-OST0021-osc-MDT0000: cannot precreate objects: rc = -5
Oct 22 15:07:21 mds5 kernel: Lustre: setting import scratch-OST0024_UUID INACTIVE by administrator request
Oct 22 15:07:21 mds5 kernel: Lustre: Skipped 5 previous similar messages
Oct 22 15:07:21 mds5 kernel: LustreError: 22084:0:(osp_precreate.c:464:osp_precreate_send()) scratch-OST0026-osc-MDT0000: can't precreate: rc = -5
Oct 22 15:07:21 mds5 kernel: LustreError: 22084:0:(osp_precreate.c:968:osp_precreate_thread()) scratch-OST0026-osc-MDT0000: cannot precreate objects: rc = -5
Thanks Oleg, we're going to try and quiet down the system a bit (the system is a little drained waiting to schedule a 32K core job) and collect the mds trace all with the OSTs active again to see if that can provide some more debug information.
We compared our test filesystem with the current /scratch filesystem and did find one significant difference, the test filesystem has a lwp service running on it that /scratch does not (there are additional lwp entries in lctl dl on MDS and OSS's. The test filesystem went through the same 2.1.5 -> 2.5.2 upgrade process, however, we also ran tunefs.lustre --writeconf on the test filesystem and we did not run tunefs.lustre --writeconf on /scratch in case we encountered a major issue and needed to rollback to previous version. That appears to be the only difference in process of upgrade between our test filesystem and /scratch. I didn't find much information about what lwp does from a quick search and some googling. Not sure if having this lwp service running would impact file layout and creation on the OSTs.
In answer to your question, we use a default stripe count of 2, a size of 1MB and offset -1. We have not enabled ost pools.