RE: fluka job crashing

From: Alfredo Ferrari <alfredo.ferrari_at_cern.ch>
Date: Thu, 4 Apr 2013 12:50:58 +0200

<203362F27C84394EB2A6A19955DB1F9E79718D2D_at_PLOXCHG03.cern.ch>
User-Agent: Alpine 2.03 (LFD 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"; format=flowed
Sender: owner-fluka-discuss_at_mi.infn.it

... change the number of primaries (START card) to a very small number.
Looking into the .err files of the runs which crashed you should be able
to see how long on average one event lasts.

             Alfredo



+----------------------------------------------------------------------+
| Alfredo Ferrari || Tel.: +41.22.76.76119 |
| CERN-EN/STI || Fax.: +41.22.76.69474 |
| 1211 Geneva 23 || e-mail: Alfredo.Ferrari_at_cern.ch |
| Switzerland || |
+----------------------------------------------------------------------+

On Thu, 4 Apr 2013, Sudeshna Banerjee wrote:

> This could be my problem too. But it has happened everytime I have submitted the batch job, but after several hours of running. If I try with a simplified geometry then the job completes properly and quickly.
>
> That is why I wanted to know if there is a possiblity to run for only a few events (few collisions in my case), so that the job (with the full CMS geometry) has a chance to finish soon.
>
> Sudeshna Banerjee
> ________________________________________
> From: Alfredo Ferrari [alfredo.ferrari_at_cern.ch]
> Sent: 04 April 2013 11:54
> To: Sudeshna Banerjee
> Cc: fluka-discuss_at_fluka.org
> Subject: Re: fluka job crashing
>
> .... bus errors are usually problems connected with the operating
> system/hardware of the computer you are running on and normally they have
> nothing to do with Fluka. On our cluster we get similar errors (very rare)
> if the node the batch job is running on temporarily loses the nfs
> connection to the master node, but I do not know if this applies to your
> case as well
>
> Alfredo
>
>
> +----------------------------------------------------------------------+
> | Alfredo Ferrari || Tel.: +41.22.76.76119 |
> | CERN-EN/STI || Fax.: +41.22.76.69474 |
> | 1211 Geneva 23 || e-mail: Alfredo.Ferrari_at_cern.ch |
> | Switzerland || |
> +----------------------------------------------------------------------+
>
> On Thu, 4 Apr 2013, Sudeshna Banerjee wrote:
>
>> Hello,
>>
>> I am trying to run fluka with a geometry file for the CMS detector. But
>> my batch jobs are failing after several hours. A core file is created but
>> the *.err, *.out and *.log files do not show any error messages. The only
>> error I see is in the batch job submission log file. It says -
>>
>> ======================= Running FLUKA for cycle # 1 =======================
>> /afs/cern.ch/user/b/bhat/scratch1/sudeshna/fluka/flutil/rfluka: line 358:
>> 5268 Bus error
>> (core dumped) "${EXE}" < "$INPN" 2> "$LOGF" > "$LOGF"
>>
>> ____________________________________________________________________________
>> How do I find out what went wrong ?
>> Also, is there a way to control the number of collisions that are generated
>> ? I am guessing that if I can run the job for 1 or 2 collisions, then I will
>> not have to wait too long to find out if the job is going to fail.
>>
>> Thanks
>> Sudeshna Banerjee
>>
>>
>>
>>
>>
>
Received on Thu Apr 04 2013 - 20:48:05 CEST

This archive was generated by hypermail 2.3.0 : Thu Apr 04 2013 - 20:48:06 CEST