Hi Chris,
I have solved the problem of the crash (or ... I wish so) running the
batch job on a machine with Linux - Scientific Linux 3.0.5 (32 bit) and
2 Intel(R) Xeon(TM) CPU - 2.40GHz! I point out that the executable was
compiled on a machine with Linux - Scientific Linux 3.0.5 (32 bit) and 2
AMD Opteron(tm) Processor 244 - 1.8GHz. It seams that the problem is
exclusively due to the machine employed for running the batch job: on
the machine with Linux - Scientific Linux 3.0.5 (32 bit) and 2 Intel(R)
Xeon(TM) CPU - 2.40GHz the run ended with success (I hope it was not a
... random event!).
Cheers,
noemi
Chris Theis wrote:
>Hi Noemi,
>
>the symptoms you describe are identical to a case I was dealing with
>just a couple of weeks ago. It seems that these mysterious FLUKA crashes
>occur on AMD chips with K8/K10 core architectures running under a 32 bit
>linux. This is independent of the distribution as I tried several
>different ones, with different math libs etc. Currently, the reason is
>not yet fully clear so I would suggest to follow the steps that Paola
>described in her mail from yesterday as it would be helpful to pinpoint
>the problem to a specific scenario.
>
>In the meantime I can only offer a remedy which solved this problem in
>our case. If you have the possibility you should try to switch to a 64
>bit version of linux. I ran your input on such a machine we previously
>had problems with and which is now running the 64 bit version of linux.
>Your input finished gracefully following exactly the same random number
>sequence like on your machine, just as expected.
>
>All 32 bit programs will be running under the 64 bit OS as well but you
>might need to install a couple of additional libraries for FLUKA to
>work. Some hints can be found in the following presentation of a
>colleague which could be helpful:
>
>http://info-fluka-discussion.web.cern.ch/info-fluka-discussion/talks/Ludovic_From32To64_190407.ppt
>
>I saw that you're sending your jobs to a LSF batch system and probably
>cannot change the OS there. In that case you might try to investigate
>the possibilities to exclude certain architectures (I'd suggest AMD
>Opteron and Phenom CPUs) from the scheduling process.
>Cheers
>
>Chris
>
>
>From: Finetti Noemi [mailto:noemi.finetti_at_aquila.infn.it]=20
>Sent: 03 December 2008 10:34
>To: Chris Theis
>Subject: Re: FLUKA crash
>
>=20
>
>Hi Christian,
>1) I am using a 32 bit version of SLC3;
>2) when I run my FLUKA job the crash on the batch machines occurs at
>differnt points (sometimes it occurs earlier and sometimes later).
>3) -bash-3.1$ more /proc/cpuinfo
>processor : 0
>vendor_id : AuthenticAMD
>cpu family : 15
>model : 5
>model name : AMD Opteron(tm) Processor 244
>stepping : 10
>cpu MHz : 1800.000
>cache size : 1024 KB
>fdiv_bug : no
>hlt_bug : no
>f00f_bug : no
>coma_bug : no
>fpu : yes
>fpu_exception : yes
>cpuid level : 1
>wp : yes
>flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
>cmov pat
>pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow ts
>fid vid ttp
>bogomips : 3617.79
>=20
>processor : 1
>vendor_id : AuthenticAMD
>cpu family : 15
>model : 5
>model name : AMD Opteron(tm) Processor 244
>stepping : 10
>cpu MHz : 1800.000
>cache size : 1024 KB
>fdiv_bug : no
>hlt_bug : no
>f00f_bug : no
>coma_bug : no
>fpu : yes
>fpu_exception : yes
>cpuid level : 1
>wp : yes
>flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
>cmov pat
>pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow ts
>fid vid ttp
>bogomips : 3616.16
>
>Thanks in advance,
>noemi
>
>
>Chris Theis wrote:
>
>
>
>Hi Noemi,
>=20
>I'm currently looking into your problem and I would need some more
>information. Could you please answer the following questions which would
>help me to check whether this problem is similar to one that I have seen
>just recently:
>=20
>- Are you using the 32 bit or the 64 bit version of SLC3?
>=20
>- When you run your FLUKA job does the crash on the batch machines
>always occur at the same
> point or does it sometimes occur earlier and sometimes later?
>=20
>- Could you please run the following command on the execution PC and
>send me the output
> "cat /proc/cpuinfo"
>=20
>Ciao
>Chris
>=20
>------------------------------------------------------------------------
>Chris Theis
>CERN/SC-RP - European Organization for Nuclear Research
>1211 Geneva 23, Switzerland
>Phone: +41 22 767 8069 Office: 892-2A-015
>e-mail: Christian.Theis@cern.ch www: http://www.cern.ch/theis
>------------------------------------------------------------------------
>=20
>=20
> =20
>
> -----Original Message-----
> From: owner-fluka-discuss_at_mi.infn.it [mailto:owner-fluka-
> discuss_at_mi.infn.it] On Behalf Of Finetti Noemi
> Sent: 27 November 2008 15:31
> To: fluka-discuss_at_fluka.org
> Subject: FLUKA crash
> =20
> Hi all,
> I have installed FLUKA 2008.3 on a linux machine (Model: 2 AMD
>Opteron
> (tm) Processor 244 - 1.8GHz; Operating system: Linux -
>Scientific
> =20
>
>Linux
> =20
>
> 3.0.5; with g77) where I have compiled my executable file
>(myfluka)
> which calls the user routines USRINI, HISTIN,SOURCE, USRMED,
>USROUT
> (see user_routines.tar.gz). Executing the job in batch (for
>399528
> primaries) the run crashed while running the same job
>interactively
> (for 100
> primaries) every thing was fine. What could be the reasons?
> =20
> I point out that the batch job was executed on a 2 Dual Core AMD
> Opteron(tm) Processor 280 - 2.4 GHz with Linux - Scientific
>Linux
> 3.0.8.
> =20
> In attachment the LSF message, the gdb output (file
>fluka_gdb.out),
> =20
>
>the
> =20
>
> files .err, .log, .out, fort.1 and fort.2.
> =20
> Thanks in advance,
> noemi
> =20
> --
>=09
>---------------------------------------------------------------------
> * Address: Dott.ssa Noemi Finetti
> c/o Dipartimento di Fisica dell'Universita' degli Studi
>dell'Aquila
> Via Vetoio - 67010 Coppito - L'Aquila - Italy
> * Phone: +39-0862-433051 (Office); +39-0862-433043 (Laboratory)
> * Fax: +39-0862-433033 (Department).
>=09
>---------------------------------------------------------------------
> =20
> =20
> =20
> =20
>
>=20
>=20
> =20
>
>
>
>
>
>--=20
>---------------------------------------------------------------------
>* Address: Dott.ssa Noemi Finetti
> c/o Dipartimento di Fisica dell'Universita' degli Studi dell'Aquila
> Via Vetoio - 67010 Coppito - L'Aquila - Italy
>* Phone: +39-0862-433051 (Office); +39-0862-433043 (Laboratory)
>* Fax: +39-0862-433033 (Department).
>---------------------------------------------------------------------
>=20
>=20
>
>------_=_NextPart_001_01C95537.0C95CDE5
>Content-Type: text/html; charset="us-ascii"
>Content-Transfer-Encoding: quoted-printable
>
><html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
>xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
>xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
>xmlns:x=3D"urn:schemas-microsoft-com:office:excel" =
>xmlns:p=3D"urn:schemas-microsoft-com:office:powerpoint" =
>xmlns:a=3D"urn:schemas-microsoft-com:office:access" =
>xmlns:dt=3D"uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" =
>xmlns:s=3D"uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" =
>xmlns:rs=3D"urn:schemas-microsoft-com:rowset" xmlns:z=3D"#RowsetSchema" =
>xmlns:b=3D"urn:schemas-microsoft-com:office:publisher" =
>xmlns:ss=3D"urn:schemas-microsoft-com:office:spreadsheet" =
>xmlns:c=3D"urn:schemas-microsoft-com:office:component:spreadsheet" =
>xmlns:odc=3D"urn:schemas-microsoft-com:office:odc" =
>xmlns:oa=3D"urn:schemas-microsoft-com:office:activation" =
>xmlns:html=3D"http://www.w3.org/TR/REC-html40" =
>xmlns:q=3D"http://schemas.xmlsoap.org/soap/envelope/" xmlns:D=3D"DAV:" =
>xmlns:x2=3D"http://schemas.microsoft.com/office/excel/2003/xml" =
>xmlns:ois=3D"http://schemas.microsoft.com/sharepoint/soap/ois/" =
>xmlns:dir=3D"http://schemas.microsoft.com/sharepoint/soap/directory/" =
>xmlns:ds=3D"http://www.w3.org/2000/09/xmldsig#" =
>xmlns:dsp=3D"http://schemas.microsoft.com/sharepoint/dsp" =
>xmlns:udc=3D"http://schemas.microsoft.com/data/udc" =
>xmlns:xsd=3D"http://www.w3.org/2001/XMLSchema" =
>xmlns:sub=3D"http://schemas.microsoft.com/sharepoint/soap/2002/1/alerts/"=
> xmlns:ec=3D"http://www.w3.org/2001/04/xmlenc#" =
>xmlns:sp=3D"http://schemas.microsoft.com/sharepoint/" =
>xmlns:sps=3D"http://schemas.microsoft.com/sharepoint/soap/" =
>xmlns:xsi=3D"http://www.w3.org/2001/XMLSchema-instance" =
>xmlns:udcxf=3D"http://schemas.microsoft.com/data/udc/xmlfile" =
>xmlns:wf=3D"http://schemas.microsoft.com/sharepoint/soap/workflow/" =
>xmlns:mver=3D"http://schemas.openxmlformats.org/markup-compatibility/2006=
>" xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" =
>xmlns:mrels=3D"http://schemas.openxmlformats.org/package/2006/relationshi=
>ps" =
>xmlns:ex12t=3D"http://schemas.microsoft.com/exchange/services/2006/types"=
> =
>xmlns:ex12m=3D"http://schemas.microsoft.com/exchange/services/2006/messag=
>es" xmlns:Z=3D"urn:schemas-microsoft-com:" xmlns:st=3D"" =
>xmlns=3D"http://www.w3.org/TR/REC-html40">
>
><head>
><meta http-equiv=3DContent-Type content=3D"text/html; =
>charset=3Dus-ascii">
><meta name=3DGenerator content=3D"Microsoft Word 12 (filtered medium)">
><style>
><!--
> /* Font Definitions */
> @font-face
> {font-family:Calibri;
> panose-1:2 15 5 2 2 2 4 3 2 4;}
>@font-face
> {font-family:Tahoma;
> panose-1:2 11 6 4 3 5 4 4 2 4;}
>@font-face
> {font-family:Consolas;
> panose-1:2 11 6 9 2 2 4 3 2 4;}
> /* Style Definitions */
> p.MsoNormal, li.MsoNormal, div.MsoNormal
> {margin:0cm;
> margin-bottom:.0001pt;
> font-size:12.0pt;
> font-family:"Times New Roman","serif";
> color:black;}
>a:link, span.MsoHyperlink
> {mso-style-priority:99;
> color:blue;
> text-decoration:underline;}
>a:visited, span.MsoHyperlinkFollowed
> {mso-style-priority:99;
> color:purple;
> text-decoration:underline;}
>pre
> {mso-style-priority:99;
> mso-style-link:"HTML Preformatted Char";
> margin:0cm;
> margin-bottom:.0001pt;
> font-size:10.0pt;
> font-family:"Courier New";
> color:black;}
>span.HTMLPreformattedChar
> {mso-style-name:"HTML Preformatted Char";
> mso-style-priority:99;
> mso-style-link:"HTML Preformatted";
> font-family:Consolas;
> color:black;}
>span.EmailStyle19
> {mso-style-type:personal-reply;
> font-family:"Calibri","sans-serif";
> color:#1F497D;}
>.MsoChpDefault
> {mso-style-type:export-only;
> font-size:10.0pt;}
>@page Section1
> {size:612.0pt 792.0pt;
> margin:72.0pt 72.0pt 72.0pt 72.0pt;}
>div.Section1
> {page:Section1;}
>-->
></style>
><!--[if gte mso 9]><xml>
> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
></xml><![endif]--><!--[if gte mso 9]><xml>
> <o:shapelayout v:ext=3D"edit">
> <o:idmap v:ext=3D"edit" data=3D"1" />
> </o:shapelayout></xml><![endif]-->
></head>
>
><body bgcolor=3Dwhite lang=3DEN-GB link=3Dblue vlink=3Dpurple>
>
><div class=3DSection1>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'>Hi Noemi,<o:p></o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'><o:p> </o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'>the symptoms you describe are identical to a case I was =
>dealing
>with just a couple of weeks ago. It seems that these mysterious FLUKA =
>crashes
>occur on AMD chips with K8/K10 core architectures running under a 32 bit =
>linux.
>This is independent of the distribution as I tried several different =
>ones, with
>different math libs etc. Currently, the reason is not yet fully clear so =
>I
>would suggest to follow the steps that Paola described in her mail from
>yesterday as it would be helpful to pinpoint the problem to a specific
>scenario.<o:p></o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'><o:p> </o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'>In the meantime I can only offer a remedy which solved =
>this
>problem in our case. If you have the possibility you should try to =
>switch to a
>64 bit version of linux. I ran your input on such a machine we =
>previously had
>problems with and which is now running the 64 bit version of linux. Your =
>input finished
>gracefully following exactly the same random number sequence like on =
>your
>machine, just as expected.<o:p></o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'><o:p> </o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'>All 32 bit programs will be running under the 64 bit OS =
>as well
>but you might need to install a couple of additional libraries for FLUKA =
>to
>work. Some hints can be found in the following presentation of a =
>colleague which
>could be helpful:<o:p></o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'><o:p> </o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'><a
>href=3D"http://info-fluka-discussion.web.cern.ch/info-fluka-discussion/ta=
>lks/Ludovic_From32To64_190407.ppt">http://info-fluka-discussion.web.cern.=
>ch/info-fluka-discussion/talks/Ludovic_From32To64_190407.ppt</a><o:p></o:=
>p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'><o:p> </o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'>I saw that you’re sending your jobs to a LSF batch =
>system and
>probably cannot change the OS there. In that case you might try to =
>investigate
>the possibilities to exclude certain architectures (I’d suggest =
>AMD
>Opteron and Phenom CPUs) from the scheduling process. =
><o:p></o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'><o:p> </o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'>Cheers<o:p></o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'>Chris<o:p></o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'><o:p> </o:p></span></p>
>
><p class=3DMsoNormal><span =
>style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif";
>color:#1F497D'><o:p> </o:p></span></p>
>
><div style=3D'border:none;border-left:solid blue 1.5pt;padding:0cm 0cm =
>0cm 4.0pt'>
>
><div>
>
><div style=3D'border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt =
>0cm 0cm 0cm'>
>
><p class=3DMsoNormal><b><span lang=3DEN-US =
>style=3D'font-size:10.0pt;font-family:
>"Tahoma","sans-serif";color:windowtext'>From:</span></b><span =
>lang=3DEN-US
>style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowt=
>ext'> Finetti
>Noemi [mailto:noemi.finetti_at_aquila.infn.it] <br>
><b>Sent:</b> 03 December 2008 10:34<br>
><b>To:</b> Chris Theis<br>
><b>Subject:</b> Re: FLUKA crash<o:p></o:p></span></p>
>
></div>
>
></div>
>
><p class=3DMsoNormal><o:p> </o:p></p>
>
><p class=3DMsoNormal>Hi Christian,<br>
>1) I am using a 32 bit version of SLC3;<br>
>2) when I run my FLUKA job the crash on the batch machines occurs at =
>differnt points
>(sometimes it occurs earlier and sometimes later).<br>
>3) -bash-3.1$ more /proc/cpuinfo<br>
>processor : 0<br>
>vendor_id : AuthenticAMD<br>
>cpu family : 15<br>
>model : =
>5<br>
>model name : AMD Opteron(tm) Processor =
>244<br>
>stepping : 10<br>
>cpu MHz : 1800.000<br>
>cache size : 1024 KB<br>
>fdiv_bug : no<br>
>hlt_bug : no<br>
>f00f_bug : no<br>
>coma_bug : no<br>
>fpu &nbs=
>p; :
>yes<br>
>fpu_exception : yes<br>
>cpuid level : 1<br>
>wp  =
>;
>: yes<br>
>flags : fpu =
>vme de
>pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat<br>
>pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow ts =
>fid vid
>ttp<br>
>bogomips : 3617.79<br>
> <br>
>processor : 1<br>
>vendor_id : AuthenticAMD<br>
>cpu family : 15<br>
>model : =
>5<br>
>model name : AMD Opteron(tm) Processor =
>244<br>
>stepping : 10<br>
>cpu MHz : 1800.000<br>
>cache size : 1024 KB<br>
>fdiv_bug : no<br>
>hlt_bug : no<br>
>f00f_bug : no<br>
>coma_bug : no<br>
>fpu &nbs=
>p; :
>yes<br>
>fpu_exception : yes<br>
>cpuid level : 1<br>
>wp  =
>;
>: yes<br>
>flags : fpu =
>vme de
>pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat<br>
>pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow ts =
>fid vid
>ttp<br>
>bogomips : 3616.16<br>
><br>
>Thanks in advance,<br>
>noemi<br>
><br>
><br>
>Chris Theis wrote:<br>
><br>
><o:p></o:p></p>
>
><pre>Hi Noemi,<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>I'm =
>currently looking into your problem and I would need some =
>more<o:p></o:p></pre><pre>information. Could you please answer the =
>following questions which would<o:p></o:p></pre><pre>help me to check =
>whether this problem is similar to one that I have =
>seen<o:p></o:p></pre><pre>just =
>recently:<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>- Are you =
>using the 32 bit or the 64 bit version of =
>SLC3?<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>- When you run =
>your FLUKA job does the crash on the batch =
>machines<o:p></o:p></pre><pre>always occur at the =
>same<o:p></o:p></pre><pre> point or does it sometimes occur =
>earlier and sometimes =
>later?<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>- Could you =
>please run the following command on the execution PC =
>and<o:p></o:p></pre><pre>send me the output<o:p></o:p></pre><pre> =
>"cat =
>/proc/cpuinfo"<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Ciao=
><o:p></o:p></pre><pre>Chris<o:p></o:p></pre><pre><o:p> </o:p></pre><=
>pre>---------------------------------------------------------------------=
>---<o:p></o:p></pre><pre>Chris Theis<o:p></o:p></pre><pre>CERN/SC-RP - =
>European Organization for Nuclear Research<o:p></o:p></pre><pre>1211 =
>Geneva 23, Switzerland<o:p></o:p></pre><pre>Phone: +41 22 767 =
>8069 &nb=
>sp; Office: =
>892-2A-015<o:p></o:p></pre><pre>e-mail: <a
>href=3D"mailto:Christian.Theis_at_cern.ch">Christian.Theis_at_cern.ch</a> =
> www: <a
>href=3D"http://www.cern.ch/theis">http://www.cern.ch/theis</a><o:p></o:p>=
></pre><pre>--------------------------------------------------------------=
>----------<o:p></o:p></pre><pre><o:p> </o:p></pre><pre><o:p> </=
>o:p></pre><pre> <o:p></o:p></pre>
>
><blockquote =
>style=3D'margin-top:5.0pt;margin-bottom:5.0pt'><pre>-----Original =
>Message-----<o:p></o:p></pre><pre>From: <a
>href=3D"mailto:owner-fluka-discuss_at_mi.infn.it">owner-fluka-discuss_at_mi.inf=
>n.it</a> [<a
>href=3D"mailto:owner-fluka">mailto:owner-fluka</a>-<o:p></o:p></pre><pre>=
><a
>href=3D"mailto:discuss_at_mi.infn.it">discuss_at_mi.infn.it</a>] On Behalf Of =
>Finetti Noemi<o:p></o:p></pre><pre>Sent: 27 November 2008 =
>15:31<o:p></o:p></pre><pre>To: <a
>href=3D"mailto:fluka-discuss_at_fluka.org">fluka-discuss_at_fluka.org</a><o:p><=
>/o:p></pre><pre>Subject: FLUKA =
>crash<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Hi =
>all,<o:p></o:p></pre><pre>I have installed FLUKA 2008.3 on a linux =
>machine (Model: 2 AMD Opteron<o:p></o:p></pre><pre>(tm) Processor 244 - =
>1.8GHz; Operating system: Linux - =
>Scientific<o:p></o:p></pre><pre> =
><o:p></o:p></pre></blockquote>
>
><pre>Linux<o:p></o:p></pre><pre> <o:p></o:p></pre>
>
><blockquote style=3D'margin-top:5.0pt;margin-bottom:5.0pt'><pre>3.0.5; =
>with g77) where I have compiled my executable file =
>(myfluka)<o:p></o:p></pre><pre>which calls the user routines USRINI, =
>HISTIN,SOURCE, USRMED, USROUT<o:p></o:p></pre><pre>(see =
>user_routines.tar.gz). Executing the job in batch (for =
>399528<o:p></o:p></pre><pre>primaries) the run crashed while running the =
>same job interactively<o:p></o:p></pre><pre>(for =
>100<o:p></o:p></pre><pre>primaries) every thing was fine. What could be =
>the reasons?<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>I point =
>out that the batch job was executed on a 2 Dual Core =
>AMD<o:p></o:p></pre><pre>Opteron(tm) Processor 280 - 2.4 GHz with Linux =
>- Scientific =
>Linux<o:p></o:p></pre><pre>3.0.8.<o:p></o:p></pre><pre><o:p> </o:p><=
>/pre><pre>In attachment the LSF message, the gdb output (file =
>fluka_gdb.out),<o:p></o:p></pre><pre> =
><o:p></o:p></pre></blockquote>
>
><pre>the<o:p></o:p></pre><pre> <o:p></o:p></pre>
>
><blockquote style=3D'margin-top:5.0pt;margin-bottom:5.0pt'><pre>files =
>.err, .log, .out, fort.1 and =
>fort.2.<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Thanks in =
>advance,<o:p></o:p></pre><pre>noemi<o:p></o:p></pre><pre><o:p> </o:p=
>
>
>></pre><pre>--<o:p></o:p></pre><pre>-------------------------------------=
>>
>>
>--------------------------------<o:p></o:p></pre><pre>* Address: =
>Dott.ssa Noemi Finetti<o:p></o:p></pre><pre> c/o =
>Dipartimento di Fisica dell'Universita' degli Studi =
>dell'Aquila<o:p></o:p></pre><pre> Via Vetoio - 67010 Coppito =
>- L'Aquila - Italy<o:p></o:p></pre><pre>* Phone: +39-0862-433051 =
>(Office); +39-0862-433043 (Laboratory)<o:p></o:p></pre><pre>* Fax: =
>+39-0862-433033 =
>(Department).<o:p></o:p></pre><pre>--------------------------------------=
>-------------------------------<o:p></o:p></pre><pre><o:p> </o:p></p=
>re><pre><o:p> </o:p></pre><pre><o:p> </o:p></pre><pre> &nb=
>sp; <o:p></o:p></pre></blockquote>
>
><pre><o:p> </o:p></pre><pre><o:p> </o:p></pre><pre> =
><o:p></o:p></pre>
>
><p class=3DMsoNormal><br>
><br>
><o:p></o:p></p>
>
><pre>-- =
><o:p></o:p></pre><pre>---------------------------------------------------=
>------------------<o:p></o:p></pre><pre>* Address: Dott.ssa Noemi =
>Finetti<o:p></o:p></pre><pre> c/o Dipartimento di Fisica =
>dell'Universita' degli Studi dell'Aquila<o:p></o:p></pre><pre> Via =
>Vetoio - 67010 Coppito - L'Aquila - Italy<o:p></o:p></pre><pre>* Phone: =
>+39-0862-433051 (Office); +39-0862-433043 =
>(Laboratory)<o:p></o:p></pre><pre>* Fax: +39-0862-433033 =
>(Department).<o:p></o:p></pre><pre>--------------------------------------=
>-------------------------------<o:p></o:p></pre><pre><o:p> </o:p></p=
>re><pre><o:p> </o:p></pre></div>
>
></div>
>
></body>
>
></html>
>
>------_=_NextPart_001_01C95537.0C95CDE5--
>
>
>
>
-- --------------------------------------------------------------------- * Address: Dott.ssa Noemi Finetti c/o Dipartimento di Fisica dell'Universita' degli Studi dell'Aquila Via Vetoio - 67010 Coppito - L'Aquila - Italy * Phone: +39-0862-433051 (Office); +39-0862-433043 (Laboratory) * Fax: +39-0862-433033 (Department). --------------------------------------------------------------------- --------------010701090504090103050503 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"> <title></title> </head> <body text="#000000" bgcolor="#ffffff"> Hi Chris,<br> I have solved the problem of the crash (or ... I wish so) running the batch job on a machine with Linux - Scientific Linux 3.0.5 (32 bit) and 2 Intel(R) Xeon(TM) CPU - 2.40GHz! I point out that the executable was compiled on a machine with Linux - Scientific Linux 3.0.5 (32 bit) and 2 AMD Opteron(tm) Processor 244 - 1.8GHz. It seams that the problem is exclusively due to the machine employed for running the batch job: on the machine with Linux - Scientific Linux 3.0.5 (32 bit) and 2 Intel(R) Xeon(TM) CPU - 2.40GHz the run ended with success (I hope it was not a ... random event!).<br> Cheers,<br> noemi<br> <br> Chris Theis wrote:<br> <blockquote type="cite" cite="mid200812031426.mB3EQDWu031510_at_smtp1.mi.infn.it"> <pre wrap="">Hi Noemi, the symptoms you describe are identical to a case I was dealing with just a couple of weeks ago. It seems that these mysterious FLUKA crashes occur on AMD chips with K8/K10 core architectures running under a 32 bit linux. This is independent of the distribution as I tried several different ones, with different math libs etc. Currently, the reason is not yet fully clear so I would suggest to follow the steps that Paola described in her mail from yesterday as it would be helpful to pinpoint the problem to a specific scenario. In the meantime I can only offer a remedy which solved this problem in our case. If you have the possibility you should try to switch to a 64 bit version of linux. I ran your input on such a machine we previously had problems with and which is now running the 64 bit version of linux. Your input finished gracefully following exactly the same random number sequence like on your machine, just as expected. All 32 bit programs will be running under the 64 bit OS as well but you might need to install a couple of additional libraries for FLUKA to work. Some hints can be found in the following presentation of a colleague which could be helpful: <a class="moz-txt-link-freetext" href="http://info-fluka-discussion.web.cern.ch/info-fluka-discussion/talks/Ludovic_From32To64_190407.ppt">http://info-fluka-discussion.web.cern.ch/info-fluka-discussion/talks/Ludovic_From32To64_190407.ppt</a> I saw that you're sending your jobs to a LSF batch system and probably cannot change the OS there. In that case you might try to investigate the possibilities to exclude certain architectures (I'd suggest AMD Opteron and Phenom CPUs) from the scheduling process. Cheers Chris From: Finetti Noemi [<a class="moz-txt-link-freetext" href="mailto:noemi.finetti_at_aquila.infn.it">mailto:noemi.finetti_at_aquila.infn.it</a>]=20 Sent: 03 December 2008 10:34 To: Chris Theis Subject: Re: FLUKA crash =20 Hi Christian, 1) I am using a 32 bit version of SLC3; 2) when I run my FLUKA job the crash on the batch machines occurs at differnt points (sometimes it occurs earlier and sometimes later). 3) -bash-3.1$ more /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 5 model name : AMD Opteron(tm) Processor 244 stepping : 10 cpu MHz : 1800.000 cache size : 1024 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow ts fid vid ttp bogomips : 3617.79 =20 processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 5 model name : AMD Opteron(tm) Processor 244 stepping : 10 cpu MHz : 1800.000 cache size : 1024 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow ts fid vid ttp bogomips : 3616.16 Thanks in advance, noemi Chris Theis wrote: Hi Noemi, =20 I'm currently looking into your problem and I would need some more information. Could you please answer the following questions which would help me to check whether this problem is similar to one that I have seen just recently: =20 - Are you using the 32 bit or the 64 bit version of SLC3? =20 - When you run your FLUKA job does the crash on the batch machines always occur at the same point or does it sometimes occur earlier and sometimes later? =20 - Could you please run the following command on the execution PC and send me the output "cat /proc/cpuinfo" =20 Ciao Chris =20 ------------------------------------------------------------------------ Chris Theis CERN/SC-RP - European Organization for Nuclear Research 1211 Geneva 23, Switzerland Phone: +41 22 767 8069 Office: 892-2A-015 e-mail: <a class="moz-txt-link-abbreviated" href="mailto:Christian.Theis@cern.ch">Christian.Theis@cern.ch</a> www: <a class="moz-txt-link-freetext" href="http://www.cern.ch/theis">http://www.cern.ch/theis</a> ------------------------------------------------------------------------ =20 =20 =20 -----Original Message----- From: <a class="moz-txt-link-abbreviated" href="mailto:owner-fluka-discuss_at_mi.infn.it">owner-fluka-discuss_at_mi.infn.it</a> [<a class="moz-txt-link-freetext" href="mailto:owner-fluka">mailto:owner-fluka</a>- <a class="moz-txt-link-abbreviated" href="mailto:discuss_at_mi.infn.it">discuss_at_mi.infn.it</a>] On Behalf Of Finetti Noemi Sent: 27 November 2008 15:31 To: <a class="moz-txt-link-abbreviated" href="mailto:fluka-discuss_at_fluka.org">fluka-discuss_at_fluka.org</a> Subject: FLUKA crash =20 Hi all, I have installed FLUKA 2008.3 on a linux machine (Model: 2 AMD Opteron (tm) Processor 244 - 1.8GHz; Operating system: Linux - Scientific =20 Linux =20 3.0.5; with g77) where I have compiled my executable file (myfluka) which calls the user routines USRINI, HISTIN,SOURCE, USRMED, USROUT (see user_routines.tar.gz). Executing the job in batch (for 399528 primaries) the run crashed while running the same job interactively (for 100 primaries) every thing was fine. What could be the reasons? =20 I point out that the batch job was executed on a 2 Dual Core AMD Opteron(tm) Processor 280 - 2.4 GHz with Linux - Scientific Linux 3.0.8. =20 In attachment the LSF message, the gdb output (file fluka_gdb.out), =20 the =20 files .err, .log, .out, fort.1 and fort.2. =20 Thanks in advance, noemi =20 -- =09 --------------------------------------------------------------------- * Address: Dott.ssa Noemi Finetti c/o Dipartimento di Fisica dell'Universita' degli Studi dell'Aquila Via Vetoio - 67010 Coppito - L'Aquila - Italy * Phone: +39-0862-433051 (Office); +39-0862-433043 (Laboratory) * Fax: +39-0862-433033 (Department). =09 --------------------------------------------------------------------- =20 =20 =20 =20 =20 =20 =20 --=20 --------------------------------------------------------------------- * Address: Dott.ssa Noemi Finetti c/o Dipartimento di Fisica dell'Universita' degli Studi dell'Aquila Via Vetoio - 67010 Coppito - L'Aquila - Italy * Phone: +39-0862-433051 (Office); +39-0862-433043 (Laboratory) * Fax: +39-0862-433033 (Department). --------------------------------------------------------------------- =20 =20 ------_=_NextPart_001_01C95537.0C95CDE5 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable <html xmlns:v=3D"urn:schemas-microsoft-com:vml" = xmlns:o=3D"urn:schemas-microsoft-com:office:office" = xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:x=3D"urn:schemas-microsoft-com:office:excel" = xmlns:p=3D"urn:schemas-microsoft-com:office:powerpoint" = xmlns:a=3D"urn:schemas-microsoft-com:office:access" = xmlns:dt=3D"uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" = xmlns:s=3D"uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" = xmlns:rs=3D"urn:schemas-microsoft-com:rowset" xmlns:z=3D"#RowsetSchema" = xmlns:b=3D"urn:schemas-microsoft-com:office:publisher" = xmlns:ss=3D"urn:schemas-microsoft-com:office:spreadsheet" = xmlns:c=3D"urn:schemas-microsoft-com:office:component:spreadsheet" = xmlns:odc=3D"urn:schemas-microsoft-com:office:odc" = xmlns:oa=3D"urn:schemas-microsoft-com:office:activation" = xmlns:html=3D<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/TR/REC-html40"><font color="red"><b>MailScanner has detected a possible fraud attempt from "www.w3.org" claiming to be</b></font> "http://www.w3.org/TR/REC-html40"</a> = xmlns:q=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.xmlsoap.org/soap/envelope/"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.xmlsoap.org" claiming to be</b></font> "http://schemas.xmlsoap.org/soap/envelope/"</a> xmlns:D=3D"DAV:" = xmlns:x2=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/office/excel/2003/xml"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/office/excel/2003/xml"</a> = xmlns:ois=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/sharepoint/soap/ois/"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/sharepoint/soap/ois/"</a> = xmlns:dir=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/sharepoint/soap/directory/"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/sharepoint/soap/directory/"</a> = xmlns:ds=3D<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2000/09/xmldsig#"><font color="red"><b>MailScanner has detected a possible fraud attempt from "www.w3.org" claiming to be</b></font> "http://www.w3.org/2000/09/xmldsig#"</a> = xmlns:dsp=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/sharepoint/dsp"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/sharepoint/dsp"</a> = xmlns:udc=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/data/udc"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/data/udc"</a> = xmlns:xsd=3D<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2001/XMLSchema"><font color="red"><b>MailScanner has detected a possible fraud attempt from "www.w3.org" claiming to be</b></font> "http://www.w3.org/2001/XMLSchema"</a> = xmlns:sub=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/sharepoint/soap/2002/1/alerts/"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/sharepoint/soap/2002/1/alerts/"</a>= xmlns:ec=3D<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2001/04/xmlenc#"><font color="red"><b>MailScanner has detected a possible fraud attempt from "www.w3.org" claiming to be</b></font> "http://www.w3.org/2001/04/xmlenc#"</a> = xmlns:sp=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/sharepoint/"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/sharepoint/"</a> = xmlns:sps=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/sharepoint/soap/"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/sharepoint/soap/"</a> = xmlns:xsi=3D<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/2001/XMLSchema-instance"><font color="red"><b>MailScanner has detected a possible fraud attempt from "www.w3.org" claiming to be</b></font> "http://www.w3.org/2001/XMLSchema-instance"</a> = xmlns:udcxf=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/data/udc/xmlfile"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/data/udc/xmlfile"</a> = xmlns:wf=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/sharepoint/soap/workflow/"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/sharepoint/soap/workflow/"</a> = xmlns:mver=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.openxmlformats.org/markup-compatibility/2006="><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.openxmlformats.org" claiming to be</b></font> "http://schemas.openxmlformats.org/markup-compatibility/2006= "</a> xmlns:m=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/office/2004/12/omml"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/office/2004/12/omml"</a> = xmlns:mrels=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.openxmlformats.org/package/2006/relationshi=ps"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.openxmlformats.org" claiming to be</b></font> "http://schemas.openxmlformats.org/package/2006/relationshi= ps"</a> = xmlns:ex12t=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/exchange/services/2006/types"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/exchange/services/2006/types"</a>= = xmlns:ex12m=3D<a class="moz-txt-link-rfc2396E" href="http://schemas.microsoft.com/exchange/services/2006/messag=es"><font color="red"><b>MailScanner has detected a possible fraud attempt from "schemas.microsoft.com" claiming to be</b></font> "http://schemas.microsoft.com/exchange/services/2006/messag= es"</a> xmlns:Z=3D"urn:schemas-microsoft-com:" xmlns:st=3D"&#1;" = xmlns=3D<a class="moz-txt-link-rfc2396E" href="http://www.w3.org/TR/REC-html40"><font color="red"><b>MailScanner has detected a possible fraud attempt from "www.w3.org" claiming to be</b></font> "http://www.w3.org/TR/REC-html40"</a>> <head> <meta http-equiv=3DContent-Type content=3D"text/html; = charset=3Dus-ascii"> <meta name=3DGenerator content=3D"Microsoft Word 12 (filtered medium)"> <style> <!-- /* Font Definitions */ @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} @font-face {font-family:Tahoma; panose-1:2 11 6 4 3 5 4 4 2 4;} @font-face {font-family:Consolas; panose-1:2 11 6 9 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0cm; margin-bottom:.0001pt; font-size:12.0pt; font-family:"Times New Roman","serif"; color:black;} a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:purple; text-decoration:underline;} pre {mso-style-priority:99; mso-style-link:"HTML Preformatted Char"; margin:0cm; margin-bottom:.0001pt; font-size:10.0pt; font-family:"Courier New"; color:black;} span.HTMLPreformattedChar {mso-style-name:"HTML Preformatted Char"; mso-style-priority:99; mso-style-link:"HTML Preformatted"; font-family:Consolas; color:black;} span.EmailStyle19 {mso-style-type:personal-reply; font-family:"Calibri","sans-serif"; color:#1F497D;} .MsoChpDefault {mso-style-type:export-only; font-size:10.0pt;} @page Section1 {size:612.0pt 792.0pt; margin:72.0pt 72.0pt 72.0pt 72.0pt;} div.Section1 {page:Section1;} --> </style> <!--[if gte mso 9]><xml> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext=3D"edit"> <o:idmap v:ext=3D"edit" data=3D"1" /> </o:shapelayout></xml><![endif]--> </head> <body bgcolor=3Dwhite lang=3DEN-GB link=3Dblue vlink=3Dpurple> <div class=3DSection1> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'>Hi Noemi,<o:p></o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'>the symptoms you describe are identical to a case I was = dealing with just a couple of weeks ago. It seems that these mysterious FLUKA = crashes occur on AMD chips with K8/K10 core architectures running under a 32 bit = linux. This is independent of the distribution as I tried several different = ones, with different math libs etc. Currently, the reason is not yet fully clear so = I would suggest to follow the steps that Paola described in her mail from yesterday as it would be helpful to pinpoint the problem to a specific scenario.<o:p></o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'>In the meantime I can only offer a remedy which solved = this problem in our case. If you have the possibility you should try to = switch to a 64 bit version of linux. I ran your input on such a machine we = previously had problems with and which is now running the 64 bit version of linux. Your = input finished gracefully following exactly the same random number sequence like on = your machine, just as expected.<o:p></o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'>All 32 bit programs will be running under the 64 bit OS = as well but you might need to install a couple of additional libraries for FLUKA = to work. Some hints can be found in the following presentation of a = colleague which could be helpful:<o:p></o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'><a href=3D<a class="moz-txt-link-rfc2396E" href="http://info-fluka-discussion.web.cern.ch/info-fluka-discussion/ta=lks/Ludovic_From32To64_190407.ppt">"http://info-fluka-discussion.web.cern.ch/info-fluka-discussion/ta= lks/Ludovic_From32To64_190407.ppt"</a>><a class="moz-txt-link-freetext" href="http://info-fluka-discussion.web.cern.=">http://info-fluka-discussion.web.cern.=</a> ch/info-fluka-discussion/talks/Ludovic_From32To64_190407.ppt</a><o:p></o:= p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'>I saw that you&#8217;re sending your jobs to a LSF batch = system and probably cannot change the OS there. In that case you might try to = investigate the possibilities to exclude certain architectures (I&#8217;d suggest = AMD Opteron and Phenom CPUs) from the scheduling process. = <o:p></o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'>Cheers<o:p></o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'>Chris<o:p></o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <p class=3DMsoNormal><span = style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif"; color:#1F497D'><o:p>&nbsp;</o:p></span></p> <div style=3D'border:none;border-left:solid blue 1.5pt;padding:0cm 0cm = 0cm 4.0pt'> <div> <div style=3D'border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt = 0cm 0cm 0cm'> <p class=3DMsoNormal><b><span lang=3DEN-US = style=3D'font-size:10.0pt;font-family: "Tahoma","sans-serif";color:windowtext'>From:</span></b><span = lang=3DEN-US style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowt= ext'> Finetti Noemi [<a class="moz-txt-link-freetext" href="mailto:noemi.finetti_at_aquila.infn.it">mailto:noemi.finetti_at_aquila.infn.it</a>] <br> <b>Sent:</b> 03 December 2008 10:34<br> <b>To:</b> Chris Theis<br> <b>Subject:</b> Re: FLUKA crash<o:p></o:p></span></p> </div> </div> <p class=3DMsoNormal><o:p>&nbsp;</o:p></p> <p class=3DMsoNormal>Hi Christian,<br> 1) I am using a 32 bit version of SLC3;<br> 2) when I run my FLUKA job the crash on the batch machines occurs at = differnt points&nbsp; (sometimes it occurs earlier and sometimes later).<br> 3)&nbsp; -bash-3.1$ more /proc/cpuinfo<br> processor&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 0<br> vendor_id&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : AuthenticAMD<br> cpu family&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 15<br> model&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : = 5<br> model name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : AMD Opteron(tm) Processor = 244<br> stepping&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 10<br> cpu MHz&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1800.000<br> cache size&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1024 KB<br> fdiv_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br> hlt_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br> f00f_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br> coma_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br> fpu&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs= p; : yes<br> fpu_exception&nbsp;&nbsp; : yes<br> cpuid level&nbsp;&nbsp;&nbsp;&nbsp; : 1<br> wp&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp= ;&nbsp; : yes<br> flags&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : fpu = vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat<br> pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow ts = fid vid ttp<br> bogomips&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 3617.79<br> &nbsp;<br> processor&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1<br> vendor_id&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : AuthenticAMD<br> cpu family&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 15<br> model&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : = 5<br> model name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : AMD Opteron(tm) Processor = 244<br> stepping&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 10<br> cpu MHz&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1800.000<br> cache size&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1024 KB<br> fdiv_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br> hlt_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br> f00f_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br> coma_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br> fpu&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs= p; : yes<br> fpu_exception&nbsp;&nbsp; : yes<br> cpuid level&nbsp;&nbsp;&nbsp;&nbsp; : 1<br> wp&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp= ;&nbsp; : yes<br> flags&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : fpu = vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat<br> pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow ts = fid vid ttp<br> bogomips&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 3616.16<br> <br> Thanks in advance,<br> noemi<br> <br> <br> Chris Theis wrote:<br> <br> <o:p></o:p></p> <pre>Hi Noemi,<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre>I'm = currently looking into your problem and I would need some = more<o:p></o:p></pre><pre>information. Could you please answer the = following questions which would<o:p></o:p></pre><pre>help me to check = whether this problem is similar to one that I have = seen<o:p></o:p></pre><pre>just = recently:<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre>- Are you = using the 32 bit or the 64 bit version of = SLC3?<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre>- When you run = your FLUKA job does the crash on the batch = machines<o:p></o:p></pre><pre>always occur at the = same<o:p></o:p></pre><pre>&nbsp; point or does it sometimes occur = earlier and sometimes = later?<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre>- Could you = please run the following command on the execution PC = and<o:p></o:p></pre><pre>send me the output<o:p></o:p></pre><pre>&nbsp; = &quot;cat = /proc/cpuinfo&quot;<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre>Ciao= <o:p></o:p></pre><pre>Chris<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></pre><= pre>---------------------------------------------------------------------= ---<o:p></o:p></pre><pre>Chris Theis<o:p></o:p></pre><pre>CERN/SC-RP - = European Organization for Nuclear Research<o:p></o:p></pre><pre>1211 = Geneva 23, Switzerland<o:p></o:p></pre><pre>Phone: +41 22 767 = 8069&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nb= sp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Office: = 892-2A-015<o:p></o:p></pre><pre>e-mail: <a href=3D<a class="moz-txt-link-rfc2396E" href="mailto:Christian.Theis_at_cern.ch">"mailto:Christian.Theis_at_cern.ch"</a>><a class="moz-txt-link-abbreviated" href="mailto:Christian.Theis_at_cern.ch">Christian.Theis_at_cern.ch</a></a>&nbsp;= &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; www: <a href=3D<a class="moz-txt-link-rfc2396E" href="http://www.cern.ch/theis">"http://www.cern.ch/theis"</a>><a class="moz-txt-link-freetext" href="http://www.cern.ch/theis">http://www.cern.ch/theis</a></a><o:p></o:p>= </pre><pre>--------------------------------------------------------------= ----------<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre><o:p>&nbsp;</= o:p></pre><pre>&nbsp; <o:p></o:p></pre> <blockquote = style=3D'margin-top:5.0pt;margin-bottom:5.0pt'><pre>-----Original = Message-----<o:p></o:p></pre><pre>From: <a href=3D<a class="moz-txt-link-rfc2396E" href="mailto:owner-fluka-discuss_at_mi.infn.it">"mailto:owner-fluka-discuss_at_mi.infn.it"</a>><a class="moz-txt-link-abbreviated" href="mailto:owner-fluka-discuss_at_mi.inf=">owner-fluka-discuss_at_mi.inf=</a> n.it</a> [<a href=3D<a class="moz-txt-link-rfc2396E" href="mailto:owner-fluka">"mailto:owner-fluka"</a>><a class="moz-txt-link-freetext" href="mailto:owner-fluka">mailto:owner-fluka</a></a>-<o:p></o:p></pre><pre>= <a href=3D<a class="moz-txt-link-rfc2396E" href="mailto:discuss_at_mi.infn.it">"mailto:discuss_at_mi.infn.it"</a>><a class="moz-txt-link-abbreviated" href="mailto:discuss_at_mi.infn.it">discuss_at_mi.infn.it</a></a>] On Behalf Of = Finetti Noemi<o:p></o:p></pre><pre>Sent: 27 November 2008 = 15:31<o:p></o:p></pre><pre>To: <a href=3D<a class="moz-txt-link-rfc2396E" href="mailto:fluka-discuss_at_fluka.org">"mailto:fluka-discuss_at_fluka.org"</a>><a class="moz-txt-link-abbreviated" href="mailto:fluka-discuss_at_fluka.org">fluka-discuss_at_fluka.org</a></a><o:p><= /o:p></pre><pre>Subject: FLUKA = crash<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre>Hi = all,<o:p></o:p></pre><pre>I have installed FLUKA 2008.3 on a linux = machine (Model: 2 AMD Opteron<o:p></o:p></pre><pre>(tm) Processor 244 - = 1.8GHz; Operating system: Linux - = Scientific<o:p></o:p></pre><pre>&nbsp;&nbsp;&nbsp; = <o:p></o:p></pre></blockquote> <pre>Linux<o:p></o:p></pre><pre>&nbsp; <o:p></o:p></pre> <blockquote style=3D'margin-top:5.0pt;margin-bottom:5.0pt'><pre>3.0.5; = with g77) where I have compiled my executable file = (myfluka)<o:p></o:p></pre><pre>which calls the user routines USRINI, = HISTIN,SOURCE, USRMED, USROUT<o:p></o:p></pre><pre>(see = user_routines.tar.gz). Executing the job in batch (for = 399528<o:p></o:p></pre><pre>primaries) the run crashed while running the = same job interactively<o:p></o:p></pre><pre>(for = 100<o:p></o:p></pre><pre>primaries) every thing was fine. What could be = the reasons?<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre>I point = out that the batch job was executed on a 2 Dual Core = AMD<o:p></o:p></pre><pre>Opteron(tm) Processor 280 - 2.4 GHz with Linux = - Scientific = Linux<o:p></o:p></pre><pre>3.0.8.<o:p></o:p></pre><pre><o:p>&nbsp;</o:p><= /pre><pre>In attachment the LSF message, the gdb output (file = fluka_gdb.out),<o:p></o:p></pre><pre>&nbsp;&nbsp;&nbsp; = <o:p></o:p></pre></blockquote> <pre>the<o:p></o:p></pre><pre>&nbsp; <o:p></o:p></pre> <blockquote style=3D'margin-top:5.0pt;margin-bottom:5.0pt'><pre>files = .err, .log, .out, fort.1 and = fort.2.<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre>Thanks in = advance,<o:p></o:p></pre><pre>noemi<o:p></o:p></pre><pre><o:p>&nbsp;</o:p= </pre> <blockquote type="cite"> <pre wrap=""></pre><pre>--<o:p></o:p></pre><pre>-------------------------------------= </pre> </blockquote> <pre wrap=""><!---->--------------------------------<o:p></o:p></pre><pre>* Address: = Dott.ssa Noemi Finetti<o:p></o:p></pre><pre>&nbsp;&nbsp; c/o = Dipartimento di Fisica dell'Universita' degli Studi = dell'Aquila<o:p></o:p></pre><pre>&nbsp;&nbsp; Via Vetoio - 67010 Coppito = - L'Aquila - Italy<o:p></o:p></pre><pre>* Phone: +39-0862-433051 = (Office); +39-0862-433043 (Laboratory)<o:p></o:p></pre><pre>* Fax: = +39-0862-433033 = (Department).<o:p></o:p></pre><pre>--------------------------------------= -------------------------------<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></p= re><pre><o:p>&nbsp;</o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre>&nbsp;&nb= sp;&nbsp; <o:p></o:p></pre></blockquote> <pre><o:p>&nbsp;</o:p></pre><pre><o:p>&nbsp;</o:p></pre><pre>&nbsp; = <o:p></o:p></pre> <p class=3DMsoNormal><br> <br> <o:p></o:p></p> <pre>-- = <o:p></o:p></pre><pre>---------------------------------------------------= ------------------<o:p></o:p></pre><pre>* Address: Dott.ssa Noemi = Finetti<o:p></o:p></pre><pre>&nbsp; c/o Dipartimento di Fisica = dell'Universita' degli Studi dell'Aquila<o:p></o:p></pre><pre>&nbsp; Via = Vetoio - 67010 Coppito - L'Aquila - Italy<o:p></o:p></pre><pre>* Phone: = +39-0862-433051 (Office); +39-0862-433043 = (Laboratory)<o:p></o:p></pre><pre>* Fax: +39-0862-433033 = (Department).<o:p></o:p></pre><pre>--------------------------------------= -------------------------------<o:p></o:p></pre><pre><o:p>&nbsp;</o:p></p= re><pre><o:p>&nbsp;</o:p></pre></div> </div> </body> </html> ------_=_NextPart_001_01C95537.0C95CDE5-- </pre> </blockquote> <br> <pre class="moz-signature" cols="72">-- --------------------------------------------------------------------- * Address: Dott.ssa Noemi Finetti c/o Dipartimento di Fisica dell'Universita' degli Studi dell'Aquila Via Vetoio - 67010 Coppito - L'Aquila - Italy * Phone: +39-0862-433051 (Office); +39-0862-433043 (Laboratory) * Fax: +39-0862-433033 (Department). --------------------------------------------------------------------- </pre> </body> </html> --------------010701090504090103050503--Received on Thu Dec 04 2008 - 16:51:15 CET
This archive was generated by hypermail 2.2.0 : Thu Dec 04 2008 - 16:51:15 CET