RE: Strange FLUKA crashes on AMD CPUs

From: Chris Theis <Christian.Theis_at_cern.ch>
Date: Fri, 7 Nov 2008 11:09:21 +0100

Hi Alfredo,

Thanks for the quick reply. Unfortunately I had forgotten in my previous
mail that I already did some FPU trapping tests and they looked
successful. However, it's hard to judge as on different platforms I have
already seen some non-trapped FPEs which only occurred when the FPU was
in a specific state.

I reran an old input with FPE trapping disabled and in the error file I
get the following messages:

------------------

NEXT SEEDS: 10A0 0 0 0 0 0 181CD
3039 0 0
  Eventv: ekin+am < pla,ij,igreyt 6.62005989 6.67840451 14 3
  *** Kbar: P_min > 1.8 GeV/c: 25 2.05994622
  *** Kaon: P_min > 1.5 GeV/c: 24 2.87135628
  *** Kaon: P_min > 1.5 GeV/c: 24 3.79837287
       2000 98000 98000
5.1187210E-03 1.0000000E+30 635
 NEXT SEEDS: C17B73 0 0 0 0 0 181CD
3039 0 0
  *** Vacuum stopping: Ij, Pla, Ekin 1 0. -5.55111512E-17

   GEODEN: negative Econtr: -0.55511E-16 Icall 701 Ij 208
Mreg,Newreg 1 1 Iest 0

  *** FIXKIN: ECHECK,PXCHCK,PYCHCK,PZCHCK -6133.11383 -6.86972235
-10.5638196
  6142.69895
  **** dE/dx: P < 0, IJ, P, MMAT -6 NAN 6
  Eventv: ekin+am < pla,ij,igreyt 14.0274245 14.0439455 13 2

  *** Emfgeo: Ustep, Dnear, Nreg, Newreg, Ichemf(npemf), X, Y, Z, Ddnear
 -1.E+20 0.0201797805 2 3 -1 NAN NAN NAN 0.
       6000 94000 94000
3.5056473E-01 1.0000000E+30 1847
 NEXT SEEDS: 25A7B5F 0 0 0 0 0 181CD
3039 0 0
  **** dE/dx: P < 0, IJ, P, MMAT 13 NAN 6
  **** dE/dx: P < 0, IJ, P, MMAT 13 NAN 6
 BLCMAX =3D NAN < BLC =3D 1.50E+02
  **** dE/dx: P < 0, IJ, P, MMAT 13 NAN 6
  **** dE/dx: P < 0, IJ, P, MMAT 13 NAN 6
  **** dE/dx: P < 0, IJ, P, MMAT 13 0. 6
Stepop, Pla < Pthrij!! ij, mmat, ekin, Pla, Pthrij 13 6 NAN 0.
  0.00528431547
  *** Stepop: Trange < 0, Ij,Pla,Pthr 13 0. 0.00528431547
  **** dE/dx: P < 0, IJ, P, MMAT 13 -9.47719933E-14 6

------------------

This shows that at least on our phenom systems we're seeing some FPEs
that don't show up on our other systems. However, some other FP accuracy
tests which I conducted did not show this erratic behavior. Thus, I
wasn't yet able to nail it down to a specific expression/calculation
that could be the cause.

In any case it would be very interesting to have a G95 version which we
could try to exclude that there is a strange (maybe optimizer related)
feature in g77 that could be the cause.

Cheers
Chris

------------------------------------------------------------------------
Chris Theis
CERN/SC-RP - European Organization for Nuclear Research
1211 Geneva 23, Switzerland
Phone: +41 22 767 8069 Office: 892-2A-015
e-mail: Christian.Theis@cern.ch www: http://www.cern.ch/theis
------------------------------------------------------------------------
Received on Fri Nov 07 2008 - 12:17:50 CET

This archive was generated by hypermail 2.2.0 : Fri Nov 07 2008 - 12:17:50 CET