RE: Workstation Computers for FLUKA from Chris Theis on 2010-11-27 (fluka discuss archive)

From: Chris Theis <Christian.Theis_at_cern.ch>
Date: Sat, 27 Nov 2010 14:20:25 +0000

Hi Makis,

> Are we sure that Fluka is only, or at least major, dependant on the double precision performance of the (C,G)PU?

My personal answer is no, not at all. But I'm sure the FLUKA developers
can provide more insight and probably numbers on this.

> I faintly remember a presentation that was stateing that Geant4 performance was following more closely the SPECint numbers than the SPECfp
numbers. With the explanation that the disicions > that an MC code has
to make are as expensive as the actual calculations. I wonder if this is
correct also for Fluka.

The performance tests merely benchmark calculations which are usually
done by very small and isolated routines. Thus, the respective result
yields rather limited evidence for the performance of complex programs.
In practice this strongly depends on branching which is often a killer
in terms of speed. Also for GPGPUs most of the speed benefit gained is
determined by how well your algorithm/problem lends itself to stream
processing concepts rather than FP performance alone.

If you happen to have a link to the presentation that you mentioned then
I'd appreciate if you could send it to me.

Cheers
Chris

________________________________________
From: Chrysostomos Valderanis
Sent: 27 November 2010 13:13
To: Chris Theis; Nicholas Bolibruch; turgaykorkut_at_hotmail.com; fluka-discus=
s_at_fluka.org
Subject: RE: Workstation Computers for FLUKA

Hi Chris,

I follow this discussion that diverges from the original quetion, but I
would like to return a bit to the original question. Are we sure that
Fluka is only, or at least major, dependant on the double precision
performance of the (C,G)PU?
I faintly remember a presentation that was stateing that Geant4
performance was following more closely the SPECint numbers than the
SPECfp numbers. With the explanation that the disicions that an MC code
has to make are as expensive as the actual calculations. I wonder if
this is correct also for Fluka.
Is anyone out there having actual numbers?

Thank you,
Makis

________________________________________
From: owner-fluka-discuss_at_mi.infn.it [owner-fluka-discuss_at_mi.infn.it] on be=
half of Chris Theis [Christian.Theis_at_cern.ch]
Sent: 26 November 2010 16:31
To: Nicholas Bolibruch; turgaykorkut_at_hotmail.com; fluka-discuss_at_fluka.org
Subject: RE: Workstation Computers for FLUKA

Hi Nicholas,

there is a lot of hype recently regarding GPUs (some of it certainly
justified), but there are some subtle issues especially in connection
with MC simulations, which I would like to comment on.

> I personally would love to see a version of Fluka take advantage of modern
> graphics processors that have a large number of cores, and are now showing
> good performance for double precision operations.

It is true that support for double precision has become a lot better lately.
However, in comparison to native single precision calculations the
performance
benefit is currently significantly lower. The benchmarks GPU vs. CPU
usually shown are based on single precision!
The probably more important point is that different vendors have allowed
themselves a bit of freedom with respect to the implementation of the
IEEE floating point standard. It is not guaranteed that the GPU that you
are running your program on will yield the same result as the one of
your colleague because it's from a different vendor. In terms of
debugging& quality control this is a very important aspect that is
commonly overlooked!

Furthermore, using proprietary platforms like CUDA limits the program to
one vendor, like Nvidia GPUs in this case. Having portability in mind
one will probably have to go for something like OpenCL in the end where
you will sacrifice
(a bit) of speed on the altar of portability.

> There are some utilities to convert Fortran 95 code to CUDA, obviously with a lot of manual work to
> specify what resources to allocate for particular code segments, unfortunately I have yet to find anything that can do this for Fortran 77.

I'm afraid that the chance of finding something suitable for F77 are
realistically below zero.
The F77 standard is more than 30 years old which renders it practically
a dinosaur, looking at the short turnover times seen in the recent
years. Furthermore, maintenance of the popular G77 compiler was stopped
many years ago, so I would not count
on anybody making the effort to provide support for modern GPUs with
respect to F77.

Cheers
Chris
Received on Sun Nov 28 2010 - 14:56:52 CET

This archive was generated by hypermail 2.2.0 : Sun Nov 28 2010 - 14:56:53 CET