RE: Workstation Computers for FLUKA

From: Nicholas Bolibruch <>
Date: Fri, 26 Nov 2010 11:15:53 -0600

I don't know if you've looked at the performance of GPU cards lately,
but the latest Tesla cards can perform double precision at a 515 GFLOPS
peak, where as a i7 980X (top of the line i7) scores about 106 GFLOPS.
Hence a computer with 4 of these cards can easily top 2 TFLOPS.
Following Moore's Law, using a similar configuration, in 2 years that
will be 4 TFLOPS, and in 4 years it will be 8 TFLOPS.

Also, I'm finding sources that cite the Tesla card as offering IEEE 754
double precision compliance. I don't know about the AMD(formerly ATI)
cards, but I imagine they are offering this compliance as well. Last
time I looked at the AMD cards they were offering similar double
precision performance to the Nvidia offering.

I'm just pointing out that the potential performance payout is great.

On Fri, 2010-11-26 at 15:31 +0000, Chris Theis wrote:
> Hi Nicholas,
> there is a lot of hype recently regarding GPUs (some of it certainly justified), but there are some subtle issues especially
> in connection with MC sim=ulations, which I would like to comment on.
> > I personally would love to see a version of Fluka take advantage of modern
> > graphics processors that have a large number of cores, and are now showing
> > good performance for double precision operations.
> It is true that support for double precision has become a lot better lately.
> However, in comparison to native single precision calculations the performance
> benefit is currently significantly lower. The benchmarks GPU vs. CPU usually shown are based on single precision!
> The probably more important point is that different vendors have allowed themselves a bit of freedom with
> respect to the implementation of the IEEE floating point standard. It is not guaranteed that the
> GPU that you are running your program on will yield the same result as the one of your colleague because it's from a different
> vendor. In terms of debugging& quality control this is a very important
> aspect that is commonly overlooked!
> Furthermore, using proprietary platforms like CUDA limits the program to one vendor,
> like Nvidia GPUs in this case. Having portability in mind one will
> probably have to go for something like OpenCL in the end where you will sacrifice
> (a bit) of speed on the altar of portability.
> > There are some utilities to convert Fortran 95 code to CUDA, obviously with a lot of manual work to
> > specify what resources to allocate for particular code segments, unfortunately I have yet to find anything that can do this for Fortran 77.
> I'm afraid that the chance of finding something suitable for F77 are realistically below zero.
> The F77 standard is more than 30 years old which renders it practically a dinosaur, looking at the short turnover times seen in
> the recent years. Furthermore, maintenance of the popular G77 compiler was stopped many years ago, so I would not count
> on anybody making the effort to provide support for modern GPUs with respect to F77.
> Cheers
> Chris
Received on Sat Nov 27 2010 - 13:28:26 CET

This archive was generated by hypermail 2.2.0 : Sat Nov 27 2010 - 13:28:27 CET