Re: Random seed (RANDOMIZE)

From: Alberto Fasso' <fasso_at_slac.stanford.edu>
Date: Fri, 1 Jul 2011 15:15:24 -0700 (PDT)

Chris,

we all know your enthousiasm about the latest fashions in computer science.
But the oldest paradigm in this field, that has not been made obsolete
so far,
is KISS = "Keep It Simple, Stupid". It is true for computer programs as well
for many other technologies. The old Volkswagen "Beetle" and the old
FIAT 500
(my first car) were so popular because they were simple and did never break.
There was no water cooling, so no risk of water leaks. Many are still
running after 40 years.
Of course, other cars are designed in a different way, often sophisticated,
and they work well. But you would never dream to put a Porsche or a
Ferrari engine inside a FIAT 500.
The good thing (or the bad thing, depending on taste) about FLUKA is that it
is simple. Have a look at
http://willworkforscience.blogspot.com/2010/10/monte-carlo-programs-in-particle.html
FLUKA works with an ASCII input file, is written in obsolete but fast F77,
works on simple PCs, and doesn't require its users to be Formula 1 drivers
(ooops, sorry, computer scientists). And, as a good family car, it takes you
from Point A to Point B without surprises.

If you do a little Google search on parallel Monte Carlo, you find a lot of
papers on M.C. used for numerical integration or for integration on a
many-dimensional space WITH A FIXED NUMBER OF DIMENSIONS: very large, but
fixed. This is not the case of M.C. for particle transport, where each
particle history can have a different number of steps and each interaction
a different number of output channels (the same reason why particle
transport
cannot be done with Quasi-Monte Carlo: a pity, because it would be much
more
efficient).
On Google, you will find also a certain number of attempts to do parallel
particle transport, but they ALL fall in one of the following cases:
- unsatisfactory results
- need of special computers not available to the average user. Often these
    computers are not available anymore, as the vector RISCs, and this
teaches us
    a lesson to remember (FLUKA worked on Vax, IBM VM, RISC UNIX, and it
is still
    there)
- programs designed from scratch to be run in parallel (these are the
    Porsches and the Ferraris: good for them!) But we don't intend to
redesign
    FLUKA from scratch.
- very simple MC codes (PENELOPE is a very good code, but can transport only
    3 different particles, each of which can have max. 3 different kinds of
    interactions)

If you remember, in a previous answer in this thread, I told you that
"every Monte Carlo code is designed according to a particular
philosophy, that you may like or not. Suggestions to do modifications
are ok, but not if they are incompatible with that philosophy".
It meant, in a few words, what I have just told you in many more words.
And then, even yourself now say: "The feasibility strongly depends on the
underlying code design". FLUKA has just not been designed for fancy things
as asynchronous SPMD, and I don't recommend that we spend precious human
resources on it.

However, if we are talking about SIMPLE improvements such as suggested by
Paola, that do not affect the structure of the program: "an automatic way of
preparing, running, summing up several parallel runs", that can
certainly be
done (probably, I think, by modifying the rfluka script). And the nice thing
is that everybody can try it: I know already of several people who have
developed their own script for this purpose, and I am sure that with your
special computer skill you can invent something better

Alberto

On Fri, 1 Jul 2011, Chris Theis wrote:

> Dear Paola,
>
> Parallelising transport MC is surely not trivial from a technical point of
> view, but it can be done applying for example asynchronous SPMD approaches as
> demonstrated by PENELOPE, MCGPU or also many other codes in the CG domain. The
> feasibility strongly depends on the underlying code design.
>
> Unfortunately I won't be able to join you on Monday but I'm surely willing to
> contribute within the boundary conditions set by my other professional
> obligations.
>
> Ciao
> Chris
>
>
>
> On 1 Jul 2011, at 16:54, "Paola Sala" <paola.sala_at_mi.infn.it> wrote:
>
>> Dear Chris,
>> maybe I'm naive too,
>> but to my poor knowledge transport Monte Carlo are not so easy to
>> parallelize. You can surely find literature on this subject, but the
>> trivial reason is that there is no fixed length loop to
>> parallelize. Every history, and every substep in a history, involves
>> an unpredictable and widely varying number of different operations.
>>
>> Fake parallelization, i.e. parallel runs on different cpus, are of
>> course not so elegant but they are staightforward, sure and efficient.
>>
>> Now, of course one can think of an automatic way of preparing,
>> running, summing up several parallel runs. This is one of the
>> technical improvements that we wish to implement. I hope you'll be able to
>> participate to the next collaboration meeting, where the item will
>> be discussed, and of course any contribution from your side would be
>> welcome.
>> Ciao
>> Paola
>>
>>> Alberto,
>>>
>>> thanks for your quick answer on this quite interesting& lively
>>> discussion.
>>>
>>> I take your point on the matter of philosophy& compatibility of
>>> suggestions.
>>> But I would like to stress once again that my proposal does not at all
>>> imply
>>> any change of philosophy but only a reversal of what is considered as
>>> default/optional and a bit of automatized help for the user!
>>>
>>> As I wrote in each previous e-mail the reproducability of pseudo-random
>>> numbers is crucial - this is a point that I have never questioned.
>>> However, I
>>> stand by my opinion that for a user this is in most cases not of concern.
>>> If the code generated the initial random seeds by itself, while outputting
>>> the first one, then the user would not have to worry about correctly
>>> initiating FLUKA's way of parallelization himself, while retaining the
>>> possibility
>>> to fully reproduce the random sequence which was "chosen" by the machine.
>>> The way that the seed is generated originally is something which can be
>>> done via hardware
>>> - like I said with VIA mobos for example - or deterministic master-slave
>>> mechanisms
>>> as long as the disjointness of parallel sequences is preserved and the
>>> initial
>>> seed is available for later use to reproduce the pseudo-random sequence.
>>>
>>> If this approach, for which I do not claim authorship but which has been
>>> devised
>>> and discussed in a plethora of mathematical treatments over the last
>>> decades,
>>> is naive, is something that might lie in the eye of the beholder.
>>> Yet, it is a popular and widely successful approach implemented in various
>>> Monte Carlo codes, a technique that extends far beyond the domain of
>>> radiation transport.
>>> So what I do object against is the statement that it is un-workable.
>>>
>>>> Referring to what I have said above, your idea of what a default should
>>>> be conflicts hardly with the FLUKA philosophy that all jobs should
>>>> be reproducible.
>>>
>>> Well, this is a philosophical design question which can be argued about.
>>> But
>>> you are absolutely right that in any case it is the developer's decision,
>>> no matter if the users like it or not.
>>>
>>> However, this discussion hides a probably much more interesting and
>>> important aspect regarding parallelization. In times of N-core CPUs and
>>> GPGPUs,
>>> found even in cell-phones, the question is rather why a user has to do
>>> parallelization& distribution himself? This surely is a philosophical
>>> question
>>> which I would be interested to hear your opinion about.
>>>
>>> Ciao
>>> Chris
>>>
>>> ________________________________________
>>> From: Alberto Fasso' [fasso_at_slac.stanford.edu]
>>> Sent: 29 June 2011 19:13
>>> To: fluka-discuss_at_fluka.org
>>> Cc: Chris Theis; Alfredo Ferrari; denis bertini
>>> Subject: RE: Random seed (RANDOMIZE)
>>>
>>> Two answers, one to Chris and one to Denis. In both cases, I ask you to
>>> keep
>>> in mind that every Monte Carlo code is designed according to a particular
>>> "philosophy", that you may like or not. Suggestions to do modifications
>>> are ok, but not if they are incompatible with that philosophy.
>>>
>>> On Wed, 29 Jun 2011, Chris Theis wrote:
>>>
>>>> with all due respect but I beg to differ on this point. The approach
>>>> that=
>>> I
>>>> describe is nothing new and actually implemented in various Monte Carlo
>>>> codes,
>>>> admittedly in the CG domain, but this is not of concern for this
>>>> discussion.
>>>>
>>>> The approach is simply to revert the default behavior that FLUKA
>>>> currently
>>>> exhibits. So mathematically nothing changes at all. More specifically,
>>>> it
>>>> would mean deterministic selection of predefined seeds as an option for
>>>> debugging and development, while starting from (pseudo)random seeds as
>>>> a
>>>> default. However, this does not mean that the user would be prevented
>>>> from
>>>> selecting a seed himself, but he would not be forced to do so to get
>>>> independent runs. It is true that mathematically it cannot be fully
>>>> excluded
>>>> that the same seed would be re-used if generated in a pseudo-random
>>>> manner,
>>>> but given the periodicity of modern RNG or the application of for
>>>> example a
>>>> VIA mobo the probability can be kept at an almost zero level.
>>>
>>> Chris, your proposal would make impossible for anybody to help a user who
>>> has got a problem, or to find a rare programming bug in the code.
>>> Reproducibility of random numbers is as important as their independence,
>>> and
>>> probably more. This was realized by Monte Carlo developers since the very
>>> early times: the first generators produced numbers completely independent,
>>> obtained from electronic noise measurements, and they were soon abandoned
>>> because they were not reproducible.
>>> Referring to what I have said above, your idea of what a default should be
>>> conflicts hardly with the FLUKA philosophy that all jobs should be
>>> reproducible
>>> (see for instance the care taken about overlapping ORs in the geometry).
>>>
>>>> Alfredo Ferrari wrote:
>>>
>>>> I believe Denis question is different, if I understand correctly he
>>>> would like
>>>> to have a way through a user routine to input something equivalent to
>>>> what is
>>>> now input in what(2) of RANDOMIZe? Am I correct?
>>>
>>> It is difficult to understand why making a copy of an input file should be
>>> more complicated than writing, compiling and writing a user routine.
>>> The fact that GEANT does it this way is coherent with the fact that all
>>> GEANT input is programmed by the user. FLUKA has been written with the
>>> opposite
>>> philosophy: that the user should program nothing or as little as possible.
>>> All that can be done with an input file shall be done with it.
>>>
>>> Alberto
>>>
>>
>>
>> Paola Sala
>> INFN Milano
>> tel. Milano +39-0250317374
>> tel. CERN +41-227679148
>>
>

-- 
Alberto Fasso`
SLAC-RP, MS 48, 2575 Sand Hill Road, Menlo Park CA 94025
Phone: (1 650) 926 4762   Fax: (1 650) 926 3569
fasso_at_slac.stanford.edu
Received on Sat Jul 02 2011 - 13:23:26 CEST

This archive was generated by hypermail 2.2.0 : Sat Jul 02 2011 - 13:23:26 CEST