RE: Combining data from runs on different machines/architectures

From: <Roger.rhaelg_at_phys.ethz.ch>
Date: Fri, 05 Jun 2009 16:58:25 +0200

Hi Chris

Thanks for your answer.=20
Do you start several cycles in parallel or just different runs in
parallel?=20
I thought of using the condor submit file in a way like
=2E...
args=3D-N 0 -M 1
Queue

args=3D-N 1 -M 1
Queue

args=3D-N 2 -M 1
Queue
=2E...

It seems to me, that the random seed (i.e. raninput001) for the next
cycle is written just when the cycle starts. Is this right? So it would
be possible to send cycles as different condor jobs in parallel. But the
comment in the manual makes me cautious (again the cluster uses
different architectures as Intel, AMD,...):

"It is MANDATORY to use only seeds output information as written by the
program in earlier runs ON THE SAME COMPUTER PLATFORM. Otherwise the
randomness of the number sequence would not be guaranteed."

What do you think about that?

For the statistical evaluation is there a difference if you calculate
many runs with different initial seeds with just one cycle or calculate
less runs but each with several cycles?

Greetz

Roger

On Thu, 2009-05-28 at 17:22 +0200, Chris Theis wrote:
> Hi Roger,
>=20
> > I plan to use a condor cluster to enhance the CPU power for my
> > simulations. The cluster consists of machines with different
> > architectures (Intel, AMD, AMD64, x86_64). I want to start several runs
> > with one or more cycles on many machines in parallel using different
> > initial seeds and combine the simulated data in the end.
>=20
> This is exactly the environment in which we're running FLUKA simulations
> in the RP group at CERN without any problems. The version of FLUKA which
> is publicly available is a 32-bit native and thus, also 64-bit machines
> will either run in compatibility or in legacy mode. Therefore, combining
> data from different machines & architectures will be no problem.
>=20
>=20
> There is only one thing that you might want to keep in mind and this is
> not FLUKA specific but rather due to different CPU architectures. So far
> I've seen that in some cases floating point rounding can be slightly
> different on Intel & AMD even if the FPU has been set to strict
> double-precision instead of the default extended-precision mode.
> However, this behavior is not limited to FLUKA but you can also find it
> in other programs. As a consequence you should make sure that you re-run
> the simulation on the same architecture in case you want to fully
> reproduce results starting from the same random seed. Otherwise the
> particle histories might diverge at one point which would lead to
> slightly different results as it is in the nature of MC simulations.
>=20
> Cheers
> Chris
>=20
>=20
> ------------------------------------------------------------------------
> Chris Theis
> CERN/DG-SCR - European Organization for Nuclear Research
> 1211 Geneva 23, Switzerland
> Phone: +41 22 767 8069 Office: 892-2A-015
> e-mail: Christian.Theis@cern.ch www: http://www.cern.ch/theis
> ------------------------------------------------------------------------
>=20
>=20
> > -----Original Message-----
> > From: owner-fluka-discuss_at_mi.infn.it [mailto:owner-fluka-
> > discuss_at_mi.infn.it] On Behalf Of Roger H=C3=A4lg
> > Sent: 28 May 2009 14:25
> > To: fluka-discuss_at_fluka.org
> > Subject: Combining data from runs on different machines/architectures
> >=20
> > Dear FLUKA experts
> >=20
> > I plan to use a condor cluster to enhance the CPU power for my
> > simulations. The cluster consists of machines with different
> > architectures (Intel, AMD, AMD64, x86_64). I want to start several runs
> > with one or more cycles on many machines in parallel using different
> > initial seeds and combine the simulated data in the end. Inevitably the
> > runs will be performed on Linux PCs with different architectures.
> >=20
> > I have in mind to have read here in the mailinglist something about the
> > problems using the data simulated on different machines, mainly
> > concerning the usage of the random number generator. Unfortunately I
> > can't find the post anymore.
> >=20
> > So my question is, if it is possible to combine the data from the
> > different machines without adulterate the results? Or do I have to
> > account for something in addition?
> > Any further explanation of this topic would be greatly appreciated.
> >=20
> > Thanks
> >=20
> > Roger
Received on Sat Jun 06 2009 - 11:40:14 CEST

This archive was generated by hypermail 2.2.0 : Sat Jun 06 2009 - 11:40:14 CEST