DISPATCH Documentation¶
Overview¶
DISPATCH
- is a code framework, rather than just another MHD/HD code
- is a system for task based computing with, in principle, unlimited scaling
- supports different kinds of (co-existing) tasks (e.g., cell-, ray- & particle-based), using HD, MHD, RMHD, non-ideal MHD, and / or particle-in-cell solvers
- can support and boost the performance of both existing and new solvers
- relieves solvers from dealing with MPI communication and OpenMP parallelization
- makes it trivial to implent new solvers, which are required to perform only two tasks:
- choose a time step size, based on local variables
- update its state, given guard cell or other neighbor information, provided by the framework
Presentations¶
- Short presentation at From Stars to Planets II, Gothenburg 2019
- Invited talk at Zooming in on Star Formation, Nafplio, Greece 2019
- Invited talk at Origins of Solar Systems, Gordon Research Conference 2019
- Invited talk at Modelling High-Mass Stellar Feedback, Tubingen 2020
Publications¶
- Pan, L., Padoan, P., & Nordlund, Å., “Inaccuracy of Spatial Derivatives in Riemann Solver Simulations of Supersonic Turbulence” , The Astrophysical Journal 876, 90 (2019). arXiv:1902.00079
- Pan, L., Padoan, P., & Nordlund, Å., “The Probability Distribution of Density Fluctuations in Supersonic Turbulence” , arXiv e-prints arXiv:1905.00923 (2019). arXiv:1905.00923
- Popovas, A., Nordlund, Å., & Ramsey, J. P., “Pebble dynamics and accretion on to rocky planets - II. Radiative models” , Monthly Notices of the Royal Astronomical Society 482, L107 (2019). arXiv:1810.07048
- Popovas, A., Nordlund, Å., Ramsey, J. P., & Ormel, C. W., “Pebble dynamics and accretion on to rocky planets - I. Adiabatic and convective models” , Monthly Notices of the Royal Astronomical Society 479, 5136 (2018). arXiv:1801.07707
- Ramsey, J. P., Haugbølle, T., & Nordlund, Å., “A simple and efficient solver for self-gravity in the DISPATCH astrophysical simulation framework” , Journal of Physics Conference Series 1031, 012021 (2018). arXiv:1806.10098
- Nordlund, Å., Ramsey, J. P., Popovas, A., & Küffmeier, M., “DISPATCH: a numerical simulation framework for the exa-scale era - I. Fundamentals” , Monthly Notices of the Royal Astronomical Society 477, 624 (2018). arXiv:1705.10774
License¶
DISPATCH is licensed under a BSD 3-Clause license:
Copyright (c) 2019 Åke Nordlund, Jon Ramsey, Andrius Popovas, Troels Haugbølle, Michael Kuffmeier.
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted
provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of
conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of
conditions and the following disclaimer in the documentation and/or other materials provided
with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to
endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
Collaborations¶
We welcome collaborations, in which we give access to the development version of DISPATCH, or provide support in connection with use of the public repository. Such collaborations are expected to result in contributions in the form of added solvers and/or experiment setups. After some time, such additions are expected to migrate to the public repository.
In most cases such collaborative development efforts run smoothest if forks of the development repository (or possibly of the public repository) are used, with pull requests used to merge validated and stable methods into the main repository.
In cases where our contributions to use of DISAPTCH are significant, those of us that participate also may expect co-authorship.
User Guide¶
Installing¶
To clone a working copy of the code from Bitbucket, do
git clone https://aanordlund@bitbucket.org/aanordlund/dispatch.git
To work with GIT behind a firewall – including cloning – see the link below.
To contribute back to the repository, use pull requests from a branch, or (better) setup a fork on Bitbucket (search for “Forking a repository” on Bitbucket Support), and make pull requests from the fork.
To get direct write access (mainly for core developers and collaborators), setup
an account on Bitbucket if you don’t have one, and ask to be added to the access list.
Then use the URL git@bitbucked.org:aanordlund/dispatch
to clone or fork.
GIT behind a firewall¶
To use GIT with bitbucket on a cluster with a firewall you can use git over SSH, and “tunnel” port 2222 on the remote machine to port 22 at bitbucket.org, via your laptop. You also need to have an account at bitbucket.org (useful in any case); cf. separate page. When this is arranged, do
ssh -R 2222:bitbucket.org:22 your_login@host_behind_firewall
From the remote host point of view, the repository is available on the local port (2222), and therefore the clone command look like so:
git clone ssh://git@localhost:2222/aanordlund/dispatch.git
Once cloned, git pull
and other GIT commands that need access to the repository will work,
as long as the SSH tunnel is connected.
If you want avoid having to add the extra options to the SSH command, you can instead add
the tunneling setup to your $(HOME)/.ssh/config
file on the host or laptop that you use
to login to the cluster:
Host clusteralias
HostName cluster.domain
User username
RemoteForward 2222 bitbucket.org:22
For more on how to set up bitbucket for SSH, see Atlassians instructions, including how to add an SSH key to your bitbucket profile.
Bitbucket account¶
To create your own repository, or a fork, or to use the ssh: method behind a firewall, you need to have an account at bitbucket.org.
After registering, use ssh-keygen
on the cluster frontend to create an SSH key. Then
use the “A” icon in the lower left corner of the bitbucket web page to access “Bitbucket
Settings” and “SSH keys”, and store the public (.pub) part of the key there.
To access a repository or fork of yours from your laptop, or another host, add the corresponding SSH keys in the same way.
Execution environment¶
To include the utilities/python/
directory in the Python search
path, and (optionally) set default options for compilation, add
these lines to your ~/.bashrc
startup file:
export DISPATCH=${HOME}/codes/dispatch
export PYTHONPATH=${DISPATCH}/utilities/python:${PYTHONPATH}
# optionally:
export HOST=NameForYourLaptop
To set compile options, add a config/host/NameForYourLaptop/Makefile
containing for example
COMPILER=ifort
SEPARATE_BUILDS=on
Experiments¶
A few of the (many more) experiments from the development
repository are made available in the public version, mainly
to serve as examples of how to construct experiments, and how
to implement solvers.
Assuming you have the gfortran compiler and MPI (if not –
see Compiling), downloading, compiling, and running for
example the heat_diffusion
demo requires only these commands:
git clone https://bitbucket.org/aanordlund/dispatch
cd dispatch/experiments/heat_diffusion
make -j
./dispatch.x
Heat diffusion¶
The heat_diffusion/
experiment demonstrates how to setup an experiment that refers
to the very simple solver solvers/heat_diffusion/solver_mod.f90
,
and by extension demonstrates how to add a solver (any solver) to the
DISPATCH code framework.
The entire solver and experiment setup is defined by the files:
solvers/heat_diffusion/Makefile
solvers/heat_diffusion/solver_mod.f90
experiments/heat_diffusion/Makefile
experiments/heat_diffusion/experiment_mod.f90
experiments/heat_diffusion/input.nml
MHD shock¶
The mhd_shock/
experiment runs a classical MHD shock tube experiment,
where the initial left and right values are set by the input
namelist file, and the choice of solver is made by setting the
SOLVER macro in the Makefile
.
Pan¶
The pan/
experiment corresponds to the runs behind the results
reported in the series of papers by Liubin Pan, Paolo Padoan,
and collaborators (see Publications, or make a search on
ADS for an up-to-date list).
Stellar atmospheres¶
The stellar_atmospheres
experiment demonstrates how to setup models of stellar
atmospheres, using tabular equations of state, and including
diffuse radiative transfer.
Truelove¶
The truelove/
experiment corresponds to the runs behind the results
reported in the paper
- Ramsey, J. P., Haugbølle, T., & Nordlund, Å., “A simple and efficient solver for self-gravity in the DISPATCH astrophysical simulation framework” , Journal of Physics Conference Series 1031, 012021 (2018). arXiv:1806.10098
Turbulence¶
The turbulence/
experiment demonstrates how to set up the driving in
forced turbulence experiments, where the details of the driving
may need to depend on the value of the SOLVER makefile macro.
The hierachical makefile setup (which may be followed by inspecting
the config/Makefile
and its sinclude
statements allows one
to override the default choice of the force_mod.f90
or
forces_mod.f90
by placing corresponding files in subdirectories
under the experiments/turbulence/
directory.
The specific driver in the case where SOLVER=ramses/hydro
, for
example, resides in experiments/turbulence/ramses/hydro/
.
Alternative, as illustrated by the directory experiments/turbulence/stagger2e/
,
one may place a specialized experiment in a subdirectory, where the
TOP
makefile macro is correspondingly defined as TOP=../../..
.
Compiling¶
Typically, you need to use a module
command to get access to a compiler
and MPI library. This could be, for example:
module load intel openmpi/intel
To compile, go to one of the expriments/ directories, and use make info
to see macro names and values that would be used in make
::
cd experiments/turbulence
make info
If the automatically selected make options shown are OK, just do::
make -j
If MPI is not available, add MPI=
to the make command, and if the
(gfortran) compiler chosen by default is not appropriate add for example
COMPILER=ifort
). The directory config/compiler/
shows which
compiler configurations are avaialable.
See also the Execution environment section.
Command line tools¶
Wether you run on Ĺinux, MacOS, or Windows, you need to have a few of basic command line tools:
git
make
gfortran (or ifort)
Windows: Cygwin¶
In Windows, a simple way to get access to gfortan and other tools that are needed (git, make, ..) is to install Cygwin (cygwin.org). Choose the 64-bit version, and be sure to include:
gcc-core
gcc-fortran
gcc-debuginfo
openmpi
openmpi-debuginfo
libopenmpi-devel
libopenmpi40
libopenmpifh40
libopenmpiuse08_40
libopenmpiusetkr40
make
git
and maybe more (shells, SSH, …)
GUI tools¶
It may also be nice to have GUI (Graphical User Interface) versions of some tools, especially to work with GIT, e.g.:
SourceTree
Option Groups¶
Supported compilers come with predefined option groups. To see the actual compiler options they include, do:
make OPTS=full_debug info
make OPTS=debug info
make OPTS= info
make OPTS=optimized info
The bundled options are chosen so that
- full_debug: generates very slow code that traps many problems
- debug: generates faster code that traps fewer problems
- (blank): compromise between compile time and code speed
- optimized: maximizes speed, at the cost of increased compile time
Note that local options in config/host/$HOST/Makefile
and in
config/compiler/$HOST/$OPTS.mkf
may modify the option bundles.
Overriding Options¶
If you prefer to compile with another set of options, do for example:
make OPT="-O3 -xHost"
Any other of the macro names shown by make info
can also be replaced:
make PAR=
make MPI=
would compile without OpenMP and without MPI, respectively.
Running¶
To run with the default input file (input.nml
), with data ouput into
the ./data/
directory:
./dispatch.x
To run with another input file:
./dispatch.x run.nml
In this case data output goes to the ./data/run/
directory.
OpenMP¶
OpenMPI is fundamental in DISPATCH, so to take advantage of task parallelization on a single node, set the OMP_NUM_THREADS environemental variable before starting. For example::
export OMP_NUM_THREADS=20
./dispatch.x
To see how many threads your compute node supports:
grep -c processor /proc/cpuinfo
That file contains other detailed information about the cores, which might be relevant when choosing compiler options
more /proc/cpuinfo
Running under MPI¶
Use the relevant cluster documentation to find out how to run under MPI. To use SLURM to run a job with 64 MPI processes, with two processes per node and 10 cores per process, you may need something similar to this (which would work on HPC.KU.DK):
#!/bin/bash
#SBATCH --ntasks=64 --ntasks-per-node=2 --cpus-per-task=10
#SBATCH --mem-per-cpu=3g --exclusive
export OMP_NUM_THREADS=10
./dispatch.x
Hyper-threading¶
If the system supports hyper-threading; e.g., 2 threads per core, with 10 cores per process::
#!/bin/bash
#SBATCH --ntasks=64 --ntasks-per-node=2 --cpus-per-task=10 --ntasks_per_core=2
#SBATCH --mem-per-cpu=3g --exclusive
export OMP_NUM_THREADS=20
./dispatch.x
Data Access¶
To access the data, use Python (cf. ref:python and Execution environment for setup details). A brief overview of the data structure and a few simple examples of Python access are given below.
Data Structure¶
Data from execution of experiments ends up under the subdirectory data/
.
It is often convenient to let data/
be a softlink to a directory on a
dedicated disk system (e.g. a Luster file system on a cluster, or an external
disk drive on a workstation or laptop).
When using the default input file (input.nml
) data is stored directly
in data/
, while if using e.g. run.nml
, the data is stored in
data/run/
.
Metadata describing the snapshots is stored as namelists in files such as
data/params.nml
data/00000/snapshot.nml
data/00000/rank_00000_patches.nml
The binary data may, depending on I/O method, reside in files such as:
data/snapshots.dat
data/00000/snapshot.dat
data/00000/00000.dat
Simple examples¶
These example illustrates how to access and display data from
snapshot 2 of a run with input.nml
as input file (see Execution environment
for the assumed setup):
# Open snapshot 2 metadata
import dispatch
sn=dispatch.snapshot(2)
...
# Plot the horizontal average of density
import dispatch.select as dse
z,f=dse.haver(sn,iv='d',dir=2)
import matplotlib.pyplot as plt
plt.plot(z,f)
...
# Display a horizontal plane using YT
import dispatch.yt as dyt
ds=dyt.snapshot(2)
import yt
slc=yt.SlicePlot(ds,fld='density',axis=2,center=[.5.,.5,.5])
slc.show()
...
# Display a horizontal plane using dispatch.graphics
import dispatch.graphics as dgr
dgr.amr_plane(sn,iv='d',z=0.5)
For more general cases, see the Python support documentation
Data output¶
Several output formats are avialable, to be described in detail below. In addition, one can optionally add HDF5 output, and “auxiliary” output which is incorporated seemlessly into the Python data access methods.
Auxiliary data¶
One can add auxiliary data, which will be automatically added to
the set of variables accessed via the dispatch
Python module,
by adding these lines in the source code::
USE task_mod
USE aux_mod
...
...
call task%aux%register ('name', array)
array
should be a pointer to a 1-D, 2-D, 3-D, or 4-D
real array. The aux
data type may be added to any
other sub-class of the task_t
data type (e.g. as a
patch%aud
or an extras%aux
.
Python support¶
It is highly recommended to use Anaconda as the Python 3 installation, adding also Spyder, and installing the add-on that allows running Jupyter Notebooks inside Spyder.
To import Python modules, add the full path of the DISPATCH directory utilities/python/
to the Python path (see Execution environment), and do:
import dispatch
import dispatch.select
import dispatch.graphics
import dispatch.yt
Documentation¶
Several example of how to access and visualize DISPATCH data are given in the Jupyter Notebooks section.
The module APIs are self-documented, using the standard Python documentation mechanisms, cf.
help(dispatch)
help(dispatch.select)
dispatch.snapshot(<TAB>
The source code may be found in utilities/python/dispatch/
, and is
also available via the auto-generated documentation (the
“Files” menu contains source code).
dispatch
module¶
Try for example:
import dispatch
help(dispatch)
help(dispatch.snapshot)
sn=dispatch.snapshot(<TAB>
To read the 1st snapshot from data/
, data/run
, and ../data/run
, do:
sn=dispatch.snapshot(1)
sn=dispatch.snapshot(1,run='run')
sn=dispatch.snapshot(1,'run',data='../data')
The resulting sn
is an instance of the snapshot class, where many
of the Fortan data type attributes from the source code are available as
Python object attributes. To see what is offered, do:
dir(sn)
dispatch.select
module¶
dispatch.select
contains support functions for choosing
selections of snapshot data.
Try for example:
s,f=dispatch.select.haver(sn,dir=2)
plot(s,f)
which returns in f
the “horizontal” averages of the density
when “up” is axis 2 (the z-axis). s
contains the coordinate
values (z in this case). Or try plotting horizontal min- and
max-values, using:
s,f0,f1=dispatch.select.hminmax(sn,dir=2)
plot(s,f0); plot(s,f1)
dispatch.graphics
module¶
dispatch.graphics
contains support functions for visualizing
snapshot data.
Documentation:
help(dispatch.graphics)
help(dispatch.graphics.imshow)
Note for example dispatch.graphics.imshow(im)
, which is similar
to matplotlib.pyplot.imshow(im)
, but uses Fortan index conventions.
dispatch.yt
module¶
dispatch.yt
is an interface to the YT Project Python package.
Try for example:
import yt
import dispatch.yt
To read the 1st snapshot from the ../data/run
, do:
help(dispatch.yt.snapshot)
ds=dispatch.yt.snapshot(1,run='run',data='../data')
The resulting ds
is an instance of the YT data set class.
For documentation of the various YT commands below, use help(command)
.
To make a slice plot, try:
fld='density'
slc=yt.SlicePlot(ds,fld=fld,axis=2,center=[.5.,.5,.5])
slc.set_log(fld,True)
slc.set_colorbar_label(fld,'Density')
slc.set_xlabel('')
slc.set_ylabel('')
slc.annotate_grids(edgecolors='white',draw_ids=False)
slc.show()
slc.save('slice_plot.png')
Jupyter Notebooks¶
Example / template notebooks:
Reading data and 1-D plotting with DISPATCH¶
NB: Remember to set the PYTHONPATH
environment variable to $DISPATCH_DIR/utilities/python
, where $DISPATCH_DIR
is the location of your DISPATCH repository.
This notebook assumes you have already compiled DISPATCH for the 1-D MHD shock experiment (make
) and run the code (./dispatch.x
) successfully. The data can be found in the data
directory.
[1]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import dispatch
import dispatch.select
import itertools
First, read a snapshot.
[2]:
datadir = '../../../experiments/mhd_shock/data'
snap = dispatch.snapshot(iout=9,verbose=1,data=datadir)
parsing ../../../experiments/mhd_shock/data/00009/snapshot.nml
parsing io_nml
adding property io
parsing snapshot_nml
parsing idx_nml
adding property idx
parsing units_nml
adding property units
parsing cartesian_params
adding property cartesian
parsing ../../../experiments/mhd_shock/data/00009/rank_00000_patches.nml
added 3 patches
timer:
snapshot metadata: 0.008 sec
_patches: 0.003 sec
[3]:
for p in snap.patches: print(p.id, p.position, p.size, p.time, p.gn, p.ds, p.level)
1 [0.16666667 0.005 0.005 ] [0.33333333 0.01 0.01 ] 0.09 [66 1 1] [0.00564972 0.01 0.01 ] 7
2 [0.5 0.005 0.005] [0.33333333 0.01 0.01 ] 0.09 [66 1 1] [0.00564972 0.01 0.01 ] 7
3 [0.83333333 0.005 0.005 ] [0.33333333 0.01 0.01 ] 0.09 [66 1 1] [0.00564972 0.01 0.01 ] 7
Printed from left to right are the patch ID, the centre of the patch in Cartesian coordinates, the time of the patch in the current snapshot, and the dimensions of the density/patch.
In this case, although we are dealing with a 1-D MHD shock tube, the solver employed does not permit true 1-D or 2-D calculations and thus there are a few dummy zones in the y- and z-directions.
[4]:
indices = snap.patches[0].idx.vars.copy()
print(indices)
print("Patch kind:",snap.patches[0].kind)
{0: 'd', 4: 'e', 1: 'px', 2: 'py', 3: 'pz', 5: 'bx', 6: 'by', 7: 'bz'}
Patch kind: stagger2_mhd_pat
Note: From here on, we assume that the shock tube runs along the x-direction.
[5]:
fig = plt.figure(figsize=(14.0,6.5))
fig.clf()
ncols, nrows = 4, 2
gs = fig.add_gridspec(ncols=ncols,nrows=nrows)
axes = []
for j,i in itertools.product(range(ncols),range(nrows)):
cax = fig.add_subplot(gs[i,j])
axes.append(cax)
if 'et' in indices.values(): indices.pop(snap.patches[0].idx.et) # If stored, don't plot total energy.
for cax, v in zip(axes, indices):
colrs = itertools.cycle(['r','g','b','c','m','y','k'])
for p in snap.patches:
jslice, kslice = 0, 0
if p.kind == 'ramses_mhd_patch': jslice, kslice = 4,4
cax.plot(p.x[p.li[0]:p.ui[0]],p.var(v)[p.li[0]:p.ui[0],jslice,kslice],marker='o',color=next(colrs),zorder=0.1*p.level)
cax.set_xlabel(r'$x$')
cax.set_ylabel(p.idx.vars[v])
axes[0].set_title(r't = {0:.03g}'.format(snap.patches[0].time))
fig.tight_layout()
plt.savefig(snap.patches[0].kind.strip())
plt.show()

Here are the MHD variables stored in this patch and their associated index in the data. px
is the x-component of momentum, etc. You don’t have to remember these indices because you can always retrieve them using aliases, e.g., patch.idx.d
.
Here’s the density at t = 0.09. Different colours have been used for each patch.
Now what if you want to see all of the data as one single array (which can be useful for analysis)?
[6]:
x, rho = dispatch.select.values_along(snap.patches,[0.0,0.0,0.0],dir=0,iv=snap.patches[0].idx.d)
[7]:
fig2 = plt.figure()
fig2.clf()
plt.plot(x,rho,'co')
for p in snap.patches:
edge = p.position[0] - 0.5*p.size[0]
plt.plot([edge,edge],[-10,10],'k--')
plt.axis(ymin=0.15,ymax=1.05,xmin=0.0,xmax=1.0)
plt.xlabel(r'$x$')
plt.ylabel(r'$\rho$')
[7]:
Text(0, 0.5, '$\\rho$')

Note that I’ve manually added vertical lines to denote patch boundaries.
[8]:
print(rho.shape)
(180,)
As you can see, the data is now available as a single numpy array.
Data access demo¶
Python packages
[1]:
import dispatch
import dispatch.select
import dispatch.graphics
Data and run directories (relative to ../notebooks
)
[2]:
data='../data'
run='512m32_hllc'
io=10
Using, from DISPATCH Python packages, dispatch.snapshot()
to read in metadata, dispatch.select.uni_gridplane()
to get an array with a slice, and dispatch.graphics.imshow()
to render it:
[3]:
s=dispatch.snapshot(io,run,data)
slc=dispatch.select.unigrid_plane(s)
import matplotlib.pyplot as pl
pl.figure(figsize=(10,10))
dispatch.graphics.imshow(slc,vmin=-5,vmax=5)
pl.title('t={:.2f}'.format(s.time));
timer:
snapshot metadata: 0.008 sec
_patches: 11.706 sec

AMR slice animation with YT¶
Module import
[1]:
import os
import sys
import time
import yt
import dispatch.yt
import numpy as np
Optionally, turn of YT info lines
[2]:
yt.funcs.mylog.setLevel(40)
Data directory, run directory, and data source field
[3]:
data='../data'
run='amr_7'
fld='density'
[4]:
z=0.6
ds=dispatch.yt.snapshot(30,run,data)
slc=yt.SlicePlot(ds,axis=2,fields=fld,center=[.5,.5,z])
slc.set_log(fld,True)
slc.set_colorbar_label(fld,"Density")
slc.set_xlabel("")
slc.set_ylabel("")
slc.annotate_grids(edgecolors="white")
slc.show()
Make a sub-directory for images
[5]:
os.makedirs(data+'/'+run+'/im',exist_ok=True)
Loop over a number of timesteps, producing images for an animation:
[6]:
z=0.6 # slice plane
i0=5 # 1st snapshot
i1=50 # last snapshot
im=(i0-5)*2+1 # first frame number
for i in range(i0,i1+1):
ds=dispatch.yt.snapshot(i,run,'../data') # data source
slc=yt.SlicePlot(ds,axis=2,fields=fld,center=[.5,.5,z]) # slice it
slc.set_log(fld,True) # use log density
slc.set_colorbar_label(fld,"Density") # replace label on colbar
slc.set_xlabel(""); slc.set_ylabel("") # blank out x- and y-labels
slc.annotate_grids(edgecolors="white") # color of grid annotation
for j in (0,1): # double frames
file='{}/{}/im/{:03d}.png'.format(data,run,im) # .png file
sys.stdout.write('{:03d}'.format(im)) # progress
c='\n' if im%20 == 0 else ' ' # new line or not
sys.stdout.write(c)
im+=1
slc.save(file)
001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020
021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040
041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060
061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080
081 082 083 084 085 086 087 088 089 090 091 092
Setting up experiments¶
To construct a new experiment
(experiment is essentially synonymous to simulation or project)
in DISPATCH one can often start out
from an existing experiment, clone it into a new experiments/whatever/
directory, and modify or replace the existing files. In fact, an
experiment is completely defined by the (relatively few) files that
are present in the specific experiments/whatever/
directory.
The main files present there, and their funcionalities, are:
Makefile
: select the main solver responsible for evolving the experimentexperiment_mod.f90
: specify initial conditions (ICs) and boundary conditions (BCs)scaling_mod.f90
: specify scaling between physical and code unitsextras_mod.f90
: select additional functionalities from theextras/
directory, such as selfgravity, radiative transfer, auxiliary HDF5 output, etc.run.nml
: specify, in Fortran namlist input files (*.nml
) the parameters of specific runs of the experiment.python/*.{py,ipynb}
: provide Python support; e.g. for preparation of input and/or analysis of output
Makefile¶
The Makefile
should initially be copied from another experiment
directory, and usually requires only minor editing, of mostly obvious
nature – such as choosing a different solver, changing the default
compilation options, and adding or modifying the extras_mod.o:
dependencies.
experiment_mod.f90¶
The experiment_mod.f90
module defines the highest level data type,
which extends (inherits) the solver_t
data type, and optionally
overloads (or “intercepts”) calls to the standar init
, update
,
output
, and possible other procedures. The main purposes of these
are:
init
: read in and use input parameters from an input namelist, by convention calledexperiment_params
.update
: do whatever specific actions that are required (or not) before and after calling the solver update procedure to update the state of a single patch (or more generally: task)output:
this is called immediately before call to theupdate
procedure (because that’s the point where ghost zones have been loaded, boundary conditions have been applied, etc).
Any or all of these procedures may be omitted, in which case calls are
instead caught by the corresponding solver_t
procedures. One can
also pass on calls to these, with for example:
SUBROUTINE output (self)
class(experiment_t):: self
!...........................................................................
call self%solver_t%output
... whatever else one may want to do, e.g. via the HDF5 interface
END SUBROUTINE output
scaling_mod.f90¶
The scaling_mod.f90
specifies the relations between physical and code units.
To create it, copy the template file microphysics/scaling_mod.f90
(or copy
an existing experiments/*/scaling_mod.f90
file).
The contents should be essentially self-explanatory. By choosing three scaling units, for example length, time, and mass, all other units are defined.
The three basic units may be expressed in CGS or SI units; both units systems are present in the template file.
extras_mod.f90 options¶
The extras/
directory, and the related extras_mod.f90
template
file provide optional extra features, which are only linked into the
executable if explicitly selected.
In practice, this is done by copying the extras/extras_mod.f90
template file to the experiments/whatever/
directory, uncommenting
lines related to the selected extra features, and adding a line listing
those features in experiments/whatever/Makefile
.
*.nml input files¶
input.nml
is a template input file, which serves a dual purpose:
- Define a simple test case, that should not take more than about a minute to run on an average laptop
- Act as a template file, to be copied to input files
whatever.nml
whose names also define the location of output files, which appear indata/whatever/
directories. Herewhatever
may be chosen arbitrarily, and could for example berun1
,run2
, or2019-08-22a
, orL=30px_levels=6-9
.
python/*.{py,ipynb} files¶
Use a python/
sub-directory to store *.py
and *.ipynb
(Jupyter Notebooks). This is where to put the experiment specific
Python files.
Functionalities that may be used in any context should instead
be added to utilities/python/
, or should be amended to for
example utilities/python/dispatch/_dispatch.py
or other
files there.
For details about data access via Python, see the Data Access section.
Technical Details¶
Note 1: some of the issues and features described here may for a period of time only be available in the development repository.
Note 2: some of the technical discussion may not be up-to-date – they are left visible to encourage updates.
Adaptive Mesh Refinement¶
Refine event sequence¶
- In
refine_t%make_child_patch()
- create the new patch
- add its parent patch as the only nbor, and prolong
- set
bits%init_nbors
- add it to the task list
- add it to the ready queue
- switch perspective to the first time it enters
refine_t%check_current()
- In
refine_t%check_current()
- detect
bits%init_nbors
- generate a new nbor list
- possibly set
bits%init_nbors
on the nbors - clear
bits%init_nbors
- continue the usual business
- detect
- In
task_mesg_t%unpack()
for virtual tasks- must do whatever an active task does; using
list_t%add_new_link
for that - must also detect
bits%init_nbors
, and act accordingly
- must do whatever an active task does; using
Derefine event sequence¶
- Call
list_t%remove_remove_and_reset()
to- set bits%remove, which hides the task in check_ready() (task and nbor)
- remove the task from the task list (it will remain in garbage bin while needed)
- use the existing nbor list to set
bits%init_nbors
on nbors - now that the task is removed, call
check_nbors()
to see if nbors have become ready - move the task to the garbage bin
- when the task is no longer needed (no longer is a member of an nbor list)
the task is deallocated and deleted by
list_t%garbage_remove()
- In
task_mesg_t%unpack()
- detect
bits%remove
and calllist_t%remove_remove_and_reset()
- detect
This is identical to what is done on the owner rank, and is taken care of by
the list_t%remove_and_reset()
procedure.
New AMR tasks¶
A new AMR task (A) is ready to update immediately, since it was spawned from a
task that was ready to update at that specific time. To update, with guard
cells filled, it needs to have an nbor list. The nbor list must have
the required nbors, and the nbor links must have needed
and download
both set to true.
When that task (A) nbor list was created, and indeed when any new nbor list is
created, that very action should trigger an init_nbors()
in the nbors of
the A task (but not in nbors of nbors!). So, the caller of the init_nbors()
must also call set_init_nbors()
, which set bits%init_nbors
in all nbors
of the 1st task.
One of those nbors (B) may be virtual, and that tasks should also get new nbor lists, both on the rank (a) that triggers the chain of events, and on the rank (b) that are owns task (B).
The rank (b) that owns the virtual task (B) also has (or will get) a copy of
the task (A) that triggered the event chain, and since the principle is that
the virtual tasks behave exactly as their originals do, we need only to ensure
that the thread on rank (b) that unpacks the virtual copy of (A) also performs a
set_init_nbors()
call, which then triggers an init_nbors()
in the next
update of the boundary task (B) on rank (b), which then gets send in copy to
rank (a), as un update of its virtual task (B), where it gets a new nbor list,
provided that the bits%init_nbors
that caused the thread on rank (b) to
generate a new nbor list still is set when the copy of the task arrives in rank
(a).
Hence, when a thread updating a boundary task discovers a bits%init_nbors
,
it should not clear that bit until after the send_to_vnbors()
call.
Adding a new task¶
When a thread adds a new task, either as a consequence of AMR on the same rank, or because of AMR on another rank, or because of load balancing, it must always take these actions:
- Make sure that
class(patch_t)
tasks have a link back to the task link - Add an nbor list to the task link
- Cause the new task to be added to the nbor lists of the nbors, by calling
list_t%set_init_nbors()
, which setsbits%init_nbors
in all nbor tasks. - Increment the task total and task level counts on the rank
- Call
check_nbors()
on the new task link, which then (as always) runscheck_ready()
on all the nbors first, and then runs check_ready() on the task link itself (unless it is a virtual task).
These actions are taken in a procedure list_t%add_new_task()
that is called
from both the AMR procedure that created new tasks, and from the
task_mesg_t%unpack()
procedure that creates virtual copies of new tasks,
or that creates a new virtual task because that task became a boundary task
on the rank that owns it.
Removing a task¶
When a thread removes a task, either as a consequence of AMR on the same rank, or because of AMR on another rank, or because of load balancing, it must always take these actions:
- Immediately set
bits%move
bit in the task status (under a brief lock). That bit should make the task effectively removed, immediately. Hence, even if it still exists, and is in the nbor lists of some tasks, it is ignored incheck_ready()
calls. - While the task is being processed for removal, it is placed on a garbage list,
as it may still be needed in task updates. While that is going on, the task
may still be used in
dnload()
procedures. - When the access count (n_needed) reaches zero, because the task has been removed from all nbor lists where it was present, the garbage collection routine is free to actually deallocate everything, and remove the last traces of the task.
To be precise on a particular aspect here: As long as a task that used to have
it in its nbor list has it there, it needs to be able to provide guard cell data.
But when the nbor list has been update and the task in question is not there
anymore, then the task list presumably contains other tasks that provide the same
guard cell coverage, and hence the task may be actually removed. Hence; the use
in dnload()
is tied one-to-one on the presence in the nbor list, so
download_link()
should not take the remove bit into account.
The functionaly described above is taken care of by list_t%remove_and_reset()
Garbage collection¶
When tasks are being added or removed it is convenient if threads the are still
assuming that the tasks exist can finish their work, without errors or task
locking. This can be achieved by maintaining a count task_t%n_needed
, which
keeps track of how many threads that currently have a lock at an nbor list where
the task is a member.
If the number is larger than zero, a task that should be deleted is instead added to a list of garbage task, from which it is removed and deleted when the count reaches zero.
There are only three procedure that create or delete nbor lists: copy_nbor_list
,
sort_nbors_by_level
, and remove_nbor_list
. These maintain the n_neeeded
count, using atomic increments and decrements.
Nbor list dead-lock¶
A deadlock could occur if a task A depends on another task B being updated,
because B is in the nbor list of A and the logical flag needed
is true
for the B entry in A’s nbor list, while at the same time A is not in the nbor
list of B, and hence B will not include A when doing check_nbors()
. This could
cause a deadlock because the mechanism that puts task A in the ready queue is
that a thread working on B, updating the time past the one that task A is
waiting for, then calls check_ready()
on the A-task.
But if the reason that the A task is not on the nbor list of B is just a time
delay (e.g. because of latency in MPI communication), then one just neeeds to
make sure that the check_ready()
call actually happens when A – after the
delay – gets put on the nbor list of B. This will be ensured if the adding
of a task to an nbor list always is accompanied by a check_ready()
call.
One call too many does not hurt.
So, at guarantee against nbor-list caused deadlocking is to add a
check_nbors()
call into the init_nbors()
call. This will work, as long
as init_nbors()
is the method used to update the nbor lists from
task_mesg_t%unpack()
. If the method is changed the new method needs to
do something similar.
Nbor list handling¶
If threads were to be allowed to update the nbor lists of tasks that another thread might be working on, the situation would become complicated. The thread would need to lock the nbor list of the other task while updating it, but it cannot be allowed to lock it’s own nbor list during that time, since the thread working on that task might happen to be running the same procedure at the same time, and a deadlock could then occur.
If threads only ever change the nbor list of their active task, and leave changing the nbor lists of other tasks to the threads updating those tasks, then much of the nbor list locking and copying can be avoided.
When the active task accesses the nbor list of another task – this happens
primarily in check_nbors()
, it needs to be sure that the another thread
doesn’t change the nbor list in the mean time, so it needs to lock it. That
locking is only effective if a thread that updates its own nbor list actually
locks it when it does so. Hence even with read-only access, the task owning
thread must lock the nbor list link while changing the nbor list.
Currently the generation of new nbor lists is primarily done by
list_t%init_nbors()
. If/when other procedures are used to modify nbor lists
they must lock the task link while doing so.
Nbor list protocol¶
The nbor list handling is based on these principles:
- Threads need to lock their nbor lists while updating them. This has the extra benefit that the updating does not require copying the nbor list in the mean time, except for the benefit of making the lock time as short as possible. However, considering how relatively seldom nbor lists need to be updated, locking during update is the simplest and safest.
- While accessing the nbor lists of other tasks (e.g. during
check_nbors()
), threads must acquire a lock on the nbor list while using it, to ensure it isn’t being changed in the meantime. However, it must not modify the nbor lists of other tasks - The nbor lists presented to other tasks need not be completely up-to-date, but they must be valid, in the sense that the nbors exist and have the expected time advance.
- If at some point in (wall clock) time a task (A) has an nbor list that
contains an nbor (B) that does not have task A in its nbor list, then to
avoid deadlocks, a
check_ready()
on task A must happen as a consequence of it being added to the nbor list of task B.
The lines with task memory locking should be marked with “TASK LOCK”, while the nbor list locking should be marked with “LINK LOCK”:
Nbor task access protocol¶
In addition to accessing nbor tasks nbor lists, the thread updating a task also
needs to access and modify nbor task structures. Setting bits in the status
word is inherently (by construction) atomic, so that should cause no problem,
and should not require task locking. Changes of patch_t%mem
should be
protected by setting task_t%lock
. This includes
task_t%rotate()
, where the memory slots are rotated, to create a circular buffertimestep_t%update()
forstagger2/
solvers, and corresponding sections of other solvers, where theirpatch_t%mem
arrays are being modified
The direct costs of applying such locks by the owner thread are negligible. Locking other tasks while accessing their memory could impact a more significant cost, and should be avoided when possible.
A relevant example is the access of patch_t%mem
from other tasks in the
form of source%mem
in download_mod.f90
. In principle, one should apply
a lock there, but the chance that something bad happens is very small, since
most of the time, the access will be to two pairs of memory slots that have
essentially zero chance of being modified during the access.
Since the access time in download_mod
is a tiny fraction (of order 0.1%)
of a task update time, if the source task happens to be under update (the
chance is a few %, since most task are dormant at any one time), the update is
not likely to finish during the brief access time. Therefore, in this case it
is necessary to lock the source task during access only when the oldest slot
that is accessed corrsponds to task_t%new
in the source.
Handling bits%init_nbors
¶
When a thread updating a boundary task discovers a bits%init_nbors
,
it should not clear that bit until after the send_to_vnbors()
call,
since the bit needs to be propagated to virtual copies of the task.
The nbor list should not be updated too early, since the thread is updating the
task because it was deemed ready to update based on the existing nbor list,
which has then been used (before the call to task%update()
) to load data
into the guard zones. The existing nbor list might also be used for some
unspecified reason as part of the update, so it should not be touched until
after the task has updated.
Hence the init_nbors()
call should be done by task_list_t%update()
,
after its call to task%update()
, and before the call to
send_to_vnbors()
.
The call to send_to_vnbors()
is immediately preceeded by a
call to load_balance()
, which means the boundary task that will have
a new nbor list generated could possibly be reassigned to another rank.
It will, in any case, since it arrives with bits%init_nbors
set, have
a new nbor list generated also on the other rank, so the result will in
any case be that both the first rank and the other rank will have consistent
nbor lists for the task.
The state of that task after the task%update()
and init_nbors()
is that
of a “dormant” task, ready to be checked as a candidate for the ready queue by
any thread updating one of its (possibly new) nbors. That is at it should be.
Consistency with MPI¶
The removal or addition of a task that is a boundary task should be reflected in the nbor list of nbors that are virtual tasks, and there could possibly be a signficant delay before that happens. The sequence of events should be:
- A thread always sends a boundary task to rank nbors, after updating or creating
it. At that time, a bit (
bits%init_nbors
) must be set, which triggers the receiving rank to update the nbor lists of the task; either creating it if is a new task, or removing it and the task, if the task is to be removed.- So, when should an
init_nbors()
call be made fromtask_mesg_t%unpack
? Clearly, when a new task is being created, but also when a task that is an nbor is removed or one that should become one is created. The first type causes a call automatically (should it?), while the 2nd type is triggered by thebits%init_nbors
being set. - The first type occurs when a task has just been created by AMR, and for a boundary task it happens both on the owner rank and on virtual nbor ranks.
- So, when should an
- As a consequnce of that, all nbors of that (virtual) task on the rank nbor should also have their nbor lists renewed. This includes those boundary tasks that appear as virtual tasks on the first rank.
- As the nbor lists of those virtual tasks are being modified, they thread that is doing the modifications must, correspondinly, take the appropriate action (i.e. calling check_nbors() for new tasks, as well as for tasks that are being removed.
MPI action items¶
Action items to ensure correct handling of AMR task under MPI:
- Ensure that new AMR tasks added are created with a valid nbor list = one that may be used to fill its guard zones in the first update. [x]
- Ensure that new AMR tasks are immediately added to the ready queue, since there is no other mechanism to put them there, and since they are indeed ready to be updated. [x]
- Ensure proper handling of
bits%init_nbors
, which should trigger a call toinit_nbors()
fromtask_list_t%update()
, after its call totask%update()
, and before its call toload_balance
for active tasks, and a similar call toinit_nbors()
from thetask_mesh_t%unpack()
procedure for virtual tasks. [x] - Ensure that the nbors of a new task, as well as of task to be removed, get
their
bits%init_nbors
set, and that that bit travels with boundary tasks as is acted upon by thetask_mesg_t%unpack()
procedure when it unpacks virtual tasks. [x] - Ensure that any
add_nbor_xxx()
call is accompanied by acheck_ready()
call of that task link. This is done by adding acheck_nbors()
to the end of theinit_nbors()
call that possible added a new task. [x] - Ensure that no locking of other task links occur. The choice is made by
not setting
omp_lock%links
in task%refine, and to make this choice permanent, all section of the code withif (omp_lock%links) ...
should be removed. [x] - In addition one should search explicitly for
%lock%set
in the code. [ ]
MPI I/O for AMR¶
Currently, only the method='legacy'
file format works with AMR, and then only
for writing – restarts have not been implemented yet.
A new method='snapshot'
will be implemented for efficient saving of AMR
data in a single data/run/NNNNN/snapshot.dat
file per snapshot.
The structure of data on disk should be optimized for fast reading, and should
have a format that is independent of the MPI geometry. This is achieved by
(temporarily) numbering all patches in a rank 1...np
, where np
is the
number of patches at the time of writing. Then such blocks from all ranks are
arranged with each variable filling a contiguous piece in the snapshot.dat file;
viz:
r1p1...r1pN, r2P1...r2pN, . . .,rMp1...rMpN
where M
is the number of ranks and N
is the number of patches per ranks
(which in general is allowed to differ from patch to patch). A metadata file
gives, for each sequence number, the offset into the file, and the previous
task number::
seq rank var task offset
1 0 0 1 xx
.
N 0 0
. .
N M 0
. . .
N M V
A restart does not need to result in the same tasks residing on the same rank as before, but should achieve that whenever possible. After or while reading in the metadata, each rank reads in as many as its share of the load
Snapshot save¶
A saved AMR snapshot must contain – effectively – a list of tasks to read in, initialize, and fill with values from the snapshot.
In the case of for example the ppd_zoom
experiment, the task list consists of
more than one piece, with different types of task in each piece. Sufficient info
about this must be carried by the snapshot meta-data.
In general, a “component” that is part of the simulation setup (“scene”) may need to have its own save/restore (output/input). If so, it also means that patches and tasks need to be identified as belonging to a component, either by being part of a partial task list, or by having an identifier string (or integer). The former is not convenient, since on the one hand we want to have all tasks in a rank-global task list, and on the other hand we need to be able to add and remove tasks in only one place, not in the global and partial lists separately.
One could possibly rely on that tasks needing special I/O treatment probably also need extra, non-generic parameters, and hence need to belong to a certain type of task, which then would naturally have its own input/output procedure.
Snapshot restore¶
When reading in AMR tasks, one needs to know the task type to be able to create the task, into which the data is to be read. The sequence of steps needs to be
- read the meta-data of a task
- create the task
- load data into the task
Clearly, the meta-data needs to contain information both about the task type, and about the location of the task data on disk.
It would be nice if this could be semi-automatic, so detection of task type was done by a required data type method.
A part of this issue is also the arrangement of different task types. We should be able to allow different solvers to exist, without insisting that one must be the extension of another. They should all be extensions of task_t, and some should all be extensions of patch_t::
task_t
/ | \
/ | \
task1_t | task2_t
|
patch_t
/ | \
/ | \
patch1_t | patch2_t
PDF I/O for AMR¶
The pdf_io_mod.f90
module outputs snapshots of the density probability
distribution function (PDF) at regular intevals (out_time``in the namelist
``pdf_io_params
). The PDF is accumulated as patches progress past the next
output time (patch_t%pdf_next
), at which time the density data is interpolated
to the exact output time its contribution is accumulated in pdf_io_t%counts
.
When a new AMR task is created, pdf_next
is set to the next multiple of
out_time
, thus triggering a call to pdf_io_t%update
when the task
reaches that time. The number of tasks that need to be accumulated before the
PDF is complete is (provisionally – cf. below) equal to io%nwrite
, which
is also the number of tasks that contribute to AMR snapshots of the patch
values.
When one snapshot of the PDF finishes, the count of the number of tasks expected
in the next snapshot is set equal to the number of tasks existing at that time,
which is the value of io%nwrite
at that time. If/when new AMR tasks are
added, with pdf_next
set to the next multiple of out_time
, this increases
the number of tasks that should be output with one. The counter that keeps
track of tasks remaining to be output should thus be incremented by one.
Correspondingly, if an AMR task is deleted, the count of remaining tasks to
accumulate should be decremented by one.
There may be borderline cases, where a task is created just after the time when an output should happen, but these should really not caue problems, if the procedure above is followed.
Level support¶
The AMR actions on a task – refine and derefine – should only be
done by the thread that is about to update the task, since any other
alternative would lead to serious synchronization problems. The
new level L-1
tasks that may be needed to support a new level L
task
can thus not be created until the next time the level L-2
task that needs
to be refined is updated.
The refine_t
data type has two methods related to checking for AMR
support (i.e., checking that all patches at level L
have level L-1
nbors
to get guard zone values from):
check_support()
checks if a new (child) patch actually has support. If not, it issues a WARNING in the rank log file. The task then takes guard zones values fromL-2
nbors, until the relevantL-1
nbors have been created.need_to_support()
checks if a givenL-2
level patch needs to be refined, to provide support for existing level L patches. If so, it creates such child patches(), along with any other refinement that may be detected a need for inrefine_t%needed
.
Both methods use 2-byte integer maps to detect lack of support – this method will work unmodified for moving patches.
A status bit (bits%support
) is available, and is used to communicate to
nbor patches about the need for support.
A new child patch is automatically sent vbor ranks when it is created, and
virtual patches are created there, as when any task arrives, with and id
that doesn’t exist yet on the rank. The bits%support
informs the rank
that this is a new AMR child patch.
Any new task arriving to a rank generates a call to list_t%init_nbor_nbors()
,
which uses position information to first create an nbor list for the
task itself, and then generates new nbor lists also for the nbors of the
task, since both sides of the nbor-nbor relation need to have consistent
nbor lists.
Flux consistency¶
For a property such as mass to be conserved across patch boundaries, it is necessary that the two tasks agree, at least on average, on the mass flux passing through the boundary between the tasks. In the guard zones, task A can only estimate the flux computed and used by task B, especially if they have different resolutions and/or timesteps, since the flux for example depends on slope limiters that act differently for different resolution, and since flux calculations are based on forward predictor steps that result in different values at different times.
The solution is that tasks communicate the flux that was used in previous timesteps, so nbor tasks can compensate – a posteriori – for differences. The flux must be defined as exactly the flux that was used in the update. To keep and communicate it, one needs an array size of 6*N*N per time slice and per variable.
Since MPI bandwidth is usually not a bottleneck, one may – as a simple temporary
solution – increase mw
by one and communicate the flux values for the whole
patch. Optimization is straightforward and can be done later – the main goal
is to demonstrate that flux consistency can be achieved, and how it should be done.
To explore this, the sub-sections below progress from the simplest case to the most general case:
Same resolution and timestep¶
Assume two task with the same resolution and the same timestep size are being updated by one thread::
0------------+-----------------+----------------+------------+--- task A
|----fA1-----|------fB2--------|
|----fA1-----|------fB2--------|
0------------+-----------------+----------------+------------+--- task B
Assume the thread updates task A first, using an estimate of the flux through the interface between them computed based on internal values from A and values in the guard cells obtained from task B. The flux is saved, and is made available to task B.
Then the thread updates task B, and would in fact compute exactly the same flux at the interface, since it would have the values from task A in its guard cells. Hence it doesn’t matter which of the fluxes it uses, and there is no need for flux corrections.
The same holds true if the two tasks are updated simultaneously by two threads.
Same resolution, different timesteps¶
Assume two task with the same resolution and different timesteps are being updated by one thread::
tA1 tA2 tA3
|<---dtA1------->|<----dtA2------->|
0----------------+-----------------+----------------+------------+--- task A
| ´ | | | |
|-----fA1----|fA1|----fB2----|-fA2-|
| | | | |
0------------+---------------+--------------+------------+----------- task B
|<--dtB1---->|<----dtB2----->|<---dtB3----->|
tB1 tB2 tB3
Assume the thread updates task A first, and that its timestep is a bit longer than that of task B. It makes the flux it used (fA1) available to task B.
Then the thread updates task B, and computes a somewhat different flux than task A (because the flux is based on prediction of the state at a different time). If it uses that flux, the mass flux between the two tasks becomes inconsistent, and mass is either gained or lost.
To use consistent fluxes, without changing the state of task A a posteriori (and there is no reason to do that – it made an accurate estimate), task B should use the flux from task A (fA1) in the 1st time step.
The thread will then select task B again, for a 2nd update, because task A is ahead of task B. Here, task B should use the remaining part of the task A flux, and then use the flux (fB2) it computed (possibly even correctly time centered), for the remaining part of the time step.
The next task to be updates is task A, which should use flux fB2 from task B for the first part of the interval, and a flux (fA2) that it estimates itself for the last part of the timestep.
Correctly centering the partial timestep fluxes would mean to use predictor
values at, for example tB3 + (dtA2-(tB3-tA2))/2
instead of at tA2 + dtA2/2
.
Same resolution, time extrapolation¶
Assume two task with the same resolution and the different timesteps are being
update by several threads, with a grace
value giving an allowed window for
time extrapolation::
tA1 tA2 tA3
|<---dtA1------->|<----dtA2------->|
0----------------+-----------------+----------------+------------+--- task A
|-----fA1--------|------fA2--------|-------fA3------|
|-----fA1----|fA1|----fB2----|-fA2-|---fB3--|
0------------+---------------+--------------+------------+----------- task B
|<--dtB1---->|<----dtB2----->|<---dtB3----->|
tB1 tB2 tB3
Assume that after task A and task B have updated as in the previous case, one thread takes task B to do the 2nd time step of it, while another one takes task A, which has also been deemed ready to update, because it is within the grace interval ahead of task B.
Task B will update as in the previous example, using first the fA1 flux from task A, and then its own fl2 flux. Task A will use its own flux (fA2), since task B has not yet made fB2 available. This creates and inconsistency, which needs to be corrected subsequently.
To avoid imposing any new constraints on which task updates before the other, we
assume that task A is still ahead by less than the grace
value, and hence
the two tasks could happen to be updated simultaneously again.
The inconsistency in the fB2 interval may be rectified if task A adds (fB2-fA2) times the length of the time interval. Task B is free to choose to use fA2 in the overlapping time interval, and then its own fB3 value, while task A cannot use new information from task B regarding fluxes, since none is available, and hence it will use fA3 over the whole interval. The situation after the update is thus the same as after the previous update, and may be corrected in the same way, in the next round of updates.
Different resolution¶
If the two task differ in resolution, with a constant ratio, and has aligned cells, then the only difference is that the flux contributions need to be normalized by the cell area, and needs to be summed over the same areas.
Maintaining consistency¶
To maintain consistency each task must
- Maintain a time for each interface, up to which it has achieved flux consistency.
- Have access to the corresponding information from the task on the other side of the interface.
Each task must be able to rely on that both tasks make the same assumptions about update order and method to resolve flux inconsistency. To achieve this without a need for negotiations, we can use the simple rules that 1) the “consistency marker” must always move forward in time, and 2) the task that lags behind in consistency time has “the right-of-way”; it can move its marker ahead of the other task.
Task B cannot know (should not need to know) if task A will be updated during its
own update (it could check the bits%busy
bit, but that might be set only after
the task B update has started).
If the consistency time of task A is ahead of that of task B then, by the first rule above, both task A and task B can trust that fluxes agree up to the earlier of the two consistency times. By the 2nd rule, the leading task can trust that the other task respects its flux choice(s) in the time interval between the two consistency times.
So, the rules holding at an interface are:
- the task which leads in consistency time can assume that the other task will adjust the fluxes used, up to the leading time
- the task which lags in consistency time has the obligation to fix consistency up to the leading time, and should then move its own consistency time on ahead to the end of its time step
|<---dtA1------->|<----dtA2------->|
0----------------+-----------------+----------------+------------+--- task A
|-----fA1--------|------fA2--------|-------fA3------|
|-----fA1----|fA1|----fB2----|-fA2-|---fB3--|
0------------+---------------+--------------+------------+---------- task B
|<--dtB1---->|<----dtB2----->|<---dtB3----->|
Conserving diV(B)¶
The divergence of the magnetic field should remain equal to zero at all times and at all locations. This condition is a constraint on the magnetic field components passing through a cell, from which one can determine the flux through one face from the fluxes through the other five faces. This may be used to enforce div(B)=0 at the interface between two DISPATCH patches, with components centered like so:
| || |
Bx Bz Bx Bz Bx
| || |
=====+===By===++===By===+=======
| || |
Bx Bz Bx Bz Bx
| || |
-----+---By---++---By---+-----
| || |
Bx Bz Bx Bz Bx
| || |
-----+---By---++---By---+-----
| || |
Bx Bz Bx Bz Bx
| || |
=====+===By===++===By===+=======
| || |
Bx Bz Bx Bz Bx
| || |
The double line symbolizes the patch boundary, with the Bx magnetic field component defined at the same location by two different tasks, generally at a sequence of different times
The continuity of By(x) and Bz(x) through the face is guaranteed, since any glitch would correspond to an electric current, which would generate a counter-acting force.
The continuity of Bx(x) may then be enforced by simply computing the Bx component at the interface from the other 5 known face flux values: Bx(face) = Bx(internal) - delta(By) - delta(Bz) (where the delta() has to be taken in the appropriate sense).
Initially, the values of By(z) at the upper y-edge, and the Bz(y) values at the upper z-edge are not known, but they are subsequently constructed, so after a few iterations also the edge and corner values are consistent with div(B)=0, and the whole cube, including the ghost zones around it, is divergence free (if it was to begin with).
Child task management¶
Steps when a new AMR task is created
- The task is created, with no nbor list, and w/o being present in any nbor list, with only interior interpolated data from the parent
- It is sent directly to the ready queue
- The thread call check_current, notices a bits%init_nbor, and generates
a new nbor list with
tlist%init_nbor (link)
- That is enough to download guard cell data
- Some cells may need to be interpolated from L-2 nbors, which must be there, since otherwise the parent patch would not have had support
- This is quite OK, since the data are interpolated in any case
- The procedure also sets bits%init_nbor in the nbor tasks, but does not itself generate new nbor lists for its nbors
- When the nbor tasks come up for updating, they too will generate new nbor lists, because of the bits%nbor status bit
- With the nbor lists initialized, everything has arrived at a new order,
and the next
check_nbors()
call will include testing of the new child patch.
Using this strategy, we are able to avoid calling init_nbor_nbors()
after
creating a new AMR task, and we are also able to avoid calling check_nbor_nbors()
,
since the nbors will take care of their own init_nbors()
.
Event sequences¶
If threads only ever change the nbor list of their active task, and leave changing
the nbor lists of other tasks to the threads updating those tasks, then much of
the nbor list locking and copying can be avoided. The choice is made by not
setting omp_lock%links
in task%refine
When the active task accessed the nbor list of another task – this happens
primarily in check_nbors()
, it needs to be sure that the another thread
doesn’t change the nbor list in the mean time, so it needs to lock it. However,
it does not need to lock it’s own nbor list during that time.
Each thread needs to lock its own active tasks nbor list in init_nbors()
, while
it is being changed, and then needs to lock the nbor’s nbor lists when they are
being used (in check_nbors()
).
Each thread needs to lock it’s own task memory while it is being changed (inside
the hd_t%update or in timestep_t%update), and then needs to lock the memory of
nbors as it accesses it (in download_t%download_link()
).
nbor list consistency¶
The nbor lists of two tasks must always be consistent, in the sense that no deadlock can occur because of the state of the nbor lists of two task that may or may not be nbors.
A deadlock could occur if a task A depends on another task B being updated,
because B is in the nbor list of A and is the logical flag needed
is true
for the B entry in A’s nbor list, while at the same time A is not in the nbor
list of B, and hence B will not include A when doing check_nbors(). This could
cause a deadlock because the mechanism that puts task A in the ready queue is
that a thread working on B, updating the time past the one that task A is
waiting for, is supposed to be that the thread updating B then calls check_ready
on the A-task.
Hence, we must either have the thread that updates nbor relations take care of updating both nbor lists “at the same time” (under lock), or we must make sure that an inconsistency does not have negative consequences.
If an nbor task (B) is present in the nbor list of a task (A), but that task (A) is not present in the nbor list of the nbor (B), then task (A) might find that it cannot update because B is not up-to-date, but when B updates it does not check if A becomes ready, because it is not on the nbor list. That problem could be handled by letting the addition of a new task to an nbor list always have the consequence to run check_nbors() on the nbor list (or at least to run check_ready() on the new nbor).
That simple fix might make it OK to delay the update of nbor lists until the thread that takes on an update can do it.
The implementation would then rely on these two mechanisms:
- When adding or removing a task from the nbor list of the task a thread is
working on, it must set a bit in the other task, triggering the next thread that starts to work on that task to update it nbor list accordingly; i.e., either remove the task has set the status bit, or add it. This means the actual nbor relation must match the action; if the action was “add”, then the task must actually be a new, overlapping task, and if the action was “remove”, then the task must actually either a) not exist any more, or b) have a flag set that indicates it is about to be removed.
- After taking such an action, the thread must call check_ready() on a task that was added to an nbor list, and must run check_nbors() on the nbors of a task that is being removed – in case they will become updateable because the task is being removed. Again, the check_ready() must then recognize the bit that indicates a task is about to be removed
Nbor list renewal¶
Nbor lists are used for these purposes, in logical order
list_t%check_nbors()
uses the nbor list to determine which nbor tasks to runlist_t%check_ready()
on. Inlist_t%check_ready()
the nbor list is used to figure out if a task is ready to be updated, by comparing the states of tasks with the functiontask_t%is_ahead_of()
.- After the task gets picked from the ready queue, e.g. by
dispatcher0_t%update
, the nbor list is used bytask_t%dnload
to pick up the information from its nbors that it needs, in order to update. - After the task has updated, and if it is a boundary task, the nbor list is
used in
list_t%send_to_vnbors()
, which has as the main purpose to use MPI transparent / “invisble”. To accomplish this, first of all the virtual copy of the active task must be updated, and then thecheck_ready()
must be used on the nbor task, by acheck_nbors()
being executed on the virtual copy of the active task.
At which point can one abandone an existing nbor list, and adopt a new one,
without creating a change of deadlock? Clearly, not after step 1), since the
nbor lists used in %dnload()
must be the same as the one used to check if
the task was ready to update. Also, clearly not after step 2, since the nbor
list at that point still carries information about which tasks were blocked by
the active task not yet having been updated.
If a new task has been added by AMR, which happens between step 2 and 3, the requirements for being “ready” will change, for all nbors of the active task that will have the new task as a new nbor. All the previous nbors of the active task, and the active task itself, need to have new nbor lists generated.
Likewise, if a task is removed by AMR, all tasks in the nbor list of the active task need to have new nbor lists generated. The best point to do this is just after the task has updated, since if it is done before, there is no longer consistency between the nbor list and the task update.
With task addition, there is no problem doing the next check_nbors()
with
the old nbor list, since the new task by definition carries the same information
as the parent task, and hence does not add any new constraint on “readiness”,
nor would adding the new task in the nbor list before the dnload()
add any
new information.
Both with task addition and with task removal, we require that the nbor list is generated by the thread assigned to update the task. By the reasoning above, this must be done at the very end of the update.
In the case of task removal, which is decided before the task update, there is no reason to actually remove the task until after it has updated; the update can by definition take place with the existing nbor list, and some nbors of the removed task have at the time already been added to the ready queue, and need access to the task as it was at the time.
Whether one removes the task before it is updated or after it has updated, one needs to trigger new nbor lists to be generated for all of the nbor tasks of the removed task. Those new nbor lists will not be generated until after those tasks have updated
Nbor list use¶
Nbor lists are used for these purposes, in logical order
In
list_t%check_ready()
the nbor list is used to figure out if a task is ready to be updated, by comparing the states of tasks with the functiontask_t%is_ahead_of()
.After the task gets picked from the ready queue, e.g. by
dispatcher0_t%update
, the nbor list is used bytask_t%dnload
to pick up the information from its nbors that it needs, in order to update.After the task has updated,
list_t%check_nbors()
uses the nbor list to determine which nbor tasks to runlist_t%check_ready()
on.the nbor list of some task that depends on the active task is used to check if that task became ready to update after the current task updates. Such tasks are on the nbor list of the active task, and has a flag
needs_me
set.
Task handling¶
A new AMR task:
- isn’t engaged in
check_ready()
of nearby tasks until it has been added to their nbor lists. - also isn’t involved in
check_ready()
performed by other threads until it has been added to the nbor lists of those nearby threads. - is generally born ready to update, since it is formed as part of a parent patch that was about to be updated, and is given the same task time.
- may be added to the ready queue on an ad hoc basis, w/o necessarily having been given an nbor list at all
- does not need to be sent to the nbor ranks until after it has updated, at which time it will have a valid nbor list on the owner rank, and will be given one automatically – as for all new tasks on a node – when arriving on another rank
This argues in favor of delaying the call to init_nbors (task%link)
until
the next time it has been picked by a thread for updating. At that time, as
part of the refine_t%check_current()
, it needs to generate an nbor list,
for use in task%dnload
.
- At the time when it comes to calling
check_nbors
, it knows who the nbors are, but they may not yet know about the new task, and instead may rely on the still existing parent task. - If the nbors don’t have the new AMR task in their nbor lists yet, they cannot be denied updating based on the new AMR task, and thus at some point they will end up in the ready queue, and perhaps only then will they generate their own, new nbor lists
- but if a task generates a new nbor list after being selected, it may find that some of the new nbors may not be up-to-date, so this argues for generating new nbor lists at the end of a task update, rather than at the start
- except: a new AMR task needs to have a first nbor list, so it can get authoritative guard zone values. It is not necessary that the virtual copies get their nbor lists in any special way; they get theirs when created.
Using this strategy, we should be able to avoid calling init_nbor_nbors()
after creating a new AMR task, and we should also be able to avoid calling
check_nbor_nbors()
, since the nbors will take care of their own.
Coding standard¶
The common coding standard has two separate purposes:
- To ensure code consistency, which makes it easier to read and understand
- Specific advantages when editing, such as easy of search, etc.
Modules¶
Modules should start with a description, with comment lines !>
being
automatically interpreted by doxygen
at readthedocs.org
:
!===============================================================================
!> Module description ...
!===============================================================================
MODULE some_mod
Then follows USE
of the modules needed, and an implicit none
(that applies
also to all proceduress in the module):
USE io_mod
USE kinds_mod
USE ...
implicit none
All variables and procedures in the module should by default be private, with the exception of the data type definition, and a static instance of the data type, the purpose of which is to provide access to static parameters, set for example from a namelist, and given as default values to instances of the data type::
private
type, public:: some_t
integer:: verbose=0
contains
procedure:: init
procedure:: update
end type
type(some_t), public:: some
CONTAINS
!===============================================================================
!> Initialization
!===============================================================================
SUBROUTINE init (self)
class(some_t):: self
...
END SUBROUTINE init
!===============================================================================
!> Update
!===============================================================================
SUBROUTINE update (self)
class(some_t):: self
...
END SUBROUTINE update
...
END MODULE some_mod
Procedures¶
Procedures should follow this template, with a dotted line separating argument declarations from local variable declarations, and with dashed lines separating code and comment blocks::
!===============================================================================
!> Procedure description (picked up by doxygen)
!===============================================================================
SUBROUTINE updatet (self, aux)
class(some_t):: self
integer:: aux
!..............................................................................
integer:: local_variables, ...
real(KindScalarVar), dimension(:,:,:), pointer:: d, ...
real(KindScalarVar), dimension(:,:,:), allocatable:: w, ...
!------------------------------------------------------------------------------
call trace%begin('some_t%update)
...
!------------------------------------------------------------------------------
! Comment block
!------------------------------------------------------------------------------
...
var = expression ! in-line comment
...
call trace%end ()
END SUBROUTINE update
or, if the procedures should be timed::
!===============================================================================
!> Procedure description ...
!===============================================================================
SUBROUTINE updatet (self, aux)
class(some_t):: self
integer:: aux
!..............................................................................
integer:: local_variables, ...
real(KindScalarVar), dimension(:,:,:):: d, ...
integer, save:: itimer
!------------------------------------------------------------------------------
call trace%begin('some_t%update, itimer=itimer)
...
call trace%end (itimer)
END SUBROUTINE update
Capitalization¶
Capitalization of some keywords, such as:
MODULE ...
USE xxx
...
CONTAINS
SUBROUTINE sub
END SUBROUTINE sub
FUNCTION fun
END FUNCTION fun
serves the purpose of allowing editor search patterns such as 'USE xxx'
,
'E sub'
, or 'N fun'
.
Code¶
For consistency, and to maintain good readability even with many levels of indentation, code should be indented with (only) two characters per level, as in this template::
SUBROUTINE proc (self, arg1, arg2, ...)
class(type_t):: self
...
select type (arg1)
class is (solver_t)
n = ...
class is (extras_t)
n = ...
class default
n = ...
end select
do i=1,n
a(i) = ...
if (a(i) > 0.) then
b(i) = ...
else
b(i) = ...
end if
end do
Comments¶
Comment in procedures should be either multi-line::
!----------------------------------------------------------------------------------
! The dashed lines should end in column 80
!----------------------------------------------------------------------------------
or single-line::
!--- This is a single line comment ---
Blank lines should be avoided (to maximize the visible code in editor windows), but if deemed absolutely essential to increase readability, they should have a leading comment sign, flush with surrounding lines (to allow fast skipping to the next procedure in editors)::
code ...
!
code ..
Debug printout¶
For largely historic reasons the code still contains a lot of debugging and
diagnostic print messages, typically controlled by a local verbose
parameter,
similar to:
if (verbose > 0) then
write (io_unit%log,'(....)') &
wallclock(), ...
if (verbose > 1) then
...
end if
end if
Because many such code blocks can be very detrimental to code readability this is discouraged (meaning most of those constructs will be gradually removed), in favor of adding ad hoc statements of the type (fully left-adjusted, to stand out)::
write (stderr,*) wallclock(), mpi%rank, omp%thread, 'message', task%id
specifically tracing some action through various pieces of the code, to be removed
after filling their function. The lines should be removed in a dedicated commit,
so they are recoverable via a simple git revert
, w/o introducing other changes.
Compilation¶
Compilation is controlled by a hierarchy of Makefiles. For each experiments/whatever/
directory there should be a Makefile
similar to those in parallel directories, so
itechnical/f / when making a new experiment, use a Makefile
from another experiment as a template.
Explanation:
The experiment Makefile
does include $(TOP)/config/Makefile
, which in turn includes
the Makefiles in the various subdirectories, including the ones in the config/compiler/
hierarchy, which determine compiler option settings.
Normally, it is not necessary to change the the make configuration, except possibly to
overrule the choice of compiler, with for exampe make COMPILER=ifort
.
See below for information on compiler option bundles, special make targets, and tailoring the make system:
make options¶
The default compiler options for gfortran and ifort are chosen as a compromise between compile time and performance; they generally give close to optimal performance, so are suitable for shorter runs and for development.
To get the best performance, compile with make clean; make OPTS=optimized
, which typically
gives a gain of a few percent in performance, at the cost of increased compilation time.
If the code crashes, try recompiling with make clean; make OPTS=full_debug
or
make clean; make OPTS=debug
. The first provides comprehensive debug settings,
while impacting performance significantly, while the 2nd avoids the particular
options that impact performance the most.
The make clean
part is only needed the 1st time after changing OPTS bundle
make targets¶
The config/Makefile
specifies a few special target, which may be useful
while developing code.
To see the values of various options (make macros= chosen, do make OPTS=xxxx info
.
Individual make macros listed may be overruled on the command line, using for
example make OPT=-O0
for a quick compilation; e.g. for checking syntax.
To get a list of source files compiled for a specific experiment, do
make source
(this requires that the code is compiled first). This may be
particularly useful when looking for particular strings in the code; e.g. with:
make source | xargs egrep -n '(pattern1|pattern2)'
Some make macros are chosen based on for example the host name, or based on other macro values. To reveal where a particular option gets its values, do for example:
make showinclude PATTERN=COMPILER
make showinclude PATTERN=OPTS
make showinclude PATTERN=OPT
make showinclude PATTERN=FC
Makefile configuration¶
Several of the make options are chosen based on the value of the environment
parameter $HOST
. To influence those settings, create a file
config/hosts/$HOST
, and put the options you prefer there. A typical
content on a cluster host would be:
COMPILER=ifort
You may choose to commit the file to the repository, but make sure to avoid
collisions with existing files. Having the $HOST
file in the repository
makes sense on a cluster, to configure common settings used by several people,
while committing a laptop config/hosts/$HOST
to the repository is usually
pointless.
To implement particular host-dependent options or option bundles for a compiler, create for example these files:
config/compiler/gfortran/$HOST/Makefile
config/compiler/gfortran/$HOST/optimized.mkf
config/compiler/gfortran/$HOST/debug.mkf
config/compiler/gfortran/$HOST/full_debug.mkf
config/compiler/gfortran/$HOST/personal.mkf
Note that these files only need to contain the lines that differ from the
corresponding lines in the config/compiler/gfortran/
files.
See also the note about makes showinclude ...
on the page about `make targets`_
Separate builds¶
Especially when developing, it saves time to have separate build_*/
directories; one for each combination of SOLVER
and OPTS
.
This may be enabled by setting the make macro SEPARATE_BUILDS
,
e.g. in a $(TOP)/options.mkf
file (which might also be used to
set the COMPILER
and other macros). The file could thus contain,
for example:
COMPILER=ifort
SEPARATE_BUILDS=on
dispatch.readthedocs.io¶
The dispatch.readthedocs.io documentation is generated automatically when pushing new commits to bitbucket.org/aanordlund/dispatch.git, but in order to proofread before pushing major updates it is very useful to generate a local cache. This is currently acthieved by doing, on a laptop:
cd docs
csh laptop.csh
This currently rsyncs the docs directory to astro08, and runs make.csh
there, which uses the local Python installation to generate the HTML,
and then rsyncs it to ursa.astro.ku.dk/~aake/dispatch/
, so it may be
accessed at www.astro.ku.dk/~aake/dispatch/
.
NOTE: With a sufficiently complete local Python installation, the generation
can be done locally, on any laptop, so the resulting HTML may be proofread
there, by just doing (in the docs
directory):
make html
Dust¶
Modeling the dynamics and growth of dust requires a convenient representation of the dust, and a convenient arrangement for computing the influence of the gas on the dust, and of the dust on the gas. These issues are addressed below.
Representation¶
To update the dust dynamics particle-by-particle is most easily done by keeping the particle information in a linked list (similar to the representation in the particle-in-cell plasma simulation setup.
We can use the same basic data type, extending it with the attributes that are particular to the dust-gas interaction.
Part of the representation is that the integer cell address is always known. The cost of computing the number density of each species is thus linear in the number of particles; as one moves the particles, their weights are summed up in each cell, and one thus arrives at their respective number density in each cell.
Given the number density of each species, one computes the rates of transformation from one species to the other using a kernel, and the resulting rates are summed up in each cell.
The next time the particles are moved over a certain time interval, their respective weights are changed according to the rates for that species.
Forces¶
Forces are now part of the extras_mod, so to add forcing take a copy
of extras/extras_mod.f90
into the experiment/xxx/
dir, and uncomment
the lines marked … ! forces
The new forces_mod
does NOT collide with the old force_mod
modules, so
for compatibility one can construct new forces_mod modules by using force%selected
to load values into patch%force_per_unit_mass
.
However, in the future forces_mod.f90
files should be written directly,
since this alleviates the need to stick to a definitive (and long) argument
list for the force%selected
functions.
Google docs¶
Google docs allow documentation containing tables, drawing, graphics, and animations, and makes it easy to collect, edit and maintain such documents. This is much more efficient and simple than creating such documentation as part of the auto-generated HTML hierarchy, which is exclusively based on the GIT repository contents.
- Link to (for now private and partially out-dated) Google docs
GIT best practices¶
The Bitbucket web site has lots of advice on best practices for code maintenance, so here we just add advice specific to DISPATCH.
Note that many of the operations shown with command line syntax in the examples below may be easier carried out with GUI applications such as SourceTree.
Consolidating updates¶
When developing your project – either just adding input files and analysis scripts, or when doing actual development of new code – it is important to commit snapshots to your local working copy of DISPATCH often (essentially after each file edit), but also important to reorganize those commits before pushing to the on-line repository, which is typiclally a fork you may be sharing with collaborators.
The dilemma is that pushing all the small incremental commits that you make to the local working copy generates too much “noise” – both literally, in the form of emails, and in the sense that those commits are probably too fine-grained to be useful for others. On the other hand, refraining to make frequent commits obliviates one of the main advantages of code maintenance: the ability to “backtrack”, to find out when and what went wrong, after the fact.
The way out of that dilemma is to reset to the start of the series of edits when done, after saving the detailed edits as a temporary branch (e.g. named yy-mm-dd for year-month-day), and then reorder and clean up, making a smaller number of well-commented commits before pushing those to Bitbucket repository.
One of the advantages with this is that debug statements that were first added and then removed disappear, reducing the “noise” in the commits that are pushed to the net.
Detailed examples:
Using rebase¶
# -- at start of the day --
git checkout master # get master branch
git pull # pull in changes from elsewhere
# -- during the day --
... edit ... # a small number of edits
git commit -a -m comment # GIT snapshot, with only terse comment
... edit ... # a small number of edits
git commit -a -m comment # GIT snapshot, with only terse comment
...
# -- merge in edits from elsewhere --
git fetch # get updates from elsewhere
git rebase origin/master # your edits become relative to that
... may require some edits + commits ... # in case of collisions
# -- consolidate --
git branch yy-mm-dd # create a backup, temporary branch
git reset origin/master # reset, to get all edits merged together
git add -p # add _selected_, related edits
git commit -m "DBG comrehensive comment" # possibly use 'commit -m' and an editor
git add -p # add _selected_, related edits
git commit -m "ORG comrehensive comment" # possibly use 'commit -m' and an editor
git add -p # add _selected_, related edits
git commit -m "DIA comrehensive comment" # possibly use 'commit -m' and an editor
git push # push your consolidated commits
git branch -D yy-mm-dd # optionally, delete the temporary branch
GIT forks¶
A GIT fork is a GIT repository (e.g. at bitbucket.org), which is created as a copy of another repository, and which maintains a two-way connection, via which one can propagats updates both from and to the original.
Read about forks for example at the Bitbucket web site.
We suggest that you create a fork of the public repository on bitbucket.org, and work with private working copies of your fork, on laptops, as well as on clusters.
You may want (other) studentents to do the same or, alternatively, make a fork of your fork, to get an arrangement such as this one::
private_version
| \
public_version private_forks
/ | | \
Ralph Rolf | Others...
| |
student student
Logging¶
Log changes, showing statistics for each file::
git log --stat
Log changes, showing differences for each file::
git log --patch
Log the last 10 changes to a file, on all branches::
git log -n 10 --all --follow --stat -- dir/file
Show the last 10 changes to a file, on all branches::
git log -n 10 --all --follow --patch -- dir/file
Searching¶
Show the commit messages of files (in any branch) that contain “string”::
git log -G'\<string\>' --all
Show the commit messages of files (in any branch) that added or removed “string”::
git log -S'\<string\>' --pickaxe-regex --all
Rebase¶
To avoid a possibly confusuing git log that contains two
parallel lines of developing, followed by a merge, it may
be better to use git rebase
.
To move your committed changes past new commits from a pull::
git rebase origin/master
Some merging of changes may be required after that
Object hierarchy¶
Main object hierarchy::
task_t ! Basic task attributes; time, position, ...
patch_t ! Mesh based task attributes; mesh, dimensions, ...
gpatch_t ! A place to stick in IC value calls
extras_t ! Layer with selectable, extra features
mhd_t ! MHD layer
solver_t ! Generic layer, to present to experiments
experiment_t ! Experiment layer, BCs, ICs, forces, ...
Call hierarchy¶
The basic call hierarchy, for quick reference::
dispatch
mpi_mod::init ! initialize MPI
omp_mod::init ! initialize OMP
io_mod::init ! initialize I/O
cartesian_mod::init ! initialize patches
mpi_cords_mod::init ! determine MPI placement
task_list_mod::init ! initialize task list
list_mod::init ! initialize a list
do
patch_mod::set ! set bits
patch_mod::init ! initialize patch
task_list_mod::append ! append to task list
end do
task_list_mod::execute ! prepare and run the job
list_mod::init_all_nbors ! initialize neighbor lists
list_mod::reset_status ! set boundary and virtual bits
do
task_mod::clear ! clear ready bit
task_list_mod::check_ready ! check if patch is ready
list_mod::queue_by_time ! add to ready queue
end do
tic ! initialize performance counters
timer_mod::print ! initialization time statistics
do
task_list_mod::update
task_list_mod::check_flags ! check for flag files
task_list_mod::check_mpi ! check for MPI messages
mpi_mesh_mod::check ! check mesh lists
MPI_IMPROBE
MPI_GET_COUNT
MPI_IRECV
recv_list::add ! add to recv_list
MPI_TEST
unpack_list::add ! move to unpack_list
mpi_mesh_mod::get ! get an incoming mesg
task_list_mod::unpack ! handle unpacking
task_list_mod::find_task ! find the link with the task
patch_mod::unpack ! patch unpack procedure
anonymous_copy ! copy header into place
patch_mod::header_to_patch ! unpack header
anonymous_copy ! copy mem into place
deallocate ! deallocate mesg%buffer and mesg
download_mod::download_link ! fill guard zones
experiment_mod::output ! snapshots
experiment_mod::update ! task update
patch_mod::rotate ! time slot rotate
task_mod::pack ! pack boundary patches
link_mod::send_to_vnbors ! send to virtual neighbors
task_mod::clear ! remove check bit
list_mod::check_nbors ! check nbors and self
end do
timer_mod::print ! execution time statistics
toc ! performance cost summary
buffered_output_mod::close ! close output files
mpi_mod::end
Initialization calls¶
experiment_t%init
solver_t%init
mhd_t%init
self%idx%init (obsolete!)
self%initial%init (obsolete!)
timestep%init [only stagger2]
gpatch_t%init
self%idx%init (new)
self%initial%init
self%idx%init (new)
patch_t%init
task_t%init
force_t%init (obsolete!)
extras_t%inig
forces_t%init
force_t%init
validate%init
As much as possible should be inside framework files, avoiding
requiring that all $(SOLVER)%init
contain a chain of specific calls.
We should thus consider moving calls to self%idx%init
to gpatch_t
,
and doing it both before and after the call to self%initial%init
,
as is done in stagger2 (to pick up changes of the %mhd
switch).
Any calls to self%initial%init
and force%init
in experiment_mod
files, or in mhd_mod
files, should be considered obsolete.
jobcontrol.csh¶
The job script hierarchy consists of SUBMIT scripts, JOB scripts, and CANNED scripts.
The SUBMIT scipts contain scheduler control commands, but other otherwise static, since they are only read at submission.
The JOB scripts contain whatever the user wants, and may be changed during the runs – e.g. before restarting a process via jobcontrol.csh.
The CANNED scripts are helpful with for example archiving copies of executables, input files and log files.
Submit scripts:¶
long.sh
short.sh
...
These contains control syntax such as #PBS or #SBATCH. Since changes after submission have no effect, they call other scripts where changes do have effect. The call is via the jobcontrol.csh script, with submit script syntax such as
source $HOME/codes/dispatch/config/startup.sh jobcontrol.csh job.sh
The jobcontrol.csh script runs the script given as argument repeatedly, as long as there is a restart.flag file present in the run directory.
Job scripts:¶
long.csh
short.csh
These can be anything, including calls to standard scripts for local systems, with syntax such as
ivy.sh input.nml
Canned scripts:¶
config/host/steno/ivy.sh
Since we often want to use standard ways of handling executables, input files, and log file, we keep a number of canned scripts, such as ivy.sh, which places a copy of the excutable, the input file, and the log file, in the data/run/ directory.
The script seach path is setup by the config/startup.sh script, with directories seached in this order:
./
config/host/$HOST/
utilities/scripts/$SCHEDULER/
utilities/scripts/
$PATH
This meane one can override the default scripts by placing scripts with the same name (e.g. ivy.sh) in the run directory.
SCHEDULER is set by a file config/host/$HOST/env.{csh,sh}, if it exists, or may be set in the environment. Possible values are “pbs”, “slurm”, etc.
The utilities/scripts/$SCHEDULER/ directories contain scheduler-i specific scripts, used e.g. by jobcontrol.csh. Some examples are:
jobid # return the id of the current job
jobkill id # cancel a job
jobmail # send a mail to $USER
Locking issues¶
Memory locking for guard zones¶
When omp_lock%tasks
is set, external read access to task memory is locked
while the task%update is actively modifying the task memory (this occupies only
a small fraction of the time to do task%update
), so the procedure is fully
thread-safe. One can optimize this, at the cost of an extremely small chance
to get relatively bad guard cell values on occasion:
When download_link wants to access a source task, it would do::
!-----------------------------------------------------------------------------
! Acquire the task in locked mode, so nothing can change while we decide
! whether to keep it. While we have the lock, we get the slot indices ``jt``
! and weights ``pt`` -- these are not going to change if we release the lock,
! and the source task updates -- unless it updates several times and those
! slots are overwritten with newer time values.
!-----------------------------------------------------------------------------
call source%lock%set ('guard cells')
call source%time_slots (target%time, jt, pt)
!-----------------------------------------------------------------------------
! If the target%time is in the first available time slot,
! which is source%t(source%iit(1:2)), that first slot could possibly be
! overwritten, in some very unusual cases, when the source task is just about
! to switch in that slot as "new", and if the current task by chance is delayed
! enough. In that case, keep the lock until after guard zones are loaded.
!-----------------------------------------------------------------------------
if (target%time < source%t(source%iit(2))) then
call target%get_guard_zones_from (source, jt, pt)
call source%lock%unset ('guard cells')
else
call source%lock%unset ('guard cells')
call target%get_guard_zones_from (source, jt, pt)
end if
Note that the get_guard_cells from (source)
procedure must rely on only
the source%it information, and
Neighbor list synchronization¶
The link_t%nbor list is now changed nearly atomically, in that the new list is
first constructed separately, and is then connected with a single instruction,
pointing link%nbor
to it. The procedure list_t%init_nbors()
creates
- a cached copy of the nbor list, which is used in
list_t%send_to_vnbors
- a cached copy sorted by level, which is used in
download_t%download_link()
The nbor lists are also used in
list_t%check_ready()
, which checks those nbors that havenbor%needed
set, to see of their tasks are sufficiently advanced in time allow the linked task to be added to the ready queuelist_t%check_nbors()
, which checks those nbors that havenbor%is_needed
set, usingcheck_ready()
on each of them. This is done after every active and virtual task update, since those are the only points in time when new tasks could become ready to update
The task link, which contains link%nbor
, must be protected by its OMP lock,
link%lock
,
whenever the nbor list is updated or referenced. The cached nbor lists are used
to keep these lock periods as short as possible
Lock timing¶
TODO: One could generalize the current pair of calls::
call object%lock%set ('label')
...
call object%lock%unset ('label')
to (with a meachnism very similar to trace%begin / trace%end
):
integer, save:: ilock=0
...
call object%lock%set ('label', ilock=ilock)
...
call object%lock%unset (ilock)
and then inside lock%set
measure the time it took to get the lock, and
inside lock%unset
measure the time the lock was held:
SUBROUTINE set (self, label, ilock) … lock%start(thread) = wallclock() … set lock .. lock%acquire(thread,ilock) = & lock%acquire(thread,ilock) + (wallclock()-lock%start(thread)) lock%start(thread) = wallclock()
SUBROUTINE unset (self, label, ilock) … … unset lock .. lock%held(thread,ilock) = & lock%held(thread,ilock) + (wallclock()-lock%start(thread))
OpenMP locking time and contention¶
A task typically has 26 nbors – possibly more in AMR situations. To get the guard cell values takes of order 10-30% percent to the update time, so at most of order 1% per source task.
To update the array memory takes a small fraction of the update time, so perhaps again of order 1% of the update time.
The life cycle of task has an period when the task is not busy, which typically is 50 times longer than the the task updat time. Hence the locking periods of time are extremely short in comparison to the task update cadence, and the chance the several target tasks are asking for the lock on a source while it is holding the lock is extremely small.
Hence, to conservatively lock the tasks while changing the task memory should give hardly any measurable impact on speed, while not doing it opens a small but non-zero change of memory conflict.
NOTE 1: The task does not need to prevent external memory access while doing
things such as task%pack
and task%courant_condition
, since these
procedures are performed by the thread that owns the task.
NOTE 2: On the other hand, while a virtual task is being unpacked, it should lock the memory, to prevent collisions with target tasks using it as source in guard cell loading.
Task mem array synchronization¶
During a task update, the only memory slot that is written to is task_t%new
,
while potentially all memory slots are needed to produce the new values.
Before a task update, all memory slots iit(1:nt-1)
of nbor (source) tasks may in
principle be needed, when computing guard zone values for the (target) task that
the thread is going to update.
If the nbor task is dormant (with bits%busy
not set), it could possibly start
computing values after its time slot info has been acquired, but since the
download for a single task takes less than 1% of the update, there is no chance
whatsoever that the task has time to both finish one update, and start producing
new mem array values for the next time slot (number 1 in terms of the iit array).
If the task is active (with bits%busy
set), and is just about to finish its
update, it might have time to do that, changing in the process its time slot
indices and its task_t%time, but with no chance to finish one more update.
Hence, after retrieving (atomically) the time slot information from a source task,
using the task_t%timeslots()
procedure, the thread working on computing guard
zone values for the target task can be sure that it can access slots (1:nt-1)
without conflicting with updates to the source task.
The only possible exception would be if the task was suspended in the middle
of the download_t%different()
or download_t%same()
procedure call, and did not
wake up until the source task had updated several times. This might possibly
(but still with very low probability) happen if one uses hyperthreading with a
large factor (such as using 20 threads on 2 cores).
It is important, however, to make sure that the source%t
and souce%iit
values
are retrieved only once, and that they are consistent, but if the local iit()
values are based on a source%it
that is updated as the last thing in task_t%rotate()
,
then even if the slot information is being changed by some thread executing
source%rotate()
during the target%timeslots() call, the target will still
receive consistent iit(1:nt-1)
memory slot values.
Consider the situation before the source update finishes::
iit(1) iit(2) iit(3) iit(4) iit(5)
4 5 1 2 3
(it) (new)
t(1) t(2) t(3) t(4) t(5)
0.3 0.4 0.0 0.1 0.2
(1) (it) (new) (4) (5)
All that happens when the source update finishes is that t(3)
is set to 0.5,
the new
mem slot is filled with updated variable values, and that the iit(:)
array is shifted left, so it reads:
iit(1) iit(2) iit(3) iit(4) iit(5)
5 1 2 3 4
(it) (new)
The shift of iit(:) may be replicated by knowing only the value of it
, which
is set atomically. So, if it
is incremented (cyclically) right after the
time slot that it corresponds to has been updated, then a thread downloading
to a target gets a set of times to interpolate between that is consistent with
the content of the corresponding memory slots. Should the thread pick up it
a microsecond too early it still gets useful data, since already the check_ready
test that happened typically a long time ago made the judgement that the task
was ready to be updated.
Locking timeline¶
Schematic timeline and call hierarchy showing lock set & unset:
task_t%update
refine_t%check_current
refine_needed
need_to_support
tlist%lock%set +|
link%lock%set +| |
.. | |
link%lock%unset -| |
tlist%lock%unset -|
selective_refine
make_child_patch
parent%lock%set +|
parent%lock%unset -|
child%link%init
child%init
download_link
tlist%lock%set +|
tlist%append_link |
tlist%init_nbors (parent%link) |
link%lock%set .
do .
link%add_nbor_by_rank .
link%set_status_and_flags
link%copy_nbor_list (... nbors)
link%remove_nbor_list (... %nbors_by_level)
link%sort_nbors_by_level (... %nbors_by_level)
link%increment_needed (... %nbors_by_level)
link%decrement_needed (nbors)
link%remove_nbor_list (old_head)
init_nbors (child_link)
ditto
init_nbor_nbors (child_link)
link%lock%set
self%init_nbors (link) +|
copy_nbor_list (link%nbor, nbors) |
link%lock%unset -|
for nbor in link%nbor
init_nbors (nbor%link)
link%remove_nbor_list (nbors)
reset_status
check_support()
tlist%lock%set
do link in tlist
...
tlist%lock%unset
send_to_vnbors
check_nbor_nbors
for nbor
check_nbors (nbor%link) .
... .
check_ready .
check_ready (link) |
tlist%lock%unset -|
remove_patch
check_support(tlist,link)
task%dnload
download_link
link%lock%set
link%copy_nbor_list (nbors)
link%increment_needed (nbors)
link%lock%unset
do
source%lock%set
same
different
source%lock%unset
end do
link%decrement_needed (nbors)
link%remove_nbor_list (nbors)
task%update
task%rotate
task%info
send_to_vnbors
check_nbors
link%lock%set
copy_nbor_list (... nbors)
increment_needed (nbors)
link%lock%unset
for nbor ..
check_ready (nbor%link)
link%lock%set
do .. nbor
..
link%lock%unset
queue_by_time (link)
tlist%lock%set
task%has_finished
Timings of relevance¶
In order to estimate the contention on nbor list locks it is useful to have estimates of the time spent in relevant routines.
Some of the central routines require approximately the times below, on a T430s laptop:
copy_nbor_list() uses about 100 mus/call for 26 nbors, or about 4 mus/nbor
remove_nbor_list() uses about half the time, or about 2 mus/nbor
check_ready() uses about 25 mus with 26 nbors, so about 1 mus/nbor
check_nbors(2) uses about half that time, or about 0.5 mus/nbor
init_nbors() uses about 1000 mus for 500 tasks, or about 2 mus/task
Maintenance¶
[ This and related posts are mainly intended for the developers that maintain the development repository ]
The maintenance instructions assume that the public and private repositories are checkout side-by-side, and are called dispatch/{private,public}. Other arrangements are handled correspondingly.
It is a good idea to add a local working copy of the private bitbucket repository as upstream, rather than the bitbucket repository itself – this reduces noise from the commits that would otherwise be needed.
4public branch¶
[ This and related posts are mainly intended for the developers that maintain the development repository ]
The 4public branch in the development repository is used to communicate changes between the development repository and the public repository. To prepare for using it:
- Start with a clean working copy of the public repository, connect the development repository as the upstream remote, and checkout the 4public branch:
cd dispatch/public
git remote add upstream ../development
git fetch upstream
git checkout -b 4public upstream/4public
Make sure to never push the 4public branch to the public repository, since that would create a very confusing situation. Always think of the 4public branch as a branch in the development repository.
Common files¶
Occasionally, we may want to update all (or most) of the common files shared with the master branch in the development reposiory. This should not be done lightly, since it may brake existing codes, and (therefore) requires extensive testing.
To check differences and possibly pull in updates of common files, do::
# in the development working copy
cd dispatch/development
git checkout 4public
git pull
git diff --stat master -- `git ls-tree -r 4public --name-only`
... check carefully which files should really be included ...
git checkout master -- `git ls-tree -r 4public --name-only`
git checkout -- file1 file2 # cancel individual updates
... test carefully ...
git commit -m "... comrehensive comment ..."
# in the public working copy
cd ../public # upstream = private
git checkout 4public # branch connected to upstream/4public
git pull # pull in updates
git checkout beta # beta branch on public
git rebase master # sync with master on public
... test carefully ... # test also that other experiments work
git commit --amend # amend the commit message (cf. below!)
git push # push to private repository
In the amended commit message you should alert people about this new update of
the `beta`
branch, and invite them to try it out. Only after confirmation that
there is no problem should the commit be carried over to the `master`
branch
of the public repository.
Pulling updates¶
To pull in updates from the development repository do::
cd dispatch/development # working copy of private
git checkout 4public # private branch
git pull # make sure it is up-to-date
git cherry-pick [ hash ] # get the new feature
... test carefully ... # test also that other experiments work
cd ../public # upstream = private
git checkout 4public # branch connected to upstream/4public
git pull # pull in updates
git checkout beta # beta branch on public
git rebase master # sync with master on public
git cherry-pick 4public # import feature
... test carefully ... # test also that other experiments work
git commit --amend # amend the commit message (cf. below!)
git push # push to private repository
In the amended commit message you should alert people about this new update on
the `beta`
branch, and invite them to try it out. Only after confirmation that
there is no problem should the commit be cherry-picked over to the `master`
branch on the public repository.
Pushing updates¶
To push updates from the private to the development repository do::
cd dispatch/development # working copy of private
git checkout master # must NOT be in 4public
cd ../public # upstream = private
git checkout 4public # private branch
git pull # make sure it is up-to-date
git cherry-pick [ hash ] # get the new feature
... test carefully ... # test also that other experiments work
git push # push to private repository
cd dispatch/development # working copy of private
git checkout 4public # branch connected to upstream/4public
git rebase master # sync with master on public
... test carefully ... # test also that other experiments work
git checkout master # private master branch
git cherry-pick 4public # import feature
git commit --amend # amend the commit message (cf. below!)
git push # push to private repos
Make clear in the amended commit message that this is imported from the public repos.
MPI¶
Overview¶
In order not ot be limited to use the initially existing task numbers when
sending MPI messages, the most general recieving method uses MPI_IMPROBE
to learn which task a message is aimed for, the size of the message, and
its sequence number. After receiving a reply from MPI_IMPROBE
, a buffer
is allocated, a non-blocking `MPI_IMRECV
is issued, and the message is
added to a recv_list
of outstanding receive messages. The messages in
the list are checked regularly (once per task update), and when the message
is complete the message object is moved to an unpk_list
for unpacking.
A more efficient method, which should scale to any number of OpenMP threads,
is to split communication between the threads, so each thread handles a
limited set of virtual tasks, issuing an MPI_IRECV
to each initially,
and re-issuing an MPI_IRECV
each time a package has been received. In
most or all cases the packages then arrive in the order being sent, but if
they do not, it is simple to add an out-of-order package to an unpack list,
just as in the case above.
Similarily, send requests are started with MPI_ISEND()
calls, and the
requests are held on lists until each set of sends (each task is in general
sent to multiple ranks) is completed (as tested via MPI_TEST_ALL()
), at
which time the send buffer is deallocated and the send request is removed
from the sent_list
and deleted.
It is an advantage to arrange this so that each thread maintain a thread-
private sent_list
– this avoids the need for OMP critical regions or
locks.
Send procedures¶
Send requests are started with MPI_ISEND()
calls, and the
requests are held on lists until each set of sends (each task is in general
sent to multiple ranks) is completed (as tested via MPI_TEST_ALL()
), at
which time the send buffer is deallocated and the send request is removed
from the sent_list
and deleted.
It is an advantage to arrange this so that each thread maintain a
thread-private sent_list
– this avoids the need for OMP critical
regions or locks.
Data type and procedures::
task_mesg_t%check_mpi
mpi_mesg_t%check_sent
mesg_t%send
mesg_t%test_all
mesg_t%wait_all
Receive procedures¶
In order not to be limited to use the initially existing task numbers when
sending MPI messages, the most general recieving method uses MPI_IMPROBE
to learn which task a message is aimed for, the size of the message, and
its sequence number. After receiving a reply from MPI_IMPROBE
, a buffer
is allocated, a non-blocking `MPI_IMRECV
is issued, and the message is
added to a recv_list
of outstanding receive messages. The messages in
the list are checked regularly (once per task update), and when the message
is complete the message object is moved to an unpk_list
for unpacking.
A more efficient method, implemented in check_active()
and
check_virtual()
, which should scale to any number of OpenMP threads,
is to split communication between the threads, so each thread handles a
limited set of virtual tasks, issuing an MPI_IRECV
to each initially,
and re-issuing an MPI_IRECV
each time a package has been received. In
most or all cases the packages then arrive in the order being sent, but if
they do not, it is simple to add an out-of-order package to an unpack list,
just as in the case above.
Data type and procedures::
task_mesg_t%check_mpi
task_mesg_t%check_priv
task_mesg_t%unpack
mpi_mesg_t%get
mpi_mesg_t%add
mpi_mesg_t%remove
mpi_mesg_t%delete
mesg_t%is_in_order
task_mesg_t%check_virtual
mesg_t%irecv
mesg_t%is_complete
mesg_t%is_in_order
Receiving order¶
A rank sends packages to several nbor ranks, so each rank also receives packages from several nbor ranks. It is important that these packages are received in the same order as they were sent from the sender rank, so the MPI protocol ensures that.
Each package sent from a rank to another rank contains a tag
, which encodes
the task ID and the sequence number. Each rank keeps track of the sequence
number from other ranks in an atomic fashion.
This is is handled by task_mesg_t%check_priv()
and is detected when looping
over the list of messages ready for unpacking. The procedure makes sure to call
task_mesg_t%unpack()
in sequential order. This is totally transparent to
the underlying procedures (e.g. patch_t%unpack()
, and detection of
out-of-order messages is thus not needed there.
Python implementation¶
The technical details of the Python interfaced are described in the sub-pages below:
Python dispatch
module¶
The `dispatch.snapshot()`
procedure returns an object where the most
important attribute is `snapshot.patches`
, which is a list of patch
objects (`p`
), carrying attributes (e.g. `p.size`
) that generally
have the same name as the corresponding variables in the code.
Collecting patch metadata¶
The list of patches is collected from the run/data/SSSSS/rank_RRRRR_patches.nml
files (here SSSSS
is a 5-digit snapshot number and RRRRR
is a
5-digit MPI rank). These files contain the metadata for all patches.
The actual data resides either in data/run/snapshots.dat
(one file
for all snapshots), or in data/run/SSSSS/snapshot.dat
(one file per
snapshot), or in data/run/SSSSS/RRRRR_PPPPP.dat
(one file per patch).
Auxiliary data¶
The auxiliary data resides either in data/run/SSSSS/RRRRR_PPPPP.aux files (one file per patch).
To add data fields to it, register a link to 1-D, 2-D, 3-D, or 4-D data arrays, with::
USE aux_mod
...
type (aux_t):: aux
...
call aux%register ('name', array)
Expression handling¶
Expression handling should take place on four levels, in the most general case:
- An arbitrary Pyhton expression is broken down into words, which may consist of * The left hand side of other expressions * Variables names * Python words – built-in or module objects
- If evaluation of such words leads to no result the word is given to the var() function
- The var function checks of the word belongs to known key-words, which respresent compound variables, such as ‘T’ for temperature, where each solver may require a different numerical expression, needing as much as 8 primitive variables (in the case of computing temperature when total energy is stored). * Expressions in the var function uses the mem() function to get actual values (which in fact or memory maps into disk files)
- Inside the mem() function, alphabetic keys (such as ‘px’) are translated to integer indices, which in turn determine the offsets of the memory maps into the disk files
The first two steps take place in the expression_parser(), while the third step takes place in the var() function (possibly calling itself recursively), and the 4th step takes place in the mem() function.
In either the var() or mem() function, two aspect where the actual disk data may differ should be compensated for:
- The different solvers use different centering of some of the variables
- The io%format parameter determines, for example, if density is stored as log density, or as linear density.
The source code may be found in utilities/python/dispatch/
, and is
also available via the auto-generated documentation (the
“Files” menu contains source code).
Radiative Transfer¶
Several radiative transfer solvers are implemented in the development version of DISPATCH, and based on that experience we are recommending the short characteristics method for general use, and including it in the public version.
The experiments/stellar_atmosphere/
directory demonstrates the use of
this solver in the context of stellar atmospheres. The method is general,
however, with the choice of angles, opacity data, and boundary conditions
defining the specific case.
Neighbor relations¶
The nbors_t%init
procedure in the solver/rt/$(RT_SOLVER)/nbors_mod.f90
file establishes the necessary nbor (dependency) lists and flags. It is called
in this context::
!dispatcher_t%execute
! rt_t%init_task_list (task_list)
! rt_t%nbors%init
! init_nbor_pair (task1, needs, task2, needs_me, download)
! rt_t%prepend_tasks_to (task_list)
! rt_t%init
! task_list%prepend_link (rt%link)
! task_list%prepend_link (rt%omega%link)
!
Task list manipulation¶
The task list is first constructed by the component/
procedure setting up
the arrangements of MHD tasks. In the common case with a Cartesian arrangements
of patch_t
tasks, the relevant file is components/cartesisan_mod.f90
.
The sequence of calls that happen are::
!cartesian%init
! task_list%init
! experiment_t%init
! task_list%append (experiment)
which results in a global task_list
containing only the exeperiment tasks,
which in this case are solver_t
tasks, where the task structure has been
extended in extras_t%init
with a solver_t%rt
RT task, which in turn
has extended itself with a set of ´solver_t%rt%omega(:)` tasks, one for each
ray direction.
These extra tasks are referred to as “RT sub-tasks”, and have not yet been
added to the task list, since cartesian_t%init()
only adds the
experiment_t
tasks (cf. above).
To give all tasks access to the task list, before calling the specific
dispatcher_t%method()
the dispatcher_t%excute
procedures calls each
task with:
!call task%init_task_list (task_list)
This provides the opportunity to add additional tasks to the tasks list (to
avoid a potential recursive disaster the new tasks are prepended rather than
appended).
In the RT case, the rt_t%init_task_list(self,task_list)
is first adds
the RT sub-tasks to the task list and then sets up the proper nbor relations
between the new tasks and the already existing MHD tasks.
Short characteristics¶
Short characteristics radiative transfer solvers typically solve the radiative transfer equations across only a single cell (or in some cases even a fraction of a cell) at a time. In three dimensions one can nevertheless achieve excellent performance, by parallelizing over perpendicular directions – effectively solving for the radiation over parallel planes, progressing from one plane to the next, starting from a boundary where values are known, either from physical boundary conditions, or from boundary values taken from an adjacent (“up-stream”) domain.
Because one is looping over two redundant directions, it is possible to significantly reduce the cost, since it allows the compiler to use loop vectorization.
The solvers/rt/short_characteristics/
directory contains the following
modules, used to perform various parts of such solutions::
radau_mod.f90 ! Radau integration -- a modified Gauss integration
rt_integral_mod.f90 ! integral method solver
rt_mod.f90 ! RT data type definitions
rt_nbors.f90 ! RT neighbor setup
rt_solver_mod.f90 ! RT solver data type
Radau integration¶
Radau integration differs from normal Gauss integration only in that one of the interval end points is always included. Here, the integration is performed over \(\mu =\) the cosine of the inclination relative to the normal, and the point \(\mu=1\) is always included.
This has the advantage that no \(\phi\)-integration (integration over azimuth) is needed for that inclination, which also generally carries the largest specific radiation intensity.
Scheduling¶
Radiative transfer tasks need not be updated with the same cadence as their MHD “host” tasks. The scheduling relations are illustrated below.
Phase 1: MHD updates are done, until mhd%time >= mhd%rt_next
(=omega%time
for all omega)::
------------- BC --------------------------------------
------------- MHD1 ------------+ |
--------------RT1 ---------------+
|
------------- MHD2 -------------+ |
--------------RT2 ---------------+
Phase 2: EOS2 calculations, at time = mhd%rt_next
, setting eos_time
to this
time. Here, EOS2 has advanced, and with MHD2 time ahead, only lack of upstream
RT prevents RT2 update:
------------- BC --------------------------------------
------------- MHD1 -------------+ |
--------------RT1 ---------------+
|
------------- MHD2 ---------------|-+
--------------EOS2 ---------------+
--------------RT2 ---------------+
Phase 3: EOS1 calculations, at time = mhd%rt_next
, setting eos_time
to this
time. Here, EOS2 has advanced, and with MHD2 time ahead, only lack of upstream
RT prevents RT2 update:
------------- BC --------------------------------------
------------- MHD1 ---------------|+
--------------RT1 ---------------+
|
------------- MHD2 ---------------|-+
--------------EOS2 ---------------+
--------------RT2 ---------------+
Phase 4: RT1 is now detected as “ready”, while RT2 is waiting:
------------- BC --------------------------------------
------------- MHD1 ---------------|+
--------------RT1 ---------------|--------+
|
------------- MHD2 ---------------|-+
--------------EOS2 ---------------+
--------------RT2 ---------------|--------+
If the RT update time rt_time
is smaller than mhd%dtime
, it could happen that
the next time the MHD task is updated, it finds itself still ahead of the RT
time, and hence that MHD task (but not all of them, advances it’s RT time again).
A simple solution would be to limit the MHD update time interval to rt_time
, for
all patches that do RT. This could impact the cost a bit, but only when the
choice of RT time is made very conservative
Tasks¶
This is what the class hierarchy looks like when radiative transfer (RT) is included::
!experiment_t -> solver_t -> mhd_t -> extras_t -> gpatch_t -> patch_t -> task_t
! |---rt_t -> gpatch_t -> patch_t -> task_t
! |-grav_t -> gpatch_t -> patch_t -> task_t
The task list consist of 1) a number of “MHD tasks” (experiment_t
tasks),
and 2) A number of rt_t
tasks, which subdivide into 2a)
one “main RT task” (an rt_t
task) for each MHD task, and 2b)
n_omega
“RT sub-tasks” (also rt_t
tasks)
Each MHD task has connections to the rt_t
tasks via the extras_t
class, and each
main RT task has connections to the RT sub-task, which are allocated as an array
of rt_t
tasks inside the main rt_t
task.
Each rt_t
task also has a connection to the patch_t part of the MHD task, and can
thus access the MHD time and other attributes
The MHD task update procedure is called directly from task_list_t%update(), and
calls extras_t%pre_update before the MHD update, and extras_t%post_update after it.
Through that call path, rt_t%post_update
is (only) called when the MHD task has
updated. It is not called as part of the normal rt_t%update
. This is the correct
point to detect that a new EOS calculation is needed.
The RT tasks are also called directly from the task_list_t%update()
, and as
illustrated above, they do not go through the extras_t
layer. Those calls go
only to rt_t%update()
.
Point source radiation¶
Point source radiative transfer is available in the development repository, on a collaborative basis. Contact aake@nbi.ku.dk if you are interested in this.
Sink-particles¶
A sinkparticle is represented by a task data type (sink_patch_t
)
that is primarily an extension of a gpatch_t
data type, with a patch%mem
that has a limited extent (~8 cell radius).
Sink particles are formed at the same resolution level as the patch they are
formed in.
From the point of view of the dispatcher task_list, the task appears as a
normal patch, but with special case handling that causes all interior
values to be downloaded from and to its normal MHD patch nbors.
The particle aspect of a sinkparticle is kept in a partcle_list_t
data
type, which stores several positions in time for the particle, making it
possible to interpolate its position to any given time, when computing
interactions (forces).
The sink particle position is updated by an N-body solver, which gets
access to the other sink particle histories by being an extension on top of
the task_list_t
data type.
The particle position update method is always one particle at a time, since particle updates use variable time steps, which differ from particle to particle. A particle “update” in the DISPATCH context thus consists of these steps
- pre_update: estimate the mid-point position to use, in order to make the update time-reflexive
- compute the forces at some point in time (e.g. the initial time for the 1st K step in KDK) and update the velocity
- update the position (e.g. with D step in KDK)
- compute the forces at some other point in time (e.g. at final time for 2nd K step in KDK) and update the velocity
Ad hoc sink particles¶
To allow testing, one can create ad hoc sink particles, using namelist entries similar to:
&sink_patch_params verbose=2 on=t n_ad_hoc=2 /
&sink_params x=0.52 y=0.52 z=0.52 vx=0.1 mass=1 /
&sink_params x=0.48 y=0.48 z=0.48 vx=0.2 mass=1 /
Whereas sink particles normally are created and added to the task list by one
of the refinement criteria, the ad hoc particles are created after the task
list with normal MHD patches has been created, via a call from extras_t%init_task_list
,
which gets called once for every patch, just before updates of the task list start.
The call has the current task list as an argument, which makes it possible to append the new tasks, corresponding to the ad hoc sink particles.
Forces¶
The forces that involve sink particles are of two types:
Gravitational forces between particles. These can be computed very efficiently, by first caching the particles from list form to array form, and then using vectorization over the particles. The cost per particle is low enough to accept, for each particle, direct computation for hundreds of the most significant sink particles. Each force computation is used to accumulate both the force of A on B, and the force of B on A.
Gravitational forces between gas and particles may, likewise, be computed by direct summation over particles, costing of order a few nanoseconds per sink particle, and thus allowing hundreds of sink particles w/o more than doubling the cost per cell.
As with the particle-particle force, one can accumulate, without additional cost, the force of the gas on the particles, from the action = reaction principle. This yields for each patch and time, a force field from all the direct summation sink particles.
To use this information in the calculation of the forces acting on each particle requires (only) that one interpolates in time, for each patch. For each patch we know the force on each particle, at a discrete set of times. From that information one can interpolate to the exact particle time, using high order time interpolation (
lagrange_mod.f90
).To quickly find the right
patch_force_t
instance, given the particle and MHD patch IDs, we use a hash table with(particle%id,patch%id)
as a 2-dimensional key, and an anonymous pointer with the address to thepatch_force_t
data type as value.
Data type hierarchy¶
The sink_patch_t
data type is an extenstion of the standard gpatch_t
data
type, with access to the task list.
The particle_solver_t
is an extension of the particle_list_t
data type, which
is a `` task_t`` extension that holds particle positions at several times.
The particle_solver_t
is kept as an attribute of `` sink_patch_t`` data type,
so the data type hierarchy looks like this:
task_list_t
/ | |
experiment_t | |
solver_t | |
mhd_t | |
| refine_t |
| / |
extras_t |
| |
sink_patch_t |
/ | |
paricle_solver_t | |
| | |
| gpatch_t |
| / | \ |
| / | list_t
particle_list_t | / |
/ | patch_t |
particle_t dll_t / \ |
/ link_t
/ |
connect_t task_t
Reflexive time steps¶
To make the particle evolution near-symplectic the most important
aspect is that the time step determination should be reflexive,
in the sense that when taking a timestep forward from t_A
to
t_B
, the timestep should be evaluated in such a way that starting
from t_B
and moving backwards in time one should end up exactly
at t_A
. (Aiming for exact symplectic expressions would be overkill
when there are weak and non-conservative perturbations from the moving gas).
A simple and efficient approximate method to achive this is to extrapolate the position forward one half (time index) step, using the previous positions, which are stored in the particle history. Details of the extrapolation may differ, e.g. in using or ignoring speed information – the main goal should be to obtain an estimate that is both cheap and accurate.
Task update sequence¶
When a sink particle task (data type sink_patch_t
) reaches the head of
the ready queue and is taken by one of the threads, its call to the normal
task_list_t%update()
procedure results in calls to these procedures
(indentation indicates call level, and the name of the file containing
the procedure is obtained with the substituttion _t
-> _mod.f90
):
experiment_t%dnload ! generic download call
sink_task_t%dnload ! sink patch download
download_t%download_link (..., all_cells=.true., ...) ! values for accretion
task%update ! generic update call
sink_task%update ! sink patch update
particle_solver_t%force_field ! fall through
particle_list_t%force_field ! compute forces
hash_table%get ! get patch_forces
sink_task_t%accrete ! accrete mass
sink_task_t%courant_condition ! set timestep
sink_task_t%move ! organize particle move
particle_solver_t%update ! particle solver update
particle_solver_t%courant_time ! particle courant time
patch_t%rotate ! patch periodicty etc
task_t%rotate ! rotate time slots
list_t%send_to_vnbors [ if the task is a boundary task ] ! send boundary tasks
Stellar winds¶
The procedure to take the stellar wind from a sink particle into account should be an extras procedure, executed by the %update procedure of the patch that is receiving the input of mass and momentum from the sink particle.
Stationary Lagrangian¶
The Stationary Lagrangian Method combines Eulerian mesh placement and Lagrangian dynamics, by making a shift in velocity space that cancels most of the bulk motion.
Refinement considerations¶
One could choose to refine from either the %it state – aligned with the mesh, but with velocities shifted by %x_shift
Task lists¶
Tasks lists are fundamental to DISPATCH, and the list nodes (data type link_t) contain several types of pointers that define relations between tasks; e.g. a subset with time order, or a subset of active tasks.
The task list should not be required to know any specifics of tasks, which are referred to with pointers of type experiment_t; the anonymous top tier task, which may be anything between the top of a hierarchy with many layers (e.g. task_t -> patch_t -> gpatch_t -> mhd_t -> rmhd_t -> solver_t), or just two layers (task_t -> experiment_t).
The task_list_t data type extends the list_t data type with procedures that can handle tasks in the context of task lists; e.g. specifying what happens in the context of unpacking a task MPI package from another MPI process.
Procedures¶
As is, the task_list_t data type contains a number of procedures that have to do with message sending and receiving. These could possibly be split off into task_mesg_t data type. Alternatively, the procedures that in effect implement dispatcher method=0 could be split off into a dispatcher_method0_t data type, leaving task_list_t to effectively be the task_mesg_t.
The list_t data type contains (or should contain only) procedures that manipulate lists of link_t data types; inserting or deleting nodes in lists, etc.
As is, the list_t data type also contains procedures that rely on tasks having meshes, e.g. for constructing neighbor lists. These could perhaps with advantage be split off into a patch_list_t data type.
tasks, one should be able to define task type specific relations that define when a source task is ahead of a target task. This requires the functions that call is_ahead_of are at a level where they are aware of experiment_t and all sub-levels.:
dispatcher_t dispatcherX_mod
task_list_t check_ready task_list_mod
experiment_t is_ahead_of refine experiment_mod
solver_t solver_mod
gpatch_t gpatch_mod
patch_t patch_mod
task_t task_mod
For task refinement, a similar desire exists. The procedure that defines if a task should be “refined” (whatever that means) should be aware of all levels of the hierarchy. Or else, one should be able to overload the refine procedure itself, at any level.
Refinement¶
How should refinement procedures and the dispatchers really work together? Currently, the task list updater calls are refine procedure, as part of the task_list_t%update procedure, but this is awkward, since it means that one allows a procedure that really deals with a single link in a task list to affect the task list itself. It would be safer and more consistent if the level that loops over task list events (i.e., the dispatcher level) is the one that also considers refinement.
But if the refine_mo should be able to manipulate the task list with dispather0, it needs to know about task list, which creates a Catch 22, since it is also called from inside task_list_mod.:
dispatcher_t dispatcherX_mod
task_list_t task_list_mod
list_t check_ready list_mod
experiment_t experiment_mod
rt_solver_t rt_solver_mod
rt_t (is_ahead_of) rt_mod
refine_t ---------------------------------------------------- refine_mod
solver_t | solver_mod
mhd_t | mhd_mod
gpatch_t | gpatch_mod
patch_t | patch_mod
link_t | link_mod
task_t is_ahead_of task_mod
Task locking¶
DISPATCH uses OpenMP nested locks, with the API defined in omp/omp_locks_mod.f90
.
A basic rule that needs to be respected with locks is to avoid that two objects of the same class are locked at the same time by a task – the situation below can clearly lead to a deadlock
thread 1:
lock A(1)
lock & unlock A(2)
unlock A(1)
thread 2:
lock A(2)
lock & unlock A(1)
unlock A(2)
On the other hand, if a thread locks one type of lock, and then a whole set of another type of locks inside the first one, this cannot lead to a deadlock, since either the outer locks are the same, and only one thread can do the set, or they are different, and they can negotiate. Hence locking the task list avoids any deadlock that could potentially be triggered if threads were not locked out.
Timeslots¶
Each task has associated with it two sets of information that are also accessed by other threads:
- the
task%t(:)
,task%dt(:)
,task%time
, andtask%it
(equivalentlytask%iit
) info about time slots - the
link%nbor
nbor lists, which lists the nbor tasks that the task depends on
The 1st set of values are accessed twice: First in list_t%check_nbors()
and
check_ready()
, which uses task_t%time and task_t%dtime to determine if the nbors
of a task are sufficiently advanced in time to consider the task ready to update.
Later, when the guard zone values are downloaded, all of the first (nt-1)
values
of task_t%t(:)
and task_t%dt(:)
are needed.
The 2nd set of values (or pointers) are used in both of these steps as well. However, since the nbor lists are not changed, or else are cached, there is no problem or need for locking in this context, except very briefly, while making the sorted cache copy and, correspondingly, when changing the nbor lists in connection with refine / derefine.
Auto-generated documentation¶
The code is documented with internal comment blocks such as this one:
!======================================================================
!> Comment test
!======================================================================
at the top of modules and module procedures. Such comment blocks are processed by Doxygen at ReadTheDocs, and turned into a set of HTML pages, showing the relations between modules, data types, and module data and procedures: