DISPATCH
|
Dispatcher method that relies on all threads maintaining a "ready queue", with tasks ready for updating. Each thread picks the task at the head of the queue, and checks all nbors for tasks that are ready, putting them back in the queue. The method is similar to method5, except the state of each task is given by the value of task_tstate, rather than by status bits in task_tstatus, and locks are used to protect on the one hand the ready queue, and on the other hand each individual task. More...
Data Types | |
type | dispatcher6_t |
Variables | |
integer, dimension(0:4), save | n_state =0 |
type(dispatcher6_t), public | dispatcher6 |
Dispatcher method that relies on all threads maintaining a "ready queue", with tasks ready for updating. Each thread picks the task at the head of the queue, and checks all nbors for tasks that are ready, putting them back in the queue. The method is similar to method5, except the state of each task is given by the value of task_tstate, rather than by status bits in task_tstatus, and locks are used to protect on the one hand the ready queue, and on the other hand each individual task.
READY QUEUE: The ready queue consists of links starting with task_listqueue, and continuing with linknext_time. Any change to one of these quantities requires that the thread acquires the task_listlock. All threads will be competing to do that, so the locked time must be as short as possible.
TASK LOCKS: Each task can change state from stat 0 (dormant) to state 1 ready to become updated, state 2 (busy = being updated), state 3 (just updated), and state 9 (finished). Each such change of state must be done by first acquiring the tasklock, changing the state variable, and releasing the lock. The state of task needs to be checked in various contexts, and to get a unique answer on such a check the tasklock must be acquired during the check. Some of these checks are anyway done in connection with changing the state, so the lock is anyway already acquired. Other checks (e.g. the check on whether the task is ahead or behind another task) do not occur in connection with a change of state, and then the tasklock must be explicitly acquired before the check, and then released. Each lock set/unset may take of the order 0.1-0.3 micro- seconds, so up to several dozen of these can be done without significantly increasing the computing time, since each task update typically uses several tens of milliseconds. A task is typically acquired as many times as there are nbors in the nbor list (or at most that many times), so there should be no chance that the set/unset of locks could take significant time, and since a specific tasklock has no impact on other tasks, this should be scalable to any number of threads per process.
When should a task be locked? The most conservative choice is that a task is locked the entire time it is being updated, so while being in state 2, while the least conservative choice is that a task is locked only when it changes state, and when the state is being tested. The task cycle consists of a long dormant time, a short (e.g. few percent) update time, and the very brief times when it changes state. The main outside reference to task data is when the task time is compared to that of another task. The task time should only be updated in locked state, going from state 2 to state 3. While the task is in state 0, 1, or 2, it may still have a sufficiently advanced time to be able to serve guard zones, so there is no reason to exclude any state, or require a particular state, while it is being tested. It just needs to be locked by the thread performing the comparison.
When a task is used as a source for guard zone loading it may be wise and conservative to lock it, so the time slots cannot be rotated during the guard zone date acquisition. In principle this may not be necessary, at least if one does NOT make use of the indirect addressing array iit((), which is being changed during a rotate. However, it may still be risky, since a time slot that is being used during guard zone interpolations might be overwritten with new data during the computations, if the task is not locked. Since the time a task is needed as a source for guard zone values is only 5-10% of the update time, locking it briefly during this time may be ok. There are, however, about 26 such accesses per time step, so that could start to become a problem, if the source task is locked during the whole interpolation time. A better approach is to lock briefly, compute the memory slot needed, check if there is a risk that it will become overwritten (this would be the case only if the slot is nr 1 in the iit() array, and keep the task locked only if this is the case.