Friday, February 19, 2010

Hierarchical job distribution support in Taskfs

This is quick post to update on completion of hierarchical task distribution in taskfs. In current situation, taskfs works by distributing the tasks to devcmd2 of remote nodes and those remote nodes execute the jobs. This is flat hierarchy of one level. In hierarchical system, client taskfs will mount remote taskfs which intern will mount next level of taskfs and so on...

To implement this, local task execution support was added to the taskfs by merging it with devcmd2. This way, now all taskfs are capable of behaving as leaf node, internal node or as root node. By using these taskfs as node, and mount-links as edges, one can create a directed graph.

The task given to root taskfs will be equally distributed in all its immediate children. These children will intern distribute the share of load given to them among their immediate children. Input will be similarly propagated from root to leaf nodes via tree. And output will be aggregated at each internal node before passing it to parent.

This directed graph is automatically created based on the availability of mounted remotes taskfs filesystems. Taskfs which does not have any other taskfs mounted will assume itself as leaf node. One can also force local execution of task by sending res 0 command or by directly sending exec cmd without doing reservation. Taskfs will execute job locally if there is no prior reservation command or if prior reservation command requested zero remote resources.

Revision 103b7aa87c is the first revision providing this support. One can use the same methodology described in previous blog do execute their jobs. The only difference that can be seen is that
when resources are reserved, all of those may not appear in session directory, only the immediate childrens of this taskfs will appear here.


Now, lets see one live example of this new taskfs, and how this tree looks like. Following is similar execution as described in previous blog, so not repeating the steps here. This tree was created for request of 4 tasks.

$ 9pfuse localhost:5555 mpoint
$ tree mpoint
mpoint
`-- remote
|-- arch [error opening dir]
|-- clone
|-- env [error opening dir]
|-- fs [error opening dir]
|-- ns [error opening dir]
`-- status [error opening dir]

6 directories, 1 file
$ ./cloneUserPause mpoint/remote/clone 4 "pwd"
6 directories, 1 file
[[DEBUG]] command is [pwd] and input file [(null)]
[[DEBUG]] clone returned [0]
[[DEBUG]] path to ctl file is [mpoint/remote//0/ctl]
[[DEBUG]] Writing command [res 4]
[[DEBUG]] Writing command [exec pwd]
[[DEBUG]] input file path [mpoint/remote//0/stdio]
[[DEBUG]] opening [mpoint/remote//0/stdio] for reading output
/home/pravin/inferno/layer3/hg
/home/pravin/inferno/layer3/hg
/home/pravin/inferno/pravin2/hg
/home/pravin/inferno/pravin2/hg
$ tree mpoint
mpoint
`-- remote
|-- 0
| |-- 0
| | |-- 0
| | | |-- 0
| | | | |-- args
| | | | |-- ctl
| | | | |-- env
| | | | |-- ns
| | | | |-- status
| | | | |-- stderr
| | | | |-- stdin
| | | | |-- stdio
| | | | |-- stdout
| | | | `-- wait
| | | |-- 1
| | | | |-- args
| | | | |-- ctl
| | | | |-- env
| | | | |-- ns
| | | | |-- status
| | | | |-- stderr
| | | | |-- stdin
| | | | |-- stdio
| | | | |-- stdout
| | | | `-- wait
| | | |-- args
| | | |-- ctl
| | | |-- env
| | | |-- ns
| | | |-- status
| | | |-- stderr
| | | |-- stdin
| | | |-- stdio
| | | |-- stdout
| | | `-- wait
| | |-- args
| | |-- ctl
| | |-- env
| | |-- ns
| | |-- status
| | |-- stderr
| | |-- stdin
| | |-- stdio
| | |-- stdout
| | `-- wait
| |-- 1
| | |-- 0
| | | |-- args
| | | |-- ctl
| | | |-- env
| | | |-- ns
| | | |-- status
| | | |-- stderr
| | | |-- stdin
| | | |-- stdio
| | | |-- stdout
| | | `-- wait
| | |-- 1
| | | |-- args
| | | |-- ctl
| | | |-- env
| | | |-- ns
| | | |-- status
| | | |-- stderr
| | | |-- stdin
| | | |-- stdio
| | | |-- stdout
| | | `-- wait
| | |-- args
| | |-- ctl
| | |-- env
| | |-- ns
| | |-- status
| | |-- stderr
| | |-- stdin
| | |-- stdio
| | |-- stdout
| | `-- wait
| |-- args
| |-- ctl
| |-- env
| |-- ns
| |-- status
| |-- stderr
| |-- stdin
| |-- stdio
| |-- stdout
| `-- wait
|-- arch [error opening dir]
|-- clone
|-- env [error opening dir]
|-- fs [error opening dir]
|-- ns [error opening dir]
`-- status [error opening dir]

14 directories, 81 files


The root tree had two children, one at /home/pravin/inferno/pravin/inferno-rtasks and other at /home/pravin/inferno/pravin2/hg. The first child had it's own child running at location /home/pravin/inferno/layer3/hg. As you can see, actual execution was done by leaf taskfs only. Internal taskfs worked only for delegating the work and aggregating the output. Above tree will help visualizing the delegation of work.


Note : Above shown tree is available only when user is still holding the the root clone file open. As long as this file is closed. The entire tree representing the remote resources is released. Following is the tree when ./cloneUserPause program terminates.

$ tree mpoint
mpoint
`-- remote
|-- 0
| |-- args
| |-- ctl
| |-- env
| |-- ns
| |-- status
| |-- stderr
| |-- stdin
| |-- stdio
| |-- stdout
| `-- wait
|-- arch [error opening dir]
|-- clone
|-- env [error opening dir]
|-- fs [error opening dir]
|-- ns [error opening dir]
`-- status [error opening dir]

7 directories, 11 files

No comments:

Post a Comment