Next enhancement is to implement selective reservation using information available from status file. So, now resource reservations can also specify what type of resources are needed for their jobs. Reservations will be made only on those resources which have requested OS and architecture. If no OS or architecture is given or if wildcard is given instead, then any available resource will be used. Following is the new format of res command. res
Now, lets try out one demo. Python is used bellow to provide interactive job management with taskfs.
$ python
Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15)
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys, os
>>> os.system ("9pfuse localhost:5555 ./mpoint")
0
>>> os.system ("tree ./mpoint")
./mpoint
`-- remote
|-- arch [error opening dir]
|-- clone
|-- env [error opening dir]
|-- fs [error opening dir]
|-- ns [error opening dir]
`-- status
5 directories, 2 files
0
>>> topStatusFile = open ( "./mpoint/remote/status", "r", 0)
>>> print topStatusFile.read()
Linux 386 4
>>> topStatusFile.close()
Till this point, we have mounted the taskfs, and we have checked the status of top level status file. It shows that there are 4 nodes, each with Linux running on 386 architecture. Now, lets do some reservation and demo job execution.
>>> ctlFile = open ( "./mpoint/remote/clone", "r+", 0)
>>> print ctlFile.read()
0
>>> os.system ("tree ./mpoint")
./mpoint
`-- remote
|-- 0
| |-- args
| |-- ctl
| |-- env
| |-- ns
| |-- status
| |-- stderr
| |-- stdin
| |-- stdio
| |-- stdout
| `-- wait
|-- arch [error opening dir]
|-- clone
|-- env [error opening dir]
|-- fs [error opening dir]
|-- ns [error opening dir]
`-- status
6 directories, 12 files
0
>>> sFile = open ("./mpoint/remote/0/status", "r", 0)
>>> print sFile.read()
cmd/0 2 Unreserved /home/pravin/projects/inferno/pravin//inferno-rtasks/ ''
>>> sFile.close()
>>> ctlFile.write ("res 5 Linux 386")
>>> sFile = open ("./mpoint/remote/0/status", "r", 0)
>>> print sFile.read()
cmd/1 2 Closed /home/pravin/inferno/layer3//hg/ ''
cmd/2 2 Closed /home/pravin/inferno/layer3//hg/ ''
cmd/3 2 Closed /home/pravin/inferno/layer3//hg/ ''
cmd/1 2 Closed /home/pravin/inferno/pravin2//hg/ ''
cmd/2 2 Closed /home/pravin/inferno/pravin2//hg/ ''
>>> sFile.close()
>>> ctlFile.write ("exec hostname")
>>> ioFile = open ("./mpoint/remote/0/stdio", "rr", 0)
>>> print ioFile.read()
inferno-test
inferno-test
inferno-test
inferno-test
inferno-test
>>> ioFile.close()
>>> sFile = open ("./mpoint/remote/0/status", "r", 0)
>>> print sFile.read()
cmd/1 2 Done /home/pravin/inferno/layer3//hg/ hostname
cmd/2 2 Done /home/pravin/inferno/layer3//hg/ hostname
cmd/3 2 Done /home/pravin/inferno/layer3//hg/ hostname
cmd/1 2 Done /home/pravin/inferno/pravin2//hg/ hostname
cmd/2 2 Done /home/pravin/inferno/pravin2//hg/ hostname
>>> sFile.close()
>>> topStatusFile = open ("./mpoint/remote/status", "r", 0)
>>> print topStatusFile.read()
Linux 386 4
Following the demo of how resource re-claimation works once clone file is closed.
>>> os.system ("tree -d ./mpoint")
./mpoint
`-- remote
|-- 0
| |-- 0
| | `-- 0
| | |-- 0
| | |-- 1
| | `-- 2
| `-- 1
| |-- 0
| `-- 1
|-- arch [error opening dir]
|-- env [error opening dir]
|-- fs [error opening dir]
`-- ns [error opening dir]
14 directories
0
>>> ctlFile.close ()
>>> os.system ("tree -d ./mpoint")
./mpoint
`-- remote
|-- 0
|-- arch [error opening dir]
|-- env [error opening dir]
|-- fs [error opening dir]
`-- ns [error opening dir]
6 directories
0
>>>
Unfortunately, I do not have testbed with different OS and architectures running. So, selective repeat is practically useless in this case. But it is quite handy when one is dealing with diverse mix of compute nodes.
The code revision ae33f01fea onwards, selective reservation works.
No comments:
Post a Comment