Tuesday, February 16, 2010

Supporting clone file semantics from Taskfs

This is another blogpost describing something that is already implemented, but not documented properly. So this blogpost is dedicated to explaining the taskfs support for clone file semantics. There are many bugfixes and optimizations in the revision 6feb9d0431 which is referred in this post, but those are not of main concern here. This revision is also tagged as v0.0.1 and is hopefully stable.

Lets start with what is clone file semantics? It means, when you read from clone file, it gives the name of resource which is allocated for current session. It also converts the clone file to the ctl file of that session resource. After this read, you are supposed to treat the clone file channel/descriptor as ctl file channel/descriptor. Whenever this descriptor is closed, taskfs will assume that this session is no longer needed by user, and it will free the remote resources. It will aslo release the taskfs session for future session requests. The main reason for this behavior is that Taskfs can easily reclaim the resources and do the garbage collection once clone file descriptor is closed.

This semantic of releasing the resources when clone and ctl files are closed, is little unnatural to the POSIX and command prompt users. This also means that you can't use input/output redirections and cat commands from shell on these files, as shell will automatically close these files once that command is complete. For example, lets see the following command.

$ cd mpoint/remote
$ cat clone
0
$ cat clone
0
$

Here, we requested for new taskfs session two times, and it returned the same session both times. This may look erroneous from POSIX users, but you should realize that cat clone reads and shows the resource reserved by taskfs and then closes the clone file. This close triggers the release of taskfs session resource (in this case resource 0). This released resource is again allocated to the next user requesting it by opening and reading from clone.

Now lets see one more example of doing writes on ctl file of taskfs session resource.

cat clone
0
$ cd 0
$ echo res 4 > ctl

This command will write res 4 to ctl file and close the file after it. As a result of that, taskfs will allocate 4 remote resources and release them after receiving the close call. This is not desired behavior for command line users. The take home message here is Don't use input redirections and cat commands on clone and ctl files.


The proper way (ie. clone filesystem way) to use this filesystem is

  1. Open clone file.

  2. Read the resource name from clone file.

  3. Write res n into same clone file descriptor, where n is number of resources needed.

  4. Write exec cmd into same clone file descriptor, where cmd is command to be executed.

  5. Open stdio file in write mode, write the input data into this file descriptor and the close this descriptor once all input is written.

  6. Open stdio file in read mode, read the data from this file descriptor and the close this descriptor once all data is read. This is the Standard Output of your command.

  7. Open stderr file in read mode, read the data from this file descriptor and the close this descriptor once all data is read. This is the Standard Error of your command.

  8. Close the clone file descriptor. This will automatically release all the resources allocated for this session of taskfs including remote resources reserved by res n.


Here is the small C program cloneUser.c I wrote to do all above steps.

Now, lets see the example run, showing how it can be used.

$ 9pfuse localhost:5555 mpoint
$
$ tree mpoint
mpoint
`-- remote
|-- arch [error opening dir]
|-- clone
|-- env [error opening dir]
|-- fs [error opening dir]
|-- ns [error opening dir]
`-- status [error opening dir]

6 directories, 1 file
$
$ ./cloneUser mpoint/remote/clone 4 "hostname"
[[DEBUG]] command is [hostname] and input file [(null)]
[[DEBUG]] clone returned [0]
[[DEBUG]] path to ctl file is [mpoint/remote//0/ctl]
[[DEBUG]] Writing command [res 4]
[[DEBUG]] Writing command [exec hostname]
[[DEBUG]] input file path [mpoint/remote//0/stdio]
[[DEBUG]] opening [mpoint/remote//0/stdio] for reading output
BlackPearl
inferno-test
inferno-test
BlackPearl
$
$ tree mpoint
mpoint
`-- remote
|-- 0
| |-- args
| |-- ctl
| |-- env
| |-- ns
| |-- status
| |-- stderr
| |-- stdin
| |-- stdio
| |-- stdout
| `-- wait
|-- arch [error opening dir]
|-- clone
|-- env [error opening dir]
|-- fs [error opening dir]
|-- ns [error opening dir]
`-- status [error opening dir]

7 directories, 11 files
$
$ ./cloneUser mpoint/remote/clone 4 "wc -l" cloneUser.c
[[DEBUG]] command is [wc -l] and input file [cloneUser.c]
[[DEBUG]] clone returned [0]
[[DEBUG]] path to ctl file is [mpoint/remote//0/ctl]
[[DEBUG]] Writing command [res 4]
[[DEBUG]] Writing command [exec wc -l]
[[DEBUG]] input file path [mpoint/remote//0/stdio]
[[DEBUG]] opening [mpoint/remote//0/stdio] for reading output
192
192
192
192
$
$ tree mpoint
mpoint
`-- remote
|-- 0
| |-- args
| |-- ctl
| |-- env
| |-- ns
| |-- status
| |-- stderr
| |-- stdin
| |-- stdio
| |-- stdout
| `-- wait
|-- arch [error opening dir]
|-- clone
|-- env [error opening dir]
|-- fs [error opening dir]
|-- ns [error opening dir]
`-- status [error opening dir]

7 directories, 11 files
$


In above test run, the cloneUser.c is the program mentioned above. In first execution of cloneUser program, 4 remote nodes are used to execute the the hostname command. You can observe that after termination of cloneUser, all remote resources are released because of which they are not visible in directory structure. Second invocation demonstrates the use of input file with cloneUser. This invocation sends the cloneuser.c file as input to the wc -l program which returns the number of lines in this code. All four remote resources report correct line-count showing expected behavior. It can be also observed that second execution of cloneUser reuses the same taskfs session 0 which was released after completion of first execution of cloneUser. So, this test run is good example of how resource reclamation works in taskfs.

No comments:

Post a Comment