Here are some ideas:
Sleigh is the part of NetWorkSpaces that allows you to execute tasks in parallel. NetWorkSpaces also includes other classes and methods that are used for communcating between different scripts, for example.
NetWorkSpaces is rather similar to Linda, but was generally designed to be simpler, and to work well with scripting languages in particular. The primary simplification in NetWorkSpaces is in the matching rules. Linda has powerful, but somewhat complex rules for tuple matching. NetWorkSpaces uses named variables that can have zero or more values. The only "matching" that NetWorkSpaces uses is on the name of the variable.
Also, NetWorkSpaces allows you to easily define the order of the values of a variable. In Linda, if a tuple query can match multiple tuples, the actual value returned is not defined, requiring the programmer to include an extra field or fields to define the order. Linda's matching rules are very powerful, but NetWorkSpaces makes it trivial to do simple things, like returning the values in "first-in first-out" order, for example.
NetWorkSpaces also tries to simplify running tasks in parallel by providing high-level methods for executing tasks: the Sleigh methods, eachElem and eachWorker. These methods make writing embarrassing parallel program trivial, often without having to modify your existing code at all. And yet, you can also use eachWorker in much the same way that you use eval in Linda, allowing you to write more sophisticated parallel programs in NetWorkSpaces, as well.
The best way to find that out is to time it yourself on your network and your machines. There are two example programs that act as simple benchmarks. One passes a token around a ring of all of the workers (ring.py) and one passes data in a star pattern (pping.py). The operations are timed, and a per-operation time is printed at the end of the program.
(Probably ought to include some example results)
Use strings. Strings are sent to the NWS server as plain ascii text, and all clients can read them. That allows you to use XML or YAML to pass other data types between clients, since they can be encoded as strings.
Failed to load application: No module named web
You installed twisted, but not twisted-web. The NWS server needs both. See the INSTALL file for more information.
The simplest way is to start R, and type in the command:
> install.packages("nws")
On the Windows version, you can use the Packages > Install Package(s)... menu item to install NetWorkSpaces.
Just make sure that you do this from an account with the privileges to write to the R installation.
If you don't have permission to write to the directory, please see the answer to "I don't have permission to write to R-2.2.1\library directory"
"path_to_R\bin\Rgui.exe" HOME=p:/ R_LIBS=p:/myRlib
R_LIBS=p:/myRlib
The simplest way is to set the NWS_SERVER_PORT environment variable when you start the server.
NWS_SERVER_PORT, NWS_WEB_PORT, NWS_INTERFACE, NWS_WEB_SERVED_DIR, and NWS_TMP_DIR.
It is a Twisted Application Configuration file, that specifies to the Twisted Framework how to construct an application. Twisted considers this to be a configuration file, although TAC files are actually Python code. To configure the NetWorkSpaces server, you could edit nws.tac if you wish, but environment variables may be preferred by many.
On Posix systems, install the nws script as an init script.
On Windows systems, install the NWS Service, using the NwsService.py script.
Use eachWorker. Give it a function that does the initialization. After eachWorker returns, you can execute eachElem, knowing that every worker has executed your initialization function.
Execute an "eval" task using eachWorker. In Python, it would look like:
>>> s.eachWorker("import time")
Use the global variable "SleighArgs". For example:
>>> s.eachElem("'argument: ' + str(SleighArgs[0])", range(4))
Yes. There is a global variable called SleighRank that is available to your task function.
>>> s.eachWorker("'My rank is: %d' % SleighRank")
That isn't usually needed, but can be important for MPI-style parallel programs. One method is to fetch the nodeList variable, split it into a list by whitespace, and get the size of the resulting list. But it's much easier to simply pass the number of workers as one of the arguments to the worker function, since the master process knows (or can easily find out) the number of workers associated with a Sleigh.
Please refer to the user manual's "Getting Started" chapter to setup rsh server on Windows. Once that's done, you specify an appropriate launch function, using the "launch" argument to the Sleigh constructor.
In R:
> s = sleigh(launch=rshcmd)
In Python:
>>> from nws.sleigh import rshcmd >>> s = Sleigh(launch=rshcmd)
Use the sshforwardcmd launch function.
(More details, please)
Just submit your Sleigh program using qsub (or bsub), specifying the number of nodes to use. When the batch queueing system runs your script, it tells you what nodes it has allocated for you to run on using an environment variable (PBS_NODEFILE for PBS, LSB_HOSTS for LSF). Use that information to compute the appropriate nodeList when constructing your Sleigh. See the batchqueueing.py example program for an example of this technique.
You can use rsh, ssh, or the "web launch" method to start Sleigh workers. To use rsh or ssh to start Sleigh workers, you have to run an rsh or ssh server on each of the worker machines. However, we don't really recommend using ssh to start Sleigh workers, since different quoting style used by UNIX-like ssh server such as cygwin or copSSH and Windows platform can cause conflicts that are hard to debug.
Yes, you can. You need to have SSH client installed on the Windows machine, and setup up password-less login to Linux cluster. The instructions on setting up password-less login is available in User Guide's Getting Started Chapter, or see How do I stop ssh from asking me for my password?. Once these two steps are done, you can simply create sleigh using ssh launch method. For example, in R:
> s = sleigh(launch=sshcmd, nodeList=c('linux1'), scriptName='RNWSSleighWorker.sh', + scriptDir='/usr/local/lib64/R/library/nws/bin', scriptExec=envcmd, workingDir='/home/user', rprog='R')
Use the workingDir argument to the Sleigh constructor:
>>> s = Sleigh(workingDir='/tmp')
Currently, there isn't a simple method of setting the working directory differently on different workers.
Use the modulePath argument to the Sleigh constructor, or set PYTHONPATH in your shell startup script on the worker machines (which is pretty simple if you have a common, NFS-mounted home directory on each of the worker machines).
The following shows one way to setup password-less ssh login.
To test the password-less login, type the following command:
% ssh hostname date
If everything is setup correctly, you should not be asked for password and the current date on remote machine will be returned.
Currently, you cannot execute more than one eachElem or eachWorker job at the same time on the same Sleigh object, even using non-blocking mode. Non-blocking mode is only intended to allow the script to perform other operations while a long running job executes, not to allow multiple jobs. To execute multiple jobs concurrently, you must create multiple Sleigh objects.
Set the verbose argument to true when constructing your Sleigh. See Where are the debug/log files created for the Sleigh workers? for more information.
>>> s = Sleigh(verbose=True)
If the verbose argument is set to true in the Sleigh constructor, the workers will create log files in the directory specified by the logDir argument. If logDir is not set, it defaults to a system specific temporary directory. On Posix systems, this is /tmp, but on Windows, the easiest thing to do is to look at the "worker info" variable in the sleigh workspace using the web interface, which includes among other things the full path of the log file for each worker.
Actually, the log messages for each of the workers are also put into the "logDebug" variable in the sleigh workspace, so you can view them directly from the web interface (even if the babelfish isn't running). Also, error messages are put in the "logError" variable, even if verbose is false.
>>> s = Sleigh(verbose=True, logDir="/home/joe/tmp")