Getting Started

This chapter provides step-by-step procedures for setting up NetWorkSpaces and Sleigh for R.

2.1  Prerequisites

NetWorkSpaces and Sleigh run on all Linux and Windows platforms. To use them, you must install the following software:
  1. R 2.1.1 or above
  2. NetWorkSpaces server http://nws-r.sourceforge.net
    The NetWorkSpaces server also requires the following:
    1. Python 2.4 http://www.python.org
      (We recommend ActiveState Python http://www.activestate.com for Windows platform).
    2. Twisted Framework http://twistedmatrix.com

2.2  NetWorkSpaces Server

2.2.1  Starting the Server

There are three ways to start a NetWorkSpaces server: This starts a NetWorkSpaces server on localhost at port 8765.

2.2.2  Stopping the Server

There are also three ways to stop a NetWorkSpaces server:

2.3  NetWorkSpaces Client

2.3.1  Installing the Client

To install NetWorkSpaces source distribution on UNIX:
  1. R CMD INSTALL nws_version.tar.gz
To install NetWorkSpaces on Windows XP:

2.3.2  Start NetWorkSpaces Client

Once you've got a NetWorkSpace server up and running, you're ready to use NetWorkSpaces.
  1. Start up an R session.
  2. Type the following:
     > ws = netWorkSpace(`R space')
     > nwsStore(ws, `x', 1)
     
This step creates a workspace named `R space' and stores a variable x with value 1 to the workspace.

You can also view what's in the workspace using a web interface. To do this, you point your browser to http://server_host_name:8766, where server_host_name is the machine that a NetWorkSpaces server resides on.

To examine values that you've created in a workspace using the server's web interface, you also need a babelfish. The babelfish translates values into a human readable format so they can be displayed in a web browser. If a value is a string, then the web interface simply displays the contents of the string, without any help from the babelfish. But, if the value is any other type of R object, it needs help from the R babelfish. To start up babelfish, execute the following command in another terminal:
 % R CMD BATCH babelfish.R
 
Note: this function will not return until you exit out of R.
For Windows, user can also start babelfish as Windows service by following these steps:
  1. Open up a command prompt
  2. cd to the directory where NWS client package is installed.
  3. cd to bin directory
  4. Install R Babelfish Service
    python RBabelfishService.py install
    
  5. Start R Babelfish Service
    python RBabelfishService.py start
    
For more examples on using NetWorkSpaces, see the Tutorials chapter.

2.4  Starting Sleigh

Sleigh is a R class, built on top of the NetWorkSpaces, that makes it very easy to write simple parallel programs. Sleigh has concept of one master and multiple workers. The master sends jobs to workers who may or may not be on the same machine as the master. To enable the master to communicate with workers, Sleigh supports several mechanisms to launch workers to run jobs. For remote launch mechanism, make sure R is in the PATH of worker machines.

2.4.1  Local Launch Mechanism

Local launch mechanism is the default option to start up workers. It starts three workers by default on local machine. Users may choose to change the number of workers by setting workerCount variable in the sleigh constructor. Local launch is useful for SMP machine, where multiple cores/processors are available. It is also useful in debugging parallel programs locally before running over a large cluster.

To create a sleigh object, simply load the nws package and type the following:
> s = sleigh()
This is equivalent to:
> s = sleigh(launch='local')

2.4.2  SSH Launch Mechanism

To start up workers on different machines, a remote login mechanism, such as SSH client, is needed. Remote workers also need to run SSH server in order to accept requests.

For Windows users, we recommend Cygwin's ssh server or copSSH from ITef!x, http://www.itefix.no/phpws/. To setup Cygwin ssh server, http://ncyoung.com/entry/389 provides a nice tutorial. Installation of copSSH is straight out of box. After installation, remember to activate users.

Next, we need to setup password-less ssh login.

Setting Up a Password-less SSH Login

To start up sleigh workers using SSH, simply load the nws package and type the following:
> s = sleigh(launch=sshcmd)
This creates three workers on local machine.

To start up sleigh workers on multiple machines,
> s = sleigh(nodeList=c('node1', 'node2'), launch=sshcmd)

2.4.3  RSH Launch Mechanism on Windows

On Windows, where SSH is not available, users can use RSH instead. Windows 2000/XP comes with RSH client, but in order to communicate with other Windows machines, user must have an RSH server running on the machine. To do so, user must first download and install a copy of Windows Services for UNIX (SFU), which is available for free at Microsoft website. SFU allows users to start up RSH server as Windows service or as an UNIX daemon. In this section, we will focus on how to start up RSH server as Windows service.
  1. Install SFU to local machine.
  2. Log in to the machine as an Administrator.
  3. Open up a command prompt.
  4. Go to SFU's common directory.
  5. Install Windows Remote Shell Service.
    rshsvc.exe -install
    
  6. Start Windows Remote Shell Service.
    rshsvc.exe -start
    
    Users can also enable automatic or manual start of Windows Remote Shell Service through Services console in Administrative Tools.
  7. Add .rhosts file.
    The .rhosts file resides in the location specified by the registry entry:
    HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\RshSvc\RhostsPath
    This value usually is C:\Windows\System32\Drivers\etc
    The format of the .rhosts file is:
    machine1 user1         # user1 can log in from machine1
    + user2                # user2 can log in from any machine
    machine2 user1 user2   # user1 and user2 can log in from machine2
    machine3 +             # all users on machine3 can log in
    + +                    # any user from any machine can log in
    
  8. Test RSH from a command prompt
    rsh localhost set
    
    If everything setup correctly, this command should return a list of environment variables set locally.
To start up sleigh workers using RSH, simply load the nws package and type the following:
> s = sleigh(launch=rshcmd)

2.4.4  Web Launch Mechanism

Web launch mechansim allows user to start up workers on different machines in an ad-hoc way, without setting up remote login mechanisms, such as SSH and RSH.

To use the web launch option, follow the steps below.
  1. Create an instance of Sleigh:
      > s = sleigh(launch=`web')
      
    The Sleigh constructor does not return until it gets a signal that all workers have started and are ready to accept jobs.

  2. Log in to a remote machine.
  3. Start a R session.
  4. Open a web browser and point to http://server_host_name:8766
  5. Click on the newly created Sleigh workspace, and read the value from variable `runMe'. It usually has value similar to:
    webLaunch(`sleigh_ride_0000000004_tmp1a6c0h', `mercury', 8765);
  6. Copy the `runMe' value to the R session.
  7. Repeat steps 2-6 for each worker that needs to be started.
  8. Once all workers have started, delete the `DeleteMeWhenAllWorkersStarted' variable from the Sleigh workspace. This signals Sleigh master that the workers have started and are ready to accept work.
Now you're ready to send jobs to remote workers. See the Sleigh for R Tutorial section in the Tutorials chapter for more information.


This document was translated from LATEX by HEVEA.