statcomp2.emp.vumc.io Information
Description
Statcomp2 is a 64 bit Linux server equipped with 2 Intel Xeon X5647 processors running at 2.93GHz, 96Gb of memory, and approximately 2TB of hard drive space. It's purpose is to run statistical applications that work on large datasets and/or are compute intensive. It has a non-uniform memory architecture (
NUMA).
statcomp2 ...
- runs Ubuntu Linux as its operating system
- is intended for jobs that are unsuitable for running on a local workstation due to large memory requirements or long run times
- R is installed
- There is no windowing software installed. All work is done via the command line.
Connecting to the server; transferring files
Note: The programs mentioned here (scp, WinSCP, ssh, putty) are just suggestions. There are many other tools available for transferring files and logging on to remote computers.
- Develop and test your program on your local workstation. If your program process a large data set, do you testing with a small subset of the data.
- When you program is ready, use something like scp (on Linux or Mac) or WinSCP (Windows) to copy your program and data to statcomp2
- Use ssh (Linux and Mac) or Putty (Windows) to log on to statcomp2. The hostname of statcomp2 is "statcomp2.emp.vumc.io".
- Run you program on statcomp2. If you expect the program to run a long time, run your job as a batch job. See the Frequently Asked Questions (FAQ) topic (in the "How do I run my R program as a batch job?" section) for details.
If you'd rather use a graphical user interface to transfer files to and from the statcomp2 server, have a look at
Account Setup
Please email
biostat-it@list.vanderbilt.edu to request an account on statcomp2.
Understanding CPU Utilization
With
top, a user can determine which cpu is running a particular job by enabling the "j" column to be displayed. It's heading is "P" and will be displayed right next to the "COMMAND" column. To turn this on, follow these steps:
- 1. start top
- 2. type the key "f"
- 3. type the key "j"
- 4. hit the enter key
You should now see a column with heading "P". Each entry will have a 1 or 0 in it corresponding to cpu 1 or 0.
The SUMMARY area just above the table of processes lists the CPU utilization percentages.
- type 1 to toggle between aggregate CPU and per CPU utilization.
Also, to save all the nifty changes you made to your
top display:
- type the key "W" to save this preference to $HOME/.toprc
View Screenshop of top
R Installed Packages and Personal Packages
When R looks for packages to load via the
library() or
require() function, it searches directory paths in the order they are specified in the shell environment variable
R_LIBS. On the server the default
paths are specified in the file
/etc/R/Renviron, and the paths are
/usr/local/lib/R/site-library,
/usr/lib/R/site-library, and
/usr/lib/R/library. Only root can add packages to these directories.
If you would like to use a package that's not installed on the server, and you feel it could be useful for others, please
send an email request to
biostat-it@list.vanderbilt.edu and we will install it for you.
Othewise, you can install it yourself into a directory in your home area and set the
R_LIBS environment variable to point to
that directory. Some reasons for doing this:
- Your package contains secret research that no-one else should see
- You'd like to install a different version of an already installed package
- The administrators are dragging their feet on your package install request (sorry, we can get busy).
Note that you don't have to add the additional paths specified in
/etc/R/Renviron as
R will do that for you.
To install a package into a directory in your home area
$ R CMD INSTALL -l ~/Rlib PackageName.tar.gz
Notice that the
~ (tilde) character expands to your home area path, which is in the environment variable
HOME.
And set the R_LIBS environment variable... BEFORE running R
$ export R_LIBS=~/Rlib
You should place the above line in your .bashrc file so that the next time you log in, the variable will be set.