Cluster Install Policy

From Biowiki
Jump to: navigation, search

Cluster Install Policies

This is an effort to demystify and standardize our software installation process on the cluster for current and future lab members, so as to (hopefully) make it clear where packages should be installed and stop duplicating efforts. Note that in a few cases software is currently in non-standard locations (for instance - Sun Java in /nfs/src) - this should be corrected eventually.

This is very much a work in progress - please feel free to modify/update the page or add any comments/questions.

Install Locations

  • Bioinformatics packages: In /nfs/src. This directory should be largely reserved for applications which are either under active development or may need to be updated frequently. Note that the contents of this directory are backed up on tape (TODO: write up instructions for excluding nfs directories from backup? -LEB)
  • Generic software/utilities: On individual cluster nodes. Things like GraphViz, java, LaTeX - basically anything available through yum/rpm, or that won't need to be updated frequently.
  • Perl modules: In /nfs/lib/perl5.

Compiling on the Cluster

All nodes have gcc 4.1.1 installed as /usr/bin/gcc4. When compiling in /nfs/src, try to compile on a node other than sheridan so as not to bog down our submit/work node. Update Cluster Software when you're done!

When installing on nodes individually, try to use yum/rpm where ever possible - this will make updating and figuring out what is installed easier.

(sort of) Automating Installation

An example installing GraphViz from Mitch Skinner:

Usually I'll try it out on sheridan first, then do the for loop:


eval `ssh-agent`
ssh-add
<enter the password from the board>
rpm -Uvh /nfs/tmp/graphviz-*
for x in `cat ~avu/hostnames`; do ssh $x 'rpm -Uvh /nfs/tmp/graphviz*'; done

It's easier if the package is available through yum (I think I needed a more recent graphviz version for the localization stuff).

The main caveat is that if some machines are down then they miss out. In the past, I've written scripts that query RPM on each cluster node for the list of installed packages; then I did some diffing and fixed things up. Point being, there's some danger of it being fiddly, but usually it's fairly automatable once you're using ssh-agent.

For those unfamiliar, here are some excellent tutorials on SSH pubkey authentication and ssh-agent

CPAN

Our CPAN modules are installed to the NFS in directories /nfs/lib, /nfs/lib64. These should now be in the default PERL5LIB path for all cluster users.

Both sheridan and lorien are setup so that the default cpan configuration installs to the NFS. Note that installing as root should be done on lorien to avoid squashed permissions.

Using Centos Alternatives

See an example of how java is installed on the cluster with alternatives here.

-- Lars Barquist - 18 Jun 2008