Both Ace and Turing have three (or more) Python distributions installed. The main system Python (located in /usr/bin) is used for system utilities, administration, or basic Python scripts. Production installations of Python 2.7 and 3.6 are available as modules, and can be loaded the same as all other standard software on the cluster:
srpruitt@turing:~$ module load python/gcc-4.8.5/3.6.3
Currently available distributions can be viewed using the
module avail command on the cluster.
In general on a system where you are not root (sudo), you can install software in your home directory. This can be a little cumbersome if you want to install an entire Python or full MPI library.
Python handles packages like numpy and Tensorflow based on name, not version number. This complicates supporting multiple versios of packages like Tensorflow that are in active development (and like to break backward compatibility). This makes supporting one user with tensorflow r1.0 code and another user with r1.5 code difficult at best.
To prevent each user on the cluster from having to install their own distribution of Python, while maintaining flexibility to allow each user to install their own Python packages and specific versions, the following rules are followed on both Ace and Turing:
- The system administrators have installed and will maintain "pure" Python distributions that can be loaded by all users as modules.
- The system administrators have installed and will maintain all essential libraries (e.g. CUDA, cudNN, math libraries, MPI, etc, etc) that can be loaded by all users as modules.
- Specific Python packages (e.g. numpy, pandas, tensorflow-gpu) can be installed by any user, after loading the production Python distribution of choice, using pip/pip3 with the --user flag.
A Python3 example installing the latest version of Tensorflow:
srpruitt@turing:~$ module load cuda80/toolkit/8.0.61 srpruitt@turing:~$ module load cuda80/blas/8.0.61 srpruitt@turing:~$ module load cudnn/6.0 srpruitt@turing:~$ which python3 /cm/shared/spack/opt/spack/linux-rhel7-x86_64/gcc-4.8.5/python-3.6.3-jaapiswyomewtlo5ngyxtdaorzddg7pf/bin/python3 srpruitt@turing:~$ which pip3 /cm/shared/spack/opt/spack/linux-rhel7-x86_64/gcc-4.8.5/python-3.6.3-jaapiswyomewtlo5ngyxtdaorzddg7pf/bin/pip3 srpruitt@turing:~$ pip3 install tensorflow-gpu --user srpruitt@turing:~$ python3 Python 3.6.3 (default, Jan 16 2018, 09:06:52) [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow >>>
Other Python packages can be installed in the same way (using the flag --user).
There is an order or precedence in terms of what python will load if there are multiple versions of something installed:
- Direct distribution installations
- Packages loaded as modules from the system
- Packages installed in the user $HOME directory
Anything installed using the --user flag in your home directory takes top priority and will supersede any other duplicates installed elsewhere.
The process above will be the same if you are using Python 2.7, except you will load the Python 2.7 distribution:
srpruitt@turing:~$ module load python/gcc-4.8.5/2.7.14
and then use
pip instead of
pip3 when you perform the --user based install.