Python is an interpreted high-level programming language for general-purpose programming. It’s design philosophy emphasizes code readability using significant whitespace. Python features a dynamic type system and automatic memory management. It supports multiple programming paradigms, including object-oriented, imperative, functional and procedural, and has a large and comprehensive standard library. Documentation for Python can be found on its official website. Currently, versions 2.7 and 3.0 of Python are available on the cluster.

Basics

After loading the module, Python can be launched through the command line by simply typing python. Python files end in *.py, and can be run from the command line using python filename.py where filename is the name of the Python file.

Python Virtual Environment

To use Python through the command line, you must first initialize a Python virtual environment. Virtual environments are isolated environments for projects, so that each project can have its own dependencies and packages installed, regardless of what dependencies every other project has. To create the virtual environment, run the following commands:

        module load python3/anaconda/5.2.0
        conda create -n python-environment python-essentials python-base
        source activate python-environment
    
Once the virtual environment is created, it can be launched at any time by ensuring that the python3 module is loaded, using the command

module load python3/anaconda/5.2.0

and then launching the environment by using

source activate python-environment

Next, you can install any packages you need inside this environment. These packages will only be available within this environment. Python packages and dependencies can be installed in your virtual environment by either using pip or the conda package manager.

Pip

Pip is a program that installs Python packages. It can be used to install packages and any other Python packages that are dependencies with the command:

pip install package-name

where package-name is the name of the package you wish to install. For example, to install SQLAlchemy, a Python SQL database library, you can use the command

pip install SQLAlchemy

Once downloaded, you will be able to use the SQLAlchemy library in your Python programs within the created virtual environment.

Conda

The conda package manager is similar to pip, but also installs non-Python packages and dependencies. Packages can be installed using the command

conda install package-name

where package-name is the name of the package you wish to install. Many conda packages are used in scientific computing and data analysis. For example, NumPy, a useful scientific computing package for Python that contains an N-dimensional array object, tools for integrating C++ and Fortran code, and useful linear algebra and random number capabilities, can be installed through Conda using the command

conda install NumPy

To exit the Python virtual environment, use the command

source deactivate

Running Python through a job script

1. Ensure that you have a virtual environment created, following the steps described above. 2. Create a Python script. The linked repository provides a simple script, test.py, which demonstrates some of Python’s basic features.

test.py


import random

print("Hello World!")

a = 10
b = 3
print("a is %d" % a)
print("b is %d" % b)
print("a*b is %d" % (a*b))

print("Random list: ")
rand_list = []
for x in range(10):
       rand_list.append(random.randint(1,100))
print(rand_list)
print("List sorted: ")
print(sorted(rand_list))
        
3. Prepare the submission script, which is the script that is submitted to the Slurm scheduler as a job in order to run the Python script. The linked repository provides the script job.sh as an example.

job.sh


#!/bin/sh

#SBATCH --job-name=py_test
#SBATCH -o py_out%j.out
#SBATCH -e py_err%j.err
#SBATCH -N 1
#SBATCH --ntasks-per-node=1

echo -e '\nsubmitted Python job'
echo 'hostname'
hostname

# creates a python virtual environment
module load python3/anaconda/5.2.0
source activate python-environment

# run python script
python3 test.py

# exit the virtual environment
source deactivate
        

4. Submit the job using: sbatch job.sh

5. Examine the results.