Managing virtual environments on the cluster
Virtual environments are self-contained directories that contain all libraries and dependencies for a specific project, allowing you to work on multiple projects without conflicts between their requirements. They help you isolate project dependencies, manage versions, and ensure that different software stacks don't interfere with each other. This can be especially useful when working on WATGPU, since there might be specific versions of libraries that may be required for your specific research projects.
The default environment manager for all users on the WATGPU cluster is conda, though pip virtual environments are also supported.
Creating and activating conda environments is simple on WATGPU:
(base) $ conda create -n <conda-venv-name>
(base) $ conda activate <conda-venv-name>
<conda-venv-name> $
You can create pip virtual environments while a conda environment is activated, and after creating the pip virtual environment, you can deactivate the conda environment and activate the pip virtual environment normally.
(base) $ python -m venv <venv-name>
(base) $ conda deactivate
$ source <venv-name>/bin/activate
Installing useful tools through conda
Tools like nvcc and nvtop can be installed using conda since conda is also a package manager.
To perform a basic install of all CUDA Toolkit components using conda, run:
conda install cuda -c nvidia
You can install previous CUDA releases by following the instructions detailed in the Conda Installation section of NVIDIA's online documentation.
To install nvtop, which can be used to better monitor GPU utilization and GPU memory usage, run:
conda install --channel conda-forge nvtop
Further information about nvtop can be found here.
If a tool suggests you to use apt or apt-get to install it, please check if you can install the same tool through conda. If this isn't possible, you can send an email to watgpu-admin@lists.uwaterloo.ca and we can look into installing the tool for you.