Welcome to the WATGPU research cluster
Introduction
Welcome to WATGPU, a University of Waterloo School of Computer Science GPU cluster aiming to facilitate access to computing resources for research purposes. This documentation serves as a comprehensive guide to understanding and utilizing WATGPU, a cluster managed through the Slurm workload manager.
Download pdf presentation here: 2024/07/25 version.
View the recording of the seminar from the 2024/07/25.
Getting access
- Faculty (CS or cross-appointed): Please contact watgpu-admin@lists.uwaterloo.ca
- Student1: With the agreement of the faculty you are working with, send an email to watgpu-admin@lists.uwaterloo.ca with them in CC.
Before making an account request, please load an SSH key at https://authman.uwaterloo.ca
Contact
If you require assistance while using WATGPU, you can contact the following:
- Admin support: watgpu-admin@lists.uwaterloo.ca
Shared GPU Resources
WATGPU offers shared access to GPUs owned by the university and generous researchers. When these GPUs are idle, they contribute to a shared pool, providing users with enhanced computational capabilities.
Slurm: How it works
Slurm simplifies the user experience by allowing you to submit, monitor, and manage your computational jobs seamlessly. Through straightforward command-line interfaces, you can submit batch jobs, specify resource requirements, and monitor job progress. Slurm ensures fair resource allocation, allowing you to focus on your research without the complexity of manual resource management.
If you're new to shared GPU computing and Slurm, getting started with WATGPU is a breeze! Think of WATGPU as your personal computing powerhouse, shared with other users when their GPUs are idle. To begin, follow these simple steps:
- Login: Access watgpu.cs using your credentials.
- Submit a Job: Utilize the
sbatch
command to submit your script that you wish to run. Think of it like asking the server to perform specific computations for you using specific resources (like how many GPUs, how much memory ...). - Monitor Progress: Check the status of your job with the
squeue
command and see how it's progressing. You can usesqueue
to view the job queue and monitor job details. - Enjoy: Your job will be run by the server as soon as the requested resources are available.
For more in-depth information, visit this page.
Thank you for choosing WATGPU. We're here to simplify your computational tasks and enhance your research.
Happy Computing!
WatGPU cluster is a research resource. Access will only be granted to students actively involved in a School of Computer Science or cross-appointed research group.