Galaxy Admin Training
Overview
Questions:Objectives:
How do I organise a Galaxy Admin Training (GAT)
What do I need to set up?
What should I know during the training?
Requirements:
Interact with the UseGalaxy.eu admins to arrange for infrastructure
Run a great training!
- Galaxy Server administration
- Ansible: slides slides - tutorial hands-on
- Deploying a compute cluster in OpenStack via Terraform: slides slides - tutorial hands-on
Time estimation: 60 minutesLast modification: Nov 18, 2020
Introduction
Setting up and running a Galaxy Admin Training is not a very complicated process thanks to a significant amount of work that has been put into making it easy and quick.
This tutorial has multiple audiences who are all adressed within this one tutorial. We encourage everyone involved in hosting a GAT event to read this, so you are aware of all of the moving parts which are required for such an event.
Agenda
In this tutorial, we will see:
Planning Your Training
First consider the requirements for your training:
- How many days will the training be?
- What topics do you want to cover?
- How many helpers will you need?
- Will the training be online or in person? Teaching online is slower, so you may not be able to cover as many topic as in person; you may also need more helpers.
- Do you have your own infrastructure, or do you need to request infrastructure?
We recommend checking out an example schedule from the GAT repository and modifying it based on your needs.
Requesting Infrastructure
If you do not have a cloud available that can, at minimum:
- setup VMs which are accessible from where your students are
- expose ports 20, 80, 443 (optional), 8080 (optional but nice)
Then consider contacting UseGalaxy.eu and let us know you’d like to host a GAT event. Tell us the number of students and dates of the event.
Provisioning the Infrastructure
Skip to the section on setting up below that is appropriate for you
[EU admins] Setting up VMs
hands_on EU: Setting up VMs
Open a PR editing the count variable and increasing it to the desired number of VMs.
Merge
Once deployed, clone/pull the repository locally and run
./bin/process-training-output.sh
$ ./bin/process-training-output.sh ubuntu gat-0.training.galaxyproject.eu ... ubuntu gat-1.training.galaxyproject.eu ...
Place this in the GAT Machines spreadsheet in the correct tab.
[Non-EU admins] Setting up the VMs
hands_on Non-EU: Setting up VMs
If you have your own cloud, setup VMs per student. We strongly recommend:
- Ubuntu 18.04+
- 8Gb RAM
- 2 VCPUs
If you’re using terraform, you can take inspiration from UseGalaxy.eu’s terraform configuration
Place the list of usernames and IPs and passwords in the GAT Machines spreadsheet in the correct tab.
[Everyone] Bootstrapping the VMs
Once your VMs are running, great! Now you’ll need to bootstrap the instances and prepare them for student use.
The GAT team maintains some infrastructure to handle the bootstrapping in the GAT repository
We use a more complicated hosts file in the project, as we have multiple pools of VMs across global regions. This is seen in the workshop_eu
and workshop_oz
groups, and the corresponding commands in the Makefile. However, this distinction of regions is not necessary. The most simple hosts file looks like:
# Your Machines
[workshop_instances]
192.0.2.1
192.0.2.2
192.0.2.3
...
# Some variables for those machines
[workshop_instances:vars]
ansible_host_key_checking = false
ansible_user = ubuntu
ansible_become = true
ansible_ssh_private_key_file = ~/admintraining.key
set_password = true # Generate a random password
hands_on Everyone: Setting up VMs
git clone https://github.com/galaxyproject/admin-training/
and change into that repo
cd bootstrap-instances
In the
hosts
file, edit the[workshop_eu]
section to list all of your IPs or DNS entries.You can specify
ansible_user=something
if it uses a different username, andansible_password=password
for each machine. E.g.A range of DNS entries (
gat-0
togat-39
)[workshop_] gat-[0:39].training.galaxyproject.eu
Some IPs
[workshop_eu] 192.0.2.1 192.0.2.2 192.0.2.3 192.0.2.4
Some IPs with additional information
[workshop_eu] 192.0.2.1 ansible_password=2121 192.0.2.2 ansible_password=1212 192.0.2.3 ansible_password=4321 192.0.2.4 ansible_password=1235 [workshop_instances:vars] ansible_user = myuser
Remove the
workshop_oz
from the top of the file under[workshop_instances:children]
Setup a python virtualenv and activate it.
Install the requirements (listed in
requirements.txt
)Run
make all
The last command runs the playbook.yml
, which in turn does a large number of things:
- Bootstraps python on the machine if it isn’t available already (Ubuntu stopped including it for some reason.)
- (Generates if needed) and copies an SSH key to all of the machines to make login easier (this is stored in
id_rsa
andid_rsa.pub
in the same directory as the makefile.) - This key is set as the ubuntu user’s key, and added to their authorized_keys, to permit them to run ansible easily on that machine (if they don’t configure the local connection correctly.)
- (When pulsar is in use) the pulsar machines are provisioned identically to the ones where Galaxy is setup, so the students can login passwordlessly to their pulsar machine.
- Updates packages (slow)
- Installs basic dependencies (emacs, vim, nano, git, etc.)
- Adds the
gat-cli
script to/usr/bin/gat
- Optionally sets a password to the machine
- If you set
set_password=true
in the hosts file, you can set a password on machines. - This is useful when your machines only have an SSH key on them, and no password set for students.
- If you set
- Reboots the machines
Testing
Once your VMs are ready, you should test ALL of the lessons you intend to teach. Most of the training should be fine, but sometimes changes in e.g. Galaxy versions or the availability of newer versions of some ansible modules will indicate you should test the training as you plan to teach it, and update the training materials where relevant.
Starting Your Training
We recommend providing a website similar to our GitHub repository (or using our repository! Ask us for a branch.) with at minimum the following links:
- Q&A pointing to a Google document where students can ask questions.
- In our experience this is an excellent format for discussion: it allows students to ask as many questions as they have, and responses can be given in real-time right below each question.
- Additionally images and complex formatting is easy
- Lastly, it’s anonymous which many students prefer.
- Chat pointing to
https://gxy.io/gatchat
(or your preferred channel) - VM List pointing to
https://gxy.io/gatmachines
(or your own spreadsheet) - The slides and tutorials for your training.
During Your Training
During planning for BCC2020 we found that monitoring student progress would be extremely difficult, it was not a situation we had encountered before as this was the first remote GAT during the pandemic. So we developed a small utility, the gat-cli
, which assists in monitoring students’ progress.
$ gat
Galaxy Admin Training (gat) tool:
gat status-ansible [Admin] Check status of ansible training
gat status-galaxy [Admin] Check status of ansible-galaxy training
gat status-cvmfs [Admin] Check status of cvmfs training
gat status-pulsar [Admin] Check status of pulsar training
Each of these commands run a couple checks on their local machine. For example the status-galaxy
command checks:
- Check that the
postgresql
service is running - Does a db named
galaxy
exist - Does http://localhost:8080/api/version respond with some content
- Check that the
galaxy
service is running - Check that the
nginx
service is running - Does https://localhost:443/api/version respond with some content
However, the gat
command is only available on the VMs, so we’ve written an ansible task (wrapped in the Makefile) which SSHs into every machine, and runs the gat command for you:
$ make check-galaxy-eu
gat-19 postgres ✘ ✘ galaxy(http) ✘ SysD-gxy ✘ SysD-nginx ✘ galaxy(ssl) ✘
gat-22 postgres ✔ ✘ galaxy(http) ✘ SysD-gxy ✘ SysD-nginx ✘ galaxy(ssl) ✘
gat-2 postgres ✔ ✔ galaxy(http) ✔ SysD-gxy ✘ SysD-nginx ✘ galaxy(ssl) ✘
gat-15 postgres ✔ ✘ galaxy(http) ✘ SysD-gxy ✘ SysD-nginx ✘ galaxy(ssl) ✘
gat-24 postgres ✔ ✘ galaxy(http) ✘ SysD-gxy ✘ SysD-nginx ✘ galaxy(ssl) ✘
gat-6 postgres ✘ ✘ galaxy(http) ✘ SysD-gxy ✘ SysD-nginx ✘ galaxy(ssl) ✘
You can see a checkmark reported for every step completed by students, giving you a nice overview of how many students have completed each step, and if you’re ready to move on. Additionally you know precisely which students you should reach out to check in with, if they aren’t progressing. The -eu
again refers to the specific pool of machines, and if you’re using a different hosts file, with different group names, you may need to edit the Makefile accordingly.
There are other make check-
commands for each of the gat
status commands. Run make
to list all of them.
After Your Training
Once your training is concluded, go through the questions students have asked in the Google Doc, and consider contributing them back to the training materials with Tips and Question boxes covering these student questions. Here is an example pull request when we did this after BCC2020.
Conclusion
Key points
Infrastructure is available for running GATs for free from UseGalaxy.eu
This can be very convenient and easy to use
EU provides appropriate DNS entries so you can run trainings with ITs.
Frequently Asked Questions
Have questions about this tutorial? Check out the FAQ page for the Teaching and Hosting Galaxy training topic to see if your question is listed there. If not, please ask your question on the GTN Gitter Channel or the Galaxy Help ForumFeedback
Did you use this material as an instructor? Feel free to give us feedback on how it went.
Did you use this material as a learner or student? Click the form below to leave feedback.
Citing this Tutorial
- Helena Rasche, 2020 Galaxy Admin Training (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/instructors/tutorials/galaxy-admin-training/tutorial.html Online; accessed TODAY
- Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012
details BibTeX
@misc{instructors-galaxy-admin-training, author = "Helena Rasche", title = "Galaxy Admin Training (Galaxy Training Materials)", year = "2020", month = "11", day = "18" url = "\url{https://training.galaxyproject.org/training-material/topics/instructors/tutorials/galaxy-admin-training/tutorial.html}", note = "[Online; accessed TODAY]" } @article{Batut_2018, doi = {10.1016/j.cels.2018.05.012}, url = {https://doi.org/10.1016%2Fj.cels.2018.05.012}, year = 2018, month = {jun}, publisher = {Elsevier {BV}}, volume = {6}, number = {6}, pages = {752--758.e1}, author = {B{\'{e}}r{\'{e}}nice Batut and Saskia Hiltemann and Andrea Bagnacani and Dannon Baker and Vivek Bhardwaj and Clemens Blank and Anthony Bretaudeau and Loraine Brillet-Gu{\'{e}}guen and Martin {\v{C}}ech and John Chilton and Dave Clements and Olivia Doppelt-Azeroual and Anika Erxleben and Mallory Ann Freeberg and Simon Gladman and Youri Hoogstrate and Hans-Rudolf Hotz and Torsten Houwaart and Pratik Jagtap and Delphine Larivi{\`{e}}re and Gildas Le Corguill{\'{e}} and Thomas Manke and Fabien Mareuil and Fidel Ram{\'{\i}}rez and Devon Ryan and Florian Christoph Sigloch and Nicola Soranzo and Joachim Wolff and Pavankumar Videm and Markus Wolfien and Aisanjiang Wubuli and Dilmurat Yusuf and James Taylor and Rolf Backofen and Anton Nekrutenko and Björn Grüning}, title = {Community-Driven Data Analysis Training for Biology}, journal = {Cell Systems} }