SQL Educational Game - Murder Mystery
Overview
Questions:Objectives:
Who did the crime?
Requirements:
Explore SQL City and discover who committed the murder
Reinforce your experiences with SQL such as querying, filtering, and joining data.
- Foundations of Data Science
- Introduction to SQL: tutorial hands-on
- Advanced SQL: tutorial hands-on
Time estimation: 2 hoursLevel: Intermediate IntermediateSupporting Materials:Last modification: Dec 21, 2021
Best viewed in a Jupyter Notebook
This tutorial is best viewed in a Jupyter notebook! You can load this notebook one of the following ways
Launching the notebook in Jupyter in Galaxy
- Instructions to Launch JupyterLab
- Open a Terminal in JupyterLab with File -> New -> Terminal
- Run
wget https://training.galaxyproject.org/training-material/topics/data-science/tutorials/sql-game/data-science-sql-game.ipynb
- Select the notebook that appears in the list of files on the left.
Downloading the notebook
- Right click one of these links: Jupyter Notebook (With Solutions), Jupyter Notebook (Without Solutions)
- Save Link As..
This is not a tutorial like most GTN content but a fun exercise for you to play around and learn a bit about SQL in a more ‘practical’, and hopefully re-inforce the skills you covered in Basic and Advanced SQL skills. It makes use of the NUKnightLab/sql-mysteries SQL murder mystery project and released under open licenses:
Original code for NUKnightLab/sql-mysteries is released under the MIT License. Original text and other content is released under Creative Commons CC BY-SA 4.0.
Download the database and connector:
# This preamble sets up the sql "magic" for jupyter. Use %%sql in your cells to write sql!
!python3 -m pip install ipython-sql sqlalchemy
!wget -c https://github.com/NUKnightLab/sql-mysteries/raw/master/sql-murder-mystery.db
Setup the database connection:
import sqlalchemy
engine = sqlalchemy.create_engine("sqlite:///sql-murder-mystery.db")
%load_ext sql
%sql sqlite:///sql-murder-mystery.db
%config SqlMagic.displaycon=False
Tables
Which tables are available to you? What columns do they contain? Here’s a handy reference for you:
import pandas as pd
from sqlalchemy import MetaData
m = MetaData()
m.reflect(engine)
results = []
for table in m.tables.values():
results.append([table.name, ', '.join([c.name for c in table.c])])
pd.set_option('display.max_colwidth', None)
pd.DataFrame(results, columns=["Table", "Columns"])
The search!
Begin your search for the truth
A crime has taken place and the police are both useless and corrupt and it is up to you and your community to solve the mystery. They failed to secure their database, and now their crime scene reports are public, it is time to figure out who the murderer was.
You know that the crime was a murder that occurred sometime on Jan.15, 2018 and that it took place in SQL City.
All the clues to this mystery are buried in a huge database, and you need to use SQL to navigate through this vast network of information. Your first step to solving the mystery is to retrieve the corresponding crime scene report from the police department’s database. From there, you can use your SQL skills to find the murderer.
%%sql
select * from crime_scene_report limit 8;
Try using the ‘Insert → Cell Below’ functionality to keep track of important query results as you go!
Solution
Write the following queries in your SQL environment to check whether you’ve found the right murderer:
%%sql
INSERT INTO solution VALUES (1, "Insert the name of the person you found here");
SELECT value FROM solution;
Key points
Learning SQL can be fun!
Frequently Asked Questions
Have questions about this tutorial? Check out the tutorial FAQ page or the FAQ page for the Foundations of Data Science topic to see if your question is listed there. If not, please ask your question on the GTN Gitter Channel or the Galaxy Help ForumFeedback
Did you use this material as an instructor? Feel free to give us feedback on how it went.
Did you use this material as a learner or student? Click the form below to leave feedback.
Citing this Tutorial
- Helena Rasche, NU Knight Lab, Erasmus+ Programme, Avans Hogeschool, 2021 SQL Educational Game - Murder Mystery (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/data-science/tutorials/sql-game/tutorial.html Online; accessed TODAY
- Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012
details BibTeX
@misc{data-science-sql-game, author = "Helena Rasche and NU Knight Lab and Erasmus+ Programme and Avans Hogeschool", title = "SQL Educational Game - Murder Mystery (Galaxy Training Materials)", year = "2021", month = "12", day = "21" url = "\url{https://training.galaxyproject.org/training-material/topics/data-science/tutorials/sql-game/tutorial.html}", note = "[Online; accessed TODAY]" } @article{Batut_2018, doi = {10.1016/j.cels.2018.05.012}, url = {https://doi.org/10.1016%2Fj.cels.2018.05.012}, year = 2018, month = {jun}, publisher = {Elsevier {BV}}, volume = {6}, number = {6}, pages = {752--758.e1}, author = {B{\'{e}}r{\'{e}}nice Batut and Saskia Hiltemann and Andrea Bagnacani and Dannon Baker and Vivek Bhardwaj and Clemens Blank and Anthony Bretaudeau and Loraine Brillet-Gu{\'{e}}guen and Martin {\v{C}}ech and John Chilton and Dave Clements and Olivia Doppelt-Azeroual and Anika Erxleben and Mallory Ann Freeberg and Simon Gladman and Youri Hoogstrate and Hans-Rudolf Hotz and Torsten Houwaart and Pratik Jagtap and Delphine Larivi{\`{e}}re and Gildas Le Corguill{\'{e}} and Thomas Manke and Fabien Mareuil and Fidel Ram{\'{\i}}rez and Devon Ryan and Florian Christoph Sigloch and Nicola Soranzo and Joachim Wolff and Pavankumar Videm and Markus Wolfien and Aisanjiang Wubuli and Dilmurat Yusuf and James Taylor and Rolf Backofen and Anton Nekrutenko and Björn Grüning}, title = {Community-Driven Data Analysis Training for Biology}, journal = {Cell Systems} }
Congratulations on successfully completing this tutorial!
Do you want to extend your knowledge? Follow one of our recommended follow-up trainings: