{
  "metadata": {
  },
  "nbformat": 4,
  "nbformat_minor": 5,
  "cells": [
    {
      "id": "metadata",
      "cell_type": "markdown",
      "source": "<div style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em; padding: 0.5em;\">\n\n# Python - Introductory Graduation\n\nby [Helena Rasche](https://training.galaxyproject.org/hall-of-fame/hexylena/), [Donny Vrins](https://training.galaxyproject.org/hall-of-fame/dirowa/), [Bazante Sanders](https://training.galaxyproject.org/hall-of-fame/bazante1/)\n\nCC-BY licensed content from the [Galaxy Training Network](https://training.galaxyproject.org/)\n\n**Objectives**\n\n- What all did I learn up until now?\n\n**Objectives**\n\n- Recap all previous modules.\n- Use exercises to ensure that all previous knowledge is sufficiently covered.\n\n**Time Estimation: 1H30M**\n</div>\n",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-0",
      "source": "<p>This module provide something like a recap of everything covered by the modular Python Introductory level curriculum. This serves as something of a graduation into the Intermediate tutorials which cover more advanced topics.</p>\n<blockquote class=\"agenda\" style=\"border: 2px solid #86D486;display: none; margin: 1em 0.2em\">\n<h3 id=\"agenda\">Agenda</h3>\n<p>In this tutorial, we will cover:</p>\n<ol id=\"markdown-toc\">\n<li><a href=\"#review\" id=\"markdown-toc-review\">Review</a>    <ol>\n<li><a href=\"#math\" id=\"markdown-toc-math\">Math</a></li>\n</ol>\n</li>\n</ol>\n</blockquote>\n<h1 id=\"review\">Review</h1>\n<p>This recapitulates the main points from all of the previous modular tutorials</p>\n<h2 id=\"math\">Math</h2>\n<p>Math in python works a lot like math in real life (from algebra onwards). Variables can be assigned, and worked with in the place of numbers</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-1",
      "source": [
        "x = 1\n",
        "y = 2\n",
        "z = x * y"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-2",
      "source": "<p>We can use familiar math operators:</p>\n<table>\n<thead>\n<tr>\n<th>Operator</th>\n<th>Operation</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td><code style=\"color: inherit\">+</code></td>\n<td>Addition</td>\n</tr>\n<tr>\n<td><code style=\"color: inherit\">-</code></td>\n<td>Subtraction</td>\n</tr>\n<tr>\n<td><code style=\"color: inherit\">*</code></td>\n<td>Multiplication</td>\n</tr>\n<tr>\n<td><code style=\"color: inherit\">/</code></td>\n<td>Division (<code style=\"color: inherit\">//</code> for rounded, integer division)</td>\n</tr>\n</tbody>\n</table>\n<p>And some familiar operations require the use of the <code style=\"color: inherit\">math</code> module:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-3",
      "source": [
        "import math\n",
        "print(math.pow(2, 8))\n",
        "print(math.sqrt(9))"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-4",
      "source": "<h2 id=\"functions\">Functions</h2>\n<p>Functions were similarly analogous to algebra and mathematics, we can express <code style=\"color: inherit\">f(x) = x * 3</code> in python as:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-5",
      "source": [
        "def f(x, y=3):\n",
        "    z = y * 2\n",
        "    return x * z"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-6",
      "source": "<p>There are a few basic parts of a function:</p>\n<ul>\n<li><code style=\"color: inherit\">def</code> starts a function <em>definition</em></li>\n<li>it needs a name, here it’s <code style=\"color: inherit\">f</code></li>\n<li>between some parentheses are the arguments\n<ul>\n<li>the are arguments, just the variable name (<code class=\"language-plaintext highlighter-rouge\">x</code>)</li>\n<li>and keyword arguments, where there is a variable and a value (<code class=\"language-plaintext highlighter-rouge\">y=3</code>)</li>\n</ul>\n</li>\n<li>The function body\n<ul>\n<li>With one or more lines</li>\n<li>Usually ending in a <code style=\"color: inherit\">return</code></li>\n</ul>\n</li>\n</ul>\n<p>And we know we can nest functions, using functional composition, just like in math. In math functional composition was written <code style=\"color: inherit\">f(g(x))</code> and in python it’s exactly the same:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-7",
      "source": [
        "print(math.sqrt(math.pow(2, 4)))"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-8",
      "source": "<p>Here we’ve nested three different functions (print is a function!). To read this we start in the middle (math.pow) and move outwards (math.sqrt, print).</p>\n<h2 id=\"types\">Types</h2>\n<p>There are lots of different datatypes in Python! The basic types are <code style=\"color: inherit\">bool</code>, <code style=\"color: inherit\">int</code>, <code style=\"color: inherit\">float</code>, <code style=\"color: inherit\">str</code>. Then we have more complex datatypes like <code style=\"color: inherit\">list</code> and <code style=\"color: inherit\">dict</code> which can contain the basic types (as well as other lists/dicts nested.)</p>\n<table>\n<thead>\n<tr>\n<th>Data type</th>\n<th>Examples</th>\n<th>When to use it</th>\n<th>When <strong>not</strong> to use it</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>Boolean (<code class=\"language-plaintext highlighter-rouge\">bool</code>)</td>\n<td><code style=\"color: inherit\">True</code>, <code style=\"color: inherit\">False</code></td>\n<td>If there are only two possible states, true or false</td>\n<td>If your data is not binary</td>\n</tr>\n<tr>\n<td>Integer (<code class=\"language-plaintext highlighter-rouge\">int</code>)</td>\n<td>1, 0, -1023, 42</td>\n<td>Countable, singular items. How many patients are there, how many events did you record, how many variants are there in the sequence</td>\n<td>If doubling or halving the value would not make sense: do not use for e.g. patient IDs, or phone numbers. If these are integers you might accidentally do math on the value.</td>\n</tr>\n<tr>\n<td>Float (<code class=\"language-plaintext highlighter-rouge\">float</code>)</td>\n<td>123.49, 3.14159, -3.33334</td>\n<td>If you need more precision or partial values. Recording distance between places, height, mass, etc.</td>\n<td> </td>\n</tr>\n<tr>\n<td>Strings (<code class=\"language-plaintext highlighter-rouge\">str</code>)</td>\n<td>‘patient_12312’, ‘Jane Doe’, ‘火锅’</td>\n<td>To store free text, identifiers, sequence IDs, etc.</td>\n<td>If it’s truly a numeric value you can do calculations with, like adding or subtracting or doing statistics.</td>\n</tr>\n<tr>\n<td>List / Array (<code class=\"language-plaintext highlighter-rouge\">list</code>)</td>\n<td><code style=\"color: inherit\">['A', 1, 3.4, ['Nested']]</code></td>\n<td>If you need to store a list of items, like sequences from a file. Especially if you’re reading in a table of data from a file.</td>\n<td>If you want to retrieve individual values, and there are clear identifiers it might be better as a dict.</td>\n</tr>\n<tr>\n<td>Dictionary / Associative Array / map (<code class=\"language-plaintext highlighter-rouge\">dict</code>)</td>\n<td><code style=\"color: inherit\">{\"weight\": 3.4, \"age\": 12, \"name\": \"Fluffy\"}</code></td>\n<td>When you have identifiers for your data, and want to look them up by that value. E.g. looking up sequences by an identifier, or data about students based on their name. Counting values.</td>\n<td>If you just have a list of items without identifiers, it makes more sense to just use a list.</td>\n</tr>\n</tbody>\n</table>\n<p>There are a couple more datatypes we didn’t cover in detail: <code style=\"color: inherit\">set</code>s, <code style=\"color: inherit\">tuple</code>, <code style=\"color: inherit\">None</code>, <code style=\"color: inherit\">enum</code>, <code style=\"color: inherit\">byte</code>, all of which <a href=\"https://docs.python.org/3/library/datatypes.html\">can be read about in Python’s documentation.</a></p>\n<h2 id=\"comparators\">Comparators</h2>\n<p>We have a couple of comparators available to use specifically for numeric values:</p>\n<ul>\n<li><code style=\"color: inherit\">&gt;</code>: greater than</li>\n<li><code style=\"color: inherit\">&lt;</code>: less than</li>\n<li><code style=\"color: inherit\">&gt;=</code>: greater than or equal to</li>\n<li><code style=\"color: inherit\">&lt;=</code>: less than or equal to</li>\n</ul>\n<p>And a couple that can be used with numbers and strings (or other values!)</p>\n<ul>\n<li><code style=\"color: inherit\">==</code>: equal to</li>\n<li><code style=\"color: inherit\">!=</code>: does not equal</li>\n</ul>\n<h2 id=\"iterables\">Iterables</h2>\n<p>TODO</p>\n<h2 id=\"flow-control\">Flow Control</h2>\n<p>Basic flow control looks like <code style=\"color: inherit\">if</code>, <code style=\"color: inherit\">elif</code>, <code style=\"color: inherit\">else</code>:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-9",
      "source": [
        "blood_oxygen_saturation = 92.3\n",
        "altitude = 0\n",
        "\n",
        "if blood_oxygen_saturation > 96 and altitude == 0:\n",
        "    print(\"Healthy individual at sea level\")\n",
        "elif blood_oxygen_saturation > 92 and altitude >= 1000:\n",
        "    print(\"Healthy value above sea level\")\n",
        "elif blood_oxygen_saturation == 0:\n",
        "    print(\"Monitor failure\")\n",
        "else:\n",
        "    print(\"Not good\")"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-10",
      "source": "<p>We must start with an <code style=\"color: inherit\">if</code>, then we can have one or more <code style=\"color: inherit\">elif</code>s, and 0 or 1 else to end our clause. If there is no <code style=\"color: inherit\">else</code>, it’s like nothing happens, we just check the <code style=\"color: inherit\">if</code> and <code style=\"color: inherit\">elif</code>s and if none match, nothing happens by default.</p>\n<p>We could use <code style=\"color: inherit\">and</code> to check if both conditions are true, and <code style=\"color: inherit\">or</code> to check if one condition is true.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-11",
      "source": [
        "# And\n",
        "for i in (True, False):\n",
        "    for j in (True, False):\n",
        "        print(f\"{i} AND {j} => {i and j}\")\n",
        "# Or\n",
        "for i in (True, False):\n",
        "    for j in (True, False):\n",
        "        print(f\"{i} OR {j} => {i or j}\")"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-12",
      "source": "<p>And if we needed, we can invert conditions with not.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-13",
      "source": [
        "# Not\n",
        "for i in (True, False):\n",
        "    print(f\"NOT {i} => {not i}\")"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-14",
      "source": "<p>All of these components (<code class=\"language-plaintext highlighter-rouge\">if/elif/else</code>, <code style=\"color: inherit\">and</code>, <code style=\"color: inherit\">or</code>, <code style=\"color: inherit\">not</code>, numerical and value comparators) let us build up</p>\n<h2 id=\"loops\">Loops</h2>\n<p>Loops let us <em>loop</em> over the contents of something iterable (a string, a list, lines in a file). We write</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">for loopVariable in myIterable:\n    # Do something\n    print(loopVariable)\n</code></pre></div></div>\n<p>Each loop has:</p>\n<ul>\n<li><code style=\"color: inherit\">for</code>, a keyword to start the loop</li>\n<li>a loop variable, here named <code style=\"color: inherit\">loopVariable</code> which is set automatically every iteration of the loop</li>\n<li><code style=\"color: inherit\">in</code>, a keyword used in a loop</li>\n<li>something we want to iterate over like a list or string (which is really just a list of single characters) or lines in a file.</li>\n<li>a loop body where all the action happens</li>\n</ul>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-15",
      "source": [
        "a = 0\n",
        "c = 0\n",
        "t = 0\n",
        "g = 0\n",
        "\n",
        "for base in 'ACTGATGCYGGCA':\n",
        "    if base == 'A':\n",
        "        a = a + 1\n",
        "    elif base == 'C':\n",
        "        c = c + 1\n",
        "    elif base == 'T':\n",
        "        t = t + 1\n",
        "    elif base == 'G':\n",
        "        g = g + 1\n",
        "    else:\n",
        "        print(\"Unexpected base!\")\n",
        "\n",
        "print(f\"a={a} c={c} t={t} g={g}\")"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-16",
      "source": "<h2 id=\"files\">Files</h2>\n<p>In python you must <code style=\"color: inherit\">open()</code> a file handle, using one of the three modes (read, write, or append). Normally you must also later <code style=\"color: inherit\">close()</code> that file, but your life can be a bit</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-17",
      "source": [
        "with open('out.txt', 'w') as handle:\n",
        "    handle.write(\"Здравствуйте \")\n",
        "    handle.write(\"世界!\\n\")\n",
        "    handle.write(\"Welcome!\\n\")\n",
        "\n",
        "# Can no longer handle.write(), once we've exited the with block, the file is automatically closed for us."
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-18",
      "source": "<p>There are several basic parts</p>\n<ul>\n<li>We use <code style=\"color: inherit\">with</code> to indicate we want to use the context manager for file opening (this is what automatically closes the file afterwards)</li>\n<li><code style=\"color: inherit\">open(path, mode)</code> opens a file</li>\n<li><code style=\"color: inherit\">as</code> is a keyword</li>\n<li><code style=\"color: inherit\">handle</code> is the name of a file handle, something that represents the file which we can write to, or read from.</li>\n</ul>\n<p>Additionally if you need a newline in your file, you <em>must</em> write it yourself with a <code style=\"color: inherit\">\\n</code>.</p>\n<p>The above code is equivalent to this, but it is not recommended, it’s a bit harder to read, and it is very very common to forget to close files which is not ideal.</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">handle = open('out.txt', 'w')\nhandle.write(\"Здравствуйте \")\nhandle.write(\"世界!\\n\")\nhandle.write(\"Welcome!\\n\")\nhandle.close()\n\n# Can no longer write.\n</code></pre></div></div>\n<p>You can also read from a file:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-19",
      "source": [
        "# Read the entire file as one giant string\n",
        "with open('out.txt', 'r') as handle:\n",
        "    print(handle.read())\n",
        "\n",
        "# Or read it as separate lines.\n",
        "with open('out.txt', 'r') as handle:\n",
        "    print(handle.readlines())"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-20",
      "source": "<h2 id=\"exceptions\">Exceptions</h2>\n<p>Sometimes things go wrong!</p>\n<ul>\n<li>You divide by zero (<code class=\"language-plaintext highlighter-rouge\">ZeroDivisionError</code>)</li>\n<li>You try and add a string and a number (<code class=\"language-plaintext highlighter-rouge\">TypeError</code>)</li>\n<li>You have incorrect indentation (<code class=\"language-plaintext highlighter-rouge\">IndentationError</code>)</li>\n<li>Files are unreadable</li>\n<li>Files have the wrong permissions</li>\n</ul>\n<p>If you expect that something will go wrong you can guard against it with a <code style=\"color: inherit\">try</code>/<code class=\"language-plaintext highlighter-rouge\">except</code></p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-21",
      "source": [
        "try:\n",
        "    print(1 / 0)\n",
        "except ZeroDivisionError:\n",
        "    print(\"Nope! I expected that!\")"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-22",
      "source": "<p>Some of the most common reasons to do this are when you’re processing user input data. Users often input invalid data, unfortunately.</p>\n<h1 id=\"exercises\">Exercises</h1>\n<h2 id=\"series-approximation\">Series Approximation</h2>\n<p>Write a program that computes the sum of an alternating series where each element of the series is an expression of the form</p>\n\n\\[4\\cdot\\sum_{k=1}^{N} \\dfrac{(-1)^{k+1}}{2 * k-1}\\]\n<p>Use that expression and calculate the sum for various values of N like <code style=\"color: inherit\">10</code>, <code style=\"color: inherit\">1000</code>, <code class=\"language-plaintext highlighter-rouge\">1000000</code></p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-23",
      "source": [
        "# Write your approximation here!\n",
        "def calculate_sum(N):\n",
        "    # for values in the range [1, N] (inclusive!)\n",
        "        # calculate the expression (the bit after the Sigma)\n",
        "    # sum up all of those numbers!\n",
        "    return value\n",
        "\n",
        "print(calculate_sum(10))"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-24",
      "source": "<h2 id=\"monte-carlo-simulation\">Monte Carlo Simulation</h2>\n<p>You can use a monte carlo simulation to calculate the value of π. The easy way to do this is to take the region <code style=\"color: inherit\">x = [0, 1], y = [0, 1]</code>, and fill it with random points. For each point, calculate the distance to the origin. Calculate the ratio of the inside points to the total points, and multiply the value by 4 to estimate π.</p>\n<p>You can use the <code style=\"color: inherit\">random</code> module to generate random values:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-25",
      "source": [
        "import random\n",
        "\n",
        "print(random.random())\n",
        "print(random.random())\n",
        "print(random.random())"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-26",
      "source": "<p>Using the <code style=\"color: inherit\">random.random()</code> to generate x and y coordinates, write a function that:</p>\n<ul>\n<li>generates N random points</li>\n<li>calculate their distance to the origin</li>\n<li>Calculate the number of points that are <code style=\"color: inherit\">distance&lt;=1</code></li>\n<li>divide that by <code style=\"color: inherit\">N</code>, and multiply by <code style=\"color: inherit\">4</code>.</li>\n</ul>\n<figure id=\"figure-1\"><img src=\"https://upload.wikimedia.org/wikipedia/commons/8/84/Pi_30K.gif\" alt=\"Gif of the unit square 0, 1 on the x and y axes. Points are being randomly generated and points inside the unit circle are marked in red, outside in blue. A quarter of a circle starts to appear, and the numbers at the top show as N increases, the approximation of Pi improves.\" loading=\"lazy\" /><figcaption><span class=\"figcaption-prefix\">Figure 1:</span> Gif from the Wikipedia Article for Monte Carlo Method</figcaption></figure>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-27",
      "source": [
        "# Write your code here!\n",
        "import math\n",
        "\n",
        "# Just a suggestion: write a function to calculate the distance to origin.\n",
        "#\n",
        "def distance(x, y):\n",
        "    # ...\n",
        "    return\n",
        "\n",
        "def generate_random_point():\n",
        "    # ....\n",
        "    return [x, y]\n",
        "\n",
        "def approximate(N=1000):\n",
        "    # For every point in the range [0, N]\n",
        "    #   check if it's distance is great than 1\n",
        "\n",
        "    # Find the ratio of how many are distance<=1\n",
        "    # and return 4 times that ratio.\n",
        "    return 4 * x\n",
        "\n",
        "# Try it with a couple N values like 1, 100, 100000,\n",
        "n = 10000\n",
        "# Since we're using a random number function, the result is different every\n",
        "# time we run the simulation.\n",
        "print(approximate(n))\n",
        "print(approximate(n))\n",
        "print(approximate(n))"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-28",
      "source": "<h2 id=\"sixpack\">Sixpack</h2>\n<p>Sixpack is an old EMBOSS program which takes in a DNA sequence, and then for every frame, for both strands, emits every Open Reading Frame (ORF) that it sees.</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">  G  R  G  F  W  C  L  G  G  K  A  A  K  N  Y  R  E  K  S  V  D  V  A  G  Y  D  X   F1\n   G  V  A  S  G  A  W  A  V  K  R  Q  K  T  T  V  K  S  R  W  M  W  R  V  M  M     F2\n    A  W  L  L  V  P  G  R  *  S  G  K  K  L  P  *  K  V  G  G  C  G  G  L  *  X    F3\n1 GGGCGTGGCTTCTGGTGCCTGGGCGGTAAAGCGGCAAAAAACTACCGTGAAAAGTCGGTGGATGTGGCGGGTTATGATG 79\n  ----:----|----:----|----:----|----:----|----:----|----:----|----:----|----:----\n1 CCCGCACCGAAGACCACGGACCCGCCATTTCGCCGTTTTTTGATGGCACTTTTCAGCCACCTACACCGCCCAATACTAC 79\n   P  R  P  K  Q  H  R  P  P  L  A  A  F  F  *  R  S  F  D  T  S  T  A  P  *  S     F6\n  X  A  H  S  R  T  G  P  R  Y  L  P  L  F  S  G  H  F  T  P  P  H  P  P  N  H  H   F5\n    P  T  A  E  P  A  Q  A  T  F  R  C  F  V  V  T  F  L  R  H  I  H  R  T  I  I    F4\n</code></pre></div></div>\n<p>Here we see a DNA sequence <code style=\"color: inherit\">GGGCGTGGCTTCTGGTGCCTGGGCGGTAAAGCGGCAAAAAACTACCGTGAAAAGTCGGTGGATGTGGCGGGTTATGATG</code> which you’ll use as input. Above is the translation of the sequence to protein, for each of the three frames (F1-6). Below is the reverse complement of the sequence, and the three frame translation again.</p>\n<p>What sixpack does is:</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">orfs = []\n\nfor sequence in [forward, reverse_complement(forward)]:\n    for frame in [sequence, sequence[1:], sequence[2:]]:\n        # Remembering\n        for potential start_codon:\n            # accumulate until it sees a stop codon\n            # and append it to the orfs array once it does.\n</code></pre></div></div>\n<p>Here are some variables for your convenience:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-29",
      "source": [
        "start_codons = ['TTG', 'CTG', 'ATG']\n",
        "stop_codons = ['TAA', 'TAG', 'TGA']\n",
        "\n",
        "# And some convenience functions\n",
        "def is_start_codon(codon):\n",
        "    return codon in start_codons\n",
        "\n",
        "def is_stop_codon(codon):\n",
        "    return codon in stop_codons"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-30",
      "source": "<p>It’s a good exercise to rewrite <code style=\"color: inherit\">sixpack</code> in a very simplified version without most of the features in sixpack:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-31",
      "source": [
        "# Write your code here!\n",
        "\n",
        "# Some recommendations:\n",
        "def reverse_complement(sequence):\n",
        "    return ...\n",
        "\n",
        "orfs = []\n",
        "\n",
        "for sequence in [forward, reverse_complement(forward)]:\n",
        "    for frame in [sequence, sequence[1:], sequence[2:]]:\n",
        "        # Remembering\n",
        "        for potential start_codon:\n",
        "            # accumulate until it sees a stop codon\n",
        "            # and append it to the orfs array once it does."
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "python"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-32",
      "source": "\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "cell_type": "markdown",
      "id": "final-ending-cell",
      "metadata": {
        "editable": false,
        "collapsed": false
      },
      "source": [
        "# Key Points\n\n",
        "\n# Congratulations on successfully completing this tutorial!\n\n",
        "Please [fill out the feedback on the GTN website](https://training.galaxyproject.org/training-material/topics/data-science/tutorials/python-basics-recap/tutorial.html#feedback) and check there for further resources!\n"
      ]
    }
  ]
}