Code Monkey home page Code Monkey logo

b1_03_align-linear-gap's Introduction

Albert-Ludwigs-Universität Freiburg

Lehrstuhl für Bioinformatik - Institut für Informatik - http://www.bioinf.uni-freiburg.de


Bioinformatics 1

WS 2021/2022
Exercise sheet 3: Sequence Alignment

Exercise 4 - Programming assignment: Implementation of Needleman-Wunsch algorithm

The programming assignment will use a scoring function as parameters which is a dictionary with 3 entries. This will look as follows:

scoring = {"match": -1, "mismatch": 1, "gap_introduction": 0}

You do not need to write the scoring function by yourself. However, make sure you use the correct keys in your implementation. An example will be provided in the skeleton of part a).

Despite most implementations of Needleman Wunsch will use a maximization at the optimization step, we will stick to the lecture and use minimization (score(match) < score(gap)). Also, have a look at return typehints in the skeleton functions and read carefully how the results should be returned.

a) Implement the function zero_init() which takes two sequences S1 and S2 and creates the Needleman-Wunsch matrix and initiates all the matrix values with zeroes. Hereby S1 should be represented by the rows and S2 by the columns.

b) Implement the function nw_init() which takes two sequences S1 and S2 as well as the scoring function and fills in the values for the first row and first column of the matrix with the correct values. Utilize a) in your implementation.

c) Implement the function nw_forward() which takes the two sequences S1 and S2 and the scoring function and output the complete matrix filled with the Needleman-Wunsch approach.

The following steps will help you with implementing the traceback.

d) Implement the function previous_cells() which takes two sequences S1 and S2, scoring function, the filled in recursion matrix from the step c) and the cell coordinates as a tuple (row, column). The function should output a list of tuples of all possible previous cells. The tuples should be again structured like (row, column).

e) Implement the function which builds all possible traceback paths. This function should return a list of possible paths which themselves are a list of tuples (row, column). The ordering must be decreasing. Meaning paths should start in the lower right corner of the matrix.

f) Implement the function build_alignment() which takes two sequences and a path as a list of tuples. This function should return an alignment tuple. Meaning two strings of same length with introduced gaps.


Preparing for the Exam

The exam_practice_nw.py file provides the possibility to create training examples for the exam. You can use it as a command line tool e.g. like:

python exam_practice_nw.py --first AAATGT --second AATTAAAA --match -1 --mismatch 0 --gap_introduction 1 --file_name matrix.csv

This will produce a csv file that you can import into Excel, Libre Office Calc, etc., where you can fill in the forward values by yourself.

To check if you provided the correct result, you can use the --check True flag and run it similarily.

python exam_practice_nw.py --check True --first AAATGT --second AATTAAAA --match -1 --mismatch 0 --gap_introduction 1 --file_name matrix_correct.csv

Recommended good practices

Here we have included some best practices helping you solve the exercises as efficiently as possible. First, clone the assignment repository.

$ git clone [email protected]:Bioinformatics-teaching/exercise-sheet-3-sequence-alignment-userID.git

Do not forget to use your own user ID. Now, answers the questions.

$ cd exercise-sheet-3-sequence-alignment-userID
$ emacs -nw exercise_sheet3.py

Include the changes and make a commit describing the modifications.

$ git add exercise_sheet3.py
$ git commit -m "Description of your modifications"

Send your results.

$ git push

Done! But, it would be nice to know something about the score, wouldn't it? Let's check the autograding results. This PR will also be used by the teachers to include specific comments.

Autograding

b1_03_align-linear-gap's People

Contributors

alexander-mitrofanov avatar domonik avatar gallardoalba avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.