printlogo
http://www.ethz.ch/index_EN
CBRG - Computational Biochemistry Research Group
 
print
  

Lab Rotations

Lab rotations (transfers) are roughly 90-hour projects that give students an opportunity to work on topics relevant to our research group.

If you are interested in any of the topics below, or have any question, please contact a member of the group.

Transition Darwin to Maple

Recently, we have been discussing with MapleSoft, the makers of the well-known computer algebra system, about possible integration of Darwin functions as the official bioinformatics library for Maple.

We have started with the most useful functions in Darwin, which consist both of

These lab rotations are an opportunity to contribute to a project that might impact many users, and that gives you insights in state-of-the-art bioinformatics algorithms/methods.

Projects

Graph Algorithms: evaluation and porting of Maximum Edge Weight Clique and Traveling Salesman Problem.
tRNA Pairing Index (TPI): port code of the TPI function and associated biorecipe.

Documentation: design of modular structure for upcoming functionality that will be ported, and verify already ported code by designing unit tests

Biorecipes

Hash functions - A hashing function is a mathematical function which returns an arbitrary integer computed from a given expression. This hashing value is guaranteed to be the same for identical expressions, but it is not guaranteed to be unique. Hashing functions are useful for finding duplicate records, finding similar records, finding similar substrings, and membership testing. Once we have a unique number for a key, we can rapidly test whether we have seen the same key before. In this way the number of comparisons made is very small. Your task is to write a biorecipe describing several hashing functions.

Eve's tree - Mitochondrial DNA is clonal through the mother (i. e. you have identical mitochondrial DNA as your mother). Therefore its phylogenetic signal is uncontaminated by recombination. This biorecipe will take over 1000 human mitochondrial DNA sequences and build a phylogenetic tree showing the relationships between many different populations of humans.

Sudoku- Write code and write a biorecipe to use several methods to solve a sudoku.

Evaluation, testing, and technical writing 

A Lab rotation in this group will consist of 3 relatively similar parts:

  1. Functional evaluation: for the group of functions the student should evaluate the existing functionality and suggest either missing modules or different implementation of the structures.
  2. Functional testing: for the group of functions the student should design testing that will subject the code to extreme cases.
  3. Technical documentation:  the documentation should be scrutinized, fixed, improved, actualized or completed (in the cases where it is missing).

Projects

Secondary structure prediction: These functions encode a system for job control.  The job would be to fix the code, document the code, include appropriate references, and make a working example in a biorecipe.  (Predict.drw,  AllAlpha, AllBeta, AllSi, Parses).

Build system for Darwin's help pages and Bio-Recipes: Integration with new build system based on CMake and automatically verify results and report regressions and missing topics

Probabilistic Finite Automata (PFA):  These objects are represented in 3 possible ways.  These are quite equivalent and usually convert into one another seamlessly.  There are several methods to build and improved PFAs.  Additionally to this the function EvolutionaryOptimization which was written with PFAs in mind should
be also considered.

Minimization functions:  There are several functions which deal with different optimization problems (functions with different properties). Some of these functions have been superseeded by others. The efficiency of these functions should be quantified (and maybe some functions should be removed).

Graph representation: (undirected graphs implemented) and Partial Orders.  Directed graph representation should be considered. Help system. The classes used for function and classes description should be generalized and tightened.  Several issues about automatic generation of text/latex/html help files should be studied.  The safety of the whole process should be increased.

Gnuplot integration Design: implement and document a way to incorporate Gnuplot as a scientific data visualizer into Darwin.

Statistical Tests: You should improve the current abilities of Darwin to perform statistical hypothesis selection, including different model selection criterion and the developpment of (non-)parametric statistical tests. This includes the implementation of complex mathematical functions.

Simulation Framework Alf: As part of the development of a simulator for genome evolution, we implemented various functions that could be generally useful. These include a method for tree sampling, parametric codon models, etc. This project aims at identifying these functions, integrating them into the Darwin library and document their use.

Phylogenetic trees: The functions either construct a tree or operate on trees. Included are for example the computation of distances between two trees or the length of a tree. The tree reconstruction methods (e.g. FastME, PhyML, PAUP) have already been coded in some other language a need to be integrated (using wrapper functions) into the Darwin-system.

Compatibility with the community: The functions convert different formats describing multiple sequence alignments or trees between the Darwin internal representation and
formats used in the community.

Multiple sequence alignment utilities: The functions compute distances between alignment-structures or score an alignment.

Codon-based analysis: measurements codon bias and tRNA economy, including funtions evaluating codon models on a tree or for pairs of sequences.
Simulating sequences under Markov codon evolution models.

 

Wichtiger Hinweis:
Diese Website wird in älteren Versionen von Netscape ohne graphische Elemente dargestellt. Die Funktionalität der Website ist aber trotzdem gewährleistet. Wenn Sie diese Website regelmässig benutzen, empfehlen wir Ihnen, auf Ihrem Computer einen aktuellen Browser zu installieren. Weitere Informationen finden Sie auf
folgender Seite.

Important Note:
The content in this site is accessible to any browser or Internet device, however, some graphics will display correctly only in the newer versions of Netscape. To get the most out of our site we suggest you upgrade to a newer browser.
More information

© 2012 ETH Zurich | Imprint | Disclaimer | 18 March 2010
top