Code Monkey home page Code Monkey logo

mas's People

Contributors

felbecker avatar luc4sq avatar mariostanke avatar mauricerad avatar moritzgrillo avatar sebastianbierb avatar timonkapischke avatar

Stargazers

 avatar

Watchers

 avatar  avatar

mas's Issues

Sequence simulation

Simulate S sequences of length L over the alphabet {A,C,G,T} and store them in a file.

Possible distributions:

  • iid
  • master + mutations

The simulation should take place in a separate program with its own main-function.
The sequences are stored in multiple fasta file-format.

sE graph layout

An algorithm that computes new coordinates for nodes on their horizontal axis. Design a criterion for the placement such that the game position is clear, e.g. long near-horizontal edges are avoided. The criterion could be force-directed, e.g. chosen edges pull their connected nodes together and nodes push each other away. A possible solution is using Graphviz, possibly as a library.

main Parts:

  • research about existing tools/libraries or own solutions
  • decision to use a reseached method/library is finished
  • succeeded implementation of this idea in a raw format (i.E. with dot. described as under the checkboxes)
  • If not done already, checked consistency in terms of doxygen and coding conventions

suggestion:

  • dot: conversion of graph and output coordinates

sDt4 SimplifyGraphRenderer

Rewrite graph renderer and simplify its usage.
3 methods:

  • update(float delta) where delta is a time value (in seconds) with the elapsed time since the last update call
  • handleInput(sf::Event event) that is called inside the event polling loop for each polled event
  • render(sf::renderWindow& window) that renders the graph in it's current state

Alignment state + actions

Suppose the set E of edges in the graph defined in #3 is known (which corresponds to the set of possible k-mer alignment in subsequent sequences).

Let a state S denote the currently selected and consistent subset of chosen edges. Write a class to encapsulate such a state. Add a "possibleActions" method, that outputs a list of valid moves given the current state.

sF animated transition between placements

Given two placements (coordinates) of all nodes in a graph, make a smooth animation (with acceleration and slowing down) to go from the current one to the new one. This will e.g. be used when the player adds an edge to her edge choice, then a new placement (sE) is chosen and the transition should be animated rather than instantaneous.

step 4

sJt3 monitoring loss

Compute the loss on a batch or on training data (set of pairs of x,y). The loss should be output from time to time during the optimization, so one can monitor the progress of the optimization.

Fix documentation

The documentation (https://mslehre.github.io/MAS/) still needs some work here and there. I will assign you to this issue, if your code is related.

  • Please keep doxygen comments in header files only.
  • Please do not "overdocument". Class- and public member/method descriptions are important for others as well as comments for function arguments if they are not self-explanatory. Typically you don't include private member variables or local variables in your doxygen documentation.
  • Test code for your features typically does not need doxygen comments.

CPU hogging

When the game is idle, it requires ~50% of CPU time on my laptop. Maybe reconfigure the wait in the main event loop?

random number generation

I just tested the sequence simulation code (at home under windows). It does produce the same sequence every time I run the program. Also the sequences are not mutated at all. I suspect that this is a seed initialization problem. Do you get different sequences for each execution of the program on your system? Do mutations work?

I researched a bit and found that std::random_device is deterministic under MINGW/Windows, but non-deterministic under linux, which is a ugly thing as we want code to be portable and work the same on every system.

As a solution I would try to not use std::random_device and set the seed using the current system time instead:

#include <chrono>

std::mt19937 rng(chrono::high_resolution_clock::now().time_since_epoch().count());

(and remove the line with the random device)

The issue with the mutations not working might be unrelated to this. Since you generate random doubles there, you should use std::uniform_real_distribution
(https://en.cppreference.com/w/cpp/numeric/random/uniform_real_distribution) for the random mutation probability, not the random device as you currently do.

sJt2 stochastic gradient descent

Write the code that iterates over batches of training data x and corresponding outputs y. In the example code mnist.cpp this is the for loop over the epochs and it calls optimizer.step().

sCt3 agent efficiency changes

Refactoring of Agent according to comments in sH branch pull request.
Efficiency changes to getEpisode, setting LearnedPolicy as default and changing LearnedPolicy accordingly.

sL GUI

Figure out how to poll mouse-events with SFML, create a simple GUI with buttons like "load game" or "close game" that perform an action when clicked.

Graph construction

Write a class that:

  • reads multiple fasta files
  • splits the sequences into k-mers
  • computes exact matches between k-mers of subsequent sequences
  • eventually builds a graph datastructure with all k-mers with at least one match as nodes and where an edge represents a exact match of 2 k-mers

More precisely:
Let equation for equation denote the number of k-mers with at least one match in string j.

Consider the graph equation such that equation and equation.

Visualization

Use the SFML-Library to write code that:

  • opens a window
  • renders a graph as described in #3 (make sure interfaces match)

You may want to start with an empty window, proceed with a single rectangle that optionally contains a string and then end up in plotting the whole k-mer graph.

This issue will define the main-function for the core program.

sJt1 loss and gradient

Define the sum-of-squares loss we use for regression:
Sum of squares of differences between predicted and samples values.
Implement the part that tells PyTorch to do backwards, i.e. to compute the gradient of the loss wrt to the parameters.

sLt2 - Gamemaster

We need a class Gamemaster. This class create a Graph, a State and a DrawNodes vector for the game.

__sF__ _smooth animation of replacement_

Given two placements (coordinates) of all nodes in a graph, make a smooth animation (with acceleration and slowing down) to go from the current one to the new one. This will e.g. be used when the player adds an edge to her edge choice, then a new placement (sE) is chosen and the transition should be animated rather than instantaneous.

__step: __

sLt1finish Button class

Finish the Button class with a function handler.
"The button class could have a general "setFunction" method that assignes a method that is invoked on button click."

bug in edge consistency

The boundary case is wrong: no two selected edges may share a node. In other words: each equivalence class of characters aligned with each other can contain at most one character from each sequence.

if((edges[left].first.j<edges[i].first.j && edges[left].second.j>edges[i].second.j)

This is a matter of changing > to >= or likewise.
Also: please make the consistency check for a pair of edges a separate function, e.g.
bool consistent(Edge &e, Edge &f);

sLt5 - layout improvement

If we start a game, the graph is not displayed correctly. We must increase the line index by one.

__sK__ _RL: learning of v-values_

Create a training set for the ML method from sJ:

  1. sample trajectories from the current policy.
  2. create a training set and learn new parameters of ML (sJ)
    learning tuples (s,r), where r is the cumulative reward (paid only at end) and s is any state on the policy episode (rollout).
  3. goto 1.

step: 5

sJt4 training performance test

Test the performance of the training algorithm to achieve its objective: minimization of loss.
For the case of linear regression this could be done by comparing to the closed-form solution or by reporting the maximum absolute value of the gradient after the training loop has finished.

sHt3 valueMLmodel efficiency changes

Update calcValueEstimates and related functions so that actions are not copied as boolean vectors.
Also change Episode.h accordingly, and some refactoring for RLDataset and valueMLmodel.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.