Code Monkey home page Code Monkey logo

Comments (3)

sreichl avatar sreichl commented on June 22, 2024

Idea 1: pre-generate all feature list names

  • Make feature list generating rule a checkpoint with a subsequent aggregation rule that creates a csv similar to the input annotation of enrichment analysis (name, path, background,…) for each analysis. -> this is then required in the target rule instead of the feature_list folder
  • Thereby the missing input problem is solved without using the internal data and the annotation of enrichment analysis module has become less cumbersome.
  • -> enabling run from A to Z
  • Need to explicitly determine the exact filenames before execution and then instruct rules -> Is this actually possible?! I did not manage before in genome_track to make outputs conditional, only inputs using input functions.
  • This requires the function dmatrix from library patsy, which in turn requires the Global Workflow Dependency functionality of Snakemake 8
  • need to make empty files for groups without DEGs

Idea 2: use checkpoints

Idea 3: use for loops around the rule

  • Check if for loops for rules are supported. Then one rule per analysis with the respective expand for the result files.

Idea 4: input = output?

  • Can I have a rule that has its input as output?!

Idea 5: adapt enrichmnet_analysis input

  • Change enrichment analysis input to a pattern of the output directory of the differential analysis. Think it threw before testing and implementing

Idea 6: Split up the feature list generation per group

  • Con: waste of resources as the result is loaded over and over
  • Pro: specific outputs supported by Snakemake
  • Request in the final target rule all pre determined feature lists and use wildcards for each group within each analyses.
  • Solves the problem without checkpoints or other problems (but requires Snakemake 8)
  • To save resources the explicit rule can take the input from the checkpoint but selects only for the lists per analysis and then copies or touches them?

from dea_limma.

sreichl avatar sreichl commented on June 22, 2024

Goal: Run analyses from rAw/reAds to pathwayZ/enrichmentZ i.e., close the gap between dea_limma/_seurat and enrichment-anlaysis module

if explicit pre generation of file names, then Snakemake 8 is required

  • install Snakemake 8
  • setup & document SLURM executor for CeMM HPC
  • change module to work with Snakemake 8 and SLURM executor (e.g., move partition from param to resource)
    • change & test all other modules, then switch min_version to 8.X.X
  • add global workflow dependency ie envs/global.yaml with library patsy for function dmatrix
  • develop function that generates file names using patsy
  • add it to target rule all as final outcome
  • add rule that touches (or copies?) respective files per group from checkpoint or call a new rule/script for feature list generation per group
    input:
        get_feature_lists,
    output:
        up = os.path.join(result_path,'{analysis}','feature_lists','{group}_up_features.txt'),
        up_annot = os.path.join(result_path,'{analysis}','feature_lists','{group}_up_features_annot.txt') if config["feature_annotation"]["path"]!="" else [],
        # same for down and featureScores.csv

from dea_limma.

sreichl avatar sreichl commented on June 22, 2024

predetermining result names potential problem
Requires to look into annotation/metatada data that is upstream generated by eg spilterlize or scRNAseq processing… hence can’t be used for a real A to Z run… But isn't that then a general problem? Think about it thoroughly before testing, then test easily without heavy developing.
Which brings me back to checkpoints between modules being the solution?!?!

from dea_limma.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.