Code Monkey home page Code Monkey logo

galaxy-tricks's Introduction

Useful tricks and tipps for Galaxy users

"You must feel the Force around you." Yoda

Text processing

  • Convert comma separated files into tab-separated files
    Convert delimiters to TAB
  • FASTA files with unique sequences
    FASTA-to-TabularUnique occurrences of each record (advanced parameters) → Tabular-to-FASTA
  • Remove sequences with N or any other character
    FASTA-to-TabularFilter data on any column using simple expressions with
    (condition: c2.find('N') != -1) → Tabular-to-FASTA
  • Extracting the 3rd column from a 5 column file
    Cut columns from a table with c3
  • Reorder columns or column swap
    Cut columns from a table with c3,c2,c1
  • Count how often one entry appears in column 1
    Datamash with Group by fields: 1 and Operation to perform: count
  • Group all rows where column 1, 4 and 5 are identical
    Datamash with Group by fields: 1,4,5
  • Column-to-rows and rows-to-columns (transpose matrix)
    Transpose rows/columns
  • Make your files smaller, e.g. for testing; subsampling of files
    Select random lines from a file
  • Make your sequence files smaller, e.g. for testing; subsampling sequences
    Sub-sample sequences files
  • Merge two files together according to one column in every file
    Join two files
  • Add unique column
    Add column to an existing dataset with iterate: Yes
  • Get rid of all rows where column 2 has values greater than 0
    Filter data on any column using simple expressions with c2>0
  • Get all rows where column 4 starts with hsa
    Filter data on any column using simple expressions with c4.startswith('hsa')
  • Get rid of all rows where the sum of column 2 and 3 is greater than 10
    Filter data on any column using simple expressions with c2+c3>10
  • Get rid of all rows where the length of my text in column 2 is greater than 10
    Filter data on any column using simple expressions with len(c2)>10
  • Create new rows for every comma separated value in column 3; Unfolding
    Unfold columns from a table with Column 3 and Comma
  • Split the first four characters of a line into it's own column
    Replace Text in entire line with Find Pattern: ^(.{4}) and Replace Pattern: &\t
  • Add the basepairs "TA" to the end of each sequences
    FASTA to TabularAdd column with TAMerge ColumnsCut columnsTabular to FASTA
  • Add a quotation mark to every row
    Compute an expression on every row with chr(34) (34 is the ASCII code for ")
  • Count all columns with numbers that do not contain 0. Usefull if you want to calculate the mean but want to exclude all columns that are 0.
    Compute an expression on every row with bool(c1) + bool(c1) + bool(c3) ...
  • Calculate log2 (not log10) from a column (e.g. c1) adding a new column Compute an expression on every row with log(c1,2)

HTS

  • Map RNA-seq data
    HISAT or TopHat
  • Map DNA-seq data
    Bowtie or BWA
  • Map methylC-seq data
    Bismark
  • Downsample BAM/SAM files
     BAM/SAM Mapping Stats will give you the number of reads/read pair in your BAM file in case you don't know it already. Then you just divide the number of reads you want to downscale to with the number of reads you have and use this fraction as the probability in Picard – Downsample SAM/BAM.
  • Get all genes that are covert by reads
    htseq-count with a gene annotation GTF file on your BAM file → Filter data on any column using simple expressions with c2>0
  • Extract sequences from intercal files, like gff, bed, gtf. Returning FASTA file →
    Extract Genomic DNA using coordinates from assembled/unassembled genomes

Workflows

More Resources

Disclaimer

All tools mentioned here are available from the Galaxy Tool Shed. Kindly ask your Galaxy Administrator to get access to them.

galaxy-tricks's People

Contributors

bgruening avatar foellmelanie avatar inutano avatar manabuishii avatar ryotayamanaka avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.