Code Monkey home page Code Monkey logo

genetherapypatientreportmaker's People

Contributors

anatolydryga avatar cnobles avatar esherm avatar helixscript avatar yhwu avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

heath1210 jbdoto

genetherapypatientreportmaker's Issues

how to treat VCN=0 vs VCN=NA

There is a small general issue I wanted to bring up before I forget: in both the runstats and the reports, samples with VCN = 0 are being given a VCN of NA. I realize this is probably small beans right now, but it will be useful to make the correction eventually.

genemarks

put genemark in a function, add on the fly check the genemarks really mean what they say. Yinghua's job

estAbunProp

add on fly check that sum of estAbunProp by unique posid for each trial, patient, time, cell are close to 1. Chris job?

Barplot Total legend

If legend is extensive, it pushes the plots to the side instead of spreading our. Need to set a dynamic height for the legend, possibly based on cell type.

passing unique breakpoints into estAbund

As Yinghua has communicated with Chuck and we make the assumption that any repeats of a distinct width for a specific integration sites are pcr replicates and a result of amplification, we should only try to pass unique lengths from a single sites and replicate into estAbund. I think this could be done by just wrapping dplyr::distinct() around the df made in the beginning of the estimateAbundances script.

What art you thoughts?

Over calculating the number of actual unique sites.

For example, an in depth analysis of pAK m6:
.....................|............in depth analysis..............|.....report....|
......................uniSites....multiSites......totalSites......uniSites
GTSP0470.......19455............536..........19991.........20528
GTSP0467.........4908..............39............4947...........4982
GTSP0469.........9720..........1010..........10730.........10137
GTSP0471...........953..........1864............2817.............969
GTSP0473...........249..............91..............340.............530
GTSP0472.............56................9................65.............155
GTSP0468...........334............113..............447.............354

The number of unique sites should be calculated for each GTSP, and not from each replicate and then added together. This may be the case in what's going on, but one way we could solve this problem is by asking length(unique(sites[GTSP]$posID)) after all the replicates have been standardized together.

Otherwise the issue may lie in the way we are standardizing sites, calling several sites near each other unique because the tail for the original site was > 5 nt.

for "p04409_10" "p04409-10" "p04409-25" “pWAS00002"

Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) :
subscript contains NAs or out-of-bounds indices
Calls: getAbundanceThreshold ... extractROWS -> normalizeSingleBracketSubscript -> NSBS -> NSBS
Execution halted

Dynamic height of embedded figures

The dynamic height of the figures is setting wrong when there are 3 or multiples of 3 celltypes. Both barplots by sample and heatmaps. Height should adjust so that 1 row for 3 barplots, 2 rows for 4-6, and so on. Currently 3 barplots gets 2 rows and has an entire empty row.

Heatmaps get extended if there are 6 or multiples of 6.

generating report for pSINSCID_UK01 was not successful

Generating report from the following sets
sampleName GTSP patient celltype timepoint
1 GTSP0436-1 GTSP0436 pSINSCID_UK01 PBMC m6
2 GTSP0436-2 GTSP0436 pSINSCID_UK01 PBMC m6
3 GTSP0436-3 GTSP0436 pSINSCID_UK01 PBMC m6
4 GTSP0436-4 GTSP0436 pSINSCID_UK01 PBMC m6
5 GTSP0437-1 GTSP0437 pSINSCID_UK01 Neutrophils m6
6 GTSP0437-2 GTSP0437 pSINSCID_UK01 Neutrophils m6
7 GTSP0437-3 GTSP0437 pSINSCID_UK01 Neutrophils m6
8 GTSP0437-4 GTSP0437 pSINSCID_UK01 Neutrophils m6
9 GTSP0438-1 GTSP0438 pSINSCID_UK01 Tcells m6
10 GTSP0438-2 GTSP0438 pSINSCID_UK01 Tcells m6
11 GTSP0438-3 GTSP0438 pSINSCID_UK01 Tcells m6
12 GTSP0438-4 GTSP0438 pSINSCID_UK01 Tcells m6
13 GTSP0440-1 GTSP0440 pSINSCID_UK01 NKcells m6
14 GTSP0440-2 GTSP0440 pSINSCID_UK01 NKcells m6
15 GTSP0440-3 GTSP0440 pSINSCID_UK01 NKcells m6
16 GTSP0440-4 GTSP0440 pSINSCID_UK01 NKcells m6
17 GTSP0441-1 GTSP0441 pSINSCID_UK01 Monocytes m6
18 GTSP0441-2 GTSP0441 pSINSCID_UK01 Monocytes m6
19 GTSP0441-3 GTSP0441 pSINSCID_UK01 Monocytes m6
20 GTSP0441-4 GTSP0441 pSINSCID_UK01 Monocytes m6
Error in rep(1:length(tframe.list), sapply(tframe.list, nrow)) :
invalid 'times' argument
Calls: lapply ... lapply -> FUN -> getEstimatedAbundance -> estAbund -> factor
Execution halted

sites.multi$estAbun not calculated

below not working

    if(use.sonicLength){
      estAbund.uniqueFragLen <- function(location, fragLen, replicate=NULL){
        if(is.null(replicate)){replicate <- 1}  #Need for downstream workflow
        dfr <- data.frame(location = location, fragLen = fragLen, 
                          replicate = replicate)
        dfr_dist <- distinct(dfr)
        site_list <- split(dfr_dist, dfr_dist$location)
        theta <- sapply(site_list, function(x){nrow(x)})
        theta <- theta[unique(dfr$location)]
        list(theta=theta)
      }
      estAbund <- estAbund.uniqueFragLen
    }

master had error, counts no error

please generate all reports before merge

p03712-08.log:Execution halted
p04409-18.log:Execution halted
p04409-26.log:Execution halted
p959-101.log:Execution halted
pFR03.log:Execution halted
pSINSCID_UK01.log:Execution halted
pWAS00002.log:Execution halted

cloning repositories lead to race condition when trying to generate multiple reports

unlink("CancerGeneList", force=TRUE, recursive=TRUE)
cmd <- "git clone https://github.com/BushmanLab/CancerGeneList.git"
message(cmd)
stopifnot( system(cmd)==0 )
source("CancerGeneList/onco_genes.R")

unlink("intSiteCaller", force=TRUE, recursive=TRUE)
cmd <- "git clone https://github.com/BushmanLab/intSiteCaller.git"
message(cmd)
stopifnot( system(cmd)==0 )
source("intSiteCaller/hiReadsProcessor.R")
source("intSiteCaller/standardization_based_on_clustering.R")

should be removed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.