Code Monkey home page Code Monkey logo

aroma.affymetrix's Introduction

CRAN check status R CMD check status Coverage Status

aroma.affymetrix: Analysis of Large Affymetrix Microarray Data Sets

Please see The Aroma Project for more details on how to use the package and for joining the user forum where you can ask questions.

Installation

R package aroma.affymetrix is available on CRAN. The easiest to install the package and all of its dependencies (of which some are on Bioconductor), use

source('http://callr.org/install#aroma.affymetrix')

aroma.affymetrix's People

Contributors

gara-z avatar henrikbengtsson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

aroma.affymetrix's Issues

ROBUSTNESS: Add explicit 'stringsAsFactors' arguments [data.frame]

$ for pkg in $pkgs; do echo "$pkg:"; (cd "$pkg"; grep -E "^[ \t]*[^#].*data[.]frame" -- */*.R | grep -vF stringsAsFactors;); echo; read -r -p "Press ENTER to continue ..."; done
aroma.affymetrix:
R/AffymetrixCdfFile.getUnitGroupCellMap.R:    map <- data.frame(unit=units, group=groups, cell=cells)
R/AffymetrixCdfFile.getUnitGroupNamesFromUgcMap.R:  res <- data.frame(
R/AffymetrixCdfFile.groupUnitsByDimension.R:  dims <- as.data.frame(dims)
R/AffymetrixCelFile.R:    class(cel) <- "data.frame"
R/AffymetrixNetAffxCsvFile.R:  data <- data.frame(unitName=data[[1]], fln)
R/AlleleSummation.R:        data <- as.data.frame(data[c("intensities", "stdvs", "pixels")])
R/AromaUfcFile.R:  ## 'data.frame':   1879547 obs. of  3 variables:
R/ChipEffectFile.R:    map <- data.frame(unit=integer(0), group=integer(0), cell=integer(0))
R/ChipEffectFile.R:  map <- data.frame(unit=units2, group=groups, cell=cells)
R/ChipEffectFile.R:  data <- as.data.frame(data)
R/ChromosomalModel.getPositionChipTypeUnit.R:  pcu <- data.frame(position=pos, chipType=chipType, unit=units)
R/CrlmmModel.R:  data <- data.frame(gender=rep("female", times=nbrOfArrays))
R/FirmaFile.R:    map <- data.frame(unit=integer(0), group=integer(0), cell=integer(0))
R/FirmaFile.R:  map <- data.frame(unit=units, group=groups, cell=cells)
R/FirmaFile.R:  data <- as.data.frame(data)
R/isUnitGroupCellMap.R:setMethodS3("isUnitGroupCellMap", "data.frame", function(this, ...) {
R/justRMA.R:  data <- data.frame(ScanDate=getTimestamps(csR))
R/MatNormalization.R:  #ss<-split(data.frame(resid),cuts)
R/ParameterCelFile.extractNnn.R:    ugcMap <- as.data.frame(ugcMap)
R/ParameterCelFile.extractNnn.R:    ugNames <- as.data.frame(ugNames)
R/ParameterCelSet.R:    ugcMap <- as.data.frame(ugcMap)
R/ParameterCelSet.R:    ugNames <- as.data.frame(ugNames)
R/readCfhHeader.R:  references <- data.frame(sd=as.double(references[,1]), sample=I(references[,2]))
R/ResidualFile.R:    map <- data.frame(unit=integer(0), group=integer(0), cell=integer(0))
R/ResidualFile.R:  map <- data.frame(unit=units, group=groups, cell=cells)
R/ResidualFile.R:  data <- as.data.frame(data)
R/SmoothMultiarrayModel.fit.R:  pcu <- data.frame(position=pos, chipType=chipType, unit=units)
R/SmoothMultiarrayModel.fit.R:        outData <- data.frame(cell=map[,"cell"], theta=rep(0, nrow(map)))
R/SnpChipEffectNnn.extractCNT.R:  dataHead <- data.frame(
R/UnitModel.fitCnProbes.R:      data <- data.frame(cell=cellsM, theta=y, sdTheta=sdTheta, outliers=FALSE)
R/UnitTypeScaleNormalization.R:  data <- data.frame(cell=NULL, y=NULL)
R/UnitTypeScaleNormalization.R:    dataKK <- data.frame(cell=cells, y=y)
R/WeightsFile.R:    map <- data.frame(unit=integer(0), group=integer(0), cell=integer(0))
R/WeightsFile.R:  map <- data.frame(unit=units, group=groups, cell=cells)
R/WeightsFile.R:  data <- as.data.frame(data)

CLEANUP: Stop testing against deprecated future.BatchJobs

$ grep -iF batchjobs -r --include="*.R"
inst/testScripts/system/chipTypes/HG-U133_Plus_2/21.BackgroundCorrections.R:## setParallelEngine(aromaSettings, "BatchJobs")
inst/testScripts/futures/HG-U133_Plus_2/14.NormExpBackgroundCorrection.R:if (require("future.BatchJobs")) {
inst/testScripts/futures/HG-U133_Plus_2/14.NormExpBackgroundCorrection.R:  strategies <- c(strategies, "batchjobs_local")
inst/testScripts/futures/HG-U133_Plus_2/14.NormExpBackgroundCorrection.R:    strategies <- c(strategies, "batchjobs_torque")
inst/testScripts/futures/HG-U133_Plus_2/12.GcRmaBackgroundCorrection.R:if (require("future.BatchJobs")) {
inst/testScripts/futures/HG-U133_Plus_2/12.GcRmaBackgroundCorrection.R:  strategies <- c(strategies, "batchjobs_local")
inst/testScripts/futures/HG-U133_Plus_2/12.GcRmaBackgroundCorrection.R:    strategies <- c(strategies, "batchjobs_torque")
inst/testScripts/futures/HG-U133_Plus_2/13.RmaBackgroundCorrection.R:if (require("future.BatchJobs")) {
inst/testScripts/futures/HG-U133_Plus_2/13.RmaBackgroundCorrection.R:  strategies <- c(strategies, "batchjobs_local")
inst/testScripts/futures/HG-U133_Plus_2/13.RmaBackgroundCorrection.R:    strategies <- c(strategies, "batchjobs_torque")
inst/testScripts/futures/HG-U133_Plus_2/11.OpticalBackgroundCorrection.R:if (require("future.BatchJobs")) {
inst/testScripts/futures/HG-U133_Plus_2/11.OpticalBackgroundCorrection.R:  strategies <- c(strategies, "batchjobs_local")
inst/testScripts/futures/HG-U133_Plus_2/11.OpticalBackgroundCorrection.R:    strategies <- c(strategies, "batchjobs_torque")
inst/testScripts/futures/GenomeWideSNP_6/11.AllelicCrosstalkCalibration.R:if (require("future.BatchJobs")) {
inst/testScripts/futures/GenomeWideSNP_6/11.AllelicCrosstalkCalibration.R:  strategies <- c(strategies, "batchjobs_local")
inst/testScripts/futures/GenomeWideSNP_6/11.AllelicCrosstalkCalibration.R:    strategies <- c(strategies, "batchjobs_torque")
inst/testScripts/futures/GenomeWideSNP_6/11.BasePositionNormalization.R:if (require("future.BatchJobs")) {
inst/testScripts/futures/GenomeWideSNP_6/11.BasePositionNormalization.R:  strategies <- c(strategies, "batchjobs_local")
inst/testScripts/futures/GenomeWideSNP_6/11.BasePositionNormalization.R:    strategies <- c(strategies, "batchjobs_torque")
inst/testScripts/futures/Hs_PromPR_v02/11.MatNormalization.R:if (require("future.BatchJobs")) {
inst/testScripts/futures/Hs_PromPR_v02/11.MatNormalization.R:  strategies <- c(strategies, "batchjobs_local")
inst/testScripts/futures/Hs_PromPR_v02/11.MatNormalization.R:    strategies <- c(strategies, "batchjobs_torque")
inst/testScripts/futures/Mapping10K_Xba142/11.UniqueCDF.R:if (require("future.BatchJobs")) {
inst/testScripts/futures/Mapping10K_Xba142/11.UniqueCDF.R:  strategies <- c(strategies, "batchjobs_local")
inst/testScripts/futures/Mapping10K_Xba142/11.UniqueCDF.R:    strategies <- c(strategies, "batchjobs_torque")
inst/testScripts/futures/Mapping10K_Xba142/11.AllelicCrosstalkCalibration.R:if (require("future.BatchJobs")) {
inst/testScripts/futures/Mapping10K_Xba142/11.AllelicCrosstalkCalibration.R:  strategies <- c(strategies, "batchjobs_local")
inst/testScripts/futures/Mapping10K_Xba142/11.AllelicCrosstalkCalibration.R:    strategies <- c(strategies, "batchjobs_torque")
inst/testScripts/futures/Mapping10K_Xba142/11.BasePositionNormalization.R:if (require("future.BatchJobs")) {
inst/testScripts/futures/Mapping10K_Xba142/11.BasePositionNormalization.R:  strategies <- c(strategies, "batchjobs_local")
inst/testScripts/futures/Mapping10K_Xba142/11.BasePositionNormalization.R:    strategies <- c(strategies, "batchjobs_torque")

FUTURE: Use do.call(fcn, ...) - not do.call("fcn", ...)

When calling a function via do.call(), specify the function by itself and not by its name, e.g.

## GOOD
y <- do.call(backgroundCorrect, args=args)

## BAD
y <- do.call("backgroundCorrect", args=args)

The former will cause backgroundCorrect() to be identified as a global object by the globals such that it is properly exported when evaluated by a future.

TESTS: "addons" tests for GSE12702 runs 40 samples and takes ~3 hours - can we use fewer to speed it up?

In:

  1. testScripts/addons/chipTypes/Mapping250K_Nsp/
    • 21.doASCRMAv2,CalMaTe.R
    • 31.doASCRMAv2,PSCBS.R
  2. testScripts/addons/chipTypes/Mapping250K_Nsp,Sty/
    • 11.doASCRMAv2,TCNs,500K.R

we use 20 tumor-normal pairs from the GEO dataset GSE12702.

The first set of tests takes nearly 3 hours to complete (sequentially). The AS-CRMAv2 step on the 40 arrays take a good 2 hours. The CalMaTe processing takes ~40 minutes for the 40 arrays. (The PSCBS is done on a single tumor-normal pair and a single chromosome and takes on a couple of minutes). Can we cut down this data set to, say, 10 tumor-normal pairs to speedup up these redundancy tests?

BTW, it turns out that I've disable the 2nd set of tests for some reason; maybe because there were race conditions where the two of them tried to run the same AS-CRMAv2 pipeline on the same set of arrays at the same time?

R-devel / CRAN errors

CRAN servers report errors on R-devel ("3.6.0") with all internal tests enabled:

 [14s/14s] ERROR
Running the tests intests/AffymetrixCelSet.Rfailed.
Last 13 lines of output:
  Checksum on record: 307006493ef52faf137da0bd84c59ca8
  [2018-04-30 13:23:28] Exception: Failed to setup a data set for any of 1 data directories located. The following reasons were reported: (1) Do not know how to unwrap object: SampleAnnotationSet (while trying './rawData/FusionSDK_HG-Focus/HG-Focus').
  
    at #03. byName.AffymetrixCelSet(static, ...)
            - byName.AffymetrixCelSet() is in environment 'aroma.affymetrix'
            - originating from '<text>'
  
    at #02. byName(static, ...)
            - byName() is in environment 'R.filesets'
  
    at #01. AffymetrixCelSet$byName("FusionSDK_HG-Focus", cdf = cdf)
            - AffymetrixCelSet$byName() is local of the calling function
  
  Error: Failed to setup a data set for any of 1 data directories located. The following reasons were reported: (1) Do not know how to unwrap object: SampleAnnotationSet (while trying './rawData/FusionSDK_HG-Focus/HG-Focus').
  Execution halted
* checking PDF version of manual ... OK
* DONE

Status: 1 ERROR, 1 NOTE

doCRMAv2() error:Do not know how to unwrap object: AromaUnitTotalCnBinarySet

Since i cannot post any new question at https://groups.google.com/forum/?oldui=1#!forum/aroma-affymetrix, I am asking for your help here.

When i tried to run the example(https://www.aroma-project.org/vignettes/PairedPSCBS-lowlevel/), an error occurred.

here is the script:

library("aroma.affymetrix")
verbose <- Arguments$getVerbose(-10, timestamp=TRUE)
dataSet <- "GSE12702"
chipType <- "Mapping250K_Nsp"
csR <- AffymetrixCelSet$byName(dataSet, chipType=chipType)
pair <- c(T="GSM318736", N="GSM318737")
csR <- csR[indexOf(csR, pair)]
res <- doASCRMAv2(csR, verbose=verbose)

The error:

Exception: Do not know how to unwrap object: AromaUnitTotalCnBinarySet

  at #10. extract.default(ass, keep, onDuplicates = "error")
          - extract.default() is in environment 'R.utils'

  at #09. extract(ass, keep, onDuplicates = "error")
          - extract() is in environment 'R.utils'

  at #08. exportTotalAndFracB.SnpChipEffectSet(cesN, verbose = verbose, 
              fields = c("total", "fracB"))
          - exportTotalAndFracB.SnpChipEffectSet() is in environment 'aroma.affymetrix'

  at #07. NextMethod("exportTotalAndFracB", fields = fields)
          - NextMethod() is in environment 'base'

  at #06. exportTotalAndFracB.CnChipEffectSet(cesN, verbose = verbose)
          - exportTotalAndFracB.CnChipEffectSet() is in environment 'aroma.affymetrix'

  at #05. exportTotalAndFracB(cesN, verbose = verbose)
          - exportTotalAndFracB() is in environment 'aroma.affymetrix'

  at #04. doCRMAv2.AffymetrixCelSet(..., combineAlleles = FALSE)
          - doCRMAv2.AffymetrixCelSet() is in environment 'aroma.affymetrix'

  at #03. doCRMAv2(..., combineAlleles = FALSE)
          - doCRMAv2() is in environment 'aroma.affymetrix'

  at #02. doASCRMAv2.default(csR, verbose = verbose)
          - doASCRMAv2.default() is in environment 'aroma.affymetrix'

  at #01. doASCRMAv2(csR, verbose = verbose)
          - doASCRMAv2() is in environment 'aroma.affymetrix'

Error: Do not know how to unwrap object: AromaUnitTotalCnBinarySet
In addition: Warning messages:
1: In transform(y) : NaNs produced
2: In transform(y) : NaNs produced
3: In (function (...)  : NaNs produced

sessionInfo:

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] zh_CN.UTF-8/zh_CN.UTF-8/zh_CN.UTF-8/C/zh_CN.UTF-8/zh_CN.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] aroma.light_3.18.0     aroma.affymetrix_3.2.0 affxparser_1.60.0      aroma.core_3.2.1       R.devices_2.16.1      
[6] R.filesets_2.13.0      R.utils_2.10.1         R.oo_1.24.0            R.methodsS3_1.8.1     

loaded via a namespace (and not attached):
 [1] DNAcopy_1.62.0     splines_4.0.2      aroma.apd_0.6.0    R.cache_0.14.0     globals_0.13.1     tools_4.0.2       
 [7] parallel_4.0.2     R.huge_0.9.0       xfun_0.18          tinytex_0.26       PSCBS_0.65.0       matrixStats_0.57.0
[13] digest_0.6.27      R.rsp_0.44.0       base64enc_0.1-3    codetools_0.2-16   sfit_0.3.1         compiler_4.0.2    
[19] future_1.19.1      listenv_0.8.0   

traceback:

> traceback()
15: stop(cond)
14: throw.Exception(Exception(...))
13: throw(Exception(...))
12: throw.default("Do not know how to unwrap object: ", class(x)[1L])
11: throw("Do not know how to unwrap object: ", class(x)[1L])
10: extract.default(ass, keep, onDuplicates = "error")
9: extract(ass, keep, onDuplicates = "error")
8: exportTotalAndFracB.SnpChipEffectSet(cesN, verbose = verbose, 
       fields = c("total", "fracB"))
7: NextMethod("exportTotalAndFracB", fields = fields)
6: exportTotalAndFracB.CnChipEffectSet(cesN, verbose = verbose)
5: exportTotalAndFracB(cesN, verbose = verbose)
4: doCRMAv2.AffymetrixCelSet(..., combineAlleles = FALSE)
3: doCRMAv2(..., combineAlleles = FALSE)
2: doASCRMAv2.default(csR, verbose = verbose)
1: doASCRMAv2(csR, verbose = verbose)

file path:

----annotationData----chipTypes----Mapping250K_Nsp----
Mapping250K_Nsp,HB20080710.acs         Mapping250K_Nsp,monocell.CDF           Mapping250K_Nsp,monocell.CDF.md5       Mapping250K_Nsp,na31,HB20101007.ufl        Mapping250K_Nsp,na31,HB20101007.ugp    
Mapping250K_Nsp,na31,HB20101007.ugp.md5         Mapping250K_Nsp.cdf
----rawData----GSE12702----Mapping250K_Nsp----
GSM318736.CEL   GSM318737.CEL

I also tried the CEL files of GenomeWideSNP_6 array with the latest annotation files(GenomeWideSNP_6,Full.cdf, GenomeWideSNP_6,Full,na35,hg19,dbSNP141,HB20170323.ugp, GenomeWideSNP_6,Full,na35,hg19,dbSNP141,HB20170323.ufl, GenomeWideSNP_6,HB20080710.acs), but the same error occurred.

I don't know how to solve this problem, please help me, thanks!

ROBUSTNESS: Add explicit 'stringsAsFactors' arguments [cbind, rbind]

$ for pkg in $pkgs; do echo "$pkg:"; (cd "$pkg"; grep -E "^[ \t]*[^#].*[cr]bind" -- */*.R | grep -vF stringsAsFactors;); echo; read -r -p "Press ENTER to continue ..."; done

aroma.affymetrix:
R/AffymetrixCdfFile.computeAffinities.R:      A <- cbind(charMtrx[1,,drop=TRUE] %*% affinity.basis.matrix,
R/AffymetrixCdfFile.computeAffinities.R:      A <- cbind(charMtrx[1,,drop=TRUE] %*% affinity.basis.matrix,
R/AffymetrixCdfFile.getAlleleProbePairs3.R:        data <- cbind(unitsChunk[uu], group, values)
R/AffymetrixCdfFile.getCellQuartets.R:        cellsUUMerged <- cbind(cellsUUMerged, cellsGG)
R/AffymetrixCdfFile.getProbeSequenceData.R:  data <- cbind(cell=cells, data)
R/AffymetrixCdfFile.getUnitGroupCellMap.R:  map <- cbind(ugcMap, cpMap)
R/AffymetrixCdfFile.MONOCELL.R:  mergedMap <- cbind(ugcMap[[1]], cellM=ugcMap[[2]][rr,"cell"])
R/AffymetrixCdfFile.UNIQUE.R:  mergedMap <- cbind(ugcMap[[1]], cellM=ugcMap[[2]][rr,"cell"])
R/AffymetrixCelFile.PLOT.R:  ma <- cbind(x=x, ma)
R/AffymetrixCelSet.writeSgr.R:    data <- cbind(ch, pos, data)
R/AffymetrixNetAffxCsvFile.XTRS.R:          dataKK <- Reduce(rbind, dataList[idxs])
R/AffymetrixNetAffxCsvFile.XTRS.R:      data <- cbind(unitName=unitNames, index=idxs, data)
R/AlleleSummation.R:        data <- cbind(cell=cells, data)
R/AlleleSummation.R:      data <- cbind(cell=cells, intensities=y, stdvs=sd)
R/AllelicCrosstalkCalibration.getSetsOfProbes.R:        snpsT$all <- cbind(snpsT$all, snps[[kk]])
R/ChipEffectFile.R:  map <- cbind(ugcMap, cpMap)
R/ChipEffectFile.R:  data <- cbind(map, data)
R/ChipEffectFile.xam.R:  xam <- cbind(x=x, am)
R/ChipEffectNnn.extractChromosomalDataFrame.R:  data <- cbind(dataHead, gp, dataTail)
R/ChipEffectSet.calculateBaseline.R:    ds <- cbind(intensities=ds, cell=cells)
R/ChipEffectSet.calculateBaseline.R:        ds <- cbind(intensities=ds, cell=cells)
R/ChipEffectSet.xam.R:  xam <- cbind(x=x, am)
R/ExonRmaPlm.calculateWeights.R:      y <- do.call(rbind, lapply(unit, FUN=.subset2, "eps"))
R/FirmaFile.R:  data <- cbind(map, data)
R/fitPlasqUnit.R:  paramIndMat <- cbind(
R/MatNormalization.R:  designMatrix <- cbind(nT, A, C, G, rowSums(A)^2, rowSums(C)^2, rowSums(G)^2, nT^2, log(as.integer(ms[,1])))
R/MatSmoothing.R:      nullX <- cbind(neg=matScoreNeg, pos=matScorePos)
R/ParameterCelFile.extractNnn.R:    data <- cbind(ugcMap, data)
R/ParameterCelFile.extractNnn.R:    data <- cbind(ugNames, data)
R/ParameterCelSet.R:    data <- cbind(ugcMap, data)
R/ParameterCelSet.R:    data <- cbind(ugNames, data)
R/ResidualFile.R:  data <- cbind(map, data)
R/SnpChipEffectNnn.extractCNT.R:    dataHead <- cbind(ProbeSet=unitNames, dataHead)
R/SnpChipEffectNnn.extractCNT.R:  data <- cbind(dataHead, data)
R/SnpInformation.R:    data <- cbind(data[,-cc,drop=FALSE], lss)
R/UnitTypeScaleNormalization.R:    data <- rbind(data, dataKK)
R/WeightsFile.R:  data <- cbind(map, data)

Subsetting by name on empty AffymetrixCelSet tries to read non-existing CEL file

> library("aroma.affymetrix")
> ds <- AffymetrixCelSet()
> # ds["foo"] or
> extract(ds, "foo")
Error in affxparser::readCelHeader(...) :
  Argument 'filename' should be a single file:
> traceback()
> traceback()
13: stop("Argument 'filename' should be a single file: ", paste(filename,
        collapse = ", "))
12: affxparser::readCelHeader(...)
11: .readCelHeader(pathname)
10: getHeader.AffymetrixCelFile(this)
9: getHeader(this)
8: getCdf.AffymetrixCelFile(getOneFile(this), ...)
7: getCdf(getOneFile(this), ...)
6: getCdf.AffymetrixCelSet(this)
5: getCdf(this)
4: clearCache.AffymetrixCelSet(res)
3: clearCache(res)
2: extract.GenericDataFileSet(ds, "foo")
1: extract(ds, "foo")

ROBUSTNESS: process() for MatNormalization does not create CEL files atomically

There are two steps in process() for MatNormalization that writes CEL files. The first step creates the CEL files and writes some initial data, whereas the second step updates/rescales this data.

There is currently no way to know whether the existing CEL files are complete/valid or not. For instance, if there is an interrupts after the first step there is nothing in the CEL files that tells us they're complete or not.

SUGGESTION: The first step should create CEL files with temporary filenames that the second step works from.

ROBUSTNESS: Add explicit 'stringsAsFactors' arguments [read.table]

$ $ (cd aroma.affymetrix; grep -A 1 -F "read.table" -- R/*.R;)
R/AffymetrixCsvGenomeInformation.R:  df <- read.table(pathname, colClasses=colClasses, header=TRUE, sep=",", quote="\"", fill=TRUE, check.names=FALSE, na.strings=c("---"), ...)
R/AffymetrixCsvGenomeInformation.R-
--
R/CnChipEffectSet.importFromDChip.R:    unitNames <- read.table(file=con, colClasses=colClasses, sep=sep, header=FALSE)
R/CnChipEffectSet.importFromDChip.R-    unitNames <- unlist(unitNames, use.names=FALSE)
--
R/CnChipEffectSet.importFromDChip.R:    data <- read.table(file=con, colClasses=colClasses, sep=sep,
R/CnChipEffectSet.importFromDChip.R-                 header=FALSE, comment.char="", quote="", fill=FALSE)

Updating from 3.0.0 to 3.1.0 failed

Hi,
I have tried to update my Aroma installation to new 3.1.0 release by using:
'source('http://callr.org/install#aroma.affymetrix')'
But that failed.

Please see the R session infor below:

source('http://callr.org/install#aroma.affymetrix')
Installing/updating package: ‘aroma.affymetrix’

Package updates:
Previous Current
aroma.affymetrix 3.0.0 3.0.0

Package installed/updated. Load it by:

library('aroma.affymetrix')

(These commands have been added to your commands history.)

sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] tools_3.2.1

MOUSEDIVm520650 CDF file

Dear Dr Bengtsson,

I don't suppose you know where to obtain the CDF file for MOUSEDIVm520650 array? The relevant .CEL files are at the web site (http://churchill-lab.jax.org/website/MDA), but I cannot see the relevant CDF file (which is also absent in bioconductor).

I ask on the off-chance you might know -- you use these arrays in your Aroma software package.

Many thanks!

TESTS: testScripts/futures/* to utilize future.BatchJobs

Update testScripts/futures/* to utilize future.BatchJobs, cf.

if (future::supportsMulticore()) strategies <- c(strategies, "multicore")
if (packageVersion("future") > "0.10.9") strategies <- c(strategies, "multisession")
if (require("async")) {
  strategies <- c(strategies, "batchjobs")
  async::backend("local")
}

BUG: R CMD check "Note: break used in wrong context: no loop is visible"

From https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-debian-clang/aroma.affymetrix-00check.html:

using R Under development (unstable) (2018-04-03 r74516)
using platform: x86_64-pc-linux-gnu (64-bit)
using session charset: UTF-8
checking for file ‘aroma.affymetrix/DESCRIPTION’ ... OK
this is package ‘aroma.affymetrix’ version ‘3.1.0’
checking package namespace information ... OK
checking package dependencies ... OK
checking if this is a source package ... OK
checking if there is a namespace ... OK
checking for executable files ... OK
checking for hidden files and directories ... OK
checking for portable file names ... OK
checking for sufficient/correct file permissions ... OK
checking serialization versions ... OK
checking whether package ‘aroma.affymetrix’ can be installed ... WARNING
Found the following significant warnings:
Note: break used in wrong context: no loop is visible
See https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-debian-clang/aroma.affymetrix-00install.html for details.
Information on the location(s) of code generating the ‘Note’s can be
obtained by re-running with environment variable R_KEEP_PKG_SOURCE set
to ‘yes’.
checking package directory ... OK
checking DESCRIPTION meta-information ... OK
[...]

CRAN: Errors in Windows due to failed file linking

For example, https://win-builder.r-project.org/bvgoYP5bMvLv/examples_and_tests/tests_x64/002.setupExampleData,rawData.Rout.fail:

> setupExampleData(aroma.affymetrix, dirs="rawData", mustWork=FALSE)
[2019-02-13 20:12:56] Exception: Failed to create file link: rawData/FusionSDK_HG-Focus/HG-Focus/HG-Focus-1-121502.CEL[.lnk] -> D:/temp/RtmpwJ0pox/RLIBS_2e6e859a46d90/AffymetrixDataTestFiles/rawData/FusionSDK_HG-Focus/HG-Focus/2.Calvin/HG-Focus-1-121502.CEL

  at #04. createLink.default(link = pathnameD, target = pathnameS, skip = TRUE)
          - createLink.default() is in environment 'R.utils'

  at #03. createLink(link = pathnameD, target = pathnameS, skip = TRUE)
          - createLink() is in environment 'R.utils'

  at #02. setupExampleData.AromaAffymetrix(aroma.affymetrix, dirs = "rawData", 
              mustWork = FALSE)
          - setupExampleData.AromaAffymetrix() is in environment 'aroma.affymetrix'

  at #01. setupExampleData(aroma.affymetrix, dirs = "rawData", mustWork = FALSE)
          - setupExampleData() is in environment 'aroma.affymetrix'

Error: Failed to create file link: rawData/FusionSDK_HG-Focus/HG-Focus/HG-Focus-1-121502.CEL[.lnk] -> D:/temp/RtmpwJ0pox/RLIBS_2e6e859a46d90/AffymetrixDataTestFiles/rawData/FusionSDK_HG-Focus/HG-Focus/2.Calvin/HG-Focus-1-121502.CEL
In addition: Warning message:
In system(cmd, intern = intern, wait = wait | intern, show.output.on.console = wait,  :
  running command 'C:\Windows\system32\cmd.exe /c mklink "rawData/FusionSDK_HG-Focus/HG-Focus/HG-Focus-1-121502.CEL" "D:/temp/RtmpwJ0pox/RLIBS_2e6e859a46d90/AffymetrixDataTestFiles/rawData/FusionSDK_HG-Focus/HG-Focus/2.Calvin/HG-Focus-1-121502.CEL"' had status 1
Execution halted

I can reproduce it on win-builder. These errors started to occur around mid 2018. But I don't understand why they all of a sudden started to appear, because these tests have been in place for many years without issues.

Task

  • Have R.utils::createLink() give (even) more informative error messages.
  • Reproduce these errors by adding corresponding tests to R.utils.

Command line: Turn doNnn() functions into CmdArgsFunction:s

Turn doNnn() functions into CmdArgsFunction:s, e.g.

doCRMAv2 <- CmdArgsFunction(doCRMAv2)

such that we can call these functions from the command line as:

Rscript -e aroma.affymetrix::doCRMAv2 --dataSet=HapMap270,testSet --chipType=GenomeWideSNP_6

Declare all S3 methods in NAMESPACE

Declare all S3 methods in NAMESPACE. Currently there are many missing ones, e.g.

* checking S3 generic/method consistency ... NOTE
Found the following apparent S3 methods exported but not registered:
  addColorMap.ArrayExplorer addColorMap.SpatialReporter
  addExclCells.DChipQuantileNormalization
...
[many!] 
...
  writeSignals.ProbeLevelTransform3 writeWig.CnChipEffectSet
See section ‘Registering S3 methods’ in the ‘Writing R Extensions’ manual.
* checking replacement functions ... OK

TESTS: "system" on GenomeWideSNP_6 takes ~3.5 hours to complete

The redundancy tests for:

  • testScripts/system/chipTypes/GenomeWideSNP_6/
    • 11.justSNPRMA.R (~1h20m)
    • 21.doASCRMAv2,AlleleSummation.R (~2h30m)
    • 21.doASCRMAv2,ArrayExplorer.R (~0h35m)
    • 21.doASCRMAv2,GCNorm.R (~0h20m)
    • 21.doASCRMAv2,extract.R (~0h04m)
    • 21.doASCRMAv2,plotACC.R (~0h05m)
    • 21.doASCRMAv2,plots.R (~0h08m)
    • 21.doASCRMAv2,singleArray.R (~0h30m)
    • 21.doASCRMAv2.R (~0h01m)

takes ~5.5 hours to complete. The data set is GSM337708 and consists of 14 CEL files. We should strip this down to processing of, say, 6 CEL files.

Also, can/should we move the justSNPRMA() test to testScripts/replication/?

affyio::read.celfile.header() requires methods package to be attached?

Issue

It looks like affyio::read.celfile.header() produce an error if the methods package is not attached.

Example

> detach("package:methods")
> source("testScripts/system/chipTypes/HG-U133_Plus_2/81.doRMA,extractAffyBatch.R", echo = TRUE)

> library("aroma.affymetrix");
[...]
aroma.affymetrix v3.0.0-9000 (2016-09-04) successfully loaded. See ?aroma.affymetrix for help.

> verbose <- Arguments$getVerbose(-8, timestamp=TRUE);

> dataSet <- "GSE9890";

> chipType <- "HG-U133_Plus_2";

> csR <- AffymetrixCelSet$byName(dataSet, chipType=chipType);

> print(csR);
AffymetrixCelSet:
Name: GSE9890
Tags: 
Path: rawData/GSE9890/HG-U133_Plus_2
Platform: Affymetrix
Chip type: HG-U133_Plus_2
Number of arrays: 10
Names: GSM249671, GSM249672, GSM249673, ..., GSM249680 [10]
Time period: 2006-12-12 15:27:01 -- 2007-06-28 10:55:52
Total file size: 129.36MB

> ab <- extractAffyBatch(csR, verbose=verbose);
Loading required namespace: affy
Loading required package: hgu133plus2cdf

20170312 19:55:14|Creating AffyBatch from 10 CEL files...
20170312 19:55:14| Filenames: rawData/GSE9890/HG-U133_Plus_2/GSM249671.CEL, rawData/GSE9890/HG-U133_Plus_2/GSM249672.CEL, rawData/GSE9890/HG-U133_Plus_2/GSM249673.CEL, rawData/GSE9890/HG-U133_Plus_2/GSM249674.CEL, rawData/GSE9890/HG-U133_Plus_2/GSM249675.CEL, rawData/GSE9890/HG-U133_Plus_2/GSM249676.CEL, rawData/GSE9890/HG-U133_Plus_2/GSM249677.CEL, rawData/GSE9890/HG-U133_Plus_2/GSM249678.CEL, rawData/GSE9890/HG-U133_Plus_2/GSM249679.CEL, rawData/GSE9890/HG-U133_Plus_2/GSM249680.CEL
20170312 19:55:15| Sample names: GSM249671, GSM249672, GSM249673, GSM249674, GSM249675, GSM249676, GSM249677, GSM249678, GSM249679, GSM249680
1 reading rawData/GSE9890/HG-U133_Plus_2/GSM249671.CEL ...Error in read.celfile.header(filenames[i], info = "full") : 
  could not find function "is"
20170312 19:55:15|Creating AffyBatch from 10 CEL files...done
> traceback()
12: read.celfile.header(filenames[i], info = "full")
11: FUN(X[[i]], ...)
10: lapply(X = X, FUN = FUN, ...)
9: sapply(seq_len(length(filenames)), function(i) {
       sdate <- read.celfile.header(filenames[i], info = "full")[["ScanDate"]]
       if (is.null(sdate) || length(sdate) == 0) 
           NA_character_
       else sdate
   })
8: read.affybatch(filenames = l$filenames, phenoData = l$phenoData, 
       description = l$description, notes = notes, compress = compress, 
       rm.mask = rm.mask, rm.outliers = rm.outliers, rm.extra = rm.extra, 
       verbose = verbose, sd = sd, cdfname = cdfname)
7: ReadAffy(filenames = filenames, sampleNames = sampleNames, ..., 
       verbose = as.logical(verbose))
6: extractAffyBatch.AffymetrixCelSet(csR, verbose = verbose)
5: extractAffyBatch(csR, verbose = verbose) at 81.doRMA,extractAffyBatch.R#9
4: eval(expr, envir, enclos)
3: eval(ei, envir)
2: withVisible(eval(ei, envir))
1: source("testScripts/system/chipTypes/HG-U133_Plus_2/81.doRMA,extractAffyBatch.R", 
       echo = TRUE)

Session info

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  base     

other attached packages:
 [1] hgu133plus2cdf_2.18.0       aroma.light_3.4.0          
 [3] aroma.affymetrix_3.0.0-9000 affxparser_1.46.0          
 [5] aroma.core_3.0.0-9000       R.devices_2.15.1           
 [7] R.filesets_2.11.0           R.utils_2.5.0              
 [9] R.oo_1.21.0                 R.methodsS3_1.7.1          
[11] R.archive_0.3.0            

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.9           AnnotationDbi_1.36.2  DNAcopy_1.48.0       
 [4] IRanges_2.8.1         BiocGenerics_0.20.0   zlibbioc_1.20.0      
 [7] aroma.apd_0.6.0       R.cache_0.12.0        globals_0.9.0-9000   
[10] tools_3.3.3           parallel_3.3.3        Biobase_2.34.0       
[13] R.huge_0.9.0          DBI_0.6               affy_1.52.0          
[16] PSCBS_0.62.0          matrixStats_0.51.0    digest_0.6.12        
[19] preprocessCore_1.36.0 affyio_1.44.0         R.rsp_0.40.0         
[22] S4Vectors_0.12.1      base64enc_0.1-4       codetools_0.2-15     
[25] memoise_1.0.0         RSQLite_1.1-2         BiocInstaller_1.24.0 
[28] methods_3.3.3         stats4_3.3.3          future_1.3.0-9000    
[31] listenv_0.6.0         Cairo_1.5-9      

Some Question About Installing this R package

Thanks for reading this question.
Thank you very much for your great R package.
I want to install and load this R package. However, I'm encounter with the following problems:

source('http://callr.org/install#aroma.affymetrix')
Error in file(filename, "r", encoding = encoding) :
cannot open the connection
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
cannot open URL 'http://callr.org/install#aroma.affymetrix': HTTP status was '400 Bad Request'

R version is 3.1.1
Any help would be great appreciate! Thanks a lot!

Add checksum file generation for FIRMAscores.CEL files

Add checksum file generation for FIRMAscores.CEL files. Currently,

firmaData/Affymetrix-HeartBrain,cerebellum+heart,RBC,QN,RMA,merged,coreR3,FIRMA,medres/HuEx-1_0-st-v2:
total 16704
-rw-rw---- 1 henrik cbc 2846948 Jan  7 12:19 huex_wta_cerebellum_A,FIRMAscores.CEL
-rw-rw---- 1 henrik cbc 2846948 Jan  7 12:20 huex_wta_cerebellum_B,FIRMAscores.CEL
-rw-rw---- 1 henrik cbc 2846948 Jan  7 12:20 huex_wta_cerebellum_C,FIRMAscores.CEL
-rw-rw---- 1 henrik cbc 2846948 Jan  7 12:20 huex_wta_heart_A,FIRMAscores.CEL
-rw-rw---- 1 henrik cbc 2846948 Jan  7 12:20 huex_wta_heart_B,FIRMAscores.CEL
-rw-rw---- 1 henrik cbc 2846948 Jan  7 12:20 huex_wta_heart_C,FIRMAscores.CEL

ROBUSTNESS: Make sure search file reg exp patterns have trailing $

Make sure all regular expressions used for searching / setting up file sets have trailing $. For instance, instead of:

  pattern <- sprintf("%s[.]asb", typeTags);
  res <- AromaUnitFracBCnBinarySet$byPath(outPath, pattern=pattern);

use

  pattern <- sprintf("%s.asb$", typeTags);
  res <- AromaUnitFracBCnBinarySet$byPath(outPath, pattern=pattern);

PARALLEL: PLM fit() and processing chunks in parallel

Background

For fit() of PLMs, we process chunks of units across all arrays one at the time. Each chunk loads all data, fits the model, and writes the results to output files (one per array) and some extra file.

Parallelization

Although each chunk can be read and fitted independently, it is not safe to store/write data independently/in parallel. There are two alternatives:

  1. Write the output of the chunks to temporary files. Then, in the main process, when all chunks are done (or once in a while), read-collect these files and write results to the final output data files. Delete temporary files.
  2. Since chunk output is typically smaller than input, we could run a few chunks in parallel and keep the results in memory and then write to file.

Both approaches are fairly easy (first one is easiest) using for loop, list environments and futures.

Update package title to use Title Case

From Writing R Extensions:

The mandatory 'Title' field should give a short description of the package. Some package listings may truncate the title to 65 characters. It should use title case (that is, use capitals for the principal words), not use any markup, not have any continuation lines, and not end in a period.

Error: '%<=%' is defunct. Use '%<-%' instead.

I'm making a comeback to aroma.affymetrix after several years' absence.
I reused one of my formerly-working scripts, but it fails.
FWIW, it seems to fail based on a deprecation of a to-me unfamiliar operator %<=%
Output:

source("Analyse.R")
....libraries loading....
20170322 21:52:59|Background correcting data set...
20170322 21:52:59| Background correcting data set...
20170322 21:53:00| Number of arrays: 24
20170322 21:53:00| Array #1 ('GSM843555_JC-110') of 24...
20170322 21:53:00| Identifying cells to be updated...
20170322 21:53:10| Number of cells: 3095094
int [1:3095094] 13 15 17 20 24 28 29 33 34 35 ...
20170322 21:53:10| Identifying cells to be updated...done
Error: '%<=%' is defunct.
Use '%<-%' instead.
See help("Defunct")

20170322 21:53:10| Array #1 ('GSM843555_JC-110') of 24...done
20170322 21:53:10| Background correcting data set...done
20170322 21:53:10|Background correcting data set...done
script terminates here.

SessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04 LTS

locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] aroma.light_3.0.0 aroma.affymetrix_3.0.0 affxparser_1.42.0
[4] aroma.core_3.0.0 R.devices_2.15.1 R.filesets_2.11.0
[7] R.utils_2.5.0 R.oo_1.21.0 R.methodsS3_1.7.1

loaded via a namespace (and not attached):
[1] DNAcopy_1.44.0 XVector_0.10.0 splines_3.2.3
[4] IRanges_2.4.8 BiocGenerics_0.16.1 zlibbioc_1.16.0
[7] aroma.apd_0.6.0 R.cache_0.12.0 globals_0.9.0
[10] tools_3.2.3 parallel_3.2.3 Biobase_2.30.0
[13] R.huge_0.9.0 gcrma_2.42.0 affy_1.48.0
[16] PSCBS_0.62.0 matrixStats_0.51.0 digest_0.6.12
[19] preprocessCore_1.32.0 affyio_1.40.0 R.rsp_0.40.0
[22] S4Vectors_0.8.11 base64enc_0.1-3 codetools_0.2-15
[25] BiocInstaller_1.20.3 Biostrings_2.38.4 stats4_3.2.3
[28] future_1.4.0 listenv_0.6.0

q()

I downloaded the aroma.affymetrix 3.0.0 source package. unzipped it and grepped for instances of %<=%
$ grep "%<=%" find . -name "*.R"
./aroma.affymetrix/R/LimmaBackgroundCorrection.R: res[[kk]] %<=% {
./aroma.affymetrix/R/LimmaBackgroundCorrection.R: } ## %<=%
./aroma.affymetrix/R/AllelicCrosstalkCalibration.R: res[[kk]] %<=% {
./aroma.affymetrix/R/AllelicCrosstalkCalibration.R: } ## %<=%
./aroma.affymetrix/R/AbstractProbeSequenceNormalization.R: res[[kk]] %<=% {
./aroma.affymetrix/R/AbstractProbeSequenceNormalization.R: } ## %<=%
./aroma.affymetrix/R/OpticalBackgroundCorrection.R: res[[ii]] %<=% {
./aroma.affymetrix/R/OpticalBackgroundCorrection.R: } ## %<=%
./aroma.affymetrix/R/MatNormalization.R: res[[ii]] %<=% {
./aroma.affymetrix/R/MatNormalization.R: } ## %<=%
./aroma.affymetrix/R/UnitModel.fitCnProbes.R: res[[kk]] %<=% {
./aroma.affymetrix/R/UnitModel.fitCnProbes.R: } # %<=%
./aroma.affymetrix/R/FragmentLengthNormalization.R: res[[kk]] %<=% {
./aroma.affymetrix/R/FragmentLengthNormalization.R: } ## %<=%
./aroma.affymetrix/R/AffymetrixCelSet.NORM.R: dataFiles[[kk]] %<=% {
./aroma.affymetrix/R/AffymetrixCelSet.convertToUnique.R: res[[kk]] %<=% {
./aroma.affymetrix/R/AffymetrixCelSet.convertToUnique.R: } ## %<=%
./aroma.affymetrix/R/RmaBackgroundCorrection.R: res[[ii]] %<=% {
./aroma.affymetrix/R/RmaBackgroundCorrection.R: } ## %<=%
./aroma.affymetrix/R/GcRmaBackgroundCorrection.R: dataFiles[[ii]] %<=% {
./aroma.affymetrix/R/GcRmaBackgroundCorrection.R: } ## %<=%

Hope this helps

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.