Code Monkey home page Code Monkey logo

nat.nblast's Introduction

natverse

natverse GitHub Docs Travis build status Coveralls test coverage

The natverse package is a wrapper for all of the commonly used NeuroAnatomy Toolbox packages. This is convenient both for package installation and for loading/attaching these packages without many calls to library().

See http://natverse.org for more details.

Installation

The recommended approach to install the full natverse is to use a helper package natmanager, which is available on CRAN. You can therefore do:

install.packages("natmanager")
natmanager::install("natverse")

See http://natverse.org/install for more details.

Use

Once installed, you can load the natverse package:

library(natverse)
#> Loading required package: elmr
#> Loading required package: catmaid
#> Loading required package: httr
#> Warning: package 'httr' was built under R version 3.6.2
#> Loading required package: nat
#> Loading required package: rgl
#> Warning: package 'rgl' was built under R version 3.6.2
#> Registered S3 method overwritten by 'nat':
#>   method             from
#>   as.mesh3d.ashape3d rgl
#> 
#> Attaching package: 'nat'
#> The following object is masked from 'package:rgl':
#> 
#>     wire3d
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, union
#> Loading required package: nat.flybrains
#> Loading required package: nat.templatebrains
#> Loading required package: nat.nblast
# example 3D plot of some neurons
plot(kcs20, col=type)

Installation Details

Conflicts and Dependencies

The natverse package contains many packages and it is possible that there could be conflicts where functions in the natverse have the same name as another package.

natverse_conflicts()
#> ── Conflicts ─────────────────────────────────────────────────────────────── natverse_conflicts() ──
#> x nat::intersect() masks base::intersect()
#> x nat::setdiff()   masks base::setdiff()
#> x nat::union()     masks base::union()
#> x nat::wire3d()    masks rgl::wire3d()

You can always choose the correct version by prepending the specific package name e.g. nat::flip() will select the nat version rather than any other.

Updates

Once installed, you check the status of all natverse packages and their dependencies like so:

natverse_update()
#> 
#> The following packages are either locally installed or information about them is missing!
#> 
#>   blob, formattable, import, mockr, nycflights13, pingr, fafbsegdata, reticulate, nat
#> 
#> Please install them manually from their appropriate source locations
#> 
#> The following natverse dependencies are out-of-date, see details below:
#> 
#> We recommend updating them by running:
#> natverse_update(update=TRUE)
#> 
#> package         remote         local          source   repo                        status 
#> --------------  -------------  -------------  -------  --------------------------  -------
#> bit64           0.9-7.1        0.9-7          CRAN     https://cran.rstudio.com/   x      
#> data.table      1.13.0         1.12.8         CRAN     https://cran.rstudio.com/   x      
#> elmr            deb0e27df...   7a2be4537...   GitHub   natverse                    x      
#> ff              2.2-14.2       2.2-14         CRAN     https://cran.rstudio.com/   x      
#> flycircuit      1b7b48e29...   cc4594f47...   GitHub   natverse                    x      
#> git2r           0.27.1         0.26.1         CRAN     https://cran.rstudio.com/   x      
#> insectbrainr    6331b4df6...   8fef94a05...   GitHub   natverse                    x      
#> mouselightr     9c2ce1c31...   8e26b7702...   GitHub   natverse                    x      
#> nat.flybrains   28ff33213...   36c622a15...   GitHub   natverse                    x      
#> nat.jrcbrains   85ed4a791...   44c95667e...   GitHub   natverse                    x      
#> neuprintr       7403d3ce2...   8ab03b744...   GitHub   natverse                    x      
#> RCurl           1.98-1.2       1.98-1.1       CRAN     https://cran.rstudio.com/   x      
#> tibble          b4eec19dd...   3f4e5dfae...   GitHub   tidyverse                   x      
#> tidyr           1.1.0          1.0.3          CRAN     https://cran.rstudio.com/   x      
#> xfun            0.16           0.15           CRAN     https://cran.rstudio.com/   x      
#> XML             NA             3.99-0.3       CRAN     https://cran.rstudio.com/   x

You can then update like so:

natverse_update(update = TRUE)

However, if you are in a hurry and want to save time from the questions use like below:

natverse_update(update=TRUE, upgrade = 'always')

If want to upgrade the natverse package itself:

remotes::update_packages('natverse')

nat.nblast's People

Contributors

ajdm avatar alexanderbates avatar jdmanton avatar jefferis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

nat.nblast's Issues

Add report generation

It would be useful to have functions that generate reports on NBLAST results, perhaps as a knitr document. For example, a histogram of scores and a 3D plot of top hits could be produced automagically, along with some clustering of those top hits.

CRAN release?

Is there anything blocking a CRAN release? I notice that there was recently a spam update – not sure if it fixed anything.

NeuriteBlast fails for FAFB neurons

Use appropriate credentials for FAFB login

library(catmaid)
source("../catmaid_fafb_login.R")
test_skids = c(21999,22132)
test_n=read.neurons.catmaid(test_skids, conn=conn)
r = NeuriteBlast(test_n[[1]], test_n[[2]])

The above script returns the following error:

Error in findDirectionVectorsFromParents(target, query, idxArray, ReturnAllIndices = TRUE, :
Some points do not have a parent: therefore impossible to calculate direction vector
In addition: Warning messages:
1: In .CleanupParentArray(d1[, "Parent"]) :
no points to choose in .CleanupParentArray using original value
2: In .CleanupParentArray(d2[, "Parent"]) :
no points to choose in .CleanupParentArray using original value

The error is from this function.

It seemswp=which(pa==p) will only work if the ids in d$Parent are row indices, which is not true for FAFB.

[BUG] in scaling when neurons don't have names

Example:

> testneurons <- readRDS('testdata/testneurons.rds')
> names(testneurons) <- NULL
> scoresaba <- nblast_allbyall(testneurons,
+                              version=2,
+                              normalisation = 'normalised')
> scoresaba
<0 x 0 matrix>
attr(,"scaled:scale")
numeric(0)

Creating sparse matrices is rather slow

Creating a sparse matrix for 1,000 neurons from the 16,000-neuron full score matrix has been running for more than 90 minutes and still hasn't finished. This is with the full score matrix loaded into memory, so the slowness is not caused by disk access issues.

nblast fails to find smat.fcwb if nblast package is not attached.

If using nblast inside another function when importing but not attaching nat.nblast

from elmr package.

 get(smat) 
2 nat.nblast::nblast(xdp, db, normalised = normalised, .parallel = .parallel, 
    ...) at nblast_fafb.R#64
1 nblast_fafb(27884, mirror = FALSE) 

workaround is to attach. Fix will involve something to do with scope of get statement (being pointed to objects in package)

optimise for memory for very large all by all NBLAST

  • Use a pattern of small (e.g. 100 x 100) blocks that might 10s of seconds / a few minutes to compute
  • this should work better than doing a whole row or column that might have 20-50k neurons.
  • need to implement an x by y nblast function instead of all by all NBLAST for each block (would current NBLAST be ok?)
  • inputs could be neuronlistfh and read in for each process. I suspect that read time will be trivial compared with search time so long as blocks take 10s of seconds to compute. This might work well for memory.
  • ideally we would parallelise across those blocks with progress
  • if doing mean scores, we might want to do forward and reverse scores at the same time since they use the same sets of neurons
  • we might wish to fill a sparse matrix with the results with a threshold

Multi-core performance and memory consumption

Dear all,

We have found nblast really helpful to our current project, especially when doing nblast against the FlyEM database.

On my laptop (6 cores 12 threads), it takes about 4 min for a one against all NBlast when running on single core.

As I want to reduce the time, I used doParallel to define a multi-core backend and run NBlast with .parallel = TRUE. Interestingly, I could confirm that all my 12 cores were running with a 100% RAM consumption, and it ended up with more than 10 min for the same task.

Then I tried running NBlast on only two cores to avoid the high memory consumption, and it took 5 min for the task.

Take the longer time and high memory consumption into consideration, I am a little bit confused about how exactly nblast using .parallel. As I have a 4 processors 40 cores 80 threads CPU and 48 GB RAM, and my dps_flyEM is 2.32 GB, is it the best to run NBlast on only 16 cores rather than 80?

Best wishes,
Jiajun Zhang

Make documentation for nblast comprehensive

nblast() should be kept as simple as possible for novice users, with the more advanced arguments being dealt with in NeuriteBlast() and WeightedNNBasedLinesetMatching(). Links should be present to these more advanced methods in nblast()'s documentation.

add plot3d.nblastres function and give nblast option to return per segment scores

In order to satisfy one of the reviewer comments we should add a new function / example that shows which points are being matched for a pair of neurons and colours one of the neurons by the quality of the match. One way to do this would be allow the nblast function to return per segment scores (perhaps wrapping them in an object with a class like nblastres. A corresponding plot3d method could then be used to make a plot with sensible defaults.

Alternatively, a lower tech version would be to include an example in the nblast docs.

Collecting per segment results could be done by playing with the NNDistFun argument (which gets passed down to WeightedNNBasedLinesetMatching.default.

add fitting functions for nat.nblast

makeprobmat, scorematrix. Distances like this:

DL2nnlist=list()
for(n1 in DL2names){
    DL2nnlist[[n1]]=lapply(fcupndps[DL2names[DL2names!=n1]],
        WeightedNNBasedLinesetMatching.dotprops,fcupndps[[n1]],NNDistFun=list)
}

fix imports for CRAN

We should submit a version to CRAN to accompany the paper. v 1.5 is coming up with errors on r-devel due to stricter namespace checking. These buglets still exsist.

https://www.r-project.org/nosvn/R.check/r-devel-osx-x86_64-clang/nat.nblast-00check.html

checking R code for possible problems ... NOTE
WeightedNNBasedLinesetDistFun : <anonymous>: no visible global function
  definition for ‘dnorm’
nhclust: no visible binding for global variable ‘as.dist’
nhclust: no visible global function definition for ‘hclust’
plot3d.hclust: no visible binding for global variable ‘rainbow’
show_similarity: no visible global function definition for
  ‘colorRampPalette’
sub_dist_mat: no visible global function definition for ‘as.dist’
Undefined global functions or variables:
  as.dist colorRampPalette dnorm hclust rainbow
Consider adding
  importFrom("grDevices", "colorRampPalette", "rainbow")
  importFrom("stats", "as.dist", "dnorm", "hclust")
to your NAMESPACE file.

Have nblast work with Labels in given neuronlist/dotprops objects

Have a UseLabels option for nblast, which would work similarly to UseAlpha. Labels could be numeric (2) or character ("axon").

Some sample dotprops data from the hemibrain, two sets of cells types that should match to each other but not between, also with differing axon/dendrite locations: PD2a, PD2b1, AV1a1, LHCENT1, LHCENT2, LHCENT3:

Code used to fetch:

library(hemibrainr)
db = hemibrain_neurons()
chosen = subset(db, type%in%c("LHPD2a1","LHPD2b1_a","LHPD2b1_b","LHAV1a1","LHCENT1","LHCENT2","LHCENT3"))
dps = dotprops(chosen)
table(chosen[[1]]$d$Label)
table(dps[[1]]$labels)

Output:

> table(chosen[[1]]$d$Label)                                                           

    0     2     3     4     7 
10809  3186 41315   213   161 
> table(dps[[1]]$labels)

    0     2     3     4     7 
10809  3186 41315   213   161 

Where:
0 - non-computing(no 'flow'), possible erroneuous
1- soma
2- axon
3 - dendrite
4 - linker
7 - cell body fibre

dps.rda.zip

Add nblast2

Currently, nblast() is equivalent to nblast1, which used the negative exponential weighting with sigma based on registration error.

nblast for swc files

Hi, nice library

recently, i use the nblast to search the similar neurons which were produced by myself, and the neuron type was save as .swc, i read the neuron as the following code:

newskel1 = read.neuron.swc("/usr/skel_1.swc")
newskel2 = read.neuron.swc("/usr/skel_2.swc")

and i transform the neuron to neuronlist

newskel = neuronlist(skel1, skel2)

and then nblast

scores = nblast(skel1, newskel)

but it gives me the following:

more than 1 point in .CleanupParentArray, choosing first from: 2 11more than 1 point in .CleanupParentArray, choosing first from: 13 15more than 1 point in .CleanupParentArray, choosing first from: 191 196more than 1 point in .CleanupParentArray, choosing first from: 202 208Warning messages:
1: In .CleanupParentArray(d1[, "Parent"]) : 
2: In .CleanupParentArray(d1[, "Parent"]) : 
3: In .CleanupParentArray(d1[, "Parent"]) : 
4: In .CleanupParentArray(d1[, "Parent"]) : 

so, i just wondering is the neuronlist i created not correct? what should i do?

then i transform the neuron to neuronlist

newskel1 = as.neuronlist(newskel1)
newskel2 = as.neuronlist(newskel2)
newskel = neuronlist(skel1, skel2)
scores = nblast(skel1, newskel)

but it gives me the following:

Error in `[.data.frame`(df, i, j) : undefined columns selected

Release nat.nblast 1.6.5

Prepare for release:

  • Check that description is informative
  • Check licensing of included files
  • usethis::use_cran_comments()
  • devtools::check()
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • Polish pkgdown reference index

Submit to CRAN:

  • usethis::use_version('patch')
  • Update cran-comments.md
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • Update install instructions in README
  • Tweet

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.