natverse / nat.nblast Goto Github PK
View Code? Open in Web Editor NEWR package implementing the NBLAST neuron search algorithm, as an add-on for the NeuroAnatomy Toolbox (nat) R package.
Home Page: http://natverse.github.io/nat.nblast
R package implementing the NBLAST neuron search algorithm, as an add-on for the NeuroAnatomy Toolbox (nat) R package.
Home Page: http://natverse.github.io/nat.nblast
is.matrix is false
and ideally give an example replicating a relevant figure from the paper. Note that Kohl 2013 immediately mentions a normalised score
Hi @jefferis et al. Could you please generate a Zenodo DOI for nat.nblast
?
nblast()
should be kept as simple as possible for novice users, with the more advanced arguments being dealt with in NeuriteBlast()
and WeightedNNBasedLinesetMatching()
. Links should be present to these more advanced methods in nblast()
's documentation.
For example if just 1 group is selected, should be allowed to rainbow colour them
a lot of people expect to do this and the current message is not that clear in this circumstance
To simplify all by all blast
Example here:
https://groups.google.com/forum/#!topic/nat-user/Oe40-OjRDd8
Paavo also had the same error recently. Happens when nabor::knn
is called with matrix containing 0 rows. Maybe nat.nblast:::WeightedNNBasedLinesetMatching.default
should return NA
if either of the inputs have 0 rows.
spam seems a bit ...
but RcppEigen would require R class defs etc
Have a UseLabels
option for nblast, which would work similarly to UseAlpha
. Labels could be numeric (2) or character ("axon").
Some sample dotprops
data from the hemibrain, two sets of cells types that should match to each other but not between, also with differing axon/dendrite locations: PD2a, PD2b1, AV1a1, LHCENT1, LHCENT2, LHCENT3:
Code used to fetch:
library(hemibrainr)
db = hemibrain_neurons()
chosen = subset(db, type%in%c("LHPD2a1","LHPD2b1_a","LHPD2b1_b","LHAV1a1","LHCENT1","LHCENT2","LHCENT3"))
dps = dotprops(chosen)
table(chosen[[1]]$d$Label)
table(dps[[1]]$labels)
Output:
> table(chosen[[1]]$d$Label)
0 2 3 4 7
10809 3186 41315 213 161
> table(dps[[1]]$labels)
0 2 3 4 7
10809 3186 41315 213 161
Where:
0 - non-computing(no 'flow'), possible erroneuous
1- soma
2- axon
3 - dendrite
4 - linker
7 - cell body fibre
I was wondering if you had a dataset listing all the neurons belonging to each cluster/supercluster from online in your nblast R package, or whether this information is only accessible on the website. Thanks very much!
edit: Just found https://github.com/jefferislab/NBLAST_clusters_online --thanks!
Prepare for release:
usethis::use_cran_comments()
devtools::check()
devtools::check_win_devel()
rhub::check_for_cran()
Submit to CRAN:
usethis::use_version('patch')
cran-comments.md
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version()
Hi, nice library
recently, i use the nblast to search the similar neurons which were produced by myself, and the neuron type was save as .swc
, i read the neuron as the following code:
newskel1 = read.neuron.swc("/usr/skel_1.swc")
newskel2 = read.neuron.swc("/usr/skel_2.swc")
and i transform the neuron to neuronlist
newskel = neuronlist(skel1, skel2)
and then nblast
scores = nblast(skel1, newskel)
but it gives me the following:
more than 1 point in .CleanupParentArray, choosing first from: 2 11more than 1 point in .CleanupParentArray, choosing first from: 13 15more than 1 point in .CleanupParentArray, choosing first from: 191 196more than 1 point in .CleanupParentArray, choosing first from: 202 208Warning messages:
1: In .CleanupParentArray(d1[, "Parent"]) :
2: In .CleanupParentArray(d1[, "Parent"]) :
3: In .CleanupParentArray(d1[, "Parent"]) :
4: In .CleanupParentArray(d1[, "Parent"]) :
so, i just wondering is the neuronlist i created not correct? what should i do?
then i transform the neuron to neuronlist
newskel1 = as.neuronlist(newskel1)
newskel2 = as.neuronlist(newskel2)
newskel = neuronlist(skel1, skel2)
scores = nblast(skel1, newskel)
but it gives me the following:
Error in `[.data.frame`(df, i, j) : undefined columns selected
As the conversion happens in WeightedNNBasedLinesetMatching()
, this means that the objects are converted for every comparison, not just once at the start, making everything much slower than it needs to be.
Is there anything blocking a CRAN release? I notice that there was recently a spam update – not sure if it fixed anything.
This presently only works for me when nat.as
is loaded and plot3d.character
is aliased to plot3dfc
See natverse/nat#275
Since you need to have a different scoring matrix when using the alpha factor to scale the calculate dot products.
In order to satisfy one of the reviewer comments we should add a new function / example that shows which points are being matched for a pair of neurons and colours one of the neurons by the quality of the match. One way to do this would be allow the nblast
function to return per segment scores (perhaps wrapping them in an object with a class like nblastres
. A corresponding plot3d method could then be used to make a plot with sensible defaults.
Alternatively, a lower tech version would be to include an example in the nblast
docs.
Collecting per segment results could be done by playing with the NNDistFun
argument (which gets passed down to WeightedNNBasedLinesetMatching.default
.
makeprobmat, scorematrix. Distances like this:
DL2nnlist=list()
for(n1 in DL2names){
DL2nnlist[[n1]]=lapply(fcupndps[DL2names[DL2names!=n1]],
WeightedNNBasedLinesetMatching.dotprops,fcupndps[[n1]],NNDistFun=list)
}
Currently, nblast()
is equivalent to nblast1, which used the negative exponential weighting with sigma based on registration error.
We're going to want to make these accessible in other contexts
Creating a sparse matrix for 1,000 neurons from the 16,000-neuron full score matrix has been running for more than 90 minutes and still hasn't finished. This is with the full score matrix loaded into memory, so the slowness is not caused by disk access issues.
Example:
> testneurons <- readRDS('testdata/testneurons.rds')
> names(testneurons) <- NULL
> scoresaba <- nblast_allbyall(testneurons,
+ version=2,
+ normalisation = 'normalised')
> scoresaba
<0 x 0 matrix>
attr(,"scaled:scale")
numeric(0)
... and list of dot prods/distances
or should we have a db argument and just name the neurons?
In general I think this approach would at least simplify naming while still providing a convenient entry point.
Dear all,
We have found nblast really helpful to our current project, especially when doing nblast against the FlyEM database.
On my laptop (6 cores 12 threads), it takes about 4 min for a one against all NBlast when running on single core.
As I want to reduce the time, I used doParallel to define a multi-core backend and run NBlast with .parallel = TRUE. Interestingly, I could confirm that all my 12 cores were running with a 100% RAM consumption, and it ended up with more than 10 min for the same task.
Then I tried running NBlast on only two cores to avoid the high memory consumption, and it took 5 min for the task.
Take the longer time and high memory consumption into consideration, I am a little bit confused about how exactly nblast using .parallel. As I have a 4 processors 40 cores 80 threads CPU and 48 GB RAM, and my dps_flyEM is 2.32 GB, is it the best to run NBlast on only 16 cores rather than 80?
Best wishes,
Jiajun Zhang
if indices = NULL we get an error
Currently ignored
It would be useful to have functions that generate reports on NBLAST results, perhaps as a knitr document. For example, a histogram of scores and a 3D plot of top hits could be produced automagically, along with some clustering of those top hits.
@dokato a placeholder
Use appropriate credentials for FAFB login
library(catmaid)
source("../catmaid_fafb_login.R")
test_skids = c(21999,22132)
test_n=read.neurons.catmaid(test_skids, conn=conn)
r = NeuriteBlast(test_n[[1]], test_n[[2]])
The above script returns the following error:
Error in findDirectionVectorsFromParents(target, query, idxArray, ReturnAllIndices = TRUE, :
Some points do not have a parent: therefore impossible to calculate direction vector
In addition: Warning messages:
1: In .CleanupParentArray(d1[, "Parent"]) :
no points to choose in .CleanupParentArray using original value
2: In .CleanupParentArray(d2[, "Parent"]) :
no points to choose in .CleanupParentArray using original value
The error is from this function.
It seemswp=which(pa==p)
will only work if the ids in d$Parent are row indices, which is not true for FAFB.
for parallelisation and progress
These days it is much more likely that an error while searching is caused by the wrong invocation, resulting in thousands of error messages being sent to the console.
Make it optionally fault tolerant using e.g. nlapply
If using nblast inside another function when importing but not attaching nat.nblast
from elmr package.
get(smat)
2 nat.nblast::nblast(xdp, db, normalised = normalised, .parallel = .parallel,
...) at nblast_fafb.R#64
1 nblast_fafb(27884, mirror = FALSE)
workaround is to attach. Fix will involve something to do with scope of get statement (being pointed to objects in package)
We should submit a version to CRAN to accompany the paper. v 1.5 is coming up with errors on r-devel due to stricter namespace checking. These buglets still exsist.
https://www.r-project.org/nosvn/R.check/r-devel-osx-x86_64-clang/nat.nblast-00check.html
checking R code for possible problems ... NOTE
WeightedNNBasedLinesetDistFun : <anonymous>: no visible global function
definition for ‘dnorm’
nhclust: no visible binding for global variable ‘as.dist’
nhclust: no visible global function definition for ‘hclust’
plot3d.hclust: no visible binding for global variable ‘rainbow’
show_similarity: no visible global function definition for
‘colorRampPalette’
sub_dist_mat: no visible global function definition for ‘as.dist’
Undefined global functions or variables:
as.dist colorRampPalette dnorm hclust rainbow
Consider adding
importFrom("grDevices", "colorRampPalette", "rainbow")
importFrom("stats", "as.dist", "dnorm", "hclust")
to your NAMESPACE file.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.