Comments (15)
i think i know why this error happens. i am mixing a bit of matlabs mex memory
allocs with c libraries allocs; usually matlab mex works fine until there are
some allocs that are not freed correctly, and after a while matlab will just
segfault. i think i have some allocs that are not deallocated correctly.
i will take a look at it and fix the bug. also, i found the solution of the bug
from the other issue and will post that out too.
thanks a lot.
Original comment by abhirana
on 11 Nov 2011 at 8:03
- Changed state: Started
- Added labels: Priority-Critical
- Removed labels: Priority-Medium
from randomforest-matlab.
hi
can you checkout the svn source? it has the fix for the memory issues.
i ran for a few thousand iterations (parfor) with your dataset and parameters
and it didnot crash. so i think it should be fine now. i used visual studio
2010 for compiling on 64bit windows 7 and matlab 7.12 with 16gb ram.
thanks
Original comment by abhirana
on 11 Nov 2011 at 10:13
from randomforest-matlab.
Hi, I checked for 15K iterations, and it worked well. Thanks!
It it won't work in my application, I'll let you know.
Original comment by [email protected]
on 14 Nov 2011 at 8:19
from randomforest-matlab.
Hi,
I can't compile this new version (mxCallalloc, mxfree) on linux 64 bit system.
Here is the error:
g++ -fpic -O2 -funroll-loops -msse3 -Wall -c src/classTree.cpp -o
tempbuild/classTree.o
src/classTree.cpp:42:20: error: matrix.h: No such file or directory
src/classTree.cpp: In function ‘void catmax_(double*, double*, double*, int*,
int*, int*, double*, int*, int*, int*, int*)’:
src/classTree.cpp:102: error: ‘mxCalloc’ was not declared in this scope
src/classTree.cpp:145: error: ‘mxFree’ was not declared in this scope
src/classTree.cpp: In function ‘void predictClassTree(double*, int, int,
int*, int*, double*, int*, int*, int, int*, int, int*, int*, int)’:
src/classTree.cpp:226: error: ‘mxCalloc’ was not declared in this scope
src/classTree.cpp:258: error: ‘mxFree’ was not declared in this scope
Thanks,
MJ
Original comment by [email protected]
on 14 Mar 2012 at 8:44
from randomforest-matlab.
hi m.seyed
could you try out the latest source? i fixed the makefile.
do tell if you still have any issues.
thanks for letting me know about the bug.
Original comment by abhirana
on 14 Mar 2012 at 9:16
from randomforest-matlab.
Hi Abhirana,
Thanks for the quick reponse. It is now working.
Best,
MJ
Original comment by [email protected]
on 14 Mar 2012 at 9:21
from randomforest-matlab.
Hi,
Still when I'm trying to train a random forest model on a 2000000 by 2000
matrix I get a segmentation fault error in matlab. I just thought to report the
problem here.
My inputs are integer numbers( I changed them to double type before passing to
random forest) and the range varies from 0 to 3500.
Best,
MJ
Original comment by [email protected]
on 14 Mar 2012 at 10:22
from randomforest-matlab.
Hi MJ
do you have enough RAM? i am guessing your dataset requires atleast 32GB just
for storing the array and another 32-50GB for the internal working of RF
(considering 8 bytes for a double and 4 billion examples.
also this might just be too large of a dataset for RF to handle and finish in a
reasonable time.
Original comment by abhirana
on 14 Mar 2012 at 10:34
from randomforest-matlab.
Yes, I have 2TB RAM otherwise I would get memory error from matlab.
I agree that this one is a huge dataset but I was curious about the performance
of RF with only few trees (say 5 trees).
Anyway, thanks for your code. I tried it on other smaller datasets and it
worked fine.
Best,
MJ
Original comment by [email protected]
on 14 Mar 2012 at 10:38
from randomforest-matlab.
@MJ
sorry i missed your post
yup, RF would work but i dont think this package will finish in any appreciable
time; currently its non-threaded both at the tree level and node level;
multi-threading is on my todo list. maybe a version of RF threaded at
node-level should be able to scale to your dataset. and maybe its not that bad
with a few trees. e.g. kinect uses a version of RF threaded at node-level
http://research.microsoft.com/pubs/145347/BodyPartRecognition.pdf
regards
Original comment by abhirana
on 20 Mar 2012 at 11:17
from randomforest-matlab.
This is an interesting discussion. I am wondering, is there a rule of thumb to
know what amounts of memory the Matlab function allocates for its internal
needs?
That is, what is the amount of required memory in function of number of samples
and their dimensionality?
Thanks for you work and this great package!
Cheers!
Original comment by [email protected]
on 28 Mar 2012 at 10:31
from randomforest-matlab.
Hi vladislavs
atleast twice that of the training data. slightly more due to some temporary
variables. but there are 6 more variables that store the tree heirarchy and
that may consume more space than necessary. each of these 6 variables are of
size ntree x nrnodes. where nrnodes=2*n+1. so the total mem requirement is
2xNxD + (ntree)x(2*N+1), N=number of examples, D=num features
for regression, the data is bagged and thus creates a shadow copy of the
training data; then sorting is done for each feature and thus regression scales
as nlog(n) in terms of compute time. for classification, a presorting step is
done for each feature and that helps classification to scale as n, but it still
creates a shadow copy of the training data. (n=number of examples). a todo is
to make regression do that presorting step to allow it to scale in n for
execution time
Original comment by abhirana
on 29 Mar 2012 at 8:42
from randomforest-matlab.
Hi Abhirana,
Thanks for your answer - it clears out the matters! Should note this somewhere!
Keep up the good work!
Original comment by [email protected]
on 29 Mar 2012 at 9:43
from randomforest-matlab.
Abhishek,
Do you have plans on releasing the fixed code?
Thanks,
—R
Original comment by [email protected]
on 5 Mar 2013 at 10:46
from randomforest-matlab.
@romashapovalov
the code in the svn is the latest code. i just haven't put up a download link.
a zip file is available here
https://code.google.com/p/randomforest-matlab/issues/detail?id=41#c8
Original comment by abhirana
on 9 Mar 2013 at 7:15
from randomforest-matlab.
Related Issues (20)
- weak learner HOT 1
- Compiling on Mac Lion HOT 6
- Compiled mexmaci64 for OSX 10.8.2 (Mountain Lion) HOT 2
- about the unbalanced data HOT 32
- Segmentation violation problem HOT 2
- Hierarchical sampling of data? HOT 3
- memory leak in HOT 1
- probability of classes for highly skewed dataset HOT 2
- Feature Normalization HOT 1
- sampsize problem
- score values from random forest HOT 1
- MATLAB crashes after tens of thousands runs !! HOT 3
- Compilation Problems with Matlab 2014a on Mac HOT 7
- How to get individual tree predictions for regression HOT 2
- use library (gcc) in matlab and error with compile of mex HOT 1
- NaN data HOT 4
- multivariate label output in regression analysis
- Matlab (randomly) crash after a number of runs HOT 5
- Directions for Bagging Regression HOT 2
- Quantifying Fractal Dimension HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from randomforest-matlab.