Comments (5)
Hi @mnaylor5! Sorry for the late response to this issue.
It appears this is a bug caused by the polling frequency of the optimizer, this is controlled by a configuration in src/optimizer.hpp:88
The tick-duration member controls how many iterations the optimizer goes through before checking the time.
Since iterations were very fast for our experiments, 10000 iterations per check was a suitable balance between not spending too much time checking the clock and still stopping reasonably close to the desired time limit.
For the dataset you provided, it appears the iterations can be much slower which is likely due to the large branching factor. So checking every 10000 iterations wouldn't work very well. As an immediate solution I was able to get a more reasonable stopping precision with a tick-duration of 10 iterations. (Simply change the 1000 to 10 src/optimizer.hpp:88) and recompile the program.
This should fix things for your specific case. I'll try to think about what might be a more general solution.
from generalizedoptimalsparsedecisiontrees.
Hi @mnaylor5,
Did you manage a way to fix or workaround this issue? I am also trying out the library and noticing that the time limit setting is ignored. I am running Ubuntu 10.04.
from generalizedoptimalsparsedecisiontrees.
Hi @abhishek-ghose, sorry for the slow response. I have not figured out a workaround - I've been using other optimal tree libraries instead.
from generalizedoptimalsparsedecisiontrees.
Thank you @mnaylor5!
from generalizedoptimalsparsedecisiontrees.
Hey @Jimmy-Lin - thanks for the response! I made the change you suggested, and it seems to successfully enforce the time limit.
This seems to lead to a couple of other issues. First, there seems to be a memory leak or something causing excessive RAM usage. This dataset is pretty small (614 observations in the training set, 8 continuous features, and a binary classification target), but a training run with a 1hr time limit uses ~190GB of RAM. The second is that I'm still getting the exact same basic tree as the output of that 1hr run on a larger machine (32 cores and 208GB RAM) - is this expected? I would think that it should have improved from the initial tree within an hour of searching, but this doesn't seem to be the case.
Any advice would be greatly appreciated! Thanks again!
from generalizedoptimalsparsedecisiontrees.
Related Issues (12)
- Cannot use F1 objective, because w is set to None
- [Minor] Misspelling in `auto/boost.sh`
- will it work on windows? HOT 1
- No source distribution or M1 Mac distribution available on PyPI
- I can't find path
- Provide Example Demonstrating Early Stopping, Optimality Gap
- Install on docker container failing due to missing make file HOT 1
- Fail to build HOT 4
- Installing as a Python Library with C++ Extensions
- Python installation does not work when calling `fit` HOT 3
- unable to build due to error in concurrent hash map: "value_type of the container must be the same as its allocator's" HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from generalizedoptimalsparsedecisiontrees.