sigopt / sigopt-examples Goto Github PK
View Code? Open in Web Editor NEWOptimization Examples with SigOpt
License: MIT License
Optimization Examples with SigOpt
License: MIT License
Hi,
I want to know how to restart my previous experiment. Here's what I did.
from sigopt_sklearn.search import SigOptSearchCV
client_token = "My Token"
net_parameters = {
'parameters':(range),
}
clf = SigOptSearchCV(net, net_parameters, cv=train_test_splitter, client_token=client_token,
n_jobs=1, n_iter=50, scoring=get_scorer('neg_mean_squared_error'))
clf.fit(SDT_training, target_training)
I have the previous experiment's ID.
I would appreciate any help!
Thank you
I just tried re-setting up from scratch on a faster computer. Doesn't seem like setup_env.sh actually takes the "Darwin" path on Mac (eventhough when I run uname -s
on the command line I see 'Darwin').
As well, brew install gfortran
has been deprecated in favor of brew install gcc
it seems.
Seems as though by running most of these steps manually and then re-running setup_env.sh I get things to work, but I'm definitely not a bash expert, or a python package requirements install expert
Another wrinkle depending on your homebrew you may have to supply the gcc version in the brew install
Error: No available formula for gcc
GCC is now maintained in homebrew-versions, with major version
number in formula name as suffix. Please tap using:
brew tap homebrew/versions
and then install GCC based on its version, e.g., 'brew install gcc47'.
However, if I run brew update
first, then the simple approach works, brew install gcc
Probably not worth racking brains about, but worth considering in future setup scripts
Hi Sigopt,
I'm trying to run Hyper Parameter Optimization using predefined(fixed) train/validation indices.
clf = SigOptSearchCV(net, net_parameters, cv=train_test_splitter, client_token=client_token,
n_jobs=1, n_iter=100, scoring=get_scorer('neg_mean_squared_error'),experiment=experiment)
This is what I used before with train_test_splitter=ShuffleSplit()
from sklearn.
How can I modify SigOptSearchCV and optimize hyper parameters for the fixed train/validation splits?
I also tried PredefinedSplit from sklearn with fixed train and validaiton indices, but the cross validation generates separate test sets and divide train/val using the remaining sets.
The point is I don't want to leave test set out and report scores on validation set.
For example, If I have 1,000 datasets, I would like to use the first 800 for training ONLY, and the rest 200 for validation ONLY.
But what the SigOptSearchCV did is, split the dataset into 900 for train and 100 for test. Then split the 900 data into train and validation sets.
Thank you!
Commands in docker build are executed in a sh
subprocess. You cannot change environment variables from a subprocess. I suspect that everything works later because the default environment variables are being used.
When I run the setup_env.sh in the sigopt beats vegas module, it just says "downloading season 1" and doesn't do anything. Are there any common problems that arise executing this?
I tried to open PR to put out a tiny bug-fix for the file orchestrate/apache_spark/orchestrate.yml
, but I got a <403>
when I tried to push the branch. ๐So I'll just post it here!
The following line:
repository: orchestrate/spark-example
should be replaced with
aws:
ecr:
repository: orchestrate/spark-example
After that example works brilliantly. ๐
as well, run_example parameter list seems to require passing client_token explicitly. This fixes it. I'll open a PR once it finishes (takes a while it seems) and I can save the nb
from predictor.sigopt_creds import client_token
from predictor.stand_alone import run_example
# Warning, this can take a very long time (many hours)
EXPERIMENT_ID = run_example(client_token)
# EXPERIMENT_ID = run_example(sigopt_depth=0, sigopt_width=0)
#EXPERIMENT_ID = 1545 # Put your experiment ID here if you have already run stand_alone
Looks like I'm getting some imports errors. I ran setup.sh as mentioned in the Readme (I did get some errors saying sudo apt-get wasn't installed since I'm on Mac OS I suppose, but everything seemed fine)
I'm using Anaconda for python and just ran the setup in a new env (in case that helps describe the error below).
So, I went to run the first cell of the nb (pasted below, and then got the error, which I've also pasted below)
from predictor.stand_alone import run_example
# Warning, this can take a very long time (many hours)
EXPERIMENT_ID = run_example(sigopt_depth=0, sigopt_width=0)
#EXPERIMENT_ID = 1545 # Put your experiment ID here if you have already run stand_alone
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-2-6a242b8ff34b> in <module>()
----> 1 from predictor.stand_alone import run_example
2 # Warning, this can take a very long time (many hours)
3 EXPERIMENT_ID = run_example(sigopt_depth=0, sigopt_width=0)
4 #EXPERIMENT_ID = 1545 # Put your experiment ID here if you have already run stand_alone
/Users/jlent/github_projects/sigopt-examples/sigopt-beats-vegas/predictor/stand_alone.py in <module>()
4
5 import bet_reader
----> 6 import evaluator
7 import read_data
8 from constant import SEASON_1314_END
/Users/jlent/github_projects/sigopt-examples/sigopt-beats-vegas/predictor/evaluator.py in <module>()
2
3 import bet_reader
----> 4 from model import get_features
5 from constant import SEASON_1415_START, SEASON_1415_END
6
/Users/jlent/github_projects/sigopt-examples/sigopt-beats-vegas/predictor/model.py in <module>()
----> 1 from sklearn import ensemble
2 import numpy
3
4 from team_stats import TeamStats
5
/Users/jlent/anaconda/envs/sigopt_vegas/lib/python2.7/site-packages/sklearn/ensemble/__init__.py in <module>()
5
6 from .base import BaseEnsemble
----> 7 from .forest import RandomForestClassifier
8 from .forest import RandomForestRegressor
9 from .forest import RandomTreesEmbedding
/Users/jlent/anaconda/envs/sigopt_vegas/lib/python2.7/site-packages/sklearn/ensemble/forest.py in <module>()
53 from ..externals.joblib import Parallel, delayed
54 from ..externals import six
---> 55 from ..feature_selection.from_model import _LearntSelectorMixin
56 from ..metrics import r2_score
57 from ..preprocessing import OneHotEncoder
/Users/jlent/anaconda/envs/sigopt_vegas/lib/python2.7/site-packages/sklearn/feature_selection/__init__.py in <module>()
5 """
6
----> 7 from .univariate_selection import chi2
8 from .univariate_selection import f_classif
9 from .univariate_selection import f_oneway
/Users/jlent/anaconda/envs/sigopt_vegas/lib/python2.7/site-packages/sklearn/feature_selection/univariate_selection.py in <module>()
13
14 from ..base import BaseEstimator
---> 15 from ..preprocessing import LabelBinarizer
16 from ..utils import (as_float_array, check_array, check_X_y, safe_sqr,
17 safe_mask)
/Users/jlent/anaconda/envs/sigopt_vegas/lib/python2.7/site-packages/sklearn/preprocessing/__init__.py in <module>()
6 from ._function_transformer import FunctionTransformer
7
----> 8 from .data import Binarizer
9 from .data import KernelCenterer
10 from .data import MinMaxScaler
/Users/jlent/anaconda/envs/sigopt_vegas/lib/python2.7/site-packages/sklearn/preprocessing/data.py in <module>()
23 from ..utils.sparsefuncs_fast import (inplace_csr_row_normalize_l1,
24 inplace_csr_row_normalize_l2)
---> 25 from ..utils.sparsefuncs import (inplace_column_scale,
26 mean_variance_axis, incr_mean_variance_axis,
27 min_max_axis)
ImportError: cannot import name inplace_column_scale
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.