Code Monkey home page Code Monkey logo

allennlp-optuna's People

Contributors

himkt avatar johngiorgi avatar magiasn avatar tatsuokun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

allennlp-optuna's Issues

Clarify License

Hey there, I'm trying to repackage allennlp-optuna for conda-forge, and was wondering what the LICENSE of the package is?

Couldn't find anything in the repo. Conda-forge will not package something without a license, so I need to ask. :)

jsonnet_evaluate_file

Hi,
I have encountered a problem when following this tuto.
I used allennlp cli to tune with the command allennlp tune config/aa.jsonnet config/hparams.json --serialization-dir result/hpo --study-name test --skip-if-exists.

And I got the error:

/allennlp_optuna/commands/tune.py:55: ExperimentalWarning: AllenNLPExecutor is experimental (supported from v1.4.0). The interface can change in the future.
include_package=include_package,
Something went wrong during jsonnet_evaluate_file, please report this: [json.exception.parse_error.101] parse error at line 1, column 1: syntax error while parsing value - invalid literal; last read: 'a'
Aborted

I don't understand this jsonnet_evaluate_file. And I found it's the same error as issue here, but so far no replies.

Could you help me through?
Thank you.

KeyError: 'attributes' for optuna-param-path config file

Hi there!

Great work on this plugin!

I'm running a test here, and I'm using the following optuna-param-path config file. I was just expecting to specify the pruner and sampler, without specifying actual attributes for them (leave the default). Looks like the tune command expects some attributes, is this really necessary?

Thanks!

{
    "pruner": {
        "type": "SuccessiveHalvingPruner"
    },
    "sampler": {
        "type": "TPESampler"
    }
}
2021-05-04 09:44:54,905 - INFO - allennlp.common.plugins - Plugin allennlp_models available
2021-05-04 09:44:54,971 - INFO - allennlp.common.plugins - Plugin allennlp_server available
2021-05-04 09:44:55,514 - INFO - allennlp.common.plugins - Plugin allennlp_optuna available
Traceback (most recent call last):
  File "/media/discoD/anaconda3/envs/allennlp/bin/allennlp", line 8, in <module>
    sys.exit(run())
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/allennlp/__main__.py", line 34, in run
    main(prog="allennlp")
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/allennlp/commands/__init__.py", line 119, in main
    args.func(args)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/allennlp_optuna/commands/tune.py", line 66, in tune
    pruner = pruner_class(**optuna_config["pruner"]["attributes"])
KeyError: 'attributes'

Provide default/good hyperparameters to start search

Is there any way to provide a set of default hyperparameters to start the search?

The motivation would be that you already have a sense of "good" hyperparameters, so beginning the search with them may be far more efficient, especially when using a sampler.

retrain runtime error: fail to load study

Hi @himkt again ^^

I had a problem when trying to run allennlp retrain, and I wish to ask for your help.

The command that I used for retrain:allennlp retrain config/aa.jsonnet -s retrain_result/test --study-name test --include-package my_project --storage sqlite:///allennlp_optuna.db

And I got this error:

RuntimeError: Fail to load study. Perhaps you attempt to use AllenNLPPruningCallback without AllenNLPExecutor. If you want to use a callback without an executor, you have to instantiate a callback withtrial and `monitor. Please see the Optuna example: https://github.com/optuna/optuna/blob/master/examples/allennlp/allennlp_simple.py.

I don't understand why it asked for a Allennlpexecutor here. And I checked with optuna studies --storage sqlite:///allennlp_optuna.db. The study is presented.

Related to this error, I also have some questions, hope you don't mind.

  • I am not sure if I understand correctly the objectif of retrain. I firstly used "train-dev" and obtained some optimised parameters, then I use the same config on "train-test". Is it the aime of retrain command here?
  • I followed the python script mentioned in the error here. Line 109 says that "# patience=None since it could conflict with AllenNLPPruningCallback". I thus changed patience in my trainer to None (I had patience=5) but I still have the same error. I am a bit confused by the patience parameter in trainer then. Can't I use it during tune?
  • I searched retrain in optuna docs but couldn't find anything similar. Is it only exists for cli?

Thank you in advance!

Erroneous poetry run commands?

All the commands in the README are prepended with poetry run, but the install instructions say to install via pip. Are these typos?

Using SuccessiveHalvingPruner

Hi there!

I'm trying to use SuccessiveHalvingPruner, since some sources report it's a better alternative to ray's PBT. I was trying to leave the default attributes, but per my previous issue (#38 ), I had to specify something to test. First I tried the following:

{
    "pruner": {
        "type": "SuccessiveHalvingPruner",
        "attributes": {
            "min_resource": "auto"
        }
    }
}

But looks like it's getting parsed as None, so optuna's class is throwing an exception here:

image

Then I tried using the following below, and it's working:

{
    "pruner": {
        "type": "SuccessiveHalvingPruner",
        "attributes": {
            "min_resource": 5
        }
    }
}

But what exactly is a "resource"? I don't understand the min_resource explanation from optuna's documentation.

Thanks!

Different results from `allennlp tune` and `allennlp retrain` with transformers

When I am tuning a transformer model, I get different results from allennlp tune and allennlp retrain with the same hyperparameters.

I found this is caused by allennlp.common.cached_transformers module, which only constructs the model in the first trial (which would consume some random numbers), and uses the cached model in trials afterwards (which would not consume random numbers), leading to inconsistent results between tune and retrain.

Question: hyperparameter tuner for allennlp with cross-validation

Hi @himkt!
I have a question about using cross-validation in optuna.

I have a very small dataset, so my idea is to do a n-fold cross validation in train-dev dataset: I train in the same trial with same hparams but different data (folds) for n times (number of folds). The result is the average of all folds.

Precisely I have:

  • nfold = 5
  • trials = 50

I optimized for each of the 5 folds with 50 trials.
What I observed is that for each fold, the hparams are different and I can't do the average (sometimes first x trials have the same hparams among several folds, but it's never always the same).

I thus have some questions:

I also found similar questions in Gitter by @PhilipMay, and an optuna issue #1875. But the responses were unclear to me...

Do you have any idea? Thank you!

retrain command not getting environment values

Hi @himkt !

I think I saw another issue, now with the retrain command. It's not getting any value set to the environment variables, except the ones that are actually from the best params used for training. This works with the tune command, probably because it uses AllenNLPExecutor, and this executor sets up the configuration properly, fetching all environment variables in the _build_params method.

Without this, I keep getting RUNTIME ERROR: Undefined external variable: x errors from jsonnet, for any environment variables that were expected in my jsonnet config.

include package is not being passed during distributed training

Use distributed training like the following snippet

  "distributed": {
    "cuda_devices": if num_gpus > 1 then std.range(0, num_gpus - 1) else 0,
  },

The include-package option does not work for the allennlp tune command. My custom dataset_reader is not found. Adding the fully qualified class name solves this issue, which basically means for some reason include-package doesn't seem to work.

It works perfectly fine if i use the allennlp train command.

Trial X failed because of the following error: ValueError('nan loss encountered')

I'm having an issue where some combinations of hyper-parameters result in 'nan loss encountered'. I suspect to high a learning rate coupled with amp enabled.

The issue is when this occurs allennlp-optuna crashes. It would be preferable if an exception simple resulted in the trial being marked as failed and the search continue from the next trial.

2021-11-22 20:26:30,886 - CRITICAL - root - Uncaught exception
Traceback (most recent call last):
  File "bin/allennlp", line 8, in <module>
    sys.exit(run())
  File "site-packages/allennlp/__main__.py", line 46, in run
    main(prog="allennlp")
  File "site-packages/allennlp/commands/__init__.py", line 122, in main
    args.func(args)
  File "site-packages/allennlp_optuna/commands/tune.py", line 89, in tune
    study.optimize(objective, n_trials=n_trials, timeout=timeout)
  File "site-packages/optuna/study/study.py", line 400, in optimize
    _optimize(
  File "site-packages/optuna/study/_optimize.py", line 66, in _optimize
    _optimize_sequential(
  File "site-packages/optuna/study/_optimize.py", line 163, in _optimize_sequential
    trial = _run_trial(study, func, catch)
  File "site-packages/optuna/study/_optimize.py", line 264, in _run_trial
    raise func_err
  File "site-packages/optuna/study/_optimize.py", line 213, in _run_trial
    value_or_values = func(trial)
  File "site-packages/allennlp_optuna/commands/tune.py", line 57, in _objective
    return executor.run()
  File "site-packages/optuna/integration/allennlp/_executor.py", line 215, in run
    allennlp.commands.train.train_model(
  File "site-packages/allennlp/commands/train.py", line 254, in train_model
    model = _train_worker(
  File "site-packages/allennlp/commands/train.py", line 504, in _train_worker
    metrics = train_loop.run()
  File "site-packages/allennlp/commands/train.py", line 577, in run
    return self.trainer.train()
  File "site-packages/allennlp/training/gradient_descent_trainer.py", line 750, in train
    metrics, epoch = self._try_train()
  File "site-packages/allennlp/training/gradient_descent_trainer.py", line 773, in _try_train
    train_metrics = self._train_epoch(epoch)
  File "site-packages/allennlp/training/gradient_descent_trainer.py", line 495, in _train_epoch
    raise ValueError("nan loss encountered")
ValueError: nan loss encountered

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.