Comments (13)
Ah, I may find the cause.
What happens if you simply remove std.parseJson
from local optimizer = std.parseJson(std.extVar('optimizer'));
?
21c21
< local optimizer = std.parseJson(std.extVar('optimizer'));
---
> local optimizer = std.extVar('optimizer');
from allennlp-optuna.
Hey @himkt,
I removed the part std.parseJson
, and it works!!
Thougth I had some trouble with the --include-pacakge
cause it can't find my registered dataset_reader, but I figured it out :)
Without allennlp-optuna, I put my project repo name in .allennlp_plugins
and it worked just fine. Now we need to write allennlp_optuna
in the file plugins so it can't find as before. (sorry this discussion is no longer under this issue, just wondering what this plugin is about ^^")
I'll continue with the pruning.
P.S: the current version works fine with allenlp 2.0.2. Looking forward for the next version of allennlp-optuna!
Thanks again :)
from allennlp-optuna.
@chuyuanli I released the new version of allennlp-optuna
(v0.1.5).
https://github.com/himkt/allennlp-optuna/releases/tag/v0.1.5
You can upgrade the library by pip install -U allennlp-optuna
.
Please give it a try.
from allennlp-optuna.
Let me close this PR as we identified and solved jsonnet error.
Feel free to open a new issue if you have something to discuss. Thank you @chuyuanli!
from allennlp-optuna.
Hello @chuyuanli, thank you for trying allennlp-optuna
!
Can you share your jsonnet configuration?
I think the problem happens because the config file is invalid for jsonnet.
You can check if your config file is valid for jsonnet by executing jsonnet
command
with filling hyperparameters using -V
.
jsonnet config/aa \
-V dropout=1 \
-V embedding_dim=1 \
-V max_filter_size=1 \
-V num_filters=1 \
-V output_dim=1 \
-V lr=1
If you share your config, it would be helpful for investigation.
from allennlp-optuna.
Hi @himkt, thanks for your prompt response!
Sure you'll find below my jsonnet config file:
local ufo_path = "xxx/data/ufo/";
local MODE = "train-dev";
local flaubert_model = "flaubert/flaubert_base_cased";
local cuda_device = 0;
local epochs = 5;
local seed = 2;
local batch_size = 1;
local nfold = std.parseInt(std.extVar('nfold'));
local dropout = std.parseJson(std.extVar('dropout'));
local lr = std.parseJson(std.extVar('lr'));
local block_h = std.parseInt(std.extVar('block_h'));
local block_layer = std.parseInt(std.extVar('block_layer'));
local doc_h = std.parseInt(std.extVar('doc_h'));
local doc_layer = std.parseInt(std.extVar('doc_layer'));
local l1_coef = std.parseJson(std.extVar('l1_coef'));
local l3_coef = std.parseJson(std.extVar('l3_coef'));
local optimizer = std.parseJson(std.extVar('optimizer'));
{
"dataset_reader" : {
"type": "dialog_reader",
"tokenizer": {
"type": "pretrained_transformer",
"model_name": flaubert_model
},
"token_indexers":{
"flaubert_tokens":{
"type": "pretrained_transformer",
"model_name": flaubert_model
},
},
"params":{"encode_turns": false,
"block": true,
"tgt_spk": "stp",
"win_size": 5,
"seed": seed,
"ufo_path": ufo_path,
"MODE": MODE
},
},
"train_data_path": "train0",
"validation_data_path": "dev0",
"model": {
"type": "hierarchical_classifier",
"embedder": {
"token_embedders": {
"flaubert_tokens": {
"type": "pretrained_transformer",
"model_name": flaubert_model,
"last_layer_only": true,
"train_parameters": false
}
}
},
"turn_encoder": {
"type": "cls_pooler",
"embedding_dim": 768
},
"block_encoder": {
"type": "gru",
"input_size": 768,
"hidden_size": block_h, //50
"num_layers": block_layer, //1
"dropout": dropout,
"bidirectional": false
},
"doc_encoder": {
"type": "gru",
"input_size": block_h, //50
"hidden_size": doc_h, //50
"num_layers": doc_layer, //1,
"dropout": dropout,
"bidirectional": false
},
"use_flaubert": true,
"l1_coef": l1_coef, //0.2,
"l3_coef": l3_coef, //0.4,
"cuda": cuda_device,
},
"data_loader": {
"batch_size": batch_size,
"shuffle": true
},
"trainer": {
"optimizer": {
"type": optimizer, //"adam", "sgd"
"lr": lr
},
"num_epochs": epochs,
"cuda_device": cuda_device,
"patience": 5,
// "epoch_callbacks": [
// {
// "type": "optuna_pruner",
// }
// ]
}
}
I tried with jsonnet command, and it shows bash: jsonnet: command not found
error. I have checked with pip install jsonnet
and jsonnet is already installed :/
Thanks ^^
from allennlp-optuna.
@chuyuanli Hmm... A jsonnet configuration seems to be valid.
Can you share hparams.json
and sample data for train/valid as well?
from allennlp-optuna.
Ah, it may need a Python script for hierarchical_classifier
class as well.
from allennlp-optuna.
Yeah I first used this jsonnet (with the exact value and not std.extVar
of course) without Optuna: allennlp train -s result/res config/aa.jsonnet
. It worked out just well. Then I changed to std.extVar
and it didn't work.
Sure, here is my hparams.json
file:
[
{
"type": "int",
"attributes": {
"name": "nfold",
"low": 0,
"high": 4
}
},
{
"type": "float",
"attributes": {
"name": "dropout",
"low": 0.0,
"high": 0.8
}
},
{
"type": "float",
"attributes": {
"name": "lr",
"low": 1e-5,
"high": 1e-3,
"log": true
}
},
{
"type": "int",
"attributes": {
"name": "block_h",
"low": 64,
"high": 512
}
},
{
"type": "int",
"attributes": {
"name": "block_layer",
"low": 1,
"high": 3
}
},
{
"type": "int",
"attributes": {
"name": "doc_h",
"low": 32,
"high": 256
}
},
{
"type": "int",
"attributes": {
"name": "doc_layer",
"low": 1,
"high": 3
}
},
{
"type": "float",
"attributes": {
"name": "l1_coef",
"low": 0.2,
"high": 0.8
}
},
{
"type": "float",
"attributes": {
"name": "l3_coef",
"low": 0.1,
"high": 0.4
}
},
{
"type": "categorical",
"attributes": {
"name": "optimizer",
"choices": ["adam", "sgd"]
}
}
]
As for the Python script for hierarchical_classifier
and data samples I made them into a zip file.
FYI, hierarchical means that I encode turn
(each sentence) -> block
(n
sentences in a row) -> doc
(whole document).
my dataset_reader.py
is wrapped with another package called ufo
. I don't want to mess you around with that ^^" So I put 2 small examples of what my input data look like.
from allennlp-optuna.
Ah also I just saw the version of allennlp-optuna. My code is written in allennlp 2.0.1, but allennlp-optuna 0.1.4 requires allennlp<2.0.0,>=1.0.0
. So when I installed allennlp-optuna it automatically degraded my allennlp.
Any way to get around with that? ^^"
from allennlp-optuna.
My code is written in allennlp 2.0.1, but allennlp-optuna 0.1.4 requires allennlp<2.0.0,>=1.0.0
I'll release the next version of allennlp-optuna, which supports AllenNLP 2.x.x later.
Sorry for the delay.
from allennlp-optuna.
Hi!
I updated the library and it works fine ;)
I made one small modification though in jsonnet file, cause the epoch_callbacks
no longer exist in allennlp2.x.x I think.
"callbacks": [//if allennlp<2.x.x, use 'epoch_callbacks' as param
{
type: 'optuna_pruner',
}
],
Apart from that all good!
Small question about the study-name
and the trials
, how could I show all the studies that I've made so far?
Thank you!!
from allennlp-optuna.
cause the epoch_callbacks no longer exist in allennlp2.x.x I think.
You're right. This change is derived from AllenNLP.
Small question about the study-name and the trials, how could I show all the studies that I've made so far?
You have to save a study to some consistent backends such as MySQL, SQLite3, Redis, etc.
And you can see the list of studies by writing a SQL query or something.
To save your study to a consistent backend, specify --storage
when running allennlp tune
.
(ref. https://github.com/himkt/allennlp-optuna#5-hyperparameter-optimization-at-scale)
For more information about allennlp-optuna, please refer the allennlp guide.
(ref. https://guide.allennlp.org/hyperparameter-optimization)
from allennlp-optuna.
Related Issues (17)
- include package is not being passed during distributed training HOT 10
- PruningCallback doesn't work HOT 6
- Erroneous poetry run commands?
- Clarify License HOT 2
- AllenNLP v2
- retrain runtime error: fail to load study HOT 2
- Question: hyperparameter tuner for allennlp with cross-validation HOT 12
- KeyError: 'attributes' for optuna-param-path config file HOT 11
- Using SuccessiveHalvingPruner HOT 11
- retrain command not getting environment values HOT 4
- Trials with repeated set of hyperparameters HOT 4
- Different results from `allennlp tune` and `allennlp retrain` with transformers HOT 4
- Support multi-objective optimization
- Passing overrides to tune command? HOT 1
- Trial X failed because of the following error: ValueError('nan loss encountered') HOT 3
- Provide default/good hyperparameters to start search HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from allennlp-optuna.