Code Monkey home page Code Monkey logo

Comments (13)

himkt avatar himkt commented on May 24, 2024 1

@chuyuanli

Ah, I may find the cause.
What happens if you simply remove std.parseJson from local optimizer = std.parseJson(std.extVar('optimizer'));?

21c21
< local optimizer = std.parseJson(std.extVar('optimizer'));
---
> local optimizer = std.extVar('optimizer');

from allennlp-optuna.

chuyuanli avatar chuyuanli commented on May 24, 2024 1

Hey @himkt,
I removed the part std.parseJson, and it works!!
Thougth I had some trouble with the --include-pacakge cause it can't find my registered dataset_reader, but I figured it out :)
Without allennlp-optuna, I put my project repo name in .allennlp_plugins and it worked just fine. Now we need to write allennlp_optuna in the file plugins so it can't find as before. (sorry this discussion is no longer under this issue, just wondering what this plugin is about ^^")

I'll continue with the pruning.
P.S: the current version works fine with allenlp 2.0.2. Looking forward for the next version of allennlp-optuna!

Thanks again :)

from allennlp-optuna.

himkt avatar himkt commented on May 24, 2024 1

@chuyuanli I released the new version of allennlp-optuna (v0.1.5).
https://github.com/himkt/allennlp-optuna/releases/tag/v0.1.5

You can upgrade the library by pip install -U allennlp-optuna.
Please give it a try. 🙏

from allennlp-optuna.

himkt avatar himkt commented on May 24, 2024 1

Let me close this PR as we identified and solved jsonnet error.
Feel free to open a new issue if you have something to discuss. Thank you @chuyuanli!

from allennlp-optuna.

himkt avatar himkt commented on May 24, 2024

Hello @chuyuanli, thank you for trying allennlp-optuna!

Can you share your jsonnet configuration?
I think the problem happens because the config file is invalid for jsonnet.

You can check if your config file is valid for jsonnet by executing jsonnet command
with filling hyperparameters using -V.

jsonnet config/aa \
  -V dropout=1 \
  -V embedding_dim=1 \
  -V max_filter_size=1 \
  -V num_filters=1 \
  -V output_dim=1 \
  -V lr=1

If you share your config, it would be helpful for investigation.

from allennlp-optuna.

chuyuanli avatar chuyuanli commented on May 24, 2024

Hi @himkt, thanks for your prompt response!

Sure you'll find below my jsonnet config file:

local ufo_path = "xxx/data/ufo/";
local MODE = "train-dev";
local flaubert_model = "flaubert/flaubert_base_cased";
local cuda_device = 0; 
local epochs = 5;
local seed = 2;
local batch_size = 1;

local nfold = std.parseInt(std.extVar('nfold'));
local dropout = std.parseJson(std.extVar('dropout'));
local lr = std.parseJson(std.extVar('lr'));
local block_h = std.parseInt(std.extVar('block_h'));
local block_layer = std.parseInt(std.extVar('block_layer'));
local doc_h = std.parseInt(std.extVar('doc_h'));
local doc_layer = std.parseInt(std.extVar('doc_layer'));
local l1_coef = std.parseJson(std.extVar('l1_coef'));
local l3_coef = std.parseJson(std.extVar('l3_coef'));
local optimizer = std.parseJson(std.extVar('optimizer'));

{
    "dataset_reader" : {
        "type": "dialog_reader",
        "tokenizer": {
          "type": "pretrained_transformer",
          "model_name": flaubert_model
        },
        "token_indexers":{
          "flaubert_tokens":{
            "type": "pretrained_transformer",
            "model_name": flaubert_model
          },
        },
        "params":{"encode_turns": false,
        "block": true,
        "tgt_spk": "stp",
        "win_size": 5,
        "seed": seed,
        "ufo_path": ufo_path,
        "MODE": MODE
        },  
    },
    "train_data_path": "train0",
    "validation_data_path": "dev0",
    "model": {
        "type": "hierarchical_classifier",
        "embedder": {
            "token_embedders": {
                "flaubert_tokens": {
                    "type": "pretrained_transformer",
                    "model_name": flaubert_model,
                    "last_layer_only": true,
                    "train_parameters": false
                }
            }
        },
        "turn_encoder": {
            "type": "cls_pooler",
            "embedding_dim": 768
        },
        "block_encoder": {
            "type": "gru",
            "input_size": 768,
            "hidden_size": block_h, //50
            "num_layers": block_layer, //1
            "dropout": dropout,
            "bidirectional": false
        },
        "doc_encoder": {
            "type": "gru",
            "input_size": block_h, //50
            "hidden_size": doc_h, //50
            "num_layers": doc_layer, //1,
            "dropout": dropout,
            "bidirectional": false
        },
        "use_flaubert": true, 
        "l1_coef": l1_coef, //0.2,
        "l3_coef": l3_coef, //0.4,
        "cuda": cuda_device,
    },

    "data_loader": {
        "batch_size": batch_size,
        "shuffle": true
    },

    "trainer": {
        "optimizer": {
          "type": optimizer, //"adam", "sgd"
          "lr": lr
        },
        "num_epochs": epochs,
        "cuda_device": cuda_device,
        "patience": 5,
        // "epoch_callbacks": [
        // {
        //     "type": "optuna_pruner",
        // }
        // ]
    }
}

I tried with jsonnet command, and it shows bash: jsonnet: command not found error. I have checked with pip install jsonnet and jsonnet is already installed :/
Thanks ^^

from allennlp-optuna.

himkt avatar himkt commented on May 24, 2024

@chuyuanli Hmm... A jsonnet configuration seems to be valid.
Can you share hparams.json and sample data for train/valid as well?

from allennlp-optuna.

himkt avatar himkt commented on May 24, 2024

Ah, it may need a Python script for hierarchical_classifier class as well.

from allennlp-optuna.

chuyuanli avatar chuyuanli commented on May 24, 2024

Yeah I first used this jsonnet (with the exact value and not std.extVar of course) without Optuna: allennlp train -s result/res config/aa.jsonnet. It worked out just well. Then I changed to std.extVar and it didn't work.

Sure, here is my hparams.json file:

[
    {
      "type": "int",
      "attributes": {
        "name": "nfold",
        "low": 0,
        "high": 4
      }
    },
    {
      "type": "float",
      "attributes": {
        "name": "dropout",
        "low": 0.0,
        "high": 0.8
      }
    },
    {
      "type": "float",
      "attributes": {
        "name": "lr",
        "low": 1e-5,
        "high": 1e-3,
        "log": true
      }
    },
    {
        "type": "int",
        "attributes": {
          "name": "block_h",
          "low": 64,
          "high": 512
        }
      },
      {
        "type": "int",
        "attributes": {
          "name": "block_layer",
          "low": 1,
          "high": 3
        }
      },
      {
        "type": "int",
        "attributes": {
          "name": "doc_h",
          "low": 32,
          "high": 256
        }
      },
      {
        "type": "int",
        "attributes": {
          "name": "doc_layer",
          "low": 1,
          "high": 3
        }
      },
      {
        "type": "float",
        "attributes": {
          "name": "l1_coef",
          "low": 0.2,
          "high": 0.8
        }
      },
      {
        "type": "float",
        "attributes": {
          "name": "l3_coef",
          "low": 0.1,
          "high": 0.4
        }
      },
      {
        "type": "categorical",
        "attributes": {
          "name": "optimizer",
          "choices": ["adam", "sgd"]
        }
      }
  ]

As for the Python script for hierarchical_classifier and data samples I made them into a zip file.
FYI, hierarchical means that I encode turn (each sentence) -> block (n sentences in a row) -> doc (whole document).
my dataset_reader.py is wrapped with another package called ufo. I don't want to mess you around with that ^^" So I put 2 small examples of what my input data look like.

allen-optuna-sample.zip

from allennlp-optuna.

chuyuanli avatar chuyuanli commented on May 24, 2024

Ah also I just saw the version of allennlp-optuna. My code is written in allennlp 2.0.1, but allennlp-optuna 0.1.4 requires allennlp<2.0.0,>=1.0.0. So when I installed allennlp-optuna it automatically degraded my allennlp.
Any way to get around with that? ^^"

from allennlp-optuna.

himkt avatar himkt commented on May 24, 2024

My code is written in allennlp 2.0.1, but allennlp-optuna 0.1.4 requires allennlp<2.0.0,>=1.0.0

I'll release the next version of allennlp-optuna, which supports AllenNLP 2.x.x later.
Sorry for the delay.

from allennlp-optuna.

chuyuanli avatar chuyuanli commented on May 24, 2024

Hi!
I updated the library and it works fine ;)
I made one small modification though in jsonnet file, cause the epoch_callbacks no longer exist in allennlp2.x.x I think.

        "callbacks": [//if allennlp<2.x.x, use 'epoch_callbacks' as param
        {
            type: 'optuna_pruner',
        }
        ],

Apart from that all good!

Small question about the study-name and the trials, how could I show all the studies that I've made so far?
Thank you!!

from allennlp-optuna.

himkt avatar himkt commented on May 24, 2024

@chuyuanli

cause the epoch_callbacks no longer exist in allennlp2.x.x I think.

You're right. This change is derived from AllenNLP.

Small question about the study-name and the trials, how could I show all the studies that I've made so far?

You have to save a study to some consistent backends such as MySQL, SQLite3, Redis, etc.
And you can see the list of studies by writing a SQL query or something.

To save your study to a consistent backend, specify --storage when running allennlp tune.
(ref. https://github.com/himkt/allennlp-optuna#5-hyperparameter-optimization-at-scale)

For more information about allennlp-optuna, please refer the allennlp guide.
(ref. https://guide.allennlp.org/hyperparameter-optimization)

from allennlp-optuna.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.