Code Monkey home page Code Monkey logo

Comments (9)

ayulockin avatar ayulockin commented on August 15, 2024 2

@David-Biggs once I have my colab ready, I will share it here.

from examples.

David-Biggs avatar David-Biggs commented on August 15, 2024 1

Hey @ayulockin

Thanks a lot!

I will try this out ASAP ;)

from examples.

ayulockin avatar ayulockin commented on August 15, 2024

Hey @David-Biggs, thanks for the request. I will try to come up with something and let you know when I have an example.

Not sure if you are aware about this PR that was recently merged in the dev branch of MMDetection that provides some dedicated support for MMDetection.

For sweeps it might require some workaround but a proper integration would be better. I will try to scope it out.

from examples.

David-Biggs avatar David-Biggs commented on August 15, 2024

Hey @ayulockin,

Great, thanks so much!

Looking forward to hearing from you.

Many thanks

from examples.

ayulockin avatar ayulockin commented on August 15, 2024

Hey @David-Biggs, after thinking about this for sometime, here's a rough solution for you if you wanna take a stab:

  • W&B requires a sweep config and a train function. You can see the same in this intro to sweep colab:
  • A sweep config is nothing but a dict of hyperparameter space you wanna search from/tune. Below is an example of sweep config:
import wandb
sweep_config = {
  "name" : "my-sweep",
  "method" : "random",
  "parameters" : {
    "epochs" : {
      "values" : [10, 20, 50]
    },
    "learning_rate" :{
      "min": 0.0001,
      "max": 0.1
    }
  }
}
  • You will then generate a sweep id by doing this: sweep_id = wandb.sweep(sweep_config).
  • The train function looks something like this:
def train():
    with wandb.init() as run:
        config = wandb.config
        model = make_model(config)
        for epoch in range(config["epochs"]):
            loss = model.fit()  # your model training code here
            wandb.log({"loss": loss, "epoch": epoch})
  • The train function will have access to wandb.config. They are coming from sweep_config (for a range a hyperparameter, the value is selected based on optimization method (grid, random, etc)).
  • MMDetection also has a train_detector function what you can call from the train function. The interesting bit would be to manage MMDetection config.
  • You can do get MMDetection inside the train function and update the required config using wandb.config. Something like this:
def train():
    with wandb.init() as run:
        config = wandb.config
        model = make_model(config)
        # MMDetection config
        config_file = 'mmdetection/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_coco.py'
        cfg = Config.fromfile(config_file)
        cfg.optimizer.lr = config.learning_rate
        train_detector(model, datasets, cfg, distributed=False, validate=True, meta=meta)

It should work. I will also try to write a colab and share with you.

from examples.

David-Biggs avatar David-Biggs commented on August 15, 2024

Hi @ayulockin,

So I've been working on it for a while and I made some alterations and have some interesting observations.

Alterations:

  1. In train() you make use of model = make_model(config). I removed this and used model = build_detector(cfg.model) (from mmdetection). I wasn't sure which one to use. I thought I would use the information in the 'current' sweep_config to update the cfg, by doing cfg.optimizer.lr = config.learning_rate ... etc , then pass cfg.model into build_detector()
  2. I removed meta=meta. The default for meta is None and I wasn't sure where you defined your meta variable.

This works... Sort of. The model trains the sweeps loop over the different values but:

Obervations:

  1. My training losses are all Nan
  2. During training, I get this error The testing results of the whole dataset is empty . There are no validation results (mAP nor Losses)

I removed all sweep related code and ran the MMDetection train_detector(model, datasets, cfg, distributed=False, validate=True) command and it worked perfectly fine. Losses were real values and I got validation results. I did some digging but could not resolve either of the two issues.

from examples.

ayulockin avatar ayulockin commented on August 15, 2024

Thanks for trying it out @David-Biggs.

Sorry, I should have clarified that make_model was more of a pseudocode and not the actual API. Glad it worked (sort of).

Were you able to resolve the NaN loss issue?

from examples.

David-Biggs avatar David-Biggs commented on August 15, 2024

Hi @ayulockin,

So I found that the reason for the Nan loss issues was the learning rate. All the values I chose were slightly too large. After reducing the values I was able to do a successful sweep.

Thanks again for your help.

from examples.

ayulockin avatar ayulockin commented on August 15, 2024

Glad it worked for you. 👯‍♂️

Closing the issue since it's resolved.

from examples.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.