While playing around with the TSBert notebook, I noticed that the models parameters do

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

freeze() doesn't set requires_grad to False about tsai HOT 8 CLOSED

timeseriesai commented on May 12, 2024

freeze() doesn't set requires_grad to False

from tsai.

Comments (8)

kamisoel commented on May 12, 2024

The problem seems to be in the way the parameters are structured. For fastai.vision models learn.opt.param_lists return a count of 3 lists and freeze() deactivates gradients for the first two parameter lists. In tsai models all parameters are in one list. So freeze() works on an empty list.

## Pseudocode for freeze implementation
def freeze(): 
   freeze_to(-1)
def freeze_to(n):
   for p in learn.opt.all_params(slice(None, len(learn.opt.param_lists) - n:
      p.require_grad=False

EDIT:
I think I found the source of the problem. The Learner has to create models as a Sequential (init, body, head), so the optimizer knows how to freeze init and body but not the head. At least that seems to be the case from fastai.vision.learner

from tsai.

kamisoel commented on May 12, 2024

My workaround for now, if anyone else wants to use TSBERT / Fine-Tuning

def freeze(learn):
  assert hasattr(learn.model, "head"), f"you can only use this with models that have .head attribute"
  for p in learn.model.parameters():
    p.requires_grad=False
  for p in learn.model.head.parameters():
    p.requires_grad=True

def unfreeze(learn):
  for p in learn.model.parameters():
    p.requires_grad=True

def fine_tune(learn, epochs, base_lr=2e-3, freeze_epochs=1, lr_mult=100,
              pct_start=0.3, div=5.0, **kwargs):
  "Fine tune with `freeze` for `freeze_epochs` then with `unfreeze` from `epochs` using discriminative LR"
  freeze(learn)
  learn.fit_one_cycle(freeze_epochs, slice(base_lr), pct_start=0.99, **kwargs)
  base_lr /= 2
  unfreeze(learn)
  learn.fit_one_cycle(epochs, slice(base_lr/lr_mult, base_lr), pct_start=pct_start, div=div, **kwargs)

from tsai.

oguiza commented on May 12, 2024

Hi @kamisoel,
Thanks for raising this issue. It's important to fix it now that we have a way to pretrain models. I'll look into it to check what needs to be updated in tsai archs to support fine-tuning.
In the meantime, have you seen any difference in performance when using your workaround?

from tsai.

kamisoel commented on May 12, 2024

Hi @oguiza,
Perfect, I hope my debugging work is of use for this :)
My workaround seems to work quite fine and should have more or less the same performance, because it's pretty close to the fastai implementation. It's just less flexible in splitting the models head and body, which shouldn't be a huge problem for TSBert since it has the same restriction (only works for models with the head property)

from tsai.

oguiza commented on May 12, 2024

Hi @kamisoel,
It's taken me a while, but I've already fixed this issue.
From now on, all models that have 'Plus' in their name will be able to use pre-trained weights and be fine-tuned.
Unlike vision models, where model parameters are split into 3 groups, time series models have only 2 groups of parameters (for backbone and head). Vision models are split into 3 groups as the initial layers need to be trained in many cases (especially if passing a number of filters different to 3).
Based on this I've re-run the TSBERT tutorial, and the results are practically identical. So there was no benefit in fine-tuning the model in this particular case.
It'd be good if you could test the change to make sure everything is working as expected.
Thanks again for raising this issue!

from tsai.

oguiza commented on May 12, 2024

I will close this issue for lack of response. If the issue persists, please, feel free to re-open.

from tsai.

kamisoel commented on May 12, 2024

Hi @oguiza
Thanks for the fast fix - and sorry for my lack of response ^^' The change seems to work just fine! 👍

Just another small request: Would it be possible to allow the use of the XCM model for pre-training as well? It already has a separate head and can be used with TSBert as well

from tsai.

oguiza commented on May 12, 2024

Hi @kamisoel, I'm glad to hear the issue is now fixed.
As to your 2nd request, I've already uploaded a new XCMPlus model you can pre-train. I haven't fully tested it, but it has the same structure as the rest, so I think it should work. It's already loaded in GitHub and will create a new pip release shortly (probably later today or tomorrow).
If you try it, please let me know if it works well.

from tsai.

freeze() doesn't set requires_grad to False about tsai HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent