Code Monkey home page Code Monkey logo

Comments (8)

sileod avatar sileod commented on August 19, 2024 1

Sentence similarity is already supported, just use tn.Classification template where y is float. So it should work off the shelf.
This code show how to specialize encoder for one of the training tasks

def load_pipeline(model_name, task_name, adapt_task_embedding=True,multilingual=False):

from tasknet.

sileod avatar sileod commented on August 19, 2024

Hi!
The library uses a shared encoder + "adapters" (task embeddings + task heads, e.g. classifiers)
It saves the shared encoder and the adapters

Currently, if you want to start again, you should load the saved encoder, and fill in the adapter weights one by one with a loop

The training is multi-task, but the model use is typically single task, what is your use case ?

from tasknet.

thirsima avatar thirsima commented on August 19, 2024

Thanks! I will try loading the encoder and adapters separately.

Eventually my use case will be to train a model that can do both sentence similarity and token classification, but I at the moment I am just trying to find a multi-task training module that works without problems. So far tasknet looks most promising.

I guess tasknet does not support sentence similarity at the moment, but looking at the currently supported task implementations, it should not be too hard to add.

from tasknet.

thirsima avatar thirsima commented on August 19, 2024

To clarify the use case, I eventually want to implement a microservice that loads the trained encoder and trained adapters from local files so that encoder is common for the 2 tasks.

from tasknet.

thirsima avatar thirsima commented on August 19, 2024

Currently, if I call trainer.save_model(task_index) for 4 tasks, 4 different copies of the encoder are saved to disk and the files seem to have differences. And if I use load_pipeline() for all 4 tasks, I have 4 copies of the encoder in memory.

Is it possible to load the 4 tasks so that the encoder would be shared again? My aim is to avoid excessive memory consumption when I have multiple tasks that could use a shared encoder.

tasknet.Model.__init__() seems to have warm_start parameter. Would it be feasible to load first encoder from one task, and then warm start tasknet.Model with that encoder?

from tasknet.

sileod avatar sileod commented on August 19, 2024

Currently, when the model is saved, it saves a single encoder + a set of adapters.
The adapter class is actually a collection of adapters. I'll try to clarify this, thanks

Then, you can load the single encoder and set of adapters, and use
model = adapter.adapt_model_to_task(model, task_name)
So you should save once, then call adapt_model_to_task multiple times for each task.
If you do:
model_t1 = adapter.adapt_model_to_task(model, task_name1)
model_t2 = adapter.adapt_model_to_task(model, task_name2)
You will have different model objects, but they will should the same weights
You can have hundreds of models, if they use the same weights, it will not use much more memory than one (that's how I trained deberta-tasksource-nli on a single GPU with tasknet)
The main concern is how to address task embeddings properly

from tasknet.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.