Code Monkey home page Code Monkey logo

nlp_scripts's Introduction

nlp_scripts's People

Contributors

dhavaltaunk08 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

nlp_scripts's Issues

The test score on Kaggle seems quiet low

I modify it and then predict the test.tsv and submit to kaggle sentiment competition. The score is quite low, much lower than BERT, so how much epoch will be better. For Bert, I just use 3 epoch, but increase epoch in this roberta seems not work.

How to load the fine-tuned model after save it local.

I noticed the fine-tuned roberta script saved the fine-tuned model locally by

model_to_save = model
torch.save(model_to_save, output_model_file)
tokenizer.save_vocabulary(output_vocab_file)

How to load this model from the local folder?
I try this:

from transformers import RobertaModel, RobertaTokenizer
# from the local folder
roberta_model = RobertaModel.from_pretrained("./")
roberta_tokenizer = RobertaTokenizer.from_pretrained('./', truncation=True, do_lower_case=True)

But failed with error: OSError: does not appear to have a file named config.json. Checkout 'https://huggingface.co/.//main' for available files.

Multi-gpu support

The current code is unable to support multi-GPU environment.

I parallelized the model using torch.nn.DataParallel(model).cuda()

I got this error when I try :
16032841374066409239620202407970

When shall clearing gradients happen?

Thank you. In the Transformers multilabel distillbert, in the function "train(epoch)", you clear gradients after the model feed-forward operation and before loss backward calculation, and you uses "optimizer.zero_grad()" twice. Will this clear the gradients that the backward calculation needs? Shall it be like this where we only need to clear gradients once before model feed-forward operation:

    optimizer.zero_grad()
    outputs = model(ids, mask, token_type_ids)

    loss = loss_fn(outputs, targets)
    if _%5000==0:
        print(f'Epoch: {epoch}, Loss:  {loss.item()}')

    loss.backward()
    optimizer.step()

about inference

To make predictions, how to load the model which were saved during the training process?
After load model, we can inference testing data.
thank you :)

Why text = " ".join(text.split()) in __getitem__ in Transformers_multilabel_distilbert?

The code I'm referring to is in cell 8 in https://github.com/DhavalTaunk08/NLP_scripts/blob/master/Transformers_multilabel_distilbert.ipynb
I tried to print before and after this line, and I see no difference.

print(text)
print(repr(text))
text = " ".join(text.split())
print(text)
print(repr(text))

Output:

proving there is no cure for stupidity.
'proving there is no cure for stupidity.'
proving there is no cure for stupidity.
'proving there is no cure for stupidity.'

Maybe it makes a difference in certain cases? Or can I just remove this line?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.