Comments (3)
It's strange. Both use the same tokenizer.
from bertweet.
Thank you for your quick response! The code with my dataset work for bertweet-base model & bertweet-base tokenizer but somehow it keeps giving me this index error when using tokenizer from bertweet-covid19-base-uncased. I guess I will use bert-base tokenizer with bertweet-covid19-base-uncased model after confirming that it is the same tokenizer.
from bertweet.
Yeah, you can just simply diff
tokenizers files (vocab.txt, bpe.codes, tokenizer.json). They're the same.
from bertweet.
Related Issues (20)
- Can't load Tokenizer HOT 1
- Some emojis not tokenized properly HOT 5
- using model for local tweets author prediction HOT 1
- What are pre-processing steps applied HOT 1
- Use model output for sentiment classifcation HOT 2
- AutoTokenizer gives error HOT 1
- Using with BERTweet with Farm HOT 1
- Applying Bertweet to a huge pandas dataframe HOT 1
- Reproducing the POS Tagger Results using Huggingface Tokenizer Offsets HOT 2
- How to get the dependency parsing result using BERTweet HOT 1
- Preprocessing of tweets HOT 1
- What is the masked token in BERTweet? HOT 1
- Question about normalization=True HOT 1
- Truncated Tweets from Archive Team Tweet Stream HOT 1
- About the tokenizer for Bertweet-Large HOT 1
- next sentence prediction HOT 1
- vinai/bertweet-large returns LABEL_0 all the time HOT 1
- IndexError: index out of range in self HOT 2
- Sentimental analysis of tweets. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bertweet.