crux82 / ganbert Goto Github PK
View Code? Open in Web Editor NEWEnhancing the BERT training with Semi-supervised Generative Adversarial Networks
License: Apache License 2.0
Enhancing the BERT training with Semi-supervised Generative Adversarial Networks
License: Apache License 2.0
Hi,
Thanks for the amazing research and the code implementation, is there any way to use this package for multi label intent classification ?
The README has output examples in which for each model {BERT, GANBERT}, {eval_accuracy, eval_f1_micro, eval_precision, eval_recall} are all identical. For example, in the case of BERT:
eval_accuracy = 0.136
eval_f1_macro = 0.010410878
eval_f1_micro = 0.136
eval_loss = 3.7638452
eval_precision = 0.136
eval_recall = 0.136
I ran your model with sh run_experiment.sh
, got numerically different results, but the same equality across {eval_accuracy, eval_f1_micro, eval_precision, eval_recall} within each model persists. For example, for GANBERT I get:
eval_accuracy = 0.514
eval_f1_macro = 0.15001474
eval_f1_micro = 0.514
eval_loss = 2.1689985
eval_precision = 0.514
eval_recall = 0.514
global_step = 276
loss = 5.6168394
Is it because you're micro averaging and therefore "micro-F1 = micro-precision = micro-recall = accuracy"?
Its a might job done, do appreciate all the efforts. However, i just want to know that if i want to run the code on googl colab and google colab supports tensorflow 2.* version only. All the code written for tensorflow 1.* some time have 'contrib' which cannot be converted automatically to tensorflow 2 So,.
What i want to say, is the GANBert code available for tensorflow 2.* version?
My apologies if i asked this at wrong place.
Kind Regards.
in data folder , i try to train and eval(ganbert), the eval result is:
eval_accuracy = 0.516
eval_f1_macro = 0.1780866
eval_f1_micro = 0.516
eval_loss = 2.249187
eval_precision = 0.516
eval_recall = 0.516
global_step = 276
loss = 5.5095177
but when i try to predict the test.txt, the accuracy is low,maybe 0.1.( i set do_predict=true, and set LABEL_RATE="1")
how to predict?
Hi. Thanks for the great repo. I am running ganbert on my data which is a binary classification task with labels 0 and 1. However, label_maps is causing issues when running on my data. I manually set the label_list and changed line 127 in the data_processors.py
. However, I am getting errors. Would you please advise on how to apply the code to other sequence classification tasks? Thank you.
I have small data in french, I need to classify in three different classes but since the data (sentences) are really small, is using this type of system really helpful and If I used it , is it difficult to adapt it for french classification ?
Hi @crux82 ,
I have run GANBERT on a variety of datasets and the results are very very promising. So, firstly, Thank you for this amazing paper and repository.
For further validation of the results, I would also like to fetch the final predicted labels as a .txt or .csv file from the model.
I have tried modifying the ganbert.py to save the "predictions" from line 531 but this is a Tensor and its evaluation is not as simple as predictions.eval(). I have tried a lot of different methods but have not been able to extract the predicted labels.
Can you please guide me how I can extract the predicted labels from the model?
Thanks and Regards,
Lakshay.
Hi, in discriminator code: logit = tf.layers.dense(layer_hidden, (num_labels + 1)), why num_labels + 1?
############ Defining Discriminator ###########
def discriminator(x, d_hidden_size, dkp, is_training, num_labels, num_hidden_discriminator=1, reuse=False):
with tf.compat.v1.variable_scope('Discriminator', reuse=reuse):
layer_hidden = tf.nn.dropout(x, keep_prob=dkp)
for i in range(num_hidden_discriminator):
layer_hidden = tf.layers.dense(layer_hidden, d_hidden_size)
layer_hidden = tf.nn.leaky_relu(layer_hidden)
layer_hidden = tf.nn.dropout(layer_hidden, keep_prob=dkp)
flatten5 = layer_hidden
logit = tf.layers.dense(layer_hidden, (num_labels + 1))
prob = tf.nn.softmax(logit)
return flatten5, logit, prob
Hi, excellent work!
But, i am a little confused about the [balance], which is used for oversample instance?
And why we need oversample?
Line 280 in 32e9922
I only saw the training set and the test set in your data,looking forward to your reply
Interesting idea of using an adversarial method for leveraging unlabeled data. I am trying to see how much unlabeled data can actually help.
In the plot below, I am comparing GanBert (Orange) that trains on both labeled and unlabeled data, and a basic model that uses Bert+Classifier (blue) that trains on the 109 labeled data only of Trec Data.
The paper reports that the basic model should achieve around 40%, but I am getting 60% which is very close to GanBert's. Are you sure that the baseline discussed in the paper is a reasonable one?
hey,I try to use some other method like word2vec, xlnet and ERNIE.
but when I run the last part of code.(pytorch version),it failed.
# Generate the output of the Discriminator for real and fake data.
# First, we put together the output of the tranformer and the generator
**disciminator_input = torch.cat([hidden_states, gen_rep], dim=1)**
RuntimeError: Tensors must have same number of dimensions: got 3 and 2
I checked the shape below
gen_rep.shape ---->torch.Size([64, 768])
hidden_states.shape ---->torch.Size([64, 64, 768])
torch.cat((gen_rep,hidden_states),dim=1)
if I run the bert model it can work, but when I try to use other NLP model like Xlnet and ERNIE. it seems the issue of dimension is existing. How can I do?
Hello,
I would like to know if I can use ganbert to generate french data and if yes how , should i switch Bert to use FlauBert the french version ?
What will be the format of the data to run the model for sentence pair task like MNLI mentioned in the paper? I would also like to try for other sentence pair tasks as well, so was curious how should i format both labelled and unlabelled data and if there would be any changes in the code?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.