kevinscaria / instructabsa Goto Github PK
View Code? Open in Web Editor NEWInstructional learning for Aspect Based Sentiment Analysis
Home Page: https://arxiv.org/abs/2302.08624
License: MIT License
Instructional learning for Aspect Based Sentiment Analysis
Home Page: https://arxiv.org/abs/2302.08624
License: MIT License
I tried to build datasets for Laptop and Rest14, and then I conducted experiments on aspect term extraction
on Laptop
, different conditions were used to determine true positive
in the get_metrics
function, and the results are as follows:
Complete:
if pred_val.lower() == gt_val.lower():
Inomplete:
if pred_val.lower() in gt_val.lower() or gt_val.lower() in pred_val.lower():
code: untils.py
for gt_val in gt_list:
for pred_val in pred_list:
# if pred_val.lower() == gt_val.lower() or gt_val.lower() == pred_val.lower():
if pred_val.lower() in gt_val.lower() or gt_val.lower() in pred_val.lower():
tp+=1
break
Hi @kevinscaria, is this released under an open source license? Would you be able to add a license declaration?
Thanks!
Hello, I was not able to reproduce your results since your data is not open-sourced. I have transformed the data following the format you requested in your README, as shown in the attached image.
However, the results I obtained were far from the results reported in your paper. Would it be possible for you to open-source the data or help point out if there is any mistake in the data format transformation that might have caused this issue?
Hi Kevin,
Such a great repo, thank you so much for the work. I'm wondering for this generative model, is there a parameter I can set to generate for longer sequences rather one sentence? I found the model is giving partial answers when I'm giving it a paragraph. I tried splitting the paragraph into sentences, but it's way too slow for an API standard. Do you have a better idea on how to do this?
Many thanks,
Bowen
Hello,
I have some confusion regarding the datasets provided in your Git repository. In your paper, you mentioned that the benchmarks are sourced from the original SemEval 14, 15, and 16. However, I noticed that you included Peng's datasets in your repository. While Peng's datasets are refined versions of the SemEval data, there are still some differences between the two. Therefore, I would like to know which resources your test results are based on. Thank you for your attention to this matter.
The model in T5Generator is AutoModelForSeq2SeqLM.
But the model in T5Classifier is T5ForConditionalGeneration.
Thanks for great work, I wonder is there other dataset that is not english-based you used during training?
Hi! Great repo :). Would you mind adding instructions on how to make this work?
For example, a requirements.txt
file specifying which versions each library should have, and the python version.
I can help you if you don't know how to do it.
Thank you!
Have you tried to use the instruction learning paradigm for Aspect Sentiment Triplet Extraction?
Hi. Thanks for sharing the code of your papers.
I'm trying to reproduce the training for the joint task, but this error appears:
AttributeError: 'DatasetLoader' object has no attribute 'create_data_in_joint_task_format'
There is any way to solve it with the other methods in the class?
Thanks!
ValueError Traceback (most recent call last)
in <cell line: 4>()
2 get_ipython().system(' pip install -U accelerate')
3 get_ipython().system(' pip install -U transformers')
----> 4 model_trainer = t5_exp.train(id_tokenized_ds, **training_args)
6 frames
/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in get_eval_dataloader(self, eval_dataset)
886 """
887 if eval_dataset is None and self.eval_dataset is None:
--> 888 raise ValueError("Trainer: evaluation requires an eval_dataset.")
889 eval_dataset = eval_dataset if eval_dataset is not None else self.eval_dataset
890 data_collator = self.data_collator
ValueError: Trainer: evaluation requires an eval_dataset.
Hello!
May I share your data set? I want to try to reproduce the experimental data in your paper. If I can, I will be very grateful!
Hi, I would like to know if you do any dataset processing in the joint task scenario. You mentioned in your paper that you have ignored the "conflict" label in ATSC task, so I want to know if you have deleted those sentences with "conflict" labels in the evaluation of the joint task. To be more specific, if the model did not predict the polarity as "conflict" in the joint task, would you ignore this wrong prediction? Because there are only examples for positive, neutral, and negative cases in the instruction, I was thinking about how you treat the "conflict" case in the joint task. Thank you.
I think there is a small mismatch seperater between create_data_in_joint_task_format
function and get_metrics
function:
In create_data_in_joint_task_format function, data is joined by ','
df['labels'] = df[aspect_col].apply(lambda x: ','.join([f"{i[key]}:{i[label_key]}" for i in x]))
In get_metrics function, data is splitted by ', '
:
gt_list = gt.split(', ')
Can you help me check if there is a typo?
Many thanks,
Dan
Does this mean that you used the test set as the dev set during training?
In my understanding, we should use a dev set different from the test set to assist training.
Since you didn't provide a dataset in your codebase, I'm not really sure.
Hello,
Using Huggingface pre-trained transformer, I am getting the output of:
'The cab ride was amazing but the service was pricey' in the form of Seq2SeqLMOutput. I want to convert this in the form of (aspect, sentiment).
I tried using the model.decode(tokenizer.decode(predicted_output.logits[0], skip_special_tokens=True)
, but I am getting this error:
argument 'ids': 'list' object cannot be interpreted as an integer
Google Colab link where I have implemented the model:
https://colab.research.google.com/drive/1gcHaM4ehqccX2zGIe8RbCeN6Q-hZh0bb?usp=sharing
Please help in this regard.
Your requirments.txt
state that torch version should be over 1.3.
However, 'tile' function was added after 1.8.0 ~.
Can you update the requirments?
This package is great, thanks!
Hey there. I am not that sure, how to load the model for inference. Is it possible that the model checkpoints are missing?
Greetings from Germany! :D
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.