Code Monkey home page Code Monkey logo

tito-joker's People

Contributors

enzoampil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

tito-joker's Issues

how to run

Hello everyone, Just confused on how to get this working. I ran the setup.sh and tried all the commands. Didn't work. How can I get it working?

How to measure the effectiveness of tag "controls" (e.g. sentiment)?

I was thinking that simple methodology would be to generate sequences for sentiment spans and measure accuracy based on some overlap measure, e.g. Jaccard, or rougue.

While imperfect, this would be an initial approach to communicating the effectiveness of utilizing text generation controls, derived from pre-trained supervised models.

Tweets would be a good start.

Potential dataset

https://www.kaggle.com/c/tweet-sentiment-extraction/data

Resurrect hosted version please. I got "ValueError: Connection error" when trying to run locally

Hi,
I wanted to play with the hosted version of the app, but the IP address hosting the website is down. And the pre-trained models are on a Google account that complains the billing is expired.

I tried running Jokes_GPT2_Finetuning2.ipynb and fixed a lot of "python module not found" errors and "file not found" errors and other syntax errors but when trying to run "Finetune the GPT2 model", I got this:

!python3 /Users/andrew/Downloads/tito-joker-master/experiments/transformers/examples/legacy/run_language_modeling.py \
--overwrite_output_dir \
--output_dir='./models/finetuned' \
--model_type=gpt2\
--model_name_or_path=gpt2\
--tokenizer_name="./models/pretrained" \
--do_train\
--train_data_file="./data/riddle_jokes.txt" \
--per_gpu_train_batch_size=4 \
--block_size=50 


09/07/2022 15:55:11 - WARNING - __main__ - Process rank: -1, device: cpu, n_gpu: 0, distributed training: False, 16-bits training: False
09/07/2022 15:55:11 - INFO - __main__ - Training/evaluation parameters TrainingArguments(
_n_gpu=0,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=False,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=IntervalStrategy.NO,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
gradient_accumulation_steps=1,
gradient_checkpointing=False,
greater_is_better=None,
group_by_length=False,
half_precision_backend=auto,
hub_model_id=None,
hub_strategy=HubStrategy.EVERY_SAVE,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=length,
load_best_model_at_end=False,
local_rank=-1,
log_level=-1,
log_level_replica=-1,
log_on_each_node=True,
logging_dir=./models/finetuned/runs/Sep07_15-55-11_voyager-528.local,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=500,
logging_strategy=IntervalStrategy.STEPS,
lr_scheduler_type=SchedulerType.LINEAR,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
no_cuda=False,
num_train_epochs=3.0,
optim=OptimizerNames.ADAMW_HF,
output_dir=./models/finetuned,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=8,
per_device_train_batch_size=8,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
remove_unused_columns=True,
report_to=[],
resume_from_checkpoint=None,
run_name=./models/finetuned,
save_on_each_node=False,
save_steps=500,
save_strategy=IntervalStrategy.STEPS,
save_total_limit=None,
seed=42,
sharded_ddp=[],
skip_memory_metrics=True,
tf32=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_legacy_prediction_loop=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
xpu_backend=None,
)
[INFO|configuration_utils.py:654] 2022-09-07 15:55:12,110 >> loading configuration file https://huggingface.co/gpt2/resolve/main/config.json from cache at /Users/andrew/.cache/huggingface/transformers/fc674cd6907b4c9e933cb42d67662436b89fa9540a1f40d7c919d0109289ad01.7d2e0efa5ca20cef4fb199382111e9d3ad96fd77b849e1d4bed13a66e1336f51
[INFO|configuration_utils.py:690] 2022-09-07 15:55:12,114 >> Model config GPT2Config {
  "_name_or_path": "gpt2",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.1,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "transformers_version": "4.18.0",
  "use_cache": true,
  "vocab_size": 50257
}

Traceback (most recent call last):
  File "/Users/andrew/Downloads/tito-joker-master/experiments/transformers/examples/legacy/run_language_modeling.py", line 375, in <module>
    main()
  File "/Users/andrew/Downloads/tito-joker-master/experiments/transformers/examples/legacy/run_language_modeling.py", line 262, in main
    tokenizer = AutoTokenizer.from_pretrained(model_args.tokenizer_name, cache_dir=model_args.cache_dir)
  File "/Users/andrew/Library/Python/3.6/lib/python/site-packages/transformers/models/auto/tokenization_auto.py", line 471, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
  File "/Users/andrew/Library/Python/3.6/lib/python/site-packages/transformers/models/auto/tokenization_auto.py", line 341, in get_tokenizer_config
    local_files_only=local_files_only,
  File "/Users/andrew/Library/Python/3.6/lib/python/site-packages/transformers/utils/hub.py", line 685, in get_file_from_repo
    use_auth_token=use_auth_token,
  File "/Users/andrew/Library/Python/3.6/lib/python/site-packages/transformers/utils/hub.py", line 290, in cached_path
    local_files_only=local_files_only,
  File "/Users/andrew/Library/Python/3.6/lib/python/site-packages/transformers/utils/hub.py", line 546, in get_from_cache
    "Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

Could you upload the pre-trained models to the repository? Or resurrect the hosted version for a month?

Refactor `run_generation` module

The current module still has a lot of unnecessary code coming from the CLI implementation from transformers. These are better as standalone functions that take default arguments from a separate config file (potentially config.yaml).

Potential output:

Create new GenerationPipeline under HuggingFace's transormers.pipeline module.

Add custom context to control "topic" of answer

Both open domain and closed domain Q&A looks promising to use as a framework. Could also extract an answer from wikipedia which is then used an additional input to generate the answer.

Idea: I can give context as a long articles (e.g. wikipedia), and then apply the logic above. This allows us to give long context, while still being about a fairly specific topic (e.g. Donald trump's wikipedia).

https://docs.google.com/presentation/d/1A5wJEzFYGdNem7egJ-BTm6EMI3jGNe1lalyChYL54gw/edit#slide=id.g72fe29dbfb_0_96

Weaver paper: https://arxiv.org/pdf/1804.10490.pdf

Bias generated output based entities found in the input

What if I bias the conditional word distribution of tito joker based on the existence of entities in the input. Eg “Why did “donald trump”, will detect the name and will focus on that name for the conditional output.

Joke type controls

These can either be explicit types from the dataset or implied from the input sequence.

Joke types can be inferred from the inital tokens from a jokes.

For example, jokes starting with "yo mama" are obviously yo mama type jokes. This makes me wonder if these will actually improve the model's accuracy in generating yo mama type jokes. I think yes since the word distributions will now be focused on historical yo mama jokes.

Apply sentiment control (configure sentiment before generation)

Two approaches in mind so far:

  1. During training, append the sentiment score of the joke to the input vector (or the last one), so it can be added as a feature to contextualize the output.

The cool thing about this is that it creates versatility around sentiment, since the value is continuous. I.e., there is a concept of very happy and slightly happy.

E.g.

embedding_vector = [0.1, 0.2, 0.3]
sentiment_score = [0.8]
model_inpute = embedding_vector + sentiment_score = [0.1, 0.2, 0.3, 0.8]
  1. Add sentiment tags to the dataset to correspond to the mood of the joke. These can be implied from actual sentiment / toxicity predictions from pre-trained models.

E.g.

raw_input = "Why did the chicken cross the road?"
processed_input = "<sad> Why did the chicken cross the road?"

The create a dataset with sentiment tags, we can simply reuse existing sentiment analysis models and apply them to each joke in the dataset. We can start off with fine-tuned BERT models for sentiment analysis on full text (example), and then move towards span level controls (example).

A span level implementation will look like below:

E.g.

raw_input = "My dog died today and I am very sad"
output = "My dog died today and <sad> I am very sad </sad>"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.