enzoampil / tito-joker Goto Github PK

View Code? Open in Web Editor NEW

45.0 45.0 5.0 6.14 MB

A humorous AI that uses state-of-the-art deep learning to tell jokes

Home Page: http://35.225.94.177:8501/

License: GNU General Public License v3.0

Jupyter Notebook 96.80% Python 3.14% Shell 0.06%

tito-joker's People

Contributors

Stargazers

Watchers

Forkers

aagarwal1012 persuaide rickasricky codeaudit l-thirat

tito-joker's Issues

Generate full standup comedy transcripts

Apply conditional joke reconstruction with the inclusion of tokens in the answer

This will allow us to control the topic of an answer. E.g., ensuring that Donald trump appears in the answer allows us to control the topic of a response.

Top of mind, a mask language modelling objective might be better than pure left to right LM.

how to run

Hello everyone, Just confused on how to get this working. I ran the setup.sh and tried all the commands. Didn't work. How can I get it working?

How to measure the effectiveness of tag "controls" (e.g. sentiment)?

I was thinking that simple methodology would be to generate sequences for sentiment spans and measure accuracy based on some overlap measure, e.g. Jaccard, or rougue.

While imperfect, this would be an initial approach to communicating the effectiveness of utilizing text generation controls, derived from pre-trained supervised models.

Tweets would be a good start.

Potential dataset

https://www.kaggle.com/c/tweet-sentiment-extraction/data

Apply emotion control to text generation

Reference notebook:

https://colab.research.google.com/drive/1nwCE6b9PXIKhv2hvbqf1oZKIGkXMTi1X

Resurrect hosted version please. I got "ValueError: Connection error" when trying to run locally

Hi,
I wanted to play with the hosted version of the app, but the IP address hosting the website is down. And the pre-trained models are on a Google account that complains the billing is expired.

I tried running Jokes_GPT2_Finetuning2.ipynb and fixed a lot of "python module not found" errors and "file not found" errors and other syntax errors but when trying to run "Finetune the GPT2 model", I got this:

!python3 /Users/andrew/Downloads/tito-joker-master/experiments/transformers/examples/legacy/run_language_modeling.py \
--overwrite_output_dir \
--output_dir='./models/finetuned' \
--model_type=gpt2\
--model_name_or_path=gpt2\
--tokenizer_name="./models/pretrained" \
--do_train\
--train_data_file="./data/riddle_jokes.txt" \
--per_gpu_train_batch_size=4 \
--block_size=50 


09/07/2022 15:55:11 - WARNING - __main__ - Process rank: -1, device: cpu, n_gpu: 0, distributed training: False, 16-bits training: False
09/07/2022 15:55:11 - INFO - __main__ - Training/evaluation parameters TrainingArguments(
_n_gpu=0,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=False,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=IntervalStrategy.NO,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
gradient_accumulation_steps=1,
gradient_checkpointing=False,
greater_is_better=None,
group_by_length=False,
half_precision_backend=auto,
hub_model_id=None,
hub_strategy=HubStrategy.EVERY_SAVE,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=length,
load_best_model_at_end=False,
local_rank=-1,
log_level=-1,
log_level_replica=-1,
log_on_each_node=True,
logging_dir=./models/finetuned/runs/Sep07_15-55-11_voyager-528.local,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=500,
logging_strategy=IntervalStrategy.STEPS,
lr_scheduler_type=SchedulerType.LINEAR,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
no_cuda=False,
num_train_epochs=3.0,
optim=OptimizerNames.ADAMW_HF,
output_dir=./models/finetuned,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=8,
per_device_train_batch_size=8,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
remove_unused_columns=True,
report_to=[],
resume_from_checkpoint=None,
run_name=./models/finetuned,
save_on_each_node=False,
save_steps=500,
save_strategy=IntervalStrategy.STEPS,
save_total_limit=None,
seed=42,
sharded_ddp=[],
skip_memory_metrics=True,
tf32=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_legacy_prediction_loop=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
xpu_backend=None,
)
[INFO|configuration_utils.py:654] 2022-09-07 15:55:12,110 >> loading configuration file https://huggingface.co/gpt2/resolve/main/config.json from cache at /Users/andrew/.cache/huggingface/transformers/fc674cd6907b4c9e933cb42d67662436b89fa9540a1f40d7c919d0109289ad01.7d2e0efa5ca20cef4fb199382111e9d3ad96fd77b849e1d4bed13a66e1336f51
[INFO|configuration_utils.py:690] 2022-09-07 15:55:12,114 >> Model config GPT2Config {
  "_name_or_path": "gpt2",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.1,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "transformers_version": "4.18.0",
  "use_cache": true,
  "vocab_size": 50257
}

Traceback (most recent call last):
  File "/Users/andrew/Downloads/tito-joker-master/experiments/transformers/examples/legacy/run_language_modeling.py", line 375, in <module>
    main()
  File "/Users/andrew/Downloads/tito-joker-master/experiments/transformers/examples/legacy/run_language_modeling.py", line 262, in main
    tokenizer = AutoTokenizer.from_pretrained(model_args.tokenizer_name, cache_dir=model_args.cache_dir)
  File "/Users/andrew/Library/Python/3.6/lib/python/site-packages/transformers/models/auto/tokenization_auto.py", line 471, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
  File "/Users/andrew/Library/Python/3.6/lib/python/site-packages/transformers/models/auto/tokenization_auto.py", line 341, in get_tokenizer_config
    local_files_only=local_files_only,
  File "/Users/andrew/Library/Python/3.6/lib/python/site-packages/transformers/utils/hub.py", line 685, in get_file_from_repo
    use_auth_token=use_auth_token,
  File "/Users/andrew/Library/Python/3.6/lib/python/site-packages/transformers/utils/hub.py", line 290, in cached_path
    local_files_only=local_files_only,
  File "/Users/andrew/Library/Python/3.6/lib/python/site-packages/transformers/utils/hub.py", line 546, in get_from_cache
    "Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

Could you upload the pre-trained models to the repository? Or resurrect the hosted version for a month?

Create Tito Joker API with FastAPI

By deploying Tito Joker as an API, the webapp can simply send requests to this API and it will also be possible to integrate Tito Joker to more platforms.

E.g.

Slack
Twitter
Email???

FastAPI: https://github.com/tiangolo/fastapi

Deploy Tito Joker V2 by updating streamlit app

This requires the following:

Update app with Tito Joker V2 link
Add drop down bar to choose model version

Initialize tests for the tito_joker module

Ensure that each model in the library will run a forward pass once setup

Utilize transformer's new `pipeline` API for language generation

Apparently, there's no Pipeline yet for sequence generation.

https://huggingface.co/transformers/_modules/transformers/pipelines.html#pipeline

Refactor `run_generation` module

The current module still has a lot of unnecessary code coming from the CLI implementation from transformers. These are better as standalone functions that take default arguments from a separate config file (potentially config.yaml).

Potential output:

Create new GenerationPipeline under HuggingFace's transormers.pipeline module.

Create TIto Joker twitter account and post jokes daily

This is dependent on the FastAPI deployment.

Tweepy examples.

Add preprocessing CLI usage to README

Tito Joker v2 sometimes returns answers missing the last letter

E.g.

Input: Why did the chicken cross the road?
Output: Why did the chicken cross the road ? To go to wor

Style transfer for standup comedy

[experiment] Apply generation techniques employed from “abstractive summarization” and “Answer generation from Q&A augmentation”

Foundational source for abstractive generation: https://arxiv.org/abs/1704.04368

Retrain model with proper registration of `special tokens`

Progress here: https://colab.research.google.com/drive/1q7f66VvC2KspiWJlAV1vqwPSIEn5gOgR#scrollTo=K3QfUrWfSNCt

Gather and train model on dataset of standup comedy monologues

Add custom context to control "topic" of answer

Both open domain and closed domain Q&A looks promising to use as a framework. Could also extract an answer from wikipedia which is then used an additional input to generate the answer.

Idea: I can give context as a long articles (e.g. wikipedia), and then apply the logic above. This allows us to give long context, while still being about a fairly specific topic (e.g. Donald trump's wikipedia).

https://docs.google.com/presentation/d/1A5wJEzFYGdNem7egJ-BTm6EMI3jGNe1lalyChYL54gw/edit#slide=id.g72fe29dbfb_0_96

Weaver paper: https://arxiv.org/pdf/1804.10490.pdf

Store `feedback` data into BigQuery (daily batch)

Train with dataset backwards so that we can prompt with a punch line

I imagine this is more straightforward to specify as a seq-to-seq problem similar to #18

Add new structured sources to model

Good additional source of jokes:

https://github.com/taivop/joke-dataset
https://upjoke.com/

These will have to be separated to "question and answer" format.

Submit model `transformers` model library

Create raw jokes sample in data dir for testing purposes

Bias generated output based entities found in the input

What if I bias the conditional word distribution of tito joker based on the existence of entities in the input. Eg “Why did “donald trump”, will detect the name and will focus on that name for the conditional output.

Joke type controls

These can either be explicit types from the dataset or implied from the input sequence.

Joke types can be inferred from the inital tokens from a jokes.

For example, jokes starting with "yo mama" are obviously yo mama type jokes. This makes me wonder if these will actually improve the model's accuracy in generating yo mama type jokes. I think yes since the word distributions will now be focused on historical yo mama jokes.

Integrate pun generation

Look for existing open source solutions

Apply sentiment control (configure sentiment before generation)

Two approaches in mind so far:

During training, append the sentiment score of the joke to the input vector (or the last one), so it can be added as a feature to contextualize the output.

The cool thing about this is that it creates versatility around sentiment, since the value is continuous. I.e., there is a concept of very happy and slightly happy.

E.g.

embedding_vector = [0.1, 0.2, 0.3]
sentiment_score = [0.8]
model_inpute = embedding_vector + sentiment_score = [0.1, 0.2, 0.3, 0.8]

Add sentiment tags to the dataset to correspond to the mood of the joke. These can be implied from actual sentiment / toxicity predictions from pre-trained models.

E.g.

raw_input = "Why did the chicken cross the road?"
processed_input = "<sad> Why did the chicken cross the road?"

The create a dataset with sentiment tags, we can simply reuse existing sentiment analysis models and apply them to each joke in the dataset. We can start off with fine-tuned BERT models for sentiment analysis on full text (example), and then move towards span level controls (example).

A span level implementation will look like below:

E.g.

raw_input = "My dog died today and I am very sad"
output = "My dog died today and <sad> I am very sad </sad>"

enzoampil / tito-joker Goto Github PK

tito-joker's People

Contributors

Stargazers

Watchers

Forkers

tito-joker's Issues

Recommend Projects

Recommend Topics

Recommend Org