Code Monkey home page Code Monkey logo

color4dial's People

Contributors

iwangjian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

jrangelg

color4dial's Issues

Generating Dialogue Issues

Thanks for sharing your codes!

But when I run bash scripts/durecdial_dialog_train.sh. The terminal told me that

cbbea35da11b573c26a355e219cf871

It seems that you add another variable that can not be accepted by the handler.

NaN losses during durecdial_planning_train_planner

Hello,

I am going through the pipeline, and have trained the Brownian Bridge. I am however encountering NaN values at the next step of training the planner, as can be seen below.

2023-10-16 10:03:53,449 [INFO] Total parameters: 319843609	Trainable parameters: 178636808
2023-10-16 10:03:53,449 [INFO] Total batches per epoch : 2151
2023-10-16 10:03:53,449 [INFO]
Epoch 1:
2023-10-16 10:04:42,779 [INFO] Train Step: 100	total_loss: nan lm_loss: nan trans_loss: nan kl_loss: nan
2023-10-16 10:05:30,786 [INFO] Train Step: 200	total_loss: nan lm_loss: nan trans_loss: nan kl_loss: nan
2023-10-16 10:06:18,475 [INFO] Train Step: 300	total_loss: nan lm_loss: nan trans_loss: nan kl_loss: nan
2023-10-16 10:07:06,691 [INFO] Train Step: 400	total_loss: nan lm_loss: nan trans_loss: nan kl_loss: nan
2023-10-16 10:07:56,195 [INFO] Train Step: 500	total_loss: nan lm_loss: nan trans_loss: nan kl_loss: nan
2023-10-16 10:08:44,817 [INFO] Train Step: 600	total_loss: nan lm_loss: nan trans_loss: nan kl_loss: nan
2023-10-16 10:09:33,643 [INFO] Train Step: 700	total_loss: nan lm_loss: nan trans_loss: nan kl_loss: nan
2023-10-16 10:10:21,695 [INFO] Train Step: 800	total_loss: nan lm_loss: nan trans_loss: nan kl_loss: nan
2023-10-16 10:11:10,581 [INFO] Train Step: 900	total_loss: nan lm_loss: nan trans_loss: nan kl_loss: nan
2023-10-16 10:11:58,976 [INFO] Train Step: 1000	total_loss: nan lm_loss: nan trans_loss: nan kl_loss: nan
2023-10-16 10:11:58,976 [INFO] Evaluating...

As you can see, this is happening from the very start, and so I'm not sure where this is coming from. As this step uses the model made in the bridge creation step, I thought that might be the issue, but training there seemed fine. Below is the end of the training for the bridge:

Epoch 10:
2023-10-16 09:37:10,892 [INFO] Batch Step: 100	Avg loss: 11.296
2023-10-16 09:38:02,177 [INFO] Batch Step: 200	Avg loss: 10.941
2023-10-16 09:38:53,537 [INFO] Batch Step: 300	Avg loss: 11.168
2023-10-16 09:39:44,677 [INFO] Batch Step: 400	Avg loss: 10.457
2023-10-16 09:40:36,015 [INFO] Batch Step: 500	Avg loss: 9.152
2023-10-16 09:42:24,550 [INFO] Evaluation Average Similarity: 0.998
2023-10-16 09:42:24,551 [INFO] Epoch 10 training done.
2023-10-16 09:42:26,615 [INFO] Saved to [logs/DuRecDial2/checkpoints_bridge/bridge_model_epoch_10.bin]
2023-10-16 09:42:26,617 [INFO] Loading raw data from data/DuRecDial2/sample_test_seen.jsonl
2023-10-16 09:42:28,640 [INFO] Creating cache instances durecdial_plan_test_seen.pkl
/project/6000784/habashyk/Color4Dial/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:549: FutureWarning: The class `PretrainedBartModel` has been depreciated, please use `BartPreTrainedModel` instead.
  warnings.warn(
100%|██████████| 6152/6152 [00:32<00:00, 191.96it/s]2023-10-16 09:43:01,608 [INFO] Total of 6152 instances were cached.
2023-10-16 09:43:01,645 [INFO] Loading raw data from data/DuRecDial2/sample_test_unseen.jsonl
2023-10-16 09:43:02,734 [INFO] Creating cache instances durecdial_plan_test_unseen.pkl

100%|██████████| 3983/3983 [00:21<00:00, 185.63it/s]2023-10-16 09:43:24,659 [INFO] Total of 3983 instances were cached.
2023-10-16 09:43:24,681 [INFO] Evaluate on test-seen ...
2023-10-16 09:45:15,136 [INFO] Saved to logs/DuRecDial2/brownian_bridge_sim/test_seen_2023-10-16-09-42-26.txt
2023-10-16 09:45:15,139 [INFO] Average similarity on test-seen: 0.9974333125222197
2023-10-16 09:45:15,139 [INFO] Evaluate on test-unseen ...
2023-10-16 09:46:39,721 [INFO] Saved to logs/DuRecDial2/brownian_bridge_sim/test_unseen_2023-10-16-09-42-26.txt
2023-10-16 09:46:39,724 [INFO] Average similarity on test-unseen: 0.9966462435209125

Testing on user data

Would it be possible to add some instructions to the readme about how to try out the model on general utterances? I want to see the model in action myself.

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.