Code Monkey home page Code Monkey logo

chatbot-rnn's Introduction

chatbot-rnn

A toy chatbot powered by deep learning and trained on data from Reddit. Built on TensorFlow v1.4.0 and Python v3.5.1.

Here is a sample chat transcript (not cherry-picked). Lines that start with ">" are me; the lines without ">" are the chatbot's responses.

$ python3 chatbot.py
Creating model...
Restoring weights...

> Hi, how are you?
 I'm not the only one here that's having fun with you, it's all me.
> Who else is with you?
 Haha. I'm not the one who made the comment. I'm the one who said that and didn't think that much of anything. 
> Oh, I see. So what's new?
 Nothing. I just thought it was a bit of an overreaction. It's been a while since I've read it. 
> It's all good
 It's all good. It's one of those things that makes me smile :)
> That's great. This is actually my demo chat, so it's nice that you're keeping things positive!
 Thank you :) 
> You're welcome! It's a nice change from the way that you sometimes are...
 *blushes* Sorry, I should have clarified that I wasn't clear. I apologize for my ignorance.
> Sooo, where are you from?             
 I am from South America. Where you from?
> California. Do you like it in South America?
 I live in Australia, but south of Florida is one of the reasons I live in Florida. I've never been there, but I've never been there. 
> I see, a true citizen of the world
 Nope, I've never heard of it. I just spent a lot of time there. 
> Spend a lot of time where?
 I've never been there before, but I've heard it's about an hour and a half away. Maybe there's some sort of party called "Southern" but that's about it. 
> Maybe there is, that's a great observation.
 Not to mention it's all about being from another country.
> All right, guess I should wrap this up, nice talking with you
 I'm not going to argue with you. You're doing it wrong.
> Some might call that arguing. Any final words?
 Don't do that. I don't need to. 

Get Started

  • Install TensorFlow for Python 3. To run a pretrained model, the CPU-only installation should suffice. If you want to train your own models, you'll need the GPU installation of TensorFlow (and a powerful CUDA-compatible GPU).

  • Clone this project to your computer.

Run my pre-trained model

  • Download my pre-trained model (2.3 GB). The zip file extracts into a folder named "reddit". Place that folder into the "models" directory of this project.

  • Run the chatbot. Open a terminal session and run python3 chatbot.py. Warning: this pre-trained model was trained on a diverse set of frequently off-color Reddit comments. It can (and eventually will) say things that are offensive, disturbing, bizarre or sexually explicit. It may insult minorities, it may call you names, it may accuse you of being a pedophile, it may try to seduce you. Please don't use the chatbot if these possibilities would distress you!

Try playing around with the arguments to chatbot.py to obtain better samples:

  • beam_width: By default, chatbot.py will use beam search with a beam width of 2 to sample responses. Set this higher for more careful, more conservative (and slower) responses, or set it to 1 to disable beam search.

  • temperature: At each step, the model ascribes a certain probability to each character. Temperature can adjust the probability distribution. 1.0 is neutral (and the default), lower values increase high probability values and decrease lower probability values to make the choices more conservative, and higher values will do the reverse. Values outside of the range of 0.5-1.5 are unlikely to give coherent results.

  • top-n: At each step, zero out the probability of all possible characters except the n most likely. Disabled by default.

  • relevance: Two models are run in parallel: the primary model and the mask model. The mask model is scaled by the relevance value, and then the probabilities of the primary model are combined according to equation 9 in Li, Jiwei, et al. "A diversity-promoting objective function for neural conversation models." arXiv preprint arXiv:1510.03055 (2015). The state of the mask model is reset upon each newline character. The net effect is that the model is encouraged to choose a line of dialogue that is most relevant to the prior line of dialogue, even if a more generic response (e.g. "I don't know anything about that") may be more absolutely probable. Higher relevance values put more pressure on the model to produce relevant responses, at the cost of the coherence of the responses. Going much above 0.4 compromises the quality of the responses. Setting it to a negative value disables relevance, and this is the default, because I'm not confident that it qualitatively improves the outputs and it halves the speed of sampling.

These values can also be manipulated during a chat, and the model state can be reset, without restarting the chatbot:

$ python3 chatbot.py
Creating model...
Restoring weights...

> --temperature 1.3
[Temperature set to 1.3]

> --relevance 0.3
[Relevance set to 0.3]

> --relevance -1
[Relevance disabled]

> --topn 2
[Top-n filtering set to 2]

> --topn -1
[Top-n filtering disabled]

> --beam_width 5
[Beam width set to 5]

> --reset
[Model state reset]

Get training data

If you'd like to train your own model, you'll need training data. There are a few options here.

  • Use pre-formatted Reddit training data. This is what the pre-trained model was trained on.

    Download the training data (2.1 GB). Unzip the monolithic zip file. You'll be left with a folder named "reddit" containing 34 files named "output 1.bz2", "output 2.bz2" etc. Do not extract those individual bzip2 files. Instead, place the whole "reddit" folder that contains those files inside the data folder of the repo. The first time you run train.py on this data, it will convert the raw data into numpy tensors, compress them and save them back to disk, which will create files named "data0.npz" through "data34.npz" (as well as a "sizes.pkl" file and a "vocab.pkl" file). This will fill another ~5 GB of disk space, and will take about half an hour to finish.

  • Generate your own Reddit training data. If you would like to generate training data from raw Reddit archives, download a torrent of Reddit comments from the torrent links listed here. The comments are available in annual archives, and you can download any or all of them (~304 GB compressed in total). Do not extract the individual bzip2 (.bz2) files contained in these archives.

    Once you have your raw reddit data, place it in the reddit-parse/reddit_data subdirectory and use the reddit-parse.py script included in the project file to convert them into compressed text files of appropriately formatted conversations. This script chooses qualifying comments (must be under 200 characters, can't contain certain substrings such as 'http://', can't have been posted on certain subreddits) and assembles them into conversations of at least five lines. Coming up with good rules to curate conversations from raw reddit data is more art than science. I encourage you to play around with the parameters in the included parser_config_standard.json file, or to mess around with the parsing script itself, to come up with an interesting data set.

    Please be aware that there is a lot of Reddit data included in the torrents. It is very easy to run out of memory or hard drive space. I used the entire archive (~304 GB compressed), and ran the reddit-parse.py script with the configuration I included as the default, which holds a million comments (several GB) in memory at a time, takes about a day to run on the entire archive, and produces 2.1 GB of bzip2-compressed output. When training the model, this raw data will be converted into numpy tensors, compressed, and saved back to disk, which consumes another ~5 GB of hard drive space. I acknowledge that this may be overkill relative to the size of the model.

  • Provide your own training data. Training data should be one or more newline-delimited text files. Each line of dialogue should begin with "> " and end with a newline. You'll need a lot of it. Several megabytes of uncompressed text is probably the minimum, and even that may not suffice if you want to train a large model. Text can be provided as raw .txt files or as bzip2-compressed (.bz2) files.

  • Simulate the United States Supreme Court. I've included a corpus of United States Supreme Court oral argument transcripts (2.7 MB compressed) in the project under the data/scotus directory.

Once you have training data in hand (and located in a subdirectory of the data directory):

Train your own model

  • Train. Use train.py to train the model. The default hyperparameters are the best that I've found, and are what I used to train the pre-trained model for a couple of months. These hyperparameters will just about fill the memory of a GTX 1080 Ti GPU (11 GB of VRAM), so if you have a smaller GPU, you will need to adjust them accordingly (for example, set --num_blocks to 2).

    Training can be interrupted with crtl-c at any time, and will immediately save the model when interrupted. Training can be resumed on a saved model and will automatically carry on from where it was interrupted.

Thanks

Thanks to Andrej Karpathy for his char-rnn repo, and to Sherjil Ozair for his TensorFlow port of char-rnn, which this repo is based on.

chatbot-rnn's People

Contributors

julien-c avatar pender avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chatbot-rnn's Issues

Ran out of memory error for 12GB GPU memory for 2MB training data?

I encountered the following error during the training stage.. Does anyone have ideas about this?

W tensorflow/core/common_runtime/bfc_allocator.cc:274] ****************************************************************************************************
W tensorflow/core/common_runtime/bfc_allocator.cc:275] Ran out of memory trying to allocate 34.33MiB.  See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:975] Resource exhausted: OOM when allocating tensor with shape[3000,3000]

Failure to read model in Tensorflow r1.0

Hey I'm getting this error in Tensorflow 1.0 after modifying the code to get working

NotFoundError (see above for traceback): Tensor name "rnnlm/multi_rnn_cell/cell_0/gru_cell/candidate/biases" not found in checkpoint files models/reddit/model.ckpt-4682964
	 [[Node: save/RestoreV2_4 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_4/tensor_names, save/RestoreV2_4/shape_and_slices)]]

Unable to run chatbot.py. Throwing error on tensorflow

C:\chatbot-rnn-master>python chatbot.py
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "C:\ProgramData\Anaconda3\lib\imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "C:\ProgramData\Anaconda3\lib\imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: DLL load failed: The specified module could not be found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "chatbot.py", line 4, in
import tensorflow as tf
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_init_.py", line 24, in
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python_init_.py", line 49, in
from tensorflow.python import pywrap_tensorflow
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 74, in
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "C:\ProgramData\Anaconda3\lib\imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "C:\ProgramData\Anaconda3\lib\imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: DLL load failed: The specified module could not be found.

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

Multi GPUs

Hello,

How can I train this model on multi GPUs ?

Thx,
Catruk

Pickle TypeError: a bytes-like object is required, not 'str'

Getting this error:
Traceback (most recent call last):
File "chatbot.py", line 301, in
main()
File "chatbot.py", line 35, in main
sample_main(args)
File "chatbot.py", line 60, in sample_main
saved_args = cPickle.load(f)
TypeError: a bytes-like object is required, not 'str'

TypeError: a bytes-like object is required, not 'str'

I have been trying to get tensorflow to work for a while, and the only way to fix it for me seems to be to use Python 3.5 instead of 2.7. Using 2.7 seems to cause this issue and I have no idea how to fix it. Suggestions?

Reddit-parser

Reddit-parser doesn't work with reddit data gathered between RC_2005-12 and RC_2007-10, while also from RC_2015-08 till RC_2017-06. I am trying to edit the reddit_parse.py myself although it will take me some time to figure it out.

PS: The main reason it fails is due to jason tags 'ups', 'downs', and 'name' are missing in the above mentioned list of data files.

downs error
name error

Could someone perhaps help me figure this out please :)?

Training on GPU?

What changes do I have to make to train.py to train using my GTX-1080 (I have tensorflow-gpu installed and recognizing it)?

No "models" directory in repo

Hi,
In the README file, you mentioned that Place that folder into the "models" directory of this project.. However, there is not a "models" directory in this repo. Maybe you can add it.

IndexError: index 0 is out of bounds for axis 0 with size 0

I get this error while trying to set custom rnn_size=50 and batch_size=10 to train my model ( set of .txt files) as suggested in one of the issues to solve the OOM error. I'm running a p2.xlarge instance and i've tried other combinations too.

File "/home/ubuntu/chatbot-rnn/utils.py", line 184, in _load_preprocessed
ydata[-1] = xdata[0]
IndexError: index 0 is out of bounds for axis 0 with size 0

Error

I got an error while trying to run it. Here's a picture
image

Reddit parser

I have no experience with reddit, and don't know how torrent works. Does the Reddit parser given in this project requiring download the torrent file or the .bz2 file? I assume it requires the .bz2 file. I followed the link, but I could not find any .bz2 file. Instead, I can only download a torrent file. Could you please give me some guidance of how to get the .bz2 file on https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/? Thank you very much!

Speed up inference

On most basic AWS instance (CPU) it takes anywhere 8-10 seconds to complete a reply. On a GPU node (K10), its around 2-3 secs (honestly not worth it considering price difference in both instances).
Was wondering, is there a way or options to tweak to speed up the inference output ? Tried playing with Beam but not worth the dent it makes compared to the quality of response.

Still a beginner in ML, but would using a TPU make a difference in inference, compared to using GPU ?

Encountering some error

Traceback (most recent call last):
File "chatbot.py", line 325, in
main()
File "chatbot.py", line 40, in main
sample_main(args)
File "chatbot.py", line 60, in sample_main
model_path, config_path, vocab_path = get_paths(args.save_dir)
File "chatbot.py", line 56, in get_paths
raise ValueError('save_dir is not a valid path.')
ValueError: save_dir is not a valid path.

problem with respone

i have use the pre-train model and i test with some context : hi, hello but the respone is strange characters .
error

Run chatbot.py on Linux raspberrypi 4.9.80-v7+

Python 3.5.3
TensorFlow 1.4.0

Total/Free
Mem: 927/820
Swap: 2047/2024

Total Used Available Use%
/dev/root 7602088 6214644 1046052 86%

Creating model...
Restoring weights...
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1323, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1302, in _run_fn
status, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.OutOfRangeError: models/reddit/model.ckpt-4735000.data-00000-of-00001; Value too large for defined data type
[[Node: save/RestoreV2_1 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_1/tensor_names, save/RestoreV2_1/shape_and_slices)]]

So - i set swap size to 3000, but i guess this isn't work because of limit of space on SD card. And model cannot fit into memory. Is this correct?

Using new reddit training data - 'Model' object has no attribute

Not sure why it's erroring in this way.. The model saved correctly.

Any ideas?

D:\bot\char-rnn-tensorflow-master>python chatbot.py --save_dir data/reddit Creating model... Traceback (most recent call last): File "chatbot.py", line 325, in <module> main() File "chatbot.py", line 40, in main sample_main(args) File "chatbot.py", line 79, in sample_main saver = tf.train.Saver(net.save_variables_list()) AttributeError: 'Model' object has no attribute 'save_variables_list'

Using this bot on discord

Hello, i'm sorry for submitting an useless issue but i didn't know how to contact you. Is there any way to put this bot in a discord server somehow?

Empty tensor error when training with my own dataset

When executing this script with opensubtitle data (like example below), empty tensor error occurs.
I am training this with TF 1.2.1, python 3.6.

[data]

U: i could come with you .
S: i could look out for you .
U: you going to shoot me ?
S: or can i come inside ?
U: take this to changuila 's mother .
S: tell her that next week ...

[error]
loading tensor data file 0
Traceback (most recent call last):
File "train.py", line 177, in
main()
File "train.py", line 42, in main
train(args)
File "train.py", line 97, in train
data_loader.cue_batch_pointer_to_epoch_fraction(global_epoch_fraction)
File "/data3/kenkim/chitchat/chatbot-rnn/utils.py", line 240, in cue_batch_pointer_to_epoch_fraction
self._cue_batch_pointer_to_step_count(step_target)
File "/data3/kenkim/chitchat/chatbot-rnn/utils.py", line 249, in _cue_batch_pointer_to_step_count
self._load_preprocessed(i)
File "/data3/kenkim/chitchat/chatbot-rnn/utils.py", line 179, in _load_preprocessed
self.tensor = self.tensor[:self.num_batches * self.batch_size * self.seq_length]
IndexError: too many indices for array

Preparing Own content for train

I am trying to create Q&A bot type, how should be my training data format. its should be like
> Q: why?
> A: answer1
> Q: who?
> A: answer2

or it should be only answers to the conversation?
> answer1
>answer2
@pender

Use dialog segmentation information when preprocessing

Hi ! First of all, Thanks for sharing your library with us.
During training, I want to make sure there is no match between message & reply from different dialog
Let's say there is two different dialog A, B and input.txt contain dialogs like below :

A-user : message1
A-bot : reply1
A-user : message2
A-bot : reply2
B-user : message3
B-bot : reply3

In this case, does this code prevent match reply2 & message3 ?

Issue with train.py - chatset errors.

Any thoughts? I am using windows..

Preprocessing file 2/6 (reddit-parse/output\output 1.bz2)... Traceback (most recent call last): File "train.py", line 190, in <module> main() File "train.py", line 49, in main train(args) File "train.py", line 55, in train data_loader = TextLoader(args.data_dir, args.batch_size, args.seq_length) File "D:\bot\utils.py", line 39, in __init__ self._preprocess(self.input_files[i], self.tensor_file_template.format(i)) File "D:\bot\utils.py", line 107, in _preprocess data = file_reference.read() File "D:\python\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 23267: character maps to <undefined>

Can't Import name seq2seq with Tensorflow version 1.2.1

What version of tensorflow works for this project?

dilip@dilip-ThinkPad:~/chatbot-rnn$ python chatbot.py
Traceback (most recent call last):
File "chatbot.py", line 14, in
from model import Model
File "/home/dilip/chatbot-rnn/model.py", line 3, in
from tensorflow.python.ops import seq2seq
ImportError: cannot import name seq2seq

Questions on Training Dataset

Thanks for the great repo, quick questions though:

  • What is the minimum amount of data rows would you recommend for a new dataset?
  • What is the exact ideal format for reformatting a new dataset?
  • Can it not be that negative with a different dataset (and lighten up a little)?

Decoding answers' character breaking issue

Hello,
I've just ran your code and got the broken answers. what's wrong with it?
char encoding/decoding problem?

I'm using tensorflow 1.3, python 3.6

What is your name?
P�$�X��_7z�z}�E�Z=��Mzc�l�|�Z=-{F;}�FHF���WO#��
����N�S#}Y��Z�}F��}}�s�=��M�F�7*}�
2�F�Z������}��=����}F����D�!HL}'�(PZM��Z=;PP�~}W�Z�z};}>#}�qqldM�P55')}
�9z�-Z!l|���-"���DZ�A'�!lMY'
$@�$**}('P}}>}'�z}&�}}CL}l�}�}D}z�};5GMl�Cz�TM-�N��Z�ZYH}}&!�Fl�=7z����@9s�#@��}r-��}�}}}��}����}F�]�����5G�2�h�r�}��}3����C�
}���J3��5�zJMMG�55�FZ�M�MMZ��M'!!�'�-()�;Xd=3o!7Z��}�����P�&P���Z��5~
ZG�����)}H�P�N-�f�n�M�Gldf�n�W>��(�s�=GD=-PD-}F�!�!-��l�3}P}"}�}Y��#�2
]�]Po��}��P���5�FGl5}!H!�zSv-����

Problem with loading the pre-trained model

Hi!
I was trying to run the pre-trained model you have provided and ran into an error.
Below is the full traceback
This might be a tensorflow compatibility problem, however you have not stated which version were you using to train it.
I used:
Ubuntu 16.04
Python 2.7.12
Tensorflow 0.12
CUDA 8.0
GeForce GTX 1080

$ python2 chatbot.py 
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so.8.0 locally
Creating model...
WARNING:tensorflow:From /home/aphex/Projects/chatbot-rnn/model.py:168 in __init__.: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30.
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
WARNING:tensorflow:From /home/aphex/Projects/chatbot-rnn/model.py:183 in __init__.: merge_all_summaries (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30.
Instructions for updating:
Please switch to tf.summary.merge_all.
WARNING:tensorflow:From /home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/ops/logging_ops.py:264 in merge_all_summaries.: merge_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30.
Instructions for updating:
Please switch to tf.summary.merge.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 6.98GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
WARNING:tensorflow:From chatbot.py:70 in sample_main.: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
Restoring weights...
W tensorflow/core/framework/op_kernel.cc:975] Not found: Tensor name "rnnlm/multi_rnn_cell/cell_1/gru_cell/gates/weights" not found in checkpoint files models/reddit/model.ckpt-4682964
	 [[Node: save/RestoreV2_11 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_11/tensor_names, save/RestoreV2_11/shape_and_slices)]]
W tensorflow/core/framework/op_kernel.cc:975] Not found: Tensor name "rnnlm/multi_rnn_cell/cell_1/gru_cell/gates/weights" not found in checkpoint files models/reddit/model.ckpt-4682964
	 [[Node: save/RestoreV2_11 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_11/tensor_names, save/RestoreV2_11/shape_and_slices)]]
W tensorflow/core/framework/op_kernel.cc:975] Not found: Tensor name "rnnlm/multi_rnn_cell/cell_1/gru_cell/gates/weights" not found in checkpoint files models/reddit/model.ckpt-4682964
	 [[Node: save/RestoreV2_11 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_11/tensor_names, save/RestoreV2_11/shape_and_slices)]]
W tensorflow/core/framework/op_kernel.cc:975] Not found: Tensor name "rnnlm/multi_rnn_cell/cell_1/gru_cell/gates/weights" not found in checkpoint files models/reddit/model.ckpt-4682964
	 [[Node: save/RestoreV2_11 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_11/tensor_names, save/RestoreV2_11/shape_and_slices)]]
W tensorflow/core/framework/op_kernel.cc:975] Not found: Tensor name "rnnlm/multi_rnn_cell/cell_1/gru_cell/gates/weights" not found in checkpoint files models/reddit/model.ckpt-4682964
	 [[Node: save/RestoreV2_11 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_11/tensor_names, save/RestoreV2_11/shape_and_slices)]]
W tensorflow/core/framework/op_kernel.cc:975] Not found: Tensor name "rnnlm/multi_rnn_cell/cell_1/gru_cell/gates/weights" not found in checkpoint files models/reddit/model.ckpt-4682964
	 [[Node: save/RestoreV2_11 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_11/tensor_names, save/RestoreV2_11/shape_and_slices)]]
Traceback (most recent call last):
  File "chatbot.py", line 301, in <module>
    main()
  File "chatbot.py", line 35, in main
    sample_main(args)
  File "chatbot.py", line 74, in sample_main
    saver.restore(sess, model_path)
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1416, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run
    feed_dict_string, options, run_metadata)
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
    target_list, options, run_metadata)
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Tensor name "rnnlm/multi_rnn_cell/cell_1/gru_cell/gates/weights" not found in checkpoint files models/reddit/model.ckpt-4682964
	 [[Node: save/RestoreV2_11 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_11/tensor_names, save/RestoreV2_11/shape_and_slices)]]
	 [[Node: save/RestoreV2_13/_21 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_36_save/RestoreV2_13", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Caused by op u'save/RestoreV2_11', defined at:
  File "chatbot.py", line 301, in <module>
    main()
  File "chatbot.py", line 35, in main
    sample_main(args)
  File "chatbot.py", line 71, in sample_main
    saver = tf.train.Saver(net.save_variables_list())
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1028, in __init__
    self.build()
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1058, in build
    restore_sequentially=self._restore_sequentially)
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 652, in build
    restore_sequentially, reshape)
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 386, in _AddRestoreOps
    tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 225, in restore_op
    [spec.tensor.dtype])[0])
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 441, in restore_v2
    dtypes=dtypes, name=name)
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2371, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/aphex/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1258, in __init__
    self._traceback = _extract_stack()

NotFoundError (see above for traceback): Tensor name "rnnlm/multi_rnn_cell/cell_1/gru_cell/gates/weights" not found in checkpoint files models/reddit/model.ckpt-4682964
	 [[Node: save/RestoreV2_11 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_11/tensor_names, save/RestoreV2_11/shape_and_slices)]]
	 [[Node: save/RestoreV2_13/_21 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_36_save/RestoreV2_13", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Training Time

How long is needed to train from scratch on fresh data? Given the ti 1080 GPU setup

Do you need any help with the repo?

Hello brother, gender not implied, I navigated across multiple ML repos and tested them on the reddit dataset and I have to say, yours is the most promising as well as well written and optimized. I am trying to build a personal assistant and I just started scratching the ML business. So yeah, I would like to support your repo in any way necessary

IndexError: too many indices for array

Traceback (most recent call last):
File "train.py", line 172, in
main()
File "train.py", line 41, in main
train(args)
File "train.py", line 93, in train
data_loader.cue_batch_pointer_to_epoch_fraction(global_epoch_fraction)
File "D:\chatbot-rnn-master\utils.py", line 233, in cue_batch_pointer_to_epoch_fraction
self._cue_batch_pointer_to_step_count(step_target)
File "D:\chatbot-rnn-master\utils.py", line 242, in _cue_batch_pointer_to_step_count
self._load_preprocessed(i)
File "D:\chatbot-rnn-master\utils.py", line 172, in _load_preprocessed
self.tensor = self.tensor[:self.num_batches * self.batch_size]
IndexError: too many indices for array

reddit_parse error

Reading from reddit_data\RS_2008-01.bz2 Traceback (most recent call last): File "reddit_parse.py", line 259, in <module> main() File "reddit_parse.py", line 37, in main parse_main(args) File "reddit_parse.py", line 87, in parse_main args.comment_cache_size, subreddit_dict, substring_blacklist, subreddit_whitelist, substring_blacklist) File "reddit_parse.py", line 105, in read_comments_into_cache subreddit_whitelist, substring_blacklist): File "reddit_parse.py", line 169, in post_qualifies body = json_object['body'] KeyError: 'body'
I'm on windows with python 3.6.7.

Enable multiple-GPU processing?

It seems that only one CPU could be used for the lib. Would it be possible for me to know whether there is a plan to develop a multiple-GPU version? How hard would that be?

Remove the beam search generator

Hi @pender , first of all, thank you so much for your repo. I found it really helpful for learning about RNNs, the code is enough clear, the model based on reddit well trained and everything is cool.

I have only one curiosity: is it possible to generate the answer instantly from the model, without the char after char generation?
I think that I have understand that this generation effect is in the beam search generator but I can't fully get how it works.
It would be great if the model could write the answer in one instant, or add a parameter by which the user can decide what type of generation use.

Thanks for your work and everything.
Alessandro.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.