Code Monkey home page Code Monkey logo

tensorflowbook's Introduction

TensorFlow for Machine Intelligence

TensorFlow for Machine Intelligence book cover

Welcome to the official book repository for TensorFlow for Machine Intelligence! Here, you'll find code from the book for easy testing on your own machine, as well as errata, and any additional content we can squeeze in down the line.

  • Code: You'll find code for each chapter inside of the chapters directory
  • Errata: Errata will be added to the errata directory as they are discovered. Send in a pull request if you have errata to report!

tensorflowbook's People

Contributors

arielscarpinelli avatar eerwitt avatar polaris- avatar rodneykeeling avatar samjabrahams avatar troymott avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tensorflowbook's Issues

CNN Implementation

Would it be possible to have some code to run and evaluate the CNN Implementation model?

Thanks!

Michael

Typo on Chapter 3

There is a small Typo in Chapter 3

1-D Tensor with boolean data type

t_2 = np.array([[True, False, False], [False, False, True], [False, True, False]], dtype=np.bool)

This should have said 2-D Tensor and not 1-D.

Sequence classification: GPU runs out of memory

Maybe the behaviour of dynamic_rnn changed, or the author used a graphic card with more memory than me, but on a GTX 980 with 4GB, the sequence classification code doesn't run.

I think the reason is that the length of the longest review is around 2700, and 2700x300xbatch_size is too much for the card. It would be nice if the book addressed this issue.

I guess a solution would be to unroll the network to e.g. 50 steps, but still feed the batches... is it possible?

the first dimension of the kernel variable

I don't understand the following sentence in 05_object_recognition_and_classification:

In the example code, there is a single kernel which is the first dimension of the kernel variable.

Acorrding to the document the shape of kernel is [filter_height, filter_width, in_channels, out_channels], the first dimension of the kernel is filter_height.

Ch4_softmax

File reading broken, batch size broken on fix file read as per attached

Softmax example in TF using the classical Iris dataset

Download iris.data from https://archive.ics.uci.edu/ml/datasets/Iris

import tensorflow as tf
import os

this time weights form a matrix, not a column vector, one "weight vector" per class.

W = tf.Variable(tf.zeros([4, 3]), name="weights")

so do the biases, one per class.

b = tf.Variable(tf.zeros([3], name="bias"))

def combine_inputs(X):
return tf.matmul(X, W) + b

def inference(X):
return tf.nn.softmax(combine_inputs(X))

def loss(X, Y):
return tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(combine_inputs(X), Y))

def read_csv(batch_size, file_name, record_defaults):
filename_queue = tf.train.string_input_producer([file_name])

reader = tf.TextLineReader(skip_header_lines=1)
key, value = reader.read(filename_queue)

# decode_csv will convert a Tensor from type string (the text line) in
# a tuple of tensor columns with the specified defaults, which also
# sets the data type for each column
decoded = tf.decode_csv(value, record_defaults=record_defaults)

# batch actually reads the file and loads "batch_size" rows in a single tensor
return tf.train.shuffle_batch(decoded,
                              batch_size=batch_size,
                              capacity=batch_size * 50,
                              min_after_dequeue=batch_size)

def inputs():

sepal_length, sepal_width, petal_length, petal_width, label =\
    read_csv(100, "./iris.data", [[0.0], [0.0], [0.0], [0.0], [""]])

# convert class names to a 0 based class index.
label_number = tf.to_int32(tf.argmax(tf.to_int32(tf.pack([
    tf.equal(label, ["Iris-setosa"]),
    tf.equal(label, ["Iris-versicolor"]),
    tf.equal(label, ["Iris-virginica"])
])), 0))

# Pack all the features that we care about in a single matrix;
# We then transpose to have a matrix with one example per row and one feature per column.
features = tf.transpose(tf.pack([sepal_length, sepal_width, petal_length, petal_width]))

return features, label_number

def train(total_loss):
learning_rate = 0.01
return tf.train.GradientDescentOptimizer(learning_rate).minimize(total_loss)

def evaluate(sess, X, Y):

predicted = tf.cast(tf.arg_max(inference(X), 1), tf.int32)

print sess.run(tf.reduce_mean(tf.cast(tf.equal(predicted, Y), tf.float32)))

Launch the graph in a session, setup boilerplate

with tf.Session() as sess:

tf.initialize_all_variables().run()

X, Y = inputs()

total_loss = loss(X, Y)
train_op = train(total_loss)

coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)

# actual training loop
training_steps = 1000
for step in range(training_steps):
    sess.run([train_op])
    # for debugging and learning purposes, see how the loss gets decremented thru training steps
    if step % 10 == 0:
        print "loss: ", sess.run([total_loss])

evaluate(sess, X, Y)

coord.request_stop()
coord.join(threads)
sess.close()

loss:

OutOfRangeError Traceback (most recent call last)
in ()
90 # for debugging and learning purposes, see how the loss gets decremented thru training steps
91 if step % 10 == 0:
---> 92 print "loss: ", sess.run([total_loss])
93
94 evaluate(sess, X, Y)

/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata)
370 try:
371 result = self._run(None, fetches, feed_dict, options_ptr,
--> 372 run_metadata_ptr)
373 if run_metadata:
374 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata)
634 try:
635 results = self._do_run(handle, target_list, unique_fetches,
--> 636 feed_dict_string, options, run_metadata)
637 finally:
638 # The movers are no longer used. Delete them.

/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
706 if handle is None:
707 return self._do_call(_run_fn, self._session, feed_dict, fetch_list,
--> 708 target_list, options, run_metadata)
709 else:
710 return self._do_call(_prun_fn, self._session, handle, feed_dict,

/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_call(self, fn, *args)
726 except KeyError:
727 pass
--> 728 raise type(e)(node_def, op, message)
729
730 def _extend_graph(self):

OutOfRangeError: RandomShuffleQueue '_7_shuffle_batch_1/random_shuffle_queue' is closed and has insufficient elements (requested 100, current size 49)
[[Node: shuffle_batch_1 = QueueDequeueMany[_class=["loc:@shuffle_batch_1/random_shuffle_queue"], component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch_1/random_shuffle_queue, shuffle_batch_1/n)]]
Caused by op u'shuffle_batch_1', defined at:
File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/lib/python2.7/site-packages/ipykernel/main.py", line 3, in
app.launch_new_instance()
File "/usr/local/lib/python2.7/site-packages/traitlets/config/application.py", line 589, in launch_instance
app.start()
File "/usr/local/lib/python2.7/site-packages/ipykernel/kernelapp.py", line 442, in start
ioloop.IOLoop.instance().start()
File "/usr/local/lib/python2.7/site-packages/zmq/eventloop/ioloop.py", line 162, in start
super(ZMQIOLoop, self).start()
File "/usr/local/lib/python2.7/site-packages/tornado/ioloop.py", line 883, in start
handler_func(fd_obj, events)
File "/usr/local/lib/python2.7/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(_args, *_kwargs)
File "/usr/local/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "/usr/local/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
callback(_args, *_kwargs)
File "/usr/local/lib/python2.7/site-packages/tornado/stack_context.py", line 275, in null_wrapper
return fn(_args, *_kwargs)
File "/usr/local/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 276, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 228, in dispatch_shell
handler(stream, idents, msg)
File "/usr/local/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 391, in execute_request
user_expressions, allow_stdin)
File "/usr/local/lib/python2.7/site-packages/ipykernel/ipkernel.py", line 199, in do_execute
shell.run_cell(code, store_history=store_history, silent=silent)
File "/usr/local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2723, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/usr/local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2825, in run_ast_nodes
if self.run_code(code, result):
File "/usr/local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2885, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 78, in
X, Y = inputs()
File "", line 45, in inputs
sepal_length, sepal_width, petal_length, petal_width, label = read_csv(100, "iris.data", [[0.0], [0.0], [0.0], [0.0], [""]])
File "", line 40, in read_csv
min_after_dequeue=batch_size)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 779, in shuffle_batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 400, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 465, in _queue_dequeue_many
timeout_ms=timeout_ms, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 704, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2260, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1230, in init
self._traceback = _extract_stack()

Chapter 6?

Is any code available from chapter 6 particularly the Char RNN?

I found it is rather slow when converting images to TFRecords, is it normal?

This is my computing environment: Ubuntu 16.04+gtx1080+cuda8.0rc+python2+Tensorflow 0.10.0. I realized that your codes are Python3.x suited, but that's not a problem. The real problem is when I was testing write_records_file function in Chapter 5 - 05 CNN Implementation.ipynb, at first the process was well running, quite okay... but after a while, it took almost 20s to 30s to generate one TFRecord object! I checked the usage of my graphic card, only 2% memory was being used. Then I switch to CPU only mode, I opened all the 28 threads to run the code, but nothing got better, it was still pretty slow, the usage of CPU was 5% to 8%...Is it normal?

where is the rest demo code of CNN implementation?

in the chart5 CNN implementation parts, we separate the dogs pictures into training data(80%) and testing data (20%), what is the loss value after 100000times training (if we follow the default setting?), how can we evaluate this CNN module(accuracy and how to use the training model )? who can share the rest demo code or some running result(including the write summary to tensor board graph event)?

H Image - Chapter 5, Page 175

I have copied the H image on page 175 to my computer (right click, "Save Image As") but when I run the code, Python just runs forever, and I can't interrupt it. I have the latest version of TensorFlow. Am I doing something wrong?

import tensorflow as tf

image_filename = "my_dir/H.jpg" # Location of the H image on my computer.
filename_queue = tf.train.string_input_producer([image_filename])

image_reader = tf.WholeFileReader()
_, image_file = image_reader.read(filename_queue)
image = tf.image.decode_jpeg(image_file)

sess = tf.Session()

sess.run(image)

Chapter 5 code errors in cnn sample

I have already fixed some errors by myself however the way how parameters are passed in this method seems wrong and I get the error got an unexpected keyword argument

layer_one = tf.contrib.layers.convolution2d(
float_image_batch,
num_output_channels=32,
kernel_size=(5,5),
activation_fn=tf.nn.relu,
weight_init=tf.random_normal,
stride=(2, 2),
trainable=True)

More details on my Stackoverflow question
http://stackoverflow.com/questions/41539658/tensorflow-error-when-i-try-to-use-tf-contrib-layers-convolution2d/41540092#41540092

ch5 - pg 179 suggestion

Hey, Really enjoying the book!

A few pages earlier you suggested adding a folder for the exporting of the dog data set.
Right before you start the def write_records_file(...), it seems it would be useful to mention adding the output/training-images and output/testing-images to your directories to avoid the error of running the ipynb for this section (CCN Implementation).

Thanks!

Chapter 6 04_arxiv not suitable for tensorflow2.x version?

Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH:

rnn_cell = tf.nn.rnn_cell.GRUCell
AttributeError: module 'tensorflow._api.v2.nn' has no attribute 'rnn_cell'

It seems that the code is not suitable for tensorflow2.x version

the accuracy of chapter5

Can you show the accuracy of chapter5 on test dataset.
I training the model and the accuracy is quite low.

sp? on Chapter 4

Under Saving training checkpoints, the word proprietary is spelled wrong.

The iterative batch generation code is so strange to me in Chapter 6 04_arxiv preprocessing.py.

First, for i in range(0, len(text) - self.length + 1, self.max_length // 2):. I'm sorry, but what if len(text) is actually smaller than self.length(I assume it's the max_length)? And Why would I need to do this process?

Second, assert all(len(x) == len(windows[0]) for x in windows). Why do I need to make every text the same length?

Next, the following while True. Isn't it going to loop infinitely?

Last, batch = windows[i: i + self.batch_size]. I don't think last batch generated will be the same size as previous ones in first dimension.

Hope someone could answer my questions:)

Running logistic regression throws error

The example code that can be found in https://raw.githubusercontent.com/backstopmedia/tensorflowbook/master/chapters/04_machine_learning_basics/logistic_regression.py throws error.

Platform: Mac
Tensorflow version: 0.9


Traceback (most recent call last):
  File "logistic_regression.py", line 90, in <module>
    sess.run([train_op])
  File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 372, in run
    run_metadata_ptr)
  File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 636, in _run
    feed_dict_string, options, run_metadata)
  File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 708, in _do_run
    target_list, options, run_metadata)
  File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 728, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.OutOfRangeError: RandomShuffleQueue '_0_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 100, current size 0)
     [[Node: shuffle_batch = QueueDequeueMany[_class=["loc:@shuffle_batch/random_shuffle_queue"], component_types=[DT_FLOAT, DT_FLOAT, DT_INT32, DT_STRING, DT_STRING, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_FLOAT, DT_STRING, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]
Caused by op u'shuffle_batch', defined at:
  File "logistic_regression.py", line 79, in <module>
    X, Y = inputs()
  File "logistic_regression.py", line 46, in inputs
    read_csv(100, "train.csv", [[0.0], [0.0], [0], [""], [""], [0.0], [0.0], [0.0], [""], [0.0], [""], [""]])
  File "logistic_regression.py", line 41, in read_csv
    min_after_dequeue=batch_size)
  File "/Library/Python/2.7/site-packages/tensorflow/python/training/input.py", line 779, in shuffle_batch
    dequeued = queue.dequeue_many(batch_size, name=name)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 434, in dequeue_many
    self._queue_ref, n=n, component_types=self._dtypes, name=name)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 465, in _queue_dequeue_many
    timeout_ms=timeout_ms, name=name)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 704, in apply_op
    op_def=op_def)
  File "/Library/Python/2.7/site-packages/tensorflow/python/framework/ops.py", line 2260, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Library/Python/2.7/site-packages/tensorflow/python/framework/ops.py", line 1230, in __init__
    self._traceback = _extract_stack()

chapter 5 working with images page 161

I downloaded the code of ch5's test-input-image.jpg and tried to replicate the results shown at the bottom of the page, array([[[..., but I couldn't get it run.
import tensorflow as tf
import numpy as np
import Image
sess = tf.Session()
image_filename = "test-input-image.jpg"

loaded the image directly in the folder of the script

display images in python

img = Image.open(image_filename)

img.show()

display images in python

print image_filename
filename_queue = tf.train.string_input_producer(tf.train.match_filenames_once(image_filename))
print filename_queue
image_reader = tf.WholeFileReader()
key,image_file = image_reader.read(filename_queue)
print key
print image_file
image = tf.image.decode_jpeg(image_file)
print image
sess.run(tf.global_variables_initializer())
sess.run(image)
print(sess.run(image))
and the results are
<tensorflow.python.ops.data_flow_ops.FIFOQueue object at 0x7f555b882110>
Tensor("ReaderRead:0", shape=(), dtype=string)
Tensor("ReaderRead:1", shape=(), dtype=string)
Tensor("DecodeJpeg:0", shape=(?, ?, ?), dtype=uint8)

Clearly, the image was not read in properly, otherwise why shape=(?,?,?) after decode_jpeg. I also used the python code to display the input at the beginening to make sure the jpeg image was correctly read in, which is true.
I would appreciate it if someone could provide some insights what the problem could be.
Thanks in advance for your time.

Ch 4 Logistic regression file address

This mechanism to create full path:

filename_queue = tf.train.string_input_producer([os.path.dirname(__file__) + "/" + file_name])

produces just '/train.csv' for me on Mac OSX.

That generates a confusing and scary stack trace, which misleads a Tensorflow novice for quite some time.

Finally when I removed the path setting, the code works just fine (naturally requires that input file is in the same folder):

filename_queue = tf.train.string_input_producer([file_name])

Stack trace:

E tensorflow/core/client/tensor_c_api.cc:485] /train.csv
     [[Node: ReaderRead = ReaderRead[_class=["loc:@TextLineReader", "loc:@input_producer"], _device="/job:localhost/replica:0/task:0/cpu:0"](TextLineReader, input_producer)]]
E tensorflow/core/client/tensor_c_api.cc:485] RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 100, current size 0)
     [[Node: shuffle_batch = QueueDequeueMany[_class=["loc:@shuffle_batch/random_shuffle_queue"], component_types=[DT_FLOAT, DT_FLOAT, DT_INT32, DT_STRING, DT_STRING, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_FLOAT, DT_STRING, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]
Traceback (most recent call last):
  File "logistic2.py", line 91, in <module>
    sess.run([train_op])
  File "/Users/villenahkuri/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 382, in run
    run_metadata_ptr)
  File "/Users/villenahkuri/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 655, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/villenahkuri/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 723, in _do_run
    target_list, options, run_metadata)
  File "/Users/villenahkuri/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 743, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.OutOfRangeError: RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 100, current size 0)
     [[Node: shuffle_batch = QueueDequeueMany[_class=["loc:@shuffle_batch/random_shuffle_queue"], component_types=[DT_FLOAT, DT_FLOAT, DT_INT32, DT_STRING, DT_STRING, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_FLOAT, DT_STRING, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]
Caused by op u'shuffle_batch', defined at:
  File "logistic2.py", line 80, in <module>
    X, Y = inputs()
  File "logistic2.py", line 47, in inputs
    read_csv(100, "train.csv", [[0.0], [0.0], [0], [""], [""], [0.0], [0.0], [0.0], [""], [0.0], [""], [""]])
  File "logistic2.py", line 42, in read_csv
    min_after_dequeue=batch_size)
  File "/Users/villenahkuri/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 817, in shuffle_batch
    dequeued = queue.dequeue_many(batch_size, name=name)
  File "/Users/villenahkuri/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 435, in dequeue_many
    self._queue_ref, n=n, component_types=self._dtypes, name=name)
  File "/Users/villenahkuri/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 867, in _queue_dequeue_many
    timeout_ms=timeout_ms, name=name)
  File "/Users/villenahkuri/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
    op_def=op_def)
  File "/Users/villenahkuri/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2310, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/villenahkuri/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1232, in __init__
    self._traceback = _extract_stack()

Ch. 6. Word Vector Embeddings - Python 2.7

Hi Guys,

I've been using the book in the last days and now I arrived at Ch 6. I am struggling to make the code from the section Word Vector Embeddings work using python 2.7.

Until now I have changed a few things, i.e.

  1. All the bz2.open() occurrences replaced with bz2.BZ2File()
  2. from urllib.request import urlopen replaced with from urllib import urlopen

Now I am stuck with the _read_pages(self, url) method... This is the error that get.

Read pages

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-45449cf53ecf> in <module>()
    146     'enwiki-20161120-pages-meta-current1.xml-p000000010p000030303.bz2',
    147     './',
--> 148     params.vocabulary_size)
    149 
    150 # def skipgrams(pages, max_context)

<ipython-input-6-45449cf53ecf> in __init__(self, url, cache_dir, vocabulary_size)
      9         if not os.path.isfile(self._pages_path):
     10             print('Read pages')
---> 11             self._read_pages(url)
     12         if not os.path.isfile(self._vocabulary_path):
     13             print('Build vocabulaty')

<ipython-input-6-45449cf53ecf> in _read_pages(self, url)
     46                     continue
     47                 page = element.findtext('./{*}revision/{*}test')
---> 48                 words = self._tokenize(page)
     49                 pages.write(''.join(words) + '\n')
     50                 element.clear()

<ipython-input-6-45449cf53ecf> in _tokenize(cls, page)
     55         # *ERROR expected string or buffer
     56 
---> 57         words = cls.TOKEN_REGEX.findall(page)
     58 
     59         words = [x.lower() for x in words]

TypeError: expected string or buffer

I copied the code to my repository <-- Thanks for helping out!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.