Comments (6)
Thanks for the response, I just wanted to be sure this is how it was working, I am making my own image network based on this code and I want all workers to pitch in for one epoch because the processing is very heavy (3D images). I am using a shared queue with filenames (each file is an image) instead of examples precisely to address the issue you have mentioned.
from cloudml-samples.
@MiguelMonteiro if you find yourself performance constrained, you can also use a shared queue for the filenames, and a local queue for the shuffle_batcher, and split up your epoch between num_worker
files, then only filenames will traverse the network, but workers will all work on different examples.
from cloudml-samples.
Hahah, Didn't see your earlier comment. Looks like you are already doing exactly what I suggested =).
from cloudml-samples.
My understanding is there's a performance trade-off to be had here (which doesn't really matter much as the census dataset is so small, but from a pedagogical perspective...). If shared_name
is specified input data is must traverse the network several additional times: read as binary to the worker -> enqueued to variable on a param-server -> read as binary to worker -> enqueued as parsed tensor to param server -> read as parsed tensor to worker
Making the memory (for buffering) and network requirements of the cluster much larger. Now in this case, a single census example is quite small so you are probably right, we should be using global variables for the queues, but for networks with large features (like image networks), this would normally be the right approach.
from cloudml-samples.
@puneith I think for the move into census we should use shared_queue=True most likely.
from cloudml-samples.
@MiguelMonteiro This code uses tf.datasets in TF1.4 now. Please see https://www.tensorflow.org/api_docs/python/tf/data/TextLineDataset#shard and use that appropriately in the code. Feel free to send a PR as well as I see this as a generic sample feature.
from cloudml-samples.
Related Issues (20)
- Problem with gcloud local prediction. HOT 5
- Unexpected change of bucket path due to lstrip in PyTorch container example HOT 1
- epoch_acc not defined in census tf-keras example? HOT 3
- Run failed "Creating a custom prediction routine with Keras" HOT 1
- census/estimator/trainer/task.py : 'module' object has no attribute 'logging' HOT 1
- I can't access to my instance HOT 1
- Links in Online Prediction Section result in 404 HOT 1
- Failed to submit the online prediction request HOT 1
- census/tf-keras training tensorflow2 HOT 6
- Adding parameter to execution fires an error HOT 3
- Public google/cloud/ml/v1/job_service.proto is out of sync with json API HOT 2
- error with gcloud init HOT 1
- Many links in readme is broken HOT 1
- Official AIP tutorial has a serious error "Could not find resource: localhost/dense/kernel error" HOT 1
- chore: snippet-bot full scan HOT 1
- Example/Template for Custom Container Online Prediction HOT 1
- ai-platform training : FATAL Flags parsing error: Unknown command line flag 'job_dir'. HOT 2
- [Policy Bot] found one or more issues with this repository.
- Python 3.5 build failing
- Dependency Dashboard
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cloudml-samples.