Comments (3)
I think I find a problem and make improvement in my fork.
Could you verify it? it gives much faster result for resnet101, but I'm not sure it is really correct.
https://github.com/jeageun/pipedream/blob/master/graph/graph.py
Thanks
from pipedream.
Can you summarize what the intended purpose of the change is?
from pipedream.
I think your implementation label the depth for all layers and your implementation uses the Depth First Search approach. This approach is good if there is no merge point, but if there is merge point, all labeling using a shorter path is not valid. Since you don't remove this invalid path from a work queue, you need to go through the invalid path until it ends. After this invalid path ends, the other valid path always updates depth value.
In my modification code,
- I remove these redundant paths before it ends. However, if we use DFS, we can't realize it is valid or not, I use BFS.
- I skip building depth if there is the previous result for depth.
from pipedream.
Related Issues (20)
- Handling uneven number of batches per replicated instance of a layer
- GPU Peer2Peer communication via --num_ranks_in_server argument HOT 1
- Resource temporarily unavailable
- To run PipeDream_2BW branch without --recompute_step
- The BLEU score of translation model seems abnormal. The model doesn't seem to train effectively.
- GPT2 355m model convergence with 2BW training
- Is there AllReduce in data parallelism? HOT 6
- How is the Double-Buffered Weight Mechanism implemented?
- Supporting T5
- The arguments of self.start_helper_thread() should be more flexible instead of fixed as int64.
- Question about time complexity of PipeDream-2BW's planner algorithm
- Question about PipeDream's optimizer
- AttributeError: module 'models.resnet50.resnet50' has no attribute 'model' HOT 1
- Is there any 2bw code that will run on the native GPU HOT 1
- AttributeError: module 'torch.distributed' has no attribute 'P2POp' HOT 1
- Running in docker will give you an error that you can't find a physical address HOT 1
- what is the role of pre_hook_pytorch_latest.patch? HOT 1
- When I was testing the pipedream code with version-updated torch, I encountered the following error (1.1.0 -> 1.11.0): HOT 2
- optimizer got an empty parameter list when rank=1 HOT 1
- same train_loader but got different loader size HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pipedream.