Comments (11)
i am working on a fix to at least get rid of check_call
by using fork and IPC. tho it might not have to be this complicated.
from nnsmith.
https://github.com/ise-uiuc/nnsmith/blob/main/nnsmith/graph_input_gen.py#L38
Made a commit to use real fork.
Still need to understand the reason of hangs to further improve this to a single-process mode for best efficiency and engineering convenience.
BTW, inconsistency in code is bad so we might want to stop using/migrate gen_model_and_range
into the new one. @lazycal
from nnsmith.
Though not sure what functionality you desire, one alternative I tried is to use multiprocess
(you can toggle it by copy-pasting forked
function from https://gist.github.com/schlamar/2311116#gistcomment-3932763 into util.py and uncomment this line
nnsmith/nnsmith/graph_input_gen.py
Line 28 in 2a487d5
- It is more difficult to reproduce a generation failure, since it is different from directly calling graph_gen.py.
- Preivously I found it still hanged, so I thought the forked process crashed and the main process failed to detect it (this is indeed possible based on that
forked
function) - Now I think the hang issue is more likely because of z3.
check_call
is convenient in that it has a timeout option as well. Not sure aboutmultiprocess
.
So if what you meant by fork and IPC is what I described, then the 3 problems above may be worth noting.
from nnsmith.
@lazycal What I mean by "stateful" is that: say coverage guided fuzzing, we need to let the generation function in the subprocess know the coverage change. But it is not quite good to pass the coverage information through command lines (i.e., check_call
)
from nnsmith.
@lazycal Yeah. my new implementation has the exact functionality of a new process. e.g., timeout, etc.
Check here. https://github.com/ise-uiuc/nnsmith/blob/main/nnsmith/graph_input_gen.py#L66
But it becomes much easier as you can simply dispatch a subprocess to execute a python function with real forking (e.g., copy-on-write, isolation, IPC, etc.).
from nnsmith.
Data communication can be done using shared variables so that we don't have to ship everything into the disk which is slow and inconvenient.
Check here: https://github.com/ise-uiuc/nnsmith/blob/main/nnsmith/graph_input_gen.py#L58
from nnsmith.
Just read your code. It looks like you are doing the same thing as the forked
function I cited, but yours fixes Problem 2 && 3. Cool! If we don't worry about reproducibility then it's fine.
from nnsmith.
Regarding reproducibility you mean seed
? I think I also adapted that but not sure if it is accurately done.
from nnsmith.
BTW, it is important to accurately locate why it hangs.
If it is an issue of our operators, then we might need to come up with some ideas to solve it.
We should definitely report it to z3 community if you believe it is a bug by z3.
From my last hang experience, it is due to the operators (i.e., Reshape5D).
You can try to diagnose it by printing the constraint expressions and try to write some manual and simple z3 expressions including the patterns to see if it is slow.
Last time I tried "abcd = ef*g" which is very slow that I thought it was not an issue from z3 but the expressions themselves.
from nnsmith.
Regarding reproducibility you mean
seed
? I think I also adapted that but not sure if it is accurately done.
Probably not beacause of seed, but z3's problem. When I tried the fork approach, I find there are some generation failures (timeout) and when I fed the same seed into graph_gen.py
they usually got passed. (Note that I can reproduce for those successful generations, so it's not seed issue). Maybe it's because the import order or something? Or maybe it's no longer an issue. Let's see.
BTW, it is important to accurately locate why it hangs.
If it is an issue of our operators, then we might need to come up with some ideas to solve it.
We should definitely report it to z3 community if you believe it is a bug by z3.
From my last hang experience, it is due to the operators (i.e., Reshape5D).
You can try to diagnose it by printing the constraint expressions and try to write some manual and simple z3 expressions including the patterns to see if it is slow.
Last time I tried "a_b_c_d = e_f*g" which is very slow that I thought it was not an issue from z3 but the expressions themselves.
I think z3 is really not stable. I found many weird z3 problems before, so in my mind hang in z3 is not surprising and a must to be handled. That's also why I am more inclined to believe it's a z3 issue and didn't prioritize locating hangs... I will probably do it next week though.
from nnsmith.
I understood that z3 is unstable if we use some uncommon features (z3.set_param(...)
) like setting timeout or so. If the hang still exists without the timeout setting, then I really doubt if it is the fault of z3 because I used to have such kind of feeling but eventually found that the root cause is something else. It is not likely that z3 will produce wrong results and unreasonable hanging under the simplest use case (or say if we don't set any other things).
I do think this is important to fix as any of our results will not be reliable if the code logic from ours or z3 is wrong. I will also help investigate this case to see if it is a z3 issue or some lagging operators.
from nnsmith.
Related Issues (20)
- [Tracking] Make Python >= 3.8 mandatory
- 💡 [Dynamic Graph] - Does nnsmith support dynamic graphs? HOT 3
- 💡 [REQUEST] TF Coverage Tutorial and Script
- TF Coverage Scripts and Tutorial HOT 1
- [Dev] `hydra` -> `click`
- [Question] Customize the number of input/output variables in generated graphs HOT 9
- 💡 [REQUEST] - Tutorial of adding a new operator for GIR HOT 4
- 🐛 [BUG] - <`ONNXModelCPU_tvm_0.9.0_cpu.yaml` file was empty, can't get opset properly properly> HOT 11
- Render seems to not work HOT 6
- 🐛 [BUG] - There is a problem with relative import in `fuzz.py` HOT 2
- Some questions about the replication of the experiment HOT 6
- Problems encountered while compiling the onnx model HOT 4
- [Help wanted] How to get the shape of the output tensor of a operator HOT 5
- [Help wanted] How to get the result of executing model_exec.py? HOT 7
- [User Question] integer type annotation in TVM HOT 2
- 🐛 [BUG] - <An error occurred when loading the onnx model generated by nnsmith using tvm.delay.> HOT 1
- [Help Wanted] Problems encountered when converting the onnx model to tvm.relay HOT 3
- [Help Wanted] How to only generate sequential models HOT 2
- Help Wanted - How does one generate minimum code examples from NNSmith bug reports HOT 3
- Instruction of TVM COV HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nnsmith.