Comments (9)
That is definitely not the behavior expected. Transform should also caching the result if the inputs haven't changed, and evaluator should not be using the cached results if the trainer outputs have changed. I'll try to recreate this here in the lab.
from tfx.
I wasn't able to recreate this. I used the simple pipeline and ran it a few times. Except for the first run or whenever I deleted the metadata db, it would always use the cached artifacts. I also didn't experience difference between caching with transform and evaluator. So, a few followup questions:
- are you working with TFX from 0.12.0 or cloned from github?
- what is the behavior when you run with the example?
- can you share your pipeline config?
from tfx.
One thing to mention, if taxi_utils.py is changed, cache won't hit for transform and trainer
For evaluator, it takes example gen output and trainer output, if those are changed, it shouldn't hit cache, could you check the pipeline output folder to see if there are new output (new # subfolder under example_gen and trainer) generated from examplegen or trainer?
from tfx.
are you working with TFX from 0.12.0 or cloned from github?
what is the behavior when you run with the example?
I'm working with TFX 0.12.0. No problem when I run the example. I'll try to modify the Trainer in the example to see if the Evaluator uses its cached results or no.
One thing to mention, if taxi_utils.py is changed, cache won't hit for transform and trainer
Thanks, that's exactly what's happening, I've been modifying my model in the utils.py file. I separated the transformation and the model in 2 files + 1 utils file and now it works. Even though I had to add "sys.path.append(path_to_utils_folder)" in my pipeline definition to avoid "no module named xxx". Is it why you made an unique taxi_utils file in the example?
For evaluator, it takes example gen output and trainer output, if those are changed, it shouldn't hit cache, could you check the pipeline output folder to see if there are new output (new # subfolder under example_gen and trainer) generated from examplegen or trainer?
My ExampleGen doesn't change (I don't generate new examples), but my Trainer output changes at each modification (IDs 44-47-49-51-53-59) while the Evaluator doesn't produce new output (ID 45).
I'll try with the Taxi example to see if it comes from my pipeline.
Thanks!
from tfx.
One thing to mention, if taxi_utils.py is changed, cache won't hit for transform and trainer
Thanks, that's exactly what's happening, I've been modifying my model in the utils.py file. I separated the transformation and the model in 2 files + 1 utils file and now it works. Even though I had to add "sys.path.append(path_to_utils_folder)" in my pipeline definition to avoid "no module named xxx". Is it why you made an unique taxi_utils file in the example?
Ok, that sounds like the issue then. If anything about the component's input changes -- including the injected file's checksum -- then the component is rerun. If you're modifying the utils.py file and then running the pipeline, it will trigger a new execution of both transform and trainer.
My ExampleGen doesn't change (I don't generate new examples), but my Trainer output changes at each modification (IDs 44-47-49-51-53-59) while the Evaluator doesn't produce new output (ID 45).
Evaluator should trigger a new execution given the Trainer is probably producing a new model on each iteration. Can you attach the evaluator logs for the more recent runs?
from tfx.
Ok, that sounds like the issue then. If anything about the component's input changes -- including the injected file's checksum -- then the component is rerun. If you're modifying the utils.py file and then running the pipeline, it will trigger a new execution of both transform and trainer.
Yes it's clearer now, thanks for this clarification.
Can you attach the evaluator logs for the more recent runs?
The log is attached. There is indeed some weird lines in the log (lines 64-65), but I still can get the results in a Jupyter notebook and display them.
1.log
from tfx.
Hi, is that possible to send us the entire pipeline log folder in a zip file? we need to check the caching logs and executor logs in each run
from tfx.
The logs are attached.
The Evaluator checkcache log detect a new Trainer output folder at each run but says the artifacts are the same.
Thanks for the help!
logs.zip
from tfx.
Thanks @loiccordone ! This helps to detect a bug in our codebase. I will have a PR soon to address this.
from tfx.
Related Issues (20)
- TFX 1.14 docker image pip broken HOT 11
- Allow List of Lists and SequenceExample in dict_to_example HOT 3
- loosening google-cloud-* package constraints for TFX 1.13 HOT 10
- TFX1.14.0 causing Google Cloud Dataflow jobs to fail HOT 27
- Slow parquet to TFRecord using parquet_executor.Executor HOT 3
- StatisticsGen treats zeros as missing data after FileBasedExampleGen with parquet_executor HOT 1
- [Request] Update to Apache Beam 2.52.0, enable Beam 2.46.0 compatibility HOT 5
- How to pass airflow task configuration to one custom component? HOT 3
- Error executing pip install tfx in new conda environment with python 3.10 HOT 6
- installing tfx 1.13.0 by pip takes so much time HOT 5
- TFX trainer component running in Kubeflow fails although it was successful in the Interactive Context HOT 8
- TFX components in GCP does not display component logs in GCP Vertex AI HOT 13
- DataFlow Job in TFX pipeline fails after running for an hour HOT 6
- TFX component never completes even though Vertex AI custom job succeeds / fails HOT 8
- Upgrade Tensorflow version HOT 3
- documentations for driver class HOT 2
- Custom driver support for KubeflowV2DagRunner HOT 3
- Error when starting Evaluator component HOT 6
- TFX 1.15.0 Issues
- R2Score Metric is incompatible with Evaluator Component HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tfx.