Comments (3)
I don't full understand the proposal, and I'm going to ask some questions that show I don't know much about this. Hope you don't mind.
Sometimes things blow up if you have different versions of something on client/scheduler/worker. Is that only true for a well-defined set of packages, so that version mismatches for other packages are fine? Or could any package X (in principle) cause bad things if you have version mismatch? If in principle anything could be a problem, then is the proposal to pin everything in the dependency tree?
Would we monitor everything pinned/everything in the dependency tree and release a new version of coiled-runtime
to bump pins whenever there's a security patch (non-trivial bug fix?) in a dependency? (Not sure how often things in our dependencies have security patches since I'm not a full-time resident of this part of the python ecosystem, but this sounds like a non-trivial responsibility.) If so, would we recommend that customers not pin to patch version of coiled-runtime
(and we'd do sufficient testing to be moderately confident that patch updates won't break stuff)?
from benchmarks.
I'm generally in support of this idea, since I believe it will lead to a much smoother user experience. The only potential drawbacks I can see would be the desire to allow coiled
to float.
This would mean that, for the life of any supported coiled-runtime release, coiled
will be constrained by that dependency.
@ntabris -- there is additional burden, But it seems like that burden exists regardless; this just makes it explicit. Thoughts?
from benchmarks.
Is that only true for a well-defined set of packages, so that version mismatches for other packages are fine? Or could any package X (in principle) cause bad things if you have version mismatch?
Any package version mismatch could lead to things failing. Distributed will check a few libraries (e.g. Python, dask
, distributed
, etc.) that are common offenders for causing things to fail and emit a warning is one of those packages has a version mismatch. That doesn't mean things can't go wrong due to other packages, but does help in lots of common cases.
If in principle anything could be a problem, then is the proposal to pin everything in the dependency tree?
I'm just proposing we pin the specific packages distributed
checks by default (which I think are all dependencies of distributed
) to make sure coiled-runtime
users don't get a red warning message
Would we monitor everything pinned/everything in the dependency tree and release a new version of coiled-runtime to bump pins whenever there's a security patch (non-trivial bug fix?) in a dependency?
I'm not too concerned about security patches here. For this part of the Python stack they have been historically very rare. We're still determining our version scheme and can factor this point in when making a final decision
This would mean that, for the life of any supported coiled-runtime release, coiled will be constrained by that dependency.
Fair point, I'll have to look at the coiled
dependencies to get a sense for how much of an issue this may be
from benchmarks.
Related Issues (20)
- [TPC-H] PySpark workloads are vulnerable to undetected worker loss HOT 9
- ⚠️ CI failed ⚠️ - Regression - TPC-H queries 12, 14, 19 HOT 2
- ⚠️ CI failed ⚠️ - ServerError - Coiled credit balance errors HOT 1
- ⚠️ CI failed ⚠️ - Regression - test_adjacent_groups [1-128MiB-p2p-disk] Duration HOT 1
- ⚠️ CI failed ⚠️ - stability/test_deadlock.py::test_repeated_merge_spill HOT 1
- Set index regression HOT 3
- ⚠️ CI failed ⚠️ - test_join_big_small / test_set_index duration regressions HOT 1
- ⚠️ CI failed ⚠️ - regressions: dataframe_cow_chain - prepreocess - q6 - q8 - set_index, write_wide_data HOT 3
- ⚠️ CI failed ⚠️ - test_basic_sum[slow-square] TimeoutError HOT 1
- ⚠️ CI failed ⚠️ - Regression: test_spilling HOT 1
- ⚠️ CI failed ⚠️ HOT 1
- optuna is failing HOT 1
- DuckDB fails with OutOfMemoryException HOT 1
- How to create a dev environment to run tpch benchmarks? HOT 4
- Difficulty generating local data HOT 3
- tpc-h query operations aren't aligned across backends HOT 1
- Add datafusion, chdb HOT 2
- Is duckdb out-of-core processing properly enabled? HOT 1
- Fair dataframe API vs API vs SQL benchmarking. HOT 7
- Make TPC-H data publicly available HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from benchmarks.