Comments (19)
Fixing the root cause sounds good, but currently it is not possible to install the amazon provider with any configuration that has Python >=3.11 or a non Windows machine on all Python Versions.
from airflow.
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
from airflow.
The 8.21.0 release added xmlsec as a required dependency to apache-airflow-providers-amazonn and pinned to a version that is 2 years old and doesn't have wheels for
It is restricted to the version which didn't breaks current installation. As you might find there is open issue for resolve this, if you know how to fix it, feel free to raise a PR.
BTW, many of the packages do not have a wheel and required to built from the sources and appropriate dev packages.
cc: @vincbeck Do you think we need this is as main dependency for Amazon provider, and could we move it in some extra?
from airflow.
Preferably we should solve the root issue #39103
from airflow.
Just had a look into this and I agree with Taragolis's sentiment here:
BTW, many of the packages do not have a wheel and required to built from the sources and appropriate dev packages.
When I did a test install (same as your steps, create a new virtual env based on 3.11 and install the provider package) I had several packages that needed to be built from source, xmlsec among them:
Building wheels for collected packages: xmlsec, methodtools, python-nvd3, unicodecsv, wirerope
the xmlsec
build did fail for me, but only because I was missing required dev packages to build the source. After installing libxmlsec1
, libxmlsec1-dev
and python3.11-dev
my installation worked just fine and was able to build a wheel locally for xmlsec
.
Perhaps you are missing some development packages as well on your system which are inhibiting the wheel build from completing?
from airflow.
I think @o-nikolas it's not as straightforward. I described it in #39103 and the problem is that xmlsec python bindings v 1.3.14 expects libxmlsec2 to be installed and used and not libxmlsec1. This might or might not be a problem for security at any point in time and it should be investigated if the dependency can be upgraded (or as @eladkal suggests - turned into an optional dependency and handle lack of xmlsec with optional amazon provider dependency + helpful instructions on what to do). Especially that - as I understand - it's only needed for maybe one integration in Amazon provider.
I am happy to guide anyone who would like to take on that task.
While yes - you can still build the wheel with installing the packages you mentioned, it's a blocker or unnecessary hurdle for people who would like to install amazon provider on anything else than Airflow PROD image - so while we currently have no problems with releasing airflow (because we limitted xmlsec python bindings), it simply makes it more difficult for those who install amazon provider on their own. Mostly up to the Amazon team to decide how to approach it.
from airflow.
I think we just put the pin in the wrong spot to be honest.
- I don't see xmlsec being used in the AWS provider anywhere directly.
- It's in our CI image because of
python3-saml
. - python3-saml is an optional extra for Amazon.
We should move the xmlsec pin to that extra instead I believe. PR #39528
from airflow.
Nope - it's been added specifically on request and by @vincbeck - #35488 to support authentication for AWS
from airflow.
And yes - it's likely a bit complex but It means that libxmlsec1 (system library) that is used and apparently needed by Amazon provider is clashing with llbxml python bindings that are used by python3-saml - because they need libxmlsec2.
So technically speaking python3-saml should not have this limit (it will work with libxmlsec2 if installed) - but amazon needs libxml1 and it clashes with xmlsec python bindings >= 1.3.14 that need libxmlsec2.
from airflow.
The whole problem will be solved IF amazon internal authentication (I guess) can use libxmlsec2 . But I have no idea if this is possible - if so - then follow-up to #35488 should be done when we bump libxmlsec1 to 2 in our image and remove the limitation from amazon provlder.
from airflow.
Yep, that all makes sense. I'm not suggesting #39528 is the long term fix. This just brings us back to status-quo before the new xmlsec release broke stuff (xmlsec is optional, and tied with the python3-saml extra that uses it).
from airflow.
While yes - you can still build the wheel with installing the packages you mentioned, it's a blocker or unnecessary hurdle for people who would like to install amazon provider on anything else than Airflow PROD image - so while we currently have no problems with releasing airflow (because we limitted xmlsec python bindings), it simply makes it more difficult for those who install amazon provider on their own. Mostly up to the Amazon team to decide how to approach it.
I'm basically this persona. I install the provider, but I do not install the python3-saml
additional extra. #39528 just moves the (hopefully) temporary pin to impact people who would actually hit the issue in the first place.
from airflow.
The question is , whether Amazon provider works with libxmlsec2 at all ?
from airflow.
Because if it does not, this is not a solution to the problem, it will allow to upgrade xmlsec to 1.3.14, but the specific authentication might (and likely will not) work for Amazon. I guess there was a reason why we installed libxmlsec1 in the first place - from what I know Amazon Linux distro contains only libxmlsec1 (last time I checked this issue)..
from airflow.
And yeah. I have no idea what it impact it will have - but if we "solve" it now without investigating and possibly bumping libxmlsec2 it might mask the issue - that's all I have to say here. But I am ok with merging #39528 - If @vincbeck and @feruzzi are fine with potential issue with libxmlsec2 compatibility :)
from airflow.
To state it another way, your concern is if folks are installing like this:
apache-airflow-providers-amazon
python3-saml
Not:
apache-airflow-providers-amazom[python3-saml]
it (likely) won't work, right? I feel like constraints cover us pretty well for this situation.
I guess I'll leave it up to the AWS folks to decide. I just don't like having to start installing xmlsec + libxml2 + libxmlsec1 for an optional dependency I don't use - #35488 (comment).
from airflow.
it (likely) won't work, right? I feel like constraints cover us pretty well for this situation.
That's a guess based on the errors I saw 3 weeks ago when I added the limit. As explained - since then maybe something changed, so my proposal is to remove the limit altogether and see if the problem shows up in main.
from airflow.
Well I guess lets give it a shot: #39534
from airflow.
Fixed by #39534
from airflow.
Related Issues (20)
- celery_executor.py remove_running got an unexpected keyword HOT 6
- Airflow 2.9.1 : AirflowContextDeprecationWarning: Accessing 'yesterday_ds_nodash' from the template is deprecated and will be removed in a future version. HOT 2
- EmrServerlessStartJobOperator causes dag load failure when using XComArg HOT 4
- Connection edit page says "Changed Row" even if field changes were not saved HOT 1
- Tasks Stuck at Scheduled State HOT 9
- local installation problem HOT 2
- Status of testing Providers that were prepared on June 07, 2024 HOT 14
- Operator `SQLExecuteQueryOperator` not logging `RAISE NOTICE` statements in PostgreSQL functions HOT 10
- EKS Overrides for AWS Batch submit_job HOT 1
- Rendered Template Fields missing for some tasks HOT 2
- dag callbacks not called on SSHOperator (base operator?) HOT 13
- Password autocompletion set to off on login page HOT 3
- Amazon Provider v8.23.0 `importlib_metadata` module import error HOT 3
- BranchDateTimeOperator Does Not Work According to DAG Timezone HOT 3
- DAG params ordering goes by key string length for some backends HOT 3
- Integrate Snowflake Notebook Scheduling with Apache Airflow HOT 6
- Webserver logs not showing failed login attempts HOT 3
- SambaHook unable to join path when Share attribute is None
- Codecov is not in sync and shows inaccurate coverage HOT 4
- Google provider : Unnecessary imports for CloudSQL operators HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from airflow.