Code Monkey home page Code Monkey logo

Comments (10)

jmchilton avatar jmchilton commented on June 11, 2024 3

I know that you're tasked with replacing SQLAlchemy and this is an approach to accomplish a piece of that task - if Marius and I think it is not the right approach I think it is fair to say... "I've done what I think is the right approach" and put the ball in our courts to find some replacement for CBV. I don't mind doing the work to support the structure that I think is important - if you don't agree it is important. The people doing the work should have a dispropionate say in how it is done - but I think you should give us sometime to propose a less drastic alternative code structure.

from galaxy.

jdavcs avatar jdavcs commented on June 11, 2024 1

As per our discussion on the backend channel, I've looked at how class-based views are implemented. I've found that there's no performance gain by using class-based views: the number of calls to foo= depends(Foo) is not reduced. In fact, the number of calls increases with class-based views. (I tested one test function in the test_tours module: 13 calls to depends() for class-based views, 10 calls for pure functions). If we look at the implementation, we'll see that the cbv decorator modifies the __init__ method of the class it's decorating; it collects the class attributes that have type annotations and sets them as instance variables on the instance of the class. But the class is still instantiated once per endpoint in most cases.

from galaxy.

mvdbeek avatar mvdbeek commented on June 11, 2024

Try the fork, I don't think the rest is really an option.

from galaxy.

jdavcs avatar jdavcs commented on June 11, 2024

Unfortunately that doesn't work. The implementation of cbv in the fork is significantly different from the original (fastapi-utils). Switching to the fork will raise this error on startup: https://github.com/yuval9313/FastApi-RESTful/blob/master/fastapi_restful/cbv.py#L119. Here's why: the fork checks whether the routes belonging to the router are all unique (see line 118); however it uses only the router path and the http method to disambiguate between the routes. But there're many more route attributes, and among them - include_in_schema, which we use while adding almost identical routes (here). Furthermore, even if we were to convince the maintainers of the fork to alter their deduplication method by adding an extra attribute to the route "signature" (which I doubt), even then, down the road we might decide to add some other attribute to our own FrameworkRouter(APIRouter), which again would cause the duplicate route error to kick in. Finally, this is just one of the things the fork does differently; I have not checked the rest of the code - there may be other surprises.

In my opinion, given our very heavy use of FastAPI, we should stick to the core library. Furthermore, given that the cbv feature is really non-essential (whereas its implementation adds quite a bit of complexity and increases the chance of more bugs), I suggest we go with option 4: just specify the common dependencies for each route. It's clean, it's straightforward, and it's trivial to fix. To update a module with cbv, we simply add the dependency as an argument to the methods that actually use it, remove the class and convert its methods to functions (kill the self). That's all.
Here's what it would look like: https://github.com/galaxyproject/galaxy/compare/dev...jdavcs:galaxy:dev_no_cbv?expand=1

from galaxy.

mvdbeek avatar mvdbeek commented on June 11, 2024

Thanks for checking, this however was a minute detail of the discussion and does not change my view that we should not make these sweeping changes to cut down on one tiny dependency.

from galaxy.

jdavcs avatar jdavcs commented on June 11, 2024

Summarizing my point here.

  1. In our api modules we rely on class-based views, a minor convenience feature provided by an external package, fastapi-utls.
  2. fastapi-utils is incompatible with SQLAlchemy 2.0, thus causing a dependency conflict that prevents us from upgrading to SQLAlchemy 2.0.
  3. fastapi-utils is not actively maintained, so it is unlikely that it will be upgraded.
  4. The fastapi-restful fork of fastapi-utils is compatible with SQLAlchemy 2.0. However, that fork has a different implementation of class-based views which is incompatible with our code base. It may or may not be solved with a modification to the fork's implementation.
  5. We can submit a request to the fastapi-restful fork maintainers and hope that (1) they will modify their library's internal logic to accommodate our use case; and (2) this will be the only incompatibility between our code and their implementation, now and in the future. We can also maintain our own fork where we simply remove the SQLAlchemy requirement constraint, which will allow us to upgrade SQLAlchemy to 2.0, but will leave us with a half-broken fork of fastapi-utils, as well as the responsibility to maintain yet more code that we didn't even write.

Or we can just drop the dependency. Here's why I think it's a good solution:

  1. It is the simplest way to solve a major blocking issue without depending on a half-functioning package (our own fork without the 2.0 constraint) or relying on a different project to alter their library's internal logic to accommodate our (much more advanced) use case.

  2. I don't think we need class-based views: it is only a minor convenience feature. We can only factor out the common depends() arguments into class scope, but a typical endpoint has multiple arguments passed to it; so by using class-based views you'll simply have all those multiple arguments minus one. Besides, sometimes we factor out an argument that is only used in a subset of methods, and in other cases we don't - so we don't use it consistently either. Finally, I think leaving those arguments in the endpoint's function definition makes the code much more straightforward. Our code is simpler without this feature.

  3. Most importantly, dropping a non-essential dependency reduces the number of external libraries we depend upon. It is not a large dependency, but it does things we don't need, like session handling, or the now-redundant InferringRouter. It also it requires a particular version of SQLAclhemy, which is a core dependency for us; this leads to the classic "diamond" problem in "dependency hell": we depend on foo and bar, we require foo of version A, but bar requires foo of version B, and versions A and B are incompatible. With "foo" (SQLAlchemy) being at the core of our own system makes this particular example different from most other examples of some datatype dependency pulling in many others.

Dependency management is a hard problem and there's been a ton written on it. Of course we should rely on external software, and of course such reliance comes with risks. What's clear to me (from both literature and experience) is that a decision on including a dependency should not be made lightly.

In this case, there is no good reason to rely on fastapi-utils other than "[removing it is] a can of worms where we'll likely break things" (as per channel discussion). But I don't think removing it will break things: it is a lengthy but straightforward modification that is not at all complex compared to lots of other changes we make all the time.

from galaxy.

jmchilton avatar jmchilton commented on June 11, 2024

I really don't think of Galaxy as a FastAPI app - Galaxy has its own opinion about what the structure of web controllers looks like and we've sculpted it and documented it and refined it over many years. Dropping an external dependency and structuring the code more tightly toward FastAPI dependency management... actual ties us to FastAPI more strongly. External dependencies aren't just about the things you import - it is also about structuring your code to shield yourself from the external dependencies. When Marius took the big swing of bringing in FastAPI he was very cognizant of my concerns about the existing structure of the code and brought in class based views to make it feel more Galaxy and in addition to that kindness was kind and let me replace the dependency management features with a standalone library that was more shielded from FastAPI internals and was useful throughout the application and in non-FastAPI contexts. He used FastAPI to implement a Galaxy web framework - he did not make Galaxy a FastAPI application. We moved the code in direction I thought there was full consensus around and balanced everyone's concerns. This feels like another big shift but we're letting our dependencies dictate our code structure and move it in directions there isn't consensus around.

I understand the performance might be better as the result of this change, but it isn't in a portion of code we know to be critical and I do not think it is wise to take such a big swing on changes that your peers aren't a fan of.

from galaxy.

jdavcs avatar jdavcs commented on June 11, 2024

@jmchilton thank you for taking the time to read into the issue. I do appreciate the very detailed and constructive points that you've made. If we want to structure our endpoints in classes with dependencies defined at the method level factored out, #17205 is one way to do that: we only take the one module+test that we're actually using from the original external library and maintain it from now on. That way we don't have to rely on an incompatible external dependency most of which we don't need.

from galaxy.

jdavcs avatar jdavcs commented on June 11, 2024

@jmchilton On a related topic, the only other dependency conflict is graphene-sqlalchemy which is part of the new TS code, added in #15639. We are using version "3.0.0b3", which is pre-SQLAlchemy 2.0. So is "3.0.0b4". The current version is "3.0.0rc1", which is compatible with SQLAlchemy 2.0, however, it appears to come with breaking changes, so the previous beta releases don't work (e.g. ImportError: cannot import name 'convert_sqlalchemy_hybrid_property_type' from 'graphene_sqlalchemy.converter'). I haven't looked at the errors in detail. Do you have a quick solution to this? If not, I'll try to move that code to "3.0.0.rc1".

from galaxy.

jdavcs avatar jdavcs commented on June 11, 2024

☝️ I'm working on it. Breaking changes affecting conversion of hybrid properties introduced in graphql-python/graphene-sqlalchemy#371. @jmchilton

from galaxy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.