Code Monkey home page Code Monkey logo

Comments (1)

adalke avatar adalke commented on September 23, 2024

I developed a prototype using SQLAlchemy to generate the new fragment database instead of using hand-built SQL queries

My test set was 100 SMILES. With low-level SQL queries and 4 processors it takes ~28.6 seconds. Going through SQLAlchemy's ORM took ~37.5 seconds -- 30% slower.

Profiling shows that most of the additional time is in session.commit(). However, even if I comment out the session.add() code (which tells SQLAlchemy to add the new objects to the database when appropriate), the overall time was still slower than the hand-written code.

Further profiling shows about 15% of the additional time was in _initialize_instance. This is part of SQLAlchemy's ORM. I replaced mmpdb's existing fragment types with slightly modified versions that inherit from registry.generate_base() base class. This has its own __init__, with a higher overhead than the basic Python-class-with-slot-definition I used.

There were some improvements to the API. I got to strip out some boilerplate code to return a FragmentRecord from the cache file. On the other hand, I had to lean how to "detach" objects to move them from one database to the other, and I had to learn the special query syntax.

When we get to things like merging multiple datasets into one, I think it would be easier to work directly to the SQLite connection object, attach databases, in the instance, and do cross-data INSERT with SELECT. I don't want to have to figure it out in SQLAlchemy.

Long-term, I think there's still a place for SQLAlchemy, as with kzfm's example.

The most likely is to define the database schema(s) in a better way than my shaky template system, and use it to manage schema creation, and to replace the vendored peewee code we use now, and use the more low-level SQLAlchemy calls to write the tables, rather than SQLAlchemy's ORM.

from mmpdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.