Is your feature request related to a problem? Please describe. We

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

The honest truth of the matter is that the runtime overhead of <code class="notranslat

Generated queries have non-deterministic ordering of columns, which trash monitoring/logging/performance about fluent-kit HOT 7 CLOSED

Remi-C commented on May 30, 2024

Generated queries have non-deterministic ordering of columns, which trash monitoring/logging/performance

from fluent-kit.

Comments (7)

0xTim commented on May 30, 2024 1

@gwynne happy to add the above sorting? I think making it better for monitoring and logging is a decent use case and it shouldn't really hit performance. We could always add it behind a flag if we're worried about it.

@rausnitz in CI we set the SWIFT_DETERMINISTIC_HASHING flag which forces the hashing algorithm to produce consistent results. Great for tests but you don't want this in the real world

from fluent-kit.

rausnitz commented on May 30, 2024

If the package maintainers are receptive, I think adding .sorted { $0.description < $1.description } to the line https://github.com/vapor/fluent-kit/blob/main/Sources/FluentSQL/SQLQueryConverter.swift#L83 would be the only necessary change. I'd be happy to make a PR if so.

I'm curious how the test assertion https://github.com/vapor/fluent-kit/blob/main/Tests/FluentKitTests/FluentKitTests.swift#L706 is able to succeed consistently, considering the column order is not deterministic. 🤔 (It's failing locally for me.)

from fluent-kit.

gwynne commented on May 30, 2024

Typically, in a case such as this, I would advise users to drop down to SQLKit to construct their queries. Helpers exist in FluentKit's FluentSQL module which enable handling of Fluent models in SQLKit queries. In the near future, there will be a release of a considerably expanded API for SQLKit (or more precisely, an intermediary API that is conceptually more or less in-between FluentKit and SQLKit) which enables building SQLKit queries in a fashion largely similar to working with Fluent's API, combining the convenience of the higher-level API's syntax with the much greater control provided by the lower-level API. For example, in the current public SQLKit API, one of the INSERT queries above might be constructed something like this:

let returnedCol10Values = try await sqlDatabase.insert(into: MyTable.schema)
    .model(myModel)
        // ... except the model() helper does not actually work with Fluent models as written, because it
        // operates via Codable rather than Fluent's database property protocols. Instead, the various
        // pieces must be specified manually:
    .columns(SQLIdentifier(MyTable.path(for: \.$col1).first!.description), /* repeat for each column */)
    .values([
        SQLBind(myModel.col1), SQLBind(myModel.col2), SQLLiteral.null ,SQLBind(myModel.col4),
        // ... and so on ...
    ])
    // if the model has `@Timestamp` properties that would otherwise be automatically managed by Fluent,
    // they must be explicitly specified in the column list and included in the value list with either
    // `SQLBind(Date())` (slow and wasteful) or `SQLFunction("now")`
    .returning(MyTable.path(for: \.$col10).first!.description)
    .all(decoding: /* the type of MyTable.col10 */.self)

Using one of the prototype API designs for the new APIs, it would look more like this:

let returnedCol10Values = try await sqlDatabase.insert(into: MyTable.self)
    .modelValue(myModel) // this version of the helper is aware of Fluent models
    .all(decoding: \MyTable.$col10)
// The above very concise query unfortunately still incurs the full performance penalty involved in Fluent's
// use of Mirror for reflection, negating one of SQLKit's major advantages. But if one is willing to write things
// out a bit more explicitly, that overhead can be completely bypassed while still leveraging the definitions
// and type information on the Fluent model definition:
let returnedCol10Values = try await sqlDatabase.insert(into: MyTable.self)
    .columns(MyTable.self, \.$col1, \.$col2, \.$col3, \.$col4, /* et al */)
    // Note: I've made up a few various input values to show the usage of various helpers
    .values(.bind(myModel.col1), .bind(myModel.col2), .nullValue(), .now(), .number(5), .bool(true), /* and so on */)
    .all(decoding: \MyTable.$col10)

This only shows a very small subsection of the improved API, but the point I'm driving at is that while at current, SQLKit queries must usually be enormously verbose and unwieldy in order to avoid incurring FluentKit's overhead, this will soon no longer be the case. (And, I should add since I didn't mention it anywhere above yet, the resulting generated SQL is entirely deterministic as long as the inputs are, unlike queries generated by FluentKit.) Given that you cite performance as one of the concerns involved, the extreme performance improvements available simply by avoiding Fluent's querying APIs would presumably be a more attractive solution than slowing Fluent itself down even further (if only very slightly so).

from fluent-kit.

rausnitz commented on May 30, 2024

Another thought I had would be to replace the dictionary on this line https://github.com/vapor/fluent-kit/blob/main/Sources/FluentSQL/SQLQueryConverter.swift#L80 with an array, where each element is a tuple or some other data structure containing a column and a value-to-be-inserted.

I don't see what benefit the dictionary brings. In SQLQueryConverter.insert(_:) for example, the function gets the dictionary, makes an array of the dictionary's keys, and then looks up the dictionary value associated with each key. Seems like it would be more straightforward to start with an array of columns+values, then iterate over it.

from fluent-kit.

gwynne commented on May 30, 2024

@rausnitz It's funny you should mention that particular code - earlier today I streamlined that very method as part of a feature PR I'm nearly done with. Great minds think alike!

from fluent-kit.

gwynne commented on May 30, 2024

The honest truth of the matter is that the runtime overhead of Mirror is an order of magnitude or so greater than that of any other performance bottleneck in Fluent, as of the last time I profiled it a few months ago. The next worst bottleneck is Codable (especially for models containing columns which, when using an SQL database driver, are saved as JSON; usually structures and arrays), a far distant second. Even the support for property wrapper requirements in protocols that I'm working on wouldn't fully eliminate the Mirror overhead; it would only cut the average number of invocations by roughly half. SE-0385 would probably go most or all of the rest of the way, but a clean, performance-focused API design will still have to wait for Fluent 5.

In short, even if I could make SQLQueryConverter operate in zero time, it's not likely much difference would be noticed 😕

from fluent-kit.

rausnitz commented on May 30, 2024

I agree that this is not likely to move the needle significantly on performance. But monitoring and analytics are the primary issues for us.

Supposing we did want to drop down to SQLKit to improve certain queries (without rewriting every query), we currently would have a hard time figuring out which ones to prioritize, because of the aforementioned monitoring/analytics problem. 🙂

I opened this PR: #556.

from fluent-kit.

Generated queries have non-deterministic ordering of columns, which trash monitoring/logging/performance about fluent-kit HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent