Code Monkey home page Code Monkey logo

Comments (25)

daveespo avatar daveespo commented on August 14, 2024 1

This pull request is the result of a lot of collaboration between @tfuda and myself. @tfuda will be submitting a separate pull request along the lines of what he and @capeterson talked about earlier in this thread. They are largely incompatible so we decided to submit them both so everyone could render an opinion on the direction as a whole. We'll call the two options:

LightweightQueryField vs. QueryFieldExorcism

Both solutions have been tested in our fairly large base of Apex and both unit test coverage and our ad hoc testing passes .. so at least in our use case, they are both viable

Test Methodology

@tfuda and I have been using the 2 selector classes in fflib-Optimization (Selector_Account and Selector_Contact) along with SelectorPerformance to kick off a run. There's a Deploy To Salesforce button on the repository home page if you want to try for yourself.

SelectorPerformance constructs 10 instances of QueryFactory, and calls selectFields on each of those 10 instances -- once with a Parent relationship, and again with a Grandparent relationship. It should be noted that each set of results below was preceded by one test run where we threw away the results. We witnessed abnormally poor performance numbers the first time code saved to server was exercised. On the second and subsequent runs, the results were fairly consistent.

The "Standard" logging level was enabled. That is:

Apex: DEBUG
Profiling: INFO

Like so: http://screencast.com/t/o5a0vKF3O1f

The following anonymous Apex was run:

SelectorPerformance.go();

The results of the tests on unadulterated fflib_QueryFactory (i.e. what's on master right now) is as follows:

10:51:58.58 (1507871505)|USER_DEBUG|[38]|DEBUG|Preload Elapsed:25
10:51:58.58 (1507948654)|USER_DEBUG|[39]|DEBUG|Parent Elapsed:429
10:51:58.58 (1508008488)|USER_DEBUG|[40]|DEBUG|Grandparent Elapsed:578
10:51:58.58 (1508073415)|USER_DEBUG|[41]|DEBUG|QF Elapsed (includes parent & grandparent):1119
10:51:58.58 (1508126379)|USER_DEBUG|[42]|DEBUG|SOQL Elapsed:347
10:51:58.58 (1508208194)|USER_DEBUG|[43]|DEBUG|Total Elapsed:1466

Summarizing:

Construction of QueryFactory (which implies selecting the fields defined in getSObjectFieldList()) was ~11ms/each ((1119 - 578 - 429) / 10) for a selector that's 54 fields 'wide'

Selecting an additional set of 45 Parent fields adds ~43ms (429 / 10)

Selecting an additional set of 45 Grandparent fields adds ~58ms (578 / 10)

Turning QueryFactory into a SOQL string takes ~35ms (347 / 10)

... which means that for a 'really wide' Selector, you spend ~147ms just constructing a SOQL string

The above numbers disregard all of the crazy business outlined in #79 -- if you start fooling around with the log levels to get more detail, these numbers go into the tank (if you can get your transaction to complete at all)

QueryFieldExorcism

Removal of QueryField

QueryField's current mission is to act as a vessel for a field path -- either a local field on the object that the QueryFactory is constructed for (i.e. Contact) or one of its ancestors (parent, grandparent, great-grandparent, etc).

Carrying around this vessel as a custom Apex class requires all of the machinery for deduping, sorting (compareTo) and hashing (hashCode/equals). However, the end goal is still to arrive at a string literal that can be used in a SOQL query. The summary on this issue already calls out the pain and suffering this causes.

The change in this pull request front-loads the effort for building the string literal up into the fflib_QueryFactory.getFieldToken method; in that method, the heavy lifting is already done to normalize the list of selected fields into SObjectField tokens and enforce CRUD/FLS. It didn't make sense to sock it away into a custom class only for it to be pulled apart later and turned into a String again.

The net effect, as the performance data shows, is that toSOQL has now become blisteringly fast. The performance penalty is now buried in selectFields (which delegates to getFieldToken)

Once the custom class was ditched, keeping a Set<String> wasn't nearly as costly as keeping the Set<QueryField>.

getFieldToken performance

There was a minor change to keep a method-local reference to the Describe result for the path component you're deduping. fflib_SObjectDescribe is good at what it does, but there is still a very minor performance penalty to call getDescribe() more than once

Use of getRelationshipName to build parent relationship paths

The code in QueryField.toString() had some logic in there to try to handle the 'special' SObjectField tokens -- there is a method on the DescribeFieldResult called getRelationshipName() that does this in a reliable way.

Collateral Damage

  • A bunch of methods on fflib_SObjectSelector were removed; these methods were already marked as deprecated so it feels like people can adapt to life without them. Our code base never made use of them to begin with so hopefully others will have the same experience.
  • The public method signature of fflib_QueryFactory.getSelectedFields now returns a Set<String> instead of a Set<QueryField>
  • The "sorted" order of the list of fields in the SELECT clause doesn't follow the "local fields first, followed by parent relationships" logic. It's just a straight alpha sort.

Results of QueryFieldExorcism

11:13:52.14 (854796968)|USER_DEBUG|[38]|DEBUG|Preload Elapsed:21
11:13:52.14 (854848048)|USER_DEBUG|[39]|DEBUG|Parent Elapsed:343
11:13:52.14 (854896699)|USER_DEBUG|[40]|DEBUG|Grandparent Elapsed:407
11:13:52.14 (854952159)|USER_DEBUG|[41]|DEBUG|QF Elapsed (includes parent & grandparent):811
11:13:52.14 (855004320)|USER_DEBUG|[42]|DEBUG|SOQL Elapsed:6
11:13:52.14 (855056614)|USER_DEBUG|[43]|DEBUG|Total Elapsed:817

Construction of QueryFactory (which implies selecting the fields defined in getSObjectFieldList()) was ~7ms/each ((817 - 407 - 343) / 10) for a selector that's 54 fields 'wide'

Selecting an additional set of 45 Parent fields adds ~34ms (343 / 10)

Selecting an additional set of 45 Grandparent fields adds ~41ms (407 / 10)

Turning QueryFactory into a SOQL string takes ~0ms (6 / 10)

... which means that for a 'really wide' Selector, you spend ~81ms just constructing a SOQL string -- a 44% improvement

Sadness

  • The result of all of this analysis is that getFieldToken is really just plain expensive -- Even after all of these changes, we only achieved a 44% increase. There's no optimization that can be done which preserves its valuable CRUD/FLS enforcement features. ISVs that need to enforce CRUD/FLS have no way around this. I had given some thought to how Platform Cache could play here -- but I'm not sure it would pass a security review since the TTL on the cache key would allow CRUD/FLS administration not be immediately reflected in Selector behavior

from fflib-apex-common.

daveespo avatar daveespo commented on August 14, 2024 1

Hi guys (@afawcett and crew) -- can we resurrect this thread and come to a decision on what, if anything can be merged into the fflib-apex-common/master branch? We've been living with the code changes in the QueryFieldExorcism branch for about 2 months and haven't had any regressions. Additionally, 2 of our packages went through security re-review in that period and passed so SFDC's security team is still satisfied with CRUD/FLS enforcement.

from fflib-apex-common.

rnisbet avatar rnisbet commented on August 14, 2024 1

We'd love to see this in fflib too, @afawcett. We're using our own workarounds because debugging is very difficult as QueryFactory uses so much of the debug log. Thanks.

from fflib-apex-common.

afawcett avatar afawcett commented on August 14, 2024 1

To confirm #107 has been merged. Thanks @daveespo 👍

from fflib-apex-common.

rsoesemann avatar rsoesemann commented on August 14, 2024 1

Great! I would love to see a blog post by @daveespo on AndyInTheCloud.com to explain how and when to use the new lightweight version.

from fflib-apex-common.

dfruddffdc avatar dfruddffdc commented on August 14, 2024

I ported QueryFactory optimisations to our internal copy of this repo in #99, but it sounds like I didn't go far enough.
I can see some very detailed analysis in #79, so again, thanks for the fine police work and if you want to raise a pull request we would be very grateful. I think you've put forward some very sensible suggestions, so I'd be keen to see how effective they are.

I can only speak for myself with regards to debugsauce, but I use the humble System.debug to find roughly where the bottle necks are. If the bottleneck is early enough in execution, a System.assert(false) to unwind the stack quickly helps keep log size down. But it sounds like your method is a bit more refined to be honest!

from fflib-apex-common.

capeterson avatar capeterson commented on August 14, 2024

I was the original author of most of the choices in question, although I'm no longer employed by FinancialForce and not an authority on the future direction of this repo (although I do want to see it thrive!). In other words, I can comment with historical knowledge, but my thoughts are not in any way gospel.

Your assumption that a Set was used to ensure there were no duplicate entries is spot-on. However, going to a List has issues with edge cases where different text values are referring to equal fields, which is why I introduced QueryField to encapsulate that (and a couple other things, but mainly that).

On the topic of "if you're passing duplicate fields into QueryFactory, you're doing something wrong": this might be true in many common selector use cases (which I didn't anticipate QueryFactory would end up powering initially!) but a huge value of qf is that you can do things like throw a field set at it and let it sort out dupes or not. That's actually the single use case that caused it to be written. There is serious non-selector use of queryfactory, and more than a few highly dynamic selectors, that rely on it to de-dupe effectively. With all that said, I still like the just-in-time dedupe idea. In the context of a highly dynamic query I don't expect the minor heap penalty would be a serious issue, while the CPU improvements seem very tempting. There's also the idea of making a subclass of QueryFactory that does less checking on input for use in selectors without dynamic components. This actually seems like a good idea considering that not every selector needs this, and it does add overhead.

As a compromise on the toSOQL() string building issue, why not toString into a List, and then String.join that? It seems to remove the excessive concatenation and assuming the apex list collection appending is fast (I haven't tested, but it seems a reasonable assumption as I write this).

I know there are internal use cases that removing de-duping would break, and I'd hate to see the internal and external versions diverge, so to me keeping the same functionality and API is highly valued, even if it does slightly compromise the performance boosts you might get. A subclass that adds those, so users can pick if they want faster or more accurate, would be a great way to address that if you do go that direction.

I'd offer to help in some capacity but I need to go get some approvals before I can, so in the interim godspeed.

from fflib-apex-common.

tfuda avatar tfuda commented on August 14, 2024

Hi @capeterson. I have been working with @daveespo on this investigation into QueryFactory performance. Yesterday I did some prototyping of some performance enhancement ideas we have. Specifically, I tried the idea of populating a List with the strings returned from QueryField.toString so that I could then use String.join in the toSOQL method. I found that this actually did NOT yield the performance improvements we had been hoping for. It yielded about a 10% improvement in the toSOQL execution time... maybe not worth the trouble.

I also had the same idea that you had, of subclassing QueryField. I changed QueryField and its methods to be virtual, and then declared a class called LightweightQueryField that extends QueryField. LightweightQueryField is a simple marker class that maintains the name of the field being queried as a String. It has its own, very streamlined implementations of toString, equals, compareTo, etc. It allowed me to continue to carry the fields around as a collection of QueryField objects. I then added a constructor to the fflib_QueryFactory class that allows you to specify that you want to use the LightweightQueryField objects to build your query as opposed to the more powerful, but also less performant QueryField. This yielded some significant performance improvements in query build time (where query build time represents the total time to construct the QueryFactory, set the fields, conditions and orderings and then call toSOQL to get back the SOQL string). In my case, I saw the query build time reduced from about 1000 ms to about 200 ms. My changes would still you to maintain the full functionality provided by QueryField, but offer a lightweight version to those that are more concerned about query build time.

from fflib-apex-common.

afawcett avatar afawcett commented on August 14, 2024

@tfuda What your describing sounds useful to me and i'm keen to see it if your happy to generate a PR. Meanwhile are you basing this on the latest code from the performance improvements @frup42 mentioned above.

from fflib-apex-common.

tfuda avatar tfuda commented on August 14, 2024

I'll get back to this this week, and clean things up and submit a proper pull request @afawcett

from fflib-apex-common.

afawcett avatar afawcett commented on August 14, 2024

👍

from fflib-apex-common.

afawcett avatar afawcett commented on August 14, 2024

Wow, thanks @daveespo @tfuda for this insight, i have to dash tonight, but i've marked this for a quality read through asap! Again, great work, really appreciate it! 👍

from fflib-apex-common.

daveespo avatar daveespo commented on August 14, 2024

So that everything's bundled together on this issue, I'm mentioning PR #108 and a the writeup on that PR from @tfuda that discusses the merits of LightweightQueryField

from fflib-apex-common.

afawcett avatar afawcett commented on August 14, 2024

Thanks @daveespo and @tfuda fine work from both of you, the gains are for sure very appealing. Wondering if there a table format or some way you can show the relevant features of the current QueryBuilder as rows and as columns these two PR's / options (LightweightQueryField and QueryFieldExorcism). So we can compare more easily the two of them. Maybe also add rows that relate to performance and other benefits or side effects, such as backwards compatibility? Sorry to ask for this, i see you documented this pretty well for one of them thanks for that, it would be just easier to progress this discussion if we had a something in more of a at a glance table summary form? My other thought here is perhaps to have a call between interested parties? Whats your thoughts?

from fflib-apex-common.

afawcett avatar afawcett commented on August 14, 2024

I'm also keen to get your views and results of any tests in respect to the new parameter here, https://github.com/financialforcedev/fflib-apex-common/blob/master/fflib/src/classes/fflib_SObjectSelector.cls#L122, to disable the sorting behaviour. I think this arrived since this discussion started.

from fflib-apex-common.

daveespo avatar daveespo commented on August 14, 2024

No problem; There are a lot of words on this issue and the related pull requests. Is this what you were looking for?

Concern Current Implementation QueryFieldExorcism LightweightQueryField
Dev Console / Developer experience With Apex log level at FINEST or NONE, Set<QueryField> performance becomes linear (see #79) and results in CPU timeouts Fixed; a Set<String> is used instead Fixed; a List<QueryField> is used instead
Debug Log output With Apex log level at FINEST, log file overflows 2MB with QueryField.equals() calls Fixed; inserting into Set<String> doesn't create log output Fixed; inserting into List<QueryField> doesn't call equals() override
CRUD/FLS enforcement (optional) Supported Supported Unsupported
Performance of building SOQL string in production environment 147ms 81ms 39ms
API backwards compatibility N/A Four 'deprecated' public methods on fflib_SObjectSelector were removed: getFieldListBuilder(), setFieldListBuilder(), getFieldListString(), getRelatedFieldListString(); Also, the method signature for fflib_QueryFactory.getSelectedFields now returns a Set<String> instead of a Set<QueryField> No API changes (The "sortFields" feature is now a no-op since deduplicating the List<QueryField> requires that the list be sorted then deduped)
Gracefully handles duplicate fields in Selector Yes Yes Yes

A few additional notes:

  • I hadn't given a ton of thought to creating shims for the 4 public methods removed in SObjectSelector -- if removing those methods is a big deal, I can ruminate on restoring them
  • I think we can actually "do both" (both QueryFieldExorcism and the Lightweight query building) with some refactoring -- The larger performance benefit seen by LightweightQueryField is largely due to bypassing of CRUD/FLS enforcment here: Line 91
  • For LightweightQueryField, we didn't get super hung up on the method signatures for the fflib_QueryFactory constructors; we realize that overloaded constructors can result in madness so if you want the additional constructor removed, we can refactor that to just have a setLightweight() method that initializes the implementation

Tom and I are US East Coasters and work in the same office so we can do a GHangout/Skype/Zoom/whatever generally between 8am and 4pm ET Monday to Thurs. This week, Tom's on holiday so next week would be the earliest we could chat. I think you know both my work and personal email addresses so you can send a meeting invite to either

Glad to see this moving forward and the great discussions on this and some of the other open tickets in this repo; lots of great stuff to think about!

from fflib-apex-common.

daveespo avatar daveespo commented on August 14, 2024

Sorry, one last thing: addressing your Q about whether the sortSelectFields feature that landed was part of our benchmarks:

Yes, we had this code in place and were constructing fflib_QueryFactory using sortSelectFields=false constructor (as you'll see in the fflib-Optimization repo). So the "before" benchmark numbers are all reflective of using the optimizations committed as part of #99

from fflib-apex-common.

rsoesemann avatar rsoesemann commented on August 14, 2024

Is there any decision here if or if not this PR gets merged. I am hardly suffering this issue and would love to see a solution inside the lib.

@afawcett @daveespo @tfuda @capeterson

from fflib-apex-common.

daveespo avatar daveespo commented on August 14, 2024

We'd love to see it merged in too! :-)

We've been using the "QueryFieldExorcism" branch in production now for 6+ months with no regrets and we'd like to get off of our fork and back onto the stock fflib distribution

from fflib-apex-common.

afawcett avatar afawcett commented on August 14, 2024

I'm going to look at this next week, promise! Thanks everyone for your patience. 👍

from fflib-apex-common.

rsoesemann avatar rsoesemann commented on August 14, 2024

@afawcett Is there work to do besides merging? If so maybe we can support you...

from fflib-apex-common.

rnisbet avatar rnisbet commented on August 14, 2024

Hi @afawcett, thank you for your quick reply. If we can merge this branch, it would give us exactly what we need.

from fflib-apex-common.

afawcett avatar afawcett commented on August 14, 2024

Thanks everyone, i have had a good look at this today and finally got comfortable committing it.

I spent a bit of time thinking about the sortSelectFields functionality, which was put in to improve performance. The comment above that indicates this was included in the before analysis confirms to me that the nullification of this behaviour, means that those using this feature currently to get performance improvement will not see a regression. Let me know though if i have misunderstood this or this is not the case.

Meanwhile, its now been merged and Travis says all is well with the world! 👍

So next it would be great if someone could write a blog or provide a wiki page PR for this repo. That describes the pros and cons of the two modes. We can add it to the README doc section and i'll of course promote on my channels, blog, twitter etc.

from fflib-apex-common.

rnisbet avatar rnisbet commented on August 14, 2024

And thank you @afawcett!

from fflib-apex-common.

afawcett avatar afawcett commented on August 14, 2024

@up2go-rsoesemann the lightweight version was not merged, only the general improvements with existing functionality. So in the end there is not much to write about, but plenty to be happy about! 👍

from fflib-apex-common.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.