Code Monkey home page Code Monkey logo

Comments (9)

alamb avatar alamb commented on August 17, 2024 1

Thanks @Rachelint -- I also have some time later this week to help with this issue (if it might help to have one or two examples as we work through #11151)

from arrow-datafusion.

Rachelint avatar Rachelint commented on August 17, 2024 1

Sounds like it might work -- I think the only way to really know for sure would be to try it out

Thanks, I try it now.

from arrow-datafusion.

Rachelint avatar Rachelint commented on August 17, 2024 1

Sounds like it might work -- I think the only way to really know for sure would be to try it out

Thanks, I try it now.

@Rachelint if you stop working on this could you please let me know?

Sorry for delay, still working on now, will try to push codes today...

from arrow-datafusion.

Rachelint avatar Rachelint commented on August 17, 2024

take

from arrow-datafusion.

Rachelint avatar Rachelint commented on August 17, 2024

Hi @alamb , I have some questions about the alternatives mentioned in the issue.
It seems the interface should be like:

trait AggregateExpr {
	fn output_from_stats(&self, stats: &Statistics) -> Option<ScalarValue> { None }
	...
}

As I understand, it is a optimization takes effect when specific aggregate functions(such as max, min, count...) can get the results directly from stats?

from arrow-datafusion.

alamb avatar alamb commented on August 17, 2024

As I understand, it is a optimization takes effect when specific aggregate functions(such as max, min, count...) can get the results directly from stats?

Yes that is the case

I think output_from_stats would also be reasonable and potentially more general. One challenge is that the AggregateExpr might not have access to its argument exprs (and thus it might not know what the argument is) 🤔

from arrow-datafusion.

Rachelint avatar Rachelint commented on August 17, 2024

As I understand, it is a optimization takes effect when specific aggregate functions(such as max, min, count...) can get the results directly from stats?

Yes that is the case

I think output_from_stats would also be reasonable and potentially more general. One challenge is that the AggregateExpr might not have access to its argument exprs (and thus it might not know what the argument is) 🤔

It seems a function expressions exists, and can help?

fn expressions(&self) -> Vec<Arc<dyn PhysicalExpr>>;

and it is used to get min/max from stats now?

if casted_expr.expressions().len() == 1 {
// TODO optimize with exprs other than Column
if let Some(col_expr) = casted_expr.expressions()[0]
.as_any()
.downcast_ref::<expressions::Column>()
{
if let Precision::Exact(val) =
&col_stats[col_expr.index()].max_value

from arrow-datafusion.

alamb avatar alamb commented on August 17, 2024

Sounds like it might work -- I think the only way to really know for sure would be to try it out

from arrow-datafusion.

edmondop avatar edmondop commented on August 17, 2024

Sounds like it might work -- I think the only way to really know for sure would be to try it out

Thanks, I try it now.

@Rachelint if you stop working on this could you please let me know?

from arrow-datafusion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.