Code Monkey home page Code Monkey logo

Comments (2)

alamb avatar alamb commented on July 4, 2024 1

It is a good observation that ScalarUDFImpl and AggregateUDFImpl don't share a common base trait and thus adding functionality that affects both requires duplication of code

I would need to introduce a more general trait for both function
Another alternative is having a duplicate function for scalar and aggregate for a related function

I agree with your analysis of the tradeoffs: a common base trait would result in less duplication in DataFusion

However, I personally prefer duplicating coerce_arguments_for_signature in each trait rather than introducing a common base trait because:

  1. It is backwards compatible (not an API change for the existing library of functions)
  2. Makes it slightly easier to implement ScalarUDF and AggregateUDF (especially when new to rust) -- rather than two impls for your function, you only need one

from arrow-datafusion.

jayzhan211 avatar jayzhan211 commented on July 4, 2024

I found that to have coerce_types for both scalarUDF and aggregateUDF, I would need to introduce a more general trait for both function

impl UDFImpl for T {
    fn name(&self) -> &str {
       &self.name
    }

    fn coerce_types(&self, data_types: &[DataType]) -> Result<Vec<DataType>> {
        not_impl_err!("Function {} does not implement coerce_types", self.name)
    }
}

impl ScalarUDFImpl: UDFImpl
impl AggregateUDFImpl: UDFImpl

Then we can have

fn coerce_arguments_for_signature(
    expressions: Vec<Expr>,
    schema: &DFSchema,
    signature: &Signature,
    func: Arc<dyn UDFImpl>,
) -> Result<Vec<Expr>> {}

Another alternative is having a duplicate function for scalar and aggregate for a related function

fn coerce_arguments_for_signature(
    expressions: Vec<Expr>,
    schema: &DFSchema,
    signature: &Signature,
    func: &ScalarUDF,
) -> Result<Vec<Expr>> {}

fn coerce_arguments_for_signature(
    expressions: Vec<Expr>,
    schema: &DFSchema,
    signature: &Signature,
    func: &AggregateUDF,
) -> Result<Vec<Expr>> {}

I think the first option is potentially beneficial in the long run(?) but the user now needs to define two traits. The second option only increases the maintenance cost.

What do you think about this @alamb
I also track if there is rust solution for this in https://users.rust-lang.org/t/inheritance-like-with-rust-trait/111102

from arrow-datafusion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.