Code Monkey home page Code Monkey logo

Comments (2)

roothsmz avatar roothsmz commented on May 28, 2024

@skrawcz ,

One thought is to provide for an overridable Callable object.

def get_dataframe_metadata(df: pd.DataFrame, metadata_func: Optional[Callable[[pd.DataFrame], Dict[str, Any]]] = None) -> Dict[str, Any]:
    """Gives metadata from loading a dataframe.

    Note: we reserve the right to change this schema. So if you're using this come
    chat so that we can make sure we don't break your code.

    This will default to include:
    - the number of rows
    - the number of columns
    - the column names
    - the data types
    
    If you provide an override, then your DataFrame will be passed into that 
    Callable and the resulting Dictionary will include your custom metadata.
    """
    if metadata_func is not None:
        return {DATAFRAME_METADATA: metadata_func(df)}
    else:
        return {
            DATAFRAME_METADATA: {
                "rows": len(df),
                "columns": len(df.columns),
                "column_names": list(df.columns),
                "datatypes": [str(t) for t in list(df.dtypes)],  # for serialization purposes
            }
        }

Source code as is currently implemented:

def get_dataframe_metadata(df: pd.DataFrame) -> Dict[str, Any]:
"""Gives metadata from loading a dataframe.
Note: we reserve the right to change this schema. So if you're using this come
chat so that we can make sure we don't break your code.
This includes:
- the number of rows
- the number of columns
- the column names
- the data types
"""
return {
DATAFRAME_METADATA: {
"rows": len(df),
"columns": len(df.columns),
"column_names": list(df.columns),
"datatypes": [str(t) for t in list(df.dtypes)], # for serialization purposes
}
}

from hamilton.

skrawcz avatar skrawcz commented on May 28, 2024

Hmm -- this makes me think do we need schemas by type? Or can we have a general one on all dataframes, but where some fields might not be easily populated.

from hamilton.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.