Code Monkey home page Code Monkey logo

arkoudanotebooks's People

Contributors

ethan-debandi99 avatar joshmarshall1 avatar mhmerrill avatar reuster986 avatar stress-tess avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arkoudanotebooks's Issues

DataFrame_Demo2 Updates

The following is a list of identified out-of-date bugs in the notebook:

  • Drop Columns, line 2
    • Wrong: df.drop('userName')
    • Right: df.drop('userName', axis=1, inplace=True)
  • Reset Index, line 3
    • Wrong: slice_df = df[[1, 3, 5]]
    • Right: slice_df = df[ak.array([1, 3, 5])]
  • Reset Index, line 6
    • Wrong: slice_df.reset_index()
    • Right: slice_df.reset_index(inplace=True)
  • Column Renaming, line 2
    • Wrong: df.rename({'userName':'user_name', 'userID': 'user_id'})
    • Right: df.rename({'userName':'user_name', 'userID': 'user_id'}, axis=1, inplace=True)

@pierce314159 looks like the same stuff as the DataFrame_Demo notebook!

Akrouda `DataFrame` Demo

DataFrame objects have been added to arkouda recently. The intention is for these objects essentially function the same as pandas.dataframe. A notebook should be developed demonstrating:

  • Arkouda DataFrame functionality and how it compares to that of pandas.dataframe
  • Highlight any variances between Arkouda DataFrame and pandas.dataframe. This may simply be code that has not yet been implemented.
  • Examples of Arkouda DataFrame in use

Add Arkouda Basics Notebook

I have a demo notebook that highlights some of the basic of Arkouda. The notebook compares Arkouda to NumPy/Pandas. It walks through the features listed below:

  • Array Creation
  • Array Functionality (Set Operations, GroupBy)
  • Creating DataFrames
  • DataFrame Functionality (GroupBy, Sorting)

We can add additional examples here as needed. I believe some adjustments may be required with updates to Arkouda since it was created to run properly.

Update info methods in Registration_Example

Update Registration_Example.ipynb to cover new info method functionality from Arkouda PR#782 including:

  • ak.information (including example to parse the JSON return string) and ak.pretty_print_information
  • Class level .info and .pretty_print_info methods for pdarray, Strings, and Categorical

Update Drop Functionality

PR #1177 in the arkouda repo updates the functionality of ak.DataFrame.drop() to include the ability to drop columns.

The DataFrame Demo notebooks should be updated to reflect these changes once PR #1177 is merged.

Link to PR for convience:
Bears-R-Us/arkouda#1177

Discussion about NYC Taxi Notebook

This issue encapsulates discussion about the NYC Taxi data set example using Arkouda.
Notebook here

  • ideas of what to compute from the taxi data set
    • infer/recover probabilistic taxi entities (kalman filer?)
    • page rank on location id graph
    • estimate number of taxis in-flight or waiting using different data fields
    • estimate paths of taxis using join-with-delta-time operation
    • other suggestions or crazier things?
  • examples of using arkouda operations
  • define helper function to interoperate with NumPy or Pandas
  • other suggestions

Yellow Trips Data Dictionary

NYC Yellow Taxi Trip Records Jan 2020

NYC Taxi Zone Lookup Table

Computationally Intensive Notebooks

Hello, thank you for the great presentation at the Arkouda hackathon a couple of weeks ago and for all of your work on this project. I went through the notebooks in this repo, and they seem to do an excellent job of detailing how Arkouda can be used in an interactive manner for typical data science use cases.

However, I was wondering if you have examples of notebooks which run more computationally intensive and algorithmically complex data science programs, on the level of your benchmarks or something along the lines of a Triangle Count algorithm which can run for quite a few iterations. I am trying to see what kind of messages are sent to the Chapel-backed server for these data science-oriented use cases. Thanks!

add pandas alignment folder

Add a pandas alignment folder for code examples demonstrating the similarities and differences between the arkouda API and pandas 2.0 API for specific functionalities.

Update Dataframe Notebooks - Indexing Issue

We have removed the ability to perform row indexing using python lists when indexing into a Arkouda dataframe. They now need to be an arkouda pdarray in order to function. Any row indexing will need to be updated to use pdarrays in place of lists.

DataFrame Demo Notebook Out of Date

Identified the following issues while testing out the DataFrame Demo notebook:

  • Drop: Need to specify the axis and inplace=True (line 2)

    • Wrong: df.drop('userName')
    • Right: df.drop('userName', axis=1, inplace=True)
  • Reset Index: slicing on integers appears to be broken, need to specify ak array (line 7)

    • Wrong: df[[1,3,5]]
    • Right: df[ak.array([1, 3, 5])]
  • Reset Index: Doesn't actually work inplace without explicitly stating so (line 13)

    • Wrong: slice_df.reset_index()
    • Right: slice_df.reset_index(inplace=True)
  • Renaming Columns: Same issues as above, does not work inplace unless specified, also need to specify axis (line 9)

    • Wrong: df.rename({'userName':'user_name', 'userID': 'user_id'})
    • Right: df.rename({'userName':'user_name', 'userID': 'user_id'}, axis=1, inplace=True)

    @pierce314159

Update Registration_Example Notebook

With the addition of the register and attach components of GroupBy, the Registration_Example notebook should be updated to reflect current functionality.

  • Existing notebook examples should be verified as still functional or updated if no longer accurate
  • New examples should be added for the GroupBy, SegArray, and Categorical register and attach functionality

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.