bears-r-us / arkoudanotebooks Goto Github PK
View Code? Open in Web Editor NEWplace for notebooks and example uses of the Arkouda software package
License: MIT License
place for notebooks and example uses of the Arkouda software package
License: MIT License
LANL Netflow data is available here
https://csr.lanl.gov/data/2017/
The following is a list of identified out-of-date bugs in the notebook:
df.drop('userName')
df.drop('userName', axis=1, inplace=True)
slice_df = df[[1, 3, 5]]
slice_df = df[ak.array([1, 3, 5])]
slice_df.reset_index()
slice_df.reset_index(inplace=True)
df.rename({'userName':'user_name', 'userID': 'user_id'})
df.rename({'userName':'user_name', 'userID': 'user_id'}, axis=1, inplace=True)
@pierce314159 looks like the same stuff as the DataFrame_Demo notebook!
@Bears-R-Us/arkouda-core-dev update ak.info
calls in notebooks to new interface
Develop Jupyter notebooks that detail a variety of Arkouda EDA workflows using Netflow data.
DataFrame
objects have been added to arkouda recently. The intention is for these objects essentially function the same as pandas.dataframe
. A notebook should be developed demonstrating:
DataFrame
functionality and how it compares to that of pandas.dataframe
DataFrame
and pandas.dataframe
. This may simply be code that has not yet been implemented.DataFrame
in useI have a demo notebook that highlights some of the basic of Arkouda. The notebook compares Arkouda to NumPy/Pandas. It walks through the features listed below:
We can add additional examples here as needed. I believe some adjustments may be required with updates to Arkouda since it was created to run properly.
Create a new version of the NYCTaxi notebooks demonstrating the application of Arkouda DataFrames in place of Pandas.
Update Registration_Example.ipynb
to cover new info method functionality from Arkouda PR#782 including:
ak.information
(including example to parse the JSON return string) and ak.pretty_print_information
.info
and .pretty_print_info
methods for pdarray
, Strings
, and Categorical
PR #1177 in the arkouda
repo updates the functionality of ak.DataFrame.drop()
to include the ability to drop columns.
The DataFrame Demo notebooks should be updated to reflect these changes once PR #1177 is merged.
Link to PR for convience:
Bears-R-Us/arkouda#1177
This issue encapsulates discussion about the NYC Taxi data set example using Arkouda.
Notebook here
Add 2 notebooks contributed by @tgstevensonRedRocket to the repository
Add a notebook highlighting new bigint
pdarray functionality
Hello, thank you for the great presentation at the Arkouda hackathon a couple of weeks ago and for all of your work on this project. I went through the notebooks in this repo, and they seem to do an excellent job of detailing how Arkouda can be used in an interactive manner for typical data science use cases.
However, I was wondering if you have examples of notebooks which run more computationally intensive and algorithmically complex data science programs, on the level of your benchmarks or something along the lines of a Triangle Count algorithm which can run for quite a few iterations. I am trying to see what kind of messages are sent to the Chapel-backed server for these data science-oriented use cases. Thanks!
Add pdf of ArrayView_index_math
notebook presented at Arkouda Weekly Call
Add a pandas alignment folder for code examples demonstrating the similarities and differences between the arkouda API and pandas 2.0 API for specific functionalities.
We have removed the ability to perform row indexing using python lists when indexing into a Arkouda dataframe. They now need to be an arkouda pdarray in order to function. Any row indexing will need to be updated to use pdarrays in place of lists.
Identified the following issues while testing out the DataFrame Demo notebook:
Drop: Need to specify the axis and inplace=True (line 2)
df.drop('userName')
df.drop('userName', axis=1, inplace=True)
Reset Index: slicing on integers appears to be broken, need to specify ak array (line 7)
df[[1,3,5]]
df[ak.array([1, 3, 5])]
Reset Index: Doesn't actually work inplace without explicitly stating so (line 13)
slice_df.reset_index()
slice_df.reset_index(inplace=True)
Renaming Columns: Same issues as above, does not work inplace unless specified, also need to specify axis (line 9)
df.rename({'userName':'user_name', 'userID': 'user_id'})
df.rename({'userName':'user_name', 'userID': 'user_id'}, axis=1, inplace=True)
@pierce314159
With the addition of the register
and attach
components of GroupBy
, the Registration_Example notebook should be updated to reflect current functionality.
GroupBy
, SegArray
, and Categorical
register
and attach
functionalityA declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.