Comments (2)
I envision creating an automatic report function (e.g. call it tour()
) that automatically runs through recommended visualizations and generates a complete report. Similar to pandas-profiling
but with more diagnostic-driven visualizations. pandas-profiling
does a great job of providing the diagnostics, but it can be overwhelming since it doesn't distill the results to important insights like H2O does.
The ability to parallelize / scale the compute (#46) should be a strong requirement for the automatic visualization capability, as we don't want the user to be waiting a long time to get any results from the automatic visualization.
from data-describe.
I envision creating an automatic report function (e.g. call it
tour()
) that automatically runs through recommended visualizations and generates a complete report. Similar topandas-profiling
but with more diagnostic-driven visualizations.pandas-profiling
does a great job of providing the diagnostics, but it can be overwhelming since it doesn't distill the results to important insights like H2O does.
If we want to do this, we have to dig into whatever insights users might be more interested in instead of plotting everything for all variables.
The ability to parallelize / scale the compute (#46) should be a strong requirement for the automatic visualization capability, as we don't want the user to be waiting a long time to get any results from the automatic visualization.
Yes, also one benefit of having automatic visualization is that we will know what plots/computations are going to be generated so we only have to calculate aggregated/summary statistics once.
from data-describe.
Related Issues (20)
- feature importance: Return top N features
- Add plotly backend for feature importance
- Add % explained variance in the labels for the cluster plot
- documentation image links are missing in website
- data_summary: Exception: Internal Error HOT 2
- Add link to open in Google Colab
- Only the Cluster_Analysis.ipynb contains a menu option for plotly
- Unit test for feature importance should validate "top_features" arg
- Develop notebook examples for specific use cases such as sensor discovery, predictive maintenance, etc.
- Create example notebooks for more specific use cases HOT 2
- data_summary: Unexpected keyword error when running Data_Summary.ipynb in the examples folder HOT 5
- Site links are broken HOT 3
- Conda environment yamls should use pinned dependency versions
- Imputation functions for missing data HOT 1
- data_summary includes null values in top_frequency
- Add error message if input data is too large for specific widgets.
- seaborn_viz_plot_time_series kwargs
- Add mallet as an additional model_type for topic modeling
- Add kwargs for create_doc_term_matrix and create_doc_term_matrix when fitting the topic model
- Add jinja2 requirement
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from data-describe.