Comments (9)
IMO, I prefer using the .pipe
method on janitor function, as opposed to the jn.DataFrame
with methods so this doesn't seem 100% natural to me. That said, I use the df.plot functionality a lot.
Also, missingno is already very easy to use,
import missingno as mn
mn.matrix(df)
This can be piped as well,
df.pipe(mn.matrix)
Thinking about this, I makes most sense to add this if we wish to extend functionality. I will have a think about how we could do this in janitor and post back.
from pyjanitor.
I am willing to work on this.
from pyjanitor.
Thanks @souravsingh! Looking forward to the PR.
from pyjanitor.
I assume there has been no PR for this? If so, I am happy to pick this up if still desired.
My only question is, how would this differ from just using missingno itself? What added functionality would we like to see? Thanks :)
from pyjanitor.
@JoshuaC3 thanks for pinging in on this one! At the moment, I haven't had the bandwidth to give this more thought, so I'm very open to discussing the most appropriate use cases. What ideas do you have?
A few points I can think off the top of my head right now that might be relevant:
- Janitor DataFrames inherit directly from Pandas dataframes and merely add in data cleaning convenience methods, so janitorDFs and pandasDFs are extremely compatible.
- One thought that I ever once had was "just wrap missingno" --> the key thing that Janitor provides is method chaining, BUT method chaining might not be a relevant pattern for data sanity checks.
from pyjanitor.
This can be piped as well,
df.pipe(mn.matrix)
Oooh cool stuff! I didn't realize that, actually. Thanks for sharing this!
from pyjanitor.
@JoshuaC3 I think this should be shown inside an example Jupyter notebook! What do you think? We have a bunch of notebooks already present, and can show how we go from data that is dirty (and has nullity) to data that is clean and densely populated!
from pyjanitor.
Once the PR #6 in pyjanitor-examples is accepted, then I believe this issue can be closed as well.
from pyjanitor.
Thank you, @dancassin! 🎉
from pyjanitor.
Related Issues (20)
- RuntimeWarning: subpackages can technically be lazily loaded HOT 16
- explode_levels
- Not able to import janitor.clean_name function - ImportError: cannot import name 'ABCPandasArray' from 'pandas.core.dtypes.generic' HOT 2
- Typos in repository
- expand function
- [INFRA] Switch over to pyproject.toml
- Support efficient json extraction within a pandas column HOT 1
- [ENH] implement full numba version of a single conditional_join
- deprecation warning for pivot_longer HOT 1
- Return only matching indices for `conditional_join`
- [ENH] cython a subset of _range_join_indices and equi join HOT 4
- extend `col` powers for index selection HOT 1
- dtype conversion on index
- `conditional_join` fails on mac for `equi-join` and numba HOT 1
- Outdated version in conda forge HOT 1
- extend `row_to_names` to support multiindex
- `sheet_name` not required in jn.xlsx_table
- Problems with equalities in contional_join HOT 18
- Make clean_names() compatible with polars and geopandas dataframes HOT 6
- implement similar functions for polars
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyjanitor.