Comments (5)
As a quick work around.
Go to the location, where the library is installed.
For me it was : "C:\Users\user_name\Anaconda3\envs\project_3\Lib\site-packages\smogn" and open smoter.py through notepad/notepad++ and replace the following:
for j in range(d):
data_new.iloc[:, j] = data_new.iloc[:, j].astype(feat_dtypes_orig[j])
with
for j in range(d):
data_new=data_new.fillna(data_new.median())
data_new.iloc[:, j] = data_new.iloc[:, j].astype(feat_dtypes_orig[j])
This should fix the issue until the issue is fixed by the dev.
Note - I filled NAs with median, because it worked for me.
from smogn.
Unfortunately I could not solve the issue. I had to stay with my naive implementation of dropping columns containing nan values, which was only meant to be a backup solution.
from smogn.
@mplutat Hello and thank you for raising this issue. Is there some you could provide to reproduce the problem? This sounds like it might be an issue with the DataFrame. However, I cannot tell from this alone.
Also, I would not be able to responsibly comment on the affect of simply removing missing values, as I have not seen the distribution / change in distribution.
from smogn.
I'm sorry for the delay. It took longer than expected to find and export affected DataFrames.
Anyway I indentified one bin which when over-sampled contains missing values in the synthetic DataFrame.
I have attached the input and resampled bin DataFrames to this comment. In case the given DataFrames are not thorough enough I have also attached the complete input DataFrame containing all values.
Thank you very much for your help.
dataframes.zip
from smogn.
Managed to solve?
I have the same problem.
from smogn.
Related Issues (20)
- Using Smogn only reducing number of observations
- IndexError: positional indexers are out-of-bounds HOT 1
- Take input as numpy arrays HOT 2
- SMOGN with `under_samp`=False fails to return original data
- some features are missing after resampling
- Cuda availability HOT 2
- Could you explain what exactly is the `rel_coef` argument? HOT 2
- How to specify resampling range? HOT 2
- Reducing verboseness HOT 2
- Handling categorical features
- Error during running advanced ex3
- SMOGN is creating a new class for target HOT 2
- Resampling with label uniformity and user uniformity
- Hyperparameter optimization
- Reproduceability of smoter HOT 1
- The possibility of applying this method in the field of images HOT 1
- Over-sampling HOT 1
- Binary label
- Documentation on the relevance value matrix HOT 3
- IntCastingNaNError: Cannot convert non-finite values (NA or inf) to integer
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from smogn.