Comments (8)
I'm learning also so I can't be too much help but I think you can experiment with different settings and datasets then see if they make sense to you?
As for memory, try google colab you can add the line !pip install ctgan
- you can get more memory + a gpu for free.
from ctgan.
I'm learning also so I can't be too much help but I think you can experiment with different settings and datasets then see if they make sense to you?
As for memory, try google colab you can add the line
!pip install ctgan
- you can get more memory + a gpu for free.
Wow, that is totally amazing! THANKS!
from ctgan.
Hello @apavlo89
The default number of epochs is 300.
If you want to know the default values for this and other arguments you can have a look at the API Reference section in our documentation: https://sdv-dev.github.io/CTGAN/api/ctgan.synthesizer.html#ctgan.synthesizer.CTGANSynthesizer
from ctgan.
Thank you very much! Is there a specific reason for choosing 300 epochs as the default? Is there some kind of optimum metric for the number of epochs based on the database?
from ctgan.
Thank you very much! Is there a specific reason for choosing 300 epochs as the default? Is there some kind of optimum metric for the number of epochs based on the database?
I assumed it was just a default number.
Not sure if this helps but in the demo in the readme you can set the epochs by ctgan.fit(data, discrete_columns, epochs=5)
or in the CTGANSynthesizer class you can adjust it within the fit function. Maybe play with the epochs to see what works best for your data?
from ctgan.
I'm quite new to machine learning especially in network techniques so would you say there's a pattern to look for in each/after a few epochs? What am I aiming for? I'd say after epoch 150 my Loss D and Loss G values were hovering around a specific range of values.
My computer then ran out of RAM at 215 epochs. In Epoch 215 the Generator Loss and Discriminator Loss was: G: 1.6974, Loss D: -77.0800.
It gave me the error DefaultCPUAllocator: not enough memory: you tried to allocate 2709625764 bytes. Buy new RAM! :(
from ctgan.
Thank you very much! Is there a specific reason for choosing 300 epochs as the default? Is there some kind of optimum metric for the number of epochs based on the database?
The default values for the model hyperparameters are, in most cases, the ones that were used to generate the results on the paper.
Regarding the value 300 in particular, the number was decided based on the performance obtained on the different datasets that were used for benchmarking, but different datasets might require different settings. In most cases, a lower number of epochs, of just a few dozens, can be more than enough to explore a particular problem a bit faster and get an idea of what the model can do on your data. However, if you want to get the most out of the model, you will probably need to tweak it a little bit and find the optimal value for each dataset you work on.
It gave me the error DefaultCPUAllocator: not enough memory: you tried to allocate 2709625764 bytes. Buy new RAM! :(
Yeah, that's indeed quite an annoying error message to get, but it comes directly from PyTorch. There isn't much that we can do about it!
from ctgan.
Closing this, as the question has already been responded.
from ctgan.
Related Issues (20)
- Should a 5-Likert scale be treated as either continuous or discrete? HOT 2
- Multi GPU support
- Avoid generating the conditional column
- Add support for Python 3.11
- Add progress bar for CTGAN fitting (+ save the loss values)
- Question about large amount of training dataset in TVAE -- is there max? HOT 1
- Add verbosity TVAE (progress bar + save the loss values)
- Condition with inequality for continuous columns
- Drop support for Python 3.7
- Question regarding CTGAN for data synthesis and classification tasks
- Tracking and Saving TVAE Loss Values HOT 2
- Set generator to eval mode before sampling?
- Switch default branch from master to main
- Remove or implement CTGAN tests
- `ClusterBasedNormalizer` refactor
- Hyperparameters
- Doubts on the usage of conditional sampling HOT 4
- Support Python 3.12
- Tune about CTGAN
- TypeError while ctgan.fit() HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ctgan.