henrysky / astronn Goto Github PK
View Code? Open in Web Editor NEWDeep Learning for Astronomers with Tensorflow
Home Page: http://astronn.readthedocs.io/
License: MIT License
Deep Learning for Astronomers with Tensorflow
Home Page: http://astronn.readthedocs.io/
License: MIT License
Running BNN test() mulitple times (the 7th time??) in a row will raise a weird error complaining shape not right, or dimension not right, can be reproduced on both CPU and GPU on my Windows and astro department linux server.
This bug is initially discovered by doing open/globular clusters benchmark, because I need to run BNN test() method for every cluster by stopped by this bug
ValueError Traceback (most recent call last)
<ipython-input-7-c36beede380f> in <module>()
57 print(np.sum(np.isnan(spec)))
58 print(name, ' and number of stars: ', indices.shape[0])
---> 59 pred, pred_var = bcnn.test(spec[1:])
60 means = np.mean(pred, axis=0)
61 mad_stds = mad_std(pred, axis=0)
d:\university\ast425\astronn\astroNN\models\BayesianCNNBase.py in test(self, input_data, inputs_err)
210 inputs_err[data_gen_shape:])
211 remainder_result = np.asarray(new.predict_generator(remainder_generator, steps=1))
--> 212 result = np.concatenate((result, remainder_result))
213
214 if result.ndim < 3: # in case only 1 test data point, in such case we need to add a dimension
ValueError: all the input arrays must have same number of dimensions
ValueError Traceback (most recent call last)
<ipython-input-8-b4056e9283f7> in <module>()
55 print(np.sum(np.isnan(spec)))
56 print(name, ' and number of stars: ', indices.shape[0])
---> 57 pred, pred_var = bcnn.test(spec[1:])
58 means = np.mean(pred, axis=0)
59 mad_stds = mad_std(pred, axis=0)
d:\university\ast425\astronn\astroNN\models\BayesianCNNBase.py in test(self, input_data, inputs_err)
218
219 predictions = result[:, :half_first_dim, 0] # mean prediction
--> 220 mc_dropout_uncertainty = result[:, :half_first_dim, 1] * (self.labels_std ** 2) # model uncertainty
221 predictions_var = np.exp(result[:, half_first_dim:, 0]) * (self.labels_std ** 2) # predictive uncertainty
222
ValueError: operands could not be broadcast together with shapes (1,5075) (25,)
The cause is unknown but BNN test_old() method is unaffected
I was considering doing a few demos with the Galaxy10
dataset but noticed that the Galaxy10.h5
file linked here has 21785 images and not the 25753 stated on the webpage. Was this a typo or are some images missing?
Thanks for assembling this fun toy dataset :)
But these introductory examples are buggy. As a beginner on deeplearning, it is not obvious for me to correct some simple bugs.
Those notebooks are very old and are not working anymore.
**OS Platform and Distribution MacOSX: Big Sur (but same on binder)
astroNN (Build or Version): master
Did you try the latest astroNN commit?: I have done git clone from master
TensorFlow installed from (source or binary, official build?): pip install
TensorFlow version: tensorflow 2.12.0
Python version: Python 3.9.16
Exact command/script to reproduce (if applicable):
Describe the problem clearly here. Be sure to describe here why it's a bug in astroNN (instead of Tensorflow's problem) or a feature request.
Among the 4 examples
After minor numpy format correction I have found inUncertainty_Demo_quad.ipynb , the generator generate_train_batch(x, y, y_err) is not accepted by model.fit(), more over the proposed model.fit_generator() is not accepted anymore by Tensorflow.
In the section Third, use a single model to get both epistemic and aleatoric uncertainty with variational inference
I tried to skip the generator by providing directly the data not involving any generator, but the data format was not accepted.
the_in,the_out = next(generator)
model.fit(the_in,the_out, epochs=20, max_queue_size=20, verbose=0,
steps_per_epoch= x.shape[0] // batch_size)
I have no deep knowledge in Tensorflow to understand the data format error.
TypeError: You are passing KerasTensor(type_spec=TensorSpec(shape=(), dtype=tf.float32, name=None), name='Placeholder:0', description="created by layer 'tf.cast_2'"), an intermediate Keras symbolic input/output, to a TF API that does not allow registering custom dispatchers, such as `tf.cond`, `tf.function`, gradient tapes, or `tf.map_fn`. Keras Functional model construction only supports TF API calls that *do* support dispatching, such as `tf.math.add` or `tf.reshape`. Other APIs cannot be called directly on symbolic Kerasinputs/outputs. You can work around this limitation by putting the operation in a custom Keras layer `call` and calling that layer on this symbolic input/output.
I hope you could quickly fix these simple examples such I could start from a simple working example.
Many thanks.
Since Tensorflow 1.5.0, Keras is an official part of Tensroflow API (tensorflow.keras). astroNN should support both keras
and tensorflow.keras
.
What is done?
tensorflow
What is not done?
keras
keras
keras
A relevant discussion on Keras github
keras
or tensorflow.keras
)keras
or tensorflow.keras
??When I run the odeint
example on tensorflow 2.2.0 i get the error:
File "C:\Users\jhsmi\pp\astroNN\astroNN\neuralode\dop853.py", line 177, in dopri853core
if tf.equal(hmax, 0.0):
File "C:\Users\jhsmi\Miniconda3\envs\py37_tf_dev\lib\site-packages\tensorflow\python\framework\ops.py", line 778, in __bool__
self._disallow_bool_casting()
File "C:\Users\jhsmi\Miniconda3\envs\py37_tf_dev\lib\site-packages\tensorflow\python\framework\ops.py", line 545, in _disallow_bool_casting
"using a `tf.Tensor` as a Python `bool`")
File "C:\Users\jhsmi\Miniconda3\envs\py37_tf_dev\lib\site-packages\tensorflow\python\framework\ops.py", line 532, in _disallow_when_autograph_enabled
" decorating it directly with @tf.function.".format(task))
tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: using a `tf.Tensor` as a Python `bool` is not allowed: AutoGraph did not convert this function. Try decorating it directly with @tf.function.
It works fine for me on TF 2.1.0
If I have an ODE function for example like this:
class ODE(object):
def __init__(self, k1, k2):
self.k1, self.k2 = k1, k2
def __call__(self, y, t):
d_1 = - self.k1 * y[0] + self.k2 * y[1]
d_2 = self.k1 * y[0] - self.k2 * y[1]
return tf.stack([d_1, d_2])
ode_func = ODE(3., 5.)
And if I now would like to do this in parallel over k1
, k2
, would this be the way to do it?
class ODE(object):
def __init__(self, k1, k2):
self.k1, self.k2 = k1, k2
self.size = len(k1)
def __call__(self, y, t):
d_1 = - self.k1 * y[:self.size] + self.k2 * y[self.size:]
d_2 = self.k1 * y[:self.size] - self.k2 * y[self.size:]
return tf.concat([d_1, d_2], axis=0)
cpu_fallback()
gpu_memory_manage()
k1 = tf.constant(np.arange(1., 6), dtype=tf.float64)
k2 = tf.constant(np.arange(1., 6)[::-1], dtype=tf.float64)
ode_func = ODE(k1, k2)
NUM_SAMPLES=100
y_init = tf.concat([np.ones(5, dtype=np.float), np.zeros(5, dtype=np.float)], axis=0)
t = tf.constant(np.linspace(0., 10., num=NUM_SAMPLES), dtype=tf.float64)
f = ODE(k1, k2)
y = odeint(f, y_init, t, precision=tf.float64)
Hi, Henry. I've got a well trained astroNN model, but I want to do some transfer learning to make it adaptable to another survey. What I've done is remove the top dense layer of the base model and build a new dense layer, but now it can only be treat like an ordinary keras model. By the way, the base model itself is a custom model under the parent class ''BayesianCNNBase''
I'm wondering:
Thank you!
Current .h5 dataset loading mechanism is problematic due to the fact that astroNN load the whole dataset into memory regardless of the size. It will eventually be a serious problem if the dataset is too big and have too little memory (Already a little problem of loading APOGEE training data (~12GB on my 16GB RAM laptop and desktop)
Irrelevant
Neural Network/Data generator should talk to H5Loader directly instead of H5Loader loads the whole dataset to memory to Neural Network/Data generator.
Hi, thanks for sharing these great implementation on github! Nice work.
I ran your notebook Uncertainty_Demo_MNIST.ipynb.
However I can not get the same results as it showed in the notebook output. The loss I got are all nan.
Could you suggest why?
The output I got from the second cell (Train the neural network on MNIST training set):
Number of Training Data: 54000, Number of Validation Data: 6000
====Message from Normalizer====
You selected mode: 255
Featurewise Center: False
Datawise Center: False
Featurewise std Center: False
Datawise std Center: False
====Message ends====
====Message from Normalizer====
You selected mode: 0
Featurewise Center: False
Datawise Center: False
Featurewise std Center: False
Datawise std Center: False
====Message ends====
Sorry but there is a known issue of the loss not handling loss correctly. I will fix it in May-- Henry 19 April 2018
Epoch 1/5
- 163s - loss: nan - output_loss: nan - variance_output_loss: nan - output_categorical_accuracy: 0.0980 - val_loss: nan - val_output_loss: nan - val_variance_output_loss: nan - val_output_categorical_accuracy: 0.0991
Epoch 2/5
- 159s - loss: nan - output_loss: nan - variance_output_loss: nan - output_categorical_accuracy: 0.0987 - val_loss: nan - val_output_loss: nan - val_variance_output_loss: nan - val_output_categorical_accuracy: 0.1047
Epoch 00002: ReduceLROnPlateau reducing learning rate to 0.0024999999441206455.
Epoch 3/5
- 157s - loss: nan - output_loss: nan - variance_output_loss: nan - output_categorical_accuracy: 0.1001 - val_loss: nan - val_output_loss: nan - val_variance_output_loss: nan - val_output_categorical_accuracy: 0.0971
Epoch 00003: ReduceLROnPlateau reducing learning rate to 0.0012499999720603228.
Epoch 4/5
- 157s - loss: nan - output_loss: nan - variance_output_loss: nan - output_categorical_accuracy: 0.0967 - val_loss: nan - val_output_loss: nan - val_variance_output_loss: nan - val_output_categorical_accuracy: 0.1008
Epoch 00004: ReduceLROnPlateau reducing learning rate to 0.0006249999860301614.
Epoch 5/5
- 157s - loss: nan - output_loss: nan - variance_output_loss: nan - output_categorical_accuracy: 0.0998 - val_loss: nan - val_output_loss: nan - val_variance_output_loss: nan - val_output_categorical_accuracy: 0.1003
Epoch 00005: ReduceLROnPlateau reducing learning rate to 0.0003124999930150807.
Completed Training, 794.97s in total
Thanks!
astroNN's generator is already thread safe
It is a known issue on Windows caused by python. Probably will work on Linux/MacOS.
So far the only issue is CPU can't generate data fast enough for a fast GPU (GTX970 or above and at least 4 threads CPU).
Only neccessary when you are using BCNN with GPU training
Link: matterport/Mask_RCNN#13
Link: keras-team/keras#6582
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-2-17f261cd711f> in <module>()
2 bcnn = Apogee_BCNN()
3 bcnn.max_epochs = 75
----> 4 bcnn.train(x,y,x_err,y_err)
d:\university\ast425\astronn\astroNN\models\Apogee_BCNN.py in train(self, input_data, labels, inputs_err, labels_err)
111 validation_steps=self.val_num // self.batch_size,
112 epochs=self.max_epochs, verbose=2, workers=os.cpu_count(),
--> 113 callbacks=[reduce_lr, csv_logger], use_multiprocessing=True)
114
115 # Call the post training checklist to save parameters
~\Anaconda3\lib\site-packages\keras\legacy\interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your `' + object_name +
90 '` call to the Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper
~\Anaconda3\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
2097 val_enqueuer = GeneratorEnqueuer(validation_data,
2098 use_multiprocessing=use_multiprocessing,
-> 2099 wait_time=wait_time)
2100 val_enqueuer.start(workers=workers, max_queue_size=max_queue_size)
2101 validation_generator = val_enqueuer.get()
Detect user's OS and enable multiprocessing in fit_generator on MacOS and Linux
Hello, thank you for your work!
Does astroNN work with tensorflow 2.4.1?
Because whenever I import a module I get
cannot import name 'get_default_session' from 'tensorflow'
For example I am trying to do
from astroNN.models.apogee_models import ApogeeBCNN
thank you in advance, Lucia
astroNN Gaia DR2 parallax zero-point offset with deep learning
Gaia DR2 calculates it as โ0.029 mas.
Sloan Digital Sky Survey Apogee calculates it as โ0.0523 mas.
Modified parallax = parallax - zero point offset
Data model: apogee_astroNN provides spectro-photometric deep learning parsec distances.
Distance in parsecs to the Orion Nebula for star classes BA, Fd, GKd and GKg pretty much agree. But astroNN appears to produce 4-5 times larger distances for Md and Mg stars.
Parsecs calculated with parallax zero point offset options:
Parsec- no offset
Dist - Apogee Deep Learning
DistApogee - use Apogee offset
DistGaia - use Gaia offset
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.
Optional, if you have any idea how to fix the issue
Thank you for this lovely library first and foremost.
I am trying to access the Galaxy10 DECals dataset (as opposed to the SDSS one) without using the h5 reader as I want to use it as a colab demo.
I've run both ! pip install astroNN
and tried cloning directly into the colab following your instructions on this commit: 9dcd394
Despite that, using load_galaxy10
still seems to be loading the SDSS dataset and not the DECals. Do you have any guidance?
I've looked at your code and I can't see why it's loading the old dataset.
Maybe the issue is in imports?
# Import statements
from astroNN.datasets import load_galaxy10
from tensorflow.keras import utils
# To load images and labels (will download automatically at the first time)
# labels corresponds to galaxy classes as specified by Galaxy Zoo
images, labels = load_galaxy10()
Thank you so much for your help!
"To load images and labels (will download automatically at the first time)"
"# First time downloading location will be ~/.astroNN/datasets/"
images, labels = load_galaxy10()
Trying to load the galaxy10 dataset using astroNN but i am getting the following error:
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate(_ssl.c:1131)>
Anyone knows why this is? Thanks in advance.
Hello and thank you for sharing your work.
I want to classify images with color depth with a Bayesian Neural Network.
Though, with this model, I am getting a dimensions error:
Input 0 of layer max_pooling1d_13 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 75, 75, 3)
My input is a dataset loaded with
training_dataset = tf.keras.preprocessing.image_dataset_from_directory
and converted to tensors with
images, labels = next(iter(training_dataset))
so I am trying to train the model with
bcnn_net = ApogeeBCNN()
bcnn_net.fit(images, labels )
Why am I getting this error? Is there a specific way to pass the data?
Thank you, Lucia
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.