Comments (8)
Now my code is:
Building
class Noah_transfer(BayesianCNNBase):
def init(self, lr=0.0005, dropout_rate=0.2):
super().init()
self.initializer = RandomNormal(mean=0.0, stddev=0.05)
self.max_epochs = 50
self.lr = lr
self.reduce_lr_epsilon = 0.00005
self.reduce_lr_min = 1e-8
self.reduce_lr_patience = 2
self.l2 = 1e-9
self.dropout_rate = dropout_rate
self.input_norm_mode = 3
self.task = 'regression'
def model(self):
input_tensor = Input(shape=self._input_shape['input'], name='input')
labels_err_tensor = Input(shape=self._labels_shape['output'], name='labels_err')
noah = load_folder('Noah_giant')
base_model = Model(inputs=noah.keras_model.input,
outputs=noah.keras_model.get_layer('dense_1').output)
base_model.trainable = False
x = base_model([input_tensor], training=False)
output = Dense(units=self._labels_shape['output'],
activation='linear',
name='output')(x)
variance_output = Dense(units=self._labels_shape['output'],
activation='linear',
name='variance_output')(x)
model = Model(inputs=[input_tensor, labels_err_tensor], outputs=[output, variance_output])
model_prediction = Model(inputs=[input_tensor], outputs=concatenate([output, variance_output]))
variance_loss = mse_var_wrapper(output, labels_err_tensor)
output_loss = mse_lin_wrapper(variance_output, labels_err_tensor)
return model, model_prediction, output_loss, variance_loss
Training
noah_transfer = Noah_transfer()
noah_transfer.task = 'regression'
noah_transfer.fit(input_data=x_train,
labels=y_train,
inputs_err=x_train_err,
labels_err=y_train_err)
Both ''model'' and ''model_prediction'' can be printed by summary(), but it will raise an error during training:
Layer "model_2" expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 4500, 1) dtype=float32>]
Call arguments received:
• inputs={'input': 'tf.Tensor(shape=(None, 4500, 1), dtype=float32)', 'input_err': 'tf.Tensor(shape=(None, None, None), dtype=float32)', 'labels_err': 'tf.Tensor(shape=(None, 11), dtype=float32)'}
• training=True
• mask=None
It seems that 'label' hasn't been taken into the training and model_2(which means ''model'' in this code) received only one input(which seems to be x_train)
from astronn.
Sorry for the late reply.
I have add a function as a first step to solve your issue. So now this new function transfer_weights()
should transfer all the weights to a new model (except the input and possibly the output layers) and set those transferred weights as non-trainable (so when you train on the other survey, only the input/output layers are trained, the middle layers are not trained). To use this new function, you should do git pull
to pull the latest commit to your computer.
Here is an example:
from astroNN.models import ApogeeBCNN
# a model trained on the original survey
bneuralnet = ApogeeBCNN()
bneuralnet.fit(xdata, ydata)
# another astroNN model
bneuralnet2 = ApogeeBCNN()
# just to initialize the model with the correct input and output shape
bneuralnet2.max_epochs = 1
bneuralnet2.fit(xdata_another_survey, ydata_another_survey)
# transfer all the weights except layers with incompatible shape
bneuralnet2.transfer_weights(bneuralnet)
# training for real, the middle part of the model is not trainable
bneuralnet2.max_epochs = 60
bneuralnet2.fit(xdata_another_survey, ydata_another_survey)
# now bneuralnet2 is your new astroNN model transferred to anther survey with the same architecture of the original survey
from astronn.
Thank you for your reply.
The two of us seem to have different ideas, your way is to transfer the weights of the base model while mine is to transfer the whole base model. Function transfer_weights() is a clever and effective way to do the transfer learning, it should be enough for me, for now.
But I still have some doubts:
- why the training step goes wrong while all the models associated(noah, base_model, model, model_predcition) are good to be printed.
- what if I want to splicing two models or add new layers directly after a base model?
This may have something to do with your architecture and could be complicated to implement, I'm not sure for that. Anyway, thanks to your efforts, it can work now and forgive me for leaving these doubts to you irresponsibly. Hope you can make astroNN more and more perfect and benefit more users.
from astronn.
There still are some bugs.
When the output layer of my transfered model and the base model have the same number of nodes, the summary says that all of my params are non-trainable. But the weights of the transfered model's output layer should be trained.
Funny thing is that, if so, my loss should stay the same during the training step, but it turns out that the loss kept getting smaller which means the weights are still trained. This behavior is not only a departure from what I want but also a departure from the model summary.
On the other hand, when my output layer node count is different from the original model, trainable params is the sum of params in output layer and variance_output layer which is right. But in the training step, it seems that all the params are still trained.
from astronn.
Yes it seems so that supposedly non-trainable parameters still get trained somehow. I am still investigating what is going on but most likely I need to set them to be non-trainable before compiling the model.
As for the output layer, the current strategy is to transfer all weights with compatible shape (i.e. if shape of weights are the same for a layer, then transfer those weights). I think what you want is to only train the input layer?? Or you can force a different output shape so that output layers wont get transferred (i.e. maybe train on T_eff and Log(g) for one survey and fe_h for another survey so output shapes are different). I think there could be a case where you have a small overlap between two surveys, then you can use the spectra from survey B but only train the input layer with label from the original survey A?
Regarding your questions from a few days ago, what do you mean by training step goes wrong? And yes splicing/adding layers probably requires more work but its not undoable per say but we need to make the simplest case working correctly first...
from astronn.
Thank you for your patience and reply.
The training step failure happened because of model splicing a few days ago, but as you said, we should make the simplest case work first, so let's talk about it later.
What really important is that I want to train both the input layer and the output layer, whether the output layers have the same shape or not. (for now they are the same, so the weights are transfered and "locked")
The case is that I have a model trained on spectra from survey A but labels from survey B, now I want to transfer this model to train it on spectra from survey C and labels from survey B. I don't know if it will work, but I just want to take an atempt.
from astronn.
I think I have fixed the issue of weights still being trained even after setting trainable=False
, I also have added an argument exclusion_output=False
so you can exclude output weights when transferring with transfer_weights()
. You can checkout the latest commit to see if it is working for you
from astronn.
Thank you for all the effort, it works now.
from astronn.
Related Issues (15)
- Parrallel odeint integration wrt func or parameter HOT 2
- Galaxy-10 missing images HOT 1
- tensorflow 2.4.1 HOT 3
- ApogeeBCNN() dimensions HOT 11
- Issue loading the Galaxy10 dataset HOT 5
- DR16 astroNN catalog of distances produces incorrect parsec values for Md and Mg stars HOT 8
- Loading Galaxy10 dataset HOT 3
- Keras's fit_generator failed when use_multiprocessing=True on WIndows only HOT 1
- Bugs in 3 of the demo_tutorial/NN_uncertainty_analysis HOT 1
- Current .h5 dataset loading mechanism is problematic
- Complete Tensorflow support without installing Keras separately HOT 3
- Weird errors raised by running the new accelerated BNN test() method HOT 2
- Can not reproduce results of Uncertainty_Demo_MNIST.ipynb HOT 4
- ODE example on tensorflow 2.2.0 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from astronn.