noaa-owp / lstm Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
We need to have an example that uses all the static attributes and forcings, that way the BMI functionality can be tested on all those items. For instance, to test the functionality of the set_value for the two wind speed directions, we need an example that takes those as forcing data.
Short description explaining the high-level reason for the new issue.
The current implementation of the LSTM_BMI does not strictly follow the template from the CSDMS heat model example: https://github.com/csdms/bmi-example-python/tree/master/heat. Or some of the other BMI python models. For instance, in the Snow model example by @SnowHydrology, the model input values attributes of the model itself: https://github.com/SnowHydrology/snowBMI/blob/0886ac4140c4ede8300f92e42b039267ef2eaa40/snow/snow.py#L136, whereas in the LSTM model, they model inputs are passed in as arguments to the model forward function:
lstm/lstm/nextgen_cuda_lstm.py
Line 19 in bd0ccf3
Is it expected that the LSTM model conform the the examples by the CSDMS? This will require some significant modification to the LSTM model implementation itself. There was, at some point, an expectation that BMI not require modifications to the specific model implementation, and that the BMI wrapper could be developed in a manner to require minimally, or none at all, changes to the model itself.
self._input_layer = torch.tensor([0,0,...,0])
, then, during the initialization of the LSTM_BMI, and during the set_value function, there would be an in-place operation to over-ride the input layer. This would allow the reference: val = self.get_value_ptr(var_name)
, which could then allow for the in-place operation through val[:] = src.reshape(val.shape)
.This link is broken. It looks like the link just needs to be updated to ../trained_neuralhydrology_models/hourly_all_attributes_and_forcings/config.yml
.
lstm/bmi_config_files/README.md
Line 6 in 63116cc
For longevity, it might be best to use a permalink though:
Im happy to open a PR to fix this, just let me know which one you all prefer.
Short description explaining the high-level reason for the new issue.
Can predict less than zero runoff
Should predict no less than zero runoff.
Both notebooks contain plots comparing observed and simulated outputs. The LSTM-BMI outputs are clearly incorrect. Let's check (hardcoded) variable units, timestep size/units, etc. See section [4] in run-lstm-with-bmi.ipynb
.
Short description explaining the high-level reason for the new issue.
An LSTM can be trained with any combination of forcings, so don't hard code them. Let the BMI determine the forcings to include from the training configuration. But we can still initialize everything, and just not include them in the input tensor.
def initialize_forcings(self):
#------------------------------------------------------------
if 'total_precipitation' in self.cfg_train['dynamic_inputs']:
self.total_precipitation = 0
#------------------------------------------------------------
if 'temperature' in self.cfg_train['dynamic_inputs']:
self.temperature = 0
for forcing_name in self.cfg_train['dynamic_inputs']:
setattr(self, forcing_name, 0)
Short description explaining the high-level reason for the new issue.
The training configuration includes the specific attributes used during training, and these need to be included in forward mode. If a model is trained with a specific set of attributes as inputs, the BMI should be able to provide those, as indicated from the training configuration file.
def set_static_attributes(self):
#------------------------------------------------------------
if 'elev_mean' in self.cfg_train['static_attributes']:
self.elev_mean = self.cfg_bmi['elev_mean']
self.all_lstm_input_values_dict['elev_mean'] = self.cfg_bmi['elev_mean']
#------------------------------------------------------------
if 'slope_mean' in self.cfg_train['static_attributes']:
self.slope_mean = self.cfg_bmi['slope_mean']
self.all_lstm_input_values_dict['slope_mean'] = self.cfg_bmi['slope_mean']
def set_static_attributes(self):
""" Get the static attributes from the configuration file
"""
for attribute in self._static_attributes_list:
if attribute in self.cfg_train['static_attributes']:
# This is probably the right way to do it,
setattr(self, attribute, self.cfg_bmi[attribute])
# and this is just in case.
self.all_lstm_input_values_dict[attribute] = self.cfg_bmi[attribute]
Short description explaining the high-level reason for the new issue.
The examples only include four LSTM inputs:
dynamic_inputs:
static_attributes:
An example that includes all currently possible inputs:
dynamic_inputs:
static_attributes:
Short description explaining the high-level reason for the new issue.
BMI claims it wants temperature in Kelvin, but the model actually wants Celcius
Short description explaining the high-level reason for the new issue.
There are two values for basin area
There should only be one value for basin area.
A scaler file is required in order to use a lstm model other than the ones saved in the repository. @jmframe where do these get generated from? Wasn't clear in the neuro hydrology if it produces this. Can you provide any guidance on the steps to bring a new lstm model for inclusion?
Reminder to confirm appropriate term for input var precip; atmosphere_water__time_integral_of_precipitation_mass_flux
vs atmosphere_water__liquid_equivalent_precipitation_rate
as mentioned/adjusted in PR #34.
Short description explaining the high-level reason for the new issue.
in a notebook, can't run from terminal
run from terminal
Short description explaining the high-level reason for the new issue.
Storing values as individual attributes of the BMI model:
self.total_precipitation or self.atmosphere_water__liquid_equivalent_precipitation_rate
And as part of a dictionary:
self.all_lstm_input_values_dict['atmosphere_water__liquid_equivalent_precipitation_rate']
Just store the value in one place or another.
Also, we don't need the dictionary all_lstm_input_values_dict, we'll just include everything in _values
Short description explaining the high-level reason for the new issue.
The model state space persists throughout the simulation time.
NeuralHydrology trains the LSTM to reset the state space and run a full sequence length of simulation at each time step.
Either
Short description explaining the high-level reason for the new issue.
Only option to save and load the model states
save the state at some point
load in the saved state and start from there.
Short description explaining the high-level reason for the new issue.
We need one output for this model. We can provide it in units of either: ['mm', 'ft3 s-1', 'm3 s-1]. So decide on that, and then make it available through the get_value. And that requires associating the chosed output unit with 'land_surface_water__runoff_volume_flux'.
Nothing currently associated with 'land_surface_water__runoff_volume_flux'.
_var_name_units_map = {'land_surface_water__runoff_volume_flux':['streamflow_cfs','ft3 s-1'], ... etc.}
def scale_output(self):
...
self._values['land_surface_water__runoff_volume_flux'] = self.streamflow_cms * (1/35.314)
...OR...
self.land_surface_water__runoff_volume_flux = self.streamflow_cms * (1/35.314)
etc.
There are some issues with the current implementations of the getter and setter functions. These are things that don't properly adhere to the BMI specification.
get_value
First, the signature of this function isn't correct. Right now it is:
def get_value(self, var_name):
But in the specification it is defined as:
def get_value(self, name: str, dest: np.ndarray) -> np.ndarray:
This directly prevents it from behaving correctly. It is expected to copy the values of the desired variable into the (already appropriately initialized) dest
NumPy array.
Also related, the function right now is returning the backing NumPy array from get_value_ptr
. It probably needs to return dest
once that's added. Regardless, it shouldn't return a direct reference to the backing variable.
get_value_at_indices
This will need to be modified after changes to get_value
. I think it's also making an unnecessary copy of the entire backing NumPy array, before then copying the right individual values into the supplied dest
array.
set_value
This function has its required second parameter for containing the values to set, but appears to makes invalid assumptions that are based on something beyond the BMI specification: e.g., that it is a singleton array. That in particular also seems to be contradicted by there being valid get_value_at_indices
and set_value_at_indices
functions (i.e., why not raise an exception to indicate there is only index 0, and thus the at_indices
functions are unnecessary and inappropriate?).
set_value_at_indices
Again, this seems to try to account for the possibility of singleton arrays in ways that contradict needing at_indices
functions, and that aren't strictly consistent with BMI.
Short description explaining the high-level reason for the new issue.
The set_value function, starting on line 792, copies the input to an internal array, down to line 805. So far, so good. But, then the rest of the code to actually set the value is commented out, so the current code cannot run input values.
There should be some mechanism to get the input value set into the model, which is in the commented out code below from lines 807-830. Looks like there has been some discussion on the correct mechanisms to get the input value set, but any consensus may be obscured by the commented out code.
See bmi.initialize()
definition,
def initialize( self, bmi_cfg_file=None ):
#NJF ensure this is a Path type so the follow open works as expected
#When used with NGen, the bmi_cfg_file is just a string...
bmi_cfg_file = Path(bmi_cfg_file)
Per #NJF comment, we should maintain bmi_config_file
as a string
for ngen
compliance.
bmi_config_file
is of type path
bmi_config_file
is of type string
Manually ran unit testing workflow and python version 3.7 fails for both mac and linux os. See job details here.
Reminder to look into.
Short description explaining the high-level reason for the new issue.
We don't need a separate dictionary for what will be included in the input, vs all possible input values. When we create the input tensor we can organize the inputs from the full dictionary (_values), or if we decide to go with storing the input values as individual attributes, we can reference them from there.
def set_values_dictionary(self):
self._values = {self._var_name_map_short_first[x]:self.all_lstm_input_values_dict[x] for x in self.all_lstm_inputs}
...
def create_scaled_input_tensor(self):
self.set_values_dictionary()
def create_scaled_input_tensor(self):
self.input_array = np.array([getattr(self, self._var_name_map_short_first[x]) for x in self.all_lstm_inputs])
... or...
self.input_array = np.array([self._values[self._var_name_map_short_first[x]] for x in self.all_lstm_inputs])
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.