Short deion explaining the high-level reason for the new issue. <h2 dir="aut

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Bound the runoff to zero about lstm HOT 10 CLOSED

noaa-owp commented on September 28, 2024

Bound the runoff to zero

from lstm.

Comments (10)

SnowHydrology commented on September 28, 2024 1

@madMatchstick, we could also consider more elegant solutions in the future. I can't track it down at the moment, but I believe one of the Kratzert papers suggested limiting runoff to an observed baseflow value. We could then predict this for ungaged basins using the CAMELS attribute sets.

For now, though, I agree with your suggestion on where to put the logic that prevents negative values.


    def scale_output(self):

        if self.cfg_train['target_variables'][0] == 'qobs_mm_per_hour':
            self.surface_runoff_mm = (self.lstm_output[0,0,0].numpy().tolist() * self.out_std + self.out_mean)

        elif self.cfg_train['target_variables'][0] == 'QObs(mm/d)':
            self.surface_runoff_mm = (self.lstm_output[0,0,0].numpy().tolist() * self.out_std + self.out_mean) * (1/24)
        
        if self.surface_runoff_mm < 0: self.surface_runoff_mm = 0
    
        self._values['land_surface_water__runoff_depth'] = self.surface_runoff_mm/1000.0
        setattr(self, 'land_surface_water__runoff_depth', self.surface_runoff_mm/1000.0)
        self.streamflow_cms = self.surface_runoff_mm * self.output_factor_cms

        self._values['land_surface_water__runoff_volume_flux'] = self.streamflow_cms
        setattr(self, 'land_surface_water__runoff_volume_flux', self.streamflow_cms)

from lstm.

madMatchstick commented on September 28, 2024

There are two bmi outputs in this lstm model, land_surface_water__runoff_volume_flux & land_surface_water__runoff_depth. Called in bmi.update(), we have scaled_output(),

    def scale_output(self):

        if self.cfg_train['target_variables'][0] == 'qobs_mm_per_hour':
            self.surface_runoff_mm = (self.lstm_output[0,0,0].numpy().tolist() * self.out_std + self.out_mean)

        elif self.cfg_train['target_variables'][0] == 'QObs(mm/d)':
            self.surface_runoff_mm = (self.lstm_output[0,0,0].numpy().tolist() * self.out_std + self.out_mean) * (1/24)
            
        self._values['land_surface_water__runoff_depth'] = self.surface_runoff_mm/1000.0
        setattr(self, 'land_surface_water__runoff_depth', self.surface_runoff_mm/1000.0)
        self.streamflow_cms = self.surface_runoff_mm * self.output_factor_cms

        self._values['land_surface_water__runoff_volume_flux'] = self.streamflow_cms
        setattr(self, 'land_surface_water__runoff_volume_flux', self.streamflow_cms)

@peckhams I believe the most appropriate place to apply the lower bound is just after the if-else statement? Id est

if self.surface_runoff_mm < 0: self.surface_runoff_mm = 0

This way, both outputs are guaranteed a positive value.

from lstm.

madMatchstick commented on September 28, 2024

I can't track it down at the moment, but I believe one of the Kratzert papers suggested limiting runoff to an observed baseflow value.

@SnowHydrology, is this the paper you are after? Section 3.2 Using LSTMs as hydrological models states, about Fig 9,

A rather simple solution for this issue is to introduce just one additional parameter and to limit the simulated discharge not to zero, but to the minimum observed flow from the calibration period

Also, from A glimpse into the Unobserved: Runoff simulation for ungauged catchments with LSTMs, section 2.2 Model,

Furthermore, because negative predictions of the runoff are physically implausible, we clip negative
predictions to 0.

To your point, let's just set lower bound to zero for now :)

from lstm.

SnowHydrology commented on September 28, 2024

@madMatchstick thanks for tracking that down. And agreed on the fix for now.

from lstm.

peckhams commented on September 28, 2024

There are two bmi outputs in this lstm model, land_surface_water__runoff_volume_flux & land_surface_water__runoff_depth. Called in bmi.update(), we have scaled_output(),

    def scale_output(self):

        if self.cfg_train['target_variables'][0] == 'qobs_mm_per_hour':
            self.surface_runoff_mm = (self.lstm_output[0,0,0].numpy().tolist() * self.out_std + self.out_mean)

        elif self.cfg_train['target_variables'][0] == 'QObs(mm/d)':
            self.surface_runoff_mm = (self.lstm_output[0,0,0].numpy().tolist() * self.out_std + self.out_mean) * (1/24)
            
        self._values['land_surface_water__runoff_depth'] = self.surface_runoff_mm/1000.0
        setattr(self, 'land_surface_water__runoff_depth', self.surface_runoff_mm/1000.0)
        self.streamflow_cms = self.surface_runoff_mm * self.output_factor_cms

        self._values['land_surface_water__runoff_volume_flux'] = self.streamflow_cms
        setattr(self, 'land_surface_water__runoff_volume_flux', self.streamflow_cms)

@peckhams I believe the most appropriate place to apply the lower bound is just after the if-else statement? Id est

if self.surface_runoff_mm < 0: self.surface_runoff_mm = 0

This way, both outputs are guaranteed a positive value.

from lstm.

peckhams commented on September 28, 2024

This seems fine here, but I would normally do this kind of thing with the following syntax:
np.maximum( self.surface_runoff_mm, 0, self.surface_runoff_mm)
This numpy function says to replace the value with whichever is bigger, the value or 0, and to write the new value "in-place" (in the same memory location). A numpy function is compiled C code, so a Python conditional (if-then-else) will likely be slower. Since the statement gets evaluated many times, this speed difference can add up.

from lstm.

peckhams commented on September 28, 2024

And this syntax works the same for variables that are scalars or n-dimensional arrays.

from lstm.

SnowHydrology commented on September 28, 2024

@peckhams Thanks for that explanation. A fair bit of Fortran code is written with that syntax, as well. Is it common for a compiled language's max statement to be faster than its if-else function?

from lstm.

peckhams commented on September 28, 2024

For compiled languages (vs. an interpreted language like Python), it may not matter. But to get maximum performance from Python code, you want to use compiled numpy (or other) functions whenever possible. Several of the numpy functions, like maximum, have an optional third argument, which is the memory location to place the result -- in this case, overwriting in-place. Note the syntax; it's not of the form: self.x = np.maximum(self.x, 0), but rather just: np.maximum(self.x, 0, self.x). When modifying a variable, especially if it's a big array, you want to just overwrite the same memory location, which is faster. Another example of this kind of in-place modification is: self.x[:] = self.x + dx; instead of self.x = self.x + dx.

from lstm.

madMatchstick commented on September 28, 2024

@peckhams Thanks for suggestion. I'll use this method,

np.maximum( self.surface_runoff_mm, 0, self.surface_runoff_mm)

from lstm.

Bound the runoff to zero about lstm HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent