highdimensionaleconlab / econ_layers Goto Github PK

View Code? Open in Web Editor NEW

1.0 2.0 1.0 336 KB

License: MIT License

Python 100.00%

econ_layers's Introduction

Pytorch Layers for Economics Applications

Pytorch

Documentation: https://HighDimensionalEconLab.github.io/econ_layers

Features

Exponential layer
Flexible multi-layer neural network with optional nonlinear last layer
Affine rescaling of output by an input

Development

To publish a new relase to pypi,

Ensure that the CI is passing
Modify setup.py to increment the minor version number, or a major version number once API stability is enforced
Choose the "Releases" on the github page, then "Draft a new relase"
Click on "Choose a tag" and then type a new release tag with the v followed by the version number you modified to be consistent
After you choose "Publish Release" it will automatically push to pypi, and you can change compatability bounds in downstream packages as required.

Credits

This package was created with Cookiecutter and the giswqs/pypackage project template.

econ_layers's People

Contributors

Stargazers

Watchers

Forkers

shizelong1985

econ_layers's Issues

Add utilities functions

Add a utilities.py
For now, can just put in

def squeeze_cpu(x):
    return x.detach().squeeze().cpu().numpy() if torch.is_tensor(x) else x
def dict_to_cpu(d):
    return {name: squeeze_cpu(val).tolist() for (name, val) in d.items()}

Add in a simple unit test. might need to addd numpy and torch to the requirements for the pckage.

Add rescaling by diving by a variable

This is a rescaling layer which takes in an index for the input and then divides by that value.

There are no trainable parameters in this one, which make it a lot easier to implement. Something like

class InputRescaling(nn.Module):
    def __init__(self, rescale_index):
        super().__init__()
        self.rescale_index= rescale_index
    def forward(self, x, y):
        return y / x[self.rescale_index]

Or something like that. @Mekahou would that do it?

Then after this, we should hook it up into the flexible layers as implementd in #2

Then if you wanted to have a flexible neural network that rescales by the 0th argument (e.g. by k if c(k,z) then you can do somehthing like

mod = FlexibleSequential(2, 1, layers = 3, hidden_dim = 128,
                                          RescalingLayer = InputRescaling,
                                          rescaling_layer_kwargs= {"rescale_index" : 0})

@kahou this would do it for the recursive, right?

Code to ignore certain lightning warnings

Right now we have

warnings.filterwarnings(
    "ignore",
    category=UserWarning,
    module="pytorch_lightning.trainer.data_loading",
    lineno=102,
)


warnings.filterwarnings(
    "ignore",
    category=UserWarning,
    module="pytorch_lightning.trainer.callback_hook",
    lineno=100,
)


warnings.filterwarnings(
    "ignore",
    category=UserWarning,
    module="torch.optim.lr_scheduler",
    lineno=129,
)

After a version bump we can revisit these and see what is still relevant.

The problem is that the lineno changes with each PL version (or torch for the lr_scheduler warning), so we can see if it is possible to do an "if" and then change it depending on what version we are at. Then when new PL versions come out we can just add a case to the if...

Rename rescaling to prepare for both input and output rescaling

The current rescalebyinputs layer doesn't do anything to the network on the inside. So it can only represent the following, for a given i index.

x_i NN(x_1, .... x_N)

Lets rename things to prepare for allowing for things like x_i NN(x_1/x_i, x_2/x_i, ... x_i, x_N/x_i).

Rename the class InputRescaling to RescaleOutputsByInput
The RescalingLayer in https://github.com/HighDimensionalEconLab/econ_layers/blob/main/econ_layers/layers.py#L60-L61 should be renamed OutputRescalingLayer and output_rescaling_layer_kwargs.
Make that change throughout the whole codebase for flexiblesequential/etc. to prepare for two different types of rescaling.

Does hacving a superflous dimension actually help speed things

with g = 0 on the neoclassical growth, having a grid on z seems to make it possible to have fewer k points.

Add support for an input rescaling to complement the output rescaling

After #15 The current rescaling layesr doesn't do anything to the network on the inside. So it can only represent the following, for a given i index.

x_i NN(x_1, .... x_N)

But we also want to be able to do things like NN(x_1/x_i,, x_2/x_i, x_i, x_N/x_i) or something like that to represent functions that are "close to" homogenous of degree 1 or x_i NN(x_1/x_i, x_2/x_i, ... x_i, x_N/x_i). That is, it divides everything by x_i except the x_i term, and then it rescales back by the x_i. This is used in an "almost guess-and-verify" for homothetic problems. Almost because it isn't forcing the NN to be independent of x_i... it just makes it easy.

Since there may be more elaborate types of rescaling, lets leave it relatively general and follow the pattern of the newly renamed outputrescaling in #15

To do this

add in a new InputRescalingLayer which mimics the design of the OutputRescalingLayer exactly. SOmething like https://github.com/HighDimensionalEconLab/econ_layers/blob/main/econ_layers/layers.py#L77-L83 becoming

        self.input_rescaling_layer_args = input_rescaling_layer_kwargs
        self.output_rescaling_layer_args = output_rescaling_layer_kwargs
        self.InputRescalingLayer = InputRescalingLayer
        self.OutputRescalingLayer = OutputRescalingLayer

        if not self.InputRescalingLayer == None:
            self.rescale_input = InputRescalingLayer(**input_rescaling_layer_kwargs)
        else:
            self.rescale_input= None

        if not self.OutputRescalingLayer == None:
            self.rescale_output= OuputRescalingLayer(**output_rescaling_layer_kwargs)
        else:
            self.rescale_output= None

Then the forward is something like the following

    def forward(self, input):
        rescaled_input = input if self.input_rescale is None else self.rescale_input(input)
        out = self.model(rescaled_input)  # pass through to the stored net
        if not self.RescalingLayer is None:
            return self.rescale_output(input, out)  # note that the rescaling doesn't use the rescaled inputs
        else:
            return out

Then, we can come up with a simple network to do this in the "guess almost homethetic" case. Something like the following is a good place to start. It would assume there is only a single value that is not rescaled. Then you could do

class RescaleAllInputsbyInput(nn.Module):
    def __init__(self, rescale_index):
        super().__init__()
        self.rescale_index = rescale_index
        
    def forward(self, x):
        rescale_scalar = 1 / x[self.rescale_index]
        return torch.stack(x[0:rescale_index]* rescale_scalar, x[rescale_index], x[rescale_index+1,:]* rescale_scalar) # or whatever it correct.

Fix docs build and pytorch lightning deprecations

Add in a utility to run and test CLI model without the CLI

Maybe something like

def solve_cli_model(Model, args, config_file, default_seed = 123, ):
    sys.argv = ["dummy.py"] + [f"--{key}={val}" for key, val in args.items()]  # hack overwriting  argv

    cli = LightningCLI(
        Model,
        run=False,
        seed_everything_default=default_seed,
        save_config_overwrite=True,        
        parser_kwargs={"default_config_files": [config_file]},
    )
    # Solves the model
    trainer = cli.instantiate_trainer(
        logger=None,
        checkpoint_callback=None,
        callbacks=[],  # not using the early stopping/etc.
    )    
    trainer.fit(cli.model)

    # Calculates the "test" values for it
    trainer.test(cli.model)
    cli.model.eval()  # Turn off training mode, where it calculates gradients for every call.

    return cli.model, cli

Except maybe give a few options with defaults as:

use_logger = False, turns off the logger if they ask.
checkpoint_callback = False
callbacks = False where if False it zeros them out. Otherwise leaves them be.
test = True, for whe3ther to run the test or not.

Etc.

Check if the patch for the ReduceLRonPlateau works

i.e. Lightning-AI/pytorch-lightning#10850

If so, then I think what we do is put the class FutureLightningCLI(LightningCLI): into the econ_layers and then in downstream users can use that instead of from the LightningCLI directly? If that works, it is an easy patch for setups where we want to try the plateau, and we can just swap back when the new release occurs.

@jbrightuniverse It might be worth trying this sooner than later if the symmetry paper ends up sensitive to the LR, but lets get it working with the step/exponential ones first.

Input rescaling: problem with data/batch dimmesion

In InputRescaling for def forward(self, x, y): return y * x[self.rescale_index] the rescaling layer multiply the NN output by the first batch element (in this case first tensor [z_0,k_0]) instead of the first data coordinate (in this case by z_i given the i-th batch element).

Rescaling by input by other input

Framework for https://github.com/HighDimensionalEconLab/deep_learning_transversality/issues/92 and https://github.com/HighDimensionalEconLab/deep_learning_transversality/issues/105

Hook up rescaling into the main flexible layers for the scalar exponential case

After #1 is complete, we can add it as an option into the FlexibleSequential

I already added in the https://github.com/HighDimensionalEconLab/econ_layers/blob/main/econ_layers/layers.py#L30-L31 function

What is needed is then to use the rescaling in the forward. I think basically https://github.com/HighDimensionalEconLab/econ_layers/blob/main/econ_layers/layers.py#L69-L74

Becomes something like

    def forward(self, input):
        out = self.model(input)  # pass through to the stored net
        if not self.RescalingLayer is None:
            return rescale(input, out) 
        else:
            return out

And construction of the FlexibleSequential could be

mod = FlexibleSequential(2, 2, layers = 3, hidden_dim = 128,
                                          RescalingLayer = ScalarExponentialRescaling,
                                          rescaling_layer_kwargs= {}  # add if required? or whatever it is....

Or something like that.

For naming of arguments, kwargs, etc see other classes and strive for consistency

See if a specialized trainer supporting pre_fit for cli/etc. is feasible

From mauricio on slack:

Just in case you find it useful I give a different idea which I would probably take if doing something like this. Note that LightningCLI can receive as input the trainer_class. The reason for this is for users to be able to extend the lightning trainer class when needed. A possibility could be to extend the fit method such that internally it would do some pretraining (disabling callbacks and loggers) and then call super().fit(). Another possibility would be to have a new method e.g. pre_fit which would implement this. Then the user would call first pre_fit and then fit. The reason why I would tend to go this way is that this fits more as training logic than cli logic

cli.config will not be a dict after PL>=1.6

in test_jsonargparse.py, cli.config["model"]["ml_model"] will need to be converted to cli.config.as_dict()["model"]["ml_model"] or perhaps separated into two lines once the release is out

Do we need Optional in the FlexibleSequential?

Is the Optional for https://github.com/HighDimensionalEconLab/econ_layers/blob/main/econ_layers/layers.py#L64 and https://github.com/HighDimensionalEconLab/econ_layers/blob/main/econ_layers/layers.py#L66 is necessary for it to work? Typically we would only want the Optional if None also makes sense (e.g. the rescaling_layer needs that, for example, in https://github.com/HighDimensionalEconLab/econ_layers/blob/main/econ_layers/layers.py#L68 Not an important point, but just something to check at some point when you next update the repo.

Add affine term in RescaleOutputByInput

https://github.com/HighDimensionalEconLab/econ_layers/blob/main/econ_layers/layers.py#L21-L31

Maybe something like

class RescaleOutputsByInput(nn.Module):
    def __init__(self, rescale_index: int = 0, bias=False):
        super().__init__()
        self.rescale_index = rescale_index
        if bias:
            self.bias =  torch.nn.Parameter(torch.Tensor(1))
        else:
            self.register_parameter('bias', None)        
        self.reset_parameters()            

    def forward(self, x, y):
        if x.dim() == 1:
            return x[self.rescale_index] * y + self.bias
        else:
            return x[:, [self.rescale_index]] * y + self.bias

Change divide to multiplies for the rescalingbyinput

Implement trainable scalar rescaling

The scalar version, something like

# Scalar rescaling.  Only one parameter.
class ScalarExponentialRescaling(nn.Module):
    def __init__(self, n_in):
        super().__init__()
        self.n_in = n_in
        self.weight = torch.nn.Parameter(torch.Tensor(1)) # only one parameter to "learn"
        self.reset_parameters()
        
    def reset_parameters(self):
        # Lets start at zero, but later could have option
        torch.nn.init.zeros_(self.weight) # maybe this?  Not entirely sure.
        
    
    def forward(self, x, y):
        exp_x = torch.exp(self.weight * x)  # exponential of input
        return torch.mul(exp_x, y)

Implement trainable diagonal exponential rescaling functions

Might see something here: https://towardsdatascience.com/how-to-build-your-own-pytorch-neural-network-layer-from-scratch-842144d623f6

The key is that it needs to have its internal weights as trainniable parameters for pytorch, rather than as fixed values.

If the input is x : R^N then for all of these, we want to map to a N x N matrix (which is diagonal in this case). We can then take that matrix and multiply it by the input to rescale. For the diagonal, the multiplication is then just a pointwise multiplication.

To start with, implement the pointwise f(x) = exp( D x) for a diagonal D and input x. i.e.

f(x_1) = exp(D_1 x_1)
f(x_2) = exp(D_2 x_2)
...
f(x_N) = exp(D_N x_N)

which is a N-parameter learnable function (i.e. self.D = torch.nn.Parameter(torch.zeros(N)) etc.)

Code that might not be that far off is

# Given an input y this calculates:
# exp(D x) * y for the pointwise multiple and exponential
class DiagonalExponentialRescaling(nn.Module):
    def __init__(self, n_in):
        super().__init__()
        self.n_in = n_in
        self.weights = torch.nn.Parameter(torch.Tensor(n_in))
        self.reset_parameters()
        
    def reset_parameters(self):
        # Lets start at zero, but later could have option
        torch.nn.init.zeros_(self.weights) # maybe this?  Not entirely sure.        
    
    def forward(self, x, y):
        exp_x = torch.exp(torch.mul(x, self.weights))  # exponential of input
        return torch.mul(exp_x, y)

Because this will be relatively low dimensional in parameters, we will need to make sure we start at the right place. Maybe even D_n = 0 is a better initial condition then something totally random.

Note that we would call this with two inputs

model = DiagonalExponentialRescaling(5)
x = torch.tensor([...])
y = torch.tensor([...]) # maybe coming out of another network
out = model(x, y)

For the unit tests:

make sure to use the gradcheck with double precision to make sure it doesn't have any issues. https://github.com/HighDimensionalEconLab/econ_layers/blob/main/tests/test_flexible_sequential.py#L20-L21
But somehow you need to ensure that it also is able to take the graidents with the parameters themselves. For that, I would setup a simple pytorch training loop to fit it to exponential data that you generate.

Move pretraining callback into the package

The current code is

import pytorch_lightning as pl
import pytorch_lightning.callbacks

class ResetOptimizers(pl.Callback):
    def __init__(self,verbose):
        super().__init__()
        self.verbose = verbose

    def on_train_epoch_end(self, trainer, pl_module):
        if trainer.current_epoch == pl_module.hparams.pretrain_epochs - 1:
            if self.verbose:
                print("\nPretraining complete, resetting optimizers and schedulers")
            trainer.accelerator.setup_optimizers(trainer)

Add a new file called callbacks.py so it can be included with import econ_layers.callbacks etc.
I think we should change it to something where the field for the pretrain epochs is not hardcoded

class ResetOptimizers(pl.Callback):
    def __init__(self,verbose: bool,
                      epoch_reset_field : str = "pretrain_epochs"):
        super().__init__()
        self.verbose = verbose
        self.epoch_reset_field= epoch_reset_field 

    def on_train_epoch_end(self, trainer, pl_module):
        reset_epoch = getattr(pl_module.hparams, self.epoch_reset_field) - 1
        if trainer.current_epoch == reset_epoch :
            if self.verbose:
                print("\nPretraining complete, resetting optimizers and schedulers")
            trainer.accelerator.setup_optimizers(trainer)

Need to add in pytorch lightning as a dependency to this package
For jsonargparse to work well, you should also add type information to the fields.. I tried to add it above but may have made a mistake

More advanced rescaling by inputs

If we find that we want more control over what inputs are rescaled, then something like

class RescaleInputsbyInput(nn.Module):
    def __init__(self, rescale_index, inputs_to_rescale = None):
        super().__init__()
        self.rescale_index = rescale_index

        # if inputs_to_rescale is None, then it assume all except the current one.
        if inputs_to_rescale is None:
            self.inputs_to_rescale = # generate all indices except the rescale_index one.
       else:
           self.inputs_to_rescale = inputs_to_rescale
        
    def forward(self, x):
        rescale_scalar = 1 / x[self.rescale_index]        
        new_x = # multiple all indices in "inputs_to_rescale" by "rescale_scalar".
        return new_x