Code Monkey home page Code Monkey logo

neurodiffgym / neurodiffeq Goto Github PK

View Code? Open in Web Editor NEW
685.0 26.0 89.0 140.97 MB

A library for solving differential equations using neural networks based on PyTorch, used by multiple research groups around the world, including at Harvard IACS.

Home Page: http://pypi.org/project/neurodiffeq/

License: MIT License

Python 98.91% TeX 0.67% Dockerfile 0.23% Shell 0.14% Batchfile 0.05%
differential-equations neural-networks pytorch pde-solver odes artificial-intelligence deep-learning physics-informed-neural-networks pypi time-series

neurodiffeq's Introduction

neurodiffeq

Downloads Codacy Badge PyPI GitHub issues Build Status codecov Documentation Status DOI

Citation

@article{chen2020neurodiffeq,
  title={NeuroDiffEq: A Python package for solving differential equations with neural networks},
  author={Chen, Feiyu and Sondak, David and Protopapas, Pavlos and Mattheakis, Marios and Liu, Shuheng and Agarwal, Devansh and Di Giovanni, Marco},
  journal={Journal of Open Source Software},
  volume={5},
  number={46},
  pages={1931},
  year={2020}
}

🔥🔥🔥Did you know that neurodiffeq supports solution bundles and can be used to solve reverse problems? See here!

🎓 Already familiar with neurodiffeq? 👇 Jump to FAQs.


Introduction

neurodiffeq is a package for solving differential equations with neural networks. Differential equations are equations that relate some function with its derivatives. They emerge in various scientific and engineering domains. Traditionally these problems can be solved by numerical methods (e.g. finite difference, finite element). While these methods are effective and adequate, their expressibility is limited by their function representation. It would be interesting if we can compute solutions for differential equations that are continuous and differentiable.

As universal function approximators, artificial neural networks have been shown to have the potential to solve ordinary differential equations (ODEs) and partial differential equations (PDEs) with certain initial/boundary conditions. The aim of neurodiffeq is to implement these existing techniques of using ANN to solve differential equations in a way that allow the software to be flexible enough to work on a wide range of user-defined problems.

Installation

Using pip

Like most standard libraries, neurodiffeq is hosted on PyPI. To install the latest stable relesase,

pip install -U neurodiffeq  # '-U' means update to latest version

Manually

Alternatively, you can install the library manually to get early access to our new features. This is the recommended way for developers who want to contribute to the library.

git clone https://github.com/NeuroDiffGym/neurodiffeq.git
cd neurodiffeq && pip install -r requirements
pip install .  # To make changes to the library, use `pip install -e .`
pytest tests/  # Run tests. Optional.

Getting Started

We are happy to help you with any questions. In the meantime, you can checkout the FAQs.

To view complete tutorials and documentation of neurodiffeq, please check Official Documentation.

In addition to the documentations, we have recently made a quick walkthrough Demo Video with slides.

Example Usages

Imports

from neurodiffeq import diff
from neurodiffeq.solvers import Solver1D, Solver2D
from neurodiffeq.conditions import IVP, DirichletBVP2D
from neurodiffeq.networks import FCNN, SinActv

ODE System Example

Here we solve a non-linear system of two ODEs, known as the Lotka–Volterra equations. There are two unknown functions (u and v) and a single independent variable (t).

def ode_system(u, v, t): 
    return [diff(u,t)-(u-u*v), diff(v,t)-(u*v-v)]

conditions = [IVP(t_0=0.0, u_0=1.5), IVP(t_0=0.0, u_0=1.0)]
nets = [FCNN(actv=SinActv), FCNN(actv=SinActv)]

solver = Solver1D(ode_system, conditions, t_min=0.1, t_max=12.0, nets=nets)
solver.fit(max_epochs=3000)
solution = solver.get_solution()

solution is a callable object, you can pass in numpy arrays or torch tensors to it like

u, v = solution(t, to_numpy=True)  # t can be np.ndarray or torch.Tensor

Plotting u and v against their analytical solutions yields something like:

lotka–volterra-solution

PDE System Example

Here we solve a Laplace Equation with Dirichlet boundary conditions on a rectangle. Note that we choose Laplace equation for its simplicity of computing analytical solution. In practice, you can attempt any nonlinear, chaotic PDEs, provided you tune the solver well enough.

Solving a 2-D PDE system is quite similar to solving ODEs, except there are two variables x and y for boundary value problems or x and t for initial boundary value problems, both of which are supported.

def pde_system(u, x, y):
    return [diff(u, x, order=2) + diff(u, y, order=2)]

conditions = [
    DirichletBVP2D(
        x_min=0, x_min_val=lambda y: torch.sin(np.pi*y),
        x_max=1, x_max_val=lambda y: 0,                   
        y_min=0, y_min_val=lambda x: 0,                   
        y_max=1, y_max_val=lambda x: 0,                   
    )
]
nets = [FCNN(n_input_units=2, n_output_units=1, hidden_units=(512,))]

solver = Solver2D(pde_system, conditions, xy_min=(0, 0), xy_max=(1, 1), nets=nets)
solver.fit(max_epochs=2000)
solution = solver.get_solution()

The signature of solution for a 2D PDE is slightly different from that of an ODE. Again, it takes in either numpy arrays or torch tensors.

u = solution(x, y, to_numpy=True)

Evaluating u on [0,1] × [0,1] yields the following plots

ANN-Based Solution Residual of PDE
laplace-solution laplace-error

Using a Monitor

A monitor is a tool for visualizing PDE/ODE solutions as well as history of loss and custom metrics during training. Jupyter Notebooks users need to run the %matplotlib notebook magic. For Jupyter Lab users, try %matplotlib widget.

from neurodiffeq.monitors import Monitor1D
...
monitor = Monitor1D(t_min=0.0, t_max=12.0, check_every=100)
solver.fit(..., callbacks=[monitor.to_callback()])

You should see the plots update every 100 epoch as well as on the last epoch, showing two plots — one for solution visualization on the interval [0,12] and the other for loss history (training and validation).

monitor

Custom Networks

For convenience, we have implemented an FCNN – fully-connected neural network, whose hidden units and activation functions can be customized.

from neurodiffeq.networks import FCNN
# Default: n_input_units=1, n_output_units=1, hidden_units=[32, 32], activation=torch.nn.Tanh
net1 = FCNN(n_input_units=..., n_output_units=..., hidden_units=[..., ..., ...], activation=...) 
...
nets = [net1, net2, ...]

FCNN is usually a good starting point. For advanced users, solvers are compatible with any custom torch.nn.Module. The only constraints are:

  1. The modules takes in a tensor of shape (None, n_coords) and the outputs a tensor of shape (None, 1).

  2. There must be a total of n_funcs modules in nets to be passed to solver = Solver(..., nets=nets).

monitor

Acutally, neurodiffeq has a single_net feature that doesn't obey the above rules, which won't be covered here.

Read the PyTorch tutorial on building your own network (a.k.a module) architecture.

Transfer Learning

Transfer learning is easily done by serializing old_solver.nets (a list of torch modules) to disk and then loading them and passing to a new solver:

old_solver.fit(max_epochs=...)
# ... dump `old_solver.nets` to disk

# ... load the networks from disk, store them in some `loaded_nets` variable
new_solver = Solver(..., nets=loaded_nets)
new_solver.fit(max_epochs=...)

We currently working on wrapper functions to save/load networks and other internal variables of Solvers. In the meantime, you can read the PyTorch tutorial on saving and loading your networks.

Sampling Strategies

In neurodiffeq, the networks are trained by minimizing loss (ODE/PDE residuals) evaluated on a set of points in the domain. The points are randonly resampled every time. To control the number, distribution, and bounding domain of sampled points, you can specify your own training/valiadation generators.

from neurodiffeq.generators import Generator1D

# Default t_min=0.0, t_max=1.0, method='uniform', noise_std=None
g1 = Generator1D(size=..., t_min=..., t_max=..., method=..., noise_std=...)
g2 = Generator1D(size=..., t_min=..., t_max=..., method=..., noise_std=...)

solver = Solver1D(..., train_generator=g1, valid_generator=g2)

Here are some sample distributions of a Generator1D.

Generator1D(8192, 0.0, 1.0, method='uniform') Generator1D(8192, -1.0, 0.0, method='log-spaced-noisy', noise_std=1e-3)
generator1d-uniform generator1d-log-spaced-noisy

Note that when both train_generator and valid_generator are specified, t_min and t_max can be omitted in Solver1D(...). In fact, even if you pass t_min, t_max, train_generator, valid_generator together, the t_min and t_max will still be ignored.

Combining Generators

Another nice feature of the generators is that you can concatenate them, for example

g1 = Generator2D((16, 16), xy_min=(0, 0), xy_max=(1, 1))
g2 = Generator2D((16, 16), xy_min=(1, 1), xy_max=(2, 2))
g = g1 + g2

Here, g will be a generator that outputs the combined samples of g1 and g2

g1 g2 g1 + g2
generator2d-1 generator2d-2 generator2d-concat

Sampling Higher Dimensions

You can use Generator2D, Generator3D, etc. for sampling points in higher dimensions. But there's also another way

g1 = Generator1D(1024, 2.0, 3.0, method='uniform')
g2 = Generator1D(1024, 0.1, 1.0, method='log-spaced-noisy', noise_std=0.001)
g = g1 * g2

Here, g will be a generator which yields 1024 points in a 2-D rectangle (2,3) × (0.1,1) every time. The x-coordinates of them are drawn from (2,3) using strategy uniform and the y-coordinate drawn from (0.1,1) using strategy log-spaced-noisy.

g1 g2 g1 * g2
generator2d-1 generator2d-2 generator2d-concat

Solution Bundle and Reverse Problems

Sometimes, it is interesting to solve a bundle of equations at once. For example, you may want to solve differential equations of the form du/dt + λu = 0 under the initial condition u(0) = U0. You may want to solve this for all λ and U0 at once, by treating them as inputs to the neural networks.

One such application is for chemical reactions, where the reaction rate is unknown. Different reaction rates correspond to different solutions, and only one solution matches observed data points. You maybe interested in first solving for a bundle of solutions, and then determining the best reaction rates (aka equation parameters). The second step is known as the inverse problem.

Here's an example of how to do this using neurodiffeq:

  1. Let's say we have an equation du/dt + λu = 0 and initial condition u(0) = U0 where λ and U0 are unknown constants. We also have a set of observations t_obs and u_obs. We first import BundleSolver and BundleIVP which is necessary to obtaining a solution bundle:

    from neurodiffeq.conditions import BundleIVP
    from neurodiffeq.solvers import BundleSolver1D
    
    import matplotlib.pyplot as plt
    import numpy as np
    import torch
    from neurodiffeq import diff
  2. We determine the domain of input t, as well as the domain of parameters λ and U0. We also need to make a decision of the order of the parameters. Namely, which should be the first parameter, and which should be the second. For the purpose of this demo, we choose λ to be the first parameter (index 0), and U0 to be the second (index 1). It is very important to keep track of the indices of the parameters.

    T_MIN, T_MAX = 0, 1
    LAMBDA_MIN, LAMBDA_MAX = 3, 5  # first parameter,  index = 0
    U0_MIN, U0_MAX = 0.2, 0.6       # second parameter, index = 1
  3. We then define the conditions and solver as usual, except that we use BundleIVP and BundleSolver1D instead of IVP and Solver1D. The interface of these two is very similar to IVP and Solver1D. You can find out more in the API reference.

    # equation parameters comes after inputs (usually temporal and spatial coordinates)
    diff_eq = lambda u, t, lmd: [diff(u, t) + lmd * u]
    
    # The keyword argument must be named "u_0" in BundleIVP. If you use anything else, e.g. `y0`, `u0`, etc., it won't work.
    conditions = [
        BundleIVP(t_0=0, u_0=None, bundle_param_lookup={'u_0': 1})  # u_0 has index 1
    ]
    
    solver = BundleSolver1D(
        ode_system=diff_eq,
        conditions=conditions,
        t_min=T_MIN, t_max=T_MAX, 
        theta_min=[LAMBDA_MIN, U0_MIN],  # λ has index 0; u_0 has index 1
        theta_max=[LAMBDA_MAX, U0_MAX],  # λ has index 0; u_0 has index 1
        eq_param_index=(0,),             # λ is the only equation parameter, which has index 0
        n_batches_valid=1,
    )

    Since λ is a parameter in the equation and U0 is a parameter in the initial condition, we must include λ in the diff_eq and U0 in the condition. If a parameter is present in both the equation and the condition, it must be included in both places. All elements of conditions passed to BundleSovler1D must be Bundle* conditions, even if they don't have parameters.

  4. Now, we can train it and obtain the solution as we normally would.

    solver.fit(max_epochs=1000)
    solution = solver.get_solution(best=True)

    The solution expects three inputs - t, λ and U0. All inputs must have the same shape. For example, if you are interested in fixing λ=4 and U0=0.4 and plotting the solution u against t ∈ [0,1] , you can do the following

    t = np.linspace(0, 1)
    lmd = 4 * np.ones_like(t)
    u0 = 0.4 * np.ones_like(t)
    
    u = solution(t, lmd, u0, to_numpy=True)
    
    import matplotlib.pyplot as plt
    plt.plot(t, u)
  5. Once you have a bundled solution, you can find a set of parameters (λ, U0) that matches observed data points (t_i, u_i) most closely. This is achieved using simple gradient descent. In the following toy example, we assume there are only three data points u(0.2) = 0.273, u(0.5)=0.129, and u(0.8) = 0.0609. The following is classical PyTorch workflow.

    # observed data points
    t_obs = torch.tensor([0.2, 0.5, 0.8]).reshape(-1, 1)
    u_obs = torch.tensor([0.273, 0.129, 0.0609]).reshape(-1, 1)
    
    # random intialization of λ and U0; keep track of their gradient
    lmd_tensor = torch.rand(1) * (LAMBDA_MAX - LAMBDA_MIN) + LAMBDA_MIN
    u0_tensor = torch.rand(1) * (U0_MAX - U0_MIN) + U0_MIN
    adam = torch.optim.Adam([lmd_tensor.requires_grad_(True), u0_tensor.requires_grad_(True)], lr=1e-2)
    
    # run gradient descent for 10000 epochs
    for _ in range(10000):
        output = solution(t_obs, lmd_tensor * torch.ones_like(t_obs), u0_tensor * torch.ones_like(t_obs))
        loss = ((output - u_obs) ** 2).mean()
        loss.backward()
        adam.step()
        adam.zero_grad()
       
    print(f"λ = {lmd_tensor.item()}, U0={u0_tensor.item()}, loss = {loss.item()}")

FAQ

Q: How to use GPU for training?

Simple. When importing neurodiffeq, the library automatically detects if CUDA is available on your machine. Since the library is based on PyTorch, it will set default tensor type to torch.cuda.DoubleTensor for if a compatible GPU device is found.

Q: How to use pretrained nets?

Refer to Sections Custom Networks and Transfer Learning.

Q: How to change the learning rate?

The standard PyTorch way.

  1. Build your networks as explained in Custom Networks: nets = [FCNN(), FCN(), ...]

  2. Instantiate a custom optimizer and pass all parameters of these networks to it

    parameters = [p for net in nets for p in net.parameters()]  # list of paramters of all networks
    MY_LEARNING_RATE = 5e-3
    optimizer = torch.optim.Adam(parameters, lr=MY_LEARNING_RATE, ...)
  3. Pass BOTH your nets and your optimizer to the solver: solver = Solver1D(..., nets=nets, optimizer=optimizer)

Q: I got a bad solution.

Unlike traditional numerial methods (FEM, FVM, etc.), the NN-based solution requires some hypertuning. The library offers the utmost flexibility to try any combination of hyperparameters.

  • To use a different network architecture, you can pass in your custom torch.nn.Modules.
  • To use a different optimizer, you can pass in your own optimizer to solver = Solver(..., optimizer=my_optim).
  • To use a different sampling distribution, you can use built-in generators or write your own generators from scratch.
  • To use a different sampling size, you can tweak the generators or change solver = Solver(..., n_batches_train).
  • To dynamically change hyperparameters during training, checkout our callbacks feature.

Q: Any rules of thumbs?

  • Don't use ReLU for activation, because its second-order derivative is identically 0.
  • Re-scale your PDE/ODE in dimensionless form, preferably make everything range in [0,1]. Working with a domain like [0,1000000] is prone to failure because a) PyTorch initializes the modules weights to be relatively small and b) most activation functions (like Sigmoid, Tanh, Swish) are most nonlinear near 0.
  • If your PDE/ODE is too complicated, consider trying curriculum learning. Start training your networks on a smaller domain, and then gradually expand until the whole domain is covered.

Contributing

Everyone is welcome to contribute to this project.

When contributing to this repository, we consider the following process:

  1. Open an issue to discuss the change you are planning to make.
  2. Go through Contribution Guidelines.
  3. Make the change on a forked repository and update the README.md if changes are made to the interface.
  4. Open a pull request.

neurodiffeq's People

Contributors

ashelywithmarvin avatar ashleighbi avatar at-chantada avatar codacy-badger avatar dependabot[bot] avatar devanshkv avatar dsondak avatar feiyu-chen96 avatar jdelpiano avatar joyparikh avatar lakshay-13 avatar marco-digio avatar matinmoezzi avatar mecunha avatar mr-ravin avatar pasq-cat avatar sakthisreeee avatar sathvikbhagavan avatar shivasj avatar shuheng-liu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neurodiffeq's Issues

solve() function should return also the best solution

Currently, the solve() function returns the solution at the last epoch.
It could be useful to return the best solution: the solution that obtains the lowest value of validation loss.
The best solution and the last solution doesn't always coincide.

less code duplication

I find myself always making parallel changes of ode and pde module which involves a lot of copy-pasting. Is it possible to factor out the commond logic in solve_system and solve2D_system?

pde module is slow

especially when there's a derivative of the neural net in the trial solution; especially when we use splines to approximate trial solution of problems with domains with complex shape

add validation set

User should be able to specify a validation set (e.g. in the form of a 'generator' which can have different density/discretization/distribution from the training generator). Accordingly, the loss on the validation set should be visualized and returned.

Model Saving & Reloading

Hi,

I am studying how transfer learning can enhance the training of physics-informed neural networks. The NeuroDiffEq sparked my interest and I was wondering whether it is possible to

  1. save a trained model, i.e. the parameters of the network and its architecture

  2. reload the saved model and continue training from that non-random state.

Oscillator equation in large domain

Dear Liu,
I am solving oscillator equation in the large domain, but not getting the proper result.
in particular I am trying to solve
y'' +0.2(y^2 -1) y' -y +y^3 = 0.53 cos x
y(0) =0.1
y'(0) =-0.2
x_min=0.0, x_max=50.0

Unifying shapes of tensors throughout the library

Rethink the diff function

At the core of the neurodiffeq library is the diff(x, t) function, which computes the partial derivative ∂x/∂t evaluated at t. Usually, both tensor t and x have shapes of (n_samples, 1). When either x.shape or t.shape is malformed, however, there are cases where things could go wrong due to broadcasting. Such cases are so subtle that they have gone unnoticed for a long time.

All our generators (as defined in neurodiffeq.generator) currently return tensors with shapes (n_samples,) instead of (n_samples, 1). Efforts should be put into unifying the tensor shapes everywhere.

Here are two simple cases for review.

Case 1: Shapes don't matter

In this case, we try different combinations of x.shape and t.shape and check the shape of the output ∂x/∂t, namely:

  • [n, 1] and [n] --> [n]
  • [n] and [n]--> [n]
  • [n, 1] and [n, 1]--> [n,1]
  • [n] and [n, 1]--> [n,1]

To see this, run the following code. Note that d1, d2, d3, and d4, while having different shapes, hold the same values. This is the reason why we incorrectly believed in the soundness of the diff() function.

n = 10

t = torch.rand(n, requires_grad=True)
x = torch.sin(t)
d1 = diff(x.reshape(-1, 1), t)
d2 = diff(x.reshape(-1), t)

t = t.reshape(-1, 1)
x = torch.sin(t)
d3 = diff(x.reshape(-1, 1), t)
d4 = diff(x.reshape(-1), t)

Case 2: Shapes matter

In this second case, we examine two new operators – div and curl in spherical coordinates – and show that only when x.shape and t.shape are both (n, 1) will the vector identity div(curl(...)) == 0 hold.

Here is the definition of curl and divergence in spherical coordinates

# these two operators have been recently implemented in neurodiffeq.operators
def spherical_curl(u_r, u_theta, u_phi, r, theta, phi):
    d_r = lambda u: diff(u, r)
    d_theta = lambda u: diff(u, theta)
    d_phi = lambda u: diff(u, phi)

    curl_r = (d_theta(u_phi * sin(theta)) - d_phi(u_theta)) / (r * sin(theta))
    curl_theta = (d_phi(u_r) / sin(theta) - d_r(u_phi * r)) / r
    curl_phi = (d_r(u_theta * r) - d_theta(u_r)) / r

    return curl_r, curl_theta, curl_phi


def spherical_div(u_r, u_theta, u_phi, r, theta, phi):
    div_r = diff(u_r * r ** 2, r) / r ** 2
    div_theta = diff(u_theta * sin(theta), theta) / (r * sin(theta))
    div_phi = diff(u_phi, phi) / (r * sin(theta))
    return div_r + div_theta + div_phi

Here we define a vector field q by specifying the rule to compute q given coordinates (r, theta, phi)

def compute_q(r, theta, phi):
    r_theta_phi = torch.stack([r.flatten(), theta.flatten(), phi.flatten()], dim=1)
    W = torch.tensor([
        [.01, .04, .07],
        [.02, .05, .08],
        [.03, .06, .09],
    ])
    q = torch.matmul(r_theta_phi, W)
    q = torch.tanh(q)
    return q[:, 0], q[:, 1], q[:, 2]

We then test the vector identity div(curl(q)) == 0 for q

n = 10

# create r, theta, and phi with shape (n, 1)
r = torch.rand(n, 1, requires_grad=True) + 0.1
theta = torch.rand(n, 1, requires_grad=True) * np.pi
phi = torch.rand(n, 1, requires_grad=True)  * np.pi * 2
q_r, q_theta, q_phi = compute_q(r, theta, phi)

# bind the operators to the r, theta, phi created above
div = lambda u_r, u_theta, u_phi: spherical_div(u_r, u_theta, u_phi, r, theta, phi)
curl = lambda u_r, u_theta, u_phi: spherical_curl(u_r, u_theta, u_phi, r, theta, phi)

div_curl_q1 = div(*curl(q_r.reshape(-1, 1), q_theta.reshape(-1, 1), q_phi.reshape(-1, 1)))
div_curl_q2 = div(*curl(q_r.reshape(-1), q_theta.reshape(-1), q_phi.reshape(-1)))

# create r, theta, and phi with shape (n,)
r = r.reshape(-1)
theta = r.reshape(-1)
phi = r.reshape(-1)
q_r, q_theta, q_phi = compute_q(r, theta, phi)

# bind the operators to the r, theta, phi created above
div = lambda u_r, u_theta, u_phi: spherical_div(u_r, u_theta, u_phi, r, theta, phi)
curl = lambda u_r, u_theta, u_phi: spherical_curl(u_r, u_theta, u_phi, r, theta, phi)

div_curl_q3 = div(*curl(q_r.reshape(-1, 1), q_theta.reshape(-1, 1), q_phi.reshape(-1, 1)))
div_curl_q4 = div(*curl(q_r.reshape(-1), q_theta.reshape(-1), q_phi.reshape(-1)))

print(div_curl_q1, div_curl_q2, div_curl_q3, div_curl_q4, sep="\n")

Printing all four div_curl_qs will show that, only div_curl_q1 is (approximately) equal to 0, which means both the dependent and independent variables must have shape (n, 1) for the differentiation to go correctly.

bump in Burger's equation's solution

When solving Burger's equation, a ' bump' always appear in the residual of the solution. The cause is unknown.

def test_burgers():

    nu = 1
    a = 2  # a parameter for the special case
    T = 0.1

    burgers = lambda u, x, t: diff(u, t) + u * diff(u, x) - nu * diff(u, x, order=2)

    ibvp = IBVP1D(
        x_min=0, x_min_val=lambda t: 0,
        x_max=1, x_max_val=lambda t: 0,
        t_min=0, t_min_val=lambda x: 2 * nu * np.pi * torch.sin(np.pi * x) / (a + torch.cos(np.pi * x))
    )
    net = FCNN(n_input_units=2, n_hidden_units=32, n_hidden_layers=1)

    solution_neural_net_burgers, _ = solve2D(
        pde=burgers, condition=ibvp, xy_min=[0, 0], xy_max=[1, T],
        net=net, max_epochs=300,
        train_generator=ExampleGenerator2D([50, 50], [0, 0], [1, T], method='equally-spaced-noisy'),
        batch_size=64,
        monitor=Monitor2D(check_every=10, xy_min=[0, 0], xy_max=[1, T])
    )

    def solution_analytical_burgers(x, t):
        numer = 2 * nu * np.pi * np.exp(-np.pi ** 2 * nu * t) * np.sin(np.pi * x)
        denom = a + np.exp(-np.pi ** 2 * nu * t) * np.cos(np.pi * x)
        return numer / denom

    xs = np.linspace(0, 1, 101)
    ts = np.linspace(0, T, 101)
    xx, tt = np.meshgrid(xs, ts)
    make_animation(solution_neural_net_burgers, xs, ts) # test animation
    sol_ana = solution_analytical_burgers(xx, tt)
    sol_net = solution_neural_net_burgers(xx, tt, as_type='np')
    assert isclose(sol_net, sol_ana, atol=0.1).all()
    print('Burgers test passed.')

Rename arguments in ode functions/classes

The following changes will be made (while providing backward compatibility)

  1. rename the argument x to u in neurodiffeq.conditions.{IVP,DirichletBVP}
  2. rename the argument x to u in neurodiffeq.neurodiffeq.{diff,safe_diff,unsafe_diff}

Performance of computing partial derivative

Hi,

I am pretty new to neurodiffeq, thank you very much for the excellent library.

I am interested in the way, and the computational speed, of computing partial derivatives w.r.t. the inputs.

Take forward ODE (1D, 1 unknown variable) solver for example, the input is x, a batch of coordinates, and the output of the neural network is y, the approximated solution of the PDE at these coordinates. If view the neural network as a smooth function that simulate the solution and name it f, the forward part in the training is evaluating y = f(x), and for each element of input, x_i, the neural network gives y_i = f(x_i), the i increase from 0 to N-1, the batch size. When constructing the loss function, one evaluate the residual of PDE, which usually require evaluating \frac{\partial y_i}{\partial x_i} and higher order of derivative.

My question related to the way of evaluating the \frac{\partial y_i}{\partial x_i}, for example x is (N, 1) tensor, y is also (N, 1) tensor, N is the batch size, if you do autograd.grad(y, t, create_graph=True, grad_outputs=ones, allow_unused=True) as the lines below
https://github.com/odegym/neurodiffeq/blob/718f226d40cfbcb9ed50d72119bd3668b0c68733/neurodiffeq/neurodiffeq.py#L21-L22
my understanding is that it will evaluate a Jacobian Matrix of size (N, N) with elements equal to \frac{\partial y_i}{\partial x_j} (I, j from 0 to N-1) regardless of the fact that y_i only dependent on x_i and thus computation (and storage) on the non-diagonal elements is useless and unnecessary. In other word, the computation actually can be done by evaluating N gradients, but the current method do N * N times.

My question is that:

  1. Is what I state above correct to your understanding?
  2. If correct, do you think this may cause computation speed influence?
  3. If 1 is correct, do you know any way, and do you have any plan to reduce the computation needed?

Thanks!

Tests are taking too long

Tests usually take ~40 mins to run. This delayed feedback can slow down development. Is there a better way to test?

Nonlinear ODE Solving?

I may have missed it, but does the library solve nonlinear ode? For example, can it solve dy / dx = 2 * y - y ^ 2 + 1 and y_0 = 0, a nonlinear Riccati Equation? When trying to solve with the specified codes, there were very wrong results, is there a trick to solve nonlinear equations with this library, or is there a method you can suggest?

Thank you very much.

Potential bug in the differential operator `diff`

Potential Bug?

I think I found an error in our implementation of the differential operator. I'll be drafting a commit to fix this shortly (PR #45). Honestly, I'm still confused about it. It's really strange that we should have such a fundamental bug. And I'm making the changes based on the following example.

Bug Description

Our differentiation function looks like this:

def diff(x, t, order=1):
    ones = torch.ones_like(t)
    der, = autograd.grad(x, t, create_graph=True, grad_outputs=ones)
    for i in range(1, order):
        der, = autograd.grad(der, t, create_graph=True, grad_outputs=ones)
    return der

Here ones is used for the role of gradients that flows backward from elsewhere. However, in this implementation, we assume x to be of the same shape as t. An error will be raised if we have differently shaped x and t. For example:

>>> x = torch.arange(10, dtype=torch.float).requires_grad_(True)
>>> y = torch.arange(10, dtype=torch.float).requires_grad_(True)
>>> z = torch.dot(x, y)
>>> diff(z, x)
RuntimeError: Mismatch in shape: grad_output[0] has a shape of torch.Size([10]) and output[0] has a shape of torch.Size([]).

Proposed fix

Change the ones to be of the same shape as x instead of t.
i.e., change

ones = torch.ones_like(t)

to

ones = torch.ones_like(x)

Now, if we re-run the above code, we get

>>> diff(z, x)
tensor([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.], grad_fn=<MulBackward0>)
>>> diff(z, y)
tensor([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.], grad_fn=<MulBackward0>)

shuffle training set

Shuffling training set (with numpy.random.shuffle or torch.permute) results in a pytorch error.

About the temporal dimension

Up to now, it is treated as another spacial dimension for the neural net. Maybe it should be treated differently because the temporal dimension only has initial conditions and no boundary condition?

remove tol keyword

The tol keyword and maxiter keyword in the solve/solve_system function are confusing if they coexist. For now, remove the tol keyword.

better definition of tolerance

Currently, we use the loss of the last epoch to determine whether the training should be stopped. We can consider using running average of the loss.

Rewrite unit tests

  1. use pytest.fixture instead of module-level variables
  2. wrap warnings with pytest.warns

Tidiness

Things that should be cleaned up:

  • how the library is broken into submodules and classes: Submodules like ode, pde, pde_spherical, tdim(time-dependent PDEs under development) are not mutually exclusive. Many of the functions and classes share very similar logic (e.g. ode.solve_system and pde.solve2D_system). This amplifies the work needed to make changes (e.g. every time I change pde.solve2D_system, I almost always need to make similar changes to ode.solve_system as well). We may want to extract these logics.
  • function signatures are not carefully thought out:
    • keywords whose values can potentially contradict each other, e.g. What if in pde.solve2D_system, the domain indicated by xy_min and xy_max is different from that indicated by train_generator?
    • keywords that come from different abstraction level, e.g. in pde.solve2D_system, there are keywords that come from the realm of differential equations (higher level) and keywords that come from the realm of deep learning (lower level). Maybe it's just me but that feels uncomfortable.
    • keywords that always appear together, e.g. nets and single_net in pde submodule. This suggests that they really should be made into one parameter object.
    • the names and order of keywords are not consistent across the package.
  • conversion between numpy.array and torch.tensor: numpy.array and torch.tensor are converted to each other everywhere and I find myself checking whether a sequence is a numpy.array or torch.tensor all the time. This makes working on / using the code less pleasant. Since the only reason numpy.array is there is because of matplotlib.pyplot and numpy.linalg, we should try to limit the use of numpy.array only to those sections and use torch.tensor in other places. Also, numpy.array and torch.tensor use different default precision, conversion can potentially introduce error.
  • the use of torch.tensor:
    • I was not mindful of the memory footprint of torch.tensor when I wrote the code, there may be cases where requires_grad is set to True when it really should be False. there may also be cases where we can reuse a tensor yet it is re-created.
    • I was not mindful when reshaping torch.tensor. I'm not sure the impact on performance.
  • naming things:
    • non-standard format: function names should not have uppercase letters (e.g.solve2D); upper case variable names are not reserved for constants and enumerations.
    • names that give false impressions: e.g. ExampleGenerator is not a python generator.
    • confusing names: e.g. What is an additional_loss? What is an FCNN? Why are there t_0 and t_1 in ode.DirichletBVP which has nothing to do with time?
  • the use of strings as categorical variables: should have used Enum instead, in other cases (e.g. the as_type keyword of pde.Solution.__call__) should have used a bool instead.

Coupled PDEs

Hello Everyone
Can we use neurodiffeq to solve coupled PDEs in thermoelasticity?

custom loss function

Currently, the loss function is assumed to be a function of the reparameterized output of the neural net alone. What if the user want a loss function that also takes the independent variable directly as an input?

Solving ODEs - Skipping correct reparametrization

Hi team,

I imagine there is an issue while solving ODEs with initial x_0_prime=0.

It appears that in this case, the correct reparametrization is skipped and we suspect the below code from ode.py maybe the cause.

        if self.x_0_prime:
            return self.x_0 + (t-self.t_0)*self.x_0_prime + ( (1-torch.exp(-t+self.t_0))**2 )*x
        else:
            return self.x_0 + (1-torch.exp(-t+self.t_0))*x

If self.x_0_prime is 0, then the if block is skipped, giving the second reparametrization.

Fix FCNN `n_hidden_layers`

What will be changed

Following the discussion last Friday, the arguments n_hidden_layers (int) and n_hidden_units(int) in the constructor of neurodiffeq.networks.FCNN will be deprecated in favor of a hidden_units (Tuple[int]).

When to release

The change will be released in v0.2.2 instead of v0.3.0.

Main causes of test failures found

Causes of Test Failure

To understand the causes of massive test failures,

  1. PyTorch tensors (on the low level) are strictly-typed and does not support dynamic casting. For example, one can not add a Double tensor with a Float tensor without explicitly casting one to the other's type.

  2. PyTorch supports globally setting the default type of tensors via torch.set_default_dtype(...), which will be used for all tensors subsequently created, until the current process exits. If no calls to torch.set_default_dtype are made, PyTorch will work out default tensor type using some unknown strategies.

  3. When python evaluates a script (for import or any other purposes), it runs everything in that module verbatim. In other words, Python is not intelligently only evaluate bar when a user executes from foo import bar. Instead, it runs (evaluates) the entire file foo.py and registers foo.bar to the current namespace.

  4. [To be confirmed] When pytest is executed with no arguments, it searches for all files matching the test_*.py pattern, executes them one at a time (in the same process), registers all target functions beginning with test in all files, and runs all registered targets. Note that in this manner, if default types are specified multiple times in different .py files, only the last call takes effect.

  5. In our case, default tensor types are different among test files. Therefore, running any individual test files yields no error (which is the common workflow when using an IDE like PyCharm) while running all tests together in a single process results in failure (which is the case for our travis-ci build config).

Behavior

To verify our theory, consider the following ways of runnings tests

  1. Scenario A: Running all test files together using
pytest tests/test_*.py 
# or, without arguments
pytest

causes some test cases to fail because of Float vs Double type mismatch

  1. Scenario B: Running each test file one at a time using
for file in tests/test_*.py; do
    pytest $file
done

yields no error (except for the technical debt in test_pde_spherical.py)

Solution

Modify the following line in .travis.yml

  - pytest --cov-report term --cov=neurodiffeq/

to do pytest one file at a time.

I'm not familiar with Travis config's syntax, but probably something like

  - pytest tests/test_function_basis.py
  - pytest tests/test_ode.py
  - pytest tests/test_pde.py
  - pytest tests/test_pde_spherical.py
  - pytest tests/test_temporal.py

Is there a more elegant way of doing this?

Solution diverges for simple 2nd-order linear ODE

I was solving the Taylor-Couette equation, which, under mild assumptions, is simplified to this 2nd order linear ODE:

Equation Statement

image

The solution to the above equation should be
image
where A and B are arbitrary constants.

Boundary Condition and Analytical Solution

Under the Dirichlet condition y(0.1) == y(10) == 10.1, the solution can be uniquely determined as:
image, which looks like this

image.

Expected Behaviour

As the training proceeds, the net should return a solution that first goes down until x == 1 and go back up again when x > 1

Actual Behaviour

However, using the following code, the network gives a solution that keeps straying away from the analytical solution.

import torch
import numpy as np
import matplotlib.pyplot as plt
from neurodiffeq import diff
from neurodiffeq.ode import solve, Monitor, ExampleGenerator
from neurodiffeq.ode import IVP, DirichletBVP
from neurodiffeq.networks import FCNN

net = FCNN(n_input_units=1, n_output_units=1, n_hidden_units=512, n_hidden_layers=0)
x_1 = 0.1
x_2 = 10.0
ode = lambda y, x: diff(y, x, order=2) + diff(y, x) / x - (y / x ** 2)
condition = DirichletBVP(x_1, 10.1, x_2, 10.1)
monitor = Monitor(t_min=x_1, t_max=x_2, check_every=50)
train_generator = ExampleGenerator(256, t_min=x_1, t_max=x_2, method='uniform')
valid_generator = ExampleGenerator(2048, t_min=x_1, t_max=x_2, method='equally-spaced')
monitor.check_every = 50

# solve the ODE
solution, loss, internals = solve(
    ode=ode,
    condition=condition, 
    t_min=x_1, 
    t_max=x_2, 
    monitor=monitor, 
    max_epochs=5000,
    return_internal=True,
    train_generator=train_generator,
    valid_generator=valid_generator,
    batch_size=train_generator.size,
)

Here is a gif file that shows how the model is performing. Note that not only does the network give a solution that looks drastically different from the analytic one, but also the solution is being scaled in the y-direction. The latter can be deduced by the change of the maximum value of the y-axis over time.
rec

solving simple ode

I am new to this . try to solve ode dx/dt= t, x(0) =1 , I found huge difference between ann and analytical solution . plz fix my problem . here is ###code

from neurodiffeq import diff
from neurodiffeq.networks import FCNN
from neurodiffeq.ode import solve, IVP, Monitor, ExampleGenerator
import torch
from torch import nn, optim
import numpy as np
import matplotlib.pyplot as plt

ode = lambda x, t: diff(x,t) - t
t_min, t_max = 0.0, 1.0
N=5
fcnn = FCNN(n_hidden_units=5, n_hidden_layers=1, actv=nn.Tanh)
adam = optim.Adam(fcnn.parameters(), lr=0.001)
init_ode = IVP(t_0= t_min, x_0=1.0)
train_gen = ExampleGenerator(N, t_min= t_min, t_max= t_max, method="equally-spaced-noisy")

solution,loss_history = solve(
ode=ode,
condition=init_ode,
train_generator=train_gen,
t_min=t_min, t_max=t_max,
net=fcnn,
batch_size=N,
max_epochs=1500,
optimizer=adam,
monitor=Monitor(t_min= t_min, t_max= t_max, check_every=100),
)

t_max2 = 4.0
N2 = 5
adam2 = optim.Adam(fcnn.parameters(), lr=0.001)
train_gen2 = ExampleGenerator(N2, t_min= t_min, t_max= t_max2, method="equally-spaced-noisy")

solution, _ = solve(
ode=ode,
condition=init_ode,
train_generator=train_gen2,
t_min=t_min, t_max=t_max2,
net=fcnn,
batch_size=N2,
max_epochs=1000,
optimizer=adam2,
monitor=Monitor(t_min= t_min, t_max= t_max2, check_every=100),
)

ts = np.linspace(t_min, 4.0, 10)
x_ana = (ts**2)/2 +1
x_nn = solution(ts, as_type='np')
plt.figure()
plt.plot(ts, x_nn, label='ANN-based solution')
plt.plot(ts, x_ana, label='analytical solution')
plt.ylabel('x')
plt.xlabel('t')
plt.title('comparing solutions')
plt.legend()
plt.show()

ann solution
array([1. , 1.02371858, 1.15868232, 1.37826193, 1.70709725,
2.2127547 , 2.98944599, 4.11046892, 5.54030202, 7.00839809])

analytical solution
array([1. , 1.09876543, 1.39506173, 1.88888889, 2.58024691,
3.4691358 , 4.55555556, 5.83950617, 7.32098765, 9. ])

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.