Code Monkey home page Code Monkey logo

dj-barotropic-gyre's Introduction

Differentiable ocean model

We aim to create a differentiable barotropic gyre model (i.e. just a simple ocean model) in the Julia ecosystem. At this point there are two threads:

  1. Using the Julia packages Oceananigans and Enzyme. Oceananigans is an existing, efficient Julia package that contains all the code we need for a barotropic gyre model, at which point we would just differentiate this code. The files barotropic_gyre_exp.jl, barotropic_gyre_original.jl, and barotropic_gyre_singlestep.jl are explicit barotropic gyre solvers written using Oceananigans functions (thanks to Greg Wagner for creating these!). We have yet to add Enzyme successfully, but will continue to work on this.

  2. The second direction is a different barotropic gyre model using a fully explicit solver (in this case RK4). This code lives in the folder explicit_solver, and is based on Python code written by Milan Kloewer (found here: https://github.com/milankl/swm).

The explicit solver code is currently working with both Enzyme and Checkpointing. The file main_energy_chkp.jl runs a sample adjoint problem and computes the sensitivity of the final energy with respect to the initial conditions. The derivative was checked with a straightforward finite difference approximation. Caution should be used when selecting both grid resolution and the number of snaps (checkpoints) for Checkpointing to do, the integration time increases quite a bit with resolution size (in my experience.)

dj-barotropic-gyre's People

Contributors

glwagner avatar milankl avatar swilliamson7 avatar vchuravy avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

dj-barotropic-gyre's Issues

Enzyme and structures and Oceananigans

A potentially vague question/issue regarding how Enzyme works. In order to run the Oceananigans barotropic gyre model, we need to initialize this structure

model = HydrostaticFreeSurfaceModel(; grid, closure,
                                   free_surface = ExplicitFreeSurface(),
                                    momentum_advection = VectorInvariant(),
                                    coriolis = HydrostaticSphericalCoriolis(),
                                    boundary_conditions = (u=u_bcs, v=v_bcs),
                                    tracers = nothing,
                                    buoyancy = nothing)

which contains all the needed information about the model that we're running, and so needs to be given to time_step! to update the velocity fields and so on. The function that I call autodiff on currently takes this as input, and so we need to wrap it in Duplicated, but the vast majority of the entries in model are not something we want to take the derivative with respect to. Is there a way to tell Enzyme to take this into account?

Another weird thing is that the choices (such as ExplicitFreeSurface()) will impact the derivative, so we do still want Enzyme to go through those functions, meaning they shouldn't be ignored in their entirety (if any of that makes sense). I guess I'm a little lost about how best to set this up to try and make the MWE of Enzyme and Oceananigans not getting along, but I think understanding how to deal with this structure is at least a start.

Issue with enzyme/code

Tried to apply Enzyme to time_step!, got to an Enzyme compilation failure. Since time_step! has no obvious input/output, I tried to style a function around time_step! that would take a single step using given input. Trying this both with the original and explicit setups for the barotropic gyre led to the same failure.

Code:

    function ad_func(u_in, v_in, u_out, v_out)

    set!(model, u = copy(u_in), v = copy(v_in))

    time_step!(model, Δt)

    # @show temp_u_out
    # @show temp_v_out

    u_out = model.velocities.u.data
    v_out = model.velocities.v.data

    @show maximum(u_out)
    @show maximum(v_out)

    return nothing

    end

   @info "Before taking a time step"
   @show maximum(model.velocities.u)

   u, v = model.velocities
   A = zeros(67,66,7)
   B = zeros(66,67,7)
   d_u_in = OffsetArray(A, -2:64, -2:63, -2:4)
   d_v_in = OffsetArray(B, -2:63, -2:64, -2:4)
   autodiff(ad_func, Duplicated(u.data, d_u_in), Duplicated(v.data, d_v_in), 
       Duplicated(OffsetArray(zeros(67,66,7), -2:64, -2:63, -2:4), OffsetArray(ones(67,66,7), -2:64, -2:63, -2:4)), 
       Duplicated(OffsetArray(zeros(66,67,7), -2:63, -2:64, -2:4), OffsetArray(ones(66,67,7), -2:63, -2:64, -2:4)))

   # ad_func(u, v, d_u_in, d_v_in)

   # @info "After taking a time step"
   # @show maximum(model.velocities.u)

Potential speedups

  1. Diffusive terms only once

Move

https://github.com/DJ4Earth/DJ-barotropic-gyre/blob/7237a9fc785d545e1d46fdf014c18b5b8b716264/explicit_solver/compute_time_deriv.jl#LL32C1-L33C94

to the end of all substeps and time step forward with dt, in ShallowWaters.jl we do this like

https://github.com/milankl/ShallowWaters.jl/blob/d1609c0be39f279d23d237887870405156c4b0ca/src/time_integration.jl#L263-L265

so after the 4 RK4 substeps have been computed, compute the diffusive terms and add them to u,v with dt in an Euler forward fashion. The reason for this is that the current time step is constrained by the gravity wave speed and the diffusive terms can have larger time steps.

  1. Include PV advection in the RK substeps but only update them once every time step

https://github.com/milankl/ShallowWaters.jl/blob/d1609c0be39f279d23d237887870405156c4b0ca/src/time_integration.jl#L251C2-L257

Especially the Arakawa and Lamb scheme is expensive, so instead of calculating $qhu, qhv$ on every substep, you can do $qh_0u_0, qh_0v_0$ where the 0 is the state of the model at $t=t_0$ and therefore the term is kept constant till $t = t_0 + dt$. The reason is that because it's a fully explicit model you don't have to update the slow terms as often as those that contain the fast gravity waves.

  1. Move to a baroclinic system

in the barotropic system the time step is determined by resolving the gravity waves at speed $\sqrt{gH}$, so reducing g will slow down the gravity waves, but not affect the non-linear terms. In ShallowWaters.jl I went for $g=0.1m/s^2$, which reduces the speed from 70m/s to 7m/s, such that similar in speed as eddies. You basically just get more turbulence for the same compute time.

https://github.com/milankl/ShallowWaters.jl/blob/d1609c0be39f279d23d237887870405156c4b0ca/src/default_parameters.jl#LL15C1-L16C70

  1. Combinations

You can probably combine 1) and 3) but I'd be careful to combine 2) and 3) because with 3) the physical argument in 2) doesn't hold anymore

Explicit solver running slowly

After moving Milan's Python code over I'm finding that the Julia equivalent runs slowly. The script checking3.jl runs everything needed, specifically the function main(Tspinup, Trun, nx, ny) will integrate for the input Tspinup (given in days), on a designated grid size nx by ny. Keep nx, ny on the order of 50 to ensure that integration doesn't take too long, for example running

include("checking3.jl")
states, states_mat = main(365, 0, 20, 20);

shouldn't take too long to integrate, but increasing nx, ny will cause the slowdowns.

How to adjoint the barotropic gyre model from Oceananigans

As of right now, the steps to get to an adjoint-able barotropic gyre model are

  1. Create a "forward step" function. Right now the barotropic gyre model in Oceananigans takes in some initial inputs, and then just runs the model forward in its entirety to return the fields $u,$ $v,$ and $\eta$ at the end, if understood correctly. Ideally, we would instead take just one forward step:
    $$f(\mathbf{u}(t), \mathbf{v}(t), \mathbf{\eta}(t)) = (\mathbf{u}(t + \Delta t), \mathbf{v}(t + \Delta t), \mathbf{\eta}(t + \Delta t) )$$
    (not trying to say $f$ returns a vector, just returns the fields at the next time step.)

  2. Once we have that forward function, we first want to apply Enzyme to see if it works. This shouldn't be difficult at this point (hopefully). This step will also let us see if there are issues with Enzyme + Oceananigans that would need to be worked out.

  3. Then we can run a trial sensitivity analysis. Namely, per Patrick's suggestion we can choose the cost function $$J(\mathbf{u}(t_f), \mathbf{v}(t_f)) = \mathbf{u}^2(t_f) + \mathbf{v}^2(t_f) = \sum_{j, k} u_{jk}^2 + v_{jk}^2$$ (i.e. we sum $u^2 + v^2$ at every point on the grid and add them all together.) Then we can ask what happens to this quantity if we change the initial $u$ field a little bit. This will just be running the "adjoint equations" step-by-step backwards.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.