Code Monkey home page Code Monkey logo

kramersmoyal's Introduction

DOI PyPI - License PyPI PyPI - Python Version Build Status codecov Documentation Status

KramersMoyal

kramersmoyal is a python package designed to obtain the Kramers–Moyal coefficients, or conditional moments, from stochastic data of any dimension. It employs kernel density estimations, instead of a histogram approach, to ensure better results for low number of points as well as allowing better fitting of the results.

The paper is now officially published on JOSS. The paper is also available here, or you can find it in the ArXiv.

Installation

To install kramersmoyal, just use pip

pip install kramersmoyal

Then on your favourite editor just use

from kramersmoyal import km

Dependencies

The library depends on numpy and scipy.

A one-dimensional stochastic process

A Jupyter notebook with this example can be found here

The theory

Take, for example, the well-documented one-dimension Ornstein–Uhlenbeck process, also known as Vašíček process, see here. This process is governed by two main parameters: the mean-reverting parameter θ and the diffusion parameter σ

which can be solved in various ways. For our purposes, recall that the drift coefficient, i.e., the first-order Kramers–Moyal coefficient, is given by and the second-order Kramers–Moyal coefficient is , i.e., the diffusion.

Generate an exemplary Ornstein–Uhlenbeck process with your favourite integrator, e.g., the Euler–Maruyama or with a more powerful tool from JiTCSDE found on GitHub. For this example let's take θ=.3 and σ=.1, over a total time of 500 units, with a sampling of 1000 Hertz, and from the generated data series retrieve the two parameters, the drift -θy(t) and diffusion σ.

Integrating an Ornstein–Uhlenbeck process

Here is a short code on generating a Ornstein–Uhlenbeck stochastic trajectory with a simple Euler–Maruyama integration method

# integration time and time sampling
t_final = 500
delta_t = 0.001

# The parameters theta and sigma
theta = 0.3
sigma = 0.1

# The time array of the trajectory
time = np.arange(0, t_final, delta_t)

# Initialise the array y
y = np.zeros(time.size)

# Generate a Wiener process
dw = np.random.normal(loc=0, scale=np.sqrt(delta_t), size=time.size)

# Integrate the process
for i in range(1,time.size):
    y[i] = y[i-1] - theta*y[i-1]*delta_t + sigma*dw[i]

From here we have a plain example of an Ornstein–Uhlenbeck process, always drifting back to zero, due to the mean-reverting drift θ. The effect of the noise can be seen across the whole trajectory.

Using kramersmoyal

Take the timeseries y and let's study the Kramers–Moyal coefficients. For this let's look at the drift and diffusion coefficients of the process, i.e., the first and second Kramers–Moyal coefficients, with an epanechnikov kernel

# The kmc holds the results, where edges holds the binning space
kmc, edges = km(y, powers=2)

This results in

Notice here that to obtain the Kramers–Moyal coefficients you need to divide kmc by the timestep delta_t. This normalisation stems from the Taylor-like approximation, i.e., the Kramers–Moyal expansion (delta t → 0).

A two-dimensional diffusion process

A Jupyter notebook with this example can be found here

Theory

A two-dimensional diffusion process is a stochastic process that comprises two and allows for a mixing of these noise terms across its two dimensions.

2D-diffusion

where we will select a set of state-dependent parameters obeying

2D-diffusion

with and .

Choice of parameters

As an example, let's take the following set of parameters for the drift vector and diffusion matrix

# integration time and time sampling
t_final = 2000
delta_t = 0.001

# Define the drift vector N
N = np.array([2.0, 1.0])

# Define the diffusion matrix g
g = np.array([[0.5, 0.0], [0.0, 0.5]])

# The time array of the trajectory
time = np.arange(0, t_final, delta_t)

Integrating a 2-dimensional process

Integrating the previous stochastic trajectory with a simple Euler–Maruyama integration method

# Initialise the array y
y = np.zeros([time.size, 2])

# Generate two Wiener processes with a scale of np.sqrt(delta_t)
dW = np.random.normal(loc=0, scale=np.sqrt(delta_t), size=[time.size, 2])

# Integrate the process (takes about 20 secs)
for i in range(1, time.size):
    y[i,0] = y[i-1,0]  -  N[0] * y[i-1,0] * delta_t + g[0,0]/(1 + np.exp(y[i-1,0]**2)) * dW[i,0]  +  g[0,1] * dW[i,1]
    y[i,1] = y[i-1,1]  -  N[1] * y[i-1,1] * delta_t + g[1,0] * dW[i,0]  +  g[1,1]/(1 + np.exp(y[i-1,1]**2)) * dW[i,1]

The stochastic trajectory in 2 dimensions for 10 time units (10000 data points)

2D-diffusion

Back to kramersmoyal and the Kramers–Moyal coefficients

First notice that all the results now will be two-dimensional surfaces, so we will need to plot them as such

# Choose the size of your target space in two dimensions
bins = [100, 100]

# Introduce the desired orders to calculate, but in 2 dimensions
powers = np.array([[0,0], [1,0], [0,1], [1,1], [2,0], [0,2], [2,2]])
# insert into kmc:   0      1      2      3      4      5      6

# Notice that the first entry in [,] is for the first dimension, the
# second for the second dimension...

# Choose a desired bandwidth bw
bw = 0.1

# Calculate the Kramers−Moyal coefficients
kmc, edges = km(y, bw=bw, bins=bins, powers=powers)

# The K−M coefficients are stacked along the first dim of the
# kmc array, so kmc[1,...] is the first K−M coefficient, kmc[2,...]
# is the second. These will be 2-dimensional matrices

Now one can visualise the Kramers–Moyal coefficients (surfaces) in green and the respective theoretical surfaces in black. (Don't forget to normalise: kmc / delta_t).

2D-diffusion

Contributions

We welcome reviews and ideas from everyone. If you want to share your ideas or report a bug, open an issue here on GitHub, or contact us directly. If you need help with the code, the theory, or the implementation, do not hesitate to contact us, we are here to help. We abide to a Conduct of Fairness.

TODOs

Next on the list is

  • Include more kernels
  • Work through the documentation carefully

Changelog

  • Version 0.4.1 - Changing CI. Correcting kmc[0,:] normalisation. Various Simplifications. Bins as ints, powers as ints.
  • Version 0.4.0 - Added the documentation, first testers, and the Conduct of Fairness
  • Version 0.3.2 - Adding 2 kernels: triagular and quartic and extending the documentation and examples.
  • Version 0.3.1 - Corrections to the fft triming after convolution.
  • Version 0.3.0 - The major breakthrough: Calculates the Kramers–Moyal coefficients for data of any dimension.
  • Version 0.2.0 - Introducing convolutions and gaussian and uniform kernels. Major speed up in the calculations.
  • Version 0.1.0 - One and two dimensional Kramers–Moyal coefficients with an epanechnikov kernel.

Literature and Support

Literature

The study of stochastic processes from a data-driven approach is grounded in extensive mathematical work. From the applied perspective there are several references to understand stochastic processes, the Fokker–Planck equations, and the Kramers–Moyal expansion

  • Tabar, M. R. R. (2019). Analysis and Data-Based Reconstruction of Complex Nonlinear Dynamical Systems. Springer, International Publishing
  • Risken, H. (1989). The Fokker–Planck equation. Springer, Berlin, Heidelberg.
  • Gardiner, C.W. (1985). Handbook of Stochastic Methods. Springer, Berlin.

You can find and extensive review on the subject here1

History

This project was started in 2017 at the neurophysik by Leonardo Rydin Gorjão, Jan Heysel, Klaus Lehnertz, and M. Reza Rahimi Tabar. Francisco Meirinhos later devised the hard coding to python. The project is now supported by Dirk Witthaut and the Institute of Energy and Climate Research Systems Analysis and Technology Evaluation.

Funding

Helmholtz Association Initiative Energy System 2050 - A Contribution of the Research Field Energy and the grant No. VH-NG-1025 and STORM - Stochastics for Time-Space Risk Models project of the Research Council of Norway (RCN) No. 274410.


1 Friedrich, R., Peinke, J., Sahimi, M., Tabar, M. R. R. Approaching complexity by stochastic methods: From biological systems to turbulence, Phys. Rep. 506, 87–162 (2011).

kramersmoyal's People

Contributors

arfon avatar fmeirinhos avatar kthyng avatar lrydin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

kramersmoyal's Issues

Errors in the readme [joss.01693]

I'm just going through your submission to JOSS and following through your readme I can't reproduce your results exactly.

  1. The code errors since kernel.epanechnikov is an unknown name (it should be kernels.epanechnikov).
  2. The code errors saying that the first power should be zero.

Fixing those two mistakes I then get the results that I plot using

plot(edges[0], kmc[1])  # the first coefficient

There is some distortion at the extremes (I presume to be expected due to low numbers of samples?) and it shows the linear trend demonstrated in the readme figures but the values are a factor of 1000 different.

A couple of minor points that don't affect the calculation but are more about style:

  1. y = np.zeros([time.size]) could simply be y = np.zeros(time.size), and
  2. size=[time.size,1] could simply be size=time.size.

Clarifying normalization

First, thank you so much for writing this package! Being able to quickly call a function to compute KM coefficients has been a considerable help in my current research.

I have a clarification to suggest in the main README.md regarding the normalization. Currently it is stated that one should multiply kmc by delta_t to obtain the actual Kramers-Moyal coefficients, but if I understand correctly one should in fact divide by delta_t.

Bandwidth?

Dear authors,

I'm sorry to pollute the github project, but can you comment a little bit on the bandwidth parameter?
It's the only free parameter and it seems to have an important influence on the results, but it is not discussed in the docs nor on the paper.

Thank you for this great work,
best

Community guidelines, code of conduct, issue and PR template

One of the JOSS requirements is to have community guidelines and code of conduct for interacting with the project. Authors can link pointers from the README to options under insights about:

Contribute to the software
Seek support for usage (authors have added this in README and can link it to the guidelines)
Report issues or PR (if the authors think users need to report in a specific style)

There are helpful templates created by GitHub to do these in Insights from the bar just under the repository name.

(Part of openjournals/joss-reviews#1693)

Help with Methodology (vectors' angles from Kramers-Moyal coefficients)

Hello, I came across your package in my effort to replicate the methodology from paper:
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000061
where in the Data Analysis section they use the Kramers-Moyal coefficients to get the drift coefficients of the vectors and subsequently the angle of neighbor vectors in the phase space in order to make conclusions about the dynamics of the flow in the state space.
I don't know if you would be interested in helping me a bit with the code. I actually want to do it for a 3d space, but even in 2d i don't quite understand how to get the angles from the output of the km function since i get this bins x bins x bins vector.
I hope I was clear. Thanks a lot in advance and apologies if my issue fall out of the scope of the issues section :)

TypeError raised by test

I ran into the following error while executing one of the tests in test directory. Maybe I'm missing something and the authors can either fix or add instruction to specify usage. Other tests ran just fine.

run test/binning_test.py


TypeError Traceback (most recent call last)
~/Documents/review-JOSS/KramersMoyal/test/binning_test.py in
15
16 hist1 = [np.histogramdd(timeseries, bins=bins,
---> 17 weights=w, density=True)[0] for w in weights.T]
18
19 hist2 = histogramdd(timeseries, bins=bins,

~/Documents/review-JOSS/KramersMoyal/test/binning_test.py in (.0)
15
16 hist1 = [np.histogramdd(timeseries, bins=bins,
---> 17 weights=w, density=True)[0] for w in weights.T]
18
19 hist2 = histogramdd(timeseries, bins=bins,

TypeError: histogramdd() got an unexpected keyword argument 'density'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.