Code Monkey home page Code Monkey logo

dbanomaly's Introduction

tspDB

tspDB enables predictive query functionality in PostgreSQL by building an additional “prediction index” for a collection of time-series of interest.

A prediction index will allow for fast data retrieval, but for entries that are:

1- At a future time step (i.e. forecasting);

2- Missing/corrupted by noise (i.e. imputation)

Our paper here provides more information about how tspDB works and its performance.  

This work has the following dependencies:

  • PostgreSQL 12+
  • Python 3.5+ with the libraries: (numpy, pandas, scipy, sklearn)

Installation

For insallation instruction, go to the installation page here

Getting Started

The main functionalities of tspDB is enabling predictive queries, which are enabled via creating a prediction index on your time series table. The index is created via the function create_pindex(). which you can use as follow:

select create_pindex('tablename','time_column_name','{"value_column_name"}','index_name')

To get you familiar with tspDB capabilities, we provided a testing function that will create a set of time series tables in your database. The test function will also create several prediction indices. Run the function from your postgres terminal

SELECT test_tspdb();

if you get at the last line

NOTICE:  Pindices successfully created

then the test has passed. Now we can check the predictive indices the test has created through

SELECT * FROM list_pindices();

You will see the three predictive indices created by the test. Now let's create our own predictive index, which we will call 'pindex1' on the time series table mixturets2. The prediction index is created on the time column time and the value column ts_7:

SELECT create_pindex('mixturets2','time','{"ts_7"}','pindex1');

we can see our newly created index by running list_pindices again:

SELECT * FROM list_pindices();

Let's now use that prediction index to produce some predictions! Let's for example predict at a time t that exists in the database. Effectively, we are denoising the existing observation or imputing a null observation. For example, at time 1, ts_7 has a null value as you can see by running:

SELECT ts_7 FROM mixturets2 WHERE time = 1;

Let's impute this point by running:

SELECT * FROM predict('mixturets2','ts_7',1,'pindex1');

Which will return predictions as well as upper an lower bound for a 95% confidence interval. We can get a tighter bound with lower confidence by changing the confidence interval to, say 80%:

SELECT * FROM predict('mixturets2','ts_7',1,'pindex1', c=> 80);

The prediction index also support forecasting queries using the same function. For example, you can forecast the value of column ts_7 at time 100010, ten points ahead of what exists in the database by running:

SELECT * FROM predict('mixturets2','ts_7',100010,'pindex1');

In a similar fashion, you can execute range predictive queries using predict(). For example, we can impute the first hundered points of ts_7 using:

SELECT * FROM predict('mixturets2','ts_7',0,100,'pindex1');

or forecast the next 10 points using:

SELECT * FROM predict('mixturets2','ts_7',100001,100010,'pindex1');

For further examples, check the python notebook examples here

Contributing

Please visit our Github page for more information

License

This work is licensed under the Apache 2.0 License.

dbanomaly's People

Contributors

abdullaho avatar adrienbrault avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.