Code Monkey home page Code Monkey logo

pypcga's Introduction

pyPCGA

Python library for Principal Component Geostatistical Approach

version 0.1

updates

  • Exact preconditioner construction (inverse of cokriging/saddle-point matrix) using generalized eigendecomposition [Lee et al., WRR 2016, Saibaba et al, NLAA 2016]
  • Fast hyperparameter tuning and predictive model validation using cR/Q2 criteria [Kitanidis, Math Geol 1991] ([Lee et al., 2021 in preparation])
  • Fast posterior variance/std computation using exact preconditioner

version 0.2 will include

  • automatic covariance model parameter calibration with nearshore application example
  • link with FMM and HMatrix to support unstructured grids

Installation

python -m pip install git+https://github.com/jonghyunharrylee/pyPCGA.git

Courses

Example Notebooks

1D linear inversion example below will be helpful to understand how pyPCGA can be implemented. Please check Google Colab examples.

Credits

pyPCGA is based on Lee et al. [2016] and currently used for Stanford-USACE ERDC project led by EF Darve and PK Kitanidis and NSF EPSCoR `Ike Wai project.

Code contributors include:

  • Jonghyun Harry Lee
  • Matthew Farthing
  • Ty Hesser (STWAVE example)

FFT-based matvec code is adapted from Arvind Saibaba's work (https://github.com/arvindks/kle).

FMM-based code (https://arxiv.org/abs/1903.02153) will be incorporated in version 0.2

References

  • J Lee, H Yoon, PK Kitanidis, CJ Werth, AJ Valocchi, "Scalable subsurface inverse modeling of huge data sets with an application to tracer concentration breakthrough data from magnetic resonance imaging", Water Resources Research 52 (7), 5213-5231

  • AK Saibaba, J Lee, PK Kitanidis, Randomized algorithms for generalized Hermitian eigenvalue problems with application to computing Karhunen–Loève expansion, Numerical Linear Algebra with Applications 23 (2), 314-339

  • J Lee, PK Kitanidis, "Large‐scale hydraulic tomography and joint inversion of head and tracer data using the Principal Component Geostatistical Approach (PCGA)", WRR 50 (7), 5410-5427

  • PK Kitanidis, J Lee, Principal Component Geostatistical Approach for large‐dimensional inverse problems, WRR 50 (7), 5428-5443

Applications

  • T. Kadeethum, D. O'Malley, JN Fuhg, Y. Choi, J. Lee, HS Viswanathan and N. Bouklas, A framework for data-driven solution and parameter estimation of PDEs using conditional generative adversarial networks, Nature Computational Science, 819–829, 2021

  • J Lee, H Ghorbanidehno, M Farthing, T. Hesser, EF Darve, and PK Kitanidis, Riverine bathymetry imaging with indirect observations, Water Resources Research, 54(5): 3704-3727, 2018

  • J Lee, A Kokkinaki, PK Kitanidis, Fast large-scale joint inversion for deep aquifer characterization using pressure and heat tracer measurements, Transport in Porous Media, 123(3): 533-543, 2018

  • PK Kang, J Lee, X Fu, S Lee, PK Kitanidis, J Ruben, Improved Characterization of Heterogeneous Permeability in Saline Aquifers from Transient Pressure Data during Freshwater Injection, Water Resources Research, 53(5): 4444-458, 2017

  • S. Fakhreddine, J Lee, PK Kitanidis, S Fendorf, M Rolle, Imaging Geochemical Heterogeneities Using Inverse Reactive Transport Modeling: an Example Relevant for Characterizing Arsenic Mobilization and Distribution, Advances in Water Resources, 88: 186-197, 2016

pypcga's People

Contributors

antoinecollet5 avatar jonghyunharrylee avatar mahtaw avatar mfarthin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pypcga's Issues

problem using linear trend in stwave example

@jonghyunharrylee when I try to specify a linear trend using the 'drift':'linear' flag in example_inv_stwave.py I get the following traceback

***** Iteration 1 ******
computed Jacobian-Matrix products in 189.261032 secs
solve saddle point (co-kriging) systems with Levenberg-Marquardt
Traceback (most recent call last):
  File "example_inv_stwave.py", line 90, in <module>
    s_hat, simul_obs, post_diagv, iter_best = prob.Run()
  File "/Users/rdchlmwf/Public/code/pyPCGA/pyPCGA/pcga.py", line 1254, in Run
    s_hat, simul_obs, post_diagv, iter_best = self.GaussNewton()
  File "/Users/rdchlmwf/Public/code/pyPCGA/pyPCGA/pcga.py", line 1157, in GaussNewton
    s_cur, beta_cur, simul_obs_cur, obj = self.LinearIteration(s_past, simul_obs)
  File "/Users/rdchlmwf/Public/code/pyPCGA/pyPCGA/pcga.py", line 1071, in LinearIteration
    s_hat, beta, simul_obs_new = self.IterativeSolve(s_cur, simul_obs, precond = precond)
  File "/Users/rdchlmwf/Public/code/pyPCGA/pyPCGA/pcga.py", line 902, in IterativeSolve
    x, info = gmres(Afun, b, tol=itertol, restart=restart, maxiter=solver_maxiter, callback=callback, M=P)
  File "<decorator-gen-5>", line 2, in gmres
  File "/Users/rdchlmwf/anaconda2/envs/enkf/lib/python2.7/site-packages/scipy/_lib/_threadsafety.py", line 46, in caller
    return func(*a, **kw)
  File "/Users/rdchlmwf/anaconda2/envs/enkf/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/iterative.py", line 460, in gmres
    work[slice1] = psolve(work[slice2])
  File "/Users/rdchlmwf/anaconda2/envs/enkf/lib/python2.7/site-packages/scipy/sparse/linalg/interface.py", line 219, in matvec
    y = self._matvec(x)
  File "/Users/rdchlmwf/anaconda2/envs/enkf/lib/python2.7/site-packages/scipy/sparse/linalg/interface.py", line 471, in _matvec
    return self.__matvec_impl(x)
  File "/Users/rdchlmwf/Public/code/pyPCGA/pyPCGA/pcga.py", line 892, in Pmv
    S = np.dot(HX.T, invPsi(HX)) # p by p matrix
  File "/Users/rdchlmwf/Public/code/pyPCGA/pyPCGA/pcga.py", line 878, in invPsi
    return np.multiply(np.multiply((1./alpha[i]),self.invR),v) - np.dot(Psi_U_i, np.multiply(Dvec[:Psi_U_i.shape[1]].reshape(Psi_UTv.shape), Psi_UTv))
ValueError: cannot reshape array of size 69 into shape (69,3)

I get a similar complaint execept the size is (69,2) when I try to manually set X. Do I need to set another flag? Thanks

Decide how we want to provide access to the adh executables?

The pre-compiled adh executables in the adh_red_inset application won't run on our blade server. We may want to point users to AdH website for the binary versions they distribute.

I suppose if we distribute in a docker container we would be ok (either just adh as a docker executable or the whole PCGA application).

No mf.hds output file

Amazing project!
I just compiled mf2005 on debian 11 and can run it just fine but I can't run the inversion in this example, because Modflow doesn't create a head file (I am using Python 3.9.13). All the input files are properly generated, but I think that the simulation doesn't terminate.
Have you noted this before?
Bernard

FileNotFoundError Traceback (most recent call last)
Cell In [2], line 1
----> 1 s_hat, simul_obs, post_diagv, iter_best = prob.Run()

File ~/anaconda3/envs/hlflopy/lib/python3.9/site-packages/pyPCGA/pcga.py:1334, in PCGA.Run(self)
1331 if self.priorU is None or self.priord is None:
1332 self.ComputePriorEig()
-> 1334 s_hat, simul_obs, post_diagv, iter_best = self.GaussNewton()
1335 #start = time()
1336 print("** Total elapsed time is %f secs" % (time()-start))

File ~/anaconda3/envs/hlflopy/lib/python3.9/site-packages/pyPCGA/pcga.py:1219, in PCGA.GaussNewton(self)
1216 print('##### 4. Start PCGA Inversion #####')
1217 print('-- evaluate initial solution')
-> 1219 simul_obs_init = self.ForwardSolve(s_init)
1220 self.simul_obs_init = simul_obs_init
1221 RMSE_init = np.linalg.norm(simul_obs_init-self.obs)/np.sqrt(n)

File ~/anaconda3/envs/hlflopy/lib/python3.9/site-packages/pyPCGA/pcga.py:379, in PCGA.ForwardSolve(self, s)
377 else:
378 with HiddenPrints():
--> 379 simul_obs = self.forward_model(s,par)
381 simul_obs = simul_obs.reshape(-1,1)
383 return simul_obs
...
--> 218 f = open(filename, "rb")
219 f.seek(0, 2)
220 totalbytes = f.tell()

FileNotFoundError: [Errno 2] No such file or directory: '/home/berna/modflow/pyPCGA/examples/modflow_flopy/simul/simul0000/mf.hds'

[ENH] Get reproducible results with PCGA (set the initial vector used by scipy.sparse.linalg.eigsh ...)

Up to now, when running pyPCGA twice or more with the same parameters, the results are slightly different each time.

This is what I attribute to the covariance matrix low-rank approximation relying on scipy.sparse.linalg.eigsh beauce the initial vector v0 is not provided and consequently chosen randomly.

The solution would be to let the possibility for the user set a seed (aka random_state) to generate a reproducible v0.

See: https://stackoverflow.com/a/52403508

Missing stwave_utils

Hi

I have been trying to run the stwave example and I cant find stwave_utils to install. Here is the trace back

Thanks

import stwave as st

File "stwave.py", line 4, in
from stwave_utils.stwave_forward_problem import STWaveProblem

ImportError: No module named stwave_utils.stwave_forward_problem

Release on pypi ?

Hi !

That's a great piece of code you have there. Do you consider adding a tag for v0.1.0 and pushing it to PyPI one day ?

model size

I have a 2d model which produces 11*11 result, I resize it to 1d.
I run and I get the following error

PCGA Inversion
1. Initialize forward and inversion parameters

------------ Inversion Parameters -------------------------
Number of unknowns : 121
Number of observations : 121
Number of principal components (n_pc) : 50
Prior model : def kernel(r): return (prior_std ** 2) * np.exp(-r)

Prior variance : 4.000000e-02
Prior scale (correlation) parameter : [71. 71.]
Posterior cov computation : diag
Posterior variance computation : Direct
Number of CPU cores (n_core) : 4
Maximum GN iterations : 10
machine precision (delta = sqrt(precision)) : 1.000000e-08
Tol for iterations (norm(sol_diff)/norm(sol)) : 1.000000e-02
Levenberg-Marquardt (LM) : True
LM solution range constraints (LM_smin, LM_smax) : None, None
Line search : True

2. Construct Prior Covariance Matrix
  • time for covariance matrix construction (m = 121) is 0 sec
3. Eigendecomposition of Prior Covariance
  • time for eigendecomposition with k = 50 is 0 sec
  • 1st eigv : 4.46991, 50-th eigv : 0.000505994, ratio: 0.0001132
4. Start PCGA Inversion

-- evaluate initial solution
obs. RMSE (norm(obs. diff.)/sqrt(nobs)): 18.3563, normalized obs. RMSE (norm(obs. diff./sqrtR)/sqrt(nobs)): 1835.63
***** Iteration 1 ******
Traceback (most recent call last):
File "/xxxx/scripts/inv_GaussPlume.py", line 57, in
s_hat, simul_obs, post_diagv, iter_best = prob.Run()
File "/usr/local/lib/python3.8/dist-packages/pyPCGA-0.1.0-py3.8.egg/pyPCGA/pcga.py", line 1332, in Run
s_hat, simul_obs, post_diagv, iter_best = self.GaussNewton()
File "/usr/local/lib/python3.8/dist-packages/pyPCGA-0.1.0-py3.8.egg/pyPCGA/pcga.py", line 1235, in GaussNewton
s_cur, beta_cur, simul_obs_cur, obj = self.LinearIteration(s_past, simul_obs)
File "/usr/local/lib/python3.8/dist-packages/pyPCGA-0.1.0-py3.8.egg/pyPCGA/pcga.py", line 1149, in LinearIteration
s_hat, beta, simul_obs_new = self.IterativeSolve(s_cur, simul_obs, precond = precond)
File "/usr/local/lib/python3.8/dist-packages/pyPCGA-0.1.0-py3.8.egg/pyPCGA/pcga.py", line 773, in IterativeSolve
HX, HZ, Hs, U_data = self.JacMat(s_cur, simul_obs, Z)
File "/usr/local/lib/python3.8/dist-packages/pyPCGA-0.1.0-py3.8.egg/pyPCGA/pcga.py", line 555, in JacMat
Htemp = self.JacVect(temp,s_cur,simul_obs,precision)
File "/usr/local/lib/python3.8/dist-packages/pyPCGA-0.1.0-py3.8.egg/pyPCGA/pcga.py", line 444, in JacVect
raise ValueError("size of simul_obs_purturbation (%d,%d) is not nruns %d" % (simul_obs_purturbation.shape[0], simul_obs_purturbation.shape[1], nruns))
ValueError: size of simul_obs_purturbation (121,1) is not nruns 52

problem with mf2005 file on Mac

Hi,

I have been trying to run the PCGA modflow example on Mac but have been getting an error. I'm not sure but from what I've seen about similar errors this is because the mf2005 binary is not compatible with my operating system (osx 10.13). Can I get information about which operating systems are compatible with this package?

Here is the traceback that I get.

Traceback (most recent call last):

File "", line 1, in
runfile('/Users/mahtag2/Downloads/pyPCGA-master/examples/modflow_flopy/example_inv_mf.py', wdir='/Users/mahtag2/Downloads/pyPCGA-master/examples/modflow_flopy')

File "/Users/mahtag2/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 710, in runfile
execfile(filename, namespace)

File "/Users/mahtag2/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 101, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "/Users/mahtag2/Downloads/pyPCGA-master/examples/modflow_flopy/example_inv_mf.py", line 106, in
s_hat, simul_obs, post_diagv, iter_best = prob.Run()

File "/Users/mahtag2/anaconda3/lib/python3.6/site-packages/pyPCGA-0.1.0-py3.6.egg/pyPCGA/pcga.py", line 1309, in Run
s_hat, simul_obs, post_diagv, iter_best = self.GaussNewton()

File "/Users/mahtag2/anaconda3/lib/python3.6/site-packages/pyPCGA-0.1.0-py3.6.egg/pyPCGA/pcga.py", line 1196, in GaussNewton
simul_obs_init = self.ForwardSolve(s_init)

File "/Users/mahtag2/anaconda3/lib/python3.6/site-packages/pyPCGA-0.1.0-py3.6.egg/pyPCGA/pcga.py", line 375, in ForwardSolve
simul_obs = self.forward_model(s,par)

File "/Users/mahtag2/Downloads/pyPCGA-master/examples/modflow_flopy/example_inv_mf.py", line 81, in forward_model
simul_obs = model.run(s, parallelization)

File "/Users/mahtag2/Downloads/pyPCGA-master/examples/modflow_flopy/mf.py", line 176, in run
simul_obs.append(self(item))

File "/Users/mahtag2/Downloads/pyPCGA-master/examples/modflow_flopy/mf.py", line 184, in call
return self.run_model(args[0], args[1])

File "/Users/mahtag2/Downloads/pyPCGA-master/examples/modflow_flopy/mf.py", line 155, in run_model
simul_obs = self.run_model_single(HK, Q_loc, idx)

File "/Users/mahtag2/Downloads/pyPCGA-master/examples/modflow_flopy/mf.py", line 129, in run_model_single
success, buff = mymf.run_model(silent=True)

File "/Users/mahtag2/anaconda3/lib/python3.6/site-packages/flopy/mbase.py", line 976, in run_model
normal_msg=normal_msg)

File "/Users/mahtag2/anaconda3/lib/python3.6/site-packages/flopy/mbase.py", line 1401, in run_model
stdout=sp.PIPE, stderr=sp.STDOUT, cwd=model_ws)

File "/Users/mahtag2/anaconda3/lib/python3.6/subprocess.py", line 709, in init
restore_signals, start_new_session)

File "/Users/mahtag2/anaconda3/lib/python3.6/subprocess.py", line 1344, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)

OSError: [Errno 8] Exec format error: '/Users/mahtag2/Downloads/pyPCGA-master/examples/modflow_flopy/input_files/mf2005'

Wrong packaging as setup.py imports pyPCGA

The packaging seems wrong as setup.py import pyPCGA and consequently its dependencies before they are even installed:

$ pip install git+https://****@github.com/jonghyunharrylee/pyPCGA.git
Collecting git+https://****@github.com/jonghyunharrylee/pyPCGA.git
  Cloning https://****@github.com/jonghyunharrylee/pyPCGA.git to /tmp/pip-req-build-ko5042bo
  Running command git clone --filter=blob:none -q 'https://****@github.com/jonghyunharrylee/pyPCGA.git' /tmp/pip-req-build-ko5042bo
  Resolved https://****@github.com/jonghyunharrylee/pyPCGA.git to commit 8294452a1c67b6cdc0c36d492c7bb2a769f76fb3
  Preparing metadata (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: /tmp/test/venv/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-ko5042bo/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-ko5042bo/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-x9bt3f13
       cwd: /tmp/pip-req-build-ko5042bo/
  Complete output (9 lines):
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/pip-req-build-ko5042bo/setup.py", line 8, in <module>
      from pyPCGA import __name__, __author__
    File "/tmp/pip-req-build-ko5042bo/pyPCGA/__init__.py", line 8, in <module>
      from .pcga import PCGA
    File "/tmp/pip-req-build-ko5042bo/pyPCGA/pcga.py", line 3, in <module>
      import numpy as np
  ModuleNotFoundError: No module named 'numpy'
  ----------------------------------------
WARNING: Discarding git+https://****@github.com/jonghyunharrylee/pyPCGA.git. Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Workaround: define the author and the version in the setup.py and retrieve it in __init__.py

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.