Code Monkey home page Code Monkey logo

pyproximal's Introduction

PyProximal

PyPI version AzureDevOps Status GithubAction Status Documentation Status OS-support Slack Status DOI

🚦 🚦 This library is under early development. Expect things to constantly change until version v1.0.0. 🚦 🚦

Objective

This Python library provides all the needed building blocks for solving non-smooth convex optimization problems using the so-called proximal algorithms.

Whereas gradient based methods are first-order iterative optimization algorithms for solving unconstrained, smooth optimization problems, proximal algorithms can be viewed as an analogous tool for non-smooth and possibly constrained versions of these problems. Such algorithms sit at a higher level of abstraction than classical algorithms like Steepest descent or Newton’s method and require a basic operation to be performed at each iteration: the evaluation of the so-called proximal operator of the functional to be optimized.

Whilst evaluating a proximal operator does itself require solving a convex optimization problem, these subproblems often admit closed form solutions or can be solved very quickly with ad-hoc specialized methods. Several of such proximal operators are therefore implemented in this library.

Here is a simple example showing how to compute the proximal operator of the L1 norm of a vector:

import numpy as np
from pyproximal import L1

l1 = L1(sigma=1.)
x = np.arange(-5, 5, 0.1)
xp = l1.prox(x, 1)

and how this can be used to solve a basic denoising problem of the form: min ||x - y||_2^2 + ||Dx||_1:

import numpy as np
from pylops import FirstDerivative
from pyproximal import L1, L2
from pyproximal.optimization.primal import LinearizedADMM

np.random.seed(1)

# Create noisy data
nx = 101
x = np.zeros(nx)
x[:nx//2] = 10
x[nx//2:3*nx//4] = -5
n = np.random.normal(0, 2, nx)
y = x + n

# Define functionals
l2 = L2(b=y)
l1 = L1(sigma=5.)
Dop = FirstDerivative(nx, edge=True, kind='backward')

# Solve functional with L-ADMM
L = np.real((Dop.H * Dop).eigs(neigs=1, which='LM')[0])
tau = 1.
mu = 0.99 * tau / L
xladmm, _ = LinearizedADMM(l2, l1, Dop, tau=tau, mu=mu,
                           x0=np.zeros_like(x), niter=200)

Why another library for proximal algorithms?

Several other projects in the Python ecosystem provide implementations of proximal operators and/or algorithms, which present some clear overlap with this project.

A (possibly not exhaustive) list of other projects is:

All of these projects are self-contained, meaning that they implement both proximal and linear operators as needed to solve a variety of problems in different areas of science.

The main difference with PyProximal lies in the fact that we decide not to intertangle linear and proximal operators within the same library. We leverage the extensive set of linear operators provided by the PyLops project and focus only on the proximal part of the problem. This makes the codebase more concise, and easier to understand and extend. Moreover many of the problems that are solved in PyLops can now be also solved by means of proximal algorithms!

Project structure

This repository is organized as follows:

  • pyproximal: python library containing various orthogonal projections, proximial operators, and solvers
  • pytests: set of pytests
  • testdata: sample datasets used in pytests and documentation
  • docs: sphinx documentation
  • examples: set of python script examples for each proximal operator to be embedded in documentation using sphinx-gallery
  • tutorials: set of python script tutorials to be embedded in documentation using sphinx-gallery

Getting started

You need Python 3.8 or greater.

Note: Versions prior to v0.3.0 work also with Python 3.6 or greater, however they require scipy version to be lower than v1.8.0.

To get the most out of PyLops straight out of the box, we recommend conda to install PyLops:

conda install -c conda-forge pyproximal

From PyPi

You can also install pyproximal with pip:

pip install pyproximal

From Github

Finally, you can also directly install from the main branch (although this is not recommended)

pip install git+https://[email protected]/PyLops/pyproximal.git@main

Contributing

Feel like contributing to the project? Adding new operators or tutorial?

We advise using the Anaconda Python distribution to ensure that all the dependencies are installed via the Conda package manager. Follow the following instructions and read carefully the CONTRIBUTING file before getting started.

1. Fork and clone the repository

Execute the following command in your terminal:

git clone https://github.com/your_name_here/pyproximal.git

2. Install PyLops in a new Conda environment

To ensure that further development of PyLops is performed within the same environment (i.e., same dependencies) as that defined by requirements-dev.txt or environment-dev.yml files, we suggest to work off a new Conda enviroment.

The first time you clone the repository run the following command:

make dev-install_conda

To ensure that everything has been setup correctly, run tests:

make tests

Make sure no tests fail, this guarantees that the installation has been successfull.

Remember to always activate the conda environment every time you open a new terminal by typing:

source activate pyproximal

Documentation

The official documentation of PyProximal is available here.

Moreover, if you have installed PyProximal using the developer environment you can also build the documentation locally by typing the following command:

make doc

Once the documentation is created, you can make any change to the source code and rebuild the documentation by simply typing

make docupdate

Note that if a new example or tutorial is created (and if any change is made to a previously available example or tutorial) you are required to rebuild the entire documentation before your changes will be visible.

Citing

When using PyProximal in scientific publications, please cite the following paper:

  • Ravasi M, Örnhag M. V., Luiken N., Leblanc O. and Uruñuela E., 2024, PyProximal - scalable convex optimization in Python, Journal of Open Source Software, 9(95), 6326. doi: 10.21105/joss.06326 (link)

Contributors

  • Matteo Ravasi, mrava87
  • Nick Luiken, NickLuiken
  • Eneko Uruñuela, eurunuela
  • Marcus Valtonen Örnhag, marcusvaltonen
  • Olivier Leblanc, olivierleblanc

pyproximal's People

Contributors

eurunuela avatar marcusvaltonen avatar mrava87 avatar nickluiken avatar nirum avatar olivierleblanc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pyproximal's Issues

Let `epsg` in `AcceleratedProximalGradient` be a vector of values

Now that the pylops FISTA works on matrices of data, it makes sense that the regularization parameter can be a vector of values (one per vector of data in the data matrix).

Implementing this feature will require that the proximals work at the matrix level rather than the vector level.

Edit: I actually think Python would already take care of making proximals works with matrices.

Comments on the JOSS paper

For reference: openjournals/joss-reviews#6326

Here are some comments I have regarding the paper itself:

  • In the statement of need, it would be helpful to cite a few well-known packages in order to highlight the key distinguishing features of this package. This is already done in the repo README and can probably be reused.
  • One DOI is missing -- see here for the suggested DOI

Faster TV

Don't know what's going on behind the scenes but skimage TV is a fair bit quicker than your implementation. Could be looped in with something like the following:

from pyproximal.ProxOperator import _check_tau
from skimage.restoration import denoise_tv_chambolle, denoise_tv_bregman

class TV:
    def __init__(self, sigma, dims, isotropic=True, max_iter=1000):
        self.sigma = sigma
        self.isotropic = isotropic
        self.max_iter = max_iter
        self.count = 0
        self.dims = dims

    def _increment_count(func):
        """Increment counter"""

        def wrapped(self, *args, **kwargs):
            self.count += 1
            return func(self, *args, **kwargs)

        return wrapped

    @_increment_count
    @_check_tau
    def prox(self, x, tau):
        x = x.reshape(self.dims)
        if self.isotropic:
            return denoise_tv_chambolle(
                x, weight=tau * self.sigma, max_num_iter=self.max_iter
            ).ravel()
        else:
            return denoise_tv_bregman(
                x,
                weight=tau * self.sigma,
                isotropic=self.isotropic,
                max_num_iter=self.max_iter,
            ).ravel()

Combining proximal operators

Hi,

Great work on this package. I’ve actually started building around it here https://github.com/jameschapman19/scikit-prox using your operators to solve regularised versions of some scikit-learn models. I like how you’ve chosen the proximal operators as the level of abstraction it has made it really easy to work with.

My question is whether you have or would consider combined proximal operators by Dykstra or otherwise. It’s been done in https://github.com/neurospin/pylearn-parsimony.

Add support for non-convex penalties acting on the singular values

The nuclear norm is the soft-thresholding operator acting on the singular values. This essentially means that even the larger singular values are penalized just as hard as the smaller ones, which leads to the phenomenon shrinking bias which has been studied in e.g. [1]. Ideally, the smaller singular values (likely stemming from noise) should be penalized proportionally harder, and this can be done in two ways:

Weighted nuclear norm (WNN)

The weighted nuclear norm adds a penalty w_i for the i:th singular value, where {w_i} is increasing. [2]

Concave function acting on the singular value

There are many works that add instead apply a concave map f to the i:th singular σ_i -> f(σ_i). These include:

  • SCAD [3]
  • Log [4]
  • MCP [5]
  • ETP [6]
  • Geman [7]
  • f_mu (special case of #51)

Proposition for enhancement

Implement the above penalties and their proximal operators.

References

[1] Carlsson et al. "An unbiased approach to low rank recovery", https://arxiv.org/abs/1909.13363
[2] Gu et al. "Weighted Nuclear Norm Minimization with Application to Image Denoising" In the Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2014
[3] Fan and Li "Variable selection via nonconcave penalized likelihood and its oracle properties" Journal of the American Statistical Association, 96(456):1348–1360, 2001
[4] Friedman "Fast sparse regression and classification", International Journal of Forecasting, 28(3):722 – 738, 2012.
[5] Zhang et al. "Nearly unbiased variable selection under minimax concave penalty" The Annals of Statistics, 38(2):894–942, 2010
[6] Gao et al. "A feasible nonconvex relaxation approach to feature selection", In Proceedings of the Conference on Artificial Intelligence (AAAI), 2011
[7] Geman and Yang "Nonlinear image recovery with half-quadratic regularization", IEEE Transactions on Image Processing, 4:932 – 946, 08 1995

Categorize/structure penalties

There are now an increasing amount of penalties being merged. Currently on main:

  • Box
  • Simplex
  • Intersection
  • AffineSet
  • Quadratic
  • Euclidean
  • EuclideanBall
  • L0Ball
  • L1
  • L1Ball
  • L2
  • L2Convolve
  • L21
  • L21_plus_L1
  • Nonlinear
  • Huber
  • Nuclear
  • NuclearBall
  • Orthogonal
  • VStack
  • SCAD
  • Log
  • ETP
  • Geman
  • QuadraticEnvelopeCard

I think it would be beneficial to categorize/group these penalties. Today we only have this flat structure, but new users could get confused of what penalties can be applied to what problems. There are many ways of structuring these, but as a first step I suggest

  • First level: Separate between vector penalties and matrix-only penalties.
  • Second level (perhaps not yet): Separate between convex and non-convex penalties

Any thoughts? They should also be sorted in an alphabetical order in the documentation.

Evaluation of proximal L2 operator slow when self.Op.explicit == True

Hi, and thank you for a nice project.

I actually started implement a framework similar to yours, but I am lucky that I found pyproximal, especially considering the similarities in most of the design choices you have made. No need to write things twice, so I am instead hoping I can contribute a bit.

What I was mostly discouraged by was how slow a simple ADMM lasso implementation was using your framework. When compared to my own implementation it was 15x slower, but I could quickly make it faster using two things:

1.) Change scipy.linalg.solve -> np.linalg.solve for dense matrices in LinearOperator (in the twin pyLops library), and
2.) Cache the Op1 operator in L2.prox()

I understand that you might want to keep the scipy.linalg.solve for backwards compatability, but I urge you to make it configurable in some way, since it is really slow. I would be happy to create PRs if you are interested.

Again, thank you for a nice framework.

Stopping criteria for solvers

Motivation

PyProximal solvers are currently only stopped when the maximum number of iterations is reached.

It is desirable to introduce stopping criteria to prematurely stop a solver based on some condition. A first attempt at introducing a stopping criterion to ProximalPoint and ProximalGradient based on the change in objective function between two consecutive iterations can be found in #176

Definition of done

The same criterion should be implemented to all solvers

  • More of such criteria should be implemented (e.g., operating only on one term of an objective function instead of the overall objective function)

JOSS Paper

Hello everyone,
given that you have contributed some code to PyProximal, I would like to give you the opportunity to be included as co-author of this JOSS paper. I am writing it mostly to ensure PyProximal can be properly cited by researchers leveraging it in their published work going forward.

I kindly ask you to thumb up if you would like to be included in the author list. If I do not see a thumb up I will assume that you are not interested.

Moreover, to make the paper stronger, I would like to add a couple of more use-cases of PyProximal: ideally they should be backed up by peer-reviewed publications (and/or openly available codes). It would be great if you could provide me with a use-case from your team or other people you know in your field.

To access the paper please head over to https://github.com/mrava87/pyproximal/actions/runs/7271942518 and click on the paper text under Artifacts (the PDF will start downloading). The paper .md file is currently sitting in this branch: https://github.com/mrava87/pyproximal/tree/joss. I am happy to push it to the main repo in a branch with the same name if you want to act directly on it, or just take any of your comments/additions and add it to the paper myself.

Contributors:
@marcusvaltonen
@NickLuiken
@olivierleblanc
@eurunuela
@shnaqvi

Add support for second-order low-rank inducing bilinear optimization frameworks (and related penalties)

There is already support for BilinearOperator in pyproximal/pyproximal/utils/bilinear.py and the PALM optimizer; however,
they do not scale to second-order methods such as Levenberg-Marquardt (LM) and Variable Projection (VarPro). In recent years, such methods have been become popular alternatives to splitting methods such as ADMM for certain types of problems (e.g. image analysis and computer vision).

Examples

As an example the nuclear norm penalty has a variational formulation
image
which was utilized in [1] using a bilinear framework (no second order method involved, though).

This also scales to the weighted nuclear norm, and it has been successfully implemented for vision tasks using second-order methods in e.g. [2, 7, 8]. Other non-convex penalties have been studied as well, e.g. [3, 4, 5, 6].

Suggestion on enhancement

I suggest adding support for Levenberg-Marquardt (LM) and Variable Projection (VarPro). As a second step I suggest adding some of the mentioned regularizers.

References

[1] Cabral et al. "Unifying Nuclear Norm and Bilinear Factorization Approaches for Low-Rank Matrix Decomposition", In the Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013
[2] Iglesias et al. "Accurate Optimization of Weighted Nuclear Norm for Non-Rigid Structure from Motion", In the Proceedings of the European Conference on Computer Vision (ECCV), 2020
[3] Valtonen Ornhag et al. "Bilinear parameterization for differentiable rank-regularization", In the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPR), 2020
[4] Valtonen Ornhag et al. "Differentiable Fixed-Rank Regularisation using Bilinear Parameterisation", In the Proceedings of the British Machine Vision Conference (BMVC), 2019
[5] Valtonen Ornhag et al. "Bilinear Parameterization for Non-Separable Singular Value Penalties", In the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021
[6] Xu et al. "A unified convex surrogate for the schatten-p norm". In Proceedings of the Conference on Artificial Intelligence (AAAI), 2017
[7] Kumar "Non-rigid structure from motion: Prior-free factorization method revisited" In IEEE Winter Conference on Applications of Computer Vision (WACV), 2020
[8] McDonald et al. "Spectral k-support norm regularization", In Advances in Neural Information Processing Systems (NIPS), 2014

Can pyproximal convolve variable?

Is it possible to apply two convolution filters to the variable X, compare the elements of the two arrays, and choose the larger value? And can pyproximal solve the problem?

Extend ADMM to general case with A and B

Currently the ADMM solver accommodates only for the special case of A and B identities.

We should extend the solver to more general cases and have A=None and B=None as defaults (which means identities)

Regarding compressed sensing solvers

I have a large-scale problem where I use the L1-regularized least square for compressed sensing by Stephen Boyd . However, the computation takes something like 170 seconds, which drove me to look for something more efficient. I found that also ISTA, FISTA, and Twist are able to fulfill the same task, but I did not find an implementation of the L1-regularized least square in the proximal library.

Q1: Is the L1-regularized least square already exist in the library?
Q2: Are ISTA, FISTA, and Twist capable of solving large-scale problems (A matrix shape (149600, 144))?
Q2: is the ADMM solver capable of solving such problems? If yes, is it more efficient?

Add support for proximal operator proposed in "Convex Low Rank Approximation" by Larsson and Olsson

In [1] a general low-rank inducing framework was introduced and showed many benefits compared to e.g. the nuclear norm when applied to common computer vision tasks. This was later generalized in [2].

Suggestion on enhancement

Implement [2] and treat the special cases R_g, R_mu and R_i=k, separately. (See papers for details).

References

[1] Larsson and Olsson, "Convex Low Rank Approximation", International Journal of Computer Vision (IJCV), 2016
[2] Valtonen Ornhag and Olsson, "A unified optimization framework for low-rank inducing penalties", In the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020

Consider making a new release since deprecation of Python 3.6 and 3.7

The deprecation of Python 3.6 and 3.7 (PR #46) and bumping of scipy version to >=1.8.0 is a big step. I would highly recommend doing a release at this point.

For example, I tried building on Python 3.10.2 with scipy==1.8.0 locally (Ubuntu 20.04) with pyproximal==0.2.0, but it failed since the scipy interface changed. Specifically it was line 8 in pyproximal/proximal/VStack.py that failed:

from scipy.sparse.linalg.interface import _get_dtype

Furthermore, when testing my repository in a Docker container (specifically python:3.9.10-slim) I was for this reason forced to use the main branch; however, these standard docker containers do not ship with git installed, and therefore I had to customize the container to first install it before being able to run it.

Maybe not a big deal, but if everything works of the shelf then people are encouraged to use the library more.

Add tests for densesolver in L2

Once Pylops v1.18.1 has been released, add the following test to test_norms:

@pytest.mark.parametrize("par", [(par1), (par2)])
def test_L2_dense(par):
    """L2 norm of Op*x with dense Op and proximal/dual proximal
    """
    b = np.zeros(par['nx'], dtype=par['dtype'])
    d = np.random.normal(0., 1., par['nx']).astype(par['dtype'])
    l2 = L2(Op=MatrixMult(np.diag(d), dtype=par['dtype']),
            b=b, sigma=par['sigma'], densesolver='numpy')

    # norm
    x = np.random.normal(0., 1., par['nx']).astype(par['dtype'])
    assert l2(x) == (par['sigma'] / 2.) * np.linalg.norm(d * x) ** 2

    # prox: since Op is a Diagonal operator the denominator becomes
    # 1 + sigma*tau*d[i] for every i
    tau = 2.
    den = 1. + par['sigma'] * tau * d ** 2
    assert_array_almost_equal(l2.prox(x, tau), x / den, decimal=4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.