pinellolab / pyrovelocity Goto Github PK

𝒫robabilistic modeling of RNA velocity ⬱

License: GNU Affero General Public License v3.0

Makefile 0.94% Python 96.44% Shell 1.68% HCL 0.50% Dockerfile 0.08% Just 0.37%

cell-fate-determination deep-generative-model developmental-trajectories multiomics probabilistic-models probabilistic-programming rna-velocity rna-velocity-estimation single-cell-genomics single-cell-rna-seq

pyrovelocity's People

Stargazers

Watchers

Forkers

cameronraysmith qinqian yunbokai lijc0804 lylaatta123

pyrovelocity's Issues

update type annotations - calls from `fig2_pancreas_data.py`

update type annotations based on call stack from fig2_pancreas_data.py

cprofile stack trace of `fig2_pancreas_data.py`

$ nohup python -m cProfile -o output/fig2_pancreas_data.stats fig2/model1/fig2_pancreas_data.py > output/fig2_pancreas_data.log 2>&1 &

$ python -m pstats output/fig2_pancreas_data.stats

output/fig2_pancreas_data.stats% stats 12000 /pyrovelocity
Fri Nov  4 00:31:40 2022    output/fig2_pancreas_data.stats

         143286088 function calls (140907165 primitive calls) in 738.562 seconds

   Ordered by: cumulative time
   List reduced from 27286 to 12000 due to restriction <12000>
   List reduced from 12000 to 46 due to restriction <'pyrovelocity'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    1.385    1.385  481.227  481.227 pyrovelocity/plot.py:650(vector_field_uncertainty)
        1    0.444    0.444  201.069  201.069 pyrovelocity/api.py:17(train_model)
        1    0.138    0.138  191.579  191.579 pyrovelocity/_trainer.py:179(train_faster)
     4260    0.360    0.000   33.885    0.008 pyrovelocity/_velocity_model.py:3169(forward)
    36000   16.486    0.000   19.296    0.001 pyrovelocity/_trainer.py:25(step)
        1    1.143    1.143   17.874   17.874 pyrovelocity/plot.py:839(plot_mean_vector_field)
     4260    0.667    0.000   13.285    0.003 pyrovelocity/_velocity_model.py:296(get_likelihood)
        1    0.001    0.001    8.680    8.680 pyrovelocity/_velocity.py:162(posterior_samples)
     8520    2.348    0.000    4.105    0.000 pyrovelocity/utils.py:74(mRNA)
     4260    0.375    0.000    3.867    0.001 pyrovelocity/_velocity_model.py:3062(get_rna)
        1    0.003    0.003    2.852    2.852 pyrovelocity/plot.py:393(plot_gene_ranking)
        1    0.004    0.004    2.633    2.633 pyrovelocity/data.py:25(load_data)
     8517    0.029    0.000    1.509    0.000 pyrovelocity/_velocity_model.py:64(create_plates)
     4260    0.016    0.000    0.961    0.000 pyrovelocity/_velocity_model.py:93(alpha)
     4260    0.081    0.000    0.948    0.000 pyrovelocity/_velocity_model.py:108(beta)
     4260    0.016    0.000    0.895    0.000 pyrovelocity/_velocity_model.py:119(gamma)
       30    0.759    0.025    0.876    0.029 pyrovelocity/plot.py:386(mae_per_gene)
     4260    0.013    0.000    0.876    0.000 pyrovelocity/_velocity_model.py:178(dt_switching)
        1    0.001    0.001    0.722    0.722 pyrovelocity/plot.py:1121(us_rainbowplot)
       30    0.116    0.004    0.300    0.010 pyrovelocity/cytotrace.py:82(compute_similarity2)
        1    0.000    0.000    0.277    0.277 pyrovelocity/data.py:72(setup_anndata_multilayers)
     8520    0.008    0.000    0.096    0.000 pyrovelocity/utils.py:17(inv)
        1    0.000    0.000    0.054    0.054 pyrovelocity/_velocity.py:216(save)
        1    0.000    0.000    0.001    0.001 pyrovelocity/_velocity_model.py:38(LogNormalModel)
        8    0.000    0.000    0.000    0.000 pyrovelocity/_velocity_model.py:246(_get_fn_args_from_batch)
        1    0.000    0.000    0.000    0.000 pyrovelocity/_trainer.py:137(VelocityTrainingMixin)
        1    0.000    0.000    0.000    0.000 pyrovelocity/_velocity_model.py:2664(LatentFactor)
        1    0.000    0.000    0.000    0.000 pyrovelocity/_velocity.py:33(PyroVelocity)
        1    0.000    0.000    0.000    0.000 pyrovelocity/_velocity_model.py:996(TimeEncoder2)
        1    0.000    0.000    0.000    0.000 pyrovelocity/_velocity_model.py:418(VelocityModel)

monkeytype runtime types for `fig2_pancreas_data.py`

$ nohup monkeytype run fig2/model1/fig2_pancreas_data.py > output/monkeytype_fig2_pancreas_data.log 2>&1 &

$ find ../../pyrovelocity/ -name '*.py' | xargs wc -l
    10 ../../pyrovelocity/__init__.py
    11 ../../pyrovelocity/pyrovelocity.py
   242 ../../pyrovelocity/api.py
   313 ../../pyrovelocity/_velocity.py
   371 ../../pyrovelocity/_trainer.py
   388 ../../pyrovelocity/utils.py
   395 ../../pyrovelocity/data.py
   517 ../../pyrovelocity/_velocity_module.py
  1131 ../../pyrovelocity/cytotrace.py
  1616 ../../pyrovelocity/_velocity_guide.py
  1955 ../../pyrovelocity/plot.py
  3314 ../../pyrovelocity/_velocity_model.py
 10263 total

##################

$ monkeytype list-modules
pyrovelocity.api
pyrovelocity._velocity
pyrovelocity._trainer
pyrovelocity.utils
pyrovelocity.data
pyrovelocity._velocity_module
pyrovelocity.cytotrace
pyrovelocity.plot
pyrovelocity._velocity_model

##################

$ monkeytype stub pyrovelocity.api
from anndata._core.anndata import AnnData
from numpy import ndarray
from pyrovelocity._velocity import PyroVelocity
from typing import (
    Dict,
    Optional,
    Tuple,
)


def train_model(
    adata: AnnData,
    guide_type: str = ...,
    model_type: str = ...,
    svi_train: bool = ...,
    batch_size: int = ...,
    train_size: float = ...,
    use_gpu: int = ...,
    likelihood: str = ...,
    num_samples: int = ...,
    log_every: int = ...,
    cell_state: str = ...,
    patient_improve: float = ...,
    patient_init: int = ...,
    seed: int = ...,
    lr: float = ...,
    max_epochs: int = ...,
    include_prior: bool = ...,
    library_size: bool = ...,
    offset: bool = ...,
    input_type: str = ...,
    cell_specific_kinetics: None = ...,
    kinetics_num: int = ...
) -> Tuple[PyroVelocity, Dict[str, ndarray]]: ...

##################

$ monkeytype stub pyrovelocity._velocity
from anndata._core.anndata import AnnData
from numpy import ndarray
from typing import (
    Dict,
    Optional,
    Sequence,
    Union,
)


class PyroVelocity:
    def __init__(
        self,
        adata: AnnData,
        input_type: str = ...,
        shared_time: bool = ...,
        model_type: str = ...,
        guide_type: str = ...,
        likelihood: str = ...,
        t_scale_on: bool = ...,
        plate_size: int = ...,
        latent_factor: str = ...,
        latent_factor_operation: str = ...,
        inducing_point_size: int = ...,
        latent_factor_size: int = ...,
        include_prior: bool = ...,
        use_gpu: int = ...,
        init: bool = ...,
        num_aux_cells: int = ...,
        only_cell_times: bool = ...,
        decoder_on: bool = ...,
        add_offset: bool = ...,
        correct_library_size: Union[bool, str] = ...,
        cell_specific_kinetics: Optional[str] = ...,
        kinetics_num: Optional[int] = ...
    ) -> None: ...
    def posterior_samples(
        self,
        adata: Optional[AnnData] = ...,
        indices: Optional[Sequence[int]] = ...,
        batch_size: Optional[int] = ...,
        num_samples: int = ...
    ) -> Dict[str, ndarray]: ...
    def save(self, dir_path: str, overwrite: bool = ..., save_anndata: bool = ..., **anndata_write_kwargs) -> None: ...

#####################

$ monkeytype stub pyrovelocity._trainer
from pyro.optim.optim import PyroOptim
from typing import (
    Any,
    Callable,
    Dict,
    List,
    Optional,
    Union,
)


def VelocityClippedAdam(optim_args: Dict[str, float]) -> PyroOptim: ...


class VelocityAdam:
    def step(self, closure: Optional[Callable] = ...) -> Optional[Any]: ...


class VelocityTrainingMixin:
    def train_faster(
        self,
        use_gpu: Optional[Union[str, int, bool]] = ...,
        seed: int = ...,
        lr: float = ...,
        max_epochs: int = ...,
        log_every: int = ...,
        patient_init: int = ...,
        patient_improve: float = ...
    ) -> List[float]: ...
(pyrovelocity-dev) [crs58@ml008 figures]$ monkeytype stub pyrovelocity.api
from anndata._core.anndata import AnnData
from numpy import ndarray
from pyrovelocity._velocity import PyroVelocity
from typing import (
    Dict,
    Optional,
    Tuple,
)


def train_model(
    adata: AnnData,
    guide_type: str = ...,
    model_type: str = ...,
    svi_train: bool = ...,
    batch_size: int = ...,
    train_size: float = ...,
    use_gpu: int = ...,
    likelihood: str = ...,
    num_samples: int = ...,
    log_every: int = ...,
    cell_state: str = ...,
    patient_improve: float = ...,
    patient_init: int = ...,
    seed: int = ...,
    lr: float = ...,
    max_epochs: int = ...,
    include_prior: bool = ...,
    library_size: bool = ...,
    offset: bool = ...,
    input_type: str = ...,
    cell_specific_kinetics: None = ...,
    kinetics_num: int = ...
) -> Tuple[PyroVelocity, Dict[str, ndarray]]: ...

####################

$ monkeytype stub pyrovelocity.utils
from torch import Tensor
from typing import Tuple


def inv(x: Tensor) -> Tensor: ...


def mRNA(
    tau: Tensor,
    u0: Tensor,
    s0: Tensor,
    alpha: Tensor,
    beta: Tensor,
    gamma: Tensor
) -> Tuple[Tensor, Tensor]: ...

#####################

$ monkeytype stub pyrovelocity.data
from anndata._core.anndata import AnnData
from typing import (
    List,
    Optional,
)


def load_data(
    data: str = ...,
    top_n: int = ...,
    min_shared_counts: int = ...,
    eps: float = ...,
    force: bool = ...
) -> AnnData: ...


def setup_anndata_multilayers(
    adata: AnnData,
    batch_key: Optional[str] = ...,
    labels_key: Optional[str] = ...,
    layer: Optional[str] = ...,
    protein_expression_obsm_key: Optional[str] = ...,
    protein_names_uns_key: Optional[str] = ...,
    categorical_covariate_keys: Optional[List[str]] = ...,
    continuous_covariate_keys: Optional[List[str]] = ...,
    copy: bool = ...,
    input_type: str = ...,
    n_aux_cells: int = ...,
    cluster: str = ...
) -> Optional[AnnData]: ...

#################

$ monkeytype stub pyrovelocity._velocity_module
from pyro.infer.autoguide.guides import AutoGuideList
from pyrovelocity._velocity_model import VelocityModelAuto
from typing import (
    Optional,
    Union,
)


class VelocityModule:
    def __init__(
        self,
        num_cells: int,
        num_genes: int,
        model_type: str = ...,
        guide_type: str = ...,
        likelihood: str = ...,
        shared_time: bool = ...,
        t_scale_on: bool = ...,
        plate_size: int = ...,
        latent_factor: str = ...,
        latent_factor_operation: str = ...,
        latent_factor_size: int = ...,
        inducing_point_size: int = ...,
        include_prior: bool = ...,
        use_gpu: int = ...,
        num_aux_cells: int = ...,
        only_cell_times: bool = ...,
        decoder_on: bool = ...,
        add_offset: bool = ...,
        correct_library_size: Union[bool, str] = ...,
        cell_specific_kinetics: Optional[str] = ...,
        kinetics_num: Optional[int] = ...,
        **initial_values
    ) -> None: ...
    @property
    def guide(self) -> AutoGuideList: ...
    @property
    def model(self) -> VelocityModelAuto: ...

####################

$ monkeytype stub pyrovelocity.cytotrace
from numpy import ndarray


def compute_similarity2(O: ndarray, P: ndarray) -> ndarray: ...


#####################

$ monkeytype stub pyrovelocity.plot
2 traces failed to decode; use -v for details
from anndata._core.anndata import AnnData
from matplotlib.figure import Figure
from numpy import ndarray
from pandas.core.indexes.base import Index
from typing import (
    Dict,
    List,
    Tuple,
)


def mae_per_gene(pred_counts: ndarray, true_counts: ndarray) -> ndarray: ...


def us_rainbowplot(
    genes: Index,
    adata: AnnData,
    pos: Dict[str, ndarray],
    data: List[str] = ...,
    cell_state: str = ...
) -> Figure: ...


def vector_field_uncertainty(
    adata: AnnData,
    pos: Dict[str, ndarray],
    basis: str = ...,
    n_jobs: int = ...,
    denoised: bool = ...
) -> Tuple[ndarray, ndarray, ndarray]: ...


#########################


$ monkeytype stub pyrovelocity._velocity_model
from pyro.distributions.torch import Poisson
from pyro.primitives import plate
from torch import Tensor
from typing import (
    Any,
    Dict,
    Optional,
    Tuple,
    Union,
)


class LogNormalModel:
    def __init__(self, num_cells: int, num_genes: int, likelihood: str = ..., plate_size: int = ...) -> None: ...
    @staticmethod
    def _get_fn_args_from_batch(
        tensor_dict: Dict[str, Tensor]
    ) -> Tuple[Tuple[Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, None, None], Dict[Any, Any]]: ...
    def create_plates(
        self,
        u_obs: Optional[Tensor] = ...,
        s_obs: Optional[Tensor] = ...,
        u_log_library: Optional[Tensor] = ...,
        s_log_library: Optional[Tensor] = ...,
        u_log_library_loc: Optional[Tensor] = ...,
        s_log_library_loc: Optional[Tensor] = ...,
        u_log_library_scale: Optional[Tensor] = ...,
        s_log_library_scale: Optional[Tensor] = ...,
        ind_x: Optional[Tensor] = ...,
        cell_state: Optional[Tensor] = ...,
        time_info: Optional[Tensor] = ...
    ) -> Tuple[plate, plate]: ...
    def get_likelihood(
        self,
        ut: Tensor,
        st: Tensor,
        u_log_library: Optional[Tensor] = ...,
        s_log_library: Optional[Tensor] = ...,
        u_scale: Optional[Tensor] = ...,
        s_scale: Optional[Tensor] = ...,
        u_read_depth: Optional[Tensor] = ...,
        s_read_depth: Optional[Tensor] = ...,
        u_cell_size_coef: None = ...,
        ut_coef: None = ...,
        s_cell_size_coef: None = ...,
        st_coef: None = ...
    ) -> Tuple[pyro.distributions.Poisson, pyro.distributions.Poisson]: ...


class VelocityModel:
    def __init__(
        self,
        num_cells: int,
        num_genes: int,
        likelihood: str = ...,
        shared_time: bool = ...,
        t_scale_on: bool = ...,
        plate_size: int = ...,
        latent_factor: str = ...,
        latent_factor_size: int = ...,
        latent_factor_operation: str = ...,
        include_prior: bool = ...,
        num_aux_cells: int = ...,
        only_cell_times: bool = ...,
        decoder_on: bool = ...,
        add_offset: bool = ...,
        correct_library_size: Union[bool, str] = ...,
        guide_type: bool = ...,
        cell_specific_kinetics: Optional[str] = ...,
        kinetics_num: Optional[int] = ...,
        **initial_values
    ) -> None: ...


class VelocityModelAuto:
    def __init__(self, *args, **kwargs) -> None: ...
    def forward(
        self,
        u_obs: Optional[Tensor] = ...,
        s_obs: Optional[Tensor] = ...,
        u_log_library: Optional[Tensor] = ...,
        s_log_library: Optional[Tensor] = ...,
        u_log_library_loc: Optional[Tensor] = ...,
        s_log_library_loc: Optional[Tensor] = ...,
        u_log_library_scale: Optional[Tensor] = ...,
        s_log_library_scale: Optional[Tensor] = ...,
        ind_x: Optional[Tensor] = ...,
        cell_state: Optional[Tensor] = ...,
        time_info: Optional[Tensor] = ...
    ) -> Tuple[Tensor, Tensor]: ...
    def get_rna(
        self,
        u_scale: Tensor,
        s_scale: Tensor,
        alpha: Tensor,
        beta: Tensor,
        gamma: Tensor,
        t: Tensor,
        u0: Tensor,
        s0: Tensor,
        t0: Tensor,
        switching: Optional[Tensor] = ...,
        u_inf: Optional[Tensor] = ...,
        s_inf: Optional[Tensor] = ...
    ) -> Tuple[Tensor, Tensor]: ...

update dependencies to latest stable versions

#101
#65

update poetry lock and add helper for gpu-compatible torch

add data download stage for pons oligodendrocyte data

add dependency management system

add dependency management with poetry

connect readthedocs to documentation

run model 2 on pons oligodendrocyte data

add preprocessing stage for pons oligodendrocyte data

ensure figure reproducibility scripts attempt to download source data if not present

conda-lock `1.4.1` fails to generate lock file from pyproject.toml with mixed conda-forge/PyPI dependencies

This issue is related to

Conda-lock configuration is specified near

pyrovelocity/pyproject.toml

Line 87 in 42c2475

[tool.conda-lock]

PyPI dependencies are marked explicitly as in

pyrovelocity/pyproject.toml

Line 43 in 42c2475

cospar = { version = "0.1.9", source = "pypi" }

A command similar to

conda-lock \
--conda mamba \
--extras dev \
--filter-extras \
--no-dev-dependencies \
--virtual-package-spec conda/virtual-packages.yml \
--log-level DEBUG \
-f pyproject.toml \
-p linux-64

from the repo root with conda-lock version 1.4.0 fails to produce a valid conda lock file due to failure to resolve PyPI dependencies whose name contains a hyphen vs underscore.

The specific error is related to the translation of package names between conda and PyPI with matplotlib_base vs matplotlib-base vs matplotlib

conda-lock stderr

Traceback (most recent call last):
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/conda_lock/src_parser/__init__.py", line 488, in seperator_munge_get
    return d[key]
KeyError: 'matplotlib-base'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/conda_lock/src_parser/__init__.py", line 491, in seperator_munge_get
    return d[key.replace("-", "_")]
KeyError: 'matplotlib_base'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".local/bin/conda-lock", line 8, in <module>
    sys.exit(main())
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/conda_lock/conda_lock.py", line 1353, in lock
    lock_func(
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/conda_lock/conda_lock.py", line 1083, in run_lock
    make_lock_files(
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/conda_lock/conda_lock.py", line 408, in make_lock_files
    lock_content = lock_content | create_lockfile_from_spec(
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/conda_lock/conda_lock.py", line 801, in create_lockfile_from_spec
    deps = _solve_for_arch(
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/conda_lock/conda_lock.py", line 737, in _solve_for_arch
    pip_deps = solve_pypi(
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/conda_lock/pypi_solver.py", line 327, in solve_pypi
    src_parser._apply_categories(requested=pip_specs, planned=planned)
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/conda_lock/src_parser/__init__.py", line 502, in _apply_categories
    for dep in seperator_munge_get(planned, item).dependencies
  File ".local/pipx/venvs/conda-lock/lib/python3.9/site-packages/conda_lock/src_parser/__init__.py", line 493, in seperator_munge_get
    return d[key.replace("_", "-")]
KeyError: 'matplotlib-base'

The approach described here would likely work, but since it would still require maintenance of two separate lists of dependencies it is not necessarily better than manually synchronizing pyproject.toml with environment.yml.

add Docker container

add hydra configuration management

make README links absolute for compatibility with PyPI description

add test stubs for each module

add test stubs for module load regression

set version using importlib rather than expecting hard-coded

pyrovelocity/pyrovelocity/pyrovelocity.py

Line 4 in aa05ef6

from . import __version__

expects the __version__ variable, which can be hard-coded in __init__.py or determined using importlib as in python-poetry/poetry#273 (comment). The latter is preferable to avoid a requirement for multiple updates on release that arises with the former.

importlib.metadata.version

from importlib import metadata

__version__ = metadata.version(__package__)

del metadata  # avoids polluting dir(__package__)

uniformly preprocess macroscopic gene set by default

We want to analyze the PBMC data set with default settings of up to 2000 highly variable genes rather than the top 3 dynamical genes.

I added pipeline stages for pbmc68k and model 2 training with up to 2000 highly variable genes and collected metrics using the dvc pipeline.

add pre-commit lint automation

add lint automation with pre-commit

usage of model-side sequential enumeration restricts upgrade of pytorch

background

Pyro’s enumeration strategy for discrete latent variable models

issue

Recalling this comment from #6 (with minor edits for the present context)

We constrain pytorch=1.8.* because the latest pyro at the time of this writing 1.8.1+06911dc with pytorch 1.12.1 will not support model-side sequential enumeration as indicated in pyro util.py, results in a memory leak, and raises the error:

NotImplementedError: At site 'cell_gene_state', model-side sequential enumeration is not implemented. Try parallel enumeration or guide-side enumeration.

As above, this issue will be closed when pyro supports model-side sequential enumeration, and the corresponding hard-coded pyro version could be removed from pyrovelocity v0.1.0.

This is discussed in

In pyrovelocity v0.1.1, this is the relevant line of _velocity_model.py where model-side sequential enumeration is used. In order to upgrade pyro and pytorch to their respective latest stable versions, we need to experiment with restricting models requiring enumeration to parallel or guide-side enumeration as suggested.

add conda environment configuration

Example environment yaml files and conda lockfiles for cpu and gpu platforms are not currently included.
These will be added to a conda subfolder.

add release automation

reproduce suppfig5 and supp tables

pyrovelocity version: 0.1.0
Python version: 3.8
Operating System: CentOS

Description

We need to reproduce the remaining supplementary materials.

support gpu acceleration on multiple platforms

To support GPU utilization on Apple M1 or M2 ( https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/ ) pyrovelocity would need to support pytorch >=1.12. For reference the latest stable version is 1.12.1.

The library currently supports 1.8.*.

With, for example, the following channels (note nodefaults, which helps to avoid confusion of dependency versions between conda-forge and anaconda package repositories)

channels:
- conda-forge
- bioconda
- nodefaults

and inclusion of dependency on

  - pytorch=1.8.*

in the default environment configuration file, then when - pytorch>=1.12 (at upgrade from current dependencies across the 1.12 threshold), GPU acceleration should work by default on the M1/2.

Relative to the above dependency in the package environment.yml,

environment.yml

name: pyrovelocity

channels:
  - conda-forge
  - bioconda
  # https://stackoverflow.com/a/71110028/446907
  # We want to have a reproducible setup, so we don't want default channels,
  # which may be different for different users. All required channels should
  # be listed explicitly here.
  - nodefaults

dependencies:
  - python=3.8
  - mamba=0.27.0
  - conda-lock=1.1.3
  - leidenalg=0.9.0
  - pyro-ppl=1.6.0
  - pip=22.2.2
  - seaborn=0.11.2
  - scvelo=0.2.4
  - scvi-tools=0.13.0
  - pytorch-lightning=1.3.0
#  - pytorch-gpu=1.8.*
#  - pytorch=1.8.*=*cuda*
  - pytorch=1.8.*
  - scikit-misc=0.1.4
  - torchmetrics=0.5.1
  - h5py=3.7.0
  - anndata=0.7.5
  - adjusttext=0.7.3
  - astropy=5.1
  - pip:
    - cospar==0.2.1

including a note such as the following

To enable GPU acceleration on linux platforms, you may attempt to execute a command similar to

conda install -y pytorch-gpu=1.8.* -c conda-forge

if your environment is compatible, for example, with 
[conda virtual package](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-virtual.html) 
`__cuda >= 11`. 

Note that the `pytorch-gpu` package pointer only works on linux platforms. 
An alternative is to specify constraints such as `pytorch=1.8.*=*cuda*`. 
Please see information regarding associated dependencies in

- [conda-forge/pytorch-gpu:1.8.0](https://anaconda.org/conda-forge/pytorch-gpu/files?version=1.8.0) 
- [conda-forge/cudatoolkit:11.2.2](https://anaconda.org/conda-forge/cudatoolkit/files?version=11.2.2)
- [nvidia cuda toolkit <--> driver version compatibility](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-major-component-versions__table-cuda-toolkit-driver-versions)

should allow anyone to enable GPU acceleration on linux platforms. See references to one dependency chain from pytorch-gpu to cudatoolkit below.

add reproducibility pipeline for figure 2

pipeline stages

flowchart TD
        node1["data_download_pancreas"]
        node2["data_download_pbmc68k"]
        node3["figure2"]
        node5["preprocess_pancreas"]
        node6["preprocess_pbmc68k"]
        node7["train_pancreas_model1"]
        node8["train_pbmc68k_model1"]
        node1-->node5
        node2-->node6
        node5-->node7
        node6-->node8
        node7-->node3
        node8-->node3

file dependencies

flowchart TD
        node1["data/external/endocrinogenesis_day15.h5ad"]
        node2["data/external/pbmc68k.h5ad"]
        node3["data/processed/pancreas_processed.h5ad"]
        node4["data/processed/pbmc_processed.h5ad"]
        node5["models/pancreas_model1/model"]
        node6["models/pancreas_model1/pyrovelocity.pkl"]
        node7["models/pancreas_model1/trained.h5ad"]
        node8["models/pbmc68k_model1/model"]
        node9["models/pbmc68k_model1/pyrovelocity.pkl"]
        node10["models/pbmc68k_model1/trained.h5ad"]
        node11["reports/fig2/fig2_pancreas_rainbow.pdf"]
        node12["reports/fig2/fig2_pancreas_shared_time.pdf"]
        node13["reports/fig2/fig2_pancreas_vector_field.pdf"]
        node14["reports/fig2/fig2_pancreas_volcano.pdf"]
        node15["reports/fig2/fig2_raw_gene_selection_model1.svg"]
        node16["reports/fig2/fig2_raw_gene_selection_model1.tif"]
        node1-->node3
        node2-->node4
        node3-->node5
        node3-->node6
        node3-->node7
        node4-->node8
        node4-->node9
        node4-->node10
        node6-->node11
        node6-->node12
        node6-->node13
        node6-->node14
        node6-->node15
        node6-->node16
        node7-->node11
        node7-->node12
        node7-->node13
        node7-->node14
        node7-->node15
        node7-->node16
        node9-->node15
        node9-->node16
        node10-->node15
        node10-->node16

support experiment tracking

update minor release version

write training metrics to log file

include supplementary figure for uncertainty analysis of kinetics parameters

pyrovelocity version: 0.1.1
Python version: 3.8
Operating System: Linux

This issue is used to generate the uncertainty boxplot of parameters.

convert documentation from rST to MyST

https://myst-parser.readthedocs.io/

meet an error about poisson distribution support

pyrovelocity version: 0.1.1
Python version: 3.8
Operating System: Linux

Description

I tried to use the model to run my data, but met an error like this

ValueError: Expected value argument (Tensor of shape (500, 2000)) to be within the support (IntegerGreaterThan(lower_bound=0)) of the distribution Poisson(rate: torch.Size([500, 2000])), but found invalid values: tensor([[0.0000, 0.0000, 0.0000, ..., 0.0000, 1.6010, 1.0000], [1.8628, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],

...

st_norm dist 500 2000 | value 500 2000 | log_prob 500 2000 | u dist 500 2000 | value 500 2000 |

My code is like:

from pyrovelocity.api import train_model

# import os

# os.environ['CUDA_VISIBLE_DEVICES']='0'

# Model 1

num_epochs = 1000 # large data

# num_epochs = 4000 # small data

adata_model_pos = train_model(adata,

                               max_epochs=num_epochs, svi_train=True, log_every=100,

                               patient_init=45,

                               batch_size=500, use_gpu=False, cell_state='state_info',

                               include_prior=True,

                               offset=False,

                               library_size=True,

                               patient_improve=1e-3,

                               model_type='auto',

                               guide_type='auto_t0_constraint',

                               train_size=1.0)

adata is the data I have, with information like this

AnnData object with n_obs × n_vars = 3060 × 37007 obs: 'cluster', 'label_time', 'nGene', 'nUMI', 'obs_names', 'time' var: 'gene_short_name', 'var_names' layers: 'matrix', 'spliced', 'unspliced'.

Is there any constraint on the adata? I would appreciate it so much if someone could give me a hand.

add reproducibility pipeline for figure S3

generate html callgraph

add framework for reproducibility pipelines

support continuous integration for model construction and evaluation experiments

configure workflow_disptach trigger
add optional tmate debugging sessions for runner deployment and model training
restrict run on push to relevant branches and paths
define environment variables for github, mlflow, and gcp
test model training job
update workflow comments to reflect current status

run model 1 on pons oligodendrocyte data

requirement of anndata = "0.7.5" makes it impossible to load anndata saved with newer version (e.g. 0.8.0)

Hi,

I have problems loading anndata objects after installing the pyro velocity environment. It seems like you have a requirement of:

anndata = "0.7.5"

can you change this to anndata = 0.8.0? See this issue for example explaining that new anndata objects cannot be loaded with the old anndata packages:

scverse/anndata#698

Best wishes,

Alexander

"TypeError: Can't instantiate abstract class PyroVelocity with abstract methods setup_anndata" When I call train_model.

pyrovelocity version:
Python version: 3.8.13
Operating System: Windows10

Description

When I run the pbmc.ipynb in pyrovelocity-master/docs/source/notebooks，all things be ok before training. Now I can't create training_model sucessfully. What can I do?（The No.15 code in https://pyrovelocity.readthedocs.io/en/latest/source/notebooks/pbmc.html#Reproduce-fully-mature-cell-type-problem）

What I Did

The train_model code.

adata_model_pos = train_model(adata_sub, max_epochs=4000, svi_train=False, log_every=100,
                              patient_init=45, batch_size=-1, use_gpu=1,
                              include_prior=True, offset=False, library_size=True,
                              cell_state='celltype',
                              patient_improve=1e-4, guide_type='auto_t0_constraint', train_size=0.67)

have some problems as follow:

TypeError                                 Traceback (most recent call last)
Cell In [14], line 1
----> 1 adata_model_pos = train_model(adata_sub, max_epochs=4000, svi_train=False, log_every=100,
      2                               patient_init=45, batch_size=-1, use_gpu=1,
      3                               include_prior=True, offset=False, library_size=True,
      4                               cell_state='celltype',
      5                               patient_improve=1e-4, guide_type='auto_t0_constraint', train_size=0.67)

File f:\ProgrammingEnvironment\Anaconda\Anaconda\envs\scvi-env\lib\site-packages\pyrovelocity-0.1.0-py3.8.egg\pyrovelocity\api.py:24, in train_model(adata, guide_type, model_type, svi_train, batch_size, train_size, use_gpu, likelihood, num_samples, log_every, cell_state, patient_improve, patient_init, seed, lr, max_epochs, include_prior, library_size, offset, input_type, cell_specific_kinetics, kinetics_num)
     [13](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=12) def train_model(adata,
     [14](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=13)                 guide_type='auto',
     [15](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=14)                 model_type='auto',
   (...)
     [22](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=21)                 library_size=True, offset=False, input_type='raw',
     [23](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=22)                 cell_specific_kinetics=None, kinetics_num=2):
---> [24](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=23)     model = PyroVelocity(adata, likelihood=likelihood,
     [25](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=24)                          model_type=model_type,
     [26](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=25)                          guide_type=guide_type, correct_library_size=library_size,
     [27](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=26)                          add_offset=offset,
     [28](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=27)                          include_prior=include_prior, input_type=input_type, 
     [29](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=28)                          cell_specific_kinetics=cell_specific_kinetics, kinetics_num=kinetics_num)
     [30](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=29)     if svi_train and (guide_type=='velocity_auto' or guide_type == 'velocity_auto_t0_constraint'):
     [31](file:///f%3A/ProgrammingEnvironment/Anaconda/Anaconda/envs/scvi-env/lib/site-packages/pyrovelocity-0.1.0-py3.8.egg/pyrovelocity/api.py?line=30)         if batch_size == -1:

TypeError: Can't instantiate abstract class PyroVelocity with abstract methods setup_anndata

update reproducibility scripts

reproducibility scripts

have many unused imports
have many code comments
import the wrong library name

fix image links in documentation

Image links in readme.html are broken due to relative import in docs/readme.rst from README.rst without specification of :relative-images:.

bioconda and anaconda poetry fix

pyrovelocity version: 0.1.0
Python version: 3.8.8
Operating System: CentOS

Description

The poetry version is not compatible with conda-build.

What I Did

fix the poetry in pyproject.yml and meta.yml.

conda-build conda/recipe

add IaC for ephemeral gpu-enabled development environment

testing some components of the codebase requires a GPU-enabled ephemeral environment
development will benefit from a consistent environment for prototyping and testing

Environment

pyrovelocity version: 0.1.0
Python version: 3.8
Operating System: vm image

Description

We need to update and save additional plots to support comparison of models 1 and 2.

What I Did

I added two figures to the dvc pipeline to allow model comparison based on marker selection and associated phase portraits.

update pyrovelocity installation

pyrovelocity version: 0.1.0
Python version: 3.8
Operating System: Linux

Description

We tried to replace the old manual installation to bioconda package installation in the documents.

What I Did

I uploaded our package to both bioconda recipe and my own channel.

pinellolab / pyrovelocity Goto Github PK

pyrovelocity's People

Stargazers

Watchers

Forkers

pyrovelocity's Issues

background

issue

Description

Description

Description

What I Did

Description

What I Did

Environment

Description

What I Did

Description

What I Did

Recommend Projects

Recommend Topics

Recommend Org