automl / neps Goto Github PK

View Code? Open in Web Editor NEW

23.0 8.0 8.0 8.23 MB

Neural Pipeline Search (NePS): Helps deep learning experts find the best neural pipeline.

Home Page: https://automl.github.io/neps/

License: Apache License 2.0

Python 100.00%

deep-learning hyperparameter-optimization neural-architecture-search neural-pipeline-search automl

neps's Introduction

Neural Pipeline Search (NePS)

NePS helps deep learning experts to optimize the hyperparameters and/or architecture of their deep learning pipeline with:

Hyperparameter Optimization (HPO) (example)
Neural Architecture Search (NAS) (example, paper)
Joint Architecture and Hyperparameter Search (JAHS) (example, paper)

For efficiency and convenience NePS allows you to

Add your intuition as priors for the search (example HPO, example JAHS, paper)
Utilize low fidelity (e.g., low epoch) evaluations to focus on promising configurations (example, paper)
Trivially parallelize across machines (example, documentation)

Or all of the above for maximum efficiency!

Recent publications

Documentation

Please have a look at our documentation and examples.

Note

As indicated with the v0.x.x version number, NePS is early-stage code and APIs might change in the future.

Installation

Using pip

pip install neural-pipeline-search

Usage

Using neps always follows the same pattern:

Define a run_pipeline function that evaluates architectures/hyperparameters for your problem
Define a search space pipeline_space of architectures/hyperparameters
Call neps.run to optimize run_pipeline over pipeline_space

In code, the usage pattern can look like this:

import neps
import logging


# 1. Define a function that accepts hyperparameters and computes the validation error
def run_pipeline(hyperparameter_a: float, hyperparameter_b: int):
    validation_error = -hyperparameter_a * hyperparameter_b
    return validation_error


# 2. Define a search space of hyperparameters; use the same names as in run_pipeline
pipeline_space = dict(
    hyperparameter_a=neps.FloatParameter(lower=0, upper=1),
    hyperparameter_b=neps.IntegerParameter(lower=1, upper=100),
)

# 3. Call neps.run to optimize run_pipeline over pipeline_space
logging.basicConfig(level=logging.INFO)
neps.run(
    run_pipeline=run_pipeline,
    pipeline_space=pipeline_space,
    root_directory="usage_example",
    max_evaluations_total=5,
)

For more details and features please have a look at our documentation and examples.

Analysing runs

See our documentation on analysing runs.

Contributing

Please see the documentation for contributors.

Citations

Please consider citing us if you use our tool!

Refer to our documentation on citations.

Alternatives

NePS does not cover your use-case? Have a look at some alternatives.

neps's People

Contributors

Stargazers

Watchers

Forkers

neeratyoy meganton agniji karibbov herilalaina danrgll raghuspacerajan tarekabouchakra

neps's Issues

Setting a Default Value for a Prior Outside of the Defined Search Range

Are there any issues that may arise when setting a default value for a prior that is outside of the defined search range? How does this affect the overall optimization process, especially in relation to the boundaries of the search space and the interactions with the prior?

Warning from Tensorboard package while running NePS

Shown Warning:

...python3.9/site-packages/torch/utils/tensorboard/init.py:4: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
if not hasattr(tensorboard, "version") or LooseVersion(
DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
) < LooseVersion("1.15"):

Potential issues if multiple checkpoints for one config

neps/src/neps/utils/common.py

Lines 119 to 124 in 61103db

    
           ckpt_files = glob.glob(str(checkpoint_dir / "*.ckpt")) 
        
           if ckpt_files: 
        
               # Load the checkpoint and retrieve necessary data 
        
               checkpoint_path = ckpt_files[-1] 
        
               checkpoint = torch.load(checkpoint_path)

This just grabs the last checkpoint found by glob which gives no guaranteed order of what it returns. For now it might be safer to just error if multiple are found. Not sure about a long term solution

What is stopping this being Python 3.8 supported?

Just curious :)

Treating Ordinals as purely Categorical may make optimizers weaker than they should be

See these lines when converting form ConfigSpace spaces to Neps space:

neps/neps/search_spaces/search_space.py

Lines 33 to 42 in 8b2d498

    
           elif isinstance(hyperparameter, CS.CategoricalHyperparameter): 
        
               parameter = CategoricalParameter( 
        
                   hyperparameter.choices, 
        
                   default=hyperparameter.default_value, 
        
               ) 
        
           elif isinstance(hyperparameter, CS.OrdinalHyperparameter): 
        
               parameter = CategoricalParameter( 
        
                   hyperparameter.sequence, 
        
                   default=hyperparameter.default_value, 
        
               )

This may be fine for small ordinals like ["small", "medium", "large"], just treating them as a categorical, but, for tabular benchmarks this may be an issue.

For example, consider a benchmark which only has tabular entries for hyperparameters x, y, i.e.

x = Ordinal([1, 1.5, 4.5, 16, 32.354, ..., 100])
y = Ordinal(["small", "medium", "large"])

An optimizer which takes into account their order should theoretically outperform one which doesn't, i.e. SMAC.

One hacky solution is to convert it to an integer representation that act as indices so order information is presevered?

Missing Documentation

[CI] Long environment set up time

Currently the tests take a very long time as each runner is installing the dependancies. They can be cached and re-used by all job steps, following something like here: https://github.com/automl/amltk/blob/eea22816b0065e6b9a495051d936062fe43567dc/.github/workflows/test.yml#L46-L47

I'm not sure how to do this with poetry though, but it should be do-able

[Dep] Update the dependancy version for configSpace

It's currently at 0.7.1

Basic Usage is overwhelming and not sufficiently commented

ModuleNotFoundError for 'gpytorch' after installing neural-pipeline-search

Tested the installation of neural-pipeline-search using pip in a clean Conda environment with Python 3.9, but encountered an error: ModuleNotFoundError: No module named 'gpytorch'.

Request for Conditional Hyperparameter Support in Search Space

Current support lacks handling dependencies between hyperparameters, limiting effective search space customization.

Hyperband bracket generation is inconsistent with the original Hyperband paper

The HB paper says that each SH bracket samples $n = \lceil \frac{s_{\max} + 1}{s + 1} \eta^s \rceil$ configurations; however, this line samples $n = \lfloor\lfloor \frac{s_{\max} + 1}{s + 1} \rfloor \eta^s\rfloor$ configurations.
In reality, this line should be:

_n_config = int(np.ceil(s_max / (_s + 1) * self.eta**_s))

Note that self.s_max is $s_{\max} + 1$.

[Doc] Seeding

I have no idea how to seed neps and there's no arguments to do so at the core run() which is the main public facing API. This is pretty important for any benchmarking/users of NePs. I'm sure it's in an example somewhere but this belongs in online docs. Just leaving this here until it's done.

[Deps] Remove upper bound on `pandas<2.0`

Heyo, any know reason for this and if not, can I submit a PR for it?

[API] Requesting a formal `ask()` and `tell()` interface

Currently I have a version that basically hacks into the internals of metahyper to get an ask() and tell() interface to all NePs has to offer in terms of optimizers. This implementation basically relieves NeP's of actually having to evaluate anything, I just want the suggestions from the optimizers.

Updating to 0.10.0 gives a new warning:

WARNING:amltk.optimization.optimizers.neps:There are 1 configs that were sampled, but have no worker assigned. Sometimes this is due to a delay in the filesystem communication, but most likely some configs crashed during their execution or a jobtime-limit was reached.

I can't really complain as NePs doesn't expose this. I'd like to keep NePs as an optional dependancy for AMLTK but I would need a stable API to base off of.

class NEPSOptimizer(Optimizer[NEPSTrialInfo]):
    """An optimizer that uses SMAC to optimize a config space."""

    def __init__(
        self,
        *,
        space: SearchSpace,
        optimizer: BaseOptimizer,
        working_dir: Path,
        bucket: Bucket | None = None,
        ignore_errors: bool = True,
        loss_value_on_error: float | None = None,
        cost_value_on_error: float | None = None,
    ) -> None:
        """Initialize the optimizer.

        Args:
            space: The space to use.
            optimizer: The optimizer to use.
            working_dir: The directory to use for the optimization.
            bucket: The bucket to give to trials generated from this optimizer.
            ignore_errors: Whether the optimizers should ignore errors from trials.
            loss_value_on_error: The value to use for the loss if the trial fails.
            cost_value_on_error: The value to use for the cost if the trial fails.
        """
        super().__init__(bucket=bucket)
        self.space = space
        self.optimizer = optimizer
        self.working_dir = working_dir
        self.ignore_errors = ignore_errors
        self.loss_value_on_error = loss_value_on_error
        self.cost_value_on_error = cost_value_on_error

        self.optimizer_state_file = self.working_dir / "optimizer_state.yaml"
        self.base_result_directory = self.working_dir / "results"
        self.serializer = metahyper.utils.YamlSerializer(self.optimizer.load_config)

        self.working_dir.mkdir(parents=True, exist_ok=True)
        self.base_result_directory.mkdir(parents=True, exist_ok=True)

    @classmethod
    def create(  # noqa: PLR0913
        cls,
        *,
        space: (
            SearchSpace
            | ConfigurationSpace
            | Mapping[str, ConfigurationSpace | Parameter]
        ),
        bucket: Bucket | None = None,
        searcher: str | BaseOptimizer = "default",
        working_dir: str | Path = "neps",
        overwrite: bool = True,
        loss_value_on_error: float | None = None,
        cost_value_on_error: float | None = None,
        max_cost_total: float | None = None,
        ignore_errors: bool = True,
        searcher_kwargs: Mapping[str, Any] | None = None,
    ) -> Self:
        """Create a new NEPS optimizer.

        Args:
            space: The space to use.
            bucket: The bucket to give to trials generated by this optimizer.
            searcher: The searcher to use.
            working_dir: The directory to use for the optimization.
            overwrite: Whether to overwrite the working directory if it exists.
            loss_value_on_error: The value to use for the loss if the trial fails.
            cost_value_on_error: The value to use for the cost if the trial fails.
            max_cost_total: The maximum cost to use for the optimization.

                !!! warning

                    This only effects the optimization if the searcher utilizes the
                    budget for it's actual suggestion of the next config. If the
                    searcher does not use the budget. This parameter has no effect.

                    The user is still expected to stop `ask()`'ing for configs when
                    they have reached some budget.

            ignore_errors: Whether the optimizers should ignore errors from trials
                or whether they should be taken into account. Please set `loss_on_value`
                and/or `cost_value_on_error` if you set this to `False`.
            searcher_kwargs: Additional kwargs to pass to the searcher.
        """
        space = _to_neps_space(space)
        searcher = _to_neps_searcher(
            space=space,
            searcher=searcher,
            loss_value_on_error=loss_value_on_error,
            cost_value_on_error=cost_value_on_error,
            max_cost_total=max_cost_total,
            ignore_errors=ignore_errors,
            searcher_kwargs=searcher_kwargs,
        )
        working_dir = Path(working_dir)
        if working_dir.exists() and overwrite:
            logger.info(f"Removing existing working directory {working_dir}")
            shutil.rmtree(working_dir)

        return cls(
            space=space,
            bucket=bucket,
            optimizer=searcher,
            working_dir=working_dir,
            loss_value_on_error=loss_value_on_error,
            cost_value_on_error=cost_value_on_error,
        )

    @override
    def ask(self) -> Trial[NEPSTrialInfo]:
        """Ask the optimizer for a new config.

        Returns:
            The trial info for the new config.
        """
        with self.optimizer.using_state(self.optimizer_state_file, self.serializer):
            (
                config_id,
                config,
                pipeline_directory,
                previous_pipeline_directory,
            ) = metahyper.api._sample_config(  # type: ignore
                optimization_dir=self.working_dir,
                sampler=self.optimizer,
                serializer=self.serializer,
                logger=logger,
            )

        if isinstance(config, SearchSpace):
            _config = config.hp_values()
        else:
            _config = {
                k: v.value if isinstance(v, Parameter) else v for k, v in config.items()
            }

        info = NEPSTrialInfo(
            name=str(config_id),
            config=deepcopy(_config),
            pipeline_directory=pipeline_directory,
            previous_pipeline_directory=previous_pipeline_directory,
        )
        trial = Trial(
            name=info.name,
            config=info.config,
            info=info,
            seed=None,
            bucket=self.bucket,
        )
        logger.debug(f"Asked for trial {trial.name}")
        return trial

    @override
    def tell(self, report: Trial.Report[NEPSTrialInfo]) -> None:
        """Tell the optimizer the result of the sampled config.

        Args:
            report: The report of the trial.
        """
        logger.debug(f"Telling report for trial {report.trial.name}")
        info = report.info
        assert info is not None

        # This is how NEPS handles errors
        result: Literal["error"] | dict[str, Any]
        if report.status in (Trial.Status.CRASHED, Trial.Status.FAIL):
            result = "error"
        else:
            result = report.results

        metadata: dict[str, Any] = {"time_end": report.time.end}
        if result == "error":
            if not self.ignore_errors:
                if self.loss_value_on_error is not None:
                    report.results["loss"] = self.loss_value_on_error
                if self.cost_value_on_error is not None:
                    report.results["cost"] = self.cost_value_on_error
        else:
            if (loss := result.get("loss")) is not None:
                report.results["loss"] = float(loss)
            else:
                raise ValueError(
                    "The 'loss' should be provided if the trial is successful"
                    f"\n{result=}",
                )

            cost = result.get("cost")
            if (cost := result.get("cost")) is not None:
                cost = float(cost)
                result["cost"] = cost
                account_for_cost = result.get("account_for_cost", True)

                if account_for_cost:
                    with self.optimizer.using_state(
                        self.optimizer_state_file,
                        self.serializer,
                    ):
                        self.optimizer.used_budget += cost

                metadata["budget"] = {
                    "max": self.optimizer.budget,
                    "used": self.optimizer.used_budget,
                    "eval_cost": cost,
                    "account_for_cost": account_for_cost,
                }
            elif self.optimizer.budget is not None:
                raise ValueError(
                    "'cost' should be provided when the optimizer has a budget"
                    f"\n{result=}",
                )

        # Dump results
        self.serializer.dump(result, info.pipeline_directory / "result")

        # Load and dump metadata
        config_metadata = self.serializer.load(info.pipeline_directory / "metadata")
        config_metadata.update(metadata)
        self.serializer.dump(config_metadata, info.pipeline_directory / "metadata")

    @override
    @classmethod
    def preferred_parser(cls) -> NEPSPreferredParser:
        """The preferred parser for this optimizer."""
        # TODO: We might want a custom one for neps.SearchSpace, for now we will
        # use config space but without conditions as NePs doesn't support conditionals
        return partial(configspace_parser, conditionals=False)

Enforce Integer Constraints in Integer Hyperparameter Search Space Definition

neps/src/neps/search_spaces/hyperparameters/integer.py

Lines 10 to 14 in bcf8d8e

    
           class IntegerParameter(FloatParameter): 
        
               def __init__( 
        
                   self, 
        
                   lower: float | int, 
        
                   upper: float | int,

The IntegerParameter class currently extends FloatParameter, allowing float numbers to be passed in instances where the intention is to define an integer hyperparameter search space. Although this is dealt with internally by rounding, should users be allowed to pass in floats when defining integers in the search spaces ?

Check restarting/handling of pending config when resuming a run

For potential reproducibility of the observed issue:

Running Random Search for 20 (max_evaluations_total) evaluations distributed across 4 workers
Midway through the run, killed a worker and restarted the worker soon enough
The overall run ran fine but noticed certain anomalies, as described below,

The process termination halted a config, for example, config ID 16
On restarting, the 4 workers proceeded fine without errors but an extra config ID 21 was generated while config ID 16 was not re-evaluated or completed and remains pending forever

Some more observations:

For max_evaluations_total=20 we should have config IDs from 1-20 with each of them having their own result.yaml
Only config_16 does not have result.yaml whereas config_21 does
If I now re-run a worker as max_evaluations_total=21, it now satisfies that extra config required by sampling a new config config_22

Should a new worker, re-evaluate pending configs, as priority?
Also with this issue or under this scenario the generated config IDs range from [1, n+1] if max_evaluations_total=n.

Regularized Evolution defaults conflicting

Regularized evolution by default expects assisted_zero_cost_proxy: Callable to be assigned since the default value of the assisted: bool is True. But the default assisted_zero_cost_proxy: Callable is None. So the regularized evolution can't be run without changing any of the defaults.

Just changing the default of assisted into False would solve this. Just want to make sure this was not intended, before I make the change.

Somehow let users catch errors in function evaluations

Basically, this part squeezes errors caused in function evaluations, so it is hard to debug.

https://github.com/automl/neps/blob/master/src/metahyper/api.py#L403-L451

Possibly drop GraKel to support `Python>3.7`

I did some cursory glancing at GraKel and it's build systems and it looks like it won't be very actively maintained and it's a bit convoluted with its build systems. I would recommend dropping it as soon as possible or NEPS will be locked for a long while.

I'm not sure how difficult this would be. Could be a hiwi job if you want to create a pure numpy replacement, otherwise maybe just switch to a different graph kernel which doesn't require being built.

I also checked to see if there are any community forks which actually fix this issue but not unfortunatly.

I'll keep communication with them at this PR and see if I can get maintainer access to switch to an easier and change to a more maintainable build system using github actions.

poetry add neural-pipeline-search yields the maximum recursion depth error

$ poetry add neural-pipeline-search==0.8.3

Outcome:

maximum recursion depth exceeded while calling a Python object

I have no clue.
But at least, pip install works.

[Doc] Add API doc generation

Just needs to be done at some point. Will try when I have some time but if anyone feels like learning how to build documentation for libraries, feel free and I am happy to help guide through.

Refs:

Plugin setup: https://github.com/automl/amltk/blob/4de8c1c1d0f82bc398044788bca5f09d3ec642d7/mkdocs.yml#L105-L150
Not all of these are needed, just some of the mkdocs plugins: https://github.com/automl/amltk/blob/4de8c1c1d0f82bc398044788bca5f09d3ec642d7/pyproject.toml#L36-L50
File that generates the API docs during mkdocs build or whatever command you're using for mkdocs doc generation: docs/api_generator.py
I also have some custom stylings and templates that apply to the API docs. This was made with trial and error so I don't really know much beyond what I got to work.
- Templates - just modified based on some guidance and templates provided in mkdocstrings.
- CSS I think I copied Textual and modified from there.

Example output:

https://automl.github.io/amltk/latest/api/amltk/optimization/optimizers/neps/#amltk.optimization.optimizers.neps.NEPSOptimizer.create

Enhancing Flexibility in Priorband Usage: Allowing Partial Knowledge Without Mandatory Default Parameters

It would be beneficial to have the ability to add partial knowledge to the configuration area, especially for using Priorband, without having to set the "default" argument for all the parameters.

Example:
pipeline_space = dict(
hyperparameter_a=neps.FloatParameter(lower=1e-5, upper=1e-1, log=True),
hyperparameter_b=neps.IntegerParameter(lower=1, upper=20, is_fidelity=True),
hyperparameter_c=neps.IntegerParameter(lower=32, upper=128, default=64, default_confidence="medium"), # my knowledge
)

Accidental Prior Usage Activation Due to ConstantParameter in SearchSpace

Accidental Prior Usage Activation Due to ConstantParameter in SearchSpace if another Parameter has is_fidelity=True

Why does this error occur:

src/neps/search_spaces/hyperparameters/constant.py

class ConstantParameter(NumericalParameter):
def init(self, value: Union[float, int, str], is_fidelity: bool = False):
super().init()
self.value = value
self.is_fidelity = is_fidelity
self.default = value # causing issue

src/neps/search_spaces/search_space.py

SearchSpace()
def init():
.....
# Check if defaults exists to construct prior from
if hasattr(hyperparameter, "default") and hyperparameter.default is not None:
self.has_prior = True
elif hasattr(hyperparameter, "has_prior") and hyperparameter.has_prior:
self.has_prior = True

Proposed Fix:
if hasattr(hyperparameter, "default") and hyperparameter.default is not None:
if not isinstance(hyperparameter, ConstantParameter):
self.has_prior = True
elif hasattr(hyperparameter, "has_prior") and hyperparameter.has_prior:
self.has_prior = True

But: Still issue(warning) with searcher Hyperband

/src/neps/optimizers/multi_fidelity/successive_halving.py:240: FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation.
self.observed_configs = pd.concat(

Warning from Pandas while running NePS

...python3.9/site-packages/pandas/core/dtypes/cast.py:1641: DeprecationWarning: np.find_common_type is deprecated. Please use np.result_type or np.promote_types.
See https://numpy.org/devdocs/release/1.25.0-notes.html and the docs for more information. (Deprecated NumPy 1.25)
return np.find_common_type(types, [])

Just experience the occurrence when I use a fidelity approach

Using `previous_pipeline_directory` should return an absolute path instead of a relative one if possible.

If running a pipeline like below, then NePs will nicely inject the previous_pipeline_directory:

def run_pipeline(previous_pipeline_directory: Path, **config) -> dict:
    ...

However this path is relative, essentially I get the following path, neps_root_directory/results/config_2_0, it could make things a bit smoother to have this as an absolute path, i.e. for logging or any more complex post analysis behaviours.

Metahyper sampling extra config

Metahyper is sometimes sampling an extra configuration, causing tests to fail at some occasions:

https://github.com/automl/neps/actions/runs/6894554381/job/18756615811
https://github.com/automl/neps/actions/runs/6213981866/job/16865511613
https://github.com/automl/neps/actions/runs/6059364933/job/16442416911
https://github.com/automl/neps/actions/runs/6963184298/job/18948320054

Difficult to reproduce, cause of problem is still known, example that the problem appears in: https://github.com/automl/neps/blob/master/neps_examples/basic_usage/hyperparameters.py

	ckpt_files = glob.glob(str(checkpoint_dir / "*.ckpt"))

	if ckpt_files:
	# Load the checkpoint and retrieve necessary data
	checkpoint_path = ckpt_files[-1]
	checkpoint = torch.load(checkpoint_path)

	elif isinstance(hyperparameter, CS.CategoricalHyperparameter):
	parameter = CategoricalParameter(
	hyperparameter.choices,
	default=hyperparameter.default_value,
	)
	elif isinstance(hyperparameter, CS.OrdinalHyperparameter):
	parameter = CategoricalParameter(
	hyperparameter.sequence,
	default=hyperparameter.default_value,
	)

	class IntegerParameter(FloatParameter):
	def __init__(
	self,
	lower: float \| int,
	upper: float \| int,