Code Monkey home page Code Monkey logo

deepcave's Introduction

Logo

DeepCAVE

DeepCAVE is a visualization and analysis tool for AutoML, with a particular focus on hyperparameter optimization (HPO). Built on the Dash framework, it offers a fully interactive experience. The tool features a variety of plugins that enable efficient insight generation, aiding in understanding and debugging the application of HPO. Additionally, the powerful run interface and the modularized plugin structure allow extending the tool at any time effortlessly.

Configuration Footprint

Installation

First, make sure you have redis-server installed on your computer.

Afterwards, follow the instructions to install DeepCAVE:

conda create -n DeepCAVE python=3.9
conda activate DeepCAVE
conda install -c anaconda swig
pip install DeepCAVE

If you want to contribute to DeepCAVE use the following steps instead:

git clone https://github.com/automl/DeepCAVE.git
cd DeepCAVE
conda create -n DeepCAVE python=3.9
conda activate DeepCAVE
conda install -c anaconda swig
make install-dev

If you want to try the examples for recording your results in DeepCAVE format, run this after installing:

make install-examples

To load runs created with Optuna or the BOHB optimizer, you need to install the respective packages by running:

make install-optuna
make install-bohb

Please visit the documentation to get further help (e.g. if you cannot install redis server or if you are on MacOS).

Visualizing and Evaluating

The webserver as well as the queue/workers can be started by simply running:

deepcave --open

If you specify --open your webbrowser automatically opens at http://127.0.0.1:8050/. You can find more arguments and information (like using custom configurations) in the documentation.

Example runs

DeepCAVE comes with some pre-evaluated runs to get a feeling for what DeepCAVE can do.

If you cloned the repository from GitHub via git clone https://github.com/automl/DeepCAVE.git, you can try out some examples by exploring the logs directory inside the DeepCAVE dashboard. For example, if you navigate to logs/DeepCAVE, you can view the run mnist_pytorch if you hit the + button left to it.

Features

Interactive Interface

  • Interactive Dashboard:
    The dashboard runs in a webbrowser and allows you to self-analyze your optimization runs interactively.

  • Run Selection Interface:
    Easily select runs from your working directory directly within the interface.

  • Integrated Help and Documentation:
    Use help buttons and integrated documentation within the interface to better understand the plugins.

Comprehensive Analysis Tools

  • Extensive Plugin Collection:
    Explore a wide range of plugins for in-depth performance, hyperparameter, and budget analysis.

  • Analysis of Running Processes:
    Analyze and monitor optimization processes as they occur, with automatic detection of run changes.

  • Group Analysis:
    Choose groups of runs for combined analysis to gain deeper insights.

Flexible and Modular Architecture

  • Modular Plugin Architecture:
    Benefit from a modularized plugin structure with access to selected runs and groups, offering you maximum flexibility.

  • Asynchronous Execution:
    Utilize asynchronous execution of resource-intensive plugins and caching of results to improve performance.

Broad Optimizer Support

  • Optimizer Support:
    Work with many frameworks and optimizers using our converters, including converters for SMAC, BOHB, AMLTK, and Optuna.

  • Native Format Saving:
    Save AutoML runs from various frameworks in DeepCAVE's native format using the built-in recorder.

  • Flexible Data Loading:
    Alternatively, load AutoML runs from other frameworks by converting them into a Pandas DataFrame.

Developer and API Features

  • API Mode:
    Interact with the code directly through API mode, allowing you to bypass the graphical interface if preferred.

Citation

If you use DeepCAVE in one of your research projects, please cite our ReALML@ICML'22 workshop paper:

@misc{sass-realml2022,
    title = {DeepCAVE: An Interactive Analysis Tool for Automated Machine Learning},
    author = {Sass, René and Bergman, Eddie and Biedenkapp, André and Hutter, Frank and Lindauer, Marius},
    doi = {10.48550/ARXIV.2206.03493},
    url = {https://arxiv.org/abs/2206.03493},
    publisher = {arXiv},
    year = {2022},
    copyright = {arXiv.org perpetual, non-exclusive license}
}

Copyright (C) 2016-2022 AutoML Group.

deepcave's People

Contributors

alexandertornede avatar dwoiwode avatar eddiebergman avatar helegraf avatar keckelt avatar krissihub avatar lukasfehring avatar mlindauer avatar phmueller avatar renesass avatar sarah-segel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepcave's Issues

New plugin: Hardness of AutoML problem

Hi,

We often wonder how hard an AutoML problem is. Can we therefore add some metrics regarding that?
For example

  • an eCDF plot for the cost distribution (i.e., a hard AutoML task should have only a few very well-performing configurations)
  • uni-modal metric from the automl loss landscape paper (but evaluated on our surrogate models)
  • convexity metric from the automl loss landscape paper (but evaluated on our surrogate models)

Parallel Coordinates

Colouring based on Configclusters (using e.g. HDBSCAN, which also allows having "noisy observations" with no affiliation)?
The clusters could create a base color which's color shading is determined by the final performance?

This might help identify communalities between well-performing configurations.

ConfigCube Projections for higher dimensions.

When dealing with more than three-dimensional HP-spaces, it may become spurious to look at 3d-slices. Maybe try some high-performing projection procedures? There is for instance UMAP and its successor. But be wary of transductive and inductive projections if you want to do it sequentially.

Pre-commit hooks: Check and update

When doing a commit, error messages from the pre-commit hooks show up (with errors not regarding the changes made, but regarding the existing files). This needs to be checked and updated.

fANOVA shows nothing (Nan Values from RF)

Hey,

first of all: thanks for that super nice tool.
It is really awesome.

Unfortunately, I have encountered a bug in the fANOVA plugin.

The values returned by the rf are all nan.
It might happen due to some constant hyperparameters in the search space.

I've attached the results of a hpbandster run to reproduce the error.

bohb_run.zip

Thanks in advance

[Question] Correct way of shutting down DeepCave

I assumed that when I close the deepcave tab and kill the command line application, deepcave would shut down completely but in fact the dash port is still in use and not freed within ~30 mins (not sure when exactly it's freed, actually, I just observed it being taken even quite a while after). Is there a shutdown command of some sort or a way to make this cleaner?
The issue right now is obviously that if I want to restart, I have to change my config to a different port.

Seaborn is missing as a dependency

Hi!
I just gave DeepCAVE a try (its is great) and noticed that seaborn seems to missing as a dependency as I got the following error after starting DeepCAVE:

Using config 'default'
Checking if redis-server is already running...
Could not connect to Redis at 127.0.0.1:6379: Connection refused
Redis server is not running. Starting...
Redis server successfully started.

-------------STARTING WORKER-------------

-------------STARTING SERVER-------------
Traceback (most recent call last):
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/style/core.py", line 137, in use
    style = _rc_params_in_file(style)
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/__init__.py", line 866, in _rc_params_in_file
    with _open_file_or_url(fname) as fd:
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/__init__.py", line 843, in _open_file_or_url
    with open(fname, encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'seaborn'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/klaus/ws/DeepCAVE/deepcave/server.py", line 6, in <module>
    app.layout = MainLayout(config.PLUGINS)()
  File "/home/klaus/ws/DeepCAVE/deepcave/config.py", line 50, in PLUGINS
    from deepcave.plugins.hyperparameter.importances import Importances
  File "/home/klaus/ws/DeepCAVE/deepcave/plugins/hyperparameter/importances.py", line 13, in <module>
    from deepcave.utils.styled_plot import plt
  File "/home/klaus/ws/DeepCAVE/deepcave/utils/styled_plot.py", line 168, in <module>
    plt = StyledPlot()
  File "/home/klaus/ws/DeepCAVE/deepcave/utils/styled_plot.py", line 28, in __init__
    plt.style.use("seaborn")
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/style/core.py", line 139, in use
    raise OSError(
OSError: 'seaborn' is not a valid package style, path of style file, URL of style file, or library style name (library styles are listed in `style.available`)
[1]    10519 terminated  deepcave --open

This is how I set it up:

git clone [email protected]:automl/DeepCAVE.git
cd DeepCAVE/
conda create -n DeepCAVE python=3.9
conda activate DeepCAVE
conda install -c anaconda swig
pip install -e .
sudo apt-get install redis-server
deepcave --open

After installing seaborn, everything worked:

conda install seaborn
deepcave --open

Seaborn is used here:

import seaborn as sns

and here:

plt.style.use("seaborn")

but is not part of the requirements.txt.

ModuleNotFoundError

I get the following error when trying to run deepcave --open: ModuleNotFoundError: No module named 'smac.epm.util_funcs'
I think smac changed the naming of the module from util_funcs to utils.
The error occured in DeepCAVE/deepcave/evaluators/epm/random_forest.py and should be fixed if you change the import.

Tests: Expand

Add more tests, especially for:

  • Run
  • Converters
  • Plugins (check API calls) and add correct typing

Add support for several other HPO tools

In order to maximize the potential users of DeepCave, we should aim to support more HPO tools. In particular, we should write converters for

  • SyneTune

  • BoTorch

  • HEBO

  • Ray Tune

  • Optuna

"state" Key missing in Run object when visualising BOHB runs with DeepCAVE

Hi, :)

I wanted to use the DeepCAVE framework to visualise some runs from HpBandSter. (I can't use SMAC because it has some issues with multi-node runs. I've already posted it.)
In deepcave.runs.converters.bohb, while creating the runs from the bohb Result object, there is no "state" key in the info dict for the run Object within bohb.get_all_runs().

Line 66 in deepcave/runs/converters/bohb.py

status = bohb_run.info["state"]

The bohb.py code assumes the configspace is also saved but it does not happen by default in the library. I've saved the configspace.json separately , while the hpbandster.core.result.json_result_logger saves results.json and configs.json.

Am I using the wrong version of HpBandSter? I used pip install hpbandster which installs version 0.7.4.

Warm Regards,

[Question] Example data?

Is it possible to include some example data so that first time users can try everything out without having to run anything?

Improve README

We would like to improve the README:

  • Add visualization GIFs
  • Add a very minimal example at the top
  • Add very simple installation guide at the top

As part of this, we can also look into similar repos and what makes their READMEs great such that we can benefit from their ideas.

Documentation: installing redis without root access

The documentation includes instructions how to install redis without root access, but there's a line missing which was necessary for me: after running 'make' in the redis directory, I also needed to run 'make install' to be able to actually run the command.

PDF Report Feature

It would be very nice to have a feature that allows to generate a PDF report from an analysis with everything the tool has to offer in principle. For this, a user would need to be able to define which analyses under which configuration they would like to see.

UI error in Importances tab

Hi,

I've just started using the DeepCAVE tool and noticed in a few screens, some fields remain active even if a run isn't selected. If you try editing the fields, it throws an error. It only happens at the very beginning of operation when no run is selected.

Workflow

  • Start DeepCAVE
  • Open the Parallel Coordinates Tab
  • Without selecting a Run, interact with the Show Important Hyperparameters ,Limit Hyperparameters or Show Unsuccessful Configurations fields. Errors are thrown

Similarly for the Importances tab when changing the Method, Trees or Limit Hyperparameters fields. Again, it only happens at the very beginning when no run has ever been selected for that session.

It doesn't affect functionality but I think it could be solved with a callback function (input being the id and value of the run menu dropdown and output being the id and visibility/enable property of the relevant fields).

Screens of the Parallel coordinates and Importance tabs where I saw this behavior.

UI_importance UI_ParallelCoord

Best,
Dipti

`type` object is not subscriptable

Will try to add from __future__ import annotations at the top.

File "/home/skantify/code/DeepCAVE/deepcave/evaluators/epm/random_forest.py", line 345, in RandomForest
    def _predict(self, X: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
TypeError: 'type' object is not subscriptable

Python 3.8.5

Enable using SMAC runs with multiple seeds

Currently, DeepCave doesn't support when runs are done with deterministic=False. For example, when running examples/1_basics/1_quadratic_function.py with deterministic=False, loading the resulting run in DeepCAVE is not possible and will give the warning message "SMAC3v2: Multiple seeds are not supported..".

Hyperparameter Importance: Different values for same run

When using the API and calculating the importance of a run more than one time, the importance values vary a lot. This happens for fANOVA and Local importance. An example code is here (execute more than once and compare the values):

from deepcave.runs.converters.smac3v2 import SMAC3v2Run
from deepcave.evaluators.fanova import fANOVA
from deepcave.evaluators.lpi import LPI
import pandas as pd

run = SMAC3v2Run.from_path("../smac3_output/700b9ad7b27ba991278b31467cbe7fe6/700b9ad7b27ba991278b31467cbe7fe6/")
result = fANOVA(run)
result.calculate(objectives=run.get_objective('1-accuracy'), budget=max(run.get_budgets()), n_trees=10, seed=None)
df_importance = pd.DataFrame(result.get_importances(hp_names=None))
df_importance[df_importance>0].dropna(axis=1)

Improvements of Sidebar

We could add some more features to the sidebar, especially:

  • History of jobs -> list of jobs from static plugins that have been run already (currently disappear after they are clicked)

  • Favorites -> favorite runs to be used in all plugins (highlighted)

Display configids

When deepcave is the first pillar of my HPO run, I might find some particular configs (or sets of configs) interesting and would like to examine them further. To do so, I need to be able to select configs and be given back the ids

Cache

Right now, cache is always resets after some minutes. Own json saving is recommended.

Quality of Surrogate Models

Many of our analyses are based on surrogate models.
It would be fairly important to know how faithful these surrogate models actually are.
Could we add some insights regarding that? In the easiest case, we could start with some RMSE on out-of-bag error.

New plugin: Add symbolic explanations

Add a new plugin that allows to apply symbolic regression to the meta-data gathered during HPO and obtain symbolic explanations for the dependency between hyperparameters and performance

To be done:

  • Add Parsimony Hyperparameter
  • Possibly add other SR HPs
  • Add check if #OptimizedHPs != #ExplainedHPs (if so, run PDP before SR)
  • Replace X1 / X2 by HP name

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.