Code Monkey home page Code Monkey logo

funsearch's Introduction

FunSearch

This repository accompanies the publication

Romera-Paredes, B. et al. Mathematical discoveries from program search with large language models. Nature (2023)

There are 6 independent directories:

  • cap_set contains functions discovered by FunSearch that construct large cap sets, and we also provide those cap sets in a numerical format for convenience.

  • admissible_set contains functions discovered by FunSearch that construct large admissible sets, and we also provide those admissible sets in a numerical format for convenience.

  • bin_packing contains heuristics discovered by FunSearch for online 1D bin packing problems, and an evaluation suite to reproduce the results reported in the paper.

  • cyclic_graphs contains functions discovered by FunSearch that construct large independent sets in strong products of cyclic graphs, and we also provide those sets in a numerical format for convenience.

  • corner_free_sets contains the discovered sets of indices, in numerical format, satisfying the combinatorial degeneration constraints described for the corners-free problem in the Supplementary Information.

  • implementation contains an implementation of the evolutionary algorithm, code manipulation routines, and a single-threaded implementation of the FunSearch pipeline. It does not contain language models for generating new programs, the sandbox for executing untrusted code, nor the infrastructure for running FunSearch on our distributed system. This directory is intended to be useful for understanding the details of our method, and for adapting it for use with any available language models, sandboxes, and distributed systems.

Installation

No installation is required. All notebooks can be opened and run in Google Colab.

Usage

  • cap_set: The notebook cap_set.ipynb can be opened via Open In Colab.

  • admissible_set: The notebook admissible_set.ipynb can be opened via Open In Colab.

  • bin_packing: The notebook bin_packing.ipynb can be opened via Open In Colab.

  • cyclic_graphs: The notebook cyclic_graphs.ipynb can be opened via Open In Colab.

Citing this work

If you use the code or data in this package, please cite:

@Article{FunSearch2023,
  author  = {Romera-Paredes, Bernardino and Barekatain, Mohammadamin and Novikov, Alexander and Balog, Matej and Kumar, M. Pawan and Dupont, Emilien and Ruiz, Francisco J. R. and Ellenberg, Jordan and Wang, Pengming and Fawzi, Omar and Kohli, Pushmeet and Fawzi, Alhussein},
  journal = {Nature},
  title   = {Mathematical discoveries from program search with large language models},
  year    = {2023},
  doi     = {10.1038/s41586-023-06924-6}
}

License and disclaimer

Copyright 2023 DeepMind Technologies Limited

All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the Apache 2.0 license. You may obtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0

All other materials are licensed under the Creative Commons Attribution 4.0 International License (CC-BY). You may obtain a copy of the CC-BY license at: https://creativecommons.org/licenses/by/4.0/legalcode

Unless required by applicable law or agreed to in writing, all software and materials distributed here under the Apache 2.0 or CC-BY licenses are distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the licenses for the specific language governing permissions and limitations under those licenses.

This is not an official Google product.

funsearch's People

Contributors

eltociear avatar matejbalog avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

funsearch's Issues

Lack of Implementation for Large Language Models (LLMs)

In the published paper on FunSearch, there is a mention of using pre-trained large language models (LLMs) like Codey (based on the PaLM2 model family) and a reference to StarCoder, an open-source LLM, in the supplementary information. However, the current GitHub repository for FunSearch does not include implementations or integration guidelines for these LLMs.

This issue is particularly evident in the sampler.py file, where the LLM class seems to be a placeholder without an actual implementation:

class LLM:
  """Language model that predicts continuation of provided source code."""

  def __init__(self, samples_per_prompt: int) -> None:
    self._samples_per_prompt = samples_per_prompt

  def _draw_sample(self, prompt: str) -> str:
    """Returns a predicted continuation of `prompt`."""
    raise NotImplementedError('Must provide a language model.')

Suggested Resolution:

  • It would be greatly beneficial for the community if the repository could include a basic implementation or integration guide for an open-source LLM, especially StarCoder, which was referenced in the paper.
  • Providing such an implementation or guide would enhance the reproducibility and usability of the FunSearch project for researchers and developers looking to explore or build upon this work.

Looking forward to any updates or guidance on this matter.

Evaluator sandbox function potentially missing 'sample' argument

In implementation/evaluator.py, we have
test_output, runs_ok = self._sandbox.run(program, self._function_to_run, current_input, self._timeout_seconds)

program = full original specification
self._function_to_run = name of the run function in program specification to evaluate and score
current_input = current test input to run on
self._timeout_seconds = timeout_seconds, default being 30 seconds

However, it seems to me that we want the sandbox to run the sample (new_function) instead of self._function_to_run? Unless self._function_to_run is updated to be the sample / new_function somewhere that I'm missing?

Happy to make a PR with this edit to make it easier and more clear to follow the implementation details, but just wanted to clarify if my understanding is correct.

No module named 'funsearch implementation'; 'funsearch' is not a package

Is the requirements file missing I keep getting implementation.funsearch is missing

Traceback (most recent call last): File /Users/user1/Documents/fun/funsearch/implementation/funsearch-py", line 20, in < module> from funsearch. implementation import code manipulation File "/Users/user1/Documents/fun/funsearch/implementation/funsearch.py", line 20, in ‹module> from funsearch. implementation import code_manipulation ModuleNotFoundError: No module named 'funsearch implementation'; 'funsearch' is not a package

why are n no.of islands are used.

Hello Team.

I really appreciate the work you did, its veery useful in solving math problems.

One doubt i have is
If we have Clusters in island which stores all the programs with same scores. why do we require n no.of islands only one island is enough right?

Please help me understand this.

Thank you in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.