{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wEvqTaRNtS5z"
      },
      "source": [
        "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/google/vizier/blob/main/docs/guides/benchmarks/running_benchmarks.ipynb)\n",
        "\n",
        "# Running Benchmarks\n",
        "We will demonstrate below how to use our benchmark runner pipeline."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Qi88APk7Qy4d"
      },
      "source": [
        "## Installation and reference imports"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zHUVrnl9wnhO"
      },
      "outputs": [],
      "source": [
        "!pip install google-vizier[jax,algorithms]"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "eGzQYe6ZcP7z"
      },
      "outputs": [],
      "source": [
        "from vizier import algorithms as vza\n",
        "from vizier import benchmarks as vzb\n",
        "from vizier.algorithms import designers\n",
        "from vizier.benchmarks import experimenters"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BEuMSlNlc_FX"
      },
      "source": [
        "Example experimenter and designer factory which we will use later."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "AIwgqoH_dAte"
      },
      "outputs": [],
      "source": [
        "experimenter = experimenters.NumpyExperimenter(\n",
        "    experimenters.bbob.Sphere, experimenters.bbob.DefaultBBOBProblemStatement(5)\n",
        ")\n",
        "\n",
        "designer_factory = designers.GridSearchDesigner.from_problem"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Erk6WFp7Q1Y4"
      },
      "source": [
        "## Algorithms and Experimenters\n",
        "Every study can be seen conceptually as a simple loop between an algorithm and objective. In terms of code, the algorithm corresponds to a `Designer`/`Policy` and objective to an `Experimenter`.\n",
        "\n",
        "Below is a simple sequential loop."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "sXEOi4Vhl7qL"
      },
      "outputs": [],
      "source": [
        "designer = designer_factory(experimenter.problem_statement())\n",
        "\n",
        "for _ in range(100):\n",
        "  suggestion = designer.suggest()[0]\n",
        "  trial = suggestion.to_trial()\n",
        "  experimenter.evaluate([trial])\n",
        "  completed_trials = vza.CompletedTrials([trial])\n",
        "  designer.update(completed_trials, vza.ActiveTrials())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wZlArEz1l4Kt"
      },
      "source": [
        "As seen above however, one modification we can make is to use variable batch\n",
        "sizes, rather than only suggesting and evaluating one-by-one. More generally,\n",
        "certain implementation details may arise:\n",
        "\n",
        "*   How many parallel suggestions should the algorithm generate?\n",
        "*   How many suggestions can be evaluated at once?\n",
        "*   Should we use early stopping on certain unpromising trials?\n",
        "*   Should we use a custom stopping condition instead of a fixed for-loop?\n",
        "*   Can we swap in a different algorithm mid-loop?\n",
        "*   Can we swap in a different objective mid-loop?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tkgb2C0hPsvG"
      },
      "source": [
        "## API\n",
        "The code flexibility needed to simulate these real-life scenarios may cause\n",
        "complications as the evaluation benchmark may no longer be stateless. In order\n",
        "to broadly cover such scenarios, our [API](https://github.com/google/vizier/blob/main/vizier/benchmarks/__init__.py) introduces the `BenchmarkSubroutine`:\n",
        "\n",
        "```python\n",
        "class BenchmarkSubroutine(Protocol):\n",
        "  \"\"\"Abstraction for core benchmark routines.\n",
        "\n",
        "  Benchmark protocols are modular alterations of BenchmarkState by reference.\n",
        "  \"\"\"\n",
        "\n",
        "  def run(self, state: BenchmarkState) -\u003e None:\n",
        "    \"\"\"Abstraction to alter BenchmarkState by reference.\"\"\"\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-QUBdrhgqFlB"
      },
      "source": [
        "All routines use and potentially modify a `BenchmarkState`, which holds information about the objective via an `Experimenter` and the algorithm itself wrapped by a `PolicySuggester`.\n",
        "\n",
        "```python\n",
        "class BenchmarkState:\n",
        "  \"\"\"State of a benchmark run. It is altered via benchmark protocols.\"\"\"\n",
        "\n",
        "  experimenter: Experimenter\n",
        "  algorithm: PolicySuggester\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zyOVOvGtqPrf"
      },
      "source": [
        "To wrap multiple `BenchmarkSubRoutines` together, we can use the `BenchmarkRunner`:\n",
        "\n",
        "```python\n",
        "class BenchmarkRunner(BenchmarkSubroutine):\n",
        "  \"\"\"Run a sequence of subroutines, all repeated for a few iterations.\"\"\"\n",
        "\n",
        "  # A sequence of benchmark subroutines that alter BenchmarkState.\n",
        "  benchmark_subroutines: Sequence[BenchmarkSubroutine]\n",
        "  # Number of times to repeat applying benchmark_subroutines.\n",
        "  num_repeats: int\n",
        "\n",
        "  def run(self, state: BenchmarkState) -\u003e None:\n",
        "    \"\"\"Run algorithm with benchmark subroutines with repetitions.\"\"\"\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CVU_X3Wxo-8e"
      },
      "source": [
        "## Example usage\n",
        "Below is a typical example of simple suggestion and evaluation:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "sNZv5Bj6ou6n"
      },
      "outputs": [],
      "source": [
        "runner = vzb.BenchmarkRunner(\n",
        "    benchmark_subroutines=[\n",
        "        vzb.GenerateSuggestions(),\n",
        "        vzb.EvaluateActiveTrials(),\n",
        "    ],\n",
        "    num_repeats=100,\n",
        ")\n",
        "\n",
        "benchmark_state_factory = vzb.DesignerBenchmarkStateFactory(\n",
        "    experimenter=experimenter, designer_factory=designer_factory\n",
        ")\n",
        "benchmark_state = benchmark_state_factory()\n",
        "\n",
        "runner.run(benchmark_state)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "T4A24qM1rEM5"
      },
      "source": [
        "We may obtain the evaluated trials via the `benchmark_state`, which contains a\n",
        "`PolicySupporter` via its `algorithm` field:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "DkJ801PPrNLK"
      },
      "outputs": [],
      "source": [
        "all_trials = benchmark_state.algorithm.supporter.trials\n",
        "print(all_trials)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "D01yeY4XseNb"
      },
      "source": [
        "Note that this design is maximally informative on everything that has happened\n",
        "so far in the study. For instance, we may also query incomplete/unused\n",
        "suggestions using the `PolicySupporter`."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "eu-zL7_Bs9Kt"
      },
      "source": [
        "## References\n",
        "*   Benchmark Runners can be found [here](https://github.com/google/vizier/tree/main/vizier/_src/benchmarks/runners).\n",
        "\n"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "name": "Running Benchmarks.ipynb",
      "private_outputs": true,
      "provenance": []
    },
    "gpuClass": "standard",
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}