OSS Vizier as a Backend
We demonstrate how OSS Vizier can be used as a distributed backend for PyGlove-based tuning tasks.
This assumes the user is already familiar with PyGlove primitives.
Installation and reference imports
!pip install google-vizier
!pip install pyglove
import multiprocessing
import multiprocessing.pool
import os
import pyglove as pg
from vizier import pyglove as pg_vizier
from vizier.service import servers
Preliminaries
In the original PyGlove setting, one can normally perform evolutionary computation, for example:
search_space = pg.Dict(x=pg.floatv(0.0, 1.0), y=pg.floatv(0.0, 1.0))
algorithm = pg.evolution.regularized_evolution()
num_trials = 100
def evaluator(value: pg.Dict):
return value.x**2 - value.y**2
for value, feedback in pg.sample(
search_space,
algorithm=algorithm,
num_examples=num_trials,
name='basic_run',
):
reward = evaluator(value)
feedback(reward=reward)
However, in many real-world scenarios, the evaluator may be much more expensive. For example, in neural architecture search applications, evaluator
may be the result of an entire neural network training pipeline.
This leads to the need for a backend, in order to:
Distribute the evaluations over multiple workers.
Store the valuable results reliably and handle worker faults.
Initializing the OSS Vizier backend
The main initializer to call is vizier.pyglove.init(...)
, which should only be called once per process (not thread). This function will edit global Python variables for determining values such as:
Prefix for study names.
Endpoint of the
VizierService
for storing data and handling requests.Port for the
PythiaService
for computing suggestions.
In the local case, this can be called as-is:
pg_vizier.init('my_study')
Alternatively, if using a remote server, the endpoint can be specified as well:
server = servers.DefaultVizierServer() # Normally hosted on a remote machine.
pg_vizier.init('my_study', vizier_endpoint=server.endpoint)
Parallelization
Due to the OSS Vizier backend, all workers may conveniently use exactly the same evaluation loop to work on a study:
NUM_WORKERS = 10
def work_fn(worker_id):
print(f"Worker ID: {worker_id}")
for value, feedback in pg.sample(
search_space,
algorithm=algorithm,
num_examples=num_trials // NUM_WORKERS,
name="worker_run",
):
reward = evaluator(value)
feedback(reward=reward)
There are three common forms of parallelization over the evaluation computation:
Multiple threads, single process.
Multiple processes, single machine.
Multiple machines.
Each of these cases defines the “worker”, which can be a thread, process or machine respectively. We demonstrate examples of every type of parallelization below.
Multiple threads, single process
with multiprocessing.pool.ThreadPool(num_workers) as pool:
pool.map(work_fn, range(NUM_WORKERS))
Multiple processes, single machine
processes = []
for worker_id in range(NUM_WORKERS):
p = multiprocessing.Process(target=work_fn, args=(worker_id,))
p.start()
processes.append(p)
for p in processes:
p.join()
Multiple machines
# Server Machine
server = servers.DefaultVizierServer()
# Worker Machine
worker_id = os.uname()[1]
pg_vizier.init('my_study', vizier_endpoint=server.endpoint)
work_fn(worker_id)