Metadata
We provide a guide below on common developer uses of the Metadata
primitive.
OSS Vizier can store Metadata
in both the ProblemStatement
and each
TrialSuggestion
/Trial
, with common use cases:
Containing additional information outside of standard parameter types.
Allowing user code to store small amounts of state information inside OSS Vizier, attached to the OSS Vizier study.
Wrapping search spaces and corresponding algorithms which are naturally incompatible with OSS Vizier’s default API, to still allow a distributed backend service.
Installation and reference imports
!pip install google-vizier
from vizier import pyvizier as vz
from google.protobuf import any_pb2
Metadata basics
The Metadata
is a key-value store, where:
Keys are UTF-8 strings.
Values can be strings or protocol buffers.
While values of type int
, float
, and more complex objects can also be
used, the developer is responsible for serializing / unserializing said objects.
metadata = vz.Metadata()
metadata['proto'] = any_pb2.Any(...)
metadata['string'] = 'hello'
Additionally, Metadata
can act as a “dictionary of dictionaries”, i.e. a hierarchy of dictionaries, via its Namespace
functionality via calling .ns()
, which creates another Metadata
which shares data with the original.
child_metadata = metadata.ns('child')
grandchild_metadata = child_metadata.ns('child')
grandchild_metadata['string'] = 'goodbye'
assert metadata.ns('child').ns('child')['string'] == 'goodbye'
ProblemStatement Metadata
The ProblemStatement
object contains a metadata
attribute, ideally for storing global metadata related to the study. Note that Metadata
will not be used in the optimization process, UNLESS there is a custom algorithm configured to use it.
Below is a usage example when training an image classifier, where one may wish to store training-related attributes in Metadata
.
problem_statement = vz.ProblemStatement()
problem_statement.metadata['dataset'] = 'cifar10'
problem_statement.metadata['architecture'] = 'resnet_18'
Trial Metadata
TrialSuggestion
and subclass Trial
also contain a metadata
attribute. This in contrast, should be used to store metadata related to the specific Trial.
In the image classification case, examples would be the type of GPU used for training and if the training worker has been preempted.
trial = vz.Trial()
trial.metadata['gpu_used'] = 'P100'
trial.metadata['preempted'] = 'True'
OSS Vizier as a backend via Metadata
As an advanced developer use case, one may extend OSS Vizier’s search space capabilities using Metadata
. Custom algorithms can provide full freedom in expressing more complex search spaces (e.g. graphs) using Metadata
.
Example use cases:
Combinatorial optimization, where the search space may consist of graphs or multiple selection (e.g. \({N \choose K}\)) primitives. Algorithms commonly include evolutionary methods, which also require custom mutation operations.
Free-form textual data used for suggestions (and maybe even evaluation metrics!), as common with language-based applications.
# Setup combinatorial search space.
choose_problem = vz.ProblemStatement()
choose_problem.metadata = vz.Metadata({'N': '10', 'K': '3'})
# Example of a suggestion proposed by a custom algorithm.
suggestion = vz.TrialSuggestion()
suggestion.metadata['chosen_indices'] = '[0, 3, 7]'
The algorithm behavior can even be changed mid-optimization with Metadata
using a client! This is in fact used extensively in our integrations with PyGlove to allow a running Pythia policy to change search spaces or mutations online.
# Original mutation rate.
mutation_problem = vz.ProblemStatement()
mutation_problem.metadata = vz.Metadata({'mutation_rate': '0.1'})
# ...
# Assume algorithm started running in the Pythia service.
# ...
# Set new mutation rate.
study_metadata = vz.Metadata({'mutation_rate': '0.2'})
# Prevent this trial from being used in the population.
trial_metadata = vz.Metadata({'use_in_population' = 'False'})
trial_id = 1
# Create unit of metadata update.
metadata_delta = vz.MetadataDelta(
on_study=study_metadata, on_trials={trial_id: trial_metadata})
Once we have a client, we can commit the metadata update:
client.update_metadata(metadata_delta)